Proposal

Main paper can be found here.

  • Usage of GNN's for irregular-geometry detectors for particle reconstruction
  • propose 2 new distance-weighted GNN layers: GRAVNET, GARNET
  • open-source their work based on TF here
  • proposed computations assume no structure-dependent info from the input data and thus this could be generalizable for other tasks such as tracking, jet identification, etc

gravnet and garnet layers

  • input: $$X$$ of dimension $$B . V . F_{in}$$
    • $$B$$ = number of elements in the batch
    • $$V$$ = number of detector hits per element
    • $$F_{in}$$ = input feature dimension per hit
  • perform linear transformation with bias
    • dimension of a vector in $$Y$$ is $$S + F_{lr}$$
    • $$S$$ = learned spatial representation of input vector
    • $$F_{lr}$$ = learned attributes for the nodes in the resulting graph
  • graph construction: (done for each element)
    • gravnet - a kNN graph is constructed for each element in the batch, based on the pairwise euclidean distances between all hits in that element.
    • garnet - each of the $$S$$ values is considered as a distance from that hit to an aggregator
    • the distance between jth vertex and kth vertex in such a graph is called as $$d_{jk}$$
  • gravent and garnet layer computations
    • scale the features based on a potential function $$V_p$$
      • $$f_{jk}^i = f_j^i V_p(d_{jk})$$
      • gravnet: $$V_p(x) = e^{-x^2}$$
      • garnet: $$V_p(x) = e^{-abs(x)}$$
    • aggregation of scaled features
      • $$f_k^i = agg_j(f_{jk}^i)$$
      • tried both max and mean
      • aggregation function which was most effective for their use-case was mean
    • transformation: (output dimension = $$F_{out}$$
      • in case of garnet, each of these $$f_k^i$$ aggregator features will again be weighted using similar equation in order to project them back to the original vertices.
      • concatenate the input $$F_{in}$$ features with this $$f_k^i$$ feature
      • perform linear transformation with bias, followed by tanh activation
  • a custom loss function is defined based on the domain knowledge

models

  • both a local and global exchange of message across sensors is proposed
  • gravnet model
    • has 4 blocks
      • concat mean of vertex features and vertex features
      • 3 dense layers with tanh activation (dim = 64)
      • one gravnet layer
      • final dense layer with dim = 128 and relu activation
    • gravnet layer
      • 'k' value for kNN-graph is set to 40
      • S = 4
      • $$F_{lr}$$ = 22
      • $$F_{out}$$ = 48
    • output of each block before the final dense layer is concatenated and then passed to the final dense layer
  • garnet model
    • has 4 blocks
      • concat mean of vertex features and vertex features
      • one dense layer with tanh activation (dim = 32)
      • 11 garnet layers!
      • final dense layer with dim = 48 and relu activation
    • garnet layer
      • S = 4
      • $$F_{lr}$$ = 20
      • $$F_{out}$$ = 32
    • output of each block before the final dense layer is concatenated and then passed to the final dense layer
  • batch norm is applied to the input and output of all blocks
  • for all these models, at the end, the following 2 layers are applied
    • a dense layer with dim = 3 and relu activation
    • another dense layer with dim = 2 and softmax activation
  • trained using Adam optimizer