Proposal

Main paper can be found here.

  • propose a graph contrastive learning (GraphCL) framework for learning unsupervised representations of graph data
  • to incorporate priors they propose using various augmentation schemes
  • this approach additionally helps boosting robustness against adversarial attacks

Summary

  • All the code is open-sourced here.
  • Authors stress the importance of pre-training methods for GNNs too

augmentation

  • mostly focuses on graph-level augmentations
  • augmentation methods
    • node dropping - randomly discard certain nodes and their connections
    • edge perturbation - randomly adding/dropping certain ratio of edges
    • attribute masking - randomly mask certain attributes of the vertices
    • subgraph - by doing random walk on the original graph
  • default augmentation ratio is 0.2

graph contrastive learning

  • Flowchart
    • take the input graph
    • perform augmentations as described above to generate different graphs
    • pass these graphs through a GNN model to generate graph level embedding
    • pass this embedding further through a projection head (in this case the authors used 2-layer MLP)
    • apply a contrastive loss function (eg: normalized temperature-scaled cross entropy loss) to maximize similarity between positive pairs and minimize similarity between negative pairs
  • $$sim(z_i, z_j) = \frac{z_i^T . z_j}{||z_i|| ||z_j||}$$
    • $$z_*$$ = output of projection head for a given graph
    • $$i, j$$ = augmented graph pairs
  • $$l_n = -log\frac{exp(sim(z_{n,i}, z_{n,j}) / \tau)}{\Sigma_{n' != n}exp(sim(z_{n',i}, z_{n',j}) / \tau)}$$
    • $$l_n$$ = loss for the $$n$$th graph
    • $$n \epsilon [0, N)$$
    • $$N$$ = number of graphs
    • $$\tau$$ = temperature parameter