GNN 101( 二 )


文章插图
公式如下:

GNN 101

文章插图
分析一下 , 会发现 , SAGA模式中ApplyEdge和ApplyVertex是传统deep learning中的NN(Neural Network)操作 , 我们可以复用;而Scatter和Gather是GNN新引入的操作 。即 , Graph Computing = Graph Ops + NN Ops 。
【GNN 101】
GNN 101

文章插图
不同的图数据集规模
  • One big graph 可能高达数十亿的结点 , 数百亿的边 。

GNN 101

文章插图
  • Many small graphs

GNN 101

文章插图
不同的图任务
  • Node-level prediction 预测图中结点的类别或性质

GNN 101

文章插图
  • Edge-level prediction 预测图中两个结点是否存在边 , 以及边的类别或性质

GNN 101

文章插图
  • Graph-level prediction 预测整图或子图的类别或性质

GNN 101

文章插图
HowWorkflow
GNN 101

文章插图
以fraud detection为例:
  • Tabformer数据集
    GNN 101

    文章插图
  • workflow
    GNN 101

    文章插图
软件栈
GNN 101

文章插图
  • 计算平面

GNN 101

文章插图
  • 数据平面

GNN 101

文章插图
SW ChallengesGraph SamplerFor many small graphs datasets, full batch training works most time. Full batch training means we can do training on whole graph; When it comes to one large graph datasets, in many real scenarios, we meet Neighbor Explosion problem;
Neighbor Explosion:
GNN 101

文章插图
Graph sampler comes to rescue. Only sample a fraction of target nodes, and furthermore, for each target node, we sample a sub-graph of its ego-network for training.This is called mini-batch training. Graph sampling is triggered for each data loading.And the hops of the sampled graph equals the GNN layer number . Which means graph sampler in data loader is important in GNN training.
GNN 101

文章插图
Challenge: How to optimize sampler both as standalone and in training pipe?
When graph comes to huge(billions of nodes, tens of billions of edges), we meet new at-scale challenges:
  • How to store the huge graph across node? -> graph partition
  • How to build a training system w/ not only distributed model computing but also distributed graph store and sampling?
    • How to cut the graph while minimize cross partition connections?

GNN 101

文章插图
A possible GNN distributed training architecture:
GNN 101

文章插图

经验总结扩展阅读