This is a very detailed and clear intro to Graph Networks byDeepmind.
We define a graph to have node, edge, and global attributes. Globalattribute, u, for example canbe the gravitational field in a three body problem setting.
Global attribute is there to give a chance to any local node / edgeto know what’s happening in a global perspective (mostly places far awyfrom it). If we exclude the global <spanclass="math inline">u</span> (which aggregates information fromacross the nodes and edges), the information that a node has access toafter m steps of propagation is determined by the set of nodes and edgesthat are at most m hops away (Figure 7). This can be interpreted asbreaking down a complex computation into smaller elementary steps.
The whole GN consists of the following 6 functions:
\(\begin{array}{l}{\mathbf{e}_{k}^{\prime}=\phi^{e}\left(\mathbf{e}_{k},\mathbf{v}_{r_{k}},\mathbf{v}_{s_{k}},\mathbf{u}\right)}\\ {\mathbf{v}_{i}^{\prime}=\phi^{v}\left(}}_{i}^{\prime},\mathbf{v}_{i},\mathbf{u}\right)}\\ {\mathbf{u}^{\prime}=\phi^{u}\left({\bar{\mathbf{e}}}^{\prime},{\bar{\mathbf{v}}}^{\prime},\mathbf{u}\right)}\end{array}\begin{array}{l}{\bar}_{i}^{\prime}=\rho^{e\to v}\left(E_{i}^{\prime}\right)}\\ {\bar}^{\prime}=\rho^{e\to u}\left(E^{\prime}\right)}\\ {\bar}^{\prime}=\rho^{v\to u}\left(V^{\prime}\right)}\end{array}\)<p>ϕe isan edge specific function and is applied to every edge, same for nodeand global. ϕ can be anyfunction, people usually use Neural Networks.</p><p>On the other hand, ρ takesin a set as input, so needs to be an aggregate function invariant topermutations, like sum, mean, max, min, etc. These functions are notaffected by the order they are sent in.</p><h3 id="composing-multi-graph-block-architecture-4.3">Composing MultiGraph Block Architecture (4.3)</h3><p>A GN is simply three aggregate functions and three update functionsfor node, edge, global respectively. It always takes in a graphcomprised of edge, node, and global attributes as input, and returning agraph comprised of edge, node, and global attribute. The GN operates onany graph without caring about size of the graph, so it is extremelyeasy to stack multiple GN block together.</p><p>Figure 6 shows 3 commons ways of stacking GN blocks.</p><h3 id="graph-networks-advantage-3.2.4">Graph Networks Advantage(3.2.4)</h3><ol type="1"><li>First, graphs can express arbitrary relationships among entities, sothe way input interacts is not limited to a fixed modelarchitecture.</li><li>Graphs represent entities and their relations as sets, which areinvariant to permutations. However, we can still impose ordering byencoding the indices in the node or edge attribute.</li><li>A GN’s per-edge and per-node functions are reused across all edgesand nodes, respectively. Therefore, GN can easily operate on graphs ofany size.</li></ol><h3 id="limitations-5.2">Limitations (5.2)</h3><p>Notions like recursion, control flow, and conditional iteration arenot straightforward to represent with graphs.</p><h2 id="graphcast"><ahref=”https://www.science.org/doi/10.1126/science.adi2336”>GraphCast</a></h2><p>GraphCast by Deepmind is a perfect example to understandencode-process-decode GN structure and it is clearly written. Note themodel architecture is inside the <ahref=”https://www.science.org/doi/suppl/10.1126/science.adi2336/suppl_file/science.adi2336_sm.pdf”>supplementarymaterial</a>. The following notes mostly come from Section3 GraphCastmodel.</p><p>They defined two space for this task. One on the grid space based onearth’s latitude and logitude, where the atmospheric information comesfrom. The other on a “mesh space” based on a regular sphere, where theactual processing happens. The design was made on the observation that“Using the latitude-longitude grid is not an advisable representationdue to its spatial inhomogeneity, and high resolution at the poles whichdemands disproportionate compute resources.”</p>