Giant component

Large connected component of a random graph

{\displaystyle p=1/(n-1)} — An Erdős–Rényi–Gilbert random graph with 1000 vertices at the critical edge probability $p=1/(n-1)$ , showing a large component and many small ones. At this edge probability, the large component is not yet a giant component: it contains only a sublinear number of vertices.

In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.

More precisely, in graphs drawn randomly from a probability distribution over arbitrarily large graphs, a giant component is a connected component whose fraction of the overall number of vertices is bounded away from zero. In sufficiently dense graphs distributed according to the Erdős–Rényi model, a giant component exists with high probability.

Giant component in Erdős–Rényi model

Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of n vertices is present, independently of the other edges, with probability p. In this model, if $p\leq {\frac {1-\epsilon }{n}}$ for any constant $\epsilon >0$ , then with high probability (in the limit as $n$ goes to infinity) all connected components of the graph have size O(log n), and there is no giant component. However, for $p\geq {\frac {1+\epsilon }{n}}$ there is with high probability a single giant component, with all other components having size O(log n). For $p=p_{c}={\frac {1}{n}}$ , intermediate between these two possibilities, the number of vertices in the largest component of the graph, $P_{\inf }$ is with high probability proportional to $n^{2/3}$ .^[1]

Giant component is also important in percolation theory.^[1]^[2] When a fraction of nodes, $q=1-p$ , is removed randomly from an ER network of degree $\langle k\rangle$ , there exists a critical threshold, $p_{c}={\frac {1}{\langle k\rangle }}$ . Above $p_{c}$ there exists a giant component (largest cluster) of size, $P_{\inf }$ . $P_{\inf }$ fulfills, $P_{\inf }=p(1-\exp(-\langle k\rangle P_{\inf }))$ . For $p<p_{c}$ the solution of this equation is $P_{\inf }=0$ , i.e., there is no giant component.

At $p_{c}$ , the distribution of cluster sizes behaves as a power law, $n(s)$ ~ $s^{-5/2}$ which is a feature of phase transition.

Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately $n/2$ edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when t edges have been added, for values of t close to but larger than $n/2$ , the size of the giant component is approximately $4t-2n$ .^[1] However, according to the coupon collector's problem, $\Theta (n\log n)$ edges are needed in order to have high probability that the whole random graph is connected.

Graphs with arbitrary degree distributions

A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in random graphs with non-uniform degree distributions. The degree distribution does not define a graph uniquely. However under assumption that in all respects other than their degree distribution, the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex have degree $k$ , then the giant component exists^[3] if and only if $\mathbb {E} [k^{2}]-2\mathbb {E} [k]>0.$ $\mathbb {E} [k]$ which is also written in the form of ${\langle k\rangle }$ is the mean degree of the network. Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional.^[4] There are three types of connected components in directed graphs. For a randomly chosen vertex:

out-component is a set of vertices that can be reached by recursively following all out-edges forward;
in-component is a set of vertices that can be reached by recursively following all in-edges backward;
weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.

Criteria for giant component existence in directed and undirected configuration graphs

Let a randomly chosen vertex has $k_{\text{in}}$ in-edges and $k_{\text{out}}$ out edges. By definition, the average number of in- and out-edges coincides so that $c=\mathbb {E} [k_{\text{in}}]=\mathbb {E} [k_{\text{out}}]$ . If $G_{0}(x)=\textstyle \sum _{k}\displaystyle P(k)x^{k}$ is the generating function of the degree distribution $P(k)$ for an undirected network, then $G_{1}(x)$ can be defined as $G_{1}(x)=\textstyle \sum _{k}\displaystyle {\frac {k}{\langle k\rangle }}P(k)x^{k-1}$ . For directed networks, generating function assigned to the joint probability distribution $P(k_{in},k_{out})$ can be written with two valuables $x$ and $y$ as: ${\mathcal {G}}(x,y)=\sum _{k_{in},k_{out}}\displaystyle P({k_{in},k_{out}})x^{k_{in}}y^{k_{out}}$ , then one can define $g(x)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial x}\vert _{y=1}$ and $f(y)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial y}\vert _{x=1}$ . The criteria for giant component existence in directed and undirected random graphs are given in the table below:


Type	Criteria
undirected: giant component	$\mathbb {E} [k^{2}]-2\mathbb {E} [k]>0$ ,^[3] or $G'_{1}(1)=1$ ^[4]
directed: giant in/out-component	$\mathbb {E} [k_{\text{in}}k_{out}]-\mathbb {E} [k_{\text{in}}]>0$ ,^[4] or $g'_{1}(1)=f'_{1}(1)=1$ ^[4]
directed: giant weak component	$2\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}k_{\text{out}}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}^{2}]+\mathbb {E} [k_{\text{in}}^{2}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}k_{\text{out}}]^{2}>0$ ^[5]

References

^ ^a ^b ^c Bollobás, Béla (2001), "6. The Evolution of Random Graphs—The Giant Component", Random Graphs, Cambridge studies in advanced mathematics, vol. 73 (2nd ed.), Cambridge University Press, pp. 130–159, ISBN 978-0-521-79722-1.
^ Newman, M. E. J. (2010). Networks : an introduction. New York: Oxford University Press. OCLC 456837194.
^ ^a ^b Molloy, Michael; Reed, Bruce (1995). "A critical point for random graphs with a given degree sequence". Random Structures & Algorithms. 6 (2–3): 161–180. doi:10.1002/rsa.3240060204. ISSN 1042-9832.
^ ^a ^b ^c ^d Newman, M. E. J.; Strogatz, S. H.; Watts, D. J. (2001-07-24). "Random graphs with arbitrary degree distributions and their applications". Physical Review E. 64 (2): 026118. arXiv:cond-mat/0007235. Bibcode:2001PhRvE..64b6118N. doi:10.1103/physreve.64.026118. ISSN 1063-651X. PMID 11497662.
^ Kryven, Ivan (2016-07-27). "Emergence of the giant weak component in directed random graphs with arbitrary degree distributions". Physical Review E. 94 (1): 012315. arXiv:1607.03793. Bibcode:2016PhRvE..94a2315K. doi:10.1103/physreve.94.012315. ISSN 2470-0045. PMID 27575156. S2CID 206251373.