Giant component

Large connected component of a random graph
An Erdős–Rényi–Gilbert random graph with 1000 vertices at the critical edge probability p = 1 / ( n 1 ) {\displaystyle p=1/(n-1)} , showing a large component and many small ones. At this edge probability, the large component is not yet a giant component: it contains only a sublinear number of vertices.

In network theory, a giant component is a connected component of a given random graph that contains a significant fraction of the entire graph's vertices.

More precisely, in graphs drawn randomly from a probability distribution over arbitrarily large graphs, a giant component is a connected component whose fraction of the overall number of vertices is bounded away from zero. In sufficiently dense graphs distributed according to the Erdős–Rényi model, a giant component exists with high probability.

Giant component in Erdős–Rényi model

Giant components are a prominent feature of the Erdős–Rényi model (ER) of random graphs, in which each possible edge connecting pairs of a given set of n vertices is present, independently of the other edges, with probability p. In this model, if p 1 ϵ n {\displaystyle p\leq {\frac {1-\epsilon }{n}}} for any constant ϵ > 0 {\displaystyle \epsilon >0} , then with high probability (in the limit as n {\displaystyle n} goes to infinity) all connected components of the graph have size O(log n), and there is no giant component. However, for p 1 + ϵ n {\displaystyle p\geq {\frac {1+\epsilon }{n}}} there is with high probability a single giant component, with all other components having size O(log n). For p = p c = 1 n {\displaystyle p=p_{c}={\frac {1}{n}}} , intermediate between these two possibilities, the number of vertices in the largest component of the graph, P inf {\displaystyle P_{\inf }} is with high probability proportional to n 2 / 3 {\displaystyle n^{2/3}} .[1]

Giant component is also important in percolation theory.[1][2] When a fraction of nodes, q = 1 p {\displaystyle q=1-p} , is removed randomly from an ER network of degree k {\displaystyle \langle k\rangle } , there exists a critical threshold, p c = 1 k {\displaystyle p_{c}={\frac {1}{\langle k\rangle }}} . Above p c {\displaystyle p_{c}} there exists a giant component (largest cluster) of size, P inf {\displaystyle P_{\inf }} . P inf {\displaystyle P_{\inf }} fulfills, P inf = p ( 1 exp ( k P inf ) ) {\displaystyle P_{\inf }=p(1-\exp(-\langle k\rangle P_{\inf }))} . For p < p c {\displaystyle p<p_{c}} the solution of this equation is P inf = 0 {\displaystyle P_{\inf }=0} , i.e., there is no giant component.

At p c {\displaystyle p_{c}} , the distribution of cluster sizes behaves as a power law, n ( s ) {\displaystyle n(s)} ~ s 5 / 2 {\displaystyle s^{-5/2}} which is a feature of phase transition.

Alternatively, if one adds randomly selected edges one at a time, starting with an empty graph, then it is not until approximately n / 2 {\displaystyle n/2} edges have been added that the graph contains a large component, and soon after that the component becomes giant. More precisely, when t edges have been added, for values of t close to but larger than n / 2 {\displaystyle n/2} , the size of the giant component is approximately 4 t 2 n {\displaystyle 4t-2n} .[1] However, according to the coupon collector's problem, Θ ( n log n ) {\displaystyle \Theta (n\log n)} edges are needed in order to have high probability that the whole random graph is connected.

Graphs with arbitrary degree distributions

A similar sharp threshold between parameters that lead to graphs with all components small and parameters that lead to a giant component also occurs in random graphs with non-uniform degree distributions. The degree distribution does not define a graph uniquely. However under assumption that in all respects other than their degree distribution, the graphs are treated as entirely random, many results on finite/infinite-component sizes are known. In this model, the existence of the giant component depends only on the first two (mixed) moments of the degree distribution. Let a randomly chosen vertex have degree k {\displaystyle k} , then the giant component exists[3] if and only if E [ k 2 ] 2 E [ k ] > 0. {\displaystyle \mathbb {E} [k^{2}]-2\mathbb {E} [k]>0.} E [ k ] {\displaystyle \mathbb {E} [k]} which is also written in the form of k {\displaystyle {\langle k\rangle }} is the mean degree of the network. Similar expressions are also valid for directed graphs, in which case the degree distribution is two-dimensional.[4] There are three types of connected components in directed graphs. For a randomly chosen vertex:

  1. out-component is a set of vertices that can be reached by recursively following all out-edges forward;
  2. in-component is a set of vertices that can be reached by recursively following all in-edges backward;
  3. weak component is a set of vertices that can be reached by recursively following all edges regardless of their direction.

Criteria for giant component existence in directed and undirected configuration graphs

Let a randomly chosen vertex has k in {\displaystyle k_{\text{in}}} in-edges and k out {\displaystyle k_{\text{out}}} out edges. By definition, the average number of in- and out-edges coincides so that c = E [ k in ] = E [ k out ] {\displaystyle c=\mathbb {E} [k_{\text{in}}]=\mathbb {E} [k_{\text{out}}]} . If G 0 ( x ) = k P ( k ) x k {\displaystyle G_{0}(x)=\textstyle \sum _{k}\displaystyle P(k)x^{k}} is the generating function of the degree distribution P ( k ) {\displaystyle P(k)} for an undirected network, then G 1 ( x ) {\displaystyle G_{1}(x)} can be defined as G 1 ( x ) = k k k P ( k ) x k 1 {\displaystyle G_{1}(x)=\textstyle \sum _{k}\displaystyle {\frac {k}{\langle k\rangle }}P(k)x^{k-1}} . For directed networks, generating function assigned to the joint probability distribution P ( k i n , k o u t ) {\displaystyle P(k_{in},k_{out})} can be written with two valuables x {\displaystyle x} and y {\displaystyle y} as: G ( x , y ) = k i n , k o u t P ( k i n , k o u t ) x k i n y k o u t {\displaystyle {\mathcal {G}}(x,y)=\sum _{k_{in},k_{out}}\displaystyle P({k_{in},k_{out}})x^{k_{in}}y^{k_{out}}} , then one can define g ( x ) = 1 c G x | y = 1 {\displaystyle g(x)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial x}\vert _{y=1}} and f ( y ) = 1 c G y | x = 1 {\displaystyle f(y)={\frac {1}{c}}{\partial {\mathcal {G}} \over \partial y}\vert _{x=1}} . The criteria for giant component existence in directed and undirected random graphs are given in the table below:

Type Criteria
undirected: giant component E [ k 2 ] 2 E [ k ] > 0 {\displaystyle \mathbb {E} [k^{2}]-2\mathbb {E} [k]>0} ,[3] or G 1 ( 1 ) = 1 {\displaystyle G'_{1}(1)=1} [4]
directed: giant in/out-component E [ k in k o u t ] E [ k in ] > 0 {\displaystyle \mathbb {E} [k_{\text{in}}k_{out}]-\mathbb {E} [k_{\text{in}}]>0} ,[4] or g 1 ( 1 ) = f 1 ( 1 ) = 1 {\displaystyle g'_{1}(1)=f'_{1}(1)=1} [4]
directed: giant weak component 2 E [ k in ] E [ k in k out ] E [ k in ] E [ k out 2 ] E [ k in ] E [ k in 2 ] + E [ k in 2 ] E [ k out 2 ] E [ k in k out ] 2 > 0 {\displaystyle 2\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}k_{\text{out}}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}]\mathbb {E} [k_{\text{in}}^{2}]+\mathbb {E} [k_{\text{in}}^{2}]\mathbb {E} [k_{\text{out}}^{2}]-\mathbb {E} [k_{\text{in}}k_{\text{out}}]^{2}>0} [5]

See also

References

  1. ^ a b c Bollobás, Béla (2001), "6. The Evolution of Random Graphs—The Giant Component", Random Graphs, Cambridge studies in advanced mathematics, vol. 73 (2nd ed.), Cambridge University Press, pp. 130–159, ISBN 978-0-521-79722-1.
  2. ^ Newman, M. E. J. (2010). Networks : an introduction. New York: Oxford University Press. OCLC 456837194.
  3. ^ a b Molloy, Michael; Reed, Bruce (1995). "A critical point for random graphs with a given degree sequence". Random Structures & Algorithms. 6 (2–3): 161–180. doi:10.1002/rsa.3240060204. ISSN 1042-9832.
  4. ^ a b c d Newman, M. E. J.; Strogatz, S. H.; Watts, D. J. (2001-07-24). "Random graphs with arbitrary degree distributions and their applications". Physical Review E. 64 (2): 026118. arXiv:cond-mat/0007235. Bibcode:2001PhRvE..64b6118N. doi:10.1103/physreve.64.026118. ISSN 1063-651X. PMID 11497662.
  5. ^ Kryven, Ivan (2016-07-27). "Emergence of the giant weak component in directed random graphs with arbitrary degree distributions". Physical Review E. 94 (1): 012315. arXiv:1607.03793. Bibcode:2016PhRvE..94a2315K. doi:10.1103/physreve.94.012315. ISSN 2470-0045. PMID 27575156. S2CID 206251373.