Network Properties and Random Graph Model
I. Key Network Properties
Problem: How to measure a network?
Answer: Using Network Properties.
1. Degree Distribution P(k)
P(k) is the probability that a random chosen node has degree k.
is number of nodes with degree k. Thus.
2. Path p, Average Distance and Diameter
A path p is a sequence of nodes in which each node is linked to the next one.
- Shortest Path Distance: the number of edges along the shortest path connecting two nodes.
- Diameter: the maximum distance between any pair of nodes in a graph. That is,
- Average path length for a connected graph or a SCC graph is defined as: . In practice, we compute the only over the connected pairs of nodes.
3. Cluster Coefficient c
A vertex’scluster coefficient c measures how a vertex’s neighbors are connected to each other. Assume as the number of edges between neighbors of vertex , and as the vertex degree.
The cluster coefficient can be calculated as
Globally, the Average clustering coefficient is .
4. Connectivity
To show the connectivity of graph, one can calculate the size of the largest connected component in graph, BTW, largest component is also known as giant component.
II. Measure Real-world Networks
In this part, we use MSN network as an exaple to show some properties of real-world networks.
1. Degree Distribution
Power law distribution: , where is a parameter whose value is typically between 2 and 3. The graph degree distribution is heavily skewed.
2. Clustering Coefficient
Average Clustering Coefficient of Real Graph can be really big(0.1140) compared to the random graph.
3. Connected Components
Nearly all of the vertices are in one largest(giant) connected component.
4. Diameter
Average path length is 6.6. Besides, 90% of the nodes can be reached in <8 hops.
5. Small World Effect(Six Degrees of Separation), 1967
A small-world network is a type of mathematical graph where most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops or steps.
In mathematical format, assuming L is the distance between two randomly chosen nodes, and N is the number of nodes in a network, then we have .
III. Graph Generation Model
There are four kinds of Graph Generation Model.
1. Random Graph Model(Erdos-Renyi Graph)
(1) Generation:
- : undirected graph on n nodes where each edge (u,v) appears i.i.d with probability p.
- : undirected graph with n nodes, and m edges picked uniformly at random.
(2) Degree Distribution P(k)
- Binomial Distribution: .
- .
(3) Clustering Coefficient
Thus,
And , so decreases with the graph size n.
Note: the if .
(4) Path Length
Randomly pick a node , and it will have:
- points whose distance is 1
- points whose distance is 2
- points whose distance is 3
- points whose distance is
At the same time, the number of vertices is . It means that: .
So .
In E-R Random Graph, dmax increase slowly with N.
(5) Giant Component
When p(n-1) = 1, the Giant Connected Component emerges.
When k=ln N, the fully connected graph emerges.
(6) Problems
- Degree distribution differs from that of real-world graph
- Giant component in most real networks does NOT emerge through a phase transition
- No local structure – Clustering Coefficient is too low.
Conclusion: Real-world network is not random!
2. Small-world Model[Watts-Strogatz ’98]
Problem: E-R random graph’s clustering is low! Need: High cluster and low diameter.
- Start with a low-dimensional regular lattice
- Has high clustering coefficient
- Rewire: Introduce randomness
- Add / remove edges to create shortcuts to join remote parts of the lattice.
- For each edge, with probability p move the other endpoint to a random node.
The more probability of rewiring p, the smaller clustering coefficient will be.
3. Barabasi-Albert(BA) Model
Problem: How to model the power-law distribution of node degree?
Solution: Introduce Growth and rich-get-richer.
(1) Assumptions
- Growth: the graphs grows continuously
- Preferential attachment(i.e., rich-get-richer): nodes with larger connectivity tend to receive new edges
(2) Model Definition
- Start with a small graph of vertices generated randomly.
- At each step, add a new vertex with edges connecting to distinct vertices already present in the graph. For each connection, the selection of the existing vertex is governed by the following equation..
(3)Properties
- The degree distribution of a BA graph follows power law distribution of γ=3.
- The diameter of BA model is .
4. Kronecker Graph Model
(1) Recursive Graph Generation
Problem: How can we think of network structure recursively? Self-similarity
- Object is similar to a part of itself: the whole has the same shape as one or more of the parts.
(2) Kronecker Graph Generation
(3) Stochastic Kronecker Graphs
Conclusion: Kronecker is similar to real-world graph.
- 本文作者: Zhang Xinmiao
- 本文链接: https://recoderchris.github.io/2021/02/04/图数据管理笔记2/
- 版权声明: 本博客所有文章除特别声明外,均采用 MIT 许可协议。转载请注明出处!