Clustering Coefficient
Triadic Closure
Triadic closure: The tendency for people who share connections in a social network to become connected.
Local clustering coefficient of a node: Fraction of pairs of the node’s friends that are friends with each other. $$ \text{The local clustering coefficient of node C}: \frac{\operatorname{\#\,of\,pairs\,of } C^{\prime} \text{s friends who are friends} }{\operatorname{\#\,of\,pairs\,of } C^{\prime} \text{s friends}} $$
Global Clustering Coefficient: Measuring clustering on the whole network:
-
Approach 1: Average local clustering coefficient over all nodes in the graph.
-
Approach 2: Measuring clustering on the whole network: percentage of “open triads” that are triangles in a network. $$ \text { Transitivity }=\frac{3 \ast \text { Number of closed triads }}{\text { Number of open triads }} $$
Distance
The eccentricity of a node $n$ is the largest distance between n and all other nodes.
The radius of a graph is the minimum eccentricity.
The periphery of a graph is the set of nodes that have eccentricity equal to the diameter.
The center of a graph is the set of nodes that have eccentricity equal to the radius.
Connectivity
An undirected graph is connected if, for every pair nodes, there is a path between them.
A directed graph is strongly connected if, for every pair nodes $u$ and $v$, there is a directed path from $u$ to $v$ and a directed path from $v$ to $u$.
A directed graph is weakly connected if replacing all directed edges with undirected edges produces a connected undirected graph.
Robustness
Network robustness: the ability of a network to maintain its general structural properties when it faces failures or attacks.
Type of attacks: removal of nodes or edges.
Structural properties: connectivity.
Centrality
Centrality measures identify the most important nodes in a network:
Centrality Measures:
-
Degree centrality important nodes have many connections. $C_{d e g}(v)=\frac{d_v}{|N|-1}$
-
Closeness centrality important nodes are close to other nodes. $C_{\text {close }}(v)=\frac{|N|-1}{\sum_{u \in N \backslash\{v\}} d(v, u)}$
-
Betweenness centrality important nodes connect other nodes. $C_{b t w}(v)=\sum_{s, t \in N} \frac{\sigma_{s, t}(v)}{\sigma_{s, t}}$
-
Load centrality
-
Page Rank
-
Katz centrality
-
Percolation centrality
How to measure the closeness centrality of a node when it cannot reach all other nodes?
Consider only nodes that L can reach and normalize by the fraction of nodes L can reach: $C_{\text {close }}(L)=\left[\frac{|R(L)|}{|N-1|}\right] \frac{|R(L)|}{\sum_{u \in R(L)} d(L, u)}$
Preferential Attachment Model
- Start with two nodes connected by an edge.
- At each time step, add a new node with an edge connecting it to an existing node.
- Choose the node to connect to at random with probability proportional to each node’s degree.
- The probability of connecting to a node $u$ of degree $k_u$ is $k_u / \sum_j k_j$.
As the number of nodes increases, the degree distribution of the network under the preferential attachment model approaches the power law $P(k) = C k^{-3}$ with constant $C$.
Preferential Attachment in NetworkX
barabasi_albert_graph(n, m) returns a network with $n$ nodes. Each new node attaches to $m$ existing nodes according to the Preferential Attachment model.
|
|
Small-World Phenomenon
Social networks tend to have high clustering coefficient and small average path length.
Small-world model:
- Start with a ring of $n$ nodes, where each node is connected to its $k$ nearest neighbors.
- Fix a parameter $p \in [0, 1]$
- Consider each edge $u, v$. With probability $p$ , select a node $w$ at random and rewire the edge $(u, v)$ so it becomes $(u, w)$.
Small World Model in NetworkX
watts_strogatz_graph(n, k, p) returns a small world network with $n$ nodes, starting with a ring lattice with each node connected to its $k$ nearest neighbors, and rewiring probability $p$.
connected_watts_strogatz_graph(n, k, p, t) runs watts_strogatz_graph(n, k, p) up to $t$ times, until it returns a connected small world network.
|
|
Link Prediction
Given a pair of nodes, how to assess whether they are likely to connect?
Measure 1: Common Neighbors
Measure 2: Jaccard Coefficient
Number of common neighbors normalized by the total number of neighbors. $$ \operatorname{jacc\_coeff}(X, Y)=\frac{|N(X) \cap N(Y)|}{|N(X) \cup N(Y)|} $$ Measure 3: Resource Allocation $$ \operatorname{ res\_alloc }(X, Y)=\sum_{u \in N(X) \cap N(Y)} \frac{1}{|N(u)|} $$ Measure 4: Adamic-Adar Index $$ \operatorname{ adamic\_adar }(X, Y)=\sum_{u \in N(X) \cap N(Y)} \frac{1}{\log (|N(u)|)} $$ Measure 5: Pref. Attachment
In the preferential attachment model, nodes with high degree get more neighbors. $$ \operatorname{ pref\_attach }(X, Y)=|N(X)||N(Y)| $$ Measure 6: Community Common Neighbors
Number of common neighbors with bonus for neighbors in same community. $$ \begin{gathered} \operatorname{cn\_soundarajan\_hopcroft }(X, Y)=|N(X) \cap N(Y)|+\sum_{u \in N(X) \cap N(Y)} f(u)\\ \text{where } f(u)=\begin{cases}1, & u \text { in same comm. as } X \text { and } Y \\ 0, & \text {otherwise }\end{cases} \end{gathered} $$ Measure 7: Community Resource Allocation $$ \begin{gathered} \operatorname{ ra\_soundarajan\_hopcroft }(X, Y)=\sum_{u \in N(X) \cap N(Y)} \frac{f(u)}{|N(u)|} \\ \text { where } f(u)= \begin{cases} 1, & u \text { in same comm. as } X \text { and } Y \\ 0, & \text {otherwise } \end{cases} \end{gathered} $$