30  Spectral Graph Theory and PageRank

Encoding graph topology into matrix spectrum for connectivity, clustering, and ranking.

30.1 Graph Matrices

NoteDefinition: Adjacency and Degree Matrices

For a graph \(G = (V, E)\) with \(n\) vertices:

  • Adjacency \(\bA\): \(A_{ij} = w_{ij}\) if \((i,j) \in E\), else \(0\).

  • Degree \(\bD\): Diagonal matrix with \(D_{ii} = d_i = \sum_{j} w_{ij}\).

NoteDefinition: Graph Laplacian

Defined as \(\bL = \bD - \bA\).

  • Discrete Derivative: \((\bL \boldsymbol{f})_i = \sum_{j \sim i} w_{ij} (f_i - f_j)\).

  • Quadratic Form: \(\bx^T \bL \bx = \sum_{(i,j) \in E} w_{ij} (x_i - x_j)^2\).

NoteTheorem: Spectral Properties

For a connected graph with \(0 = \lambda_1 \leq \lambda_2 \leq ... \leq \lambda_n\):

  1. \(\lambda_1 = 0\) with eigenvector \(\mathbf{1}\).

  2. Multiplicity of 0: Equals the number of connected components.

  3. Fiedler Value (\(\lambda_2\)): Measures algebraic connectivity. \(\lambda_2 > 0\) iff graph is connected.

NoteDefinition: Normalized Laplacians
  • Symmetric: \(\bL_{\text{sym}} = \bD^{-1/2} \bL \bD^{-1/2}\). (Eigenvalues in \([0, 2]\)).

  • Random Walk: \(\bL_{\text{rw}} = \bD^{-1} \bL = \bI - \bP\).

30.2 Spectral Clustering

NoteTheorem: Spectral Relaxation

Minimizing a graph cut (RatioCut) is NP-hard. Spectral methods relax this to an eigenvector problem:

  1. Compute \(k\) smallest eigenvectors of \(\bL\) (starting from \(\bv_2\)).

  2. Embed vertices in \(\fR^k\) using these eigenvectors.

  3. Cluster embedding via \(k\)-means.

30.3 Random Walks and PageRank

NoteDefinition: Random Walk Matrix

Transition matrix \(\bP = \bD^{-1} \bA\).

  • \(P_{ij} = w_{ij}/d_j\) is the probability of moving from \(j\) to \(i\).

  • Stationary Distribution: \(\boldsymbol{\pi}\) such that \(\bP\boldsymbol{\pi} = \boldsymbol{\pi}\). For undirected graphs, \(\pi_i = d_i / \sum d_j\).

NoteDefinition: PageRank and Google Matrix

Models a random surfer on a directed web graph with teleportation: \[ \begin{align} \mathbf{G} = d \bS + \frac{1-d}{N} \mathbf{J} \end{align} \]

  • \(d\) is the damping factor (standard: 0.85).

  • \(\bS\) is the link matrix (stochastic, handles dangling nodes).

  • \(\mathbf{J}\) is the all-ones matrix (teleportation).

Computation: PageRank vector \(\mathbf{r}\) is the dominant eigenvector of \(\mathbf{G}\), computed via power iteration.

30.4 Exercises

WarningExercise
  1. Construct \(\bL\) for a 3-vertex path graph. Find its eigenvalues by hand.

  2. Show that \(\bL \mathbf{1} = \bzero\) for any graph.

  3. Prove \(\bL\) is positive semi-definite using the quadratic form.

  4. Implement PageRank power iteration for a 10-node random graph.