33  Markov Chains and Stochastic Matrices

Modeling random processes where future state depends only on the current state.

33.1 Definitions

NoteDefinition: Column-Stochastic Matrix

Matrix \(\bP \in \fR^{n \times n}\) where:

  1. \(P_{ij} \geq 0\).

  2. \(\sum_i P_{ij} = 1\) (Columns sum to 1).

Property: \(\lambda=1\) is always an eigenvalue (dominant).

NoteDefinition: Markov Chain

Probability distributions \(\bx^{(k)}\) evolving by \(\bx^{(k+1)} = \bP \bx^{(k)}\).

  • After \(k\) steps: \(\bx^{(k)} = \bP^k \bx^{(0)}\).

  • Long-run behavior is governed by the eigendecomposition of \(\bP\).

NoteDefinition: Stationary Distribution

A probability vector \(\boldsymbol{\pi}\) such that \(\bP \boldsymbol{\pi} = \boldsymbol{\pi}\).

  • Right eigenvector of \(\bP\) for \(\lambda=1\).

  • Represents the equilibrium state of the chain.

33.2 Convergence and Mixing

NoteTheorem: Perron-Frobenius for Chains

If \(\bP\) is irreducible (all states reachable) and primitive (no cycles), then:

  1. \(\lambda_1 = 1\) is simple and unique.

  2. \(|\lambda_i| < 1\) for all other eigenvalues.

  3. \(\lim_{k\to\infty} \bP^k \bx^{(0)} = \boldsymbol{\pi}\) for any initial state.

NoteDefinition: Spectral Gap and Mixing Time

The gap \(\delta = 1 - |\lambda_2|\) controls convergence speed.

  • Mixing Time: \(k_{\varepsilon} \approx \log(1/\varepsilon) / \delta\).

  • Insight: Large gap = fast mixing. Connected graphs have larger gaps.

33.3 Properties and Design

NoteDefinition: Detailed Balance (Reversibility)

A chain is reversible if \(\pi_j P_{ij} = \pi_i P_{ji}\) for all \(i, j\).

  • Implication: Reversible chains have real eigenvalues.

  • Symmetry: \(\bD_{\boldsymbol{\pi}}^{-1/2} \bP \bD_{\boldsymbol{\pi}}^{1/2}\) is symmetric.

Tip

Design Tool: Metropolis-Hastings. Construct a reversible chain for a desired \(\boldsymbol{\pi}\) by accepting moves with probability \(a = \min(1, \frac{\pi_{\text{new}} Q(\text{old}|\text{new})}{\pi_{\text{old}} Q(\text{new}|\text{old})})\).

33.4 Exercises

WarningExercise
  1. Construct \(\bP\) for a 2-state chain with transition probabilities \(\{0.1, 0.2\}\). Find \(\boldsymbol{\pi}\).

  2. Prove that \(|\lambda| \leq 1\) for any stochastic matrix.

  3. Simulate 100 steps of PageRank on a 5-node graph. Check convergence rate vs. \(\lambda_2\).

  4. Find \(\boldsymbol{\pi}\) for a random walk on \(K_4\) (complete graph).

Let \(\bP\) be a row-stochastic matrix (\(\sum_j P_{ij} = 1\)) and let \(\bv\) be an eigenvector with eigenvalue \(\lambda\). Let \(v_k\) be the entry with the largest magnitude: \(|v_k| = \max_i |v_i|\). From \(\bP\bv = \lambda\bv\), the \(k\)-th component is: \[ \begin{align} \lambda v_k = \sum_j P_{kj} v_j. \end{align} \] Taking absolute values and using the triangle inequality: \[ \begin{align} |\lambda| |v_k| &\leq \sum_j P_{kj} |v_j| \\ &\leq \left( \sum_j P_{kj} \right) \max_i |v_i| \\ &= 1 \cdot |v_k|. \end{align} \] Since \(|v_k| > 0\), we conclude \(|\lambda| \leq 1\). The proof for column-stochastic matrices follows from \(\sigma(\bP) = \sigma(\bP^T)\).