33 Markov Chains and Stochastic Matrices
Modeling random processes where future state depends only on the current state.
33.1 Definitions
Matrix \(\bP \in \fR^{n \times n}\) where:
\(P_{ij} \geq 0\).
\(\sum_i P_{ij} = 1\) (Columns sum to 1).
Property: \(\lambda=1\) is always an eigenvalue (dominant).
Probability distributions \(\bx^{(k)}\) evolving by \(\bx^{(k+1)} = \bP \bx^{(k)}\).
After \(k\) steps: \(\bx^{(k)} = \bP^k \bx^{(0)}\).
Long-run behavior is governed by the eigendecomposition of \(\bP\).
A probability vector \(\boldsymbol{\pi}\) such that \(\bP \boldsymbol{\pi} = \boldsymbol{\pi}\).
Right eigenvector of \(\bP\) for \(\lambda=1\).
Represents the equilibrium state of the chain.
33.2 Convergence and Mixing
If \(\bP\) is irreducible (all states reachable) and primitive (no cycles), then:
\(\lambda_1 = 1\) is simple and unique.
\(|\lambda_i| < 1\) for all other eigenvalues.
\(\lim_{k\to\infty} \bP^k \bx^{(0)} = \boldsymbol{\pi}\) for any initial state.
The gap \(\delta = 1 - |\lambda_2|\) controls convergence speed.
Mixing Time: \(k_{\varepsilon} \approx \log(1/\varepsilon) / \delta\).
Insight: Large gap = fast mixing. Connected graphs have larger gaps.
33.3 Properties and Design
A chain is reversible if \(\pi_j P_{ij} = \pi_i P_{ji}\) for all \(i, j\).
Implication: Reversible chains have real eigenvalues.
Symmetry: \(\bD_{\boldsymbol{\pi}}^{-1/2} \bP \bD_{\boldsymbol{\pi}}^{1/2}\) is symmetric.
Design Tool: Metropolis-Hastings. Construct a reversible chain for a desired \(\boldsymbol{\pi}\) by accepting moves with probability \(a = \min(1, \frac{\pi_{\text{new}} Q(\text{old}|\text{new})}{\pi_{\text{old}} Q(\text{new}|\text{old})})\).
33.4 Exercises
Construct \(\bP\) for a 2-state chain with transition probabilities \(\{0.1, 0.2\}\). Find \(\boldsymbol{\pi}\).
Prove that \(|\lambda| \leq 1\) for any stochastic matrix.
Simulate 100 steps of PageRank on a 5-node graph. Check convergence rate vs. \(\lambda_2\).
Find \(\boldsymbol{\pi}\) for a random walk on \(K_4\) (complete graph).
Let \(\bP\) be a row-stochastic matrix (\(\sum_j P_{ij} = 1\)) and let \(\bv\) be an eigenvector with eigenvalue \(\lambda\). Let \(v_k\) be the entry with the largest magnitude: \(|v_k| = \max_i |v_i|\). From \(\bP\bv = \lambda\bv\), the \(k\)-th component is: \[ \begin{align} \lambda v_k = \sum_j P_{kj} v_j. \end{align} \] Taking absolute values and using the triangle inequality: \[ \begin{align} |\lambda| |v_k| &\leq \sum_j P_{kj} |v_j| \\ &\leq \left( \sum_j P_{kj} \right) \max_i |v_i| \\ &= 1 \cdot |v_k|. \end{align} \] Since \(|v_k| > 0\), we conclude \(|\lambda| \leq 1\). The proof for column-stochastic matrices follows from \(\sigma(\bP) = \sigma(\bP^T)\).