17 Eigenvalues and Eigenvectors
Reveals the scaling directions of a linear transformation. Essential for stability analysis, PCA, and spectral methods.
17.1 Definitions and Basic Properties
For \(\bA \in \fR^{n \times n}\), a scalar \(\lambda \in \fC\) is an eigenvalue if \(\exists\) nonzero \(\bv \in \fC^n\) such that: \[ \begin{align} \bA\bv = \lambda\bv. \end{align} \]
Spectrum \(\sigma(\bA)\): The set of all eigenvalues.
Invariance: Multiplying \(\bv\) by \(\bA\) only scales its length; its direction remains fixed.
\(p_{\bA}(\lambda) = \det(\bA - \lambda \bI)\). Eigenvalues are the roots of \(p_{\bA}(\lambda) = 0\).
Verify \(\bv = (1, 1)^T\) is an eigenvector of \(\bA = \begin{pmatrix}3&1\\1&3\end{pmatrix}\) and find \(\lambda\).
Why is the zero vector excluded from the definition of an eigenvector?
If \(\bB = \bV^{-1}\bA\bV\) (Similarity Transform), then \(\sigma(\bB) = \sigma(\bA)\). Insight: Eigenvalues are invariant under change of basis.
Trace: \(\sum \lambda_i = \text{tr}(\bA)\).
Determinant: \(\prod \lambda_i = \det(\bA)\).
Singularity: \(0 \in \sigma(\bA) \iff \bA\) is singular.
Inversion: If \(\bA\) is invertible, eigenvalues of \(\bA^{-1}\) are \(1/\lambda_i\).
17.2 Symmetric and Hermitian Matrices
Symmetric and Hermitian matrices possess special spectral properties that simplify analysis and computation.
If \(\bA = \bA^T\):
All eigenvalues \(\lambda_i\) are real.
Eigenvectors for distinct \(\lambda\) are orthogonal.
\(\bA\) is orthogonally diagonalizable: \(\bA = \bQ\bD\bQ^T\).
\(\bA = \bA^*\) (conjugate transpose). The complex analog of symmetric matrices.
- Spectral Theorem (Hermitian): Unitarily diagonalizable (\(\bA = \bQ\bD\bQ^*\)) with real eigenvalues.
Prove eigenvalues of \(\bA^2\) are \(\lambda_i^2\).
Diagonalize \(\bA = \begin{pmatrix}4&1\\1&4\end{pmatrix}\) by finding an orthonormal eigenbasis.
17.3 Iterative Algorithms
In practice, we compute eigenvalues via iteration, not by finding roots of the characteristic polynomial.
Repeatedly apply \(\bA\) to a unit vector: \(\bv^{(k)} = \frac{\bA\bv^{(k-1)}}{\|\bA\bv^{(k-1)}\|}\).
Convergence: \(\bv^{(k)} \to\) dominant eigenvector (for \(|\lambda_1| > |\lambda_2|\)).
Rate: \(O(|\lambda_2/\lambda_1|^k)\).
Assume \(\bA\) is diagonalizable with eigenvectors \(\{\bu_1, ..., \bu_n\}\) and eigenvalues \(|\lambda_1| > |\lambda_2| \geq ...\). Let \(\bx^{(0)} = \sum_{i=1}^n c_i \bu_i\) with \(c_1 \neq 0\). Applying \(\bA\) \(k\) times: \[ \begin{align} \bA^k \bx^{(0)} &= \sum_{i=1}^n c_i \lambda_i^k \bu_i \\ &= c_1 \lambda_1^k \left( \bu_1 + \sum_{i=2}^n \frac{c_i}{c_1} \left(\frac{\lambda_i}{\lambda_1}\right)^k \bu_i \right). \end{align} \] Since \(|\lambda_i/\lambda_1| < 1\) for all \(i > 1\), the summation term decays to zero as \(k \to \infty\). Thus \(\bA^k \bx^{(0)}\) aligns with \(\bu_1\). The normalization at each step prevents overflow while preserving this alignment.
Apply power iteration to \((\bA - \sigma\bI)^{-1}\).
Use: Finds the eigenvalue closest to the shift \(\sigma\).
Efficiency: Factorize \((\bA - \sigma\bI)\) once, then solve at each step (\(O(n^2)\)).
Update the shift \(\sigma_k\) to the current Rayleigh quotient \((\bv^{(k)})^T\bA\bv^{(k)}\) at each step.
- Convergence: Cubic for symmetric matrices (digits triple each step).
Implement power iteration to find \(\lambda_{\max}\) of a tridiagonal matrix.
Use RQI on a random \(4 \times 4\) symmetric matrix. Observe the rate of convergence.
17.4 Schur Decomposition and the QR Algorithm
The Schur decomposition provides the theoretical basis for the QR algorithm, the standard iterative method for computing the eigenvalues and eigenvectors of dense, general matrices.
Every square \(\bA\) satisfies \(\bA = \bQ\bT\bQ^*\), where \(\bQ\) is unitary and \(\bT\) is upper triangular.
- Schur Vectors: Columns of \(\bQ\) provide an orthonormal basis for nested invariant subspaces.
Iteratively compute QR factorizations and swap the factors. The goal is to transform \(\bA\) into an upper triangular (Schur) form, where the eigenvalues appear on the diagonal.
\(\bA_k = \bQ_k \bR_k\)
\(\bA_{k+1} = \bR_k \bQ_k = \bQ_k^T \bA_k \bQ_k\)
Convergence: \(\bA_k\) converges to the Schur form \(\bT\).
Performance: Raw QR iteration requires \(O(n^3)\) operations per step. Practical implementations, such as those in standard numerical libraries, typically use:
Hessenberg Reduction: Pre-process \(\bA\) to upper Hessenberg form in \(O(n^3)\) operations.
Triangularization: Subsequent QR steps on a Hessenberg matrix cost only \(O(n^2)\).
Shifts: Techniques to accelerate convergence to quadratic or cubic rates.
For \(\bA = \begin{pmatrix}3&1&0\\0&2&1\\0&0&2\end{pmatrix}\), identify the Schur vectors and eigenvalues by inspection.
Implement 50 steps of basic QR iteration on a random \(3 \times 3\) symmetric matrix. Check the off-diagonals.