17 Eigenvalues and Eigenvectors

Reveals the scaling directions of a linear transformation. Essential for stability analysis, PCA, and spectral methods.

Eigenvectors are directions preserved by a linear map. Along an eigenvector, the matrix acts like multiplication by a scalar. This reduces a matrix transformation to independent one-dimensional actions, when enough eigenvectors exist.

17.1 Definitions and Basic Properties

Definition: Eigenvalues and Eigenvectors

For \(\bA \in \fR^{n \times n}\), a scalar \(\lambda \in \fC\) is an eigenvalue if \(\exists\) nonzero \(\bv \in \fC^n\) such that: \[ \begin{align} \bA\bv = \lambda\bv. \end{align} \]

Spectrum \(\sigma(\bA)\): The set of all eigenvalues.
Invariance: Multiplying \(\bv\) by \(\bA\) only scales its length; its direction remains fixed.

Definition: Characteristic Polynomial

\(p_{\bA}(\lambda) = \det(\bA - \lambda \bI)\). Eigenvalues are the roots of \(p_{\bA}(\lambda) = 0\).

Remark

(Characteristic polynomial as theory) The characteristic polynomial is useful for proofs and small hand examples. Numerical algorithms do not compute eigenvalues by forming \(p_{\bA}\) and finding its roots; polynomial coefficients are extremely sensitive to perturbations.

Exercise

Verify \(\bv = (1, 1)^T\) is an eigenvector of \(\bA = \begin{pmatrix}3&1\\1&3\end{pmatrix}\) and find \(\lambda\).
Why is the zero vector excluded from the definition of an eigenvector?

Theorem: Similar Matrices

If \(\bB = \bV^{-1}\bA\bV\) (Similarity Transform), then \(\sigma(\bB) = \sigma(\bA)\). Eigenvalues are invariant under change of basis.

Definition: Diagonalization

A matrix \(\bA\) is diagonalizable if there exists an invertible matrix \(\bV\) such that \[ \begin{align} \bA=\bV\bD\bV^{-1}, \end{align} \] where \(\bD\) is diagonal. The columns of \(\bV\) are eigenvectors and the diagonal entries of \(\bD\) are eigenvalues.

Remark

(Multiplicity and defectiveness) Repeated eigenvalues do not guarantee enough eigenvectors. A matrix can have an eigenvalue of algebraic multiplicity \(2\) but only one independent eigenvector; such a matrix is defective and cannot be diagonalized.

Example

(Defective matrix) \[ \begin{align} \bA= \begin{pmatrix} 1&1\\ 0&1 \end{pmatrix} \end{align} \] has characteristic polynomial \((1-\lambda)^2\). Solving \((\bA-\bI)\bv=\bzero\) gives only one independent eigenvector, so \(\bA\) is not diagonalizable.

Theorem: Key Properties

Trace: \(\sum \lambda_i = \text{tr}(\bA)\).
Determinant: \(\prod \lambda_i = \det(\bA)\).
Singularity: \(0 \in \sigma(\bA) \iff \bA\) is singular.
Inversion: If \(\bA\) is invertible, eigenvalues of \(\bA^{-1}\) are \(1/\lambda_i\).

17.2 Symmetric and Hermitian Matrices

Symmetric and Hermitian matrices possess special spectral properties that simplify analysis and computation.

The symmetric case is the cleanest eigenvalue theory. Orthogonal eigenvectors mean that diagonalization is also a numerically safe change of basis: \(\bQ^{-1}=\bQ^T\), so the transformation does not amplify Euclidean errors.

Theorem: Spectral Theorem (Real Symmetric)

If \(\bA = \bA^T\):

All eigenvalues \(\lambda_i\) are real.
Eigenvectors for distinct \(\lambda\) are orthogonal.
\(\bA\) is orthogonally diagonalizable: \(\bA = \bQ\bD\bQ^T\).

Definition: Hermitian Matrix

\(\bA = \bA^*\) (conjugate transpose). The complex analog of symmetric matrices.

Spectral Theorem (Hermitian): Unitarily diagonalizable (\(\bA = \bQ\bD\bQ^*\)) with real eigenvalues.

Exercise

Prove eigenvalues of \(\bA^2\) are \(\lambda_i^2\).
Diagonalize \(\bA = \begin{pmatrix}4&1\\1&4\end{pmatrix}\) by finding an orthonormal eigenbasis.

17.3 Iterative Algorithms

In practice, we compute eigenvalues via iteration, not by finding roots of the characteristic polynomial.

Most large eigenvalue computations do not compute the full spectrum. They target a few eigenvalues: the largest, smallest, or those closest to a shift. Matrix-vector products and linear solves are the basic primitives.

Definition: Power Iteration

Repeatedly apply \(\bA\) to a unit vector: \(\bv^{(k)} = \frac{\bA\bv^{(k-1)}}{\|\bA\bv^{(k-1)}\|}\).

Convergence: \(\bv^{(k)} \to\) dominant eigenvector (for \(|\lambda_1| > |\lambda_2|\)).
Rate: \(O(|\lambda_2/\lambda_1|^k)\).

Remark

(Power iteration failure modes) Power iteration can fail or converge slowly if \(|\lambda_1|\approx|\lambda_2|\), if the starting vector has tiny component in the dominant eigenvector direction, or if the dominant eigenvalue is not unique in magnitude.

Proof

Assume \(\bA\) is diagonalizable with eigenvectors \(\{\bu_1, ..., \bu_n\}\) and eigenvalues \(|\lambda_1| > |\lambda_2| \geq ...\). Let \(\bx^{(0)} = \sum_{i=1}^n c_i \bu_i\) with \(c_1 \neq 0\). Applying \(\bA\) \(k\) times: \[ \begin{align} \bA^k \bx^{(0)} &= \sum_{i=1}^n c_i \lambda_i^k \bu_i \\ &= c_1 \lambda_1^k \left( \bu_1 + \sum_{i=2}^n \frac{c_i}{c_1} \left(\frac{\lambda_i}{\lambda_1}\right)^k \bu_i \right). \end{align} \] Since \(|\lambda_i/\lambda_1| < 1\) for all \(i > 1\), the summation term decays to zero as \(k \to \infty\). Thus \(\bA^k \bx^{(0)}\) aligns with \(\bu_1\). The normalization at each step prevents overflow while preserving this alignment.

Definition: Inverse Iteration

Apply power iteration to \((\bA - \sigma\bI)^{-1}\).

Use: Finds the eigenvalue closest to the shift \(\sigma\).
Efficiency: Factorize \((\bA - \sigma\bI)\) once, then solve at each step (\(O(n^2)\)).

Remark

(Shifted inverse iteration) If \(\lambda_i\) is an eigenvalue of \(\bA\), then \((\lambda_i-\sigma)^{-1}\) is an eigenvalue of \((\bA-\sigma\bI)^{-1}\). The eigenvalue closest to \(\sigma\) becomes largest in magnitude after inversion.

Definition: Rayleigh Quotient Iteration (RQI)

Update the shift \(\sigma_k\) to the current Rayleigh quotient \((\bv^{(k)})^T\bA\bv^{(k)}\) at each step.

Convergence: Cubic for symmetric matrices (digits triple each step).

Definition: Eigenvalue Method Selection

Exercise

Implement power iteration to find \(\lambda_{\max}\) of a tridiagonal matrix.
Use RQI on a random \(4 \times 4\) symmetric matrix. Observe the rate of convergence.

17.4 Schur Decomposition and the QR Algorithm

The Schur decomposition provides the theoretical basis for the QR algorithm, the standard iterative method for computing the eigenvalues and eigenvectors of dense, general matrices.

Diagonalization may fail for defective matrices, but Schur decomposition always exists over \(\fC\). This is why practical dense eigensolvers target Schur form rather than diagonal form.

Theorem: Schur Decomposition

Every square \(\bA\) satisfies \(\bA = \bQ\bT\bQ^*\), where \(\bQ\) is unitary and \(\bT\) is upper triangular.

Schur Vectors: Columns of \(\bQ\) provide an orthonormal basis for nested invariant subspaces.

Definition: QR Iteration

Iteratively compute QR factorizations and swap the factors. The goal is to transform \(\bA\) into an upper triangular (Schur) form, where the eigenvalues appear on the diagonal.

\(\bA_k = \bQ_k \bR_k\)
\(\bA_{k+1} = \bR_k \bQ_k = \bQ_k^T \bA_k \bQ_k\)

Convergence: \(\bA_k\) converges to the Schur form \(\bT\).

Remark

(QR iteration performance) Raw QR iteration requires \(O(n^3)\) operations per step. Practical implementations, such as those in standard numerical libraries, typically use:

Hessenberg Reduction: Pre-process \(\bA\) to upper Hessenberg form in \(O(n^3)\) operations.
Triangularization: Subsequent QR steps on a Hessenberg matrix cost only \(O(n^2)\).
Shifts: Techniques to accelerate convergence to quadratic or cubic rates.

Exercise

For \(\bA = \begin{pmatrix}3&1&0\\0&2&1\\0&0&2\end{pmatrix}\), identify the Schur vectors and eigenvalues by inspection.
Implement 50 steps of basic QR iteration on a random \(3 \times 3\) symmetric matrix. Check the off-diagonals.