numerics
Loading...
Searching...
No Matches
Eigenvalue Algorithms

Eigenvalue problems arise throughout scientific computing: structural vibration analysis, quantum chemistry, graph clustering, principal component analysis, and more. The choice of algorithm depends on two primary factors: whether the full decomposition is needed (all \(n\) eigenpairs) or only a few selected eigenvalues, and whether the matrix is dense or large and sparse.

  • Full decomposition — Dense symmetric matrices of moderate size ( \(n \lesssim 2000\)): use Cyclic Jacobi (num::eig_sym) or LAPACK dsyev.
  • Selected eigenvalues near a target — Use inverse or Rayleigh quotient iteration (num::inverse_iteration, num::rayleigh_iteration).
  • A few extreme eigenvalues of a large sparse matrix — Use Lanczos (num::lanczos).

Cyclic Jacobi Eigendecomposition

The Jacobi method computes the full spectral decomposition of a real symmetric matrix \(A = V\Lambda V^T\) by repeatedly applying Givens (plane) rotations that drive all off-diagonal entries to zero.

Similarity Transform

A Givens rotation \(G(p,q,\theta)\) is the identity with a \(2\times 2\) rotation block at rows/columns \(p\) and \(q\). The similarity transform

\[ A' = G^T A G \]

modifies only rows and columns \(p\) and \(q\). Setting \(A'_{pq} = 0\) requires

\[ (c^2 - s^2) A_{pq} + cs(A_{pp} - A_{qq}) = 0. \]

Rotation Parameters

Let \(t = \tan\theta\) and define

\[ \tau = \frac{A_{qq} - A_{pp}}{2 A_{pq}}. \]

Then \(t\) satisfies \(t^2 + 2\tau t - 1 = 0\), and taking the smaller root \(|t| \leq 1\) to minimize perturbation to other entries:

\[ t = \frac{\operatorname{sign}(\tau)}{|\tau| + \sqrt{1 + \tau^2}}, \qquad c = \frac{1}{\sqrt{1+t^2}}, \qquad s = ct. \]

When \(\tau \gg 1\) this gives \(t \approx 1/(2\tau) \approx 0\), avoiding catastrophic cancellation.

Diagonal and Off-Diagonal Update Formulas

After the rotation the updated diagonal entries are

\[ A'_{pp} = c^2 A_{pp} - 2cs\,A_{pq} + s^2 A_{qq}, \qquad A'_{qq} = s^2 A_{pp} + 2cs\,A_{pq} + c^2 A_{qq}, \]

and the off-diagonal row/column updates for \(r \neq p,q\) are

\[ A'_{rp} = c\,A_{rp} - s\,A_{rq}, \qquad A'_{rq} = s\,A_{rp} + c\,A_{rq}. \]

Eigenvectors are accumulated by applying the same rotation to the columns of \(V\).

Convergence

The off-diagonal Frobenius norm \(\sigma_{\mathrm{off}} = \sqrt{2\sum_{p<q} A_{pq}^2}\) is the convergence monitor. Each rotation decreases it by exactly \(2A_{pq}^2\):

\[ \sigma_{\mathrm{off}}^2(A') = \sigma_{\mathrm{off}}^2(A) - 2A_{pq}^2. \]

Over a full cyclic sweep (all \(\binom{n}{2}\) pairs):

\[ \sigma_{\mathrm{off}}^2(A_{k+1}) \leq \left(1 - \frac{2}{n(n-1)}\right)\sigma_{\mathrm{off}}^2(A_k) \qquad \text{(linear phase)}. \]

Near the solution, convergence transitions to a quadratic ultimate phase:

\[ \sigma_{\mathrm{off}}(A_{k+1}) \leq C\,\sigma_{\mathrm{off}}^2(A_k). \]

In practice 5–10 sweeps suffice for double precision with well-separated eigenvalues; up to 15 sweeps for clustered spectra.

Complexity

Each rotation costs \(O(n)\) flops for the row/column update and \(O(n)\) for eigenvector accumulation. A full sweep visits \(n(n-1)/2\) pairs, giving \(O(4n^3)\) flops per sweep and \(O(n^3 \times \text{sweeps})\) overall.

Comparison with Other Eigensolvers

Method Structure Cost Finds Use when
Cyclic Jacobi Symmetric dense \(O(n^3 \times\) sweeps) All Small \(n\), high accuracy
LAPACK dsyev (Householder+QR) Symmetric dense \(O(\tfrac{4}{3}n^3)\) All Standard dense symmetric
Power iteration Any \(O(n^2)\times\) iters 1 dominant Sparse, dominant only
Lanczos Symmetric sparse \(O(kn)\times\) iters \(k\) extreme Large sparse, \(k \ll n\)

API: num::eig_sym, num::EigenResult


Power, Inverse, and Rayleigh Iteration

These single-vector methods find one eigenpair at a time and are most useful when only one or a few eigenvalues are needed.

Power Iteration

Expand the starting vector in the eigenbasis: \(v_0 = \sum_i \alpha_i u_i\). After \(k\) applications of \(A\), the component along \(u_i\) ( \(i \geq 2\)) decays as \((\lambda_i/\lambda_1)^k\), so the iteration converges to the dominant eigenvector.

The update rule is

\[ \mathbf{v}_{k+1} = \frac{A\mathbf{v}_k}{\|A\mathbf{v}_k\|}. \]

The eigenvalue estimate at each step is the Rayleigh quotient \(\lambda \approx \mathbf{v}_k^T A \mathbf{v}_k\).

Convergence rate: the angle between \(v_k\) and the true eigenvector \(u_1\) decays as

\[ \left|\frac{\lambda_2}{\lambda_1}\right|^k. \]

When \(|\lambda_1| \approx |\lambda_2|\) convergence is very slow; use Lanczos instead.

API: num::power_iteration

Inverse Iteration

Replace \(A\) by \((A - \sigma I)^{-1}\). Its eigenvalues are \(\{1/(\lambda_i - \sigma)\}\), so the dominant eigenvalue of the shifted inverse corresponds to the \(\lambda_i\) **closest to \(\sigma\)**. Factor \((A - \sigma I)\) once ( \(O(n^3)\)), then solve cheaply ( \(O(n^2)\)) at each iteration.

The convergence factor at shift \(\sigma\) is

\[ r = \frac{|\lambda - \sigma|}{|\mu - \sigma|} \]

where \(\lambda\) is the nearest eigenvalue and \(\mu\) is the second nearest. Choosing \(\sigma\) close to the target makes \(r \approx 0\); typically 3–5 iterations reach machine precision from a shift within 10% of \(\lambda\).

API: num::inverse_iteration

Rayleigh Quotient Iteration

Update the shift at every step using the Rayleigh quotient of the current vector:

\[ \sigma_k = \mathbf{v}_k^T A \mathbf{v}_k \qquad (\|\mathbf{v}_k\| = 1). \]

For symmetric \(A\) this gives cubic convergence: if the current error is \(\epsilon\), then

\[ |\sigma_{k+1} - \lambda| \leq C\,|\sigma_k - \lambda|^3. \]

An error of \(10^{-4}\) becomes \(\sim 10^{-12}\) in one more step. The cost is \(O(n^3)\) per iteration (new LU at each step), so in practice RQI is used only as a cheap refinement pass (≤ 3 iterations) after a rough estimate from power or inverse iteration.

API: num::rayleigh_iteration

Method Selection

Goal Recommended method
Largest eigenvalue, well-separated num::power_iteration
Eigenvalue nearest known \(\sigma\) num::inverse_iteration
Refine a rough estimate to machine precision num::rayleigh_iteration
Several extreme eigenvalues, large sparse \(A\) num::lanczos

Lanczos Algorithm

Lanczos builds a \(k\)-dimensional Krylov subspace \(\mathcal{K}_k(A, v_1) = \operatorname{span}\{v_1, Av_1, \ldots, A^{k-1}v_1\}\) and extracts near-optimal approximations to the extreme eigenvalues of a large symmetric matrix. It is the method of choice when \(k \ll n\) and \(A\) is available only as a matrix-vector product (SpMV).

The Lanczos Relation

Starting from a unit vector \(v_1\), the \(k\)-step process constructs an orthonormal basis \(V_k = [v_1, \ldots, v_k]\) satisfying

\[ AV_k = V_k T_k + \beta_k \mathbf{v}_{k+1} \mathbf{e}_k^T, \]

where \(T_k\) is the symmetric tridiagonal Galerkin projection

\[ T_k = V_k^T A V_k \in \mathbb{R}^{k \times k}. \]

The three-term recurrence generating each new basis vector is: \(w = Av_j\), \(\alpha_j = v_j^T w\), \(w \leftarrow w - \alpha_j v_j - \beta_{j-1} v_{j-1}\), \(\beta_j = \|w\|\), \(v_{j+1} = w/\beta_j\).

Ritz Values and Cheap Residual Bound

The eigenvalues of \(T_k\) — called Ritz values \(\theta_1, \ldots, \theta_k\) — approximate the extreme eigenvalues of \(A\). If \(S\) is the eigenvector matrix of \(T_k\), the Ritz vectors \(\tilde{u}_i = V_k s_i\) satisfy

\[ \|A\tilde{u}_i - \theta_i \tilde{u}_i\|_2 = \beta_k |S_{k,i}|. \]

This provides an inexpensive stopping criterion: only \(\beta_k\) (already computed in the recurrence) and the last row of \(S\) (from the small eigenproblem on \(T_k\)) are needed — no explicit Ritz vector is required.

Ghost Eigenvalues and Full Reorthogonalization

In floating-point arithmetic, the three-term recurrence loses orthogonality after \(O(1/\varepsilon_{\mathrm{mach}})\) steps. Once a Ritz value converges to an eigenvalue \(\lambda_j\), the corresponding Lanczos vector drifts back into the \(u_j\) eigenspace, causing \(\lambda_j\) to reappear as a spurious ghost eigenvalue.

Full reorthogonalization (MGS) projects \(w\) against all previous basis vectors at every step, at a cost of \(O(k^2 n)\) total, and guarantees orthogonality to machine precision. It requires storing all \(k\) vectors simultaneously.

Selective reorthogonalization (Simon 1984) reorthogonalizes only against converged Ritz vectors, reducing work by 2–10× when few eigenvalues have converged.

Thick Restart and ARPACK

Thick restart (Wu–Simon 2000) retains the \(p\) best Ritz pairs when restarting, forming the new starting basis \([\tilde{u}_1, \ldots, \tilde{u}_p, v_{k+1}]\). This is the basis of ARPACK (Implicitly Restarted Lanczos), which is the backend for MATLAB eigs() and SciPy eigsh().

Complexity

Phase Cost
\(k\) matvecs \(O(k \cdot \mathrm{nnz})\) sparse, \(O(kn^2)\) dense
Full reorthogonalization \(O(k^2 n)\)
Inner eigensolver on \(T_k\) \(O(k^3)\) (negligible for \(k \ll n\))

API: num::lanczos, num::LanczosResult