|
numerics
|
Direct solvers (LU, Thomas) factor the matrix once and solve in \(O(n^2)\) per right-hand side; they are exact up to floating-point and preferred when \(n\) is moderate or many right-hand sides share the same coefficient matrix. Iterative solvers (Jacobi, Gauss-Seidel, CG, GMRES) never form a factorization; they produce a sequence of approximate solutions and are preferred for large sparse systems where factorization fill-in would be prohibitive.
Solve \(Tx = d\) where
\[ T = \begin{pmatrix} b_0 & c_0 & & \\ a_1 & b_1 & c_1 & \\ & a_2 & b_2 & \ddots \\ & & \ddots & \ddots & c_{n-2} \\ & & & a_{n-1} & b_{n-1} \end{pmatrix} \]
Tridiagonal systems arise in 1D finite differences, implicit time integration (Crank-Nicolson), cubic spline interpolation, and ADI splitting for 2D/3D parabolic equations.
The Thomas algorithm is Gaussian elimination specialized to tridiagonal structure. The LU factors have the form
\[ L = \begin{pmatrix} 1 \\ l_0 & 1 \\ & l_1 & \ddots \\ & & \ddots & 1 \end{pmatrix}, \quad U = \begin{pmatrix} u_0 & c_0 \\ & u_1 & c_1 \\ & & \ddots & \ddots \\ & & & u_{n-1} \end{pmatrix} \]
where the recurrences follow directly from \(A = LU\):
\[ l_{i-1} = \frac{a_{i-1}}{u_{i-1}}, \qquad u_i = b_i - l_{i-1}\, c_{i-1} \]
Forward sweep (eliminate sub-diagonal, modify \(d\) in place):
\[ l \leftarrow a_i / u_{i-1}, \quad u_i \leftarrow b_i - l\, c_{i-1}, \quad d_i \leftarrow d_i - l\, d_{i-1}, \quad i = 1, \ldots, n-1 \]
Backward substitution:
\[ x_{n-1} = d_{n-1} / u_{n-1}, \qquad x_i = (d_i - c_i\, x_{i+1}) / u_i, \quad i = n-2, \ldots, 0 \]
The grand total is \(8n - 7\) FLOPs — optimal, since reading the \(O(n)\) input already costs \(O(n)\).
The algorithm is unconditionally stable when \(T\) is strictly diagonally dominant ( \(|b_i| > |a_i| + |c_i|\) for all \(i\)) or symmetric positive definite. Under diagonal dominance the modified diagonal \(|u_i| > |c_i|\) remains bounded away from zero at every step.
The forward sweep is inherently sequential within a single system. For \(m\) independent tridiagonal systems (common in ADI methods), assign one GPU thread per system: the \(m\) sweeps are fully independent and the kernel achieves near-peak memory bandwidth.
Write \(A = M - N\) and iterate
\[ M x_{k+1} = N x_k + b \quad \Longrightarrow \quad x_{k+1} = \underbrace{M^{-1}N}_{B}\, x_k + M^{-1}b \]
The iteration converges from any starting point if and only if the spectral radius \(\rho(B) < 1\). Decompose \(A = D + L + U\) (diagonal, strict lower, strict upper):
| Method | \(M\) | \(N\) |
|---|---|---|
| Jacobi | \(D\) | \(-(L+U)\) |
| Gauss-Seidel | \(D+L\) | \(-U\) |
| SOR( \(\omega\)) | \(\frac{1}{\omega}D + L\) | \((\frac{1}{\omega}-1)D - U\) |
Each component of the new iterate is computed from the old iterate only:
\[ x_i^{(k+1)} = \frac{1}{A_{ii}}\!\left(b_i - \sum_{j \neq i} A_{ij}\, x_j^{(k)}\right), \quad i = 0, \ldots, n-1 \]
Because all \(x_i^{(k+1)}\) depend solely on \(x^{(k)}\), Jacobi is embarrassingly parallel — all updates are independent.
Convergence condition: Jacobi converges under strict diagonal dominance ( \(|A_{ii}| > \sum_{j \neq i}|A_{ij}|\)) or for SPD \(A\). The iteration matrix is \(B_J = -D^{-1}(L+U)\). For the model Poisson problem (5-point stencil on an \(n \times n\) grid):
\[ \rho(B_J) = \cos\!\left(\frac{\pi}{n+1}\right) \approx 1 - \frac{\pi^2}{2n^2} \]
This requires \(O(n^2)\) iterations to reduce the error by \(1/e\) — Jacobi is too slow as a standalone solver for large Poisson problems but remains a useful multigrid smoother.
Each component uses the most recently computed values:
\[ x_i^{(k+1)} = \frac{1}{A_{ii}}\!\left(b_i - \sum_{j < i} A_{ij}\, x_j^{(k+1)} - \sum_{j > i} A_{ij}\, x_j^{(k)}\right) \]
The in-place update propagates fresh data immediately. For SPD \(A\) the Stein-Rosenberg theorem gives
\[ \rho(B_{GS}) = \rho(B_J)^2 \]
so Gauss-Seidel converges in half as many iterations as Jacobi.
SOR over-relaxes the Gauss-Seidel correction by \(\omega \in (0,2)\):
\[ x_i^{(k+1)} = x_i^{(k)} + \omega\!\left(\tilde{x}_i^{(GS)} - x_i^{(k)}\right) \]
For the model Poisson problem the optimal parameter is
\[ \omega_{\mathrm{opt}} = \frac{2}{1 + \sin(\pi/(n+1))} \approx 2 - \frac{2\pi}{n} \]
which reduces the spectral radius to
\[ \rho(B_{SOR,\,\omega_{\mathrm{opt}}}) \approx 1 - \frac{4\pi}{n} \]
requiring only \(O(n)\) iterations — a landmark improvement over the \(O(n^2)\) of plain Gauss-Seidel (D.M. Young, 1950).
Parallelism: natural-ordering GS has a sequential dependency chain. For structured grids, red-black ordering (color node \((i,j)\) by parity of \(i+j\)) decouples all red updates and all black updates into two independent parallel sweeps, preserving the same convergence rate.
For an SPD matrix \(A\), solving \(Ax = b\) is equivalent to minimizing the quadratic functional
\[ \phi(\mathbf{x}) = \frac{1}{2}\mathbf{x}^T A\mathbf{x} - \mathbf{b}^T\mathbf{x} \]
since \(\nabla\phi = A\mathbf{x} - \mathbf{b} = -\mathbf{r}\). The residual points toward the minimum; CG follows a sequence of search directions that span the Krylov subspace
\[ \mathcal{K}_k(A,\mathbf{r}_0) = \operatorname{span}\!\bigl\{\mathbf{r}_0,\, A\mathbf{r}_0,\, \ldots,\, A^{k-1}\mathbf{r}_0\bigr\} \]
The iterate \(x_k\) is the unique minimizer of \(\|\mathbf{x}-\mathbf{x}^*\|_A\) over \(\mathbf{x}_0 + \mathcal{K}_k\).
Two vectors are A-conjugate if \(\mathbf{p}^T A\mathbf{q} = 0\). Minimizing \(\phi\) along an A-conjugate direction leaves progress along previous directions undisturbed. CG maintains A-conjugacy using only the immediately preceding direction — a consequence of the three-term Lanczos recurrence.
At iteration \(k\), exact line minimization along the search direction \(\mathbf{p}_k\) gives
\[ \alpha_k = \frac{\mathbf{r}_k^T\mathbf{r}_k}{\mathbf{p}_k^T A \mathbf{p}_k} \]
After updating \(\mathbf{x}_{k+1} = \mathbf{x}_k + \alpha_k\mathbf{p}_k\) and \(\mathbf{r}_{k+1} = \mathbf{r}_k - \alpha_k A\mathbf{p}_k\), the new A-conjugate direction is
\[ \mathbf{p}_{k+1} = \mathbf{r}_{k+1} + \beta_k\mathbf{p}_k, \qquad \beta_k = \frac{\mathbf{r}_{k+1}^T\mathbf{r}_{k+1}}{\mathbf{r}_k^T\mathbf{r}_k} \]
The dominant cost per iteration is the matrix-vector product \(A\mathbf{p}_k\) — \(O(n^2)\) for dense \(A\), \(O(\mathrm{nnz})\) for sparse.
CG terminates in at most \(n\) iterations in exact arithmetic. In practice:
\[ \frac{\|\mathbf{e}_k\|_A}{\|\mathbf{e}_0\|_A} \leq 2\left(\frac{\sqrt{\kappa}-1}{\sqrt{\kappa}+1}\right)^k \]
where \(\kappa = \kappa_2(A) = \lambda_{\max}/\lambda_{\min}\).
| \(\kappa\) | Iterations for 6-digit accuracy |
|---|---|
| \(10\) | ~7 |
| \(100\) | ~23 |
| \(10^4\) | ~230 |
| \(10^6\) | ~2300 |
Eigenvalue clustering accelerates convergence: if all eigenvalues except a small cluster lie in a single interval, CG quickly annihilates the corresponding polynomial and then converges as if the problem had only as many distinct eigenvalues as the cluster size.
Replace \(A\) with \(M^{-1}A\) where \(M \approx A\) is cheap to invert. The PCG scalars change to
\[ \alpha_k = \frac{\mathbf{r}_k^T\mathbf{z}_k}{\mathbf{p}_k^T A\mathbf{p}_k}, \qquad \beta_k = \frac{\mathbf{r}_{k+1}^T\mathbf{z}_{k+1}}{\mathbf{r}_k^T\mathbf{z}_k} \]
where \(\mathbf{z}_k = M^{-1}\mathbf{r}_k\) is the preconditioner solve. Common preconditioners: diagonal (Jacobi), incomplete Cholesky (IC0), and algebraic multigrid (AMG, giving \(\kappa(M^{-1}A) \approx O(1)\) for Poisson-type problems).
cg_matfree accepts a \(\mathtt{MatVecFn}\) callback instead of an explicit matrix, enabling implicit time stepping ( \(A = M + \Delta t\, K\), never assembled), spectral methods (apply \(A\) via FFT in \(O(n\log n)\)), and structured operators (Toeplitz, circulant) with \(O(n)\) storage.
GMRES solves \(Ax = b\) for any invertible \(A\), including non-symmetric matrices where CG does not apply. At step \(k\) it finds
\[ x_k = \operatorname*{argmin}_{x \in x_0 + \mathcal{K}_k}\|Ax - b\|_2 \]
over the same Krylov subspace \(\mathcal{K}_k(A,r_0)\).
The Krylov basis is built by the Arnoldi process using Modified Gram-Schmidt (MGS). After \(k\) steps:
\[ A V_k = V_{k+1}\bar{H}_k \]
where \(V_k = [v_1 \mid \cdots \mid v_k]\) has orthonormal columns and \(\bar{H}_k \in \mathbb{R}^{(k+1)\times k}\) is the upper Hessenberg matrix of MGS coefficients. The MGS step at iteration \(j\) is:
\[ w \leftarrow Av_j, \qquad h_{ij} = w^T v_i, \quad w \leftarrow w - h_{ij} v_i \quad (i = 0,\ldots,j), \qquad v_{j+1} = w / \|w\| \]
Using the Arnoldi relation and orthonormality of \(V_{k+1}\):
\[ \|Ax_k - b\|_2 = \|\bar{H}_k y_k - \beta e_1\|_2, \qquad \beta = \|r_0\|_2 \]
This is a small \((k+1) \times k\) least-squares problem solved incrementally via Givens rotations. At step \(j\) a new rotation annihilates the sub-diagonal entry \(H_{j,j+1}\):
\[ \begin{pmatrix} c_j & s_j \\ -s_j & c_j \end{pmatrix} \begin{pmatrix} H_{jj} \\ H_{j,j+1} \end{pmatrix} = \begin{pmatrix} d \\ 0 \end{pmatrix}, \qquad c_j = \frac{H_{jj}}{d},\quad s_j = \frac{H_{j,j+1}}{d},\quad d = \sqrt{H_{jj}^2 + H_{j,j+1}^2} \]
The rotated right-hand side vector \(g\) satisfies \(|g_{j+1}| = \|r_{j+1}\|_2\), giving the residual norm **without computing \(x_k\) explicitly** — used as the convergence monitor.
Once \(\|r_k\|_2 < \mathrm{tol}\), back-solve the \(k \times k\) upper triangular system \(H_{[k]}y = g_{[k]}\) and recover \(x_k = x_0 + V_k y\).
Full GMRES stores \(k\) vectors growing without bound. **Restarted GMRES( \(m\))** caps memory at \(m+1\) vectors of length \(n\): after \(m\) Arnoldi steps, update \(x_0\) from the current solution and restart.
GMRES convergence is governed by the polynomial approximation problem:
\[ \frac{\|r_k\|_2}{\|r_0\|_2} \leq \min_{\substack{p_k \in \mathcal{P}_k \\ p_k(0)=1}} \max_{\lambda \in \sigma(A)} |p_k(\lambda)| \]
Convergence is fast when the spectrum \(\sigma(A)\) is clustered away from the origin; it can stagnate for many steps then converge suddenly if one eigenvalue is an outlier far from the main cluster.
Preconditioning replaces \(A\) by \(M^{-1}A\) (left) or \(AM^{-1}\) (right); right preconditioning is usually preferred because it minimizes the true residual norm \(\|r_k\|_2\) rather than the preconditioned norm \(\|M^{-1}r_k\|_2\).