|
numerics
|
A matrix factorization rewrites \(A\) as a product of simpler matrices whose structure reduces solving, least-squares, and determinant problems to cheap triangular operations.
Computing \(A^{-1}\) and then \(x = A^{-1}b\) costs two \(O(n^3)\) operations. Factorization costs one \(O(n^3)\) pass up front; each subsequent solve (forward + backward substitution) costs only \(O(n^2)\).
When the same coefficient matrix appears with many right-hand sides—time steps of an ODE, Newton iterations, column-by-column inversion—the factorization is amortised:
\[ \underbrace{O(n^3)}_{\text{factor once}} + k \cdot \underbrace{O(n^2)}_{\text{solve}} \quad \ll \quad k \cdot O(n^3) \]
For \(k = n\) solves the factorisation approach is \(n\) times cheaper.
Every non-singular \(A \in \mathbb{R}^{n \times n}\) admits
\[ PA = LU \]
where:
U is upper triangular.
\(L\) and \(U\) are stored packed in a single \(n \times n\) working matrix: \(U\) occupies the diagonal and above; \(L\) occupies the strict lower triangle with the implicit unit diagonal.
Gaussian elimination proceeds column by column. At step \(k\), the multipliers
\[ l_{ik} = \frac{a_{ik}^{(k)}}{a_{kk}^{(k)}}, \quad i = k+1, \ldots, n-1 \]
are stored in the lower triangle, and the trailing submatrix receives a rank-1 (Schur complement) update:
\[ a_{ij}^{(k+1)} = a_{ij}^{(k)} - l_{ik}\, a_{kj}^{(k)}, \quad i,j > k \]
After \(n-1\) steps the working matrix holds \(L\) (below diagonal) and \(U\) (diagonal and above).
Without pivoting, a tiny pivot \(a_{kk}^{(k)}\) inflates the multipliers catastrophically. Example: for
\[ A = \begin{pmatrix} \varepsilon & 1 \\ 1 & 1 \end{pmatrix}, \quad \varepsilon = 10^{-16} \]
without pivoting \(l_{21} \approx 10^{16}\) and \(U_{22} \approx -10^{16}\). Swapping rows first gives \(l_{21} = \varepsilon\) and \(U_{22} \approx 1\).
Partial pivoting — swapping row \(k\) with the row achieving \(\max_{i \geq k}|a_{ik}^{(k)}|\) — guarantees \(|L_{ij}| \leq 1\) for all \(i > j\), which bounds the growth factor:
\[ \|LU\|_\infty \leq 2^{n-1} \|A\|_\infty \]
The \(2^{n-1}\) bound is the Wilkinson worst case; random matrices grow as \(O(n^{1/2})\) in practice, and diagonally dominant or SPD matrices have a growth factor bounded by a small constant.
Given \(PA = LU\) and a right-hand side \(b\), solve \(Ax = b\) in three steps:
\[ y_i = y_i - \sum_{j=0}^{i-1} L_{ij}\, y_j, \quad i = 1, \ldots, n-1 \]
\[ x_i = \frac{1}{U_{ii}}\!\left(y_i - \sum_{j=i+1}^{n-1} U_{ij}\, x_j\right), \quad i = n-1, \ldots, 0 \]
Each triangular solve costs \(n^2\) flops; the factorization itself costs \(\frac{2}{3}n^3\) flops.
\[ \det(A) = (-1)^{\text{swaps}} \prod_{i=0}^{n-1} U_{ii} \]
where "swaps" counts non-trivial pivots ( \(\text{piv}[k] \neq k\)).
The inverse is computed by solving \(AX = I\) column by column— \(n\) calls to the triangular solve, each \(O(n^2)\), total \(O(n^3)\). Never compute \(A^{-1}\) explicitly when only \(A^{-1}b\) is needed; solve \(Ax = b\) directly.
| Operation | FLOPs | Notes |
|---|---|---|
| LU factorization | \(\frac{2}{3}n^3\) | Dominant one-time cost |
| Forward substitution | \(n^2\) | Per right-hand side |
| Backward substitution | \(n^2\) | Per right-hand side |
LU with partial pivoting is backward stable: the computed solution \(\hat{x}\) satisfies
\[ (A + \delta A)\,\hat{x} = b, \qquad \frac{\|\delta A\|}{\|A\|} \leq \varepsilon_{\mathrm{mach}}\,\rho(n)\,\kappa(A) \]
where \(\rho(n) \leq 2^{n-1}\) is the growth factor and \(\kappa(A) = \|A\|\,\|A^{-1}\|\) is the condition number.
Every \(A \in \mathbb{R}^{m \times n}\) with \(m \geq n\) admits
\[ A = QR \]
where \(Q \in \mathbb{R}^{m \times m}\) is orthogonal ( \(Q^TQ = I\)) and \(R \in \mathbb{R}^{m \times n}\) is upper triangular. The economy (thin) form uses \(\hat{Q} \in \mathbb{R}^{m \times n}\) and \(\hat{R} \in \mathbb{R}^{n \times n}\), discarding the \(m - n\) trailing columns of \(Q\) that contribute only to the residual.
A Householder reflector is the rank-1 symmetric orthogonal matrix
\[ H = I - 2\hat{v}\hat{v}^T, \qquad H^T = H, \quad H^2 = I, \quad \det(H) = -1 \]
For a vector \(x \in \mathbb{R}^m\), choose \(\hat{v}\) so that \(Hx\) is a scalar multiple of \(e_1\):
\[ Hx = -\operatorname{sign}(x_0)\|x\|_2\, e_1 \]
Construction: form the unnormalized vector
\[ v = x + \operatorname{sign}(x_0)\|x\|_2\, e_1, \qquad \hat{v} = v / \|v\| \]
Sign convention: choose the sign matching \(x_0\), so that \(v_0 = x_0 + \operatorname{sign}(x_0)\|x\|\) is a sum of same-sign terms. The alternative \(v_0 = x_0 - \|x\|\) suffers catastrophic cancellation when \(x_0 \approx \|x\|\). Concisely:
\[ v_0 \mathrel{+}= \operatorname{sign}(x_0)\|x\| \]
Apply \(r = \min(m-1, n)\) reflections. Step \(k\) zeros all entries of column \(k\) strictly below the diagonal by applying \(H_k = I - 2\hat{v}_k\hat{v}_k^T\) to the trailing submatrix \(R[k:m,\, k:n]\):
\[ R[k:m,\, j] \;\leftarrow\; R[k:m,\, j] - 2\hat{v}_k\!\left(\hat{v}_k^T R[k:m,\, j]\right), \quad j = k, \ldots, n-1 \]
After \(r\) steps, \(R\) is upper triangular. The full \(Q\) is recovered by accumulating reflectors in reverse order starting from \(Q = I_m\):
\[ Q \;\leftarrow\; H_0 H_1 \cdots H_{r-1} \]
(built as \(Q \leftarrow H_{r-1}, \ldots, H_0\) applied right-to-left).
If only \(Q^Tb\) is required for a least-squares solve, apply each \(H_k\) directly to \(b\) in \(O(mn)\) total — never form \(Q\) explicitly.
Classical Gram-Schmidt orthogonalizes columns sequentially and is not backward stable: round-off in early columns corrupts later ones, and for ill-conditioned matrices the computed \(Q\) can be far from orthogonal.
Householder is backward stable regardless of \(\kappa(A)\):
\[ (A + \delta A) = \hat{Q}\hat{R}, \qquad \frac{\|\delta A\|_F}{\|A\|_F} = O(\varepsilon_{\mathrm{mach}}) \]
The normal equations \(A^TAx = A^Tb\) satisfy \(\|\delta A\|/\|A\| = O(\varepsilon_{\mathrm{mach}}\,\kappa(A)^2)\); for \(\kappa(A) > 10^8\) in double precision they lose all significant digits while QR still gives correct residuals.
For the overdetermined system \(\min_x \|Ax - b\|_2\) with \(m > n\), use the isometry of \(Q\):
\[ \|Ax - b\|_2^2 = \|QRx - b\|_2^2 = \|Rx - Q^Tb\|_2^2 = \|\hat{R}x - \tilde{c}\|_2^2 + \|\bar{c}\|_2^2 \]
where \(c = Q^Tb\), \(\tilde{c} = c[0:n]\) (the matched component), and \(\bar{c} = c[n:m]\) (the unavoidable residual). The minimum is attained by solving the \(n \times n\) upper triangular system
\[ \hat{R}\,x = \tilde{c} \]
via backward substitution. The residual norm \(\|\bar{c}\|_2\) is available as a by-product at no extra cost.
| Operation | FLOPs | Notes |
|---|---|---|
| Factorization | \(2mn^2 - \frac{2}{3}n^3\) | Building \(R\) and storing reflectors |
| Building \(Q\) explicitly | \(4m^2n - 2mn^2 + \frac{2}{3}n^3\) | Avoid when possible |
| Applying \(Q^T\) to a vector | \(2mn\) | Sufficient for least-squares |
| Backward substitution | \(n^2\) | Per right-hand side |