Sensitivity of problems vs. robustness of algorithms.
Why Factorize instead of Invert?
Rule: Never compute \(\bA^{-1}\) explicitly.
Cost: Computing \(\bA^{-1}\) takes \(3\times\) longer than LU (\(\frac{8}{3}n^3\) vs. \(\frac{2}{3}n^3\)).
Stability: Round-off in \(\bA^{-1}\) propagates into the entire product \(\bA^{-1}\bb\). Stable factorizations confine errors.
Efficiency: LU allows solving for new \(\bb\)’s in \(O(n^2)\) without storing a dense inverse.
For \(n=500\), compare total flops for solving 10 systems via \(\bA^{-1}\) vs. LU.
Construct \(\bA = \text{diag}(1, 1, 10^{-12})\). Compare residuals of inv(A) @ b and solve(A, b).
Numerical Stability
A property of the algorithm.
An algorithm is backward stable if its computed \(\hat{\bx}\) is the exact solution to a nearby problem: \((\bA + \delta \bA)\hat{\bx} = \bb + \delta \bb\), with \(\|\delta \bA\|/\|\bA\| = O(\varepsilon_{\text{mach}})\).
GEPP Stability: Gaussian Elimination with Partial Pivoting is backward stable for virtually all practical matrices. The growth factor \(\rho\) stays small, preventing exponential error amplification.
Explain why a small residual doesn’t always guarantee a small error (Forward Error).
Histogram growth factors for 100 random matrices via scipy.linalg.lu.
Condition Number
A property of the problem.
\(\kappa(\bA) = \|\bA\|\|\bA^{-1}\|\).
Interpretation: Worst-case error amplification. A perturbation of size \(\varepsilon\) in \(\bb\) can cause error up to \(\kappa(\bA) \varepsilon\) in \(\bx\).
Rule of Thumb: You expect at most \(16 - \log_{10}(\kappa(\bA))\) correct digits in double precision.
Let \(\bA\bx = \bb\) be the original system and \(\bA(\bx+\delta\bx) = \bb+\delta\bb\) the perturbed system. Subtracting gives \(\bA \delta\bx = \delta\bb\), so \(\delta\bx = \bA^{-1} \delta\bb\). Taking norms: \[
\begin{align}
\|\delta\bx\| \leq \|\bA^{-1}\| \|\delta\bb\|.
\end{align}
\] Also, from \(\bA\bx = \bb\), we have \(\|\bb\| \leq \|\bA\| \|\bx\|\), which implies \(1/\|\bx\| \leq \|\bA\|/\|\bb\|\). Combining these inequalities for the relative error: \[
\begin{align}
\frac{\|\delta\bx\|}{\|\bx\|} &\leq \|\bA^{-1}\| \|\delta\bb\| \frac{\|\bA\|}{\|\bb\|} \\
&= (\|\bA\| \|\bA^{-1}\|) \frac{\|\delta\bb\|}{\|\bb\|} \\
&= \kappa(\bA) \frac{\|\delta\bb\|}{\|\bb\|}.
\end{align}
\] Thus, the condition number \(\kappa(\bA)\) acts as the amplification factor for relative perturbations in the right-hand side.
Compute \(\kappa_2(H_n)\) for Hilbert matrices \(n=3, ..., 10\). How fast does it grow?
The Neumann Series: Use it to prove the perturbation bound: \(\frac{\|\delta \bx\|}{\|\bx\|} \leq \frac{\kappa(\bA)}{1 - ...}\).
Solver Selection Hierarchy
Algorithms are typically selected based on a trade-off between numerical stability and computational efficiency:
SVD: Most stable, most expensive (\(O(n^3)\) with large constant).
Householder QR: Very stable, no growth factor (\(O(mn^2)\)).
Cholesky: Stable for SPD systems, half the cost of LU.
GEPP (LU with Pivoting): Standard for general systems.
LU without Pivoting: Dangerously unstable; avoid.
The accuracy of a computed solution \(\hat{\bx}\) can be improved by iterative refinement:
Compute the residual \(\br = \bb - \bA\hat{\bx}\).
Solve \(\bA \boldsymbol{\delta} = \br\).
Update the estimate: \(\hat{\bx} \leftarrow \hat{\bx} + \boldsymbol{\delta}\).
Each step recovers digits lost due to ill-conditioning, up to the limits of machine precision.
Solve \(H_{10}\bx = H_{10}\mathbf{1}\). Apply one step of iterative refinement. How many digits are recovered?
Compare forward errors of solve (LU) vs lstsq (QR) on a Hilbert system.