14 Stability and Condition Number

Sensitivity of problems vs. robustness of algorithms.

Conditioning asks whether the mathematical problem is sensitive to small changes in the data. Stability asks whether the algorithm introduces only small changes to the problem. A good computation needs both: a well-conditioned problem and a stable algorithm.

14.1 Residual, Forward Error, and Backward Error

Definition: Residual and Forward Error

For a computed solution \(\hat{\bx}\) to the linear system in the result above: \[ \begin{align} \br = \bb-\bA\hat{\bx} \qquad \text{and} \qquad \be = \hat{\bx}-\bx^*, \end{align} \] where \(\bx^*\) is the exact solution.

Residual: How well the computed answer satisfies the equations.
Forward error: How close the computed answer is to the exact answer.

Example

(Small residual, large error) Let \[ \begin{align} \bA=\begin{pmatrix}1&0\\0&10^{-8}\end{pmatrix}, \qquad \bb=\begin{pmatrix}1\\10^{-8}\end{pmatrix}. \end{align} \] The exact solution is \(\bx^*=(1,1)^T\). The approximation \(\hat{\bx}=(1,0)^T\) has residual \[ \begin{align} \br=\bb-\bA\hat{\bx}=(0,10^{-8})^T, \end{align} \] which is tiny in absolute norm, but the solution error is \((0,-1)^T\). The small singular value makes one direction hard to determine accurately.

Exercise

Return to the result above. Explain why the residual in the result above is computable but the forward error usually is not.
Verify every number in the small-residual example above.
Change the small diagonal entry from \(10^{-8}\) to \(10^{-12}\). How do the residual and forward error change?

Theorem: Residual-Error Bound

Let \(\bA\) be nonsingular, \(\bx^*\) solve \(\bA\bx=\bb\), and \(\hat{\bx}\) have residual \(\br=\bb-\bA\hat{\bx}\). If \(\bb\neq\bzero\), then \[ \begin{align} \frac{\|\hat{\bx}-\bx^*\|}{\|\bx^*\|} \leq \kappa(\bA)\frac{\|\br\|}{\|\bb\|}. \end{align} \] Thus a small relative residual guarantees a small relative forward error only when the problem is well-conditioned.

Exercise

Starting from \(\br=\bb-\bA\hat{\bx}\) and \(\bb=\bA\bx^*\), show that \(\br=\bA(\bx^*-\hat{\bx})\).
Use the previous step to prove \(\|\hat{\bx}-\bx^*\|\leq \|\bA^{-1}\|\|\br\|\).
Use \(\|\bb\|\leq \|\bA\|\|\bx^*\|\) to finish the proof of the result above.
Apply the bound to the small-residual example above. Is the bound sharp?

14.2 Why Factorize instead of Invert?

Remark

(Explicit inverse rule) Never compute \(\bA^{-1}\) explicitly.

Cost: Computing \(\bA^{-1}\) takes \(3\times\) longer than LU (\(\frac{8}{3}n^3\) vs. \(\frac{2}{3}n^3\)).
Stability: Round-off in \(\bA^{-1}\) propagates into the entire product \(\bA^{-1}\bb\). Stable factorizations confine errors.
Efficiency: LU allows solving for new \(\bb\)’s in \(O(n^2)\) without storing a dense inverse.

Exercise

Use the LU factorization costs from the LU chapter to compare total flops for solving 10 systems via \(\bA^{-1}\) vs. LU when \(n=500\).
Construct \(\bA = \text{diag}(1, 1, 10^{-12})\). Compare residuals of inv(A) @ b and solve(A, b) using the result above.
Explain the result using the distinction between residual and forward error from the result above.

14.3 Numerical Stability

A property of the algorithm.

Definition: Backward Stability

An algorithm is backward stable if its computed \(\hat{\bx}\) is the exact solution to a nearby problem: \((\bA + \delta \bA)\hat{\bx} = \bb + \delta \bb\), with \(\|\delta \bA\|/\|\bA\| = O(\varepsilon_{\text{mach}})\).

Theorem: Normwise Backward Error for Linear Systems

For a computed vector \(\hat{\bx}\) with residual \(\br=\bb-\bA\hat{\bx}\), the normwise relative backward error for the linear system \(\bA\bx=\bb\) is \[ \begin{align} \eta(\hat{\bx}) = \frac{\|\br\|} {\|\bA\|\|\hat{\bx}\|+\|\bb\|}. \end{align} \] This is the smallest relative perturbation size, measured normwise in both \(\bA\) and \(\bb\), that makes \(\hat{\bx}\) an exact solution of a nearby system.

Exercise

Suppose \((\bA+\delta\bA)\hat{\bx}=\bb+\delta\bb\). Show that \(\br=\delta\bb-\delta\bA\hat{\bx}\).
Use the triangle inequality to prove the lower bound \[\|\br\|\leq \|\delta\bA\|\|\hat{\bx}\|+\|\delta\bb\|.\]
Divide by \(\|\bA\|\|\hat{\bx}\|+\|\bb\|\) to obtain the denominator in the result above.
Construct a rank-one perturbation in the direction of \(\br\) to see why the bound can be attained.
Compare this computable backward error with the residual-error bound in the result above.

Remark

GEPP Stability: Gaussian Elimination with Partial Pivoting is backward stable for virtually all practical matrices. The growth factor \(\rho\) stays small, preventing exponential error amplification.

Remark

(Stable algorithms) Householder QR and Cholesky for SPD systems are backward stable under their usual assumptions. LU with partial pivoting is reliable in practice, though its worst-case growth factor can be large.

Exercise

Use the result above to explain why backward stability is a natural standard when input data are already uncertain.
Explain why a small residual does not always guarantee a small forward error. Use the result above.
Compute \(\eta(\hat{\bx})\) from the result above for the small-residual example above.
Histogram growth factors for 100 random matrices via scipy.linalg.lu.

14.4 Condition Number

A property of the problem.

Definition: Condition Number

\(\kappa(\bA) = \|\bA\|\|\bA^{-1}\|\).

Interpretation: Worst-case error amplification. A perturbation of size \(\varepsilon\) in \(\bb\) can cause error up to \(\kappa(\bA) \varepsilon\) in \(\bx\).
Rule of Thumb: You expect at most \(16 - \log_{10}(\kappa(\bA))\) correct digits in double precision.

Proof

Let \(\bA\bx = \bb\) be the original system and \(\bA(\bx+\delta\bx) = \bb+\delta\bb\) the perturbed system. Subtracting gives \(\bA \delta\bx = \delta\bb\), so \(\delta\bx = \bA^{-1} \delta\bb\). Taking norms: \[ \begin{align} \|\delta\bx\| \leq \|\bA^{-1}\| \|\delta\bb\|. \end{align} \] Also, from \(\bA\bx = \bb\), we have \(\|\bb\| \leq \|\bA\| \|\bx\|\), which implies \(1/\|\bx\| \leq \|\bA\|/\|\bb\|\). Combining these inequalities for the relative error: \[ \begin{align} \frac{\|\delta\bx\|}{\|\bx\|} &\leq \|\bA^{-1}\| \|\delta\bb\| \frac{\|\bA\|}{\|\bb\|} \\ &= (\|\bA\| \|\bA^{-1}\|) \frac{\|\delta\bb\|}{\|\bb\|} \\ &= \kappa(\bA) \frac{\|\delta\bb\|}{\|\bb\|}. \end{align} \] Thus, the condition number \(\kappa(\bA)\) acts as the amplification factor for relative perturbations in the right-hand side.

Exercise

Reproduce the perturbation argument above without looking.
Compute \(\kappa_2(H_n)\) for Hilbert matrices \(n=3, ..., 10\). How fast does it grow?
Use the result above, the result above, and the result above to explain why a stable algorithm on an ill-conditioned problem may still lose many digits.
The Neumann Series: Use it to prove the perturbation bound: \(\frac{\|\delta \bx\|}{\|\bx\|} \leq \frac{\kappa(\bA)}{1 - ...}\).

14.5 Solver Selection Hierarchy

Theorem: Solver Selection by Structure

Choose a solver from the most specific trustworthy structure in the problem:

If \(\bA\) is SPD and dense, use Cholesky; if it is SPD and large sparse, use CG with a preconditioner.
If \(\bA\) is general, square, dense, and nonsingular, use LU with partial pivoting.
If the problem is full-rank least squares, use Householder QR.
If the problem is rank-deficient, nearly rank-deficient, or requires numerical rank information, use the SVD.
If \(\bA\) is large, sparse, nonsymmetric, and only matrix-vector products are practical, use GMRES or another nonsymmetric Krylov method.

The more structure a solver exploits, the cheaper it can be; the less trustworthy the structure, the more robust the solver must be.

Exercise

For each case in the result above, identify the mathematical structure being used: symmetry, positive definiteness, sparsity, full column rank, or numerical rank.
Explain why Cholesky is inappropriate if the SPD assumption fails.
Explain why QR is preferred to normal equations for full-rank least squares, using the result above.
Explain why the SVD is the natural fallback when rank is uncertain, using the result above.
Explain why CG and GMRES fit the Krylov approximation principle from the result above.

Definition: Iterative Refinement

The accuracy of a computed solution \(\hat{\bx}\) can be improved by iterative refinement:

Compute the residual \(\br = \bb - \bA\hat{\bx}\).
Solve \(\bA \boldsymbol{\delta} = \br\).
Update the estimate: \(\hat{\bx} \leftarrow \hat{\bx} + \boldsymbol{\delta}\).

Each step recovers digits lost due to ill-conditioning, up to the limits of machine precision.

Remark

(Limitations) Iterative refinement helps when the residual can be computed accurately and the correction equation can be solved reliably. It cannot overcome a problem whose condition number is so large that the desired digits are not present in the data.

Exercise

Solve \(H_{10}\bx = H_{10}\mathbf{1}\). Apply one step of iterative refinement from the result above. How many digits are recovered?
Compare forward errors of solve (LU) vs lstsq (QR) on a Hilbert system. Interpret the result using the result above.
Choose a solver for each case in the result above: SPD, dense general square, full-rank least squares, rank-deficient least squares, large sparse SPD, and large sparse nonsymmetric. Justify each choice by citing one earlier result or exercise.