10  LU Factorization

Most direct methods for solving linear systems rely on matrix factorizations that reduce a complex system into multiple simpler ones. The LU factorization is the most fundamental of these, decomposing a square matrix \(\bA\) into a unit lower triangular matrix \(\bL\) and an upper triangular matrix \(\bU\). This process is essentially the matrix form of Gaussian elimination.

10.1 LU and Gaussian Elimination

Decomposing \(\bA\) into triangular factors allows us to solve \(\bA\bx = \bb\) by solving two triangular systems, which is computationally efficient.

NoteDefinition: LU Factorization

\(\bA = \bL\bU\): \[ \begin{align} \bL = \begin{pmatrix} 1 & 0 & ... \\ \ell_{21} & 1 & ... \\ \vdots & \vdots & \ddots \end{pmatrix}, \quad \bU = \begin{pmatrix} u_{11} & u_{12} & ... \\ 0 & u_{22} & ... \\ \vdots & \vdots & \ddots \end{pmatrix}. \end{align} \]

WarningExercise
  1. Compare free parameters: how many in \(\bL\) vs. \(\bU\)? Do they sum to \(n^2\)?

  2. If \(\bA = \bL\bU\) is invertible, prove \(\bL\) and \(\bU\) are also invertible.

Tip

Existence and Uniqueness. \(\bA\) admits a unique LU factorization iff all leading principal submatrices \(\bA_k\) are nonsingular (\(k=1, ..., n-1\)). The algorithm fails if a zero pivot is encountered.

Tip

Flop Cost. Computing \(\bL\) and \(\bU\) via Gaussian elimination costs .

Tip

The Solve Pattern. To solve \(\bA\bx = \bb\) given \(\bA = \bL\bU\):

  1. Solve \(\bL\by = \bb\) via forward substitution (\(O(n^2)\)).

  2. Solve \(\bU\bx = \by\) via backward substitution (\(O(n^2)\)).

Advantage: Once \(\bA\) is factored, each new \(\bb\) costs only \(O(n^2)\).

WarningExercise
  1. Factor \(\bA = \begin{pmatrix} 2 & 3 & 1 \\ 4 & 7 & 3 \\ -2 & -3 & 1 \end{pmatrix}\) by hand.

  2. Solve \(\bA\bx = (5, 15, 4)^T\) using your factors.

  3. SciPy: Use scipy.linalg.lu and explain why it always returns a permutation \(\bP\).

10.2 Pivoting and Stability

While mathematically elegant, basic LU factorization is numerically unstable if any pivot (the diagonal element used to eliminate entries below it) is small relative to the entries it is eliminating. This can lead to catastrophic rounding errors. Partial Pivoting mitigates this by ensuring that at each step, we swap rows to place the largest available entry in the pivot position.

NoteDefinition: LUP Decomposition

The result of Gaussian elimination with partial pivoting is recorded as \(\mathbf{P}\bA = \bL\bU\), where \(\bP\) is a permutation matrix that tracks the row swaps.

Tip

Stability and Growth Factor. The numerical stability of LUP is characterized by the growth factor \(\rho = \max|u_{ij}| / \max|a_{ij}|\). If \(\rho\) is large, the solution may be inaccurate. Partial pivoting keeps \(\rho\) small in practice (typically \(O(n)\)), ensuring the algorithm is backward stable. While the theoretical worst-case growth is \(2^{n-1}\), such matrices are rarely encountered in engineering applications.

WarningExercise
  1. Give a \(2 \times 2\) matrix where LU fails (due to a zero or small pivot) but LUP succeeds.

  2. Why is \(\bP^{-1}\) trivial to compute for any permutation matrix? (Hint: consider \(\bP^T\)).

  3. Complete pivoting (swapping both rows and columns) provides even stronger stability guarantees but is rarely used. Why? (Consider the trade-off between search cost and marginal stability gain).

10.3 Algorithm Cost Analysis

Understanding the computational cost is vital for choosing the right solver for large-scale engineering problems. The bulk of the work is in the factorization step.

Tip

Total Cost for \(k\) Right-Hand Sides. For a system with \(n\) unknowns and \(k\) different right-hand side vectors \(\bb\), the total cost is: \[ \begin{align} \text{Cost} = \frac{2}{3}n^3 + 2kn^2 \text{ flops}. \end{align} \] When \(k\) is large, the \(O(n^3)\) cost of factorization is amortized over many \(O(n^2)\) solves. For example, solving for 500 \(\bb\)’s for \(n=1000\) using one factorization is significantly faster than re-factorizing for each new \(\bb\).

Forward substitution solves \(\bL\by = \bb\) where \(\ell_{ii} = 1\). The \(i\)-th entry \(y_i\) is given by: \[ \begin{align} y_i = b_i - \sum_{j=1}^{i-1} \ell_{ij} y_j. \end{align} \] Counting the floating-point operations:

  • For a fixed \(i\), the summation requires \(i-1\) multiplications and \(i-1\) additions/subtractions.

  • Total flops: \(\sum_{i=1}^n 2(i-1) = 2 \sum_{k=0}^{n-1} k = 2 \frac{(n-1)n}{2} = n^2 - n\).

For large \(n\), each triangular solve costs approximately \(n^2\) flops.

WarningExercise
  1. Sum the operations for forward substitution and show the cost is exactly \(n^2 - n\).

  2. Use scipy.linalg.lu\_factor and lu\_solve for a system with 3 right-hand sides. Verify residuals.