18  Singular Value Decomposition

The Singular Value Decomposition (SVD) is a general matrix factorization that decomposes any \(m \times n\) matrix into a product of orthogonal transformations and a diagonal scaling matrix.

18.1 The SVD Theorem

NoteTheorem: Singular Value Decomposition (SVD)

For \(\bA \in \fR^{m \times n}\), there exist orthogonal matrices \(\bU \in \fR^{m \times m}, \bV \in \fR^{n \times n}\) and diagonal \(\bsigma \in \fR^{m \times n}\) such that: \[ \begin{align} \bA = \bU \bsigma \bV^T. \end{align} \]

  • Singular Values (\(\sigma_i\)): Non-negative, ordered \(\sigma_1 \geq \sigma_2 \geq ... \geq 0\).

  • Modes: \(\bA = \sum_{i=1}^r \sigma_i \bu_i \bv_i^T\) (Outer product expansion).

Tip

Geometric Interpretation: \(\bA\) rotates the input space (\(\bV^T\)), scales along the axes (\(\bsigma\)), and rotates the result to the output space (\(\bU\)).

18.2 The Four Fundamental Subspaces

The SVD provides optimal orthonormal bases for the spaces defined by \(\bA\) (rank \(r\)):

  1. First \(r\) columns of \(\bU\).

  2. Last \(n-r\) columns of \(\bV\).

  3. First \(r\) columns of \(\bV\).

  4. Last \(m-r\) columns of \(\bU\).

NoteTheorem: Relationship to Eigendecomposition
  • Right singular vectors (\(\bv_i\)) are eigenvectors of \(\bA^T\bA\).

  • Left singular vectors (\(\bu_i\)) are eigenvectors of \(\bA\bA^T\).

  • Singular values \(\sigma_i = \sqrt{\lambda_i(\bA^T\bA)}\).

18.3 Low-Rank Approximation

NoteTheorem: Eckart-Young-Mirsky Theorem

The best rank-\(k\) approximation (\(k < r\)) of \(\bA\) in spectral and Frobenius norms is the truncated SVD: \[ \begin{align} \bA_k = \sum_{i=1}^k \sigma_i \bu_i \bv_i^T. \end{align} \]

  • Error: \(\|\bA - \bA_k\|_2 = \sigma_{k+1}\).
Tip

Applications: Truncated SVD provides the computational basis for Principal Component Analysis (PCA), image compression, and latent semantic analysis.

18.4 The Pseudoinverse

NoteDefinition: Moore-Penrose Pseudoinverse

\(\bA^+ = \bV \bsigma^+ \bU^T\), where \((\bsigma^+)_{ii} = 1/\sigma_i\) if \(\sigma_i > 0\), else \(0\).

  • Minimum-Norm Solution: For any system \(\bA\bx = \bb\), the vector \(\hat{\bx} = \bA^+\bb\) is the shortest vector minimizing \(\|\bA\bx - \bb\|_2\).
WarningExercise
  1. Compute the SVD of \(\bA = \begin{pmatrix} 3 & 1 \\ 1 & 2 \end{pmatrix}\) by hand.

  2. Compression: Load a \(512 \times 512\) image. Plot the relative error \(\|\bA - \bA_k\|_F / \|\bA\|_F\) vs. \(k\). At what \(k\) is the image recognizable?

  3. Use the pseudoinverse to find the least squares solution to a rank-deficient \(3 \times 3\) system.