11  Symmetric Positive Definite (SPD) Matrices

Symmetric Positive Definite (SPD) matrices form one of the most important classes of matrices in scientific computing. They possess several remarkable properties that allow for algorithms that are twice as fast and significantly more stable than those for general matrices.

11.1 Definitions and Properties

SPD matrices arise naturally in a wide variety of engineering contexts, including optimization (as Hessian matrices of convex functions), statistics (as covariance matrices), and physical simulations (as stiffness matrices in structural mechanics).

NoteDefinition: Positive Definite

A symmetric matrix \(\bA \in \fR^{n \times n}\) is positive definite (PD) if the quadratic form \(\bx^T\bA\bx\) is strictly positive for every nonzero vector \(\bx \in \fR^n\): \[ \begin{align} \bx^T\bA\bx > 0 \quad \text{for every nonzero } \bx \in \fR^n. \end{align} \] If \(\bA\) is both symmetric and PD, it is SPD.

WarningExercise
  1. Test \(\bA = \begin{pmatrix}2 & 1 \\ 1 & 2\end{pmatrix}\) for PD property by expanding \(\bx^T\bA\bx\).

  2. Prove a diagonal matrix \(\bD\) is PD iff all diagonal entries \(d_i > 0\).

Tip

Spectral Characterization. A symmetric matrix is PD iff all its eigenvalues are strictly positive. Every SPD matrix is invertible (\(\det \bA = \prod \lambda_i > 0\)).

WarningExercise
  1. If \(\bA\) is SPD, show \(\bB = \bA + c\bI\) for \(c > 0\) is also SPD.

  2. Use np.linalg.eigvalsh to verify SPD property for several test matrices.

11.2 Geometry and Metric

An SPD matrix \(\bA\) can be thought of as defining a new geometry or “metric” on \(\fR^n\), where the distance and angle between vectors are weighted by the entries of \(\bA\). This perspective is fundamental to understanding the behavior of iterative solvers like the Conjugate Gradient method.

NoteDefinition: \(\bA\)-Inner Product

For SPD \(\bA\), the mapping \(\langle \bx, \by \rangle_\bA = \bx^T\bA\by\) satisfies the axioms of an inner product.

  • \(\bA\)-Norm: The induced norm is \(\|\bx\|_\bA = \sqrt{\bx^T\bA\bx}\).

  • Ellipsoid: The unit ball in this norm, \(\{\bx : \|\bx\|_\bA = 1\}\), is an \(n\)-dimensional ellipsoid whose principal axes are aligned with the eigenvectors of \(\bA\).

Tip

Physical Meaning: Directions with large eigenvalues are stiff'' orexpensive’’ (small axes in the unit ellipsoid); small eigenvalues represent ``sensitive’’ directions.

WarningExercise
  1. For \(\bA = \begin{pmatrix} 4 & 0 \\ 0 & 1 \end{pmatrix}\), compute the distance between \((1,0)^T\) and origin in both Euclidean and \(\bA\)-metrics.

  2. Where does the \(\bA\)-metric appear in statistics? (Mahalanobis distance).

11.3 Cholesky Decomposition

The symmetry and positive definiteness of \(\bA\) allow for a specialized version of the LU factorization known as the Cholesky decomposition. The decomposition can be interpreted as a generalized square root for SPD matrices, where \(\bA = \bL\bL^T\).

NoteDefinition: Cholesky Decomposition

Every SPD matrix \(\bA\) admits a unique decomposition \(\bA = \bL\bL^T\), where \(\bL\) is lower triangular with strictly positive diagonal entries.

Tip

Flop Cost. The computational cost of the Cholesky decomposition is , approximately half that of a general LU factorization.

Tip

Stability Advantage: Unlike general LU, Cholesky is guaranteed to be stable without pivoting. In code, np.linalg.cholesky is the standard numerical test for positive definiteness; if it raises a LinAlgError, the matrix is not PD.

WarningExercise
  1. Factor \(\bA = \begin{pmatrix} 4 & 2 \\ 2 & 3 \end{pmatrix}\) by hand.

  2. Why does Cholesky not require row swaps?

  3. Solve \(\bA\bx = \bb\) using Cholesky for \(\bA = \begin{pmatrix} 2 & 1 \\ 1 & 2 \end{pmatrix}\).

WarningExercise

(Applications)

  1. Covariance: Generate X = np.random.randn(100, 4). Verify X.T @ X is SPD via Cholesky.

  2. Gram Matrix: Prove that \(\mathbf{G} = \bX^T\bX\) is at least positive semi-definite for any \(\bX\), and PD if \(\bX\) has full column rank.

For any nonzero vector \(\bv \in \fR^n\), consider the quadratic form: \[ \begin{align} \bv^T \mathbf{G} \bv &= \bv^T (\bX^T \bX) \bv \\ &= (\bX \bv)^T (\bX \bv) \\ &= \|\bX \bv\|_2^2. \end{align} \] Since the squared Euclidean norm is always non-negative, \(\bv^T \mathbf{G} \bv \geq 0\), proving \(\mathbf{G}\) is positive semi-definite. If \(\bX\) has full column rank, then \(\bX \bv = \bzero\) iff \(\bv = \bzero\). Thus \(\bv^T \mathbf{G} \bv > 0\) for all nonzero \(\bv\), making \(\mathbf{G}\) positive definite.

WarningExercise

(SPD Condition Number) \(\kappa_2(\bA) = \lambda_{\max} / \lambda_{\min}\).

  1. If \(\bA\) is SPD with eigenvalues \(\{10, 0.01\}\), what is \(\kappa_2(\bA)\)?

  2. Verify \(\kappa_2(\bL) = \sqrt{\kappa_2(\bA)}\) where \(\bL\) is the Cholesky factor.