12  Orthogonal Matrices

Transformations that preserve lengths and angles. Numerically ideal due to optimal conditioning (\(\kappa_2(\bQ) = 1\)).

12.1 Orthogonality and Orthonormal Bases

NoteDefinition: Orthogonal Vectors

\(\bx, \by \in \fR^n\) are orthogonal if \(\bx^T\by = 0\).

  • Mutually Orthogonal Set: \(\bv_i^T\bv_j = 0\) for all \(i \neq j\).
NoteDefinition: Orthonormal Basis

A set \(\{\bq_1, ..., \bq_n\}\) where \(\bq_i^T\bq_j = \delta_{ij}\) (unit length and mutually orthogonal).

  • Matrix Form: If \(\bQ = [\bq_1, ..., \bq_n]\), then \(\bQ^T\bQ = \bI\).
Tip

Efficiency in Orthonormal Bases. In an orthonormal basis, the coefficients of a vector \(\bv = \sum c_i \bq_i\) are given by standard inner products: \(c_i = \bq_i^T\bv\). This allows the solution of a linear system, which generally requires \(O(n^3)\) operations, to be computed using a sequence of inner products in \(O(n^2)\) time.

WarningExercise
  1. Verify \(\bq_1 = \frac{1}{\sqrt{2}}(1, 1)^T, \bq_2 = \frac{1}{\sqrt{2}}(1, -1)^T\) is an orthonormal basis for \(\fR^2\).

  2. Express \(\bv = (3, -1)^T\) in this basis.

  3. Prove that any set of \(n\) mutually orthogonal nonzero vectors is linearly independent.

NoteDefinition: Orthogonal Matrix

Square matrix \(\bQ\) where \(\bQ^T = \bQ^{-1}\). Columns (and rows) form an orthonormal basis.

NoteTheorem: Invariance Properties

For any orthogonal \(\bQ\) and vectors \(\bx, \by\):

  1. Inner Product Preserving: \((\bQ\bx)^T(\bQ\by) = \bx^T\by\).

  2. Isometry: \(\|\bQ\bx\|_2 = \|\bx\|_2\).

  3. Conditioning: \(\kappa_2(\bQ) = 1\) (Optimally conditioned).

(1): Applying the Reversal Law for transposes and the fact that \(\bQ^T\bQ = \bI\): \[ \begin{align} (\bQ\bx)^T(\bQ\by) &= \bx^T \bQ^T \bQ \by \\ &= \bx^T (\bQ^T \bQ) \by \\ &= \bx^T \bI \by = \bx^T \by. \end{align} \]

(2): Using the result from (1) with \(\bx = \by\): \[ \begin{align} \|\bQ\bx\|_2^2 &= (\bQ\bx)^T(\bQ\bx) \\ &= \bx^T\bx = \|\bx\|_2^2. \end{align} \] Taking the square root gives \(\|\bQ\bx\|_2 = \|\bx\|_2\).

(3): From (2), we see that the induced norm is \(\|\bQ\|_2 = 1\). Similarly, \(\|\bQ^T\|_2 = 1\) because \(\bQ^T\) is also orthogonal. Thus: \[ \begin{align} \kappa_2(\bQ) = \|\bQ\|_2 \|\bQ^T\|_2 = 1 \cdot 1 = 1. \end{align} \]

WarningExercise
  1. Verify rotation matrix \(\bQ = \begin{pmatrix}\cos\theta & -\sin\theta \\ \sin\theta & \cos\theta\end{pmatrix}\) is orthogonal.

  2. Prove that the product of two orthogonal matrices is orthogonal.

  3. If \(\bQ\) is orthogonal, show \(\det(\bQ) = \pm 1\).

12.2 QR Decomposition

Factors \(\bA\) into an orthogonal \(\bQ\) and upper triangular \(\bR\). Default choice for least squares and eigenvalue algorithms.

NoteDefinition: QR Decomposition

For \(\bA \in \fR^{m \times n}\) (\(m \geq n\)): \(\bA = \bQ\bR\).

  • Full QR: \(\bQ \in \fR^{m \times m}, \bR \in \fR^{m \times n}\).

  • Thin QR: \(\bQ_1 \in \fR^{m \times n}, \bR_1 \in \fR^{n \times n}\). Columns of \(\bQ_1\) span \(\mathcal{R}(\bA)\).

NoteDefinition: Householder Reflector

A reflector \(\bH = \bI - 2\frac{\bv\bv^T}{\bv^T\bv}\) reflects \(\bx\) across the hyperplane \(\bv^\perp\).

  • Zeroing Property: For any \(\bx\), we can choose \(\bv\) such that \(\bH\bx = \pm \|\bx\| \be_1\).

  • Efficiency: Never form \(\bH\) explicitly. Apply as \(\bH\bx = \bx - \bv(2\frac{\bv^T\bx}{\bv^T\bv})\), a rank-1 update costing \(O(m)\).

Tip

Flop Cost. For \(\bA \in \fR^{m \times n}\) via Householder: \(\text{Cost} \approx 2mn^2 - \frac{2}{3}n^3\). Square case (\(m=n\)): \(\frac{4}{3}n^3\) (twice the cost of LU).

WarningExercise
  1. Find \(\bv\) s.t. \(\bH(3, 4)^T = (-5, 0)^T\).

  2. Use Householder to triangularize \(\bA = \begin{pmatrix} 1 & 2 \\ 2 & 3 \\ 2 & 4 \end{pmatrix}\).

  3. Compare thin vs. full QR shapes in NumPy for a \(5 \times 3\) matrix.

12.3 Gram-Schmidt Process

Tip

Classical vs. Modified Gram-Schmidt.

  • Classical (CGS): \(\bu_k = \bv_k - \sum_{j < k} \text{proj}_{\bu_j}(\bv_k)\). Numerically unstable for ill-conditioned \(\bA\) (\(\text{Error} \propto \kappa(\bA)^2\)).

  • Modified (MGS): Orthogonalize sequentially against partially updated vectors. Better stability (\(\text{Error} \propto \kappa(\bA)\)).

Tip

Numerical Hierarchy: Householder QR is the most stable (\(O(\varepsilon_{\text{mach}})\) loss of orthogonality) and is used by np.linalg.qr. MGS is used when \(\bQ\) columns are needed one by one (e.g., Arnoldi/GMRES).

WarningExercise
  1. Apply hand CGS to columns of \(\bA = \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ 0 & 1 \end{pmatrix}\).

  2. Stability Experiment: Construct a nearly singular Vandermonde matrix. Compare loss of orthogonality (\(\|\bQ^T\bQ - \bI\|_F\)) for CGS, MGS, and Householder.

12.4 Orthogonal Projection

NoteDefinition: Orthogonal Projector

For a subspace \(S\) with orthonormal basis \(\bQ\): \(\bP = \bQ\bQ^T\).

  • Properties: \(\bP^2 = \bP\) (Idempotent), \(\bP^T = \bP\) (Symmetric).

  • Geometric View: \(\bP\bx\) is the closest vector in \(S\) to \(\bx\). Residual \(\br = \bx - \bP\bx\) lies in \(S^\perp = \mathcal{N}(\bQ^T)\).

WarningExercise
  1. Show that for any projector \(\bP\), the residual \(\bx - \bP\bx\) is orthogonal to \(\bP\bx\).

  2. Project \(\bx = (1, 2, 3)^T\) onto the span of \((1, 1, 1)^T\).