Orthogonal matrices represent the class of linear transformations that preserve lengths and angles — rotations and reflections. They are numerically ideal: multiplying by an orthogonal matrix introduces no error amplification (\(\kappa_2(\bQ) = 1\)). This section develops orthogonal matrices and their role in the QR factorization, which underpins least-squares solvers and the QR eigenvalue algorithm.
Two vectors \(\bx, \by \in \fR^n\) are called orthogonal if their inner product is zero: \[\bx^T\by = 0.\] Geometrically, orthogonal vectors are perpendicular. A set of vectors \(\{\bv_1, \bv_2, ..., \bv_k\}\) is mutually orthogonal if every pair of distinct vectors is orthogonal: \[\bv_i^T\bv_j = 0 \quad \text{for all } i \neq j.\]
Refer to the result above.
Are \((1, 2)^T\) and \((4, -2)^T\) orthogonal? Are \((1, 1)^T\) and \((1, -1)^T\) orthogonal?
If \(\bx \perp \by\) and \(\bx \perp \bz\), show that \(\bx \perp (\alpha\by + \beta\bz)\) for any scalars \(\alpha, \beta\).
In \(\fR^3\), find a nonzero vector orthogonal to both \((1, 0, 1)^T\) and \((0, 1, 1)^T\). (Hint: solve a homogeneous system or use the cross product.)
A set of vectors \(\{\bq_1, \bq_2, ..., \bq_n\}\) in \(\fR^n\) is an orthonormal basis if the vectors are mutually orthogonal and each has unit length: \[\bq_i^T\bq_j = \delta_{ij} =
\begin{cases}
1 & \text{if } i = j \\
0 & \text{if } i \neq j.
\end{cases}\] Any \(n\) such vectors in \(\fR^n\) are automatically linearly independent and span \(\fR^n\). If we arrange them as columns of \(\bQ = [\bq_1, ..., \bq_n]\), then \(\bQ^T\bQ = \bI\).
Any vector \(\bv \in \fR^n\) can be expressed uniquely as a linear combination of an orthonormal basis \(\{\bq_1, ..., \bq_n\}\): \[\bv = \sum_{i=1}^n (\bq_i^T\bv)\,\bq_i.\] The coefficients \(\bq_i^T\bv\) are simple inner products — no linear system needs to be solved. This is the key computational advantage of orthonormal bases.
Refer to the result above.
Verify that \(\bq_1 = (1/\sqrt{2},\,1/\sqrt{2})^T\) and \(\bq_2 = (1/\sqrt{2},\,-1/\sqrt{2})^T\) form an orthonormal basis for \(\fR^2\).
Express \(\bv = (3, -1)^T\) as \(\sum_i (\bq_i^T\bv)\bq_i\) using the basis above. Verify the result equals \(\bv\).
A set of \(n\) mutually orthogonal nonzero vectors in \(\fR^n\) is linearly independent. Prove this. (Hint: assume \(\sum c_i\bv_i = \bzero\) and take the inner product with each \(\bv_j\).)
A square matrix \(\bQ \in \fR^{n \times n}\) is orthogonal if its columns form an orthonormal set: \[\bQ^T\bQ = \bQ\bQ^T = \bI_n.\] Equivalently, \(\bQ^T = \bQ^{-1}\), meaning the transpose is its own inverse.
Orthogonal matrices represent rotations, reflections, or combinations of these in \(\fR^n\). They preserve vector lengths and angles between vectors, making them particularly valuable in scientific computing where maintaining geometric properties is important.
Refer to the result above.
Verify that \(\bQ = \begin{pmatrix}\cos\theta & -\sin\theta \\ \sin\theta &
\cos\theta\end{pmatrix}\) is orthogonal for any \(\theta\). What geometric operation does multiplication by \(\bQ\) perform?
Is \(\bM = \begin{pmatrix}1/\sqrt{2} & 1/\sqrt{2} \\ 1/\sqrt{2} &
1/\sqrt{2}\end{pmatrix}\) orthogonal? Why or why not?
If \(\bQ\) is orthogonal, show that \(\bQ^T\) is also orthogonal.
Let \(\bQ \in \fR^{n \times n}\) be an orthogonal matrix. For all \(\bx, \by \in \fR^n\):
\((\bQ\bx)^T(\bQ\by) = \bx^T\by\) (inner-product-preserving)
\(\|\bQ\bx\|_2 = \|\bx\|_2\) (isometry)
\(\kappa_2(\bQ) = 1\) (optimally conditioned)
(1): \((\bQ\bx)^T(\bQ\by) = \bx^T\bQ^T\bQ\by = \bx^T\bI\by = \bx^T\by\).
(2): Set \(\bx = \by\) in (1): \(\|\bQ\bx\|_2^2 = \bx^T\bx = \|\bx\|_2^2\).
(3): By (2), \(\|\bQ\|_2 = \max_{\|\bx\|=1}\|\bQ\bx\|_2 = 1\). Since \(\bQ^{-1} = \bQ^T\) is also orthogonal, \(\|\bQ^{-1}\|_2 = 1\). Thus \(\kappa_2(\bQ) = \|\bQ\|_2\|\bQ^{-1}\|_2 = 1\).
The condition number of 1 is the best possible. Multiplying by an orthogonal matrix introduces no amplification of errors, which is why orthogonal transformations (Householder reflections, Givens rotations) are preferred over Gaussian elimination when stability is paramount.
If \(\bQ \in \fR^{n \times n}\) satisfies \(\bQ^T\bQ = \bI\), then \(\bQ\bQ^T = \bI\) as well, so the rows of \(\bQ\) also form an orthonormal basis for \(\fR^n\).
\(\bQ^T\bQ = \bI\) implies \(\bQ\) is invertible with \(\bQ^{-1} = \bQ^T\). Left-multiplying by \(\bQ\) and right-multiplying by \(\bQ^T\) gives \(\bQ\bQ^T = \bI\).
For any orthogonal \(\bQ\), \(\det(\bQ) = \pm 1\). When \(\det(\bQ) = +1\), \(\bQ\) is a rotation; when \(\det(\bQ) = -1\), it includes a reflection.
From \(\bQ^T\bQ = \bI\): \(\det(\bQ^T)\det(\bQ) = 1\), and \(\det(\bQ^T) = \det(\bQ)\), so \(\det(\bQ)^2 = 1\).
If \(\bQ_1, \bQ_2 \in \fR^{n \times n}\) are orthogonal, then \(\bQ_1\bQ_2\) is also orthogonal.
\((\bQ_1\bQ_2)^T(\bQ_1\bQ_2) = \bQ_2^T(\bQ_1^T\bQ_1)\bQ_2 = \bQ_2^T\bI\bQ_2 = \bI\).
Refer to Props~above–above.
For \(\bQ = \begin{pmatrix}0 & -1 \\ 1 & 0\end{pmatrix}\) (rotation by 90°), compute \(\det(\bQ)\) and verify the rows are orthonormal.
For the reflection \(\bQ = \begin{pmatrix}1 & 0 \\ 0 & -1\end{pmatrix}\), verify \(\det(\bQ) = -1\) and \(\bQ\bQ^T = \bI\).
Is \(\bQ_1 + \bQ_2\) necessarily orthogonal when \(\bQ_1\) and \(\bQ_2\) are? Give a counterexample.
QR Decomposition
The QR decomposition factors a matrix into an orthogonal factor \(\bQ\) and an upper triangular factor \(\bR\), combining the stability benefits of orthogonal transformations with the efficiency of triangular solves. It is the factorization of choice for least-squares problems and forms the basis of the QR eigenvalue algorithm.
For a matrix \(\bA \in \fR^{m \times n}\) with \(m \geq n\), the QR decomposition factors \(\bA\) as \[\bA = \bQ\bR,\] where \(\bQ \in \fR^{m \times m}\) is orthogonal (\(\bQ^T\bQ = \bI\)) and \(\bR \in \fR^{m \times n}\) is upper triangular with zeros below the main diagonal.
The QR factorization is particularly valuable in scientific computing because it transforms problems involving arbitrary matrices into problems with triangular matrices, which are much easier to solve. The orthogonality of \(\bQ\) ensures numerical stability in many applications, as orthogonal transformations preserve Euclidean norms and do not amplify errors.
Refer to the result above.
If \(\bA = \bQ\bR\), explain how to solve \(\bA\bx = \bb\) using \(\bQ\) and \(\bR\) separately. What is the cost of each step?
For \(\bA = \begin{pmatrix}1 & 0 \\ 0 & 1 \\ 0 & 0\end{pmatrix}\), write a QR factorization by inspection. (Hint: \(\bA\) already has orthonormal columns.)
Why does the QR decomposition always exist for any \(\bA \in \fR^{m \times n}\) with \(m \geq n\)? (Hint: consider Gram-Schmidt applied to the columns of \(\bA\).)
If \(\bA \in \fR^{m \times n}\) (\(m \geq n\)) has full column rank, the first \(n\) columns of \(\bQ\) (from the full QR factorization \(\bA = \bQ\bR\)) form an orthonormal basis for \(\text{col}(\bA)\).
In the QR factorization of a full-column-rank matrix \(\bA\), the diagonal entries of \(\bR\) satisfy \(|r_{ii}| > 0\) for each \(i\). In particular, \(\bR\) is invertible.
Refer to Props~above–above.
If \(r_{11} = 0\) in the QR factorization of \(\bA\), what does this imply about the first column of \(\bA\)?
For a \(3 \times 2\) full-rank matrix \(\bA\), identify which columns of the full \(\bQ \in \fR^{3 \times 3}\) span \(\text{col}(\bA)\) and which span \(\text{col}(\bA)^\perp\).
In NumPy, compute the full and thin QR of a \(4 \times 3\) matrix and verify:
- the first 3 columns of \(\bQ\) span \(\text{col}(\bA)\), (b) the 4th column of \(\bQ\) is orthogonal to each column of \(\bA\).
Householder Reflectors
A key method for computing the QR factorization is via Householder reflectors: orthogonal matrices that reflect a vector across a hyperplane and can zero out all entries below a chosen position in a column. Applying \(n\) such reflectors in sequence reduces \(\bA\) to upper triangular form.
Given a nonzero vector \(\bv \in \fR^m\), the Householder reflector is the matrix \[\bH = \bI - \frac{2\bv\bv^T}{\bv^T\bv}.\] Geometrically, \(\bH\bx\) reflects \(\bx\) across the hyperplane \(\bv^\perp = \{\bw : \bv^T\bw = 0\}\).
For any nonzero \(\bv \in \fR^m\), the Householder matrix \(\bH = \bI - 2\bv\bv^T/\|\bv\|_2^2\) satisfies:
\(\bH^T = \bH\) (symmetric)
\(\bH^T\bH = \bI\) (orthogonal)
\(\bH^2 = \bI\), so \(\bH^{-1} = \bH\) (involution)
\(\det(\bH) = -1\) (reflection, not rotation)
Let \(\tau = \bv^T\bv = \|\bv\|_2^2 > 0\).
(1): \(\bH^T = (\bI - 2\bv\bv^T/\tau)^T = \bI - 2\bv\bv^T/\tau = \bH\).
(2): \(\bH^T\bH = \bH^2 = (\bI - 2\bv\bv^T/\tau)^2 = \bI - 4\bv\bv^T/\tau +
4\bv(\bv^T\bv)\bv^T/\tau^2 = \bI - 4\bv\bv^T/\tau + 4\bv\bv^T/\tau = \bI\).
(3): Follows from (1) and (2): \(\bH^2 = \bH^T\bH = \bI\).
(4): \(\bH\) has eigenvalue \(-1\) (for any \(\bx \parallel \bv\): \(\bH\bv = \bv - 2\bv = -\bv\)) with multiplicity 1, and eigenvalue \(+1\) (for \(\bx \perp \bv\)) with multiplicity \(m-1\), so \(\det(\bH) = (-1)^1 \cdot 1^{m-1} = -1\).
For any \(\bx \in \fR^m\) with \(\bx \neq \pm\|\bx\|_2\be_1\), there exists a Householder vector \(\bv\) such that \[\bH\bx = \pm\|\bx\|_2\be_1,\] where \(\be_1 = (1,0,...,0)^T\). The choice \(\bv = \bx + \|\bx\|_2\be_1\) (taking the \(+\) sign) avoids cancellation and is numerically preferred.
We want \(\bH\bx = \alpha\be_1\) for some scalar \(\alpha\). Since \(\bH\) is orthogonal, \(|\alpha| = \|\bx\|_2\). Setting \(\bv = \bx - \alpha\be_1\): \[\bH\bx = \bx - \frac{2\bv\bv^T\bx}{\bv^T\bv}.\] We compute \(\bv^T\bx = \|\bx\|_2^2 - \alpha x_1\) and \(\bv^T\bv = 2(\|\bx\|_2^2 - \alpha x_1) = 2\bv^T\bx\), so \(\bH\bx = \bx - \bv = \alpha\be_1\). Choosing \(\alpha = -\text{sign}(x_1)\|\bx\|_2\) avoids cancellation when \(x_1\) is large.
The key use of Householder reflectors in QR factorization is iterative zeroing: at step \(k\), choose \(\bH_k\) to zero all entries below the \((k,k)\) position in column \(k\) of the current matrix. After \(n\) such reflections, the result is upper triangular. Since each \(\bH_k\) is orthogonal, the product \(\bQ = \bH_1\bH_2\cdots\bH_n\) is orthogonal, giving \(\bA = \bQ\bR\). Each reflector is applied as \(\bH\bx = \bx - 2\bv(\bv^T\bx)/\|\bv\|^2\) — a rank-1 update costing only \(O(m)\), not \(O(m^2)\).
Refer to the result above and the result above.
Let \(\bx = (3, 4)^T \in \fR^2\). Find the Householder vector \(\bv\) such that \(\bH\bx = -5\be_1\). Write \(\bH\) explicitly and verify \(\bH\bx = (-5, 0)^T\).
For \(\bx = (1,2,2)^T\), find \(\bv\) and \(\bH\) such that \(\bH\bx = -3\be_1\). Verify \(\bH\) is symmetric and \(\bH^2 = \bI\).
In NumPy, apply two successive Householder reflectors to triangularize \(\bA = \begin{pmatrix} 1 & 2 \\ 2 & 3 \\ 2 & 4 \end{pmatrix}\) and recover \(\bQ\) and \(\bR\). Compare with np.linalg.qr(A).
The QR factorization of \(\bA \in \fR^{m \times n}\) (via Householder reflections) costs approximately \(2mn^2 - \frac{2}{3}n^3\) flops.
Refer to the result above.
For \(\bA \in \fR^{n \times n}\) (square), simplify the QR cost formula. Compare it to LU factorization cost (\(\frac{2}{3}n^3\)). Which is cheaper and by what factor?
Why does a zero diagonal entry in \(\bR\) imply \(\bA\) is rank-deficient? (Hint: \(\bA = \bQ\bR\) and \(\bQ\) is invertible.)
In NumPy, np.linalg.qr(A) returns the thin QR by default (mode='reduced'). What are the shapes of \(\bQ\) and \(\bR\) for a \(5 \times 3\) matrix? When would you want the full QR (mode='complete')?
Gram-Schmidt
The Gram-Schmidt process is a constructive algorithm for building an orthonormal basis from any set of linearly independent vectors. It directly yields the QR factorization: the orthonormal vectors form the columns of \(\bQ\) and the projection coefficients populate \(\bR\). Two variants exist — classical and modified — with different numerical stability properties.
Let \(S \subseteq \fR^n\) be a subspace. Its orthogonal complement is \[S^\perp = \{\bw \in \fR^n : \bw^T\bv = 0 \text{ for all } \bv \in S\}.\] \(S^\perp\) is itself a subspace of \(\fR^n\).
If \(S \subseteq \fR^n\) has dimension \(k\), then \(\dim(S^\perp) = n - k\) and \(S \cap S^\perp = \{\bzero\}\).
Every vector \(\bx \in \fR^n\) decomposes uniquely as \(\bx = \bx_S + \bx_{S^\perp}\) where \(\bx_S \in S\) and \(\bx_{S^\perp} \in S^\perp\). The component \(\bx_S\) is the orthogonal projection of \(\bx\) onto \(S\).
For \(\bA \in \fR^{m \times n}\) with columns spanning \(S = \text{col}(\bA)\), the orthogonal complement satisfies \(S^\perp = \ker(\bA^T)\), i.e., \(S^\perp = \{\bw : \bA^T\bw = \bzero\}\).
Refer to the result above and Props~above–above.
Let \(S = \text{span}\{(1,1,0)^T, (0,1,1)^T\}\) in \(\fR^3\). Find a basis for \(S^\perp\) by solving \(\bA^T\bw = \bzero\) where \(\bA\) has the spanning vectors as columns.
Verify that your answer to (1) satisfies \(\dim(S) + \dim(S^\perp) = 3\).
In NumPy, use the QR factorization of \(\bA\) to extract an orthonormal basis for \(\text{col}(\bA)\) and for \(\text{col}(\bA)^\perp = \ker(\bA^T)\).
The Gram-Schmidt process converts a set of linearly independent vectors \(\{\bv_1, \bv_2, ..., \bv_n\}\) into an orthonormal basis for the same subspace, by sequentially removing each vector’s projection onto the span of previously orthogonalized vectors.
Classical Gram-Schmidt Algorithm. Given linearly independent vectors \(\{\bv_1, ..., \bv_n\}\), compute an orthogonal set \(\{\bu_1, ..., \bu_n\}\) by: \[
\begin{align}
\bu_1 &= \bv_1, \\
\bu_k &= \bv_k - \sum_{j=1}^{k-1}\frac{\bu_j^T\bv_k}{\bu_j^T\bu_j}\bu_j,
\quad k = 2, ..., n.
\end{align}
\] Normalize each vector to obtain an orthonormal basis: \(\bq_i = \bu_i / \|\bu_i\|_2\).
The Gram-Schmidt process can be interpreted geometrically:
\(\bu_1\) is simply the first vector \(\bv_1\).
\(\bu_2\) is \(\bv_2\) minus its projection onto \(\bu_1\).
\(\bu_3\) is \(\bv_3\) minus its projections onto both \(\bu_1\) and \(\bu_2\).
Each \(\bu_k\) is the component of \(\bv_k\) that is orthogonal to the subspace spanned by \(\{\bu_1, ..., \bu_{k-1}\}\).
Let \(\bv_1 = (1, 1, 1)^T\) and \(\bv_2 = (0, 1, 2)^T\) in \(\fR^3\).
Step 1: \(\bu_1 = \bv_1 = (1, 1, 1)^T\).
Step 2: \[
\begin{align}
\bu_2 &= \bv_2 - \frac{\bu_1^T\bv_2}{\bu_1^T\bu_1}\bu_1
= (0,1,2)^T - \frac{3}{3}(1,1,1)^T = (-1, 0, 1)^T.
\end{align}
\]
Normalizing: \[
\begin{align}
\bq_1 &= \tfrac{1}{\sqrt{3}}(1, 1, 1)^T, \qquad
\bq_2 = \tfrac{1}{\sqrt{2}}(-1, 0, 1)^T.
\end{align}
\]
Refer to the result above.
Apply Gram-Schmidt to \(\bv_1 = (1, 0, 1)^T\), \(\bv_2 = (1, 1, 0)^T\), \(\bv_3 = (0, 1, 1)^T\). Verify each pair of output vectors is orthogonal.
Show that the Gram-Schmidt process fails (produces \(\bu_k = \bzero\)) if and only if the input vectors are linearly dependent.
In NumPy, implement the classical Gram-Schmidt algorithm for a \(4 \times 3\) matrix \(\bA\) and compare your output \(\bQ\) with np.linalg.qr(A)[0].
The Gram-Schmidt process provides a constructive proof of the existence of an orthonormal basis for any subspace of \(\fR^n\). In practice, however, the classical algorithm can suffer from numerical instability due to accumulated rounding errors, especially for large sets of nearly dependent vectors. The modified Gram-Schmidt algorithm, which applies orthogonalization against each previous vector sequentially (using the partially orthogonalized vector at each step), offers better numerical stability and is preferred in computational applications.
Modified Gram-Schmidt Algorithm. Instead of computing all projections from the original vector \(\bv_k\), modified Gram-Schmidt re-uses the partially orthogonalized vector at each step: \[
\begin{align}
\bv_k^{(1)} &= \bv_k, \\
\bv_k^{(j+1)} &= \bv_k^{(j)} - \frac{\bu_j^T\,\bv_k^{(j)}}{\bu_j^T\bu_j}\bu_j,
\quad j = 1, ..., k-1, \\
\bu_k &= \bv_k^{(k)}.
\end{align}
\] Normalize as before: \(\bq_i = \bu_i / \|\bu_i\|_2\).
In exact arithmetic, classical and modified Gram-Schmidt produce identical results. The difference is numerical: modified Gram-Schmidt limits error accumulation by immediately removing components already seen.
Consider the matrix \(\bA = \begin{bmatrix} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{bmatrix}\) whose columns are \(\bv_1 = (1,1,1)^T\) and \(\bv_2 = (0,1,2)^T\) from the previous example.
From the Gram-Schmidt process, \(\bq_1 = \frac{1}{\sqrt{3}}(1,1,1)^T\) and \(\bq_2 = \frac{1}{\sqrt{2}}(-1,0,1)^T\). The \(\bR\) matrix entries are: \[
\begin{align}
r_{11} &= \bq_1^T\bv_1 = \sqrt{3}, \quad r_{12} = \bq_1^T\bv_2 = \sqrt{3}, \quad
r_{22} = \bq_2^T\bu_2 = \sqrt{2}.
\end{align}
\]
Therefore: \[\bR = \begin{bmatrix} \sqrt{3} & \sqrt{3} \\ 0 & \sqrt{2} \end{bmatrix}, \qquad
\bQ\bR = \begin{bmatrix} \frac{1}{\sqrt{3}} & -\frac{1}{\sqrt{2}} \\
\frac{1}{\sqrt{3}} & 0 \\ \frac{1}{\sqrt{3}} & \frac{1}{\sqrt{2}} \end{bmatrix}
\begin{bmatrix} \sqrt{3} & \sqrt{3} \\ 0 & \sqrt{2} \end{bmatrix}
= \begin{bmatrix} 1 & 0 \\ 1 & 1 \\ 1 & 2 \end{bmatrix} = \bA.\;✓\]
Both CGS and MGS produce the QR factorization in exact arithmetic, but differ numerically. Let \(\hat{\bQ}\) denote the computed factor. Then:
Classical (CGS): \(\|\hat{\bQ}^T\hat{\bQ} - \bI\|_2 = O\!\left(\varepsilon_{\text{mach}}\,\kappa(\bA)^2\right)\)
Modified (MGS): \(\|\hat{\bQ}^T\hat{\bQ} - \bI\|_2 = O\!\left(\varepsilon_{\text{mach}}\,\kappa(\bA)\right)\)
Both algorithms cost \(O(mn^2)\) flops for \(\bA \in \fR^{m \times n}\).
The key difference: CGS computes all projections from the original vector \(\bv_k\), so rounding errors in early projections contaminate all subsequent ones. MGS re-uses the partially orthogonalized vector at each step, immediately removing components already seen, which limits error accumulation. For maximum stability (at the same \(O(mn^2)\) cost), Householder QR achieves \(O(\varepsilon_{\text{mach}})\) loss of orthogonality regardless of \(\kappa(\bA)\); it is what np.linalg.qr actually uses.
The Gram-Schmidt process is closely related to the QR decomposition. Arranging the original vectors as columns of \(\bA\) and the orthonormal vectors as columns of \(\bQ\), we have \(\bA = \bQ\bR\) where \(\bR\) is upper triangular with entries: \[r_{ij} =
\begin{cases}
\bq_i^T\bv_j & \text{if } i \leq j \\
0 & \text{if } i > j.
\end{cases}\] The elements of \(\bR\) are precisely the projection coefficients computed during the Gram-Schmidt process.
Let \(\bA = \begin{pmatrix} 1 & 1 \\ 1 & 2 \\ 0 & 1 \end{pmatrix}\).
Apply classical Gram-Schmidt by hand to the columns of \(\bA\) to find an orthonormal basis \(\{\bq_1, \bq_2\}\) for \(\text{col}(\bA)\).
Write out \(\bQ \in \fR^{3 \times 2}\) and \(\bR \in \fR^{2 \times 2}\) and verify \(\bA = \bQ\bR\).
Apply modified Gram-Schmidt to the same columns. Do you get the same \(\bQ, \bR\) (in exact arithmetic)?
In NumPy, compute np.linalg.qr(A) and compare. Verify Q.T @ Q is close to \(\bI\).
(Orthogonal Projection) Let \(\bQ \in \fR^{m \times n}\) (with \(m > n\)) have orthonormal columns (thin QR). The orthogonal projector onto \(\text{col}(\bQ)\) is \(\bP = \bQ\bQ^T\).
Show that \(\bP^2 = \bP\) (idempotent) and \(\bP^T = \bP\) (symmetric).
Show that \(\bP\bx\) is the closest vector in \(\text{col}(\bQ)\) to \(\bx\), i.e., \(\|\bx - \bP\bx\|_2 \leq \|\bx - \by\|_2\) for all \(\by \in \text{col}(\bQ)\).
If \(\bA = \bQ\bR\) (full QR with \(\bQ \in \fR^{m \times m}\) orthogonal), what is the projection onto \(\text{col}(\bA)\) in terms of the thin \(\bQ_1\) (first \(n\) columns of \(\bQ\))?
Write NumPy code to project a vector x = np.random.randn(5) onto the column space of a random \(5 \times 3\) matrix A. Verify the residual x - P @ x is orthogonal to each column of A.
(Stability Experiment) Generate a nearly linearly dependent set of vectors:
V = np.column\_stack([np.ones(50), np.linspace(0,1,50)**k for k in range(1,6)])
This is a Vandermonde-like matrix with \(\kappa(\bV) \gg 1\).
Implement classical Gram-Schmidt and compute \(\|\hat{\bQ}^T\hat{\bQ} - \bI\|_F\).
Implement modified Gram-Schmidt and compute the same quantity. Compare.
Use np.linalg.qr(V) (Householder) and compute the same quantity. What do you observe?
Compute \(\kappa(\bV)\) via np.linalg.cond(V). Does the stability gap between CGS and MGS scale as the result above predicts?
The Gram-Schmidt process and QR factorization are foundational tools in numerical linear algebra, underlying algorithms for solving linear systems, least-squares problems, eigenvalue computation (the QR algorithm), and orthogonal basis construction in iterative solvers.
For each application below, identify which property of QR (orthogonality of \(\bQ\), triangularity of \(\bR\), or both) is the key ingredient and write one sentence explaining why.
Solving a square linear system \(\bA\bx = \bb\) via \(\bA = \bQ\bR\).
Computing the least-squares solution to an overdetermined system \(\bA\bx \approx \bb\).
One step of the QR eigenvalue algorithm: \(\bA_k = \bQ_k\bR_k\), \(\bA_{k+1} = \bR_k\bQ_k\).
Constructing a coordinate system aligned with a given set of data directions.