5 Introduction to Norms
A norm is a function that assigns a non-negative scalar ``size’’ to a vector or matrix. Norms are the fundamental tool for measuring errors, bounding perturbations, and defining convergence throughout scientific computing. This chapter introduces the key vector norm definitions and the families most used in practice. We will study norms more rigorously in later chapters, where we develop matrix norms, operator inequalities, Cauchy-Schwarz, Hölder’s inequality, and the condition number.
A function \(\|\cdot\| : \fR^n \to \fR\) is a vector norm if for all \(\bx, \by \in \fR^n\) and \(c \in \fR\):
\(\|\bx\| \geq 0\) and \(\|\bx\| = 0 \Leftrightarrow \bx = \bzero\) (positive definiteness)
\(\|c\bx\| = |c|\|\bx\|\) (absolute homogeneity)
\(\|\bx + \by\| \leq \|\bx\| + \|\by\|\) (triangle inequality)
For \(\bx \in \fR^n\) and \(p \geq 1\), the \(\ell^p\) norm is \[ \begin{align} \|\bx\|_p = \left(\sum_{i=1}^n |x_i|^p\right)^{1/p}. \end{align} \] The three most common cases are:
\(\|\bx\|_1 = \displaystyle\sum_{i=1}^n |x_i|\) (Manhattan norm, the sum of absolute values)
\(\|\bx\|_2 = \sqrt{\displaystyle\sum_{i=1}^n x_i^2} = \sqrt{\bx^T\bx}\) (Euclidean norm)
\(\|\bx\|_\infty = \max_{1 \leq i \leq n} |x_i|\) (max norm, the limit as \(p \to \infty\))
Physical Intuition. In scientific computing, the choice of norm depends on what you want to measure:
\(\ell_2\) (Energy): The standard measure of magnitude; corresponds to physical energy or distance.
\(\ell_\infty\) (Worst-case): Measures the single largest entry. Useful for
tolerance'' checks (e.g.,every entry must be below \(10^{-6}\)’’).\(\ell_1\) (Sparsity): Encourages or measures sparsity; often used in optimization and compressed sensing.
For \(\bx = (3, -4, 0)^T\): \[ \begin{align} \|\bx\|_1 = 3 + 4 + 0 = 7, \qquad \|\bx\|_2 = \sqrt{9 + 16 + 0} = 5, \qquad \|\bx\|_\infty = 4. \end{align} \] The unit balls \(\{\bx : \|\bx\|_p \leq 1\}\) in \(\fR^2\) are: a diamond (\(p = 1\)), a circle (\(p = 2\)), and a square (\(p = \infty\)). As \(p\) increases, the unit ball grows toward the \(\ell^\infty\) square.
Throughout these notes, \(\|\cdot\|\) without a subscript denotes the Euclidean norm \(\|\cdot\|_2\) unless stated otherwise. Errors and residuals are almost always measured in \(\|\cdot\|_2\). In NumPy: np.linalg.norm(x) computes \(\|\bx\|_2\); np.linalg.norm(x, ord=p) computes \(\|\bx\|_p\).
Numerical Zero. In floating-point arithmetic, the condition \(\|\bx\| = 0\) is rarely met exactly due to rounding. We instead use a tolerance \(\varepsilon\): we say \(\bx\) is ``numerically zero’’ if \(\|\bx\| \leq \varepsilon \cdot \|\bx_0\|\), where \(\bx_0\) is some initial reference vector.
For any two norms \(\|\cdot\|_\alpha\) and \(\|\cdot\|_\beta\) on \(\fR^n\), there exist constants \(c_1, c_2 > 0\) such that \[ \begin{align} c_1\|\bx\|_\beta \leq \|\bx\|_\alpha \leq c_2\|\bx\|_\beta \quad \text{for all } \bx \in \fR^n. \end{align} \] In particular: \(\|\bx\|_2 \leq \|\bx\|_1 \leq \sqrt{n}\,\|\bx\|_2\) and \(\|\bx\|_\infty \leq \|\bx\|_2 \leq \sqrt{n}\,\|\bx\|_\infty\).
Since \(\fR^n\) is finite-dimensional, the unit sphere \(S_\beta = \{\bx : \|\bx\|_\beta = 1\}\) is compact. Any norm \(\|\cdot\|_\alpha\) is a continuous function on \(\fR^n\), and by the Extreme Value Theorem, it attains its minimum \(c_1\) and maximum \(c_2\) on \(S_\beta\): \[ \begin{align} c_1 = \min_{\bx \in S_\beta} \|\bx\|_\alpha, \quad c_2 = \max_{\bx \in S_\beta} \|\bx\|_\alpha. \end{align} \] For any nonzero \(\by \in \fR^n\), let \(\bx = \by / \|\by\|_\beta \in S_\beta\). Then: \[ \begin{align} c_1 \leq \left\| \frac{\by}{\|\by\|_\beta} \right\|_\alpha \leq c_2 \quad \Rightarrow \quad c_1 \|\by\|_\beta \leq \|\by\|_\alpha \leq c_2 \|\by\|_\beta. \end{align} \] The explicit inequality \(\|\bx\|_1 \leq \sqrt{n}\|\bx\|_2\) follows from Cauchy-Schwarz applied to \(|\bx|\) and the all-ones vector \(\mathbf{1}\): \[ \begin{align} \|\bx\|_1 &= \sum_{i=1}^n |x_i| \cdot 1 \\ &\leq \left( \sum_{i=1}^n x_i^2 \right)^{1/2} \left( \sum_{i=1}^n 1^2 \right)^{1/2} \\ &= \sqrt{n} \|\bx\|_2. \end{align} \]
Refer to Defs~above–above and the result above.
For \(\bx = (-1, 2, -3, 4)^T\), compute \(\|\bx\|_1\), \(\|\bx\|_2\), \(\|\bx\|_\infty\) by hand. Verify all three norm-equivalence inequalities hold.
Verify the triangle inequality \(\|\bx + \by\|_2 \leq \|\bx\|_2 + \|\by\|_2\) for \(\bx = (1, 2)^T\), \(\by = (-3, 1)^T\) by computing both sides.
Using NumPy, confirm your hand calculations with
np.linalg.norm(x, 1),np.linalg.norm(x), andnp.linalg.norm(x, np.inf).Show that \(\|\bx\|_\infty = \lim_{p \to \infty}\|\bx\|_p\). (Hint: let \(M = \max_i|x_i|\); factor \(M\) out of \(\|\bx\|_p\) and take the limit.)