5 Introduction to Norms

A norm is a function that assigns a non-negative scalar ``size’’ to a vector or matrix. Norms are the fundamental tool for measuring errors, bounding perturbations, and defining convergence throughout scientific computing. This chapter introduces the key vector norm definitions and the families most used in practice. We will study norms more rigorously in later chapters, where we develop matrix norms, operator inequalities, Cauchy-Schwarz, Hölder’s inequality, and the condition number.

Definition: Vector Norm

A function \(\|\cdot\| : \fR^n \to \fR\) is a vector norm if for all \(\bx, \by \in \fR^n\) and \(c \in \fR\):

\(\|\bx\| \geq 0\) and \(\|\bx\| = 0 \Leftrightarrow \bx = \bzero\) (positive definiteness)
\(\|c\bx\| = |c|\|\bx\|\) (absolute homogeneity)
\(\|\bx + \by\| \leq \|\bx\| + \|\by\|\) (triangle inequality)

Definition: \(\ell^p\) Norms

For \(\bx \in \fR^n\) and \(p \geq 1\), the \(\ell^p\) norm is \[ \begin{align} \|\bx\|_p = \left(\sum_{i=1}^n |x_i|^p\right)^{1/p}. \end{align} \] The three most common cases are:

\(\|\bx\|_1 = \displaystyle\sum_{i=1}^n |x_i|\) (Manhattan norm, the sum of absolute values)
\(\|\bx\|_2 = \sqrt{\displaystyle\sum_{i=1}^n x_i^2} = \sqrt{\bx^T\bx}\) (Euclidean norm)
\(\|\bx\|_\infty = \max_{1 \leq i \leq n} |x_i|\) (max norm, the limit as \(p \to \infty\))

Remark

(Physical Intuition) In scientific computing, the choice of norm depends on what you want to measure:

\(\ell_2\) (Energy): The standard measure of magnitude; corresponds to physical energy or distance.
\(\ell_\infty\) (Worst-case): Measures the single largest entry. Useful for tolerance'' checks (e.g.,every entry must be below \(10^{-6}\)’’).
\(\ell_1\) (Sparsity): Encourages or measures sparsity; often used in optimization and compressed sensing.

Example

For \(\bx = (3, -4, 0)^T\): \[ \begin{align} \|\bx\|_1 = 3 + 4 + 0 = 7, \qquad \|\bx\|_2 = \sqrt{9 + 16 + 0} = 5, \qquad \|\bx\|_\infty = 4. \end{align} \] The unit balls \(\{\bx : \|\bx\|_p \leq 1\}\) in \(\fR^2\) are: a diamond (\(p = 1\)), a circle (\(p = 2\)), and a square (\(p = \infty\)). As \(p\) increases, the unit ball grows toward the \(\ell^\infty\) square.

Remark

Remark

(Numerical Zero) In floating-point arithmetic, the condition \(\|\bx\| = 0\) is rarely met exactly due to rounding. We instead use a tolerance \(\varepsilon\): we say \(\bx\) is ``numerically zero’’ if \(\|\bx\| \leq \varepsilon \cdot \|\bx_0\|\), where \(\bx_0\) is some initial reference vector.

Theorem: All Norms are Equivalent on \(\fR^n\)

For any two norms \(\|\cdot\|_\alpha\) and \(\|\cdot\|_\beta\) on \(\fR^n\), there exist constants \(c_1, c_2 > 0\) such that \[ \begin{align} c_1\|\bx\|_\beta \leq \|\bx\|_\alpha \leq c_2\|\bx\|_\beta \quad \text{for all } \bx \in \fR^n. \end{align} \] In particular: \(\|\bx\|_2 \leq \|\bx\|_1 \leq \sqrt{n}\,\|\bx\|_2\) and \(\|\bx\|_\infty \leq \|\bx\|_2 \leq \sqrt{n}\,\|\bx\|_\infty\).

Proof

Since \(\fR^n\) is finite-dimensional, the unit sphere \(S_\beta = \{\bx : \|\bx\|_\beta = 1\}\) is compact. Any norm \(\|\cdot\|_\alpha\) is a continuous function on \(\fR^n\), and by the Extreme Value Theorem, it attains its minimum \(c_1\) and maximum \(c_2\) on \(S_\beta\): \[ \begin{align} c_1 = \min_{\bx \in S_\beta} \|\bx\|_\alpha, \quad c_2 = \max_{\bx \in S_\beta} \|\bx\|_\alpha. \end{align} \] For any nonzero \(\by \in \fR^n\), let \(\bx = \by / \|\by\|_\beta \in S_\beta\). Then: \[ \begin{align} c_1 \leq \left\| \frac{\by}{\|\by\|_\beta} \right\|_\alpha \leq c_2 \quad \Rightarrow \quad c_1 \|\by\|_\beta \leq \|\by\|_\alpha \leq c_2 \|\by\|_\beta. \end{align} \] The explicit inequality \(\|\bx\|_1 \leq \sqrt{n}\|\bx\|_2\) follows from Cauchy-Schwarz applied to \(|\bx|\) and the all-ones vector \(\mathbf{1}\): \[ \begin{align} \|\bx\|_1 &= \sum_{i=1}^n |x_i| \cdot 1 \\ &\leq \left( \sum_{i=1}^n x_i^2 \right)^{1/2} \left( \sum_{i=1}^n 1^2 \right)^{1/2} \\ &= \sqrt{n} \|\bx\|_2. \end{align} \]

Exercise

Refer to Defs~above–above and the result above.

For \(\bx = (-1, 2, -3, 4)^T\), compute \(\|\bx\|_1\), \(\|\bx\|_2\), \(\|\bx\|_\infty\) by hand. Verify all three norm-equivalence inequalities hold.
Verify the triangle inequality \(\|\bx + \by\|_2 \leq \|\bx\|_2 + \|\by\|_2\) for \(\bx = (1, 2)^T\), \(\by = (-3, 1)^T\) by computing both sides.
Using NumPy, confirm your hand calculations with np.linalg.norm(x, 1), np.linalg.norm(x), and np.linalg.norm(x, np.inf).
Show that \(\|\bx\|_\infty = \lim_{p \to \infty}\|\bx\|_p\). (Hint: let \(M = \max_i|x_i|\); factor \(M\) out of \(\|\bx\|_p\) and take the limit.)