The Gram-Schmidt algorithm

This post is about the Gram-Schmidt procedure which transforms a set of vectors \(u_1, \dots u_n\) into a new set of vectors \(v_1, \dots, v_n\) which are orthonormal (ON) for some inner product \(\langle\cdot, \cdot \rangle\). This process is purely algebraïc and can be performed over any hilbert space, no matter its complexity: vectors, matrices, functions are vectors that can be ortho-normalized.

Deriving the algorithm

The first step

Suppose we want to achieve two things: we should have \(v_i \perp v_j\) and \(\lVert v_i\rVert = 1\) for any \(v_i, v_j\) in the new collection. Set \(v_1 = u_1 / \lVert u_1\rVert\) so that the second criterion is met. The simplest and most intuitive way to transform a set of vectors \(\{u_i\}_i\) into another \(\{v_i\}_i\) is by a linear map. Let us try to construct \(v_2\) as a linear combination of \(u_2\) and \(v_1\): \(v_2 = a\,u_2 + bv_1\). We must have

\begin{align} \langle v_1, v_2\rangle & = a\langle v_1, u_2\rangle + b\langle v_2, v_2\rangle \\ & = a\langle v_1, u_2\rangle + b = 0 \end{align}

So \(b\) must be proportional to \(-\langle v_1, u_2\rangle\) the projection of the new vector \(u_2\) onto \(v_1\). Further, if we want that \(\langle v_2, v_2\rangle = 1\) we must set \(a= 1/\lVert u_2 \rVert\) and thus $$b = -a\langle v_1, u_2\rangle =\; – \frac{\langle v_1, u_2\rangle}{\lVert u_2 \rVert}$$

Iterating the process

Suppose we have applied the process \(i-1\) times, so that we have an orthonormal family of \(i-1\) vectors \(\{v_j\}_j\) and a new vector \(u_i\). We wish to find a linear combination again:

$$ v_i = \gamma_0u_i + \sum_{j=1}^{i-1}\gamma_{j}v_j $$

If we solve for \(1 \leq k \leq i-1\) the equation \(\langle v_i, v_k\rangle = 0\), because we have \(\langle v_j, v_k\rangle = 0\) for any \(j\neq k\), we obtain that

$$ \langle v_i, v_k\rangle = \gamma_0\langle u_i, v_k\rangle + \gamma_k\langle v_k, v_k\rangle = \gamma_0\langle u_i, v_k\rangle + \gamma_k = 0 $$

this condition, together with the unit norm constraint, gives solutions

$$ \begin{cases} & \gamma_0 = \dfrac{1}{\lVert u_i \rVert} \\ & \\ & \gamma_k =\; -\gamma_0\langle u_i, v_k\rangle =\; -\dfrac{\langle u_i, v_k\rangle}{\lVert u_i \rVert} \end{cases} $$ we thus obtain the Gram-Schmidt algorithm at step \(i\):

$$ v_i \gets \dfrac{1}{\lVert u_i \rVert}\,u_i -\,\sum_{j=1}^{i-1}\dfrac{\langle u_i, v_j\rangle}{\lVert u_i \rVert}\,v_j $$

The QR decomposition

Assume the vectors \(u_1, \dots u_n\) are the columns \(a_j\in\mathbb{R}^n\) of a matrix \(\mathrm{A} = \left[ a_{ij} \right]_{i,j = 1}^n\). We can apply the algorithm to the vectors \(a_j\) with the inner product \(\langle x, y \rangle \triangleq x^\mathsf{T} y\). Denote \(q_1, \dots, q_n\) the ON family obtained. We can then revert the Gram-Schmidt equations to get

\begin{align} a_i = \lVert a_i\rVert\cdot\sum_{j=1}^i \langle a_i, q_j \rangle q_j &= \sum_{j=1}^i \lVert a_i\rVert\cdot\langle a_i, q_j \rangle q_j \\ &= \sum_{j=1}^i r_{ji}q_j = \left[\mathrm{QR}\right]_i \end{align} With \(\mathrm{Q}\) a unitary matrix and \(\mathrm{R}\) upper triangular, as \(r_{ji} = 0\) for \(j > i\).

Linear dependence

Finally, notice that if we have linearly dependent vectors in the original family \(\{u_i\}_i\), the Gram-Schmidt update becomes \(0\) after the rank of the original family is reached. If \(\{u_i\}_{i=1}^n\) has rank \(p\), then the ON family is

$$\{v_j\}_{j=1}^n = \{ v_1, \dots, v_p, 0, \dots, 0 \}$$