The Gram-Schmidt algorithm

March 24, 2021

This post is about the Gram-Schmidt procedure which transforms a set of vectors $u_1, \dots u_n$ into a new set of vectors $v_1, \dots, v_n$ which are orthonormal (ON) for some inner product $\langle\cdot, \cdot \rangle$. This process is purely algebraïc and can be performed over any hilbert space, no matter its complexity: vectors, matrices, functions are vectors that can be ortho-normalized.

Deriving the algorithm

The first step

Suppose we want to achieve two things: we should have $v_i \perp v_j$ and $\lVert v_i\rVert = 1$ for any $v_i, v_j$ in the new collection. Set $v_1 = u_1 / \lVert u_1\rVert$ so that the second criterion is met. The simplest and most intuitive way to transform a set of vectors $\{u_i\}_i$ into another $\{v_i\}_i$ is by a linear map. Let us try to construct $v_2$ as a linear combination of $u_2$ and $v_1$: $v_2 = a\,u_2 + bv_1$. We must have

\begin{align} \langle v_1, v_2\rangle & = a\langle v_1, u_2\rangle + b\langle v_2, v_2\rangle \\ & = a\langle v_1, u_2\rangle + b = 0 \end{align}

So $b$ must be proportional to $-\langle v_1, u_2\rangle$ the projection of the new vector $u_2$ onto $v_1$. Further, if we want that $\langle v_2, v_2\rangle = 1$ we must set $a= 1/\lVert u_2 \rVert$ and thus $$b = -a\langle v_1, u_2\rangle =\; – \frac{\langle v_1, u_2\rangle}{\lVert u_2 \rVert}$$

Iterating the process

Suppose we have applied the process $i-1$ times, so that we have an orthonormal family of $i-1$ vectors $\{v_j\}_j$ and a new vector $u_i$. We wish to find a linear combination again:

$$ v_i = \gamma_0u_i + \sum_{j=1}^{i-1}\gamma_{j}v_j $$

If we solve for $1 \leq k \leq i-1$ the equation $\langle v_i, v_k\rangle = 0$, because we have $\langle v_j, v_k\rangle = 0$ for any $j\neq k$, we obtain that

$$ \langle v_i, v_k\rangle = \gamma_0\langle u_i, v_k\rangle + \gamma_k\langle v_k, v_k\rangle = \gamma_0\langle u_i, v_k\rangle + \gamma_k = 0 $$

this condition, together with the unit norm constraint, gives solutions

$$ \begin{cases} & \gamma_0 = \dfrac{1}{\lVert u_i \rVert} \\ & \\ & \gamma_k =\; -\gamma_0\langle u_i, v_k\rangle =\; -\dfrac{\langle u_i, v_k\rangle}{\lVert u_i \rVert} \end{cases} $$ we thus obtain the Gram-Schmidt algorithm at step $i$:

$$ v_i \gets \dfrac{1}{\lVert u_i \rVert}\,u_i -\,\sum_{j=1}^{i-1}\dfrac{\langle u_i, v_j\rangle}{\lVert u_i \rVert}\,v_j $$

The QR decomposition

Assume the vectors $u_1, \dots u_n$ are the columns $a_j\in\mathbb{R}^n$ of a matrix $\mathrm{A} = \left[ a_{ij} \right]_{i,j = 1}^n$. We can apply the algorithm to the vectors $a_j$ with the inner product $\langle x, y \rangle \triangleq x^\mathsf{T} y$. Denote $q_1, \dots, q_n$ the ON family obtained. We can then revert the Gram-Schmidt equations to get

\begin{align} a_i = \lVert a_i\rVert\cdot\sum_{j=1}^i \langle a_i, q_j \rangle q_j &= \sum_{j=1}^i \lVert a_i\rVert\cdot\langle a_i, q_j \rangle q_j \\ &= \sum_{j=1}^i r_{ji}q_j = \left[\mathrm{QR}\right]_i \end{align} With $\mathrm{Q}$ a unitary matrix and $\mathrm{R}$ upper triangular, as $r_{ji} = 0$ for $j > i$.

Linear dependence

Finally, notice that if we have linearly dependent vectors in the original family $\{u_i\}_i$, the Gram-Schmidt update becomes $0$ after the rank of the original family is reached. If $\{u_i\}_{i=1}^n$ has rank $p$, then the ON family is

$$\{v_j\}_{j=1}^n = \{ v_1, \dots, v_p, 0, \dots, 0 \}$$