SVD when everything goes right
This post is about a derivation of the Singular Value Decomposition of a matrix \(\mathrm{A}\) when several conditions are met. These conditions are quite restrictive, but make derivations very easy.
Symmetric PD matrices
Let us begin with a symmetric matrix \(\mathrm{S}\in\mathbb{R}^{n\times n}\) which is assumed positive-definite with \(n\) distinct eigenvalues \(\{\lambda_i\}_i\).
It is easy to prove that the eigenvectors \(v_i\) are orthogonal :
$$0 < v_i^\mathsf{T}\mathrm{S}\,v_i = \lambda_i\lVert v_i \rVert^2 $$
gives that all eigenvalues are strictly positive, but we have for \(i \neq j\)
$$ \lambda_i\,v_i^\mathsf{T}v_j = (\mathrm{S}v_i)^\mathsf{T}v_j = v_i^\mathsf{T}\mathrm{S}^\mathsf{T}v_j = v_i^\mathsf{T}\mathrm{S}v_j = \lambda_j\,v_i^\mathsf{T}v_j$$
because \(\mathrm{S} = \mathrm{S}^\mathsf{T}\). As \(\lambda_i \neq \lambda_j\) we must have \(v_i^\mathsf{T}v_j = 0\).
A symmetric PD matrix \(\mathrm{S}\) can therefore be decomposed as \(\mathrm{S} = \mathrm{Q}\mathrm{\Lambda}\mathrm{Q}^\mathsf{T}\) with \(\mathrm{Q}\) orthogonal and \(\mathrm{\Lambda}\) diagonal with positive entries.
Eigendecomposition of a product
Suppose \(\mathrm{A}\) is square, and further that the matrices \(\mathrm{A}^\mathsf{T} \mathrm{A}\) and \(\mathrm{A}\mathrm{A}^\mathsf{T}\) have distinct eigenvalues.
Then, by our derivation in the previous section, we have that each of the two matrices can be decomposed as
$$
\mathrm{A}^\mathsf{T} \mathrm{A} = \mathrm{U}\mathrm{\Sigma}_1\mathrm{U}^\mathsf{T}
\qquad\mathrm{and}\qquad
\mathrm{A}\mathrm{A}^\mathsf{T} = \mathrm{V}\mathrm{\Sigma}_2\mathrm{V}^\mathsf{T}
$$
because these two matrices are always symmetric and positive-definite. Further, the eigenvalues \(\sigma^1_i\) and \(\sigma^2_i\) stored in \(\mathrm{\Sigma}_1\) and \(\mathrm{\Sigma}_2\) are equal: if \(\mathrm{A}^\mathsf{T} \mathrm{A}u_i = \sigma^1_i u_i\) then
$$\mathrm{A}\mathrm{A}^\mathsf{T} (\mathrm{A}u_i) = \sigma^1_i (\mathrm{A}u_i) = \sigma^2_j v_j\qquad\text{for some}\;j
$$
We can therefore always rearrange \(\mathrm{\Sigma}_1\) into \(\mathrm{\Sigma}_2\) by pre and post multiplying by a permutation: \(\mathrm{\Sigma}_1 = \mathrm{P}\mathrm{\Sigma}_2\mathrm{P}^\mathsf{T}\). By applying the same permutation to the matrix of eigenvectors we keep the decomposition
$$ \mathrm{A}\mathrm{A}^\mathsf{T} = \mathrm{V}\mathrm{\Sigma}_2\mathrm{V}^\mathsf{T} = (\mathrm{V}\mathrm{P}^\mathsf{T})\mathrm{P}\mathrm{\Sigma}_2\mathrm{P}^\mathsf{T}(\mathrm{V}\mathrm{P}^\mathsf{T})^\mathsf{T} = \tilde{\mathrm{V}}\mathrm{\Sigma}_1\tilde{\mathrm{V}}^\mathsf{T}$$
we can therefore assume \(\mathrm{\Sigma}_1 = \mathrm{\Sigma}_2 = \mathrm{\Sigma}\) for some appropriate orthogonal matrices \(\mathrm{U}, \mathrm{V}\).
Singular Value Decomposition
Because the eigenvalues stored in \(\mathrm{\Sigma}\) are strictly positive, we can take their square root \(\sqrt{\sigma_i}\) and store them in a new matrix \(\tilde{\mathrm{\Sigma}} = \mathrm{\Sigma}^{1 / 2}\) such that
$$\tilde{\mathrm{\Sigma}}\tilde{\mathrm{\Sigma}} = \tilde{\mathrm{\Sigma}}^\mathsf{T}\tilde{\mathrm{\Sigma}} =
\tilde{\mathrm{\Sigma}}\tilde{\mathrm{\Sigma}}^\mathsf{T} = \mathrm{\Sigma}
$$
the trick behind the SVD is to construct a specififc \(\mathrm{V}\) from the \(\mathrm{U}\)-decomposition. Consider the equalities
\begin{align}
\mathrm{A}\mathrm{A}^\mathsf{T} &= \mathrm{V}\mathrm{\Sigma}\mathrm{V}^\mathsf{T} \\
& = \left(\mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\right)\mathrm{\Sigma}\left(\tilde{\mathrm{\Sigma}}^{-1}\mathrm{U}^\mathsf{T} \mathrm{A}^\mathsf{T}\right) \\
& =
\left(\mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\right)\mathrm{\Sigma}\left(\mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\right)^\mathsf{T}
\end{align}
therefore, by setting \(\mathrm{V} \triangleq \mathrm{A}\mathrm{U}\tilde{\mathrm{\Sigma}}^{-1}\) we can do two things: orthogonally diagonalize the matrix \(\mathrm{A}\mathrm{A}^\mathsf{T}\) and ensure that \(\mathrm{A} = \mathrm{U}^\mathsf{T}\tilde{\mathrm{\Sigma}}\mathrm{V}\).
The matrices \(\mathrm{U}, \mathrm{V}\) are often named differently, so that the SVD decomposition of \(\mathrm{A}\) is
$$ \mathrm{A} = \mathrm{U}\mathrm{\Sigma}\mathrm{V}^\mathsf{T}
$$
for some unitary matrices \(\mathrm{U}, \mathrm{V}\) and a positive diagonal matrix \(\mathrm{\Sigma}\).