Expectation of a quadratic form: the trace trick

Suppose one wants to compute the expected value \(\mathbb{E}_X[X]\) of a quadratic form \(f(x) \triangleq x^\mathsf{T}\mathrm{A} x\). Here \(x\) is a \(n\) dimensional vector and \(\mathrm{A}\) a \(n\times n\) matrix. This post introduces a popular trick to compute the expectation of the quadratic form \(f\).

Trace definition and property

Recall the definition of the trace for a matrix \(\mathrm{X} = [\mathrm{X}_{ij}]_{i,j = 1}^n\): the trace is the sum of the diagonal elements $$ \mathrm{Tr}[\mathrm{X}] = \sum_{i=1}^n \mathrm{X}_{ii} $$ from which it is easy to derive that \(\mathrm{Tr}[\cdot]\) is linear. The trace of a \(1\times 1\) matrix is simply the value of its only element \(\mathrm{X}_{11}\). One easily finds, using any usual definition for integrals, that as a linear map one has

$$\int \mathrm{Tr}[g(\mathrm{X})]\,\mathrm{dX} = \mathrm{Tr}\left[ \int g(\mathrm{X})\,\mathrm{dX}\right] $$

Another important property is the cyclic nature of the trace: \(\mathrm{Tr}[\mathrm{XY}] = \mathrm{Tr}[\mathrm{YX}]\) which can be generalized to three matrices $$ \mathrm{Tr}[\mathrm{XYZ}] = \mathrm{Tr}[\mathrm{ZXY}] = \mathrm{Tr}[\mathrm{YZX}] $$ or any finite product of \(m\) matrices.

The trace trick

Suppose that \(X\) is a random variable distributed according to the density \(p\). Suppose the integral \(\Sigma = \mathrm{Cov}[X,X] = \mathbb{E}[XX^\mathsf{T}]\) exists. The expectation of the quadratic \(f\) is then equal to \(\mathrm{Tr}[\mathrm{A}\Sigma]\).

To prove this, we first use the fact that \(f(x)\) is a scalar, and therefore is equal to \(\mathrm{Tr}[f(x)]\). Then, by the cyclic property of the trace \(f(x) = \mathrm{Tr}[\mathrm{A}XX^\mathsf{T}]\). Finally, as the trace is linear, the trace of the expectation is equal to the expectation of the trace: $$ \mathbb{E}_X[\mathrm{Tr}[\mathrm{A}XX^\mathsf{T}]] = \mathrm{Tr}[\mathrm{A}\mathbb{E}_X[XX^\mathsf{T}]] $$

which concludes the proof.