Jensen's inequality: going to the limit

This post is about a proof of Jensen's Inequality, and more specifically a way to go from finite spaces to countable and uncountable spaces.

Finite case

Recall that a function \(f\) is convex on \([a,b]\) iff the inequality

$$ f(ta + (1-t)b) \leq tf(a) + (1-t)f(b) $$

holds for any real \(t\) between \(0\) and \(1\). We can easily generalize this relationship to any convex sum of points (a convex sum is a weighted sum \(\sum_{i=1}^n \gamma_ia_i\) for which \(\gamma_i \geq 0\) and \(\sum_{i=1}^n \gamma_i = 1\)) using induction on \(n\). The above inequality serves as the initialization(\(n\) equal to \(1\) and \(2\)). Assuming the inequality holds for \(n-1\) points we can notice that

$$ f\left(\sum_{i=1}^n \gamma_ia_i\right) = f\left(\gamma_na_n + \sum_{i=1}^{n-1} \gamma_ia_i\right) = f( ta + (1-t)b) $$

if we set \(t = \gamma_n\), \(a = a_n\), \(b = \sum_{i=1}^{n-1} \gamma_i'a_i = \sum_{i=1}^{n-1} \frac{\gamma_i}{1-\gamma_n} a_i \)

thus, by the convexity of \(f\), we have

$$ f\left(\sum_{i=1}^n \gamma_ia_i\right) \leq \gamma_n f(a_n) + (1 – \gamma_n)f\left(\sum_{i=1}^{n-1} \gamma_i'a_i\right) $$

however, since the \(\gamma_i'\) form a convex combination of \(n-1\) points, we can perform the induction step: \(f\left(\sum_{i=1}^{n-1} \gamma_i'a_i\right) \leq \sum_{i=1}^{n-1} \gamma_i'f(a_i)\). Thus we obtain that \(f\left(\sum_{i=1}^n \gamma_ia_i\right) \leq \sum_{i=1}^n \gamma_if(a_i)\) for any finite convex combination \(\{ (\gamma_i,a_i) \}_{i=1}^n\).

Countable case

A countable collection of points \((a_n)_{n\in\mathbb{N}}\) can be combined in a convex fashion using any positive sequence \((\gamma_n)_{n\in\mathbb{N}}\) such that \(\sum_{i=1}^\infty \gamma_i= \lim_{n\to\infty}\sum_{i=1}^n \gamma_i = 1\). This implies that the coefficients vanish at infinity: \(\gamma_i \to 0\) as \(i \to \infty\).

Assuming both \(\sum_{i=1}^n \gamma_i a_i\) and \(\sum_{i=1}^n \gamma_i f(a_i)\) converge, and assuming \(f\) is continuous, we can take limits on both sides of our inequality to yield:

$$ f\left(\sum_{i=1}^\infty \gamma_ia_i\right) \leq \sum_{i=1}^\infty \gamma_if(a_i) $$

Continuous case

Assuming we have a distribution \(p\) defined on the real line \(\mathbb{R}\), the integral $$\mathbb{E}_p[f(x)] = \int_\mathcal{X}f(x)p(x)\,\mathrm{d}x$$ can be approximated using countable sums. Indeed, the function \(p(x)\) can be approximated using a convex combination of dirac deltas: \(p(x) \approx p_n(x) = \sum_{i=1}^n\gamma_i \delta_{x_i}(x)\) (formally, convex combinations of diracs are weakly dense in the space of distributions). We have that \(\mathbb{E}_{p_n}[f(x)] \to \mathbb{E}_{p}[f(x)]\) as \(n \to \infty\) and

$$ f\left(\mathbb{E}_{p_n}[x]\right) = f\left(\sum_{i=1}^n\gamma_i x_i\right) \leq \sum_{i=1}^n\gamma_i f(x_i) = \mathbb{E}_{p_n}[f(x)] $$ and we can let \(n\) tend to \(\infty\) on each side of the inequality to conclude.