Jensen's inequality: going to the limit
This post is about a proof of Jensen's Inequality, and more specifically a way to go from finite spaces to countable and uncountable spaces.
Finite case
Recall that a function \(f\) is convex on \([a,b]\) iff the inequality
$$ f(ta + (1-t)b) \leq tf(a) + (1-t)f(b) $$
holds for any real \(t\) between \(0\) and \(1\).
We can easily generalize this relationship to any convex sum of points (a convex sum is a weighted sum \(\sum_{i=1}^n \gamma_ia_i\) for which \(\gamma_i \geq 0\) and \(\sum_{i=1}^n \gamma_i = 1\)) using induction on \(n\).
The above inequality serves as the initialization(\(n\) equal to \(1\) and \(2\)). Assuming the inequality holds for \(n-1\) points we can notice that
$$
f\left(\sum_{i=1}^n \gamma_ia_i\right) = f\left(\gamma_na_n + \sum_{i=1}^{n-1} \gamma_ia_i\right) = f( ta + (1-t)b)
$$
if we set \(t = \gamma_n\), \(a = a_n\), \(b = \sum_{i=1}^{n-1} \gamma_i'a_i = \sum_{i=1}^{n-1} \frac{\gamma_i}{1-\gamma_n} a_i \)
thus, by the convexity of \(f\), we have
$$
f\left(\sum_{i=1}^n \gamma_ia_i\right) \leq \gamma_n f(a_n) + (1 – \gamma_n)f\left(\sum_{i=1}^{n-1} \gamma_i'a_i\right)
$$
however, since the \(\gamma_i'\) form a convex combination of \(n-1\) points, we can perform the induction step: \(f\left(\sum_{i=1}^{n-1} \gamma_i'a_i\right) \leq \sum_{i=1}^{n-1} \gamma_i'f(a_i)\).
Thus we obtain that \(f\left(\sum_{i=1}^n \gamma_ia_i\right) \leq \sum_{i=1}^n \gamma_if(a_i)\) for any finite convex combination \(\{ (\gamma_i,a_i) \}_{i=1}^n\).
Countable case
A countable collection of points \((a_n)_{n\in\mathbb{N}}\) can be combined in a convex fashion using any positive sequence \((\gamma_n)_{n\in\mathbb{N}}\) such that \(\sum_{i=1}^\infty \gamma_i= \lim_{n\to\infty}\sum_{i=1}^n \gamma_i = 1\).
This implies that the coefficients vanish at infinity: \(\gamma_i \to 0\) as \(i \to \infty\).
Assuming both \(\sum_{i=1}^n \gamma_i a_i\) and \(\sum_{i=1}^n \gamma_i f(a_i)\) converge, and assuming \(f\) is continuous, we can take limits on both sides of our inequality to yield:
$$
f\left(\sum_{i=1}^\infty \gamma_ia_i\right) \leq \sum_{i=1}^\infty \gamma_if(a_i)
$$
Continuous case
Assuming we have a distribution \(p\) defined on the real line \(\mathbb{R}\), the integral
$$\mathbb{E}_p[f(x)] = \int_\mathcal{X}f(x)p(x)\,\mathrm{d}x$$
can be approximated using countable sums.
Indeed, the function \(p(x)\) can be approximated using a convex combination of dirac deltas: \(p(x) \approx p_n(x) = \sum_{i=1}^n\gamma_i \delta_{x_i}(x)\) (formally, convex combinations of diracs are weakly dense in the space of distributions).
We have that \(\mathbb{E}_{p_n}[f(x)] \to \mathbb{E}_{p}[f(x)]\) as \(n \to \infty\) and
$$
f\left(\mathbb{E}_{p_n}[x]\right) = f\left(\sum_{i=1}^n\gamma_i x_i\right) \leq \sum_{i=1}^n\gamma_i f(x_i) = \mathbb{E}_{p_n}[f(x)]
$$
and we can let \(n\) tend to \(\infty\) on each side of the inequality to conclude.