Jensen's inequality: going to the limit

April 4, 2021

This post is about a proof of Jensen's Inequality, and more specifically a way to go from finite spaces to countable and uncountable spaces.

Finite case

Recall that a function $f$ is convex on $[a,b]$ iff the inequality

$$ f(ta + (1-t)b) \leq tf(a) + (1-t)f(b) $$

holds for any real $t$ between $0$ and $1$. We can easily generalize this relationship to any convex sum of points (a convex sum is a weighted sum $\sum_{i=1}^n \gamma_ia_i$ for which $\gamma_i \geq 0$ and $\sum_{i=1}^n \gamma_i = 1$) using induction on $n$. The above inequality serves as the initialization($n$ equal to $1$ and $2$). Assuming the inequality holds for $n-1$ points we can notice that

$$ f\left(\sum_{i=1}^n \gamma_ia_i\right) = f\left(\gamma_na_n + \sum_{i=1}^{n-1} \gamma_ia_i\right) = f( ta + (1-t)b) $$

if we set $t = \gamma_n$, $a = a_n$, $b = \sum_{i=1}^{n-1} \gamma_i'a_i = \sum_{i=1}^{n-1} \frac{\gamma_i}{1-\gamma_n} a_i $

thus, by the convexity of $f$, we have

$$ f\left(\sum_{i=1}^n \gamma_ia_i\right) \leq \gamma_n f(a_n) + (1 – \gamma_n)f\left(\sum_{i=1}^{n-1} \gamma_i'a_i\right) $$

however, since the $\gamma_i'$ form a convex combination of $n-1$ points, we can perform the induction step: $f\left(\sum_{i=1}^{n-1} \gamma_i'a_i\right) \leq \sum_{i=1}^{n-1} \gamma_i'f(a_i)$. Thus we obtain that $f\left(\sum_{i=1}^n \gamma_ia_i\right) \leq \sum_{i=1}^n \gamma_if(a_i)$ for any finite convex combination $\{ (\gamma_i,a_i) \}_{i=1}^n$.

Countable case

A countable collection of points $(a_n)_{n\in\mathbb{N}}$ can be combined in a convex fashion using any positive sequence $(\gamma_n)_{n\in\mathbb{N}}$ such that $\sum_{i=1}^\infty \gamma_i= \lim_{n\to\infty}\sum_{i=1}^n \gamma_i = 1$. This implies that the coefficients vanish at infinity: $\gamma_i \to 0$ as $i \to \infty$.

Assuming both $\sum_{i=1}^n \gamma_i a_i$ and $\sum_{i=1}^n \gamma_i f(a_i)$ converge, and assuming $f$ is continuous, we can take limits on both sides of our inequality to yield:

$$ f\left(\sum_{i=1}^\infty \gamma_ia_i\right) \leq \sum_{i=1}^\infty \gamma_if(a_i) $$

Continuous case

Assuming we have a distribution $p$ defined on the real line $\mathbb{R}$, the integral $$\mathbb{E}_p[f(x)] = \int_\mathcal{X}f(x)p(x)\,\mathrm{d}x$$ can be approximated using countable sums. Indeed, the function $p(x)$ can be approximated using a convex combination of dirac deltas: $p(x) \approx p_n(x) = \sum_{i=1}^n\gamma_i \delta_{x_i}(x)$ (formally, convex combinations of diracs are weakly dense in the space of distributions). We have that $\mathbb{E}_{p_n}[f(x)] \to \mathbb{E}_{p}[f(x)]$ as $n \to \infty$ and

$$ f\left(\mathbb{E}_{p_n}[x]\right) = f\left(\sum_{i=1}^n\gamma_i x_i\right) \leq \sum_{i=1}^n\gamma_i f(x_i) = \mathbb{E}_{p_n}[f(x)] $$ and we can let $n$ tend to $\infty$ on each side of the inequality to conclude.