Bayesian linear regression in three languages
We have already covered the basics of bayesian linear regression in a previous post. Let's now implement our own regression routine in three languages !
Python
From prior to posterior
We only have to implement two routines: one for the posterior mean and variance, and one for the predictive distribution. The code for the posterior is pretty straightfoward when implemented naïvely:
def posterior(sigma_noise: Union[float, jnp.ndarray],
weight_covariance: Union[List[List[float]], jnp.ndarray],
X: jnp.ndarray,
y: jnp.ndarray) -> Tuple[jnp.ndarray, jnp.ndarray]:
'''
Returns the posterior distribution on the weights of the linear model y ~ x
'''
Q_inv = weight_covariance + X.T @ X / sigma_noise ** 2
Q = jnp.linalg.inv(Q_inv)
mean = Q @ X.T @ y / sigma_noise ** 2
return mean, Q
Let us visualize priors and posteriors for \(3\) different random initializations of the weight prior covariance \(\pmb{\Sigma}\): the three priors are plotted below:
After computing each posterior, we observe that they all neatly converge to similar distributions:
so that the prior \(\pmb{\Sigma}\) is not crucial as long as it is not extremely biased.
Predictive distribution
Recall that if the posterior is \(\mathcal{N}(\pmb{\mu}, \mathbf{Q})\) then the predictive distribution \(f_\star \mid \mathbf{x}_\star\) is \(\mathcal{N}(\pmb{\mu}^\mathsf{T}\mathbf{x}_\star, \mathbf{x}_\star^\mathsf{T}\mathbf{Q}\mathbf{x}_\star)\). Because our input is two-dimensional, we cannot easily plot the predictive distribution as a function of the test input \(\mathbf{x}_\star\). Instead, we'll plot the distribution of the output as \(\mathbf{x}_\star\) changes on the unit circle: \(\mathbf{x}_\star = \left[\cos\theta\;\sin\theta\right]\) for \(\theta\in [-\pi, \pi]\).
The results for our three initializations are illustrated below
notice that although our model is linear in \(\mathbf{x}\), it is not linear as a function of \(\theta\). The three models are similar as expected.
Julia
The code snippet to implement the posterior is similar:
using LinearAlgebra
function posterior(
sigma_noise::Float64,
weight_covariance::Matrix{Float64},
X::Matrix{Float64},
y::Matrix{Float64}
)::Tuple{Matrix{Float64}, Matrix{Float64}}
"""
Computes posterior mean and covariance for the
linear regression weights.
"""
Q_inv = weight_covariance + transpose(X) * X / ( sigma_noise ^ 2 )
Q = inv(Q_inv)
mean = Q * transpose(X) * y / ( sigma_noise ^ 2 )
return mean, Q
end
To be added.
R
In R, we can update the posterior through a function:
posterior <- function(
sigma_noise,
weight_covariance,
X,
y
) {
Q_inv <- weight_covariance + t(X) %*% X / (sigma_noise ^ 2)
Q <- solve(Q_inv)
mean <- Q %*% t(X) %*% y / (sigma_noise ^ 2)
results <- list("mean" = mean, "cov" = Q)
return(results)
}
To be added.