Let be a Gaussian vector over with mean and covariance matrix . Split in two bits, say with respective sizes (here ). What is the conditional distribution of given ? First, let us split the mean and covariance of into the corresponding blocks:
so that for example is a Gaussian with mean and covariance . Obviously, since is symmetric, .
The distribution of
is a Gaussian random variable with mean
and with covariance
Let be the joint density for , namely
It is well known that the conditional distribution of given is
We could perform this exact computation and find the claim in the theorem but. To proceed, we need to find the expression of the inverse of . That is doable, and indeed the famous Schur formulas tell us that
where is called the Schur complement of the first block of ,
We immediately recognize (3). By carefully reorganizing the terms inside we would readily find that is proportional to
hence the theorem would be proved.
I find this method computational and I never remember the block-inversion formula (6).
Instead, there is a simpler, more conceptual path: observe that is a quadratic function in , hence when is fixed, is still a quadratic function in . But obviously, log-quadratic probability densities are precisely Gaussian densities. We just proved that
Hence, all we have to do is to compute the conditional mean and the conditional variance, namely
To compute (10), there is a clever trick. The idea is to remove the part of wich depends on , to get something independent of . Indeed, we want to find a matrix such that is independent of . Since are jointly Gaussian, they only need to be decorrelated, that is which translates into , hence
and for future reference,
Now, we can compute the conditional mean:
For the conditional variance, we note that
hence is independent of , and in particular and