Two-level

Next: Fast Posterior Approximation Up: Inference Previous: First-level

Two-level

Here we consider the full two-level model laid out in equations 2 and 3, applying the same ideas as in the previous section to infer on the second-level GLM height parameters $\beta _g$ . We will substitute into the posterior for the full two-level model the summary result of the first-level model derived in the previous section. This will provide us with the way of inferring on the full two-level model using just the summary result of the first-level, i.e. without re-using the data .

Considering equations 2 and 3. The full joint posterior for the two-level model is:

$\displaystyle p(\beta_g,\sigma_g^2,\vec{\beta_K},\vec{\sigma_K^2}\vert Y) \prop... ...k^2)\}p(\beta_K\vert\beta_g,\sigma_g^2) p(\beta_g,\sigma_g^2,\vec{\sigma_K^2}),$

(12)

where $\vec{\sigma_K^2}$ is the $(K\times 1)$ vector of first level variances $\sigma_k^2$ , and $\vec{\beta_K}$ is the $(K\times 1)$ vector of first level regression parameters $\beta_k$ (for $k=1\ldots K$ ). We set the prior to be the Berger-Bernardo reference prior for this full two-level model (see section 3.2):

$\displaystyle p(\beta_g,\sigma_g^2,\vec{\sigma_K^2})$

$\displaystyle =$

$\displaystyle \frac{1}{\sigma_g^2} \prod_k \frac{1}{\sigma_k^2}.$

(13)

Note that this model specification gives the posterior distribution, not only on the second-level parameters $(\beta_g,\sigma^2_g)$ but also on the parameters from all of the first-level models $(\vec{\beta_K},\vec{\sigma^2_K})$ . However, if we are only interested in the top/second-level parameters, we may substitute the summary result from the first level into this two-level model and marginalise over $(\vec{\beta_K},\vec{\sigma^2_K})$ (see appendix 10.6), showing that the marginal distribution on $(\beta_g,\sigma^2_g)$ does not depend on the original data, but on the summary parameters from the first level, i.e. $\mu_{\beta_k}$ and $\sigma_{\beta_k}^2\Sigma_{\beta_k}$ :

$\displaystyle p(\beta_g,\sigma_g^2,\vec{\tau_K}\vert Y)$

$\displaystyle \propto$

$\displaystyle \prod_k \left\{ \mathcal{N}(\mu_{\beta_k};X_{gk} \beta_g,(\sigma_... ...ma_g^2I) \Gamma (\tau_{k};\nu_{\beta_k}/2,\nu_{\beta_k}/2) \right\}1/\sigma_g^2$

(14)

where $X_{gk}$ is the $k^{th}$ row vector of the second-level design matrix $X_{g}$ , and $\vec{\tau_K}$ is a $(K\times 1)$ vector of latent variables $\tau_k$ for $k=1\ldots K$ introduced for mathematical convenience (see appendix 10.6).

A special case of equation 14 is when the variances, $\vec{\sigma^2_{\beta_k}\Sigma_{\beta_k}}$ on the first-level GLM parameters are known with very high degrees of freedom ( $\nu_k \to \infty$ ). This is equivalent to $p(\beta_k\vert Y_k)$ in equation 10 being a Normal distribution instead of a t-distribution. In this case, the prior distribution on $\vec{\tau_K}$ reduces to a delta function centered on $\vec{\tau_K}=\vec{1}$ and the joint posterior distribution on the second-level parameters reduces to:

$\displaystyle p(\beta_g,\sigma_g^2\vert Y)$

$\displaystyle \propto$

$\displaystyle \prod_k \left\{ \mathcal{N}(\mu_{\beta_k};X_{gk} \beta_g,(\sigma_{\beta_k}^2\Sigma_{\beta_k})+\sigma_g^2I) \right\}1/\sigma_g^2.$

(15)

Equation 14 (or, in the special case, equation 15) gives us the joint posterior distributions of $\beta _g$ , $\sigma _g^2$ and $\vec{\tau_K}$ . However, as in the first-level model, we are actually interested in inferring upon the marginal distribution over the GLM height parameters, $\beta _g$ . This marginal posterior $p(\beta_g\vert Y)$ cannot be obtained analytically. Therefore, we consider two approaches, a fast posterior approximation and a slower but more accurate approach using Markov Chain Monte Carlo (MCMC) sampling. Crucially, in both approaches we are going to assume that $p(\beta_g\vert Y)$ is a multivariate non-central t-distribution:

$\displaystyle p(\beta_g\vert Y)$	$\displaystyle \propto$	$\displaystyle \int p(\beta_g,\sigma_g^2,\vec{\tau_K}\vert Y) d\sigma_g^2 d\vec{\tau_K}$	(16)
	$\displaystyle \approx$	$\displaystyle \mathcal{T}(\beta_g;\mu_{\beta_g},\sigma_{\beta_g}^2\Sigma_{\beta_g},\nu_{\beta_g})$	(17)

This assumption is crucial to the idea of being able to split hierarchies into inference on different levels for higher order models as we shall see in section 4. We shall test the validity of this assumption later. The fast posterior approximation or MCMC approaches are the means by which we get the distribution parameters $\mu_{\beta_g},\sigma_{\beta_g}^2\Sigma_{\beta_g},\nu_{\beta_g}$ .

Next: Fast Posterior Approximation Up: Inference Previous: First-level