Figure 2 show boxplots of the difference in
z-statistics between those obtained from a long MCMC chain of
200,000 samples and those obtained from the different inference
approaches considered. The intention is to consider the inference
from a very long MCMC time series as a ``gold standard''. To help
validate this assumption the first boxplot (labelled [MCMC])
compares this ``gold standard'' inference with another equally
long MCMC chain but with a different random seed. This allows us
to assess the inaccuracies in the ``gold standard'' due to the
finite length of the MCMC chain. In all four datasets the
difference in z-statistics for this is of the order of .
The second boxplot (labelled [BIDET]) compares our ``gold
standard'' to the inference obtained when we fit the non-central
multivariate t-distribution to the long MCMC chain with a
different random seed. This allows us to validate one of the
strongest assumptions that we make in this paper. That is that the
marginal posterior in equations 14 are a
non-central multivariate t-distribution. This is crucial to the
idea of being able to split hierarchies into inference on
different levels. By making this distributional assumption it also
allows us to infer on shorter MCMC chains, and gives us some basis
for the fast approximation approach. This assumption is well
supported by these [BIDET] boxplots with the difference in
z-statistics being of the order of for all four datasets.
Figure 2 also shows boxplots for the fast
approximation approaches. We show boxplots for the upper bound
(labelled [UPPER]) and lower bound (labelled [LOWER]). Of
particular interest is how good these bounds are at actually
bounding the ``gold standard'' [MCMC]. Hence, a third boxplot
(labelled [BOUND]) shows the how far outside the bound the ``gold
standard'' is. This shows a z-statistic difference of up to
for dataset 2. This z-statistic difference of up to
between
the fast approximation bounds and the [MCMC] ``gold standard''
will be used later as part of the [HYBRID] inference approach (see
section 7.1.1).
The final boxplot shows the traditional inference approach of
ignoring the known fixed effects variance estimating the total
mixed effects variance, and using OLS to perform inference
(labelled [OLS]). Because this ignores the fixed effects variance
this makes this approach the ``gold standard'' for Dataset 1, in
which
. Indeed this is supported by
the boxplot. However, for Datasets 2 and 3,
and varies over
. For these datasets OLS will give unbiased
statistics, but very inefficient statistics as the
information is ignored. These boxplots
illustrate the difference in z-statistics between OLS and the
``gold standard'' due to this inefficiency. In Dataset 4,
is sufficiently small compared to
so
that the differences between [OLS] and [MCMC] are negligible.
Figure 3 shows the z-statistics obtained for 20 voxels from the 3 Datasets for the inference approaches of [UPPER],[LOWER],[BIDET], and [OLS]. For Dataset 1 the correspondance of [OLS], [LOWER] and [BIDET] is reiterated. For Datasets 2 and 3 the difference between [BIDET] and [OLS] is illustrated, as is the small inaccuracy of the [UPPER] and [LOWER] fast approximation approaches compared with [BIDET].
Figure 4 shows the histograms for the four
different datasets of the degrees of freedom (DOF) obtained at
each voxel from fitting the non-central t-distribution to an MCMC
chain of 200,000 samples from the marginal posterior,
, as part of [BIDET]. For Dataset 1, we know that
the OLS solution is the correct one and that the DOF,
. In
In Dataset 4
is sufficiently small compared to
so that the differences between [OLS] and [MCMC] are
negligible and the range of DOF match those found in Dataset 1.
Figure 4 shows that [BIDET] correctly finds the
DOF as being
for the majority of voxels in Dataset 1. However,
for Datasets 2 and 3 the OLS DOF will be
and
respectively. We should not expect [BIDET] to have the same DOF
values as this. Indeed the histograms show that the DOF obtained
from [BIDET] varies from about these OLS DOF values to values up
to about
or
DOF. Without using [BIDET] there would be no
way of knowing, for a particular voxel, the required DOF.
Figure 5 shows boxplots of the
difference in z-statistics between those obtained from a long MCMC
chain of 200,000 samples and those obtained from using [BIDET] on
MCMC chains of varying sample sizes. This illustrates the need for
an MCMC chain of at least 20,000 samples to achieve accuracies of
the order of in z-statistics.