The maximum likelihood solutions given in equations
5-7 depend on knowledge of the latent dimensionality . In the noise free case this quantity can
easily be deduced from the rank of the covariance of the observations
Many other informal methods have been
proposed, the most popular choice being the "scree plot" where one
looks for a "knee" in the plot of ordered eigenvalues that signifies a
split between significant and presumably unimportant directions of the
data. With real FMRI data, however, the decision as to where to choose the
cutoff value is not obvious and a choice based on simple visual
inspection will be ambiguous (see figure 9(ii) for an example).
This problem is intensified by the fact
that the data set
is finite and thus
is being estimated by the
sample covariance of the set of observations
. Even in the absence of any source signals, i.e. when
contains a
finite number of samples from purely Gaussian isotropic noise only, the
eigenspectrum of the sample covariance matrix is not identical to
but
instead distributed around the true noise covariance: the eigenspectrum will
depict an apparent difference in the significance of individual directions
within the noise [Everson and Roberts, 2000].
![]() |
In the case of purely Gaussian noise, however, the sample covariance matrix
has a Wishart distribution and we can utilise
results from random matrix theory on the empirical
distribution function
for the eigenvalues of the covariance matrix of
a single random
-dimensional matrix
[Johnstone, 2000].
Suppose that
as
and
, then
almost surely, where the limiting distribution has a density
If we assume that the source distributions
are Gaussian, the
probabilistic ICA model (equation 2) reduces to the
probabilistic PCA model [Tipping and Bishop, 1999]. In this case, we can use more
sophisticated statistical criteria for model order selection. [Minka, 2000]
placed PPCA in the Bayesian framework and presented a Laplace approximation to
the posterior distribution of the model evidence that
can be calculated
efficiently from the eigenspectrum of the covariance matrix of
observations. When
min
, then
In order to account for the limited amount of data, we combine this estimate
with the predicted cumulative distribution and replace
by its
adjusted eigenspectrum
prior to evaluating the
model evidence. Other possible choices for model order selection for PPCA
include the Bayesian Information Criterion (BIC, [Kass and Raftery, 1993]) the
Akaike Information Criterion (AIC, [Akaike, 1969]) or Minimum
Description Length (MDL, [Rissanen, 1978]).
Note that the estimation of the model order in the case of the
probabilistic PCA model is based on the assumption of Gaussian source
distribution. [Minka, 2000], however, provides some empirical
evidence that the Laplace approximation works reasonably well in the
case where the source distributions are non-Gaussian.
As an example, figure 2 shows the eigenspectrum and different estimators of the intrinsic dimensionality for
different artificial data sets, where 10 latent sources with non-Gaussian distribution were introduced into
simulated AR data (i.e. auto-regressive noise where the AR parameters were
estimated from real resting state FMRI data) and real FMRI resting state noise
at peak levels of between and
of the mean signal intensity. Note
how the increase in AR order will increase the estimates of the latent
dimensionality, simply because there are more eigenvalues that fail the
sphericity assumption. Performing variance-normalisation and adjusting the
eigenspectrum using
in all cases improves the estimation. In the
case of Gaussian white noise the model assumptions are correct and the adjusted
eigenspectrum exactly matches equation 8. In most cases, the
different estimators give similar results once the data were variance
normalised
and the eigenspectrum was adjusted using
. Overall, the Laplace
approximation and the Bayesian Information Criterion appear to give
consistent estimates of the latent dimensionality even though the distribution of the embedded sources are non-Gaussian.