MELODIC - Probabilistic Independent Component Analysis for FMRI

Independent Component Analysis (ICA) is becoming a popular exploratory method for analysing complex data such as that from FMRI experiments. ICA attempts to decompose the data into (statistically) independent spatial maps (and associated time courses), which ideally each represent different artefacts or activation patterns. By using all of the 4D dataset together in the analysis, this kind of approach does not need a (temporal) model, in the way that the above research does. The application of such ``model-free'' methods, however, has been restricted both by the view that results can be uninterpretable and by the lack of ability to quantify statistical significance for estimated spatial maps.

In [4] we proposed a probabilistic ICA (PICA) model for FMRI which models the observations as mixtures of spatially non-Gaussian signals and artefacts in the presence of Gaussian noise. We demonstrated that using an objective estimation of the amount of Gaussian noise through Bayesian analysis of the number of activation and (non-Gaussian) noise sources, the problem of ``overfitting'' can be overcome. The approach proposed for estimating a suitable model order (i.e., how many ICA components to find) also allows for a unique decomposition of the data and reduces problems of interpretation, as each final component is more likely to be due to only one physical or physiological process (figs. 4,5). The model also advanced standard ICA with other improvements, such as voxel-wise temporal pre-whitening, variance normalisation of timeseries and the use of prior information about the spatio-temporal nature of the source processes. Finally, in order to statistically infer areas of activation from the estimated PICA maps, we used an alternative-hypothesis testing approach based on a non-spatial Gaussian/Gamma mixture model [6]. The above methodology is implemented as MELODIC (Multivariate Exploratory Linear Optimised Decomposition into Independent Components).

**Figure 4:** GLM, classical ICA and PICA analyses of visual stimulus FMRI data: (a) GLM results using GRF-based inference. (b-e) IC maps from classical ICA having temporal correlation with GLM model; the secondary maps are uninterpretable. (f) eigenspectrum of the data covariance matrix and PICA estimate of the latent dimensionality. (g,h) PICA maps; the secondary map is postulated as V3/MT and is reliably found by PICA in such data.
$\includegraphics[width=0.8\figwidth]{PICA1b}$

**Figure 5:** Additional PICA maps from the visual activation data: (a) head motion (translation in Z), (b) sensory motor activation, (c) signal fluctuations in areas close to the sinuses (possibly due to interaction of field inhomogeneity with head motion), (d) high frequency MR 'ghost' and (e) 'resting- state' fluctuations/ physiological noise.
$\includegraphics[width=0.48\figwidth]{PICA2b}$

As an application of this exploratory approach, we have applied PICA to find resting state networks (RSNs) without the need for ``seeding''. RSNs are low frequency (

Hz) spatially-distributed networks with self-consistent time-course that have previously been identified in resting FMRI data using correlation with a seed pixel's timecourse [10]. Their underlying cause is not yet proven, though it has been postulated that they reflect functional networks (as opposed to being physiologically based but functionally uninteresting). We have attempted to further characterise these networks using resting FMRI, both because of their potentially interesting nature, and because they represent a major cause of (currently unmodelled) structured ``noise'' in FMRI data.

The PICA approach appears to successfully separate different RSNs from each other and from other (physiological and scanner-related) components in resting (and even activation) FMRI data. Using low-TR data (

s) to avoid aliasing of cardiac-related and breathing-related components (and therefore to be able to unambiguously separate these components in the data), we have shown that RSNs are indeed not directly related to these components [14]. Using high resolution (2x1.5x1.75mm) data we have separated ``true'' RSNs (fig. 6(a)), which do indeed appear to lie purely within grey matter, from other networks having similar power spectra (

Hz) which appear to lie in larger blood vessels [14].

Figure 6: RSNs found in 2x1.5x1.75mm resting FMRI data (a) and 4 RSN spatial maps identified as consistent across 7 subjects (b).

$\includegraphics[width=0.11\figwidth,height=0.1265\figwidth]{MDL03_IC25_1s}$ a $\includegraphics[width=0.11\figwidth,height=0.1265\figwidth]{MDL03_IC25_2s}$

$\includegraphics[width=0.33\figwidth,height=0.2530\figwidth]{RSN_clr3by2}$ b

We have further investigated whether the number and spatial localisation of RSNs is consistent across different subjects. We have identified 4 RSNs which appear to have high repeatability when analysed across 7 subjects (posited as V1, spatial association and V2, motor area and attention area, see fig. 6(b)) [13].

The PICA methodology is further extended in [5] to the analysis of multi-session/multi-subject FMRI data via a novel exploratory probabilistic tensor-ICA model which provides a tri-linear decomposition and estimates the signal characteristics in the temporal, spatial and subject/session domain. The resulting data representation provides a rich source of information with which to infer not only on the spatial characteristics of activation but also to enable learning about the variability across the sessions/subjects.