Accuracy and dimensionality

Next: Real FMRI data Up: Results Previous: Spatio-temporal accuracy of PICA

Accuracy and dimensionality

**Figure 7:** Spatio-temporal accuracy as a function of assumed dimensionality (i.e. when retaining different amounts of variance) for the simulated audio-visual activation data at the 3% level: (i) correlation between the extracted time course and the true signal time course; (ii) false positive rate; (iii) false negative rate (visual stim.: solid, aud. stim.: dashed); (iv) eigenspectrum of the data covariance matrix together with Laplace approximation to the model order.
$\fbox{\small Key: (\raisebox{0.8mm}{\includegraphics[width=5mm]{./Pics/linesol}}... ...mm}{\includegraphics[width=5mm]{./Pics/lineddot}}) auditory activation cluster }$

Within the estimation steps, the choice of number of components was determined from the estimate of the Bayesian evidence (equation 10). A different choice of gives rise to a different model with different quality of estimation. Under the model, the optimal number of components should match the column rank of , where the number of components is restricted to the 'true' number of source processes in order to avoid arbitrary splits of identified sources into separate component maps ('overfitting').

Figure 2 has demonstrated that in the case of data that conforms to the model, the model order can be inferred accurately. In the case of real FMRI data, estimation of the number of source processes is a much more difficult task. In order to assess the dependency between the estimated number of source processes and the spatio-temporal accuracy of the estimation, we performed ROC analysis on spatial maps obtained after projecting the data into subspaces of increasing dimensionality⁷.

Figure 7 shows the results of the temporal correlation and the final false-positive rates and false-negative rates over the range of possible dimensions for the data set with $3\%$ peak level activation, where the spatial maps were thresholded at the level. Both for the spatial and temporal accuracy these plots suggest that the quality of estimation does not improve once the source signals are being estimated in a subspace with more than about 30 dimensions. These results appear to be consistent for both artificial activation patterns and time courses. Reducing the number of sources below 30, however, will lead to increasingly poor estimates.

Overfitting would necessarily result in an increase of the false-negative rate, an effect that is shown in figure 7 (iii) for the auditory activation cluster. For this particular data set the effect is very subtle since the data has been generated without any voxel-wise variation of the temporal signal introduced into resting-state FMRI data. The artificial time courses are consistent within the clusters and therefore these clusters are less likely than in 'real' FMRI data to be incorrectly split into different spatial maps. Though the quality of estimation does not degenerate badly with increased dimensionality it is still essential to find a good estimate of the lower dimensional subspace. It not only dramatically decreases the computational load but more importantly provides better estimates for the noise, which is essential as part of the inferential steps. For this data set, the Laplace approximation to the evidence for model order (figure 7(iv)) appears to work well.

Next: Real FMRI data Up: Results Previous: Spatio-temporal accuracy of PICA

Christian F. Beckmann 2003-08-05