Next: Multiway multidimensional data reduction
Up: tr00dl2
Previous: Introduction
EEG mapping data analysis gives rise to some statistical problems when looking for treatments
effects with inferential tests, mainly because of the structure of the data leading to multiple
comparisons: in time points, in variables (EEG bands) and in locations (electrodes).
One must first notice that multiple testing correction such as Bonferroni adjustment is not applicable here as
there is a large amount of highly correlated variables to be considered: 1008 measures for each
time point (among about 10) and for each dose (say 3 doses). Multivariate parametric methods such
as MANOVA are not to be recommended either mainly because of the small size of the sample (here
12 subjects) making the assumptions difficult to assess and models not practical to apply (
e.g. covariance structure estimation).
Based on these points to analyse a wake-EEG (pharmaco-EEG data) the statistical method often shows
a two level analysis as found in [3]. Their method or at least their approach is
widely used, so comments will only be based on this one. The first level called Statistical
Decision Tree, an exploratory level, uses non-parametric testing methods ; significant
results (at the leaves of the tree) are submitted to the second level, a confirmatory level
using qualitative criteria called Descriptive Data Analysis (DDA) and a quantitative confirmation
using Principal Component Analysis (PCA). After the first level, two kinds of series of maps, one
at each time point, are then produced: coloured maps of p-values for dose versus placebo
comparisons (each map is 28 points which can be interpolated), and the SDT-maps which plots at
each electrode the value of the ``decision" of the direction of drug effect [3]). At
the second level DDA intends to give confirmation of SDT results meeting a few qualitative
criteria e.g. spatial coherence and time coherence of significant results (i.e.
clusters of electrodes that lasts in time), then SDT/DDA results are confirmed or not by using a
t-test on the principal component (PCA of the electrodes) ``overlapping" the spatial SDT/DDA
result (e.g. a frontal effect FP1, FP2, Fz if overlapped by the back/front principal
component).
First of all, the term ``Statistical Decision" in SDT is misleading as the method does not give
any level of confidence on the final conclusion. The SDT procedure is applied electrode by
electrode and time after time (in comparison to baseline), therefore multiple testing problem is
still present. Also problematic is the comparison of probability maps from time to time or from
EEG-band to EEG-band. In particular, a probability map does not enable quantification of the
effect observed especially when produced from non-parametric tests. The only possible comparison
is in fact the spread, if it is a true spread (multiple testing). Certainly the DDA procedure
provides some control on false positives. It introduces qualitative criteria on coherence of the
results. If these criteria seem common sense they might not support all the designs or drug
analysed, and they bring an interpretation or conclusive step into the statistical analysis from
which interpretations and conclusions are made. The PCA confirmation plays a similar role
concerning multiple testing, but this time in an acceptable way as it is a statistical control.
Notice it would seem more logical to start with PCA as an exploratory tool, then select
electrodes most contributing to PC and/or meeting DDA criteria, and then to confirm
hypothesis with SDT.
The main problem with the current method seems to be the multiple testing issue. This has been an
issue in the medical imaging literature when looking at activation with different imaging
techniques such as PET. Solutions to exhibit the distribution of local maxima can be derived
using random fields theory as in [22] or permutation testing procedure as in
[7]. Permutation testing would be better in this context as here only 28 ``pixels"
forms the ``image", making the smooth random field approximation difficult to keep. The approach
is univariate in a highly multivariate context with three particular features : time correlation,
spatial correlation and frequency-band correlation. Using univariate models (or separate models)
does not account for spatial, temporal, and frequency structures of the data, which are discussed
only a posteriori and qualitatively when making conclusions from the results.
The idea of using data reduction techniques is promising because it can at first reduce the
problem of multiple testing, and secondly provide some modelisation of the structure. PCA is not
fully appropriate for the multi-entries array as only bilinear modelisation is possible, one must
use a data reduction method allowing multi-linear modelisation. Multiway data reduction techniques
have been already used on multichannel evoked potentials data as in [6] with the
PARAFAC method. In this later paper the signal was in input instead of its Fourier transform
(summarised on bands) as in our case (i.e. EEG analysis instead of qEEG analysis). PARAFAC
method seems more appropriate in EEG analysis as the focus is more on modelisation of the time
course (long) than on decomposition of the variability. Choosing a tradeoff between
modelisation and decomposition more focused on this later, the purpose of this paper
is to describe an other method handling multi-entries data which conserves most of the properties
of the PCA method, with therefore easier understanding.
Next: Multiway multidimensional data reduction
Up: tr00dl2
Previous: Introduction
Didier Leibovici
2001-09-04