next up previous
Next: Multi-subject analysis with GLM Up: General Linear Model Previous: Taking into account the

Applying GLM

Introducing GLM modelling involves the choice of the ``design matrix'' $X$ implying a model for the observed $y$, and the choice of the covariance structure (e.g. autocorrelation pattern) which relates to the sources of errors. In a simple way, $X$ models what is expected and $V$ models the errors25. What is not put in $X$ will be reflected in the errors and may need an appropriate structure to be properly taken into account for a ``good'' model. Inversely, what is put in $X$ is taken away from the errors. A good example of this is a confound covariate which will be put into the design $X$ to account for in the model as explaining what is expected, therefore adjusting the other fixed effects and not inflating error variation otherwise. The subject effect can be thought to be part of the model ($X$) because one would expect different responses for different subjects. But that way the error variation will not take into account the sampling variation of the subjects (fixed approach), unless it is properly modelled as well in $V$ (implying random approach). Note that if one does not include the subjects in the model ($X$), the sampling variations will be considered in the errors, but will be pooled with the other sources of errors. We will come back to these points in the next section. The problem of GLM with structured covariance lies in the estimation of $V$ and $\beta$ at the same time. Mixed models are GLM models of this type where part of the covariance structure comes from random effects. Part of the design describes fixed effects and part of it describes random effects:
\begin{displaymath}
y=X\beta+Z\gamma +\epsilon =X\beta+\eta
\end{displaymath} (28)

with $E(\gamma)=E(\epsilon)=0$, $var(\epsilon)=R$, $var(\gamma)= G$ and $cov(\gamma,\epsilon)=0$, or, $E(\eta=0)$ and $V=var(\eta)=ZG{\;}^tZ+R$. Maximum likelihood techniques such as REML (REstricted ML see [3] for example) find $\hat{V}$ maximising
\begin{displaymath}
\ell_R(G,R)=-\frac{1}{2}[\log\vert V\vert+ \log\vert{\;}^tX...
...rt +
{}^t(y-X\hat{\beta}_{GLS})V^{-1}(y-X\hat{\beta}_{GLS})]
\end{displaymath} (29)

and then plug it into the GLS estimator of $\beta$ to have a final $\hat{\hat{\beta}}_{GLS}$. For small samples, obtaining REML maximisation might be unreliable in the general case and still problematic for the unequal within-subject covariances hypothesis, but in a lot of situations the maximisation problem is simplified and can even lead to direct calculation. In the general case the Mixed model can accept any form for $Z$, $G$ and $R$, then an algorithm [12] can give the REML. For balanced data ``everything becomes simpler''. This is because it implies $\mathcal{I}m(VX)\subset\mathcal{I}m(X)$ for many situations which makes the OLS equivalent to the GLS. This is particularly the case in fMRI (where continuous exploratory variables come from convolution of ``balanced" dummy variables with a model of haemodynamic response thus preserving the balanced aspect) and that is the reason why the two-stage or two-level approach is here valid or optimal. Using mixed model for fMRI studies considers the random effect as the subject effect. $Z$ will be a matrix of dummy variables identifying the subjects ($i=1\cdots n$), making $ZG{\;}^tZ$ a well-structured matrix of the form $G \otimes 1_T{\;}^t1_T$. $R$ is usually $\sigma^2_w
I\!d_{nT}$. As the subjects are supposed to be independent $G=\sigma_s^2 I\!d_n$ so that $ZG{\;}^tZ= I\!d_n \otimes \sigma_s^21_T{\;}^t1_T $ implying a constant correlation pattern for measures within subject, it is the one used in the section 3.2. Remark: In this model it is important to notice introducing a pattern of autocorrelation between time measures is not done through $Z$ (random effects) as $\gamma$ would be redundant with $\epsilon$; it is introduced in $R$.
next up previous
Next: Multi-subject analysis with GLM Up: General Linear Model Previous: Taking into account the
Didier Leibovici 2001-03-01