... maxima)1
The point distribution is the distribution at a given voxel, i.e. one random variable, and the field distribution is the continuous version of multivariate distribution i.e. a distribution of a vector of random variables.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... voxel2
The only spatial consideration comes when thresholding the statistic map.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... autocorrelation3
Looking for robust estimates of the autocorrelation can be achieved by using a robust estimate for each time series, and by smoothing spatially the autocorrelation obtained.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (i.i.d.)4
Note that independence for a time series is not realistic and autocorrelation has to be taken into account, see further.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... activation)5
As already indicated in the introduction, this level of decision is said to be uncorrected as based on the point distribution and not using the field distribution (multiple comparison problem).
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... assumption:6
It is valid as a conditional distribution (conditionally to the given subject). A fixed factor (ANOVA language) means that all levels of the factor (here the different subjects) encompass the possible levels one can encounter in the population studied; e.g. gender is a fixed factor with two levels.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... statistics:7
If one supposes equality of ``estimated variances'' $\hat{\sigma1}^2=\hat{\sigma2}^2$ and $n_1=n_2$, like a ``Group Z'', one obtains $t_o(fixed)=1/\sqrt(2n)
(\sum_it_o1(i)-\sum_it_o2(i))$ or $Z_{group12}=1/\sqrt2(groupeZ_1-groupZ_2)$ or $ =\frac{1/\sqrt n_1
groupeZ_1 -1/\sqrt n_2 groupeZ_2}{\sqrt((n_1+n_2)/n_1n_2)}$ if $n_1\neq n_2$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... comparisons8
Although Hotelling's $T^2$ might be used at this stage
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... relationship)9
As a trivial illustration of the link with the first section, let $X$ have two columns; one column containing 1 when the condition is A and 0 otherwise (this column identifies condition A), and one column identifying B, then $(\beta_2 - \beta_1)$ is going to be the activation looked for, as an estimate of the difference of the means.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... independent10
$E(\epsilon)=0$ and $var(\epsilon)=\sigma^2I\!d_T$ are called the Gauss-Markov conditions.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... ``residuals''11
To obtain the best estimate one wants to minimise the variation between what is observed ($y_t$) and what is modelled ($X\beta$)
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... g-inverse12
Generalised inverse [12] is used when $X$ is not of full rank giving a non-invertible $^tXX$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... theorem13
The Gauss-Markov theorem is in fact established for any estimable function (see further) of $\beta$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... estimates14
It can be shown that for independent Normally distributed errors, the OLS is also the maximum likelihood BUE $i.e.$ among all estimates.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... exists15
If $V$ is only semi-definite positive (and thus singular) everything is done with g-inverses, $V^-={\;}^tK^-K^-$, the BLUE property remains if and only if $VV^-X=X$ [11][12].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... (GLS)16
When $V$ is known $\widehat{\beta}_{GLS}$ is also BLUE and BUE under Normality.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... estimate17
Or more generally with $V$ singular[3] $\hat{\sigma}^2=\frac{^t(y-X\hat{\beta})V^-(y-X\hat{\beta}}{trace(AV)}\sim\chi^2(trace(AV))$ where $y\sim(\mu=X\beta,\sigma^2V)$ and $A$ is the resulting from the quadratic form of the residuals, $i.e.$ $A=^tP_{X\bot}V^-P_{X\bot}=V^-P_{X\bot}$ then giving $trace(AV)=trace(VV^-)-trace(P_X)$ giving $T-rank(X)$ if $V$ non-singular [12].
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... statistic18
The distribution given (exact is $min(q-k,p)=1$) is a traditional McKeon(1974)Biometrika, 61:381-383 approximate distribution, where $Df=4+\frac{(q-k)p+2}{B-1}$ and $B=\frac{(rank(P_{X\bot})+(q-k)-p-1)(rank(P_{X\bot}-1)}{(rank(P_{X\bot})-p-3)(rank(P_{X\bot})-p)}$, and $\nu=((q-k)p)[\frac{Df-2}{Df}][\frac{rank(P_{X\bot})}{(rank(P_{X\bot})-p-1)}]$
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... BLUE)19
The situation is less critical in a number of cases where the covariance structure is known and few parameters have to be estimated.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... estimated-GLS20
It is nonetheless possible to do this for fMRI studies with the hypothesis of the same $V$ for all of the voxels, so giving a large sample to estimate $V$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... anymore:21
Notice that GLS brings one back to OLS under a Gauss-Markov model assumption.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... freedom22
if $SS/E(MS)\sim \chi_f^2
$ then $var(SS)/E(MS))=2f$ hence $f=\frac{2E(SS)^2}{var(SS)}$
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... approximatively23
It would be exact if $P_{X'^\bot}V$ was idempotent [12] making $edf=trace(P_{X'^\bot}V)$ as well.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... assumption24
Without ``swamping" the formulae given above for $\hat{\sigma}^2$ and (26) still hold if $V$ is replaced by $KV_I{\;}^tK$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... errors25
Note in all the text $V$ is sometimes expressing the covariance structure and sometimes only the correlation structure; no distinction has been made and it should be clear enough, noting that usually here common variance is assumed and so the same scalar $\sigma^2$ enables one to go from one to another.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... variables)26
This is if only random intercepts are considered, one may consider all parameters in $X$ as random, making $Z=(I\!d_n \otimes X)$
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... variation27
Compound Symmetry is the covariance model used in the section 3.2, $J_T=1_T{\;}^t1_T=TP_{1_T}$
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... thing)28
and the projectors are expressed with or without the $V^{-1}$ according BLUE or not BLUE.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... parameters29
Instead of averaging the variances of the parameters one may median or max depending on how we want to be conservative or not.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... GLS30
Alternatively estimating $\beta$ given the whole covariance, and estimating covariances (different levels) given $\beta$, is known to converge to the maximum likelihood estimates under normality.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
... projector31
Really, $P_G$ should be denoted $P_{[G]}$, the projector onto the space generated by the columns of $G$.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
....32
Note that $trace(V^-P_GV)=trace(P_GVV^-)$ and if $P_G=P_{F\bot}=Id -P_F$, it is equal to $trace(VV^-)-trace(P_F)=rank(V)-rank(P_F)$
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.