Footnotes

... maxima)¹

The point distribution is the distribution at a given voxel, i.e. one random variable, and the field distribution is the continuous version of multivariate distribution i.e. a distribution of a vector of random variables.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... voxel ²

The only spatial consideration comes when thresholding the statistic map.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... autocorrelation ³

Looking for robust estimates of the autocorrelation can be achieved by using a robust estimate for each time series, and by smoothing spatially the autocorrelation obtained.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... (i.i.d.)⁴

Note that independence for a time series is not realistic and autocorrelation has to be taken into account, see further.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... activation)⁵

As already indicated in the introduction, this level of decision is said to be uncorrected as based on the point distribution and not using the field distribution (multiple comparison problem).

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... assumption:⁶

It is valid as a conditional distribution (conditionally to the given subject). A fixed factor (ANOVA language) means that all levels of the factor (here the different subjects) encompass the possible levels one can encounter in the population studied; e.g. gender is a fixed factor with two levels.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... statistics:⁷

If one supposes equality of ``estimated variances'' $\hat{\sigma1}^2=\hat{\sigma2}^2$ and

, like a ``Group Z'', one obtains $t_o(fixed)=1/\sqrt(2n) (\sum_it_o1(i)-\sum_it_o2(i))$ or $Z_{group12}=1/\sqrt2(groupeZ_1-groupZ_2)$ or $=\frac{1/\sqrt n_1 groupeZ_1 -1/\sqrt n_2 groupeZ_2}{\sqrt((n_1+n_2)/n_1n_2)}$ if $n_1\neq n_2$ .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... comparisons ⁸

Although Hotelling's

might be used at this stage

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... relationship)⁹

As a trivial illustration of the link with the first section, let

have two columns; one column containing 1 when the condition is A and 0 otherwise (this column identifies condition A), and one column identifying B, then $(\beta_2 - \beta_1)$ is going to be the activation looked for, as an estimate of the difference of the means.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... independent ¹⁰

$E(\epsilon)=0$ and $var(\epsilon)=\sigma^2I\!d_T$ are called the Gauss-Markov conditions.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... ``residuals''¹¹

To obtain the best estimate one wants to minimise the variation between what is observed (

) and what is modelled ( $X\beta$ )

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... g-inverse ¹²

Generalised inverse [12] is used when

is not of full rank giving a non-invertible

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... theorem ¹³

The Gauss-Markov theorem is in fact established for any estimable function (see further) of $\beta$ .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... estimates ¹⁴

It can be shown that for independent Normally distributed errors, the OLS is also the maximum likelihood BUE

among all estimates.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... exists ¹⁵

is only semi-definite positive (and thus singular) everything is done with g-inverses, $V^-={\;}^tK^-K^-$ , the BLUE property remains if and only if

[11][12].

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... (GLS)¹⁶

When

is known $\widehat{\beta}_{GLS}$ is also BLUE and BUE under Normality.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... estimate ¹⁷

Or more generally with

singular[3] $\hat{\sigma}^2=\frac{^t(y-X\hat{\beta})V^-(y-X\hat{\beta}}{trace(AV)}\sim\chi^2(trace(AV))$ where $y\sim(\mu=X\beta,\sigma^2V)$ and

is the resulting from the quadratic form of the residuals,

$A=^tP_{X\bot}V^-P_{X\bot}=V^-P_{X\bot}$ then giving

giving

non-singular [12].

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... statistic ¹⁸

The distribution given (exact is

) is a traditional McKeon(1974)Biometrika, 61:381-383 approximate distribution, where $Df=4+\frac{(q-k)p+2}{B-1}$ and $B=\frac{(rank(P_{X\bot})+(q-k)-p-1)(rank(P_{X\bot}-1)}{(rank(P_{X\bot})-p-3)(rank(P_{X\bot})-p)}$ , and $\nu=((q-k)p)[\frac{Df-2}{Df}][\frac{rank(P_{X\bot})}{(rank(P_{X\bot})-p-1)}]$

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... BLUE)¹⁹

The situation is less critical in a number of cases where the covariance structure is known and few parameters have to be estimated.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... estimated-GLS ²⁰

It is nonetheless possible to do this for fMRI studies with the hypothesis of the same

for all of the voxels, so giving a large sample to estimate

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... anymore:²¹

Notice that GLS brings one back to OLS under a Gauss-Markov model assumption.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... freedom ²²

if $SS/E(MS)\sim \chi_f^2$ then

hence $f=\frac{2E(SS)^2}{var(SS)}$

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... approximatively ²³

It would be exact if $P_{X'^\bot}V$ was idempotent [12] making $edf=trace(P_{X'^\bot}V)$ as well.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... assumption ²⁴

Without ``swamping" the formulae given above for $\hat{\sigma}^2$ and (26) still hold if

is replaced by $KV_I{\;}^tK$ .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... errors ²⁵

Note in all the text

is sometimes expressing the covariance structure and sometimes only the correlation structure; no distinction has been made and it should be clear enough, noting that usually here common variance is assumed and so the same scalar $\sigma^2$ enables one to go from one to another.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... variables)²⁶

This is if only random intercepts are considered, one may consider all parameters in

as random, making $Z=(I\!d_n \otimes X)$

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... variation ²⁷

Compound Symmetry is the covariance model used in the section 3.2, $J_T=1_T{\;}^t1_T=TP_{1_T}$

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... thing)²⁸

and the projectors are expressed with or without the $V^{-1}$ according BLUE or not BLUE.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... parameters ²⁹

Instead of averaging the variances of the parameters one may median or max depending on how we want to be conservative or not.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... GLS ³⁰

Alternatively estimating $\beta$ given the whole covariance, and estimating covariances (different levels) given $\beta$ , is known to converge to the maximum likelihood estimates under normality.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

... projector ³¹

Really,

should be denoted $P_{[G]}$ , the projector onto the space generated by the columns of

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

....³²

Note that

and if $P_G=P_{F\bot}=Id -P_F$ , it is equal to

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.