Non-spatial without Class Proportions

Next: Non-spatial with Class Proportions Up: Continuous Weights Mixture Model Previous: Continuous Weights Mixture Model

Non-spatial without Class Proportions

We are going to approximate the distribution in equation 3 by replacing the discrete labels, , with $K \times 1$ continuous weights vectors, $\vec{w_{i}}$ :

$\displaystyle p(\vec{w},\vec{\theta}\vert y) \propto \prod_i^N \sum_{k=1}^K \{w_{ik}p(y_i\vert x_i=k,\theta_k)\}p(\vec{w})p(\vec{\theta})$

(9)

where $\vec{w}= \{ \vec{w_i}:i=1\ldots N$ } and $\vec{w_i}=\{w_{ik}:k=1\ldots K\}$ is the continuous weights vector at voxel

. Equation 9 only approximates equation 3 if we apply certain constraints to the continuous weights vectors. If we choose a prior on the continuous weights vector, $\vec{w_i}$ , with the constraints that $0<w_{ik}<1$ and $\sum_k w_{ik}=1$ , then as $p(\vec{w_i})$ tends to delta functions at $w_{ik}=0$ and $w_{ik}=1$ , then equation 9 will tend to equation 3. Therefore, to apply these constraints the prior we use is:

$\displaystyle p(\vec{w})$

$\displaystyle =$

$\displaystyle p(\vec{w}\vert\vec{\tilde{w}},\gamma)p(\vec{\tilde{w}})$

(10)

where:

$\displaystyle p(\vec{\tilde{w}})$	$\displaystyle =$	$\displaystyle \prod_{ik}^N p(\vec{\tilde{w}}_{ik})$
$\displaystyle p(\vec{\tilde{w}}_{ik})$	$\displaystyle =$	$\displaystyle Uniform(\vec{\tilde{w}_{ik}};-\infty,+\infty)$	(11)

and $p(\vec{w}\vert\vec{\tilde{w}},\gamma) = \prod_{i} p(\vec{w_i}\vert\vec{\tilde{w}_i},\gamma)$ , where crucially $p(\vec{w_i}\vert\vec{\tilde{w}_i},\gamma)$ is a deterministic relationship by which $\vec{w_i}$ and $\vec{\tilde{w}_i}$ are related by the logistic transform:

$\displaystyle w_{ik}=\frac{\exp(\tilde{w}_{ik}/ \gamma)}{\sum_{k=1}^K \exp(\tilde{w}_{ik}/ \gamma))}$

(12)

The normalising constant in the logistic transform ${\sum_{k=1}^K exp(\tilde{w}_{ik}/ \gamma)}$ ensures that the condition ${\sum_{k=1}^K w_{ik}=1}$ is met. This expression also ensures that $\tilde{w}_{ik}>\tilde{w}_{jk}$ , if and only if $w_{ik}>w_{ij}$ . Figure 1 shows how the logistic transform produces an approximation to the delta functions as $\gamma$ gets smaller. We fix the value of $\gamma$ to 0.05 whilst bounding $-10 < \tilde{w}_{ik} < 10$ , this ensures that we get the desired approximation to delta functions at 0 and 1, whilst ensuring that we can compute $exp(\tilde{w}_{ik}/ \gamma)$ without causing overflow.

To summarise, we now have two vectors of continuous weights at each voxel, ${w}_{ik}$ and $\tilde{w}_{ik}$ . $\tilde{w}_{ik}$ are weights which have a prior on them which is uniform on the real line. We then use the logistic transform to deterministically map the weights $\tilde{w}_{ik}$ to ${w}_{ik}$ at each voxel. Then, ${w}_{ik}$ are the continuous weights which represent approximations to the discrete labels with delta functions at 0 and 1.

**Figure 1:** Consider that we have the number of classes as . [top] shows samples from the prior of $\tilde{w}_{i1}$ , $\tilde{w}_{i1} \sim Uniform(-10,10)$ (samples from $\tilde{w}_{i2}$ are similar). [bottom] shows the samples from $w_{i1}$ ( $w_{i2}$ is similar), which the samples from the prior of $\tilde{w}_{i1}$ and $\tilde{w}_{i2}$ transform to under the logistic transform with different values of $\gamma$ (equation 12). Hence, it can be seen how this produces a prior for $\vec{w_i}$ , which approximates the desired delta functions at 0 and as $\gamma$ gets smaller.
$\begin{figure}\begin{center} \begin{tabular}{ccc} &$\tilde{w}_{i1}$&\\ &\psfig{... ...tic_transform_3.ps,width=0.3\textwidth}\\ \end{tabular}\end{center}\end{figure}$

Next: Non-spatial with Class Proportions Up: Continuous Weights Mixture Model Previous: Continuous Weights Mixture Model