next up previous
Next: Segmentation Using the HMRF-EM Up: Segmentation of Brain MR Previous: MRF-MAP Estimation

  
Model Fitting Using the EM Algorithm

A statistical model is complete only if both its functional form and its parameters are determined. The procedure for estimating the unknown parameters is known as model fitting. For an HMRF model, the parameter set $\theta = \{\theta_\ell, \ell \in \mathcal L\}$ is what should be estimated. If the Gaussian emission function is assumed for the observable random variable y, the mean and standard deviation of each Gaussian class are the parameters, so that $\theta_\ell=(\mu_\ell, \sigma_\ell)$. Since both the class label and the parameters are unknown and they are strongly inter-dependent, the data set is said to be ``incomplete'' and the problem of parameter estimation is regarded as an ``incomplete-data'' problem. Many techniques have been proposed to solve this problem, among which the expectation-maximization (EM) algorithm [11] is the one most widely used. The strategy underlying the EM algorithm consists of the following: estimate the missing part as $\hat{\mathbf x}$ given the current $\theta$ estimate and then use it to form the complete data set $\{\hat{\mathbf x}, \mathbf y\}$; new $\theta$ can be estimated by maximizing the expectation of the complete-data log likelihood, ${\mathcal E} \Big[\log P(\mathbf x, \mathbf
y\vert\theta)\Big]$. Mathematically, the EM algorithm can be described by:
Start
An initial estimate $\theta^{(0)}$.
The E-step
Calculate the conditional expectation
$\displaystyle Q(\theta\vert\theta^{(t)})$ = $\displaystyle \mathcal E \Big[\log P(\mathbf x,
\mathbf y\vert\theta) \vert\mathbf y, \theta^{(t)}\Big]$  
  = $\displaystyle \sum_{{\mathbf x} \in \mathcal X}p({\mathbf
x\vert y},\theta^{(t)})\cdot\log p({\mathbf x, y}\vert\theta),$ (22)

The M-step
maximize $Q(\theta\vert\theta^{(t)})$ to obtain the next estimate

\begin{displaymath}\theta^{(t+1)}=\arg\max_\theta Q(\theta\vert\theta^{(t)}).
\end{displaymath} (23)

Let $\theta^{(t+1)} \rightarrow \theta^{(t)}$ and repeat from the E-step.
Under certain reasonable conditions, EM estimates converge locally to the ML estimates [32]. For the GHMRF field model, the intensity distribution function, dependent on the parameter set $\theta$, is

\begin{displaymath}p(y_i\vert\theta) = \sum_{\ell \in \mathcal L}
\frac{1}{\sqr...
...{2\sigma_\ell^2}\Big) \cdot
p(\ell\vert x_{{\mathcal N}_i}),
\end{displaymath} (24)

where $p(\ell\vert x_{{\mathcal N}_i})$ is the locally dependent probability of $x_i=\ell$ and the parameter set $\theta=\{\mu_\ell, \sigma_\ell\vert\ell \in \mathcal L\}$. The Q-function is then formulated as

\begin{displaymath}Q = \sum_{i \in \mathcal S}\sum_{\ell \in {\mathcal
L}}\lef...
...ell-\frac{(y_i-\mu_\ell)^2}{2\sigma_\ell^2}\right]+C\right\},
\end{displaymath} (25)

where $C=-0.5\log(2\pi)$. Applying the EM algorithm, we obtain
$\displaystyle \mu_\ell^{(t+1)}$ = $\displaystyle \frac{\sum_{i \in \mathcal S}P^{(t)}(\ell\vert y_i)y_i}
{\sum_{i \in \mathcal S}P^{(t)}(\ell\vert y_i)}$ (26)
$\displaystyle \big(\sigma_\ell^{(t+1)}\big)^2$ = $\displaystyle \frac{\sum_{i \in \mathcal
S}P^{(t)}(\ell\vert y_i)(y_i-\mu_\ell)^2}{\sum_{i \in \mathcal
S}P^{(t)}(\ell\vert y_i)},$ (27)

which are the same update equations for the FGM model [4], except that

\begin{displaymath}P^{(t)}(\ell\vert y_i)=\frac{g^{(t)}(y_i;\theta_\ell)\cdot
P^{(t)}(\ell\vert x_{{\mathcal N}_i})}{p(y_i)}.
\end{displaymath} (28)

The calculation of the conditional probability $P^{(t)}(\ell\vert x_{{\mathcal N}_i})$ involves estimation of the class labels, which are obtained through MRF-MAP estimation - Equation (21). We refer to this HMRF model-based EM algorithm as a HMRF-EM algorithm and the standard FM model-based EM algorithm as a FM-EM algorithm.
next up previous
Next: Segmentation Using the HMRF-EM Up: Segmentation of Brain MR Previous: MRF-MAP Estimation
Yongyue Zhang
2000-05-11