Hidden Markov Random Field Model

Next: Model Simulation and Image Up: Hidden Markov Random Field Previous: Markov Random Field Theory

Hidden Markov Random Field Model

The concept of a hidden Markov random field model is derived from hidden Markov models (HMM), which are defined as stochastic processes generated by a Markov chain whose state sequence cannot be observed directly, only through a sequence of observations. Each observation is assumed to be a stochastic function of the state sequence. The underlying Markov chain changes its state according to a $l \times l$ transition probability matrix, where l is the number of states. HMMs have been applied successfully to speech recognition [26,30] and handwritten script recognition [19]. Since original HMMs were designed as 1D Markov chains with first order neighbourhood systems, it can not directly be used in 2D/3D problems such as image segmentation. Here, we consider a special case of a HMM, in which the underlying stochastic process is a Markov random field (MRF), instead of a Markov chain, therefore not restricted to 1D. We refer to this special case as a hidden Markov random field (HMRF) model. Mathematically, an HMRF model is characterized by the following:

Hidden Random Field (MRF) The Random field $X=\{X_i, i \in \mathcal S\}$ is an underlying MRF assuming values in a finite state space $\mathcal L$ with probability distribution (5). The state of X is unobservable.
Observable Random Field $Y=\{Y_i, i \in \mathcal S\}$ is a random field with a finite state space $\mathcal D$ . Given any particular configuration $\mathbf x \in \mathcal X$ , every Y_i follows known conditional probability distribution p(y_i|x_i) of the same functional form $f(y_i;\theta_{x_i})$ , where $\theta_{x_i}$ is the involved parameters. This distribution is called the emission probability function and Y is also referred to as the emitted random field.
Conditional Independence For any $\mathbf x \in \mathcal X$ , the random variables Y_i are conditional independent,

$\begin{displaymath} P({\mathbf y\vert x}) = \prod_{i \in \mathcal S}P(y_i\vert x_i). \end{displaymath}$ (8)

Based on the above, we can write the joint probability of (X,Y) as

P(y, x)	=	P(y\|x)P(x)
	=	$\displaystyle P({\mathbf x}) \prod_{i \in \mathcal S}P(y_i\vert x_i) .$

According to the local characteristics of MRFs, the joint probability of any pair of (X_i, Y_i), given X_i's neighbourhood configuration $X_{{\mathcal N}_i}$ , is:

$\begin{displaymath} P(y_i, x_i\vert x_{{\mathcal N}_i}) = P({y_i\vert x_i})P(x_i\vert x_{{\mathcal N}_i}) \end{displaymath}$

(9)

Thus, we can compute the marginal probability distribution of Y_i dependent on the parameter set $\theta$ (in this case, we treat $\theta$ as a random variable) and $X_{{\mathcal N}_i}$ ,

$\displaystyle p(y_i\vert x_{{\mathcal N}_i},\theta)$	=	$\displaystyle \sum_{\ell \in \mathcal L} p(y_i, \ell\vert x_{{\mathcal N}_i},\theta)$
	=	$\displaystyle \sum_{\ell \in \mathcal L} f(y_i;\theta_\ell)p(\ell\vert x_{{\mathcal N}_i}),$	(10)

where $\theta = \{\theta_\ell, \ell \in \mathcal L\}$ . We call this the hidden Markov random field (HMRF) model. Note, the concept of an HMRF is different from that of an MRF in the sense that the former is defined with respect to a pair of random variable families (X,Y) while the latter is only defined with respect to X. More precisely, an HMRF model can be described by the following:

$X=\{X_i, i \in \mathcal S\}$ - hidden MRF, with prior distribution p(x);
$Y=\{Y_i, i \in \mathcal S\}$ - observable random field, with emission probability distribution p(y_i|x_i) for each y_i;
$\theta = \{\theta_\ell, \ell \in \mathcal L\}$ - the set of parameters involved in the above distributions.

If we assume the random variables X_i are independent of each other, which means that for $\forall \ell \in \mathcal L$ and $i \in \mathcal S$ , we have

$\begin{displaymath}P(\ell\vert x_{{\mathcal N}_i})=P(\ell)=\omega_\ell, \end{displaymath}$

then equation(10) reduces to

$\begin{displaymath}p(y\vert\theta)=\sum_{\ell \in \mathcal L}\omega_\ell \cdot f(y;\theta_\ell), \end{displaymath}$

which is the definition of the finite mixture model. Therefore a FM model is a degenerate special case of an HMRF model. It is obvious from the above that the fundamental difference between the FM model and the HMRF model lies in their different spatial properties. The FM model is spatially independent whereas the HMRF model may be spatially dependent. Therefore, the HMRF model is more flexible for image modelling in the sense that it has the ability to encode both the statistical and spatial properties of an image. With a Gaussian emission distribution, the FM model is usually known as the finite Gaussian Mixture (FGM) or finite normal mixture (FNM) model. More specifically, the observable random variables have the following density function:

$\begin{displaymath} p(y\vert\phi)=\sum_{\ell \in \mathcal L}\omega_\ell \cdot g(y;\theta_\ell) \end{displaymath}$

(11)

where

$\begin{displaymath} g(y;\theta_\ell)=\frac{1}{\sqrt{2\pi\sigma_\ell^2}} \exp\... ...)\quad\text{and}\quad \theta_\ell=(\mu_\ell, \sigma_\ell)^T. \end{displaymath}$

(12)

Similarly, an HMRF model with a Gaussian emission distribution can be specified as:

$\begin{displaymath} p(y_i\vert x_{{\mathcal N}_i},\theta) = \sum_{\ell \in \mathcal L} g(y_i;\theta_\ell) p(\ell\vert X_{{\mathcal N}_i}), \end{displaymath}$

(13)

where g and $\theta_\ell$ are defined as in (12). We refer to this type of HMRF as the Gaussian hidden Markov random field (GHMRF) model.

Next: Model Simulation and Image Up: Hidden Markov Random Field Previous: Markov Random Field Theory

Yongyue Zhang
2000-05-11