Recall from equation 5 that in order to estimate the mixing matrix and the sources, we need to optimise an orthogonal rotation matrix in the space of whitened observations:
In order to choose a technique for the unmixing step note that all
previous results have highlighted the importance of non-Gaussianity of the
source distributions: the split into a non-Gaussian part plus additive Gaussian
noise is at the heart of the uniqueness results. Also, the estimation of the
intrinsic dimensionality is based on the identification of eigenvectors of the
data covariance matrix that violate the sphericity assumption of the isotropic
Gaussian noise model. Consistent with this, we will estimate the unmixing matrix
based on the principle of non-Gaussianity. [Hyvärinen et al., 2001]
have
presented an elegant fixed point algorithm that uses approximations to
neg-entropy in order to optimise for non-Gaussian source distributions
and give a clear account of the relation between this approach to
statistical independence. In brief, the individual sources are
obtained by projecting the data
onto the individual rows of
, i.e. the
th source is estimated as
where denotes the derivative of
. This is followed by a re-
normalisation step
such that
is of unit length.
A proof of convergence and discussion about the choice of the non-
linear function can be found in [Hyvärinen et al., 2001]. In order to
estimate
sources, this estimation is simply performed
times
under the constraint that the vectors
are mutually orthogonal.
The constraint on the norm and the mutual orthogonality assure that
these vectors actually form an orthogonal rotation matrix
.
Thus, estimation of the sources is carried out under the assumption
that all marginal distributions of
have maximally
non-Gaussian distribution.
The choice of the nonlinear function is domain specific and in our case will be strongly linked to the inferential steps that are being performed after IC estimation (see section 4 below).