In order to understand the computational difficulty of the problem it is helpful to consider the amount of calculations required. For instance, take a typical field of view (FOV) for the brain as 200mm cubed, and let each voxel be a cube of length d mm. Therefore, there are (200/d)3 voxels in the volume.
To interpolate this volume requires, generally, one evaluation of the interpolation function (acting on the floating image) for each voxel location in the reference image. For trilinear interpolation there are at least 8 floating point operations required for each interpolation evaluation. Therefore, the total cost of interpolation is at least operations.
In addition, to calculate the cost function typically requires at least one operation (to estimate variance) per voxel, making the number of operations required per voxel at least 9. Therefore, on a machine that could achieve 500 MFLOPS, the time required for one cost function evaluation, with voxel size of 1mm would be at least seconds. However, in practice there are some additional calculations required as well as memory access overheads, and so our implementation (on a Pentium3 500MHz) takes approximately 1 second for this evaluation.
Now consider implementing a search through the transformation space by constructing a grid with N different values for each parameter, so that there are a total of ND elements (where D is the Degree Of Freedom -- 12 for affine). The total number of calculations required to evaluate the cost function at each point is which would ideally take seconds at 500 MFLOPS. Table 1 shows the times required for a range of moderate parameters. From this table it can be seen that evaluations for 12 DOF using 1mm voxels are prohibitive, taking over 4000 years for a grid with N=10 and over 21 hours for a simple neighbourhood N=3. However, using 8mm cubed voxels improves the execution time although the biggest improvement comes from restricting the dimensionality of the transformation space (that is, the DOF).
For a simple, but usable, search routine it would be necessary to search with a grid of at least N=100, in which case even for 6 DOF and 8mm voxels the time taken would exceed 7 hours. This would be unacceptable for most users, especially given that our implementation would be likely to take over 50 hours. However, in combination with a local optimisation method, a coarse search method using 8mm voxels can be used to create a useful optimisation method, which is pursued in section 4.3.