Chapter 3 Example box: Single subject ICA

Introduction

The aim of this example is to get a better feeling for the types of noise present in fMRI data, and to get some experience with recognising different ICA signal and noise components.

This example is based on tools available in FSL, and the file names and instructions are specific to FSL. However, similar analyses can be performed in other neuroimaging software packages.

Please download the dataset for this example here:

Data download

The dataset you downloaded contains two example output folders created after running ICA on data from a single subject. In the example below, we will visualise the single-subject ICA components and you will label the components manually (using the guidelines described in chapter 3). Then, you will compare your hand classification of the components to labels from semi automated classification tools.

Manual classification

To get an understanding of the influence of data quality and quantity on the performance of ICA, you are going to take a look at two different examples. This first example is using multiband data.

Open a command line terminal and change directory into the directory you have downloaded called 'Data_3.1' (using cd). We are now going to use FSLeyes to classify these components as signal or noise. Use the following command to open FSLeyes in "MELODIC mode":

fsleyes --scene melodic -ad Rest_MB6.feat/filtered_func_data.ica &

This simultaneously shows the spatial map, time course and power spectrum for each component, and we can use this information to perform the classifications, as explained in Chapter 3 of the primer. You can label the components as 'Signal', 'Movement' etc within this view. You can add multiple labels to the same component. Any component that does not have a 'Signal' or 'Unknown' label will be considered noise in the subsequent clean-up step.

Make sure melodic_IC is selected and click on the Load labels button in the melodic IC classification window and open the labels.txt file from the Rest_MB6.feat directory (it will show a message saying the the label file does not refer to the melodic directory, which is because we renamed it - click on Apply the labels to the current overlay). You will now see labels for all the components except for ten which are labeled as Unknown (in yellow).

Please have a look at these 10 and label each one as either Unclassified noise or as Signal (note that FSLeyes allows you to include more informative labels if you wish). Remember that the components you are looking at were obtained from a 15-minute run of multiband data, and no smoothing was applied during preprocessing. Once you have classified all of the components, please save your results by overwriting the labels.txt file and close FSLeyes.

Now close FSLeyes and open the following dataset (this is an example of non-multiband EPI data):

fsleyes --scene melodic -ad Rest_EPI.ica/filtered_func_data.ica &

Load the labels.txt file (in Rest_EPI.ica) in the same manner as before, and classify the 10 unknown components. Keep in mind that this is an older dataset that has much fewer timepoints, bigger voxel size and that smoothing has been performed during preprocessing. Again, save the results by overwriting labels.txt and close FSLeyes.

Removing noise components from the data (clean-up)

You will now "clean up" your data by removing the components that you classified as noise from your data. This is done using fsl_regfilt, which will regress the time courses of the noise components from the data.

We first need to get a list of IDs of the noise components:

tail -1 Rest_EPI.ica/labels.txt

Now we can pass this list of numbers to fsl_regfilt (in the command below, replace 1,2,3 with the output of the above command, making sure to remove the square brackets and to also remove all of the spaces between numbers):

fsl_regfilt -i Rest_EPI.ica/filtered_func_data.nii.gz -d Rest_EPI.ica/filtered_func_data.ica/melodic_mix -o Rest_EPI.ica/filtered_func_data_clean.nii.gz -f 1,2,3

This file creates a new version of the preprocessed dataset from this subject, which has any variance that can be explained by the components labelled as noise removed. This 'cleaned' dataset can now be used for subsequent analyses.

Automatic IC classification

To avoid manually labelling all of the components for every single subject, tools have been developed that aim to automatically identify components representing structured noise in fMRI data. We will take a look at two of these tools:

FIX is an automated classification algorithm that uses hand-labelled training data to train its multi-level classifier to reliably label signal and noise components in comparable novel datasets. There are already different trained classifiers available, which can be used in case your data is comparable to the data FIX has been trained on. For optimal results, you should retrain the classifier on your data. Click here to see the command used to run fix.
```
fix MB6_rest.feat training.RData 20 -m
```
AROMA is an alternative to FIX that specifically aims to identify motion artefacts. It does not require classifier (re-) training across studies. It uses four theoretically motivated spatial and temporal features embedded in a simple and robust classifier, and has been shown to minimize the impact of motion while improving resting-state network reproducibility. Click here to see the command used to run AROMA.
```
python2.7 ICA_AROMA.py \
  -in filtered_func_data.nii.gz \
  -out AROMA \
  -mc mc/prefiltered_func_data_mcf.par \
  -affmat reg/example_func2highres.mat \
  -warp reg/highres2standard_warp.nii.gz \
  -md filtered_func_data.ica
```

Now you are going to compare your own classification of signal and noise components to classifications done by FIX and by AROMA. Use the following commands to get a list of the component numbers that were classified as noise by each method, and compare them against your own results (that you passed to fsl_regfilt, above).

This command lists the components that were classified as noise by FIX:

tail -n 1 Rest_EPI.ica/fix4melview_training_thr30.txt

And this command lists the components which were classified as motion-related by AROMA:

cat Rest_EPI.ica/AROMA/classified_motion_ICs.txt

Do the results agree for the ten components that you classified (which were numbers 1, 13, 14, 24, 29, 33, 36, 37, 39 and 42)? Remember that the lists contain all the components that were labeled as noise. This means that the rest of the components were either labeled as Signal, or as Unknown.