We use an audio-visual dataset taken using echo planar images (EPI) acquired using a 3 Tesla system with TR=3 seconds, time to echo (TE) = 30ms, in-plane resolution 4mm and slice thickness 7mm. The first 4 scans were removed and the data was motion corrected using MCFLIRT (Jenkinson et al., 2002) and high-pass filtered as described in Woolrich et al. (2001). The data was not spatially smoothed. The visual stimulus was a reversing checkerboard boxcar stimulus (30 seconds on, 30 seconds off). The auditory stimulus was also a boxcar stimulus (45 seconds on. 45 seconds off). We infer upon the same three models we used on the artificial datasets.