If you are carrying out a mid-level analysis (e.g., cross-sessions) that will be fed into an even higher-level analysis (e.g., cross-subjects), then it could be argued that a mixed-effects analysis should be done at the mid-level. A mixed-effects analysis would assume that the sessions are randomly sampled from a "population" of sessions that that subject could produce. This includes estimation of each subject's session-to-session variance. However, it is common for only a small number of sessions to be collected for each subject, making estimation of each subject's session-to-session variance impractical. One solution to this is to assume a common session-to-session variance for all subjects, thereby providing enough data for the session-to-session variance to be estimated. However, this has a downside in that you lose information about which subjects are good (i.e. low variance) and which subjects are bad (i.e. high variance). Hence, when only a small number of sessions has been collected for each subject (say, less than 10), it is recommended that you use a fixed effects analysis at the mid-level. This in effect treats the multiple first-level sessions (for each subject) as if they were one long session. Although this does ignore the session-session variability at this level, it is arguable that this is not of interest anyway (this is a somewhat philosophical debate). However, the combined session and subject variability will still affect (and be estimated at) the next level.

In short, fixed effects is favoured as it avoids practical problems associated with estimating the session-to-session variance (when there are not many sessions per subject), at the same time as maintaining information about which subjects are good and bad.