Blog Archives

fMRI In Neuroscience: Efficiency of Event-related Experiment Designs

Event-related fMRI experiments are used to detect selectivity in the brain to stimuli presented over short durations. An event is generally modeled as an impulse function that occurs at the onset of the stimulus in question. Event-related designs are flexible in that many different classes of stimuli can be intermixed. These designs can minimize confounding behavioral effects due to subject adaptation or expectation. Furthermore, stimulus onsets can be modeled at frequencies that are shorter than the repetition time (TR) of the scanner. However, given such flexibility in design and modeling, how does one determine the schedule for presenting a series of stimuli? Do we space out stimulus onsets periodically across a scan period? Or do we randomize stimulus onsets? Furthermore what is the logic for or against either approach? Which approach is more efficient for gaining incite into the selectivity in the brain?

Simulating Two fMRI Experiments: Periodic and Random Stimulus Onsets

To get a better understanding of the problem of choosing efficient experiment design, let’s simulate two simple fMRI experiments. In the first experiment, a stimulus is presented periodically 20 times, once every 4 seconds, for a run of 80 seconds in duration. We then simulate a noiseless BOLD signal evoked in a voxel with a known HRF. In the second experiment, we simulate the noiseless BOLD signal evoked by 20 stimulus onsets that occur at random times over the course of the 80 second run duration.  The code for simulating the signals and displaying output are shown below:

rand('seed',12345);
randn('seed',12345);
TR = 1 % REPETITION TIME
t = 1:TR:20; % MEASUREMENTS
h = gampdf(t,6) + -.5*gampdf(t,10); % ACTUAL HRF
h = h/max(h); % SCALE TO MAX OF 1

% SOME CONSTANTS...
trPerStim = 4; % # TR PER STIMULUS FOR PERIODIC EXERIMENT
nRepeat = 20; % # OF TOTAL STIMULI SHOWN
nTRs = trPerStim*nRepeat
stimulusTrain0 = zeros(1,nTRs);

beta = 3; % SELECTIVITY/HRF GAIN

% SET UP TWO DIFFERENT STIMULUS PARADIGM...
% A. PERIODIC, NON-RANDOM STIMULUS ONSET TIMES
D_periodic = stimulusTrain0;
D_periodic(1:trPerStim:trPerStim*nRepeat) = 1;

% UNDERLYING MODEL FOR (A)
X_periodic = conv2(D_periodic,h);
X_periodic = X_periodic(1:nTRs);
y_periodic = X_periodic*beta;

% B. RANDOM, UNIFORMLY-DISTRIBUTED STIMULUS ONSET TIMES
D_random = stimulusTrain0;
randIdx = randperm(numel(stimulusTrain0)-5);
D_random(randIdx(1:nRepeat)) = 1;

% UNDERLYING MODEL FOR (B)
X_random = conv2(D_random,h);
X_random = X_random(1:nTRs);
y_random = X_random*beta;

% DISPLAY STIMULUS ONSETS AND EVOKED RESPONSES
% FOR EACH EXPERIMENT
figure
subplot(121)
stem(D_periodic,'k');
hold on;
plot(y_periodic,'r','linewidth',2);
xlabel('Time (TR)');
title(sprintf('Responses Evoked by\nPeriodic Stimulus Onset\nVariance=%1.2f',var(y_periodic)))

subplot(122)
stem(D_random,'k');
hold on;
plot(y_random,'r','linewidth',2);
xlabel('Time (TR)');
title(sprintf('Responses Evoked by\nRandom Stimulus Onset\nVariance=%1.2f',var(y_random)))


BOLD signals evoked by periodic (left) and random (right) stimulus onsets.

The black stick functions in the simulation output indicate the stimulus onsets and each red function is the simulated noiseless BOLD signal to those stimuli. The first thing to notice is the dramatically different variances of the BOLD signals evoked for the two stimulus presentation schedules. For the periodic stimuli, the BOLD signal quickly saturates, then oscillates around an effective baseline activation. The estimated variance of the periodic-based signal is 0.18. In contrast, the signal evoked by the random stimulus presentation schedule varies wildly, reaching a maximum amplitude that is roughly 2.5 times as large the maximum amplitude of the signal evoked by periodic stimuli. The estimated variance of the signal evoked by the random stimuli is 7.4, roughly 40 times the variance of the signal evoked by the periodic stimulus.

So which stimulus schedule allows us to better estimate the HRF and, more importantly, the amplitude of the HRF, as it is the amplitude that is the common proxy for voxel selectivity/activation? Below we repeat the above experiment 50 times. However, instead of simulating noiseless BOLD responses, we introduce 50 distinct, uncorrelated noise conditions, and from the simulated noisy responses, we estimate the HRF using an FIR basis set for each  repeated trial. We then compare the estimated HRFs across the 50 trials for the periodic and random stimulus presentation schedules. Note that for each trial, the noise is exactly the same for the two stimulus presentation schedules. Further, we simulate a selectivity/tuning gain of 3 times the maximum HRF amplitude and assume that the HRF to be estimated is 16 TRs/seconds in length. The simulation and output are below:

%% SIMULATE MULTIPLE TRIALS OF EACH EXPERIMENT
%% AND ESTIMATE THE HRF FOR EACH
%% (ASSUME THE VARIABLES DEFINED ABOVE ARE IN WORKSPACE)

% CREATE AN FIR DESIGN MATRIX
% FOR EACH EXPERIMENT
hrfLen = 16;  % WE ASSUME TO-BE-ESTIMATED HRF IS 16 TRS LONG

% CREATE FIR DESIGN MATRIX FOR THE PERIODIC STIMULI
X_FIR_periodic = zeros(nTRs,hrfLen);
onsets = find(D_periodic);
idxCols = 1:hrfLen;
for jO = 1:numel(onsets)
idxRows = onsets(jO):onsets(jO)+hrfLen-1;
for kR = 1:numel(idxRows);
X_FIR_periodic(idxRows(kR),idxCols(kR)) = 1;
end
end
X_FIR_periodic = X_FIR_periodic(1:nTRs,:);

% CREATE FIR DESIGN MATRIX FOR THE RANDOM STIMULI
X_FIR_random = zeros(nTRs,hrfLen);
onsets = find(D_random);
idxCols = 1:hrfLen;
for jO = 1:numel(onsets)
idxRows = onsets(jO):onsets(jO)+hrfLen-1;
for kR = 1:numel(idxRows);
X_FIR_random(idxRows(kR),idxCols(kR)) = 1;
end
end
X_FIR_random = X_FIR_random(1:nTRs,:);

% SIMULATE AND ESTIMATE HRF WEIGHTS VIA OLS
nTrials = 50;

% CREATE NOISE TO ADD TO SIGNALS
% NOTE: SAME NOISE CONDITIONS FOR BOTH EXPERIMENTS
noiseSTD = beta*2;
noise = bsxfun(@times,randn(nTrials,numel(X_periodic)),noiseSTD);

%% ESTIMATE HRF FROM PERIODIC STIMULUS TRIALS
beta_periodic = zeros(nTrials,hrfLen);
for iT = 1:nTrials
y = y_periodic + noise(iT,:);
beta_periodic(iT,:) = X_FIR_periodic\y';
end

% CALCULATE MEAN AND STANDARD ERROR OF HRF ESTIMATES
beta_periodic_mean = mean(beta_periodic);
beta_periodic_se = std(beta_periodic)/sqrt(nTrials);

%% ESTIMATE HRF FROM RANDOM STIMULUS TRIALS
beta_random = zeros(nTrials,hrfLen);
for iT = 1:nTrials
y = y_random + noise(iT,:);
beta_random(iT,:) = X_FIR_random\y';
end

% CALCULATE MEAN AND STANDARD ERROR OF HRF ESTIMATES
beta_random_mean = mean(beta_random);
beta_random_se = std(beta_random)/sqrt(nTrials);

% DISPLAY HRF ESTIMATES
figure
% ...FOR THE PERIODIC STIMULI
subplot(121);
hold on;
h0 = plot(h*beta,'k')
h1 = plot(beta_periodic_mean,'linewidth',2);
h2 = plot(beta_periodic_mean+beta_periodic_se,'r','linewidth',2);
plot(beta_periodic_mean-beta_periodic_se,'r','linewidth',2);
xlabel('Time (TR)')
legend([h0, h1,h2],'Actual HRF','Average \beta_{periodic}','Standard Error')
title('Periodic HRF Estimate')

% ...FOR THE RANDOMLY-PRESENTED STIMULI
subplot(122);
hold on;
h0 = plot(h*beta,'k');
h1 = plot(beta_random_mean,'linewidth',2);
h2 = plot(beta_random_mean+beta_random_se,'r','linewidth',2);
plot(beta_random_mean-beta_random_se,'r','linewidth',2);
xlabel('Time (TR)')
legend([h0,h1,h2],'Actual HRF','Average \beta_{random}','Standard Error')
title('Random HRF Estimate')


Estimated HRFs from 50 trials of periodic (left) and random (right) stimulus schedules

In the simulation outputs, the average HRF for the random stimulus presentation (right) closely follows the actual HRF tuning. Also, there is little variability of the HRF estimates, as is indicated by the small standard error estimates for each time points. As well, the selectivity/gain term is accurately recovered, giving a mean HRF with nearly the same amplitude as the underlying model. In contrast, the HRF estimated from the periodic-based experiment is much more variable, as indicated by the large standard error estimates. Such variability in the estimates of the HRF reduce our confidence in the estimate for any single trial. Additionally, the scale of the mean HRF estimate is off by nearly 30% of the actual value.

From these results, it is obvious that the random stimulus presentation rate gives rise to more accurate, and less variable estimates of the HRF function. What may not be so obvious is why this is the case, as there were the same number of stimuli and  the same number of signal measurements in each experiment. To get a better understanding of why this is occurring, let’s refer back to the variances of the evoked noiseless signals. These are the signals that are underlying the noisy signals used to estimate the HRF. When noise is added it impedes the detection of the underlying trends that are useful for estimating the HRF.  Thus it is important that the variance of the underlying signal is large compared to the noise so that the signal can be detected.

For the periodic stimulus presentation schedule, we saw that the variation in the BOLD signal was much smaller than the variation in the BOLD signals evoked during the randomly-presented stimuli. Thus the signal evoked by random stimulus schedule provide a better characterization of the underlying signal in the presence of the same amount of noise, and thus provide more information to estimate the HRF. With this in mind we can think of maximizing the efficiency of the an experiment design as maximizing the variance of the BOLD signals evoked by the experiment.

An Alternative Perspective: The Frequency Power Spectrum

Another helpful interpretation is based on a signal processing perspective. If we assume that neural activity is directly correspondent with the onset of a stimulus event, then we can interpret the train of stimulus onsets as a direct signal of the evoked neural activity. Furthermore, we can interpret the HRF as a low-pass-filter that acts to “smooth” the available neural signal in time. Each of these signals–the neural/stimulus signal and the HRF filtering signal–has with it an associated power spectrum. The power spectrum for a signal captures the amount of power per unit time that the signal has as a particular frequency $\omega$. The power spectrum for a discrete signal can be calculated from the discrete Fourier transform (DFT) of the signal $F(\omega)$ as follows

$P(\omega) = | F(\omega)|^2$

Below, we use Matlab’s $\text{fft.m}$ function to calculate the DFT and the associated power spectrum for each of the stimulus/neural signals, as well as the HRF.

%% POWER SPECTRUM ANALYSES
%% (ASSUME THE VARIABLES DEFINED ABOVE ARE IN WORKSPACE)

% MAKE SURE WE PAD SUFFICIENTLY
% FOR CIRCULAR CONVOLUTION
N = 2^nextpow2(nTRs + numel(h)-1);
nUnique = ceil(1+N/2); % TAKE ONLY POSITIVE SPECTRA

% CALCULATE POWER SPECTRUM FOR PERIODIC STIMULI EXPERIMENT
ft_D_periodic = fft(D_periodic,N)/N; % DFT
P_D_periodic = abs(ft_D_periodic).^2; % POWER
P_D_periodic = 2*P_D_periodic(2:nUnique-1); % REMOVE ZEROTH & NYQUIST

% CALCULATE POWER SPECTRUM FOR RANDOM STIMULI EXPERIMENT
ft_D_random = fft(D_random,N)/N; % DFT
P_D_random = abs(ft_D_random).^2; % POWER
P_D_random = 2*P_D_random(2:nUnique-1); % REMOVE ZEROTH & NYQUIST

% CALCULATE POWER SPECTRUM OF HRF
ft_h = fft(h,N)/N; % DFT
P_h = abs(ft_h).^2; % POWER
P_h = 2*P_h(2:nUnique-1); % REMOVE ZEROTH & NYQUIST

% CREATE A FREQUENCY SPACE FOR PLOTTING
F = 1/N*[1:N/2-1];

% DISPLAY STIMULI POWER SPECTRA
figure
subplot(131)
hhd = plot(F,P_D_periodic,'b','linewidth',2);
axis square; hold on;
hhr = plot(F,P_D_random,'g','linewidth',2);
xlim([0 .3]); xlabel('Frequency (Hz)');
set(gca,'Ytick',[]); ylabel('Magnitude');
legend([hhd,hhr],'Periodic','Random')
title('Stimulus Power, P_{stim}')

% DISPLAY HRF POWER SPECTRUM
subplot(132)
plot(F,P_h,'r','linewidth',2);
axis square
xlim([0 .3]); xlabel('Frequency (Hz)');
set(gca,'Ytick',[]); ylabel('Magnitude');
title('HRF Power, P_{HRF}')

% DISPLAY EVOKED SIGNAL POWER SPECTRA
subplot(133)
hhd = plot(F,P_D_periodic.*P_h,'b','linewidth',2);
hold on;
hhr = plot(F,P_D_random.*P_h,'g','linewidth',2);
axis square
xlim([0 .3]); xlabel('Frequency (Hz)');
set(gca,'Ytick',[]); ylabel('Magnitude');
legend([hhd,hhr],'Periodic','Random')
title('Signal Power, P_{stim}.*P_{HRF}')


Power spectrum of neural/stimulus (left), HRF (center), and evoked BOLD (right) signals

On the left of the output we see the power spectra for the stimulus signals. The blue line corresponds to the spectrum for the periodic stimuli, and the green line the spectrum for the randomly-presented stimuli. The large peak in the blue spectrum corresponds to the majority of the stimulus power at 0.25 Hz for the periodic stimuli, as this the fundamental frequency of the periodic stimulus presentation (i.e. every 4 seconds). However, there is little power at any other stimulus frequencies. In contrast the green spectrum indicates that the random stimulus presentation has power at multiple frequencies.

If we interpret the HRF as a filter, then we can think of the HRF power spectrum as modulating the power spectrum of the neural signals to produce the power of the evoked BOLD signals. The power spectrum for the HRF is plotted in red in the center plot. Notice how a majority of the power for the HRF is at frequencies less than 0.1 Hz, and there is very little power at frequencies above 0.2 Hz. If the neural signal power is modulated by the HRF signal power, we see that there is little resultant power in the BOLD signals evoked by periodic stimulus presentation (blue spectrum in the right plot). In contrast, because the power for the neural signals evoked by random stimuli are spread across the frequency domain, there are a number of frequencies that overlap with those frequencies for which the HRF also has power. Thus after modulating neural/stimulus power with the HRF power, the spectrum of the BOLD signals evoked by the randomly-presented stimuli have much more power across the relevant frequency spectrum than those evoked by the periodic stimuli. This is indicated by the larger area under the green curve in the right plot.

Using the signal processing perspective allows us to directly gain perspective on the limitations of a particular experiment design which are rooted in the frequency spectrum of the HRF. Therefore, another way we can think of maximizing the efficiency of an experimental design is maximizing the amount of power in the resulting evoked BOLD responses.

Yet Another Perspective Based in Statistics: Efficiency Metric

Taking a statistics-based approach leads to a formal definition of efficiency, and further, a nice metric for testing the efficiency of an experimental design. Recall that when determining the shape of the HRF, a common approach is to use the GLM model

$y = X \beta + \epsilon$

Here $y$ is the evoked BOLD signal and $X$ is a design matrix that links a set of linear model parameters $\beta$ to those responses. The variable $\epsilon$ is a noise term that is unexplained by the model. Using an FIR basis formulation of the model, the weights in $\beta$ represent the HRF to a stimulus condition.

Because fMRI data are a continuous time series, the underlying noise $\epsilon$ is generally correlated in time. We can model this noise as a Gaussian process with zero mean and a constant multivariate covariance $C_{\epsilon}$. Note that this is analogous to the Generalized Least Squares (GLS) formulation of the GLM. In general, the values that comprise $C_{\epsilon}$ are unknown and have to be estimated from the fMRI data themselves.

For a known or estimated noise covariance, the Maximum Likelihood Estimator (MLE) for the model parameters $\beta$(derivation not shown) is:

$\hat \beta = (X^TC_{\epsilon}^{-1}X)X^TC_{\epsilon}^{-1}y$

Because the ML estimator of the HRF is a linear combination of the design matrix $X$ and a set of corresponding responses, which are both random variables ($X$ can represent any possible experiment design, and $y$ is by definition random), the estimator is itself a random variable. It thus follows that the estimate for the HRF also has a variance. (We demonstrated how $\beta$ is a random variable in the 50 simulations above, where for each simulation X was held fixed, but due to the added noise $y$ was a random variable. For each noise condition, the estimate for $\beta$ took on different values.) We saw above how an HRF estimator with a large variance is undesirable, as it reduces our confidence in the estimates of the HRF shape and scale. Therefore we would like to determine an estimator that has a minimum overall variance.

A formal metric for efficiency of a least-squares estimator is directly related to the variance of the estimator. The efficiency is defined to be the inverse of the sum of the estimator variances. An estimator that has a large sum of variances will have a low efficiency, and vice versa. But how do we obtain the values of the variances for the estimator? The variances can be recovered from the diagonal elements of the estimator covariance matrix $C_{\hat \beta}$, giving the following definition for the efficiency, $E$

$E = 1/trace(C_{\hat \beta})$

In earlier post we found that the covariance matrix $C_{\hat \beta}$ for the GLS estimator (i.e. the formulation above) with a given noise covariance $C_{\epsilon}$ is:

$C_{\hat \beta} = (X^T C_{\epsilon}^{-1} X)^{-1}$.

Thus the efficiency for the HRF estimator is

$E = 1/trace((X^T C_{\epsilon}^{-1}X)^{-1})$

Here we see that the efficiency depends only on the known noise covariance (or an estimate of it), and the design matrix used in the model, but not the shape of the HRF. In general the noise covariance is out of the experimenter’s control (but see the take-homes below ), and must be dealt with post hoc. However, because the design matrix is directly related to the experimental design, the above expression gives a direct way to test the efficiency of experimental designs before they are ever used!

In the simulations above, the noise processes are drawn from an independent multivariate Gaussian distribution, therefore the noise covariance is equal to the identity (i.e. uncorrelated). We also estimated the HRF using the FIR basis set, thus our model design matrix was $X_{FIR}$. This gives the estimate the efficiency for the simulation experiments:

$E_{simulation} = 1/trace(X_{FIR}^T X_{FIR})$

Below we calculate the efficiency for the FIR estimates under the simulated experiments with periodic and random stimulus presentation designs.

%% ESTIMATE DESIGN EFFICIENCY
%% (ASSUME THE VARIABLES DEFINED ABOVE ARE IN WORKSPACE)

% CALCULATE EFFICIENCY OF PERIODIC EXPERIMENT
E_periodic = 1/trace(pinv(X_FIR_periodic'*X_FIR_periodic));

% CALCULATE EFFICIENCY OF RANDOM EXPERIMENT
E_random = 1/trace(pinv(X_FIR_random'*X_FIR_random));

% DISPLAY EFFICIENCY ESTIMATES
figure
bar([E_periodic,E_random]);
set(gca,'XTick',[1,2],'XTickLabel',{'E_periodic','E_random'});
title('Efficiency of Experimental Designs');
colormap hot;


Estimated efficiency for simulated periodic (left) and random (right) stimulus schedules.

Here we see that the efficiency metric does indeed indicate that the randomly-presented stimulus paradigm is far more efficient than the periodically-presented paradigm.

Wrapping Up

In this post we addressed the efficiency of an fMRI experiment design. A few take-homes from the discussion are:

1. Randomize stimulus onset times. These onset times should take into account the low-pass characteristics (i.e. the power spectrum) of the HRF.
2. Try to model selectivity to events that occur close in time. The reason for this is that noise covariances in fMRI are highly non-stationary. There are many sources of low-frequency physiological noise such as breathing, pulse, blood pressure, etc, all of which dramatically effect the noise in the fMRI timecourses. Thus any estimate of noise covariances from data recorded far apart in time will likely be erroneous.
3. Check an experimental design against other candidate designs using the Efficiency metric.

Above there is mention of the effects of low-frequency physiological noise. Until now, our simulations have assumed that all noise is independent in time, greatly simplifying the picture of estimating HRFs and corresponding selectivity. However, in a later post we’ll address how to deal with more realistic time courses that are heavily influenced by sources of physiological noise. Additionally, we’ll tackle how to go about estimating the noise covariance $C_{\epsilon}$ from more realistic fMRI time series.

fMRI in Neuroscience: Estimating Voxel Selectivity & the General Linear Model (GLM)

In a typical fMRI experiment  a series of stimuli are presented to an observer and evoked brain activity–in the form of blood-oxygen-level-dependent (BOLD) signals–are measured from tiny chunks of the brain called voxels. The task of the researcher is then to infer the tuning of the voxels to features in the presented stimuli based on the evoked BOLD signals. In order to make this inference quantitatively, it is necessary to have a  model of how BOLD signals are evoked in the presence of stimuli. In this post we’ll develop a model of evoked BOLD signals, and from this model recover the tuning of individual voxels measured during an fMRI experiment.

Modeling the Evoked BOLD Signals — The Stimulus and Design Matrices

Suppose we are running an event-related fMRI experiment where we present $C$ different stimulus conditions to an observer while recording the BOLD signals evoked in their brain over a series of $T$ consecutive fMRI measurements (TRs). We can represent the stimulus presentation quantitatively with a $T \times C$ binary Stimulus Matrix, $D$, whose entries indicate the onset of each stimulus condition (columns) at each point in time (rows). Now let’s assume that we have an accurate model of how a voxel is activated by a single, very short stimulus. This activation model is called hemodynamic response function (HRF), $h$, for the voxel, and, as we’ll discuss in a later post, can be estimated from the measured BOLD signals. Let’s assume for now that the voxel is also activated to an equal degree to all stimuli. In this scenario we can represent the BOLD signal evoked over the entire experiment with another $T \times C$ matrix $X$ called the Design Matrix that is the convolution of the stimulus matrix $D$ with the voxel’s HRF $h$.

$X = D * h$

Note that this model of the BOLD signal is an example of the Finite Impulse Response (FIR) model that was introduced in the previous post on fMRI Basics.

To make the concepts of $D$ and $X$ more concrete, let’s say our experiment consists of $C = 3$ different stimulus conditions: a light, a tone, and heat applied to the palm. Each stimulus condition is presented twice in a staggered manner during 80 TRs of fMRI measurements. The stimulus matrix and the design matrix are simulated here in Matlab:

TR = 1; % REPETITION TIME
t = 1:TR:20; % MEASUREMENTS
h = gampdf(t,6) + -.5*gampdf(t,10); % HRF MODEL
h = h/max(h); % SCALE HRF TO HAVE MAX AMPLITUDE OF 1

trPerStim = 30; % # TR PER STIMULUS
nRepeat = 2; % # OF STIMULUS REPEATES
nTRs = trPerStim*nRepeat + length(h);
impulseTrain0 = zeros(1,nTRs);

% VISUAL STIMULUS
impulseTrainLight = impulseTrain0;
impulseTrainLight(1:trPerStim:trPerStim*nRepeat) = 1;

% AUDITORY STIMULUS
impulseTrainTone = impulseTrain0;
impulseTrainTone(5:trPerStim:trPerStim*nRepeat) = 1;

% SOMATOSENSORY STIMULUS
impulseTrainHeat = impulseTrain0;
impulseTrainHeat(9:trPerStim:trPerStim*nRepeat) = 1;

% COMBINATION OF ALL STIMULI
impulseTrainAll = impulseTrainLight + impulseTrainTone + impulseTrainHeat;

% SIMULATE VOXELS WITH VARIOUS SELECTIVITIES
visualTuning = [4 0 0]; % VISUAL VOXEL TUNING
auditoryTuning = [0 2 0]; % AUDITORY VOXEL TUNING
somatoTuning = [0 0 3]; % SOMATOSENSORY VOXEL TUNING
noTuning = [1 1 1]; % NON-SELECTIVE

beta = [visualTuning', ...
auditoryTuning', ...
somatoTuning', ...
noTuning'];

% EXPERIMENT DESIGN / STIMULUS SEQUENCE
D = [impulseTrainLight',impulseTrainTone',impulseTrainHeat'];

% CREATE DESIGN MATRIX FOR THE THREE STIMULI
X = conv2(D,h'); % X = D * h
X(nTRs+1:end,:) = []; % REMOVE EXCESS FROM CONVOLUTION

% DISPLAY STIMULUS AND DESIGN MATRICES
subplot(121); imagesc(D); colormap gray;
xlabel('Stimulus Condition')
ylabel('Time (TRs)');
title('Stimulus Train, D');
set(gca,'XTick',1:3); set(gca,'XTickLabel',{'Light','Tone','Heat'});

subplot(122);
imagesc(X);
xlabel('Stimulus Condition')
ylabel('Time (TRs)');
title('Design Matrix, X = D * h')
set(gca,'XTick',1:3); set(gca,'XTickLabel',{'Light','Tone','Heat'});



Stimulus presentation matrix, D (left) and the Design Matrix X for an experiment with three stimulus conditions: a light, a tone, and heat applied to the palm

Each column of the design matrix above (the right subpanel in the above figure) is essentially a model of the BOLD signal evoked independently by each stimulus condition, and the total signal is simply a sum of these independent signals.

Modeling Voxel Tuning — The Selectivity Matrix

In order to develop the concept of the design matrix we assumed that our theoretical voxel is equally tuned to all stimuli. However, few voxels in the brain exhibit such non-selective tuning. For instance, a voxel located in visual cortex will be more selective for the light than for the tone or the heat stimulus. A voxel in auditory cortex will be more selective for the tone than for the other two stimuli. A voxel in the somoatorsensory cortex will likely be more selective for the heat than the visual or auditory stimuli. How can we represent the tuning of these different voxels?

A simple way to model tuning to the stimulus conditions in an experiment is to multiplying each column of the design matrix by a weight that modulates the BOLD signal according to the presence of the corresponding stimulus condition. For example, we could model a visual cortex voxel by weighting the first column of $X$ with a positive value, and the remaining two columns with much smaller values (or even negative values to model suppression). It turns out that we can model the selectivity of $V$ individual voxels simultaneously through a $C \times V$ Selectivity Matrix$\beta$. Each entry in $\beta$ is the amount that the $v$-th voxel (columns) is tuned to the $c$-th stimulus condition (rows). Given the design matrix and the selectivity matrix, we can then predict the BOLD signals $y$ of selectively-tuned voxels with a simple matrix multiplication:

$y = X\beta$

Keeping with our example experiment, let’s assume that we are modeling the selectivity of four different voxels: a strongly-tuned visual voxel, a moderately-tuned somatosensory voxel, a weakly tuned auditory voxel, and an unselective voxel that is very weakly tuned to all three stimulus conditions. We can represent the tuning of these four voxels with a $3 \times 4$ selectivity matrix. Below we define a selectivity matrix that represents the tuning of these 4 theoretical voxels and simulate the evoked BOLD signals to our 3-stimulus experiment.

% SIMULATE NOISELESS VOXELS' BOLD SIGNAL
% (ASSUMING VARIABLES FROM ABOVE STILL IN WORKSPACE)
y0 = X*beta;

figure;
subplot(211);
imagesc(beta); colormap hot;
axis tight
ylabel('Condition')
set(gca,'YTickLabel',{'Visual','Auditory','Somato.'})
xlabel('Voxel');
set(gca,'XTick',1:4)
title('Voxel Selectivity, \beta')

subplot(212);
plot(y0,'Linewidth',2);
legend({'Visual Voxel','Auditory Voxel','Somato. Voxel','Unselective'});
xlabel('Time (TRs)'); ylabel('BOLD Signal');
title('Activity for Voxels with Different Stimulus Tuning')
set(gcf,'Position',[100 100 750 540])
subplot(211); colorbar


Selectivity matrix (top) for four theoretical voxels and GLM BOLD signals (bottom) for a simple experiment

The top subpanel in the simulation output visualizes the selectivity matrix defined for the four theoretical voxels. The bottom subpanel plots the columns of the $T \times V$ matrix of voxel responses $y$. We see that the maximum response of the strongly-tuned visual voxel (plotted in blue) is larger than that of the other voxels, corresponding to the larger weight upper left of the selectivity matrix. Also note that the response for the unselective voxel (plotted in cyan) demonstrates the linearity property of the FIR model. The attenuated but complex BOLD signal from the unselective voxel results from the sum of small independent signals evoked by each stimulus.

Modeling Voxel Noise

The example above demonstrates how we can model BOLD signals evoked in noisless theoretical voxels. Though this noisless scenario is helpful for developing a modeling framework, real-world voxels exhibit variable amounts of noise (noise is any signal that cannot be accounted by the FIR model). Therefore we need to incorporate a noise term into our BOLD signal model.

The noise in a voxel is often modeled as a random variable $\epsilon$. A common choice for the noise model is a zero-mean Normal/Gaussian distribution with some variance $\sigma^2$:

$\epsilon \sim \mathcal N(0,\sigma^2)$

Though the variance of the noise model may not be known apriori, there are methods for estimating it from data. We’ll get to estimating noise variance in a later post when we discuss various sources of noise and how to account for them using more advance techniques. For simplicity, let’s just assume that the noise variance is 1 as we proceed.

Putting It All Together — The General Linear Model (GLM)

So far we have introduced on the concepts of the stimulus matrix, the HRF, the design matrix, selectivity matrix, and the noise model. We can combine all of these to compose a comprehensive quantitative model of BOLD signals measured from a set of voxels during an experiment:

$y = X\beta + \epsilon \\ = (D * h)\beta + \epsilon$

This is referred to as the General Linear Model (GLM).

In a typical fMRI experiment the researcher controls the stimulus presentation $D$, and measures the evoked BOLD responses $y$ from a set of voxels. The problem then is to estimate the selectivities of the voxels based on these measurments. Specifically, we want to determine the parameters $\hat \beta$ that best explain the measured BOLD signals during our experiment. The most common way to do this is a method known as Ordinary Least Squares (OLS) Regression. Using OLS the idea is to adjust the values of $\hat \beta$ such that the predicted model BOLD signals are as similar to the measured signals as possible. In other words, the goal is to infer the selectivity each voxel would have to exhibit in order to produce the measured BOLD signals. I showed in an earlier post that the optimal OLS solution for the selectivities $\hat \beta$ is given by:

$\hat \beta = (X^T X)^{-1} X^T y$

Therefore, given a design matrix $X$ and a set of voxel responses $y$ associated with the design matrix, we can calculate the selectivities of voxels to the stimulus conditions represented by the columns of the design matrix. This works even when the BOLD signals are noisy. To get a better idea of this process at work let’s look at a quick example based on our toy fMRI experiment.

Example: Recovering Voxel Selectivity Using OLS

Here the goal is to recover the selectivities of the four voxels in our toy experiment they have been corrupted with noise. First, we add noise to the voxel responses. In this example the variance of the added noise is based on a concept known as signal-to-noise-ration or SNR.  As the name suggests, SNR is the ratio of the underlying signal to the noise “on top of” the signal. SNR is a very important concept when interpreting fMRI analyses. If a voxel exhibits a low SNR, it will be far more difficult to estimate its tuning. Though there are many ways to define SNR, in this example it is defined as the ratio of the maximum signal amplitude to the variance of the noise model. The underlying noise model variance is adjusted to be one-fifth of the maximum amplitude of the BOLD signal, i.e. an SNR of 5.  Feel free to try different values of SNR by changing the value of the variable $\text{SNR}$ in the Matlab simulation. Noisy versions of the 4 model BOLD signals are plotted in the top subpanel of the figure below. We see that the noisy signals are very different from the actual underlying BOLD signals.

Noisy BOLD signals from 4 voxels (top) and GLM predictions (bottom) of the underlying BOLD signals

Here we estimate the selectivities $\hat \beta$ from the GLM using OLS, and then predict the BOLD signals in our experiment with this estimate. We see in the bottom subpanel of the above figure that the resulting GLM predictions of are quite accurate. We also compare the estimated selectivity matrix $\hat \beta$ to the actual selectivity matrix $\beta$ below. We see that OLS is able to recover the selectivity of all the voxels.

Actual (top) and estimated (bottom) selectivity matrices.

% SIMULATE NOISY VOXELS & ESTIMATE TUNING
% (ASSUMING VARIABLES FROM ABOVE STILL IN WORKSPACE)

SNR = 5; % (APPROX.) SIGNAL-TO-NOISE RATIO
noiseSTD = max(y0(:))./SNR; % NOISE LEVEL FOR EACH VOXEL
noise = bsxfun(@times,randn(size(y0)),noiseSTD);
y = y0 + noise;

betaHat = inv(X'*X)*X'*y % OLS
yHat = X*betaHat; % GLM PREDICTION

figure
subplot(211);
plot(y,'Linewidth',3);
xlabel('Time (s)'); ylabel('BOLD Signal');
legend({'Visual Voxel','Auditory Voxel','Somato. Voxel','Unselective'});
title('Noisy Voxel Responses');

subplot(212)
h1 = plot(y0,'Linewidth',3); hold on
h2 = plot(yHat,'-o');
legend([h1(end),h2(end)],{'Actual Responses','Predicted Responses'})
xlabel('Time (s)'); ylabel('BOLD Signal');
title('Model Predictions')
set(gcf,'Position',[100 100 750 540])

figure
subplot(211);
imagesc(beta); colormap hot(5);
axis tight
ylabel('Condition')
set(gca,'YTickLabel',{'Visual','Auditory','Somato.'})
xlabel('Voxel');
set(gca,'XTick',1:4)
title('Actual Selectivity, \beta')

subplot(212)
imagesc(betaHat); colormap hot(5);
axis tight
ylabel('Condition')
set(gca,'YTickLabel',{'Visual','Auditory','Somato.'})
xlabel('Voxel');
set(gca,'XTick',1:4)
title('Noisy Estimated Selectivity')
drawnow


Wrapping Up

Here we introduced the GLM commonly used for fMRI data analyses and used the GLM framework to recover the selectivities of simulated voxels. We saw that the GLM is quite powerful of recovering the selectivity in the presence of noise. However, there are a few details left out of the story.

First, we assumed that we had an accurate (albeit exact) model for each voxel’s HRF. This is generally not the case. In real-world scenarios the HRF is either assumed to have some canonical shape, or the shape of the HRF is estimated the experiment data. Though assuming a canonical HRF shape has been validated for block design studies of peripheral sensory areas, this assumption becomes dangerous when using event-related designs, or when studying other areas of the brain.

Additionally, we did not include any  physiological noise signals in our theoretical voxels. In real voxels, the BOLD signal changes due to physiological processes such as breathing and heartbeat can be far larger than the signal change due to underlying neural activation. It then becomes necessary to either account for the nuisance signals in the GLM framework, or remove them before using the model described above. In two upcoming posts we’ll discuss these two issues: estimating the HRF shape from data, and dealing with nuisance signals.

fMRI In Neuroscience: The Basics

Magnetic Resonance Imaging (MRI) is a procedure used to essentially “look inside” of materials in a non-invasive manner. As the name suggests, the procedure forms images by measuring the differences in resonances of different materials while in the presence of a strong magnetic field (the details of the MRI procedure are pretty fascinating, but I will save them for another post). In the setting of neuroscience, MRI is often used to image the inside of the brain while it is performing basic functions like hearing a tone, or pressing a button. This flavor of MRI is appropriately referred to as Functional MRI, or fMRI.

The main concept behind fMRI is that as the neurons in the brain function they consume fuel (sugar) and oxygen, which is supplied by blood flow. The harder the neurons in one part of the brain work, the more blood and therefore the more oxygen flows to that part of the brain. The levels of oxygen in the blood will also vary proportionately to how quickly oxygen is being consumed by the active neurons; the more activity, the faster oxygen is consumed. It turns out the resonant frequency of a tissue will vary depending on the level of oxygen present in the tissue. fMRI essentially measures the differences in resonances of your brain tissue based on these functionally-dependent levels of blood oxygen. This is commonly referred to as the Blood-Oxygen-Level-Dependent (BOLD) signal.

The BOLD signal is not measured from individual neurons, but rather from small (on the order of 2-3 mm) cubic regions of the brain called voxels (imagine your brain being composed of hundreds of thousands of tiny Leggos). Each voxel contains hundreds of thousands of neurons, so the BOLD signal measured from a voxel is indicative of the group activity of the neurons located within that voxel. As the neurons in a voxel become active due to brain function, the BOLD signal in each of voxel will vary over time. The work of many a neuroscientist is to accurately characterize the BOLD signal change and to relate (i.e. correlate) it to brain function.

Characterizing the BOLD Signal — the Hemodynamic Response Function (HRF)

Because the BOLD signal is based on blood flow, it is delayed in time from the onset of neural activity due to the period it takes for blood to flow into (and out of) the voxel. The BOLD signal generally peaks 4 to 6 seconds after the onset of neural activity, after which it decreases back toward baseline, even undershooting baseline amplitude around 8 to 12 seconds after onset. This undershoot is believed to be caused by ongoing neuronal metobolism that overconsumes the initial supply of oxygen to the voxel to sub-baseline levels. If there is no addition neural activity the BOLD signal eventually (after approximately 20 seconds) returns to baseline levels. These signal dynamics are referred to as a voxel’s Hemodynamic Response Function, or HRF. A common quantiative model of the HRF is a sum of two Gamma distributions. One of the distributions models the initial peak of the BOLD signal, and another (inverted) distribution models the undershoot. An example of a model HRF is shown below.

Model Hemodynamic Response Function (HRF)

%% MODEL OF THE HRF
t = 0:.1:20;
hrfModel = gampdf(t,6) + -.5*gampdf(t,10);

% DISPLAY
figure
plot(t,hrfModel,'r','Linewidth',2);
hold on;
hb = plot(t,zeros(size(t)),'k--');
title('Model HRF');
xlabel('Time From Activity Onset (s)');
ylabel('BOLD Signal');
legend(hb,'Baseline')


It is important to note that the MRI scanner doesn’t necessarily measure the BOLD signal at such a high temporal resolution as indicated above. Because of limitations imposed by both hardware and software, the required period for an individual fMRI measurement–or TR, short for Repetition Time–is on the order of 1 to 2 seconds. (However recent advances in parallell imaging and multiband excitation technologies are vastly shortening the required TR of fMRI measurements.)

Relating the HRF to Brain Activity — The Finite Impulse Response (FIR) Model

The HRF provides a model for how the BOLD signal in a voxel will vary due to a short burst of neural activity. Therefore, a useful interpretation of BOLD signal dynamics comes from signal processing. If we treat the HRF as a filter $h$ that operates over some finite length of time equal to the length of the HRF, then the measured BOLD signal $y(t)$ evoked by a series of bursts in neural activity can be modeled as a convolution of an impulse train $D(t)$ with the filter:

$y(t) = D(t) * h$

where $D(t)$ equals one at each  point in time where a burst of neural activity  occurs, and zero otherwise. The $(*)$ is the convolution operator (if you’re not familiar with convolution, it’s not terribly important here; just think of it as the operation of applying  a filter to some data). This is identical to the Finite Impulse Response (FIR) model used in signal processing. An example of the FIR model used to model the BOLD signal evoked by a sequence of three bursts of neural activity is shown below.

Demonstration of the FIR model of the BOLD signal

%% FIR MODEL OF BOLD RESPONSE

% HRF AS MEASURED BY MRI SCANNER
TR = 1;			% REPETITION TIME
t = 1:TR:20;	% MEASUREMENTS
h = gampdf(t,6) + -.5*gampdf(t,10);

% CREATE A STIMULUS IMPULSE TRAIN
% FOR THREE SEPARATE STIMULUS CONDITIONS
trPerStim = 30;			% # TR PER STIMULUS
nRepeat = 2;
nTRs = trPerStim*nRepeat + length(h);
impulseTrain0 = zeros(1,nTRs);

impulseTrainModel = impulseTrain0;
impulseTrainModel([1,24,28]) = 1;
boldModel = conv(impulseTrainModel,h);
boldModel(nTRs+1:end) = [];

% DISPLAY AN EXAMPLE OF FIR RESPONSE
figure
stem(impulseTrainModel*max(h),'k');
hold on;
plot(boldModel,'r','Linewidth',2); xlim([0,nTRs]);
title('HRF Convolved with Impulse Stimulus Train');
xlabel('Time (TRs)'); ylabel('BOLD Signal');
legend({'Activity Onset', 'Voxel Response'})
set(gcf,'Position',[100 100 750 380])


A useful property of the FIR model is that BOLD signals from overlapping HRFs sum linearly. This effect is displayed in the figure above. The BOLD signals evoked by the second and third impulses, which appear close to one another in time, combine linearly to form a larger and more complicated BOLD signal than that evoked by the first impulse alone. As we will see later, this linearity property makes it possible to determine the selectivity of neurons within a voxel even in the presence of rapidly-presented stimuli.

Block and Event-related fMRI Experiments

Using fMRI, neuroscientists design experiments that present stimuli having particular features that are hypothesized to be encoded by neurons in a particular region of interest in the brain. Given the evoked BOLD signal measured from a voxel during stimulus presentation, along with the HRF for the voxel, and the assumptions of FIR model framework, it is possible to calculate the degree to which the neurons in the voxel must be selective for each the stimulus features in order to produce the measured BOLD signals. We’ll delve more into how this selectivity is calculated in a later post, but for now let’s take look at two flavors of experiment designs.

The Block Design

One flavor of experimental design is to simply present a stimulus continuosly for a long period, followed by absence of stimuli for a long period. This is what is called a Block Design experiment. From the view of the FIR model, presenting the stimulus for a long period is equivalent to composing $D(t)$ of many consecutive impulses. Because the signal from the HRF of each impulse are assumed to sum linearly, the BOLD signal evoked by the block of impulses will be much larger than the signal evoked by a single brief stimulus presentation. This is demonstrated in the simulation below:

%% SIMULATE A BLOCK-DESIGN EXPERIMENT
%% (ASSUMING VARIABLES FRON ABOVE ARE IN WORKSPACE)

% CREATE BLOCK STIMULUS TRAIN
blocks = repmat([ones(8,1);zeros(8,1)], round(nTRs-10)/16,1);
blockImpulseTrain = impulseTrain0;
blockImpulseTrain(1:numel(blocks)) = blocks;
boldBlock = conv(blockImpulseTrain,h);

% DISPLAY BOLD RESPONSES FROM BLOCK DESIGN
figure
stem(blockImpulseTrain*max(h),'k');
hold on;
plot(boldBlock(1:nTRs),'r','Linewidth',2); xlim([0,nTRs]);
title('Simulated Block Design');
xlabel('Time (TRs)'); ylabel('BOLD Signal');
legend({'Stimulus Train', 'Voxel Response'})
set(gcf,'Position',[100 100 750 380])


Simulation of BOLD signals evoked by Block Design experiment

Here the height of the stimulus onset indicators (in black) are scaled to be the maximum height of the HRF from a single impulse. We see that the peak bold signal from the block design is much larger. The block design is used when the number of features/stimuli that the experimenter would like to probe is small. In this scenario, the increased signal amplitude results in a much more sensitive measurement of selectivity for the probed stimulus features, but at the cost of an inefficient model design. Therefore many separate block design experiments will have to be run to test various hypotheses.

The Event-related Design

An alternative to the block design is to instead show many types of stimuli for short duations. This is what is known as an Event-related design. For instance, say are interested in how visual, auditory, and somatosensory information are encoded in the brain. We could run three separate block design experiments, one for each stimulus modality, or we could run a single event-related design experiment, where we intermittently present a subject a light, a tone, and ask them to press a button. If there exists a voxel that is involved in encoding each of these stimuli to an equal degree, it would be difficult to discover this relationship by running three separate block design experiments. A simulation of such an experiment and the responses from such a voxel is shown below.

Simulation of an Event-related experiment

%% SIMULATE AN EVENT-RELATED EXPERIMENT
%% (ASSUMING VARIABLES FRON ABOVE ARE IN WORKSPACE)

% VISUAL STIMULUS
impulseTrainLight = impulseTrain0;
impulseTrainLight(1:trPerStim:trPerStim*nRepeat) = 1;

% AUDITORY STIMULUS
impulseTrainTone = impulseTrain0;
impulseTrainTone(5:trPerStim:trPerStim*nRepeat) = 1;

% SOMATOSENSORY STIMULUS
impulseTrainPress = impulseTrain0;
impulseTrainPress(9:trPerStim:trPerStim*nRepeat) = 1;

% COMBINATION OF ALL STIMULI
impulseTrainAll = impulseTrainLight + impulseTrainTone + impulseTrainPress;

% SIMULATE BOLD SIGNAL EVOKED BY EACH CONDITION
boldLight = conv(impulseTrainLight,h);
boldTone = conv(impulseTrainTone,h);
boldPress = conv(impulseTrainPress,h);
boldAll = conv(impulseTrainAll,h);

% DISPLAY STIMULUS ONSETS FOR EACH CONDITION
figure
subplot(211)
hold on
stem(impulseTrainLight,'k');
stem(impulseTrainTone,'b');
stem(impulseTrainPress,'g');
xlim([0,nTRs]);
xlabel('Time (TRs)'); ylabel('BOLD Signal');
legend({'Light Stimulus Train', 'Tone Stimulus Train','Press Stimulus Train'})
title('Impulse Trains for 3 Different Stimuli');

% DISPLAY COMBINATION OF BOLD RESPONSES FROM EACH CONDITION
subplot(212)
hold on;
plot(boldLight(1:nTRs),'k');
plot(boldTone(1:nTRs),'b');
plot(boldPress(1:nTRs),'g');
plot(boldAll(1:nTRs),'r','Linewidth',2);
xlim([0,nTRs]);
xlabel('Time (TRs)'); ylabel('BOLD Signal');
legend({'Response to Light','Response to Tone','Response to Press','Total Response'});
title('Simulation of BOLD Signal from Overlapping Stimuli');
set(gcf,'Position',[100 100 750 540])


In this example notice how the responses evoked by each class of stimulus overlap, resulting in a complex BOLD signal. However, if each class of stimulus were shown separately, as in the case of the block design, the evoked BOLD signal would look very different, as indicated by the individual responses to each stimulus (black, blue and green responses, respectively). Event-related designs are powerful in that we can probe many features simultaneously and therefore uncover correlated selectivities/tuning. However, event-related designs require that we have a very accurate model for the HRF. This requirement is greatly relaxed for block design experiments (in fact many block-designs assume very simplistic HRF shapes such as a triangle or square waveform).

Wrapping Up

In this post we introduced some of the basic concepts in fMRI used for neuroscience research, including the BOLD signal, the HRF, as well as Block and Event-related experiment designs. However, the underlying story used to present these concepts has been dramatically simplified. Here we ignore the effects of physiological noise, which can be a debilatating factor in many fMRI analyses. Also missing are the details of how to calculate a voxel’s selectivity to a set of stimulus features given measured data, an HRF, and an experimental paradigm. This is where the General Linear Model (GLM) that predominates fMRI research comes in. We also assume that the proposed model HRF is an accurate characterization of a voxel’s BOLD response dynamics. Though this is often a reasonable assumption for peripheral sensory areas of the brain, it is often a very poor assumption for other areas of the brain. It is therefore often necessary to estimate the shape of the HRF from fMRI data. We’ll discuss all of these issues: physiological noise, tuning/selectivity estimation using the GLM, as well as HRF estimation in a following series of posts on fMRI methods in neuroscience.