**Reference**

HowToGuides

Manuals

LabAlumni

DataAnalysis

Advice for...

**Admin**

analysis:nsb2016:week7

~~DISCUSSION~~

Goals:

- Understand the idea of the spectrogram and the inputs and outputs of typical implementations
- Apply your knowledge of spectral leakage to the selection of spectrogram parameters
- Cultivate an awareness of common pitfalls in spectrogram-based analysis
- Get started with using the FieldTrip toolbox

Resources:

- (for reference) Chronux toolbox slides
- (for reference) Fieldtrip documentation and spectrogram tutorial

The spectral estimation methods introduced in the previous module average spectral content over a number of different windows. This means that we get an improved estimate of the *overall* spectral content of the signal, but such an average cannot establish if the spectral content is changing over time. Neural signals rarely have stationary spectral properties, as illustrated by this snippet of a LFP from the rat ventral striatum (source: van der Meer and Redish, 2009)

The top trace illustrates some striking properties of the ventral striatal LFP: under certain conditions, gamma oscillations appear in “events” of a relatively stereotyped length, with frequencies alternating between ~80 Hz (“gamma-80” or “high gamma”) and ~50 Hz (“gamma-50” or “low gamma”).

How can we characterize the spectral properties of such a rapidly changing signal? One approach is illustrated in the colorful panel directly below the raw LFP trace – this is a **spectrogram**. A spectrogram estimates the spectral content of a signal over time. It has time and frequency axes, with power shown in the color dimension (note the colorbar: in this plot, red is high power, blue is low power).

The spectrogram is constructed by sliding a window along the signal and computing the power spectrum at each position of the window. As with the PSD estimators in the previous module such as Welch's method, the window is characterized by its size and shape (rectangular, Hamming, etc.) which control the tradeoffs resulting from spectral leakage. Unlike these PSD estimators however, to obtain a spectrogram we do not average across windows. Instead, we keep the spectrum for each position of the window, filling in each “time” column of the spectrogram as it moves along.

(As an aside, spectrograms are a widely used tool for time-frequency analysis; for instance, they are often the first step in voice recognition software. Plotting a spectrogram of your own voice can be done with just a few lines of MATLAB code.)

MATLAB's signal processing toolbox includes basic functionality for constructing spectrograms.

☛ Look at the documentation for the `spectrogram()`

function. Notice that the input arguments it takes are familiar to you from the last module: window, overlap, nfft, Fs. The usage we are looking for is the last one:

[S,F,T,P] = spectrogram(...) P is a matrix representing the Power Spectral Density (PSD) of each segment.

Let's give it a spin:

%% load the data % remember to cd to the correct folder here, may need to get this file from the lab database cfg = []; cfg.fc = {'R016-2012-10-03-CSC04a.Ncs'}; csc = LoadCSC(cfg); Fs = csc.cfg.hdr{1}.SamplingFrequency; %% restrict data cscR = restrict(csc,3282,3286); % if you don't have this, implement it (it's one line of code!) plot(cscR.tvec,cscR.data); % note there are various gamma oscillations present, as well as a large negative-going transient %% construct and plot the spectrogram [S,F,T,P] = spectrogram(cscR.data,hanning(512),256,1:200,Fs); imagesc(T,F,10*log10(P)); % converting to dB as usual set(gca,'FontSize',20); axis xy; xlabel('time (s)'); ylabel('Frequency (Hz)');

(If you can't access the server, I put this session on Google Drive here.)

You should see:

As in the example, color indicates signal power at a particular point (bin), corresponding to a specific frequency and a specific time. Some power is apparent at low frequencies (<10Hz), as well as in the gamma range (~50-100Hz).

Let's deconstruct this plot a little further. Since we specified 512-sample windows with a 256-sample overlap, we expect roughly 8 estimates per second. This is because our Fs is 2000, and therefore 8 “steps” of our moving window, moving at 256 samples per step (the window size minus the overlap equals the step size) fit into one second. Indeed we can discern approximately 8 “columns” (time bins) per second in the plot. For the frequency axis, we specified `1:200`

as the bins of interest, so our frequency axis has a resolution of 1 Hz per bin. The centers of the time and frequency bins are returned as the `T`

and `F`

outputs respectively.

An important point about the time axis is this:

>> T(1) ans = 0.1280

That is, the time corresponding to the first column (time bin) of the spectrogram is *not* 0. It is also not the time of the first sample in our data, as is readily verified:

>> cscR.tvec(1) ans = 3.2820e+03

In fact, the time of the first bin returned by `spectrogram()`

is the *center of the first window for which there is enough data*. The data passed to `spectrogram()`

is assumed to start at time 0. This makes sense when we remember that the input arguments do not contain timestamps (only voltage samples) so this function has no way of knowing the “experiment time” (`csc.tvec`

).

Thus, the time of the first spectrogram bin (`0.128`

) is because this is the center of the first 512-sample window for which there is enough data (`256/2000`

is `0.128`

), and the first sample of the data is assigned time 0.

Before we start investigating the effects of changing the spectrogram parameters, it is helpful to be able to see the raw LFP overlaid onto the spectrogram:

%% hold on; lfp_minmax = 25; lfp_cent = 125; % range and mean of LFP plotting tvec0 = cscR.tvec - cscR.tvec(1); % align LFP with spectrogram data = rescale(cscR.data,-lfp_minmax,lfp_minmax); data = data+lfp_cent; lfp_h = plot(tvec0,data,'k');

Note that to make the LFP and the spectrogram line up, the time of the first LFP sample needs to be zero. The rescaling of the LFP is just to make it easier to see when plotted overlaid onto the spectrogram; of course, if you prefer, you can plot it below or above without overlap by changing `lfp_cent`

.

Now that we can see the LFP and the spectrogram together in one plot, you can observe the correspondence between them. The sharp transient early in the data shows up as a dark red in the low frequency range. This is perhaps not obvious, because such a transient can hardly be described as an oscillation – it does not even complete a full period (cycle). So beware: if you see strong power in a spectrogram (or a PSD for that matter) this does not imply that there is a true oscillation in the signal.

In contrast to this low-frequency transient, you can see that there are various gamma-range oscillations which extend over multiple cycles, and can therefore be regarded as true oscillations rather than transients. For instance, at the start of this data segment there is some low gamma around 50-60Hz, and later some higher frequencies appear.

Another thing you will have noticed is that the spectrogram looks pretty coarse. Let's see if we can improve this by changing the input parameters.

In the example above, we used a 512-sample Hanning window with an overlap of 256; that is, the window moves in steps of 256 samples, and as a result we get a new power estimate every 0.128 seconds. In addition, we requested power estimates at integer frequencies from 1 to 200.

☛ Change the code above so that `spectrogram()`

uses an overlap of 384 samples (instead of 256) and returns power with 0.25 Hz resolution (instead of 1).

The spectrogram should now look somewhat smoother. Because of the larger overlap, the spectrogram now contains an estimate every 512-384 = 128 samples, rather than every 256. However, it is important to appreciate that increasing the frequency and time resolution in this way does not change the underlying estimate. This can be seen, for instance, by setting the frequency resolution to something really small (e.g. 0.01). This will take a while to compute, but if you compare the spectrograms side-by-side, it should be clear that the frequency estimates do not become more precise.

Similarly, you can decrease the overlap to `window_length - 1`

, resulting in a spectrogram that is ultra-smooth in appearance, but this cannot escape the underlying limitation that spectral content is estimated over a window that is 512 samples wide.

An important parameter that we know can affect our estimate is the shape of the window used.

☛ Change the window used to a rectangular one (`rectwin`

). What happens to the spectrogram?

Tracking changes over time using a spectrogram generally means that we don't want to use very large time windows. A large window would obscure rapid changes in spectral content, such as the tendency for a low-gamma to follow high-gamma events illustrated in the first figure, above. However, as you know from the previous module, the effects of spectral leakage are more prominent in smaller windows: there is fewer data to make up for the edge effects. This means that spectrograms tend to be particularly sensitive to the choice of window. A Hanning window is often a good default choice. As we will see below, however, there are methods we can use to improve further on this.

The most important choice in constructing spectrograms is the size of the time window. This controls the fundamental tradeoff between our ability to resolve *which frequencies* are present in the signal, and at *what time* those frequencies occur.

This tradeoff arises because of spectral leakage. With a large window, the edge effects are relatively small, and frequency content can be estimated relatively accurately. However, we cannot know at what time in this large window a given oscillatory event occurs. In contrast, using a small window, we may be able to localize more precisely in time when an event occurs. The price we pay is that we have increased spectral leakage, and we are now limited in our ability to determine the frequency of the event.

This idea is illustrated in panel (b) of the figure below:

Rectangular “time-frequency tiles”, with different widths and heights, correspond to the area in the spectrogram that would be highlighted by given oscillatory event of a fixed frequency for different choices of time window. For a small window, we can know with more precision when the event occurred, but we are uncertain about the frequency. In contrast, a large window constrains the frequency estimate, but now we are uncertain about time.

☛ For the piece of data used here, compare side by side (using `subplot()`

) the spectrogram obtained with 512- and 1024-sample windows. For a fair comparison, set the overlap such that you end up with a comparable number of time bins.

You should get something like this:

As expected, the larger window has tighter frequency estimates, but the spectrogram is now more smeared out in time.

These observations raise the question of if there are ways of picking the window size that are somehow optimal. In general, there is no substitute for looking at the raw data and tuning the parameters so that features of interest, such as the gamma events here, are well resolved in the spectrogram. However, there are some sophisticated methods available for which we have to move to the FieldTrip toolbox, which we will get to later in this module.

As noted above, the input to `spectrogram()`

is not a tsd containing samples with accompanying timestamps; rather, it is simply a list of samples assumed to be acquired at a constant sampling frequency `Fs`

. If the input in fact does not have a constant `Fs`

, as can occur for instance if there are gaps in the data, there will be trouble:

cscR = restrict(csc,3300,3340); [S,F,T,P] = spectrogram(cscR.data,rectwin(256),128,1:200,Fs); imagesc(T,F,10*log10(P)); % converting to dB as usual set(gca,'FontSize',20); axis xy; xlabel('time (s)'); ylabel('Frequency (Hz)'); hold on; lfp_minmax = 25; lfp_cent = 125; % range and mean of LFP plotting tvec0 = cscR.tvec - cscR.tvec(1); % align LFP with spectrogram data = rescale(cscR.data,-lfp_minmax,lfp_minmax); data = data+lfp_cent; lfp_h = plot(tvec0,data,'k'); xlim([tvec0(1) tvec0(end)]);

As you can see, `spectrogram()`

ignores the gap present in the LFP, causing the spectrogram and the LFP to become misaligned. In general, always verify that your data does not have gaps when computing spectrograms!

Another way for the spectrogram and LFP to become misaligned is when an incorrect sampling frequency is specified.

☛ Return to the original data segment (`cscR = Restrict(csc,3282,3286);`

) and compute the spectrogram using `Fs-100`

instead of `Fs`

. Plot the spectrogram and overlay the LFP. What do you notice?

Computing a spectrogram enables a number of further analyses, such as characterizing the relationship between power at different frequency bands. An example of this is shown by the correlation matrix in panel C of the first figure on this page, which reveals that ventral striatal gamma-50 and gamma-80 tend to be slightly anticorrelated in time.

Spectrograms can also be used to obtain a time series of power in a certain band over time, similar to the filtering-based approach in the previous module. Such a time series can then for instance be used to determine if power in a certain frequency band is related to a behavioral or task variable such as running speed.

Another common spectrogram-based analysis involves aligning the spectrogram to task events of interest, such as a button press, stimulus onset, or receipt of a reinforcer. To do this, we first need to extract the times of these events from the data files.

Apart from neural data stored in `.ncs`

and `.ntt`

files, the Neuralynx system also stores *events* in a file called `Events.Nev`

(see Module 2 for details). Here is an example usage for the specific task in this data:

cfg = []; cfg.eventList = {'Feeder 0 nosepoke','Feeder 1 nosepoke', ... '1 pellet cue','3 pellet cue','5 pellet cue', ... '1 pellet dispensed','3 pellet dispensed','5 pellet dispensed'}; cfg.eventLabel = {'n0','n1', ... 'c1','c3','c5', ... 'd1','d3','d5'}; evt = LoadEvents(cfg);

This result is the following:

>> evt evt = t: {1x8 cell} label: {'n0' 'n1' 'c1' 'c3' 'c5' 'd1' 'd3' 'd5'} cfg: [1x1 struct] >> evt.t ans = Columns 1 through 4 [1x58 double] [1x57 double] [1x35 double] [1x19 double] Columns 5 through 8 [1x34 double] [1x40 double] [1x25 double] [1x42 double]

Thus, for each item in `cfg.eventList`

we extract the corresponding event timestamps, placing them in an output structure field with a name specified in `cfg.eventLabel`

.

The events returned by `LoadEvents()`

above are already present in the Events file; this function merely presents them in a more usable format. However, for many tasks, some more processing is needed to obtain event times of interest, for instance “only those nosepokes that follow presentation of a cue predicting 5 pellets, and for which the rat does not un-poke for at least one second”. Creating appropriate “trial functions” that return usable event times is unglamorous, but a key analysis step for any data set. (Remember: “Garbage in, garbage out”!)

In any case, armed with some event times, we are now in a position to ask questions of the sort, “does reward delivery elicit changes in the ventral striatal LFP?” and “is the ventral striatal LFP sensitive to reward-predictive cues?” To do so, we will use the FieldTrip toolbox, which massively expands MATLAB's suite of built-in functions for spectral analysis.

FieldTrip is a toolbox originally developed for the analysis of human EEG and MEG data. Because these settings have historically faced complex signal processing challenges, such as eyeblink artifact removal, source reconstruction, and mapping across many electrodes, the methods developed to address these challenges are mature and have much to offer, particularly to rodent electrophysiologists!

The FieldTrip toolbox continues to be actively developed and maintained, primarily at the Donders Institute for Brain, Cognition and Behaviour in Nijmegen. It is supported by a popular mailing list, extensive documentation with tutorials, and various workshops organized across the globe (such as this one in Toronto).

☛ If you haven't already done so, clone the FieldTrip repository on GitHub. Make sure you have a shortcut that sets up MATLAB's path to include it. Also do a `git pull`

of the course repository.

**IMPORTANT: to make this section work, your path shortcut should add FieldTrip first, the course repository second!** This ensures that our (modified) functions take precedence over the FieldTrip code.

Let's load our familiar ventral striatal LFP into FieldTrip:

%% remember to cd to your data folder fc = {'R016-2012-10-03-CSC04a.ncs'}; data = ft_read_neuralynx_interp(fc);

You will get a number of warnings about discontinuous recording and expected sample mismatches. This is because FieldTrip expects an exactly constant `Fs`

in the data, which we know Neuralynx typically does not provide.

Next, let's look at the contents of the `data`

structure:

>> data data = hdr: [1x1 struct] label: {'R016-2012-10-03-CSC04a'} time: {[1x5256982 double]} trial: {[1x5256982 double]} fsample: 2000 sampleinfo: [1 5256982] cfg: [1x1 struct]

A quirk of FieldTrip is that it treats all data as organized into *trials*, a legacy of its event-related potential roots in human work. However, a quick `plot(data.time{1},data.trial{1})`

should assure you that all the data is there. At this point, FieldTrip thinks of this full recording session as one single “trial”, the first cell of `data.label`

, `data.time`

, and `data.trial`

.

To convert this single “trial” containing the entire recording into proper trials surrounding events of interest, FieldTrip requires a `trl`

variable that specifies the start and end indices (into the data) which are then passed to the `ft_redefinetrial()`

function, which cuts up the data into trials:

cfg = []; cfg.t = cat(2,getd(evt,'n0'),getd(evt,'n1')); cfg.mode = 'nlx'; cfg.hdr = data.hdr; cfg.twin = [-1 4]; trl = ft_maketrl(cfg); cfg = []; cfg.trl = trl; data_trl = ft_redefinetrial(cfg,data);

The above reorganizes the data into 4-second windows (from -1 to 4 seconds) around nosepoke times (`evt.n0`

and `evt.n1`

). `ft_maketrl()`

converts the event times from `LoadEvents()`

into the data indices that `ft_redefinetrial()`

needs.

☛ Inspect the `data_trl`

structure. Notice that there are now 115 trials instead of 1, and that each trial has its own time and data vectors. Plot the 47th trial as an example to verify this.

Now we are ready to construct the event-triggered spectrogram:

cfg = []; % start with empty cfg cfg.output = 'pow'; cfg.channel = 'R016-2012-10-03-CSC04a'; cfg.method = 'mtmconvol'; cfg.taper = 'hanning'; cfg.foi = 10:2:100; % frequencies of interest cfg.t_ftimwin = ones(size(cfg.foi)).*0.5; % window size: fixed at 0.5s cfg.toi = -0.5:0.05:3.5; % times of interest TFR = ft_freqanalysis(cfg, data_trl); figure cfg = []; cfg.channel = 'R016-2012-10-03-CSC04a'; ft_singleplotTFR(cfg, TFR);

That is quite a mouthful. Almost any FieldTrip function requires a `cfg`

struct as input that specifies the parameters of the task to be completed. To avoid unexpected results, it is good practice to start with an empty cfg (`cfg = [];`

) when moving on to a new type of operation, such as from trial segmentation to spectrogram construction in the above example. The function `ft_freqanalysis()`

computes the spectrogram based on the data and the parameters included in the `cfg`

file. `ft_singleplotTFR()`

plots the result, again based on the information in the `cfg`

file and the output of `ft_freqanalysis()`

.

You should get:

This plot is an *average* spectrogram over 115 trials, with time 0 corresponding to the time of nosepoking into the reward receptacle. As with the examples using `spectrogram()`

, we can control the smoothness of the spectrogram by specifying the time and frequency steps. For the frequency steps this is accomplished the same way in FieldTrip, but for the time axis we can now simply specify a vector with the window centers instead of a window size and overlap.

Looking at the results, it appears there are some beta (~20Hz) and gamma events visible in the average spectrogram.

☛ Change the `cfg.toi`

and `cfg.foi`

parameters to verify you can change the resolution of the time and frequency axes. What happens if you try to extend the time axis from -2 to 5? What would you do if you wanted data for that extended range?

FieldTrip provides many powerful tools for spectral analysis. This section highlights a few useful ones, but the tutorials and documentation of the functions involved (e.g. ft_freqanalysis) give a more complete overview.

Often we are not interested in absolute power levels, as provided per default by a spectrogram. Rather, we wish to compare spectral content following some event of interest to a baseline. This is easily accomplished:

figure cfg = []; cfg.baseline = [-2 0]; cfg.baselinetype = 'relative'; cfg.channel = 'R016-2012-10-03-CSC04a'; ft_singleplotTFR(cfg, TFR);

Note that the beta and gamma events now stand out clearly.

Up until now, we have used a *fixed* window size (number of samples, or equivalently, time) in computing spectrograms. The above example specified a 0.5s window.

☛ Recompute and plot the event-triggered spectrogram using a *frequency-dependent* window of 20 cycles by setting `cfg.t_ftimwin = 20./cfg.foi;`

and a frequency range of 1:100.

You should now have a beautiful image that balances time and frequency resolution:

☛ What causes the whitespace at the bottom of the image?

This frequency-dependent windowing can be thought of as computing many spectrograms with different windows in parallel, and then keeping different pieces of each so that low frequencies are estimated with good precision in the frequency domain (but poor temporal precision) and high frequencies are resolved from good temporal precision (but relatively poor frequency precision). This tradeoff is illustrated by the time-frequency tiles in the “B” panel of the figure earlier in this module.

(As a technical note, this frequency-dependent windowing is similar to what can be obtained with wavelet transforms.)

To set up a statistical comparison of the pre-nosepoke baseline and the oscillation patterns following the nosepoke, we perform two different frequency analyses, with different times of interest:

cfg = []; cfg.output = 'pow'; cfg.channel = 'R016-2012-10-03-CSC04a'; cfg.method = 'mtmconvol'; cfg.taper = 'hanning'; cfg.foi = 1:1:100; cfg.keeptrials = 'yes'; % need this for stats later cfg.t_ftimwin = 20./cfg.foi; % 20 cycles per time window cfg.toi = -1:0.05:0; % pre-nosepoke baseline TFR_pre = ft_freqanalysis(cfg, data_trl); cfg.toi = 0:0.05:1; % post-nosepoke TFR_post = ft_freqanalysis(cfg, data_trl); TFR_pre.time = TFR_post.time; % time axes should be identical for comparison

Now we run `ft_freqstatistics()`

to do the comparison between the two frequency analyses:

%% t-test cfg = []; cfg.channel = 'R016-2012-10-03-CSC04a'; cfg.latency = 'all'; cfg.trials = 'all'; cfg.frequency = 'all'; cfg.avgoverchan = 'no'; cfg.avgovertime = 'no'; cfg.avgoverfreq = 'no'; cfg.parameter = 'powspctrm'; cfg.method = 'stats'; cfg.statistic = 'ttest2'; cfg.alpha = 0.05; nTrials1 = size(TFR_pre.powspctrm,1); nTrials2 = size(TFR_post.powspctrm,1); cfg.design = cat(2,ones(1,nTrials1),2*ones(1,nTrials2)); % two conditions cfg.ivar = 1; % dimension of design var which contains the independent variable (group) stat = ft_freqstatistics(cfg,TFR_post,TFR_pre); cfg.parameter = 'stat'; ft_singleplotTFR(cfg,stat); % plot the t-statistic

The result shows the t-statistic of the bin-by-bin comparison between pre- and post-nosepoke power. Depending on what correction for multiple comparisons is to be applied, some time-frequency bins may approach significance. This is however only a single recording session.

The Statistics section of this walkthrough has some background on the different steps performed here; the event-related statistics tutorial has further examples of different statistics.

FieldTrip excels at processing many data channels in parallel. Processing all 16 channels from this session is as simple for the user as processing 1 channel. (Be aware however, that this will be mild torture for older computers; the following took about 5 minutes to run on my 3-year old laptop.)

%% load the data fc = FindFiles('*.ncs'); % get filenames of all LFPs recorded data_all = ft_read_neuralynx_interp(fc); % load them all -- this will take a while data_all.hdr.Fs = data_all.fsample; % for some reason this is missing from the header %% define layout for later plotting cfg = []; cfg.layout = 'ordered'; cfg.channel = data.label; layout = ft_prepare_layout(cfg, data_all);

Defining a channel layout is the only addition when dealing with multichannel data. Here, we are using one of the default layouts ('ordered'). However, layouts specific to particular EEG/MEG and intracranial probes are also available, so that results can be plotted topographically (i.e. matching the physical sensor layout).

Remarkably, the trial definition and spectrogram computations are identical to those for a single channel:

%% cfg = []; cfg.t = cat(2,getd(evt,'n0'),getd(evt,'n1')); cfg.mode = 'nlx'; cfg.hdr = data_all.hdr; cfg.twin = [-1 4]; trl = ft_maketrl(cfg); cfg = []; cfg.trl = trl; data_trl = ft_redefinetrial(cfg,data_all); %% cfg = []; % start with empty cfg cfg.output = 'pow'; cfg.method = 'mtmconvol'; cfg.taper = 'hanning'; cfg.foi = 1:100; % frequencies of interest cfg.toi = -0.5:0.05:3.5; % times of interest cfg.t_ftimwin = 20./cfg.foi; TFR = ft_freqanalysis(cfg, data_trl);

Only the final plot command is slightly different:

figure cfg = []; cfg.baseline = [-2 0]; cfg.baselinetype = 'relative'; cfg.layout = layout; ft_multiplotTFR(cfg, TFR); % note this is now multiplot rather than singleplot

You should get:

This shows the baseline-corrected, event-aligned spectrograms for all 16 channels in this session. There is also an average shown (the subplot on the lower right). Note that a subset of the channels, presumably located in the ventral striatum, show a very similar time-frequency pattern, whereas another subset do not show this at all. The channels that do not show this same pattern are likely located in the hippocampus.

analysis/nsb2016/week7.txt · Last modified: 2019/02/11 15:16 by mvdm

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International