User Tools

Site Tools


analysis:course-w16:week10

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
analysis:course-w16:week10 [2016/02/14 15:17]
mvdm [Spike train analysis II: tuning curves, encoding, decoding]
analysis:course-w16:week10 [2017/01/05 15:16]
mvdm [Bayesian decoding]
Line 1: Line 1:
 ~~DISCUSSION~~ ~~DISCUSSION~~
- 
-:!: **UNDER CONSTRUCTION -- PLEASE DO NOT USE YET** :!: 
  
 ===== Spike train analysis II: tuning curves, encoding, decoding ===== ===== Spike train analysis II: tuning curves, encoding, decoding =====
Line 9: Line 7:
   * Learn to estimate and plot tuning curves, raw and smoothed   * Learn to estimate and plot tuning curves, raw and smoothed
   * Implement a basic Bayesian decoding algorithm   * Implement a basic Bayesian decoding algorithm
-  * Compare decoded and actual position by exporting to a movie file+  * Compare decoded and actual position by computing the decoding error
  
 Resources: Resources:
Line 86: Line 84:
  
 <code matlab> <code matlab>
-ENC_S = restrict(S,run_start,​run_end); +LoadMetadata;​ 
-ENC_pos = restrict(pos,​run_start,​run_end); +ENC_S = restrict(S,metadata.taskvars.trial_iv); 
 +ENC_pos = restrict(pos,​metadata.taskvars.trial_iv); 
 + 
 % check for empties and remove % check for empties and remove
 keep = ~cellfun(@isempty,​ENC_S.t);​ keep = ~cellfun(@isempty,​ENC_S.t);​
 ENC_S.t = ENC_S.t(keep);​ ENC_S.t = ENC_S.t(keep);​
 ENC_S.label = ENC_S.label(keep);​ ENC_S.label = ENC_S.label(keep);​
 + 
 S.t = S.t(keep); S.t = S.t(keep);
 S.label = S.label(keep);​ S.label = S.label(keep);​
 </​code>​ </​code>​
  
-We have created ''​ENC_''​ versions of our spike trains and position data, containing only data from when the rat was running on the track (the ''​run_start''​ and ''​run_end''​ variables have been previously generated by a different script) and removed all cells from the data set that did not have any spikes on the track.+We have created ''​ENC_''​ versions of our spike trains and position data, containing only data from when the rat was running on the track (using experimenter annotation stored in the metadata; ​''​trial_iv'' ​contains the start and end times of trials) and removed all cells from the data set that did not have any spikes on the track.
  
 ☛ Plot the above scatterfield again for the restricted spike train. Verify that no spikes are occurring off the track by comparing your plot to the previous one for the full spike trains, above. ☛ Plot the above scatterfield again for the restricted spike train. Verify that no spikes are occurring off the track by comparing your plot to the previous one for the full spike trains, above.
Line 116: Line 115:
 y_edges = SET_ymin:​SET_yBinSz:​SET_ymax;​ y_edges = SET_ymin:​SET_yBinSz:​SET_ymax;​
  
-occ_hist = histcn(pos_mat,​y_edges,​x_edges);​+occ_hist = histcn(pos_mat,​y_edges,​x_edges); ​% 2-D version of histc()
  
-no_occ_idx = find(occ_hist == 0); % NaN out bins rat never visited+no_occ_idx = find(occ_hist == 0); % NaN out bins never visited
 occ_hist(no_occ_idx) = NaN; occ_hist(no_occ_idx) = NaN;
  
-occ_hist = occ_hist .* (1/30); % convert to seconds using video frame rate+occ_hist = occ_hist .* (1/30); % convert ​samples ​to seconds using video frame rate (30 Hz)
  
 subplot(221);​ subplot(221);​
Line 161: Line 160:
 {{ :​analysis:​course-w16:​raw_tc.png?​nolink&​900 |}} {{ :​analysis:​course-w16:​raw_tc.png?​nolink&​900 |}}
  
-Note that from the occupancy map, you can see the rat spent relatively more time at the choice point compared to other segments of the track. However, the rough binning is not very satisfying. Let's see if we can do better with some smoothing:+Note that from the occupancy map, you can see the rat spent relatively more time at the base of the stem compared to other segments of the track. However, the rough binning is not very satisfying. Let's see if we can do better with some smoothing:
  
 <code matlab> <code matlab>
Line 170: Line 169:
  
 occ_hist(no_occ_idx) = NaN; occ_hist(no_occ_idx) = NaN;
-occ_hist = occ_hist .* (1/​30); ​% convert to seconds using video frame rate+occ_hist = occ_hist .* (1/30);
  
 subplot(221);​ subplot(221);​
Line 178: Line 177:
 % %
 spk_hist = histcn(spk_mat,​y_edges,​x_edges);​ spk_hist = histcn(spk_mat,​y_edges,​x_edges);​
-spk_hist = conv2(spk_hist,​kernel,'​same'​);​+spk_hist = conv2(spk_hist,​kernel,'​same'​); ​% 2-D convolution
 spk_hist(no_occ_idx) = NaN; spk_hist(no_occ_idx) = NaN;
  
Line 267: Line 266:
 In general, from the [[http://​en.wikipedia.org/​wiki/​Poisson_distribution | definition of the Poisson distribution]],​ it follows that In general, from the [[http://​en.wikipedia.org/​wiki/​Poisson_distribution | definition of the Poisson distribution]],​ it follows that
  
-\[P(n_i|\mathbf{x}) = \frac{(\tau f_i(\mathbf{x}))^{n_i}}{n_i!} e^{-\tau f_i (x)}\]+\[P(n_i|\mathbf{x}) = \frac{(\tau f_i(\mathbf{x}))^{n_i}}{n_i!} e^{-\tau f_i (\mathbf{x})}\]
  
 $f_i(\mathbf{x})$ is the average firing rate of neuron $i$ over $x$ (i.e. the tuning curve for position), $n_i$ is the number of spikes emitted by neuron $i$ in the current time window, and $\tau$ is the size of the time window used. Thus, $\tau f_i(\mathbf{x})$ is the mean number of spikes we expect from neuron $i$ in a window of size $\tau$; the Poisson distribution describes how likely it is that we observe the actual number of spikes $n_i$ given this expectation. $f_i(\mathbf{x})$ is the average firing rate of neuron $i$ over $x$ (i.e. the tuning curve for position), $n_i$ is the number of spikes emitted by neuron $i$ in the current time window, and $\tau$ is the size of the time window used. Thus, $\tau f_i(\mathbf{x})$ is the mean number of spikes we expect from neuron $i$ in a window of size $\tau$; the Poisson distribution describes how likely it is that we observe the actual number of spikes $n_i$ given this expectation.
Line 288: Line 287:
  
 The tuning curves take care of the $f_i(x)$ term in the decoding equations. Next, we need to get $\mathbf{n}$,​ the spike counts. The tuning curves take care of the $f_i(x)$ term in the decoding equations. Next, we need to get $\mathbf{n}$,​ the spike counts.
-=== Preparing tuning curves for decoding === 
- 
-With the math taken care of, we can now start preparing the data for the decoding procedure. First we need to make sure we have tuning curves for all neurons. 
- 
-Now we need to do the same for //all// cells. For now, we will revert to using the "​low-resolution"​ version (with 63x47 bins) with a small amount of smoothing. Even though this is not as good of an estimate as the high-resolution version, our decoding will be super slow if we try to run it on a high-resolution smoothed estimate. 
- 
-So first, let's inspect our updated tuning curve example: 
- 
-<code matlab> 
-kernel = gausskernel([4 4],2); % 2-D gaussian, width 4 bins, SD 2 
- 
-SET_xmin = 10; SET_ymin = 10; SET_xmax = 640; SET_ymax = 480; 
-SET_nxBins = 63; SET_nyBins = 47; 
- 
-spk_binned = ndhist(cat(1,​spk_x',​spk_y'​),​[SET_nxBins;​ SET_nyBins],​[SET_xmin;​ SET_ymin],​[SET_xmax;​ SET_ymax]); 
-spk_binned = conv2(spk_binned,​kernel,'​same'​);​ % smoothing 
- 
-occ_binned = ndhist(cat(1,​getd(pos,'​x'​),​getd(pos,'​y'​)),​[SET_nxBins;​ SET_nyBins],​[SET_xmin;​ SET_ymin],​[SET_xmax;​ SET_ymax]); 
-occ_mask = (occ_binned < 5); 
-occ_binned = conv2(occ_binned,​kernel,'​same'​);​ % smoothing 
- 
-occ_binned(occ_mask) = 0; % don't include bins with less than 5 samples 
- 
-VT_Fs = 30; 
-tc = spk_binned./​(occ_binned .* (1 / VT_Fs)); 
-tc(isinf(tc)) = NaN; 
- 
-pcolor(tc'​);​ shading flat; 
-axis xy; colorbar; axis off; 
-</​code>​ 
- 
-Then, we can do the same for all cells in our data set: 
- 
-<code matlab> 
-clear tc 
-nCells = length(S.t);​ 
-for iC = 1:nCells 
-    spk_x = interp1(pos.tvec,​getd(pos,'​x'​),​S.t{iC},'​linear'​);​ 
-    spk_y = interp1(pos.tvec,​getd(pos,'​y'​),​S.t{iC},'​linear'​);​ 
- 
-    spk_binned = ndhist(cat(1,​spk_x',​spk_y'​),​[SET_nxBins;​ SET_nyBins],​[SET_xmin;​ SET_ymin],​[SET_xmax;​ SET_ymax]); 
-    spk_binned = conv2(spk_binned,​kernel,'​same'​);​ 
-    ​ 
-    tc = spk_binned./​(occ_binned .* (1 / VT_Fs)); 
-    tc(isinf(tc)) = NaN; 
-    ​ 
-    all_tc{iC} = tc; 
-    ​ 
-end 
-</​code>​ 
- 
-Note that we don't need to recompute the occupancy because it is the same for all cells. 
- 
-Let's inspect the resulting tuning curves: 
- 
-<code matlab> 
-ppf = 25; % plots per figure 
-for iC = 1:​length(S.t) 
-    nFigure = ceil(iC/​ppf);​ 
-    figure(nFigure);​ 
-    ​ 
-    subplot(5,​5,​iC-(nFigure-1)*ppf);​ 
-    pcolor(all_tc{iC});​ shading flat; axis off; 
-    ​ 
-end 
-</​code>​ 
- 
-You will see a some textbook "place cells" with a clearly defined single place field. There are also cells with other firing patterns. 
- 
-☛ One cell has a completely green place map. What does this indicate, and under what conditions can this happen? 
- 
-Since we see cells with fields in some different locations, it seems unlikely that a single sensory cue or nonspatial source can account for this activity. Of course, numerous experiments have demonstrated that many place cells do not depend on any specific sensory cue to maintain a stable firing field. 
  
 === Preparing firing rates for decoding === === Preparing firing rates for decoding ===
  
-The tuning curves take care of the $f_i(x)$ term in the decoding equations. Now we need to get $\mathbf{n}$,​ which are simply ​spike counts:+To obtain ​spike counts ​within a given bin size, we can use histc():
  
 <code matlab> <code matlab>
 +%% make Q-mat
 clear Q; clear Q;
-binsize = 0.25;+binsize = 0.25; % seconds
  
 % assemble tvecs % assemble tvecs
-tvec_edges = ExpKeys.TimeOnTrack:binsize:ExpKeys.TimeOffTrack;+tvec_edges = metadata.taskvars.trial_iv.tstart(1):binsize:metadata.taskvars.trial_iv.tend(end);
 Q_tvec_centers = tvec_edges(1:​end-1)+binsize/​2;​ Q_tvec_centers = tvec_edges(1:​end-1)+binsize/​2;​
  
-for iC = length(S.t):-1:1+for iC = length(ENC_S.t):-1:1
    
-    spk_t = S.t{iC};+    spk_t = ENC_S.t{iC};
     Q(iC,:) = histc(spk_t,​tvec_edges);​     Q(iC,:) = histc(spk_t,​tvec_edges);​
 +    Q(iC,end-1) = Q(iC,​end-1)+Q(iC,​end);​ % remember last bin of histc() gotcha
    
 end end
-nActiveNeurons ​sum(> 0);+= Q(:,1:end-1);
 </​code>​ </​code>​
  
Line 385: Line 314:
  
 <code matlab> <code matlab>
-% look at it +imagesc(Q_tvec_centers,​1:​nCells,​Q)
-imagesc(tvec,​1:​nCells,​Q)+
 set(gca,'​FontSize',​16);​ xlabel('​time(s)'​);​ ylabel('​cell #'); set(gca,'​FontSize',​16);​ xlabel('​time(s)'​);​ ylabel('​cell #');
 </​code>​ </​code>​
- 
-You should see: 
- 
-{{ :​analysis:​course:​week10_qmat.png?​600 |}} 
- 
-If you zoom in to a smaller slice of time, you will notice that there are gaps in the data, i.e. segments without any activity whatsoever. This is a quirk of this particular data set: the epochs when the rat is in transit between the pedestals and the track have been removed to facilitate spike sorting. 
  
 Our Q-matrix only includes non-zero counts when the animal is running on the track; these episodes manifest as narrow vertical stripes. To speed up calculations later, let's restrict Q to those times only: Our Q-matrix only includes non-zero counts when the animal is running on the track; these episodes manifest as narrow vertical stripes. To speed up calculations later, let's restrict Q to those times only:
  
 <code matlab> <code matlab>
-LoadMetadata;​+%% only include data we care about (runs on the maze)
 Q_tsd = tsd(Q_tvec_centers,​Q);​ Q_tsd = tsd(Q_tvec_centers,​Q);​
-Q_tsd = restrict(Q_tsd,​metadata.taskvars.trial_iv); ​% metadata contains experimenter annotations - here, intervals when rat ran the track+Q_tsd = restrict(Q_tsd,​metadata.taskvars.trial_iv);​
 </​code>​ </​code>​
  
Line 409: Line 331:
 %% prepare tuning curves %% prepare tuning curves
 clear tc clear tc
-nBins = numel(occ_binned);+nBins = numel(occ_hist);
 nCells = length(S.t);​ nCells = length(S.t);​
 for iC = nCells:-1:1 for iC = nCells:-1:1
Line 417: Line 339:
 occUniform = repmat(1/​nBins,​[nBins 1]); occUniform = repmat(1/​nBins,​[nBins 1]);
 </​code>​ </​code>​
 +
  
 === Running the decoding algorithm === === Running the decoding algorithm ===
Line 551: Line 474:
    
 end end
-plot(Q_tvec_centers,​dec_err,'​.k'​);​ 
 </​code>​ </​code>​
  
Line 559: Line 481:
 % get trial id for each sample % get trial id for each sample
 trial_id = zeros(size(Q_tvec_centers));​ trial_id = zeros(size(Q_tvec_centers));​
-trial_idx = nearest_idx3(run_start,​Q_tvec_centers);​ % NOTE: on non-Windows,​ use nearest_idx.m+trial_idx = nearest_idx3(metadata.taskvars.trial_iv.tstart,​Q_tvec_centers);​ % NOTE: on non-Windows,​ use nearest_idx.m
 trial_id(trial_idx) = 1; trial_id(trial_idx) = 1;
 trial_id = cumsum(trial_id);​ trial_id = cumsum(trial_id);​
Line 574: Line 496:
 This yields: This yields:
  
-{{ :analysis:cosmo2014:dec_err_1step_250ms.png?600 |}}+{{ :analysis:course-w16:dec_err.png?nolink&600 |}}
  
-Thus, on average our estimate is 1.87 pixels away from the true position. Earlier laps seem to have some more outliers of bins where our estimate is bad (large distance) but there is no obvious trend across laps visible.+(Note, your plot might look a little different.) 
 + 
 +Thus, on average our estimate is 2.14 pixels away from the true position. Earlier laps seem to have some more outliers of bins where our estimate is bad (large distance) but there is no obvious trend across laps visible.
  
 ☛ How does the decoding accuracy depend on the bin size used? Try a range from very small (10ms) to very large (1s) bins, making sure to note the average decoding error for 50ms bins, for comparison with results in the next module. What factors need to be balanced if the goal is maximum accuracy? ☛ How does the decoding accuracy depend on the bin size used? Try a range from very small (10ms) to very large (1s) bins, making sure to note the average decoding error for 50ms bins, for comparison with results in the next module. What factors need to be balanced if the goal is maximum accuracy?
Line 595: Line 519:
 This gives: This gives:
  
-{{ :analysis:cosmo2014:2d_decerror_space_250ms.png?600 |}}+{{ :analysis:course-w16:dec_errspace.png?nolink&600 |}}
  
 It looks like the decoding error is on average larger on the central stem, compared to the arms of the maze. It looks like the decoding error is on average larger on the central stem, compared to the arms of the maze.
Line 603: Line 527:
 ==== Challenges ==== ==== Challenges ====
  
-Visual inspection of the animation or movie suggests that the decoding does a decent job of tracking the rat's true location. Howeverespecially because of the number of parameters involved in the analysis (bin sizehow firing rates are computedthe Poisson and independence assumptions,​ etc.) it is important to quantify how well we are doing.+★ Implement a decoding ​analysis on your own data. Remember that this does not necessarily requires using spiking data -- anything that you can construct ​tuning curve for would work! In this modulewe had something like 100 simultaneously recorded neuronsbut even if you have only oneyou can still attempt to use it for decodingQuantify decoding performance (errorfor a few relevant parameters. 
 + 
 +★ How does decoding performance scale with the number of cells used? This is an important ​issue if we want to figure out if we should invest resources in attempting to record from more neurons, or if we have all we need in data sets such as this one.
  
-★ Modify the visualization code above to also compute ​a //decoding error// for each frame. This should be the distance between the rat's actual location ​and the location with the highest posterior probability ​(the "​maximum a posteriori"​ or MAP estimate). Plot this error over timeexcluding those bins where no cells were activeHow does this error change over the course ​of the session? How does it change if you reduce the bin size for decoding to 100ms?+★ In famous paper, [[http://science.sciencemag.org/content/318/​5852/​900 | Johnson ​and Redish ​(2007)]] showed that the hippocampus transiently represents possible future trajectories as rats appeared to deliberate between choices (left? right?at a decision pointHoweverthey used a controversial "​two-step"​ decoding algorithm which attracted criticismRefer to the Methods section ​of that paper to figure out how they did the decoding, and modify the code above to implement their version. What differences do you notice?
analysis/course-w16/week10.txt · Last modified: 2018/07/07 10:19 (external edit)