wiki

====== Differences ====== This shows you the differences between two versions of the page.

--- analysis:course-w16:week10 [2016/02/14 19:47]
mvdm [Preparing firing rates for decoding]
+++ analysis:course-w16:week10 [2018/04/17 15:20] (current)
@@ Line 1: / Line 1: @@
 ~~DISCUSSION~~
-:!: **UNDER CONSTRUCTION -- PLEASE DO NOT USE YET** :!:
 ===== Spike train analysis II: tuning curves, encoding, decoding =====
@@ Line 9: / Line 7: @@
   * Learn to estimate and plot tuning curves, raw and smoothed
   * Implement a basic Bayesian decoding algorithm
-  * Compare decoded and actual position by exporting to a movie file
+  * Compare decoded and actual position by computing the decoding error
 Resources:
@@ Line 16: / Line 14: @@
   * (for reference) [[http://jn.physiology.org/content/79/2/1017.long | Zhang et al. 1998]], first application of decoding to place cell data, with nice explanations and derivations
   * (for reference) [[http://www.jneurosci.org/content/18/18/7411.long | Brown et al. 1998]], an example of a more sophisticated decoding method
-**Note**: this module uses an externally compiled function, ''ndhist''. By default, this will not work on non-Windows machines, but the source code is provided. If you are a non-Windows user and able to compile this, I would be grateful if you could push it to %%GitHub%%.
 ==== Introduction ====
@@ Line 83: / Line 78: @@
 This cell seems to have a place field just to the left of the choice point on the track (a T-maze). There are also a few spikes on the pedestals, where the rat rests in between runs on the track.
-This figure is a useful visualization of the raw data, but it is not a tuning curve.
 === Estimating tuning curves ===
-A simple way to obtain a tuning curve is to bin the spikes (not in time, as in the previous module, but now in space):
+This figure is a useful visualization of the raw data, but it is not a tuning curve. As a first step towards estimating this cell's tuning curve (or //encoding model//, we should restrict the spikes to only those occurring when the rat is running on the track:
 <code matlab>
-SET_xmin = 10; SET_ymin = 10; SET_xmax = 640; SET_ymax = 480;
+LoadMetadata;
-SET_nxBins = 63; SET_nyBins = 47;
+ENC_S = restrict(S,metadata.taskvars.trial_iv);
+ENC_pos = restrict(pos,metadata.taskvars.trial_iv);
+% check for empties and remove
+keep = ~cellfun(@isempty,ENC_S.t);
+ENC_S.t = ENC_S.t(keep);
+ENC_S.label = ENC_S.label(keep);
+S.t = S.t(keep);
+S.label = S.label(keep);
+</code>
-spk_binned = ndhist(cat(1,spk_x',spk_y'),[SET_nxBins; SET_nyBins],[SET_xmin; SET_ymin],[SET_xmax; SET_ymax]);
+We have created ''ENC_'' versions of our spike trains and position data, containing only data from when the rat was running on the track (using experimenter annotation stored in the metadata; ''trial_iv'' contains the start and end times of trials) and removed all cells from the data set that did not have any spikes on the track.
-imagesc(spk_binned');
+☛ Plot the above scatterfield again for the restricted spike train. Verify that no spikes are occurring off the track by comparing your plot to the previous one for the full spike trains, above.
-axis xy; axis off; colorbar;
-</code>
-''ndhist'', for "N-dimensional histogram" can bin data in more than one dimension, in this case a 2-D grid with ''x'' and ''y'' position as the dimensions. It needs to know the minimum and maximum values for each dimension, as well as the number of bins in each; the data should be formatted with one dimension per row.
+To estimate tuning curves from the data, we need to divide spike count by time spent for each location on the maze. A simple way of doing that is to obtain 2-D histograms, shown here for the position data:
-This gives:
+<code matlab>
+clear pos_mat;
+pos_mat(:,1) = getd(ENC_pos,'y'); % construct input to 2-d histogram
+pos_mat(:,2) = getd(ENC_pos,'x');
-{{ :analysis:course:week10_tc0.png?600 |}}
+SET_xmin = 80; SET_ymin = 0; % set up bins
+SET_xmax = 660; SET_ymax = 520;
+SET_xBinSz = 10; SET_yBinSz = 10;
-This spike count can then be divided by how much time the rat spent in each of the bins (the "occupancy",  a value in seconds) to get a firing rate in spikes per second:
+x_edges = SET_xmin:SET_xBinSz:SET_xmax;
+y_edges = SET_ymin:SET_yBinSz:SET_ymax;
+occ_hist = histcn(pos_mat,y_edges,x_edges); % 2-D version of histc()
+no_occ_idx = find(occ_hist == 0); % NaN out bins never visited
+occ_hist(no_occ_idx) = NaN;
+occ_hist = occ_hist .* (1/30); % convert samples to seconds using video frame rate (30 Hz)
+subplot(221);
+pcolor(occ_hist); shading flat; axis off; colorbar
+title('occupancy');
+</code>
+We can do the same thing for the spikes of our example neuron:
 <code matlab>
-occ_binned = ndhist(cat(1,getd(pos,'x'),getd(pos,'y')),[SET_nxBins; SET_nyBins],[SET_xmin; SET_ymin],[SET_xmax; SET_ymax]);
+% basic spike histogram
+clear spk_mat;
-% this is a sample count, so need to convert to seconds (1/30s per sample, the video tracker sampling rate) to get time
+iC = 7;
-VT_Fs = 30;
+spk_x = interp1(ENC_pos.tvec,getd(ENC_pos,'x'),ENC_S.t{iC},'linear');
-tc = spk_binned./(occ_binned .* (1./VT_Fs)); % firing rate is spike count divided by time
+spk_y = interp1(ENC_pos.tvec,getd(ENC_pos,'y'),ENC_S.t{iC},'linear');
+spk_mat(:,2) = spk_x; spk_mat(:,1) = spk_y;
+spk_hist = histcn(spk_mat,y_edges,x_edges);
-pcolor(tc'); shading flat; %  "shading flat" removes black grid in plot
+spk_hist(no_occ_idx) = NaN;
-axis xy; colorbar; axis off;
+subplot(222)
+pcolor(spk_hist); shading flat; axis off; colorbar
+title('spikes');
 </code>
-Note that the occupancy is obtained by multiplying the number of samples in each bin with the sampling interval.
+..and finally simply divide one by the other:
-You should get:
+<code matlab>
+% rate map
+tc = spk_hist./occ_hist;
-{{ :analysis:course:week10_tc1.png?600 |}}
+subplot(223)
+pcolor(tc); shading flat; axis off; colorbar
+title('rate map');
+</code>
-''pcolor()'' ensures that bins which are NaN (such as would occur from division by zero occupancy) show up as white; ''imagesc()'' shows NaNs the same way as zeros.
+This gives:
+{{ :analysis:course-w16:raw_tc.png?nolink&900 |}}
-We now have a tuning curve for our cell, sometimes also called a "rate map" because we are dealing with (x,y) position. However, it suffers from the same issues arising from binning identified in the [[analysis:course-w16:week9|previous module]]. An improvement would be to bin at a much finer scale, and then use smoothing:
+Note that from the occupancy map, you can see the rat spent relatively more time at the base of the stem compared to other segments of the track. However, the rough binning is not very satisfying. Let's see if we can do better with some smoothing:
 <code matlab>
-SET_nxBins = 630; SET_nyBins = 470; % more bins
+kernel = gausskernel([4 4],2); % Gaussian kernel of 4x4 pixels, SD of 2 pixels (note this should sum to 1)
-kernel = gausskernel([30 30],8); % 2-D gaussian for smoothing: 30 points in each direction, SD of 8 bins
+[occ_hist,~,~,pos_idx] = histcn(pos_mat,y_edges,x_edges);
+occ_hist = conv2(occ_hist,kernel,'same');
-% spikes
+occ_hist(no_occ_idx) = NaN;
-spk_binned = ndhist(cat(1,spk_x',spk_y'),[SET_nxBins; SET_nyBins],[SET_xmin; SET_ymin],[SET_xmax; SET_ymax]);
+occ_hist = occ_hist .* (1/30);
-spk_binned = conv2(spk_binned,kernel,'same'); % note, now a 2-dimensional convolution
-% occupancy
+subplot(221);
-occ_binned = ndhist(cat(1,getd(pos,'x'),getd(pos,'y')),[SET_nxBins; SET_nyBins],[SET_xmin; SET_ymin],[SET_xmax; SET_ymax]);
+pcolor(occ_hist); shading flat; axis off; colorbar
-occ_binned = conv2(occ_binned,kernel,'same');
+title('occupancy');
-occ_binned(occ_binned < 0.01) = 0; % minimum occupancy thresholding
+%
-tc = spk_binned./(occ_binned .* (1 / VT_Fs));
+spk_hist = histcn(spk_mat,y_edges,x_edges);
-tc(isinf(tc)) = NaN;
+spk_hist = conv2(spk_hist,kernel,'same'); % 2-D convolution
+spk_hist(no_occ_idx) = NaN;
-figure;
+subplot(222)
-pcolor(tc'); shading flat; axis off
+pcolor(spk_hist); shading flat; axis off; colorbar
-axis xy; colorbar;
+title('spikes');
+%
+tc = spk_hist./occ_hist;
+subplot(223)
+pcolor(tc); shading flat; axis off; colorbar
+title('rate map');
 </code>
-This gives:
+Now you should get:
-{{ :analysis:course:week10_tc2.png?600 |}}
+{{ :analysis:course-w16:smooth_tc.png?nolink&900 |}}
-Now we have a nice smooth estimate of this cell's place field. Note that the values on the colorbar have changed quite a bit as a result of the smoothing.
+These are well-formed tuning curves we can use for decoding. Of course we could bin more finely for increased spatial resolution, but this will slow down the decoding, so for now it's not worth it.
-☛ What does the ''occ_binned(occ_binned < 0.01) = 0;'' line accomplish? Uncomment it and re-run the code.
+Next we obtain a tuning curve for all our cells:
-☛ What happens if you don't do ''tc(isinf(tc)) = NaN;''?
+<code matlab>
+clear tc all_tc
+nCells = length(ENC_S.t);
+for iC = 1:nCells
+    spk_x = interp1(ENC_pos.tvec,getd(ENC_pos,'x'),ENC_S.t{iC},'linear');
+    spk_y = interp1(ENC_pos.tvec,getd(ENC_pos,'y'),ENC_S.t{iC},'linear');
+    clear spk_mat;
+    spk_mat(:,2) = spk_x; spk_mat(:,1) = spk_y;
+    spk_hist = histcn(spk_mat,y_edges,x_edges);
+    spk_hist = conv2(spk_hist,kernel,'same');
+    spk_hist(no_occ_idx) = NaN;
+    tc = spk_hist./occ_hist;
+    all_tc{iC} = tc;
+end
+</code>
+We can inspect the results as follows:
+<code matlab>
+%%
+ppf = 25; % plots per figure
+for iC = 1:length(ENC_S.t)
+    nFigure = ceil(iC/ppf);
+    figure(nFigure);
+    subtightplot(5,5,iC-(nFigure-1)*ppf);
+    pcolor(all_tc{iC}); shading flat; axis off;
+    caxis([0 10]);
+end
+</code>
+You will see a some textbook "place cells" with a clearly defined single place field. There are also cells with other firing patterns.
 The data in this module computes tuning curves for location, but the idea is of course more general. For continuous variables in particular, it is a natural and powerful way to describe the relationship between two quantities -- spikes and location in this case, but there is no reason why you couldn't do something like pupil diameter as a function of arm reaching direction, for instance!
@@ Line 188: / Line 266: @@
 In general, from the [[http://en.wikipedia.org/wiki/Poisson_distribution | definition of the Poisson distribution]], it follows that
-\[P(n_i|\mathbf{x}) = \frac{(\tau f_i(\mathbf{x}))^{n_i}}{n_i!} e^{-\tau f_i (x)}\]
+\[P(n_i|\mathbf{x}) = \frac{(\tau f_i(\mathbf{x}))^{n_i}}{n_i!} e^{-\tau f_i (\mathbf{x})}\]
 $f_i(\mathbf{x})$ is the average firing rate of neuron $i$ over $x$ (i.e. the tuning curve for position), $n_i$ is the number of spikes emitted by neuron $i$ in the current time window, and $\tau$ is the size of the time window used. Thus, $\tau f_i(\mathbf{x})$ is the mean number of spikes we expect from neuron $i$ in a window of size $\tau$; the Poisson distribution describes how likely it is that we observe the actual number of spikes $n_i$ given this expectation.
@@ Line 197: / Line 275: @@
 \[P(\mathbf{n}|\mathbf{x}) = \prod_{i = 1}^{N} \frac{(\tau f_i(\mathbf{x}))^{n_i}}{n_i!}
-e^{-\tau f_i (x)}\]
+e^{-\tau f_i (\mathbf{x})}\]
 An analogy here is simply to ask: if the probability of a coin coming up heads is $0.5$, what is the probability of two coints, flipped simultaneously, coming up heads? If the coins are independent then this is simply $0.5*0.5$.
@@ Line 209: / Line 287: @@
 The tuning curves take care of the $f_i(x)$ term in the decoding equations. Next, we need to get $\mathbf{n}$, the spike counts.
-=== Preparing tuning curves for decoding ===
-With the math taken care of, we can now start preparing the data for the decoding procedure. First we need to make sure we have tuning curves for all neurons.
-Now we need to do the same for //all// cells. For now, we will revert to using the "low-resolution" version (with 63x47 bins) with a small amount of smoothing. Even though this is not as good of an estimate as the high-resolution version, our decoding will be super slow if we try to run it on a high-resolution smoothed estimate.
-So first, let's inspect our updated tuning curve example:
-<code matlab>
-kernel = gausskernel([4 4],2); % 2-D gaussian, width 4 bins, SD 2
-SET_xmin = 10; SET_ymin = 10; SET_xmax = 640; SET_ymax = 480;
-SET_nxBins = 63; SET_nyBins = 47;
-spk_binned = ndhist(cat(1,spk_x',spk_y'),[SET_nxBins; SET_nyBins],[SET_xmin; SET_ymin],[SET_xmax; SET_ymax]);
-spk_binned = conv2(spk_binned,kernel,'same'); % smoothing
-occ_binned = ndhist(cat(1,getd(pos,'x'),getd(pos,'y')),[SET_nxBins; SET_nyBins],[SET_xmin; SET_ymin],[SET_xmax; SET_ymax]);
-occ_mask = (occ_binned < 5);
-occ_binned = conv2(occ_binned,kernel,'same'); % smoothing
-occ_binned(occ_mask) = 0; % don't include bins with less than 5 samples
-VT_Fs = 30;
-tc = spk_binned./(occ_binned .* (1 / VT_Fs));
-tc(isinf(tc)) = NaN;
-pcolor(tc'); shading flat;
-axis xy; colorbar; axis off;
-</code>
-Then, we can do the same for all cells in our data set:
-<code matlab>
-clear tc
-nCells = length(S.t);
-for iC = 1:nCells
-    spk_x = interp1(pos.tvec,getd(pos,'x'),S.t{iC},'linear');
-    spk_y = interp1(pos.tvec,getd(pos,'y'),S.t{iC},'linear');
-    spk_binned = ndhist(cat(1,spk_x',spk_y'),[SET_nxBins; SET_nyBins],[SET_xmin; SET_ymin],[SET_xmax; SET_ymax]);
-    spk_binned = conv2(spk_binned,kernel,'same');
-    tc = spk_binned./(occ_binned .* (1 / VT_Fs));
-    tc(isinf(tc)) = NaN;
-    all_tc{iC} = tc;
-end
-</code>
-Note that we don't need to recompute the occupancy because it is the same for all cells.
-Let's inspect the resulting tuning curves:
-<code matlab>
-ppf = 25; % plots per figure
-for iC = 1:length(S.t)
-    nFigure = ceil(iC/ppf);
-    figure(nFigure);
-    subplot(5,5,iC-(nFigure-1)*ppf);
-    pcolor(all_tc{iC}); shading flat; axis off;
-end
-</code>
-You will see a some textbook "place cells" with a clearly defined single place field. There are also cells with other firing patterns.
-☛ One cell has a completely green place map. What does this indicate, and under what conditions can this happen?
-Since we see cells with fields in some different locations, it seems unlikely that a single sensory cue or nonspatial source can account for this activity. Of course, numerous experiments have demonstrated that many place cells do not depend on any specific sensory cue to maintain a stable firing field.
 === Preparing firing rates for decoding ===
-The tuning curves take care of the $f_i(x)$ term in the decoding equations. Now we need to get $\mathbf{n}$, which are simply spike counts:
+To obtain spike counts within a given bin size, we can use histc():
 <code matlab>
+%% make Q-mat
 clear Q;
-binsize = 0.25;
+binsize = 0.25; % seconds
-tvec = ExpKeys.TimeOnTrack:binsize:ExpKeys.TimeOffTrack;
-for iC = length(S.t):-1:1
+% assemble tvecs
+tvec_edges = metadata.taskvars.trial_iv.tstart(1):binsize:metadata.taskvars.trial_iv.tend(end);
+Q_tvec_centers = tvec_edges(1:end-1)+binsize/2;
+for iC = length(ENC_S.t):-1:1
-    spk_t = S.t{iC};
+    spk_t = ENC_S.t{iC};
-    Q(iC,:) = histc(spk_t,tvec);
+    Q(iC,:) = histc(spk_t,tvec_edges);
+    Q(iC,end-1) = Q(iC,end-1)+Q(iC,end); % remember last bin of histc() gotcha
 end
-nActiveNeurons = sum(Q > 0);
+Q = Q(:,1:end-1);
 </code>
@@ Line 302: / Line 314: @@
 <code matlab>
-% look at it
+imagesc(Q_tvec_centers,1:nCells,Q)
-imagesc(tvec,1:nCells,Q)
 set(gca,'FontSize',16); xlabel('time(s)'); ylabel('cell #');
 </code>
-You should see:
+Our Q-matrix only includes non-zero counts when the animal is running on the track; these episodes manifest as narrow vertical stripes. To speed up calculations later, let's restrict Q to those times only:
-{{ :analysis:course:week10_qmat.png?600 |}}
+<code matlab>
+%% only include data we care about (runs on the maze)
-If you zoom in to a smaller slice of time, you will notice that there are gaps in the data, i.e. segments without any activity whatsoever. This is a quirk of this particular data set: the epochs when the rat is in transit between the pedestals and the track have been removed to facilitate spike sorting.
+Q_tsd = tsd(Q_tvec_centers,Q);
+Q_tsd = restrict(Q_tsd,metadata.taskvars.trial_iv);
+</code>
 The final step before the actual decoding procedure is to reformat the tuning curves a bit to make the decoding easier to run. Instead of keeping them as a 2-D matrix, we just unwrap this into 1-D:
 <code matlab>
+%% prepare tuning curves
 clear tc
-nBins = numel(occ_binned);
+nBins = numel(occ_hist);
 nCells = length(S.t);
 for iC = nCells:-1:1
@@ Line 325: / Line 339: @@
 occUniform = repmat(1/nBins,[nBins 1]);
 </code>
 === Running the decoding algorithm ===
@@ Line 331: / Line 346: @@
 <code matlab>
-len = length(tvec);
+%% decode
-p = nan(length(tvec),nBins);
+Q_tvec_centers = Q_tsd.tvec;
+Q = Q_tsd.data;
+nActiveNeurons = sum(Q > 0);
+len = length(Q_tvec_centers);
+p = nan(length(Q_tvec_centers),nBins);
 for iB = 1:nBins
     tempProd = nansum(log(repmat(tc(iB,:)',1,len).^Q));
@@ Line 338: / Line 358: @@
     p(:,iB) = exp(tempProd)*tempSum*occUniform(iB);
 end
+p = p./repmat(sum(p,2),1,nBins); % renormalize to 1 total probability
+p(nActiveNeurons < 1,:) = 0; % ignore bins with no activity
 </code>
-Note, on my 2-year old laptop this takes about 2 minutes to run. The computation can be speeded up substantially by only looping over bins that have non-zero occupancy, but this would complicate the code a bit.
+☛ Compare these steps with the equations above.
-☛ Compare these steps with the equations above. There is no log in there; why does it appear here?
+  * There is no log in the equations; why does it appear here?
+  * What parts of the equation correspond to the ''tempProd'' and ''tempSum'' variables?
-Finally we renormalize and set the posterior time windows without any neural activity to zero (done for display purposes):
-<code matlab>
-p = p./repmat(sum(p,2),1,nBins);
-p(nActiveNeurons < 1,:) = 0;
-</code>
 === Visualizing the results ===
@@ Line 356: / Line 373: @@
 <code matlab>
-xBinEdges = linspace(SET_xmin,SET_xmax,SET_nxBins+1);
+xBinned = interp1(ENC_pos.tvec,pos_idx(:,1),Q_tvec_centers);
-yBinEdges = linspace(SET_ymin,SET_ymax,SET_nyBins+1);
+yBinned = interp1(ENC_pos.tvec,pos_idx(:,2),Q_tvec_centers);
-xTempD = getd(pos,'x'); xTempR = pos.tvec;
-yTempD = getd(pos,'y');
-gS = find(~isnan(xTempD) & ~isnan(yTempD));
-xi = interp1(xTempR(gS),xTempD(gS),tvec,'linear');
-yi = interp1(xTempR(gS),yTempD(gS),tvec,'linear');
-xBinned = (xi-xBinEdges(1))./median(diff(xBinEdges));
-yBinned = (yi-yBinEdges(1))./median(diff(yBinEdges));
 </code>
@@ Line 373: / Line 380: @@
 <code matlab>
-goodOccInd = find(occ_binned > 0);
+goodOccInd = find(occ_hist > 0);
-for iT = 1:size(p,1)
+SET_nxBins = length(x_edges)-1; SET_nyBins = length(y_edges)-1;
+dec_err = nan(length(Q_tvec_centers),1);
+for iT = 1:length(Q_tvec_centers)
     cla;
-    temp = reshape(p(iT,:),[SET_nxBins SET_nyBins]);
+    temp = reshape(p(iT,:),[SET_nyBins SET_nxBins]);
-    toPlot = nan(SET_nxBins,SET_nyBins);
+    toPlot = nan(SET_nyBins,SET_nxBins);
     toPlot(goodOccInd) = temp(goodOccInd);
     pcolor(toPlot); axis xy; hold on; caxis([0 0.5]);
     shading flat; axis off;
     hold on; plot(yBinned(iT),xBinned(iT),'ow','MarkerSize',15);
+    % get x and y coordinates of MAP
+    [~,idx] = max(toPlot(:));
+    [x_map,y_map] = ind2sub(size(toPlot),idx);
+    if nActiveNeurons(iT) > 0
+        dec_err(iT) = sqrt((yBinned(iT)-y_map).^2+(xBinned(iT)-x_map).^2);
+    end
+    plot(y_map,x_map,'g*','MarkerSize',5);
-    h = title(sprintf('t %.2f, nCells %d',tvec(iT),nActiveNeurons(iT)));
+    h = title(sprintf('t %.2f, nCells %d, dist %.2f',Q_tvec_centers(iT),nActiveNeurons(iT),dec_err(iT)));
     if nActiveNeurons(iT) == 0
         set(h,'Color',[1 0 0]);
@@ Line 395: / Line 416: @@
 </code>
-Initially, when the rat is moving around at the base of the maze (its actual position is indicated by the white ''o''), no decoding results will be shown because there are no cells active. However, after a few seconds the rat starts running up the stem of the maze, and some pixels in the plot will change color, indicating $P(\mathbf{x}|\mathbf{n})$, the "posterior probability" that is the output of the decoding procedure. As you can see, the decoding seems to track the rat's actual location as it moves.
+This plot shows the posterior $P(\mathbf{x}|\mathbf{n})$, as the rat moves around the maze; its actual position is indicated by the white ''o'', and the pixel with the highest posterior probability is indicated by the green ''*''. As you can see, the decoding seems to track the rat's actual location as it moves.
 ☛ No decoding is available for those bins where no neurons are active, because we manually set the posterior to zero. However, there also seem to be some frames in the animation where some neurons are active (as indicated in the title), yet no decoded estimate is visible. What is the explanation for this?
-=== Exporting the results to a movie file ===
+=== Optional diversion: exporting the results to a movie file ===
-By making the MATLAB animation into a movie file, it is often easier to explore the results. To do this, we can run the animation code above, with a few small modifications. First, before entering the main plotting loop, set the figure to be used to a specific size:
+By making the %%MATLAB%% animation into a movie file, it is often easier to explore the results. To do this, we can run the animation code above, with a few small modifications. First, before entering the main plotting loop, set the figure to be used to a specific size:
 <code matlab>
@@ Line 423: / Line 444: @@
 </code>
-The above will only work if you have the XVid codec installed: I highly recommend this because it creates movie files that are an order of magnitude smaller than uncompressed files. If you have trouble with XVid, you can of course still save an uncompressed file for now. For longer movies, it is often required to save a file, say, every 500 frames, to prevent the ''f'' variable getting too large. These segments can then be merged with a video editing program such as [[http://www.virtualdub.org/ | VirtualDub]] (Windows only AFAIK; please suggest OSX/Linux alternatives if you know any that work well!).
+The above will only work if you have the %%XVid%% codec installed: I highly recommend this because it creates movie files that are an order of magnitude smaller than uncompressed files. If you have trouble with %%XVid%%, you can of course still save an uncompressed file for now. For longer movies, it is often required to save a file, say, every 500 frames, to prevent the ''f'' variable getting too large. These segments can then be merged with a video editing program such as [[http://www.virtualdub.org/ | VirtualDub]] (Windows only %%AFAIK%%; please suggest %%OSX/Linux%% alternatives if you know any that work well!).
+=== Quantifying decoding accuracy ===
+By running the above loop without plotting we can obtain the "decoding error", that is the distance between the maximum a posteriori (MAP) estimate and the rat's true position:
+<code matlab>
+%% get distance (no plotting)
+dec_err = nan(length(Q_tvec_centers),1);
+SET_nxBins = length(x_edges)-1; SET_nyBins = length(y_edges)-1;
+xBinned = interp1(ENC_pos.tvec,pos_idx(:,1),Q_tvec_centers);
+yBinned = interp1(ENC_pos.tvec,pos_idx(:,2),Q_tvec_centers);
+for iT = 1:length(Q_tvec_centers);
+    temp = reshape(p(iT,:),[SET_nyBins SET_nxBins]);
+    toPlot = nan(SET_nyBins,SET_nxBins);
+    toPlot(goodOccInd) = temp(goodOccInd);
+    % get x and y coordinates of MAP
+    [~,idx] = max(toPlot(:));
+    [x_map,y_map] = ind2sub(size(toPlot),idx);
+    if nActiveNeurons(iT) > 0
+        dec_err(iT) = sqrt((yBinned(iT)-y_map).^2+(xBinned(iT)-x_map).^2);
+    end
+end
+</code>
+A nice way to plot this is to average by lap as well as overall:
+<code matlab>
+% get trial id for each sample
+trial_id = zeros(size(Q_tvec_centers));
+trial_idx = nearest_idx3(metadata.taskvars.trial_iv.tstart,Q_tvec_centers); % NOTE: on non-Windows, use nearest_idx.m
+trial_id(trial_idx) = 1;
+trial_id = cumsum(trial_id);
+%plot(dec_err_tsd.tvec,trial_id,'.k');
+figure; set(gca,'FontSize',18);
+boxplot(dec_err,trial_id);
+xlabel('trial'); ylabel('decoding error (pixels)');
+av_error = nanmean(dec_err);
+title(sprintf('avg err %.2f',av_error));
+</code>
+This yields:
+{{ :analysis:course-w16:dec_err.png?nolink&600 |}}
+(Note, your plot might look a little different.)
+Thus, on average our estimate is 2.14 pixels away from the true position. Earlier laps seem to have some more outliers of bins where our estimate is bad (large distance) but there is no obvious trend across laps visible.
+☛ How does the decoding accuracy depend on the bin size used? Try a range from very small (10ms) to very large (1s) bins, making sure to note the average decoding error for 50ms bins, for comparison with results in the next module. What factors need to be balanced if the goal is maximum accuracy?
+A different way of looking at the decoding error is to plot it as a function of space:
+<code matlab>
+cfg = [];
+cfg.y_edges = y_edges; cfg.x_edges = x_edges;
+dec_err_tsd = tsd(Q_tvec_centers,dec_err);
+space_err = TSDbySpace(cfg,ENC_pos,dec_err_tsd);
+figure;
+pcolor(space_err); shading flat; axis off; colorbar; caxis([0 10]);
+</code>
+This gives:
+{{ :analysis:course-w16:dec_errspace.png?nolink&600 |}}
+It looks like the decoding error is on average larger on the central stem, compared to the arms of the maze.
+☛ What could be some reasons for this? Can you think of ways to test your suggestions?
 ==== Challenges ====
-Visual inspection of the animation or movie suggests that the decoding does a decent job of tracking the rat's true location. However, especially because of the number of parameters involved in the analysis (bin size, how firing rates are computed, the Poisson and independence assumptions, etc.) it is important to quantify how well we are doing.
+★ Implement a decoding analysis on your own data. Remember that this does not necessarily requires using spiking data -- anything that you can construct a tuning curve for would work! In this module, we had something like 100 simultaneously recorded neurons, but even if you have only one, you can still attempt to use it for decoding. Quantify decoding performance (error) for a few relevant parameters.
+★ How does decoding performance scale with the number of cells used? This is an important issue if we want to figure out if we should invest resources in attempting to record from more neurons, or if we have all we need in data sets such as this one.
-★ Modify the visualization code above to also compute a //decoding error// for each frame. This should be the distance between the rat's actual location and the location with the highest posterior probability (the "maximum a posteriori" or MAP estimate). Plot this error over time, excluding those bins where no cells were active. How does this error change over the course of the session? How does it change if you reduce the bin size for decoding to 100ms?
+★ In a famous paper, [[http://science.sciencemag.org/content/318/5852/900 | Johnson and Redish (2007)]] showed that the hippocampus transiently represents possible future trajectories as rats appeared to deliberate between choices (left? right?) at a decision point. However, they used a controversial "two-step" decoding algorithm which attracted criticism. Refer to the Methods section of that paper to figure out how they did the decoding, and modify the code above to implement their version. What differences do you notice?

wiki

User Tools

Site Tools

Page Tools