wiki

~~DISCUSSION~~ ===== Module 5: Visualizing neural data in MATLAB ===== Goals: * Load local field potential (LFP) and spike data from an example data set * Master basic and intermediate MATLAB plotting: ''plot'', ''axis'' and ''figure'' properties * Use interactive figure properties for dynamic data viewing * Learn to use functions with variable numbers of input arguments (using ''varargin'' and ''cfg'' inputs) effectively * Understand which file format to export figures to Resources: * (reference) MATLAB documentation for ''plot'', [[ http://www.mathworks.com/help/matlab/ref/figure_props.html | Figure properties ]], [[ http://www.mathworks.com/help/matlab/ref/axes_props.html | Axis properties ]] and [[ http://www.mathworks.com/help/matlab/ref/line_props.html | Line Properties ]] * (reference) [[http://www.mathworks.com/help/matlab/creating_plots/figures-plots-and-graphs.html | MATLAB plotting gallery ]] * (reference) [[http://www.vandermeerlab.org/BIOL377_assignment3_2012.pdf | Place cell assignment]] from %%MvdM%%'s BIOL377 course (if you want a gentle intro to what is going on in the data you are looking at; can be done in 30 min) * (optional) Tufte, [[ http://www.edwardtufte.com/tufte/books_vdqi | The Visual Display of Quantitative Information ]] ===== Introductory remarks ===== Visualization is an essential component of any data analysis workflow. You would never perform brain surgery or another intricate experimental procedure without a good view of what you are doing -- if conditions are met for a next step, whether a manipulation has the desired outcome -- and likewise, you should not do data analysis blind. This means that after every step you want to be able to see what changed. Also, for many analyses, a visualization is part of the final outcome. Most neuroscience papers have complex multipanel figures, such as this one from [[http://www.nature.com/nature/journal/v469/n7330/fig_tab/nature09633_F4.html | Dragoi and Tonegawa]] (Nature, 2011): {{:analysis:course:nature09633-f4.2.jpg?nolink&600|}} Notice that panel Aa, a highly selective and organized view of raw data, already contains LFP data (top) and spike data from 18 cells! This sort of figure is constructed by first producing the individual panels in MATLAB, and then touching up and combining them in Illustrator. We will do the MATLAB part here and aim to produce something similar to panel Aa. An important principle in graphical design for scientific communication is to only include those elements which contribute to the messages(s) you want to communicate. Remove and simplify as much as possible: this will make your figures more readable. The Tufte book linked to above is a beautiful statement of this (and other) guiding principles of graphic design. Even if you do not go as far as reading it, do at least make your figure legends big enough to be readable! :-) ==== Loading some data ==== First, make sure you have the folder ''R042-2013-08-18'' from the data share, and that this is placed in a sensible location (NOT in a %%GitHub%% or project folder!). This data set was collected from the dorsal CA1 region of the hippocampus of a rat performing a T-maze task, using a drive with 16 independently movable tetrodes. Spike and LFP data was recorded from each tetrode; possible spike events were detected online and stored for offline spike-sorting, and LFPs were sampled at 2kHz and bandpass filtered between 1-475Hz. (A quirk of this particular data set is that certain time intervals are cut out of the spike data, but not the LFP. So you may notice some odd looking gaps in the rasterplot later.) Use the shortcut button created in the previous module to set your path. In your project folder, create a folder with today's date and ''cd'' to it. This will be your "scratch" working directory with stuff that you don't necessarily intend to push to %%GitHub%%. Create a ''sandbox.m'' file in your daily folder. In this file, use cell mode to load some spike trains, a LFP, and position data as follows (recall you can use Ctrl+Enter to execute the code in a cell): <code matlab> %% cd to data folder -- replace this with yours fd = 'D:\data\DataAnalysisTutorial\R042-2013-08-18'; cd(fc); %% load the data (note, may need to unzip position data first) cfg = []; S = LoadSpikes(cfg) </code> If you are using your own data that has a missing or bad .clusters file, do this: <code matlab> cfg = []; cfg.useClustersFile = 0; S = LoadSpikes(cfg) </code> This should give a data structure containing spike train data: <code matlab> LoadSpikes: Loading 67 files... S = t: {1x67 cell} label: {1x67 cell} cfg: [1x1 struct] usr: [1x1 struct] </code> The details of this are discussed in [[analysis:nsb2014:week2|Module 2]]. In brief, each cell of ''S.t'' contains the spike times from a putative neuron. The qualifier "putative" is used because this is extracellular data and [[analysis:nsb2014:week4|spike-sorting]] is not perfect, so it's likely there will be some spikes missing and some spikes included that are not from this neuron. Always remember this even if I will omit the "putative" from now on for short. You can see we have loaded 67 neurons for now. Let's also load a LFP: <code matlab> cfg = []; cfg.fc = {'R042-2013-08-18-CSC03a.ncs'}; csc = LoadCSC(cfg); </code> This gives: <code matlab> LoadCSC: Loading 1 files... LoadCSC: R042-2013-08-18-CSC03a.ncs 0/17193 bad blocks found (0.00%). >> csc csc = tvec: [8802816x1 double] data: [1x8802816 double] label: {'R042-2013-08-18-CSC03a.ncs'} cfg: [1x1 struct] </code> ''csc'' is the common Neuralynx designation for "continuously sampled channel" and typically is an EEG or LFP type signal sampled and filtered so that high-frequency components such as spikes are not accessible. It is possible to have wide-band, 32kHz CSCs suitable for spike extraction, but these are not included in the current dataset. As discussed in [[analysis:nsb2014:week2|Module 3]] a LFP is defined by matching sets of sample timestamps (''csc.tvec'') and sampled data (''csc.data''). ==== ts objects: basic methods ==== The ''S'' variable that contains our spike times is a MATLAB struct that conforms to the ''ts'' ("timestamp") format introduced in [[analysis:nsb2014:week2|Module 2]]. Its ''.t'' field contains spike trains as lists of numbers, corresponding to the timestamps of the spikes. So, to plot a raster with the spikes of the first cell in our data set: <code matlab> nCell = 1; SET_SpikeHeight = 0.4; plot([S.t{nCell} S.t{nCell}],[nCell-SET_SpikeHeight nCell+SET_SpikeHeight],'k') set(gca,'YLim',[-10 10]); </code> ☛ Extend the example code above to make a rasterplot containing the spikes for all cells in ''S''. To facilitate re-use later, wrap your code in a function, %%PlotSpikeRaster%%, that plots the raster for any ''ts'' input. You should find that it takes a significant amount of time to plot that many spikes. In practice we often want to plot spikes for specific time intervals. A useful function that works on ''ts'', ''tsd'' and ''iv'' objects is ''restrict()''. ☛ Use MATLAB's ''help'' function to look up the usage information for ''restrict()''. In a new editor cell, use this function to create a ''S_r'' variable with data restricted to between 5950 and 6050 s. Calling your %%PlotSpikeRaster%% function on this restricted data should be much faster. Now that we are a bit more zoomed in, some interesting features of the rasterplot become visible. For instance, at time 5967, many cells are synchronously active. Could this be a sharp wave-ripple event (SWR)? ☛ Restrict the csc to the same time (5950 to 6050) and plot it in the same figure. You can use the ''rescale()'' function to fit the csc into a convienient data range (e.g. ''rescale(csc.data,-5,0)''). You should see that the synchronous activation around time 5967 is indeed associated with a striking oscillation in the LFP: {{ :analysis:nsb2014:m5_swr_example.png?600 |}} This high-frequency osillation (the "ripple") superimposed on a slower oscillation (the "sharp wave") is the signature associated with "replay" events in the hippocampus. We will explore this phenomenon in detail in later modules. ==== Figure properties, handles, and exporting ==== (Remember to execute the commands below from your ''sandbox.m'' file, not from the command line! That way you can more easily retrieve what you did later.) A MATLAB figure is actually a collection of objects, all with their own properties which you can access and edit. For instance, to change the figure background to black, do <code matlab> set(gcf,'Color',[0 0 0]); </code> ''gcf'' is MATLAB's name for "get current figure". The above command set its '''Color''' property to ''[0 0 0]'' which is black (zero red, zero green, zero blue in the RGB world). ☛ Set the background of the current //axes// to black (''gca'' is MATLAB's name for the current axes object; notice that the axes background is currently white, which looks ugly). Notice that now you can't see the axis lines and labels anymore, because they too became black when you changed the color of the axes. You want to set their color to be white, but you might not know the name of the relevant property. ☛ Use ''get(gca)'' to get a list of all the axes properties. Find the properties that correspond to the axes color and set them to white. Now that our figure looks nice, let's save it as an image file: <code matlab> print(gcf,'-dpng','-r300','R042-2013-08-18-LFPsnippet.png'); </code> Notice that the first argument of ''print'' is the current figure. The other arguments specify that we want a ''PNG'' file, with 300dpi resolution. PNG format is a good choice for saving figures because it uses lossless compression, JPG images, which use lossy compression, can have ugly artifacts (but more colors). Another useful save format is ''-dill'' which saves ''.ai'' Illustrator files. ☛ Look at the image file. The colors are all wrong! Solve this problem by turning off the ''InvertHardCopy'' property of the figure, and save again. (MATLAB does this by default to facilitate printing images on white paper.) Let's add some moderately fancy features to our plot. Starting from a currently selected rasterplot, do the following: <code matlab> %% add multi-unit activity to rasterplot ax1 = gca; ax1_pos = get(ax1,'Position'); ax2 = axes('Position',ax1_pos); % new axes with same position as first cfg = []; cfg.tvec = csc.tvec; cfg.sigma = 0.1; mua = getMUA(cfg,S); % obtain multi-unit activity xr = get(ax1,'XLim'); mua_r = restrict(mua,xr(1),xr(2)); % only keep the data we need axes(ax2); mua_hdl = plot(mua_r.tvec,mua_r.data,'Color',[0.7 0.7 0.7]); set(gca,'YAxisLocation','right','Box','off','XTick',[],'Color','none') linkaxes([ax1 ax2],'x'); </code> A few things to note here: * Using the ''Position'' property of the current axes (ax1, containing the rasterplot) we created a new set of axes (ax2). * In these new axes, we plotted the multi-unit activity, obtaining an output argument. Like ''gca'' and ''gcf'' this is a **handle** to graphics object; in this case, a handle to the multi-unit activity signal we just drew. * We set the properties of the new multi-unit axes to have the y-axis on the right (plus a few other properties). * ''linkaxes()'' was used to link the x-axis of both axes so that both update when zooming in. Try it! ☛ Change the properties of ''mua_hdl'' so that the line is red, dashed, and of width 2. 100 seconds of LFP is a lot to display. ☛ Use the 'XLim' axes property to zoom in to 5989 to 5990 s; while you are at it, increase the font size to 24. (You can change multiple properties with the same command, just keep adding property-value pairs.) It would be nice to be able to update what time window of the data we are looking at, without having to type these ''XLim'' commands. To do this we need to use a special Figure property, introduced in the next section. One final point of interest: different graphics objects are often (hierarchically) related. For instance, our figure currently contains one set of axes. We can obtain the axes handle directly, ''axes_handle = gca;'', but also indirectly, ''axes_handle = get(gcf,'Children');''. This works because currently, the axes are the only child of the figure. ==== Figure callback functions ==== First, another basic MATLAB skill: anonymous functions. Consider: <code matlab> sqr_fn = @(x) x.^2; sqr_fn(2) </code> ''sqr_fn'' defines a **function handle** to a so-called anonymous function, i.e. a function that is not in .m file form. The above example basically says "sqr_fn is a function of x, and should return x squared." (if you don't understand the ''.'' notation, it is important to look this up!) Look at the [[http://www.mathworks.com/help/matlab/ref/figure_props.html|KeyPressFcn]] figure property. This property specifies the function to call whenever a key is pressed with the figure active.. ☛ Implement the example shown in the page above. Notice the properties of the ''evt'' object passed to the keypress function, and that this function also gets passed the handle of the figure that generated the keypress event. The keypress function thus has access to everything we need to implement some useful navigation functionality. ☛ Create a keypress function that moves the x-limits of the current figure's axes to the left or right by 50% (i.e. if the current axes are from 1 to 2s, then pressing the right arrow key takes it to 1.5-2.5 s. It will be easier to do this with a ''.m'' file given the length of what you need to do. Hint: model your ''.m'' file on the template in the example given in the KeyPressFcn section of the [[http://www.mathworks.com/help/matlab/ref/figure_props.html|Figure properties ]] page, and define it thus: ''f_hdl = figure('KeyPressFcn',@your_function);'' if your function is called ''your_function.m''. Please keep in mind of the difference between ”==” and “strcmp” operators. The former treats two strings as vectors and does letter-wise comparison. Thus the two strings ought to be of the same dimension. The latter operator does lexical comparison and returns “1” if the two strings are identical, “0” otherwise. ==== Varargins ==== A standard function definition specifies the exact number of input and output arguments the function expects: ''function y = sqrt(x)'' has one input and one output argument. However, it is often useful to have a variable number of input arguments, so that you can override defaults or specify additional options as needed without complicating simple function calls. For instance, you want to be able to make quick plots with ''plot(x)'' but specify extra stuff when needed (''plot(x,'LineWidth',2)''). This is accomplished with the special argument ''varargin''. Here is the idea: <code matlab> function test_fun(a,varargin) b = 2; % set defaults for b and c c = 3; extract_varargin; % override b and c if specified fprintf('a is %d, b is %d, c is %d\n',a,b,c); </code> Test it: <code matlab> >> test_fun(1) a is 1, b is 2, c is 3 >> test_fun(1,'b',1) a is 1, b is 1, c is 3 >> </code> Note how specifying a "key-value" pair ('''b',1'') overwrote the default for ''b''. ''extract_varargin'' basically goes through the varargins and assigns each value to the corresponding key. ☛ Verify your understanding by calling ''test_fun'' such that ''b'' and ''c'' are both set to ''0''. Also note what happens when you do ''test_fun(1,'B',1)''! A different way of flexibly specifying function inputs is through the use of a ''cfg'' (configuration) input. An example of how this works can be seen in the ''getMUA()'' function used above. ==== Exercise ==== ☛ Extend your %%PlotSpikeRaster%% function as follows, using either the varargin or cfg strategy, and your knowledge of object handles. <code matlab> % required inputs: % % spikes: ts object with spike train % % optional inputs: % % csc: tsd object with LFP data (bonus points for accepting more than one gracefully) % % cscColor: [nCSC x 3] RGB values to plot CSCs, defaults to red % spikeColor: [nCells x 3] RGB values to plot spikes, defaults to black % % evt: ts object with event times to plot as vertical lines, defaults to none % % interactiveMode: boolean to enable/disable arrow key navigation of time axis </code> This function should produce a rasterplot of spikes -- one neuron per row -- with one or multiple LFPs plotted above it, similar to panel Aa of the Dragoi and Tonegawa figure above. The values on the y-axis are not important for now, so you can feel free to hide this axis and rescale everything to look good. Set the default x range to 2 seconds. Also, we want to be able to overlay events of interest on this plot (such as times when the animal received reward, or brain stimulation was triggered); these can be simple vertical lines that extend all the way through the plot. You can make up a few event times yourself for testing. Finally, for data exploration it is helpful to be able to skip through the data with the arrow keys; set it up so that the left and right arrows can be used to scroll left and right. For extra awesomeness, add zoom functionality that scales the LFPs when pressing the up and down arrows. ☛ When you are satisfied with your work, save a copy in your project folder and push it to GitHub. Verify that your version is visible from the web. === Hints === Everything you need to know to make this work has been introduced in this module: beyond this (and basic MATLAB) no special functions or tricks are needed. A few suggestions: * Start simple. Instead of trying to implement everything at once, take it one step at a time. First, plot a single line (corresponding to one spike). Then, plot a spike train for one neuron, and keep building. * Don't test on the full data set. Things will be faster if you use restricted data for testing. * Plot everything in a single set of axes, i.e. without using ''subplot()'' or ''axes()''; this means you have to rescale and add a baseline offset (a.k.a. DC component) to the LFP signals so that they don't overlap. * Develop in the ''sandbox'' file so that you can use cell mode and have easy access to the workspace. Only when everything is basically working, move to putting everything inside a function.

wiki

User Tools

Site Tools

Sidebar

Page Tools