analysis:course-w16:week4

This shows you the differences between two versions of the page.

Both sides previous revision Previous revision Next revision | Previous revision | ||

analysis:course-w16:week4 [2016/01/21 13:46] mvdm [Detailed examination of Neuralynx time series data] |
analysis:course-w16:week4 [2018/07/07 10:19] (current) |
||
---|---|---|---|

Line 1: | Line 1: | ||

~~DISCUSSION~~ | ~~DISCUSSION~~ | ||

- | |||

- | :!: **Under construction, do not use!** :!: | ||

===== Anatomy of time series data, sampling theory ===== | ===== Anatomy of time series data, sampling theory ===== | ||

Line 93: | Line 91: | ||

==== Subsampling (decimating) time series data ==== | ==== Subsampling (decimating) time series data ==== | ||

- | In the real world, the frequency at which we can acquire data will be limited the properties of your experimental equipment. For instance, the maximum sampling rate on a typical Neuralynx system is 32 kHz. Thus, the highest-frequency signal we can detect is 16 kHz (the Nyquist frequency). Crucially, however, we cannot rule out the possibility that frequencies above 16 kHz are present in the signal we are sampling from! Thus, we risk **aliasing**: generating "phantom" frequencies in our sampled data that don't exist in the true signal. What to do? | + | In the real world, the frequency at which we can acquire data will be limited by the properties of your experimental equipment. For instance, the maximum sampling rate on a typical Neuralynx system is 32 kHz. Thus, the highest-frequency signal we can detect is 16 kHz (the Nyquist frequency). Crucially, however, we cannot rule out the possibility that frequencies above 16 kHz are present in the signal we are sampling from! Thus, we risk **aliasing**: generating "phantom" frequencies in our sampled data that don't exist in the true signal. What to do? |

The general solution is to apply an //anti-aliasing filter// to the data before sampling. To illustrate this, let's generate a signal consisting of two frequencies: | The general solution is to apply an //anti-aliasing filter// to the data before sampling. To illustrate this, let's generate a signal consisting of two frequencies: | ||

Line 125: | Line 123: | ||

<code matlab> | <code matlab> | ||

% sample at 12 Hz with different method | % sample at 12 Hz with different method | ||

+ | tvec1d = decimate(tvec1, dt); | ||

signal2d = decimate(signal1,dt); | signal2d = decimate(signal1,dt); | ||

Line 184: | Line 183: | ||

<code matlab> | <code matlab> | ||

xl = [1 1.04]; | xl = [1 1.04]; | ||

- | linkaxes('x','ax1',ax2); | + | linkaxes([ax1, ax2], 'x'); |

set(ax1,'XLim',xl); % see what I did there?) | set(ax1,'XLim',xl); % see what I did there?) | ||

</code> | </code> | ||

Line 204: | Line 203: | ||

You should obtain something like: | You should obtain something like: | ||

- | {{ :analysis:course:week3_fig1.png?nolink&600 |}} | + | {{ :analysis:course-w16:spline_recover.png?nolink&600 |}} |

Notice how the spline-interpolated sampled signal is a pretty good approximation to the original. In cases where you care about detecting the values and/or locations of signal peaks, such as during spike sorting, performing spline interpolation can often improve accuracy substantially! | Notice how the spline-interpolated sampled signal is a pretty good approximation to the original. In cases where you care about detecting the values and/or locations of signal peaks, such as during spike sorting, performing spline interpolation can often improve accuracy substantially! | ||

+ | |||

==== Detailed examination of Neuralynx time series data ==== | ==== Detailed examination of Neuralynx time series data ==== | ||

+ | |||

+ | This section will look in some detail at how raw time series data is stored by the Neuralynx system. Even if you do not use this system in your own work, the lessons that can be learned from looking at what can go wrong at the raw data level already are universal! | ||

To get into the guts of actual Neuralynx data, we will not use the sanitized wrapper provided by ''LoadCSC()'' but instead use the loading function provided by Neuralynx. Using cell mode in a sandbox file as usual, ''cd'' into the ''R016-2012-10-08'' data folder you downloaded previously in Week 1. Then deploy the Neuralynx loader: | To get into the guts of actual Neuralynx data, we will not use the sanitized wrapper provided by ''LoadCSC()'' but instead use the loading function provided by Neuralynx. Using cell mode in a sandbox file as usual, ''cd'' into the ''R016-2012-10-08'' data folder you downloaded previously in Week 1. Then deploy the Neuralynx loader: | ||

Line 284: | Line 286: | ||

<code matlab> | <code matlab> | ||

- | >> run(FindFile('*keys.m')) | + | >> LoadExpKeys |

>> ExpKeys | >> ExpKeys | ||

Line 304: | Line 306: | ||

</code> | </code> | ||

- | In fact this data contains two recording sessions, called 'Value' and 'Risk' respectively (this refers to the distributions of food outcomes predicted by audio cues presented as the rat crossed the center of the track; we will not use this information for now). These sessions map onto the first and second elements of ''TimeOnTrack'' and ''TimeOffTrack'', which give the times (in seconds) of when the Value and Risk sessions started and ended, respectively. | + | In fact this data contains two recording sessions, called 'Value' and 'Risk' respectively (this refers to the distributions of food outcomes predicted by audio cues presented as the rat crossed the center of the track; we will not use this information for now, but the full task is described in the [[http://onlinelibrary.wiley.com/doi/10.1111/ejn.13069/fullpaper | paper]]). These sessions map onto the first and second elements of ''TimeOnTrack'' and ''TimeOffTrack'', which give the times (in seconds) of when the Value and Risk sessions started and ended, respectively. |

☛ Use the first element of ''ExpKeys.TimeOnTrack'' and ''ExpKeys.TimeOffTrack'' to find the indices of ''Timestamps'' corresponding to the Value session. Then, use these to create a new set of variables ''TimestampsValue'', ''SamplesValue'' et cetera. (Note that this is essentially what ''restrict()'' does; If you are confused by this, review the documentation on [[http://www.mathworks.com/help/matlab/math/matrix-indexing.html|Matrix Indexing]].) | ☛ Use the first element of ''ExpKeys.TimeOnTrack'' and ''ExpKeys.TimeOffTrack'' to find the indices of ''Timestamps'' corresponding to the Value session. Then, use these to create a new set of variables ''TimestampsValue'', ''SamplesValue'' et cetera. (Note that this is essentially what ''restrict()'' does; If you are confused by this, review the documentation on [[http://www.mathworks.com/help/matlab/math/matrix-indexing.html|Matrix Indexing]].) | ||

Line 334: | Line 336: | ||

☛ Compute Neuralynx's true sampling rate from the observed mode of the timestamp diffs. | ☛ Compute Neuralynx's true sampling rate from the observed mode of the timestamp diffs. | ||

- | Close enough for practical purposes, but the differences could add up! | + | Close enough for practical purposes, but the differences could become significant for very long recording sessions! |

Next: what is up with these clearly smaller values in the diff plot? Let's investigate: | Next: what is up with these clearly smaller values in the diff plot? Let's investigate: | ||

Line 351: | Line 353: | ||

☛ How does the ''LoadCSC()'' function handle these cases? | ☛ How does the ''LoadCSC()'' function handle these cases? | ||

+ | |||

+ | :!: NOTE: the above missing sample weirdness was a rare occurrence for our lab's Neuralynx system; one that was traced to a faulty framegrabber board driver which caused the computer to lock up periodically. Thanks to Neuralynx's warning and error reporting system in the acquisition software, we were immediately alerted that something unexpected was happening. In addition, the ''*events.Nev'' file contains event strings indicating suspect data blocks. | ||

+ | |||

+ | ==== Challenges ==== | ||

+ | |||

+ | ★ If you have your own time series data, find out how it is stored: with what precision? In blocks? Does the reported sampling rate match up with what is in the data? How can you convert from the raw data format to voltage (or whatever the quantity you are measuring is)? | ||

+ | |||

+ | ★ If you implemented your own file loader(s) back in Module 2, implement checks for missing samples and possible sampling frequency misalignments. | ||

+ | |||

+ | ★ Important! If you have your own idea of something you'd like to accomplish in this course, even if is isn't listed as an official challenge, ask me and we can make it count as one. What you do in this course should be as relevant as possible to your work! |

analysis/course-w16/week4.1453402019.txt.gz · Last modified: 2018/07/07 10:19 (external edit)

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International