analysis:course-w16:week11

This shows you the differences between two versions of the page.

Both sides previous revision Previous revision Next revision | Previous revision | ||

analysis:course-w16:week11 [2016/02/22 14:34] mvdm [Granger causality: introduction] |
analysis:course-w16:week11 [2018/07/07 10:19] (current) |
||
---|---|---|---|

Line 1: | Line 1: | ||

~~DISCUSSION~~ | ~~DISCUSSION~~ | ||

- | :!: **Under construction, please do not use yet!** :!: | + | ===== Interactions between multiple signals: coherence, Granger causality, and phase-slope index ===== |

- | | + | |

- | ===== Interactions between multiple signals: coherence and other connectivity measures ===== | + | |

Goals: | Goals: | ||

Line 9: | Line 7: | ||

* Develop an intuition for what determines the coherence between two signals | * Develop an intuition for what determines the coherence between two signals | ||

* Employ some different methods of estimating coherence to appreciate the tradeoffs involved | * Employ some different methods of estimating coherence to appreciate the tradeoffs involved | ||

- | * Understand the main limitations of the coherence measure, along with a sketch of advanced methods that attempt to address them | + | * Understand the main limitation of the coherence measure: a lack of information about directionality |

+ | * Grasp the concept of Granger causality in the context of autoregressive models | ||

+ | * Explore cases where Granger causality can give inaccurate or misleading results, and learn how to detect such cases | ||

+ | * Apply an alternative method, the phase-slope index, that can work in cases where G-causality fails | ||

Resources: | Resources: | ||

Line 15: | Line 16: | ||

* (background reading, a brief review) [[http://www.ncbi.nlm.nih.gov/pubmed/16150631 | Fries (2005) ]] Communication through coherence paper | * (background reading, a brief review) [[http://www.ncbi.nlm.nih.gov/pubmed/16150631 | Fries (2005) ]] Communication through coherence paper | ||

* (optional, a nice example application) [[http://www.ncbi.nlm.nih.gov/pubmed/17372196 | deCoteau et al. (2007)]] hippocampus-striatum coherence changes with learning | * (optional, a nice example application) [[http://www.ncbi.nlm.nih.gov/pubmed/17372196 | deCoteau et al. (2007)]] hippocampus-striatum coherence changes with learning | ||

+ | * (technical background) [[http://arxiv.org/pdf/q-bio/0608035v1.pdf | Ding et al. (2006)]] theory of Granger causality and applications to neuroscience | ||

==== Introduction ==== | ==== Introduction ==== | ||

Line 22: | Line 23: | ||

Characterizing and quantifying relationships between different signals, recorded from anatomically related areas in the brain, is an important tool in systems and cognitive neuroscience. Much evidence supports the idea that the effective flow of information along a fixed anatomical projection can be dynamically regulated, for instance by emphasizing bottom-up rather than top-down inputs in a task-related manner. | Characterizing and quantifying relationships between different signals, recorded from anatomically related areas in the brain, is an important tool in systems and cognitive neuroscience. Much evidence supports the idea that the effective flow of information along a fixed anatomical projection can be dynamically regulated, for instance by emphasizing bottom-up rather than top-down inputs in a task-related manner. | ||

- | One possible mechanism for this routing of information is "communication through coherence" and its many variants (Fries, 2005) which propose that effective connectivity depends on the degree to which two areas exhibit coherent oscillatory activity. In this module we define LFP coherence, explore its properties, and apply it to some example data. | + | One possible mechanism for this routing of information is "communication through coherence" and its many variants (Fries, 2005) which propose that effective connectivity (i.e. the flow of information) depends on the degree to which two areas exhibit coherent oscillatory activity. In this module we define LFP coherence, explore its properties, and apply it to some example data. (For a brief review on what is meant by structural, functional and effective connectivity, see [[http://www.scholarpedia.org/article/Brain_connectivity | here]]). |

Coherence is an inherently //symmetric// measure, i.e. it cannot distinguish whether A influences B or the other way around. To address the directionality question, we also explore Granger causality and phase slopes. | Coherence is an inherently //symmetric// measure, i.e. it cannot distinguish whether A influences B or the other way around. To address the directionality question, we also explore Granger causality and phase slopes. | ||

Line 463: | Line 464: | ||

In general, the above models $M$ are examples of **autoregressive** (AR) models: the dependent variable $Y(t)$ is regressed against linear combinations of past values of that variable itself, where the coefficients $a$ and $b$ can be thought of as the regression coefficients or weights of each past value. You may also encounter the term vector autoregressive (VAR) models, this is simply the multivariate extension of AR models. $M_2$ above is a VAR model since it has two variables. There is a large literature on (V)AR models, since it is a major tool in forecasting of all sorts of things ranging from the stock market to the weather. | In general, the above models $M$ are examples of **autoregressive** (AR) models: the dependent variable $Y(t)$ is regressed against linear combinations of past values of that variable itself, where the coefficients $a$ and $b$ can be thought of as the regression coefficients or weights of each past value. You may also encounter the term vector autoregressive (VAR) models, this is simply the multivariate extension of AR models. $M_2$ above is a VAR model since it has two variables. There is a large literature on (V)AR models, since it is a major tool in forecasting of all sorts of things ranging from the stock market to the weather. | ||

- | === Autoregressive models === | ||

=== Generating artificial data === | === Generating artificial data === | ||

- | === Granger causality: caveats === | + | To explore how to fit AR models to data, it's a good idea to start with some artificial data of which we know the structure. Earlier in this module we did so "by hand", but here we will use %%FieldTrip%%'s useful ''ft_connectivitysimulation()'': |

+ | <code matlab> | ||

+ | cfg = []; | ||

+ | cfg.ntrials = 1000; | ||

+ | cfg.triallength = 5; % in seconds | ||

+ | cfg.fsample = 1000; | ||

+ | cfg.nsignal = 2; % two signals, X and Y, which start out as identical white noise | ||

+ | |||

+ | cfg.method = 'linear_mix'; | ||

+ | cfg.mix = [0; 0]; % multiply white noise for X and Y by this | ||

+ | cfg.delay = [0; 0]; % Y is n samples delayed relative to X (both 0) | ||

+ | cfg.bpfilter = 'no'; | ||

+ | cfg.absnoise = 1; % add independent noise to both signals, so now X and Y should be independent | ||

+ | |||

+ | data = ft_connectivitysimulation(cfg); | ||

+ | data.label = {'X','Y'}; | ||

+ | </code> | ||

+ | |||

+ | The above code generates 1000 trials of 5 seconds each of independent white noise for two signals $X$ and $Y$. We do so in a somewhat roundabout way, by first setting the common signal in A and B to zero for each (''cfg.mix = [0; 0]'') and then adding independent noise of amplitude 1 to each (''cfg.absnoise = 1''). Why this is so will become clear later when we generate more interesting combinations of signals. | ||

+ | |||

+ | ☛ Verify that indeed the two signals X and Y are uncorrelated, as one would expect from independently generated white noise. One way to do so is to compute a correlation coefficient for each trial and plot the distribution of resulting correlation coefficients (use ''corrcoef()''). | ||

+ | |||

+ | Next, we can fit our AR model: | ||

+ | |||

+ | <code matlab> | ||

+ | cfg_ar = []; | ||

+ | cfg_ar.order = 3; | ||

+ | cfg_ar.toolbox = 'bsmart'; | ||

+ | mdata = ft_mvaranalysis(cfg_ar, data); | ||

+ | </code> | ||

+ | |||

+ | Note the ''order'' parameter, which specifies how far back to estimate coefficients for (the $L$ parameter in the equations above). Although this is %%FieldTrip%% code, it uses the [[http://www.brain-smart.org/ | BSMART]] toolbox under the hood to fit the model. The ''ft_mvaranalysis()'' function has some useful options we aren't using right now, such as the ability to estimate errorbars with the ''jackknife'' option. This takes a long time, however, so we don't do this now. | ||

+ | |||

+ | What we are interested in are the coefficients $a$ and $b$, i.e. the extent that we can predict each signal separately based on its own past, and then how much that prediction can be improved by knowledge of the other signal. | ||

+ | |||

+ | To plot these coefficients, we can do: | ||

+ | |||

+ | <code matlab> | ||

+ | figure; subplot(221) | ||

+ | |||

+ | labels = {'X->X','X->Y';'Y->X','Y->Y'}; cols = 'rgbc'; | ||

+ | nP = 0; | ||

+ | for iI = 1:cfg.nsignal | ||

+ | for iJ = 1:cfg.nsignal | ||

+ | nP = nP + 1; | ||

+ | | ||

+ | h(nP) = plot(1:cfg_ar.order,sq(mdata.coeffs(iI,iJ,:)),cols(nP)); | ||

+ | hold on; | ||

+ | plot(1:cfg_ar.order,sq(mdata.coeffs(iI,iJ,:)),'.','MarkerSize',20,'Color',cols(nP)); | ||

+ | |||

+ | end | ||

+ | end | ||

+ | set(gca,'FontSize',18,'LineWidth',1); box off; | ||

+ | set(h,'LineWidth',2); | ||

+ | xlabel('lag (samples)'); ylabel('coefficient'); | ||

+ | title('cfg.delay = [0; 0];'); | ||

+ | legend(h,labels(:)); | ||

+ | </code> | ||

+ | |||

+ | You should see that the coefficient values are very small (on the order of $10^{-4}$). This is what we expect from signals that we know to be uncorrelated; these values should not be statistically different from zero, which would mean that we cannot predict anything about our signal based on its past -- the definition of white noise! | ||

+ | |||

+ | Let's now create some signals that do have some structure: | ||

+ | |||

+ | <code matlab> | ||

+ | %% | ||

+ | cfg.mix = [0.8; 0.8]; % X and Y are identical white noise with amplitude 0.8 | ||

+ | cfg.absnoise = 0.2; % add amplitude 0.2 *independent* noise | ||

+ | cfg.delay = [0; 2]; % advance Y 2 samples relative to X | ||

+ | |||

+ | data = ft_connectivitysimulation(cfg); | ||

+ | data.label = {'X','Y'}; | ||

+ | </code> | ||

+ | |||

+ | ☛ Fit the VAR model again, and plot the coefficients in the next subplot. You should get something like: | ||

+ | |||

+ | {{ :analysis:course-w16:mvar1.png?nolink&600 |}} | ||

+ | |||

+ | Note how for the delay case, we correctly estimate that X can be predicted from Y, at the expected delay of 2 samples. | ||

+ | |||

+ | It is important to be aware of the limitations of Granger causality. The term "Granger-causes" is often used to indicate the inherently descriptive nature of VAR models, which cannot distinguish true causality from a number of alternative scenarios. Prominent among these is the possibility of a common input Z affecting both X and Y, but with different time lags. X may then "Granger-cause" Y, without any direct anatomical connection between them. A different, all-too common case is when signals X and Y have different signal-to-noise ratios; we will highlight this issue in the next section. More generally, it is unclear what conclusions can be drawn from Granger causality in systems that with recurrent (feedback) connections, which are of course ubiquitous in the brain -- a nice paper demonstrating and discussing this is [[http://neuralsystemsandcircuits.biomedcentral.com/articles/10.1186/2042-1001-1-9 | Kispersky et al. 2011]]. | ||

+ | |||

+ | === Spectrally resolved Granger causality === | ||

+ | |||

+ | Given how ubiquitous oscillations are in neural data, it is often informative to not fit VAR models directly in the time domain (as we did in the previous section) but go to the frequency domain. Intuitively, //spectrally resolved// Granger causality measures how much of the power in $X$, not accounted for by $X$ itself, can be attributed to $Y$ ([[http://www.sciencedirect.com/science/article/pii/S1053811908001328 | technical paper]]). To explore this, we'll generate some more artificial data: | ||

+ | |||

+ | <code matlab> | ||

+ | nTrials = 1000; | ||

+ | |||

+ | cfg = []; | ||

+ | cfg.ntrials = nTrials; | ||

+ | cfg.triallength = 5; | ||

+ | cfg.fsample = 1000; | ||

+ | cfg.nsignal = 2; | ||

+ | |||

+ | cfg.method = 'linear_mix'; | ||

+ | cfg.mix = [0.5; 0.5]; | ||

+ | cfg.delay = [0; 4]; | ||

+ | cfg.bpfilter = 'yes'; | ||

+ | cfg.bpfreq = [50 100]; % white noise gets filtered in this frequency band | ||

+ | cfg.absnoise = 0.5; % add independent noise to both signals | ||

+ | |||

+ | data = ft_connectivitysimulation(cfg); | ||

+ | data.label = {'X','Y'}; | ||

+ | </code> | ||

+ | |||

+ | Note that we are now using the ''bpfilter'' cfg option, which filters the original white noise in the specified frequency band. Thus, X and Y are 50% identical signal with frequency content between 50 and 100 Hz, and 50% independent noise. (You can inspect what this looks like by doing ''ft_databrowser([],data)''). | ||

+ | |||

+ | Next, we perform the frequency decomposition, %%FieldTrip%%-style: | ||

+ | |||

+ | <code matlab> | ||

+ | cfg_TFR = []; | ||

+ | cfg_TFR.channel = {'X','Y'}; | ||

+ | cfg_TFR.channelcmb = {'X' 'Y'}; | ||

+ | cfg_TFR.method = 'mtmfft'; | ||

+ | cfg_TFR.output = 'fourier'; | ||

+ | cfg_TFR.foi = 1:1:150; | ||

+ | cfg_TFR.taper = 'hanning'; | ||

+ | |||

+ | TFR = ft_freqanalysis(cfg_TFR,data); | ||

+ | </code> | ||

+ | |||

+ | Now we can compute the Granger spectra: | ||

+ | |||

+ | <code matlab> | ||

+ | cfg_G = []; | ||

+ | cfg_G.method = 'granger'; | ||

+ | cfg_G.channel = {'X','Y'}; | ||

+ | cfg_G.channelcmb = {'X' 'Y'}; | ||

+ | |||

+ | C = ft_connectivityanalysis(cfg_G,TFR); | ||

+ | </code> | ||

+ | |||

+ | ...and plot the results: | ||

+ | |||

+ | <code matlab> | ||

+ | figure; | ||

+ | for iP = 1:4 | ||

+ | subplot(2,2,iP); | ||

+ | plot(C.freq,C.grangerspctrm(iP,:)); | ||

+ | set(gca,'FontSize',14,'YLim',[0 0.5]); | ||

+ | title([C.labelcmb{iP,:}]); | ||

+ | end | ||

+ | </code> | ||

+ | |||

+ | You should get something like | ||

+ | |||

+ | {{ :analysis:course-w16:spectral_granger1.png?nolink&600 |}} | ||

+ | |||

+ | These panels show how much of the power in X (or Y) can be predicted based on itself, or the other signal. You can see that the top right panel (Y→X) has higher coefficients than the reverse (X→Y), consistent with the 4-sample advancement of Y relative to X. | ||

+ | |||

+ | ☛ Why are there non-zero coefficients for the X→Y direction? Test your hypothesis with artificial data. | ||

+ | |||

+ | Now, let's consider the following case: | ||

+ | |||

+ | <code matlab> | ||

+ | cfg = []; | ||

+ | cfg.ntrials = nTrials; | ||

+ | cfg.triallength = 5; | ||

+ | cfg.fsample = 1000; | ||

+ | cfg.nsignal = 2; | ||

+ | |||

+ | cfg.method = 'linear_mix'; | ||

+ | cfg.mix = [1; 0.5]; % X bigger than Y | ||

+ | cfg.delay = [0; 0]; | ||

+ | cfg.bpfilter = 'yes'; | ||

+ | cfg.bpfreq = [50 100]; % white noise gets filtered in this frequency band | ||

+ | cfg.absnoise = 0.5; % add independent noise to both signals | ||

+ | |||

+ | data = ft_connectivitysimulation(cfg); | ||

+ | data.label = {'X','Y'}; | ||

+ | </code> | ||

+ | |||

+ | Note that $X$ is the same signal as $Y$, but twice as large; there is //no delay// between them. Independent noise is then added to both $X$ and $Y$ as before. | ||

+ | |||

+ | ☛ Compute the Granger cross-spectra for these signals as was done above. | ||

+ | |||

+ | You should see that X Granger-causes Y. | ||

+ | |||

+ | ☛ How is this possible, given that we generated X and Y to have zero delay? | ||

+ | |||

+ | This case, in which two (near-)identical signals have different signal-to-noise ratios, is very common in neuroscience. As you have seen, Granger causality can be easily fooled by this. | ||

+ | |||

+ | How can we detect if we are dealing with a Granger-causality false positive like this? An elegant way is to //reverse// both signals and test again; if the Granger asymmetry persists after this, we have a tell-tale of a signal-to-noise Granger artifact. | ||

+ | |||

+ | ☛ Reverse the two signals and compute Granger cross-spectra, both for the zero-delay artifact case and for the true causal case above. Verify that this reverse-Granger test accurately distinguishes the two cases. [[http://www.sciencedirect.com/science/article/pii/S105381191401009X | This paper]] discusses these issues in more detail and has thoughtful discussion. | ||

==== Phase-slope index ==== | ==== Phase-slope index ==== | ||

+ | |||

+ | If we have a situation such as the above, it is possible that a true lag or lead between two signals is obscured by different signal-to-noise ratios. If such a case is detected by the reverse-Granger analysis, how can we proceed with identifying the true delay? | ||

+ | |||

+ | A possible solution is offered by the analysis of **phase slopes**: the idea that for a given lead or lag between two signals, the phase lag (or lead) should systematically depend on frequency ([[http://arxiv.org/pdf/0712.2352.pdf | Nolte et al. (2008)]], see also precedents in the literature such as [[http://science.sciencemag.org/content/308/5718/111.short | Schoffelen et al. (2005)]]). | ||

+ | |||

+ | [[https://github.com/mvdm/papers/tree/master/Catanese_vanderMeer2016 | Catanese and van der Meer (2016)]] diagram the idea as follows: | ||

+ | |||

+ | {{ :analysis:course-w16:psi_example.png?nolink&800 |}} | ||

+ | |||

+ | In the example in (**A**) above, the red signal always leads the blue signal by 5 ms, which results in a different phase lag across frequencies (20, 25 and 33.3 Hz in this example). This is because 5ms is a much bigger slice of a full oscillation cycle at 33.3Hz than it is at 25Hz; the bottom panel shows the linear relationship between phase lag and frequency for the above examples, resulting in a positive slope for the red-blue phase difference indicating a red signal lead. | ||

+ | |||

+ | (**B**) shows the raw phase differences for an example real data session in the top panel: note that the phase lag as a function of frequency contains approximately linear regions in the "low-gamma" (45-65 Hz, green) and "high-gamma" (70-90 Hz, red) frequency bands, with slopes in opposite directions. The phase slope (middle panel) is the derivative of the raw phase lag, and the reversal of the phase slope sign around 65-70 Hz indicates that high and low gamma are associated with opposite directionality in the vStr-mPFC system, with vStr leading for low gamma and mPFC leading for high gamma oscillations. The bottom panel shows the phase slope index (PSI) which normalizes the raw phase slope by its standard deviation. | ||

+ | |||

+ | Thus, to summarize, the phase slope index (PSI) is a normalized form of the phase slope -- obtained by dividing the raw phase slope at each frequency by its standard deviation (estimated using a bootstrap). The phase slope itself is obtained by taking the derivative (slope) of the raw phase differences across frequencies; as discussed above, these raw phase differences can be obtained by estimating the phase (angle) of the cross-spectrum. | ||

+ | |||

+ | The time lag (or lead) between two signals given a phase slope is: | ||

+ | |||

+ | \begin{equation} | ||

+ | t_{a-b} = [\frac{\phi_{a-b}(f+df) - \phi_{a-b}(f)}{df}]/ 360^{\circ} | ||

+ | \label{eq:psi} | ||

+ | \end{equation} | ||

+ | |||

+ | where $t_{a-b}$ is the time lag (or lead) in seconds between signals $a$ and $b$, to be inferred from the phase differences $\phi_{a-b}$ (in degrees) observed at frequencies $f$ and $f+df$. For instance, given a phase difference $\phi_{a-b} = 45^{\circ}$ between signals $a$ and $b$ at $f = 25$Hz, and $\phi_{a-b} = 36^{\circ}$ at $f = 20$Hz, $t_{a-b} = [(45-36)/(25-20)]/360 = 5$ms (the example in panel **A** above). As $df \to 0$, the fraction shown in square brackets above corresponds to the derivative $\phi_{a-b}'(f)$, i.e. the phase slope. Positive time lags indicate that $a$ leads $b$. | ||

+ | |||

+ | To test how this works, let's generate two signals with an ambiguous Granger-relationship: | ||

+ | |||

+ | <code matlab> | ||

+ | nTrials = 1000; | ||

+ | |||

+ | cfg = []; | ||

+ | cfg.ntrials = nTrials; | ||

+ | cfg.triallength = 5; | ||

+ | cfg.fsample = 1000; | ||

+ | cfg.nsignal = 2; | ||

+ | |||

+ | cfg.method = 'linear_mix'; | ||

+ | cfg.mix = [1; 0.3]; % X bigger than Y | ||

+ | cfg.delay = [0; 4]; | ||

+ | cfg.bpfilter = 'yes'; | ||

+ | cfg.bpfreq = [50 100]; % white noise gets filtered in low gamma band | ||

+ | cfg.absnoise = 0.5; % add independent noise to both signals | ||

+ | |||

+ | data = ft_connectivitysimulation(cfg); | ||

+ | data.label = {'X','Y'}; | ||

+ | </code> | ||

+ | |||

+ | Note that Y leads X, but X has larger amplitude than Y. | ||

+ | |||

+ | ☛ Verify that according to the Granger spectra, there is no evidence to support an asymemtric (Granger-causal) relationship between Y and X. Since we generated the signals with a 4 sample lead for Y, we know this to be incorrect. | ||

+ | |||

+ | Now, let's compute the phase slope. We start with the Fourier decomposition, as before: | ||

+ | |||

+ | <code matlab> | ||

+ | cfg_TFR = []; | ||

+ | cfg_TFR.channel = {'X','Y'}; | ||

+ | cfg_TFR.channelcmb = {'X' 'Y'}; | ||

+ | cfg_TFR.method = 'mtmfft'; | ||

+ | cfg_TFR.output = 'fourier'; | ||

+ | cfg_TFR.foi = 1:1:150; | ||

+ | cfg_TFR.taper = 'hanning'; | ||

+ | |||

+ | TFR = ft_freqanalysis(cfg_TFR,data); | ||

+ | </code> | ||

+ | |||

+ | But now, we use a different method for the connectivity analysis: | ||

+ | |||

+ | <code matlab> | ||

+ | cfg_psi = []; | ||

+ | cfg_psi.method = 'psi'; | ||

+ | cfg_psi.bandwidth = 8; % number of frequencies to compute slope over | ||

+ | cfg_psi.channel = {'X','Y'}; | ||

+ | cfg_psi.channelcmb = {'X' 'Y'}; | ||

+ | |||

+ | C = ft_connectivityanalysis(cfg_psi,TFR); | ||

+ | </code> | ||

+ | |||

+ | We plot the phase slope between Y and X: | ||

+ | |||

+ | <code matlab> | ||

+ | figure; | ||

+ | plot(C.freq,sq(C.psispctrm(2,1,:))); | ||

+ | xlabel('Frequency'); ylabel('Phase slope'); | ||

+ | </code> | ||

+ | |||

+ | The positive phase slope correctly identified that Y leads X. | ||

+ | |||

+ | ☛ What are the units on the vertical axis? | ||

+ | |||

+ | ==== Challenges ==== | ||

+ | |||

+ | ★ If you have your own data with at least two signals that you suspect may be related, identify an appropriate functional connectivity analysis and apply it to the data. Comment on why the chosen method was used. | ||

+ | |||

+ | ★ The "theta" rhythm, which is about 8 Hz in moving rodents, is important in coordinating the spike timing of hippocampal neurons. However, theta frequencies also appear in LFPs recorded from other brain areas, including the prefrontal cortex and the ventral striatum. One hypothesis is that those areas simply "inherit" theta activity from their hippocampal inputs. Test this idea using data from R020, which has electrodes in hippocampus and ventral striatum, and your chosen connectivity analysis method. | ||

+ | |||

+ |

analysis/course-w16/week11.1456169679.txt.gz · Last modified: 2018/07/07 10:19 (external edit)

Except where otherwise noted, content on this wiki is licensed under the following license: CC Attribution-Share Alike 4.0 International