User Tools

Site Tools


analysis:nsb2017:week1

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

analysis:nsb2017:week1 [2017/07/01 18:18]
mvdm
analysis:nsb2017:week1 [2023/04/13 12:21]
Line 1: Line 1:
-~~DISCUSSION~~ 
  
-=== Module 1: Setting up === 
- 
-Goals: 
-  * Set up a working MATLAB installation with appropriate path shortcuts 
-  * Use %%GitHub%% to acquire the analysis code we will use 
-  * Perform some elementary %%GitHub%% operations (pull, add, commit, push) 
-  * Create a well-designed folder structure for your work, and be aware of naming conventions (in process, promoted data) 
-  * Connect to the lab database, download a data set, and test your path setup 
- 
-Resources: 
-  * Recommended:​ [[http://​www.ploscompbiol.org/​article/​info%3Adoi%2F10.1371%2Fjournal.pcbi.1000424|Noble,​ A Quick Guide to Organizing Computational Biology Projects]] ([[http://​www.ploscompbiol.org/​article/​fetchObject.action?​uri=info%3Adoi%2F10.1371%2Fjournal.pcbi.1000424&​representation=PDF|direct link to pdf]]) (yes, yours is one) 
-  * Optional: Introduction to version control with %%GitHub%%: [[http://​www.youtube.com/​GitHubGuides|Youtube videos]],​[[http://​readwrite.com/​2013/​09/​30/​understanding-github-a-journey-for-beginners-part-1|ReadWrite blog]] 
-  * More Git and GitHub resources: [[http://​git-scm.com/​book/​en/​Getting-Started-Git-Basics|more detailed doc pages]], [[http://​stackoverflow.com/​questions/​tagged/​git|git tagged questions on StackOverflow]],​ [[https://​github.s3.amazonaws.com/​media/​progit.en.pdf|Pro Git manual]] (surprisingly readable) 
-  * Optional: MATLAB documentation on [[http://​www.mathworks.com/​help/​matlab/​matlab_env/​understanding-file-locations-in-matlab.html|File Locations]] and [[http://​www.mathworks.com/​help/​matlab/​matlab_env/​what-is-the-matlab-search-path.html#​br8ch8o|Search Paths]] 
-  * Optional: vandermeerlab inventory of [[analysis:​basicskills|basic computing skills]] 
- 
-Step-by-step:​ 
- 
-=== Installing MATLAB === 
- 
-At MBL, MATLAB should already be installed on lab computers. Verify it starts correctly; you should see its main window open up, including a panel called "​Command Window"​ greeting you with a prompt (''>>''​). 
- 
-=== Setting up GitHub === 
- 
-Next, we need to obtain some existing MATLAB code that we will build on in this module. To do this, we will use [[http://​www.github.com|GitHub]]. 
- 
-%%GitHub%% is a system for "​distributed version control":​ it keeps track of changes to a set of files, such as pieces of MATLAB code, with one or more contributors. This system makes it easy to keep track of evolving code, and to share improvements between collaborators. Typical scenarios in which such version control is useful include, for instance, if you want to run the exact code that you used to generate some figure a while ago, but you've since made changes to the code; or the same analysis suddenly gives a different result and you want to track down what change caused it. If you are new to %%GitHub%%, you can watch the video under Resources above to get an overall idea of how it works and why it is useful. 
- 
-If you don't already have a %%GitHub%% account, go to [[http://​www.github.com|GitHub]] and sign up. E-mail me (mvdm at dartmouth dot edu) your account name, so I can give you write access to the code repository we will use. 
- 
-Meanwhile, download and install the Git client of your choice if you don't already have one installed. This is a piece of software that will allow you to talk to %%GitHub%%, which is where the code is actually stored. For Windows, I recommend [[https://​github-windows.s3.amazonaws.com/​GitHubSetup.exe|GitHub Desktop for Windows]] as a user-friendly way to get started. For installing Git and setting up GitHub on various operating systems, see [[https://​help.github.com/​articles/​set-up-git|GitHub:​ Set Up Git]]. 
- 
-Next, configure your client. For %%GitHub%% Windows, after starting up the %%GitHub%% %%GUI%% (the default window that opens when you run %%GitHub%%) you'll first need to sign in with your account, then click Tools > Options. Set the "​Default Storage Directory"​ to something reasonable such as ''​D:​\My_Documents\GitHub\''​). Also check that your username and e-mail address look ok (I am ''​mvdm''​). 
- 
-=== Cloning the module codebase === 
- 
-Now we are ready to use Git to create a local copy ("​clone"​) of the module codebase. If this is your first time using git, I recommend you open a Git Shell, which you can do by typing ''​Git Shell''​ in the search box of the Start menu. Once open, note your working directory (displayed at the prompt of your now opened shell) and type ''​git clone https://​github.com/​mvdm/​nsb2017'',​ which will create a new folder ''​nsb2017''​ in your working directory. 
- 
-As an alternative,​ you can navigate to the course repository [[https://​github.com/​mvdm/​nsb2017|web page]] on %%GitHub%%. There, click the button "Clone in desktop",​ and your %%GitHub%% client should prompt you to accept the clone (tested successfully with Chrome, %%YMMV%%). If this fails, you can [[http://​joe.blog.freemansoft.com/​2014/​04/​github-clone-to-desktop-with-windows.html|drag]] the URL from your browser into your %%GitHub%% client. 
- 
-Now, verify that the above steps have resulted in the creation of a ''​nsb2017''​ folder with various subfolders and files in it, indicating that you have a local copy of the codebase. Because Git is tracking the contents of this folder, it is now easy to "​pull"​ the latest version from %%GitHub%%, either from the command line: 
- 
-<​code>​ 
-git pull 
-</​code>​ 
- 
-Or, by clicking the ''​Sync''​ button in the %%GUI%%. 
- 
-This "​pull"​ should do nothing, because you already have the latest version. The basic idea is that you can stay up-to-date easily as well as contribute to the codebase so that everyone else can benefit. As you might expect, that part is known as a "​push",​ which we will do in the next step. 
- 
-=== A first commit and push ===  
- 
-First, if you haven'​t "done a pull" recently, do one now before starting the next step. 
- 
-Open the ''​README.md''​ file in the ''​nsb2017''​ folder. The ''​.md''​ extension is for %%Markdown%%,​ a lightweight set of commands to format text (syntax reference is [[https://​help.github.com/​articles/​markdown-basics | here]]). 
- 
-Add your name, affiliation and a brief description of your interests to the list, and save the file. Then go to your git shell (open one with the gear icon in the top right of your client'​s %%GUI%%, or use the desktop icon) and type ''​git status''​. Git has noticed the change, but it says that this change is not yet "​staged for commit"​. In other words, git is not tracking this file. Let's fix this: 
- 
-<​code>​ 
-git add README.md 
-git commit -m "Added name to list in README file" 
-</​code>​ 
- 
-If you now do a ''​git status''​ you will see that you are ahead of the origin (the online repository) by 1 commit. This makes sense because you just made a change. Let's push this by doing ''​git push''​. If you get an "​access denied"​ type error, email me (mvdm at dartmouth dot edu) your %%GitHub%% username and I will give you permission. If everything goes to plan you should now be able to see the updated README file [[https://​github.com/​mvdm/​nsb2017| on GitHub]]. As above, you can also use the %%GUI%% ''​Sync''​ button to accomplish the same steps (albeit in a less transparent manner). 
- 
-A schematic of these basic operations (pull, commit, push) is shown below, using the amazing [[https://​www.dokuwiki.org/​plugin:​graphviz|DokuWiki plugin]] for [[http://​www.graphviz.org/​|GraphViz]]:​ 
- 
-<​graphviz dot center> 
-digraph G { 
-  remote -> local [label="​ pull"​];​ 
-  local -> staging [label="​ commit"​];​ 
-  staging -> remote [label="​ push"​];​ 
-} 
-</​graphviz>​ 
- 
-What happens if in between your pull and push someone else pushes a change? In that case you cannot push your changes unless you do a pull first and [[http://​stackoverflow.com/​questions/​161813/​fix-merge-conflicts-in-git | resolve any conflicts]]. **In any case, you should make a habit of doing a pull first before starting to edit anything, in order to minimize conflicts.** 
- 
-=== Using GitHub to acquire the FieldTrip toolbox === 
- 
-Using your experience from the previous section, create a local clone of the [[https://​github.com/​fieldtrip/​fieldtrip|FieldTrip toolbox]]. If you are using the command line, make sure that you ''​cd''​ to your %%GitHub%% folder, i.e. that you are not within some other project such as ''​nsb2017'',​ before cloning. If things worked correctly you should have ''​fieldtrip''​ and ''​nsb2017''​ folders within your %%GitHub%% folder; **not** a ''​fieldtrip''​ folder within your ''​nsb2017''​ folder! 
- 
-We will use this toolbox extensively for the analysis of local field potential (LFP) data. Be aware that it is about 1.2GB in size! 
- 
-Note how using %%GitHub%% to obtain %%FieldTrip%% not only ensures you have the most recent version, but also enables you to easily incorporate future changes, revert to previous versions, etc. using pull and other git tools. 
- 
-=== Configuring MATLAB to use the code from GitHub === 
- 
-Now, we need to tell MATLAB where to find all this code we have just obtained. Open MATLAB and [[http://​www.mathworks.com/​help/​matlab/​matlab_env/​create-matlab-shortcuts-to-rerun-commands.html | create a shortcut]] titled something like "​Neural Data Analysis"​. The code for the shortcut should be 
- 
-<code matlab> 
-restoredefaultpath;​ clear classes; % start with a clean slate 
- 
-cd('​D:​\My_Documents\GitHub\nsb2017\shared'​);​ % or, wherever your code is located -- NOTE \shared subfolder! 
-p = genpath(pwd);​ % create list of all folders from here 
-addpath(p); 
- 
-cd('​D:​\My_Documents\GitHub\FieldTrip'​);​ % or whatever you chose, obviously 
-ft_defaults;​ 
-</​code>​ 
- 
-This ensures that whenever you click this button, you have a clean **path** (the set of folders, other than the current working directory, whose contents MATLAB can access) of only the MATLAB default plus your local versions of the two %%GitHub%% repositories. 
- 
-:!: When setting your path in MATLAB to add the ''​shared''​ folder only and //not// the parent folder ''​nsb2017''​. Adding the entire nsb2017 folder will result in an error when you try to run the LoadCSC command later in the module: 
- 
-Optional: if you don't like the ''​.git''​ folders in your path, you can get clever with [[http://​www.mathworks.com/​help/​matlab/​matlab_prog/​regular-expressions.html|regular expressions]] to remove these: 
- 
-<​code>​ 
-p = regexprep(p,'​D.*?​\.git.*?;',''​);​ 
-</​code>​ 
- 
-=== Establish a sensible folder structure === 
- 
-So far, you have local %%GitHub%% repository clones added to MATLAB'​s path. But as you work on a project, you will write your own analysis code. You will also have data files to work with; some that you download as part of these modules, and some that you may have collected yourself. It is important to consider where all of these files will go, and how you will manage them. I recommend using three separate locations: 
- 
-  * //​%%GitHub%% folders//. Files in here you only change (or add) when you can improve what is already there. This content is backed up and version-controlled (i.e. you can see the complete history of changes and revert to any version you want) through the %%GitHub%% system. These files can be shared by multiple different projects, including working through these modules, analysis related to the data you collect, and perhaps a %%PhD%% project! For me, this folder is in ''​D:​\My_Documents\GitHub\''​. 
-  * //Project folders//. Each project has a home folder which holds the code for that project. As explained in the Noble paper linked to above, it can be helpful to create a new folder for each day you work on the project. If you find you are copying certain functions or snippets of code from day to day, those should be moved to the ''​shared''​ folder. It is critical that the contents of this folder are backed up in case of computer failure. I use Dropbox for this, so an example project folder I have is ''​D:​\My_Documents\Dropbox\projects\OccasionSettingNAccRecording\''​. As an alternative,​ you may also set up a %%GitHub%% repository of your own (it's free) so that you can track your progress. Either way, the important point is that you can always find what you did on a given date -- this should work together with your lab (analysis) notes where you keep track of issues, progress, paste figures, et cetera. 
-  * //Data folders//. Data, both raw and preprocessed,​ should live in a different place: ''​D:​\data\''​ in my case. This is because different projects may access the same data, and because backup strategies for data are typically different than for code. 
- 
-With this trifold division, when you want to work on a project, you would click the appropriate MATLAB shortcut for it first. Following the example above, this should add the appropriate %%GitHub%% folders to the path. Next, the ''​shared''​ folder of the project is also added to the path. Data is generally not added to the path, because some data files in different folders may have the same name. Then, you create a new folder with today'​s date, and you are ready to go! Note also that if you want someone else to be able to replicate your results, you need to tell them what path you used. 
- 
-There are several situations when it is appropriate to move code from your //project folder// to a //​%%GitHub%% folder//: 
- 
-  * you improve a piece of code that was already on %%GitHub%% 
-  * you have a new piece of code in the //shared// project folder that is proving useful 
-  * you reach a milestone, such as an analysis that tests a certain hypothesis 
- 
-If you are an owner or collaborator of a %%GitHub%% repository, you will be able to push changes you make. I will enable this for you on the course repository (if you email me -- you need to do this for the editing of the readme file, above, to work), but to be accepted as a collaborator on a large project such as %%FieldTrip%%,​ you will need to show your work to the owners first (as can be done by creating a Fork or branch and and issuing a [[https://​help.github.com/​articles/​using-pull-requests | pull request]]). 
- 
-=== Grab an example data session === 
- 
-Next, let's get some data! Go to the NS&B share and find the ''​ExampleData''​ folder (within the MouseHippocampus folder). 
- 
-For this module you will need the ''​R016-2012-10-08''​ folder (containing data from one recording session), which you can find in the /​promoted/​R016 folder. A good place to put it is in something like ''​D:​\data\promoted\''​ (Rxxx indicate different rats, followed by the date of each session). As mentioned, in general you want to keep your data separate from your code; for instance, multiple analysis projects may use the same data, so you don't want to duplicate it. 
- 
-The choice of the folder name ''​promoted''​ indicates that these are data folders for which preprocessing is completed. As will be explained further in Module 2 and others, preprocessing typically includes the renaming of raw data files, annotation, spike sorting, and a few other steps. In general, it is useful to keep promoted data separate from data still in process. 
- 
-=== Verify things are working === 
- 
-As explained in the Noble paper linked to above, create a folder with today'​s date in your project folder. Create a ''​sandbox.m''​ file in it, click your previously made shortcut to set up the paths, and use [[http://​blogs.mathworks.com/​videos/​2011/​07/​26/​starting-in-matlab-cell-mode-scripts/​|Cell Mode]] in the MATLAB editor (type ''​edit''​ in the Command Window if you don't have one open yet) to check that you can load a data file: 
- 
-<code matlab> 
-%% load data 
-cd('​D:​\Data\R016\R016-2012-10-08'​);​ % replace this with where you saved the data 
- 
-cfg = []; 
-cfg.fc = {'​R016-2012-10-08-CSC02d.ncs'​};​ % cell array with filenames to load 
-csc = LoadCSC(cfg);​ 
-</​code>​ 
- 
-When you execute the above cell (Ctrl+Enter when it is selected in the Editor; Command+Enter on OS X), you should get: 
- 
-<code matlab> 
-LoadCSC: Loading 1 files... 
-LoadCSC: R016-2012-10-08-CSC02d.ncs 44/10761 bad blocks found (0.41%). 
->> csc 
- 
-csc =  
- 
-     tvec: [5498360x1 double] 
-     data: [1x5498360 double] 
-    label: {'​R016-2012-10-08-CSC02d.ncs'​} 
-      cfg: [1x1 struct] 
- 
->> csc.cfg 
- 
-ans =  
- 
-      history: [1x1 struct] 
-          hdr: {[1x1 struct]} 
-      ExpKeys: [1x1 struct] 
-    SessionID: '​R016-2012-10-08'​ 
-</​code>​ 
- 
-What you have loaded is in fact a local field potential recorded from the rat ventral striatum. The different file types and data fields above will be explained in more detail in the next module. For now, let's just take a peek at the data: 
- 
-<code matlab> 
-plot(csc.tvec,​csc.data);​ 
-xlim([1338.6 1339.2]); 
-</​code>​ 
- 
-You should see some interesting oscillations -- we will explore these in detail in upcoming modules. If you see this, you have successfully completed this module! 
- 
-{{ :​analysis:​nsb2014:​verify.png?​600 |}} 
analysis/nsb2017/week1.txt ยท Last modified: 2023/04/13 12:21 (external edit)