Project Management with git and R

Staying Organized

The semester is finally over meaning it’s time to put some serious dents into my research projects. I’ve got a couple floating around at this point, but the one that I need to think about the most is my PhD dissertation. From what I’ve heard it’s quite an intensive ordeal and given that over the next year I need to write a multi-chapter document with a dissertation committee spread across thousands of miles, I think it might be important to try to stay organized.

A couple of different people I follow on Twitter have suggested that I use Scrivner to keep track of writing and what not, but for the past year or two I have really been trying to push myself to commit exclusively to using Free and Open Source Software for everything I do. For some time now I have been thinking about trying to write my entire dissertation using all things R and RStudio and I think I’m ready to try it. I don’t want to get on a rant here about FOSS and how great R is, but let me just give a few reasons why I want to head down this path as opposed to using proprietary, paid software.

  1. It’s free
  2. Learning the ins and outs of all of this is going to make me a better programmer/researcher over the course of the next year (more skills = more employment options!)
  3. Working with this will help make my work more accessible and reproducible

That said, I’m not only going to try to write my thesis using only free software (I’m sure there will be exceptions), but one thing I also want to do is to document as much of the process as I can so that anyone else (especially those without computer backgrounds!) can do the same if they would ever want to learn these sets of tools. I’ve always thought it’s a shame that a lot of this software is not common place in the Humanities and hope that through blogging about how to do it, it will help others who want to use it find it easier.

So where do you even start?

R Projects and Git(hub)

When starting a new project, the first thing that I normally do is to create a place on my computer where the entirety of that project can live and then link that up to github so I can share what I do with the people I work with.

In this post I will just walk through how to just set up a version controlled project with R via Github.

Step One: Create a repository on Github

I find it’s easiest when starting a project to start on github first, then copy that onto your computer.

The first thing to do is to log into your github account (assuming you have one, it’s easy to sign up!) and create a new repository (repo) using the little plus at the top right of your home page as you can see below.

Navigate to New Repository Page

From here you want to give your repo a name, a small description, and tick the box that asks you initialize it with a README file. The screenshot below shows the page right before I click the green “Create Repository” button.

Setting up your repo

When choosing a name, it’s important to pick something easy to recognize and type. Once you click ‘Create Repository’, it will take you to your repo’s home page and should look something like this:

The Blank Page

From here you are going to want to click the green ‘Clone or download’ and then copy that link to your clipboard. The next screen shot shows what that will look like.

Clone the Git HTTPS

With the link copied, you then want to open up your terminal (if you’re a Windows user, make sure to get Gitbash!!) and then navigate to where you want your project to live. I keep all my projects in a directory called projects on my desktop for easy access. From here, you then want to use git to clone your project. If you’re totally new to using this kind of software to do work, you’ll also have to install git. If you don’t have git, this page will show you how to get it via homebrew at the command line, which, SURPRISE! you also have to download separately!! You can get that here.

Side note: when you are first starting out with all things computer-program-tech-whatever, you are going to need to get a lot of software before you can even start to use all the fun stuff. I remember this being particularly discouraging and no fun when I started. If you can, it’s best to try to get someone to slowly walk you through this via some pair programming. If not, don’t be afraid to ask questions when you start!

So getting back to what we were doing, you need to navigate to where you want your project to live (the first line) and then clone your directory by typing git clone yourproject.git.

Cloning Your Repo

Now if you look in your projects folder (or just press ls as in “list” all the files that are in your directory), you’ll notice you have a new directory sent directly from github heaven!

Step Two: Making It an R Project

Since I’m going to primarily using R to manage this project, I then need to turn this directory into an R project so that every time I open this project, it can be governed with R.

In order to do that, I need to get out of terminal, and then open up RStudio. From here you need to go to File > New Project > Create From Existing Directory >

Then navigate and select to open the folder you just cloned from github.

If you then tick the box that says to open the box in a new project, you’ll then see something like this:

Tah dah! We’ve now made our directory an R project and can save our changes on github.

Step Three: Pushing it the Cloud

Now all that’s left to do is sync up the changes on our local machine with that on github. We can bring terminal back and assuming that we are in our project’s directory (we can check that by typing pwd in Terminal), we then need to enter the work life cycle of updating a github account with the following commands:

git add .
git commit -m "Whatever you did"
git push 

Or how I did it with my project here:

If we then go back to our project page on Github.com, we can see that our changes should have been updated!

I should note that each time you start your project you want to git pull to make sure that you have the most updated version of the project (for when you are eventually collaborating with other people).

In writing this I actually realized there is SO MUCH that underlies this seemingly basic process of starting a project, so if anyone has any questions on this whole process, please let me know and I can try to make this post (and future ones) clearer.