Revision Control

Keeping track of your source code and developing as a team

At some point during your software development career, you've probably run into some form of revision control to keep track of the source code you're developing. Revision control is a system that allows you to manage software projects that might have multiple team members and many different source code revisions. The revision control system will keep track of all the changes made by each team member throughout the development process. Most revision control systems will alert developers when the changes they've made to the project will 'break' the software. Team members can also create different 'branches' of the development tree to experiment with new features and create altogether new projects. In this document, we'll go over some of the basics of revision control, and get you acquainted with a revision control system that we recommend you use.

There are multiple platforms available that offer revision control. In the past, you might have used revision control systems like CVS or Subversion. The OLPC core development teams use a revision control system called Git. Git was developed by Linux creator Linus Torvalds for managing development of the Linux kernel. Since it's public release in 2006, Git has gained a lot of steam in the Open Source/Free Software communities for its ease of use and powerful management capabilities. Since it's the revision control system used by the OLPC core development team, we recommend you maintain your source code using Git.

A note about revision control systems - they're big and complicated. Few developers know how to put to use all of the features provided by a revision control system, Git notwithstanding. Since it's unlikely that you'll be using the more complicated features, we're going to go over the basics in this section. At the end, we'll provide references in case you want to get your hands dirty. Right now, we're going to focus on the important stuff: installing git, importing a new project into git, making changes to your project, and sharing your project with other team members.

Things to know about Git

Git works a little differently from other revision control systems. A lot of the differences are fairly intuitive and easy to learn. Here are some of the main differences you'll need to know:

  • Repositories work differently. In Subversion and CVS, you usually have a single repository that exists on a server or some remote central place. All of the code and it's history is stored on this server, which you checkout (pulling down updates from the server and merging them with your code) or commit to (push your code updates and merge them with the code on the server). Git does this a little differently. Each copy of the project will be itself be a repository. Each of your team members will have a full repository, and not just a copy of the code, on their machine which they will have full access to. When you commit or update your project, you're committing and updating to the code that's on your machine. If you want then put this code on a central server for everyone to see, you have to push parts of your repository to the server. This gives each individual user a lot more power over the way changes are made to the project.
  • URLs work differently. In Subversion, the location of the repository is identified by a URL. This URL will have the location of the server and a path inside the repository. In a Subversion repository, you normally have paths like trunk/, branches/ and tags/. In git, the URL contains only the location of the repository. Branches and tags are internal to the system, and are not tracked by different paths. One of these branches that git keeps track of is usually the main branch, called master.
  • Revision tracking is different. Subversion keeps track of files in a linear, numbered order. Git keeps track of revisions with hash id's. This is a little intimidating at first, but is fairly easy to get used to. The latest revision of the code is identified by the HEAD tag. It's parent is HEAD^, and it's grandparent is HEAD~2. You can keep typing carrets after HEAD to see changes further back, but its easier to use the equivalent syntax, HEAD~n (where n is the number of revisions back). This is a better scheme in case someone makes a huge mistake and you need to revert back a few times. Typically, git's scheme results in less headaches than that of Subversion or CVS.
  • Committing is a little different. Each commit has an author name and committer field. This records who changed the code, and who committed it. If one of your developers is unable to get to the repository and wants to make a commit, the developer can email you his or her changes and you can update the server while still keeping track of who created the changes. We'll show you in the installation section how to keep track of your username for commits.

We see here that the scheme Git uses to keep track of your code is different from Subversion and CVS. Subversion and CVS use a linear, stack-type scheme to keep track of code - new code is pushed to the server and is 'stacked' on top of old code. Git's scheme is like a graph where files are linked together in a large web. This scheme is much better when you have multiple developers all working on different parts of the project. For instance, if one developer is working on Activity programming with python, and another is working on developing the UI in Glade, developer independence is maintained, making collisions much less frequent and easier to rectify.

Installing Git

Installing git is easy. If you're using the development package we demonstrated earlier (see emulation), installing git is as easy as opening your terminal and typing:

$ sudo apt-get install git-core

If you're on Windows or OSX, you can find installation packages here.

Once you've installed git, there are a few things to configure. Launch a terminal, and type in the follow information.

$ git config --global user.name "Type in your user name here"
$ git config --global user.email yourname@whatever.com
$ git config --global core.editor "command for your favorite editor"

The first two lines will keep track of your name and email for commits. This is really important for keeping track of who did what, so you'll want to do this immediately. The third line will set your default text editor. This can be eclipse, vim, emacs, textmate, or whatever you use to edit code. This will come up later.

Starting a new project

Starting a new project in git is easy. Start by going to the folder of the project you want to start, and type the following commands.

$ cd ~/yourprojectlocation
$ git init
$ git add .
$ git commit -a

git init will initialize the repository, git add . adds the current folder and all of it's files to the repository, and git commit -a will create the first import (the -a switch means 'all', committing all of your changes). After typing in the last command, git will open up a text editor so that you can write a brief commit message for your team. This will open up your text editor so that you can type a commit message (if you didn't set your default editor, git will open up your terminal's default editor, usually vi or vim). Type in your commit message, save it, and close your text editor. You've just made your first commit.

If opening up a text editor is a pain and you just want to type in a quick message, you can use this syntax:

$ git commit -a -m "your commit message"

This will skip opening your text editor.

Once you've set up your git repository, a subfolder called .git will be created in your project's folder. This will keep track of your project tree. You can poke around in here if you like, but most users will never need to make changes to anything in this folder.

Using Git

Now that you have your repository created, try making a few changes to your project's files. Once you've done that, using the following command will let you see the differences between the files in your project folder and in the repository.

$ git diff

This is a very useful command, one that you'll likely find yourself using frequently.

Now lets say you have a file in your repository that you want to checkout. Maybe you're missing this file, or you need to update your local version.

$ git checkout path  # replace path with the path to the file you're looking for

This will pull the file from the repository and put it in your code.

Just like in Subversion, you can tell git to add, move, or remove a file in the repository.

$ git add file  #replace file with the path to the file you're talking about
$ git mv file
$ git rm file

This works recursively - if you add a directory, it'll add all files and subdirectories (same with moving or removing).

Now it's time to commit your changes to the repository.

$ git commit -a

Maintaining your repository

Git provides commands for keeping track of what is changing in your repository. Two useful commands for looking at these changes are log and blame. The log command gives a list of all changes that have been made recently. The blame command will list the username and time of the last revisions made to a file.

$ git log
$ git blame filename

You can also use the log command to search for commits containing certain strings.
$ git log -Sword  # replace word (but not -S) with the string or regex you're searching for

Git also has a powerful tagging system. Tagging is usually used to keep track of milestones in your progress. You can tag your progress with words like roughdraft, alpha, beta, v1.0, or whatever you like.

$ git tag -a
$ git show tag  # replace 'tag' with the tag you're searching for

The tag command tags all of your files. This will again open up a text editor to prompt for your tag. You can again avoid this by using the -m switch. Other useful switches are:
$ git tag -l      # list all of the tags
$ git tag -d tag  # deletes the tag
$ git tag -F file # tags a specific file

Branching and merging actions are very easy in git. The act of branching will create a new portion of the project, allowing you to make new changes that won't affect the master project, but will also be update with commits to dependencies (files in common with the branch and the master, like Subversion's trunk). Merging is the act of taking that branch and syncing it back with the master (aka trunk).

Merging and branching gives the developer freedom to create new portions of the project to experiment with. If you're looking to add a new feature and are unsure how it might affect the project, or if you're experimenting with something and want to keep that experimentation separate from the project, branching is the way to go. If your changes wind up working out great, you can merge the branch back with the master. Git will handle most of the specifics of branching and merging for you, leaving you with a lot of freedom to implement new features.

Branching and merging in git is done like so:

$ git branch            # this will list all of the current branches
$ git branch branchname # replace branchname with the name of your new branch
$ git checkout branch   # replace branch with the name of the branch you want to check out

$ git merge branchname  # merge the branch named branchname with the master. git will alert you if there are conflicts.
$ git cherry-pick path  # grab a file named path from another branch

In CVS and Subversion, branching and merging projects tends to be a tedious affair, making it difficult for users to take advantage of these capabilities. The opposite is the case with git - merging and branching is usually very easy and thus most git users like to take advantage of these capabilities. You might not ever want or need to use these capabilities, but we recommend you learn to take advantage of them because it allows gives each developer a lot more creative freedom with the project.

Working with a remote server

One of the important things to recognize at this point (in case it wasn't made clear before) is that the repository you've been working with up to now has been a local repository. That means all of the changes, commits, branches, etc have been made only on your local machine and have not been committed to the server.

Lets reiterate at this point how git is different from other revision control systems like Subversion and CVS - each git team member is keeping a local copy of the repository on their machine. In Subversion, users are directly committing and checking out from the server. In git, the committing and checking out happens on the local machine. If you want to share your work over a server (which you'll want to do if you're working with multiple people on multiple machines), you have to push or pull changes to and from the server.

First, lets assume that the project you've been working on might already exist on a server.

$ cd /.../yourprojectdirectory
$ git clone url   # This will make a clone of the project from url in the current directory 
$ # the url you'll provide will usually be the ssh path for a server. It might look like this:
$ git clone ssh://netID@login.oit.duke.edu/~/path/to/repo.git

Now you have a copy of the current project from the server. Now lets say you've made a few changes, and you want your friend to be able to access your code. After committing your changes to the local version of your repository, you'll need to push the changes to the server.
$ git push -a

All of the changes you made will now be merged with the remote server. This works just like the commit command. Your text editor will open if you've made changes that conflict with the code on the repository.

Now lets say you've been away for a little while and you want to pull down the updates from the repository.

$ git pull

It's as easy as that.

Further Reading

This document covers only the basics of using git to manage your projects. Git is a large, powerful version control system, and has lots of features that were beyond the scope of this section. Git has had a recent surge in popularity resulting in dozens of tutorials and excellent manuals. Here are a few.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License