get out of town

manifesto

Git is concise. Concise does not mean short.

Well crafted history documents your source's purpose and direction.

While git can cure mistakes, good design decisions prevent them.

When in doubt, branch.

guide

prelude

Git is a ridiculously powerful, often misunderstood tool. It can bring teams to a veritable nirvana of productivity, or silo a single expert developer in frustration. For a happy git state to be reached, the whole team must pursue a moderate level of understanding of the tool, and come to an agreement on how to interact with each other. This includes a couple major areas: major organization, feature development, deployment/release strategies, and lastly, what to do when everything goes awry.

This guide will not attempt to instruct the budding git-ite in every detail, there are a multitude of powerful man pages and stack overflow questions to deal with that. This is merely attempting to stake out a standard when working with teams.

I assume the following: that the reader has git installed and working, basic terminal/unix knowlege, and a little knowledge of the git object model would not hurt.

If you need a quick intro on what git is and what it can do for you, see the resources below.

organization

commits

The most crucial part of organizing and maintaining a good project history is effective and descriptive commits. Each commit should represent one coherent change, and should have a strong commit message that thoroughly documents what went into the change. No commit should have side effects. If you discover some fix along the way, make two commits for each change. You can split up changes to the same file using git add -p. A common instance of this is a quick clean up of code (removing trailing spaces or adding formatting) while working on other changes. Separate the two kinds of changes for clarity.

The commit message is the most important part of the commit itself. It should represent your motives, your references, and your reasons as to why this change is entering the codebase at all. Good commit message style is well known, but for the sake of completeness:

A strong title message, less than 80 chars

A commit message description, if needed. *AND IT OFTEN IS*. If you are fixing
more than a typo or changing some color, you need to list why and how you came
to this point. Who asked you to do it, why you are doing it here, are there
patterns you followed, code you copied. Below it, I like to list out
references, man pages, stack overflow answers I used to make my decision.

References:
- http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html

Among the git source team, they take the git commmit messages to an entirely different level: They treat the commits as documentation and part of the source itself. I have heard of pull requests being denied because the commits contained new phrases or words that brought confusion to the terminology in the plumbing of git. Let me repeat: the code was fine, but the commit was deemed unworthy.

Part of the reason for such pedantry is that they treat commits kind of like lawyers treat case law. They make the case for a new change based on the arguments that have been merged in already, just like a good attorney defends his client using prior cases.

I really like this approach, because it causes the programmer to think long and hard about what he/she is adding, and to carefully consider the design and thought that went into the history of the codebase, not just what is there now.

For more information, this is the best documentation on writing good git commits from OpenStack.

branching and forking

For reasons explained below in the deployment section, it's crucial to maintain a clean deployment branch. Traditionally this is dubbed master, and because systems like heroku expect this naming convention, we'll stick to it. Master should be as straight as possible, every commit should be deployable. That means that no untested, work-in-progress code should ever be committed to master. All development, no matter how trivial should be on a separate branch. In fact, you shouldn't be working on any branch that other people are committing to. Working on a feature, and another developer joins the same task? One of the two of you (probably him) should branch or fork.

When a task or feature is in mind, you make a branch (git checkout -b [branchname]). Your branchname should be concise, but descriptive, as it will help inform what is in progress on the project. If you are adding a required field to a registration form, a good branch name is add-require-to-registration, bubbas-branch or v14 is not acceptable.

(Some groups like to use ticket numbers for branch names, so the above is not a hard and fast rule. An employer of mine uses commmitter-initials/ticket-number, and uses a really clever script to decode what is what.)

As you work on your branch, you are free to do whatever you want. Feel like trying a different path? Branch off your branch. Need to overwrite a commit? Go right ahead. Make lots of little WIP commits as you debug a script? Fine by me.

* 34f6b96  (1 second)    <Evan Travers>   (HEAD, feature-branch-1) still working...
* 896c5b1  (18 seconds)  <Evan Travers>   WIP on txt
* 3bc2744  (35 seconds)  <Evan Travers>   make sure that this is on
* 18c2f95  (44 seconds)  <Evan Travers>   git has entered the room
this is bad. we should nuke it from orbit.

Just fix it, clean it, document it before it is submitted to be merged into feature-branch-1

A clean, master-ready branch is: a clean line of single-purpose, well documented commits, with all WIP or temp commits cleaned up by squashing or git reset --soft.

* 09239a9  (23 minutes)  <Evan Travers>   (HEAD, feature-branch-1) remove unneeded code, clean the /assets folder
* 745928d  (24 minutes)  <Evan Travers>   complete feature #1
* 18c2f95  (24 minutes)  <Evan Travers>   git has entered the room
a git rebase -i or git reset --soft later, and this is ready for master.

Just remember that changing history is like a painkiller. It's ok to use it occasionally when you are hurting, but if you use it often it can be a sign of a problem.

A typical path to cleaning up your branch: git rebase master will ensure that you handle merge conflicts before the merge. If you've been bad, you may have to do some work here, especially if you've let your branch get too long. After that point, you can use git rebase -i to squash WIP commits, reword badly named commits, and generally clean up.

A good way to think about preparing your branch for merging is to think about preparing a patch. That's what you are doing. Creating a concise, useful patch, with documentation and the changes listed out so that if anyone wants to know what you changed and why they can find out at a glance.

workflow

Here is an example of an ideal-ish workflow, for a single developer working on a feature.

Check out git common flow for more information.

deployment

When your feature branch is really fully prepared, tested, and vetted by all relevant authorities, it's time to bring into the fold. If your shop follows a release pattern, it'd be a good idea to tag it now.

Heroku deploys whatever you push to it's master branch, which is a very good reason to keep that master branch clean.

apocalypse

There are a few worst case scenarios, and quite a few tangled git-object trees that are possible. Some will be in the FAQ, but most problems can be solved using the all powerful git rebase -i. There are many and varied guides on this topic, but the main principle is this: run git rebase -i [reference] to a reference preceding (or wrapping, if that helps you) the commits you want to change. You'll be presented with the history in reverse order, newest at the bottom, oldest at the top. You have several useful options for each commit, all are explained in the man page.

Just remember, if you found yourself in a dire strait, it's sometimes because someone mangled history being irresponsible, but more often because of runaway commits that are too long or have side effects.

advanced theory

For those questions that escape into the esoteric, and for those who want to escalate from user to master. Which should be all of you.

turn your world upside down

To understand the git, you have to gain some knowledge of unix philosophy. in particular, you should be thinking about git in relation to two main paradigms: One: that everything should be a file. Two: simple, chainable applications are incredibly powerful once you master the toolset. If you have spent some time chaining tools together in bash (ls -la | grep pdf, anyone?), you'll have some idea of what I'm talking about.

So how does this help? Well, in one perspective changing fashion: git is not a program. git is a file structure. The git tool isn't even really a program. It's a set of small tools designed to help you manage, filter and view the git system. This was a groundbreaking idea for me, I had been trying to understand the git system through the tools, not the other way around.

So you should be expending your efforts to understand the .git folder, and once you understand that you'll have a much better handle on what the tools are and how they should be used. If you wanted to be a total butterfly programmer, you could use git entirely using cat, touch, mkdir. Just saying.

To learn more, check out this chapter of the git book.

see the forest because of the trees

This is a gross over-simplification, but your .git folder contains four basic object types: Trees, blobs, commits, and tags. What are these things?

The simplest is the blob. It holds some data. Usually a file, or a change/diff to a file. For now, we'll just consider it as your changes. Always remember that git tracks state, not just files, so that your history is the story of your changes building upon one another.

The second building block is the tree. And I don't mean our happy leafy oxygen providing friends, I mean computer science style. Therefore it's a connected series of nodes, these nodes being... you guessed it. Objects. At its simplest, a tree can be a single line, beginning from your first file, and continuing on in a series of changes from that point on. Like a timeline. A tree is also like a directory on your computer. It references a bunch of other trees and/or blobs.

The last piece of the puzzle that you are most used to using is the commit. The commit refers to a particular tree, pointing to it and describing it. So a commit is a reference to a collection of changes that describe your history at that point>. That's it.

Lastly you have tags. Tags are exactly what they sound like: a quick description to reference to allow you to select/find it quickly.

walk the line, and the tree

Why does this matter? Because it gives us a framework to describe the git concepts we interact with using the git tool. If you remember you are dealing with this tree of objects, it's easier to visualize how changes percolate through the system, especially if you go all Back to the Future and decide to change the past to influence the future's course.

To learn more, check out the amazing and very well written Git Internals chapter of the Pro Git book.

fork and branch

Why would you ever fork? Isn't branching enough?

Where is fork in the git commands? Yep. You can go check. Not there. Fork is a construct we the community used to describe the process of cloning a repository to our local system for changes. If you think about it, every time you clone your own repository, you are forking the remote.

In this way, you should think of your github account as your actual development presence. When you go to work on the project, you clone (fork) it to your repository. You then clone your own repository to your computer, then working on branches and what not as normal. When you push to your github repo, you are backing up your work and allowing your github to behave as the gatekeeper between the master repo and your fork. Of course, there are many many workflows possible, this is just a normal-ish one, sometimes called github flow.

faq

Why merge rather than rebase new features into my feature branch?
Let's make sure we are on the same page. You've been working hard on your feature branch, and it's been some time. You find it's necessary to grab a new commit from another branch, whether it's master or someone else's in progress feature. Simply can't live without it. You think to yourself: "I'm going to rebase my branch anyway to clean it up for master... why should I merge now, rather than rebasing my branch on top of the new feature? The answer: deniability. When you rebase, it messes with the history, and doesn't clearly show where you pulled in the code from another branch. When you merge in the code from another branch, it will create a merge commit at the point that you pulled the code in. Imagine that you pull in a bug that breaks the layout in IE8... hunting down the your coworkers code that is now splayed out throughout your history is going to be a difficult task. Wouldn't it be nice if that code you just pulled in was contained in a single commit, at the point in history where you pulled it in? Yep, that's a merge commit. So merge in that feature if you must, but merge, don't rebase. It's for your own protection.
My head is detached, now what?
Stay away from guillotines. a far more complete faq can be found here.
I've created too many WIP commits and need to clean them up, help!
git reset --soft [sha1] will undo all commits but keep the changes up to the sha1. I'll pick the commit before my work, reset back to that, and then use git reset . to unstage all the work, and construct commits using git add --all --patch to just add related work to slowly build the commmits I should have made from the start.
What git "flow" do you recommend?
At the moment, I like git common flow.
My SCM guy is going to kill me when he founds out I [insert problem here].
Calm down. Follow the directions. (gorgeous diagram courtesy of the amazing Justin Hileman.)

terms

for all the useful terms, check out gitglossary.

resources

Changelog