manifesto
Git is concise. Concise does not mean short.
Well crafted history documents your source's purpose and direction.
While git can cure mistakes, good design decisions prevent them.
When in doubt, branch.
guide
prelude
Git is a ridiculously powerful, often misunderstood tool. It can bring teams to a veritable nirvana of productivity, or silo a single expert developer in frustration. For a happy git state to be reached, the whole team must pursue a moderate level of understanding of the tool, and come to an agreement on how to interact with each other. This includes a couple major areas: major organization, feature development, deployment/release strategies, and lastly, what to do when everything goes awry.
This guide will not attempt to instruct the budding git-ite in every detail, there are a multitude of powerful man pages and stack overflow questions to deal with that. This is merely attempting to stake out a standard when working with teams.
I assume the following: that the reader has git installed and working, basic terminal/unix knowlege, and a little knowledge of the git object model would not hurt.
If you need a quick intro on what git is and what it can do for you, see the resources below.
organization
commits
The most crucial part of organizing and maintaining a good project history is effective and descriptive commits. Each commit should represent one coherent change, and should have a strong commit message that thoroughly documents what went into the change. No commit should have side effects. If you discover some fix along the way, make two commits for each change. You can split up changes to the same file using git add -p
. A common instance of this is a quick clean up of code (removing trailing spaces or adding formatting) while working on other changes. Separate the two kinds of changes for clarity.
The commit message is the most important part of the commit itself. It should represent your motives, your references, and your reasons as to why this change is entering the codebase at all. Good commit message style is well known, but for the sake of completeness:
A strong title message, less than 80 chars A commit message description, if needed. *AND IT OFTEN IS*. If you are fixing more than a typo or changing some color, you need to list why and how you came to this point. Who asked you to do it, why you are doing it here, are there patterns you followed, code you copied. Below it, I like to list out references, man pages, stack overflow answers I used to make my decision. References: - http://tbaggery.com/2008/04/19/a-note-about-git-commit-messages.html
Among the git source team, they take the git commmit messages to an entirely different level: They treat the commits as documentation and part of the source itself. I have heard of pull requests being denied because the commits contained new phrases or words that brought confusion to the terminology in the plumbing of git. Let me repeat: the code was fine, but the commit was deemed unworthy.
Part of the reason for such pedantry is that they treat commits kind of like lawyers treat case law. They make the case for a new change based on the arguments that have been merged in already, just like a good attorney defends his client using prior cases.
I really like this approach, because it causes the programmer to think long and hard about what he/she is adding, and to carefully consider the design and thought that went into the history of the codebase, not just what is there now.
For more information, this is the best documentation on writing good git commits from OpenStack.
branching and forking
For reasons explained below in the deployment section, it's crucial to maintain a clean deployment branch. Traditionally this is dubbed master, and because systems like heroku expect this naming convention, we'll stick to it. Master should be as straight as possible, every commit should be deployable. That means that no untested, work-in-progress code should ever be committed to master. All development, no matter how trivial should be on a separate branch. In fact, you shouldn't be working on any branch that other people are committing to. Working on a feature, and another developer joins the same task? One of the two of you (probably him) should branch or fork.
When a task or feature is in mind, you make a branch (git checkout -b [branchname]
). Your branchname should be concise, but descriptive, as it will help inform what is in progress on the project. If you are adding a required field to a registration form, a good branch name is add-require-to-registration
, bubbas-branch
or v14
is not acceptable.
(Some groups like to use ticket numbers for branch names, so the above is not a hard and fast rule. An employer of mine uses commmitter-initials/ticket-number, and uses a really clever script to decode what is what.)
As you work on your branch, you are free to do whatever you want. Feel like trying a different path? Branch off your branch. Need to overwrite a commit? Go right ahead. Make lots of little WIP commits as you debug a script? Fine by me.
* 34f6b96 (1 second) <Evan Travers> (HEAD, feature-branch-1) still working... * 896c5b1 (18 seconds) <Evan Travers> WIP on txt * 3bc2744 (35 seconds) <Evan Travers> make sure that this is on * 18c2f95 (44 seconds) <Evan Travers> git has entered the room
Just fix it, clean it, document it before it is submitted to be merged into feature-branch-1
A clean, master-ready branch is: a clean line of single-purpose, well documented commits, with all WIP or temp commits cleaned up by squashing or git reset --soft
.
* 09239a9 (23 minutes) <Evan Travers> (HEAD, feature-branch-1) remove unneeded code, clean the /assets folder * 745928d (24 minutes) <Evan Travers> complete feature #1 * 18c2f95 (24 minutes) <Evan Travers> git has entered the room
git rebase -i
or git reset --soft
later, and this is ready for master.Just remember that changing history is like a painkiller. It's ok to use it occasionally when you are hurting, but if you use it often it can be a sign of a problem.
A typical path to cleaning up your branch: git rebase master
will ensure that you handle merge conflicts before the merge. If you've been bad, you may have to do some work here, especially if you've let your branch get too long. After that point, you can use git rebase -i
to squash WIP commits, reword badly named commits, and generally clean up.
A good way to think about preparing your branch for merging is to think about preparing a patch. That's what you are doing. Creating a concise, useful patch, with documentation and the changes listed out so that if anyone wants to know what you changed and why they can find out at a glance.
workflow
Here is an example of an ideal-ish workflow, for a single developer working on a feature.
- Identify the task.
git pull
(make sure no one else has done it!)git checkout -b task
(make a new branch for this task).- make concise commits describing and encapsulating your changes. They should have descriptive and complete descriptions, following good git commit message practices.
- If your feature goes on for a sufficient time, and you discover you need code from another development branch, do not rebase. If you absolutely must, merge in that feature, noting carefully the reason for your merge. (Why, you ask?)
- Before you craft your pull request, rebase on your target branch to solve merge conflicts locally... this way your patch will merge cleanly.
- Are you done? Run your tests and QA'd on staging? Time to join the mothership.
Check out git common flow for more information.
deployment
When your feature branch is really fully prepared, tested, and vetted by all relevant authorities, it's time to bring into the fold. If your shop follows a release pattern, it'd be a good idea to tag it now.
Heroku deploys whatever you push to it's master branch, which is a very good reason to keep that master branch clean.
apocalypse
There are a few worst case scenarios, and quite a few tangled git-object trees that are possible. Some will be in the FAQ, but most problems can be solved using the all powerful git rebase -i
. There are many and varied guides on this topic, but the main principle is this: run git rebase -i [reference]
to a reference preceding (or wrapping, if that helps you) the commits you want to change. You'll be presented with the history in reverse order, newest at the bottom, oldest at the top. You have several useful options for each commit, all are explained in the man page.
Just remember, if you found yourself in a dire strait, it's sometimes because someone mangled history being irresponsible, but more often because of runaway commits that are too long or have side effects.
advanced theory
For those questions that escape into the esoteric, and for those who want to escalate from user to master. Which should be all of you.
turn your world upside down
To understand the git, you have to gain some knowledge of unix philosophy. in particular, you should be thinking about git in relation to two main paradigms: One: that everything should be a file. Two: simple, chainable applications are incredibly powerful once you master the toolset. If you have spent some time chaining tools together in bash (ls -la | grep pdf
, anyone?), you'll have some idea of what I'm talking about.
So how does this help? Well, in one perspective changing fashion: git is not a program. git is a file structure. The git tool isn't even really a program. It's a set of small tools designed to help you manage, filter and view the git system. This was a groundbreaking idea for me, I had been trying to understand the git system through the tools, not the other way around.
So you should be expending your efforts to understand the .git
folder, and once you understand that you'll have a much better handle on what the tools are and how they should be used. If you wanted to be a total butterfly programmer, you could use git entirely using cat, touch, mkdir. Just saying.
To learn more, check out this chapter of the git book.
see the forest because of the trees
This is a gross over-simplification, but your .git folder contains four basic object types: Trees, blobs, commits, and tags. What are these things?
The simplest is the blob. It holds some data. Usually a file, or a change/diff to a file. For now, we'll just consider it as your changes. Always remember that git tracks state, not just files, so that your history is the story of your changes building upon one another.
The second building block is the tree. And I don't mean our happy leafy oxygen providing friends, I mean computer science style. Therefore it's a connected series of nodes, these nodes being... you guessed it. Objects. At its simplest, a tree can be a single line, beginning from your first file, and continuing on in a series of changes from that point on. Like a timeline. A tree is also like a directory on your computer. It references a bunch of other trees and/or blobs.
The last piece of the puzzle that you are most used to using is the commit. The commit refers to a particular tree, pointing to it and describing it. So a commit is a reference to a collection of changes that describe your history at that point>. That's it.
Lastly you have tags. Tags are exactly what they sound like: a quick description to reference to allow you to select/find it quickly.
walk the line, and the tree
Why does this matter? Because it gives us a framework to describe the git concepts we interact with using the git tool. If you remember you are dealing with this tree of objects, it's easier to visualize how changes percolate through the system, especially if you go all Back to the Future and decide to change the past to influence the future's course.
To learn more, check out the amazing and very well written Git Internals chapter of the Pro Git book.
fork and branch
Why would you ever fork? Isn't branching enough?
Where is fork in the git commands? Yep. You can go check. Not there. Fork is a construct we the community used to describe the process of cloning a repository to our local system for changes. If you think about it, every time you clone your own repository, you are forking the remote.
In this way, you should think of your github account as your actual development presence. When you go to work on the project, you clone (fork) it to your repository. You then clone your own repository to your computer, then working on branches and what not as normal. When you push to your github repo, you are backing up your work and allowing your github to behave as the gatekeeper between the master repo and your fork. Of course, there are many many workflows possible, this is just a normal-ish one, sometimes called github flow.
faq
- Why merge rather than rebase new features into my feature branch?
- Let's make sure we are on the same page. You've been working hard on your feature branch, and it's been some time. You find it's necessary to grab a new commit from another branch, whether it's master or someone else's in progress feature. Simply can't live without it. You think to yourself: "I'm going to rebase my branch anyway to clean it up for master... why should I merge now, rather than rebasing my branch on top of the new feature? The answer: deniability. When you rebase, it messes with the history, and doesn't clearly show where you pulled in the code from another branch. When you merge in the code from another branch, it will create a merge commit at the point that you pulled the code in. Imagine that you pull in a bug that breaks the layout in IE8... hunting down the your coworkers code that is now splayed out throughout your history is going to be a difficult task. Wouldn't it be nice if that code you just pulled in was contained in a single commit, at the point in history where you pulled it in? Yep, that's a merge commit. So merge in that feature if you must, but merge, don't rebase. It's for your own protection.
- My head is detached, now what?
- Stay away from guillotines. a far more complete faq can be found here.
- I've created too many WIP commits and need to clean them up, help!
-
git reset --soft [sha1]
will undo all commits but keep the changes up to the sha1. I'll pick the commit before my work, reset back to that, and then usegit reset .
to unstage all the work, and construct commits usinggit add --all --patch
to just add related work to slowly build the commmits I should have made from the start. - What git "flow" do you recommend?
- At the moment, I like git common flow.
- My SCM guy is going to kill me when he founds out I [insert problem here].
- Calm down. Follow the directions. (gorgeous diagram courtesy of the amazing Justin Hileman.)
terms
for all the useful terms, check out gitglossary.
resources
- git homepage
- git commit guide
- pro git book
- git common flow
- git immersion
- try git online
- git branching model
- my example .gitconfig
- changing history, and living with it
- please stay away from rebase
- Martin Fowler's Bliki: version control
- using pull requests
- useful git-ftp tool
- deploying wp sites via git
- neatest git ui I have seen so far
- the difference between head and the working tree
- your commits should tell a story (great talk on crafting commits)
- how to submit git changes by email rather than pull-request
- conventional commits - A specification for adding human and machine readable meaning to commit messages