Authoring Git Commits, for Professional Software Developers
When entering the field of software development, we fully understand that we must learn to write code. What isn't generally understood from the outset is that there are a myriad of other, less well-known topics and techniques that must also be mastered. One such very important aspect of software development is a Version Control System ('VCS').
As you may already know, a VCS allows developers to track changes in a group of files. While it might be fun and educational to build our own simple VCS from scratch we really don't need to. A smart guy named Linus has already created the de-facto tool of choice for software professionals today:
The purpose of this article is to help new (<= 5 years professional experience) software developers author better git commits.
I will assume that you have at least tinkered with git and are familiar with, and have used, the following commands:
Ok, let's dive in.
Commits are the life-blood of a git repository. Each is like an entry in an ever-growing journal detailing a project's development history. So, let's talk about creating and working with commits.
We'll start with a simple principle:
Know exactly what you are committing (before doing so).
git commit, always
git status and
git diff first! Or, if you use a git GUI, actually look at the diff before clicking the button!
Worst-case scenario: Many a developer has mistakenly committed and pushed private credentials to an open-source remote. (Been there, done that...)
More likely scenario: You had added
println statements or other temporary debugging aids while coding which you meant to remove before committing. Avoid the unnecessary hassle and embarrassment of a
git revert by always looking at the diff before committing.
The video below offers a good example of using GUI to review changes before committing.
Now, git doesn't tell you how or when to commit, or how big commits should be, so I will: Commits should happen more frequently and should be smaller in size than you probably think. The more frequently they come, and the smaller they are, the better! We're talking about potentially dozens of commits per day!
If you're doing TDD you should be committing every time a new test or new suite of tests is passing. If you're doing TCR the commits will appear automatically and you'll actually have to squash them into larger, but still bite-sized, chunks. But what if you've been hacking for several hours and have a massive diff on your hands? (Happens to the best of us...)
Slice up large batches of un-staged/un-tracked changes into bite-sized commits.
Why all this fuss about small commits? Well, the tiny changes contained in small, single-purpose commits can, if desired, be manipulated independently by amendment, reverting, re-ordering, or can even be squashed into larger commits--but once squashed, the previously listed actions become unavailable.
Dave Thomas and Andy Hunt, in their classic book The Pragmatic Programmer called version control 'a giant
undo button--a project-wide time machine that can return you to those halcyon days...when the code actually compiled and ran.'_
Still not convinced?
Martin Fowler echoed this sentiment in his book on refactoring in which he confesses: "I commit after each successful refactoring, so I can easily get back to a working state should I mess up later."
A commit after each successful refactoring--that's a lot of commits! If I've lost you at this point, let me reassure you that you don't have to push a stream of such tiny commits (more on that later). I'll also say that the so-called 'problem' of too many commits is a lot like the 'problem' of 'too many layers of indirection' in that it is usually not a problem.
Here's a demonstration of how I use Sublime Merge (my preferred git GUI) to go about staging a group of changes each in their own commit (that video is a bit long but the first minute is probably enough to get the gist):
If, like Jessica, the manager in "O Foreman, Where art Thou?", you and your team members review each and every commit pushed by every other team member (and you really should!), you can begin to leverage git commit messages as a communications tool. You can provide deeper context for decisions and refer to articles or blogs that helped you in your coding. Commit messages can serve as conversation starters that advance the state of the art for your organization.
Before moving on, let's pause for a moment to recognize that the act of composing a commit message is similar in nature to naming a variable or describing a test case. Giving names to things doesn't require deep knowledge of mathematics or computer science but it can be deceptively difficult! Sometimes I have to take my hands off the keyboard, sit back, and just think for a moment or two before a good commit message appears in my mind. Sometimes I write something only to rewrite it seconds later. It's worth a few moments of your time to author a decent message.
Use commit messages to communicate!
Ok, so you've logged a few commits, all that's left is to push, right? Well, pushing is significant in that once commits are pushed, they become part of the history for anyone who might
git pull from the moment the changes are uploaded. If you alter the commit history in any way once it's been pushed that will require everyone else on the team to re-clone the repository!
So, all of that amending, re-ordering, and squashing we talked about earlier is off limits once the commits have been pushed. My recommendation is to push regularly and often (multiple times a day!), but not so often that you don't have a chance to consider a group of commits for squashing and re-ordering, etc...
As we have learned from all great time-travel movies: don't alter history!
(If you absolutely have to, you can alter the history by using
git push --force... Let the reader beware!)
Authoring commits in git (or any other VCS) is an important part of your work as a software developer. What will you do to make your commits communicate more effectively? I'll bet you've had a few ideas come to mind as you've read this article. Implement them without delay! Your team will thank you. Developers who inherit your code in the future will thank you. Your future self will even thank you!