Better Code Reviews with git History Rewriting
1 | # Rewriting history with git |
The following post is a walkthrough on how I take a larger pull request, and break it down into well-ordered commits. At Mozilla, there is a pretty strong code review culture. Each line of code added to a project gets reviewed by a peer before it is merged in. There can be several cycles of code review where changes are requested. This happens asynchronously, as it’s a globally distributed organization. It can be tough coming into this environment from more of an agency dev environment, where anything goes. However, in order to ship quality code, the code review is an excellent process to find bugs, enforce consistency, and to spread the knowledge of new code across the team.
Large PRs are hard to review. When this happens, it’s frustrating for the reviewer. It’s hard to follow the logic of lots of changes. Despite this, it sometimes makes sense to send in a large PR, especially with large architectural changes. In order to get back a quality review, I like to split my larger code chunks into smaller ones.
Here’s how I do it.
The code
For this example, I’m going to illustrate how I take code I wrote a couple of weeks ago, and rewrite its history. Right now, it’s a single commit. But often, I have lots of messy commits. In this case, if I can’t make sense of my commit history, I will squash it all into one.
The code I’m working with is adding keyboard shortcuts for a feature called “call tree transforms”. It adds labels to the context menu showing the shortcuts, and generally cleans up the shortcuts to be more readable. The shortcuts work across several different panels in the profiler.
The current commit history:
1 | 24200dfa (HEAD -> transform-shortcuts) Add call tree transform shortcuts |
The branch transform-shortcuts at commit 24200dfa is my work, and a8c2aaa9 is the main branch that is merged in upstream (which coincidentally is also my work from earlier.) The first thing I want to do is start an interactive rebase before my current commit.
Interactive rebasing
Normally when I rewrite my history, I use an interactive rebase. This will let me rewrite and re-order my commits The command (which I will type out in full) is as follows.
1 | # Begin an interactive rebase, which will present a screen where the history of |
However, I want to work before my current commit, so I’ll add the ^ modifier to main.
1 | # main^ refers to the commit before main. |
My last commit was a merge commit. I’ll truncate the list a bit, but it looks like this in the interactive rebase page.
1 | # ... |
I want to edit the commit before my current 24200dfa.
1 | # ... |
I save and exit vim with :wc.
1 | Stopped at e7ea6e08... Add a bit more test coverage for empty or null marker schema |
Now, I want to restore my work, and begin chopping it up. To do this, I use the handy git restore command. Historically, this was also the git checkout.
1 | git restore . --source transform-shortcuts |
My command will take the fully finished branch, and apply it to my working directory.
1 | ~/dev/profiler on profiler/(no branch, rebasing transform-shortcuts) |
Coming up with the commit plan
Now begins the fun work. I take the list of files here, and copy them over to my editor to begin creating a commit plan of action. This is a relatively small changeset compared to some I’ve done this way, but it really helps to be explicit here.
1 | res/css/style.css |
I run git diff and read through my code changes. I annotate each file listing with what happened in that file.
1 | res/css/style.css |
Now I re-order the commit messages in the order I want them
1 | Fix some minor styling issues with the context menu |
Normally, I start working through my list, and delete items as I do them, but for this tutorial I will write up a complete plan on how to commit each one.
1 | git add res/css/style.css |
git add --patch
The most interesting line to call out here is:
1 | git add --patch src/components/shared/CallNodeContextMenu.css |
My CSS file changes included some that I wanted to attribute to two different commits. Here I use the --patch flag to signal to git that I want to decide which parts to add. --patch is available for a variety of commands, including the git restore command used earlier. It is the command that really makes me feel like I’m in control of moving chunks of code around.
1 | --- a/CallNodeContextMenu.css |
Here is an example of the decision I can make. At the end it asks me a question on how I want to proceed. Entering ? will list out the full commands available.
1 | y - stage this hunk |
I chose y to “Yes, stage this hunk”. Another useful one is to split the hunks into smaller ones.
For example, my next code hunk included some that I wanted, and some that I didn’t for this commit.
1 |
|
Here I chose s, for splitting the hunk up.
1 | Split into 2 hunks. |
Finally when I was done I chose q for quitting.
1 | (1/1) Stage this hunk [y,n,q,a,d,e,?]? q |
For more advanced situations, I like to use the e option to directly edit the diff of the hunk. This is helpful to break apart edits that are on the same line, or that won’t split apart nicely into hunks.
The committed code
After running through all of my commit plan, I have a list of well-ordered commits.
1 | 437f93c3 (HEAD) Teach the flame graph how to use transform shortcuts |
I’m still in the interactive rebase mode, and my working directory is now clean.
1 | git rebase --continue |
Since I touched a merge commit in my history editing, I’ll run one final rebase against the main branch.
1 | git rebase main |
This will ensure the only changes are my new ones.
1 | b8339103 (HEAD -> transform-shortcuts) Add call tree transform shortcuts |
My original squashed commit is there: Add call tree transform shortcuts. It only contains the unneeded whitespace changes. I’ll get rid of it with a git reset --hard 44278a4d – although I could also do a git rebase --interactive main and drop the commit that way.
1 | b8339103 (HEAD -> transform-shortcuts) Add call tree transform shortcuts |
Now my commit history tells a nice story about what I’m adding with this new feature. I’ve taken the time to look through all of the code. Each commit is well-ordered. I can now go through and verify I have adequate test coverage for each commit, as the new changes are much more granular. I might go back and add more detailed, multi-line commit messages where it makes sense.
I add a new commit to the top of my stack with a test I was missing, and it’s ready for review.
1 | git commit --message 'Add a test for call tree transform keyboard shortcuts' |
The examples I gave of git commands were the full unabbreviated versions. I do this kind of work most days, so I have a list of aliases I prefer to use in order to do a bit less typing.
1 | alias g="git" |
Why it’s worth the effort
In an organization with a strong focus on code quality, taking the time to break up commits like this results in a much smoother review process. In the past, I’ve sent in big messy commits, and a detailed reviewer struggled through it. They did not have a fun time with it, and then I struggled with figuring out how to apply the feedback. This happened multiple times. Once I took the time to learn how to nicely split up my commits, even if I sent in a large stack of commits to review, the conversation was much smoother. Sometimes it’s fine to struggle through a few tough rounds of code review, but an even better outcome is that the quality of the review received is much better. The reviewer can more easily read the code, and the decisions to modify things along the way. It brings more intentionality to the process.
This rewriting does take time, at times several hours, or even a full morning. This usually happens for projects that I’ve been working on for more than a week. For example, I can take a big piece of complicated architectural work, and split it out into 20 commits. The process to write that code was messy and confusing. However, after splitting it up into logical chunks of code, a clear story can emerge.
Update: other tools
As I get feedback from this post, I’ll share it below.
Another option beyond git rebase -i is git revise. It includes some quality improvements to some of the rewriting process, as well as some speed improvements due to avoiding working on the file system, and doing things directly in memory. I use this quite frequently nowadays to stage some changes with git add --patch and then run git revise fb92283a to revise a specific patch.
Many people have good success with git absorb, which is based off of hg absorb. It can help absorb changes into earlier patches. I find it works sometimes for me, but not always. When it does, it saves time on some of the manual work from above.
