If you’ve been using Git for a while, you’re probably comfortable with
git rebase, especially if you’ve been using
git svn and keeping your history linear. Rebasing has certain caveats, but what I don’t see mentioned much, if at all, is how using
git rebase can completely break things without you being aware of it. More than that though, things can break while you’re genuinely convinced that what you’re doing is fine, to the point of not seeing any problem whatsoever.
A standard situation you’ll find yourself in is needing to introduce changes from a feature branch into the main development branch. Suppose you’re working on a simple application which displays a user’s current location on a map.
You decide you want to display other users who are nearby. You aren’t quite sure how far you want to develop it, so you create a branch. Using the existing
current_location() function, you hack away and write a new
users_nearby() function. The development branch you forked from has continued on while you were writing
Happy enough with the new code, you commit.
E is the commit containing the addition of
users_nearby(), which uses the existing
current_location(). You decide that you want to develop things a little further and display not just the nearby users’ locations, but the number of them currently visible on the map. You write some more code and commit again, introducing
visible_users_near(), which uses
By now, you’re fine with the way it functions, but you’re not happy with the way the code is written. Deciding that
users_nearby() doesn’t actually need to depend on
current_location(), you refactor
users_nearby() and remove the dependence on
current_location(). (This example is somewhat contrived, so how exactly
users_nearby() would work after the refactoring isn’t clear — stay with me.)
You consider the new feature finished, and want to bring the changes into the main development branch. It’s a few changes so why not rebase?
You test that the code compiles, and everything works as expected. Nobody on the development team gives it a thought.
Some time later you’re hunting down a bug which is causing problems, and you end up at commit
E. You recompile everything to try and find the cause of your bug, and… there are compilation errors. What happened!
It turns out that someone else was working
current_location() while you were writing
users_nearby(). They made commits
D, and in doing so, fundamentally changed
current_location() in a way which breaks
G didn’t see any of those changes until they were rebased, but only
G was tested for an error-free compile;
F are broken because they were never tested against the upstream changes after rebasing.
What would the situation be if, instead of rebasing, a regular
merge was done?
After a merge, the state of the code in
M is the same state
G was in when
rebase was used. The difference is that
F are no longer broken.
“I’m never rebasing again.”
I think it’s easy to see this example as a reason that
git rebase is broken and should be avoided at all costs. But I think when you’re working on small, short-lived, local branches, there are reasons why breakage isn’t very likely to occur, and can be actively avoided.
One reason is that you can see the changes upstream and spot a relevant change as it happens. Another is that you could write a script which every rebased commit is tested against, to see if anything problematic happens — e.g., breaks the build — by using something like
git rev-list --reverse feature ^master to get the list of commit hashes you need.
If the feature branch you’re working on has been pushed somewhere “public”, you have absolutely nothing to worry about, because you shouldn’t be rebasing the feature branch anyway!