SourceTree

Merge or Rebase?

By on August 21, 2012

As you’re no doubt aware, Git and Mercurial are great at re-integrating divergent lines of development through merging. They have to be, since their design strongly encourages developers to commit changes in parallel in their own distributed environments. Eventually some or all of these commits have to be brought together into a shared graph, and merging and rebasing are two primary ways that let us do that. So which one do you use?

What does Merge or Rebase mean?

Let’s start by defining what merging and rebasing are.

Merging brings two lines of development together while preserving the ancestry of each commit history.

In contrast, rebasing unifies the lines of development by re-writing changes from the source branch so that they appear as children of the destination branch – effectively pretending that those commits were written on top of the destination branch all along.

Here’s a visual comparison between merging and rebasing a branch ‘feature/awesomestuff’ back to the master branch (click for full size):

So merging keeps the separate lines of development explicitly, while rebasing always ends up with a single linear path of development for both branches. But this rebase requires the commits on the source branch to be re-written, which changes their content and their SHAs. This has important ramifications which we’ll talk about below.

[An aside: merging in Git can sometimes result in a special case: the 'fast forward merge'.  This only applies if there are no commits in the destination branch which aren't already in the source branch. Fast-forward merges create no merge commit and the result looks like a rebase, because the commits just move over to the destination branch - except no history re-writing is needed (we'll talk about this re-writing in a second). You can turn fast-forward merges off in SourceTree so that a merge commit is always created if you want - check the 'Create a commit' option in the Merge dialog or set it globally in Preferences > Git.]

So, what are the pros and cons of merging and rebasing?

Pros and Cons

Merging Pros

Merging Cons

Rebase Pros

Rebase Cons

Practical tips

Looking at the above pros/cons, it’s clear that it’s generally not a case of choosing between one or the other, but more a case of using each at the appropriate times.

To explore this further, let’s say you work in a development team with many committers, and that your team uses both shared branches as well as personal feature branches.

Shared branches

With shared branches, several people commit to the same branch, and they find that when pushing, they get an error indicating that someone else has pushed first. In this case, I would always recommend the ‘Pull with rebase’ approach. In other words, you’ll be pulling down other people’s changes and immediately rebasing your commits on top of these latest changes, allowing you to push the combined result back as a linear history. It’s important to note that your commits must not have been shared with others yet.

In SourceTree, you can do this in the Pull dialog:

You can also set this as the default behavior in your Preferences if you like:

By taking this approach, when developing in parallel on a shared branch, you and your colleagues can still create a linear history, which is much simpler to read than if each member merges whenever some commits are built in parallel.

The only time you shouldn’t rebase and should merge instead is if you’ve shared your outstanding commits already with someone else via another mechanism, e.g. you’ve pushed them to another public repository, or submitted a patch or pull request somewhere.

Feature branches

Now let’s take the case where you deliberately create a separate branch for a feature you’re developing, and for the sake of this example, you are the only person working on that feature branch. This approach is common with git-flow and hg-flow for example. This feature branch may take a while to complete, and you’ll only want to re-integrate it into other lines of development once you’re done. So how do you manage that? There are actually two separate issues here that we must address.

The final merge: When building a feature on a separate branch, you’re usually going to want to keep these commits together in order to illustrate that they are part of a cohesive line of development. Retaining this context allows you to identify the feature development easily, and potentially use it as a unit later, such as merging it again into a different branch, submitting it as a pull request to a different repository, and so on. Therefore, you’re going to want to merge rather than rebase when you complete your final re-integration, since merging gives you a single defined integration point for that feature branch and allows easy identification of the commits that it comprised.

Keeping the feature branch up to date: While you’re developing your feature branch, you may want to periodically keep it in sync with the branch which it will eventually be merged back into. For example, you may want to test that your new feature remains compatible with the evolving codebase well before you perform that final merge. There are two ways you can bring your feature branch up to date:

  1. Periodically merge from the (future) destination branch into your feature branch. This approach used to cause headaches in old systems like Subversion, but actually works fine in Git and Mercurial.
  2. Periodically rebase your feature branch onto the current state of the destination branch

The pros and cons of each are generally similar to those for merging and rebasing. Rebasing keeps things tidier, making your feature branch appear quite compact. If you use the merge approach instead, it means that your feature branch will always branch off from its original base commit, which might have happened quite a long time ago. If your entire team did this and there’s a lot of activity, your commit history would contain a lot of parallel feature branches over a long period of time. Rebasing continually compacts each feature branch into a smaller space by moving its base commit to more recent history, cleaning up your commit graph.

The downside of rebasing your feature branches in order to keep them up to date is that this approach rewrites history. If you never push these branches outside your development machine, this is no problem. But assuming that you do want to push them somewhere, say for backup or just visibility, then rebasing can cause issues. On Mercurial, it’s always a bad idea – you should never push branches you intend to rebase later. With git, you have a bit more flexibility. For example, you could push your feature branches to a different remote to keep them separate from the rest, or you could push your feature branches to your usual remote, as your development team is aware that these feature branches will likely be rewritten so they should not check them out from this remote.

Conclusion

The consensus that I come across most frequently is that both merge and rebase are worth using. The time to use either is entirely dependent on the situation, the experience of your team, and the specific DVCS you’re using.

  1. When multiple developers work on a shared branch, pull & rebase your outgoing commits to keep history cleaner (Git and Mercurial)
  2. To re-integrate a completed feature branch, use merge (and opt-out of fast-forward commits in Git)
  3. To bring a feature branch up to date with its base branch:
    1. Prefer rebasing your feature branch onto the latest base branch if:
      • You haven’t pushed this branch anywhere yet, or
      • You’re using Git, and you know for sure that other people will not have checked out your feature branch
    2. Otherwise, merge the latest base changes into your feature branch

I hope this helps! Please let me know in the comments if you have any questions or suggestions.

 

  • Oliver Zhou

    Rebase is now much safer than before thanks to hg phases support

    • Anonymous

      Indeed – Mercurial will actually stop you rewriting changes (including via Rebase) that have already been pushed if you use 2.2 or above.

  • Pingback: Linkdump for August 22nd | found drama

  • http://twitter.com/abbiekressner Abbie Kressner

    Great blog post!

  • ybart

    Would be create to have the option to show the original commit date instead of the rebased commit date.

    • Anonymous

      It’s already there in 1.5.3 – in Preferences > Git, check ‘Display author date instead of commit date in log’. Both dates are also displayed in the commit details when you select the line.

      • ybart

        Didn’t saw that ! Thanks for the tip. I noticed the information was available in commit details panel, but it’s definitely simpler to have the info directly in the list.

  • http://twitter.com/PhillSparks Phill Sparks

    I’ve recently started using git smart-pull. It’s a ruby script that detects the best way to update the local branch (rebase or merge). It also stashes local changes. Maybe it’s something you can consider for SourceTree?

    http://github-displayer.heroku.com/geelen/git-smart/raw/master/docs/smart-pull.html

    • http://www.marnen.org Marnen Laibow-Koser

      That would just make it easier to rebase, which is usually a bad thing. Merge is to be preferred in nearly all cases, since it doesn’t rewrite history.

  • http://www.telecommutetojuryduty.com/ Dominick

    very clear and informative post; I feel like I understand these two approaches much better now. thanks Steve!

  • http://www.facebook.com/eugine.dubinin Eugine Dubinin

    Cool! Thank you very much. I was in doubt before reading this.

  • http://rakeroutes.com/ Stephen Ball

    Rather than decide between merge bubble and straight line of history, you can also use rebase to allow for clean, non-fast forward feature branch merges.

    1. rebase the feature branch against the destination
    2. use `merge –no-ff feature_branch` to pull it in to the destination branch

    This allows for a clear merge commit of a feature (which means it’s easily revertable as well as exceedingly easy to see in a graph) with no back and forth merge bubbling.

    • http://www.marnen.org Marnen Laibow-Koser

      What’s wrong with merge bubbling? Rebasing means your history lies, since it claims you branched off master where you didn’t. See http://paul.stadig.name/2010/12/thou-shalt-not-lie-git-rebase-ammend.html for why this is usually a bad idea. (I do occasionally rebase the feature branch, but I more often merge it. And I *always* merge it if I’ve pushed the feature branch to a remote—which I usually have—because force-pushing to a remote is *far* worse than merge bubbling.)

      In general, I think rebasing more than once in a blue moon is the sign of a broken Git workflow, because it rewrites history too much.

      I notice that as a Git novice, I used to rebase a lot and hardly ever merge. As an expert Git user today, I merge a lot and hardly ever rebase. I think that’s significant.

      • http://rakeroutes.com/ Stephen Ball

        Nothing is inherently “wrong” with merge bubbling: it all depends on your team and how you’re using your Git history.

        For us we value seeing a linear feature branch history over preserving what work was being done in parallel. With 8 devs even trivially parallel work quickly makes the log graph unusable and reduces the effectiveness of git bisect for getting context around a commit.

        We use rebase in two ways:

        1. present our development as a cohesive story (i.e. refactor 20, 30 commits down to a few that explain the actual feature development instead of the false paths).

        2. Align feature branches with the current master/HEAD so that the merge comes in as a linear history.

        Basically, how Linus describes in the classic “how to rebase” post: http://www.mail-archive.com/dri-devel@lists.sourceforge.net/msg39091.html

        I’m curious: what issues did you have rewriting history “too much”? I agree that rebase (or squashing) can taken to an unpleasant extreme such as when a large feature is compressed down into a single poorly written commit. But rebase only changes HEAD and then plays back the commits: nothing is lost until you tell it to get lost.

        • http://www.marnen.org Marnen Laibow-Koser

          > For us we value seeing a linear feature branch history over preserving what work was being done in parallel.

          Ah, I see. In my experience, a desire to see a linear branch history is generally a sign of not using Git *as Git*. Of course many of us (myself included) grew up on tools like Subversion that encourage a linear history.

          But Git really is different—it encourages a non-linear history, and IMHO it’s at its best when that is respected, because it’s easier for Git’s amazing history analysis tools to work if the history is not prematurely linearized.

          > With 8 devs even trivially parallel work quickly makes the log graph unusable

          How are you trying to use it? I’ve worked on larger teams than that without rebase and had no problem at all.

          > and reduces the effectiveness of git bisect for getting context around a commit.

          Actually, the opposite is true. Git bisect works best if you keep the history non-linear. What are you referring to here?

          > present our development as a cohesive story (i.e. refactor 20, 30 commits down to a few that explain the actual feature development instead of the false paths).

          There’s no reason to do that. It’s a lie, and as explained in that Paul Stadig article I linked to, it makes your Git history less useful. The false paths may prove useful later on, and should not be thrown away.

          > I’m curious: what issues did you have rewriting history “too much”?

          Well, lying about the development history obscures what actually happened, and a rebase-heavy workflow requires force-pushing, which is rather a no-no.

          It comes down to this: if you want a linear history, use Subversion. If you want to use Git, respect its ability to handle branches and don’t artificially dumb down your history.

          • http://rakeroutes.com/ Stephen Ball

            Interesting. I’m 80% sure you’re trolling, shame on me. It’s like you didn’t even read my post or Linus’s post. I’ll hit on your points and share some more links though. :-)

            You take a very strange view to only use a subset of the full Git. Git is the only system I’ve used into that can keep a linear history in such an elegant manner. SVN? Please.

            Ah, when I say “git log” I mean the DAS version of git log –graph. The pretty git log gets exceedingly noisy with a parallel history the vertical bars pushing out all the other data.

            For Git bisect I mean that using it with a linear/feature history you can easily see the feature context surrounding the commit.

            All respect to Paul Stadig, but that’s a very shortsighted view of Git’s potential. Rewriting history is nothing to be scared of. Commits like “Fixed typo” are literally noise in your signal and far better to use fixup and squash while working on the development branch.

            Git push –force is also not inherently bad either, although you’d only need it if you have pushed your development branch to a remote. You should certainly not be rebasing or changing public history! But if you’re on a development branch and you’re using your remote origin for backup (?) then push -f is what you’d need.

            It comes down to this: Git provides wonderful functionality for deliberately telling the story of your code. Not using it and letting your history just happen is missing out.

            http://blog.izs.me/post/37650663670/git-rebase – Git is an editor. “Firmly against rebase” is like being firmly against backspace.

            http://rakeroutes.com/blog/deliberate-git/ – “Write the code, commit the code, then refactor the commits”

          • http://www.marnen.org Marnen Laibow-Koser

            > Interesting. I’m 80% sure you’re trolling, shame on me.

            Nope. Sorry if I gave that impression. I don’t pull my punches when discussing things like this, but I’m trying to approach this as a real discussion.

            > It’s like you didn’t even read my post or Linus’s post. I’ll hit on your points and share some more links though. :-)

            I had not read Linus’s post, but I have now; thanks for the reference. I do use rebase occasionally on my private branches as he suggests, but that’s about it.

            And of course I read your comment before replying. If there’s something you think I’m not getting, I’d love to know what it is.

            > You take a very strange view to only use a subset of the full Git.

            That’s not quite what I’m advocating. But Git has lots of commands that aren’t much use except in fairly restricted circumstances, and I believe rebase is one such. I use rebase every now and then, but merge is far more generally useful.

            > Git is the only system I’ve used into that can keep a linear history in such an elegant manner. SVN? Please.

            My recollection from the last time I used SVN (admittedly a while ago) is that it’s a perfectly good VCS—provided you never make private commits and don’t do too much branching. In other words, it’s about equivalent to Git in maintaining a linear history on a public server. The reason Git is a better VCS than SVN, IMHO, is that maintaining a linear history on a public server is not the only thing that a VCS should do. :)

            > Ah, when I say “git log” I mean the DAS

            DAS?

            > version of git log –graph. The pretty git log gets exceedingly noisy with a parallel history the vertical bars pushing out all the other data.

            I nearly always use SourceTree or GitX for this purpose. I think the nonlinear nature of Git makes GUI tools far more versatile for viewing complex histories.

            > For Git bisect I mean that using it with a linear/feature history you can easily see the feature context surrounding the commit.

            Git bisect behaves just fine with nonlinear history too. Are you referring to the case where you’re running bisect on a repo with two correct but mutually incompatible branches merged in?

            Or are you referring to the case in the izs.me blog post, where he talks about the problem of running bisect on a repo full of broken commits? I’m not advocating that either. I don’t commit till the tests pass—but then that commit is there to stay.

            > All respect to Paul Stadig, but that’s a very shortsighted view of Git’s potential. Rewriting history is nothing to be scared of. Commits like “Fixed typo” are literally noise in your signal and far better to use fixup and squash while working on the development branch.

            I’m sorry, but there I don’t agree at all. “Fixed typo” is noise, and I generally squash it on a private branch, true. I do rewrite to that extent, and I do try to group large changesets into smaller logical changes as I commit them, but that’s about the extent of it. Generally I want my history to reflect what actually occurred. Just because you *can* rewrite history doesn’t mean you should.

            > Git push –force is also not inherently bad either, although you’d only need it if you have pushed your development branch to a remote. You should certainly not be rebasing or changing public history! But if you’re on a development branch and you’re using your remote origin for backup (?) then push -f is what you’d need.

            I often do push my dev branches for backup, and so that my colleagues can see what I’m doing if they need to. In the latter case, they’ll pull my dev branch to look at it, so a subsequent push –force is right out. (And of course, there’s no way for me to know whether they’ve done that, short of asking them all.)

            Git push –force is a misfeature, in my opinion. Anything that goes into the public repo should stay there. (Eric Sink appears to agree with me, FWIW.)

            > It comes down to this: Git provides wonderful functionality for deliberately telling the story of your code. Not using it and letting your history just happen is missing out.

            I don’t quite let my history “just happen”; I do make sure my commits are logical and atomic. But I also make sure they’re truthful. I have zero interest in reorganizing my commits ex post facto to tell a different story. Why would I want to do that? That just makes things more confusing for later maintainers (who might even be me 2 years from now).

            In other words, the later maintainer should IMHO be able to figure out the *actual* history, not the one that you thought would look good.

            I’ll admit that I’m mystified at the idea that rearranging my history is somehow a good thing. Why do you do this? What is the point? What benefits do you get that make up for the loss of actual history?

            > http://blog.izs.me/post/376506… – Git is an editor. “Firmly against rebase” is like being firmly against backspace.

            This post appears to be saying that rewriting history lets you see differences at a higher level. But Git already has diff for that. When I look at Git history, I don’t want to see a pretty story; I want forensics. By rewriting your history, you’re throwing out your forensic data forever—so when a later maintainer needs it to track down a bug, it’s too late.

            Also, I disagree with the basic premise. Git isn’t an editor, it’s a VCS. Its job is to make it easy for projects to *keep* history, not *change* it. To the extent that your VCS doesn’t reliably record history, it isn’t a VCS.

            > http://rakeroutes.com/blog/del… – “Write the code, commit the code, then refactor the commits”

            That’s one of your own posts, right? I think you’re advocating putting in commits the sort of long explanations that I tend to think are more suited for an issue tracker or a README file. That may work for some projects, though.

            I *might* be willing to use rebase -i the way you describe, though—because that’s not really changing the commit history as such; it’s just expanding the commit messages and sometimes combining a couple of subsequent commits into one.

            Even here, though, brain fade can creep in. Particularly if I’m going to be writing a longer message for a lot of work, I’d rather write it as I do the work so I don’t forget details. Normally I put a lot of notes in the issue tracker as I work for this purpose.

            In short, then, I believe you’re manufacturing yourself a problem by trying to use Git more like a documentation engine or issue tracker. Your solution is IMHO reasonable, but only if you have the problem in the first place. I prefer a workflow where that problem wouldn’t arise.

  • Pingback: More Git tidbits | Some Things Are Obvious

  • Pingback: Bookmarks for April 3rd from 20:18 to 21:00 | dekay.org

  • Adrian von Gegerfelt

    I find the ‘Rebase current changes onto [otherbranch]‘ text to be scary. I don’t want to affect ‘otherbranch’, only my current branch. Is it possible to change the message to something in the lines with ‘Rebase and use [otherbranch] as source’, or something…

    • Anonymous

      Sorry, I think the current text is clearer. With rebase it’s very important to understand which commits will be changing, and the text makes clear that it’s your current commits that will move & be placed on top of the other branch you selected – ie the other branch isn’t modified, it’s just what the commits move on to.

      • Anonymous

        that isn’t very clear to be honest. I am with Adrian.

      • Jerry

        Totally agree. When I right-click my feature branch (which is checked out), the menu gives me the choice to “Rebase current changes onto “. So that doesn’t tell me what’s going to be changed. You’re assuming people already know what rebasing does, and that this terminology makes it clear in which direction it’s happening. I originally branched off of develop, not master. When they say “current changes”, do they mean in develop, or master? Does the system automatically know which branch to pull from to do the rebase? Vague, non-descript menus like these can cause people to mess up entire repos, and I know, because I’ve seen it happen.

      • http://about.me/mikeschinkel MikeSchinkel

        I agree with Adrian. That wording does not give me enough information to ever be willing to risk using it. I’ll sadly do from the command line instead.

        Maybe what would help is if there were some way to preview the Git command that will be run for all menu options?

  • saurav
  • Erik van der Neut

    Thanks Steve! This is a wonderfully clear explanation that gives me extra confidence in how to use GIT/SourceTree.

  • Anonymous

    Where can I read about how to deal with rebase conflicts? I have no idea what SourceTree is showing me when this happens, which file is from where, or what stage I am at. It’s extremely confusing.

    • Anonymous

      Rebase conflicts are essentially the same as merge conflicts, except that your changes are ‘replayed’ on top of the target branch, one at a time, meaning the rebase has to stop as soon as there’s a conflict. You have to resolve the conflicts the same way as you do with a merge (e.g. launching external merge tools) then continue the rebase process, since there may be more commits to ‘replay’ after this one. Rebase is a multi-stage process rather than a single stage like merge because each commit has to be modified as it is rebased. SourceTree automatically prompts you to continue the rebase process if you click Commit or other commit functions while you have a rebase in progress so you don’t have to remember to use the ‘Continue Rebase’ feature explicitly once you’ve resolved the conflicts.

      • Magbic

        I can’t seem to get this working on SourceTree 1.9.0, I created a merge conflict scenario for testing Pull with Rebase instead of Merge from Develop to my Feature Branch.

        SourceTree creates “(no branch / rebasing …)”, okay so when I resolve my conflict, I commit it, then “HEAD” Appears with my resolved file, but if I switch to my Feature Branch it doesn’t appear to be rebased, when I get prompted to Continue, this seems to do thing and I find myself back to square 1.

        What am I not doing right?

  • Pingback: Simple GIT – Merge und Rebase verständlich | Effective Trainings & Consulting

  • Anonymous

    Excellent article. One of the best posts I’ve seen on the ‘always more to learn/consider’ topic of merge versus rebase.