Using Git and feature branches effectively

There is a bit of a debate raging in the last few weeks that started when somebody commented on a few things that Martin Fowler wrote about git and using feature branches and feature toggles. Martin Fowler makes a few very good points about how feature branches can cause a lot of testing and integration headaches and lot of people seemed to have picked up on this one and there seems to be a bit of an anti-feature branch movement emerging.

The main problem I have with this is that these people are confusing problems and causes here and effectively blaming a solution they have for causing a problem they have. Roughly the argument goes as follows: feature branches cause people to accumulate change that then becomes very disruptive when it lands on the main branch. There will be CI breakage and lots of problems for people on other feature branches that are suddenly faced with merge issues. I think Martin Fowler actually gets it but some of his followers seem to be confused. By all means, feature toggles are a great way to get changes in early and have them exposed. It’s a powerful tool that you can use to get changes out earlier. Early is good: use it. However, it is not the case that if you use feature branches there has to be a lot of pain for everybody. In other words, feature branches are not inherently evil and don’t have to be problematic.

It’s not the practice of feature branching that is the problem but the fact that testing and continuous integration are not decentralized in a lot of organizations. In other words until your changes land on the central branch, you are not doing the due diligence of testing. Even worse, you are not making sure you have tested your changes before you add them to the main branch.

You can’t do decentralized versioning unless you also decentralize your testing and integration. Git has value when used as a SVN replacement. Git has more value when used as a DVCS. There is no good reason why you can’t do decentralized testing and integration with git. Rather the opposite: it has been designed with exactly this in mind. The whole point of git is divide and conquer. Break the changes up: decentralize the testing and integration work and solve the vast majority of problems before change is pushed upstream. If you push your problems along with your changes you are doing it wrong. Decentralized integration is a very clever strategy that is based on the notion that the effort involved with testing and integration scales exponentially rather than linearly with the amount of change. By decentralizing you have many people working on smaller sets of changes that are much easier to deal with: the collaborative effort on testing decreases and when it all comes together you have a much smaller set of problems to deal with.

This is how the linux kernel manages to remain stable, despite the fact that the amount of change there is measured in thousands of LOC/day. If you need proof that thousands of people can work collaboratively on millions of lines of code using git branches: look at the linux kernel. The amount of change, integration and testing effort, etc. in whatever you are working on probably pales in comparison with that: your problems are easy.

Here’s a few simple practices that will address most of the issues:

  1. No change that will break CI on the main branch is allowed on the main branch. Zero tolerance on this one. In git terms: rebase against main, run ci test on the feature branch, fix any problems, push. You can automate this even: jenkins pretty much supports this out of the box. If main breaks ever, somebody doesn’t get the basics: educate them with a big clue bat. Rationale, it is vital to keep main stable at all times. That way everybody on a feature branch will know it is safe to rebase.
  2. Rebase against main frequently, especially if you do big changes. Rationale: you are doing the changes, it is you that will take the pain of doing the integration work when things go wrong, not everybody else. The earlier you know about problems, the easier it is to fix. Feature branches are not about stopping rebases. If your feature branch is way behind, you have done something very wrong. Don’t ever do that without good reason. If it is on the central branch, you will have to deal with it at some point: so get it over with ASAP.
  3. Commit frequently, keep commits as small as you can. Rationale: smaller commits are easier to analyze and fix when you have conflicts. Also, Git is really good at applying commits one by one. If you isolate the merge problems to a handful of commits, rebasing is pretty much painless.
  4. Push as early as you can. If the CI builds are green on your branch and you are confident things are fine, push and don’t wait. Don’t accumulate integration work for others. Feature branches are not about hiding change but about isolating change. There’s a difference. One is a communication problem and the other is a proven strategy of divide and conquer. If appropriate, feature toggles are indeed a great way to land experimental changes. I believe this was the main point Martin Fowler was trying to make.
  5. Communicate clearly around big code restructurings. Rationale, everybody rebasing against your changes will experience some pain. You are causing people to have to do work, so tell them it is going to happen. I always ask people to push their changes before I push my big changes. That way, I can fix the integration issues on my side before I push.
  6. Collaborate with people by pushing changes back and forth without involving the main branch. Git format-patch is your friend: you can do this by email. If somebody needs a change before you land it, you can give it to them.
  7. Be aware of the cost of things. Any time people spend on things like resolving merge conflicts, doing rebases, etc. costs you. Inevitably when you branch you accept that there is going to be some cost. Keeping a branch alive means you add cost. Don’t do it needlessly. So, branch and do what you have to do and then rebase, push and delete the branch. And again, not committing is effectively the same as branching in git. It has the same cost and risk attached to it.
  8. Don’t push branches to origin unless they are going to be long lived and need to be worked on by multiple people. For simple work, keep the branches local to your machine. If you are going to be doing all of the commits and integration work, the only valid reason for pushing it upstream would be backups. There are alternatives to backing this way.
  9. Beyond a certain team size, the stable branch needs to be protected. Pull change rather than pushing it (this is a severely underused git feature). Junior on the third floor says his patch is ready: pull it, have a look at it and give him feedback but don’t allow him to push and bypass checks and balances. Push only works in small teams. Pull forces people to communicate.

When will (feature) branches get you in trouble? Antipatterns:

  • You are working on a big change. You haven’t updated for days. Do I need to spell it out? That is just wrong. Update!
  • You are pushing a big change, all the CI builds go red. Oops, test before you push you dumb idiot. You deserve all the angry looks you are going to get. If this happens a lot, consider setting up your CI environment for having a stable branch for tested commits and an unstable branch for incoming commits that need to be tested. Jenkins supports this and it will keep unstable change out of stable.
  • You are doing a big change, somebody else is also doing a big change. You find out about that when you rebase and spend hours dealing with merge conflicts. Seriously: communicate before you do anything drastic and give people an opportunity to get their changes in before you ruin their day. Pushing massive changes that you know are going to cause problems when people rebase is a very egocentric thing to do. Be a team player.
  • You are working on a big change. You haven’t updated or committed anything for several days on your local branch. You are effectively using your local branch as a feature branch: everybody who is not pushing change is effectively on a feature branch. Not committing is NOT a strategy to avoid using feature branches. Also, you are not committing? Why???? What’s your excuse for not committing to your own local branch? Seriously, consider using version management and stop treating git as a file server for code backups.
  • You are working on a branch for an extended period of time but your CI builds only run on the central branch. Congratulations, you have just tossed out CI as a good practice. Fix it. Either have the discipline to run the CI tests manually on every commit to the branch or set up a CI build for it. Either way, don’t break the feedback loop you get with CI.

Now at this point I have to admit something: I don’t use feature branches a lot but I do tend to accumulate a few commits locally before I push them. Also. we haven’t set up stable and unstable branches in jenkins (yet, planning to). We have the occasional breakage of our CI builds. I’m actually guilty of breaking some of the builds myself. The reason/weak excuse is: I’m having a hard time changing people’s way of working. You turn your back and people stop committing and continue treating git like they are using cvs. But I’m at least aware what the real problem is here: not the fact that I have branches but the fact that our testing needs to be decentralized.

5 thoughts on “Using Git and feature branches effectively

  1. Pingback: Help me, because I think Martin Fowler has a Merge Paranoia « Arialdo Martini

  2. The biggest problem with Feature Branches is that they remove your ability to http://c2.com/cgi/wiki?RefactorMercilessly . You mention “big changes” in a lot of your points, but if you’re really into doing TDD’s Red/Green/Refactor cycle and “Leaving the campground cleaner than you found it” http://programmer.97things.oreilly.com/wiki/index.php/The_Boy_Scout_Rule , you know that much more often what you have are small improvements, or little redesigns.

    In a feature branch situation, you end up with people fixing the same issues, most probably in different manners, or people having the codebase morph under their feet; and in both cases, big merge pains.

    More than once in a project I had to neglect my duties as a professional programmer to refactor and clean up on the small due to FEAR of someone else changing that code in a branch that lives for 2 weeks. Those same projects were infested with “big refactors”… from your points it sounds like that seems natural to you – it is not.

    By the time the merge is done and conflicts arise, the sequence of steps I took when refactoring will have vanished from my mind already, and it will be painful for me and the other programmer to sort it out.

    Some people are scared to merge “unfinished” code into the codebase, but most of the time, it doesn’t matter, or can just be disabled. In rails you could just switch it off based on environment, if you don’t wanna use Feature Toggles. Most of the time you can get away by just omitting it from the UI navigation and using a direct URL to access new functionality when you must.

    In my experience, the best programmers are those that take baby steps, incrementally building functionally all the while keeping the code compiling and tests passing at the majority of the time – certainly at each git commit. There’s no justification for the fear of working on master; other than failing to follow this discipline, and having bad tests that don’t point out regressions well enough.

    • Good points. However, big refactorings are not necessarily a technical issue here. If anything, git is actually quite helpful when it comes to this. I have in the past done big refactorings on git branches. It actually works. E.g. big package renames in Java can cause a lot of fallout. I’ve done such changes on branches and successfully rebased against upstream changes for several weeks. The challenge here is communicating your change is going to happen and picking the right moment. With subversion, your only option is to tell everybody to sit on their hands and not commit while you cross your fingers and do the changes and commits.

      In git you do these things on a branch. Even if you don’t call it a branch, your local repository is for all practical purposes a branch. Not pushing your commits for a few days is exactly the same as having a feature branch. You’re just not keeping it in a safe place (i.e. in a backed up location on a server), which is not a very bright thing to do.

      The real problem here is the transactional view of a version control where anyone can claim a global lock (a.k.a. commit freeze) on the repository and do their thing. This does not scale to larger projects. Git was in fact designed to solve this problem but it requires you change the way you work.

      Instead of everybody pushing their changes commit by commit whenever they feel like it, you instead use pull requests (or the old school email based version of that as the Linux kernel guys do). A pull request is nothing but a serialized feature branch. Most large open source projects work this way by now for good reason: it works. Pull requests are very simple: either they merge cleanly without causing errors and test failures or they don’t. Only the former type gets merged typically.

      Using pull requests forces people to communicate about their changes. Suddenly a big refactoring becomes your problem and not everybody else’s. If your big refactoring pull request doesn’t apply cleanly to the master, it gets rejected and you have to do more work. You iterate on that while rebasing against upstream changes, effectively doing the integration work before your pull request is integrated. Whether you like it or not: you are on a feature branch at this point.

      If somebody else has a conflicting feature branch, you had better learn to talk to each other before you find out the hard way by having your patches rejected. Git allows you to rebase against each other so you can resolve whatever conflicts you have before you dump your changes on master.

      The key point in my blog post is that if you isolate yourself like this from the master, you have to also decentralize your testing. A key problem here is that in many organizations master is the only thing that is continuously built and tested. So everybody is anxious to have their half broken code on master so they can know the impact. If you’ve ever experienced a project with widespread commit anxiety, you’ll know that this is a very negative thing to have.

      If you practice continuous deployment it is not acceptable for the master to ever be in a broken state because it means you can’t deploy. This implies that all change has to be fully tested and integrated before landing on master. This means by definition that practicing continuous deployment means you also have to use feature branches.

Leave a Reply