Enforcing code conventions in Java

After many years of working with Java, I finally got around to enforcing code conventions in our project. The problem with code conventions is not agreeing on them (actually this is hard since everybody seems to have their own preferences but that’s beside the point) but enforcing them. For the purpose of enforcing conventions you can choose from a wide variety of code checkers such as checkstyle, pmd, and others. My problem with this approach is that checkers usually end up being a combination of too strict, too verbose, or too annoying. In any case nobody ever checks their output and you need to have the discipline to fix things yourself for any issues detected. Most projects I’ve tried checkstyle on, it finds thousands of stupid issues using the out of the box configuration. Pretty much every Java project I’ve ever been involved with had somewhat vague guidelines on code conventions and a very loose attitude to enforcing these. So, you end up with loads of variation in whitespace, bracket placement, etc. Eventually people stop caring. It’s not a problem worthy of a lot of brain cycles and we are all busy.

Anyway, I finally found a solution to this problem that is completely unintrusive: format source code as part of your build. Simply add the following blurb to your maven build section and save some formatter settings in XML format in your source tree. It won’t fix all your issues but formatting related diffs should be a thing of the past. Either your code is fine, in which case it will pass the formatter unmodified or you messed up, in which case the formatter will fix it for you.

<plugin><!-- mvn java-formatter:format -->
    <groupId>com.googlecode.maven-java-formatter-plugin</groupId>
    <artifactId>maven-java-formatter-plugin</artifactId>
    <version>0.4</version>
    <configuration>
        <configFile>${project.basedir}/inbot-formatter.xml</configFile>
    </configuration>

    <executions>
        <execution>
            <goals>
                <goal>format</goal>
            </goals>
        </execution>
    </executions>
</plugin>

This plugin formats the code using the specified formatting settings XML file and it executes every build before compilation. You can create the settings file by exporting the Eclipse code formatter settings. Intellij users can use these settings as well since recent versions support the eclipse formatter settings file format. The only thing you need to take care off is the organize imports settings in both IDEs. Eclipse comes with a default configuration that is very different from what Intellij does and it is a bit of a pain to fix on the Intellij side. Eclipse has a notion of import groups that are each sorted alphabetically. It comes with four of these groups that represent imports with different prefixes so, javax.* and java.*, etc. are different groups. This behavior is very tedious to emulate in Intellij and out of the scope of the exported formatter settings. For that reason, you may want to consider modifying things on the Eclipse side and simply remove all groups and simply sort all imports alphabetically. This behavior is easy to emulate on Intellij and you can configure both IDEs to organize imports on save, which is good practice. Also, make sure to not allow .* imports and only import what you actually use (why load classes you don’t need?). If everybody does this, the only people causing problems will be those with poorly configured IDEs and their code will get fixed automatically over time.

Anyone doing a mvn clean install to build the project will automatically fix any formatting issues that they or others introduced. Also, the formatter can be configured conservatively and if you set it up right, it won’t mess up things like manually added new lines and other manual formatting that you typically want to keep. But it will fix the small issues like using the right number of spaces (or tabs, depending on your preferences), having whitespace around brackets, braces, etc. The best part: it only adds about 1 second to your build time. So, you can set this up and it basically just works in a way that is completely unintrusive.

Compliance problems introduced by people with poor IDE configuration skills/a relaxed attitude to code conventions (you know who you are) will automatically get fixed this way. Win win. There’s always the odd developer out there who insists on using vi, emacs, notepad, or something similarly archaic that most IDE users would consider cruel and unusual punishment. Not a problem anymore, let them. These masochists will notice that whatever they think is correctly formatted Java might cause the build to create a few diffs on their edits. Ideally, this happens before they commit. And if not, you can yell at them for committing untested code: no excuses for not building your project before a commit.

Jruby & UTF-8

I just spent a day trying to force jruby to do the right things with UTF-8 content. Throughout my career, I’ve dealt with UTF-8 issues on pretty much any project I’ve ever touched. It seems that world+dog just can’t be trusted to do the right things when it comes to this. If you ever see mangled content in your browser: somebody fucked it up.

Lately, I’ve been experiencing issues both with requests and responses in our sinatra based application on top of ruby rack. So despite, setting headers correctly, ensuring I hardcode everything to be UTF-8 end to end, it still mangles properly encoded utf-8 content by applying a system default encoding both on the way in and out of my system. Lets just say I’ve been cursing a lot in the past few days. In terms of WTF’s per second, I was basically not being very nice.

Anyway, Java has a notion of a default encoding that is set (inconsistently) from the environment that you start that you can only partially control with the file.encoding system property. On windows and OSX this is not going to be UTF-8. That combined with Ruby’s attitude to be generally sloppy about encodings and just hoping for the best is basically a recipe for trouble. Ruby strings just pickup whatever system encoding is default. Neither rack nor Sinatra try to align that in any way with the http headers it deals with.

When you start jruby using the provided jruby script it actually sets the file.encoding system property to utf-8. However, this does not affect java.nio.charset.Charset.defaultCharset(), which is used a lot by Jruby. IMHO that is a bug and a serious one. Let me spell this out: any code that relies on java.nio.charset.Charset.defaultCharset() is broken. Unless you explicitly want to support legacy content that is not in UTF-8, in which case you should be very explicit about exactly which of the gazillion encodings you want. The chances of that aligning with the defaults are going to be slim.

This broke in a particularly weird way for me: it was working fine on OS X (with its broken system defaults). However, it broke on our server, which is Ubuntu 12.04 LTS. If you’ve ever used ubuntu, you may have noticed that terminals use UTF-8 by default. This is great, all operating systems should do this. There’s one problem though: it’s not actually system wide and my mistake was assuming that it was. It’s fine when you log in and use a terminal. However, when starting jruby from an init.d script with a sudo, the terminal encoding reverts back to some shit ansi default from the nineteen sixties that is 100% guaranteed inappropriate for any modern web server. This default is then passed on to the jvm, which causes jruby with rack to the wrong things.

The fix, add this to your script:

export LC_ALL=en_US.UTF-8
export LANG=en_US.UTF-8
export LANGUAGE=en_US.UTF-8

This forces the encoding to be correct in the bash script and should hopefully trickle down the broken pile of software that just assumes the defaults make sense.

Jruby and Java at Localstream

Update. I’ve uploaded a skeleton project about the stuff in this post to Github. The presentation that I gave on this at Berlin Startup Culture on May 21st can be found here.

The server side code in the Localstream platform is a mix of Jruby and Java code. Over the past few months, I’ve gained a lot of experience using the two together and making the most of idioms and patterns in both worlds.

Ruby purist might wonder why you’d want to use Java at all. Likewise, Java purists might wonder why you’d waste your time doing Jruby at all instead of more hipster friendly languages such as Scala, Clojure, or Kotlin. In this article I want to steer clear of that particular topic and instead focus on more productive things such as what we use for deployment, dependency management, dependency injection, configuration, and logging. Also it is an opportunity to introduce two of my new Github projects:

The Java ecosystem provides a lot of good, very well supported technology. This includes the jvm itself but also libraries such as Google’s guava, misc. Apache frameworks such as httpclient, commons-lang, commons-io, commons-compress, the Spring framework, icu4j, and many others. Equivalents exist for Ruby, but mostly those equivalents leave a lot to be desired in terms of features, performance, design, etc. It didn’t take me long to conclude that a lot of the ruby stuff out there is sub-standard and not quite up to my level of expectations. That’s why I use Jruby instead of Ruby: it allows me to get the best of both worlds. The value of ruby is in its simplicity and the language. The value of Java is access to an enormous amount of good software. Jruby allows me to have both.

Continue reading “Jruby and Java at Localstream”

Puppet

I recently started preparing for deploying our localstre.am codebase to an actual server. So, that means I’m currently having a lot of fun picking up some new skills and playing system administrator (and being aware of my short comings there). World plus dog seems to be recommending using either Chef or Puppet for this  and since I know few good people that are into puppet in a big way and since I’ve seen it used in Nokia, I chose the latter.

After getting things to a usable state, I have a few observations that come from my background of having engineered systems for some time and having a general gut feeling about stuff being acceptable or not that I wanted to share.

  • The puppet syntax is weird. It’s a so-called ruby DSL that tries to look a bit like json and only partially succeeds. So you end up with a strange mix of : and => depending on the context. I might be nit picking here but I don’t think this a particularly well designed DSL. It feels more like a collection of evolved conventions for naming directories and files that happen to be backed by ruby. The conventions, naming, etc. are mostly non sensical. For example the puppet notion of a class is misleading. It’s not a class. It doesn’t have state and you don’t instantiate it. No objects are involved either. It’s more close to a ruby module. But in puppet, a module is actually a directory of puppet stuff (so more like a package). Ruby leaks through in lots of places anyway so why not stick with the conventions of that language? For example by using gems instead of coming up with your own convoluted notion of a package (aka module in puppet). It feels improvised and gobbled together. Almost like the people who built this had no clue what they were doing and changed their minds several times. Apologies for being harsh if that offended you BTW ;-).
  • The default setup of puppet (and chef) assumes a lot of middleware that doesn’t make any sense whatsoever for most smaller deployments (or big deployments IMNSHO). Seriously, I don’t want a message broker anywhere near my servers any time soon, especially not ActiveMQ. The so-called masterless (puppet) or solo (chef) setups are actually much more appropriate for most people.  They are more simple and have less moving parts. That’s a good thing when it comes to deployments.
  • It tries to be declarative. This is mostly a good thing but sometimes it is just nice to have an implicit order of things following from the order in which you specify things. Puppet forces you to be explicit about order and thus ends up being very verbose about this. Most of that verbosity is actually quite pointless. Sometimes A really comes before B if I specify it in that order in one file.
  • It’s actually quite verbose compared to the equivalent bash script when it comes to basic stuff like for example starting a service, copying a file from a to b, etc. Sometimes a “cp foo bar; chmod 644 bar” just kinda does the trick. It kind of stinks that you end up with these five line blobs for doing simple things like that. Why make that so tedious?
  • Like maven and ant in the Java world it, it tries to be platform agnostic but only partially succeeds. A lot of platform dependencies creep in any way and generally puppet templates are not very portable. Things like package names, file locations, service names, etc. end up being platform specific anyway.
  • Speaking of which, puppet is more like ant than like maven. Like ant, all puppet does is provide the means to do different things. It doesn’t actually provide a notion of a sensible default way that things are done that you then customize, which is what maven does instead. Not that I’m a big fan of maven but with puppet you basically have to baby sit the deployment and worry about all the silly little details that are (or should be) bog standard between deployments: creating users, setting up & managing ssh keys, ensuring processes run with the appropriate restrictions, etc. This is a lot of work and like a lot of IT stuff it feels repetitive and thus is a good candidate for automation. Wait … wasn’t puppet supposed to be that solution? The puppet module community provides some reusable stuff but its just bits and pieces really and not nearly enough for having a sensible production ready setup for even the simplest of applications. It doesn’t look like I could get much value out of that community.

So, I think puppet at this stage is a bit of a mixed bag and I still have to do a lot of work to actually produce a production ready system. Much more than I think is justified by the simplicity of real world setups that I’ve seen in the wild. Mostly running a ruby or java application is not exactly rocket science. So, why exactly does this stuff continue to be so hard & tedious despite a multi billion dollar industry trying to fix this for the last 20 years or so?

I don’t think puppet is the final solution in devops automation. It is simply too hard to do things with puppet and way too easy to get it wrong as well. There’s too much choice, a lack of sensible policy, and way too many pitfalls. It being an improvement at all merely indicates how shit things used to be.

Puppet feels more like a tool to finish the job that linux distributors apparently couldn’t be bothered to do in arbitrary ways than like a tool to produce reliable & reproducible production quality systems at this point and I could really use a tool that does the latter without the drama and attitude. What I need is sensible out of the box experience for the following use case: here’s a  war file, deploy that on those servers.

Anyway, I started puppetizing our system last week and have gotten it to the point where I can boot a bunch of vagrant virtual machines with the latest LTS ubuntu and have them run localstre.am in a clustered setup. Not bad for a week of tinkering but I’m pretty sure I could have gotten to that point without puppet as well (possibly sooner even). And, I still have a lot of work to do to setup a wide range of things that I would have hoped would be solved problems (logging, backups, firewalls, chroot, load balancing a bog standard, stateless http server, etc). Most of this falls in the category of non value adding stuff that somebody simply has to do. Given that we are a two person company and I’m the CTO/server guy, that would be me.

I of course have the benefit of hindsight from my career in Nokia where I watched Nokia waste/invest tens of millions on deploying simple bog standard Java applications (mostly) to servers for a few years. It seems simple things like “given a war file, deploy the damn thing to a bunch of machines” get very complicated when you grow the number of people involved. I really want to avoid needing a forty people ops team to do stupid shit like that.

So, I cut some corners. My time is limited and my bash skills are adequate enough that I basically only use puppet to get the OS in a usable enough state that I can hand off to to a bash script to do the actual work of downloading, untarring, chmodding, etc. needed to  get our application running. Not very puppet like but at least it gets the job done in 40 lines of code or so without intruding too much on my time. In those 40 lines, I install the latest sun jdk (tar ball), latest jruby (another tar ball), our code base, and the two scripts to start elastic search and our jetty/mizuno based app server.

What would be actually useful is reusable machine templates for bog standard things like php and ruby capable servers, java tomcat servers, apache load balancers, etc with sensible hardened configurations, logging, monitoring, etc. The key benefit would be inheriting from a sensible setup and only changing the bits that actually need changing. It seems that is too much to ask for at this point and consequently hundreds of thousands of system administrators (or the more hipster devops if you are so inclined) continue to be busy creating endless minor variations of the same systems.

Using Java constants in JRuby

I’ve been banging my head against the wall for a while before finding a solution that doesn’t suck for a seemingly simple problem.

I have a project with mixed jruby and java code. In my Java code I have a bunch of string constants. I use these in a few places for type safety (and typo safety). Nothing sucks more than spending a whole afternoon debugging a project only to find that you swapped two letters in a string literal somewhere. Just not worth my time. Having ran into this problem with ruby a couple of times now, I decided I needed a proper fix.

So, I want to use the same constants I have in Java in jruby. Additionally, I don’t want to litter my code with Fields:: prefixes. In java I would use static imports. In ruby you can use includes, except that doesn’t quite work for java constants. And also, I want my ruby code to complain loudly when I make a typo with any of these constants. So ruby constants don’t solve my problem. Ruby symbols don’t solve my problem either (not typo proof). Attribute accessors are kind of not compatible with modules (need a class for that). So my idea was to simply add methods to the module dynamically.

So, I came up with the following which allows me to define a CommonFields module and simply include that wherever I need it.

require 'jbundler'
require 'singleton'

java_import my.package.Fields

# hack to get some level of type safety
# simply include the Fields module where you need to use 
# field names and  then you can simply use the defined 
#fields as if they were methods that return a string
module CommonFields
    @rubySpecificFields=[
      :foooo,
      :bar
    ]
    @rubySpecificFields.each do |field|
      CommonFields.send(:define_method, field.to_s) do
        return field.to_s
      end  
    end
    
    Fields.constants.each do | field |
      CommonFields.send(:define_method, Fields.const_get(field)) do
        return Fields.const_get(field)
      end
    end
end

include CommonFields
  
puts foooo, email, display_name

Basically all this does is add the Java constants (email and display_name are two of the constants) to the module dynamically when you include it. After that, you just use the constant names and it just works ™. I also added a list of ruby symbols so, I can have fields that I don’t yet have on the java side. This works pretty OK and I’m hoping jruby does the right things with respect to inlining the method calls. Most importantly, any typo will lead to an interpreter error about the method not existing. This is good enough as it will cause tests to fail and be highly visible, which is what I wanted.

Pretty neat and I guess you could easily extend this approach to implement proper enums as well. I spotted a few enum like implementations but they all suffer from the prefix verbosity problem I mentioned. The only remaining (minor) issue is that ruby constants are upper case and my Java constant names are upper case as well. Given that I want to turn them into ruby functions, that is not going to work. Luckily in my case, the constant values are actually camel cased field names that I can simply use as the function name in ruby. So that’s what I did here. I guess I could have lower cased the name as well.

I’m relatively new to ruby so I was wondering what ruby purists think of this approach.

GeoTools

In the past few weeks I’ve been working on a little project on git hub called GeoTools that you might be interested in if you are into geo spatial stuff.

GeoTools is a set of tools for manipulating geo hashes and geometric shapes in the wgs 84 coordinate system. A geo hash is a representation of a coordinate that interleaves the bit representations of the latitude and longitude and base32 encodes the result. This string representation has a very useful property: nearby coordinates will have the same prefix. As is observed in this blog post: http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-indexing-with-Quadtrees-and-Hilbert-Curves, geo hashes effectively encode the path to a leaf in a quad tree.

The algorithms used for implementing the functionality in GeoTools are mostly well known and commonly used in 2D graphics. However, I have not found any comprehensive open sourced implementation for the wgs 84 coordinate system.

The key functionality that motivated the creation of this library was the ability to cover shapes on a map with geo hashes for indexing purposes. To support that, I needed some elementary geometric operations implemented.

Features

  • GeoGeometry class with methods that allow you to:
    • Calculate distance between two coordinates using the haversine algorithm.
    • check bounding box containment for a point
    • check polygon containment for a point
    • get the center for a polygon
    • get bounding box for a polygon
    • convert circle to a polygon
    • create a polygon from a point cloud
    • translate a wgs84 coordinate by x & y meters along the latitude and longitude
  • GeoHashUtils class with methods that allow you to:
    • check containment of a point in a geohash
    • find out the boundingbox of a geohash
    • find out neighboring geohashes east, west, south, or north of a geohash
    • get the 32 sub geo hashes for a geohash, or the north/south halves, or the NE, NW, SE, SW quarters.
    • convert geo hash to a BitSet and a BitSet to a geo hash
    • cover lines, paths, polygons, or circles with geo hashes

I’ve deliberately kept the design simple and non object oriented. These classes have no external dependencies and only use the java.util package and java.lang.Math class. Consequently it should be easy to port this functionality to whatever other language. Also, I’ve not attempted to implement Point, Polygon, Path, Circle, or other classes to support this library. The reason for this is very simple: these things are commonly implemented in other frameworks and any attempt from my side to impose my own implementation would conflict with the need of others to reuse their own classes. I think it is somewhat of a design smell that world + dog feels compelled to implement their own Point class.

So instead, a point is represented as a latitude, longitude pair of doubles or as an array of two doubles (like in geo json). Likewise, paths and polygons are represented as arrays of points. A circle is a point and a radius. This should enable anyone to integrate this functionality easily.

Jsonj: a new library for working with JSon in Java

Update 11 July 2011: I’ve done several commits on github since announcing the project here. New features have been added; bugs have been fixed; you can find jsonj on maven central now as well; etc. In short, head over to github for the latest on this.

I’ve just uploaded a weekend project to github. So, here it is, jsonj. Enjoy.

If you read my post on json a few weeks ago, you may have guessed that I’m not entirely happy with the state of json in Java relative to other languages that come with native support for json. I can’t fix that entirely but I felt I could do a better job than most other frameworks I’ve been exposed to.

So, I sat down on Sunday and started pushing this idea of just taking the Collections framework and combining that with the design of the Json classes in GSon, which I use at work currently, and throwing in some useful ideas that I’ve applied at work. The result is a nice, compact little framework that does most of what I need it to do. I will probably add a few more features to it and expand some of the existing ones. I use some static methods at work that I can probably do in a much nicer way in this framework.

Continue reading “Jsonj: a new library for working with JSon in Java”

On Java, Json, and complexity

Time for a long overdue Saturday morning blog post.

Lately, I’ve been involved in some interesting new development at work that is all about key value stores, Json, Solr and a few other technologies I might have mentioned a few times. In other words, I’m having loads of fun at work doing stuff I really like. We’ve effectively tossed out XML and SQL and are now all about Json blobs that live in KVs and are indexed in Solr. Nothing revolutionary, except it is if you have been exposed to the wonderfully bloated and complicated world of Java Enterprise Edition and associated technologies. I’ve found that there is a bunch of knee-jerk reflexes that causes Java developers to be biased to be doing all the wrong things here. This is symptomatic of the growing gap between the enterprise Java people on one side and the cool hip kids wielding cool new languages and tools on the other hand.

Continue reading “On Java, Json, and complexity”

Git and agile

I’ve been working with Subversion since 2004 (we used a pre 1.0 version at GX). I started hearing about git around the 2006-2007 time frame when Linus Torvalds’ replacement for Bitkeeper started maturing enough for other people to use it. I met people working on Maemo (the Debian based OS for the N770, N800, N810, and recently the N900) in Nokia who were really enthusiastic about it in 2008. They had to use it to work with all the upstream projects Maemo depends on and they loved it. When I moved to Berlin everybody there was using subversion so I just conformed and ignored git/mercurial and all those other cool versioning systems out there for an entire year. It turns out that was lost time, I should have switched around 2007/2008. I’m especially annoyed by this because I’ve been aware of decentralized versioning being superior to centralized versioning since 2006. If you don’t believe me, I had a workshop paper at SPLC 2006 on version management and variability management that pointed out the emerging of DVCSes in that context. I’ve wasted at least three years. Ages for the early adopter type guy I still consider myself to be.

Anyway, after weighing the pros and cons for way too long, I switched from subversion to git last week. What triggered me to do this was, oddly, an excellent tutorial on Mercurial by Joel Spolsky. Nothing against Mercurial, but Git has the momentum in my view and it definitely appears to be the band wagon to be jumping right now. I don’t see any big technical argument for using Mercurial instead of Git. There’s github and no mercurial hub as far as I know. So, I took Joel’s good advice on Mercurial as a hint that it was time to get off my ass and get more serious about switching to anything else than Subversion. I had already decided in favor of git based on stuff I’ve been reading on both versioning systems.

My colleagues of course haven’t switched (yet, mostly) but that is not an issue with git-svn, which allows me to interface with svn repositories. I’d like to say making the switch was an easy ride, except it wasn’t. The reason is not git but me. Git is a powerful tool that has quite a bit more features than Subversion. Martin Fowler has a nice diagram on “recommendability” and “required skill”. Git is in the top right corner (highly recommended but you’ll need to learn some new skills) and Subversion is lower right (recommended, not much skill needed). The good news is that you will need only a small subset of commands to cover the feature set provided by svn and you can gradually expand what you use from there. Even with this small subset git is worth the trouble IMHO, if only because world + dog are switching. The bad news is that you will just have to sit down and spend a few hours learning the basics. I spent a bit more than I planned to on this but in the end I got there.

I should have switched around 2007/2008

The mistake I made that caused me to delay the switch for years was not realizing that git adds loads of value even when your colleagues are not using it: you will be able to collaborate more effectively if you are the only one using git! There are two parts to my mistake.

The first part is that the whole point of git is branching. You don’t have a working copy, you have a branch. It’s exactly the same with git-svn: you don’t have a svn working copy but a branch forked of svn trunk. So what, you might think. Git excels at merging between branches. With svn branching and merging is painful, so instead of having branches and merging between them, you avoid conflicts by updating often and committing often. With git-svn, you don’t update from svn trunk, you merge its changes in your local branch. You are working on a branch by default and creating more than 1 is really not something to be scared of. It’s is painless, even if you have a large amount of uncommitted work (which would get you in trouble with svn). Even if that work includes renaming the top level directories in your project (I did this). Even if other people are doing big changes in svn trunk. That’s a really valuable feature to have around. It means I can work on big changes to the code without having to worry about upstream svn commits. The type of changes nobody dares to take on because it would be too disruptive to deal with branching and merging and because there are “more important things” to do and we don’t want to “destabilize” trunk. Well, not any more. I can work on changes locally on a git branch for weeks if needed and push it back to trunk when it is ready while at the same time me and my colleagues keep committing big changes on trunk. The reason I’m so annoyed right now is the time I spent on resolving svn conflicts in the past four years was essentially unnecessary. Not switching four years ago was a big mistake.

The second part of my mistake was assuming I needed IDE support for git to be able to deal with refactoring and particularly class renames (which I do all the time in Eclipse). While there is egit now, it is still pretty immature. It turns out that assuming I needed Eclipse support was a false assumption. If you rename a file in a git repository and commit the file, Git will automatically figure out that the file was renamed, you don’t need to tell git that the file was renamed. A simple “mv foo.java bar.java” will work. On directories too. This is a really cool feature. So I can develop in eclipse without it even being aware of any git specifics, refactor and rename as much as I like, and git will keep tracking the changes for me. Even better, certain types of refactorings that are quite tricky with subclipse and subversive just work in git. I’ve corrupted svn work directories on several occasions when trying to rename packages and moving stuff around. Git will handle this effortlessly. Merges work so well because git can handle the situation where a locally renamed file needs changes from upstream merged into it. It’s a core feature, not an argument against using it. My mistake. I probably spent even more time on corrupted svn directories than conflict resolution in the last three years.

Git is an Agile enabler

We have plenty of pending big changes and refactorings that we have been delaying because they are disruptive. Git allows me to work on these changes whenever I feel like it without having to finish them before somebody else starts introducing conflicting changes.

This is not just a technical advantage. It is a process advantage as well. Subversion forces you to serialize change so that you minimize the interactions between the changes. That’s another way of saying that subversion is all about waterfall. Git allows you to decouple change instead and parallelize the work more effectively. Think multiple teams working on the same code base on unrelated changes. Don’t believe me? The linux kernel community has thousands of developers from hundreds of companies working on the same code base touching large portions of the entire source tree. Git is why that works at all and why they push out stable releases every 6 weeks. Linux kernel development speed is measured in thousands of lines of code modified or added per day. Evaluating the incoming changes every day is a full time job for several people.

Subversion is causing us to delay necessary changes, i.e. changes that we would prefer to do if only it wouldn’t be so disruptive. Delayed changes pile up to become technical debt. Think of git as a tool to manage your technical debt. You can work on business value adding changes (and keep the managers happy) and disruptive changes at the same time without the two interfering. In other words you can be more agile. Agile has always been about technical enablers (refactoring tooling, unit testing frameworks, continuous integration infrastructure, version control, etc) as much as it was about process. Having the infrastructure to do rapid iterations and release frequently is critical to the ability to release every sprint. You can’t do one without the other. Of course, tools don’t fix process problems. But then, process tends to be about workarounds for lacking tools as well. Decentralized version management is another essential tool in this context. You can compensate not using it with process. IMHO life is to short to play bureaucrat.

Not an easy ride

But as I said, switching from svn to git wasn’t a smooth ride. Getting familiar with the various git commands and how they are different from what I am used to in svn has been taking some time despite the fact that I understand how it works and how I am supposed to use it. I’m a git newby and I’ve been making lots of beginners mistakes (mainly using the wrong git commands for the things I was trying to do). The good news is that I managed to get some pretty big changes committed back to the central svn repository without losing any work (which is the point of version management). The bad news is that I got stuck several times trying to figure out how to rebase properly, how to undo certain changes, how to recover a messed up checkout on top of my local work directory from the local git repository. In short, I learned a lot on this and I have still some more things to learn. On the other hand, I can track changes from svn trunk, have local topic branches, merge from those to the local git master, and dcommit back to trunk. That about covers all my basic needs.