Agile methods

Another interesting article by Martin Fowler: The New Methodology (and again I’m several months late). The predecessor of this article in 2000 was equally interesting. In fact it still is. If you have the time, read both of them.

Martin Fowler, Kent Beck, Erich Gamma and other people of that generation have greatly influenced the way I think about software engineering. I’ve been a scholar in software engineering and currently my work at Nokia also involves a great deal of software engineering research. I see these guys as the pragmatic force driving the industry forward. While we scholars are babbling about Model Driven Architectures, Architecture Description Languages, etc.in our ivory towers, these guys are all “stuff and no fluff”. It’s a good thing to once in a while consider that many of the software engineering research community ideas and concepts in the past have done and continue to do a lot of damage. For example, the waterfall model (the mother of all software development methodology) is still being taught in universities. Out of touch SE teachers still teach their students to do things one after the other (instead of iteratively).

The original papers on the waterfall model, iterative development and related methodology from the seventies are an interesting read. Their authors had a lot more vision than past and present proponents of these methods. But they are just that: the vision of software developers coming out of the cold in the seventies. We’ve learned a lot since.

If there’s one thing you can learn from Martin Fowler, Kent Beck or Alistair Cockburn, it is that you should never ever implement their methodology to the letter. If you are doing so, you didn’t get it. Agile is all about change, including changing the way you work on permanent basis. The article I’m citing presently argues this in Martin Fowler’s usual clear fashion. Go read it already.

Intentional Software

From a very interesting article by Martin Fowler:

I’ve had the opportunity to spend a little time with Intentional Software and several of my colleagues at ThoughtWorks have worked closely with Intentional over the last year or so. As a result I’ve had the opportunity to peek behind the Intentional curtain – although I’m restricted in how much I can say about what I saw there. Fortunately they intend to start opening up about their work over the next year or so.

Whoohoo! I’ve been following the developments around this company for a few years now. Charles Simonyi is mostly known for being one of the influential architects at Microsoft responsible for creating and popularizing such things as WYSIWYG editing, Excell and the Hungarian notation. Simply put, the guy is brilliant. A brilliant guy with a vision: intentional programming/software. Working for Microsoft from the beginning, he is one of their richer Microsoft millionnaires/billionaires ™. A few years ago he quit his job at Microsoft to start his own company called Intentional Software. Before that he published a few articles on intentional programming which, frankly, include some ideas that are way beyond the imagination of the average C/C++/Java/C#/whatever programmer. While these guys fight over such petty things as syntax, he made it a first class entity in his programming environment. Intentional programming is all about translating intentions to working code. If doing C++ style templates is your intention, define them in the core constructs provided by the intentional programming environment and write them.
But what am I doing, trying to summarize Martin Fowler’s excellent article into one paragraph. Go read his article. It will take you some time but it will be time well spent.

I’ve been a long time fan of Martin Fowler. I should check his site more often.

-Ofun

I found this article rather insightful -Ofun

I agree with most of it. Many software projects (commercial, oss, big & small) have strict guidelines with respect to write access to soure repositories and usage of these rights. As the author observes many of these restrictions find their roots in the limited ability of legacy revision control systems to roll back undesirable changes and to merge sets of coherent changes. And not in any inherent process advantages (like enforcing reviews, preventing malicious commits). Consequently, this practice restricts programmers in their creativity.

Inviting creative developers to commit on a source repository is a very desirable thing. It should be made as easy as possible for them to do their thing.

On more than one occasion I have spent some time looking at source code from some OSS project (to figure out what was going wrong in my own code). Very often my hands start to itch to make some trivial changes (refactor a bit, optimize a bit, add some functionality I need). In all of these cases I ended up not doing these changes because committing the change would have required a lengthy process involving:
– get on the mailing list
– figure out who to discuss the change with
– discuss the change to get permission to send the change to this person
– wait for the person to accept/reject the change

This can be a lengthy process and upfront you already feel guilty of contacting the person about this trivial change with your limited knowledge of the system. In short, the size of the project and its members scare off any interested developers except the ones determined to get their change in.

What I’d like to do is this:
– Checkout tomcat (I work with tomcat a lot, fill in your favorite OSS project)
– Make some change I think is worthwhile having without worrying about consequences, opinions of others, etc.
– Commit it with a clear message why I changed it.
– Leave it to the people who run the project to laugh away my ignorance or accept the change as they see fit.

The apache people don’t want the change, fine. Undo it, don’t merge, whatever. But don’t restrict peoples right to suggest changes/improvements in any kind of way. If you end up rejecting 50% of the commits that means you still got 50% useful stuff. The reviewing, merging workload can be distributed among people.

In my current job (for GX, the company that I am about to leave), I am the release manager. I am the guy in charge for the source repositories of the entire GX product line. I’d like to work like outlined above but we don’t. Non product developers in the company need to contact me by mail if they want to get their changes in. Some of them do, most of them don’t. I’m convinced that I’d get a lot of useful changes. We use subversion which is nice but not very suitable for the way of working outlined above and in the article I quoted. Apache also uses subversion so I can understand why they don’t want to give people like me commit rights just like that.

So why is this post labelled as software engineering science? Well I happen to believe that practice is ahead in some things over the academic community (of which I am also a part). Practicioners have a fine nose for tools and techniques that work really well. Academic software engineering researchers don’t for a variety of reasons:
– they don’t engineer that much software
– very few of them develop at all (I do, I’m an exception)
– they are not very familiar with the tools developers use

In the past two years in practice I have learned a number of things:
– version control is key to managing large software projects. Everything in a project revolves around putting stuff in and getting stuff out of the repository. If you didn’t commit it, it doesn’t exist. Committing it puts it on the radar of people who need to know about it.
– Using branches and tags is a sign the development process is getting more mature. It means you are separating development from maintenance activities.
– Doing branches and tags on the planned time and date is an even better sign: things are going according to some plan (i.e. this almost looks like engineering).
– Software design is something non software engineers (including managers and software engineering researchers) talk about, a lot. Software engineers are usually to busy to bother.
– Consequently, few software actually gets designed in the traditional sense of the word (create important looking sheets of paper with lots of models on them).
– Instead two or three developers get together for an afternoon and lock themselves up with a whiteboard and a requirements document to take the handful of important decisions that need to be taken.
– Sometimes these decisions get documented. This is called the architecture document
– Sometimes a customer/manager (same thing really) asks for pretty pictures. Only in those cases a design document is created.
– Very few new software gets build from scratch.
– The version repository is the annotated history of the software you are trying to evolve. If important information about design decisions is not part of the annotated history, it is lost forever.
– Very few software engineers bother annotating their commits properly.
– Despite the benefits, version control systems are very primitive systems. I expect much of the progress in development practice in the next few years to come from major improvements in version control systems and the way they integrate into other tools such as bug tracking systems and document management systems.

Some additional observations on OSS projects:
– Open source projects have three important tools: the mailinglist, the bug tracking system and the version control system (and to a lesser extent wikis). These tools are comparatively primitive to what is used in the commercial software industry.
– Few oss projects have explicit requirements and design phases.
– In fact all of the processes used in OSS projcets are about the use of the before mentioned tools.
– Indeed few oss projects have designs
– Instead oss projects evolve and build a reputation after an initial commit of a small group of people of some prototype.
– Most of the life cycle of an oss project consist of evolving it more or less ad hoc. Even if there is a roadmap, that usually only serves as a common frame of reference for developers rather than as a specification of things to implement.

I’m impressed by how well some OSS projects (mozilla, kde, linux) are run and think that the key to improving commercial projects is to adopt some of the better practices in these projects.

Many commercial software actually evolves in a very similar fashion despite manager types keeping up appearances by stimulating the creation of lengthy design and requirements documents, usually after the development has finished.

Scientific content should be free

A recurring topic on slashdot (www.slashdot.org) and in the scientific community is open journals: peer reviewed scientific journals that make their articles available for free. Today, slashdot commented on the ideas of the IEEE to maybe open their vast electronic library to the public. I am a big proponent of this and sincerely hope that they will do this. However I am very critical of the discussions about the cost. These discussions appear to be influenced heavily by publishers who continuously try to make it appear that these costs have to be very high. They propose that authors cover the cost of 3000$ (!!!!) per article.

This is where I disagree because as far as I can see these costs don’t really exist (or rather have to exist). I used to be a Ph. D. student. I wrote articles, submitted them to conferences and journals. I also peer reviewed articles for journals and conferences. I never received a single dollar for this work from the publishers and worse now have to pay to get access to my own articles (well I cheated by saving a copy).

My point is: all the relevant work in the publishing process is done by volunteers like me. Worldwide, scientists write articles for free and review other scientist’s articles for free. The only scientists who receive money from publishers are editors who, in some cases, get a modest compensation for their precious time from a publisher who makes a lot of money. I’m convinced there would be plenty of people willing to donate their time to do this. I’m one of those people. This work mostly consists of taking decisions what to publish and what to reject, organizing and coordinating the review process, etc.

Historically we needed publishers to distribute the peer reviewed articles to libraries and this is why publishers have enjoyed an enormous revenue stream for centuries now. The profits made by publishers are huge (billions of dollars). They continue to be huge because scientists need to publish in their journals because of the journal rankings (which are based on references to articles).

Now that we have the internet, this is no longer true.

Well not entirely. Of course you still have some hosting costs, site maintainance and maybe a bunch of people coordinating the whole review distribution process, content management & site maintainance. My point is: the per article cost of the whole process is extremely low. It’s nowhere near the amount of dollars named in the article. I’d be surprised and shocked if it were more than a few dollars. An organization like the IEEE should be able to fund this using sponsoring, advertising & volunteer contributions.

Of course they’d have to reorganize how they work. A journal is a periodic bundle of articles intended for paper distribution. Electronic publishing is instant (not periodic) and essentially free of cost. Beyond organizing the process and hosting there is virtually no cost. The process which is currently optimized for paper distribution is therefore obsolete. You need volunteer authors, volunteer reviewers, volunteer editorial boards for specific scientific audiences, supporting staff and hosting (here are some real costs) and a means to establish article and editorial board rankings (this is mostly a technical problem).

Editorial boards consist of key members of a research community who invite other scientists to contribute articles and do peer reviews. The output of an editorial board consists of peer reviewed articles. Not for profit organizations like the IEEE can take care of the editing and hosting. This will require some funding. Funding is available from sponsoring, advertisements, research funds, universities, society memberships etc.

Considering the amounts that are saved by taking publishers out of the equation, this should be no problem. Universities would save millions if the whole scientific publishing community would adopt this model.

So, IEEE, ACM and other not for profit scientific organizations: do your members and the scientific community a favour (and isn’t this what you exist for in the first place?) by making content available freely. There’s no shortage of scientists willing to do the writing and reviewing for you (I’m one of those people). The rest of the process can and should be optimized for online hosting. The costs involved with the latter part should be very modest. The benefits are enormous.