Maven: the way forward

A bit longer post today. My previous blog post set me off pondering on a couple of things that I have been pondering on before that sort of fit nicely together in a potential way forward. In this previous post and also this post, I spent a lot of words criticizing maven. People would be right to criticize me for blaming maven. However, that would be the wrong way to take my criticism. There’s nothing wrong with maven, it just annoys the hell out of me that it is needed and that I need to spend so much time waiting for it. In my view, maven is a symptom of a much bigger underlying problem: the java server side world (or rather the entire solution space for pretty much all forms of development) is bloated with tools, frameworks, application servers, and other stuff designed to address tiny problems with each other. Together, they sort of work but it isn’t pretty. What if we’d wipe all of that away, very much like the Sun people did when they designed Java 20 years ago? What would be different? What would be the same? I cannot of course see this topic separately from my previous career as a software engineering researcher. In my view there have been a lot of ongoing developments in the past 20 years that are now converging and morphing into something that could radically improve over the existing state of the art. However, I’m not aware of any specific projects taking on this issue in full even though a lot of people are working on parts of the solution. What follows is essentially my thoughts on a lot of topics centered around taking Java (the platform, not necessarily the language) as a base level and exploring how I would like to see the platform morph into something worthy of the past 40 years of research and practice.

Architecture

Lets start with the architecture level. Java packages were a mistake, which is now widely acknowledged. .Net namespaces are arguably better and OSGi bundles with explicit required and provided APIs as well as API versioning are better still. To scale software into the cloud where it must coexist with other software, including different (or identical) versions of itself, we need to get a grip on architecture.

The subject has been studied extensively (see here fore a nice survey of some description languages) and I see OSGi as the most successful implementation to date that preserves important features that most other development platforms currently lack, omit, or half improvise. The main issue with OSGi is that it layers stuff on top of Java but is not really a part of it. Hence you end up with a mix of manifest files that go into jar files; annotations that go into your source code; and cruft in the form of framework extensions to hook everything up, complete with duplicate functionality for logging, publish subscribe patterns, and even web service frameworks. The OSGi people are moving away towards a more declarative approach. Bring this to its ultimate conclusion and you end up with language level support for basically all that OSGi is trying to do. So, explicit provided and required APIs, API versioning, events, dynamic loading/unloading, isolation.

A nice feature of Java that OSGi relies on is the class loader. When used properly, it allows you to create a class loader, let it load classes, execute the functionality, and then destroy the class loader and all the stuff it loaded which is then garbage collected. This is nice for both dynamic loading and unloading of functionality as well as isolating functionality (for security and stability reasons). OSGi heavily depends on this feature and many application servers try to use this. However, the mechanisms used are not exactly bullet proof and there exist enormous problems with e.g. memory leaking which causes engineers to be very conservative with relying on these mechanisms in a live environment.

More recently, people have started to use dependency injection where the need for something is expressed in the code (e.g. with an annotation) or externally in some configuration file). Then at run time a dependency injecting container tries to fulfill the dependencies by creating the right objects and injecting dependencies. Dependency injection improves testability and modularization enormously.

A feature in maven that people seem to like is its way of dealing with dependencies. You express what you need in the pom file and maven fetches the needed stuff from a repository. The maven, osgi, & spring combo, is about to happen. When it does, you’ll be specifying dependencies in four different places: java imports; annotations, the pom file, and the osgi manifest. But still, I think the combined feature set is worth having.

Language

Twenty years ago, Java was a pretty minimalistic language that took basically the best of 20 years (before that) of OO languages and kept a useful subset. Inevitably, lots got discarded or not considered at all. Some mistakes were made, and the language over time absorbed some less than perfect versions of the stuff that didn’t make it. So, Java has no language support for properties, this was sort of added on by the setter/getter convention introduced in JavaBeans. It has inner classes instead of closures and lambda functions. It has no pure generics (parametrizable types) but some complicated syntactic sugar that gets compiled to non generic code. The initial concurrent programming concepts in the language were complex, broken, and dangerous to use. Subsequent versions tweaked the semantics and added some useful things like the java concurrent package. The language is overly verbose and 20 years after the fact there is now quite a bit of competition from languages that basically don’t suffer from all this. The good news is that most of those have implementations on top of the JVM. Lets not let this degenerate into a language war but clearly the language needs a proper upgrade. IMHO scala could be a good direction but it too has already some compromise embedded and lacks support for the architectural features discussed above. Message passing and functional programming concepts are now seen as important features for scalability. These are tedious at best in Java and Scala supports these well while simultaneously providing a much more concise syntax. Lets just say a replacement of the Java language is overdue. But on the other hand it would be wrong to pick any language as the language. Both .Net and the JVM are routinely used as generic runtimes for all sorts of languages. There’s also the LLVM project, which is a compiler tool chain that includes dynamic compilation in a vm as an option for basically anything GCC can compile.

Artifacts should be transient

So we now have a hypothetical language, with support for all of the above. Lets not linger on the details and move on to deployment and run time. Basically the word compile comes from the early days of computing when people had to punch holes into cards and than compile those into stacks and hand feed them to big, noisy machines. In other words, compilation is a tedious & necessary evil. Java popularized the notion of just in time compilation and partial, dynamic compilation. The main difference here is that just in time compilation merely moves the compilation step to the moment the class is loaded whereas dynamic compilation goes a few steps further and takes into account run-time context to decide if and how to compile. IDEs tend to compile on the fly while you edit. So why, bother with compilation after you finish editing and before you need to load your classes? There is no real technical reason to compile ahead of time beyond the minor one time effort that might affect start up. You might want the option to do this but it should not default to doing it.

So, for most applications, the notion of generating binary artifacts before they are needed is redundant. If nothing needs to be generated, nothing needs to be copied/moved either. This is true for both compiled or interpreted and interpreted languages. A modern Java system basically uses some binary intermediate format that is generated before run-time. That too is redundant. If you have dynamic compilation, you can just take the source code and execute it (while generating any needed artifacts for that on the fly). You can still do in IDE compilation for validation and static analysis purposes. The distinction between interpreted and static compiled languages has become outdated and as scripting languages show, not having to juggle binary artifacts simplifies life quite a bit. In other words, development artifacts (other than the source code) are transient and with the transformation from code to running code automated and happening at run time, they should no longer be a consideration.

That means no more build tools.

Without the need to transform artifacts ahead of run-time, the need for tools doing and orchestrating this also changes. Much of what maven does is basically generating, copying, packaging, gathering, etc. artifacts. An artifact in maven is just a euphemism for a file. Doing this is actually pretty stupid work. With all of those artifacts redundant, why keep maven around at all? The answer to that is of course testing and continuous integration as well as application life cycle management and other good practices (like generating documentation). Except, lots of other different tools are involved with that as well. Your IDE is where you’d ideally review problems and issues. Something like Hudson playing together with your version management tooling is where you’d expect continuous integration to take place and application life cycle management is something that is part of your deployment environment. Architectural features of the language and run-time combined with good built in application and component life cycle removes much of the need of external tooling to support all this and improves interoperability.

Source files need to go as well

Visual age and smalltalk pioneered the notion of non file based program storage where you modify the artifacts in some kind of DB. Intentional programming research basically is about the notion that programs are essentially just interpretations of more abstract things that get transformed (just in time) to executable code or into different views (editable in some cases). Martin Fowler has long been advocating IP and what he refers to as the language workbench. In a nut shell, if you stop thinking of development as editing a text file and start thinking of it as manipulating abstract syntax trees with a variety of tools (e.g. rename refactoring), you sort of get what IP and language workbenches are about. Incidentally, concepts such as APIs, API versions, provided & required interfaces are quite easily implemented in a language workbench like environment.

Storage, versioning, access control, collaborative editing, etc.

Once you stop thinking in terms of files, you can start thinking about other useful features (beyond tree transformations), like versioning or collaborative editing for example. There have been some recent advances in software engineering that I see as key enablers here. Number 1 is that version management systems are becoming decentralized, replicated databases. You don’t check out from git, you clone the repository and push back any changes you make. What if your IDE were working straight into your (cloned) repository? Then deployment becomes just a controlled sequence of replicating your local changes somewhere else (either push based, pull based, or combinations of that. A problem with this is of course that version management systems are still about manipulating text files. So they sort of require you to serialize your rich syntax trees to text and you need tools to unserialize them in your IDE again. So, text files are just another artifact that needs to be discarded.

This brings me to another recent advance: couchdb. Couchdb is one of the non relational databases currently experiencing lots of (well deserved) attention. It doesn’t store tables, it stores structured documents. Trees in other words. Just what we need. It has some nice properties built in, one of which is replication. Its built from the ground up to replicate all over the globe. The grand vision behind couchdb is a cloud of all sorts of data where stuff just replicates to the place it is needed. To accomplish this, it builds on REST, map reduce, and a couple of other cool technology. The point is, couchdb already implements most of what we need. Building a git like revision control system for versioning arbitrary trees or collections of trees on top can’t be that challenging.

Imagine the following sequence of events. Developer A modifies his program. Developer B working on the same part of the software sees the changes (real time of course) and adds some more. Once both are happy they mark the associated task as done. Somewhere on the other side of the planet a test server locally replicates the changes related to the task and finds everything is OK. Eventually the change and other changes are tagged off as a new stable release. A user accesses the application on his phone and at the first opportunity (i.e. connected), the changes are replicated to his local database. End to end the word file or artifact appears nowhere. Also note that the bare minimum of data is transmitted: this is as efficient as it is ever going to get.

Conclusions

Anyway, just some reflections on where we are and where we need to go. Java did a lot of pioneering work in a lot of different domains but it is time to move on from the way our grand fathers operated computers (well, mine won’t touch a computer if he can avoid it but that’s a different story). Most people selling silver bullets in the form of maven, ruby, continuous integration, etc. are stuck in the current thinking. These are great tools but only in the context of what I see as a deeply flawed end to end system. A lot of additional cruft is under construction to support the latest cloud computing trends (which is essentially about managing a lot of files in a distributed environment). My point here is that taking a step back and rethinking things end to end might be worth the trouble. We’re so close to radically changing the way developers work here. Remove files and source code from the equation and what is left for maven to do? The only right answer here is nothing.

Why do I think this needs to happen: well, developers are currently wasting enormous amounts of time on what are essentially redundant things rather than developing software. The last few weeks were pretty bad for me, I was just handling deployment and build configuration stuff. Tedious, slow, and maven is part of this problem.

Update 26 October 2009

Just around the time I was writing this, some people decided to come up with Play, a framework + server inspired by Python Django that preserves a couple of cool features. The best feature: no application server restarts required, just hit F5. Works for Java source changes as well. Clearly, I’m not alone in viewing the Java server side world as old and bloated. Obviously it lacks a bit in functionality. But that’s easily fixed. I wonder how this combines with a decent dependency injection framework. My guess is not well, because dependency injection frameworks require a context (i.e.) state to be maintained and Play is designed to be stateless (like Django). Basically, each save potentially invalidates the context require a full reload of that as well (i.e. a server restart). Seems the play guys have identified the pain point in Java: server side state comes at a price.

Lucene Custom Analyzer

A second neat trick I did with Lucene this week was to wrap the StandardAnalyzer with my own analyzer (see here for the other post on Lucene I did a few days ago).

The problem I was trying to address is very simple. I have a nice web service API for my search engine. The incoming query is handled by Lucene using the bundled QueryParser which has a quite nice and elaborate query language that covers most of my needs. However, a problem is that it uses the StandardAnalyzer on everything which means that all the terms in the query are being tokenized. For text this is a good thing. However, I also have fields in my index that are not text.

The Lucene solution to this is to use Untokenized fields in the index. Only problem, using untokenized fields in combination with the QueryParser is not recommended and tends to not work well since everything in the query is being tokenized. So, you should not use the QueryParser but programmatically construct your own Query. Nice but not what I want since it complicates my search API and I need to make complicated queries on the other end of it.

What I wanted is to match a url field against either the whole or part of the url (using wildcards). On top of that, I want to do that as part of a normal QueryParser query e.g. keyword: foo and link: “http\://example.com/foo”. I’ve been doing this the wrong way for a while and let Lucene tokenize the url. So http://example.com/foo becomes [http] [example.com] [foo] for Lucene. The StandardAnalyzer is actually quite smart about hostnames as you can see since otherwise it would treat the . as a token separator as well.

This was working reasonably well for me. However, this week I ran into a nice borderline case where my url ended in …./s.a. Tokenization happens on characters like . and /. On top of that, the StandardAnalyzer that I use with the QueryParser also filters out stopwords like a, the, etc. Normally this is good (with text at least). But in my case it meant the last a was dropped and my query was getting full matches against entries with a similar link ending in e.g. s.b. Not good.

Of course what I really wanted is to be able to use untokenized fields with the QueryParser. Instead what I did this week was create a tokenizer that for selected fields skips tokenization and treats the entire field content as a single token. I won’t put the code for that here but it is quite easy:

  • extend Analyzer
  • override tokenStream(String field, Reader r)
  • if field matches any of your special fields, return a custom TokenStream that returns the entire content of the Reader as a single Token, else just delegate to a StandardAnalyzer instance.

This is a great way to influence the tokenization and also enables a few more interesting hacks that I might explore later on.

OpenID 2.0 and concerns about it

It seems JanRain is finally readying the final version of OpenID 2.0. There’s a great overview of some concerns that I mostly share on readwriteweb.com. Together with another recent standard (OAuth), OpenID 2.0 could be a huge step forward for web security and privacy.

Lets start with what OpenID is about and why, generally, it is a good idea. The situation right now on the web is that:

  • Pretty much every web site has its own identity solution. This means that users have to keep track of dozens of accounts. Generally users have only one or two email addresses so in practice that means most these accounts are actually tied to 1 email account. Imagine someone steals your gmail password and starts scanning your mail for all those nice account activation mails you’ve been getting for years. Hint: “mail me my password”, “reset my password”. In short, the current situation has a lot of security risks. It’s basically all the downsides of a centralized identity solution without any of the advantages. There are many valid concerns about using OpenID related to e.g. phishing. However, what most people overlook is that the current situation is much worse and also that many OpenID providers actually address the concerns by implementing various technical solutions and security practices. For example myopenid.com and verisign employ very sophisticated technologies that you won’t find on many websites where you would happily provide your credit card number. There is no technical reason whatsoever why openid providers can’t use the same or better authentication mechanisms that you probably use with your bank already.
  • While technically some websites could work together on identity, very few do and the ones that do tend to have very strong business ties (e.g. banks, local governments, etc. This means that in most cases, reusable identity is only usable on a handful of partner sites. Google, Microsoft, and Yahoo are great examples. They each have partner programs that allows externals to authenticate people with them. Only problem: almost nobody seems to do that. So reality check: OpenID is the only widespread single sign on solution on the web. There is nothing else. All the other stuff is hopelessly locked into commercial verticals. Microsoft has been trying for years to get their password solution to do what OpenID is doing today. They failed miserably so far.
  • Web sites are increasingly dependent on each other. Mashups started as an informal thing where site A used an API from site B and did something nice with it. Now these interactions are getting much more complex. The amount of sites involved in typical mashups is increasing and the amount of privacy sensitive data flying around in these mashups is also increasing. A very negative pattern that I’ve seen on several sites is the “please provide your gmail/hotmail/yahoo user password and we’ll import your friends” type feature. Do you really want to share your years of private email conversations with a startup run in a garage in California? Of course not! This is not a solution but a cheap hack. The reality is that something like OpenID + OAuth is really needed because right now many users are putting themselves in danger by happily providing their username and passwords.
  • Social networks like Facebook authenticate people for the many little apps that plug into them. So far Facebook is the most successful here. Facebook provides a nice glimpse of what OpenID makes possible on a much larger scale but it is still a centralized vertical. I am on Facebook and generally like what I see there but I’m not really comfortable with the notion that they are the web from now on (which seems to be implied in their centralized business model). Recent scares with their overly aggressive advertisement schemes shows that they can’t really be trusted.

OpenID is not a complete solution for the above problems and it is important to realize that is by design: it tries to solve only one problem and tries to solve it well. But generally it is a vast improvement over what is used today. Additionally, it can be complemented with protocols like OAuth which are about delegating permissions from one site to another on your behalf. OpenID and OAuth are very well integrated with the web architecture in the sense that they are not monolithic identity solutions but modular solutions designed to be combined with other modular solutions. This modular nature is essential because it allows for very diverse combinations of technology. This in turn allows different sites to implement the security they need but in a compatible way. For example, for some sites allowing any OpenID provider would be a bad idea. So, implement whitelisting and work with a set of OpenID providers you trust (e.g. Verisign).

OpenID and OAuth provide a very decent base level of protection that is not available from any other widely used technology currently. The closest thing to it is the Liberty Alliance/SAML/Microsoft family of identity products. These are designed for and applied exclusively in enterprise security products. You find them in banks and financial institutions; travel agencies, etc. These are also used on the web but invariably only to build verticals. Both Google and Microsoft use technologies like this to power their identity solutions. In fact, many OpenID identity providers also use these technolgies. For example, Microsoft is rumoured to OpenID enable their solution and several members of the Liberty Alliance (e.g. Sun) have been experimenting with OpenID as well. They are not mutually exclusive technologies.

It gets better though. Many OpenID providers are employing really advanced anti phishing technologies. Currently you and your cryptographically weak password are just sitting ducks for Russian/Nigerian/Whatever scammers. Even if you think your password is good, it probably isn’t. OpenID doesn’t specify how to authenticate. Consequently, OpenID providers are competing on usability and anti phishing features. For example, Verisign and myopenid.com employ techniques that makes them vastly more secure than most websites out there, including some where you make financial transactions. There has been a lot of criticism on openid and this has been picked up by those that implement it.

So now on to OpenID 2.0. This version is quite important because it is the result of many companies discussing what should be in there for a very long time. In some respects there are a few regrettable compromises and maybe not all of the spec is that good of an idea (e.g. .name support). But generally it is a vast improvement over OpenID 1.1 which is what is in use currently and which is technically flawed in several ways that 2.0 fixes. The reason 2.0 is important is because many companies have been holding off OpenID support until it was ready.

The hope/expectation is that those companies will start enabling OpenID logins for their sites over the next few months. The concern expressed here is that this may not actually happen and that in fact OpenID hype seems past its glory already. Looking at how few sites I can actually sign into with my OpenID today, I’d have to agree. As of yet, no major website has adopted OpenID. Sure there are plenty of identity providers that support OpenID but very few relying parties that accept identities from those providers. Most of the OpenID sites out there are simple blogs, startups web 2.0 type stuff, etc. The problem seems to be that everybody is waiting for everybody else and also that everybody is afraid of giving up control over their little clusters of users.

So ironically, even though there are many millions of openids out there, most of their owners don’t use them (or even are aware of having one). Pretty soon openid will be the authentication system with the most users on this planet (if not already) and people don’t even know about it. Even the largest web sites have no more than something like a hundred million users (which is a lot). Several of those sites are already openid identity providers (e.g. AOL).

The reason I hope OpenID does get some adoption is because if it isn’t it will take a very long term for something similar to emerge. This means that the current very undesirable situation is prolonged for a very long time. In my view a vast improvement is needed over the current situation and besides OpenID, there seems to be very little in terms of solutions that can realistically be used today.

The reason I am posting this is because over the past few months me and my colleagues have been struggling with how to do security in decentralized smart spaces. If you check my publications web site, you will see several recent workshop papers that provide a high level overview of what we are building. Most of these papers are pretty quiet on security so far even though obviously security and privacy is a huge concern in a world where user devices use each others services and mash them up with commercial services in both the local network and internet. Well, the solution we are applying in our research platform is a mix of OpenID, OAuth and some rather cool add-ons that we have invented to those. Unfortunately I can’t detail too much about our solutions yet except that I am very excited about them. Over the next year, we should be able to push out more information into the public.

semantic vs Semantic

Interesting post on how microformats relate to the Semantic web as envisioned by the w3c.

The capital S is semantically relevant since it distinguishes it from the lower case semantic web that microformats are all about. The difference is that the Semantic web requires technology that has been defined by the w3c but is not currently available in any mainstream products such as for example web browsers that people use to browse the current web. This technologies include RDF, the OWL query language, XHTML 1.x and 2.x and a few other rather obscure “standards” that you won’t find on a typical end user PC or web server. I use quotes around the word standard here because I don’t believe the W3C to be very effective in transferring its recommended standards over to industry in a proper way.
Continue reading “semantic vs Semantic”

Facebook

It seems I’ve been unaware of the little revolution that has been unfolding since May 24th. Before that, facebook was yet another social network popular mostly in the US. On that date, facebook opened up their API and made it possible for people to integrate their 3rd party services into facebook. Marc Andreesen explained the concept in a lengthy post that is well worth reading around mid June. This too went unnoticed by me. To my defense, I was on vacation first half of June and maybe a bit less connected than I usually am.

About two weeks ago, my neighbour, friend & colleague Christian del Rosso, invited me to facebook. He must have noticed that I didn’t catch up his earlier link to Marc Andreesen’s article. So I dutifully signed up not expecting much of it but somewhat curious to find out why facebook was being mentioned a lot lately. I’m already on linkedin and del.icio.us so I thought I was pretty well covered in this web 2.0 thing. Apparently not.

In the past two weeks, I found several people I know that recently created accounts on facebook. Facebook has the notion of networks and groups and I’m in several of those now, all rapidly growing. Finally in the last few days I started exploring facebook a bit more in depth and doing things like updating my profile, exploring other people’s profiles, and finally figuring out that there’s a shitload of cool applications that integrate into facebook. The proverbial penny dropped only a few days ago.

I’m on iLike, mytravelmap and flixster now. Also I have hooked up my blog and del.icio.us to facebook and of course installed the chuck norris fact generator. All very fun toys. The first three I would probably never have signed up for seperately.

The only thing that I don’t like is that openid is not part of facebook. That’s a pitty, because I believe the fully decentralized mash ups enabled by openid are the future. Ultimately, facebook is another vertical and the waiting is just for who will buy these guys (and for how much). It seems that .com bubble 2.0 is now well underway.

It would seem from the above that facebook is perfect. Of course it isn’t. I’ve encountered many issues so far: performance problems; parts of the site not working; strange errors and failing ajax stuff. Also I noticed that the entire thing seems to be written in php. That could give rise to some worries related to e.g. security and scalability. Opening it up to basically anybody who cares to develop 3rd party stuff does not exactly make it better.

OSGi: some criticism

Over the past few weeks, I’ve dived into the wonderful world called OSGi. OSGi is a standardized (by a consortium and soon also JCP) set of java interfaces and specifications that effectively layer a component model on top of Java. By component I don’t mean that it replaces JavaBeans with something else but that it provides a much improved way of modularizing Java software into blobs that can be deployed independently.

OSGi is currently the closest thing to having support for the stuff that is commonly modeled using architecture description languages. ADLs have been used in industry to manage large software bases. Many ADLs are homegrown systems (e.g. Philips’ KOALA) or simply experimental tools created in a university context (e.g. XADL). OSGi is similar to these languages because:

  • It has a notion of a component (bundles)
  • Dependencies between components
  • Provided and required APIs
  • API versioning

Making such things explicit, first class citizens in a software system is a good thing. It improves manageability and consistency. Over the past few weeks I’ve certainly enjoyed exploring the OSGi framework and its concepts while working on actual code. However, it struck me that a lot of things are needlessly complicated or difficult.

Managing dependencies to other bundles is more cumbersome than handling imports in Java. Normally when I want to use some library, I download it; put it on the classpath; type a few letters and then ctrl+space myself through whatever API it exposes. In OSGi it’s more difficult. You download the bundle (presuming there is one) and then need to decide on which packages you want to use that it exposes.

I’m a big fan of the organize imports feature in eclipse which seems to not understand OSGi imports and exports at all. That means that for one library bundle you may find yourself going back and fort to the manifest file of your bundle to manually add packages you need. Eclipse PDE doesn’t seemt to be so clever. For me that is a step backwards.

Also most libraries don’t actually ship as bundles. Bundles are a new concept that is not backwards compatible with good old jar files (which is the form most 3rd party libraries come in). This is an unnecessary limitation. A more reasonable default would be to treat non OSGi jar files as bundles that simply export everything in it and put everything it imports on the import path. It can’t be that hard to fish that information out of a jar file. At the very least, I’d like a tool to that for me. Alternatively, and this is the solution I would prefer, it should be possible to add the library to the OSGI boot classpath. This allows all bundles that load to access non OSGi libraries and does not require modifications to those libraries at all.

Finally, I just hate having to deal with this retarded manifest file concept. I noticed the bug that requires the manifest to end with a empty line still exists (weird stuff happens if this is missing). This is equally annoying as the notion of having to use tabs instead of spaces in makefiles. I was banging my head against the wall over newline stuff in 1997. The PDE adds a nice frontend to editing the manifest (including friendly warning of the bug if you accidentally remove the newline). But the fact remains that it is a kludge to superimpose stuff on Java that is not part of Java.

Of course with version 1.5 there is now a nicer way to do this using annotations. Understandably, OSGi needs to be backwards compatible with older versions (hence the kludge is excused) but the way forward is obviously to deprecate this mechanism on newer editions of Java. Basically, I want to be able to specify at class and method level constraints with respect to import & export. An additional problem is that packages don’t really have first class representation in java. They are just referred to by name in classes (in the package declaration) but don’t have their own specification. That means it is difficult to add package level annotations (you can work around this using a package-info.java file).