Here’s a simple use case which I imagine happens a lot:
You’re writing a Java application. It requires storing data in xml and retrieving this data from xml. It used to work like this:
- Write some prototype xml that more or less has the information you need. At this point you expect things to evolve.
- Parse the thing into a DOM or write a sax handler in Java
- Write some code to extract data from the xml document and store it in memory as java beans
- Write some code to turn the beans back into xml
- Tune parsing code and xml as required while you develop the rest of the application
SAX is worse, it is a very low level API that requires you to write event handlers to get data out of it. The only nice thing about it is that works ‘on the fly’, so you can prevent buffering the entire thing in memory.
Neither are very nice to work with in the above scenario. In most cases all I want to do is something like this:
Document foo = Document.parse("foo.xml", FooInterface.class);
FooInterface fooBean = (FooInterface)foo.getBean();
Bar barBean = (Bar)fooBean.getBar();
In short I want to unserialize some xml, do stuff with it and then serialize it back and not spend to much time thinking about how that should be implemented. There are of course some problems with this:
- there are multiple possible xml structures to serialize to, which one is correct
- the interfaces may not fully define the constraints this xml should match
- there might be other, non java applications that need to do stuff with the xml
Here’s how sun expects people to solve the above problem (using JAXB):
- Write an xml schema that describes your xml format
- Compile this schema into java code with interfaces & implementations of beans that match this schema & some parsing and unparsing code
- Write code that depends on this generated code
This solution has a number of problems:
- You need to write an xml schema before you write any code. That sort of doesn’t work when you only have a rough idea of what should be in the xml
- The code that you write to access the generated beans has a necessary level of indirection and is therefore ugly.
- Any time you edit the schema and regenerate your bean you potentially break this ugly code.
- XML schema is very complicated, you are likely to iterate multiple times to get it right, even if the xml format does not evolve.
- The reference implementation of JAXB comes with strings attached. It’s a typical SUN thing bundled with piles of documentation, a shitty installer, a shitty license, lots of dependencies on stuff you don’t really need. There’s not much else. Apache JAXME is a project that intends to fix these issues but at 0.5 it is somewhat a risk to use in a project.
In short, like so many xml based tools, JAXB is an overarchitected solution that doesn’t actually have much advantages over hand parsed and unparsed stuff for the simple use cases that are so common. I’ve written plenty of xml parser code based on the DOM api and I’ve decided to keep doing that. In fact, in the time I used to figure out the above I could have written it for what I am currently building.