Mobile Linux

A lot has been written about mobile and embedded device platforms lately (aka. ‘phone’ platforms). Usually articles are about the usual incumbent platforms: Android, IOS, and Windows Phone and the handful of alternatives from e.g. RIM and others. Most of the debate seems to revolve around the question whether IOS will crush Android, or the other way around. Kind of a boring debate that generally involves a lot of fan boys from either camp highlighting this or that feature, the beautiful design, and other stuff.

Recently this three way battle (or two way battle really, depending on your views regarding Windows Phone), has gotten a lot more interesting. However, my in view this ‘war’ was actually concluded nearly a decade ago before it even started and mobile linux won in a very unambiguous way. What is really interesting is how this is changing the market right now.

Continue reading “Mobile Linux”

Git svn voodoo

As discussed earlier on this site, I recently started using git in our svn centric organization. Since I’m trying to convince some co-workers to do the same, I would like to share a bit of git voodoo for working with multiple git repositories and a central svn repository. Most git tutorials don’t really show how to do this, even though it is quite easy. The approach below gives you all the flexibility with git that you need while allowing you to inter-operate seamlessly with your svn using colleagues.

Continue reading “Git svn voodoo”

N900 tweaking

I’ve been tweaking my N900 quite a bit (just because I can).

Power management. Sadly there are some issues with some wifi routers related to power management. If you find yourself with connections timing out, the solution is going to settings, internet connections. Then edit the problematic connection and go to the last page which features an advanced button. Then under ‘other’ set power management to intermediate or off.

With that sorted out, you’ll want to be offline most of the day. So don’t turn on sip/im/facebook unless you need it and switch it off right after you’re done. Push email is nice but with 15/30 minute polling your battery will last longer.

To gain insight, of course install battery-eye. This plots a graph of your batteries power reserves. Finally, you may want to install a few applets to dim the screen, turn on/off wifi, and switch between 2G and 3G. You can find these in the extras repository that is enabled by default in the application manager.

Apt-get. The application manager is nice but a bit sluggish and it insists on refreshing catalogs after just about each tap. Use it to install openssh and make sure to pick a good password (or set up key authentication). Then ssh into your n900 and use apt-get update and apt-get install just like you would on any decent Debian box. This is why you got this device.

Finding stuff to install. Instead of listing all the crap I installed, I’ll provide something more useful: ways of finding crap to install.

  • Ovi store. Small selection of goodies. Check it out but don’t count on finding too much there. Included for completeness
  • Misc sites with the latest cool stuff:
  • Advanced (i.e. don’t come crying when you mess up and have to reflash): enable the extras, extras-testing, extras-devel repositories from here. Many useful things are provided here. Some of them have the potential to seriously mess up your device. Extras-devel is where all the good stuff comes from but it’s very much like Debian unstable.

Browser extensions. The N900 browser supports extensions. Install the adblock and maemo geolocation extensions through the application manager.

Use the browser. Instead of applications, you can use the browser and rely on web applications instead:

  • Cotchin. A web based foursquare client. Relies on the geolocation API for positioning.
  • Google Reader for touch screen phones.
  • Google maps mobile. Includes latitude, routing and other cool features. Relies on the geolocation API for positioning.
  • Maemaps. Pretty cool N900 optimized unofficial frontend for Google Maps.
  • Hahlo. A nice twitter client in the browser.

Missing the point

Like most of you (probably), I’ve been reading the news around Google Buzz with interest. At this point, the regular as clockwork announcements from Google are treated somewhat routinely by the various technology blogs. Google announced foo, competitor bar says this and expert John Doe says that. Bla bla bla, revolutionary, bla bla similar to bla, bla. Etc. You might be tempted to dismiss Buzz as yet another Google service doomed to be ignored by most users. And you’d be right. Except it’s easy to forget that most of those announcements actually do have some substance. Sure, there have been a few less than exciting ones lately and not everything Google touches turns into gold but there is some genuinely cool stuff being pushed out into the world from Mountain View on a monthly, if not more frequent, basis.

So this week it’s Google Buzz. Personally, I think Buzz won’t last. At least not in its current gmail centric form. Focusing on Buzz is missing the point however. It will have a lasting effect similar to what happened with RSS a few years back. The reason is very simple, Google is big enough to cause everybody else to implement their APIs, even if buzz is not going to be a huge success. They showed this with open social, which world + dog now implements, despite it being very unsuccessful in user space. Google wave, same thing so far. The net effect of Buzz and the APIs that come with it will be internet wide endorsement of a new real time notification protocol, pubsubhubbub. In effect this will take twitter (already an implementer) to the next level. Think pubsubhubbub sinks and sources all over the internet and absolutely massive traffic between those sources and sinks. Every little internet site will be able to notify the world of whatever updates it has, every person on the internet will be able to subscribe to such notifications directly, or more importantly, indirectly to whichever other websites choose to consume, funnel and filter those notifications on their behalf. It’s so easy to implement that few will resist the temptation to do so.

Buzz is merely the first large scale consumer of pubsubhub notifications. Friendfeed tried something similar with RSS, was bought by Facebook and successfully eliminated as a Facebook competitor. However, Pubsubhubbub is the one protocol that Facebook won’t be able to ignore. For now they seem to stick with their closed everything model. This means there is Facebook and the rest of the world and well guarded boundaries between those. As the rest of the world becomes more interesting in terms of notifications, keeping Facebook isolated as it is today will become harder. Technically, there are no obstacles. The only reason Facebook is isolated is because it chooses to be isolated. Anybody who is not Facebook has a stake in committing to pubsubhubbub to be able to compete with Facebook. So Facebook becoming a consumer of pubsubhubbub type notifications is a matter of time, if only because it will simply be the easiest way for them to syndicate third party notifications (which is their core business). I’d be very surprised if they hadn’t got something implemented already. Facebook becoming a source of notifications is a different matter though. The beauty of the whole thing is that the more notifications originate outside of Facebook, the less this will matter. Already some of their status updates are simply syndicated from elsewhere (e.g. mine go through Twitter). Facebook is merely a place people go to see an aggregated view on what their friends do. It is not a major source of information, and ironically the limitations imposed by Facebook make it less competitive as such.

So, those dismissing Buzz for whatever reason are missing the point: it’s the APIs stupid! Open APIs, unrestricted syndication and aggregation of notifications, events, status updates, etc. It’s been talked about for ages, it’s about to happen in the next few months. First thing to catch up will be those little social network sites that almost nobody uses but collectively are used by everybody. Hook them up to buzz, twitter, etc. Result, more detailed event streams popping up outside of Facebook. Eventually people will start hooking up Facebook as well, with or without the help of Facebook. By this time endorsement will seem like a good survival strategy for Facebook.

maven: good ideas gone wrong

I’ve had plenty of time to reflect on the state of server side Java, technology, and life in general this week. The reason for all this extra ‘quality’ time was because I was stuck in an endless loop waiting for maven to do its thing, me observing it failed in subtly different ways, tweaking some more, and hitting arrow up+enter (repeat last command) and fiddling my thumbs for two more minutes. This is about as tedious as it sounds. Never mind the actual problem, I fixed it eventually. But the key thing to remember here is that I lost a week of my life on stupid book keeping.

On to my observations:

  • I have more xml in my maven pom files than I ever had with my ant build.xml files four years ago, including running unit tests, static code checkers, packaging jars & installers, etc. While maven does a lot of things when I don’t need them to happen, it seems to have an uncanny ability to not do what I want when I need it to or to first do things that are arguably redundant and time consuming.
  • Maven documentation is fragmented over wikis, javadoc of diverse plugins, forum posts, etc. Google can barely make sense of it. Neither can I. Seriously, I don’t care about your particular ideology regarding elegance: just tell me how the fuck I set parameter foo on plugin bar and what its god damn default is and what other parameters I might be interested in exist.
  • For something that is supposed to save me time, I sure as hell am wasting a shit load of time on making it do what I want and watching it do what it does (or not), and fixing the way it works. I had no idea compiling & packaging less than 30 .java files could be so slow.
  • Few people around me dare to touch pom files. It’s like magic and I hate magicians myself. When it doesn’t work they look at me to fix it. I’ve been there before and it was called ant. Maven just moved the problem and didn’t solve a single problem I had five years ago while doing the exact same shit with ant. Nor did it make it any easier.
  • Maven compiling, testing, packaging and deploying defeats the purpose of having incremental compilation and dynamic class (re)loading. It’s just insane how all this application server deployment shit throws you back right to the nineteen seventies. Edit, compile, test, integration-test, package, deploy, restart server, debug. Technically it is possible to do just edit, debug. Maven is part of the problem here, not of the solution. It actually insists on this order of things (euphemistically referred to as a life cycle) and makes you jump through hoops to get your work done in something resembling real time.
  • 9 out of 10 times when maven enters the unit + integration-test phase, I did not actually modify any code. Technically, that’s just a waste of time (which my employer gets to pay for). Maven is not capable of remembering the history of what you did and what has changed since the last time you ran it so like any bureaucrat it basically does maximum damage to compensate for its ignorance.
  • Life used to be simple with a source dir, an editor, a directory of jars and an incremental compiler. Back in 1997, java recompiles took me under 2 seconds on a 486, windows NT 3.51 machine with ‘only’ 32 MB, ultra edit, an IBM incremental java compiler, and a handful of 5 line batch files. Things have gotten slower, more tedious, and definitely not faster since then. It’s not like I have much more lines of code around these days. Sure, I have plenty of dependencies. But those are run-time resolved, just like in 1997, and are a non issue at compile time. However, I can’t just run my code but I have to first download the world, wrap things up in a jar or war, copy it to some other location, launch some application server, etc. before I am in a position to even see if I need to switch back to my editor to fix some minor detail.
  • Your deployment environment requires you to understand the ins and outs of how where stuff needs to be deployed, what directory structures need to be there, etc. Basically if you don’t understand this, writing pom files is going to be hard. If you do understand all this, pom files won’t save you much time and will be tedious instead. You’d be able to write your own bash scripts, .bat files or ant files to achieve the same goals. Really, there’s only so many ways you can zip a directory into a .jar or .war file and copy them over from A to B.
  • Maven is brittle as hell. Few people on your project will understand how to modify a pom file. So they do what they always do, which is copy paste bits and pieces that are known to more or less do what is needed elsewhere. The result is maven hell. You’ll be stuck with no longer needed dependencies, plugins that nobody has a clue about, redundant profiles for phases that are never executed, half broken profiles for stuff that is actually needed, random test failures. It’s ugly. It took me a week to sort out the stinking mess in the project I joined a month ago. I still don’t like how it works. Pom.xml is the new build.xml, nobody gives a shit about what is inside these files and people will happily copy paste fragments until things work well enough for them to move on with their lives. Change one tiny fragment and all hell will break loose because it kicks the shit out of all those wrong assumptions embedded in the pom file.

Enough whining, now on to the solutions.

  • Dependency management is a good idea. However, your build file is the wrong place to express those. OSGI gets this somewhat right, except it still externalizes dependency configuration from the language. Obviously, the solution is to integrate the component model into the language: using something must somehow imply depending on something. Possibly, the specific version of what you depend on is something that you might centrally configure but beyond that: automate the shit out of it, please. Any given component or class should be self describing. Build tools should be able to figure out the dependencies without us writing them down. How hard can it be? That means some none existing language to supersede the existing ones needs to come in existence. No language I know of gets this right.
  • Compilation and packaging are outdated ideas. Basically, the application server is the run-time of your code. Why doesn’t it just take your source code, derive its dependencies and runs it? Every step in between editing and running your code is a potential to introduce mistakes & bugs. Shortening the distance between editor and run-time is good. Compilation is just an optimization. Sure, it’s probably a good idea for the server to cache the results somewhere. But don’t bother us with having to spoon feed it stupid binaries in some weird zip file format. One of the reasons scripting languages are so popular is because it reduces the above mentioned cycle to edit, F5, debug. There’s no technical reason whatsoever why this would not be possible with statically compiled languages like java. Ideally, I would just tell the application server the url of the source repository, give it the necessary credentials and I would just be alt tabbing between my browser and my editor. Everything in between that is stupid work that needs to be automated away.
  • The file system hasn’t evolved since the nineteen seventies. At the intellectual level, you modify a class or lambda function or whatever and that changes some behavior in your program, which you then verify. That’s the conceptual level. In practice you have to worry about how code gets translated into binary (or asciii) blobs on the file system, how to transfer those blobs to some repository (svn, git, whatever), then how to transfer them from wherever they are to wherever they need to be, and how they get picked up by your run-time environment. Eh, that’s just stupid book keeping, can I please have some more modern content management here (version management, rollback, auditing, etc.)? Visual age actually got this (partially) right before it mutated into eclipse: software projects are just databases. There’s no need for software to exist as text files other than nineteen seventies based tool chains.
  • Automated unit, integration and system testing are good ideas. However, squeezing them in between your run-time and your editor is just counter productive. Deploy first, test afterwards, automate & optimize everything in between to take the absolute minimum of time. Inserting automated tests between editing and manual testing is a particularly bad idea. Essentially, it just adds time to your edit debug cycle.
  • XML files are just a fucking tree structures serialized in a particularly tedious way. Pom files are basically arbitrary, schema less xml tree-structures. It’s fine for machine readable data but editing it manually is just a bad idea. The less xml in my projects, the happier I get. The less I need to worry about transforming tree structures into object trees, the happier I get. In short, lets get rid of this shit. Basically the contents of my pom files is everything my programming language could not express. So we need more expressive programming languages, not entirely new ones to complement the existing ones. XML dialects are just programming languages without all of the conveniences of a proper IDE (debuggers, code completion, validation, testing, etc.).

Ultimately, maven is just a stop gap. And not even particularly good at what it does.

update 27 October 2009

Somebody produced a great study on how much time is spent on incremental builds with various build tools. This stuff backs my key argument up really well. The most startling out come:

Java developers spend 1.5 to 6.5 work weeks a year (with an average of 3.8 work weeks, or 152 hours, annually) waiting for builds, unless they are using Eclipse with compile-on-save.

I suspect that where I work, we’re close to 6.5 weeks. Oh yeah, they single out maven as the slowest option here:

It is clear from this chart that Ant and Maven take significantly more time than IDE builds. Both take about 8 minutes an hour, which corresponds to 13% of total development time. There seems to be little difference between the two, perhaps because the projects where you have to use Ant or Maven for incremental builds are large and complex.

So anyone who still doesn’t get what I’m talking about here, build tools like maven are serious time wasters. There exist tools out there that reduce this time to close to 0. I repeat, Pyhton Django = edit, F5, edit F5. No build/restart time whatsoever.

N900 & Slashdot

I just unleashed the stuff below in a slashdot thread. 10 years ago I was a regular there (posting multiple times per day) and today I realized that I hadn’t actually even bothered to sign into slashdot since buying a mac a few months ago. Anyway, since I spent time writing this I might as well repost here. On a side note, they support OpenID for login now! Cool!

…. The next-gen Nokia phone [arstechnica.com] on the other hand (successor to the N900) will get all the hardware features of the iPhone, but with the openness of a linux software stack. Want to make an app that downloads podcasts? Fine! Want to use your phone as a modem? No problem! In fact, no corporation enforcing their moral or business rules on how you use your phone, or alienation of talented developers [macworld.com]!

You might make the case that the N900 already has the better hardware when you compare it to the iphone. And for all people dismissing Nokia as just a hardware company, there’s tons of non trivial Nokia IPR in the software stack as well (not all OSS admittedly), that provides lots of advantages in the performance or energy efficiency domain; excellent multimedia support (something a lot of smart phones are really bad at), hardware acceleration, etc. Essentially most vendors ship different combinations of chips coming from a very small range of companies so from that point of view it doesn’t really matter what you buy. The software on top makes all the difference and the immaturity of newer platforms such as Android can be a real deal breaker when it comes to e.g. battery life, multimedia support, support for peripherals, etc. There’s a difference between running linux on a phone and running it well. Nokia has invested heavily in the latter and employs masses of people specialized in tweaking hardware and software to get the most out of the hardware.

But the real beauty of the N900 for the slashdot crowd is simply the fact that it doesn’t require hacks or cracks: Nokia actively supports & encourages hackers with features, open source developer tools, websites, documentation, sponsoring, etc. Google does that to some extent with Android but the OS is off limits for normal users. Apple actively tries to stop people from bypassing the appstore and is pretty hostile to attempts to modify the OS in ways they don’t like. Forget about other platforms. Palm technically uses linux but they are still keeping even the javascript + html API they have away from users. It might as well be completely closed source. You wouldn’t know the difference.

On the other hand, the OS on the N900 is Debian. Like on Debian, the package manager is configured in /etc/sources.list which is used by dpkg and apt-get, which work just as you would expect on any decent Debian distribution. You have root access, therefore you can modify any file, including sources.list. Much of Ubuntu actually compiles with little or no modification and most of the problems you are likely to encounter relate to the small screen size. All it takes to get to that software is pointing your phone at the appropriate repositories. There was at some point a Nokia sponsored Ubuntu port to ARM even, so there is no lack of stuff that you can install. Including stuff that is pretty pointless on a smart phone (like large parts of KDE). But hey, you can do it! Games, productivity tools, you name it and there probably is some geek out there who managed to get it to build for Maemo. If you can write software and package it as a Debian package and can cross compile it to ARM (using the excellent OSS tooling of course), there’s a good chance it will just work.

So, you can modify the device to your liking at a level no other mainstream vendor allows. Having a modifiable Debian linux system with free access to all of the OS on top of what is essentially a very compact touch screen device complete with multiple radios (bluetooth, 3G, wlan), sensors (GPS, motion, light, sound), graphics, dsp, should be enough to make any self respecting geek drool.

Now with the N900 you get all of that, shipped as a fully functional smart phone with all of the features Nokia phones are popular for such as excellent voice quality and phone features, decent battery life (of course with all the radios turned on and video & audio playing none stop, your mileage may vary), great build quality and form factor, good support for bluetooth and other accessories, etc. It doesn’t get more open in the current phone market currently and this is still the largest mobile phone manufacturer in the world.

In other words, Nokia is sticking out its neck for you by developing and launching this device & platform while proclaiming it to be the future of Nokia smart phones. It’s risking a lot here because there are lots of parties in the market that are in the business of denying developers freedom and securing exclusive access to mobile phone software. If you care about stuff like this, vote with your feet and buy this or similarly open (suggestions anyone?) devices from operators that support instead of prevent you from doing so. If Nokia succeeds here, that’s a big win for the OSS community.

Disclaimer: I work for Nokia and I’m merely expressing my own views and not representing my employer in any way. That being said, I rarely actively promote any of our products and I choose to do so with this one for one reason: I believe every single word of it.

Publications backlog

I’m now a bit more than half a year into my second ‘retirement’ from publishing (and I’m not even 35). The first one was when I was working as a developer at GX Creative Online Development 2004-2005 and paid to write code instead of text. In between then and my current job (back to coding), I was working at Nokia Research Center. So naturally I did lots of writing during that time and naturally I changed jobs before things started to actually appear on paper. Anyway, I have just added three items to my publications page. Pdfs will follow later. One of them is a magazine article for IEEE Pervasive Computing I wrote together with my colleagues in Helsinki about the work we have been doing there for the past two years. I’m particularly happy about getting that one out. It was accepted for publication in August and hopefully it will end up on actual dead trees soon. Once IEEE puts the pdf online, I’ll add it here as well. I’ve still got one more journal paper in the pipeline. Hopefully, I’ll get some news on that one soon. After that, I don’t have anything planned but you never know of course.

However, I must say that I’m quite disappointed with the whole academic publishing process, particularly when it comes to journal articles. It’s slow, tedious, the decision process is arbitrary, and ultimately only a handful of people read what you write since most journals come with really tight access control. Typically that doesn’t even happen until 2-3 years after you write it (more in some cases). I suspect the only reason people read my stuff at all is because I’ve been putting the pdfs on my site. I get more hits (80-100 on average) on a few stupid blog posts per day than most of my publications have gotten in the past decade. From what I managed to piece together on Google Scholar, I’m not even doing that bad with some of my publications (in terms of citations). But, really, academic publishing is a really, inefficient way of communication.

Essentially the whole process hasn’t really evolved much since the 17th century when the likes of Newton, Leibniz, et al. started communicating their findings in journals and print. The only qualitative difference between a scientific article and a blog post is so called peer-review (well, it’s a shitload of work to write articles of course). This is sort of like the Slashdot moderation system but performed by peers in the academic community (with about the same bias to the negative) who get to decide what is good enough for whatever workshop, conference or journal magazine you are targeting. I’ve done this chore as well and I would say that like on slashdot, most of the material passing across my desk is of a rather mediocre level. Reading the average proceedings in my field is not exactly fun since 80% tends to be pretty bad. Reading the stuff that doesn’t make it (40-60% for the better conferences) is worse though. I’ve done my bit of flaming on Slashdot (though not recently) and still maintain excellent karma there (i.e. my peers like me there). Likewise, out of 30+ publications on my publication page, only a handful is actually something that I still consider worth my time (writing it).

The reason that there are so many bad articles out there is that the whole process is optimized for meeting mostly quantitative goals universities and research institutes set for their academic staff. To reach these goals, academics organize workshops and conferences with and for each other that provides them with a channel for meeting these targets. The result is workshops full of junior researchers like I once was trying to sell their early efforts. Occasionally some really good stuff is published this way but generally the more mature material is saved for conferences, which have a bit wider audience and more strict reviewing. Finally, the only thing that really counts in the academic world is journal publications.

Those are run by for profit publishing companies that employ successful academics to do the content sorting and peer review coordination for them. Funnily these tend to also be the people running conferences and workshops. Basically, veterans of the whole peer reviewing process. Journal sales is a based on volume (e.g. once a quarter or once a month), reputation, and a steady supply of new material. This is a business model that the publishing industry has perfected over the centuries and many millions of research money flow straight to publishers. It is based on a mix of good enough papers that libraries & research institutes will pay to access and a need of the people in these institutes to get published, which requires access to the published work of others. Good enough is of course a relative term here. If you set the goals too high, you’ll end up not having enough material to make running the journal printing process commercially viable. If you set the goals too low, no-one will buy it.

In other words, top to bottom the scientific publishing process is optimized to keeping most of the academic world employed while sorting out the bad eggs and boosting the reputation of those who perform well. Nothing wrong with that, except for every Einstein, there’s tens of thousands of researchers who will never really publish anything significant or ground breaking who get published anyway. In other words, most stuff published is apparently worth the paper it is printed on (at least to the publishing industry) but not much more. I’ve always found the economics of academic publishing fascinating.

Anyway, just some Sunday morning reflections.

Photos Rome

I’ve been polishing my photos in Picasa and ended up using the nice sync feature to upload them to the corresponding photo sharing site as well. So go here to enjoy them.

Picasa is a bit of a downgrade since I used to spend way too much time polishing with powerful tools such as Photoshop, Gimp, etc. However, I find I like the workflow in Picasa better. And while the few basic edits you can do there leave something to be desired, it’s good enough. I have the Gimp installed as well but it’s just so slow, buggy and weird to work with it’s offensive and I won’t be investing in Photoshop on my new mac since the price is just way too high. Technically I could go for Photoshop elements except it doesn’t come with some features that I really would want (24 & 32 bit images, LAB mode, layers & ways to combine them, flexible masking, etc). You can sort of do some of that in the Gimp but it is frankly painful and the results tend to be underwhelming. I have some hopes that this KOffice photo thingy might live up to some of the hype. I’ll be giving it a try as soon as I can lay my hands on some Mac OS X binaries. Otherwise, if anyone knows of any other OSS photography tools for Mac I’d be very interested. I’m already a Hugin user as blogged earlier this week (and see above linked album for some nice panoramas).

OpenID, the identity landscape, and social networks

I’m still getting used to no longer being in nokia research center. One of my disappointments of being in NRC and being a vocal proponent of openid, social networks, etc. was that despite lots of discussion on this topic not much has happened in terms of me getting room to work on these topics or me convincing a lot of people about my opinions on these topics. I have one publication that is due out whenever the magazine involved gets around to approving and printing the article. But that’s it.

So, I take great pleasure in observing how things are evolving lately and finding that I’ve been pushing the right topics all along. Earlier this week, Facebook became a relying party for OpenID. Outside the OpenID community and regular techcrunch readers, this seems to have not been a major news story. Since, just about anybody I discussed this topic with in the past few years (you know who you are) always insisted that “no way that a major network like Facebook will ever use OpenID”. If you were one of those people: admit right now that you were wrong.

It seems to me that this is a result of fact that the social networking landscape is maturing. As part of this maturation process, several open standards are emerging. Identity and authentication are very important topics here and it seems the consensus is increasingly that no single company is going to own all 6-7 billion identities on this planet. So naturally any company with the ambition to potentially separate 6-7 billion individuals from their money for some product or service, will need to either work with multiple identity providers.

So naturally such companies require a standard for doing so. That standard is OpenID. It has no competition. There is no alternative. There are plenty of proprietary APIs that only work with limited sets of identity providers but none like OpenID that can work with all of them.

Similarly, major identity providers like Google, Facebook are stuck at sharing a few hundred million users between them, they shift their attention to somehow involving all those users that didn’t sign up with them. Pretty much all of them are OpenID providers already. Facebook just took the obvious next step in becoming a relying party as well. The economics are mindbogglingly simple: Facebook doesn’t make money from verifying peoples identity but they do make money from people using their services. OpenID relying party means the group of people who can access their services just grew to the entire internet population. Why wouldn’t they want that? Of course this doesn’t mean that world + dog will now be a Facebook user but it does mean that one important obstacle has just disappeared.

BTW. Facebook’s current implementation is not very intuitive. I’ve been able to hook up myopenid to my facebook account but I haven’t actually found a login page where I can login with my openid yet. It seems that this is a work in progress still.

Anyway, this concludes my morning blogging session. Haven’t blogged this much in months. Strange how the prospect of not having to work today is energizing me 🙂

wolframalpha

A few years ago, a good friend gave me a nice little present: 5 kilos of dead tree in the form of Stephen Wolfram’s “A new kind of science”. I never read it cover to cover and merely scanned a few pages with lots of pretty pictures before deciding that this wasn’t really my cup of tea. I also read a bit some of the criticism on this book from the scientific community. I’m way out of my league there so, no comments from be except a few observations:

  • Presentation of the book is rather pompous and arrogant. The author tries to convince the readers that they the most important piece of science ever produced in their hands.
  • This is what set of most of the criticism. Apparently, the author fails to both credit related work as well as properly back up some of his crucial claims with proper evidence.
  • Apparently there are quite a few insufficiently substantiated claims which affects credibility of the overall book and claims of the author
  • The approach of the author to write the book has been the ivory tower approach where he quite literally dedicated a decade+ of his life to writing it during which he did not seek out much criticism from his peers.
  • So, the book is controversial and may either turn out to be the new relativity theory (relatively speaking) or a genuine dud. I’m out of my league deciding either way

Anyway, the same Stephen Wolfram has for years been providing the #1 mathematical software IDE: Mathematica, which is one of the most popular software tools for anyone involved with mathematics. I’m not a mathematician and haven’t touched such tools in over 10 years now (dabbled a bit with linear algebra in college) but as far as I know, his company and product have a pretty solid reputation.

Now the same person has brought the approach he applied to his book and his solid reputation as a owner of Mathematica to the wonderful world of Web 2.0. Now that is something I know a thing or two about. Given the above I was initially quite sceptic when the first, pretty wild, rumors around wolframalpha started circulating. However, some hands on experience has just changed my mind. So here’s my verdict:

This stuff is great & revolutionary!

No it’s not Google. It’s not Wikipedia either. It’s not Semantic web either. Instead it’s a knowledge reasoning engine hooked up to some authoritative data sets. So, it’s not crawling the web. It’s not user editable and it is not relying on traditional Semantic web standards from e.g. W3C (though very likely it must be using similar technology).

This is the breakthrough that was needed. The semantic web community seems to be stuck in an endless loop pondering pointless standards, query formats, graph representations and generally rehashing computer science topics that have been studied for 40 years now without producing much viable business models or products. Wikipedia is nice but very chaotic and unstructured as well. The marriage of semantic web and wikipedia is obvious has been tried countless times and has so far not produced interesting results. Google is very good at searching through the chaos that is the current web but can be absolutely unhelpful with simple, fact based questions. Most fact based questions in Google return a wikipedia article as one of the links. Useful, but it doesn’t directly answer the question.

This is exactly the gap that wolframalpha fills. There’s many scientists and startups with the same ambition but Wolframalpha.com got to market first with a usable product that can answer a broad range of factual questions with knowledge imported into its system from trustworthy sources. It works beautifully for facts and knowledge it has and allows users to do two things:

  • Find answers to pretty detailed queries from trustworthy sources. Neither Wikipedia nor Google can do this, at best they can point you at a source that has the answer and leave it up to you to judge the trustworthyness of the source.
  • Fact surfing! Just like surfing from one topic to the next on Wikipedia is a fun activity, I predict that drilling down facts on wolframalpha is a equally fun and useful.

So what’s next? Obviously, wolframalpha.com will have competition. However, their core asset seems to be their reasoning engine combined with the quite huge fact database which is to date unrivaled. Improvements in both areas will solidify their position as market leader. I predict that several owners of large bodies of authoritative information will be itching to be a part of this and partnership deals will be announced. Wolframalpha could easily evolve into a crucial tool for knowledge workers. So crucial even that they might want to pay for access to certain information.

Some more predictions:

  • Several other startups will start competing soon with competing products. There should be dozens of companies working on similar or related products. Maybe all they needed was a somebody taking a first step.
  • Google likely has people working on such technologies they will either launch or buy products in this space in the next two years
  • Main competitors of Google are Yahoo and MS who have both been investing heavily in search technology and experience. They too will want a piece of this market
  • With so much money floating around in this market, wolframalpha and similar companies should have no shortage of venture capital, despite the current crisis. Also, wolframalpha might end up being bought up by Google or MS.
  • If not bought up or outcompeted (both of which I consider to be likely), wolframalpha will be the next Google