Missing the point

Like most of you (probably), I’ve been reading the news around Google Buzz with interest. At this point, the regular as clockwork announcements from Google are treated somewhat routinely by the various technology blogs. Google announced foo, competitor bar says this and expert John Doe says that. Bla bla bla, revolutionary, bla bla similar to bla, bla. Etc. You might be tempted to dismiss Buzz as yet another Google service doomed to be ignored by most users. And you’d be right. Except it’s easy to forget that most of those announcements actually do have some substance. Sure, there have been a few less than exciting ones lately and not everything Google touches turns into gold but there is some genuinely cool stuff being pushed out into the world from Mountain View on a monthly, if not more frequent, basis.

So this week it’s Google Buzz. Personally, I think Buzz won’t last. At least not in its current gmail centric form. Focusing on Buzz is missing the point however. It will have a lasting effect similar to what happened with RSS a few years back. The reason is very simple, Google is big enough to cause everybody else to implement their APIs, even if buzz is not going to be a huge success. They showed this with open social, which world + dog now implements, despite it being very unsuccessful in user space. Google wave, same thing so far. The net effect of Buzz and the APIs that come with it will be internet wide endorsement of a new real time notification protocol, pubsubhubbub. In effect this will take twitter (already an implementer) to the next level. Think pubsubhubbub sinks and sources all over the internet and absolutely massive traffic between those sources and sinks. Every little internet site will be able to notify the world of whatever updates it has, every person on the internet will be able to subscribe to such notifications directly, or more importantly, indirectly to whichever other websites choose to consume, funnel and filter those notifications on their behalf. It’s so easy to implement that few will resist the temptation to do so.

Buzz is merely the first large scale consumer of pubsubhub notifications. Friendfeed tried something similar with RSS, was bought by Facebook and successfully eliminated as a Facebook competitor. However, Pubsubhubbub is the one protocol that Facebook won’t be able to ignore. For now they seem to stick with their closed everything model. This means there is Facebook and the rest of the world and well guarded boundaries between those. As the rest of the world becomes more interesting in terms of notifications, keeping Facebook isolated as it is today will become harder. Technically, there are no obstacles. The only reason Facebook is isolated is because it chooses to be isolated. Anybody who is not Facebook has a stake in committing to pubsubhubbub to be able to compete with Facebook. So Facebook becoming a consumer of pubsubhubbub type notifications is a matter of time, if only because it will simply be the easiest way for them to syndicate third party notifications (which is their core business). I’d be very surprised if they hadn’t got something implemented already. Facebook becoming a source of notifications is a different matter though. The beauty of the whole thing is that the more notifications originate outside of Facebook, the less this will matter. Already some of their status updates are simply syndicated from elsewhere (e.g. mine go through Twitter). Facebook is merely a place people go to see an aggregated view on what their friends do. It is not a major source of information, and ironically the limitations imposed by Facebook make it less competitive as such.

So, those dismissing Buzz for whatever reason are missing the point: it’s the APIs stupid! Open APIs, unrestricted syndication and aggregation of notifications, events, status updates, etc. It’s been talked about for ages, it’s about to happen in the next few months. First thing to catch up will be those little social network sites that almost nobody uses but collectively are used by everybody. Hook them up to buzz, twitter, etc. Result, more detailed event streams popping up outside of Facebook. Eventually people will start hooking up Facebook as well, with or without the help of Facebook. By this time endorsement will seem like a good survival strategy for Facebook.

Indie Social Networking

I have this page elsewhere on this site where I try to keep track of various accounts I have with social networks and other sites.  I updated it earlier today with some interesting additions.

It seems finally decentralized social networking is starting to happen. It’s all very low profile now but promising. It all started somewhere last week when I noticed that one of my colleagues, John Kemp was now micro blogging via something called identi.ca. I noticed this because his status in skype was telling me. Since we share similar interests in things like OpenID and a few other things, I decided to check it out. I never really bought into this twitter stuff and gave up on updating my Facebook status regularly long time ago. But this identi.ca looks rather cool, so I signed up.

It’s basically twitter minus some features (not yet implemented) with a few interesting twists:

  • You can sign in using OpenID
  • It’s open source. The software identi.ca is based on is called laconi.ca.
  • It’s completely open. It has all the hooks and obvious protocols implemented. For example, I microblog using a identi.ca contact in my jabber client (pidgin) over XMPP. There’s RSS and probably some more stuff.
  • Your friends info is available as FOAF, thus enabling Google’s Social Graph search to work with the data there and in other places (like e.g. your wordpress linkdump).
  • It’s decentralized, you can have laconi.ca friends on different servers. Like email, there is no need for everybody to be on the same server.
  • It’s written in PHP -> you can probably install it on any decent hosting provider you can now run your own microblog just like you can run your own blog.

Of course being low profile, there’s only the usual suspects active: i.e. people like me.

A second interesting site I bumped into is whoisi.com. It’s basically friendfeed or similar sites with a few interesting twists:

  • You don’t have to sign in or register. You just start using it.
  • In fact you can’t sign in and there’s little need because whoisi creates a nice account for you on the fly that you can access using the cookie it sets automatically or a url you can bookmark.
  • You can follow any person on the web and associate feeds with that person.
  • There’s no concept of your profile on whoisi. It’s simply a tool for following people, anonymously. They don’t even have to use whoisi in order for you to follow them.

It’s run by Christopher Blizzard who works at Mozilla. I’m not sure if he is doing this in his spare time or if this has a bigger Mozilla labs plan behind it. Either way, he’s a cool guy with good ideas obviously. Since whoisi didn’t know about me yet, I ended up following myself, which feels slightly hedonistic, and added most of the interesting feeds. Including of course my identi.ca feed.

It occurs to me that using identi.ca’s FOAF and Google’s Social Graph search, whoisi should be able to automatically find websites related to a person from a single url by just following the rel=me links that Google can produce and then any friends from the rel=friend links. Check out what Google finds out about me from providing www.jillesvangurp.com here.

This hooking up of simple building blocks is exactly the point of the decentralized social network. It’s nice to see some useful building blocks emerge that work towards making this happen. Basically, all the necessary building blocks are there already. From a single link it is possible to construct a very detailed view of what your friends are doing all over the web fully automatically. True all this is still a bit too difficult for the average user right now but I imagine that a bit of search and discovery magic would go a long way to making this just work on a lot of sites.

WP-OpenID

I’ve been enthusiastic about openid for a while but have so far not managed to openid enable my site. WP-OpenID, which is the main openid plugin for wordpress is under quite active development. Unfortunately, until recently, any version I tried of that had some issues that prevented me from using it.

The author Will Norris got hired by Vidoop the other day to continue working on wp-openid in the context of the diso project. Diso is another thing I’m pretty enthousiastic about. So, things are improving on the openid front.

Tonight, I managed to get version 2.1.9 of wp-openid to install without any issues on my wordpress 2.5.1 blog. I’ve been testing and it seems to at least accept my openid www.jillesvangurp.com (delegate to myopenid) without issues.

So finally, my blog is openid enabled.

The delegation bit is BTW courtesy of another wordpress plugin: openid delegation. I’ve been using the 0.1 version for more than a year and it just works. Delegation is an openid concept where any website can delegate openid authentication to an external openid provider. This allows you to use a URL you own as your identity and also to switch provider without losing control of your openid url.

captcha

It seems the captcha plugin (capcc) I was using with wordpress has been broken for some time. Probably this happened when I installed wp 2.5 a few weeks ago. My friend Christian del Rosso pointed this out. I installed a different plugin now (yacaptcha) which both looks nicer and hopefully works better too.

So if you couldn’t comment because of this, try again.

Google webmaster central

I’ve been using Google Analytics on my site for a while and it is really great. Recently my good friend Mark de Lange, who designs and owns web sites such as drukenbestel.nl (in Dutch) for a living, pointed out that Google also has a great web masters section with lots of useful tools and goodies. After claiming ownership of your site, you can run some analysis and get some useful advice on improving your site. Cool!

Crypto Crap in Python

I’m looking into doing a little cryptographic stuff in python. Nothing fancy, just some standard stuff. Not for the first time I’m bumping into this brick wall of “batteries included”, the notion that the python library comes with a lot of stuff that should be good enough for whatever you need to do. Only problem is that it doesn’t. XML parsing stinks in Python; http IO stinks (need lots of third party stuff to make that usable); no UTF-8 by default; etc.

Out of the box python is bloody useless unless you want to do some very simplistic stuff. So basically my problem is very simple: I need to be able to sign stuff and verify signatures in a way that is compatible with how stuff like this stuff is commonly done on the internet ™. I.e. you’d expect some pretty mature, well tested libraries to be around for whatever programming language you’d like to use. I know exactly where to go to get this stuff for Java, for example.

So we’re looking at some very basic capability to do stuff with algorithms like RSA, SHA1, MD5 etc. Batteries not included with python at all so I Google a bit to find out what people commonly use for this in python and stumble upon what seems to be the most popular library pycrypto. It seems to have all the algorithms, great! Only one minor detail that has had me crawl all over Google for the entire afternoon:

Public keys usually come as base64 encoded thingies: how the hell do I get them in and out of the functions/classes and what not provided by pycrypto. Batteries not included. After a long search, I find this nice post.

Basically it’s telling me that various people have bothered to provide nice libraries with relevant code for python but somehow all of them have neglected to provide this very basic functionality that you will need 100% guaranteed. That just sucks. In the hypothetical case that you’d actually want to use this stuff to do hypothetically useful things like verifying a signature attached to some http request you will basically find yourself reverse engineering this poorly documented library and figuring out how to get from a base 64 encoded RSA key to a properly configured RSA class instance and back again. I had lots of fun (not) reading about the details of RSA, x.509, etc.

Eventually I found some sample code here that seems to half do what I need. But I’d just prefer to be able to reuse something that is hassle free instead of copy pasting somebody else’s code and debugging it until it works as expected and basically reinventing the wheel by making what would amount to Jilles private little python crypto library. I have better things to do.

Web application scalability

It seems infoq picked up some stuff from a comment I left on the serverside about one of my pet topics (Server side Java).

The infoq article also mentions that I work at Nokia. I indeed work for Nokia Research Center and it’s a great place to work. Only they do require me to point out that when making such comments I’m not actually representing them.

The discussion is pretty interesting and I’ve recently also ventured into using other things than Java (mainly python lately with the Django framework). So far I dearly miss development tooling which ranges from non existent to immature crap for most languages that are not Java. Invariably the best IDEs for these languages are actually built in Java. For example, I’m using the eclipse pydev extension for python development. It’s better than nothing but it still sucks compared to how I develop Java in the same IDE. Specifically: no quickfixes; only a handful of refactorings, no inline documentation, barely working autocompletion, etc make life hell. I forgot what it is like to actually have to type whole lines of code.

I understand the development situation is hardly better for other scripting languages. There’s some progress on the ruby front since Sun started pushing things on that side but none of this stuff is actually production quality. Basically the state of the art in programming environments is currently focussed primarily on statically compiled OO languages like Java or C#. Using something else can be attractive from for example language feature point of view but the price you pay is crappy tooling.

Python as a language is quite OK although it is a bit out of date with things like non utf-8 strings and a few other things that my fellow country man Guido van Rossum is planning to fix in python 3000. Not having explicit typing takes some getting used to and also means my workload is higher because I constantly have to use Google to look up stuff that eclipse would just tell me (e.g. what methods and properties can I use on this HttpResp object I’m getting from Django; what’s the name of the exception I’m supposed to be catching here, etc). In my view that’s not progress and leads to sloppy coding practices where people don’t bother dealing with fault situations unless they have to (which long term in a large scale server environment is pretty much always).

semantic vs Semantic

Interesting post on how microformats relate to the Semantic web as envisioned by the w3c.

The capital S is semantically relevant since it distinguishes it from the lower case semantic web that microformats are all about. The difference is that the Semantic web requires technology that has been defined by the w3c but is not currently available in any mainstream products such as for example web browsers that people use to browse the current web. This technologies include RDF, the OWL query language, XHTML 1.x and 2.x and a few other rather obscure “standards” that you won’t find on a typical end user PC or web server. I use quotes around the word standard here because I don’t believe the W3C to be very effective in transferring its recommended standards over to industry in a proper way.
Continue reading “semantic vs Semantic”