Versioneye

During our recent acquisition, we had to do a bit of due diligence as well to cover various things related to the financials, legal strucuturing, etc. of Localstream. Part of this process was also doing a license review of our technical assets.

Doing license reviews is one of those chores that software architects are forced to do once in a while. If you are somewhat knowledgable about open source licensing, you’ll know that there are plenty of ways that companies can get themselves in trouble by e.g. inadvertently licensing their code base under GPLv3 simply by using similarly licensed libraries, violating the terms of licenses, reusing inappropriately licensed Github projects, etc. This is a big deal in the corporate world because it exposes you to nasty legal surprises. Doing a license review basically means going through the entire list of dependencies and transitive dependencies (i.e. dependencies of the dependencies) and reviewing the way these are licensed. Basically everything that gets bundled with your software is in scope for this review.

I’ve done similar reviews in Nokia where the legal risks were large enough to justify having a very large legal department that concerned themselves with doing such reviews. They had built tools on top of Lotus Notes to support that job, and there was no small amount of process involved in getting software past them. So, this wasn’t exactly my favorite part of the job. A big problem with these reviews is that software changes constantly and that the review is only valid for the specific combination of versions that you reviewed. Software dependencies change all the time and keeping track of the legal stuff is a hard problem and requires a lot of bookkeeping. This is tedious and big companies get themselves into trouble all the time. E.g. Microsoft has had to withdraw products from the market on several occasions, Oracle and Google have been bickering over Android for ages, and famously Sco ended up suing world + dog over code they thought they owned the copyright of (Linux).

Luckily there’s a new Berlin based company called Versioneye that makes keeping track of dependencies very easy. Versioneye is basically a social network for software. What it does is genius: it connects to your public or private source repositories (Bitbucket and Github are fully supported currently) and then picks apart your projects to look for dependencies in maven pom files, bundler Gemfiles, npm, bower and many other files that basically list all the dependencies for your software. It then builds lists of dependencies and transitive dependencies and provides details on the licenses as well. It does all this automatically. Even better, it also alerts you of outdated dependencies, allows you to follow specific dependencies, and generally solves a lot of headaches when it comes to keeping track of dependencies.

I’ve had the pleasure of drinking more than a few beers with founder Robert Reiz of Versioneye and gave him some feedback early on. I was very impressed with how responsive he and his co-founder Timo were. Basically they delivered all the features I asked for (and more) and they are constantly adding new features. Currently they already support most dependency management tooling out there so chances are very good that whatever you are using is already supported. If not, give them some feedback and chances are that they add it if it makes sense.

So, when the time came to do the Localstream due diligence, using their product was a no-brainer and it got the job done quickly. Versioneye gave me a very detailed overview of all Localstream dependencies across our Java, ruby, and javascript components and made it trivially easy to export a complete list of all our dependencies, versions, and licenses for the Localstream due diligence.

Versioneye is a revolutionary tool that should be very high on the wish list of any software architect responsible for keeping track of software dependencies. This is useful for legal reasons but also a very practical way to stay on top of the tons of dependencies that your software has. If you are responsible for any kind of commercial software development involving open source components, you should take a look at this tool. Signup, import all your Github projects and play with this. It’s free to use for open source projects or to upload dependency files manually. They charge a very reasonable fee for connecting private repositories.

Lumia 800

I got my Lumia 800 a few weeks before Christmas. Since then, I’ve used it on a daily basis. I just realized I never actually got around to doing a review. Since I work for Nokia, it is always a bit awkward doing reviews since I can’t basically do a negative review without upsetting some of my peers in Nokia. So, disclaimer: I work for Nokia, any opinions you read here are my own and do not necessarily represent any official Nokia position.

My general solution to this problem has been to simply not do a review of any Nokia product unless I actually like it and won’t have to lie to you. So yes, I’m biased and definitely not objective but at the same time I don’t work for marketing, and definitely would feel bad about propagating lies.

I had no problems doing an N900 review a while back or reviewing the N8. So, now it is the Lumia 800’s turn. The good news is, I like this phone. In fact, like it so much I’d recommend this phone and its little brother the 710 to anyone, including close friends and family whom I definitely would not want to burden with a bad product. To be clear, I don’t take this type of recommendation lightly because if I’m wrong, I basically have to deal with the consequences of people complaining about embarrassing short comings, etc.

Continue reading “Lumia 800”

CouchDB

We did a little exercise at work to come up with a plan to scale to absolutely massive levels. Not an entirely academic problem where I work. One of the options I am (strongly) in favor of is using something like couchdb to scale out. I was aware of couchdb before this but over the past few days I have learned quite a bit more that and am now even more convinced that couchdb is a perfect match for our needs. For obvious reasons I can’t dive in what we want to do with it exactly. But of course itemizing what I like in couchdb should give you a strong indication that it involves shitloads (think hundreds of millions) of data items served up to shitloads of users (likewise). Not unheard of in this day and age (e.g. Facebook, Google). But also not something any off the shelf solution is going to handle just like that.

Or so I thought …

The couchdb wiki has a nice two line description:

Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.

This is not the whole story but it gives a strong indication that quite a lot is being claimed here. So, lets dig into the details a bit.

Document oriented and schema less storage. CouchDB stores json documents. So, a document is nothing more than a JSON data structure. Fair enough. No schemas to worry about, just data. A tree with nodes, attributes and values. Up to you to determine what goes in the tree.

Conflict resolution. It has special attributes for the identity and revision of a document and some other couchdb stuff. Both id and revision are globally unique uuids. UPDATE revision is not a uuid (thanks Matt).That means that any document stored in any instance of couchdb anywhere on this planet is uniquely identifiable and that any revision of such a document in any instance of couchdb is also uniquely identifiable. Any conflicts are easily identified by simply examining the id and revision attributes. A (simple) conflict resolution mechanism is part of couchdb. Simple but effective for simple day to day replication.

Robust incremental replication. Two couchdb nodes can replicate to each other. Since documents are globally unique, it is easy to figure out which document is on which node. Additionally, the revision id allows couchdb to figure out what the correct revision is. Should you be so unlucky to have conflicting changes on both nodes, there are ways of dealing with conflict resolution as well. What this means is that any node can replicate to any other node. All it takes is bandwidth and time. It’s bidirectional so you can have a master-master setup where both nodes consume writes and propagate changes to each other. The couchdb use the concept of “eventual consistency” to emphasize the fact that a network of couchdb nodes replicating to each other will eventually have the same data and be consistent with each other, regardless of the size of the network or how out of sync the nodes are at the beginning.

Fault tolerant.Couchdb uses a file as its datastore. Any write to a couchdb instance appends stuff to this file. Data in the file already is never overwritten. That’s why it is fault tolerant. The only part of the file that can possibly get corrupted is at the end of the file, which is easily detected (on startup). Aside from that, couchdb is rock solid and guaranteed to never touch your data once it has been committed to disk. New revisions don’t overwrite old ones, they are simply appended to the file (in full) to the end of the file with a new revision id. You. Never. Overwrite. Existing. Data. Ever. Fair enough, it doesn’t get more robust than that. Allegedly, kill -9 is a supported shutdown mechanism.

Cleanup by replicating. Because it is append only, a lot of cruft can accumulate in the bit of the file that is never touched again. Solution: add an empty node, tell the others to replicate to it. Once they are done replicating, you have a clean node and you can start cleaning up the old ones. Easy to automate. Data store cleanup is not an issue. Update. As Jan and Matt point out in the comments, you can use a compact function, which would be a bit more efficient.

Restful. CouchDBs native protocol is REST operations over HTTP. This means several things. First of all, there are no dedicated binary protocols, couchdb clients, drivers, etc. Instead you use normal REST and service related tooling to access couchdb. This is good because this is exactly what has made the internet work for all these years. Need caching? Pick your favorite caching proxy. Need load balancing? Same thing. Need access from language x on platform y? If it came with http support you are ready to roll.

Incremental map reduce. Map reduce is easy to explain if you understand functional programming. If you’re not familiar with that, it’s a divide and conquer type strategy to calculate stuff concurrently from lists of items. Very long lists with millions/billions of items. How it works is as follows: the list is chopped into chunks. The chunks are processed concurrently in a (large) cluster to calculate something. This is called the map phase. Then the results are combined by collecting the results from processing each of the chunks. This is called the reduce phase. Basically, this is what Google uses to calculate e.g. pagerank and many thousands of other things on their local copy of the web (which they populate by crawling) the web regularly. CouchDB uses the same strategy as a generic querying mechanism. You define map and reduce functions in Javascript and couchdb takes care of applying them to the documents in its store. Moreover, it is incremental. So if you have n documents and those have been map reduced and you add another document, it basically incrementally calculates the map reduce stuff. I.e. it catches up real quick. Using this feature you can define views and query simply by accessing the views. The views are calculated on write (Update. actually it’s on read), so accessing a view is cheap whereas writing involves the cost of storing and the background task of updating all the relevant views, which you control yourself by writing good map reduce functions. It’s concurrent, so you can simply add nodes to scale. You can use views to index specific attributes, run clustering algorithms, implement join like query views, etc. Anything goes here. MS at one point had an experimental query optimizer backend for ms sql that was implemented using map reduce. Think expensive datamining SQL queries running as map reduce jobs on a generic map reduce cluster.

It’s fast. It is implemented in erlang which is a language that is designed from the ground up to scale on massively parallel systems. It’s a bit of a weird language but one with a long and very solid track record in high performance, high throughput type systems. Additionally, couchdb’s append only and lock free files are wickedly fast. Basically, the primary bottleneck is the available IO to disk. Couchdb developers are actually claiming sustained write throughput that is above 80% of the IO bandwidth to disk. Add nodes to scale out.

So couchdb is an extremely scalable & fast storage system for documents that provides incremental map reduce for querying and mining the data; http based access and replication; and a robust append only, overwrite never, and lock free storage.

Is that all?

No.

Meebo decided that this was all nice and dandy but they needed to partition and shard their data instead of having all their data in every couchdb node. So they came up with CouchDB Lounge. Basically what couchdb lounge does is enabled by the REST like nature of couchdb. It’s a simple set of scripts on top of nginx (a popular http proxy) and the python twisted framework (a popular IO oriented framework for python) that dynamically routes HTTP messages to the right couchdb node. Each node hosts not one but several (configurable) couchdb shards. As the shards fill up, new nodes can be added and the existing shards are redistributed among them. Each shard calculates its map reduce views, the scripts in front of the loadbalancer take care of reducing these views across all nodes to a coherent ‘global’ view. I.e. from the outside world, a couchdb lounge cluster looks just like any other couchdb node. It’s sharded nature is completely transparent. Except it is effectively infinitely scalable both in the number of documents it can store as well in the read/write throughput. Couchdb looks just like any other couchdb instance in the sense that you can run the full test suite that comes with couchdb against and it will basically pass all tests. There’s no difference from a functional perspective.

So, couchdb with couchdb lounge provides an off the shelf solution for storing, accessing and querying shitloads of documents. Precisely what we need. If shitloads of users come that need access, we can give them all the throughput they could possibly need by throwing more hardware in the mix. If shitloads is redefined to mean billions instead of millions, same solution. I’m sold. I want to get my hands dirty now. I’m totally sick and tired of having to deal with retarded ORM solutions that are neither easy, scalable, fast, robust, or even remotely convenient. I have some smart colleagues who are specialized in this stuff and way more who are not. The net result is a data layer that requires constant fire fighting to stay operational. The non experts routinely do stuff they shouldn’t be doing that then requires lots of magic from our DB & ORM gurus. And to be fair, I’m not an expert. CouchDB is so close to being a silver bullet here that you’d have to be a fool to ignore the voices telling you that it is all too good to be true. But then again, I’ve been looking for flaws and so far have not come up with something substantial.

Sure, I have lots of unanswered questions and I’m hardly a couchdb expert since technically, any newby with more than an hour experience coding stuff for the thing outranks me here. But if you put it all together you have an easy to understand storage solution that is used successfully by others in rather large deployments that seem to be doing quite well. If there are any limits in terms of the number of nodes, the number of documents, or indeed the read/write throughput, I’ve yet to identify it. All the available documentation seems to suggest that there are no such limits, by design.

Some good links:

imac 24″

Saturday morning I turned on my PC and basically the screen did not come on (I have it on standby). Suspicious but has happened before. So I press the powerbutton to do a reset and then it did single beep followed by 8 rapid beeps. That’s the bios telling you: this pc is FOOBARRED, please try installing a new motherboard. Or something. Good luck with that. Anyway, eight beeps and nothing.

After the predictable “godverdomme, kutzooi”, which needs no translation here, I calmed down and did what I was planning to do anyway (by coincidence). Which was visiting the local Apple store. Or rather the Gravis M&S store on the Ernst Reuter Platz here in Berlin, a nice big store specialized in reselling all the nice Apple gadgets along with a helpdesk and good support options (I hate putting expensive hardware in the mail). So, I basically already decided to go for an imac. Question: which one. Eh … why go for anything less than the biggest one? Sure it costs money but it I’ll be spending the next few years glued to its screen. So 24″, 3Ghz dual core cpu, 4GB ram, a 1TB diskdrive and a nice nivida chipset with 512MB (which will no doubt run X-plane just fine).

Anyway, they didn’t have one in store with an English keyboard. So they made the order for me and told me “one and a half week”. So, to my pleasant surprise, I got the call already today that they had my new imac ready. So I fetched it, plugged it in and enjoyed the famous Apple out of the box experience, which is excellent.

Then I went to work installing the basics: firefox, adium, skype and some other essentials. I haven’t gotten around to applying all the tweaks I have on my work laptop but will of course be doing that to fix e.g. annoying home-end key behavior and a couple of other things.

I am now in need of:

  • A proper usb mouse (sorry but the mighty mouse will join its brother in a drawer)
  • A USB2 – SATA converter to read both internal drives in my old PC with all my music, photos and other essentials. I have a pretty recent backup on an external drive but had gotten a bit sloppy backing up the last few months. BTW. I noticed that NTFS is read only on macs, so any tips for fixing that are welcome. Macfuse seems to be one option, any alternatives?
  • To fix that, I will need a nice new big external drive to hook up to time machine.

Not all is great though:

  • The keyboard sucks compared to the one that came with my Mac Book Pro last year. WTF is up with the weirdly small enter key and the weird symbols where it used to say page down, page up, etc.
  • The mighty mouse still stinks
  • All the mobile me and .mac spam on first launch is kind of annoying

Anyway, happy to be online again.

wolframalpha

A few years ago, a good friend gave me a nice little present: 5 kilos of dead tree in the form of Stephen Wolfram’s “A new kind of science”. I never read it cover to cover and merely scanned a few pages with lots of pretty pictures before deciding that this wasn’t really my cup of tea. I also read a bit some of the criticism on this book from the scientific community. I’m way out of my league there so, no comments from be except a few observations:

  • Presentation of the book is rather pompous and arrogant. The author tries to convince the readers that they the most important piece of science ever produced in their hands.
  • This is what set of most of the criticism. Apparently, the author fails to both credit related work as well as properly back up some of his crucial claims with proper evidence.
  • Apparently there are quite a few insufficiently substantiated claims which affects credibility of the overall book and claims of the author
  • The approach of the author to write the book has been the ivory tower approach where he quite literally dedicated a decade+ of his life to writing it during which he did not seek out much criticism from his peers.
  • So, the book is controversial and may either turn out to be the new relativity theory (relatively speaking) or a genuine dud. I’m out of my league deciding either way

Anyway, the same Stephen Wolfram has for years been providing the #1 mathematical software IDE: Mathematica, which is one of the most popular software tools for anyone involved with mathematics. I’m not a mathematician and haven’t touched such tools in over 10 years now (dabbled a bit with linear algebra in college) but as far as I know, his company and product have a pretty solid reputation.

Now the same person has brought the approach he applied to his book and his solid reputation as a owner of Mathematica to the wonderful world of Web 2.0. Now that is something I know a thing or two about. Given the above I was initially quite sceptic when the first, pretty wild, rumors around wolframalpha started circulating. However, some hands on experience has just changed my mind. So here’s my verdict:

This stuff is great & revolutionary!

No it’s not Google. It’s not Wikipedia either. It’s not Semantic web either. Instead it’s a knowledge reasoning engine hooked up to some authoritative data sets. So, it’s not crawling the web. It’s not user editable and it is not relying on traditional Semantic web standards from e.g. W3C (though very likely it must be using similar technology).

This is the breakthrough that was needed. The semantic web community seems to be stuck in an endless loop pondering pointless standards, query formats, graph representations and generally rehashing computer science topics that have been studied for 40 years now without producing much viable business models or products. Wikipedia is nice but very chaotic and unstructured as well. The marriage of semantic web and wikipedia is obvious has been tried countless times and has so far not produced interesting results. Google is very good at searching through the chaos that is the current web but can be absolutely unhelpful with simple, fact based questions. Most fact based questions in Google return a wikipedia article as one of the links. Useful, but it doesn’t directly answer the question.

This is exactly the gap that wolframalpha fills. There’s many scientists and startups with the same ambition but Wolframalpha.com got to market first with a usable product that can answer a broad range of factual questions with knowledge imported into its system from trustworthy sources. It works beautifully for facts and knowledge it has and allows users to do two things:

  • Find answers to pretty detailed queries from trustworthy sources. Neither Wikipedia nor Google can do this, at best they can point you at a source that has the answer and leave it up to you to judge the trustworthyness of the source.
  • Fact surfing! Just like surfing from one topic to the next on Wikipedia is a fun activity, I predict that drilling down facts on wolframalpha is a equally fun and useful.

So what’s next? Obviously, wolframalpha.com will have competition. However, their core asset seems to be their reasoning engine combined with the quite huge fact database which is to date unrivaled. Improvements in both areas will solidify their position as market leader. I predict that several owners of large bodies of authoritative information will be itching to be a part of this and partnership deals will be announced. Wolframalpha could easily evolve into a crucial tool for knowledge workers. So crucial even that they might want to pay for access to certain information.

Some more predictions:

  • Several other startups will start competing soon with competing products. There should be dozens of companies working on similar or related products. Maybe all they needed was a somebody taking a first step.
  • Google likely has people working on such technologies they will either launch or buy products in this space in the next two years
  • Main competitors of Google are Yahoo and MS who have both been investing heavily in search technology and experience. They too will want a piece of this market
  • With so much money floating around in this market, wolframalpha and similar companies should have no shortage of venture capital, despite the current crisis. Also, wolframalpha might end up being bought up by Google or MS.
  • If not bought up or outcompeted (both of which I consider to be likely), wolframalpha will be the next Google

macbook pro, cons and pros

Having had my new mac for a few weeks now, it is time for a review.

In a nutshell, consider me switched. Overall it’s great and a huge improvement over my slow XP based laptop.

However, since everybody seems to focus on how great macs are,  I’m going to first focus on everything that I think sucks or otherwise annoys the hell out of me. Aside of course from Steve Jobs launching new models right after I got my mac.

  • Key bindings. A matter of taste, habit, and also a matter of consistency. My main gripe is with the latter. I use Mac Office (entourage, word, powerpoint), eclipse, a terminal window and of course firefox. Mac Office is the only of these which preserves the quite sensible behaviour for the home and end key that you find on just about any platform: home means beginning of line, end means end of line. The default in mac applications is different: beginning of document and end of document. I need the former functionality dozens of times per day and the latter … well never actually. So it’s a mess. I ended up reconfiguring eclipse because looking at the java imports each time I press home gets old real quick. Thankfully eclipse is fully configurable. I also attempted configuring the mac itself. Very few applications seem to pay attention. The Terminal application also has its settings, and ignores mac defaults regarding this anyway. Which leaves firefox. Annoyingly still looking for a solution to that one.
  • Delete. The backspace is called a delete button. My usb keyboard has two delete buttons and a clear button. One of the delete buttons is actually a real delete button. Command+delete only works with the backspace variant. The clear button seems equivalent to the (real) delete button. My laptop has one delete button but it is not a delete button. I normally use the delete button almost as much as the 26 letter keys. Gimme back my delete button! If this doesn’t sound very logical, consistent or usable that’s because it isn’t.
  • Mighty Mouse. Of course I got a Mighty mouse with my mac usb keyboard. Nice experiment this touch sensitive surface but I really mean left click when my index finger clicks left of the wheel/ball thingy. Likewise for right clicks. This goes wrong a lot. Middle clicks are annoyingly difficult. The side buttons require quite a bit of grip to press. Yet, it is surprisingly easy to click them accidentally. In short, strongly considering to hook up a reall mouse now.
  • Alt+tab. Expose is nice but often I just want to switch back and forth between two windows. This works fine as long as they are application windows, but not if they are document windows. So open two mails in separate windows and you can’t switch with alt+tab between them.
  • Window management. You can minimize windows. But then it is a lot more difficult to switch to them. Double clicking a window title minimizes (on win32 this means maximize). So I accidentally minimize loads of windows which I then need to find back in the dock. Annoying. When minimized, windows are nowhere to be seen in alt+tab or expose. This sucks if you want to switch back to them. Which is the whole point of minimizing vs. closing a window.
  • No file move supported in finder. This stinks. No select file, command+x, command+v. No right click, move. No file->move. Apparently possible to do with drag and drop and option key. Defaults to copy though :-/.
  • Finder. In general, the finder is a bit underpowered if you are used to windows explorer. I miss my folder tree.
  • Time machine + file vault. Loads of trouble to get this working properly. It’s sort of backing up now, finally. But not exactly ‘it just works’.
  • No VGA connector. This means I need to drag along a converter whenever I go to meetings because most beamers come without a DVI cable. Annoying.

In all fairness, three weeks is not enough to get rid of my windows habits. Especially since I still have an XP machine at home.

However, I’m getting more efficient on the mac by the day. I’ve absorbed tons of new tricks and have had loads of fun figuring out little issues. Here’s my list of big grin on face causing stuff:

  • Display arrangement. I have my laptop left of my 20″ screen. It’s slighly lower. I managed to arrange the screens such that when I move my mouse horizontally, it moves to the other screen at more or less the same altitude. Cool.
  • Display settings persist. The display settings survive me unplugging the laptop, using a beamer for some presentation and plugging my monitor back in. Great & just the way it should be.
  • Ambient light adjustment. Quite funny when I covered the right most sensor while pressing the delete (or rather backspace), the screen dimmed. Turned out that with the desklight shining on one side of the laptop, covering the lit sensor with your hand causes the screen to compensate for the sudden darkness by dimming. Had a good laugh about that. It actually has two sensors so this is only an issue if you are sitting in the dark next to a desk light.
  • Photo screensaver. Looks great with my vacation photos. Apparently my efforts to calibrate my windows PC at home were reasonably successful since the photos look excellent on the laptop, which of course is properly calibrated (being a mac and all that). Of course it dislays different photos on both screens, at the same time. Same for my desktop background, which updates every few seconds (without apparent performance hit).
  • Expose. Love it, partially compensates for the alt+tab. Inexplicably, they only show one desktop if you use spaces. I have it hooked up to my side mouse button.
  • It’s fast. It should be for this price of course. But still. It is fast. Gone is the endless disk churning that comes with windows.
  • It’s silent. This is the most silent laptop I’ve ever worked with. No vacuum cleaner type fans activating and deactivating all the time.
  • Multi touch touchpad. This is a really nice feature. Tap with two fingers -> context menu, drag with two fingers -> scroll. So much fun.

Failing power supply

In April 2007, I replaced a broken power supply in my PC with a Antec SmartPower 2.0 500W Power Supply. Check my review here. A few days ago, my pc has started producing a high pitched noise. Really annoying. So, I Google a little and what do I find: Antec SmartPower 2.0 500W Power Supplies apparently have 21% failure rate. Tell tale signs include the damn thing making high pitched noise.

I have to investigate a little further but probably this means the power supply is failing after less than one and a half year. Out of warranty of course. Damn it, really annoying to have to open that case again to replace the same part. Basically, the PC is now nearly three years old and maybe I should just replace it altogether. Something quiet, fast and reliable would be nice.

In a few weeks my new Macbook Pro should arrive at work (ordered yesterday). I was planning to wait and see if I like that and if so, just upgrade to a nice Mac at home as well. Not fully convinced yet.

Feel free to recommend a decent PSU. Has to power a Nvidia 7800, 2 drives, lots of USB hardware and a amd 4400+ dualcore CPU.

Update. I ended up installing a ZALMAN ZM600-HP. Seems to have a few good reviews http://www.tweaknews.net/reviews/zm600hp/. It’s expensive, over qualified for the job and supposedly really good and quiet. Sadly the rest of my machine is still rather noisy.

Joost and video on demand

Screenshots And Video Of The New Joost

Joost has announced that they are changing the way their service works. Having used it quite a bit, I think this is probably the best thing for them since it was based on a misguided channel/TV metaphore. However, I wonder (along with Techcrunch) what their added value really is. It used to be that p2p seemed like it was the only way to escape from blocky, tiny videos with low frames per second and audio/video sync problems (aka Real Video, what happened to those guys anyway?).

Just last week I was looking at some videos on Vimeo and noticed that they have streaming HD now. Like Youtube, it starts streaming right away. Unlike Youtube, the video is sharp, full screen, high resolution, and mostly free from severe compression artifacts. In other words, they seem to have figured out a way to push large amounts of data to me cost effectively. I didn’t measure it but I estimate I was getting around 1mbps data from them at least.

Doing this on a large scale used to be really expensive. However, in recent years, content delivery networks (CDNs) have emerged that can cost effectively deliver large downloads to massive amounts of users. A CDN is actually similar to p2p. Essentially it involves ensuring you have a servers+bandwidth in every major provider network and keeping these servers in sync. Bandwidth inside a provider network is a lot easier to get. For providers the benefit is that they don’t need to use expensives bit pipes from other providers to get the content to you. So as long as they don’t run out of local bandwidth (of which they have plenty), they will prefer this. Also with less hops to the user, it is a lot easier to ensure there is actually enough bandwidth to the user. Essentially, this brings the best features of p2p to web streaming and makes Joost more or less redundant. Although arguably, they still have a slight cost advantage here due to their reliance on a CDN (this type of service of course costs money).

There are now several flash based streaming sites that use a CDN. What these services have in common is crappy content. There’s only so much amateur, 3 minute video fragments I can take. Also, 3 minute “commercial” fragments of full content normally broadcasted on really obscure tv channels in the middle of the night is hardly compelling. The reason for this is copyright legislation and a systematic ignoring of users outside the USA by media corporations.

Joost, flawed as it was, actually has some okish content hidden inside it. I quite enjoyed watching episodes of Lexx (an obscure but fun Canadian SF series from the nineties) and also a few full feature kung fu movies from the seventies as well as a few documentaries. I wouldn’t pay for any of that but if you are bored, it’s at least a way to pass some time. But Joost never managed to convince media corporations to provide premium content. They still haven’t solved that problem.

If you live inside the US, life is good, apparently. There’s Apple TV, Amazon, Hulu, and a few others like netflix offering massive amounts of good quality pay per view type HD content for download, and in some cases even streaming. Some of these services are ad supported, some of them are subscription based. Joost won’t stand a chance in that market.

However, for about 5.8 billion people outside the US, life is not so good. Here in Finland there are only a handful of video on demand companies whose offerings suck big time comparatively. Additionally, their UI is in Finnish which makes it extremely hard for me to use them or even to figure out what they are trying to offer me. The US based ones won’t deliver content outside the US because that requires separate deals with media companies for each country. In the US, one deal helps you reach a population of around 250 million users. In europe, countries are a lot smaller. My understanding is that to some extent this type of services is now also available in the UK and Germany, which are relatively large countries.

Finland has only 5 million inhabitants.In other words, no content for me. So, if I want to see a movie, I can hope one of the pay per view TV channels broadcasts it (I don’t have a subscription though); buy the DVD; go to the cinema; or hope one of the handful of local TV stations broadcasts something worth watching.

Google Chrome – First Impressions

First impression: Google delivered, I’ve never used a browser this fast. It’s great.

Yesterday, a cartoon was prematurely leaked detailing Google’s vision for what a browser could look like. Now, 24 hours later I’m reviewing what until yesterday was a well kept secret.

So here’s my first impressions.

  • Fast and responsive. What can I say? Firefox 3 was an improvement over Firefox 2 but this is in a different league. There’s still lots of issues with having many tabs open in Firefox. I’ve noticed it doesn’t like handling bitmaps and switching tabs gets unusable with a few dozen tabs open. Chrome does not have this issue at all. It’s faster than anything I’ve browsed with so far (pretty much any browser you can think of probably).
  • Memory usage. Chrome starts new processes for each domain and not per tab. I opened a lot of tabs in the same domain and the number of processes did not go up. Go to a different domain and you get another chrome process. However, it does seem to use substantial amount of memory in total. Firefox 3 is definitely better. Not an issue with 2 GB like I have and the good news is that you get memory back when you close tabs. But still, 40-60MB per domain is quite a lot.
  • Javascript performance. Seems fantastic. Gmail and Google Reader load in no time at all. Easily faster than Firefox 3.
  • UI. A bit spartan if you are used to Firefox with custom bells & wistles (I have about a dozen extensions). But it works and is responsive. I like it. Some random impressions here: 
    • no status bar (good)
    • very few buttons (good)
    • no separate search field (could be confusing for users)
    • tabs on top, looks good, unlike IE7.
    • mouse & keyboard. Mostly like in Firefox. Happy to see middle click works. However, / does not work and you need to type ctrl+f to get in page search
  • URL bar. So far so good, seems to copy most of the relevant features from Firefox 3. I like Firefox 3’s behaviour better though.
  • RSS feeds. There does not seem to be any support for subscribing to, or reading feeds. Strange. If I somehow missed it, there’s a huge usability issue here. If not, I assume it will be added.
  • Bookmarks. An important feature for any browser. Google has partially duplicated Firefox 3’s behaviour with a little star icon but no tagging.
  • Extensions. none whatsoever :-(. If I end up not switching, this will be the reason. I need my extensions.
  • Import Firefox Profile. Seems pretty good, passwords, browsing history, bookmarks, etc. were all imported. Except for my cookies.
  • Home screen. Seems nicer than a blank page but nothing I’d miss. Looks a bit empty on my 1600×1200 screen.
  • Missing in action. No spelling control, no search plugins (at least no obvious way for me to use them even though all my firefox search plugins are listed in the options screen), no print preview, no bookmarks management, no menu bar (good, don’t miss it)
So Google delivers on promises they never made. Just out of the blue there is Chrome and the rest of the browser world has some catching up to do. Firefox and Safari are both working on the right things of course and have been a huge influence on Chrome (which Google gives them plenty of credit for). However, the fact is that Google is showing both of them that they can do much better. 
Technically I think the key innovation here is using multiple processes to handle tabs from different domains. This is a good idea from both a security point of view as from a performance point of view. Other browsers try to be clever here and do everything in one process with less than stellar results. I see Firefox 3 still block the entire UI regularly and that is just inherent to its architecture. This simply won’t happen with Chrome. Worst case is that one of the tabs becomes unusable and you just close it. Technically, you might wonder if they could not have done this with threads instead of processes.

So, I’m genuinely impressed here. Google is really delivering something exceptionally solid here. Download it and see for yourself.

Posting this from Chrome of course.

Songbird Beta (0.7)

Songbird Blog » Songbird Beta is Released!.

Having played with several milestone builds of songbird, I was keen to try this one. This is a big milestone for this music player & browser hybrid. Since I’ve blogged on this before, I will keep it short.

The good:

  • New feathers (songbird lingo for UI theme) looks great. Only criticism is that it seems to be a bit of an iTunes rip off.
  • Album art has landed
  • Stability and memory usage is now acceptable for actually using the application
  • Unlike iTunes, it actually supports the media buttons on my logitech keyboard.

The bad (or not so good since I have no big gripes):

  • Still no support for the iTunes invented but highly useful compilation flag (bug 9090). This means that my well organized library is now filled with all sorts of obscure artists that I barely know but apparently have one or two songs from. iTunes sorts these into compilation corner and I use this feature to keep a nice overview of artists and complete albums.
  • Despite being a media player with extension support, there appears to be no features related to sound quality. Not even an equalizer. Not even as an extension. This is a bit puzzling because this used to be a key strength of winamp, the AOL product that the songbird founders used to be involved with.
  • Despite being a browser, common browser features are missing. So no bookmarks, no apparent RSS feed, no Google preconfigured in the search bar, etc. Some of these things are easily fixed with extensions.

Verdict: much closer than previous builds but still no cigar. Key issue for me is compilation flag support. Also I’d really like to see some options for affecting audio playback quality. I can see how having a browser in my media player could be useful but this is not a good browser nor a good media player yet.