Crypto Crap in Python

I’m looking into doing a little cryptographic stuff in python. Nothing fancy, just some standard stuff. Not for the first time I’m bumping into this brick wall of “batteries included”, the notion that the python library comes with a lot of stuff that should be good enough for whatever you need to do. Only problem is that it doesn’t. XML parsing stinks in Python; http IO stinks (need lots of third party stuff to make that usable); no UTF-8 by default; etc.

Out of the box python is bloody useless unless you want to do some very simplistic stuff. So basically my problem is very simple: I need to be able to sign stuff and verify signatures in a way that is compatible with how stuff like this stuff is commonly done on the internet ™. I.e. you’d expect some pretty mature, well tested libraries to be around for whatever programming language you’d like to use. I know exactly where to go to get this stuff for Java, for example.

So we’re looking at some very basic capability to do stuff with algorithms like RSA, SHA1, MD5 etc. Batteries not included with python at all so I Google a bit to find out what people commonly use for this in python and stumble upon what seems to be the most popular library pycrypto. It seems to have all the algorithms, great! Only one minor detail that has had me crawl all over Google for the entire afternoon:

Public keys usually come as base64 encoded thingies: how the hell do I get them in and out of the functions/classes and what not provided by pycrypto. Batteries not included. After a long search, I find this nice post.

Basically it’s telling me that various people have bothered to provide nice libraries with relevant code for python but somehow all of them have neglected to provide this very basic functionality that you will need 100% guaranteed. That just sucks. In the hypothetical case that you’d actually want to use this stuff to do hypothetically useful things like verifying a signature attached to some http request you will basically find yourself reverse engineering this poorly documented library and figuring out how to get from a base 64 encoded RSA key to a properly configured RSA class instance and back again. I had lots of fun (not) reading about the details of RSA, x.509, etc.

Eventually I found some sample code here that seems to half do what I need. But I’d just prefer to be able to reuse something that is hassle free instead of copy pasting somebody else’s code and debugging it until it works as expected and basically reinventing the wheel by making what would amount to Jilles private little python crypto library. I have better things to do.

3 Replies to “Crypto Crap in Python”

  1. # WTF are you smoking pal?
    # This is from the standard library
    >>> import base64
    >>> encoded = base64.b64encode(‘data to be encoded’)
    >>> encoded
    ‘ZGF0YSB0byBiZSBlbmNvZGVk’
    >>> data = base64.b64decode(encoded)
    >>> data
    ‘data to be encoded’

  2. Spend some time looking at the Python library documentation at:

    http://docs.python.org/lib/lib.html

    The book “Python Essential Reference” by David Beazley is also very useful.

    Some of the “batteries included” modules you might be looking for:

    xml.dom
    xml.dom.minidom
    xml.sax
    md5
    sha
    hmac
    base64

    Python has so much HTTP stuff I won’t list it all but with just the included libraries you get enough HTTP client stuff to rewrite cURL and enough server stuff to write a simple HTTP server in a few lines of code. You also get full socket libraries in case you want to do something more low-level.

    Read the docs before going off like this!

  3. Well Suraj, there’s more to life than base64. The problem is that a RSA public key contains several data items that you need to parse once you decode the base64. No rocket science of course but non of the third party python crypto libraries I’ve tried come with those batteries included (pretty mind boggling since you will need to do this). Each and every one of those will have you dive into the RSA spec and figure out all by yourself what the internals of base64 encoded RSA public keys look like.

    And Greg, I’m aware of those modules. It would be pretty sad if there wasn’t any dom and sax included indeed but there’s a bit more to XML processing these days than working with those low level APIs. Personally I hate having to deal with DOM since it is such a clumsy API. If you’re used to working with xml in other languages, python feels a little primitive. Regarding the various hash functions, those are useful of course but not enough for what I needed.

    Regarding http libraries, I’m not impressed with the included batteries. Take proxy handling for example. Simple problem: how to configure python such that it will use a http proxy but not for localhost. Neither urlib or urlib2 seem to get this right. It’s all or nothing. Of course you can modify your dozens of calls to these module to include some logic to use or not use a proxy depending on the hostname. Just one little example.

    Maybe I’m spoiled, or maybe the included batteries are not that great.

Leave a Reply