Using SQLObject with Twisted

Over the past week I’ve been working on a small personal project with Twisted. I need to access a database to store and retrieve data, so I started with the obvious, using APIs provided by Twisted.

Twisted provides a database API called adbapi. The API is pretty straightforward, and the operations I wanted to perform were not rocket science anyway, so it served the purpose, but I wans’t 100% satisfied.

I was using the runQuery function, mainly, and putting there a regular SQL statement. I didn’t like that. Then I remembered SQLObject.

SQLObject is a ORM, providing an object oriented API to several databases (I was aiming at MySQL and SQLite). This is what I was looking for. But there is a problem: accessing a database is a blocking operation.

Twisted uses the reactor pattern thus you can’t run a blocking operation in the event loop’s thread without affecting the whole program. Database accessing libraries tend to be blocking, so Twisted runs database operations in another thread and then gets results in a callback in the main thread. This makes the illusion of non-blocking database access.

In order to integrate SQLObject nicely with Twisted this is exactly what we want to do. We’ll defer all database operations to another thread and we’ll get results (or failures) in callback functions. The key function here is deferToThread which will run the specified function in the reactor thread pool and return a deferred. In order to make our life easier we’ll use a decorator which will run the decorated function in the reactor’s thread pool and return a deferred:

[gist]https://gist.github.com/793936[/gist]

Now lets see a simple (yet full) example of how all this works:

[gist]https://gist.github.com/793982[/gist]

As you can see the results (or errors) are retrieved in the got_result/got_error callback functions asynchronously, and as the operation was executed in a different thread this didn’t affect the main event loop.

:wq

Setting core dump limits from Python

It’s very annoying for a program to crash, but it happens. When it happens if the system was configured to dump the core you’ll get a core file with hopefully enough information to debug the crash.

But we don’t know if the system will be configured to dump the core, so we may want to do this from our Python program. We can use the resource module in the standard library. The following example shows how to enable core dumps if —dump-core option is specified:

[gist]https://gist.github.com/772101[/gist]

PS: This worked just fine for the root user but didn’t seem to do the job for a normal user.

Implementing registry pattern with class decorators

Registry is a quite common design pattern which I recently needed to implement and I found Python class decorators very useful so I thought I’d write about it. :–)

In the registry pattern we have a global object we call the registry or registrar which contains references to shared objects and it’s the only entity which can access those.

In my case, I had a plugin style architecture in which different plugins might be dynamically added without any configuration. The first implementation I made used a metaclass to add the plugin class to the registry:

[gist]https://gist.github.com/771884[/gist]

That worked, at least for the beginning. But later I found a limitation: plugins needed the Plugin metaclass, so I couldn’t use another metaclass on them if I needed to. And at some point I did feel the need to do that. I wanted the plugins to use the Singleton metaclass (to implement the Singleton pattern) but then I couldn’t use the Plugin metaclass because only one metaclass can be used (and I didn’t want to create a SingletonPlugin metaclass). And class decorators saved the day:

[gist]https://gist.github.com/771888[/gist]

Class decorators are supported only in Python >= 2.6, but that wasn’t a problem. :–)

:wq

Bye bye grep, hello ack!

This past week I’ve been enhancing my Vim configuration, since it’s the editor I use all the time. While browsing for interesting plugins I run across Ack, a plugin that promises to be an enhancement for Vim’s builtin grep capability.

The Vim plugin uses the ack command on your system which is Perl replacement for grep. After discovering it on http://betterthangrep.com I didn’t look back. Ack is so much better than grep for finding text among code!

One of the good things is the fact that Ack will automagically ignore well-known version control systems files. According to the documentation:

Directories ignored by default:
autom4te.cache, blib, build, .bzr, .cdv, cover_db, CVS, darcs, ~.dep,
~.dot, .git, .hg, MTN, ~.nib, .pc, ~.plst, RCS, SCCS, sgbak and .svn

Lets see an usage example:

[gist]https://gist.github.com/717782[/gist]

And we don’t need to explicitly exclude any of the directories listed above, because Ack will do it for us.

Happy ack-ing!

 

lambda vs. functools.partial

In Python we have a way for creating anonymous functions at runtime: lambda. With lambda we can create a function that will not be bond to any name. That is, we don’t need to def a function, and lambda can be used instead.

[gist]https://gist.github.com/707898[/gist]

Lambda can also be used to call another function by fixing certain argument:

[gist]https://gist.github.com/707899[/gist]

However, lambda functions use late binding for the arguents they get. That is, when you create a lambda function passing a variable as an argument, it’s not immediately copied, a reference to the scope is kept and the value is resolved when the function is called. Thus, if the value of that argument changes within that scope the lambda function will be called with the changed value. Lets see it in action:

[gist]https://gist.github.com/707901[/gist]

Unexpected? It depends, but it was definitely unexpected when I run into it. The solution for this is to bind early, by copying the value at the time of the creation of the function. We can do this 2 ways:

Using a fixed parameter in the lambda function:

[gist]https://gist.github.com/707903[/gist]

Using functools.partial:

[gist]https://gist.github.com/707904[/gist]

I personally like the second approach, using partial, since its more explicit and I really see option one as a workaround to how lambda functions work.

 

Browsing RFCs with qRFCView

I usually need to browse through lots of RFCs, both for work and leisure. Reading them in the browser or even a PDF reader is quite not right for me and long ago I found the ultimate tool for RFC reading on my computer: qRFCView.

However, one day I wanted to read RFC6026 and suddenly I realized that for some reason qRFCView wouldn’t let me choose a RFC number greater than 5000.

A quick search led me to know that the number is hardcoded 🙁 It’s listed as an important bug in Debian since january 2009.

The fix is really easy, just get the Debian source for qrfcview, modify src/main.cpp, raise the number to 9999 or whatever you think it’s best and debuild again.

iPhone: getting IMAP push notifications with Python and Prowl

I’m a happy iPhone user. I was lucky to get the first iPhone soon after it was available and I just bought a new iPhone4. This is my first time having unlimited data plan on a mobile phone 🙂 so first thing I did was to configure my email accounts, both personal and work.
It felt weird to configure a GMail account as Microsoft Exchange in order to sync contacts and get push notifications, but that did the trick. For my work email account, I didn’t have push notifications, and polling is not any cool. I know very little about email itself, but I found that in order to have push email you need an IMAP IDLE compatible server and client.
As you may see below, the server does support IMAP IDLE:
[gist]https://gist.github.com/579679[/gist]
but it looks like iPhone Mail app doesn’t 🙁
I started looking around and finally found the solution: use an application called Prowl (2.39€) which will push whatever notification it gets through a REST API. The idea is basically to create a console IMAP IDLE client and use the Prowl API to send a new notification. Sounds like fun!
There is a Perl implementation here (http://github.com/mschmitt/GhettoPush/) but I don’t like Perl very much, so I had to do it in Python, of course 😉
I also found some other Python implementations which I didn’t like for one reason or another so I coded a new one using imaplib2 by Piers Lauder (http://www.cs.usyd.edu.au/~piers/python/imaplib.html) and prowlpy, which I forked on GitHub and did a couple of really silly modifications (http://github.com/saghul/prowlpy).
It works on the background and it can connect to an arbitrary name of accounts (configured with an ini-style file) and send push notifications on behalf of them. It also logs to syslog, if running in the background.
This Prowl thing is just awesome, this is an example of what you can do with it, but anyone can write a really simple script to get instant notifications on his/her iPhone. Just give it a try!
The libraries:
References:

 

Happy pushing!

Python: list vs. set

In Python we have several types of objects for storing values in a
array-like way: lists, tuples, dictionaries (they are really more like
hash tables) and sets. Lists and sets might look alike but they are
different. Lets do a quick test:

[gist]https://gist.github.com/569053[/gist]

In the above snippet, we can see that iterating over all items in a

list takes just a little longer tan a a set: 8.99us vs 6.87us we are
talking micro-seconds here!

Now, if we look at how long it takes to verify is the list or set
contains an object, we can see the difference: 1.99us vs 140ns. Sets
are an order of magnitude faster!

Why is that? We can think that a python list is like a C linked list.
It’s time complexity when checking for a value is O(n). Sets, on the
other hand, are implemented as hash tables, so the key is hashed and
instantaneously found (or not) with a time complexity of O(1).

For the curious, this is the list object declaration in Include/listobject.h

[gist]https://gist.github.com/569078[/gist]

and this is the set object declaration, in Include/setobject.h

[gist]https://gist.github.com/569085[/gist]

More on python and hash tables here:

http://sites.google.com/site/usfcomputerscience/hash-tables-imp

:wq

Choosing a JSON library

When I download a library for using whatever API I usually see that people tend to use different JSON libraries, and sometimes I just don’t have it installed, but I know I got another one which could do the job just fine.

A while ago I ran across this while I was having a look at the Tropo WebAPI library (http://github.com/tropo/tropo-webapi-python) and ended up adding the following code:

try:
    import cjson as jsonlib
    jsonlib.dumps = jsonlib.encode
    jsonlib.loads = jsonlib.decode
except ImportError:
    try:
        from django.utils import simplejson as jsonlib
    except ImportError:
        try:
            import simplejson as jsonlib
        except ImportError:
            import json as jsonlib

Most used JSON libraries are ordered regarding speed. You might want to read a nice comparison between libraries here:
http://blog.hill-street.net/?p=7

:wq