Serializing DB operations with Twisted and SQLObject

Long time no write! A while ago I wrote a post about how to combine Twisted with SQLObject. The method describe there, however, doesn’t work particularly well when using SQLite.

Since SQLite databases are a single file (or a memory area) some operations will need to lock when writing. In the other post we used deferToThread, which will run a function on a different thread and return a Deferred which will fire when the operation is finished. The thread on which the operation will be run is taken from the reactor thread pool, so it’s not a single thread. This means that if more than one write operation is executed almost at the same time, the first operation will get the lock, and the others will need to wait. When using SQLObject, the operation will fail if the lock can’t be acquired, so let’s see what we can do about it.

One option would be to use the timeout parameter and set it to some reasonable value, so that the operation would wait that amount of time before giving up on the lock. It’s not 100% guaranteed that it will get it, so this could be complemented with a wait-retry strategy.

Or we can just run all database operations in a single thread. This way only one operation will be executed at a time so we avoid the issue. Also, since we return deferreds nothing will be blocked waiting for results.

In order to do this I chose to use Twisted’s ThreadPoolclass, with a single thread. The reason for doing it this way is that Twisted provides us with nice functions to run operations on one of these pools and return a deferred.

First we setup our thread pool:

[gist]https://gist.github.com/994097[/gist]

We will need to stop the thread pool when our application ends, hence the call to addSystemEventTrigger. It will run the specified function when the reactor is about to be stopped.

The we’ll create a decorator which we’ll use to decorate functions that will run on the thread pool:

[gist]https://gist.github.com/994098[/gist]

Since the thread pool only has a single thread we are effectively serializing all database operations, if they are called using our decorator. Lets see a complete example:

[gist]https://gist.github.com/994096[/gist]

Hope you find it useful!

:wq

lighttpd + PHP-FPM on Debian Squeeze

I’m not a super-BOFH but I do like to host my own blog (not this on, though) and play with stuff.

In the past I used to use Apache, but I never liked the VirtualHosts configuration, specially when it gets big. Don’t get me wrong, I’m not saying its not right, its just not right for me. So I switched to lighttpd and I loved the configuration. It also has good performance.

This lighttpd server hosts a couple of WordPress blogs among other stuff and I noticed some slowness from time to time. Since its not critical to me I didn’t bother much. But then, reading some of my RSS feeds I heard about PHP-FPM. Never heard of it before.

It will run as a system daemon and take care of spawning PHP processes for handling requests when necessary, among other things. Its performance should be better than old FastCGI, so I decided to give it a try.

There is no PHP-FPM package on Debian Squeeze, but fortunately we can use the Dotdeb respository which does have one. Follow the instructions here and install PHP-FPM:

apt-get install php5-fpm

You may also want to upgrade all your PHP related packages, Dotdeb
contains good and updated packages.

Now we need to configure lighttpd to use PHP-FPM. Lets create a
configuration file for that in /etc/lighttpd/conf-available with the
following content:

[gist]https://gist.github.com/936469[/gist]

I suggest you name the file 10-fastcgi-fpm.conf. Now lets disable old FastCGI and enable our new module:

lighttpd-disable-mod fastcgi
lighttpd-disable-mod fastcgi-php
lighttpd-enable-mod fastcgi-fpm

And last, restart the server:

/etc/init.d/lighttpd force-reload

I didn’t benchmark it thoroughly, but my rtGUI page used to take up to 10 seconds to load and now it takes less than a second. Not bad!

:wq!

Checking Google Voice SIP service availability

During the past weeks Google has been turning on and off the inbound SIP service in Google Voice. Suddenly you woke up and saw someone in Twitter claiming that it worked, and hours later it just didn’t work anymore. Then I thought about making a way to ping Google Voice servers and getting this information in a nice way.

When inbound SIP support is enabled Google Voice servers (which run OpenSIPS, by the way) respond to OPTIONS requests, so my idea was very simple: send a SIP OPTIONS message to GV servers and print the result somewhere. This somewhere had to be the web. But I didn’t want to host a website, so I thought “hey, maybe it’s time to try that Google App Engine!”.

Applications run in a sandboxed environment on GAE, so you can’t have extension modules or even access the socket module. But you can make outgoing HTTP requests. With these limitations in mind this is what I came up with:

GAE has cron job support, that is, it can do a GET request at regular intervals. So why not leverage this and make a GET request every 5 minutes to some web service which tells us the GV SIP service status and store it? This way we don’t need to connect anywhere every time someone wants to see this data, because is kept up to date by the cron job.

This required a component which must able to handle HTTP requests and generate SIP OPTIONS requests. I didn’t have this, but since I work every day with the python-sipsimple library and I know some Twisted it took me very little time to build one. I called it SIPwPing and it’s available on my GitHub.

Some of you may think this is completely over-engineered. And you are right. But I wanted to play with GAE and this was a very good excuse to do so. 🙂

Application is now live running at http://gvoice-sip-status.appspot.com and the source code will be available on my GitHub account later today.

PS: The application is using the free plan and if any quota is exceeded at some point I’ll not bother much, since I already had the fun making it work 🙂

:wq

Gae_gv_error
Gae_gv_unknown
Gae_gv_ok

FOSDEM: awesomeness^3

It has been a very long weekend, I’m writing this on the train while traveling back from FOSDEM. I knew the event was big, but I couldn’t imagine it was that big.

Everything was so perfect: the talks, the speakers, the facilities, the connection, … Specially the connection. On conferences usually network sucks because the access points can’t cope with the load or there is not enough bandwidth for everyone… who know. This was not the case at FOSDEM. Every device got a public IPv4 address and also a IPv6 one. We did simple test and downloaded 10GB from a FTP server. It took 2 minutes.

I was lucky and got the opportunity to give a talk in the Open Source Telephony Devroom about SIPSIMPLE SDK the SIP library we develop at work

The talk went through the implementation of different server applications for SylkServer a SIP application server we recently launched. Simple SIP client examples were also shown. The presentation is available for viewing and downloading here:

[slideshare id=6830569&doc=developingrichsipapplicationswithsipsimple-110206091415-phpapp02]

The code examples used for the presentation can be downloaded from my GitHub repository.

Once more, kudos to the organization os FOSDEM, it was a really great event, I would definitely recommend it to anyone. Hope to be there again next year!

:wq

Managing timezone aware datetime objects

Dealing with date and time can be quite tricky. Specially if we need to convert local times from different timezones. Lets see what standard Python datetime module does:

[gist]https://gist.github.com/805087[/gist]

The generated timestamp is in ISO8601 format, well, not really, microsecds should be removed, but that’s not the biggest problem. The problem is that the offset from UTC is not added. If a timestamp is generated in a local form, the offset to UTC must be provided, lets see an example:

2011-02-01T00:56:23+01:00

That means that the local time is 00:56:23, which is +01:00 hours ahead UTC. Regular Python datetime objects are not timezone aware. Getting proper timezone information is a tough matter, but fortunately the dateutil module will help. Lets see it in action:

[gist]https://gist.github.com/805107[/gist]

Hey! Now the timestamp is generated correctly! (I set the microsecond to 0 so that its fully ISO 8601 compliant). We can also use the dateutil module to parse a timestamp and generate a timezone-aware object out of it:

[gist]https://gist.github.com/805117[/gist]

One last note: if you need to perform operations such as calculating the difference between two datetime objects, they both need to be timezone aware, otherwise bad things happen:

[gist]https://gist.github.com/805121[/gist]

:wq

Using SQLObject with Twisted

Over the past week I’ve been working on a small personal project with Twisted. I need to access a database to store and retrieve data, so I started with the obvious, using APIs provided by Twisted.

Twisted provides a database API called adbapi. The API is pretty straightforward, and the operations I wanted to perform were not rocket science anyway, so it served the purpose, but I wans’t 100% satisfied.

I was using the runQuery function, mainly, and putting there a regular SQL statement. I didn’t like that. Then I remembered SQLObject.

SQLObject is a ORM, providing an object oriented API to several databases (I was aiming at MySQL and SQLite). This is what I was looking for. But there is a problem: accessing a database is a blocking operation.

Twisted uses the reactor pattern thus you can’t run a blocking operation in the event loop’s thread without affecting the whole program. Database accessing libraries tend to be blocking, so Twisted runs database operations in another thread and then gets results in a callback in the main thread. This makes the illusion of non-blocking database access.

In order to integrate SQLObject nicely with Twisted this is exactly what we want to do. We’ll defer all database operations to another thread and we’ll get results (or failures) in callback functions. The key function here is deferToThread which will run the specified function in the reactor thread pool and return a deferred. In order to make our life easier we’ll use a decorator which will run the decorated function in the reactor’s thread pool and return a deferred:

[gist]https://gist.github.com/793936[/gist]

Now lets see a simple (yet full) example of how all this works:

[gist]https://gist.github.com/793982[/gist]

As you can see the results (or errors) are retrieved in the got_result/got_error callback functions asynchronously, and as the operation was executed in a different thread this didn’t affect the main event loop.

:wq

Setting core dump limits from Python

It’s very annoying for a program to crash, but it happens. When it happens if the system was configured to dump the core you’ll get a core file with hopefully enough information to debug the crash.

But we don’t know if the system will be configured to dump the core, so we may want to do this from our Python program. We can use the resource module in the standard library. The following example shows how to enable core dumps if —dump-core option is specified:

[gist]https://gist.github.com/772101[/gist]

PS: This worked just fine for the root user but didn’t seem to do the job for a normal user.

Implementing registry pattern with class decorators

Registry is a quite common design pattern which I recently needed to implement and I found Python class decorators very useful so I thought I’d write about it. :–)

In the registry pattern we have a global object we call the registry or registrar which contains references to shared objects and it’s the only entity which can access those.

In my case, I had a plugin style architecture in which different plugins might be dynamically added without any configuration. The first implementation I made used a metaclass to add the plugin class to the registry:

[gist]https://gist.github.com/771884[/gist]

That worked, at least for the beginning. But later I found a limitation: plugins needed the Plugin metaclass, so I couldn’t use another metaclass on them if I needed to. And at some point I did feel the need to do that. I wanted the plugins to use the Singleton metaclass (to implement the Singleton pattern) but then I couldn’t use the Plugin metaclass because only one metaclass can be used (and I didn’t want to create a SingletonPlugin metaclass). And class decorators saved the day:

[gist]https://gist.github.com/771888[/gist]

Class decorators are supported only in Python >= 2.6, but that wasn’t a problem. :–)

:wq

Bye bye grep, hello ack!

This past week I’ve been enhancing my Vim configuration, since it’s the editor I use all the time. While browsing for interesting plugins I run across Ack, a plugin that promises to be an enhancement for Vim’s builtin grep capability.

The Vim plugin uses the ack command on your system which is Perl replacement for grep. After discovering it on http://betterthangrep.com I didn’t look back. Ack is so much better than grep for finding text among code!

One of the good things is the fact that Ack will automagically ignore well-known version control systems files. According to the documentation:

Directories ignored by default:
autom4te.cache, blib, build, .bzr, .cdv, cover_db, CVS, darcs, ~.dep,
~.dot, .git, .hg, MTN, ~.nib, .pc, ~.plst, RCS, SCCS, sgbak and .svn

Lets see an usage example:

[gist]https://gist.github.com/717782[/gist]

And we don’t need to explicitly exclude any of the directories listed above, because Ack will do it for us.

Happy ack-ing!

 

lambda vs. functools.partial

In Python we have a way for creating anonymous functions at runtime: lambda. With lambda we can create a function that will not be bond to any name. That is, we don’t need to def a function, and lambda can be used instead.

[gist]https://gist.github.com/707898[/gist]

Lambda can also be used to call another function by fixing certain argument:

[gist]https://gist.github.com/707899[/gist]

However, lambda functions use late binding for the arguents they get. That is, when you create a lambda function passing a variable as an argument, it’s not immediately copied, a reference to the scope is kept and the value is resolved when the function is called. Thus, if the value of that argument changes within that scope the lambda function will be called with the changed value. Lets see it in action:

[gist]https://gist.github.com/707901[/gist]

Unexpected? It depends, but it was definitely unexpected when I run into it. The solution for this is to bind early, by copying the value at the time of the creation of the function. We can do this 2 ways:

Using a fixed parameter in the lambda function:

[gist]https://gist.github.com/707903[/gist]

Using functools.partial:

[gist]https://gist.github.com/707904[/gist]

I personally like the second approach, using partial, since its more explicit and I really see option one as a workaround to how lambda functions work.