I had the pleasure to give a presentation at the first ever Python Devroom at FOSDEM. I talked about how event loops work internally and how pyuv can help by abstracting a lot of the problems with a pretty simple to use API. I also introduced rose, a pyuv based PEP-3156 event loop implementation, but I’ll write a followup post on that 🙂
Thanks a lot to everyone who attended the talk, and for those who couldn’t here are the slides!
While searching for some information on Python locks I recently ran across this great post by David Beazley. In it, he explains how the syncronization primitives in the Python standard library are implemented. Basically, Lock is implemented as a binary semaphore in C, and the rest are implemented in pure Python. Even if the post is from 2009 this is still the case. UPDATE: As Antoine Pitrou points out in the comments, starting with CPython 3.2, RLock is now implemented in C.
This got me thinking. As you know I’ve created pyuv, a Python wrapper for libuv, and libuv includes cross-platform implementations for mutexes, semaphores, conditions rwlocks and barriers, which I never bothered to add to pyuv. I just didn’t add them because I thought they didn’t add any value, but after reading David’s article I decided to do a quick test: implement wrappers for a mutex and a condition variable and use them in a Queue implementation in order to see if there was any difference in performance. Not that I ever ran into performance issues related to that, but I was curious anyway 🙂
Someone may think “oh, but given that Python has the GIL, how does using multiple threads and speeding up the locks matter?”. The GIL is released whenever a IO operation is performed, so if your Python application is multithreaded and it mainly deals with IO-bound tasks, they GIL is not that relevant. If your application is CPU bound, however, better have a look at the multiprocessing module.
So, lets get into the code! I implemented Barrier, Condition, Mutex, RWLock and Semaphore in this pyuv branch, which directly wrap their libuv counterpats. Then I copied the Queue implementation from the standard library and used the freshly wrapped synchronization primitives:
For the testing part, I used the timeit function from IPython with 5 runs. Not sure if it’s the best way, but results suggest it is a good way 🙂 Here is the test script:
Here are the results:
The tests were run with 2, 4 and 100 threads, and since I was testing performance, I added PyPy to the mix. Now, as you can see in the results, the custom Queue is about 33% faster than the one in the standard library, so I was pretty happy about that. Until I tested PyPy. It just beats the shit out of both, which is awesome 🙂
The performance increase on CPython is nice, there is a downside, however: libuv treats errors in this primitives a bit “abruptly”, it calls abort(). This means that if you use the locks incorrectly your program will core dump. I personally like it, because it helps you find and fix the problem right away, but not everyone may like it.
Yesterday NodeJS 0.8.0 was released, and libuv got its dedicated branch for maintenance of the 0.8.X version cycle. Since pyuv now implements all the features offered by libuv I chose to follow the same path and branch out pyuv 0.8.0.
From now on pyuv’s v0.8 branch will only get bugfixes from the corresponding libuv branch. pyuv’s master branch will use libuv master and have version number 0.9.0-dev, there will not be a release of that branch soon, I guess.
So, what’s new on pyuv 0.8.0 then?
FSPoll handle, for polling for file changes using the stat syscall
Several fixes in the fs module for Windows
Bugfixes that came with libuv
For version 0.9, apart from following libuv’s development I have plans to make the following changes:
Move the getaddrinfo function to the util module
Overhaul the dns module, provide a raw wrapper on top of c-ares
instead of exposing a complete DNS resolver (I’ll cover this in a
future blog post, when I approach the implementation)
Last but not least, I’d like to thank again the NodeJS and libuv developers, they are really doing a great job!
With yesterday’s pyuv release the door was open for integrating pyuv with other event loops or applications. Thanks to the Poll handle we can now create a regular socket in Python and put it in pyuv’s event loop, so we can also use pyuv to replace other event loops ;–)
I created a couple of projects for toying around with this feature. The first project implements a Tornado IOLoop which runs on top of pyuv, and the second one implements a Twisted reactor on top of pyuv.
They are not feature complete yet, but basics are working and I’ll be adding more features as time allows.
It’s been a while since I released pyuv. In this time the libuv guys have been really busy adding lots of cool stuff and with this release pyuv now wraps most parts of libuv. Here is a short changelog highlighting the important additions:
Made the default loop a singleton
Added TTY handle
Moved all exception definitions to a standalone file
Added set_membership function to UDP handle
Added ability to write a list of strings to IOStream objects
Added ability to send lists of strings on UDP handles
Added open function to Pipe handle
Added Process handle
Added ‘data’ attribute to all handles for storing arbitrary objects
Implemented pending_instances function on Pipe handle
Implemented nodelay, keepalive and simultaneous_accepts functions
on TCP handle
Added ‘counters’ attribute to Loop
Added ‘poll’ function to Loop
Added new functions to fs module: unlink, mkdir, rmdir, rename,
chmod, fchmod, link, symlink, readlink, chown, fchown, fstat
Added new functions to util module: uptime, get_process_title,
set_process_title, resident_set_size, interface_addresses, cpu_info
For the next release I plan to add the missing functions for asynchronous filesystem operations and work on Windows support (I even got a pull request to start with, yay!). Stay tuned!
After working on this on-and-of on my free time I’m happy to share it. pyuv is a Python interface to libuv, a high-performance, portable library for asynchronous network communications and more. libuv is the platform layer for the well known NodeJS.
pyuv implements the following features:
Asynchronous DNS resolver
Running operations in a ThreadPool
Prepare, idle and check handles
System memory information
System load information
Getting executable path
Asynchronous filesystem operations (stat, lstat)
libuv implements more features, which I plan to implement as time allows. In the meanwhile grab the source and have a look at the examples on the GitHub repository.
Documentation is available through ReadTheDocs here.