Implementing a ZeroMQ – WebSocket gateway

In the last post we saw a simple PubSub application using ZeroMQ. Today we are going to extend that and publish all the data we get from the ZeroMQ publishers to web clients by using WebSocket.

First problem I ran into was choosing a WebSocket server to suit my needs. I had a look at EventletGeventtxWebsocket and some others, but since what I wanted was quite simple I decided to go and write my own, in order to avoid such dependencies. Communicating both worlds (ZeroMQ and WebSocket) seemed like the big problem, so I thought the best way to solve it would be to put all sockets in a polling loop running on its own thread and avoid possible multithreading issues. This works thanks to the ZeroMQ Poller, which is able to poll ZeroMQ sockets and anything which defines a fileno() method. Awesome stuff.

Since the single threaded polling experiment worked I decided to make a small standalone project (without ZeroMQ, using select.poll instead) with it for possible future stuff. You can find it here.

Back to our gateway, lets go and check the code:

producer:

[gist]https://gist.github.com/1046944[/gist]

As you may have guessed, its the same simple multicast producer we saw on the previous post.

client web page:

[gist]https://gist.github.com/1051836[/gist]

consumer and WebSocket server:

[gist]https://gist.github.com/1051872[/gist]

I had never used poll nor ZeroMQ Poller before, but they were both pretty easy to get started with. In a nutshell, the server will take every piece of data it gets from the ZeroMQ socket and broadcast it to all the connected web clients through web sockets. The gateway is one way only, the input coming from the web sockets is not sent anywhere, I’ll leave that as an experiment for the reader. :–)

To test it just run the server and serve the page. You can use this to serve the page.

:wq

Implementing a PubSub based application with Python and ZeroMQ

A while back I came across this awesome post by Nicholas Piël about ZeroMQ and I added test ZeroMQ to my never ending ToCheck list. Today I found the excuse to check it out and here is what I did with it.

I wanted to implement a producer-consumer application to aggregate data contained in an arbitrary number of files in an arbitrary number of nodes. ZeroMQ seemed like a good tool for this, so lets see a helloworld example: producer:

[gist]https://gist.github.com/1046936[/gist]

consumer:

[gist]https://gist.github.com/1046938[/gist]

As you can see the code is very straightforward. We create a socket, declare it as a publisher (PUB) or subscriber (SUB) accordingly and subscribe to the topics which we care about (in the case of the subscriber).

Looks like we are writing or reading to/from a standard socket, but this socket is on steroids. You can write to it even if the other end is not connected, all the queuing logic has been taken care for you, it accepts multiple connections, … and many many other things.

UPDATE: After re-reading the docs it seems that ZeroMQ sockets are not really thread-safe unless you pass them using a “full memory barrier”. I guess the fact that Python has the GIL makes my example not crash :–)

ZeroMQ is also thread-safe, and this makes our code a lot simpler than it would if ZeroMQ weren’t thread safe. Lets see an example with 10 threads simultaneously sending messages through a ZeroMQ socket without any locking mechanism.

[gist]https://gist.github.com/1046942[/gist]

The approach we took is very flexible in the sense that any number of producers may connect to our consumer, but they need to know where (in terms of IP and port) our consumer is or will be. Could we change that somehow?

Yes we can. We can use multicast for this. We can make all producers join a multicast group and send their messages there. Then the consumer would also join the same group and get them. Of course this only makes sense in a controlled environment where data is not critical or no non-authorized party is able to join the multicast group.

The funny thing is that we only need to change one line of code in the producer and another one in the consumer in order to make them work in multicast mode.

producer:

[gist]https://gist.github.com/1046944[/gist]

consumer:

[gist]https://gist.github.com/1046947[/gist]

I’ve started a small project I’ll be working on in my free time, hope to post something about it soon :–)

:wq