In Python we have several types of objects for storing values in a
array-like way: lists, tuples, dictionaries (they are really more like
hash tables) and sets. Lists and sets might look alike but they are
different. Lets do a quick test:
[gist]https://gist.github.com/569053[/gist]
In the above snippet, we can see that iterating over all items in a
list takes just a little longer tan a a set: 8.99us vs 6.87us we are
talking micro-seconds here!
Now, if we look at how long it takes to verify is the list or set
contains an object, we can see the difference: 1.99us vs 140ns. Sets
are an order of magnitude faster!
Why is that? We can think that a python list is like a C linked list.
It’s time complexity when checking for a value is O(n). Sets, on the
other hand, are implemented as hash tables, so the key is hashed and
instantaneously found (or not) with a time complexity of O(1).
For the curious, this is the list object declaration in Include/listobject.h
[gist]https://gist.github.com/569078[/gist]
and this is the set object declaration, in Include/setobject.h
[gist]https://gist.github.com/569085[/gist]
More on python and hash tables here:
http://sites.google.com/site/usfcomputerscience/hash-tables-imp
:wq