import time
print('Last updated: %s' %time.strftime('%d/%m/%Y'))
Last updated: 17/06/2014
deque
container data type¶The collections
module in Python's standard library provides some convienient container datatypes that not only make your code more "Pythonic", but they also are optimized for performance for certain operations. Previously, we have seen how we can use the collections.Counter
class (Day 3: 6 different ways for counting elements using a dictionary) to write short and efficient code to quantify items.
Today, we will take a look at another collections
class: the deque
container data type.
Most of you might already be familiar with deque
s due to their popularity in C++ and might want to skip this short introduction.
For all the others: deque
(pronounced "deck") is short for "double-ended queue" and thus it is sometimes also written as "dequeue". This abstract data structure is quite similar to an array (or Python list), which is optimized for adding and removing elements from either the back or front.
The deque
class comes with methods that are already familiar from the Python list
type, such as
list
and deque
objects¶append(val)
appends and element to the right
clear()
removes all elements
count(val)
returns the number of elements that are equal to val
extend(iterable)
adds elements to the right from an iterable object
pop()
removes and returns an element from the right end
reverse()
reverses elements in-place; returns None
remove(val)
removes first occurence of val
deque
objects¶appendleft(val)
appends an element to the left
extendleft(iterable)
adds elements to the left from an iterable object
popleft()
removes and returns an element from the left end
rotate(n)
Rotate the deque n steps to the right. If n is negative, rotate to the left. Rotating one step to the right is equivalent to: d.appendleft(d.pop()).
import collections
a = collections.deque([1,2,3])
a.rotate(2)
print(a)
deque([2, 3, 1])
Note that deque
s don't have equivalents to the Python list
methods, such as index()
for looking up values, or the .sort()
method (although the sorted()
function can be used as a workaround). If we want to get a slice from a deque
, equivalent to my_list[0:3]
, we would need to use itertools
, for example.
import itertools
list(itertools.islice(my_deque, 0, 3))
import timeit
import collections
from copy import copy
labels = ['.append()', '.insert(0,x)\n.appendleft(x)',
'.pop()', '.pop(0)\n.popleft()', '.reverse()']
list_time, deque_time = [], []
n = 10000
l = [i for i in range(n)]
d = collections.deque([i for i in range(n)])
l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.append(1)',
'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.append(1)',
'from __main__ import d_cp').repeat(repeat=3, number=1000)))
l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.insert(0,1)',
'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.appendleft(1)',
'from __main__ import d_cp').repeat(repeat=3, number=1000)))
l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.pop()',
'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.pop()',
'from __main__ import d_cp').repeat(repeat=3, number=1000)))
l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.pop(0)',
'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.popleft()',
'from __main__ import d_cp').repeat(repeat=3, number=1000)))
l_cp, d_cp = copy(l), copy(d)
list_time.append(min(timeit.Timer('l_cp.reverse()',
'from __main__ import l_cp').repeat(repeat=3, number=1000)))
deque_time.append(min(timeit.Timer('d_cp.reverse()',
'from __main__ import d_cp').repeat(repeat=3, number=1000)))
import platform
import multiprocessing
def print_sysinfo():
print('\nPython version :', platform.python_version())
print('compiler :', platform.python_compiler())
print('\nsystem :', platform.system())
print('release :', platform.release())
print('machine :', platform.machine())
print('processor :', platform.processor())
print('CPU count :', multiprocessing.cpu_count())
print('interpreter:', platform.architecture()[0])
print('\n\n')
%matplotlib inline
import matplotlib.pyplot as plt
def plot():
plt.rcParams.update({'font.size': 12})
fig = plt.figure(figsize=(11,13))
# Setting the positions and width for the bars
pos = list(range(len(list_time)))
width = 0.2
# Plotting the bars
fig, ax = plt.subplots(figsize=(8,6))
plt.bar(pos, list_time, width,
alpha=0.5,
color='g',
label=labels[0])
plt.bar([p + width for p in pos], deque_time, width,
alpha=0.5,
color='b',
label=labels[1])
# Setting axis labels and ticks
ax.set_ylabel('time in seconds\n(for 1,000 operations on starting size 10,000)')
ax.set_title('Python list and collections.deque operations')
ax.set_xticks([p + 1.5 * width for p in pos])
ax.set_xticklabels(labels)
# Setting the x-axis and y-axis limits
plt.xlim(0-width, len(list_time))
plt.ylim([0, max(list_time + deque_time) * 1.5])
# Adding the legend and showing the plot
plt.legend(['list', 'collections.deque'], loc='upper left')
plt.grid()
plt.show()
plot()
print_sysinfo()
<matplotlib.figure.Figure at 0x1064c8ba8>
Python version : 3.4.1 compiler : GCC 4.2.1 (Apple Inc. build 5577) system : Darwin release : 13.2.0 machine : x86_64 processor : i386 CPU count : 4 interpreter: 64bit
As we can see in the plot above, the performance of the shared methods between Python list
and deque
objects are about the same. However, as expected, the appendleft(x)
and popleft(x)
methods on deques are significantly faster than workaround-approaches on Python list
s.
Thus, if our application requires a lot of adding and removing of items at both ends of the data structure, deque
s are definitely to be preferred over list
s.