This tutorial includes some advanced topics and techniques frequently used in Python. The topics covered in this part are:
Prerequisites: Basic and Intermediate ACM Python tutorials or some experience in Pyhton and programming in general.
Python is a lexible language. You can use it in three ways: procedural programming, object-oriented programming, functional programming. Even though you are using one of the approaches (you might think!), behind the scene it anyway manipulates objects... Why?
print type(1)
print type(int)
def f(): pass
print type(f)
<type 'int'> <type 'type'> <type 'function'>
Let's create an example class Person
and add a couple of fields and methods to it.
class Person(object):
def __init__(self, name):
self.name = name
def WhatIsYourName(self):
return self.name
p1 = Person('John')
p1.WhatIsYourName()
Now, let's create a small hierarchy of classes with a base class Person
.
class Employee(Person):
def __init__(self, name, organization):
self.name = name
self.employer = organization
def WhereDoYouWork(self):
return self.employer
p2 = Employee('Joshua', 'KAUST')
print '%s works at %s' % (p2.WhatIsYourName(), p2.WhereDoYouWork())
class PhD(Employee):
title = 'PhD' # this will be a class-wise field!
def __init__(self, name, organization, major):
super(PhD, self).__init__(name, organization) # or you can write Employee.__init__(self, name, organization)
self.major = major
def WhatIsYourMajor(self):
return self.major
p3 = PhD('Chris','KAUST','EE')
print '%s is a %s in %s at %s' % (p3.name, p3.title, p3.WhatIsYourMajor(), p3.employer)
Student
that has Person
as a base class.Student
have fields major
and school
. school
should be an optional field with None
as the default value.WhereDoYouStudy
to Student
class.PhD
class have two base classes - Employee
and Student
.WhereDoYouWork
function to PhD
(it will shadow Employee
's WhereDoYouWork
) the following way: if school
is not defined for a PhD
, i.e., is equal to None
, then return the result Employee
's WhereDoYouWork
returns, otherwise return Student's WhereDoYouStudy
output.You can extend the above cells or use the below cell for your code.
#%load solutions/oop_ex.py
Often, it is very useful to extend existing types and classes when an additional feature is needed. For this section, let's try to extend dict
into a class Graph
. But before this let's have a closer look at how the basic types/classes in python are organized.
# Check the type of class dict
type(dict)
# List all the methods, fields and properties dict class has
dir(dict)
# An example of 'magic' method with two underscores
dict.__getitem__?
NOTE: A well written and more detailed guide on so called Python "magic methods" (with double underscores) you can find here.
Now, knowing that dict
is basically a class, we can build another class Graph
and use dict as the base class. Our Graph
will have the following structure: each key-element will be a node, each value element will be a list
of the adjacent nodes. Graph will be initialized with a list of edges.
class Graph(dict):
"""Graph class, extends dict"""
def __init__(self, edges):
"""Initialize graph with a list of 2-tuples. Each tuple is a directed edge."""
for u, v in edges:
try:
self[u].append(v)
except KeyError, e:
self[u] = [v]
# instead of try-except you can use: self[u] = self.get(u,[]) + [v]
def __str__(self):
"""Returns a Graph representation. Called by 'print' for any object we would like to print."""
representation = '\n'.join(["%i -> [%s]" % (u, ','.join([str(v) for v in V]))
for u, V in self.iteritems()])
return representation
g = Graph([(1,2), (2,3), (1,3)])
print g
Build an extention for dict called "SortedDict". Whenever you print an instance of such dict
, print it sorted by keys in the ascending order. For the test please use the dict
below.
d = {100: "Hi", 7:"Week", 24:"Work", 2009:"KAUST"}
print d
#%load solutions/sorted_dict.py
As we mentioned, everything in Python is an object. Even a function. Let's check it out and see how we can exploit this.
dir(len)
# Check the help for the __call__ method
len.__call__?
Now, since functions are objects, we can use them as objects, e.g., as another funciton arguments. Consider the following example.
# Apply function example
def apply(data, func):
"""Loop through the data and apply the provided function."""
return [func(d) for d in data]
apply([1, -1, 2, 10, 100, -404], str)
# Bind function example
def bind(func, **kwargs):
def bfunc(*args):
return func(*args, **kwargs)
return bfunc
As a quick example, let's make function sorted
sort the provided iterable object in the reverse order.
sorted([1, -1, 2, 10, 100, -404])
sorted?
rev_sorted = bind(sorted, reverse=True)
rev_sorted([1, -1, 2, 10, 100, -404])
The last pattern when we have a function generates a function based on the provided one is unltimately frequent and very useful. Let's consider an example, and firts write a simple Fibonacci sequence generator function.
(FYI: Fibonacci recurrent numerical sequence is defined as follows: F(n) = F(n-1) + F(n-2), F(0) = 0, F(1) = 1)
def Fib(n):
"""Return the n-th Fibonacci number."""
assert type(n) is int and n >= 0, "ERROR (Fib): index should be positive and integer!"
return Fib(n-1) + Fib(n-2) if n > 1 else 1 if n is 1 else 0
[Fib(i) for i in range(15)]
Fibonacci function is recurrent. Moreover, for every input n, it should compute values for all the previous indices. Is this efficient? Of course not. Let's make it more efficient, hacking the __call__ method and adding some caching. We will make it using decorators.
import collections
def memoize(func):
"""A caching decorator. Checks and returns a cached value before applying the function itself."""
cache = {}
def cachedFunc(*args):
if args not in cache:
print "Cache miss!"
cache[args] = func(*args)
return cache[args]
return cachedFunc
@memoize
def Fib(n):
"""Return the n-th Fibonacci number."""
assert type(n) is int and n >= 0, "ERROR (Fib): index should be positive and integer!"
return Fib(n-1) + Fib(n-2) if n > 1 else 1 if n is 1 else 0
[Fib(i) for i in range(16)]
Implement a decorator that will trace Fibonacci function calls. Whenever the Fib function called, print it and its arguments. If it is called recurrently, maintain the indent: each subsequent recurrent call should be indented with two more spaces.
#%load solutions/fib_trace.py
NOTE: You can find a lot of useful decorators along with their design patterns in this library.
If you used Python even for a bit, you most probably noticed that the real power of the language is in its modules and packages which you can import into your script and use the code you or somebody else developed and tested before. But how to write your own package? Right, it is simple... because modules are also objects!
The simplest way is just create a file, say MyModule.py
, put some functions and classes you implemented in it, and finally write
import MyModule
or something like
from MyModule import * # or from MyModule import (MyFunctionName, MyClassName)
Now, if you would like to build something BIG, say a whole package for sound processing, you would probably prefer not putting everything in a mess into a single file, but rather you would prefer splitting everything logically into separate files or even folders. Then, your structure should look like this: (Example is taken from the Python documentation page)
Note the __init__.py
files. They might be just empty (in most of the packages they are), but if you need to do some additional stuff at the moment of importing module (say, you want to print your module's version information and a license), you should do this exactly in the appropriate __init__.py
.
Also note the subfolders you might have with submodules. Large packages with large libraries use exactly this kind of structure which allows programmers import only the submodules they are interested in working with. Example:
import sound.effects.echo as echo
Let's have a look at a more concrete example (taken from Johansson's Python lectures).
%%file mymodule.py
"""
Example of a python module. Contains a variable called my_variable,
a function called my_function, and a class called MyClass.
"""
my_variable = 0
def my_function():
"""
Example function
"""
return my_variable
class MyClass:
"""
Example class.
"""
def __init__(self):
self.variable = my_variable
def set_variable(self, new_value):
"""
Set self.variable to a new value
"""
self.variable = new_value
def get_variable(self):
return self.variable
import mymodule
help(mymodule)
mymodule.my_variable
mymodule.my_function
my_class = mymodule.MyClass()
my_class.set_variable(10)
my_class.get_variable()
In this section we shed some light on how the memory is actually managed by Python, show you important-to-know techniques that will help you build efficient code and avoid common mistakes.
First, we should get a good feeling of what the actual sizes of objects in Pyhton are, and how the sizes grow whenever we create lists/dictionaries/tuples out of some number of objects.
# The function for showing size of objects (source: http://deeplearning.net/software/theano/tutorial/python-memory-management.html)
import sys
def show_sizeof(x, level=0):
print "\t" * level, x.__class__, sys.getsizeof(x), x
if hasattr(x, '__iter__'):
if hasattr(x, 'items'):
for xx in x.items():
show_sizeof(xx, level + 1)
else:
for xx in x:
show_sizeof(xx, level + 1)
sys.getsizeof?
# Lets check sizes of different objects (sizes are indicated for 64-bit Python)
show_sizeof(None) # 16 bytes for None
show_sizeof(1) # 24 bytes for 64-bit int - 3 time the size of int64_t in C
show_sizeof(2**500) # 92 bytes for 64-bit Python's long with unconstrained length
show_sizeof(0.5) # 24 bytes for 64-bit double
show_sizeof("") # 37 bytes for empty string
show_sizeof(u"") # 50 for empty unicode string
show_sizeof("Test") # 41 for not empty string (+1 byte per character)
show_sizeof(u"Test") # 58 for not empty unicode string (+2 bytes per character)
Python lists are actually dynamic arrays.
show_sizeof([]) # 72 bytes
show_sizeof([1, "test", 0.5]) # 96 bytes. The capacity of this list is 6; +4 bytes per each link in a list.
All of these details seem to be minor, unless you start building large-scale applications, or process huge amounts of data for your projects. So, keep this in mind.
Here we just briefly discuss a pretty wasteful approach to manage memory allocation employed in Python. For thorough description and possible solutions you can refer to http://www.evanjones.ca/memoryallocator/.
We can run a small experiment using memory profiler utitlity.
%%file memory-profile-me.py
import copy
import memory_profiler
@profile
def function():
x = range(1000000) # allocate a big list
y = copy.deepcopy(x)
del x
return y
if __name__=="__main__":
function()
%run -m memory_profiler memory-profile-me.py
Filename: memory-profile-me.py Line # Mem usage Increment Line Contents ================================================ 5 15.883 MiB 0.000 MiB @profile 6 def function(): 7 46.945 MiB 31.062 MiB x = range(1000000) # allocate a big list 8 149.242 MiB 102.297 MiB y = copy.deepcopy(x) 9 149.242 MiB 0.000 MiB del x 10 149.242 MiB 0.000 MiB return y
So, what does this chart above tell us? Somewhy, Python didn't shrink the memory in use after we deleted x
. What happened?
To speed up memory allocation, Python reuses already allocated chuncks. It keeps several lists for small objects (separate lists for different sizes). Whenever we create a new object, Python either allocates a new block, or reuses a free one from one of the lists. That's why when we deleted x, the memory usage didn't shrink.
Python is a nice language. If you had known other programming languages before starting using Python, you can easily switch to Python (learn it literaly in 3-6 hours), and continue writing ugly awkward code using idioms and structures from your previous languages. However, Python is beautiful, very succinct and very expressive. This section is devoted to show you some essential well known (or less known) Python idioms (or tricks, if you will) that can make your code shorter and more elegant.
# Basic for-loop comprehension
a = [1,2,3,4,5,6,7]
b = [x**2 for x in a]
print b
# For-loop if-else comprehension
c = [x**2 for x in a if x > 4]
print c
# Multiple for-loop if-else comprehension
llist = [[1,2,3],(4,5,6),(7,8,9)]
c = [x**2 for sublist in llist for x in sublist if x % 2 == 0]
print c
# In Python you can actually use infinite lsits and dicts (without running out of memory!)
a = [1,2,3]
a.append(a)
print a
a[3][3][3][3][3]
# Quick swap
a = 1
b = 2
print a,b
a, b = b, a
print a, b
# Nice math-like comparison notation
x = 5
print 3 < x < 8
print x < 10 < 5*x < 99
print 2 > x < 7
# List flattening techniques
a = [[1,2,3],[4,5,6],[7,8,9]]
print sum(a,[]) # this works only with nested lists (or objects with __add__ method defined)
print [x for b in a for x in b] # works with any nested iterables
# Pairing elements of two lists
first, second = [1,2,3,4,5], [6,7,8,9,10]
print first
print second
paired = zip(first,second)
print paired
# Unpairing the elements back into two lists (NOTE: Lists turn into tuples. This is due to zip implementation.)
a, b = zip(*paired)
print a
print b
# Providing function arguments via structures
def func(a, b, c, kw1 = '', kw2 = 0):
print a
print b
print c
print kw1
print kw2
args = (1, 2, 3)
kwargs = {'kw1':'Hello', 'kw2':100}
func(*args,**kwargs)
# Enumeration of the elements
l = ["spam", "ham", "eggs"]
print list(enumerate(l))
print list(enumerate(l,5))
Copyright 2014, Maruan Al-Shedivat, ACM Student Member.