Advanced Techniques in Python: Classes, Objects, Functions, Memory

This tutorial is prepared by ACM Student Chapter of King Abdullah University of Science and Technology (KAUST).
It includes some advanced topics and techniques frequently used in Python.

The topics covered in this part are:

  • Classes, objects, and methods. OOP in Python.
  • Advanced techneques in objects and functions manipulation.
  • Modules and their structure. How to develop a Python module.
  • Python memory management.
  • The Pythonic way: Tips & Tricks.

Prerequisites: Basic ACM Python tutorial and/or some experience in Pyhton and programming in general.

1. Object-oriented programming in Python

Python is a lexible language. You can use it in three ways: procedural programming, object-oriented programming, functional programming. Even though you are using one of the approaches (you might think!), behind the scene it anyway manipulates objects... Why?

In [1]:
print type(1)
print type(int)

def f(): pass
print type(f)
<type 'int'>
<type 'type'>
<type 'function'>

1.1. How to create classes and build class hierarchies?

Let's create an example class Person and add a couple of fields and methods to it.

In [ ]:
class Person(object):
    def __init__(self, name):
        self.name = name
    
    def WhatIsYourName(self):
        return self.name

p1 = Person('John')
p1.WhatIsYourName()

Now, let's create a small hierarchy of classes with a base class Person.

In [ ]:
class Employee(Person):
    def __init__(self, name, organization):
        self.name = name
        self.employer = organization
    
    def WhereDoYouWork(self):
        return self.employer

p2 = Employee('Joshua', 'KAUST')
print '%s works at %s' % (p2.WhatIsYourName(), p2.WhereDoYouWork())
In [ ]:
class PhD(Employee):
    title = 'PhD' # this will be a class-wise field!
    
    def __init__(self, name, organization, major):
        super(PhD, self).__init__(name, organization) # or you can write Employee.__init__(self, name, organization)
        self.major = major
    
    def WhatIsYourMajor(self):
        return self.major

p3 = PhD('Chris','KAUST','EE')
print '%s is a %s in %s at %s' % (p3.name, p3.title, p3.WhatIsYourMajor(), p3.employer)

Exercise:

  • Create a class Student that has Person as a base class.
  • Make Student have fields major and school. school should be an optional field with None as the default value.
  • Add a method WhereDoYouStudy to Student class.
  • Make PhD class have two base classes - Employee and Student.
  • Add WhereDoYouWork function to PhD (it will shadow Employee's WhereDoYouWork) the following way: if school is not defined for a PhD, i.e., is equal to None, then return the result Employee's WhereDoYouWork returns, otherwise return Student's WhereDoYouStudy output.

You can extend the above cells or use the below cell for your code.

In [ ]:
 
In [ ]:
#%load solutions/oop_ex.py

1.2. Extending existing Python types

Often, it is very useful to extend existing types and classes when an additional feature is needed. For this section, let's try to extend dict into a class Graph. But before this let's have a closer look at how the basic types/classes in python are organized.

In [ ]:
# Check the type of class dict
type(dict)
In [ ]:
# List all the methods, fields and properties dict class has
dir(dict)
In [ ]:
# An example of 'magic' method with two underscores
dict.__getitem__?

NOTE: A well written and more detailed guide on so called Python "magic methods" (with double underscores) you can find here.

Now, knowing that dict is basically a class, we can build another class Graph and use dict as the base class. Our Graph will have the following structure: each key-element will be a node, each value element will be a list of the adjacent nodes. Graph will be initialized with a list of edges.

In [ ]:
class Graph(dict):
    """Graph class, extends dict"""
    def __init__(self, edges):
        """Initialize graph with a list of 2-tuples. Each tuple is a directed edge."""
        for u, v in edges:
            try:
                self[u].append(v)
            except KeyError, e:
                self[u] = [v]
            # instead of try-except you can use: self[u] = self.get(u,[]) + [v]
    
    def __str__(self):
        """Returns a Graph representation. Called by 'print' for any object we would like to print."""
        representation = '\n'.join(["%i -> [%s]" % (u, ','.join([str(v) for v in V]))
                                    for u, V in self.iteritems()])
        return representation
        
In [ ]:
g = Graph([(1,2), (2,3), (1,3)])
print g

Exercise:

Build an extention for dict called "SortedDict". Whenever you print an instance of such dict, print it sorted by keys in the ascending order. For the test use the dict below.

In [ ]:
d = {100: "Hi", 7:"Week", 24:"Work", 2009:"KAUST"}
print d
In [ ]:
#%load solutions/sorted_dict.py

1.3. Functions are also classes, and we can play with them the same way!

As we mentioned, everything in Python is an object. Even a function. Let's check it out and see how we can exploit this.

In [ ]:
dir(len)
In [ ]:
# Check the help for the __call__ method
len.__call__?

Now, since functions are objects, we can use them as objects, e.g., as another funciton arguments. Consider the following example.

In [ ]:
# Apply function example
def apply(data, func):
    """Loop through the data and apply the provided function."""
    return [func(d) for d in data]

apply([1, -1, 2, 10, 100, -404], str)
In [ ]:
# Bind function example
def bind(func, **kwargs):
    def bfunc(*args):
        return func(*args, **kwargs)
    return bfunc

As a quick example, let's make function sorted sort the provided iterable object in the reverse order.

In [ ]:
sorted([1, -1, 2, 10, 100, -404])
In [ ]:
sorted?
In [ ]:
rev_sorted = bind(sorted, reverse=True)
rev_sorted([1, -1, 2, 10, 100, -404])

The last pattern when we have a function generates a function based on the provided one is unltimately frequent and very useful. Let's consider an example, and firts write a simple Fibonacci sequence generator function.

(FYI: Fibonacci recurrent numerical sequence is defined as follows: F(n) = F(n-1) + F(n-2), F(0) = 0, F(1) = 1)

In [ ]:
def Fib(n):
    """Return the n-th Fibonacci number."""
    assert type(n) is int and n >= 0, "ERROR (Fib): index should be positive and integer!"
    return Fib(n-1) + Fib(n-2) if n > 1 else 1 if n is 1 else 0
In [ ]:
[Fib(i) for i in range(15)]

Fibonacci function is recurrent. Moreover, for every input n, it should compute values for all the previous indices. Is this efficient? Of course not. Let's make it more efficient, hacking the __call__ method and adding some caching. We will make it using decorators.

In [ ]:
import collections

def memoize(func):
    """A caching decorator. Checks and returns a cached value before applying the function itself."""
    cache = {}
    def cachedFunc(*args):
        if args not in cache:
            print "Cache miss!"
            cache[args] = func(*args)
        return cache[args]
    return cachedFunc

@memoize
def Fib(n):
    """Return the n-th Fibonacci number."""
    assert type(n) is int and n >= 0, "ERROR (Fib): index should be positive and integer!"
    return Fib(n-1) + Fib(n-2) if n > 1 else 1 if n is 1 else 0
In [ ]:
[Fib(i) for i in range(16)]

Exercise:

Implement a decorator that will trace Fibonacci function calls. Whenever the Fib function called, print it and its arguments. If it is called recurrently, maintain the indent: each subsequent recurrent call should be indented with two more spaces.

In [ ]:
#%load solutions/fib_trace.py

NOTE: You can find a lot of useful decorators along with their design patterns in this library.

2. What are modules in Python? How to develop your own?

If you used Python even for a bit, you most probably noticed that the real power of the language is in its modules and packages which you can import into your script and use the code you or somebody else developed and tested before. But how to write your own package? Right, it is simple... because modules are also objects!

The simplest way is just create a file, say MyModule.py, put some functions and classes you implemented in it, and finally write

In [ ]:
import MyModule

or something like

In [ ]:
from MyModule import *  # or from MyModule import (MyFunctionName, MyClassName)

Now, if you would like to build something BIG, say a whole package for sound processing, you would probably prefer not putting everything in a mess into a single file, but rather you would prefer splitting everything logically into separate files or even folders. Then, your structure should look like this: (Example is taken from the Python documentation page)

sound/ Top-level package __init__.py Initialize the sound package formats/ Subpackage for file format conversions __init__.py wavread.py wavwrite.py aiffread.py aiffwrite.py auread.py auwrite.py ... effects/ Subpackage for sound effects __init__.py echo.py surround.py reverse.py ... filters/ Subpackage for filters __init__.py equalizer.py vocoder.py karaoke.py ...

Note the __init__.py files. They might be just empty (in most of the packages they are), but if you need to do some additional stuff at the moment of importing module (say, you want to print your module's version information and a license), you should do this exactly in the appropriate __init__.py.

Also note the subfolders you might have with submodules. Large packages with large libraries use exactly this kind of structure which allows programmers import only the submodules they are interested in working with. Example:

In [ ]:
import sound.effects.echo as echo

Let's have a look at a more concrete example (taken from Johansson's Python lectures).

In [ ]:
%%file mymodule.py
"""
Example of a python module. Contains a variable called my_variable,
a function called my_function, and a class called MyClass.
"""

my_variable = 0

def my_function():
    """
    Example function
    """
    return my_variable
    
class MyClass:
    """
    Example class.
    """

    def __init__(self):
        self.variable = my_variable
        
    def set_variable(self, new_value):
        """
        Set self.variable to a new value
        """
        self.variable = new_value
        
    def get_variable(self):
        return self.variable
In [ ]:
import mymodule
In [ ]:
help(mymodule)
In [ ]:
mymodule.my_variable
In [ ]:
mymodule.my_function
In [ ]:
my_class = mymodule.MyClass() 
my_class.set_variable(10)
my_class.get_variable()

3. Memory management in Python

In this section we shed some light on how the memory is actually managed by Python, show you important-to-know techniques that will help you build efficient code and avoid common mistakes.

3.1. Size of objects in Python

First, we should get a good feeling of what the actual sizes of objects in Pyhton are, and how the sizes grow whenever we create lists/dictionaries/tuples out of some number of objects.

In [1]:
# The function for showing size of objects (source: http://deeplearning.net/software/theano/tutorial/python-memory-management.html)
import sys

def show_sizeof(x, level=0):
    print "\t" * level, x.__class__, sys.getsizeof(x), x
    if hasattr(x, '__iter__'):
        if hasattr(x, 'items'):
            for xx in x.items():
                show_sizeof(xx, level + 1)
        else:
            for xx in x:
                show_sizeof(xx, level + 1)
In [2]:
sys.getsizeof?
In [ ]:
# Lets check sizes of different objects (sizes are indicated for 64-bit Python)
show_sizeof(None)    # 16 bytes for None
show_sizeof(1)       # 24 bytes for 64-bit int - 3 time the size of int64_t in C
show_sizeof(2**500)  # 92 bytes for 64-bit Python's long with unconstrained length

show_sizeof(0.5)     # 24 bytes for 64-bit double
show_sizeof("")      # 37 bytes for empty string
show_sizeof(u"")     # 50 for empty unicode string
show_sizeof("Test")  # 41 for not empty string (+1 byte per character)
show_sizeof(u"Test") # 58 for not empty unicode string (+2 bytes per character)

Python lists are actually dynamic arrays.

In [ ]:
show_sizeof([])                # 72 bytes
show_sizeof([1, "test", 0.5])  # 96 bytes. The capacity of this list is 6; +4 bytes per each link in a list.

All of these details seem to be minor, unless you start building large-scale applications, or process huge amounts of data for your projects. So, keep this in mind.

3.2. Python internal memory management

Here we just briefly discuss a pretty wasteful approach to manage memory allocation employed in Python. For thorough description and possible solutions you can refer to http://www.evanjones.ca/memoryallocator/.

We can run a small experiment using memory profiler utitlity.

In [ ]:
%%file memory-profile-me.py

import copy
import memory_profiler

@profile
def function():
    x = range(1000000)  # allocate a big list
    y = copy.deepcopy(x)
    del x
    return y

if __name__=="__main__":
    function()
In [3]:
%run -m memory_profiler memory-profile-me.py
Filename: memory-profile-me.py

Line #    Mem usage    Increment   Line Contents
================================================
     5   15.883 MiB    0.000 MiB   @profile
     6                             def function():
     7   46.945 MiB   31.062 MiB       x = range(1000000)  # allocate a big list
     8  149.242 MiB  102.297 MiB       y = copy.deepcopy(x)
     9  149.242 MiB    0.000 MiB       del x
    10  149.242 MiB    0.000 MiB       return y


So, what does this chart above tell us? Somewhy, Python didn't shrink the memory in use after we deleted x. What happened?

To speed up memory allocation, Python reuses already allocated chuncks. It keeps several lists for small objects (separate lists for different sizes). Whenever we create a new object, Python either allocates a new block, or reuses a free one from one of the lists. That's why when we deleted x, the memory usage didn't shrink.

4. The Pythonic way: Tips & Tricks

Python is a nice language. If you had known other programming languages before starting using Python, you can easily switch to Python (learn it literaly in 3-6 hours), and continue writing ugly awkward code using idioms and structures from your previous languages. However, Python is beautiful, very succinct and very expressive. This section is devoted to show you some essential well known (or less known) Python idioms (or tricks, if you will) that can make your code shorter and more elegant.

List comprehensions (the same for dicts, sets, frozensets, and actually all iterable objects)

In [ ]:
# Basic for-loop comprehension
a = [1,2,3,4,5,6,7]
b = [x**2 for x in a]
print b
In [ ]:
# For-loop if-else comprehension
c = [x**2 for x in a if x > 4]
print c
In [ ]:
# Multiple for-loop if-else comprehension
llist = [[1,2,3],(4,5,6),(7,8,9)]
c = [x**2 for sublist in llist for x in sublist if x % 2 == 0]
print c

Infinite structures

In [ ]:
# In Python you can actually use infinite lsits and dicts (without running out of memory!)
a = [1,2,3]
a.append(a)
print a
In [ ]:
a[3][3][3][3][3]

Other fancy stuff

In [ ]:
# Quick swap
a = 1
b = 2
print a,b
a, b = b, a
print a, b
In [ ]:
# Nice math-like comparison notation
x = 5
print 3 < x < 8
print x < 10 < 5*x < 99
print 2 > x < 7
In [ ]:
# List flattening techniques
a = [[1,2,3],[4,5,6],[7,8,9]]
print sum(a,[])                  # this works only with nested lists (or objects with __add__ method defined)
print [x for b in a for x in b]  # works with any nested iterables
In [ ]:
# Pairing elements of two lists
first, second = [1,2,3,4,5], [6,7,8,9,10]
print first
print second

paired = zip(first,second)
print paired
In [ ]:
# Unpairing the elements back into two lists (NOTE: Lists turn into tuples. This is due to zip implementation.)
a, b = zip(*paired)
print a
print b
In [ ]:
# Providing function arguments via structures
def func(a, b, c, kw1 = '', kw2 = 0):
    print a
    print b
    print c
    print kw1
    print kw2

args = (1, 2, 3)
kwargs = {'kw1':'Hello', 'kw2':100}
func(*args,**kwargs)
In [ ]:
# Enumeration of the elements
l = ["spam", "ham", "eggs"]
print list(enumerate(l))
print list(enumerate(l,5))

Copyright 2015, Maruan Al-Shedivat, ACM Student Member.