In my courses and as part of Software Carpentry I'm teaching more and more in IPython notebooks. One of the things I/we teach is testing, so I wondered if I could teach it in a notebook as well. With quite a lot of help from Matthias Bussonnier I got it working so I thought I'd share.
We will test a function that caluculates the GC-content of a DNA sequence. The GC-content is simply the percentage of bases in the DNA sequence that are either G's or C's. So, for example, the GC-content of 'ATTGC' is 40%.
The function we are testing is get_gc_content()
and it takes a single
argument, which is a string represting a sequence. This function is in
a custom module called dna_analysis.py
.
We can use the %%file
magic to save a block of code to a file, so we
can use that to create the module that we're going to test.
%%file dna_analysis.py
"""Code for analyzing DNA sequences"""
from __future__ import division
def get_gc_content(seq):
"""Determine the GC content of a sequence"""
seq = seq.upper()
gc_content = 100 * (seq.count('G') + seq.count('C')) / len(seq)
return gc_content
Writing dna_analysis.py
checks to see if it returns the right value
%%file test_dna.py
from dna_analysis import get_gc_content
def test_get_gc_content_zero():
assert get_gc_content('ATTATTAAA') == 0
def test_get_gc_content_lowercase():
assert get_gc_content('atgcatgc') == 50
def test_get_gc_content_multiline():
sequence = """atta
gccg
attt
cccg"""
assert get_gc_content(sequence) == 50
Writing test_dna.py
We would normally run nosetests from the command line, so in IPython we can just call !nosetests
.
!nosetests
..F ====================================================================== FAIL: test_dna.test_get_gc_content_multiline ---------------------------------------------------------------------- Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest self.test(*self.arg) File "/home/ethan/Dropbox/Teaching/ProgBiol/repo/ipynbs/test_dna.py", line 15, in test_get_gc_content_multiline assert get_gc_content(sequence) == 50 AssertionError ---------------------------------------------------------------------- Ran 3 tests in 0.018s FAILED (failures=1)
This is exactly we output we want since the original function handles most basic cases, but not multiline strings.
There is also a nice example of running doctests in the notebook.
I wonder a bit about how valuable this is in the sense that we probably wouldn't normally run tests this way (at least I don't at the moment). But, I think at least for a short workshop where we're teaching all of the Python in a notebook that the value of reducing the cognitive load relative to switching environments for the testing portion might outweigh doing something in a way that might not be exactly how we would do it in day to day work.