Importing IPython Notebooks as Modules

It is a common problem that people want to import code from IPython Notebooks. This is made difficult by the fact that Notebooks are not plain Python files, and thus cannot be imported by the regular Python machinery.

There is a flag in the notebook server that provides a certain workaround for a small set of cases, but I think it's gross so I won't even discuss it.

Fortunately, Python provides some fairly sophisticated hooks into the import machinery, so we can actually make IPython notebooks importable without much difficulty, and only using public APIs.

Forgive me if some of this is gross or wrong, I haven't really written import hooks before.

In [1]:
import io, os, sys, types
In [2]:
from IPython.nbformat import current
from IPython.core.interactiveshell import InteractiveShell

Import hooks typically take the form of two objects:

  1. a module loader, which takes a module name (e.g. 'IPython.display'), and returns a Module
  2. a module finder, which figures out whether a module might exist, and tells Python what loader to use
In [3]:
def find_notebook(fullname, path=None):
    """find a notebook, given its fully qualified name and an optional path"""
    name = fullname.rsplit('.', 1)[-1]
    if not path:
        path = ['']
    for d in path:
        nb_path = os.path.join(d, name + ".ipynb")
        if os.path.isfile(nb_path):
            return nb_path
            

Notebook Loader

Here we have our Notebook Loader. It's actually quite simple - once we figure out the filename of the module, all it does is:

  1. load the notebook document into memory
  2. create an empty Module
  3. execute every cell in the Module namespace

Since IPython cells can have extended syntax, the IPython transform is applied to turn each of these cells into their pure-Python counterparts before executing them. If all of your notebook cells are pure-Python, this step is unnecessary.

In [4]:
class NotebookLoader(object):
    """Module Loader for IPython Notebooks"""
    def __init__(self, path=None):
        self.shell = InteractiveShell.instance()
        self.path = path
    
    def load_module(self, fullname):
        """import a notebook as a module"""
        path = find_notebook(fullname, self.path)
        
        print ("importing IPython notebook from %s" % path)
                                       
        # load the notebook object
        with io.open(path, 'r', encoding='utf-8') as f:
            nb = current.read(f, 'json')
        
        
        # create the module and add it to sys.modules
        # if name in sys.modules:
        #    return sys.modules[name]
        mod = types.ModuleType(fullname)
        mod.__file__ = path
        mod.__loader__ = self
        sys.modules[fullname] = mod
        
        # extra work to ensure that magics that would affect the user_ns
        # actually affect the notebook module's ns
        save_user_ns = self.shell.user_ns
        self.shell.user_ns = mod.__dict__
        
        try:
          for cell in nb.worksheets[0].cells:
            if cell.cell_type == 'code' and cell.language == 'python':
                # transform the input to executable Python
                code = self.shell.input_transformer_manager.transform_cell(cell.input)
                # run the code in themodule
                exec code in mod.__dict__
        finally:
            self.shell.user_ns = save_user_ns
        return mod

The Module Finder

The finder is a simple object that tells you whether a name can be imported, and returns the appropriate loader. All this one does is check, when you do:

import mynotebook

it checks whether mynotebook.ipynb exists. If a notebook is found, then it returns a NotebookLoader.

Any extra logic is just for resolving paths within packages.

In [5]:
class NotebookFinder(object):
    """Module finder that locates IPython Notebooks"""
    def __init__(self):
        self.loaders = {}
    
    def find_module(self, fullname, path=None):
        nb_path = find_notebook(fullname, path)
        if not nb_path:
            return
        
        key = path
        if path:
            # lists aren't hashable
            key = os.path.sep.join(path)
        
        if key not in self.loaders:
            self.loaders[key] = NotebookLoader(path)
        return self.loaders[key]

Register the hook

Now we register the NotebookFinder with sys.meta_path

In [6]:
sys.meta_path.append(NotebookFinder())

After this point, my notebooks should be importable.

Let's look at what we have in the CWD:

In [7]:
ls
Importing Notebooks.ipynb  bs.ipynb                   mynotebook.ipynb           nb/

So I should be able to import mynotebook.

Aside: displaying notebooks

Here is some simple code to display the contents of a notebook with syntax highlighting, etc.

In [8]:
from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import HtmlFormatter

from IPython.display import display, HTML

formatter = HtmlFormatter()
lexer = PythonLexer()

# publish the CSS for pygments highlighting
display(HTML("""
<style type='text/css'>
%s
</style>
""" % formatter.get_style_defs()
))
In [9]:
def show_notebook(fname):
    """display a short summary of the cells of a notebook"""
    with io.open(fname, 'r', encoding='utf-8') as f:
        nb = current.read(f, 'json')
    html = []
    for cell in nb.worksheets[0].cells:
        html.append("<h4>%s cell</h4>" % cell.cell_type)
        if cell.cell_type == 'code':
            html.append(highlight(cell.input, lexer, formatter))
        else:
            html.append("<pre>%s</pre>" % cell.source)
    display(HTML('\n'.join(html)))

show_notebook("mynotebook.ipynb")

heading cell

My Notebook

code cell

def foo():
    return "foo"

code cell

def has_ip_syntax():
    listing = !ls
    return listing

code cell

def whatsmyname():
    return __name__

So my notebook has a heading cell and some code cells, one of which contains some IPython syntax.

Let's see what happens when we import it

In [10]:
import mynotebook
importing IPython notebook from mynotebook.ipynb

Hooray, it imported! Does it work?

In [11]:
mynotebook.foo()
Out[11]:
'foo'

Hooray again!

Even the function that contains IPython syntax works:

In [12]:
mynotebook.has_ip_syntax()
Out[12]:
['Importing Notebooks.ipynb', 'bs.ipynb', 'mynotebook.ipynb', 'nb']

Notebooks in packages

We also have a notebook inside the nb package, so let's make sure that works as well.

In [13]:
ls nb
__init__.py   __init__.pyc  other.ipynb

Note that the __init__.py is necessary for nb to be considered a package, just like usual.

In [14]:
show_notebook(os.path.join("nb", "other.ipynb"))

markdown cell

This notebook just defines `bar`

code cell

def bar(x):
    return "bar" * x
In [15]:
from nb import other
other.bar(5)
importing IPython notebook from nb/other.ipynb

Out[15]:
'barbarbarbarbar'

So now we have importable notebooks, from both the local directory and inside packages.

I can even put a notebook inside IPython, to further demonstrate that this is working properly:

In [16]:
import shutil
from IPython.utils.path import get_ipython_package_dir

utils = os.path.join(get_ipython_package_dir(), 'utils')
shutil.copy("mynotebook.ipynb", os.path.join(utils, "inipython.ipynb"))

and import the notebook from IPython.utils

In [17]:
from IPython.utils import inipython
inipython.whatsmyname()
importing IPython notebook from /Users/minrk/Dropbox/dev/ip/mine/IPython/utils/inipython.ipynb

Out[17]:
'IPython.utils.inipython'

Even Cython magics

With a bit of extra magic for handling the IPython interactive namespace during load, even magics like %%cython can be used:

In [18]:
show_notebook('bs.ipynb')

code cell

%load_ext cythonmagic

markdown cell

Python Black-Scholes

code cell

from math import exp, sqrt, pow, log, erf

def std_norm_cdf(x):
    return 0.5*(1+erf(x/sqrt(2.0)))

def bs_py(s, k, t, v, rf, div, cp):
    """Price an option using the Black-Scholes model.
    
    s : initial stock price
    k : strike price
    t : expiration time
    v : volatility
    rf : risk-free rate
    div : dividend
    cp : +1/-1 for call/put
    """
    d1 = (log(s/k)+(rf-div+0.5*pow(v,2))*t)/(v*sqrt(t))
    d2 = d1 - v*sqrt(t)
    optprice = cp*s*exp(-div*t)*std_norm_cdf(cp*d1) - \
        cp*k*exp(-rf*t)*std_norm_cdf(cp*d2)
    return optprice

markdown cell

Cython Black-Scholes (the same, just with types)

code cell

%%cython
cimport cython
from libc.math cimport exp, sqrt, pow, log, erf

@cython.cdivision(True)
cdef double std_norm_cdf(double x) nogil:
    return 0.5*(1+erf(x/sqrt(2.0)))

@cython.cdivision(True)
def bs_cy(double s, double k, double t, double v,
                 double rf, double div, double cp):
    """Same as above, with Cython"""
    cdef double d1, d2, optprice
    with nogil:
        d1 = (log(s/k)+(rf-div+0.5*pow(v,2))*t)/(v*sqrt(t))
        d2 = d1 - v*sqrt(t)
        optprice = cp*s*exp(-div*t)*std_norm_cdf(cp*d1) - \
            cp*k*exp(-rf*t)*std_norm_cdf(cp*d2)
    return optprice

code cell

In [19]:
from bs import bs_py, bs_cy
print "Python"
%timeit bs_py(100.0, 100.0, 1.0, 0.3, 0.03, 0.0, -1)
print "Cython"
%timeit bs_cy(100.0, 100.0, 1.0, 0.3, 0.03, 0.0, -1)
importing IPython notebook from bs.ipynb
Python
100000 loops, best of 3: 3.03 ┬Ás per loop
Cython
1000000 loops, best of 3: 367 ns per loop

Back to top