#!/usr/bin/env python
# coding: utf-8

# # Using Fortran and C code with Python

# J.R. Johansson (jrjohansson at gmail.com)
# 
# The latest version of this [IPython notebook](http://ipython.org/notebook.html) lecture is available at [http://github.com/jrjohansson/scientific-python-lectures](http://github.com/jrjohansson/scientific-python-lectures).
# 
# The other notebooks in this lecture series are indexed at [http://jrjohansson.github.io](http://jrjohansson.github.io).

# In[1]:


get_ipython().run_line_magic('pylab', 'inline')
from IPython.display import Image


# The advantage of Python is that it is flexible and easy to program. The time it takes to setup a new calculation is therefore short. But for certain types of calculations Python (and any other interpreted language) can be very slow. It is particularly iterations over large arrays that is difficult to do efficiently.
# 
# Such calculations may be implemented in a compiled language such as C or Fortran. In Python it is relatively easy to call out to libraries with compiled C or Fortran code. In this lecture we will look at how to do that.
# 
# But before we go ahead and work on optimizing anything, it is always worthwhile to ask.... 

# In[2]:


Image(filename='images/optimizing-what.png')


# ## Fortran

# ### F2PY

# F2PY is a program that (almost) automatically wraps fortran code for use in Python: By using the `f2py` program we can compile fortran code into a module that we can import in a Python program.
# 
# F2PY is a part of NumPy, but you will also need to have a fortran compiler to run the examples below.

# ### Example 0: scalar input, no output

# In[3]:


get_ipython().run_cell_magic('file', 'hellofortran.f', 'C File  hellofortran.f\n        subroutine hellofortran (n)\n        integer n\n       \n        do 100 i=0, n\n            print *, "Fortran says hello"\n100     continue\n        end\n')


# Generate a python module using `f2py`:

# In[4]:


get_ipython().system('f2py -c -m hellofortran hellofortran.f')


# Example of a python script that use the module:

# In[5]:


get_ipython().run_cell_magic('file', 'hello.py', 'import hellofortran\n\nhellofortran.hellofortran(5)\n')


# In[6]:


# run the script
get_ipython().system('python hello.py')


# ### Example 1: vector input and scalar output

# In[7]:


get_ipython().run_cell_magic('file', 'dprod.f', '\n       subroutine dprod(x, y, n)\n    \n       double precision x(n), y\n       y = 1.0\n    \n       do 100 i=1, n\n           y = y * x(i)\n100    continue\n       end\n')


# In[8]:


get_ipython().system('rm -f dprod.pyf')
get_ipython().system('f2py -m dprod -h dprod.pyf dprod.f')


# The `f2py` program generated a module declaration file called `dsum.pyf`. Let's look what's in it:

# In[9]:


get_ipython().system('cat dprod.pyf')


# The module does not know what Fortran subroutine arguments is input and output, so we need to manually edit the module declaration files and mark output variables with `intent(out)` and input variable with `intent(in)`:

# In[10]:


get_ipython().run_cell_magic('file', 'dprod.pyf', 'python module dprod ! in \n    interface  ! in :dprod\n        subroutine dprod(x,y,n) ! in :dprod:dprod.f\n            double precision dimension(n), intent(in) :: x\n            double precision, intent(out) :: y\n            integer, optional,check(len(x)>=n),depend(x),intent(in) :: n=len(x)\n        end subroutine dprod\n    end interface \nend python module dprod\n')


# Compile the fortran code into a module that can be included in python:

# In[11]:


get_ipython().system('f2py -c dprod.pyf dprod.f')


# #### Using the module from Python

# In[12]:


import dprod


# In[13]:


help(dprod)


# In[14]:


dprod.dprod(arange(1,50))


# In[15]:


# compare to numpy
prod(arange(1.0,50.0))


# In[16]:


dprod.dprod(arange(1,10), 5) # only the 5 first elements


# Compare performance:

# In[17]:


xvec = rand(500)


# In[18]:


timeit dprod.dprod(xvec)


# In[19]:


timeit xvec.prod()


# ### Example 2: cummulative sum, vector input and vector output

# The cumulative sum function for an array of data is a good example of a loop intense algorithm: Loop through a vector and store the cumulative sum in another vector.

# In[20]:


# simple python algorithm: example of a SLOW implementation
# Why? Because the loop is implemented in python.
def py_dcumsum(a):
    b = empty_like(a)
    b[0] = a[0]
    for n in range(1,len(a)):
        b[n] = b[n-1]+a[n]
    return b


# Fortran subroutine for the same thing: here we have added the `intent(in)` and `intent(out)` as comment lines in the original fortran code, so we do not need to manually edit the fortran module declaration file generated by `f2py`. 

# In[21]:


get_ipython().run_cell_magic('file', 'dcumsum.f', 'c File dcumsum.f\n       subroutine dcumsum(a, b, n)\n       double precision a(n)\n       double precision b(n)\n       integer n\ncf2py  intent(in) :: a\ncf2py  intent(out) :: b\ncf2py  intent(hide) :: n\n\n       b(1) = a(1)\n       do 100 i=2, n\n           b(i) = b(i-1) + a(i)\n100    continue\n       end\n')


# We can directly compile the fortran code to a python module:

# In[22]:


get_ipython().system('f2py -c dcumsum.f -m dcumsum')


# In[23]:


import dcumsum


# In[24]:


a = array([1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0])


# In[25]:


py_dcumsum(a)


# In[26]:


dcumsum.dcumsum(a)


# In[27]:


cumsum(a)


# Benchmark the different implementations:

# In[28]:


a = rand(10000)


# In[29]:


timeit py_dcumsum(a)


# In[30]:


timeit dcumsum.dcumsum(a)


# In[31]:


timeit a.cumsum()


# ### Further reading

# 1. http://www.scipy.org/F2py
# 1. http://dsnra.jpl.nasa.gov/software/Python/F2PY_tutorial.pdf
# 1. http://www.shocksolution.com/2009/09/f2py-binding-fortran-python/

# ## C

# ## ctypes

# ctypes is a Python library for calling out to C code. It is not as automatic as `f2py`, and we manually need to load the library and set properties such as the functions return and argument types. On the other hand we do not need to touch the C code at all. 

# In[32]:


get_ipython().run_cell_magic('file', 'functions.c', '\n#include <stdio.h>\n\nvoid hello(int n);\n\ndouble dprod(double *x, int n);\n\nvoid dcumsum(double *a, double *b, int n);\n\nvoid\nhello(int n)\n{\n    int i;\n    \n    for (i = 0; i < n; i++)\n    {\n        printf("C says hello\\n");\n    }\n}\n\n\ndouble \ndprod(double *x, int n)\n{\n    int i;\n    double y = 1.0;\n    \n    for (i = 0; i < n; i++)\n    {\n        y *= x[i];\n    }\n\n    return y;\n}\n\nvoid\ndcumsum(double *a, double *b, int n)\n{\n    int i;\n    \n    b[0] = a[0];\n    for (i = 1; i < n; i++)\n    {\n        b[i] = a[i] + b[i-1];\n    }\n}\n')


# Compile the C file into a shared library:

# In[33]:


get_ipython().system('gcc -c -Wall -O2 -Wall -ansi -pedantic -fPIC -o functions.o functions.c')
get_ipython().system('gcc -o libfunctions.so -shared functions.o')


# The result is a compiled shared library `libfunctions.so`:

# In[34]:


get_ipython().system('file libfunctions.so')


# Now we need to write wrapper functions to access the C library: To load the library we use the ctypes package, which included in the Python standard library (with extensions from numpy for passing arrays to C). Then we manually set the types of the argument and return values (no automatic code inspection here!). 

# In[35]:


get_ipython().run_cell_magic('file', 'functions.py', "\nimport numpy\nimport ctypes\n\n_libfunctions = numpy.ctypeslib.load_library('libfunctions', '.')\n\n_libfunctions.hello.argtypes = [ctypes.c_int]\n_libfunctions.hello.restype  =  ctypes.c_void_p\n\n_libfunctions.dprod.argtypes = [numpy.ctypeslib.ndpointer(dtype=numpy.float), ctypes.c_int]\n_libfunctions.dprod.restype  = ctypes.c_double\n\n_libfunctions.dcumsum.argtypes = [numpy.ctypeslib.ndpointer(dtype=numpy.float), numpy.ctypeslib.ndpointer(dtype=numpy.float), ctypes.c_int]\n_libfunctions.dcumsum.restype  = ctypes.c_void_p\n\ndef hello(n):\n    return _libfunctions.hello(int(n))\n\ndef dprod(x, n=None):\n    if n is None:\n        n = len(x)\n    x = numpy.asarray(x, dtype=numpy.float)\n    return _libfunctions.dprod(x, int(n))\n\ndef dcumsum(a, n):\n    a = numpy.asarray(a, dtype=numpy.float)\n    b = numpy.empty(len(a), dtype=numpy.float)\n    _libfunctions.dcumsum(a, b, int(n))\n    return b\n")


# In[36]:


get_ipython().run_cell_magic('file', 'run_hello_c.py', '\nimport functions\n\nfunctions.hello(3)\n')


# In[37]:


get_ipython().system('python run_hello_c.py')


# In[38]:


import functions


# ### Product function:

# In[39]:


functions.dprod([1,2,3,4,5]) 


# ### Cumulative sum:

# In[40]:


a = rand(100000)


# In[41]:


res_c = functions.dcumsum(a, len(a)) 


# In[42]:


res_fortran = dcumsum.dcumsum(a)


# In[43]:


res_c - res_fortran


# ### Simple benchmark

# In[44]:


timeit functions.dcumsum(a, len(a))


# In[45]:


timeit dcumsum.dcumsum(a)


# In[46]:


timeit a.cumsum()


# ### Further reading

# * http://docs.python.org/2/library/ctypes.html
# * http://www.scipy.org/Cookbook/Ctypes

# ## Cython

# A hybrid between python and C that can be compiled: Basically Python code with type declarations.

# In[47]:


get_ipython().run_cell_magic('file', 'cy_dcumsum.pyx', '\ncimport numpy\n\ndef dcumsum(numpy.ndarray[numpy.float64_t, ndim=1] a, numpy.ndarray[numpy.float64_t, ndim=1] b):\n    cdef int i, n = len(a)\n    b[0] = a[0]\n    for i from 1 <= i < n:\n        b[i] = b[i-1] + a[i]\n    return b\n')


# A build file for generating C code and compiling it into a Python module.

# In[48]:


get_ipython().run_cell_magic('file', 'setup.py', '\nfrom distutils.core import setup\nfrom distutils.extension import Extension\nfrom Cython.Distutils import build_ext\n\nsetup(\n    cmdclass = {\'build_ext\': build_ext},\n    ext_modules = [Extension("cy_dcumsum", ["cy_dcumsum.pyx"])]\n)\n')


# In[49]:


get_ipython().system('python setup.py build_ext --inplace')


# In[50]:


import cy_dcumsum


# In[51]:


a = array([1,2,3,4], dtype=float)
b = empty_like(a)
cy_dcumsum.dcumsum(a,b)
b


# In[52]:


a = array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])


# In[53]:


b = empty_like(a)
cy_dcumsum.dcumsum(a, b)
b


# In[54]:


py_dcumsum(a)


# In[55]:


a = rand(100000)
b = empty_like(a)


# In[56]:


timeit py_dcumsum(a)


# In[57]:


timeit cy_dcumsum.dcumsum(a,b)


# ### Cython in the IPython notebook

# When working with the IPython (especially in the notebook), there is a more convenient way of compiling and loading Cython code. Using the `%%cython` IPython magic (command to IPython), we can simply type the Cython code in a code cell and let IPython take care of the conversion to C code, compilation and loading of the function. To be able to use the `%%cython` magic, we first need to load the extension `cythonmagic`:

# In[58]:


get_ipython().run_line_magic('load_ext', 'cythonmagic')


# In[62]:


get_ipython().run_cell_magic('cython', '', '\ncimport numpy\n\ndef cy_dcumsum2(numpy.ndarray[numpy.float64_t, ndim=1] a, numpy.ndarray[numpy.float64_t, ndim=1] b):\n    cdef int i, n = len(a)\n    b[0] = a[0]\n    for i from 1 <= i < n:\n        b[i] = b[i-1] + a[i]\n    return b\n')


# In[63]:


timeit cy_dcumsum2(a,b)


# ### Further reading

# * http://cython.org
# * http://docs.cython.org/src/userguide/tutorial.html
# * http://wiki.cython.org/tutorials/numpy

# ## Versions

# In[64]:


get_ipython().run_line_magic('reload_ext', 'version_information')

get_ipython().run_line_magic('version_information', 'ctypes, Cython')