This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.
First, we launch 4 IPython engines with ipcluster start -n 4
in a console.
Then, we create a client that will act as a proxy to the IPython engines. The client automatically detects the running engines.
from IPython.parallel import Client
rc = Client()
Let's check the number of running engines.
rc.ids
To run commands in parallel over the engines, we can use the %px magic or the %%px cell magic.
%%px
import os
print("Process {0:d}.".format(os.getpid()))
We can specify which engines to run the commands on using the --targets or -t option.
%%px -t 1,2
# The os module has already been imported in the previous cell.
print("Process {0:d}.".format(os.getpid()))
By default, the %px magic executes commands in blocking mode: the cell returns when the commands have completed on all engines. It is possible to run non-blocking commands with the --noblock or -a option. In this case, the cell returns immediately, and the task's status and the results can be polled asynchronously from the IPython interactive session.
%%px -a
import time
time.sleep(5)
The previous command returned an ASyncResult instance that we can use to poll the task's status.
print(_.elapsed, _.ready())
The %pxresult blocks until the task finishes.
%pxresult
print(_.elapsed, _.ready())
IPython provides convenient functions for most common use-cases, like a parallel map function.
v = rc[:]
res = v.map(lambda x: x*x, range(10))
print(res.get())
You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).
IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).