from IPython.display import display, Image, HTML
from talktools import website, nbviewer
IPython is an open source, interactive computing environment for Python and other languages.
website('http://ipython.org')
import ipythonproject
ipythonproject.core_devs()
Fernando Perez | Brian Granger | Min Ragan-Kelley | Thomas Kluyver |
Matthias Bussonnier | Jonathan Frederic | Paul Ivanov | Evan Patterson |
Damian Avila | Brad Froehle | Zach Sailer | Robert Kern |
Jorgen Stenarson | Jonathan March | Kyle Kelley |
Notice that the output of the above Python code is an HTML table with embedded images. IPython generalizes the notion of output to include rich formats: HTML, PNG, JPEG, PDF, JavaScript, LaTeX, etc. This means that any Python object can declare rich representations that will be rendered and saved in the notebook.
The IPython Notebook is a web-based interactive computing environment that spans the full range of data related activities:
How does IPython target these different activities?
The central focus of IPython is the writing and running of code. We try to make this as pleasant as possible:
Let's download some stock data into a Pandas DataFrame
and then visualize the time series using Vincent/Vega/d3.
import vincent
import pandas as pd
vincent.initialize_notebook()
import pandas.io.data as web
all_data = {}
for ticker in ['AAPL', 'GOOG', 'IBM', 'YHOO', 'MSFT']:
all_data[ticker] = web.get_data_yahoo(ticker, '1/1/2010', '1/1/2013')
price = pd.DataFrame({tic: data['Adj Close'] for tic, data in all_data.items()})
In the Notebook DataFrame
objects are represented as formatted HTML tables:
price[0:10]
AAPL | GOOG | IBM | MSFT | YHOO | |
---|---|---|---|---|---|
Date | |||||
2010-01-04 | 205.70 | 626.75 | 122.62 | 27.88 | 17.10 |
2010-01-05 | 206.05 | 623.99 | 121.14 | 27.89 | 17.23 |
2010-01-06 | 202.77 | 608.26 | 120.35 | 27.72 | 17.17 |
2010-01-07 | 202.40 | 594.10 | 119.94 | 27.43 | 16.70 |
2010-01-08 | 203.75 | 602.02 | 121.14 | 27.62 | 16.70 |
2010-01-11 | 201.95 | 601.11 | 119.87 | 27.27 | 16.74 |
2010-01-12 | 199.65 | 590.48 | 120.83 | 27.09 | 16.68 |
2010-01-13 | 202.47 | 587.09 | 120.57 | 27.34 | 16.90 |
2010-01-14 | 201.29 | 589.85 | 122.49 | 27.89 | 17.12 |
2010-01-15 | 197.93 | 580.00 | 122.00 | 27.80 | 16.82 |
line = vincent.Line(price[['GOOG', 'AAPL', 'IBM', 'YHOO', 'MSFT']], width=600, height=300)
line.axis_titles(x='Date', y='Price')
line.legend(title='Ticker')
display(line)
Data science is a multi-language activity. R. Python. Julia. Scala. Etc. The IPython architecture is language agnostic.
Let's fit a linear model in R and visualize the results:
import numpy as np
X = np.array([0,1,2,3,4])
Y = np.array([3,5,4,6,7])
%load_ext rmagic
The %%R
syntax tells IPython to run the rest of the cell as R code:
%%R -i X,Y -o XYcoef
XYlm = lm(Y~X)
XYcoef = coef(XYlm)
print(summary(XYlm))
par(mfrow=c(2,2))
plot(XYlm)
Call: lm(formula = Y ~ X) Residuals: 1 2 3 4 5 -0.2 0.9 -1.0 0.1 0.2 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 3.2000 0.6164 5.191 0.0139 * X 0.9000 0.2517 3.576 0.0374 * --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.7958 on 3 degrees of freedom Multiple R-squared: 0.81, Adjusted R-squared: 0.7467 F-statistic: 12.79 on 1 and 3 DF, p-value: 0.03739
This %%language
syntax is an IPython specific extension to the Python language. This "magic command syntax" allows Python code to call out to a wide range of other languages (Ruby, Bash, Julia, Fortran, Perl, Octave, Matlab, etc.)
In the IPython architecture, the kernel is a separate process that runs the user's code and returns the output back to the frontend (Notebook, Terminal, etc.). Kernels talk to frontends using a well documented message protocol (JSON over ZeroMQ and WebSockets). The default kernel that ships with IPython knows how to run Python code. However, there are now kernels in other languages:
By later this year, all users of the IPython Notebook will have the option to choose what type of kernel to use for each Notebook.
Here is a notebook that runs code in the native Julia kernel:
website("http://nbviewer.ipython.org/url/jdj.mit.edu/~stevenj/IJulia%20Preview.ipynb")
Notebook documents are just JSON files stored on your filesystem. These files store everything related to a computation:
Notebook documents can be shared:
Notebook documents can be viewed by anyone on the web through http://nbviewer.ipython.org
website("http://nbviewer.ipython.org")
This allows people to compose and share reproducible stories that involve code and data.
Earlier this year, Randall Munroe (xkcd) published a comic about regular expression golf. Peter Norvig from Google wanted to explore some of the algorithms related to this comic and shared his explorations as a notebook on nbviewer:
website("http://nbviewer.ipython.org/url/norvig.com/ipython/xkcd1313.ipynb")