To view This notebook properly, you need to disable Mixed Content Blocking warning in your browser
Get data (simulation, experiment control)
Manipulate and process data.
Visualize results... to understand what we are doing!
Communicate results: produce figures for reports or publications, write presentations.
IPython provides a rich architecture for interactive computing with:
Powerful interactive shells (terminal and Qt-based).
A browser-based notebook with support for code, text, mathematical expressions, inline plots and other rich media.
Support for interactive data visualization and use of GUI toolkits.
Flexible, embeddable interpreters to load into your own projects.
Easy to use, high performance tools for parallel computing.
%%bash
ls -lh ~/ | head -n 3
total 34M -rw-rw-r-- 1 rmyeid rmyeid 69K Aug 31 22:11 aapl_ohlc.csv -rw-rw-r-- 1 rmyeid rmyeid 13K Apr 23 17:14 Download.pdf
!uname -a
Linux einstein 3.13.0-24-generic #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
No need to introduce any special directive.
print("This is Python!")
This is Python!
def fact(n):
if n <= 0:
return 1
return n*fact(n-1)
fact(20)
2432902008176640000
%%ruby
puts 'This is Ruby playing with Python!!!'
This is Ruby playing with Python!!!
You need to change the cell type to Markdown.
or
from IPython.display import IFrame
IFrame('http://nbviewer.ipython.org/', width='100%', height=350)
This is not a Python tutorial, we trust that you can pick the language so quickly if you follow any of the following resources:
$ sudo apt-get install python-numpy python-scipy
$ sudo apt-get install python-scikits-learn python-pandas
$ sudo apt-get install python-nltk python-sympy python-pip
$ sudo pip install ipython
$ sudo pip install bokeh
This is harder in general, but you can use homebrew, macports, or just use Enthought or Ananconda Python distributions (Look at Windows instructions). Here, is a mac specific tutorial.
$ brew install python
$ pip install virtualenv virtualenvwrapper
$ pip install numpy
$ brew install gfortran
$ pip install scipy
$ brew install freetype
$ pip install matplotlib
$ pip install ipython bokeh
Windows lacks a good packaging system, so the easiest way to setup a Python environment is to install a pre-packaged distribution. Some good alternatives are:
EPD and Anaconda CE are also available for Linux and Max OS X.
%install_ext http://raw.github.com/jrjohansson/version_information/master/version_information.py
%load_ext version_information
%version_information numpy, scipy, matplotlib, sympy, scikit_learn, nltk, pandas
Installed version_information.py. To use it, type: %load_ext version_information
Software | Version |
---|---|
Python | 2.7.6 (default, Mar 22 2014, 22:59:56) [GCC 4.8.2] |
IPython | 2.2.0 |
OS | posix [linux2] |
numpy | 1.8.1 |
scipy | 0.13.3 |
matplotlib | 1.3.1 |
sympy | sympy |
scikit_learn | 0.15.1 |
nltk | 2.0.4 |
pandas | 0.13.1 |
Tue Sep 02 14:53:24 2014 EDT |
NumPy is the fundamental package for scientific computing with Python. It contains among other things:
A powerful N-dimensional array object
Sophisticated (broadcasting) functions
Tools for integrating C/C++ and Fortran code
Useful linear algebra, Fourier transform, and random number capabilities
import numpy as np
from __future__ import print_function
We can initialize arrays from Python lists or list of lists.
# a vector: the argument to the array function is a Python list
v = np.array([11, 12, 13, 14])
print('v =\n{}'.format(v))
# a matrix: the argument to the array function is a nested Python list
M = np.array([[2, 1], [3, 4]])
print('M =\n{}'.format(M))
print (type(v), type(M))
v = [11 12 13 14] M = [[2 1] [3 4]] <type 'numpy.ndarray'> <type 'numpy.ndarray'>
The array object has many useful attributes, like:
Also many operations are available:
print("v shape is {}".format(v.shape))
print("M shape is {}".format(M.shape))
print("Data type of v is {}".format(v.dtype))
print()
print("M transpose =\n{}".format(M.T))
print()
M.sort(axis=1)
print("M sorted by row =\n{}".format(np.asarray(M)))
print()
print("v stats are mean = {}, standard deviation = {:.4}, max = {}, min ={}".format(v.mean(), v.std(), v.max(), v.min()))
print()
print("Converting matrix M to a vector {}".format(M.flatten()))
print("Converting vector v to a matrix=\n{}".format(v.reshape(2,2)))
print()
print("M matrix size is {} and number of dimensions is {}".format(M.size, M.ndim))
v shape is (4,) M shape is (2, 2) Data type of v is int64 M transpose = [[2 3] [1 4]] M sorted by row = [[1 2] [3 4]] v stats are mean = 12.5, standard deviation = 1.118, max = 14, min =11 Converting matrix M to a vector [1 2 3 4] Converting vector v to a matrix= [[11 12] [13 14]] M matrix size is 4 and number of dimensions is 2
Sequences of numbers as well as random numbers could be used to initialize arrays.
x = np.arange(0, 10, 1) # arguments: start, stop, step
print("Create a range\n{}".format(x))
print()
# using linspace, both end points ARE included
x = np.linspace(0, 10, 41)
print("Create a spaced range\n{}".format(x))
print()
# uniform random numbers in [0,1]
x = np.random.rand(4,4)
print("Create a uniform random matrix (4,4)\n{}".format(x))
print()
# a diagonal matrix
x = np.diag([1,2,3])
print("Create a digonal matrix\n{}".format(x))
print()
x = np.zeros((3,3))
print("Create a zero matrix (3,3) \n{}".format(x))
Create a range [0 1 2 3 4 5 6 7 8 9] Create a spaced range [ 0. 0.25 0.5 0.75 1. 1.25 1.5 1.75 2. 2.25 2.5 2.75 3. 3.25 3.5 3.75 4. 4.25 4.5 4.75 5. 5.25 5.5 5.75 6. 6.25 6.5 6.75 7. 7.25 7.5 7.75 8. 8.25 8.5 8.75 9. 9.25 9.5 9.75 10. ] Create a uniform random matrix (4,4) [[ 0.31281701 0.09433053 0.04063735 0.61049305] [ 0.9849427 0.03703267 0.79554371 0.48745227] [ 0.46420451 0.17958533 0.32115069 0.33124119] [ 0.96986526 0.74728439 0.93486614 0.24787973]] Create a digonal matrix [[1 0 0] [0 2 0] [0 0 3]] Create a zero matrix (3,3) [[ 0. 0. 0.] [ 0. 0. 0.] [ 0. 0. 0.]]
ndarrays can be indexed using the standard Python $\mathbf{x}$[obj] syntax, where $\mathbf{x}$ is the array and obj the selection. There are three kinds of indexing available: record access, basic slicing, advanced indexing. Which one occurs depends on obj.
print("v[0] = {}\n".format(v[0]))
print("M =\n{}\n".format(M))
print("M[1, 1] = {}\n".format(M[1,1]))
print("M[1] = {}\n".format(M[1]))
print("M[1, :] = {}\n".format(M[1, :]))
print("M[:, 1] = {}\n".format(M[:, 1]))
print("M[1, :] = 0")
M[1, :] = 0
print("M =\n{}\n".format(M))
v[0] = 11 M = [[1 2] [3 4]] M[1, 1] = 4 M[1] = [3 4] M[1, :] = [3 4] M[:, 1] = [2 4] M[1, :] = 0 M = [[1 2] [0 0]]
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
print("A =\n{}\n".format(A))
print("A[1:4, 1:4]=\n{}\n".format(A[1:4, 1:4]))
print("A[::2, ::2]=\n{}\n".format(A[::2, ::2]))
print("A[ [1,4] ]=\n{}\n".format(A[[1,4]]))
print("A[ [1,4], [2,-1] ]=\n{}\n".format(A[[1,4],[2,-1]]))
A = [[ 0 1 2 3 4] [10 11 12 13 14] [20 21 22 23 24] [30 31 32 33 34] [40 41 42 43 44]] A[1:4, 1:4]= [[11 12 13] [21 22 23] [31 32 33]] A[::2, ::2]= [[ 0 2 4] [20 22 24] [40 42 44]] A[ [1,4] ]= [[10 11 12 13 14] [40 41 42 43 44]] A[ [1,4], [2,-1] ]= [12 44]
A = np.array([[n+m*10 for n in range(5)] for m in range(5)])
print("A =\n{}\n".format(A))
print("A > 20 =\n{}\n".format(A > 20))
print("np.where(A > 20) =\n{}\n".format(np.where(A > 20)))
print("np.argwhere(A > 20) =\n{}\n".format(np.argwhere(A > 20)))
print("A - 10 =\n{}\n".format(A - 10))
print("A * 10 =\n{}\n".format(A * 10))
print("A * A =\n{}\n".format(A * A))
A = [[ 0 1 2 3 4] [10 11 12 13 14] [20 21 22 23 24] [30 31 32 33 34] [40 41 42 43 44]] A > 20 = [[False False False False False] [False False False False False] [False True True True True] [ True True True True True] [ True True True True True]] np.where(A > 20) = (array([2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4]), array([1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4])) np.argwhere(A > 20) = [[2 1] [2 2] [2 3] [2 4] [3 0] [3 1] [3 2] [3 3] [3 4] [4 0] [4 1] [4 2] [4 3] [4 4]] A - 10 = [[-10 -9 -8 -7 -6] [ 0 1 2 3 4] [ 10 11 12 13 14] [ 20 21 22 23 24] [ 30 31 32 33 34]] A * 10 = [[ 0 10 20 30 40] [100 110 120 130 140] [200 210 220 230 240] [300 310 320 330 340] [400 410 420 430 440]] A * A = [[ 0 1 4 9 16] [ 100 121 144 169 196] [ 400 441 484 529 576] [ 900 961 1024 1089 1156] [1600 1681 1764 1849 1936]]
print("np.linalg.det(A) = {}\n".format(np.linalg.det(A)))
np.linalg.det(A) = 0.0
try:
print("np.linalg.inv(A) = {}\n".format(np.linalg.inv(A)))
except np.linalg.LinAlgError as e:
print("Matrix is singular")
Matrix is singular
To calculate $||\mathbf{v}||_2 = \sqrt{\mathbf{v}^T \mathbf{v}}$ if $\mathbf{v} \in \mathbb{R}^d$
v = np.arange(5)
print("v = {}\n".format(v))
print("||v|| = np.linalg.norm(v) = {}\n".format(np.linalg.norm(v)))
print("np.dot(v.T, v) = {}\n".format(np.dot(v.T, v)))
print("np.dot(v.T, v) ** 0.5 = {}".format(np.dot(v.T, v) ** 0.5))
v = [0 1 2 3 4] ||v|| = np.linalg.norm(v) = 5.47722557505 np.dot(v.T, v) = 30 np.dot(v.T, v) ** 0.5 = 5.47722557505
To calculate $\mathbf{v} \mathbf{v}^T \in R^{d\times d}$
print("v.shape = {}".format(v.shape))
u = v[:, np.newaxis]
print("u = v[np.newaxis,:] =\n{}\n".format(u))
print("u.shape = {}".format(u.shape))
print("np.dot(u, u.T) =\n{}\n".format(np.dot(u, u.T)))
#print("np.linalg.inv(")
v.shape = (5,) u = v[np.newaxis,:] = [[0] [1] [2] [3] [4]] u.shape = (5, 1) np.dot(u, u.T) = [[ 0 0 0 0 0] [ 0 1 2 3 4] [ 0 2 4 6 8] [ 0 3 6 9 12] [ 0 4 8 12 16]]
You can apply arithmetic to specific dimensions, like dividing each column by specific value. Moreover, you can aggregate quantities like sum over specific dimensions.
A = np.random.randint(0, 100, (4, 5))
v = np.arange(5) + 1.
u = np.arange(4) + 2.
print("A =\n{}\n".format(A))
print("A.max() = {}".format(A.max()))
print("A.max(axis=0) = {}".format(A.max(axis=0)))
print("A.min(axis=1) = {}".format(A.min(axis=1)))
print()
print("v = {}".format(v))
print("A / v =\n{}\n".format(A/v))
print()
print("u = {}".format(u))
print("(A.T - u).T =\n{}\n".format((A.T-u).T))
print("np.diff(A, axis=0) =\n{}\n".format(np.diff(A, axis=0)))
print("np.cumsum(A, axis=1) =\n{}\n".format(np.cumsum(A, axis=1)))
A = [[51 98 56 95 17] [32 24 8 87 8] [ 5 95 1 17 45] [22 8 63 97 20]] A.max() = 98 A.max(axis=0) = [51 98 63 97 45] A.min(axis=1) = [17 8 1 8] v = [ 1. 2. 3. 4. 5.] A / v = [[ 51. 49. 18.66666667 23.75 3.4 ] [ 32. 12. 2.66666667 21.75 1.6 ] [ 5. 47.5 0.33333333 4.25 9. ] [ 22. 4. 21. 24.25 4. ]] u = [ 2. 3. 4. 5.] (A.T - u).T = [[ 49. 96. 54. 93. 15.] [ 29. 21. 5. 84. 5.] [ 1. 91. -3. 13. 41.] [ 17. 3. 58. 92. 15.]] np.diff(A, axis=0) = [[-19 -74 -48 -8 -9] [-27 71 -7 -70 37] [ 17 -87 62 80 -25]] np.cumsum(A, axis=1) = [[ 51 149 205 300 317] [ 32 56 64 151 159] [ 5 100 101 118 163] [ 22 30 93 190 210]]
matplotlib is a python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. matplotlib can be used in python scripts, the python and ipython shell, web application servers, and six graphical user interface toolkits.
Best practice to import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
x = np.linspace(0, 5, 10)
y = x ** 2
fig, ax = plt.subplots()
ax.plot(x, x**2, label="$y = x^2$")
ax.plot(x, x**3, label="y = x**3")
ax.legend(loc=2); # upper left corner
ax.set_xlabel('x')
ax.set_ylabel('y', fontsize=38)
ax.set_title('Advertise Here');
/usr/lib/pymodules/python2.7/matplotlib/font_manager.py:1236: UserWarning: findfont: Font family ['monospace'] not found. Falling back to Bitstream Vera Sans (prop.get_family(), self.defaultFamily[fontext]))
xx = np.linspace(-0.75, 1., 100)
n = np.array([0,1,2,3,4,5])
fig, axes = plt.subplots(1, 4, figsize=(12,3))
axes[0].scatter(xx, xx + 0.25*np.random.randn(len(xx)))
axes[0].set_title("scatter")
axes[1].step(n, n**2, lw=2)
axes[1].set_title("step")
axes[2].bar(n, n**2, align="center", width=0.5, alpha=0.5)
axes[2].set_title("bar")
axes[3].fill_between(x, x**2, x**3, color="green", alpha=0.5);
axes[3].set_title("fill_between");
# A histogram
n = np.random.randn(100000)
fig, axes = plt.subplots(1, 2, figsize=(12,4))
axes[0].hist(n)
axes[0].set_title("Default histogram")
axes[0].set_xlim((min(n), max(n)))
axes[1].hist(n, cumulative=True, bins=50)
axes[1].set_title("Cumulative detailed histogram")
axes[1].set_xlim((min(n), max(n)));
from mpl_toolkits.mplot3d.axes3d import Axes3D
alpha = 0.7
phi_ext = 2 * np.pi * 0.5
def flux_qubit_potential(phi_m, phi_p):
return 2 + alpha - 2 * np.cos(phi_p)*np.cos(phi_m) - alpha * np.cos(phi_ext - 2*phi_p)
phi_m = np.linspace(0, 2*np.pi, 100)
phi_p = np.linspace(0, 2*np.pi, 100)
X,Y = np.meshgrid(phi_p, phi_m)
Z = flux_qubit_potential(X, Y).T
fig = plt.figure(figsize=(8,6))
ax = fig.add_subplot(1,1,1, projection='3d')
ax.plot_surface(X, Y, Z, rstride=4, cstride=4, alpha=0.25)
cset = ax.contour(X, Y, Z, zdir='z', offset=-np.pi, cmap=plt.cm.coolwarm)
cset = ax.contour(X, Y, Z, zdir='x', offset=-np.pi, cmap=plt.cm.coolwarm)
cset = ax.contour(X, Y, Z, zdir='y', offset=3*np.pi, cmap=plt.cm.coolwarm)
ax.set_xlim3d(-np.pi, 2*np.pi);
ax.set_ylim3d(0, 3*np.pi);
ax.set_zlim3d(-np.pi, 2*np.pi);
To change your matplotlib figures styling, you have one several options:
Use MPLTools package.
Change matplotlib.rcParams values.
Change matplotlibrc default values. Here is my personalized configuration file.
import prettyplotlib as ppl
import matplotlib as mpl
np.random.seed(12)
fig, ax = plt.subplots(1)
# Show the whole color range
for i in range(8):
x = np.random.normal(loc=i, size=1000)
y = np.random.normal(loc=i, size=1000)
ppl.scatter(ax, x, y, label=str(i))
ppl.legend(ax)
_ = ax.set_title('prettyplotlib `scatter` example\nshowing default color cycle and scatter params')
from IPython.display import IFrame
IFrame('http://matplotlib.org/gallery.html#lines_bars_and_markers', width='100%', height=550)
The mpld3 project brings together Matplotlib, and D3js, the popular Javascript library for creating interactive data visualizations for the web. The result is a simple API for exporting your matplotlib graphics to HTML code which can be used within the browser, within standard web pages, blogs, or tools such as the IPython notebook.
import mpld3
mpld3.enable_notebook()
np.random.seed(0)
P = np.random.random(size=10)
A = np.random.random(size=10)
x = np.linspace(0, 10, 100)
data = np.array([[x, Ai * np.sin(x / Pi)]
for (Ai, Pi) in zip(A, P)])
fig, ax = plt.subplots(2)
points = ax[1].scatter(P, A, c=P + A,
s=200, alpha=0.5)
ax[1].set_xlabel('Period')
ax[1].set_ylabel('Amplitude')
colors = plt.cm.ScalarMappable().to_rgba(P + A)
for (x, l), c in zip(data, colors):
ax[0].plot(x, l, c=c, alpha=0.5, lw=3)
mpld3.disable_notebook()
Bokeh is a Python interactive visualization library that targets modern web browsers for presentation. Its goal is to provide elegant, concise construction of novel graphics in the style of D3.js, but also deliver this capability with high-performance interactivity over very large or streaming datasets. Bokeh can help anyone who would like to quickly and easily create interactive plots, dashboards, and data applications.
import bokeh
try:
from bokeh.sampledata import us_counties, unemployment
except:
bokeh.sampledata.download()
from bokeh.sampledata import us_counties, unemployment
from bokeh.plotting import *
colors = ["#F1EEF6", "#D4B9DA", "#C994C7", "#DF65B0", "#DD1C77", "#980043"]
county_xs=[
us_counties.data[code]['lons'] for code in us_counties.data
if us_counties.data[code]['state'] == 'tx'
]
county_ys=[
us_counties.data[code]['lats'] for code in us_counties.data
if us_counties.data[code]['state'] == 'tx'
]
county_colors = []
for county_id in us_counties.data:
if us_counties.data[county_id]['state'] != 'tx':
continue
try:
rate = unemployment.data[county_id]
idx = min(int(rate/2), 5)
county_colors.append(colors[idx])
except KeyError:
county_colors.append("black")
output_notebook()
patches(county_xs, county_ys, fill_color=county_colors, fill_alpha=0.7,
line_color="white", line_width=0.5, title="Texas Unemployment 2009")
show()
from IPython.display import IFrame
IFrame('http://bokeh.pydata.org/docs/gallery.html', width='100%', height=550)
from IPython.html.widgets import interact, RadioButtonsWidget, IntSliderWidget, TextWidget
def plot_sine(freq):
x = np.linspace(-np.pi, np.pi, num=1000)
plt.plot(x, np.sin(2*np.pi*freq*x))
interact(plot_sine, freq=(1, 10, 0.5))
<function __main__.plot_sine>
def plot_sine2(amplitude, color, title):
fig, ax = plt.subplots(figsize=(4, 3),
subplot_kw={'axisbg':'#EEEEEE',
'axisbelow':True})
ax.grid(color='w', linewidth=2, linestyle='solid')
x = np.linspace(0, 10, 1000)
ax.plot(x, amplitude * np.sin(x), color=color,
lw=5, alpha=0.4)
ax.set_xlim(0, 10)
ax.set_ylim(-10.1, 10.1)
ax.set_title(title)
return fig
interact(plot_sine2,
amplitude=IntSliderWidget(min=0, max=10, step=1,value=1),
color=RadioButtonsWidget(values=['blue', 'green', 'red']),
title=TextWidget(value="Advertise here"))
<function __main__.plot_sine2>
from IPython.display import IFrame
IFrame('https://plot.ly/feed', width='100%', height=550)
pandas is a library for data manipulation and analysis:
Data structures: TimeSeries and DataFrame
An integrated group by engine for aggregating and transforming data sets
Input/Output tools: loading tabular data from flat files (CSV, delimited, Excel 2003), and saving and loading pandas objects from the fast and efficient PyTables/HDF5 format.
Memory-efficent “sparse” versions of the standard data structures for storing data that is mostly missing or mostly constant (some fixed value)
Moving window statistics (rolling mean, rolling standard deviation, etc.)
import pandas as pd
from pandas import Series, DataFrame
labels = ['a', 'b', 'c', 'd', 'e']
s = Series([1, 2, 3, 4, 5], index=labels)
s
a 1 b 2 c 3 d 4 e 5 dtype: int64
print("'b' in s = {}".format('b' in s))
print(" s['b'] = {}".format(s['b']))
'b' in s = True s['b'] = 2
mapping = s.to_dict()
mapping
{'a': 1, 'b': 2, 'c': 3, 'd': 4, 'e': 5}
Series(mapping)
a 1 b 2 c 3 d 4 e 5 dtype: int64
import pandas.io.data
import datetime
aapl = pd.io.data.get_data_yahoo('AAPL',
start=datetime.datetime(2006, 10, 1),
end=datetime.datetime(2012, 1, 1))
aapl.head()
Open | High | Low | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
Date | ||||||
2006-10-02 | 75.10 | 75.87 | 74.30 | 74.86 | 178159800 | 10.17 |
2006-10-03 | 74.45 | 74.95 | 73.19 | 74.08 | 197677200 | 10.07 |
2006-10-04 | 74.10 | 75.46 | 73.16 | 75.38 | 207270700 | 10.24 |
2006-10-05 | 74.53 | 76.16 | 74.13 | 74.83 | 170970800 | 10.17 |
2006-10-06 | 74.42 | 75.04 | 73.81 | 74.22 | 116739700 | 10.08 |
5 rows × 6 columns
aapl.to_csv('aapl_ohlc.csv')
!head aapl_ohlc.csv
Date,Open,High,Low,Close,Volume,Adj Close 2006-10-02,75.1,75.87,74.3,74.86,178159800,10.17 2006-10-03,74.45,74.95,73.19,74.08,197677200,10.07 2006-10-04,74.1,75.46,73.16,75.38,207270700,10.24 2006-10-05,74.53,76.16,74.13,74.83,170970800,10.17 2006-10-06,74.42,75.04,73.81,74.22,116739700,10.08 2006-10-09,73.8,75.08,73.53,74.63,109555600,10.14 2006-10-10,74.54,74.58,73.08,73.81,132897100,10.03 2006-10-11,73.42,73.98,72.6,73.23,142963800,9.95 2006-10-12,73.61,75.39,73.6,75.26,148213800,10.23
reading a csv file.
df = pd.read_csv('aapl_ohlc.csv', index_col='Date', parse_dates=True)
df.head()
Open | High | Low | Close | Volume | Adj Close | |
---|---|---|---|---|---|---|
Date | ||||||
2006-10-02 | 75.10 | 75.87 | 74.30 | 74.86 | 178159800 | 10.17 |
2006-10-03 | 74.45 | 74.95 | 73.19 | 74.08 | 197677200 | 10.07 |
2006-10-04 | 74.10 | 75.46 | 73.16 | 75.38 | 207270700 | 10.24 |
2006-10-05 | 74.53 | 76.16 | 74.13 | 74.83 | 170970800 | 10.17 |
2006-10-06 | 74.42 | 75.04 | 73.81 | 74.22 | 116739700 | 10.08 |
5 rows × 6 columns
df.index
<class 'pandas.tseries.index.DatetimeIndex'> [2006-10-02, ..., 2011-12-30] Length: 1323, Freq: None, Timezone: None
df[['Open', 'Close']].head()
Open | Close | |
---|---|---|
Date | ||
2006-10-02 | 75.10 | 74.86 |
2006-10-03 | 74.45 | 74.08 |
2006-10-04 | 74.10 | 75.38 |
2006-10-05 | 74.53 | 74.83 |
2006-10-06 | 74.42 | 74.22 |
5 rows × 2 columns
print(type(df['Open']))
print(type(df[['Open', 'Close']]))
<class 'pandas.core.series.Series'> <class 'pandas.core.frame.DataFrame'>
df['diff'] = df.Open - df.Close
df.head()
Open | High | Low | Close | Volume | Adj Close | diff | |
---|---|---|---|---|---|---|---|
Date | |||||||
2006-10-02 | 75.10 | 75.87 | 74.30 | 74.86 | 178159800 | 10.17 | 0.24 |
2006-10-03 | 74.45 | 74.95 | 73.19 | 74.08 | 197677200 | 10.07 | 0.37 |
2006-10-04 | 74.10 | 75.46 | 73.16 | 75.38 | 207270700 | 10.24 | -1.28 |
2006-10-05 | 74.53 | 76.16 | 74.13 | 74.83 | 170970800 | 10.17 | -0.30 |
2006-10-06 | 74.42 | 75.04 | 73.81 | 74.22 | 116739700 | 10.08 | 0.20 |
5 rows × 7 columns
close_px = df['Adj Close']
mavg = pd.rolling_mean(close_px, 40)
close_px.plot(label='AAPL')
mavg.plot(label='mavg')
plt.legend(loc='best')
<matplotlib.legend.Legend at 0x7f1d5d265c50>
df = pd.io.data.get_data_yahoo(['AAPL', 'Googl', 'GE', 'IBM', 'KO', 'MSFT', 'PEP'],
start=datetime.datetime(2010, 1, 1),
end=datetime.datetime(2013, 1, 1))['Adj Close']
rets = df.pct_change()
df.head()
AAPL | GE | Googl | IBM | KO | MSFT | PEP | |
---|---|---|---|---|---|---|---|
Date | |||||||
2010-01-04 | 29.08 | 13.33 | 313.69 | 121.19 | 25.02 | 27.31 | 53.54 |
2010-01-05 | 29.13 | 13.40 | 312.31 | 119.73 | 24.72 | 27.32 | 54.19 |
2010-01-06 | 28.66 | 13.33 | 304.43 | 118.95 | 24.71 | 27.15 | 53.64 |
2010-01-07 | 28.61 | 14.02 | 297.35 | 118.54 | 24.65 | 26.87 | 53.30 |
2010-01-08 | 28.80 | 14.32 | 301.31 | 119.73 | 24.19 | 27.05 | 53.13 |
5 rows × 7 columns
_ = pd.scatter_matrix(rets, diagonal='kde', figsize=(10, 10))
corr = rets.corr()
plt.imshow(corr, cmap='hot', interpolation='none')
plt.colorbar()
plt.xticks(range(len(corr)), corr.columns)
plt.yticks(range(len(corr)), corr.columns);
A library to deal with English language. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, and an active discussion forum.
import nltk
identifies sentence and word boundaries.
nltk.download("punkt")
[nltk_data] Downloading package 'punkt' to /home/rmyeid/nltk_data... [nltk_data] Package punkt is already up-to-date!
/usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters /usr/local/lib/python2.7/dist-packages/nltk/__init__.py:682: DeprecationWarning: object() takes no parameters
True
sentences = """This is Rami. At eight o'clock on Thursday morning James Arthur didn't feel very good."""
sents = nltk.sent_tokenize(sentences)
sents
['This is Rami.', "At eight o'clock on Thursday morning James Arthur didn't feel very good."]
words = nltk.word_tokenize(sents[1])
words
['At', 'eight', "o'clock", 'on', 'Thursday', 'morning', 'James', 'Arthur', 'did', "n't", 'feel', 'very', 'good', '.']
classifies words to several categories as nouns(NN), verbs(VB), and adjectives (ADJ).
nltk.download("maxent_treebank_pos_tagger")
[nltk_data] Downloading package 'maxent_treebank_pos_tagger' to [nltk_data] /home/rmyeid/nltk_data... [nltk_data] Package maxent_treebank_pos_tagger is already up-to- [nltk_data] date!
True
tagged = nltk.pos_tag(words)
tagged
[('At', 'IN'), ('eight', 'CD'), ("o'clock", 'JJ'), ('on', 'IN'), ('Thursday', 'NNP'), ('morning', 'NN'), ('James', 'NNP'), ('Arthur', 'NNP'), ('did', 'VBD'), ("n't", 'RB'), ('feel', 'VB'), ('very', 'RB'), ('good', 'JJ'), ('.', '.')]
identifies phrases in text that refers to persons, locations and organizations.
nltk.download("maxent_ne_chunker")
nltk.download("words")
[nltk_data] Downloading package 'maxent_ne_chunker' to [nltk_data] /home/rmyeid/nltk_data... [nltk_data] Package maxent_ne_chunker is already up-to-date! [nltk_data] Downloading package 'words' to /home/rmyeid/nltk_data... [nltk_data] Package words is already up-to-date!
True
entities = nltk.chunk.ne_chunk(tagged)
list(entities.subtrees(filter=lambda x: x.node == 'PERSON'))
[Tree('PERSON', [('James', 'NNP'), ('Arthur', 'NNP')])]
removes suffixes and prefixes to reduce sparsity of language vocabulary usage.
stemmer = nltk.stem.LancasterStemmer()
words = u"Stemming is funnier than a bummer says the sushi loving computer scientist".split()
[stemmer.stem(w) for w in words]
[u'stem', u'is', u'funny', u'than', u'a', u'bum', u'say', u'the', u'sush', u'lov', u'comput', u'sci']
Check if the data is available through an API or just downloadable! Here are some pointers:
from lxml import html
import requests
from IPython.display import IFrame
IFrame('http://econpy.pythonanywhere.com/ex/001.html', width='100%', height=250)
page = requests.get('http://econpy.pythonanywhere.com/ex/001.html')
tree = html.fromstring(page.text)
#This will create a list of buyers:
buyers = tree.xpath('//div[@title="buyer-name"]/text()')
#This will create a list of prices
prices = tree.xpath('//span[@class="item-price"]/text()')
print('Buyers: ', buyers)
print()
print('Prices: ', prices)
Buyers: ['Carson Busses', 'Earl E. Byrd', 'Patty Cakes', 'Derri Anne Connecticut', 'Moe Dess', 'Leda Doggslife', 'Dan Druff', 'Al Fresco', 'Ido Hoe', 'Howie Kisses', 'Len Lease', 'Phil Meup', 'Ira Pent', 'Ben D. Rules', 'Ave Sectomy', 'Gary Shattire', 'Bobbi Soks', 'Sheila Takya', 'Rose Tattoo', 'Moe Tell'] Prices: ['$29.95', '$8.37', '$15.26', '$19.25', '$19.25', '$13.99', '$31.57', '$8.49', '$14.47', '$15.86', '$11.11', '$15.98', '$16.27', '$7.50', '$50.85', '$14.26', '$5.68', '$15.00', '$114.07', '$10.09']
Extracting hyperlinks from Google homepage.
from bs4 import BeautifulSoup
r = requests.get("http://www.google.com")
data = r.text
soup = BeautifulSoup(data)
for link in soup.find_all('a'):
print(link.get('href'))
http://www.google.com/imghp?hl=en&tab=wi http://maps.google.com/maps?hl=en&tab=wl https://play.google.com/?hl=en&tab=w8 http://www.youtube.com/?tab=w1 http://news.google.com/nwshp?hl=en&tab=wn https://mail.google.com/mail/?tab=wm https://drive.google.com/?tab=wo http://www.google.com/intl/en/options/ http://www.google.com/history/optout?hl=en /preferences?hl=en https://accounts.google.com/ServiceLogin?hl=en&continue=http://www.google.com/ /chrome/index.html?hl=en&brand=CHNG&utm_source=en-hpp&utm_medium=hpp&utm_campaign=en /advanced_search?hl=en&authuser=0 /language_tools?hl=en&authuser=0 /intl/en/ads/ /services/ https://plus.google.com/116899029375914044550 /intl/en/about.html /intl/en/policies/
A library to construct, manipulate and visualize graphs, it contains:
Data structures for graphs, digraphs, and multigraphs.
Nodes and edges can hold arbitrary data
Generators for classic graphs, random graphs, and synthetic networks
Standard graph algorithms and Network analysis measures
import networkx as nx
G = nx.karate_club_graph()
nx.draw_spring(G)
plt.show()
It is an interactive, collaborative analytics tool that integrates:
You can open a notebook from Google Drive. You can share notebooks like you would share a Google Doc. You can comment and edit collaboratively, in realtime. There is zero setup, because all the computation happens in Chrome. You can even quickly and easily package your analytics pipeline into a GUI for folks that don't want to program. In effect, you can go from zero to analytics with little impedance.