IPython Notebook: your new SQL client¶

IPython Notebook is a Python client in your browser

cells of executable Python and rich HTML (Markdown) interspersed
Python runs in an IPython Notebook server (usually local)

In [1]:

rdbms = 'oracle mysql postgresql'.split()
ultimate_database = rdbms[-1]
print("We love %s!" % ultimate_database)

We love postgresql!

ipython_sql¶

An IPython extension piping data to and from sqlalchemy connections

In [2]:

%load_ext sql
%sql postgresql://will:longliveliz@localhost/shakes

Out[2]:

u'Connected: will@shakes'

OpenSourceShakespeare data

In [3]:

%sql SELECT * FROM work LIMIT 2

Out[3]:

workid	title	longtitle	year	genretype	notes	source	totalwords	totalparagraphs
12night	Twelfth Night	Twelfth Night, Or What You Will	1599	c	None	Moby	19837	1031
allswell	All's Well That Ends Well	All's Well That Ends Well	1602	c	None	Moby	22997	1025

In [4]:

result = _

In [5]:

result

Out[5]:

workid	title	longtitle	year	genretype	notes	source	totalwords	totalparagraphs
12night	Twelfth Night	Twelfth Night, Or What You Will	1599	c	None	Moby	19837	1031
allswell	All's Well That Ends Well	All's Well That Ends Well	1602	c	None	Moby	22997	1025

Python objects, like our result set, can describe themselves in good-looking HTML terms for the Notebook's sake. Nonetheless, our result set is basically just a Python list.

In [6]:

result[0]

Out[6]:

(u'12night', u'Twelfth Night', u'Twelfth Night, Or What You Will', 1599, u'c', None, u'Moby', 19837, 1031)

In [7]:

result[0][0]

Out[7]:

u'12night'

... except that we can also see the column names (keys), then use them to pick out column values.

In [8]:

result.keys

Out[8]:

[u'workid',
 u'title',
 u'longtitle',
 u'year',
 u'genretype',
 u'notes',
 u'source',
 u'totalwords',
 u'totalparagraphs']

In [9]:

result[0]['longtitle']

Out[9]:

u'Twelfth Night, Or What You Will'

Single-line SQL results can be assigned directly to a Python variable.

In [10]:

result = %sql SELECT workid, title FROM work WHERE title ILIKE '%as%'

In [11]:

result

Out[11]:

workid	title
asyoulikeit	As You Like It
measure	Measure for Measure
passionatepilgrim	Passionate Pilgrim

bind variables¶

In [12]:

play_name = 'Tempest'

In [13]:

%%sql 
SELECT  
        c.charname,
        c.description,
        c.speechcount
FROM    work w
JOIN    character_work cw ON (w.workid = cw.workid)
JOIN    character c ON (cw.charid = c.charid)
WHERE   w.title = :play_name
AND     c.description != 'presented by spirits'

Out[13]:

charname	description	speechcount
Adrian	a lord	9
Alonso	king of Naples	40
Antonio	the King's brother, the usurping Duke of Milan	57
Ariel	an airy spirit	45
Caliban	a savage and deformed slave	50
Ferdinand	son to the King of Naples	31
Francisco	a lord	2
Gonzalo	an honest old counsellor	52
Master	master of a ship	2
Miranda	daughter to Prospero	49
Prospero	the rightful Duke of Milan	115
Sebastian	the King's brother	67
Stephano	a drunken butler	60
Trinculo	a jester	39

In [14]:

result = _

Pandas!¶

In [15]:

import pandas as pd
dataframe = pd.DataFrame(result, columns=result.keys)
dataframe

Out[15]:

	charname	description	speechcount
0	Adrian	a lord	9
1	Alonso	king of Naples	40
2	Antonio	the King's brother, the usurping Duke of Milan	57
3	Ariel	an airy spirit	45
4	Caliban	a savage and deformed slave	50
5	Ferdinand	son to the King of Naples	31
6	Francisco	a lord	2
7	Gonzalo	an honest old counsellor	52
8	Master	master of a ship	2
9	Miranda	daughter to Prospero	49
10	Prospero	the rightful Duke of Milan	115
11	Sebastian	the King's brother	67
12	Stephano	a drunken butler	60
13	Trinculo	a jester	39

matplotlib¶

In [16]:

%matplotlib inline

In [17]:

import matplotlib.pyplot as plt

In [18]:

plt.pie(dataframe.speechcount, labels=dataframe.charname)

Out[18]:

([<matplotlib.patches.Wedge at 0x4959b10>,
  <matplotlib.patches.Wedge at 0x495d110>,
  <matplotlib.patches.Wedge at 0x495d650>,
  <matplotlib.patches.Wedge at 0x495db50>,
  <matplotlib.patches.Wedge at 0x4960090>,
  <matplotlib.patches.Wedge at 0x4960590>,
  <matplotlib.patches.Wedge at 0x4960a90>,
  <matplotlib.patches.Wedge at 0x4960f90>,
  <matplotlib.patches.Wedge at 0x49634d0>,
  <matplotlib.patches.Wedge at 0x49639d0>,
  <matplotlib.patches.Wedge at 0x4963ed0>,
  <matplotlib.patches.Wedge at 0x4968410>,
  <matplotlib.patches.Wedge at 0x4968910>,
  <matplotlib.patches.Wedge at 0x4968e10>],
 [<matplotlib.text.Text at 0x4959fd0>,
  <matplotlib.text.Text at 0x495d590>,
  <matplotlib.text.Text at 0x495da90>,
  <matplotlib.text.Text at 0x495df90>,
  <matplotlib.text.Text at 0x49604d0>,
  <matplotlib.text.Text at 0x49609d0>,
  <matplotlib.text.Text at 0x4960ed0>,
  <matplotlib.text.Text at 0x4963410>,
  <matplotlib.text.Text at 0x4963910>,
  <matplotlib.text.Text at 0x4963e10>,
  <matplotlib.text.Text at 0x4968350>,
  <matplotlib.text.Text at 0x4968850>,
  <matplotlib.text.Text at 0x4968d50>,
  <matplotlib.text.Text at 0x496b290>])

This presentation¶

View and download at http://nbviewer.ipython.org/6588378 or http://tinyurl.com/pgopen2013-ipy-sql

Installation¶

The simplest course is to download and install Anaconda, a Python distribution from Continuum Analytics that includes ipython, pandas, matplotlib, etc. Then

your_anaconda_directory/bin/pip install ipython_sql