IPython Notebook: your new SQL client

IPython Notebook is a Python client in your browser

  • cells of executable Python and rich HTML (Markdown) interspersed
  • Python runs in an IPython Notebook server (usually local)
In [1]:
rdbms = 'oracle mysql postgresql'.split()
ultimate_database = rdbms[-1]
print("We love %s!" % ultimate_database)
We love postgresql!

ipython_sql

An IPython extension piping data to and from sqlalchemy connections

In [2]:
%load_ext sql
%sql postgresql://will:longliveliz@localhost/shakes
Out[2]:
u'Connected: will@shakes'
In [3]:
%sql SELECT * FROM work LIMIT 2
Out[3]:
workid title longtitle year genretype notes source totalwords totalparagraphs
12night Twelfth Night Twelfth Night, Or What You Will 1599 c None Moby 19837 1031
allswell All's Well That Ends Well All's Well That Ends Well 1602 c None Moby 22997 1025
In [4]:
result = _
In [5]:
result
Out[5]:
workid title longtitle year genretype notes source totalwords totalparagraphs
12night Twelfth Night Twelfth Night, Or What You Will 1599 c None Moby 19837 1031
allswell All's Well That Ends Well All's Well That Ends Well 1602 c None Moby 22997 1025

Python objects, like our result set, can describe themselves in good-looking HTML terms for the Notebook's sake. Nonetheless, our result set is basically just a Python list.

In [6]:
result[0]
Out[6]:
(u'12night', u'Twelfth Night', u'Twelfth Night, Or What You Will', 1599, u'c', None, u'Moby', 19837, 1031)
In [7]:
result[0][0]
Out[7]:
u'12night'

... except that we can also see the column names (keys), then use them to pick out column values.

In [8]:
result.keys
Out[8]:
[u'workid',
 u'title',
 u'longtitle',
 u'year',
 u'genretype',
 u'notes',
 u'source',
 u'totalwords',
 u'totalparagraphs']
In [9]:
result[0]['longtitle']
Out[9]:
u'Twelfth Night, Or What You Will'

Single-line SQL results can be assigned directly to a Python variable.

In [10]:
result = %sql SELECT workid, title FROM work WHERE title ILIKE '%as%'
In [11]:
result
Out[11]:
workid title
asyoulikeit As You Like It
measure Measure for Measure
passionatepilgrim Passionate Pilgrim

bind variables

In [12]:
play_name = 'Tempest'
In [13]:
%%sql
SELECT  
        c.charname,
        c.description,
        c.speechcount
FROM    work w
JOIN    character_work cw ON (w.workid = cw.workid)
JOIN    character c ON (cw.charid = c.charid)
WHERE   w.title = :play_name
AND     c.description != 'presented by spirits'
Out[13]:
charname description speechcount
Adrian a lord 9
Alonso king of Naples 40
Antonio the King's brother, the usurping Duke of Milan 57
Ariel an airy spirit 45
Caliban a savage and deformed slave 50
Ferdinand son to the King of Naples 31
Francisco a lord 2
Gonzalo an honest old counsellor 52
Master master of a ship 2
Miranda daughter to Prospero 49
Prospero the rightful Duke of Milan 115
Sebastian the King's brother 67
Stephano a drunken butler 60
Trinculo a jester 39
In [14]:
result = _

Pandas!

In [15]:
import pandas as pd
dataframe = pd.DataFrame(result, columns=result.keys)
dataframe
Out[15]:
charname description speechcount
0 Adrian a lord 9
1 Alonso king of Naples 40
2 Antonio the King's brother, the usurping Duke of Milan 57
3 Ariel an airy spirit 45
4 Caliban a savage and deformed slave 50
5 Ferdinand son to the King of Naples 31
6 Francisco a lord 2
7 Gonzalo an honest old counsellor 52
8 Master master of a ship 2
9 Miranda daughter to Prospero 49
10 Prospero the rightful Duke of Milan 115
11 Sebastian the King's brother 67
12 Stephano a drunken butler 60
13 Trinculo a jester 39

matplotlib

In [16]:
%matplotlib inline
In [17]:
import matplotlib.pyplot as plt
In [18]:
plt.pie(dataframe.speechcount, labels=dataframe.charname)
Out[18]:
([<matplotlib.patches.Wedge at 0x4959b10>,
  <matplotlib.patches.Wedge at 0x495d110>,
  <matplotlib.patches.Wedge at 0x495d650>,
  <matplotlib.patches.Wedge at 0x495db50>,
  <matplotlib.patches.Wedge at 0x4960090>,
  <matplotlib.patches.Wedge at 0x4960590>,
  <matplotlib.patches.Wedge at 0x4960a90>,
  <matplotlib.patches.Wedge at 0x4960f90>,
  <matplotlib.patches.Wedge at 0x49634d0>,
  <matplotlib.patches.Wedge at 0x49639d0>,
  <matplotlib.patches.Wedge at 0x4963ed0>,
  <matplotlib.patches.Wedge at 0x4968410>,
  <matplotlib.patches.Wedge at 0x4968910>,
  <matplotlib.patches.Wedge at 0x4968e10>],
 [<matplotlib.text.Text at 0x4959fd0>,
  <matplotlib.text.Text at 0x495d590>,
  <matplotlib.text.Text at 0x495da90>,
  <matplotlib.text.Text at 0x495df90>,
  <matplotlib.text.Text at 0x49604d0>,
  <matplotlib.text.Text at 0x49609d0>,
  <matplotlib.text.Text at 0x4960ed0>,
  <matplotlib.text.Text at 0x4963410>,
  <matplotlib.text.Text at 0x4963910>,
  <matplotlib.text.Text at 0x4963e10>,
  <matplotlib.text.Text at 0x4968350>,
  <matplotlib.text.Text at 0x4968850>,
  <matplotlib.text.Text at 0x4968d50>,
  <matplotlib.text.Text at 0x496b290>])

This presentation

Installation

The simplest course is to download and install Anaconda, a Python distribution from Continuum Analytics that includes ipython, pandas, matplotlib, etc. Then

your_anaconda_directory/bin/pip install ipython_sql