Available on twitter as @aerogeek
HDF5 format supports any kind of data for digital storage regardless of their origin.
The format is platform independent and is widely used in scientific codes.
HDF5 allows inclusion of metadata and attribute.
HDF5 is highly efficient chunked input output operations.
PyTables is a package for managing hierarchical datasets and designed to efficiently and easily cope with extremely large amounts of data.
PyTables is built on top of the HDF5 library, using the Python language and the NumPy package.
You can download PyTables and use it for free.
Import pytable
import tables as tb
We will use test.hdf file for this simple tutorial.
data = tb.openFile("test.hdf","r")
We use openFile command to open a given HDF5 file and return a File object. The second argument can be "r","a","w" etc for read,append and write respectively
Lets see what's there in this hdf file.
print data
test.hdf (File) '' Last modif.: 'Tue May 27 17:46:41 2014' Object Tree: / (RootGroup) '' /dd (Array(100,)) '' /xxarray (Array(1000, 10)) '' /somegroup (Group) 'this is a new group' /somegroup/xxarray (Array(1000, 1000)) ''
There is a rootgroup which has dd array of size 100 and xxarray of size (1000,10) and somegroup group which has xxarray of size(1000,1000)
/ represents the root group
Structure similar to the picture we saw before
But before that lets load matplotlib so we can plot the accessed data
import matplotlib.pyplot as plt
%matplotlib inline
l=plt.plot(data.getNode("/","dd"))
#load numpy
import numpy as np
print np.sum(data.getNode("/somegroup","xxarray"))
5004733.31628
Notice the / in the Where portion of the getNode.
for node in data.walkNodes():
print node
/ (RootGroup) '' /dd (Array(100,)) '' /somegroup (Group) 'this is a new group' /xxarray (Array(1000, 10)) '' /somegroup/xxarray (Array(1000, 1000)) ''
for group in data.walkGroups():
print group
/ (RootGroup) '' /somegroup (Group) 'this is a new group'
xarray = data.getNode("/","xxarray").read()
lets look at xarray
print xxarray.shape
print xxarray[:10,:4]
(1000, 10) [[ 37.65124901 66.32351938 26.63416963 16.65826532] [ 98.94743049 82.64344691 91.08102543 94.67738448] [ 58.44255102 21.29901262 66.38514329 51.73292396] [ 31.51240632 81.57204173 49.28576672 68.90711896] [ 0.69827242 73.19891575 28.26325399 94.32989173] [ 49.33902333 36.06414164 12.88478684 77.96042142] [ 42.42653654 68.19470728 91.70381558 20.89961624] [ 76.69672676 98.17104868 65.40370551 52.33994386] [ 57.81297935 20.31001171 0.19128739 81.94616451] [ 15.77061071 72.08482339 66.38717341 9.50811376]]
For more info on HDF5 visit
For more info on pytables visit