This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.
import numpy as np
We create a memory-mapped array with a specific shape.
nrows, ncols = 1000000, 100
f = np.memmap('memmapped.dat', dtype=np.float32,
mode='w+', shape=(nrows, ncols))
Let's feed the array with random values, one column at a time because our system memory is limited!
for i in range(ncols):
f[:,i] = np.random.rand(nrows)
We save the last column of the array.
x = f[:,-1]
Now, we flush memory changes to disk by removing the object.
del f
Reading a memory-mapped array from disk involves the same memmap function but with a different file mode. The data type and the shape need to be specified again, as this information is not stored in the file.
f = np.memmap('memmapped.dat', dtype=np.float32, shape=(nrows, ncols))
np.array_equal(f[:,-1], x)
del f
You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).
IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).