This example is inspired by a real task in the lab. We obtained a carefully segmented image of gold nanoparticles and wanted to quantify various aspects of each group (ie single particles, dimers, trimers and large clusters). For example, how does the eccentricity vary with each subgroup? The color labels were pre-assigned, and utilities to juggle the various species were imported into pyparty.
Configure notebook style (see NBCONFIG.ipynb), add imports and paths. The %run magic used below requires IPython 2.0 or higher.
%run NBCONFIG.ipynb
Populating the interactive namespace from numpy and matplotlib
First let's look at the test data, which we have prelabeled from a previous analysis.
from pyparty.data import nanolabels, nanogold
NANOLABELS = nanolabels()
ax1, ax2 = splot(1,2)
showim(nanogold(), ax1, title='SEM nanoparticles')
showim(NANOLABELS, ax2, 'spectral',
title='size-segmented nanoparticles');
print 'unique colors:', np.unique(NANOLABELS)
unique colors: [ 0. 1. 2. 3. 4.]
The segmented image has five unique labels, with 0 being the background. The first task is to split the segmented image into masks for each of the particle categories. pyparty
provides a utility to simplify this task, multi_mask(labeledimage, names, ignore=0):
This is best illustrated with an example.
from pyparty.multi import multi_mask
from collections import OrderedDict
NAMES = ('singles', 'dimers', 'trimers', 'clusters')
axes = splot(2,2, figsize=(10,6))
masks = multi_mask(NANOLABELS, *NAMES, astype=OrderedDict)
for idx, (name, image) in enumerate(masks.items()):
showim(image, axes[idx], 'gray', title=name)
type(masks), masks.keys()[0], masks.values()[0].shape
(collections.OrderedDict, 'singles', (614, 1012))
We see that the return of multi_masks() is an OrderedDict of boolean arrays, each one corresponding to subgroup of interest. For certain tasks, the masks are sufficient; for example, the area of the image occupied by dimers can be found by the ratio of white pixels to dark pixels:
L, W = NANOLABELS.shape
dimer_area = sum(masks['dimers'])
print 'Dimer coverage: %.1f%%' % \
(100.0 * ( float(dimer_area) / (L * W)) )
Dimer coverage: 6.7%
For more complex tasks, working directly with masks can be cumbersome. Therefore, we turn to a simple Canvas container object called MultiCanvas.
See the related MultiCanvas tutorial. We will focus on the constructor, MultiCanvas.from_labeled(), which reads a labeled image into a container of names/canvii. The opetions of from_labeled are:
canvas.from_labels()
. multi-masks should still be fairly fast. Raise the maximum at your discretion!One should note that we can easily get the aforementioned directly from the MultiCanvas via:
from pyparty import MultiCanvas
mc = MultiCanvas.from_labeled(NANOLABELS, *NAMES)
mc
No handlers could be found for logger "pyparty.tools.manager"
MultiCanvas (0xabd508c): singles - Canvas (0xaa81d4c) : 614 X 1012 : 1162 particles dimers - Canvas (0xaa817dc) : 614 X 1012 : 269 particles trimers - Canvas (0xabd532c) : 614 X 1012 : 46 particles clusters - Canvas (0xabd529c) : 614 X 1012 : 40 particles
Let's use MultiCanvas to the following:
Plot the distribution of areas in each particle subgroup (ie piechart or histogram). And how does this compare to multiples of the mean single particle* area?*
Before plotting, let's set some shared parameters like colors, as well as compute the mean singles area $\mu$.
mc.set_colors('r','g','y', 'magenta')
mc.mycolors
{'clusters': 'magenta', 'dimers': 'g', 'singles': 'r', 'trimers': 'y'}
# Take out singles, take mean
mu = np.mean(mc['singles'].area)
'Mean single particle area: %s%%' % round(mu,1)
'Mean single particle area: 74.6%'
While the histogram and piechart are both built into the multicanvas, the histogram will take more keywords to look nice. Therefore, let's do the pie chart first. We will use the keyword "autopct = percent" to specify that we want to see both the percentage AND count of the species. (autopct also accepts cout or both or any valid pie autopct keyword).
ax1, ax2 = splot(1, 2, figsize=(10,5))
chartkwds = {'autopct':'percent', 'shadow':True}
mc.pie(ax1, **chartkwds);
mc.pie(ax2, attr='area', explode=(0,0,0,0.1), **chartkwds);
We see that although clusters only account for 2.6% of the particle count, they make up over 10% of the coverage!
Because there are very few clusters, but they are large in area, it's tough to get a nice histogram with them included. Therefore, we will drop them (they are the third index):
del mc['clusters']
mc.names
['singles', 'dimers', 'trimers']
Now, we merely make the histogram:
BINS = 30
YMAX = XMAX = 300
BINS=30
mc.hist(attr='area', bins=BINS);
# Add vline at 1, 2, 3 times mean
plt.vlines((mu, 2*mu, 3*mu), 0, YMAX, linestyles='--')
# Add text to plot
amu = '$A_\mu$'
textkwds = {'color':'blue', 'bbox':{'facecolor':'gray', 'alpha':.5}}
plt.text(25, 275, 'x < %s' % amu, **textkwds)
plt.text(80, 275, '%s < x < 2%s' % (amu, amu), **textkwds)
plt.text(158, 275, '2%s < x <2%s' % (amu,amu), **textkwds)
plt.text(250, 200, 'x > 3%s' % amu, **textkwds)
plt.xlim(0,XMAX)
plt.ylim(0,YMAX);
We see that the single particles distribution is centered around the single particle mean area nicely. The histogram is misleading because it displays the number of particles moreso than their relative portion of the surface coverage. For this, a pie chart was more informative.
mc['singles'].patchshow(gcolor='red', title='Singles patches');
c_dimers = mc['dimers']
c_dimers.background = nanogold()
c_dimers.patchshow(pmap='jet');