Pandas ordering/grouping/slicing/? MultiIndex columns
I've managed to group my data correctly, but I would like to switch the fastest varying index when I'm displaying my dataframe.
Currently, I've got month varying fastest, but I would like it to be the slowest.
Basically, I'm trying to reorder some columns so that each value is under the month, rather than each much under the value.
Here's a sample data frame that I'm trying to work with.
import pandas as pd
from StringIO import StringIO
TESTDATA=StringIO("""2000, '1Jan', 45.1, 13.442, 13
2000, 'Feb', 46.1, 14.94, 17
2000, 'Mar', 25.1, 15.02, 14
2001, '1Jan', 85., 13.38, 12
2001, 'Feb', 16., 14.81, 15
2001, 'Mar', 49., 15.14, 17
2002, '1Jan', 90., 13.59, 15
2002, 'Feb', 33., 15.13, 22
2002, 'Mar', 82., 14.88, 10
2003, '1Jan', 47., 13.64, 17
2003, 'Feb', 34., 14.83, 16
2003, 'Mar', 78., 15.27, 22""")
df = pd.read_csv(TESTDATA, header=None, names=['year', 'month', 'weight', 'extent', 'rank']).set_index(['year', 'month']).unstack('month')
df
weight | extent | rank | |||||||
---|---|---|---|---|---|---|---|---|---|
month | '1Jan' | 'Feb' | 'Mar' | '1Jan' | 'Feb' | 'Mar' | '1Jan' | 'Feb' | 'Mar' |
year | |||||||||
2000 | 45.1 | 46.1 | 25.1 | 13.442 | 14.94 | 15.02 | 13 | 17 | 14 |
2001 | 85.0 | 16.0 | 49.0 | 13.380 | 14.81 | 15.14 | 12 | 15 | 17 |
2002 | 90.0 | 33.0 | 82.0 | 13.590 | 15.13 | 14.88 | 15 | 22 | 10 |
2003 | 47.0 | 34.0 | 78.0 | 13.640 | 14.83 | 15.27 | 17 | 16 | 22 |
df.columns
MultiIndex(levels=[[u'weight', u'extent', u'rank'], [u' '1Jan'', u' 'Feb'', u' 'Mar'']], labels=[[0, 0, 0, 1, 1, 1, 2, 2, 2], [0, 1, 2, 0, 1, 2, 0, 1, 2]], names=[None, u'month'])
That's nice, but I really want 'months' as the top level
Here's where I get lost, I've played with a few things and I can't get what I'm looking for. I know the index would look like this after the fact:
pd.MultiIndex(levels=[ ['1Jan', 'Feb', 'Mar'], [u'weight', u'extent', u'rank']],
labels=[ [0, 0, 0, 1, 1, 1, 2, 2, 2],[0, 1, 2, 0, 1, 2, 0, 1, 2]],
names=[None, u'month'])
And I know that just setting the index doesn't work:
df.columns = pd.MultiIndex(levels=[ ['1Jan', 'Feb', 'Mar'], [u'weight', u'extent', u'rank']],
labels=[ [0, 0, 0, 1, 1, 1, 2, 2, 2],[0, 1, 2, 0, 1, 2, 0, 1, 2]],
names=[u'month', None])
df
month | 1Jan | Feb | Mar | ||||||
---|---|---|---|---|---|---|---|---|---|
weight | extent | rank | weight | extent | rank | weight | extent | rank | |
year | |||||||||
2000 | 45.1 | 46.1 | 25.1 | 13.442 | 14.94 | 15.02 | 13 | 17 | 14 |
2001 | 85.0 | 16.0 | 49.0 | 13.380 | 14.81 | 15.14 | 12 | 15 | 17 |
2002 | 90.0 | 33.0 | 82.0 | 13.590 | 15.13 | 14.88 | 15 | 22 | 10 |
2003 | 47.0 | 34.0 | 78.0 | 13.640 | 14.83 | 15.27 | 17 | 16 | 22 |
Any help with answers or even search keywords would be apprecated.
month 1Jan Feb Mar
weight extent rank weight extent rank weight extent rank
year
2000 45 13.442 46 25 14.94 15.02 13 17 14
2001 85 13.380 16 49 14.81 15.14 12 15 17
2002 90 13.590 33 82 15.13 14.88 15 22 10
2003 47 13.640 34 78 14.83 15.27 17 16 22