Pandas¶

Titanic Exercise¶

Titanic - Women and children first?
Calculate the proportion of passengers that survived "grouped" by sex
Calculate the proportion of passengers that survived "grouped" by sex and pclass (Passenger Class)
Create age categories (Children <14 years, Teenager 14-20 years, Adults 20-65, Seniors 65+) and calculate the proportion and count of passengers that survived "grouped" by sex, pclass and age category

Read the data from the csv ("../../data/titanic/titanic.txt") and calculate the proportion of passengers that survived "grouped" by sex

In [1]:

%matplotlib inline
import numpy as np
import pandas as pd

"""
Please, write your code here!
"""

titanic_frame = pd.read_csv("../../data/titanic/titanic.txt")
titanic_frame.groupby(['sex'])['survived'].mean()

Out[1]:

sex
female    0.663067
male      0.167059
Name: survived, dtype: float64

Calculate the proportion of passengers that survived "grouped" by sex and pclass (Passenger Class)

In [2]:

"""
Please, write your code here!
"""
titanic_frame.groupby(['sex', 'pclass'])['survived'].mean()

Out[2]:

sex     pclass
female  1st       0.937063
        2nd       0.878505
        3rd       0.370892
male    1st       0.329609
        2nd       0.144509
        3rd       0.116466
Name: survived, dtype: float64

Create age categories (Children <14 years, Teenager 14-20 years, Adults 20-65, Seniors 65+) and calculate the proportion and count of passengers that survived "grouped" by sex, pclass and age category

In [3]:

"""
Please, write your code here!
"""
titanic_frame['age_category'] = pd.cut(titanic_frame['age'], [0, 13, 20, 65, 999])
proportion = titanic_frame.groupby(['sex','pclass','age_category'])['survived'].mean()
count = titanic_frame.groupby(['sex','pclass','age_category'])['survived'].count()
count.name = 'count'
pd.concat([proportion, count], axis=1)

Out[3]:

			survived	count
sex	pclass	age_category
female	1st	(0, 13]	0.000000	1
		(13, 20]	1.000000	14
		(20, 65]	0.952941	85
		(65, 999]	1.000000	1
	2nd	(0, 13]	1.000000	12
		(13, 20]	0.909091	11
		(20, 65]	0.854839	62
	3rd	(0, 13]	0.363636	11
		(13, 20]	0.611111	18
		(20, 65]	0.464286	28
male	1st	(0, 13]	1.000000	5
		(13, 20]	0.333333	3
		(20, 65]	0.327434	113
		(65, 999]	0.000000	4
	2nd	(0, 13]	1.000000	11
		(13, 20]	0.133333	15
		(20, 65]	0.080000	100
		(65, 999]	0.000000	1
	3rd	(0, 13]	0.375000	16
		(13, 20]	0.107143	28
		(20, 65]	0.095745	94

In [3]: