Read the data from the csv ("../../data/titanic/titanic.txt") and calculate the proportion of passengers that survived "grouped" by sex
%matplotlib inline
import numpy as np
import pandas as pd
"""
Please, write your code here!
"""
titanic_frame = pd.read_csv("../../data/titanic/titanic.txt")
titanic_frame.groupby(['sex'])['survived'].mean()
sex female 0.663067 male 0.167059 Name: survived, dtype: float64
Calculate the proportion of passengers that survived "grouped" by sex and pclass (Passenger Class)
"""
Please, write your code here!
"""
titanic_frame.groupby(['sex', 'pclass'])['survived'].mean()
sex pclass female 1st 0.937063 2nd 0.878505 3rd 0.370892 male 1st 0.329609 2nd 0.144509 3rd 0.116466 Name: survived, dtype: float64
Create age categories (Children <14 years, Teenager 14-20 years, Adults 20-65, Seniors 65+) and calculate the proportion and count of passengers that survived "grouped" by sex, pclass and age category
"""
Please, write your code here!
"""
titanic_frame['age_category'] = pd.cut(titanic_frame['age'], [0, 13, 20, 65, 999])
proportion = titanic_frame.groupby(['sex','pclass','age_category'])['survived'].mean()
count = titanic_frame.groupby(['sex','pclass','age_category'])['survived'].count()
count.name = 'count'
pd.concat([proportion, count], axis=1)
survived | count | |||
---|---|---|---|---|
sex | pclass | age_category | ||
female | 1st | (0, 13] | 0.000000 | 1 |
(13, 20] | 1.000000 | 14 | ||
(20, 65] | 0.952941 | 85 | ||
(65, 999] | 1.000000 | 1 | ||
2nd | (0, 13] | 1.000000 | 12 | |
(13, 20] | 0.909091 | 11 | ||
(20, 65] | 0.854839 | 62 | ||
3rd | (0, 13] | 0.363636 | 11 | |
(13, 20] | 0.611111 | 18 | ||
(20, 65] | 0.464286 | 28 | ||
male | 1st | (0, 13] | 1.000000 | 5 |
(13, 20] | 0.333333 | 3 | ||
(20, 65] | 0.327434 | 113 | ||
(65, 999] | 0.000000 | 4 | ||
2nd | (0, 13] | 1.000000 | 11 | |
(13, 20] | 0.133333 | 15 | ||
(20, 65] | 0.080000 | 100 | ||
(65, 999] | 0.000000 | 1 | ||
3rd | (0, 13] | 0.375000 | 16 | |
(13, 20] | 0.107143 | 28 | ||
(20, 65] | 0.095745 | 94 |