Calculating percentage change of each variable: used formula 2010's percentage of each variable - 2000's percentage of each variable.
Including:
import pandas as pd
immg = pd.read_csv("./data/raw/immg.csv", header = 0)
black = pd.read_csv("./data/raw/black.csv", header = 0)
white = pd.read_csv("./data/raw/white.csv", header = 0)
high_edu = pd.read_csv("./data/raw/high_edu.csv", header = 0)
low_edu = pd.read_csv("./data/raw/low_edu.csv", header = 0)
un_rate = pd.read_csv("./data/raw/un_rate.csv", header = 0)
data = [('Native', un_rate), ('African American', black), ('White', white), ('US Citizens with High Education Level', high_edu), ('US Citizens with Low Education Level', low_edu)]
immg['Growth'] = immg['2010'] - immg['2000']
changes = immg.copy()
del changes['2000']
del changes['2010']
for indivial_group in data:
name = indivial_group[0]
group = indivial_group[1]
group['Growth'] = group['2010'] - group['2000']
changes[name] = group['Growth']
changes.to_csv('./data/cleaned/changes.csv', index = False)
Display the sample of file in the cleaned folder.
changes.head()
<class 'pandas.core.frame.DataFrame'> Int64Index: 5 entries, 0 to 4 Data columns (total 7 columns): State 5 non-null values Growth 5 non-null values Native 5 non-null values African American 5 non-null values White 5 non-null values US Citizens with High Education Level 5 non-null values US Citizens with Low Education Level 5 non-null values dtypes: float64(6), object(1)
Putting some useful data files from raw folder to cleaned folder
immg.to_csv('./data/cleaned/immg.csv', index = False)
black.to_csv('./data/cleaned/black.csv', index = False)
white.to_csv('./data/cleaned/white.csv', index = False)
high_edu.to_csv('./data/cleaned/high_edu.csv', index = False)
low_edu.to_csv('./data/cleaned/low_edu.csv', index = False)
un_rate.to_csv('./data/cleaned/un_rate.csv', index = False)
List the team members contributing to this notebook, along with their responsabilities: