# Python Developer Salary Survey Results¶

The survey was conducted over a 1 month period in February 2014 using a simple Google form. The anonymized data was then stored in an SQL Server database. The raw data is publicly accessible via SlashDB API. In this notebook we are doing some simple analysis of the results using (what else) Python. Enjoy.

http://demo.slashdb.com/db/pystreet.html

### Survey data, JSON representation¶

In [18]:
%matplotlib inline
import pandas
import numpy

### USA Salary Breakdown¶

In [23]:
plot = responses_usa.hist(column=['salary_usd','desired_salary_usd'], bins=[0, 25000,50000,75000,100000,125000,150000,175000,200000,225000,250000,275000,300000,325000,350000,375000,400000,425000,450000,475000,500000], figsize=(20,5), xrot=45, xlabelsize=15)


### Worldwide Salary Breakdown¶

In [24]:
plot = responses_all.hist(column=['salary_usd','desired_salary_usd'], bins=[0, 25000,50000,75000,100000,125000,150000,175000,200000,225000,250000,275000,300000,325000,350000,375000,400000,425000,450000,475000,500000], figsize=(20,5), xrot=45, xlabelsize=15, ylabelsize=15)


### How much more Python developers would like to earn?¶

In [25]:
x = responses_all[['salary_usd','desired_salary_usd']].mean()
# x.plot(kind='barh')
plot = x.plot(kind="bar", figsize=(7,7), fontsize=15, title="Average salary and average desired salary worldwide")
print "At a maximum {0:.0%}".format(((responses_all['desired_salary_usd'] - responses_all['salary_usd'])/responses_all['salary_usd'] ).max())
print "On average {0:.0%}".format(((responses_all['desired_salary_usd'] - responses_all['salary_usd'])/responses_all['salary_usd'] ).mean())
print "At a minimum {0:.0%}".format(((responses_all['desired_salary_usd'] - responses_all['salary_usd'])/responses_all['salary_usd'] ).min())

At a maximum 317%
On average 46%
At a minimum -7%


### Salary as a function of years of experience¶

In [26]:
from numpy import max, min
df = responses_usa[['years_experience','salary_usd']]
g = df.groupby('years_experience')
df = g.agg([min, max])
df.plot(kind="line", figsize=(12,7), title="U.S. Python developer salary range as a function of experience.")

df = responses_all[['years_experience','salary_usd']]
g = df.groupby('years_experience')
df = g.agg([min, max])
df.plot(kind="line", figsize=(12,7), title="Worldwide Python developer salary range as a function of experience.")

Out[26]:
<matplotlib.axes.AxesSubplot at 0xa78e940>