# Speakers Spider (PyCon 2014 sprint)¶

This is a projct for gender analysis of conference speakers over time. We implemented multiple spiders using Scrapy package for scraping speaker names from conference websites and used combination of SexMachine package and Genderize.io for inferring gender.

Lets first load the data to see what information we have collected so far:

In [47]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline

import seaborn as sns
sns.set_palette("deep", desat=.6)
sns.set_context(rc={"figure.figsize": (11, 6)})

import matplotlib as mpl
mpl.rcParams['font.sans-serif'].insert(0, 'Arial')
mpl.rcParams['font.sans-serif'].insert(0, 'Liberation Sans')
mpl.rcParams['font.family'] = 'sans-serif'

In [137]:
df = pd.read_csv('./ndata.csv')


Lets see how many conferences we scrapped.

In [138]:
df['conference'].unique().size

Out[138]:
133

This means that we have 133 unique conferences scrapied. Now lets see how many unique (conference, year) pairs we have

In [139]:
conf_year = df.groupby(['conference', 'year'])
len(conf_year)

Out[139]:
262

So we have 262 conference, year pairs. In average 2 years data for a conference.

Now lets see what data we have for speakers gender. There are around 1000 names which are marked as 'andy' which means that the gender for those cannot be identified. We are going to ignore those names as we can assume that gender distrubution among unidentified names should be the same as in identified ones. In other words we assum that there is no reason for the gender prediction to work better or worse for the certain gender of names.

There are also names marked as mostly_male, mostly_female. We could merged them with the male, female classes but for now are going to ignore those ones as well.

In [140]:
df['gender'].describe()

Out[140]:
count     12234
unique        5
top        male
freq       9854
Name: gender, dtype: object
In [141]:
df['gender'].value_counts()

Out[141]:
male             9854
female           1094
mostly_male       577
andy              563
mostly_female     146
dtype: int64
In [142]:
# take only male and female, ignore the rest
df = df[(df.gender == 'male') | (df.gender == 'female')]
df['gender'].value_counts()

Out[142]:
male      9854
female    1094
dtype: int64

## The Overall Gender Frequency¶

Now when we have our data prepared lets plot the data to visually see whats going on.

In [143]:
gender_plot= df['gender'].value_counts().plot(kind = 'bar', title = 'Overall gender frequency in the collected data')
plt.savefig('gender_plot.png',  bbox_inches='tight')


## Gender per Year¶

By using pandas magical functionality we can calculate frequency table of genders per year and then plot it.

In [144]:
per_year = pd.crosstab(df['year'], df['gender'])
per_year

Out[144]:
gender female male
year
2001 10 236
2002 11 156
2003 20 142
2004 11 184
2005 17 205
2006 8 201
2007 25 324
2008 30 444
2009 38 538
2010 64 777
2011 95 1154
2012 244 2518
2013 371 2001
2014 150 974

14 rows × 2 columns

In [145]:
per_ear_plot = per_year.plot(kind='bar', stacked=True, title="Gender per year")
plt.savefig(r'gender_per_year.png', bbox_inches='tight')


but this does not show proportions for that we can normalize data per year to get percentages for male/female

In [146]:
per_year_perc = per_year.div(per_year.sum(axis = 1), axis = 0)

In [147]:
per_ear_plot = per_year_perc.plot(kind='bar', stacked=True, title="Gender frequency per year")
plt.savefig(r'gender_freq_per_year.png', bbox_inches='tight')


As we can see percentage of women speakers is always below 20%.

It would be interesting to see if there is any conference where percentage of women speakers is 20% or above.

In [160]:
conf_year = df.groupby(['conference', 'year'])
conf_year['gender'].value_counts(normalize = True).to_dict()

Out[160]:
{('Acts as Conference', 2009L, 'male'): 1.0,
('Agile Roots', 2009L, 'female'): 0.15789473684210525,
('Agile Roots', 2009L, 'male'): 0.84210526315789469,
('Agile Roots', 2010L, 'female'): 0.21212121212121213,
('Agile Roots', 2010L, 'male'): 0.78787878787878785,
('Airbnb Tech Talks', 2012L, 'female'): 0.125,
('Airbnb Tech Talks', 2012L, 'male'): 0.875,
('Aloha Ruby Conf', 2012L, 'female'): 0.045454545454545456,
('Aloha Ruby Conf', 2012L, 'male'): 0.95454545454545459,
('AltDevConf', 2012L, 'female'): 0.16666666666666666,
('AltDevConf', 2012L, 'male'): 0.83333333333333337,
('Ancient City Ruby', 2013L, 'female'): 0.10000000000000001,
('Ancient City Ruby', 2013L, 'male'): 0.90000000000000002,
('ArrrrCamp', 2010L, 'female'): 0.14285714285714285,
('ArrrrCamp', 2010L, 'male'): 0.8571428571428571,
('ArrrrCamp', 2011L, 'female'): 0.15384615384615385,
('ArrrrCamp', 2011L, 'male'): 0.84615384615384615,
('ArrrrCamp', 2012L, 'female'): 0.045454545454545456,
('ArrrrCamp', 2012L, 'male'): 0.95454545454545459,
('ArrrrCamp', 2013L, 'female'): 0.052631578947368418,
('ArrrrCamp', 2013L, 'male'): 0.94736842105263153,
('Barcelona Ruby Conference', 2012L, 'female'): 0.0625,
('Barcelona Ruby Conference', 2012L, 'male'): 0.9375,
('Big Ruby', 2013L, 'female'): 0.071428571428571425,
('Big Ruby', 2013L, 'male'): 0.9285714285714286,
('Big Ruby', 2014L, 'female'): 0.26666666666666666,
('Big Ruby', 2014L, 'male'): 0.73333333333333328,
('Burlington Ruby', 2013L, 'female'): 0.16666666666666666,
('Burlington Ruby', 2013L, 'male'): 0.83333333333333337,
2013L,
'female'): 0.23529411764705882,
2013L,
'male'): 0.76470588235294112,
('Chef Conf', 2012L, 'female'): 0.030303030303030304,
('Chef Conf', 2012L, 'male'): 0.96969696969696972,
('ChiPy', 2010L, 'male'): 1.0,
('ChiPy', 2012L, 'male'): 1.0,
('ChiPy', 2013L, 'female'): 0.11764705882352941,
('ChiPy', 2013L, 'male'): 0.88235294117647056,
('ChiPy', 2014L, 'male'): 1.0,
('Chicago Djangonauts', 2011L, 'male'): 1.0,
('Chicago Djangonauts', 2012L, 'male'): 1.0,
('Chicago Djangonauts', 2013L, 'male'): 1.0,
('Chicago Erlang User Group', 2012L, 'male'): 1.0,
('Chicago Erlang User Group', 2013L, 'male'): 1.0,
('Chicago Freelancers', 2013L, 'male'): 1.0,
('Chicago Web Conf', 2012L, 'female'): 0.25,
('Chicago Web Conf', 2012L, 'male'): 0.75,
('ChicagoLUG', 2013L, 'male'): 1.0,
('Continuum', 2013L, 'female'): 0.088235294117647065,
('Continuum', 2013L, 'male'): 0.91176470588235292,
('Copenhagen JS', 2012L, 'male'): 1.0,
('DevCon5', 2012L, 'male'): 1.0,
('DevconTLV Feb', 2013L, 'male'): 1.0,
('DevconTLV January', 2014L, 'male'): 1.0,
('DevconTLV June', 2013L, 'male'): 1.0,
('DevconTLV October', 2013L, 'male'): 1.0,
('DjangoCon', 2009L, 'male'): 1.0,
('DjangoCon', 2010L, 'male'): 1.0,
('DjangoCon', 2011L, 'male'): 1.0,
('DjangoCon', 2012L, 'male'): 1.0,
('DjangoCon AU', 2013L, 'male'): 1.0,
('Ember Conf', 2014L, 'female'): 0.20000000000000001,
('Ember Conf', 2014L, 'male'): 0.80000000000000004,
('Emerging Languages Camp', 2010L, 'female'): 0.16666666666666666,
('Emerging Languages Camp', 2010L, 'male'): 0.83333333333333337,
('Enthought', 2012L, 'female'): 0.025974025974025976,
('Enthought', 2012L, 'male'): 0.97402597402597402,
('Eric', 2013L, 'female'): 0.48148148148148145,
('Eric', 2013L, 'male'): 0.51851851851851849,
('Erlang DC', 2013L, 'female'): 0.076923076923076927,
('Erlang DC', 2013L, 'male'): 0.92307692307692313,
('Erlang Factory', 2013L, 'female'): 0.090909090909090912,
('Erlang Factory', 2013L, 'male'): 0.90909090909090906,
('Erlang Factory Lite Krakow, Poland', 2012L, 'male'): 1.0,
('Erlang Factory Lite Moscow, Russia', 2012L, 'male'): 1.0,
('Erlang Factory SF Bay Area', 2012L, 'female'): 0.018518518518518517,
('Erlang Factory SF Bay Area', 2012L, 'male'): 0.98148148148148151,
('Euro Clojure', 2012L, 'male'): 1.0,
('EuroPython', 2006L, 'male'): 1.0,
('EuroPython', 2008L, 'male'): 1.0,
('EuroPython', 2009L, 'male'): 1.0,
('EuroPython', 2010L, 'female'): 0.11764705882352941,
('EuroPython', 2010L, 'male'): 0.88235294117647056,
('EuroPython', 2011L, 'female'): 0.095652173913043481,
('EuroPython', 2011L, 'male'): 0.90434782608695652,
('EuroPython', 2012L, 'female'): 0.078260869565217397,
('EuroPython', 2012L, 'male'): 0.92173913043478262,
('EuroPython', 2013L, 'female'): 0.076190476190476197,
('EuroPython', 2013L, 'male'): 0.92380952380952386,
('EuroSciPy', 2008L, 'male'): 1.0,
('EuroSciPy', 2009L, 'female'): 0.066666666666666666,
('EuroSciPy', 2009L, 'male'): 0.93333333333333335,
('EuroSciPy', 2010L, 'female'): 0.05128205128205128,
('EuroSciPy', 2010L, 'male'): 0.94871794871794868,
('EuroSciPy', 2011L, 'female'): 0.042553191489361701,
('EuroSciPy', 2011L, 'male'): 0.95744680851063835,
('EuroSciPy', 2012L, 'female'): 0.035714285714285712,
('EuroSciPy', 2012L, 'male'): 0.9642857142857143,
('Farmhouse Conf', 2011L, 'female'): 0.54545454545454541,
('Farmhouse Conf', 2011L, 'male'): 0.45454545454545453,
('Flourish', 2012L, 'female'): 0.18181818181818182,
('Flourish', 2012L, 'male'): 0.81818181818181823,
('Fosdem', 2012L, 'female'): 0.072289156626506021,
('Fosdem', 2012L, 'male'): 0.92771084337349397,
('Fosdem', 2014L, 'female'): 0.055555555555555552,
('Fosdem', 2014L, 'male'): 0.94444444444444442,
('FreeGeek Chicago', 2012L, 'female'): 0.38461538461538464,
('FreeGeek Chicago', 2012L, 'male'): 0.61538461538461542,
('FreeGeek Chicago', 2013L, 'female'): 0.33333333333333331,
('FreeGeek Chicago', 2013L, 'male'): 0.66666666666666663,
('Fronteers', 2010L, 'female'): 0.17647058823529413,
('Fronteers', 2010L, 'male'): 0.82352941176470584,
('Fronteers', 2011L, 'female'): 0.15384615384615385,
('Fronteers', 2011L, 'male'): 0.84615384615384615,
('Fronteers', 2012L, 'female'): 0.125,
('Fronteers', 2012L, 'male'): 0.875,
('Fronteers', 2013L, 'female'): 0.17647058823529413,
('Fronteers', 2013L, 'male'): 0.82352941176470584,
('GORUCO', 2008L, 'male'): 1.0,
('GORUCO', 2009L, 'female'): 0.14285714285714285,
('GORUCO', 2009L, 'male'): 0.8571428571428571,
('GORUCO', 2012L, 'female'): 0.076923076923076927,
('GORUCO', 2012L, 'male'): 0.92307692307692313,
('GORUCO', 2013L, 'female'): 0.25,
('GORUCO', 2013L, 'male'): 0.75,
('Garden City Ruby', 2014L, 'female'): 0.21428571428571427,
('Garden City Ruby', 2014L, 'male'): 0.7857142857142857,
('Golden Gate Ruby Conference', 2009L, 'male'): 1.0,
('Golden Gate Ruby Conference', 2010L, 'female'): 0.23076923076923078,
('Golden Gate Ruby Conference', 2010L, 'male'): 0.76923076923076927,
('Golden Gate Ruby Conference', 2011L, 'female'): 0.076923076923076927,
('Golden Gate Ruby Conference', 2011L, 'male'): 0.92307692307692313,
('Golden Gate Ruby Conference', 2012L, 'female'): 0.22222222222222221,
('Golden Gate Ruby Conference', 2012L, 'male'): 0.77777777777777779,
('Golden Gate Ruby Conference', 2013L, 'female'): 0.21052631578947367,
('Golden Gate Ruby Conference', 2013L, 'male'): 0.78947368421052633,
('HTML 5.tx', 2013L, 'female'): 0.16666666666666666,
('HTML 5.tx', 2013L, 'male'): 0.83333333333333337,
('Ictev', 2013L, 'female'): 0.45070422535211269,
('Ictev', 2013L, 'male'): 0.54929577464788737,
('Ignite Buffalo', 2013L, 'female'): 0.20000000000000001,
('Ignite Buffalo', 2013L, 'male'): 0.80000000000000004,
('Ignite RailsConf', 2012L, 'male'): 1.0,
('International Conference on Functional Programming', 2012L, 'male'): 1.0,
('JRuby Conference', 2009L, 'male'): 1.0,
('JSConf', 2011L, 'male'): 1.0,
('JSConf', 2012L, 'female'): 0.20454545454545456,
('JSConf', 2012L, 'male'): 0.79545454545454541,
('JSConf EU', 2013L, 'female'): 0.22448979591836735,
('JSConf EU', 2013L, 'male'): 0.77551020408163263,
('Jax Conf', 2012L, 'male'): 1.0,
('Jenkins User Conference San Francisco', 2012L, 'male'): 1.0,
('Kiwi PyCon', 2013L, 'male'): 1.0,
('Kk', 2012L, 'female'): 0.14705882352941177,
('Kk', 2012L, 'male'): 0.8529411764705882,
('Kod.io', 2014L, 'female'): 0.23529411764705882,
('Kod.io', 2014L, 'male'): 0.76470588235294112,
('LA Ruby Conference', 2009L, 'male'): 1.0,
('LA Ruby Conference', 2010L, 'female'): 0.20000000000000001,
('LA Ruby Conference', 2010L, 'male'): 0.80000000000000004,
('LA Ruby Conference', 2011L, 'male'): 1.0,
('LA Ruby Conference', 2012L, 'male'): 1.0,
('LA Ruby Conference', 2013L, 'female'): 0.125,
('LA Ruby Conference', 2013L, 'male'): 0.875,
('LA Ruby Conference', 2014L, 'female'): 0.10000000000000001,
('LA Ruby Conference', 2014L, 'male'): 0.90000000000000002,
('LXJS', 2012L, 'female'): 0.034482758620689655,
('LXJS', 2012L, 'male'): 0.96551724137931039,
('La', 2012L, 'female'): 0.19444444444444445,
('La', 2012L, 'male'): 0.80555555555555558,
('Lca', 2013L, 'female'): 0.096385542168674704,
('Lca', 2013L, 'male'): 0.90361445783132532,
('Lone Star Ruby Conference', 2009L, 'female'): 0.11764705882352941,
('Lone Star Ruby Conference', 2009L, 'male'): 0.88235294117647056,
('Lone Star Ruby Conference', 2010L, 'female'): 0.076923076923076927,
('Lone Star Ruby Conference', 2010L, 'male'): 0.92307692307692313,
('Lone Star Ruby Conference', 2011L, 'female'): 0.064516129032258063,
('Lone Star Ruby Conference', 2011L, 'male'): 0.93548387096774188,
('Lone Star Ruby Conference', 2013L, 'female'): 0.15384615384615385,
('Lone Star Ruby Conference', 2013L, 'male'): 0.84615384615384615,
('MagRails', 2011L, 'female'): 0.125,
('MagRails', 2011L, 'male'): 0.875,
('MongoDB', 2012L, 'male'): 1.0,
('Mountain rb', 2010L, 'male'): 1.0,
('MountainWest RubyConf', 2007L, 'male'): 1.0,
('MountainWest RubyConf', 2008L, 'female'): 0.076923076923076927,
('MountainWest RubyConf', 2008L, 'male'): 0.92307692307692313,
('MountainWest RubyConf', 2009L, 'female'): 0.076923076923076927,
('MountainWest RubyConf', 2009L, 'male'): 0.92307692307692313,
('MountainWest RubyConf', 2010L, 'female'): 0.10000000000000001,
('MountainWest RubyConf', 2010L, 'male'): 0.90000000000000002,
('MountainWest RubyConf', 2011L, 'male'): 1.0,
('MountainWest RubyConf', 2012L, 'female'): 0.11764705882352941,
('MountainWest RubyConf', 2012L, 'male'): 0.88235294117647056,
('MountainWest RubyConf', 2013L, 'female'): 0.125,
('MountainWest RubyConf', 2013L, 'male'): 0.875,
('Nickel City Ruby Conference', 2013L, 'female'): 0.20000000000000001,
('Nickel City Ruby Conference', 2013L, 'male'): 0.80000000000000004,
('Northeast Scala Symposium', 2012L, 'male'): 1.0,
('OSCON', 2001L, 'female'): 0.04065040650406504,
('OSCON', 2001L, 'male'): 0.95934959349593496,
('OSCON', 2002L, 'female'): 0.065868263473053898,
('OSCON', 2002L, 'male'): 0.93413173652694614,
('OSCON', 2003L, 'female'): 0.12345679012345678,
('OSCON', 2003L, 'male'): 0.87654320987654322,
('OSCON', 2004L, 'female'): 0.056410256410256411,
('OSCON', 2004L, 'male'): 0.94358974358974357,
('OSCON', 2005L, 'female'): 0.076576576576576572,
('OSCON', 2005L, 'male'): 0.92342342342342343,
('OSCON', 2006L, 'female'): 0.039800995024875621,
('OSCON', 2006L, 'male'): 0.96019900497512434,
('OSCON', 2007L, 'female'): 0.088607594936708861,
('OSCON', 2007L, 'male'): 0.91139240506329111,
('OSCON', 2008L, 'female'): 0.086021505376344093,
('OSCON', 2008L, 'male'): 0.91397849462365588,
('OSCON', 2009L, 'female'): 0.092307692307692313,
('OSCON', 2009L, 'male'): 0.90769230769230769,
('OSCON', 2010L, 'female'): 0.072368421052631582,
('OSCON', 2010L, 'male'): 0.92763157894736847,
('OSCON', 2011L, 'female'): 0.089403973509933773,
('OSCON', 2011L, 'male'): 0.91059602649006621,
('OSCON', 2012L, 'female'): 0.13134328358208955,
('OSCON', 2012L, 'male'): 0.86865671641791042,
('OSCON', 2013L, 'female'): 0.20274914089347079,
('OSCON', 2013L, 'male'): 0.79725085910652926,
('OSCON', 2014L, 'female'): 0.18650793650793651,
('OSCON', 2014L, 'male'): 0.81349206349206349,
('OpenStack On Ales', 2013L, 'male'): 1.0,
('PSF', 2012L, 'female'): 0.084415584415584416,
('PSF', 2012L, 'male'): 0.91558441558441561,
('PSF', 2013L, 'female'): 0.12060301507537688,
('PSF', 2013L, 'male'): 0.87939698492462315,
('Pacific Northwest Scala', 2013L, 'male'): 1.0,
('Pumping Station: One', 2013L, 'male'): 1.0,
('PuppetConf', 2012L, 'female'): 0.063829787234042548,
('PuppetConf', 2012L, 'male'): 0.93617021276595747,
('PyCon AU', 2010L, 'male'): 1.0,
('PyCon AU', 2011L, 'female'): 0.5,
('PyCon AU', 2011L, 'male'): 0.5,
('PyCon AU', 2012L, 'male'): 1.0,
('PyCon AU', 2013L, 'male'): 1.0,
('PyCon Australia', 2013L, 'female'): 0.11904761904761904,
('PyCon Australia', 2013L, 'male'): 0.88095238095238093,
('PyCon CA', 2012L, 'female'): 0.15254237288135594,
('PyCon CA', 2012L, 'male'): 0.84745762711864403,
('PyCon CA', 2013L, 'female'): 0.17777777777777778,
('PyCon CA', 2013L, 'male'): 0.82222222222222219,
('PyCon DE', 2012L, 'female'): 0.018867924528301886,
('PyCon DE', 2012L, 'male'): 0.98113207547169812,
('PyCon DE', 2013L, 'female'): 0.016949152542372881,
('PyCon DE', 2013L, 'male'): 0.98305084745762716,
('PyCon US', 2007L, 'female'): 0.033333333333333333,
('PyCon US', 2007L, 'male'): 0.96666666666666667,
('PyCon US', 2008L, 'female'): 0.035714285714285712,
('PyCon US', 2008L, 'male'): 0.9642857142857143,
('PyCon US', 2009L, 'female'): 0.031746031746031744,
('PyCon US', 2009L, 'male'): 0.96825396825396826,
('PyCon US', 2010L, 'female'): 0.080459770114942528,
('PyCon US', 2010L, 'male'): 0.91954022988505746,
('PyCon US', 2011L, 'female'): 0.01282051282051282,
('PyCon US', 2011L, 'male'): 0.98717948717948723,
('PyCon US', 2012L, 'female'): 0.057142857142857141,
('PyCon US', 2012L, 'male'): 0.94285714285714284,
('PyCon US', 2013L, 'female'): 0.14285714285714285,
('PyCon US', 2013L, 'male'): 0.8571428571428571,
('PyCon US', 2014L, 'female'): 0.26315789473684209,
('PyCon US', 2014L, 'male'): 0.73684210526315785,
('PyGotham', 2011L, 'female'): 0.5,
('PyGotham', 2011L, 'male'): 0.5,
('PyOhio', 2010L, 'male'): 1.0,
('PyOhio', 2011L, 'male'): 1.0,
('PyOhio', 2012L, 'female'): 0.038461538461538464,
('PyOhio', 2012L, 'male'): 0.96153846153846156,
('PyOhio', 2013L, 'female'): 0.096774193548387094,
('PyOhio', 2013L, 'male'): 0.90322580645161288,
('PyTennessee', 2014L, 'female'): 0.25925925925925924,
('PyTennessee', 2014L, 'male'): 0.7407407407407407,
('Pygotham', 2012L, 'female'): 0.15384615384615385,
('Pygotham', 2012L, 'male'): 0.84615384615384615,
('Rails Israel', 2012L, 'male'): 1.0,
('Rails Israel', 2013L, 'female'): 0.083333333333333329,
('Rails Israel', 2013L, 'male'): 0.91666666666666663,
('RailsConf', 2012L, 'female'): 0.032258064516129031,
('RailsConf', 2012L, 'male'): 0.967741935483871,
('RailsConf', 2013L, 'female'): 0.13235294117647059,
('RailsConf', 2013L, 'male'): 0.86764705882352944,
('Rest Fest', 2012L, 'female'): 0.043478260869565216,
('Rest Fest', 2012L, 'male'): 0.95652173913043481,
('Rocky Mountain Ruby', 2011L, 'female'): 0.043478260869565216,
('Rocky Mountain Ruby', 2011L, 'male'): 0.95652173913043481,
('Rocky Mountain Ruby', 2012L, 'female'): 0.041666666666666664,
('Rocky Mountain Ruby', 2012L, 'male'): 0.95833333333333337,
('Rocky Mountain Ruby', 2013L, 'female'): 0.086956521739130432,
('Rocky Mountain Ruby', 2013L, 'male'): 0.91304347826086951,
('Ruby Conf Australia', 2013L, 'female'): 0.14285714285714285,
('Ruby Conf Australia', 2013L, 'male'): 0.8571428571428571,
('Ruby Conference', 2007L, 'female'): 0.032258064516129031,
('Ruby Conference', 2007L, 'male'): 0.967741935483871,
('Ruby Conference', 2008L, 'female'): 0.027027027027027029,
('Ruby Conference', 2008L, 'male'): 0.97297297297297303,
('Ruby Conference', 2009L, 'female'): 0.050000000000000003,
('Ruby Conference', 2009L, 'male'): 0.94999999999999996,
('Ruby Conference', 2010L, 'female'): 0.078125,
('Ruby Conference', 2010L, 'male'): 0.921875,
('Ruby Conference', 2011L, 'female'): 0.050847457627118647,
('Ruby Conference', 2011L, 'male'): 0.94915254237288138,
('Ruby Conference', 2012L, 'female'): 0.11363636363636363,
('Ruby Conference', 2012L, 'male'): 0.88636363636363635,
('Ruby Conference', 2013L, 'female'): 0.14000000000000001,
('Ruby Conference', 2013L, 'male'): 0.85999999999999999,
('Ruby Hoedown', 2007L, 'female'): 0.125,
('Ruby Hoedown', 2007L, 'male'): 0.875,
('Ruby Hoedown', 2008L, 'female'): 0.0625,
('Ruby Hoedown', 2008L, 'male'): 0.9375,
('Ruby Hoedown', 2010L, 'male'): 1.0,
('Ruby Lugdunum (RuLu)', 2012L, 'male'): 1.0,
('Ruby Midwest', 2011L, 'female'): 0.095238095238095233,
('Ruby Midwest', 2011L, 'male'): 0.90476190476190477,
('Ruby Midwest', 2013L, 'female'): 0.1875,
('Ruby Midwest', 2013L, 'male'): 0.8125,
('Ruby Nation', 2012L, 'male'): 1.0,
('Ruby On Ales', 2011L, 'male'): 1.0,
('Ruby On Ales', 2012L, 'female'): 0.16666666666666666,
('Ruby On Ales', 2012L, 'male'): 0.83333333333333337,
('Ruby On Ales', 2013L, 'female'): 0.26666666666666666,
('Ruby On Ales', 2013L, 'male'): 0.73333333333333328,
('RubyConf India', 2012L, 'female'): 0.037037037037037035,
('RubyConf India', 2012L, 'male'): 0.96296296296296291,
('RubyConf India', 2013L, 'male'): 1.0,
('RubyConf Uruguay', 2010L, 'male'): 1.0,
('RubyConf Uruguay', 2013L, 'female'): 0.10526315789473684,
('RubyConf Uruguay', 2013L, 'male'): 0.89473684210526316,
('Ruby|Web Conference', 2010L, 'male'): 1.0,
('SciPy', 2008L, 'male'): 1.0,
('SciPy', 2009L, 'male'): 1.0,
('SciPy', 2010L, 'female'): 0.046153846153846156,
('SciPy', 2010L, 'male'): 0.9538461538461539,
('SciPy', 2011L, 'female'): 0.063492063492063489,
('SciPy', 2011L, 'male'): 0.93650793650793651,
('SciPy', 2012L, 'male'): 1.0,
('SciPy', 2013L, 'female'): 0.10000000000000001,
('SciPy', 2013L, 'male'): 0.90000000000000002,
('Scotland Ruby', 2011L, 'male'): 1.0,
('Steel City Ruby', 2012L, 'female'): 0.125,
('Steel City Ruby', 2012L, 'male'): 0.875,
('Steel City Ruby', 2013L, 'female'): 0.36363636363636365,
('Steel City Ruby', 2013L, 'male'): 0.63636363636363635,
('Sunny Conf', 2010L, 'male'): 1.0,
('The Next Web', 2012L, 'female'): 0.086956521739130432,
('The Next Web', 2012L, 'male'): 0.91304347826086951,
('Troy', 2013L, 'female'): 0.10526315789473684,
('Troy', 2013L, 'male'): 0.89473684210526316,
('Waza', 2012L, 'male'): 1.0,
('Web Directions Code', 2012L, 'female'): 0.14285714285714285,
('Web Directions Code', 2012L, 'male'): 0.8571428571428571,
('Web Directions South', 2012L, 'male'): 1.0,
('Web Rebels', 2012L, 'male'): 1.0,
('Wicked Good Ruby', 2013L, 'female'): 0.16666666666666666,
('Wicked Good Ruby', 2013L, 'male'): 0.83333333333333337,
('Windy City DB', 2010L, 'male'): 1.0,
('Windy City DB', 2011L, 'female'): 0.25,
('Windy City DB', 2011L, 'male'): 0.75,
('Windy City DB', 2012L, 'male'): 1.0,
('Windy City Go', 2011L, 'female'): 0.16666666666666666,
('Windy City Go', 2011L, 'male'): 0.83333333333333337,
('Windy City Go', 2012L, 'female'): 0.25,
('Windy City Go', 2012L, 'male'): 0.75,
('Windy City Rails', 2009L, 'male'): 1.0,
('Windy City Rails', 2010L, 'male'): 1.0,
('Windy City Rails', 2011L, 'male'): 1.0,
('Windy City Rails', 2012L, 'male'): 1.0,
('X.Org Developer Conference', 2012L, 'male'): 1.0,
('confoo.ca', 2010L, 'female'): 0.043956043956043959,
('confoo.ca', 2010L, 'male'): 0.95604395604395609,
('confoo.ca', 2011L, 'female'): 0.019230769230769232,
('confoo.ca', 2011L, 'male'): 0.98076923076923073,
('confoo.ca', 2012L, 'female'): 0.066666666666666666,
('confoo.ca', 2012L, 'male'): 0.93333333333333335,
('confoo.ca', 2013L, 'female'): 0.13186813186813187,
('confoo.ca', 2013L, 'male'): 0.86813186813186816,
('confoo.ca', 2014L, 'female'): 0.13095238095238096,
('confoo.ca', 2014L, 'male'): 0.86904761904761907,
('curtin', 2014L, 'female'): 0.5,
('curtin', 2014L, 'male'): 0.5,
('developerweek.com', 2013L, 'female'): 0.12621359223300971,
('developerweek.com', 2013L, 'male'): 0.87378640776699024,
('developerweek.com', 2014L, 'female'): 0.13953488372093023,
('developerweek.com', 2014L, 'male'): 0.86046511627906974,
('djangocon.eu', 2011L, 'female'): 0.068965517241379309,
('djangocon.eu', 2011L, 'male'): 0.93103448275862066,
('djangocon.eu', 2013L, 'female'): 0.034482758620689655,
('djangocon.eu', 2013L, 'male'): 0.96551724137931039,
('djangocon.eu', 2014L, 'female'): 0.10344827586206896,
('djangocon.eu', 2014L, 'male'): 0.89655172413793105,
('eurUko', 2012L, 'female'): 0.071428571428571425,
('eurUko', 2012L, 'male'): 0.9285714285714286,
('jQuery Conference San Francisco', 2012L, 'female'): 0.083333333333333329,
('jQuery Conference San Francisco', 2012L, 'male'): 0.91666666666666663,
('jQuery Conference UK', 2012L, 'male'): 1.0,
('js.chi();', 2012L, 'male'): 1.0,
('meet.js SUMMIT', 2012L, 'female'): 0.0625,
('meet.js SUMMIT', 2012L, 'male'): 0.9375,
('openstack Summit Portland', 2013L, 'female'): 0.16666666666666666,
('openstack Summit Portland', 2013L, 'male'): 0.83333333333333337,
('openstack summit fall', 2012L, 'female'): 0.094736842105263161,
('openstack summit fall', 2012L, 'male'): 0.90526315789473688,
('railsberry', 2012L, 'female'): 0.10000000000000001,
('railsberry', 2012L, 'male'): 0.90000000000000002,
('rockymtnruby.com', 2010L, 'male'): 1.0,
('rockymtnruby.com', 2011L, 'female'): 0.043478260869565216,
('rockymtnruby.com', 2011L, 'male'): 0.95652173913043481,
('rockymtnruby.com', 2012L, 'female'): 0.17647058823529413,
('rockymtnruby.com', 2012L, 'male'): 0.82352941176470584,
('rockymtnruby.com', 2013L, 'female'): 0.1111111111111111,
('rockymtnruby.com', 2013L, 'male'): 0.88888888888888884,
('strangeloop.com', 2009L, 'male'): 1.0,
('strangeloop.com', 2011L, 'female'): 0.085714285714285715,
('strangeloop.com', 2011L, 'male'): 0.91428571428571426,
('strangeloop.com', 2012L, 'female'): 0.080000000000000002,
('strangeloop.com', 2012L, 'male'): 0.92000000000000004,
('strangeloop.com', 2013L, 'female'): 0.2441860465116279,
('strangeloop.com', 2013L, 'male'): 0.7558139534883721,
('strataconf', 2011L, 'female'): 0.125,
('strataconf', 2011L, 'male'): 0.875,
('strataconf', 2012L, 'female'): 0.16107382550335569,
('strataconf', 2012L, 'male'): 0.83892617449664431,
('strataconf', 2013L, 'female'): 0.17679558011049723,
('strataconf', 2013L, 'male'): 0.82320441988950277}

Lets plot a few of them

In [167]:
conf_data = df[(df.conference == 'Golden Gate Ruby Conference') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked  = True, title = 'Golden Gate Ruby Conference')

Out[167]:
<matplotlib.axes.AxesSubplot at 0xbd4d8ec>
In [168]:
conf_data = df[(df.conference == 'strangeloop.com') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked  = True, title = 'strangeloop.com')

Out[168]:
<matplotlib.axes.AxesSubplot at 0xc0dbacc>
In [169]:
conf_data = df[(df.conference == 'PyCon US') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked  = True, title = 'PyCon US')

Out[169]:
<matplotlib.axes.AxesSubplot at 0xbc7396c>
In [171]:
conf_data = df[(df.conference == 'Cascadia Ruby') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked  = True, title = 'Cascadia Ruby')

Out[171]:
<matplotlib.axes.AxesSubplot at 0xc11928c>
In [175]:
conf_data = df[(df.conference == 'Farmhouse Conf') ]
conf_data_year = pd.crosstab(conf_data['year'], conf_data['gender'])
conf_data_year.plot(kind = 'bar', stacked  = True, title = 'Farmhouse Conf')

Out[175]:
<matplotlib.axes.AxesSubplot at 0xc58396c>

Farmhouse Conf. is the only conference we scrapped that has about the same number of female and male speakers. The conference is not stricktly about coding. Besides the equal balance of female and male speakers was enforced by the rules of the conference