Let's first import the modules we're going to need to use plus define some handy lists we're going to need later on: import csv import StringIO import sys player_names = ["Dwayne Gretskeet", "South Park Cows", "the favorites", "Pmoda", "Fresh Baked Brownies", "JoshHum", "Pika Pikachus", "BenDover", "Mikhail's Team", "First Place", "Team Leaddz", "EWeb FTW", "HERRO. THIS IS DOG", "Matthew's Team"] scoring_categories = ["G", "A", "+/-", "PIM", "PPP", "SOG", "W", "GAA", "SV%", "SHO"] And now do the actual parsing into some data structures: csv_data = """Player,G,A,+/-,PIM,PPP,SOG,W,GAA,SV%,SHO Dwayne Gretskeet,154,191,26,379,109,1325,38,2.32,0.926,4 South Park Cows,141,191,46,282,115,1157,31,2.57,0.92,4 Mikhail's Team,138,160,40,268,97,1233,13,2.88,0.9,3 the favorites,132,153,82,307,76,1269,33,2.23,0.918,5 Pmoda,124,186,-25,318,97,1283,32,2.11,0.919,4 JoshHum,121,198,-22,227,115,1263,21,2.93,0.914,2 BenDover,119,206,74,248,88,1064,22,2.52,0.905,0 Pika Pikachus,113,201,59,177,103,1022,21,2.65,0.91,2 Fresh Baked Brownies,95,185,36,262,91,1092,25,2.01,0.926,5 Matthew's Team,89,169,9,256,97,1052,13,3.11,0.898,1 First Place,88,214,-51,186,108,932,29,2.64,0.909,2 HERRO. THIS IS DOG,83,148,51,245,70,912,19,2.84,0.914,2 EWeb FTW,81,149,-28,164,70,826,26,2.4,0.909,5 Team Leaddz,81,156,14,279,72,941,16,2.61,0.906,2 """ f = StringIO.StringIO(csv_data) reader = csv.DictReader(f) per_player_stats = {i['Player']: {x: i[x] for x in scoring_categories} for i in reader} Awesome. Let's take a look at our data (it should look pretty similar to our original CSV) by defining a print function for our data structure: from IPython.core.display import HTML def print_stats(stats, columns, keyword): output = '' output += '' for col in columns: output += '' output += "" for k,v in stats.iteritems(): output += '' for col in columns: output += '' output += '' output += "
' + keyword + '' + col + '
' + k + '' + str(v[col]) + '
" return HTML(output) def print_players(players, columns): return print_stats(players, columns, "Player") def print_categories(categories, columns): return print_stats(categories, columns, "Category") print_players(per_player_stats, scoring_categories) Nice! Now let's get into the real fun: ranking the data. What is a rank you ask? Well, let's take Goals as an example: The person with the most Goals will have a ranking of 1, second most -- 2, etc. We're going to do this for each stat category. First, let's define a function to rank each category: def rank(unranked, reverse=True): ranked = {} for x in unranked: ranked_values = sorted(unranked.values()) if reverse: ranked_values.reverse() ranked[x] = ranked_values.index(unranked[x]) + 1 # add 1 to account for zero indexing return ranked Great, but our data is structured such that it is indexed by player. But our ranking function takes a dictionary whose key is the stat category. Let's transpose our data so that this is the case: def transpose_players_to_categories(players): stat_categories = [] for player in players: stat_categories = players[player].keys() break categories = {category: {} for category in stat_categories} for stat in players: for cat in stat_categories: try: categories[cat][stat] = int(str(players[stat][cat])) except ValueError: categories[cat][stat] = float(str(players[stat][cat])) return categories unranked_stats = transpose_players_to_categories(per_player_stats) OK, so now we have our data in the right format, but it's still just the totals. We haven't ranked it yet. Let's print it to take a look at what our data looks like: print_categories(unranked_stats, player_names) Let's rank the data now. We're also going to rank all of our data in the opposite order in a seperate dictionary. I'll come back to why this is necessary a little later: (Notice that for GAA, lower is actually better, so we rank it in the other direction) ranked_stats = {} for stat in unranked_stats: if stat == "GAA": ranked_stats[stat] = rank(unranked_stats[stat], False) else: ranked_stats[stat] = rank(unranked_stats[stat]) backwards = {} for stat in unranked_stats: if stat != "GAA": backwards[stat] = rank(unranked_stats[stat], False) else: backwards[stat] = rank(unranked_stats[stat]) Perfect, let's display it: print_categories(ranked_stats, player_names) D'oh, so close! The data is still transposed. That's no good. Let's transpose it back to it's original format (and also transpose our backwards version while we're at it): players = {player: {} for player in per_player_stats} for stat in ranked_stats: for player in players: players[player][stat] = ranked_stats[stat][player] players_backwards = {player: {} for player in per_player_stats} for stat in backwards: for player in players_backwards: players_backwards[player][stat] = backwards[stat][player] Awesome! Let's print out the results: print_players(players, scoring_categories) Super. We now have the basis for the rest of the work we're going to do. Let's try to add everyone's average rank: for player in players: avg = sum(players[player].values())/float(len(players[player].values())) avg = int(avg * 100) / 100.0 # trick to round to at most 2 decimal places players[player]["AVG"] = avg print_players(players, ["AVG"]) Next, let's rank each player according to their average rank. At the same time I'm going to manually add in each player's actual rank according to Yahoo's scoring. ranked_avg = rank(transpose_players_to_categories(players)["AVG"], False) for player in players: players[player]["Rank"] = ranked_avg[player] players["South Park Cows"]["Actual"] = 1 players["the favorites"]["Actual"] = 2 players["Dwayne Gretskeet"]["Actual"] = 3 players["Pmoda"]["Actual"] = 4 players["JoshHum"]["Actual"] = 5 players["Fresh Baked Brownies"]["Actual"] = 6 players["BenDover"]["Actual"] = 7 players["Mikhail's Team"]["Actual"] = 8 players["HERRO. THIS IS DOG"]["Actual"] = 9 players["First Place"]["Actual"] = 10 players["Pika Pikachus"]["Actual"] = 11 players["Matthew's Team"]["Actual"] = 12 players["EWeb FTW"]["Actual"] = 13 players["Team Leaddz"]["Actual"] = 14 print_players(players, ["AVG", "Rank", "Actual"]) Alright, let's add everyone's diff between their Rank, and they Actual, Yahoo provided score: for player in players: players[player]["Diff"] = players[player]["Rank"] - players[player]["Actual"] print_players(players, ["AVG", "Rank", "Actual", "Diff"]) Alright, let's add the final few pieces of the puzzle to get back to where the Excel version was at. First the easy part, number of First and Top 5s per player: for player in players: number_ones = 0 top_fives = 0 for cat in players[player]: if cat in scoring_categories: if players[player][cat] == 1: number_ones += 1 if players[player][cat] < 6: top_fives += 1 players[player]["First"] = number_ones players[player]["Top 5"] = top_fives print_players(players, ["AVG", "Rank", "Actual", "Diff", "First", "Top 5"]) It turns out getting the number of Last Place and Bottom 5 finishes is a bit trickier than First and Top 5. We always know that the player with Rank == 1 is in First Place. Likewise, if someone's Rank is between 1 and 5 they are in the Top 5. But is the opposite true for Last Place and Bottom 5? No -- because of ties. There are 14 Players, but if multiple people are tied for last, they're also tied for second to last, and are ranked as 13th rather than 14th. Hey, remember that "backwards" stuff we did a while back? Well, now is when it's important. Ranked the opposite way, we're able to essentially treat Last and Bottom 5 exactly the same way as we did First and Top 5. for player in players_backwards: lasts = 0 bottom_fives = 0 for cat in players_backwards[player]: if cat in scoring_categories: if players_backwards[player][cat] == 1: lasts += 1 if players_backwards[player][cat] < 6: bottom_fives += 1 players[player]["Last"] = lasts players[player]["Bottom 5"] = bottom_fives print_players(players, ["AVG", "Rank", "Actual", "Diff", "First", "Top 5", "Last", "Bottom 5"]) Alright, let's take a look at everything all together now: print_players(players, scoring_categories + ["AVG", "Rank", "Actual", "Diff", "First", "Top 5", "Last", "Bottom 5"]) Aaanndddd, we're back to where we were with the Excel version. Wow, that took a bit longer than expected. It was fun, but took quite a bit more effort than the Excel version did. Lessons learned? Probably, but it's 5am and I'm going to sleep. Thanks for reading!