Notebook
Let's first import the modules we're going to need to use plus define some handy lists we're going to need later on:
And now do the actual parsing into some data structures:
Awesome. Let's take a look at our data (it should look pretty similar to our original CSV) by defining a print function for our data structure:
Nice! Now let's get into the real fun: ranking the data. What is a rank you ask? Well, let's take Goals as an example: The person with the most Goals will have a ranking of 1, second most -- 2, etc. We're going to do this for each stat category. First, let's define a function to rank each category:
Great, but our data is structured such that it is indexed by player. But our ranking function takes a dictionary whose key is the stat category. Let's transpose our data so that this is the case:
OK, so now we have our data in the right format, but it's still just the totals. We haven't ranked it yet. Let's print it to take a look at what our data looks like:
Let's rank the data now. We're also going to rank all of our data in the opposite order in a seperate dictionary. I'll come back to why this is necessary a little later: (Notice that for GAA, lower is actually better, so we rank it in the other direction)
Perfect, let's display it:
D'oh, so close! The data is still transposed. That's no good. Let's transpose it back to it's original format (and also transpose our backwards version while we're at it):
Awesome! Let's print out the results:
Super. We now have the basis for the rest of the work we're going to do. Let's try to add everyone's average rank:
Next, let's rank each player according to their average rank. At the same time I'm going to manually add in each player's actual rank according to Yahoo's scoring.
Alright, let's add everyone's diff between their Rank, and they Actual, Yahoo provided score:
Alright, let's add the final few pieces of the puzzle to get back to where the Excel version was at. First the easy part, number of First and Top 5s per player:
It turns out getting the number of Last Place and Bottom 5 finishes is a bit trickier than First and Top 5. We always know that the player with Rank == 1 is in First Place. Likewise, if someone's Rank is between 1 and 5 they are in the Top 5. But is the opposite true for Last Place and Bottom 5? No -- because of ties. There are 14 Players, but if multiple people are tied for last, they're also tied for second to last, and are ranked as 13th rather than 14th. Hey, remember that "backwards" stuff we did a while back? Well, now is when it's important. Ranked the opposite way, we're able to essentially treat Last and Bottom 5 exactly the same way as we did First and Top 5.
Alright, let's take a look at everything all together now:
Aaanndddd, we're back to where we were with the Excel version. Wow, that took a bit longer than expected. It was fun, but took quite a bit more effort than the Excel version did. Lessons learned? Probably, but it's 5am and I'm going to sleep. Thanks for reading!