As explained in the Before week 1 notebook, each week of this class is an IPython notebook like this one. In order to follow the class, you simply start reading from the top, following the instructions.
Hint: And you can ask me - or any of the friendly Teaching Assistants - for help at any point if you get stuck!
This first lecture will go over a few different topics to get you started
Now it's time to install Python. The most important thing to know before you begin is that we use python 2.7, so DO NOT install Python 3.
Every year, someone doesn't read carefully enough, so let me say it again: DO NOT install Python 3.
In spite of the warnings above (and this is TRUE
) someone still always installs Python 3. Don't do it. I'm still waiting for the right time to transition - I'm thinking that'll be next year.
jupyter notebook
" and your terminal, and the system should be ready to use in your favorite browser.Part 3 will teach you how to use the IPython Notebook. Note that if you want to use another Python distribution, that's fine, but we cannot promise to help you with anything other than Anaconda.
Video lecture: You get started on this part by watching the "IPython Notebook" video below.
from IPython.display import YouTubeVideo
YouTubeVideo("yC754EgHpck",width=600, height=337.5)
Oh, and I forgot a few important things when I made this video, so here are some additional tips & tricks on IPython.
YouTubeVideo("NDZyU_NlX0I",width=600, height=337.5)
Exercises
- Download the IPython file that I've prepared for you and save it somewhere where you can find it again. The link is here. (Hint: Be careful not to save this in .txt format - make sure the extension is .ipynb.)
- Work through exercise 1-9 in the file, solving the simple Python exercises in there. Also use this as a chance to become familiar with how the IPython notebook works. (And a little bit of
json
).
Now that you've completed working through the IPython notebook, it's time for the moment of truth! If you had great difficulty with the Python coding itself, you're going to be in trouble. Everything we do going forward in this class will depend on you being comfortable with Python. There is simply no way that you will be able to do well, if you're also struggling with Python on top of everything else you'll be learning.
So if you're not 100% comfortable with Python, I recommend you stop right now, and follow Code Academy's excellent short course on Python programming before proceeding. This might seem tough, but the ability to program is a prerequisite for this class, and if you know how to program, you should be able to handle the Python questions above.`
Ok, so you're now on top of Python, so let's get started with a quick overview of APIs.
Video lecture: Click on the image below to watch it on YouTube.
YouTubeVideo("9l5zOfh0CRo",width=600, height=337.5)
It's time for you to get to work. Take a look at the two texts below - just to get a sense of a more technical description of how APIs work
Reading (just skim): Wikipedia page on APIs
Reading (just skim): Wikipedia page on REST for web services
In this final part, we're going to use Python to access Twitter's API. This is the main part or the work today!
Reading: Read through chapter 1 of Mining the Social Web, 2nd Edition (MTSW2e) by Matthew A. Russell. We will focus on content up to page 29, so you can skim the rest. The first chapter of the book is available for free. You can get it here.
In MTSW2e, Russell suggests that you install a virtual machine and run Python from there. That's not necessary.
But you will have to install the twitter
Python library. The twitter
Python library makes it easier to interact with the Twitter API by creating wrapper around the API urls, as well as loading the json for you.
There are a number of Twitter libraries for Python, and it's important that you get the right one (they all do more or less the same thing, but they have different syntax).
In order to install the right library, use the Python installer pip
. And make sure you use Anaconda's pip
, which is in the Anaconda base folder (your system may already have pip
installed as part of another Python distro). On my system (a mac), I installed the twitter
library by typing
>> ~/anaconda/bin/pip install twitter
at the terminal.
Exercises
Solve the exercise below in a newnotebook
. So when I write "Work through example ...", I mean: do that in yournotebook
. Make sure that you add lots of comments in and around your code. Don't forget that these exercises will form the foundation for Assignment 1 when that time comes. So it is to your own advantage to write everything up nicely while it's fresh in your memory.Questions about the text
- According to Chapter 1, what are some things that we humans might want technology to help us get?
- What's the maximum number of characters in a Tweet?
- How many monthly active users does the book say that we can find on Twitter?
- What is the key difference between connections on Twitter and connections on Facebook?
- Take a look at the tweet meta-data on page 22-25. Which fields do you think are most interesting?
Coding your way through Chapter 1 of MTSW2e
- If you don't have one, start a Twitter account. (And follow @suneman).
- If you're having trouble registering as a developer, you can follow this guide. You might want to use Skype to do this, as international texting can be costly. If you don't get a text message (SMS only works for some Danish numbers), ask Twitter to call with the security code
- Work through example 1.1 in the book. Note: Some of the code in the book is outdated due to changes to the Twitter API (1.0 -> 1.1). Updated code can be found here. Read the text in the book, but use the code from the link as you work though the next exercises.
- Work through example 1.2, and also retrieve the trends for Denmark. What are they? Hint: I found the Denmark WOE using this webpage.
- Work through example 1.3. Bonus: Can you think of other ways to plot nicely formatted
json
structures?- Work through example 1.4. No need to calculate intersections for the Denmark trends. I'm pretty sure the overlap will be zero.
- Work through example 1.5, but choose your own term to query for. Make sure that it's a term that gets lots of hits.
- Extract the actual tweet text, and retweet count for all of the tweets you downloaded in example 1.5. Which tweet is the most popular one?
And you're done. Don't forget to congratulate yourself for getting all the way through the exercises!