Homework assignment 1

Your homework assignment takes the form of an IPython Notebook. To work on the assignment, you'll need to follow these steps:

(1) Click here to download a copy of this file (with extension .ipynb).

(2) Upload the copy that you just got from the server to your own IPython Notebook server. Here's how to do that:

(2a) First, make sure your IPython Notebook server is active (check the EC2 admin panel). Open a window to that server and make sure you're on the "Notebooks" screen.

(2b) Find the .ipynb file that you just downloaded on your computer, open a window to your IPython Notebook server, and drag it into the file list.

(2c) You've now got a copy of the assignment where you can work on it. Congratulations!

(2d) Make sure to change the name of your copy of the homework assignment to include your own name (e.g., change it to "Your Name - Data and Databases Homework Assignment 1").

(3) There are a number of problem sets below that require you to write code to achieve a goal. I've left code cells incomplete blank after each question. Your job is to fill in the code cells so that running the code produces the requested output. Make sure to use the "Run" command to ensure that your code works and gives accurate results. Feel free to insert Markdown cells if you have any notes or explanations regarding your code. You must complete all of the problem sets!

(4) When you're done with your assignment, you need to submit it. Go to File > Download as... and click IPython Notebook. A .ipynb will be downloaded to your computer. E-mail that file as an attachment to [email protected]. (We may change the procedure for submitting homework assignments in the future, as you become more familiar with Github.)

This homework assignment is due on June 3rd, 2014.

Problem set 1: Simple indexing and list functions

There is a list of numbers assigned to a variable number_list in the code below that has been assigned a list of numbers. Run this cell before filling in the answers below.

In [2]:
number_list = [-13, 5, 12, 17, 20, 0]

Write another expression in the cell below, using square bracket index notation, that causes the 4th element of number_list (i.e., the number 17) to be displayed when running the code in the cell.

In [3]:
number_list[3]
Out[3]:
17

Now write an expression that evaluates to the number of items in the list (i.e., 6), using the len() function.

In [4]:
len(number_list)
Out[4]:
6

The following expression:

print max(number_list)

... will print the largest value in number_list (i.e., 20). Change the variable index in the code below so that the expression

print sorted(number_list)[x]

... does the same thing. (i.e., when you run the cell, it should display 20.)

In [5]:
x = -1
print sorted(number_list)[x]
20

Problem set 2: Slices

Write an expression below that evaluates to a slice of number_list starting with its second element and ending with its fifth element (exclusive). I.e., the expression should evaluate to [5, 12, 17].

In [7]:
number_list[1:4]
Out[7]:
[5, 12, 17]

Write an expression below that evaluates to a slice of number_list starting with its third element and ending at the end of the list. I.e., the expression should evaluate to [12, 17, 20, 0].

In [8]:
number_list[2:]
Out[8]:
[12, 17, 20, 0]

Finally, fill in a value for the variable x below so that the expression below it evaluates to a slice of number_list starting at the second-to-last element of the list and ending at the end of the list. The expression should evaluate to [20, 0]. x should be a negative integer.

In [9]:
x = -2
print number_list[x:]
[20, 0]

Problem set 3: Comprehensions

For this problem set, I'm introducing a new Python operator: the modulo operator, %. This operator returns the remainder of dividing one integer by another. For example:

In [10]:
print 22 % 3
1

This expression evaluates to 1 because the remainder of dividing 22 by 3 is 1. We can use the modulo operator to test whether or not a number is even, by using the number 2 on the right side of the operator:

In [11]:
print 100 % 2
print 101 % 2
0
1

Given the above information, write a list comprehension that evaluates to a list containing only the members of number_list that are odd---i.e., [-13, 5, 17]. Use the modulo operator in the membership expression of the list comprehension.

In [12]:
[x for x in number_list if x % 2 == 1]
Out[12]:
[-13, 5, 17]

Problem set 4: Splitting strings

In the cell below, a variable float_str is set to a string containing a list of floating-point numbers, separated by semicolons (;). (Make sure to run this cell before you proceed, so that the variable will be available in subsequent cells.)

In [14]:
float_str = "5.8;6.9;3.1;5.9;6.6;6.5;6.5;5.6;6;6.4;3.32;6.0;6.0;6.3;6.6;6.6"

Write an expression below that converts this string into a list of floating-point numbers. The type of the expression should be list and the type of individual elements in the list should be float. The expression should evaluate to something that looks like this:

[5.8, 6.9, 3.1, 5.9, 6.6, 6.5, 6.5, 5.6, 6.0, 6.4, 3.32, 6.0, 6.0, 6.3, 6.6, 6.6]

(Hint: you'll need to use the .split() method.)

In [15]:
[float(x) for x in float_str.split(';')]
Out[15]:
[5.8,
 6.9,
 3.1,
 5.9,
 6.6,
 6.5,
 6.5,
 5.6,
 6.0,
 6.4,
 3.32,
 6.0,
 6.0,
 6.3,
 6.6,
 6.6]

Problem set 5: LeBron James

Write a code snippet below that displays LeBron James' average number of assists per game in the 2013-2014 regular season. Feel free to cut and paste from the course notes as appropriate to get the boilerplate code for loading the CSV file.

In [19]:
import csv
import urllib

url = "https://gist.githubusercontent.com/aparrish/cb1672e98057ea2ab7a1/raw/13166792e0e8436221ef85d2a655f1965c400f75/lebron_james.csv"
stats = list(csv.reader(urllib.urlopen(url)))

ast = [float(rec[22]) for rec in stats[1:]]
sum(ast) / len(ast)
Out[19]:
6.337662337662338