These problem sets focus on MongoDB and Tornado.
For this first problem set, we're going to build off of your solution to Problem Set #2 ("Of Widgets and Pandas") in last week's homework assignment. Specifically, we're going to be working with the widgets listed on this page. You'll be creating a MongoDB database that has a document for every listed widget. But first, in the cell below, connect to your local MongoDB instance and make a variable collection
that points to a collection called widgets
in the lede_program
database. I've done the appropriate import
statements for you, and included a line at the end that prints the full name of the collection (i.e., its database name plus the name of the collection.) The cell's output should be the string lede_program.widgets
.
import pymongo
# your code here
# end your code
print collection.full_name
In the cell below, write a Python statement that will remove all documents from this collection. We want to start fresh! I've left in a line that prints out the number of records in the collection. This number should be 0
.
# your code here
# end your code
print collection.count()
Great! Now, the tough part. In the cell below, duplicate the code in the second code cell from Problem Set #2 in Assignment #5. There should be one key difference in your code, however: instead of creating an empty list, and adding each document to the list, you should instead insert each document into the widgets
collection. After you've executed the code, evaluating the expression list(collection.find())
should look something like this (your ObjectId
numbers will be different):
[{u'_id': ObjectId('53b4c0c92735fe3ff2977816'),
u'partno': u'C1-9476',
u'price': 2.7,
u'quantity': 512,
u'widgetname': u'Skinner Widget'},
{u'_id': ObjectId('53b4c0c92735fe3ff2977817'),
u'partno': u'JDJ-32/V',
u'price': 9.36,
u'quantity': 967,
u'widgetname': u'Widget For Furtiveness'},
... some widgets omitted for brevity ...
{u'_id': ObjectId('53b4c0c92735fe3ff297781e'),
u'partno': u'5B-941/F',
u'price': 13.26,
u'quantity': 919,
u'widgetname': u'Widget For Cinema'}]
(Hint: Pay attention to types! Make sure that price
is an integer and quantity
is a floating-point number when inserting the document into MongoDB.) I've included some scaffolding for you, including the Beautiful Soup import statement and the code to fetch the contents of widgets.html
into a variable called html_str
. I've also included, at the very end, the expression to show all documents in the collection.
from bs4 import BeautifulSoup
import urllib
html_str = urllib.urlopen("http://static.decontextualize.com/widgets.html").read()
# your code here
# end your code
list(collection.find())
Nice. Your work is how I like my burgers: well done. (It's also how I like my steak: rare, and of the highest quality.)
Finally, in the cell below, write an expression that checks to ensure that the number of documents in the collection is equal to the number of widgets in widgets.html
. The cell should contain a single expression that evaluates to True
.
This problem set focuses on exercising your ability to write expressions with Pymongo that filter, limit, and sort lists of documents in a MongoDB collection, using the .find()
, .sort()
and .limit()
methods.
First problem. In the cell below, write a statement that performs a MongoDB query returning a list containing one document: the least expensive widget in the catalog. I.e., your code, when run, should evaluate to this (keeping in mind that your ObjectId
will be different):
[{u'_id': ObjectId('53b4c0c92735fe3ff297781b'),
u'partno': u'MZ-556/B',
u'price': 2.35,
u'quantity': 948,
u'widgetname': u'Yellow-Tipped Widget'}]
Now, in the cell below, write an expression that returns a list of widget documents where the quantity of available widgets is greater than 900
. These documents should only have a subset of available fields, namely partno
and quantity
. Your code, when run, should evaluate to this (again, keeping in mind that your ObjectId
s will be different; the order of documents in the list might also be different):
[{u'partno': u'JDJ-32/V', u'quantity': 967},
{u'partno': u'MZ-556/B', u'quantity': 948},
{u'partno': u'5B-941/F', u'quantity': 919}]
Cool. Finally, in the cell below, write an expression that returns a list of widget documents where the word "Widget" occurs at the end of the widgetname
string. Use the $regex
query selector. The documents in the list should include only the widgetname
field, and should be sorted by the widgetname
field. I.e., your code, when run, should evaluate to this (again, your ObjectId
s will be different):
[{u'widgetname': u'Infinite Widget'},
{u'widgetname': u'Manicurist Widget'},
{u'widgetname': u'Self-Knowledge Widget'},
{u'widgetname': u'Skinner Widget'},
{u'widgetname': u'Unshakable Widget'},
{u'widgetname': u'Yellow-Tipped Widget'}]
In this problem set, you'll make a very simple web API with Tornado.
This problem set works a little bit different from the others! You'll be pasting into the cell below a program that you've written elsewhere. (As discussed in class, it's difficult to run a Tornado application inside of iPython Notebook.)
Here's how your web API should work. A request to the resource /oz
should return the following response (as a JSON string):
{"result": "Toto, I've a feeling we're not in Kansas anymore."}
If the parameters pet
and place
are included in the query string, then the string in the response should include the strings specified as the values of those keys in place of Toto
and Kansas
, respectively. For example, the following request with curl
, assuming your web service is running on localhost
port 8000
...
curl -s "http://localhost:8000/oz?pet=Fluffy&place=Brooklyn"
... should print the following response:
{"result": "Fluffy, I've a feeling we're not in Brooklyn anymore."}
I've included the basic framework for the application for you. The part you need to fill in is in the definition of the get()
method.
import tornado.httpserver
import tornado.ioloop
import tornado.options
import tornado.web
from tornado.options import define, options
define("port", default=8000, help="run on the given port", type=int)
tornado.options.parse_command_line()
class OzHandler(tornado.web.RequestHandler):
def get(self):
# your code here!
# end your code
application = tornado.web.Application(handlers=[(r"/oz", OzHandler)])
http_server = tornado.httpserver.HTTPServer(application)
http_server.listen(options.port)
tornado.ioloop.IOLoop.instance().start()
Great job! I enjoyed writing this homework assignment and I hope you enjoyed completing it.