There are a bunch of ways to get at spreadsheet data from Python, but the simplest way is usually to export a version of the spreadsheet in csv (comma-separated values) format. CSV is really easy to work with from Python!
with open('sales.csv') as infile:
content = infile.read()
content
print(content)
Python's Standard Library has a special csv module that makes it easy to change that data into Python lists.
import csv
with open('sales.csv') as infile:
data = []
reader = csv.reader(infile)
for row in reader:
data.append(row)
print(data)
So far so good, but those quote marks around the numbers mean that right now, Python thinks they're strings, not numbers. We want to convert them, but carefully; if we try to convert a word like 'Michigan' to a number, we'll get an error:
float('Michigan')
We'll use a brand-new concept called exception handling. We try to change each piece of data into a number, except if that causes an error, we'll tell Python to leave it alone (instead of stopping the program with an error message, the way it normally would).
with open('sales.csv') as infile:
data = []
reader = csv.reader(infile)
for row in reader:
newrow = []
for item in row:
try:
newrow.append(float(item))
except ValueError:
newrow.append(item)
data.append(newrow)
print(data)
OK, now some new instructions arrive from the head office:
We need to make our sales numbers look better. Quote them in Canadian dollars.
This is a workshop on programming, not ethics, so we'll comply. We'll check Bank of Canada and use a number of $0.98 per Canadian dollar.
usd_to_cad = 0.98
fudgey_data = []
for row in data:
canadian_row = []
for val in row:
try:
canadian_val = round(val / usd_to_cad)
canadian_row.append(canadian_val)
except TypeError:
canadian_row.append(val)
fudgey_data.append(canadian_row)
print(data)
print(fudgey_data)
Now we have fudgey_data
, but we want to get it back out into a spreadsheet (well, into a CSV, and we'll let the spreadsheet program handle the CSV-to-spreadsheet conversion).
with open('fudgey.csv', 'w') as outfile: # the 'w' stands for 'write'
writer = csv.writer(outfile)
for canadian_row in fudgey_data:
writer.writerow(canadian_row)
Have a look at fudgey.csv! You can open fudgey.csv from a spreadsheet program, and it will know how to convert it into spreadsheet format.
Incidentally, if you can get permission to work in a different spreadsheet program altogether (but one that's compatible with MS Excel), check out Resolver One - it's a spreadsheet that lets you use Python directly in its cells! Really interesting project.
The main office writes again:
Our currency conversions need to be up-to-date. Redo the numbers daily with the current exchange rate.
There are a bunch of ways in Python to get data off a webpage. Probably the easiest way requires two extra packages: requests and BeautifulSoup. Put those packages on your computer:
pip install requests
pip install beautifulsoup4
import requests
page = requests.get('http://www.bankofcanada.ca/rates/exchange/')
print(page.text[:2000])
Ewww! Good thing we only printed the first 2000 characters! But when we eyeball the actual page in our browser, we know that we're looking for "1 CAD noon", so we use View Source in our browser and then search that. We find it in
<tr><th>1 CAD noon</th><td>0.9846 USD</td><td>0.9836 USD</td></tr>
Now we know that our program just needs to find the HTML element whose text is "1 CAD noon", then take the text of the next HTML element after that. The BeautifulSoup package makes that easy.
import bs4
soup = bs4.BeautifulSoup(page.text)
rate_element = soup.find('th', text="1 CAD noon").next.text
rate_element
Now we just need to extract the number from that string... .split()
will chop it into a list of separate words; we want the first word.
rate = rate_element.split()[0]
rate
usd_to_cad = float(rate)
usd_to_cad
Let's take the code we wrote up above and make a function of it, so it's easy to call.
def convert_currency(data, usd_to_foreign):
foreign_data = []
for row in data:
foreign_row = []
for val in row:
try:
foreign_val = round(val / usd_to_foreign)
foreign_row.append(foreign_val)
except TypeError:
foreign_row.append(val)
foreign_data.append(foreign_row)
return foreign_data
print(data)
print(convert_currency(data, usd_to_cad))