To facilitate model calibration, we propose the definition of a standard data set, which contains all the necessary information. The data is held in a Panda table of type option_quotes, with one row per quote and 8 columns. This table is defined in quantlib/reference/data_structures.py. The column names are defined in reference/names.py, and are as follows:
We do not include the dividend yield nor the risk-free rate in the data set: The implied forward price and risk-free rate are estimated from the call/put parity.
This notebook demonstrates the creation of such data file by processing the quotes on the S&P 500 index options (SPX) provided by the Chicago Board of Options Exchange (CBOE).
SPX delayed options quotes are published by the CBOE, in a comma-separated format. The file provides:
We provide below the procedure for converting the raw SPX option data file into the standardized option quotes data format.
from __future__ import print_function
import pandas
import datetime
import dateutil
import re
import os
import quantlib.reference.names as nm
import quantlib.reference.data_structures as ds
def ExpiryMonth(s):
"""
Convert SPX contract months into month number
"""
call_months = "ABCDEFGHIJKL"
put_months = "MNOPQRSTUVWX"
try:
m = call_months.index(s)
except ValueError:
m = put_months.index(s)
return m
spx_symbol = re.compile("\\(SPX(1[0-9])([0-9]{2})([A-Z])([0-9]{3,4})-E\\)")
def parseSPX(s):
"""
Parse an SPX quote string, return expiry date and strike
"""
tokens = spx_symbol.split(s)
if len(tokens) == 1:
return {'dtExpiry': None, 'strike': -1}
year = 2000 + int(tokens[1])
day = int(tokens[2])
month = ExpiryMonth(tokens[3])
strike = float(tokens[4])
dtExpiry = datetime.date(year, month, day)
return ({'dtExpiry': dtExpiry, 'strike': strike})
The csv file downloaded from the CBOE site can be converted into a standard option_quotes panda data frame by the following function.
def read_SPX_file(option_data_file):
"""
Read SPX csv file,
return spot and a data frame of type option_quotes
"""
# read two lines for spot price and trade date
with open(option_data_file) as fid:
lineOne = fid.readline()
spot = float(lineOne.split(',')[1])
lineTwo = fid.readline()
dt = lineTwo.split('@')[0]
dtTrade = dateutil.parser.parse(dt).date()
print('Dt Calc: %s Spot: %f' % (dtTrade, spot))
# read all option price records as a data frame
df = pandas.io.parsers.read_csv(option_data_file, header=0, sep=',', skiprows=[0,1])
# split and stack calls and puts
call_df = df[['Calls', 'Bid', 'Ask']]
call_df = call_df.rename(columns={'Calls':'Spec', 'Bid':'PBid', 'Ask': 'PAsk'})
call_df['Type'] = nm.CALL_OPTION
put_df = df[['Puts', 'Bid.1', 'Ask.1']]
put_df = put_df.rename(columns = {'Puts':'Spec', 'Bid.1':'PBid',
'Ask.1':'PAsk'})
put_df['Type'] = nm.PUT_OPTION
df_all = call_df.append(put_df, ignore_index=True)
# parse Calls and Puts columns for strike and contract month
# insert into data frame
cp = [parseSPX(s) for s in df_all['Spec']]
option_quotes = ds.option_quotes_template()
option_quotes = option_quotes.reindex(index=range(len(cp)))
# Fill the option_quotes data frame
option_quotes[nm.STRIKE] = [x['strike'] for x in cp]
option_quotes[nm.EXPIRY_DATE] = [x['dtExpiry'] for x in cp]
option_quotes[nm.OPTION_TYPE] = df_all['Type']
option_quotes[nm.EXERCISE_STYLE] = nm.EURO_EXERCISE
option_quotes[nm.PRICE_BID] = df_all['PBid']
option_quotes[nm.PRICE_ASK] = df_all['PAsk']
option_quotes[nm.TRADE_DATE] = dtTrade
option_quotes = option_quotes[(option_quotes[nm.STRIKE] > 0) & \
(option_quotes[nm.PRICE_BID]>0) & \
(option_quotes[nm.PRICE_ASK]>0)]
option_quotes[nm.SPOT] = spot
return option_quotes
In the example below, the file 'SPX-Options-24jan2011.csv' was downloaded from the CBOE web site. The standardized option quotes data file is saved as a csv file and as a panda data frame.
File paths are relative to the notebooks folder, so it's important that the notebook browser be started with the command:
ipython notebook --pylab inline path-to-the-notebooks-folder
option_data_file = os.path.join('..', 'data', 'SPX-Options-24jan2011.csv')
df_SPX = read_SPX_file(option_data_file)
print('%d records processed' % len(df_SPX))
# save a csv file and pickled data frame
df_SPX.to_csv(os.path.join('..', 'data', 'df_SPX_24jan2011.csv'), index=False)
df_SPX.to_pickle(os.path.join('..', 'data', 'df_SPX_24jan2011.pkl'))
Dt Calc: 2011-01-24 Spot: 1290.590000 1472 records processed