Example using the data manager classes¶

This notebook shows how to use the data manager framework for simpler API usage and for caching capabilities.

Please note that in order to request bloomberg fields using property access, it must be CAPITALIZED. (sid.PX_AST NOT sid.px_last)

In [1]:

import pandas as pd
import tia.bbg.datamgr as dm

Single Security Accessor¶

In [2]:

# create a DataManager for simpler api access
mgr = dm.BbgDataManager()
# retrieve a single security accessor from the manager
msft = mgr['MSFT US EQUITY']

In [3]:

#  Can now access any Bloomberg field (as long as it is upper case)
msft.PX_LAST, msft.PX_OPEN

Out[3]:

(47.590000000000003, 47.229999999999997)

In [4]:

# Access multiple fields at the same time
msft['PX_LAST', 'PX_OPEN']

Out[4]:

[47.59, 47.23]

In [5]:

# OR pass an array
msft[['PX_LAST', 'PX_OPEN']]

Out[5]:

[47.59, 47.23]

In [6]:

# Have the manager default to returning a frame instead of values
mgr.sid_result_mode = 'frame'
msft.PX_LAST

Out[6]:

	PX_LAST
MSFT US EQUITY	47.59

In [7]:

# multiple fields returned as data frame
msft[['PX_LAST', 'PX_OPEN']]

Out[7]:

	PX_LAST	PX_OPEN
MSFT US EQUITY	47.585	47.23

In [8]:

# Retrieve historical data
msft.get_historical(['PX_OPEN', 'PX_HIGH', 'PX_LOW', 'PX_LAST'], '1/1/2014', '1/12/2014').head()

Out[8]:

	PX_OPEN	PX_HIGH	PX_LOW	PX_LAST
date
2014-01-02	37.350	37.40	37.10	37.16
2014-01-03	37.200	37.22	36.60	36.91
2014-01-06	36.850	36.89	36.11	36.13
2014-01-07	36.325	36.49	36.21	36.41
2014-01-08	36.000	36.14	35.58	35.76

Multi-security accessor¶

In [9]:

sids = mgr['MSFT US EQUITY', 'IBM US EQUITY', 'CSCO US EQUITY']
sids.PX_LAST

Out[9]:

	PX_LAST
CSCO US EQUITY	28.89
IBM US EQUITY	170.97
MSFT US EQUITY	47.58

In [10]:

sids.get_historical('PX_LAST', '1/1/2014', '11/12/2014').head()

Out[10]:

	IBM US EQUITY	CSCO US EQUITY	MSFT US EQUITY
date
2014-01-02	185.53	22.000	37.16
2014-01-03	186.64	21.980	36.91
2014-01-06	186.00	22.010	36.13
2014-01-07	189.71	22.310	36.41
2014-01-08	187.97	22.293	35.76

In [11]:

sids.get_historical(['PX_OPEN', 'PX_LAST'], '1/1/2014', '11/12/2014').head()

Out[11]:

	IBM US EQUITY		CSCO US EQUITY		MSFT US EQUITY
	PX_OPEN	PX_LAST	PX_OPEN	PX_LAST	PX_OPEN	PX_LAST
date
2014-01-02	187.21	185.53	22.17	22.000	37.350	37.16
2014-01-03	185.83	186.64	22.09	21.980	37.200	36.91
2014-01-06	187.15	186.00	21.96	22.010	36.850	36.13
2014-01-07	186.39	189.71	22.26	22.310	36.325	36.41
2014-01-08	189.33	187.97	22.29	22.293	36.000	35.76

Caching¶

In [12]:

#
# ability to cache requests in memory or in h5 file
#
ms = dm.MemoryStorage()
cmgr = dm.CachedDataManager(mgr, ms, pd.datetime.now())

In [13]:

cmsft = cmgr['MSFT US EQUITY']
cmsft.PX_LAST

Out[13]:

	PX_LAST
MSFT US EQUITY	47.585

In [14]:

%timeit msft.PX_LAST

1 loops, best of 3: 277 ms per loop

In [15]:

%timeit cmsft.PX_LAST

1000 loops, best of 3: 1.66 ms per loop

In [16]:

csids = cmgr['MSFT US EQUITY', 'IBM US EQUITY']
sids = mgr['MSFT US EQUITY', 'IBM US EQUITY']

In [17]:

%timeit sids.get_historical('PX_LAST', start='1/3/2000', end='1/3/2014').head()

1 loops, best of 3: 987 ms per loop

In [18]:

%timeit csids.get_historical('PX_LAST', start='1/3/2000', end='1/3/2014').head()

The slowest run took 371.09 times longer than the fastest. This could mean that an intermediate result is being cached 
1 loops, best of 3: 5.17 ms per loop

C:\Anaconda\lib\site-packages\pandas\core\index.py:1196: FutureWarning: using '-' to provide set differences with Indexes is deprecated, use .difference()
  "use .difference()",FutureWarning)

In [19]:

#
# HD Storage
# - note after executing the warning from hf api. I decided to leave blanks instead of replacing
#

import tempfile
fh, fp = tempfile.mkstemp()

h5storage = dm.HDFStorage(fp)  # Can set compression level for smaller files
h5mgr = dm.CachedDataManager(mgr, h5storage, pd.datetime.now())
h5msft = h5mgr['MSFT US EQUITY']
h5msft.PX_LAST

C:\Anaconda\lib\site-packages\tables\path.py:100: NaturalNameWarning: object name is not a valid Python identifier: 'MSFT US EQUITY'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though
  NaturalNameWarning)

Out[19]:

	PX_LAST
MSFT US EQUITY	47.59

In [20]:

# Notice no warning as it is taken from cache
h5msft.PX_LAST

Out[20]:

	PX_LAST
MSFT US EQUITY	47.59

In [21]:

h5msft.get_historical('PX_LAST', start='1/2/2000', end='1/2/2014').head()

Out[21]:

	PX_LAST
date
2000-01-03	58.2813
2000-01-04	56.3125
2000-01-05	56.9063
2000-01-06	55.0000
2000-01-07	55.7188

In [22]:

%timeit h5msft.get_historical('PX_LAST', start='1/2/2000', end='1/2/2014')

100 loops, best of 3: 6.18 ms per loop

In [23]:

# notice only IBM gets warning as MSFT is already cached, so it only retrieves IBM data
h5sids = h5mgr['MSFT US EQUITY', 'IBM US EQUITY']
h5sids.get_historical('PX_LAST', start='1/3/2000', end='1/2/2014').tail()

C:\Anaconda\lib\site-packages\tables\path.py:100: NaturalNameWarning: object name is not a valid Python identifier: 'IBM US EQUITY'; it does not match the pattern ``^[a-zA-Z_][a-zA-Z0-9_]*$``; you will not be able to use natural naming to access this object; using ``getattr()`` will still work, though
  NaturalNameWarning)

Out[23]:

	IBM US EQUITY	MSFT US EQUITY
date
2013-12-26	185.35	37.44
2013-12-27	185.08	37.29
2013-12-30	186.41	37.29
2013-12-31	187.57	37.41
2014-01-02	185.53	37.16

In [24]:

# not perfect as it retrieves for each security and then concats BUT better than roundtrip to bloomberg plus consistency added for free
%timeit h5sids.get_historical('PX_LAST', start='1/3/2000', end='1/2/2014')

100 loops, best of 3: 14.8 ms per loop