Original data from https://s3.amazonaws.com/zanran_storage/www.etftopics.com/ContentPages/13968652.xls , modified and available locally as ETFs.csv .
This notebook only works in Quantopian Research, but one only needs to modify the loading of the file to work with it locally.
import numpy as np
import pandas as pd
import matplotlib.pyplot as pyplot
import seaborn as sns
ETFs = local_csv('ETFs.csv', skipinitialspace=True,
na_values=['n/a','NaN'], thousands=',')
ETFs = ETFs.set_index('Symbol')
ETFs[:2]
Name | Index | Description | Category | Provider | Expense_Ratio | Dividend | Div_Yield | Market_Cap | Shares | ... | Beta | Asset | Size | Style | Region | Sector | Leveraged | Inverse | Top_Holdings | Chart | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Symbol | |||||||||||||||||||||
AAXJ | iShares MSCI All Country Asia ex Japan Index Fund | MSCI All Country Asia ex Japan Index | The index measures the performance of 11 devel... | Asia Pacific Equities | iShares | 0.7400% | $0.36 | 1.8400% | $1,687.89 | 31 | ... | 1 | Equity | NaN | NaN | Pacific ex-Japan | NaN | NaN | NaN | NaN | NaN |
ACWI | MSCI All Country World Index Fund | MSCI All Country World Index | The Index is designed to measure the performan... | Global Equities | iShares | 0.3500% | $0.36 | 1.4100% | $1,176.64 | 30 | ... | 1 | Equity | Large-Cap | Blend | Global/Multi-Region | NaN | NaN | NaN | NaN | NaN |
2 rows × 21 columns
del ETFs['Top_Holdings']
del ETFs['Chart']
ETFs.columns
Index([u'Name', u'Index', u'Description', u'Category', u'Provider', u'Expense_Ratio', u'Dividend', u'Div_Yield', u'Market_Cap', u'Shares', u'Avg_Vol', u'Beta', u'Asset', u'Size', u'Style', u'Region', u'Sector', u'Leveraged', u'Inverse'], dtype='object')
def p2f(x):
if type(x) is str:
return float(x.strip('%'))/100
else:
return x / 100
def d2f(x):
if type(x) is str:
return float(x.strip('$').replace(',',''))
else:
return x
def s2b(x):
if type(x) is str:
if x == 'Yes':
return True
else:
return False
else:
return False
# Apply conversions
ETFs.Expense_Ratio = ETFs.Expense_Ratio.apply(p2f)
ETFs.Dividend = ETFs.Dividend.apply(d2f)
ETFs.Div_Yield = ETFs.Div_Yield.apply(p2f)
ETFs.Market_Cap = ETFs.Market_Cap.apply(d2f)
ETFs.Inverse = ETFs.Inverse.apply(s2b)
ETFs[:3]
Name | Index | Description | Category | Provider | Expense_Ratio | Dividend | Div_Yield | Market_Cap | Shares | Avg_Vol | Beta | Asset | Size | Style | Region | Sector | Leveraged | Inverse | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Symbol | |||||||||||||||||||
AAXJ | iShares MSCI All Country Asia ex Japan Index Fund | MSCI All Country Asia ex Japan Index | The index measures the performance of 11 devel... | Asia Pacific Equities | iShares | 0.0074 | 0.36 | 0.0184 | 1687.89 | 31 | 527890 | 1 | Equity | NaN | NaN | Pacific ex-Japan | NaN | NaN | False |
ACWI | MSCI All Country World Index Fund | MSCI All Country World Index | The Index is designed to measure the performan... | Global Equities | iShares | 0.0035 | 0.36 | 0.0141 | 1176.64 | 30 | 379781 | 1 | Equity | Large-Cap | Blend | Global/Multi-Region | NaN | NaN | False |
ACWX | iShares MSCI ACWI ex US Index Fund | MSCI All Country World ex-U.S. Index | The index measures the performance of global e... | Global Equities | iShares | 0.0035 | 0.52 | 0.0147 | 654.07 | 17 | 161924 | 1 | Equity | Large-Cap | Blend | Ex-U.S. | NaN | NaN | False |
len(ETFs.index)
1020
ETFs.Category.value_counts()
Leveraged Equities 120 Emerging Markets Equities 46 Energy Equities 40 Large Cap Value Equities 36 Technology Equities 32 Large Cap Blend Equities 30 Currency 29 Financials Equities 28 Europe Equities 25 Foreign Large Cap Equities 25 Health & Biotech Equities 22 National Munis 21 Commodities 19 All Cap Equities 19 Commodity Producers Equities 18 Precious Metals 18 Government Bonds 18 Leveraged Commodities 18 Large Cap Growth Equities 18 Asia Pacific Equities 17 China Equities 16 Quantitative Methodology Indexes 16 Global Equities 16 Oil & Gas 15 Inverse Equities 15 Small Cap Blend Equities 15 Consumer Discretionary Equities 14 Miscellaneous Sector Equities 14 Utilities Equities 14 Real Estate 12 ... Consumer Staples Equities 11 Corporate Bonds 11 Latin America Equities 10 Materials 10 Foreign Small & Mid Cap Equities 10 Diversified Portfolio 10 Small Cap Value Equities 10 Global Real Estate 9 Mid Cap Value Equities 9 Metals 9 Communications Equities 8 Inverse Commodities 8 Mid Cap Growth Equities 8 Japan Equities 8 Leveraged Real Estate 7 Small Cap Growth Equities 7 Money Market 7 Leveraged Currency 6 Inflation-Protected Bonds 6 Hedge Fund 6 Long-Short 6 Preferred Stock/Convertible Bonds 5 International Government Bonds 4 Transportation Equities 4 Mortgage Backed Securities 3 High Yield Bonds 3 California Munis 3 New York Munis 3 Volatility 2 Emerging Markets Bonds 2 dtype: int64
ETFs.Provider.value_counts()
iShares 201 Invesco PowerShares 146 ProShares 111 State Street 89 Wisdom Tree 52 Vanguard 46 First Trust 44 Rydex 40 Barclays iPath 38 Claymore 36 Direxion 34 Van Eck 29 Merrill Lynch HOLDRs 17 ELEMENTS 16 UBS 13 PIMCO 10 Global X 10 IndexIQ 9 Charles Schwab 8 Grail Advisors 7 US Commodity Funds 7 EG Shares 6 XShares 6 RevenueShares 6 GlobalShares 5 FaithShares 5 ETF Securities 4 MacroMarkets 4 Jefferies 4 Credit Suisse 2 JP Morgan 2 Javelin Funds 2 Geary Advisors 2 ALPS 2 Goldman Sachs 2 GreenHaven 1 Fidelity 1 AdvisorShares 1 Sprott Asset Management 1 U.S. One 1 dtype: int64
ETFs.Expense_Ratio.plot(kind='hist')
<matplotlib.axes._subplots.AxesSubplot at 0x7f85b3e2c3d0>
ETFs.Dividend.plot(kind='hist', bins=20)
<matplotlib.axes._subplots.AxesSubplot at 0x7f85b3bd19d0>
ETFs.Div_Yield.plot(kind='hist', bins=20)
<matplotlib.axes._subplots.AxesSubplot at 0x7f85b3adf390>
ETFs.Market_Cap.plot(kind='hist',bins=40)
<matplotlib.axes._subplots.AxesSubplot at 0x7f85b3998510>
ETFs.Asset.value_counts()
Equity 725 Bond 106 Commodity 100 Currency 35 Real Estate 28 Multi-Asset 26 dtype: int64
ETFs.Size.value_counts()
Large-Cap 205 Small-Cap 79 Mid-Cap 53 Multi-Cap 24 dtype: int64
ETFs.Style.value_counts()
Blend 177 Value 85 Growth 59 dtype: int64
ETFs.Beta.plot(kind='hist',bins=8)
<matplotlib.axes._subplots.AxesSubplot at 0x7f85b38f3690>
ETFs.Region.value_counts()
U.S. 460 Global/Multi-Region 107 Emerging Markets 28 China 20 Europe 15 Japan 10 Developed Markets 9 Ex-U.S. 9 EAFE 8 Asia 7 India 6 Pacific ex-Japan 6 BRIC 5 Brazil 5 Latin America 5 Canada 3 Euro-Pacific 3 Middle East 3 Mexico 3 South Korea 2 Africa 2 Taiwan 2 Australia 2 Russia 2 Indonesia 2 Middle East/Africa 2 Israel 1 Colombia 1 Austria 1 France 1 Turkey 1 Ireland 1 Italy 1 Switzerland 1 Malaysia 1 United Kingdom 1 Germany 1 Pacific 1 Singapore 1 Chindia 1 Nordic 1 Poland 1 Chile 1 Belgium 1 Thailand 1 Hong Kong 1 Netherlands 1 South Africa 1 Peru 1 North America 1 Sweden 1 Vietnam 1 Spain 1 dtype: int64
ETFs.Sector.value_counts()
Energy 45 Technology 41 Financial 41 Industrial 32 Health & Biotech 25 Utilities 17 Retail & Consumer 15 Materials 12 Communications 11 Consumer Staples 6 Consumer Discretionary 6 Transportation 4 Building & Construction 3 Agriculture 3 Aerospace & Defense 2 Environmental 1 dtype: int64
ETFs.Leveraged.value_counts()
2x 116 3x 44 dtype: int64
ETFs.Inverse.value_counts()
False 914 True 106 dtype: int64
# http://www.etfreplay.com/
# http://etfdb.com/etf/###
# http://finance.yahoo.com/q/hl?s=### ETF holdings
# https://accounts.profunds.com/etfdata/ByFund/EFAD-psdlyhld.csv
# http://www.ishares.com/us/products/239709/ishares-russell-2000-growth-etf