Pair trading is a strategy for trading a pair financial instruments, where this pair historically has a 'mean reverting' relationship. This means that historically, the spread between the instrument's prices trade within a tight region, and if we detect that the current spread is outside its historical norm, we can take advantage of this by trading the instruments with the idea that the prices will return to their historical relationship. Essentially, we are exploiting slight deviations from a historical norm.
A basic way of approaching this goes as follows:
Most pair trading algorithms take a pair of positions — a long and a short. In this way, the trader isn’t betting on direction, but rather simply betting that eventually the two assets will gravitate back to their usual relationship. Unfortunately, markets are not always predictable. Assets that were correlated over the last several years may not be correlated tomorrow. Correlations are tendencies for assets to move together, but at any given time they can diverge. This is called drift and will be talked about later.
For this project, here is the list of stocks I used, which can be found at the beginnign of all my proposed algorithms:
context.stocks = [
(sid(4283), sid(5885)), # Coke and Pepsi
(sid(8229), sid(21090)), # Walmart and Target
(sid(8347), sid(23112)), # Exxon mobile and Chevron
(sid(3496), sid(4521)), # Home depot and lowes
(sid(20088), sid(25006)), # Goldman Sachs, JP Morgan
(sid(863), sid(25165)), # BHP Billiton Limited (BHP) and BHP Billiton plc (BBL)
(sid(1638), sid(1637)), # Comcact K and Comcast A
(sid(7784), sid(7767)), # Unilever
(sid(8554), sid(2174)) # SPY and DIA
]
References for these pairs:
We use cointegration to determine if the spread time series shows a mean-reverting relationship. In this case, the following function in all algorithms does the test:
def test_coint(pair):
# Using Augmented Dickey-Fuller unit root test (from Brett's code)
#result = sm.OLS(pair[1], pair[0]).fit()
#dfResult = ts.adfuller(result.resid)
#return dfResult[0] >= dfResult[4]['10%']
# Using cointegration tets built into statsmodels
result = ts.coint(pair[1], pair[0])
return result[0] >= result[2][2]
Brett shared his code with us earlier in the semester, which is the first check using the Augmented Dickey-Fuller test. There is a built in coint() function, which is a part of statsmodels. They seem to both give the same results.
Reference for Cointegration:
Once we are sure that the pair is cointegrated, we need some sort of strategy for when to enter and exit the trade. We want to enter in a trade whe nwe detect a movement away from the historical norm, and we want to exit the trade when we see it move back. There is a lot of variation here, but the research I have done shows that the most common way to do this is:
I have parameterized a lot of the code, and tried playing with some of the values (especially the threshold for exiting the trade, as some research said you should trade when it crosses mean value), but not much came with my experimenting.
As will be discussed below in the algorithms section, when deciding to exit the trade, I didn't know if we should compare to the historical mean and standard deviation, or if we should use the values at the time of the trade. Sometimes using one method works better than the other, so I couldn't get a handle on which one would be better, if any.
Another concept I came across is the idea of dollar neutral trades. With this method, the size of the order places for either position is made in constant dollar terms. This is done by taking the dollar size of the position you want, and dividing by the current price of the asset.
I did play around with just ordering a set number of assets, no matter what. Again, at times it seemed to help, sometimes it did not.
Reference link:
A stop loss trade is a trade you can make to help rpotect yourself from losses on bets that do not turn out your way. If you long an asset, for example, you can tell Quantopian that if the price goes below a certain value, to sell the position to prevent further loses. It seems like something one would do as a best practice, especially since we are making bets on our assets going the right direction. This is implemented via a param called stopLossOrder, which is a boolean we set when we initialize the algo.
Just like most things in this project, it seems to help in a lot of cases, but doesn't in others.
References:
One concern with pairs trading is that the assets may eventually fall out of a relationship. For example, if one of the companies in the pair (assuming stocks are the asset) annoucnes bad quaterly results, or enters into a legal battle, or is acquired by some third party firm, the historical relationship may dissolve. I attempt to correct for this by detecting if a pair has gone from being cointegrated to not cointegrated, and if so, dropping whatever position we have. There migt be better ways of dealing with this, which can be a part of future work.
Here is where I list out the few algorithms I have on quantopian. In some sense, the last algorithm has everything: a portfolio of stocks and all the parameters I can change. I just wanted to outline my thinking.
For all but the last algorithm below, I take the approach of looking only at one pair, not a portfolio of them. I wanted to focus on a single pair to make sure I understood what was going on, as well as for debugging purposes. The algorithms can be easily extened to loop over the stocks context variable, as is done in the last algorithm.
After testing the pair for cointegration, we start our strategy: we look to find when the spread is 'out-of-bounds' from the historical average (defined as a parameter, context.window_length) by testing if the current spread is greater than the historical mean + or - some fraction of the historical standard deviation. If so, we long one stock and short the other, and then look for when the 'mean reversion' happens to lock in some profits. We detect this when the current spread and the historical spread 'cross paths'.
This algo also plots out the historical mean +- the historical standard deviation, which helps in debugging.
Location: https://www.quantopian.com/algorithms/5552aee307123e7be00004fc
Inspired by both Brett's code and code found on quantopian (mostly here), this basic algorithm looks at one stock in the list and does the above approach.
Location: https://www.quantopian.com/algorithms/5558ff7444017cb32c000632
I've seen some algorithms use a ration to define spread, while other use subtraction to define spread. this algorithm is a copy of the above using a different definition of spread
Location: https://www.quantopian.com/algorithms/5558215ea780c09bc20001e5
This is the complete algorithm. It trades against a portfolio, with most variables being configurable at the initalize phase. Each paid gets its own set of valued for parameters, which allows for algorithm tuning per pair.
The results of this are a bit mixed. In the final algorithm, I was finally seeing some results from mainly two tweaks: stop loss ordering and changing window_length. Stop-loss ordering, at least for some pairs, really helps to mitigate loss. Also, I originally had window length set to 30 or 60 on a lot of runs. Changing it to 14 for calculating the historical mean and s.d. semeed to improve results across the board.
At the end of the day, pair trading is only as good as the pairs chosen, and some of these pairs do not seem to be good candidates, as their spread doesn't deviate for long enough to capture much profit.
However, what is apparent is that these approaches migth be too simple for real world use. There would have to be a lot of logic here to think about proper portfolio and risk management.
One avenue of future work would be to see if there are better ways of dealing with drift. Another avenue of work would be to find sets of parameters that are tweaked for all the pairs in a portfolio, and to keep retraining these parameters via some machine learning algorithm. This can be done weekly to re-check the parameters. This is above what quantopian can offer, I think, but would be interesting to see if it would help maintain profits over time.