Toggle navigation
JUPYTER
FAQ
View as Code
View on GitHub
Execute on Binder
Download Notebook
pydata-ldn2014-writeup
01 - Interactive Financial Analytics with Python and IPython.ipynb
Notebook
In [1]:
%
autosave
10
Autosaving every 10 seconds
Source material
¶
http://www.hilpisch.com/YH_PyData_Eurex_Tutorial.html
http://www.hilpisch.com/YH_PyData_Eurex_Tutorial.ipynb
Background
¶
Not just data munging as primary problem, but it is a big problem.
Sources, formats, cleaning missing data.
Performance too
Organisational problems
Teams are silos. People who need answers can't ask questions. People who can give answers can't express them.
Continuum Analytics vision
Simple, interactive, collaborative, but still scalable performance
Notes
¶
pandas offers access to free financial sources, but beware! Not clean, not reliable, but good for playing around.
Volatility clustering
: if you plot log difference in close (log returns) you notice volatility clusters, isn't randomly distributed.
Investors want volalitity, offers short-term trading profit chances
Do you have to shift Returns by 1 (back 1) before multiplying? No.
This model doesn't take portfolio rebalancing (?) into account.
Rebalancing means restoring e.g. 70% in X, 30% in Y balance of your portfolio
Should use discounting, rather than simple sum of Earnings
Err on the side of readability. Don't put too many operations, particular in Pandas, onto one line.
Unless performance, when measured, is an issue.
VSTOXX vs EUROSTOXX
EUROSTOXX is mean reverting, standard theory of stocks apply.
VSTOXX is kind of like an interest rate. Percentage points, aggregate, implies volatility of puts and calls.
Log returns helps comparing two different time series in a mathematical way. Seems a common pattern.
Good link:
http://scipy-lectures.github.io/advanced/mathematical_optimization/
High frequency trading data
¶
High frequency data not well covered by textbooks, even just the data sizes changes the game.
Worse, heterogenous time intervals! Tick data comes when it comes, not fixed.
!!AI Can you use numexpr to df.apply(...) some optimized function?
Why Python?
¶
Nothing compares to Python's sheer breadth.
What, in Ruby, comes close to NumPy, SciPy, and Pandas?
R?
Systems development, actual production code, web development, ..., Python can do it.
Performance?
Python has overcome this stigma.
Python is less a glue between system components or libraries, and more a glue between high performance methods.
LLVM, multi-core, GPUs, clusters.
In [ ]: