In [1]:

%autosave 10

Autosaving every 10 seconds

Source material¶

Background¶

Not just data munging as primary problem, but it is a big problem.
- Sources, formats, cleaning missing data.
Performance too
Organisational problems
- Teams are silos. People who need answers can't ask questions. People who can give answers can't express them.
Continuum Analytics vision
- Simple, interactive, collaborative, but still scalable performance

pandas offers access to free financial sources, but beware! Not clean, not reliable, but good for playing around.
Volatility clustering: if you plot log difference in close (log returns) you notice volatility clusters, isn't randomly distributed.
Investors want volalitity, offers short-term trading profit chances
Do you have to shift Returns by 1 (back 1) before multiplying? No.
This model doesn't take portfolio rebalancing (?) into account.
- Rebalancing means restoring e.g. 70% in X, 30% in Y balance of your portfolio
Should use discounting, rather than simple sum of Earnings
Err on the side of readability. Don't put too many operations, particular in Pandas, onto one line.
- Unless performance, when measured, is an issue.
VSTOXX vs EUROSTOXX
- EUROSTOXX is mean reverting, standard theory of stocks apply.
- VSTOXX is kind of like an interest rate. Percentage points, aggregate, implies volatility of puts and calls.
Log returns helps comparing two different time series in a mathematical way. Seems a common pattern.
Good link: http://scipy-lectures.github.io/advanced/mathematical_optimization/

High frequency data not well covered by textbooks, even just the data sizes changes the game.
Worse, heterogenous time intervals! Tick data comes when it comes, not fixed.
!!AI Can you use numexpr to df.apply(...) some optimized function?

Nothing compares to Python's sheer breadth.
- What, in Ruby, comes close to NumPy, SciPy, and Pandas?
R?
- Systems development, actual production code, web development, ..., Python can do it.
Performance?
- Python has overcome this stigma.
- Python is less a glue between system components or libraries, and more a glue between high performance methods.
  - LLVM, multi-core, GPUs, clusters.

In [ ]: