Kalman and Bayesian Filters in Python

Table of Contents

Introduction

Version 0.0

Author's note - this is obsolete, read the preface instead.

The Kalman filter was introduced to the world via papers published in 1958 and 1960 by Rudolph E Kalman. This work built on work by Nobert Wiener. Kalman's early papers were extremely abstract, but researchers quickly realized that the papers described a very practical technique to filter noisy data. From then until now it has been an ongoing topic of research, and there are many books and papers devoted not only to the basics, but many specializations and extensions to the technique. If you are reading this, you have likely come across some of them.

If you are like me, you probably find them nearly impenetrable. I find that almost all start with very abstract math, assume familiarity with notation and naming conventions that I haven't seen before, and focus heavily on proof rather than exposition and teaching. This is perhaps understandable, but it is a regrettable situation and not necessary.

After struggling through this material for some time, things finally began making sense. A majority of my 'aha' moments were due to implementing and experimenting with various simple filters. What to make of an equation like $K_k=P_{k}^{-}H^{T}_{k}[H_k P^{-}_{k} H^{T}_{k} + R_k]^{-1}$ is initially puzzing. This is especially true when it pops out as the result of two to three pages of linear algrebra, and each variable is given only a very abstract, mathematically rigorous definition. One book I have doesn't bother to define $R_k$ despite using it throughout its 4 pages of derivation of K. If instead I tell you that K is just a scaling factor for choosing how much of a measurement and how much of a prediction to use in the filter, what K is becomes obvious (although perhaps the computation is still a bit mysterious). After implementing a few Kalman filters for toy problems and varying the values of various matrices and constants I developed both an intuitive and fairly deep understanding of how Kalman filters work. This knowledge is indispensible; it is trivial to code the handful of linear equations for a Kalman filter, and they never change. I do mean a handful - you can implement the simplest Kalman filter in 10 lines of code.

this needs a lot of editting. bored with it for now.

While you do need to know some basic probability and some very basic linear algebra to understand and implement Kalman filters, by and large you really do not need to understand the complicated, multi-page derivations. The end result of all the math is a small handful of equations which you can use to perform very sophisticated filtering, smoothing, and tracking. Implementing the basic equations is never difficult. Kalman filter design is much more an art than a science - implementers spend a few minutes writing the basic equations, and then a lot of time tuning the filter for their specific problem.

I compare this to a student learning the equation for the volume of a sphere: $V(r) = \frac{4}{3}\pi r^3$ A student can use this equation without even understanding the functional notation $V(r)$, and certainly they do not need to be able to derive the equation via calculus. Eventually, in some domains, knowledge of the calculus will become useful, but an enormous amount of work can be done by only knowing the equation and how to apply it. Also, it is often useful to understand facts about a sphere, such as it the shape that encloses the greatest volume with the smallest surface area. You do not have to know how to prove that to make use of that information to explain the reason for why bubbles are round, or to economically fence an area for livestock.

I argue the same is largely true with Kalman filters. In this book I will not prove the Kalman filter equations are correct, nor will I derive them. I will instead strive to build in you an physical, intuitive understanding of what the equations mean, and how they interact. Of course, if you are trying to navigate a spaceship to Mars you will require a more sophisticated understanding than I provide, and you will not view this as a useful resource. But the rest of people, who perhaps wants to track heads in video, or control a little hobby robot, or smooth some data, I think this approach should provide a lot of insight into how to create Kalman filters, and provide enough background that the standard texts are now approachable.

Prerequisites

While Kalman filters are used in many domains, they were initially created to solve problems with missle tracking and navigation. Most of my examples will draw from physical examples such as these. However, I do this not just from a sense of history, but because I believe that helps the student form strong pysical intuitions about what the filter is doing. For example, we will learn how the Kalman filter creates and uses something called "hidden variables". This is normally presented in a highly abstract manner. However, it is actually quite simple. Suppose I know your position at several points in time. From that I can calculate your velocity even though I don't have any sensor that directly measures your velocity. In a Kalman filter, if I have a position sensor, the filter will generate the hidden, or unobserved variable velocity. A lot of seemingly arcane terminology suddenly becomes concrete and clear. So you should have taken a basic physics course and understand equations like $d = \frac{a}{2} t^2 + v_0 t + d_0$

You will need some basic calculus - you should be able to integrate velocity to get distance, or take the derivative of the distance equation above to get the velocit equation. You should be familiar with trigonometry.

Kalman filtering depends heavily on statistics. I have a basic Kalman filtering textbook that devotes several chapters to statistics before even trying to discuss Kalman filtering. Again, I feel like this is a place where you can get a long way with a little information. So I will assume that you have had some exposure to probability. If you have seen done calculus you probably know how to perform basic probability computations. I will provide some remedial math for gaussian distributions, but I will move fairly quickly through it.

Finally, you will probably want to have some exposure to linear algebra. I will assume that you understand matrices, arrays, and simple operations like matrix multiplication. In the more difficult moments we will use things like LU and Cholesky decomposition, and touch on things like eigenvalues and eigenvectors, but you really do not need to be able to do math at that level.

I am writing this in IPython Notebook. I use Python because of its excellent mathematics library in the form of numpy and scipy, and its strength as a general purpose programming language. Kalman filter books that do include code invariably use Matlab. This makes sense, as any professional engineer with have access to it, and long practice with it. However, I am addressing the hobbiest, and most cannot afford to, or do not want to buy Matlab. Finally, the interactive aspect of mixing code, results, and text in one document is irrestible to me. I have read texts and papers with intriguiging graphs, and I struggle to understand how the graph was generated. With IPython Notebook nothing is hidden - the math is all revealed. If you wonder 'what would happen if I changed this parameter' - you can just type in a new value and see the results. Finally, because of the facilities of numpy and scipy you do not have to code or understand details of linear algebra - when the Kalman filter equations state that we need to find the Cholesky decomposition we will just call numpy.linalg.cholesky. I will spend a few words explaining what that means when we get to it, but to be honest you do not need to understand it to use the code.

Since this is a book on doing Kalman filtering with Python I will expect that you know Python; I do not attempt to teach it. With that said, it is a very easy language to learn, and it reads much like pseudocode. If you are more familiar with another language I do not think you will have any major difficulties reading the source code. I purposefully restrict the code to the more basic features of Python to accomodate people with varying skill levels. If you use this code in your own work, feel free to use more advanced Python facilities if it strikes your fancy.

So that is a fair amount of prerequisites, but I think you will already have most of them if you are seriously trying to solve a problem where Kalman filters are useful.

In [1]:
#format the book
import book_format
book_format.load_style()
Out[1]: