Designing Nonlinear Kalman Filters

In [1]:
#format the book
%matplotlib inline
from __future__ import division, print_function
import matplotlib.pyplot as plt
import book_format
book_format.load_style()
Out[1]:

Introduction

Author's note: I was initially planning to have a design nonlinear chapter that compares various approaches. This may or may not happen, but for now this chapter has no useful content and I suggest not reading it.

We see that the Kalman filter reasonably tracks the ball. However, as already explained, this is a silly example; we can predict trajectories in a vacuum with arbitrary precision; using a Kalman filter in this example is a needless complication.

Kalman Filter with Air Drag

I will dispense with the step 1, step 2, type approach and proceed in a more natural style that you would use in a non-toy engineering problem. We have already developed a Kalman filter that does excellently at tracking a ball in a vacuum, but that does not incorporate the effects of air drag into the model. We know that the process model is implemented with $\textbf{F}$, so we will turn our attention to that immediately.

Notionally, the computation that $\textbf{F}$ computes is

$$x' = Fx$$

With no air drag, we had

$$ \mathbf{F} = \begin{bmatrix} 1 & \Delta t & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 0 & 1 & \Delta t & \frac{1}{2}{\Delta t}^2 \\ 0 & 0 & 0 & 1 & \Delta t \\ 0 & 0 & 0 & 0 & 1 \end{bmatrix} $$

which corresponds to the equations

$$ \begin{aligned} x &= x + v_x \Delta t \\ v_x &= v_x \\ \\ y &= y + v_y \Delta t + \frac{a_y}{2} {\Delta t}^2 \\ v_y &= v_y + a_y \Delta t \\ a_y &= a_y \end{aligned} $$

From the section above we know that our new Euler equations must be

$$ \begin{aligned} x &= x + v_x \Delta t \\ v_x &= v_x \\ \\ y &= y + v_y \Delta t + \frac{a_y}{2} {\Delta t}^2 \\ v_y &= v_y + a_y \Delta t \\ a_y &= a_y \end{aligned} $$

Realistic 2D Position Sensors

The position sensor in the last example are not very realistic. In general there is no 'raw' sensor that provides (x,y) coordinates. We have GPS, but GPS already uses a Kalman filter to create a filtered output; we should not be able to improve the signal by passing it through another Kalman filter unless we incorporate additional sensors to provide additional information. We will tackle that problem later.

Consider the following set up. In an open field we put two transmitters at a known location, each transmitting a signal that we can detect. We process the signal and determine how far we are from that signal, with some noise. First, let's look at a visual depiction of that.

In [2]:
circle1=plt.Circle((-4, 0), 5, color='#004080', 
                   fill=False, linewidth=20, alpha=.7)
circle2=plt.Circle((4, 0), 5, color='#E24A33', 
                   fill=False, linewidth=5, alpha=.7)

fig = plt.gcf()
ax = fig.gca()

plt.axis('equal')
plt.xlim((-10, 10))
plt.ylim((-10, 10))

plt.plot ([-4, 0], [0, 3], c='#004080')
plt.plot ([4, 0], [0, 3], c='#E24A33')
plt.text(-4, -.5, "A", fontsize=16, horizontalalignment='center')
plt.text(4, -.5, "B", fontsize=16, horizontalalignment='center')

ax.add_artist(circle1)
ax.add_artist(circle2)
plt.show()

Here I have attempted to show transmitter A, drawn in red, at (-4,0) and a second one B, drawn in blue, at (4,0). The red and blue circles show the range from the transmitters to the robot, with the width illustrating the effect of the $1\sigma$ angular error for each transmitter. Here I have given the blue transmitter more error than the red one. The most probable position for the robot is where the two circles intersect, which I have depicted with the red and blue lines. You will object that we have two intersections, not one, but we will see how we deal with that when we design the measurement function.

This is a very common sensor set up. Aircraft still use this system to navigate, where it is called DME (Distance Measuring Equipment). Today GPS is a much more common navigation sytem, but I have worked on an aircraft where we integrated sensors like this into our filter along with the GPS, INS, altimeters, etc. We will tackle what is called multi-sensor fusion later; for now we will just address this simple configuration.

The first step is to design our state variables. We will assume that the robot is travelling in a straight direction with constant velocity. This is unlikely to be true for a long period of time, but is acceptable for short periods of time. This does not differ from the previous problem - we will want to track the values for the robot's position and velocity. Hence,

$$\mathbf{x} = \begin{bmatrix}x\\v_x\\y\\v_y\end{bmatrix}$$

The next step is to design the state transistion function. This also will be the same as the previous problem, so without further ado,

$$ \mathbf{x}' = \begin{bmatrix}1& \Delta t& 0& 0\\0& 1& 0& 0\\0& 0& 1& \Delta t\\ 0& 0& 0& 1\end{bmatrix}\mathbf{x}$$

The next step is to design the control inputs. We have none, so we set ${\mathbf{B}}=0$.

The next step is to design the measurement function $\mathbf{z} = \mathbf{Hx}$. We can model the measurement using the Pythagorean theorem.

$$ z_a = \sqrt{(x-x_A)^2 + (y-y_A)^2} + v_a\\[1em] z_b = \sqrt{(x-x_B])^2 + (y-y_B)^2} + v_b $$

where $v_a$ and $v_b$ are white noise.

We see an immediate problem. The Kalman filter is designed for linear equations, and this is obviously nonlinear. In the next chapters we will look at several ways to handle nonlinear problems in a robust way, but for now we will do something simpler. If we know the approximate position of the robot than we can linearize these equations around that point. I could develop the generalized mathematics for this technique now, but instead let me just present the worked example to give context to that development.

Instead of computing $\mathbf{H}$ we will compute the partial derivative of $\mathbf{H}$ with respect to the robot's position $\mathbf{x}$. You are probably familiar with the concept of partial derivative, but if not, it just means how $\mathbf{H}$ changes with respect to the robot's position. It is computed as the partial derivative of $\mathbf{H}$ as follows:

$$\frac{\partial \mathbf{h}}{\partial \mathbf{x}} = \begin{bmatrix} \frac{\partial h_1}{\partial x_1} & \frac{\partial h_1}{\partial x_2} &\dots \\ \frac{\partial h_2}{\partial x_1} & \frac{\partial h_2}{\partial x_2} &\dots \\ \vdots & \vdots \end{bmatrix} $$

Let's work the first partial derivative. We want to find

$$\frac{\partial }{\partial x} \sqrt{(x-x_A)^2 + (y-y_A)^2} $$

Which we compute as $$ \begin{aligned} \frac{\partial h_1}{\partial x} &= ((x-x_A)^2 + (y-y_A)^2))^\frac{1}{2} \\ &= \frac{1}{2}\times 2(x-x_a)\times ((x-x_A)^2 + (y-y_A)^2))^{-\frac{1}{2}} \\ &= \frac{x_r - x_A}{\sqrt{(x_r-x_A)^2 + (y_r-y_A)^2}} \end{aligned} $$

We continue this computation for the partial derivatives of the two distance equations with respect to $x$, $y$, $dx$ and $dy$, yielding

$$\frac{\partial\mathbf{h}}{\partial\mathbf{x}}= \begin{bmatrix} \frac{x_r - x_A}{\sqrt{(x_r-x_A)^2 + (y_r-y_A)^2}} & 0 & \frac{y_r - y_A}{\sqrt{(x_r-x_A)^2 + (y_r-y_A)^2}} & 0 \\ \frac{x_r - x_B}{\sqrt{(x_r-x_B)^2 + (y_r-y_B)^2}} & 0 & \frac{y_r - y_B}{\sqrt{(x_r-x_B)^2 + (y_r-y_B)^2}} & 0 \\ \end{bmatrix} $$

That is pretty painful, and these are very simple equations. Computing the Jacobian can be extremely difficult or even impossible for more complicated systems. However, there is an easy way to get Python to do the work for you by using the sympy module [1]. sympy is a Python library for symbolic mathematics. The full scope of its abilities are beyond this book, but it can perform algebra, integrate and differentiate equations, find solutions to differential equations, and much more. We will use it to compute our Jabobian!

First, a simple example. We will import sympy, initialize its pretty print functionality (which will print equations using LaTeX). We will then declare a symbol for numpy to use.

In [3]:
import sympy
from sympy import  init_printing
#from sympy.interactive import printing
init_printing(use_latex='mathjax')

phi, x = sympy.symbols('\phi, x')
phi
Out[3]:
$$\phi$$

Notice how we use a latex expression for the symbol phi. This is not necessary, but if you do it will render as LaTeX when output. Now let's do some math. What is the derivative of $\sqrt{\phi}$?

In [4]:
sympy.diff('sqrt(phi)')
Out[4]:
$$\frac{1}{2 \sqrt{\phi}}$$

We can factor equations.

In [5]:
sympy.factor('phi**3 -phi**2 + phi - 1')
Out[5]:
$$\left(\phi - 1\right) \left(\phi^{2} + 1\right)$$

sympy has a remarkable list of features, and as much as I enjoy exercising its features we cannot cover them all here. Instead, let's compute our Jacobian.

In [6]:
from sympy import symbols, Matrix
phi = symbols('\phi')
phi

x, y, xa, xb, ya, yb, dx, dy = symbols('x y x_a x_b y_a y_b dx dy')

H = Matrix([[sympy.sqrt((x-xa)**2 + (y-ya)**2)], 
            [sympy.sqrt((x-xb)**2 + (y-yb)**2)]])

state = Matrix([x, dx, y, dy])
H.jacobian(state)
Out[6]:
$$\left[\begin{matrix}\frac{x - x_{a}}{\sqrt{\left(x - x_{a}\right)^{2} + \left(y - y_{a}\right)^{2}}} & 0 & \frac{y - y_{a}}{\sqrt{\left(x - x_{a}\right)^{2} + \left(y - y_{a}\right)^{2}}} & 0\\\frac{x - x_{b}}{\sqrt{\left(x - x_{b}\right)^{2} + \left(y - y_{b}\right)^{2}}} & 0 & \frac{y - y_{b}}{\sqrt{\left(x - x_{b}\right)^{2} + \left(y - y_{b}\right)^{2}}} & 0\end{matrix}\right]$$

In a nutshell, the entry (0,0) contains the difference between the x coordinate of the robot and transmitter A's x coordinate divided by the distance between the robot and A. (2,0) contains the same, except for the y coordintates of the robot and transmitters. The bottom row contains the same computations, except for transmitter B. The 0 entries account for the velocity components of the state variables; naturally the range does not provide us with velocity.

The values in this matrix change as the robot's position changes, so this is no longer a constant; we will have to recompute it for every time step of the filter.

If you look at this you may realize that this is just a computation of x/dist and y/dist, so we can switch this to a trigometic form with no loss of generality:

$$\frac{\partial\mathbf{h}}{\partial\mathbf{x}}= \begin{bmatrix} -\cos{\theta_A} & 0 & -\sin{\theta_A} & 0 \\ -\cos{\theta_B} & 0 & -\sin{\theta_B} & 0 \end{bmatrix} $$

However, this raises a huge problem. We are no longer computing $\mathbf{H}$, but $\Delta\mathbf{H}$, the change of $\mathbf{H}$. If we passed this into our Kalman filter without altering the rest of the design the output would be nonsense. Recall, for example, that we multiply $\mathbf{Hx}$ to generate the measurements that would result from the given estimate of $\mathbf{x}$ But now that $\mathbf{H}$ is linearized around our position it contains the change in the measurement function.

We are forced, therefore, to use the change in $\mathbf{x}$ for our state variables. So we have to go back and redesign our state variables.

Please note this is a completely normal occurance in designing Kalman filters. The textbooks present examples like this as fait accompli, as if it is trivially obvious that the state variables needed to be velocities, not positions. Perhaps once you do enough of these problems it would be trivially obvious, but at that point why are you reading a textbook? I find myself reading through a presentation multiple times, trying to figure out why they made a choice, finally to realize that it is because of the consequences of something on the next page. My presentation is longer, but it reflects what actually happens when you design a filter. You make what seem reasonable design choices, and as you move forward you discover properties that require you to recast your earlier steps. As a result, I am going to somewhat abandon my step 1, step 2, etc., approach, since so many real problems are not quite that straightforward.

If our state variables contain the velocities of the robot and not the position then how do we track where the robot is? We can't. Kalman filters that are linearized in this fashion use what is called a nominal trajectory - i.e. you assume a position and track direction, and then apply the changes in velocity and acceleration to compute the changes in that trajectory. How could it be otherwise? Recall the graphic showing the intersection of the two range circles - there are two areas of intersection. Think of what this would look like if the two transmitters were very close to each other - the intersections would be two very long cresent shapes. This Kalman filter, as designed, has no way of knowing your true position from only distance measurements to the transmitters. Perhaps your mind is already leaping to ways of working around this problem. If so, stay engaged, as later sections and chapters will provide you with these techniques. Presenting the full solution all at once leads to more confusion than insight, in my opinion.

So let's redesign our state transition function. We are assuming constant velocity and no acceleration, giving state equations of $$ \dot{x}' = \dot{x} \\ \ddot{x}' = 0 \\ \dot{y}' = \dot{y} \\ \dot{y}' = 0$$

This gives us the the state transition function of

$$ \mathbf{F} = \begin{bmatrix}0 &1 & 0& 0\\0& 0& 0& 0\\0& 0& 0& 1\\ 0& 0& 0& 0\end{bmatrix}$$

A final complication comes from the measurements that we pass in. $\mathbf{Hx}$ is now computing the change in the measurement from our nominal position, so the measurement that we pass in needs to be not the range to A and B, but the change in range from our measured range to our nomimal position.

There is a lot here to take in, so let's work through the code bit by bit. First we will define a function to compute $\frac{\partial\mathbf{h}}{\partial\mathbf{x}}$ for each time step.

In [7]:
from math import sin, cos, atan2

def H_of(pos, pos_A, pos_B):
    """ Given the position of our object at 'pos' in 2D, and two 
    transmitters A and B at positions 'pos_A' and 'pos_B', return 
    the partial derivative of H
    """

    theta_a = atan2(pos_a[1]-pos[1], pos_a[0] - pos[0])
    theta_b = atan2(pos_b[1]-pos[1], pos_b[0] - pos[0])

    return np.array([[0, -cos(theta_a), 0, -sin(theta_a)],
                     [0, -cos(theta_b), 0, -sin(theta_b)]])

Now we need to create our simulated sensor.

In [8]:
from numpy.random import randn

class DMESensor(object):
    def __init__(self, pos_a, pos_b, noise_factor=1.0):
        self.A = pos_a
        self.B = pos_b
        self.noise_factor = noise_factor
        
    def range_of(self, pos):
        """ returns tuple containing noisy range data to A and B
        given a position 'pos'
        """
        
        ra = math.sqrt((self.A[0] - pos[0])**2 + (self.A[1] - pos[1])**2)
        rb = math.sqrt((self.B[0] - pos[0])**2 + (self.B[1] - pos[1])**2)
        
        return (ra + randn()*self.noise_factor, 
                rb + randn()*self.noise_factor)

Finally, we are ready for the Kalman filter code. I will position the transmitters at x=-100 and 100, both with y=-20. This gives me enough space to get good triangulation from both as the robot moves. I will start the robot at (0,0) and move by (1,1) each time step.

In [9]:
import math
from filterpy.kalman import KalmanFilter
import numpy as np
import book_plots as bp

pos_a = (100, -20)
pos_b = (-100, -20)

f1 = KalmanFilter(dim_x=4, dim_z=2)

f1.F = np.array ([[0, 1, 0, 0],
                  [0, 0, 0, 0],
                  [0, 0, 0, 1],
                  [0, 0, 0, 0]], dtype=float)

f1.R *= 1.
f1.Q *= .1

f1.x = np.array([[1, 0, 1, 0]], dtype=float).T
f1.P = np.eye(4) * 5.

# initialize storage and other variables for the run
count = 30
xs, ys = [], []
pxs, pys = [], []

# create the simulated sensor
d = DMESensor(pos_a, pos_b, noise_factor=3.)

# pos will contain our nominal position since the filter does not
# maintain position.
pos = [0,0]

for i in range(count):
    # move (1,1) each step, so just use i
    pos = [i, i]
    
    # compute the difference in range between the nominal track
    # and measured ranges
    ra,rb = d.range_of(pos)
    rx,ry = d.range_of((pos[0] + f1.x[0, 0], pos[1] + f1.x[2, 0]))
    z = np.array([[ra-rx], [rb-ry]])

    # compute linearized H for this time step
    f1.H = H_of (pos, pos_a, pos_b)

    # store stuff so we can plot it later
    xs.append(f1.x[0, 0]+i)
    ys.append(f1.x[2, 0]+i)
    pxs.append(pos[0])
    pys.append(pos[1])
    
    # perform the Kalman filter steps
    f1.predict()
    f1.update(z)


bp.plot_filter(xs, ys)
bp.plot_track(pxs, pys)
plt.legend(loc=2)
plt.show()

Linearizing the Kalman Filter

Now that we have seen an example of linearizing the Kalman filter we are in a position to better understand the math.

We start by assuming some function $\mathbf f$

References