Modeling the Last Flight of MH370 with a Markov Chain Monte Carlo Method

Conor L. Myhrvold


Harvard University, SEAS, IACS , Computational Science & Engineering (CSE)

AM207 : Advanced Scientific Computing: Stochastic Optimization Methods. Monte Carlo Methods for Inference and Data Analysis

Final Project, March 2014



Contact Info:

[email protected]

Twitter: @conormyhrvold

LinkedIn: www.linkedin.com/pub/conor-myhrvold/37/583/a7/


Current as of March 24, 2014 -> Typos/bugs fixed, March 29, 2014

Goal: The overall goal of this project is to use publically available information to pinpoint possible locations of flight MH370, a Malaysian Airlines Boeing 777-200ER which went missing in flight on March 8, 2014.

Let's get started. Here are the libraries we need. Note that after installing the Anaconda distribution of Python 2.7, which has IPython, you need to install seaborn and Basemap, which is part of matplotlib but is not included in the default installation. You need Basemap to make the maps -- there's really no other way around it.

In [103]:
#using Anaconda 64 bit distro of Python 2.7, Windows 7 environment (other specs will work though)
#import libraries

#numpy
import numpy as np

#scipy
import scipy as sp
from scipy import stats
from scipy.stats import norm
from scipy.stats import expon

#plotting
%matplotlib inline 
# ^^ make plots appear in IPython notebook instead of separate window
import matplotlib.pyplot as plt
import seaborn as sns
from IPython import display
from mpl_toolkits.basemap import Basemap # use for plotting lat lon & world maps

Overview - MH370

Malaysia Airlines 370 disappeared. A variety of information sources have been obtained, which can clue us into its possible locations. I have read a ton of articles and also reached out to some of the journalists (which I have a background in) and organizations involved, in an effort to get the best data possible to put into our Monte Carlo (MC) simulation. MC is an ideal approach to model where the airplane could be since unlike other air disasters, we don't know a lot about what happened yet. More than two weeks have passed and no wreckage has been found, let alone the airplane crash site or how it got there.

Getting the Data - 777 Landable Runways on Earth

Is the MH-370 on land? Let's follow the lead of WNYC and plot the runways where it could have landed. Note that we know the plane did not crash land; it would have sent a distress signal. So it either landed somewhere safely, or alternatively crashed in the water (likely the Indian Ocean).

Like WNYC did, we can download the data from XPlane, open the file in Wordpad (for Windows) and resave it as a text file with the extension .txt. (And rename it to w/e we want; in my case runways.txt .) Then, placing the file in the same directory, we can import the data into this IPython notebook and read the lines, appending them into a list to iterate through:

Alternatively you can get the exact data in the format currently written here: https://dl.dropboxusercontent.com/u/14868427/runways.txt, which you can use to easily follow along with this example in your own working IPython environment.

In [104]:
f = open('runways' + '.txt', 'r') #renamed XPlane data file -- massive list of runway properties around the world

line_list = []
for line in f.readlines():
    if '100   ' in line:
        line.rstrip('\n')
        line_list.append(line)
                
print "number of runways:", len(line_list)
number of runways: 34344

Here's an example of a runway entry we want to parse:

In [105]:
print line_list[0]
100   54.86   2   0 0.00 1 3 0 13   22.32526300  114.19222700  563.88   60.05 3  0 0 2 31   22.30395000  114.21587400  220.98    0.00 3  1 0 2


We discover that we want to grab entries 9&10, and 18&19, as those are the lat/lon coords of the starts and ends of the runways. we use those coordinates to get a runway center, which we plot, and the runway length, which we use to filter out the possible runways a Boeing 777 could land on.

An example below of a '100 filter' that won't work (so we need to filter at beginning):

In [106]:
#example of 1 that won't work ... need to filter more
print line_list[76]
14   49.02711300 -122.37642100   90 0 Abbotsford Tower


Specify, once we're in list form, that 100 has to occur at the beginning (signifying row that has runway coords) and we're good to go:

In [107]:
new_line_list = []
for line in line_list:
    if int(line[0]) == 1 and int(line[1]) == 0 and int(line[2]) == 0:
        new_line_list.append(line)

print "number of runways:", len(new_line_list)
    
number of runways: 33527

In [108]:
coord1 = []
coord2 = []
coord3 = []
coord4 = []
for line in new_line_list:
    temp = line.split()
    try:
        coord1.append(temp[9])
        coord2.append(temp[10])
        coord3.append(temp[18])
        coord4.append(temp[19])
    except:
        break

The test which detected the errant '100s' we don't want, came about by checking the length of our 4 coords.

In [109]:
#check to make sure all lengths equal

#print coord1[0:10]
print len(coord1)
print len(coord2)
print len(coord3)
print len(coord4)
33527
33527
33527
33527

In [110]:
print coord1[0:10]
print coord2[0:10]
print coord3[0:10]
print coord4[0:10]
['22.32526300', '29.88083567', '29.87181142', '-23.21480832', '-23.53246610', '-23.52883459', '29.65570843', '29.65007606', '32.77822536', '32.77741703']
['114.19222700', '-103.70108085', '-103.70136737', '-051.19038734', '133.68587381', '133.66776261', '-096.57985673', '-096.58204544', '-104.38455756', '-104.37910499']
['22.30395000', '29.86895821', '29.87798246', '-23.21352502', '-23.52755315', '-23.51384459', '29.64476323', '29.65039560', '32.78200130', '32.78280963']
['114.21587400', '-103.69317249', '-103.69288597', '-051.18127931', '133.68954508', '133.69025847', '-096.57897605', '-096.57678734', '-104.37814356', '-104.38359613']

Now append to a numpy array that will hold our airport coords:

In [111]:
coordlist = np.zeros((len(new_line_list), 4))
for i in xrange(len(coord1)):
    coordlist[i][0] = coord1[i]
    coordlist[i][1] = coord2[i]
    coordlist[i][2] = coord3[i]
    coordlist[i][3] = coord4[i]
#print len(coordlist)
print coordlist[0:10]
[[  22.325263    114.192227     22.30395     114.215874  ]
 [  29.88083567 -103.70108085   29.86895821 -103.69317249]
 [  29.87181142 -103.70136737   29.87798246 -103.69288597]
 [ -23.21480832  -51.19038734  -23.21352502  -51.18127931]
 [ -23.5324661   133.68587381  -23.52755315  133.68954508]
 [ -23.52883459  133.66776261  -23.51384459  133.69025847]
 [  29.65570843  -96.57985673   29.64476323  -96.57897605]
 [  29.65007606  -96.58204544   29.6503956   -96.57678734]
 [  32.77822536 -104.38455756   32.7820013  -104.37814356]
 [  32.77741703 -104.37910499   32.78280963 -104.38359613]]

Here's what a row looks like. Latitude is housed in rows 0 and 2; longitude in 1 and 3. They're pairs of coordinates that mark the beginning and end of the runway:

In [112]:
print coordlist[0]
[  22.325263  114.192227   22.30395   114.215874]

When I first averaged, an alarming reality surfaced; some runways had the same averaged coords! What was going on? I checked to make sure I was iteratring and adding correctly...and I was...

In [113]:
print coordlist[1][0]+coordlist[1][2], coordlist[1][1]+coordlist[1][3]
print coordlist[2][0]+coordlist[2][2], coordlist[2][1]+coordlist[2][3]

#probably are same runways, just in a criss cross format!
59.74979388 -207.39425334
59.74979388 -207.39425334

Okay, so mystery likely solved -- they're probably criss cross runways which would have same center point, at an airport. Now that we have the coordinate pairs, we average them to get the center of the runway in a numpy array:

In [114]:
lat = []
lon = []
for i in xrange(len(coordlist)):
    lat.append( (coordlist[i][0]+coordlist[i][2])/2.0 )
    lon.append( (coordlist[i][1]+coordlist[i][3])/2.0 )
    
print lat[0:10]; print lon[0:10]
[22.3146065, 29.874896939999999, 29.874896939999999, -23.214166669999997, -23.530009624999998, -23.52133959, 29.65023583, 29.65023583, 32.780113329999999, 32.780113330000006]
[114.20405049999999, -103.69712666999999, -103.69712667, -51.185833325000004, 133.687709445, 133.67901054000001, -96.579416390000006, -96.579416390000006, -104.38135056, -104.38135056]

I break up into lists for better bug detection (used to do it in one step, then had the 'same average' problem which I couldn't figure out until I split it up):

In [115]:
coordlist_average = np.zeros((len(coordlist), 2))

for i in xrange(len(lon)):
    coordlist_average[i][0] = lat[i]
    coordlist_average[i][1] = lon[i]

print coordlist_average[0:10]
print len(coordlist_average) #check to make sure they're all still there!
[[  22.3146065   114.2040505 ]
 [  29.87489694 -103.69712667]
 [  29.87489694 -103.69712667]
 [ -23.21416667  -51.18583333]
 [ -23.53000962  133.68770944]
 [ -23.52133959  133.67901054]
 [  29.65023583  -96.57941639]
 [  29.65023583  -96.57941639]
 [  32.78011333 -104.38135056]
 [  32.78011333 -104.38135056]]
33527

Right, so there are 33527 runways in the world. The next step is to calculate the length of those runways, to filter out where the 777 couldn't land. We can use a Haversine formula to get the runway distances (even though error due to curvature in this case is minimal, we can reuse this puppy later). Apparently, there's a Python module haversine which has this, but let's write this function ourselves:

In [116]:
earth_radius = 6371000 #in m, http://en.wikipedia.org/wiki/Earth

"""
Haversine equation.
Computes the great circle distance between two pairs of longitude/latitude.
Returns the distance in m or km depending on input (I use meters.) 
"""
def haversine(r,lat1,lon1,lat2,lon2):
    d = 2.0*r*np.arcsin(np.sqrt(np.sin(np.radians(lat2-lat1)/2.0)**2+np.cos(np.radians(lat1))*np.cos(np.radians(lat2))*np.sin(np.radians(lon2-lon1)/2.0)**2))
    return d   
In [117]:
#testing first data set result
haversine(earth_radius,coordlist[0][0],coordlist[0][1],coordlist[0][2],coordlist[0][3])
Out[117]:
3396.108094394885

That's an example of a runway length, in meters. That's a long runway! Now that we have established a way to get runway distance, we perform the calculation across a whole numpy array to get all of their distances, to filter out the ones that are too short to land a 777 on:

In [118]:
coordlist_len = np.zeros((len(coordlist),1))

for i in xrange(len(coordlist_len)):
    coordlist_len[i] = haversine(earth_radius,coordlist[i][0],coordlist[i][1],coordlist[i][2],coordlist[i][3])
    
print coordlist_len[0:10]
[[ 3396.10809439]
 [ 1525.02853835]
 [ 1067.51875355]
 [  941.64587632]
 [  662.21304719]
 [ 2835.27738788]
 [ 1220.0226416 ]
 [  509.35849511]
 [  732.01281279]
 [  732.01324227]]

Okay, so now we have the lengths of all the runways in the world. A 777 apparently needs at least a 5000 ft runway length to land.

In [119]:
#convert ft to m
min_run_dist = 5000* 0.3048 # 5000 ft *  0.3048 ft/m
print "The minimum runway distance to land a Boeing 777 is", min_run_dist, "meters."
The minimum runway distance to land a Boeing 777 is 1524.0 meters.

In [120]:
runway777_indices = []
for i in xrange(len(coordlist_len)):
    if coordlist_len[i] >= 1524.0 :
        runway777_indices.append(i)
print len(runway777_indices)
print runway777_indices[0:50]
7357
[0, 1, 5, 11, 23, 27, 28, 29, 31, 35, 45, 46, 50, 51, 60, 61, 64, 65, 73, 74, 77, 79, 81, 83, 87, 89, 90, 93, 96, 98, 101, 102, 104, 117, 118, 119, 125, 127, 128, 130, 131, 132, 133, 134, 137, 138, 139, 155, 158, 159]

This gives us the indices of the runways that are long enough. We've appended those indices to a list, so we can now access those same indices in where we stored our runway coordinates, to only plot runways long enough in the world, to land a 777. There are a little over 7000 of them.

In [121]:
master_777_runway_coords = coordlist[runway777_indices]
master_777_runway_avg = coordlist_average[runway777_indices]

print len(master_777_runway_coords) == len(master_777_runway_avg)
print "-----------------------"
print master_777_runway_coords
print "-----------------------"
print master_777_runway_avg
True
-----------------------
[[  22.325263    114.192227     22.30395     114.215874  ]
 [  29.88083567 -103.70108085   29.86895821 -103.69317249]
 [ -23.52883459  133.66776261  -23.51384459  133.69025847]
 ..., 
 [  37.09965032 -113.59303424   37.08153031 -113.593116  ]
 [  55.771524     12.300378     55.767022     12.347614  ]
 [  52.159333      4.404        52.173         4.432     ]]
-----------------------
[[  22.3146065   114.2040505 ]
 [  29.87489694 -103.69712667]
 [ -23.52133959  133.67901054]
 ..., 
 [  37.09059032 -113.59307512]
 [  55.769273     12.323996  ]
 [  52.1661665     4.418     ]]

Getting the Data - Flight MH370 Known Flight Coordinates + Inmarsat Coords

In [122]:
# The Inmarsat satellite is at 0,64.5 -- it's geostationary.
inmarsat_coord = [0, 64.5]

#Now we plot the plane's known positions

#Kuala Lumpur International Airport Coordinates: http://www.distancesfrom.com/my/Kuala-Lumpur-Airport-(KUL)-Malaysia-latitude-longitude-Kuala-Lumpur-Airport-(KUL)-Malaysia-latitude-/LatLongHistory/3308940.aspx
kualalumpur_coord = [2.7544829, 101.7011363]

#Pulau Perak coordinates: http://tools.wmflabs.org/geohack/geohack.php?pagename=Pulau_Perak&params=5_40_50_N_98_56_27_E_type:isle_region:MY
# http://en.wikipedia.org/wiki/Perak_Island -> Indonesia military radar detected near island
pulauperak_coord = [5.680556,98.940833]

# Igari Waypoint. Source: # http://www.fallingrain.com/waypoint/SN/IGARI.html Given in hours,minutes,seconds.
igariwaypoint_coord = [6. + 56./60. + 12./3600., 103. + 35./60. + 6./3600.] 

print "inmarsat lat/lon:", inmarsat_coord[0], inmarsat_coord[1]
print "kuala lumpur int'l airport coord lat/lon:", kualalumpur_coord[0],kualalumpur_coord[1]
print "pulau perak lat/lon:", pulauperak_coord[0],pulauperak_coord[1]
print "igari waypoint lat/lon:", igariwaypoint_coord[0],igariwaypoint_coord[1]
inmarsat lat/lon: 0 64.5
kuala lumpur int'l airport coord lat/lon: 2.7544829 101.7011363
pulau perak lat/lon: 5.680556 98.940833
igari waypoint lat/lon: 6.93666666667 103.585

Plotting the Data

Now let's plot. Recall our center-of-runway array:

In [123]:
master_777_runway_avg
Out[123]:
array([[  22.3146065 ,  114.2040505 ],
       [  29.87489694, -103.69712667],
       [ -23.52133959,  133.67901054],
       ..., 
       [  37.09059032, -113.59307512],
       [  55.769273  ,   12.323996  ],
       [  52.1661665 ,    4.418     ]])

We take that array and split up the lat and long coords for each for plotting purposes:

In [124]:
runway_lats = []
runway_lons = []
# split coordinates into list form now to follow plotting example
for i in xrange(len(master_777_runway_avg)):
    runway_lats.append(master_777_runway_avg[i][0])
    runway_lons.append(master_777_runway_avg[i][1])
In [125]:
#print type(runway_lats); print type(runway_lons)
#print runway_lats[0:10]
#print runway_lons[0:10]
In [126]:
#we similarly split up the plane & sat lat & lons

plane_lats = [2.7544829,(6.+56./60.+12./3600.),5.680556]
plane_lons = [101.7011363,(103.+35./60.+6./3600.),98.940833]

sat_lat = 0
sat_lon = 64.5

Now we'll make a map with those points on it, using Basemap, an extension of Matplotlib:

In [127]:
#Used Basemap template. Could substitute for any other number of projections, and provide specs for the lat/long displays etc.

# set figure size
fig = plt.figure(figsize=[21,15])

# setup Lambert Conformal basemap.
fig = Basemap(width=10000000,height=8000000,projection='lcc',resolution='c',lat_0=10,lon_0=90.,suppress_ticks=True)

#draw coasts
fig.drawcoastlines()

# draw boundary, fill background.
fig.drawmapboundary(fill_color='lightblue')
fig.fillcontinents(color='#FFD39B',lake_color='lightblue')

# draw parallels
parallels = np.arange(-50.,50,10.)
fig.drawparallels(np.arange(-50,50,10),labels=[1,1,0,1], fontsize=15)

# draw meridians
meridians = np.arange(50.,130.,10.)
fig.drawmeridians(np.arange(50,130,10),labels=[1,1,0,1], fontsize=15)

#translate coords into map coord system to plot

#Runway Locs
x,y = fig(runway_lons,runway_lats)

#Known 777 Locs
x2,y2 = fig(plane_lons,plane_lats)

#Inmarsat Satellite Loc
x3,y3 = fig(sat_lon,sat_lat)

# plot coords w/ filled circles
fig.plot(x,y,'ko',markersize=5,label='Landable Runways')
fig.plot(x2,y2,'bo',markersize=10,label='MH370 Flight Path')
fig.plot(x3,y3,'ro',markersize=10,label='Inmarsat 3-F1')

#draw arrows showing flight path
arrow1 = plt.arrow(x2[0],y2[0],x2[1]-x2[0],y2[1]-y2[0],linewidth=3,color='blue',linestyle='dashed',label='flight path')
arrow2 = plt.arrow(x2[1],y2[1],x2[2]-x2[1],y2[2]-y2[1],linewidth=3,color='blue',linestyle='dashed',label='flight path')

#make legend
legend = plt.legend(loc='lower center',fontsize=10,frameon=True,title='Legend',markerscale=1,prop={'size':15})
legend.get_title().set_fontsize('20')

#add title
plt.title('Landable Runways for a Boeing 777', fontsize=30)

#show below
plt.show()

We see that there are lots of runways where our missing plane could land. This map will be the basis for more complicated maps showing the plane's possible position at a given point in time, along with its flight path. The legend, by the way, is not covering any runways. The middle of the South Indian Ocean in this map is basically the middle of nowhere, but from our results this region of Ocean, especially further down, could become a very interesting area to search.

Satellite Ping Data -- the 5th Ping

Now we plot and incorporate the ping data from the satellite. Basically it's going to be within a big arc from the satellite. We actually don't know to the precision that Malayasia authorities claim to know. That's due to the uncertainty involved in inferring the geolocation from the engine ping. Really, it was never supposed to be used for location -- that data Malaysia Airlines did not subscribe for, so we are left with fairly substantial error in where this places the plane at the last ping mark. The other 4 pings would be critical, but those haven't been publically released, although we are now hearing that the ping time was not every hour but actually got longer over time, which indicates to some extent that MH370 moved away from the satellite (from a radial distance perspective) over time.

In [128]:
#Inmarsat satellite height off ground

sat_height = 42170 #m
elevation_angle = np.radians(40) #elevation angle of satellite; convert degrees to radians
earth_radius = 6371 #in km, http://en.wikipedia.org/wiki/Earth

#To get the satellite calculate we have to solve several equations which were computed on Mathematica

"""
Computes the ping arc distance from the satellite to the plane.
Returns the angle in degrees.
"""
def satellite_calc(radius,orbit,angle):
    interim = (np.sqrt(-radius**2+orbit**2*(1./np.cos(angle)**2))-orbit*np.tan(angle))/np.float(orbit+radius)
    return np.degrees(2*np.arctan(interim))

ping_arc_dist = satellite_calc(earth_radius,sat_height,elevation_angle)
print "ping arc distance in degrees:", ping_arc_dist

dist_from_sat = earth_radius*np.radians(satellite_calc(earth_radius,sat_height,elevation_angle))
print "distance from satellite", dist_from_sat
ping arc distance in degrees: 43.3540831616
distance from satellite 4820.7540969

There are two types of approximations we can make, for a given radius of the geostationary satellite or the airplane. The first is a simple circle, which is pretty good. However, Earth is not a sphere, and so at large radii this approximation is not ideal. So I adapt the simple circle to be more oblong like an ellipse. I don't include the spherical trigonometry / mathematical derivations, since they are not really in the scope of what we're trying to accomplish here in finding MH370.

In [129]:
"""
write circle function. generates x&y pairs in a circle, 360 degrees
angle in degrees, number of points to put in circle, satellite location lat/lon
"""
def make_circle1(radius,pts,lon_loc,lat_loc): 
    coords_array = np.zeros((pts, 2))
    for i in xrange(pts):
        coords_array[i][0] =  radius*np.cos(np.radians(i)) + lat_loc 
        coords_array[i][1] =  radius*np.sin(np.radians(i)) + lon_loc 
    return coords_array

"""
write ellipse function
since across the earth it's not actually a circle
derived from spherical trigonometry
"""
def make_circle(radius,pts,lon_loc,lat_loc):
    coords_array = np.zeros((pts, 2))
    for i in xrange(pts):
        coords_array[i][0] =  radius*np.cos(np.radians(i)) + lat_loc 
        #bug fix in line below -> *(np.pi/180.) needs to be moved 1 parentheses over. only has large effect at low latitudes.
        coords_array[i][1] =  radius*np.sin(np.radians(i))/(np.cos(np.cos(np.radians(radius))*(lat_loc+radius*np.cos(np.radians(i)))*(np.pi/180.))) + lon_loc 
    return coords_array

"""
write function that plots final destination from plane, from the point of the last ping.
this will be some period of time in between 0 minutes and an hour -- or else it would have pinged again.
make a point at a distance on a heading to see where the plane would be if it continued on a straight line,
from the 5th ping.
"""
def make_vector(ang_radius,heading,lon_loc,lat_loc):
    #bug fix -- lon needs to have lon offset & lat needs to have lat offset
    vec_lon = ang_radius*np.cos(np.radians(heading)) + lon_loc
    #bug fix in line below -> *(np.pi/180.) needs to be moved 1 parentheses over. only has large effect at low latitudes.
    vec_lat = ang_radius*np.sin(np.radians(heading))/(np.cos(np.cos(np.radians(ang_radius))*(lat_loc+ang_radius*np.cos(np.radians(heading)))*(np.pi/180.))) + lat_loc
    return vec_lat,vec_lon
In [130]:
#testing the two types of circles. Using input from:
#normal_prob_step(255.136, 30, 98.940833, 5.680556, 8.12971613367), from below,
#which initially did not work for the ellipse-like approximation.

test = make_circle(8.12971613367,360,98.940833,5.680556)

test_lat = []
test_lon = []
for i in xrange(len(test)):
    test_lat.append(test[i][0])
    test_lon.append(test[i][1])

test1 = make_circle1(8.12971613367,360,98.940833,5.680556)

test_lat1 = []
test_lon1 = []
for i in xrange(len(test1)):
    test_lat1.append(test1[i][0])
    test_lon1.append(test1[i][1])
#print test_lat

#create figure
plt.figure()
plt.plot(test_lon1,test_lat1,color='red',label='simple circle')
plt.plot(test_lon,test_lat,color='blue',label='ellipsoid')
legend = plt.legend(loc='lower left',fontsize=8,frameon=True,markerscale=1,prop={'size':5})
plt.title('comparing circle approximations to each other')
plt.legend()
plt.show()

Above we see the difference between the two circles. Remember that at the equator, 1 degree of latitude and longitude stay the same, but at the poles, the degrees of latitude contract tremendously. So our complicated formula does a slightly better job of incorporating this stretch of latitude as the circumference of Earth increases toward the equator.

Otherwise, what the bug did before, was effectively make it so that the airplane was going faster when it picked some directions. So instead of a constant speed in every direction, you had a constant speed for each heading but what that speed was, differed.

In [131]:
circle_pts = make_circle(ping_arc_dist,360,64.5,0)
#print circle_pts

circle_lat = []
for i in xrange(len(circle_pts)):
    circle_lat.append(circle_pts[i][0])

circle_lon = []
for i in xrange(len(circle_pts)):
    circle_lon.append(circle_pts[i][1])

print circle_lat[0:10]
print "-------------------"
print circle_lon[0:10]
[43.354083161614668, 43.347480124758938, 43.32767302554074, 43.294667897394362, 43.248474794001574, 43.189107786229151, 43.116584957842761, 43.030928399998473, 42.93216420451359, 42.820322455918792]
-------------------
[64.5, 65.387581023937116, 66.274618176340184, 67.16056873867629, 68.044892293124406, 68.927051859745077, 69.806515017937599, 70.682755007130297, 71.555251801804673, 72.423493156143252]

Now what we'll do is vary the distance from the satellite up and down by 5%, 10%, and 20% to build in error in where the plane could be. These figures are an arbitrary amount; but it shows the extent to which such a variation would effect the plane's location. And even a slight variation in the Insarmat radius can have a dramatic effect on where the plane is likely to be, 5 hours later. Of course, the error could be much more substantial as Inmarsat signal strength and signal travel time is not supposed to be used for geolocation. But then we don't have much to go on, and all evidence reported suggests that the Inmarsat data is decently reliable.

Inmarsat 20% error

In [132]:
err1_20per = 0.8*ping_arc_dist
err2_20per = 1.2*ping_arc_dist

circle_pts_err1_20per = make_circle(err1_20per,360,64.5,0)
circle_pts_err2_20per = make_circle(err2_20per,360,64.5,0)

circle_lon_err1_20per = []
for i in xrange(len(circle_pts_err1_20per)):
    circle_lon_err1_20per.append(circle_pts_err1_20per[i][1])
    
circle_lon_err2_20per = []
for i in xrange(len(circle_pts_err2_20per)):
    circle_lon_err2_20per.append(circle_pts_err2_20per[i][1])
    
circle_lat_err1_20per = []
for i in xrange(len(circle_pts_err1_20per)):
    circle_lat_err1_20per.append(circle_pts_err1_20per[i][0])
    
circle_lat_err2_20per = []
for i in xrange(len(circle_pts_err2_20per)):
    circle_lat_err2_20per.append(circle_pts_err2_20per[i][0])
In [133]:
print circle_lat_err1_20per[0:10]
print "--------------------"
print circle_lon_err1_20per[0:10]
print "--------------------"
print circle_lat_err2_20per[0:10]
print "--------------------"
print circle_lon_err2_20per[0:10]
[34.683266529291735, 34.677984099807155, 34.662138420432591, 34.635734317915485, 34.598779835201256, 34.551286228983322, 34.49326796627421, 34.424742719998783, 34.345731363610874, 34.256257964735035]
--------------------
[64.5, 65.188878981104665, 65.877377969510135, 66.565117672800966, 67.251720196246211, 67.936809734453092, 68.620013254448324, 69.300961167418293, 69.979287986414448, 70.65463296742287]
--------------------
[52.0248997939376, 52.016976149710722, 51.993207630648882, 51.953601476873231, 51.898169752801884, 51.82692934347498, 51.739901949411312, 51.63711407999817, 51.518597045416307, 51.384386947102556]
--------------------
[64.5, 65.570728584833461, 66.640789529381379, 67.709516640009824, 68.776246609632906, 69.840320444151047, 70.901084868830466, 71.957893708176215, 73.010109233052489, 74.057103469050048]

Inmarsat 10% error

In [134]:
err1_10per = 0.9*ping_arc_dist
err2_10per = 1.1*ping_arc_dist

circle_pts_err1_10per = make_circle(err1_10per,360,64.5,0)
circle_pts_err2_10per = make_circle(err2_10per,360,64.5,0)

circle_lon_err1_10per = []
for i in xrange(len(circle_pts_err1_10per)):
    circle_lon_err1_10per.append(circle_pts_err1_10per[i][1])
    
circle_lon_err2_10per = []
for i in xrange(len(circle_pts_err2_10per)):
    circle_lon_err2_10per.append(circle_pts_err2_10per[i][1])
    
circle_lat_err1_10per = []
for i in xrange(len(circle_pts_err1_10per)):
    circle_lat_err1_10per.append(circle_pts_err1_10per[i][0])
    
circle_lat_err2_10per = []
for i in xrange(len(circle_pts_err2_10per)):
    circle_lat_err2_10per.append(circle_pts_err2_10per[i][0])

Inmarsat 5% error

In [135]:
err1_5per = 0.95*ping_arc_dist
err2_5per = 1.05*ping_arc_dist

circle_pts_err1_5per = make_circle(err1_5per,360,64.5,0)
circle_pts_err2_5per = make_circle(err2_5per,360,64.5,0)

circle_lon_err1_5per = []
for i in xrange(len(circle_pts_err1_5per)):
    circle_lon_err1_5per.append(circle_pts_err1_5per[i][1])
    
circle_lon_err2_5per = []
for i in xrange(len(circle_pts_err2_5per)):
    circle_lon_err2_5per.append(circle_pts_err2_5per[i][1])
    
circle_lat_err1_5per = []
for i in xrange(len(circle_pts_err1_5per)):
    circle_lat_err1_5per.append(circle_pts_err1_5per[i][0])
    
circle_lat_err2_5per = []
for i in xrange(len(circle_pts_err2_5per)):
    circle_lat_err2_5per.append(circle_pts_err2_5per[i][0])

Now we make a map plot showing these circles. Keep in mind that the dashed red lines of Inmarsat error are the furthest away from the estimate, that the line could be with a given percentage of error. So I could shade the region in between, but I am displaying 3 error ranges so that would get to be a bit tricky. The furthest two dashed lines away from the solid red line are with the greatest error, 20%. In the middle is 10%, and closest to the solid line is the 5%.

In [160]:
#Plot the same map again ... just add the location coordinates of where it could be located

# set figure size
fig = plt.figure(figsize=[21,15])

# setup Lambert Conformal basemap.
fig = Basemap(width=10000000,height=8000000,projection='lcc',resolution='c',lat_0=5,lon_0=90.,suppress_ticks=True)

#draw coasts
fig.drawcoastlines()

# draw boundary, fill background.
fig.drawmapboundary(fill_color='lightblue')
fig.fillcontinents(color='#FFD39B',lake_color='lightblue')

# draw parallels
parallels = np.arange(-50.,50,10.)
fig.drawparallels(np.arange(-50,50,10),labels=[1,1,0,1], fontsize=15)

# draw meridians
meridians = np.arange(50.,130.,10.)
fig.drawmeridians(np.arange(50,130,10),labels=[1,1,0,1], fontsize=15)

#translate coords into map coord system to plot
#Runway Locs
x,y = fig(runway_lons,runway_lats)
#Known 777 Locs
x2,y2 = fig(plane_lons,plane_lats)
#Inmarsat Satellite Loc
x3,y3 = fig(sat_lon,sat_lat)

#Add location coords
x4,y4 = fig(99.8,6.35) #show Lankawi for electrical fire scenario

#Add circle coords -- 20% error
x5,y5 = fig(circle_lon,circle_lat)
x6,y6 = fig(circle_lon_err1_20per,circle_lat_err1_20per)
x7,y7 = fig(circle_lon_err2_20per,circle_lat_err2_20per)
#                     10% error
x8,y8 = fig(circle_lon_err1_10per,circle_lat_err1_10per)
x9,y9 = fig(circle_lon_err2_10per,circle_lat_err2_10per)
#                      5% error
x10,y10 = fig(circle_lon_err1_5per,circle_lat_err1_5per)
x11,y11 = fig(circle_lon_err2_5per,circle_lat_err2_5per)

# plot coords w/ filled circles
fig.plot(x,y,'ko',markersize=5,label='Landable Runways')
fig.plot(x2,y2,'bo',markersize=10,label='MH370 Flight Path')
fig.plot(x3,y3,'ro',markersize=10,label='Inmarsat 3-F1')
fig.plot(x4,y4,'go',markersize=10,label='Lankawi Island')

#draw circle showing extent of Inmarsat sat radar detection
fig.plot(x5,y5,'r-',markersize=5,label='Inferred MH370 Location from Satellite, 5th Ping')
fig.plot(x6,y6,'r--',markersize=5,label='with 5,10,20% error')
fig.plot(x7,y7,'r--',markersize=5)
fig.plot(x8,y8,'r--',markersize=5)
fig.plot(x9,y9,'r--',markersize=5)
fig.plot(x10,y10,'r--',markersize=5)
fig.plot(x11,y11,'r--',markersize=5)

#draw arrows showing flight path
arrow1 = plt.arrow(x2[0],y2[0],x2[1]-x2[0],y2[1]-y2[0],linewidth=3,color='blue',linestyle='dashed',label='flight path')
arrow2 = plt.arrow(x2[1],y2[1],x2[2]-x2[1],y2[2]-y2[1],linewidth=3,color='blue',linestyle='dashed',label='flight path')

#make legend
legend = plt.legend(loc='lower left',fontsize=10,frameon=True,title='Legend',markerscale=1,prop={'size':15})
legend.get_title().set_fontsize('20')

#add title
plt.title('Inmarsat Ping Estimation', fontsize=30)

#show below
plt.show()

Now we have the starting plot that we saw in so many newspaper websites all around the world in March. Lankawi Island is plotted as reference point -- in some scenarios, people believe it could have tried to head to an emergency landing there. We have gathered an impressive data set which forms the foundation of our Monte Carlo model.

Making Monte Carlo Model

Convert the Inmarsat line location and error bounds into probabilities

In [161]:
"""
ping_prob is specifically the 5th ping probability, which we have.
we center a normal probability distribution upon the location of the radius line.

d = dist_from_sat in my case and is more abstractly 'd', a distance
r = earth_radius in my case and is is more abstractly 'r', a radius
lat1 = sat latitude in my case
lon1 = sat longitude in my case
lat2, lon2 we iterate through for our function
err is the 5,10,or 20% error Inmarsat error
"""
def ping_prob(lat1,lon1,lat2,lon2,err,d,r): 
    return np.exp(-0.5*((haversine(r,lat1,lon1,lat2,lon2)-d)/(err*d))**2)/(d*np.sqrt(2*np.pi)) 
    #model a normal distribution above...manually...could also probably use:
    #stats.norm.pdf or something of that sort

Next we make a probability grid of where the plane could be located, across our latitudes and longitudes. We'll initialize this probability from the 5th ping perspective, since we've already calculated appropriate latitudes and longitudes from that.

In [162]:
#create lat/long grid of distance
lat_max = 55 #55 N
lat_min = -50 #50 S

lon_max = 130 #130 E
lon_min = 50 #50 E

lat_range = np.abs(lat_max)+np.abs(lat_min)
lon_range = np.abs(lon_max)-np.abs(lon_min)
print "latitude range in degrees:", lat_range
print "longitude range in degrees:", lon_range
latitude range in degrees: 105
longitude range in degrees: 80

In [163]:
# x coord = longitude
# y coord = latitude
mult_fac = 1 #Could adjust to get more resolution per lat/lon -- keep at 1 for now (change to 10 next version)

prob_grid = np.zeros((lat_range*mult_fac,lon_range*mult_fac)) #initialize grid as numpy array

for i in xrange(lat_range*mult_fac): #across all lat grid-lets
    for j in xrange(lon_range*mult_fac): #across all lon grid-lets (so now we're iterating through rows + columns)
        prob_grid[i][j] = ping_prob(sat_lat,sat_lon,(lat_min+i/np.float(mult_fac)),(lon_min+j/np.float(mult_fac)),0.2,dist_from_sat,earth_radius) 
        #assuming 20% error == 0.2
In [164]:
#print prob_grid.shape
#print prob_grid
#print np.sum(prob_grid)
In [165]:
plt.figure()
plt.plot(prob_grid)
plt.show()