# Where to Start Seafloor Searching, from the MH370 Markov Chain Monte Carlo (MCMC) Model¶

## Conor L. Myhrvold¶

Harvard University, SEAS, IACS , Computational Science & Engineering (CSE)

AM207 : Advanced Scientific Computing: Stochastic Optimization Methods. Monte Carlo Methods for Inference and Data Analysis

Final Project, Spring 2014. Current as of April 15.

### For articles about prior versions, see Fast Company Labs' : How I Narrowed Down The Location Of Malaysia Air Using "Monte Carlo" Data Models More About Our Methodology: Tracking MH370 With Monte Carlo Data Models This Data Model Shows MH370 Could Not Have Flown "Accidentally" To Its Destination With New Information, Our Data Models Point To Foul Play On Malaysia Air Flight 370 ¶

Contact Info:

conor.myhrvold@gmail.com

#### Let's get started. Here are the libraries we need. Note that after installing the Anaconda distribution of Python 2.7, which has IPython, you need to install seaborn and Basemap, which is part of matplotlib but is not included in the default installation. You need Basemap to make the maps -- there's really no other way around it.¶

In [43]:
#using Anaconda 64 bit distro of Python 2.7, Windows 7 environment (other specs will work though)
#import libraries

#numpy
import numpy as np

#scipy
import scipy as sp
from scipy import stats
from scipy.stats import norm
from scipy.stats import expon

#plotting
%matplotlib inline
# ^^ make plots appear in IPython notebook instead of separate window
import matplotlib.pyplot as plt
import seaborn as sns
from IPython import display
from mpl_toolkits.basemap import Basemap # use for plotting lat lon & world maps


### Functions for Making Our Monte Carlo Model¶

#### See prior versions for explanation on the setup of these. Here I list the functions we created over the course of Version 1 and Version 2 of this model. This time I'm listing the functions first, so the actual inputs are clearer to understand, in their own section. We also won't be modifying these throughout the notebook, so they'll all be here:¶

In [44]:
#Here's the ping arc distance function
"""
Computes the ping arc distance from the satellite to the plane.
Returns the angle in degrees.
"""
return np.degrees(2*np.arctan(interim))

# Here's the ellipse function (which is more realistic and favored over the simple circle function):
"""
write ellipse function
since across the earth it's not actually a circle
derived from spherical trigonometry
"""
coords_array = np.zeros((pts, 2))
for i in xrange(pts):
return coords_array

#Here's our Haversine equation formula
"""
Haversine equation.
Computes the great circle distance between two pairs of longitude/latitude.
Returns the distance in m or km depending on input (I use meters.)
"""
def haversine(r,lat1,lon1,lat2,lon2):
return d

# Convert the Inmarsat line location and error bounds into probabilities
"""
we center a normal probability distribution upon the location of the radius line.
d = a distance
lat1, lon1 is a latitude and longitude
lat2, lon2 we iterate through for our function
err is the error
"""
def ping_prob_normal(lat1,lon1,lat2,lon2,err,d,r):
return np.exp(-0.5*((haversine(r,lat1,lon1,lat2,lon2)-d)/(err*d))**2)/(d*np.sqrt(2*np.pi))

# Here's the function that picks a new location for the plane at each time step:
"""
takes in a heading, a starting location, and a std dev for picking a new heading
a std dev of 30, for example, with a normal distribution, means that -30 to 30 degrees will be
where most of the new heading draws will come out of (a std dev applies to both sides of the distribution.)
and the highest probability will be straight ahead. this is of course not including the ping probability.
"""
def normal_prob_step(old_heading, std_dev, start_lon, start_lat, ang_dist):

angle_list = range(0,360) # ; print angle_list

#we create a radius of 360 points that would be the heading for 360 possible degrees
circle = make_circle(ang_dist,len(angle_list),start_lon,start_lat)

weights = np.zeros(len(angle_list)) #make 360 array
for i in xrange(len(weights)):
weights[i] = np.exp(-(((np.float(angle_list[i])-180.)/std_dev)**2) /2.) / (std_dev*np.sqrt(2*np.pi))
#makes array of normally distributed weights as if heading was 180 degrees. Sort of a hack to make it periodic.

#Now we have to translate it back to whatever the heading should be, instead of 180 degrees
#Fortunately we can use numpy's roll. Implementing our own would be a pain.

#initialize new possible coordinates within an hr's distance, new weights for the odds, and new angles
#(depending on whether the plane would go off the grid or not)
new_circle = []
new_weights = []
new_angles = []

#make sure lat & lon are in bounds
for i in xrange(len(circle)):
if circle[i][0] >= lat_min and circle[i][0] <= lat_max and circle[i][1] >= lon_min and circle[i][1] <= lon_max:
new_circle.append(circle[i])
new_weights.append(s_weights[i])
new_angles.append(angle_list[i])

return new_circle,new_weights,new_angles

# Here's the same-heading vectors from the last ping, to whereever the plane may finally be located.
"""
write function that plots final destination from plane, from the point of the last ping.
this will be some period of time in between 0 minutes and an hour -- or else it would have pinged again.
make a point at a distance on a heading to see where the plane would be if it continued on a straight line,
from the 5th ping.
"""
return vec_lat,vec_lon

#MCMC Functions

#Best of Version 1
# Here's the main Monte Carlo function which puts it all together:
"""
a function which given a list of discrete probabilities for each destination point,
it will choose one of those points.

lon_init,lat_init -- last known point of plane in longitude and latitude
km_hop -- how far the plane went in the time interval, 1 hr. So in simplest case, the 777's cruising speed/hr.
std_dev -- the standard deviation of the heading, based on a normal distribution from the current heading (0 deg).
ping_percent_err -- what % error you assume in the Inmarsat 5th ping. either 2.5 or 5%.

"""

#initialize
plane_lat = np.zeros(5) #initialize plane location after each hop (assumed to be 1 hr for now)
plane_lon = np.zeros(5)
lat = lat_init
lon = lon_init

for i in xrange(len(plane_lat)):
#new_circle gives up possible coords for diff headings

raw_weights = np.zeros(len(new_circle))
for j in xrange(len(new_circle)):

probs = raw_weights / np.sum(raw_weights) #normalize

index = range(len(new_circle))
chosen = np.random.choice(index,size=None,p=probs)
#print "chosen",chosen

plane_lat[i],plane_lon[i] = new_circle[chosen] #update position
lat = plane_lat[i]
lon = plane_lon[i]

#at end of simulation, run the last location & heading for plane for 4 different times

new_plane_lat = np.zeros(9)
new_plane_lon = np.zeros(9)

for i in xrange(len(plane_lat)):
new_plane_lat[i] = plane_lat[i]
new_plane_lon[i] = plane_lon[i]

new_plane_lat[5] = route1[0]
new_plane_lat[6] = route2[0]
new_plane_lat[7] = route3[0]
new_plane_lat[8] = route4[0]
new_plane_lon[5] = route1[1]
new_plane_lon[6] = route2[1]
new_plane_lon[7] = route3[1]
new_plane_lon[8] = route4[1]

return new_plane_lat,new_plane_lon

#Best of Version 2
# Here's the main Monte Carlo function which puts it all together:
"""
a function which given a list of discrete probabilities for each destination point,
it will choose one of those points.

lon_init,lat_init -- last known point of plane in longitude and latitude
km_hop -- how far the plane went in the time interval, 1 hr. So in simplest case, the 777's cruising speed/hr.
std_dev -- the standard deviation of the heading, based on a normal distribution from the current heading (0 deg).
ping_percent_err -- what % error you assume in the Inmarsat 5th ping. either 2.5 or 5%.

"""

#initialize
plane_lat = np.zeros(6) #initialize plane location after each hop (assumed to be 1 hr for now)
plane_lon = np.zeros(6)
lat = lat_init
lon = lon_init

for i in xrange(len(plane_lat)):
#new_circle gives up possible coords for diff headings

raw_weights = np.zeros(len(new_circle))
for j in xrange(len(new_circle)):

probs = raw_weights / np.sum(raw_weights) #normalize

index = range(len(new_circle))
chosen = np.random.choice(index,size=None,p=probs)
#print "chosen",chosen

plane_lat[i],plane_lon[i] = new_circle[chosen] #update position
lat = plane_lat[i]
lon = plane_lon[i]

#at end of simulation, run the last location & heading for plane for 4 different times

new_plane_lat = np.zeros(10)
new_plane_lon = np.zeros(10)

for i in xrange(len(plane_lat)):
new_plane_lat[i] = plane_lat[i]
new_plane_lon[i] = plane_lon[i]

new_plane_lat[6] = route1[0]
new_plane_lat[7] = route2[0]
new_plane_lat[8] = route3[0]
new_plane_lat[9] = route4[0]
new_plane_lon[6] = route1[1]
new_plane_lon[7] = route2[1]
new_plane_lon[8] = route3[1]
new_plane_lon[9] = route4[1]

return new_plane_lat,new_plane_lon


### The Data - Flight MH370 Known Flight Coordinates + Inmarsat Coords¶

#### Here are some useful constants we need for converting purposes:¶

In [45]:
eq_deg_km = 111.32 # number of km/degree at eq Source: http://en.wikipedia.org/wiki/Decimal_degrees
earth_radius = 6371 #in km, http://en.wikipedia.org/wiki/Earth

#Inmarsat satellite information
sat_height = 42170 #Inmarsat satellite height off ground, in meters
elevation_angle = np.radians(40) #elevation angle of satellite; convert degrees to radians. Source: NYT Hong Kong Bureau


#### Here are the relevant lat/lon coordinates. We think of them as lat/lon, but note that Python wants them in lon/lat for plotting purposes!¶

In [46]:
# The Inmarsat satellite is at 0,64.5 -- it's geostationary.
inmarsat = [0, 64.5]

#Kuala Lumpur International Airport Coordinates: http://www.distancesfrom.com/my/Kuala-Lumpur-Airport-(KUL)-Malaysia-latitude-longitude-Kuala-Lumpur-Airport-(KUL)-Malaysia-latitude-/LatLongHistory/3308940.aspx
kualalumpur = [2.7544829, 101.7011363]

#Pulau Perak coordinates: http://tools.wmflabs.org/geohack/geohack.php?pagename=Pulau_Perak&params=5_40_50_N_98_56_27_E_type:isle_region:MY
# http://en.wikipedia.org/wiki/Perak_Island -> Indonesia military radar detected near island
pulauperak = [5.680556, 98.940833]

# Igari Waypoint. Source: # http://www.fallingrain.com/waypoint/SN/IGARI.html Given in hours,minutes,seconds.
igariwaypoint = [6. + 56./60. + 12./3600., 103. + 35./60. + 6./3600.]

print "inmarsat lat/lon:", inmarsat[0], inmarsat[1]
print "kuala lumpur int'l airport coord lat/lon:", kualalumpur[0],kualalumpur[1]
print "pulau perak lat/lon:", pulauperak[0],pulauperak[1]
print "igari waypoint lat/lon:", igariwaypoint[0],igariwaypoint[1]

inmarsat lat/lon: 0 64.5
kuala lumpur int'l airport coord lat/lon: 2.7544829 101.7011363
pulau perak lat/lon: 5.680556 98.940833
igari waypoint lat/lon: 6.93666666667 103.585


#### Here are the previously calculated 6 ping arcs in the second half of Version 2:¶

In [47]:
#ping arc distances as calculated in Mathematica
ping_distances = np.array([4036.99, 4194.65, 4352.32, 4509.99, 4667.65, 4825.32])
ping_times = np.array([0.9333, 1, 1, 1, 1, 1]) #make 1st hop slightly smaller owing to time difference
ping_arcs = np.array([34.8485, 36.2649, 37.6812, 39.0976, 40.5139, 41.9303, 43.3466])


#### And in Version 1, where we updated to the last ping each time:¶

In [48]:
ping_arc_dist = satellite_calc(earth_radius,sat_height,elevation_angle)
print "ping arc distance in degrees:", ping_arc_dist
print "distance from satellite", dist_from_sat

ping arc distance in degrees: 43.3540831616
distance from satellite 4820.7540969


#### Before we plot the data, let's formally define our latitude and longitude bounds up front. They correspond to an area over which MH370 could maximally fly:¶

In [49]:
#create lat/long grid of distance
lat_min = -50 #50 S
lat_max = 50  #50 N
lon_min = 50  #50 E
lon_max = 140 #130 E
lat_space = 5 #spacing for plotting latitudes and longitudes
lon_space = 5


### Plotting Satellite Ping Arcs and MH370 Known Locations¶

#### Version 2 -- All 6 Pings¶

In [50]:
#make points for 6 circles -- opt not to use for loop
ping_circle_211am = make_circle(ping_arcs[0],360,64.5,0)
ping_circle_311am = make_circle(ping_arcs[1],360,64.5,0)
ping_circle_411am = make_circle(ping_arcs[2],360,64.5,0)
ping_circle_511am = make_circle(ping_arcs[3],360,64.5,0)
ping_circle_611am = make_circle(ping_arcs[4],360,64.5,0)
ping_circle_711am = make_circle(ping_arcs[5],360,64.5,0)
ping_circle_811am = make_circle(ping_arcs[6],360,64.5,0)

#initialize lat & lon lists
circle_lon_211am = []
circle_lat_211am = []
circle_lat_311am = []
circle_lon_311am = []
circle_lat_411am = []
circle_lon_411am = []
circle_lat_511am = []
circle_lon_511am = []
circle_lat_611am = []
circle_lon_611am = []
circle_lat_711am = []
circle_lon_711am = []
circle_lat_811am = []
circle_lon_811am = []

for i in xrange(len(ping_circle_211am)): #they're all the same length so just do it once
circle_lat_211am.append(ping_circle_211am[i][0])
circle_lon_211am.append(ping_circle_211am[i][1])

circle_lat_311am.append(ping_circle_311am[i][0])
circle_lon_311am.append(ping_circle_311am[i][1])

circle_lat_411am.append(ping_circle_411am[i][0])
circle_lon_411am.append(ping_circle_411am[i][1])

circle_lat_511am.append(ping_circle_511am[i][0])
circle_lon_511am.append(ping_circle_511am[i][1])

circle_lat_611am.append(ping_circle_611am[i][0])
circle_lon_611am.append(ping_circle_611am[i][1])

circle_lat_711am.append(ping_circle_711am[i][0])
circle_lon_711am.append(ping_circle_711am[i][1])

circle_lat_811am.append(ping_circle_811am[i][0])
circle_lon_811am.append(ping_circle_811am[i][1])

In [51]:
"""
To conserve space I will write the first part of the plot as a function that I'll run, since these will remain the same.
I could pass in parameters I'd want to change, as variables as well, but it isn't necessary for my purposes.
"""
def figure_function_all_pings(fig):

#Draw coasts
fig.drawcoastlines()

#Draw boundary
fig.drawmapboundary(fill_color='lightblue')

#Fill background
fig.fillcontinents(color='white',lake_color='lightblue')

#Draw parallels
parallels = np.arange(lat_min,lat_max,lat_space)
fig.drawparallels(np.arange(lat_min,lat_max,lat_space),labels=[1,1,0,1], fontsize=15)

#Draw meridians
meridians = np.arange(lon_min,lon_max,lon_space)
fig.drawmeridians(np.arange(lon_min,lon_max,lon_space),labels=[1,1,0,1], fontsize=15)

#Translate coords into map coord system to plot

#Known 777 Locs
x,y = fig(kualalumpur[1],kualalumpur[0]) #plotted as lon,lat NOT lat,lon -- watch out!!
x2,y2 = fig(igariwaypoint[1],igariwaypoint[0])
x3,y3 = fig(pulauperak[1],pulauperak[0])

#Inmarsat Satellite Loc
x4,y4 = fig(inmarsat[1],inmarsat[0])

#Add circle coords -- these are for each ping. will not plot the 2.5 and 5% error
x5,y5 = fig(circle_lon_211am,circle_lat_211am)
x6,y6 = fig(circle_lon_311am,circle_lat_311am)
x7,y7 = fig(circle_lon_411am,circle_lat_411am)
x8,y8 = fig(circle_lon_511am,circle_lat_511am)
x9,y9 = fig(circle_lon_611am,circle_lat_611am)
x10,y10 = fig(circle_lon_711am,circle_lat_711am)
x11,y11 = fig(circle_lon_811am,circle_lat_811am)

#Draw circle showing extent of Inmarsat sat radar detection for each of the pings
fig.plot(x5,y5,'r--',markersize=5,label='1st Ping Arc')
fig.plot(x6,y6,'r-',markersize=5, label='Ping Arcs After Disappearance')
fig.plot(x7,y7,'r-',markersize=5)
fig.plot(x8,y8,'r-',markersize=5)
fig.plot(x9,y9,'r-',markersize=5)
fig.plot(x10,y10,'r-',markersize=5)
fig.plot(x11,y11,'r-',markersize=5)

# plot coords w/ filled circles
fig.plot(x,y,'bo',markersize=10,label='MH370 Flight Path')
fig.plot(x2,y2,'bo',markersize=10)
fig.plot(x3,y3,'go',markersize=10,label='MH370 Last Known Coords')
fig.plot(x4,y4,'ro',markersize=10,label='Inmarsat 3-F1')

#Draw arrows showing flight path
arrow1 = plt.arrow(x,y,x2-x,y2-y,linewidth=3,color='blue',linestyle='dashed',label='flight path')
arrow2 = plt.arrow(x2,y2,x3-x2,y3-y2,linewidth=3,color='blue',linestyle='dashed',label='flight path')

In [52]:
#Set figure size
fig = plt.figure(figsize=[30,20])

#Setup Basemap
fig = Basemap(width=10000000,height=18000000,projection='lcc',resolution='c',lat_0=10,lon_0=90,suppress_ticks=True)

#Call figure function
figure_function_all_pings(fig)

#Put other plots in here -- they don't have to be in any strict order necessarily.
# (We don't have any right now)

#Make legend
legend = plt.legend(loc='upper right',fontsize=10,frameon=True,title='Legend',markerscale=1,prop={'size':15})
legend.get_title().set_fontsize('20')

plt.title('Inmarsat Ping Estimation -- Individual Pings', fontsize=30)

#Show below
plt.show()


#### Version 1 -- last ping (at the time, what was thought to be 5th ping)¶

In [53]:
#plot the 5th ping arc

circle_pts = make_circle(ping_arc_dist,360,64.5,0)

circle_lat = []
for i in xrange(len(circle_pts)):
circle_lat.append(circle_pts[i][0])

circle_lon = []
for i in xrange(len(circle_pts)):
circle_lon.append(circle_pts[i][1])


#### Prepare to plot the 2.5% and 5% error:¶

In [54]:
# 2.5% error

err1_2per = 0.975*ping_arc_dist
err2_2per = 1.025*ping_arc_dist

circle_pts_err1_2per = make_circle(err1_2per,360,64.5,0)
circle_pts_err2_2per = make_circle(err2_2per,360,64.5,0)

circle_lon_err1_2per = []
circle_lon_err2_2per = []
circle_lat_err1_2per = []
circle_lat_err2_2per = []

for i in xrange(len(circle_pts_err1_2per)):
circle_lon_err1_2per.append(circle_pts_err1_2per[i][1])

for i in xrange(len(circle_pts_err2_2per)):
circle_lon_err2_2per.append(circle_pts_err2_2per[i][1])

for i in xrange(len(circle_pts_err1_2per)):
circle_lat_err1_2per.append(circle_pts_err1_2per[i][0])

for i in xrange(len(circle_pts_err2_2per)):
circle_lat_err2_2per.append(circle_pts_err2_2per[i][0])

# 5% error

err1_5per = 0.95*ping_arc_dist
err2_5per = 1.05*ping_arc_dist

circle_pts_err1_5per = make_circle(err1_5per,360,64.5,0)
circle_pts_err2_5per = make_circle(err2_5per,360,64.5,0)

circle_lon_err1_5per = []
circle_lon_err2_5per = []
circle_lat_err1_5per = []
circle_lat_err2_5per = []

for i in xrange(len(circle_pts_err1_5per)):
circle_lon_err1_5per.append(circle_pts_err1_5per[i][1])

for i in xrange(len(circle_pts_err2_5per)):
circle_lon_err2_5per.append(circle_pts_err2_5per[i][1])

for i in xrange(len(circle_pts_err1_5per)):
circle_lat_err1_5per.append(circle_pts_err1_5per[i][0])

for i in xrange(len(circle_pts_err2_5per)):
circle_lat_err2_5per.append(circle_pts_err2_5per[i][0])

In [211]:
"""
Version 1 figure creation -- equivalent to the above. I show the 2.5% & 5% errors.
"""
def figure_function_fifth_ping(fig):

#Draw coasts
fig.drawcoastlines()

#Draw boundary
fig.drawmapboundary(fill_color='lightblue')

#Fill background
fig.fillcontinents(color='white',lake_color='lightblue')

#Draw parallels
parallels = np.arange(lat_min,lat_max,lat_space)
fig.drawparallels(np.arange(lat_min,lat_max,lat_space),labels=[1,1,0,1], fontsize=15)

#Draw meridians
meridians = np.arange(lon_min,lon_max,lon_space)
fig.drawmeridians(np.arange(lon_min,lon_max,lon_space),labels=[1,1,0,1], fontsize=15)

#Translate coords into map coord system to plot

#Known 777 Locs
x,y = fig(kualalumpur[1],kualalumpur[0]) #plotted as lon,lat NOT lat,lon -- watch out!!
x2,y2 = fig(igariwaypoint[1],igariwaypoint[0])
x3,y3 = fig(pulauperak[1],pulauperak[0])

#Inmarsat Satellite Loc
x4,y4 = fig(inmarsat[1],inmarsat[0])

#Add circle coords -- these are for each ping. will not plot the 2.5 and 5% error
x5,y5 = fig(circle_lon,circle_lat)
x6,y6 = fig(circle_lon_err1_2per,circle_lat_err1_2per)
x7,y7 = fig(circle_lon_err2_2per,circle_lat_err2_2per)
x8,y8 = fig(circle_lon_err1_5per,circle_lat_err1_5per)
x9,y9 = fig(circle_lon_err2_5per,circle_lat_err2_5per)

#Draw circle showing extent of Inmarsat sat radar detection for each of the pings
fig.plot(x5,y5,'r-',markersize=5,label='Last Ping Arc')
fig.plot(x6,y6,'r--',markersize=5, label='2.5% and 5% Error')
fig.plot(x7,y7,'r--',markersize=5)
fig.plot(x8,y8,'r--',markersize=5)
fig.plot(x9,y9,'r--',markersize=5)

# plot coords w/ filled circles
fig.plot(x,y,'bo',markersize=10,label='MH370 Flight Path')
fig.plot(x2,y2,'bo',markersize=10)
fig.plot(x3,y3,'go',markersize=10,label='MH370 Last Known Coords')
fig.plot(x4,y4,'ro',markersize=10,label='Inmarsat 3-F1')

#Draw arrows showing flight path
arrow1 = plt.arrow(x,y,x2-x,y2-y,linewidth=3,color='blue',linestyle='dashed',label='flight path')
arrow2 = plt.arrow(x2,y2,x3-x2,y3-y2,linewidth=3,color='blue',linestyle='dashed',label='flight path')

In [212]:
#Set figure size
fig = plt.figure(figsize=[30,20])

#Setup Basemap
fig = Basemap(width=10000000,height=18000000,projection='lcc',resolution='c',lat_0=10,lon_0=90,suppress_ticks=True)

#Call figure function
figure_function_fifth_ping(fig)

#Put other plots in here -- they don't have to be in any strict order necessarily.
# (We don't have any right now)

#Make legend
legend = plt.legend(loc='upper right',fontsize=10,frameon=True,title='Legend',markerscale=1,prop={'size':15})
legend.get_title().set_fontsize('20')