%load_ext ipycache
%matplotlib inline
In this first step towards our map, we acquire data from OpenStreetMap for Singapore, filter and clean it using GeoPandas and the fuzzywuzzy library.
Recipes contained in this notebook:
geopandas_osm
fuzzywuzzy
to clean dataMetro Extracts is a service by MapZen that packages up OSM data from key cities around the world for convenient download, updated weekly.
# Download the Singapore IMPOSM GeoJSON zipfile
!wget https://s3.amazonaws.com/metro-extracts.mapzen.com/singapore.imposm-geojson.zip
--2015-03-20 22:15:01-- https://s3.amazonaws.com/metro-extracts.mapzen.com/singapore.imposm-geojson.zip Resolving s3.amazonaws.com (s3.amazonaws.com)... 54.231.10.80 Connecting to s3.amazonaws.com (s3.amazonaws.com)|54.231.10.80|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 17780196 (17M) [binary/octet-stream] Saving to: ‘singapore.imposm-geojson.zip’ 100%[======================================>] 17,780,196 306KB/s in 71s 2015-03-20 22:16:12 (243 KB/s) - ‘singapore.imposm-geojson.zip’ saved [17780196/17780196]
# Unzip the file
!unzip singapore.imposm-geojson.zip
Archive: singapore.imposm-geojson.zip inflating: singapore-admin.geojson inflating: singapore-aeroways.geojson inflating: singapore-amenities.geojson inflating: singapore-buildings.geojson inflating: singapore-landusages.geojson inflating: singapore-landusages_gen0.geojson inflating: singapore-landusages_gen1.geojson inflating: singapore-places.geojson inflating: singapore-roads.geojson inflating: singapore-roads_gen0.geojson inflating: singapore-roads_gen1.geojson inflating: singapore-transport_areas.geojson inflating: singapore-transport_points.geojson inflating: singapore-waterareas.geojson inflating: singapore-waterareas_gen0.geojson inflating: singapore-waterareas_gen1.geojson inflating: singapore-waterways.geojson
The relevant files for us are singapore-roads.geojson
and singapore-admin.geojson
, which gives us administrative boundaries.
We know we'll have to do some filtering with singapore-roads.geojson
, so let's open it up in Pandas...
import pandas as pd
df = pd.read_json('singapore-roads.geojson')
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-14-38a4a0a8709c> in <module>() ----> 1 df = pd.read_json('singapore-roads.geojson') /home/michelle/.virtualenvs/sgmap/local/lib/python2.7/site-packages/pandas/io/json.pyc in read_json(path_or_buf, orient, typ, dtype, convert_axes, convert_dates, keep_default_dates, numpy, precise_float, date_unit) 196 obj = FrameParser(json, orient, dtype, convert_axes, convert_dates, 197 keep_default_dates, numpy, precise_float, --> 198 date_unit).parse() 199 200 if typ == 'series' or obj is None: /home/michelle/.virtualenvs/sgmap/local/lib/python2.7/site-packages/pandas/io/json.pyc in parse(self) 264 265 else: --> 266 self._parse_no_numpy() 267 268 if self.obj is None: /home/michelle/.virtualenvs/sgmap/local/lib/python2.7/site-packages/pandas/io/json.pyc in _parse_no_numpy(self) 481 if orient == "columns": 482 self.obj = DataFrame( --> 483 loads(json, precise_float=self.precise_float), dtype=None) 484 elif orient == "split": 485 decoded = dict((str(k), v) /home/michelle/.virtualenvs/sgmap/local/lib/python2.7/site-packages/pandas/core/frame.pyc in __init__(self, data, index, columns, dtype, copy) 206 dtype=dtype, copy=copy) 207 elif isinstance(data, dict): --> 208 mgr = self._init_dict(data, index, columns, dtype=dtype) 209 elif isinstance(data, ma.MaskedArray): 210 import numpy.ma.mrecords as mrecords /home/michelle/.virtualenvs/sgmap/local/lib/python2.7/site-packages/pandas/core/frame.pyc in _init_dict(self, data, index, columns, dtype) 334 335 return _arrays_to_mgr(arrays, data_names, index, columns, --> 336 dtype=dtype) 337 338 def _init_ndarray(self, values, index, columns, dtype=None, /home/michelle/.virtualenvs/sgmap/local/lib/python2.7/site-packages/pandas/core/frame.pyc in _arrays_to_mgr(arrays, arr_names, index, columns, dtype) 4615 # figure out the index, if necessary 4616 if index is None: -> 4617 index = extract_index(arrays) 4618 else: 4619 index = _ensure_index(index) /home/michelle/.virtualenvs/sgmap/local/lib/python2.7/site-packages/pandas/core/frame.pyc in extract_index(data) 4666 4667 if have_dicts: -> 4668 raise ValueError('Mixing dicts with non-Series may lead to ' 4669 'ambiguous ordering.') 4670 ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.
Aw, shucks...Pandas can't open GeoJSON files. # TODO: why
Luckily, GeoPandas comes to the rescue!
GeoPandas is an extension of Pandas that adds geographic capabilities. We'll see in a bit some of the awesome geographic manipulation we can do. But for now, it's sort of a relief that we can open GeoJSON files!
import geopandas as gpd
df = gpd.read_file("singapore-roads.geojson")
df
access | bridge | class | geometry | id | name | oneway | osm_id | ref | service | tunnel | type | z_order | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | None | 0 | highway | LINESTRING (103.8284047430461 1.3068665711867,... | 0 | Orchard Road | 1 | 4386520 | None | None | 0 | primary | 6 |
1 | None | 0 | highway | LINESTRING (103.8433569686602 1.28958115900517... | 1 | Merchant Loop | 1 | 9590308 | None | None | 0 | residential | 3 |
2 | None | 0 | highway | LINESTRING (103.8412715511506 1.28860734949448... | 2 | Clemenceau Avenue | 1 | 9590470 | None | None | 0 | primary | 6 |
3 | None | 0 | highway | LINESTRING (103.8420525768883 1.29002322057853... | 3 | Merchant Road | 1 | 9590561 | None | None | 0 | secondary | 5 |
4 | None | 0 | highway | LINESTRING (103.8447162618978 1.28850685047542... | 4 | Read Cresent | 1 | 9590577 | None | None | 0 | residential | 3 |
5 | None | 0 | highway | LINESTRING (103.8513752648741 1.39678611968956... | 5 | Tampines Expressway | 1 | 14058412 | TPE | None | 0 | motorway | 9 |
6 | None | 0 | highway | LINESTRING (103.8579654524252 1.39413458843965... | 6 | Seletar Expressway | 1 | 14061945 | SLE | None | 0 | motorway | 9 |
7 | None | 0 | highway | LINESTRING (103.8443906249596 1.31096833932369... | 7 | Central Expressway | 1 | 14295878 | CTE | None | 0 | motorway | 9 |
8 | None | 1 | highway | LINESTRING (103.8664486928081 1.29436714189819... | 8 | ECP Rochor Road Exit | 1 | 14458223 | None | None | 0 | motorway_link | 23 |
9 | None | 0 | highway | LINESTRING (103.8021676259133 1.27279203931056... | 9 | Telok Blangah Road | 1 | 14458224 | None | None | 0 | primary | 6 |
10 | None | 1 | highway | LINESTRING (103.6372540162506 1.34803621646112... | 10 | Ayer Rajah Expressway | 1 | 15004371 | AYE | None | 0 | motorway | 29 |
11 | None | 1 | highway | LINESTRING (103.8341078737844 1.27236565189612... | 11 | Ayer Rajah Expressway | 1 | 15005218 | AYE | None | 0 | motorway | 29 |
12 | None | 0 | highway | LINESTRING (103.8541000539577 1.39778775711882... | 12 | Seletar Expressway | 1 | 15092487 | SLE | None | 0 | motorway | 9 |
13 | None | 0 | highway | LINESTRING (103.8610271940164 1.40078252730367... | 13 | None | 1 | 15092593 | None | None | 0 | motorway_link | 3 |
14 | None | 0 | highway | LINESTRING (103.8541408738262 1.39762062196953... | 14 | Seletar Expressway | 1 | 15092594 | SLE | None | 0 | motorway | 9 |
15 | None | 0 | highway | LINESTRING (103.8408083671813 1.31467506836410... | 15 | None | 0 | 9584847 | None | None | 0 | footway | 0 |
16 | None | 0 | highway | LINESTRING (103.8578884227351 1.39304544394129... | 16 | None | 1 | 15092850 | None | None | 0 | motorway_link | 3 |
17 | None | 0 | highway | LINESTRING (103.8387570640176 1.31267631973334... | 17 | Scotts Road | 1 | 8096835 | None | None | 0 | primary | 6 |
18 | None | 0 | highway | LINESTRING (103.8392504228384 1.31344946648208... | 18 | Newton Road | 1 | 9585045 | None | None | 0 | primary | 6 |
19 | None | 0 | highway | LINESTRING (103.8373030552741 1.31470365065393... | 19 | Sarkies Road | 0 | 9585074 | None | None | 0 | residential | 3 |
20 | None | 0 | highway | LINESTRING (103.842835949559 1.291595917070989... | 20 | Unity Street | 0 | 9588130 | None | None | 0 | residential | 3 |
21 | None | 0 | highway | LINESTRING (103.8308129476469 1.30390055092971... | 21 | Paterson Hill | 1 | 9586040 | None | None | 0 | primary | 6 |
22 | None | 0 | highway | LINESTRING (103.8318304268731 1.30502766544945... | 22 | Patterson Road | 1 | 9585621 | None | None | 0 | primary | 6 |
23 | None | 0 | highway | LINESTRING (103.7749761452852 1.38575318818034... | 23 | Kranji Expressway | 1 | 15773693 | KJE | None | 0 | motorway_link | 3 |
24 | None | 0 | highway | LINESTRING (103.7114082945315 1.36711217079856... | 24 | None | 1 | 15773695 | None | None | 0 | motorway_link | 3 |
25 | None | 0 | highway | LINESTRING (103.8227325415305 1.27891518721688... | 25 | Central Expressway | 1 | 15774741 | CTE | None | 0 | motorway | 9 |
26 | None | 0 | highway | LINESTRING (103.840308386657 1.287336988249505... | 26 | Central Expressway | 1 | 15775018 | CTE | None | 1 | motorway | -11 |
27 | None | 0 | highway | LINESTRING (103.6635033709619 1.32316124241309... | 27 | Upper Jurong Road | 1 | 15775493 | None | None | 0 | motorway_link | 3 |
28 | None | 0 | highway | LINESTRING (103.8445550778999 1.31091209675340... | 28 | Central Expressway | 1 | 15775020 | CTE | None | 1 | motorway | -11 |
29 | None | 0 | highway | LINESTRING (103.8858461769513 1.35177773039972... | 29 | Hougang Ave 1 | 0 | 4887867 | None | None | 0 | residential | 3 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
62515 | None | 0 | highway | LINESTRING (103.8447872566178 1.34067682165575... | 62515 | None | 0 | 330213797 | None | parking_aisle | 0 | service | 0 |
62516 | None | 0 | highway | LINESTRING (103.8529749510948 1.34078017052188... | 62516 | None | 0 | 330213798 | None | parking_aisle | 0 | service | 0 |
62517 | None | 0 | highway | LINESTRING (103.8468275794882 1.33968608070064... | 62517 | None | 0 | 330213799 | None | driveway | 0 | service | 0 |
62518 | None | 0 | highway | LINESTRING (103.8468381406862 1.33498802397186... | 62518 | None | 0 | 330213801 | None | driveway | 0 | service | 0 |
62519 | None | 0 | highway | LINESTRING (103.8448426609977 1.34049996349880... | 62519 | None | 0 | 330213802 | None | driveway | 0 | service | 0 |
62520 | None | 0 | highway | LINESTRING (103.8517760874838 1.33962623391200... | 62520 | Lorong 6 Toa Payoh | 1 | 330213981 | None | None | 0 | secondary_link | 3 |
62521 | None | 0 | highway | LINESTRING (103.8549077341475 1.33705198380933... | 62521 | None | 0 | 330213781 | None | None | 0 | steps | 0 |
62522 | None | 0 | highway | LINESTRING (103.8467848317821 1.34170737665094... | 62522 | None | 0 | 330213766 | None | driveway | 0 | service | 0 |
62523 | None | 0 | highway | LINESTRING (103.846060970624 1.339905183649603... | 62523 | None | 0 | 330213782 | None | driveway | 0 | service | 0 |
62524 | None | 0 | highway | LINESTRING (103.8472062738736 1.33912952232991... | 62524 | None | 0 | 330213780 | None | driveway | 0 | service | 0 |
62525 | private | 0 | highway | LINESTRING (103.8568858632965 1.32973474997694... | 62525 | None | 0 | 330069948 | None | driveway | 0 | service | 0 |
62526 | None | 0 | highway | LINESTRING (103.8544328993328 1.34317957412434... | 62526 | None | 0 | 330213785 | None | driveway | 0 | service | 0 |
62527 | None | 0 | highway | LINESTRING (103.8468528928358 1.34090095374661... | 62527 | None | 0 | 330213778 | None | driveway | 0 | service | 0 |
62528 | private | 0 | highway | LINESTRING (103.8595746940155 1.34965694125876... | 62528 | None | 1 | 330785634 | None | driveway | 0 | service | 0 |
62529 | None | 0 | highway | LINESTRING (103.8546323886283 1.33303914766501... | 62529 | None | 0 | 330069911 | None | parking_aisle | 0 | service | 0 |
62530 | None | 0 | highway | LINESTRING (103.8442157784594 1.34044011671013... | 62530 | None | 0 | 330213803 | None | driveway | 0 | service | 0 |
62531 | None | 0 | highway | LINESTRING (103.77509433012 1.331970119734268,... | 62531 | None | 0 | 331376625 | None | None | 0 | footway | 0 |
62532 | private | 0 | highway | LINESTRING (103.8146235531243 1.35609767947631... | 62532 | None | 0 | 330872771 | None | None | 0 | service | 0 |
62533 | None | 0 | highway | LINESTRING (103.7916909174987 1.29639941814166... | 62533 | None | 0 | 330230340 | None | None | 0 | tertiary | 4 |
62534 | private | 0 | highway | LINESTRING (103.8600742554446 1.34619026992523... | 62534 | None | 1 | 330785637 | None | parking_aisle | 0 | service | 0 |
62535 | None | 0 | highway | LINESTRING (103.7745914997486 1.33213943417838... | 62535 | None | 0 | 331377095 | None | None | 0 | steps | 0 |
62536 | None | 0 | highway | LINESTRING (103.775110255736 1.332153348137646... | 62536 | None | 0 | 331376626 | None | None | 0 | footway | 0 |
62537 | None | 0 | highway | LINESTRING (103.7747972754715 1.33181027684075... | 62537 | None | 0 | 331377098 | None | None | 0 | footway | 0 |
62538 | None | 0 | highway | LINESTRING (103.7750795779704 1.33226591709727... | 62538 | None | 0 | 331376624 | None | None | 0 | steps | 0 |
62539 | None | 0 | highway | LINESTRING (103.8567095918727 1.34121167089725... | 62539 | None | 0 | 330213800 | None | parking_aisle | 0 | service | 0 |
62540 | private | 0 | highway | LINESTRING (103.8601053523054 1.34619722690486... | 62540 | None | 0 | 330785952 | None | driveway | 0 | service | 0 |
62541 | None | 0 | highway | LINESTRING (103.7747296335129 1.33193634066448... | 62541 | None | 0 | 331377096 | None | None | 0 | footway | 0 |
62542 | None | 0 | highway | LINESTRING (103.7747785838274 1.33163434069313... | 62542 | None | 0 | 331377097 | None | None | 0 | footway | 0 |
62543 | private | 0 | highway | LINESTRING (103.8595054594953 1.35042723816041... | 62543 | None | 0 | 330785633 | None | driveway | 0 | service | 0 |
62544 | None | 0 | highway | LINESTRING (103.8552120810517 1.33421789470830... | 62544 | None | 0 | 330069915 | None | driveway | 0 | service | 0 |
62545 rows × 13 columns
What's more, we can do geo stuff really easily like plotting the contents of this GeoDataFrame.
df.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f96aac7fa90>
Well that was easy! But plotting this has shown us that there's too much data in here. Not only is Singapore (the dense diamond-shaped island) included in this set, there's also roads from Malaysia to the north and Indonesia to the south in here. We'll have to filter them out. To do that we'll need the data in the singapore-admin.geojson
file.
admin = gpd.read_file('singapore-admin.geojson')
admin
admin_leve | geometry | id | name | osm_id | type | |
---|---|---|---|---|---|---|
0 | 2 | POLYGON ((103.5682607272107 1.276999922341711,... | 0 | Singapura | -536780 | administrative |
1 | 6 | POLYGON ((103.75383539292 1.394411191244401, 1... | 1 | North West Community Development Council | -3831714 | administrative |
2 | 6 | POLYGON ((103.5682607272107 1.276999922341711,... | 2 | South West Community Development Council | -3831716 | administrative |
3 | 6 | POLYGON ((103.8567891361339 1.355489991496248,... | 3 | South East Community Development Council | -3831715 | administrative |
4 | 6 | POLYGON ((103.8733000604621 1.362206494328233,... | 4 | North East Community Development Council | -3831713 | administrative |
5 | 6 | POLYGON ((103.7926486337553 1.299063941341472,... | 5 | Central Singapore Community Development Council | -3831712 | administrative |
6 | 10 | POLYGON ((103.6607330681441 1.548248032044329,... | 6 | Taman Impian Emas | 232080984 | administrative |
7 | NaN | POLYGON ((103.915249893284 1.385568199577297, ... | 7 | Pasir Ris | 244243731 | administrative |
8 | NaN | POLYGON ((103.9200380554719 1.374212145700731,... | 8 | Tampines | 244243399 | administrative |
# The first one is what we want
sg_boundary = admin.ix[0].geometry
sg_boundary
We filter to the roads that lie within
this Singapore boundary. The within
function is from the shapely
geographic manipulation library, whose capabilities are exposed in an intuitive way by geopandas
.
sg_roads = df[df.geometry.within(sg_boundary)]
sg_roads.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f0e0c22d410>
That looks a lot more like it! We can also take a look at the relative sizes of the dataframes:
print df.shape
sg_roads.shape
(62545, 13)
(29451, 13)
We can also do standard pandas
style filtering. For example, we can extract just the "highways" - meaning roads - from this dataframe with the following line of code.
sg_roads = sg_roads[sg_roads["class"] == 'highway']
sg_roads.shape
(28810, 13)
We'll do a bit more filtering on the output presently, but first let's look at the second way of obtaining OSM data, which will bring us to the same point with very little code.
geopandas_osm
¶This second method uses a library called geopandas_osm
, written by Jake Wasserman. It will eventually be incorporated into geopandas.io
, but for now we use it as a stand-alone library. You can find the code here. It queries OpenStreetMap directly through their Overpass API.
It's more convenient as we don't need to download and unpack the data from Metro Extracts every time. Also, the Overpass API is updated roughly once a day, so if you find yourself making edits to OpenStreetMap directly that you want to see reflected in the data you download, it's a much shorter wait. Note, however, that the API does take quite some time to return results!
%%cache sgosm.pkl df2 sg_roads2
import geopandas_osm.osm
# Query for the highways within the `sg_boundary` we obtained earlier from the sg_admin.
df2 = geopandas_osm.osm.query_osm('way', sg_boundary, recurse='down', tags='highway')
# This gives us lots of columns we don't need, so we'll isolate it to the three we do need
sg_roads2 = df2[df2.type == 'LineString'][['highway', 'name', 'geometry']]
# display the GeoDataFrame that results
sg_roads2
highway | name | geometry | |
---|---|---|---|
0 | primary | Orchard Road | LINESTRING (103.8284048 1.3068666, 103.8287382... |
1 | residential | Hougang Ave 1 | LINESTRING (103.8858462 1.3517778, 103.8859356... |
2 | primary | Scotts Road | LINESTRING (103.8387571 1.3126764, 103.83872 1... |
3 | tertiary | Keng Lee Road | LINESTRING (103.8395387 1.3132203, 103.8396493... |
4 | footway | NaN | LINESTRING (103.8408084 1.3146751, 103.8412438... |
5 | primary | Newton Road | LINESTRING (103.8392505 1.3134495, 103.8394598... |
6 | residential | Sarkies Road | LINESTRING (103.8373031 1.3147037, 103.8359738... |
7 | primary | Patterson Road | LINESTRING (103.8318305 1.3050277, 103.8315539... |
8 | secondary | Orchard Boulevard | LINESTRING (103.8348293 1.3004045, 103.8342528... |
9 | secondary | Grange Road | LINESTRING (103.8348293 1.3004045, 103.8346883... |
10 | primary | Paterson Hill | LINESTRING (103.830813 1.3039006, 103.8302888 ... |
11 | primary | River Valley Road | LINESTRING (103.8313547 1.2960058, 103.83227 1... |
12 | residential | Unity Street | LINESTRING (103.842836 1.291596, 103.8422129 1... |
13 | residential | Merbau Road | LINESTRING (103.8430392 1.2934171, 103.842244 ... |
14 | residential | Mohamed Sultan Road | LINESTRING (103.8387207 1.2909753, 103.8388778... |
15 | residential | Saiboo Street | LINESTRING (103.8387207 1.2909753, 103.8386978... |
16 | residential | Merchant Loop | LINESTRING (103.843357 1.2895812, 103.8436285 ... |
17 | primary | Clemenceau Avenue | LINESTRING (103.8412716 1.2886074, 103.8414779... |
18 | secondary | Merchant Road | LINESTRING (103.8420526 1.2900233, 103.8421061... |
19 | residential | Read Cresent | LINESTRING (103.8447163 1.2885069, 103.8448984... |
20 | motorway | Tampines Expressway | LINESTRING (103.8513753 1.3967862, 103.8518491... |
21 | motorway | Seletar Expressway | LINESTRING (103.8579655 1.3941346, 103.8580422... |
22 | motorway | Central Expressway | LINESTRING (103.8443907 1.3109684, 103.8445533... |
23 | motorway_link | ECP Rochor Road Exit | LINESTRING (103.8664487 1.2943672, 103.8653088... |
24 | primary | Telok Blangah Road | LINESTRING (103.8021677 1.2727921, 103.8022844... |
25 | motorway | Ayer Rajah Expressway | LINESTRING (103.6372541 1.3480363, 103.6373882... |
26 | motorway | Ayer Rajah Expressway | LINESTRING (103.8341079 1.2723657, 103.8361228... |
27 | motorway | Seletar Expressway | LINESTRING (103.8541001 1.3977878, 103.8545108... |
28 | motorway_link | NaN | LINESTRING (103.8610272 1.4007826, 103.8599639... |
29 | motorway | Seletar Expressway | LINESTRING (103.8541409 1.3976207, 103.8537589... |
... | ... | ... | ... |
29039 | service | NaN | LINESTRING (103.8962452 1.308929, 103.8962545 ... |
29040 | service | NaN | LINESTRING (103.8963869 1.3154738, 103.8964208... |
29041 | service | NaN | LINESTRING (103.8948332 1.3162965, 103.8949344... |
29042 | service | NaN | LINESTRING (103.9243887 1.3058938, 103.9244309... |
29043 | service | NaN | LINESTRING (103.9239223 1.3058523, 103.923993 ... |
29044 | service | NaN | LINESTRING (103.8982788 1.3119643, 103.8981389... |
29045 | service | NaN | LINESTRING (103.9230115 1.3075672, 103.9231694... |
29046 | service | NaN | LINESTRING (103.8950937 1.3163567, 103.8951445... |
29047 | service | NaN | LINESTRING (103.9218713 1.3068476, 103.9219578... |
29048 | service | NaN | LINESTRING (103.8958409 1.3167593, 103.8956646... |
29049 | service | NaN | LINESTRING (103.9259704 1.3064983, 103.9258357... |
29050 | service | NaN | LINESTRING (103.9255259 1.3062226, 103.9254309... |
29051 | service | NaN | LINESTRING (103.8956646 1.3167249, 103.8956338... |
29052 | service | NaN | LINESTRING (103.9261383 1.3067769, 103.926265 ... |
29053 | service | NaN | LINESTRING (103.9258459 1.3083791, 103.925754 ... |
29054 | service | NaN | LINESTRING (103.8971049 1.3127559, 103.8970379... |
29055 | service | NaN | LINESTRING (103.896544 1.3139329, 103.8966131 ... |
29056 | service | NaN | LINESTRING (103.8965426 1.3119683, 103.8968468... |
29057 | residential | Marine Drive | LINESTRING (103.9098645 1.3024331, 103.909354 ... |
29058 | service | NaN | LINESTRING (103.9087401 1.301585, 103.9089982 ... |
29059 | service | NaN | LINESTRING (103.9083074 1.3029351, 103.9089768... |
29060 | service | NaN | LINESTRING (103.9080775 1.3032895, 103.9088848... |
29061 | service | NaN | LINESTRING (103.9077598 1.303779, 103.9073851 ... |
29062 | service | NaN | LINESTRING (103.9071591 1.3023186, 103.9067438... |
29063 | service | NaN | LINESTRING (103.9056312 1.3028917, 103.9058463... |
29064 | service | NaN | LINESTRING (103.9061311 1.3012161, 103.9062264... |
29065 | service | NaN | LINESTRING (103.9079658 1.3034616, 103.9078315... |
29066 | service | NaN | LINESTRING (103.9089611 1.3019279, 103.9083074... |
29067 | path | NaN | LINESTRING (103.7561811 1.4075004, 103.7560779... |
29068 | path | NaN | LINESTRING (103.7560785 1.4074843, 103.7561811... |
29069 rows × 3 columns
[Saved variables df2, sg_roads2 to file '/home/michelle/Dropbox/Repositories/public-facing/SingaporeRoadnameOrigins/notebooks/sgosm.pkl'.]
There's a slight discrepancy in the number of roads, which could be due to changes in the OpenStreetMap database since the Metro Extract was extracted. In any case, we've arrived at roughly the same point. Let's carry on with the cleaning, with this GeoDataFrame.
Since we're doing name classification of roads, we can drop the roads with no names in the database.
named_roads = sg_roads2[sg_roads2.name.notnull()]
named_roads.shape
(11999, 3)
That cut down on a lot of roads! Let's also make sure that we're doing actual roads, not footpaths and the like:
named_roads.highway.value_counts()
residential 4716 primary 2571 secondary 1259 motorway 585 tertiary 571 unclassified 564 service 495 footway 225 cycleway 221 motorway_link 217 trunk 205 primary_link 104 pedestrian 55 track 37 construction 34 trunk_link 34 path 34 steps 27 secondary_link 19 proposed 8 tertiary_link 8 raceway 7 living_street 2 rest_area 1 dtype: int64
%%cache filtered_roads.pkl filtered_roads
accepted_road_types = [u'motorway', u'primary',
u'primary_link', u'residential', u'road',
u'secondary', u'secondary_link', u'tertiary',
u'tertiary_link', u'trunk', u'trunk_link', u'unclassified']
filtered_roads = named_roads[named_roads.highway.isin(accepted_road_types)]
filtered_roads.shape
[Skipped the cell's code and loaded variables filtered_roads from file '/home/michelle/Dropbox/Repositories/public-facing/SingaporeRoadnameOrigins/notebooks/filtered_roads.pkl'.]
filtered_roads
highway | name | geometry | |
---|---|---|---|
0 | primary | Orchard Road | LINESTRING (103.8284048 1.3068666, 103.8287382... |
1 | residential | Hougang Ave 1 | LINESTRING (103.8858462 1.3517778, 103.8859356... |
2 | primary | Scotts Road | LINESTRING (103.8387571 1.3126764, 103.83872 1... |
3 | tertiary | Keng Lee Road | LINESTRING (103.8395387 1.3132203, 103.8396493... |
5 | primary | Newton Road | LINESTRING (103.8392505 1.3134495, 103.8394598... |
6 | residential | Sarkies Road | LINESTRING (103.8373031 1.3147037, 103.8359738... |
7 | primary | Patterson Road | LINESTRING (103.8318305 1.3050277, 103.8315539... |
8 | secondary | Orchard Boulevard | LINESTRING (103.8348293 1.3004045, 103.8342528... |
9 | secondary | Grange Road | LINESTRING (103.8348293 1.3004045, 103.8346883... |
10 | primary | Paterson Hill | LINESTRING (103.830813 1.3039006, 103.8302888 ... |
11 | primary | River Valley Road | LINESTRING (103.8313547 1.2960058, 103.83227 1... |
12 | residential | Unity Street | LINESTRING (103.842836 1.291596, 103.8422129 1... |
13 | residential | Merbau Road | LINESTRING (103.8430392 1.2934171, 103.842244 ... |
14 | residential | Mohamed Sultan Road | LINESTRING (103.8387207 1.2909753, 103.8388778... |
15 | residential | Saiboo Street | LINESTRING (103.8387207 1.2909753, 103.8386978... |
16 | residential | Merchant Loop | LINESTRING (103.843357 1.2895812, 103.8436285 ... |
17 | primary | Clemenceau Avenue | LINESTRING (103.8412716 1.2886074, 103.8414779... |
18 | secondary | Merchant Road | LINESTRING (103.8420526 1.2900233, 103.8421061... |
19 | residential | Read Cresent | LINESTRING (103.8447163 1.2885069, 103.8448984... |
20 | motorway | Tampines Expressway | LINESTRING (103.8513753 1.3967862, 103.8518491... |
21 | motorway | Seletar Expressway | LINESTRING (103.8579655 1.3941346, 103.8580422... |
22 | motorway | Central Expressway | LINESTRING (103.8443907 1.3109684, 103.8445533... |
24 | primary | Telok Blangah Road | LINESTRING (103.8021677 1.2727921, 103.8022844... |
25 | motorway | Ayer Rajah Expressway | LINESTRING (103.6372541 1.3480363, 103.6373882... |
26 | motorway | Ayer Rajah Expressway | LINESTRING (103.8341079 1.2723657, 103.8361228... |
27 | motorway | Seletar Expressway | LINESTRING (103.8541001 1.3977878, 103.8545108... |
29 | motorway | Seletar Expressway | LINESTRING (103.8541409 1.3976207, 103.8537589... |
30 | primary | Turf Club Avenue | LINESTRING (103.7572461 1.4233889, 103.757277 ... |
33 | motorway | Kranji Expressway | LINESTRING (103.7740277 1.3968194, 103.7740856... |
34 | motorway | Kranji Expressway | LINESTRING (103.7024781 1.3606672, 103.7029984... |
... | ... | ... | ... |
28587 | primary | Jurong Town Hall Road | LINESTRING (103.7456548 1.3238507, 103.7456587... |
28588 | primary | New Upper Changi Road | LINESTRING (103.9278275 1.3236756, 103.9294382... |
28589 | tertiary | Punggol Central | LINESTRING (103.9090633 1.4008567, 103.9089959... |
28590 | tertiary | Punggol Central | LINESTRING (103.9085407 1.4013837, 103.9090685... |
28593 | primary | Serangoon Road | LINESTRING (103.8635402 1.3219606, 103.8636524... |
28594 | secondary | Tanah Merah Coast Road | LINESTRING (103.9878283 1.3174213, 103.9873099... |
28595 | secondary | Tanah Merah Coast Road | LINESTRING (103.9819889 1.32008, 103.9822557 1... |
28596 | unclassified | Tanah Merah Ferry Road | LINESTRING (103.9880358 1.3171467, 103.9882954... |
28597 | unclassified | Toh Guan Road East | LINESTRING (103.7480459 1.3364131, 103.7477151... |
28598 | secondary | Toh Guan Road | LINESTRING (103.7471773 1.3321312, 103.7473666... |
28599 | secondary | Toh Guan Road | LINESTRING (103.7476578 1.3368285, 103.7476645... |
28600 | unclassified | Toh Tuck Link | LINESTRING (103.7587476 1.3306791, 103.7580119... |
28601 | residential | Toh Tuck Road | LINESTRING (103.7605157 1.3375774, 103.760884 ... |
28602 | residential | Toh Tuck Road | LINESTRING (103.7717609 1.3410362, 103.7718022... |
28603 | secondary | Yio Chu Kang Link | LINESTRING (103.8737945 1.3560351, 103.8735798... |
28618 | secondary | Harbour Drive | LINESTRING (103.7667516 1.2907781, 103.7671805... |
28619 | secondary | West Coast Ferry Road | LINESTRING (103.7666223 1.2908582, 103.7667516... |
28781 | residential | China Street | LINESTRING (103.8479024 1.2838138, 103.8482049... |
28784 | residential | Clarke Quay | LINESTRING (103.8448404 1.2907019, 103.8447671... |
28819 | secondary | Sungei Kadut Drive | LINESTRING (103.7465423 1.4078451, 103.7462293... |
28820 | secondary | Sungei Kadut Drive | LINESTRING (103.7469404 1.4300587, 103.7469354... |
28829 | unclassified | Macalister Road | LINESTRING (103.8301091 1.2795678, 103.8297781... |
28838 | residential | Countryside Walk | LINESTRING (103.8377302 1.3901191, 103.8380534... |
28923 | residential | Lorong 4 Toa Payoh | LINESTRING (103.850795 1.339425, 103.8508394 1... |
28924 | residential | Lorong 4 Toa Payoh | LINESTRING (103.8515894 1.3395955, 103.8513506... |
28964 | secondary_link | Lorong 6 Toa Payoh | LINESTRING (103.8517761 1.3396263, 103.8520148... |
28978 | residential | PIE | LINESTRING (103.67452 1.3306795, 103.6770103 1... |
28998 | residential | Nepal Park | LINESTRING (103.7878953 1.3008067, 103.7879641... |
28999 | residential | Nepal Park | LINESTRING (103.7878222 1.3015588, 103.7881668... |
29057 | residential | Marine Drive | LINESTRING (103.9098645 1.3024331, 103.909354 ... |
10636 rows × 3 columns
filtered_roads.plot()
<matplotlib.axes._subplots.AxesSubplot at 0x7f0e2136b890>
In case you're wondering, the empty spots are mostly reservoirs, followed by military areas and air fields. So this is looking pretty right.
Now we'll check whether the road names are correct. Since OpenStreetMap is crowdsourced, it's possible that there are typos. Also, often they use abbreviations for the streets - but only sometimes. For our own sanity, we'll expand out all the abbreviations to their full length.
abbrev = ['Ave', 'Dr', 'St', 'Blvd', 'Ter', 'Pl', 'Gr']
nonabbr = ['Avenue', 'Drive', 'Street', 'Boulevard', 'Terrace', 'Place', 'Grove']
for a, b in zip(abbrev, nonabbr):
filtered_roads.replace('(.)' + a + '(\.|$)', r'\1' + b, regex=True, inplace=True)
filtered_roads.replace('(.)' + a + ' ', r'\1' + b + r' ', regex=True, inplace=True)
abbrev = ['Lor', 'Jln', 'Kpg', 'Bkt']
nonabbr = ['Lorong', 'Jalan', 'Kampong', 'Bukit']
for a, b in zip(abbrev, nonabbr):
filtered_roads.replace("^%s\.?\s" % a, b + ' ', regex=True, inplace=True)
filtered_roads.replace(" %s\.?\s" % a, ' ' + b + ' ', regex=True, inplace=True)
froads = filtered_roads.copy(deep=True)
froads.shape
(10636, 3)
# quick filter because I realised that there's some carpark entrances and exits in there
# also remove flyovers
# ~ here is a negative: filter to just the roads that do not contain "exit", etc
froads = froads[~froads['name'].str.contains("exit", case=False)]
froads = froads[~froads['name'].str.contains("entrance", case=False)]
froads = froads[~froads['name'].str.contains("flyover", case=False)]
froads = froads[~froads['name'].str.contains("fyover", case=False)] # a typo
froads = froads[~froads['name'].str.contains("underpass", case=False)]
froads = froads[~froads['name'].str.contains("access", case=False)]
# remove a handful of "roads" that look like junctions between roads
froads = froads[~froads['name'].str.contains(";")]
froads.shape
(10490, 3)
I'm going to compare them with a correctly-spelled list of roads I got off another website.
correct_roads = pd.read_csv("sg_roadnames.txt",skiprows=2, header=None)
correct_roads.columns = ["Roadname"]
correct_roads
Roadname | |
---|---|
0 | Abingdon Road |
1 | Adam Drive |
2 | Adam Park |
3 | Adam Road |
4 | Adis Road |
5 | Admiralty Drive |
6 | Admiralty Lane |
7 | Admiralty Link |
8 | Admiralty Road |
9 | Admiralty Road East |
10 | Admiralty Road West |
11 | Admiralty Street |
12 | Ah Hood Road |
13 | Ah Soo Garden |
14 | Ah Soo Walk |
15 | Aida Street |
16 | Airline Road |
17 | Airport Boulevard |
18 | Airport Cargo Road |
19 | Akyab Road |
20 | Albert Street |
21 | Alexandra Lane |
22 | Alexandra Road |
23 | Alexandra Terrace |
24 | Aliwal Street |
25 | Aljunied Avenue 1 |
26 | Aljunied Avenue 2 |
27 | Aljunied Avenue 3 |
28 | Aljunied Avenue 4 |
29 | Aljunied Avenue 5 |
... | ... |
3916 | Yishun Northview Drive |
3917 | Yishun Ring Road |
3918 | Yishun Street 11 |
3919 | Yishun Street 20 |
3920 | Yishun Street 21 |
3921 | Yishun Street 22 |
3922 | Yishun Street 61 |
3923 | Yishun Street 71 |
3924 | Yishun Street 72 |
3925 | Yishun Street 81 |
3926 | Yong Siak Street |
3927 | York Hill |
3928 | York Place |
3929 | York Road |
3930 | Youngberg Terrace |
3931 | Yuan Ching Road |
3932 | Yuk Tong Avenue |
3933 | Yung An Road |
3934 | Yung Ho Road |
3935 | Yung Kuang Road |
3936 | Yung Loh Road |
3937 | Yung Ping Road |
3938 | Yung Sheng Road |
3939 | Yunnan Crescent |
3940 | Yunnan Drive |
3941 | Yunnan Road |
3942 | Yunnan Walk |
3943 | Zion Road |
3944 | Zion Close |
3945 | Zehnder Road |
3946 rows × 1 columns
correct_roads.shape
(3946, 1)
So there are 3911 roads in the list of correct roads. How many in the list of filtered roads? Roads can share names, so we need to count unique values.
len(froads['name'].value_counts())
3434
It looks like we have an approximate subset, so it makes sense to try matching the roads in OSM with the roads from our "correct" set. We can use the fuzzywuzzy
library from SeatGeek to do this.
from fuzzywuzzy import process
# To give us an idea of how this library works...note the transposition of the i and e
print process.extractOne("Aljuneid Avenue 1", correct_roads.Roadname.values)
('Aljunied Avenue 1', 94)
Now we do it on all the unique roadname values.
roadname_values = pd.DataFrame(pd.unique(froads.name))
roadname_values.columns = ["name"]
%%cache augmentedosm.pkl augmented_roads
import re
# We'll augment our dataframe with the best guess output by fuzzywuzzy and its associated score
augmented_roads = roadname_values.copy(deep=True)
def correct_road(rd):
# if there's an exact match, return it (note: fuzzywuzzy may already do this instead
# of searching through everything for the top match, but I don't know for sure)
if rd in correct_roads.Roadname.values:
return rd, 100
else:
# use fuzzywuzzy to find the closest match
closest_match, score = process.extractOne(rd, correct_roads.Roadname.values)
# only use the closest match if there's the same number of words
rd_words = rd.split()
closest_match_words = closest_match.split()
if len(closest_match_words) == len(rd_words):
# in addition, if the only difference is in the number, then just return
# the old number
if ((closest_match_words[:-1] == rd_words[:-1]) and
(re.search("\d", closest_match_words[-1])) and
(re.search("\d", rd_words[-1]))):
return rd, 100
else:
return closest_match, score
else:
return rd, 100
augmented_roads["corrected"], augmented_roads["score"] = zip(*augmented_roads.name.apply(correct_road))
augmented_roads
[Skipped the cell's code and loaded variables augmented_roads from file '/home/michelle/Dropbox/Repositories/public-facing/SingaporeRoadnameOrigins/notebooks/augmentedosm.pkl'.]
Some of these are obviously wrong. Let's inspect the ones that did get changed more closely.
augmented_roads[(augmented_roads.score >= 90) & (augmented_roads.score < 100)]
name | corrected | score | |
---|---|---|---|
6 | Patterson Road | Paterson Road | 96 |
18 | Read Cresent | Read Crescent | 96 |
91 | Tomlison Road | Tomlinson Road | 96 |
101 | Chua Chu Kang Road | Choa Chu Kang Road | 94 |
255 | Mounbatten Road | Mountbatten Road | 97 |
326 | Choa Chu Kang Cresent | Choa Chu Kang Crescent | 98 |
344 | Sungei Kadut Cresent | Sungei Kadut Crescent | 98 |
382 | Neo Tiew Cresent | Neo Tiew Crescent | 97 |
398 | Woodland Centre Road | Woodlands Centre Road | 98 |
410 | Woodlands North Drive | North Woodlands Drive | 95 |
412 | Woodlands North Link | North Woodlands Link | 95 |
542 | Japanesen Garden Road | Japanese Garden Road | 98 |
555 | Seconf Chin Bee Road | Second Chin Bee Road | 95 |
579 | Student Walk | Students Walk | 96 |
587 | Neythai Road | Neythal Road | 92 |
588 | Fan Yong Road | Fan Yoong Road | 96 |
591 | Kian Teck Cresent | Kian Teck Crescent | 97 |
592 | Joon Koon Circle | Joo Koon Circle | 97 |
594 | Forth Lok Yang Road | Fourth Lok Yang Road | 97 |
856 | Eastwood Rd | Eastwood Road | 92 |
860 | Eastwwod Terrace | Eastwood Terrace | 94 |
1012 | Jelan Selamat | Jalan Selamat | 92 |
1013 | Jelan Sayang | Jalan Sayang | 92 |
1034 | Lorong Abu Taib | Lorong Abu Talib | 97 |
1050 | Ermani Street | Ernani Street | 92 |
1055 | Lakme Terrance | Lakme Terrace | 96 |
1063 | Jalan Terang Bulah | Jalan Terang Bulan | 94 |
1074 | Jalan Binjai | Jalan Binja | 96 |
1102 | Jalan Limau Bali | Jalan Limau Balli | 97 |
1227 | Tong Soon Green | Thong Soon Green | 97 |
... | ... | ... | ... |
2589 | Spingwood Walk | Springwood Walk | 97 |
2592 | Parkstone Rd | Parkstone Road | 92 |
2593 | Swanage Rd | Swanage Road | 91 |
2595 | Wareham Rd | Wareham Road | 91 |
2596 | Crescent Rd | Crescent Road | 92 |
2724 | Jalan Lebat Daum | Jalan Lebat Daun | 94 |
2775 | Chua Chu Kang Way | Choa Chu Kang Way | 94 |
2785 | Kampung Sireh | Kampong Sireh | 92 |
2801 | St. John's Road | St John's Road | 97 |
2818 | Kent Ridge Cresent | Kent Ridge Crescent | 97 |
2822 | King Albet Park | King Albert Park | 97 |
2831 | Saint Martin's Drive | St Martin's Drive | 92 |
2880 | Tai Thong Cresent | Tai Thong Crescent | 97 |
2890 | Fairway Drive | Fairways Drive | 96 |
2984 | Bt Merah View | Bukit Merah View | 90 |
2989 | Mohamed Ali Lane | Mohamad Ali Lane | 94 |
3045 | Tao Chin Road | Tao Ching Road | 96 |
3070 | Bright Hill Cresent | Bright Hill Crescent | 97 |
3075 | Senoko Cresent | Senoko Crescent | 97 |
3139 | Kampung Wak Hassan | Kampong Wak Hassan | 94 |
3187 | Arumugam Rd | Arumugam Road | 92 |
3196 | Japan Pelangi | Jalan Pelangi | 92 |
3209 | St Helena Rd | St Helena Road | 92 |
3210 | Queens Avenue | Queen's Avenue | 96 |
3261 | St. Margaret Road | St Margaret's Road | 95 |
3315 | Spingwood Terrace | Springwood Terrace | 97 |
3333 | Lengkok Lengkok | Lengkok Angsa | 95 |
3353 | Bah Soon Pah Rd | Bah Soon Pah Road | 94 |
3368 | Hai Sing Cresent | Hai Sing Crescent | 97 |
3400 | Sin Joo Walk | Sing Joo Walk | 96 |
105 rows × 3 columns
While the ones with a score under 90 are mostly wrong, and it would be more worth it to just use the original names supplied by OSM.
augmented_roads[(augmented_roads.score < 90)]
name | corrected | score | |
---|---|---|---|
15 | Merchant Loop | Merchant Road | 77 |
20 | Seletar Expressway | Seletar Road | 85 |
21 | Central Expressway | Central Circus | 63 |
25 | Kranji Expressway | Kranji Loop | 85 |
45 | Serangoon Viaduct | Serangoon Road | 77 |
103 | Elgin Bridge | Jalan Woodbridge | 64 |
106 | Esplanade Drive | Adam Drive | 85 |
197 | Buangkok Drive | Gul Drive | 85 |
202 | Compassvale Bow | Compassvale Road | 84 |
204 | Sengkang Square | Geylang Square | 76 |
206 | Rivervale Drive | Riviera Drive | 86 |
207 | Rivervale Crescent | Riverina Crescent | 86 |
245 | Buangkok Link | Gul Link | 85 |
273 | Boundary Close | Toh Close | 85 |
300 | Tampines Industrial Avenue 5 | Tampines Ind Avenue 5 | 86 |
353 | Brickland Road | Adam Road | 85 |
413 | South Woodlands Way | North Woodlands Way | 89 |
480 | Tampines Industrial Avenue 4 | Tampines Ind Avenue 4 | 86 |
623 | Pioneer Walk | Pine Walk | 86 |
624 | Sunview Road | View Road | 86 |
642 | Sago Lane | Admiralty Lane | 85 |
656 | Cantonment Close | Jago Close | 85 |
719 | Holland Link | Holland Plain | 88 |
770 | Jalan Pasu | Jalan Ampas | 86 |
1009 | Lengkong Lima | Lengkok Lima | 88 |
1010 | Lengkong Satu | Lengkok Satu | 88 |
1011 | Lengkong Empat | Lengkok Empat | 89 |
1043 | Lengkong Tiga | Lengkok Tiga | 88 |
1120 | Mayflower Link | Mayflower Lane | 86 |
1290 | Anderson Bridge | Anderson Road | 79 |
... | ... | ... | ... |
2948 | MacRitchie Viaduct | Maritime Avenue | 61 |
2965 | Simei Lane | Anchorvale Lane | 85 |
2975 | Tampines Industrial Avenue 1 | Tampines Ind Avenue 1 | 86 |
2979 | Tampines Industrial Avenue 2 | Tampines Ind Avenue 2 | 86 |
2980 | Tampines Industrial Avenue 3 | Tampines Ind Avenue 3 | 86 |
2981 | Tampines Link | Gul Link | 85 |
2990 | Bayfront Link | Gul Link | 85 |
3001 | Harbourfront Avenue | Bedok Avenue | 85 |
3015 | Stirling Walk | Kew Walk | 85 |
3042 | Technology Drive | Adam Drive | 85 |
3082 | Pavilion Circle | Gul Circle | 85 |
3085 | Pavilion Grove | Ash Grove | 85 |
3098 | Republic Cresent | Gul Crescent | 71 |
3099 | Republic Link | Gul Link | 85 |
3121 | Cleantech Loop | Clementi Loop | 74 |
3122 | CleanTech View | Sims View | 85 |
3123 | Cleantech Heights | Kew Heights | 85 |
3141 | Gambas Cresent | Tuas Crescent | 74 |
3146 | Tembeling Lane | Bali Lane | 85 |
3227 | Lentor Plain | Lentor Lane | 87 |
3239 | Chancery Court | Cameron Court | 74 |
3283 | Kian Teck Lane | Kian Teck Avenue | 87 |
3287 | Tuas South Boulevard | Tuas Avenue 1 | 85 |
3291 | Burgundy Drive | Burgundy Rise | 89 |
3349 | Greenwich Drive | Adam Drive | 85 |
3352 | Kitchener Link | Arts Link | 85 |
3384 | Pasir Ris Coast Industrial Park 6 | Pasir Ris Coast Ind Park 6 | 88 |
3393 | Kallang Airport Way | Kallang Way 1 | 87 |
3403 | Seletar Green Walk | Ah Soo Walk | 85 |
3412 | Sin Ming Lane | Sin Ming Avenue | 86 |
129 rows × 3 columns
final_roadnames = augmented_roads.copy(deep=True)
final_roadnames["final_name"] = final_roadnames.apply(lambda row: row["corrected"] if row.score >= 90 else row["name"], axis=1)
final_roadnames
name | corrected | score | final_name | |
---|---|---|---|---|
0 | Orchard Road | Orchard Road | 100 | Orchard Road |
1 | Hougang Avenue 1 | Hougang Avenue 1 | 100 | Hougang Avenue 1 |
2 | Scotts Road | Scotts Road | 100 | Scotts Road |
3 | Keng Lee Road | Keng Lee Road | 100 | Keng Lee Road |
4 | Newton Road | Newton Road | 100 | Newton Road |
5 | Sarkies Road | Sarkies Road | 100 | Sarkies Road |
6 | Patterson Road | Paterson Road | 96 | Paterson Road |
7 | Orchard Boulevard | Orchard Boulevard | 100 | Orchard Boulevard |
8 | Grange Road | Grange Road | 100 | Grange Road |
9 | Paterson Hill | Paterson Hill | 100 | Paterson Hill |
10 | River Valley Road | River Valley Road | 100 | River Valley Road |
11 | Unity Street | Unity Street | 100 | Unity Street |
12 | Merbau Road | Merbau Road | 100 | Merbau Road |
13 | Mohamed Sultan Road | Mohamed Sultan Road | 100 | Mohamed Sultan Road |
14 | Saiboo Street | Saiboo Street | 100 | Saiboo Street |
15 | Merchant Loop | Merchant Road | 77 | Merchant Loop |
16 | Clemenceau Avenue | Clemenceau Avenue | 100 | Clemenceau Avenue |
17 | Merchant Road | Merchant Road | 100 | Merchant Road |
18 | Read Cresent | Read Crescent | 96 | Read Crescent |
19 | Tampines Expressway | Tampines Expressway | 100 | Tampines Expressway |
20 | Seletar Expressway | Seletar Road | 85 | Seletar Expressway |
21 | Central Expressway | Central Circus | 63 | Central Expressway |
22 | Telok Blangah Road | Telok Blangah Road | 100 | Telok Blangah Road |
23 | Ayer Rajah Expressway | Ayer Rajah Expressway | 100 | Ayer Rajah Expressway |
24 | Turf Club Avenue | Turf Club Avenue | 100 | Turf Club Avenue |
25 | Kranji Expressway | Kranji Loop | 85 | Kranji Expressway |
26 | Prinsep Street | Prinsep Street | 100 | Prinsep Street |
27 | Tanglin Road | Tanglin Road | 100 | Tanglin Road |
28 | Alexandra Road | Alexandra Road | 100 | Alexandra Road |
29 | Nicoll Highway | Nicoll Highway | 100 | Nicoll Highway |
... | ... | ... | ... | ... |
3404 | Seletar North Link | Seletar North Link | 100 | Seletar North Link |
3405 | Ghim Moh Link | Ghim Moh Link | 100 | Ghim Moh Link |
3406 | Hougang Street 31 | Hougang Street 31 | 100 | Hougang Street 31 |
3407 | Hougang Street 32 | Hougang Street 32 | 100 | Hougang Street 32 |
3408 | Serangoon Lane | Serangoon Lane | 100 | Serangoon Lane |
3409 | Gambir Walk | Gambir Walk | 100 | Gambir Walk |
3410 | Upper Serangoon Crescent | Upper Serangoon Crescent | 100 | Upper Serangoon Crescent |
3411 | Ubi Close | Ubi Close | 100 | Ubi Close |
3412 | Sin Ming Lane | Sin Ming Avenue | 86 | Sin Ming Lane |
3413 | Compassvale Lane | Compassvale Lane | 100 | Compassvale Lane |
3414 | Lorong 5 Realty Park | Lorong 5 Realty Park | 100 | Lorong 5 Realty Park |
3415 | Wee Nam Road | Wee Nam Road | 100 | Wee Nam Road |
3416 | Tampines Street 72 | Tampines Street 72 | 100 | Tampines Street 72 |
3417 | Changi South Lane | Changi South Lane | 100 | Changi South Lane |
3418 | Telegraph Street | Telegraph Street | 100 | Telegraph Street |
3419 | Biopolis Street | Biopolis Street | 100 | Biopolis Street |
3420 | Biopolis Link | Biopolis Link | 100 | Biopolis Link |
3421 | Plymouth Avenue | Plymouth Avenue | 100 | Plymouth Avenue |
3422 | Gentle Road | Gentle Road | 100 | Gentle Road |
3423 | Leicester Road | Leicester Road | 100 | Leicester Road |
3424 | Simon Walk | Simon Walk | 100 | Simon Walk |
3425 | Joo Hong Road | Joo Hong Road | 100 | Joo Hong Road |
3426 | Florence Close | Florence Close | 100 | Florence Close |
3427 | Hoot Kiam Road | Hoot Kiam Road | 100 | Hoot Kiam Road |
3428 | Yishun Avenue 8 | Yishun Avenue 8 | 100 | Yishun Avenue 8 |
3429 | Choa Chu Kang Avenue 6 | Choa Chu Kang Avenue 6 | 100 | Choa Chu Kang Avenue 6 |
3430 | Clarke Quay | Clarke Quay | 100 | Clarke Quay |
3431 | Countryside Walk | Countryside Walk | 100 | Countryside Walk |
3432 | PIE | PIE | 100 | PIE |
3433 | Nepal Park | Nepal Park | 100 | Nepal Park |
3434 rows × 4 columns
# Final clean-up
final_roadnames.loc[3432, 'final_name'] = 'Pan-Island Expressway' # was: PIE
final_roadnames.loc[2492, 'final_name'] = 'Woodlands Drive 75' # was: Woodlands Drive 50
# drop the intermediate columns in final_roadnames and write out to csv file
final_roadnames.drop(["score", "corrected"], inplace=True, axis=1)
final_roadnames.to_csv("singapore-roadnames-final.csv")
# write out our filtered roads GeoDataFrame to a GeoJSON file
filtered_roads.crs = None
filtered_roads.to_file("singapore-roads-filtered.geojson", driver="GeoJSON")