Exploring seafloor habitats: geographic analysis using IPython Notebook with GRASS & R

Massimo Di Stefano, Software Engineer

Rensselaer Polytechnic Institute, Troy, NY

Woods Hole Oceanographic Institution, Woods Hole, MA


In this Notebook GRASS GIS (Geographic Resource Analysis Support System) is used as GIS (Geographic Information System) engine, in combination with R used as environment for statistical computing.


In [8]:
cd /home/epy/notebook/
/home/epy/notebook

Import Modules and Export GRASS PNG Driver

In [9]:
import os
import sys
import grass.script as grass
from gis import *
from IPython.core.display import Image, HTML
In [10]:
!export GRASS_TRANSPARENT=TRUE
!export GRASS_TRUECOLOR=TRUE
!export GRASS_PNG_COMPRESSION=9
!export GRASS_PNG_AUTO_WRITE=TRUE

Workflow Description

  • Landform Extraction
  • Samples/Annotation/Raster OVERLAY
  • Hierarchical Clustering
  • Biological Community Assemblage Maps
  • Community-Density Probability Map

In [4]:
%pylab inline
Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.kernel.zmq.pylab.backend_inline].
For more information, type 'help(pylab)'.

Set the GRASS GIS environment to use a World Geodetic System WGS84

In [5]:
setGenv(location='lonlat')
{'MAPSET': 'PERMANENT', 'GISDBASE': '/home/epy/grass7data', 'LOCATION_NAME': 'lonlat'}
In [6]:
#grass.run_command('r.in.gdal', input='stell_10m.tiff', output='stell10m', flags='oe', overwrite=True)

Set the working region to the DTM extent and resolution

In [7]:
grass.read_command('g.region', rast='stell10m', flags='ap', res=0.0001).strip().split('\n')
Out[7]:
['projection: 3 (Latitude-Longitude)',
 'zone:       0',
 'datum:      wgs84',
 'ellipsoid:  wgs84',
 'north:      42:48N',
 'south:      42:04:59.88N',
 'west:       70:54W',
 'east:       70:01:59.52W',
 'nsres:      0:00:00.36',
 'ewres:      0:00:00.36',
 'rows:       7167',
 'cols:       8668',
 'cells:      62123556']

Set the value 198.3 to null value

In [8]:
#grass.run_command('r.mapcalc', expression="stell_na10m=if(abs(stell10m) < 198.3, stell10m)", overwrite=True)
#grass.run_command('r.null', map='stell_na10m', setnull=0)

Import Vector Points for Site 1

In [9]:
#grass.run_command('v.in.ogr', dsn='kdata/points/site1.shp', output='site1', flags='oe', overwrite=True)

Make a new vector type=Area as bbox for site1

In [10]:
#grass.run_command('g.region', vect='site1', flags='ap', res=0.0001)
#grass.run_command('v.in.region', type='area', output='site1area', overwrite=True)

Make a new GRASS Location UTM-WGS84 19 N (EPSG: 32619)

In [11]:
#grass.run_command('g.proj', epsg='32619', location='utm19N_wgs84')
In [12]:
setGenv(location='utm19N_wgs84')
{'MAPSET': 'PERMANENT', 'GISDBASE': '/home/epy/grass7data', 'LOCATION_NAME': 'utm19N_wgs84'}

Reproject the vector files from WGS84 (EPSG: 4326) to UTM-WGS84 19 N (EPSG: 32619)

In [13]:
#grass.run_command('v.proj', input='site1area', location='lonlat', mapset='PERMANENT', overwrite=True)
#grass.run_command('v.proj', input='site1', location='lonlat', mapset='PERMANENT', overwrite=True)

Reproject a portion of DTM based on the vector extent

In [14]:
grass.run_command('g.remove', rast='dtm1')
reproj(reg='site1', inp='stell_na10m', out='dtm1', res=10, buff=0, location='lonlat', mapset='PERMANENT')
grass.run_command('r.fillnulls', input='dtm1', output='dtm1', overwrite=True)
grass.run_command('r.out.png', input='dtm1', output='dtm1.png', overwrite=True)
Image(filename='dtm1.png')
Out[14]:

Check extension and resolution

In [15]:
!g.region rast=dtm1 -p
projection: 1 (UTM)
zone:       19
datum:      wgs84
ellipsoid:  wgs84
north:      4718670
south:      4716550
west:       392840
east:       397080
nsres:      10
ewres:      10
rows:       212
cols:       424
cells:      89888
In [17]:
HTML(region('dtm1')[0])
Out[17]:
projection1 (UTM)
zone19
datumwgs84
ellipsoidwgs84
north4718670
south4716550
west392840
east397080
nsres10
ewres10
rows212
cols424
cells89888
In [18]:
HTML(rasterinfo('dtm1')[0])
Out[18]:
north4718670
south4716550
east397080
west392840
nsres10
ewres10
rows212
cols424
cells89888
datatypeFCELL
min-133.76
max-56.76
title (dtm1)
units"none"
vertical_datum"none"
timestamp"none"

Generate 3D visualization

In [16]:
grass.run_command('m.nviz.image', elevation_map='dtm1', output='dtm1_3d', perspective=15, height=2000, color_map='dtm1', resolution_fine=1, resolution_coarse=1, format='tif')
os.system('convert dtm1_3d.tif dtm1_3d.png')
Image(filename='dtm1_3d.png')
Out[16]:

Morfological Features Extraction

In [11]:
elev = 'dtm1'
sf = 5
cl = 'elev1cl5'
clf='elev1cl5nn5'
vpoint='site1'
In []:
img = makemorfo(input=elev,nnwin=9, pmwin=15, resolution=10, overwrite=True, remove=False) # makemorfo(input=layer, overwrite=True)
In [21]:
pnglist = morf2png(img)
written  dtm1_avg.png
written  dtm1_min.png
written  dtm1_max.png
written  dtm1_er.png
written  dtm1_slope.png
written  dtm1_profc.png
written  dtm1_crosc.png
written  dtm1_minic.png
written  dtm1_maxic.png
written  dtm1_longc.png
In [22]:
imgslides(pnglist)
Out[22]:

Spatial Clustering

In [20]:
clusterize(img[3:], cl, k=5, ss=10, ns=500)
Image(filename=cl+'.png')
Loading required package: morfoclara
Loading required package: sp
Loading required package: XML

Attaching package: ‘XML’

The following object(s) are masked from ‘package:tools’:

    toHTML

GRASS GIS interface loaded with GRASS version: GRASS 7.0.svn (2012)
and location: utm19N_wgs84
Loading required package: rgdal
Geospatial Data Abstraction Library extensions to R successfully loaded
Loaded GDAL runtime: GDAL 1.9.0, released 2011/12/29
Path to GDAL shared files: /usr/share/gdal/1.9
Loaded PROJ.4 runtime: Rel. 4.7.1, 23 September 2009, [PJ_VERSION: 470]
Path to PROJ.4 shared files: (autodetected)
Warning messages:
1: statistics not supported by this driver 
2: In create2GDAL(dataset = dataset, drivername = drivername, type = type,  :
  mvFlag=NA unsuitable for type Byte
Out[20]:

Apply a nearest neighbors filter and export the results

In [23]:
grass.run_command('r.neighbors', flags='c', input=cl, output=clf, method='mode', size=sf, overwrite=True)
Out[23]:
0
In [24]:
grass.run_command('g.region', rast=clf, flags='ap')
grass.run_command('r.out.gdal', input=clf, output=clf+'.tif', format='GTiff', type='Byte', overwrite=True)
Out[24]:
0
In [13]:
# makepng doesnt' overwrite and fail silenty
makepng(rasterlist=[str(clf)],
        vectorlist=[str(vpoint)], 
        output=clf+'.png')
Image(filename=clf+'.png')
png done
Out[13]:

Vector Raster Overlay

In [26]:
grass.run_command('v.db.addcolumn', map=vpoint, columns="depth double precision,morfo int")
grass.run_command('v.what.rast', map=vpoint, rast=clf, col='morfo')
grass.run_command('v.what.rast', map=vpoint, rast=elev, col='depth')
Out[26]:
0
In [27]:
grass.read_command('db.columns', table=vpoint).strip().split('\n')
Out[27]:
['cat',
 'cat_',
 'category_n',
 'genus',
 'species',
 'eic_id',
 'imagename',
 'idcode',
 'length_mm',
 'target_cou',
 'subs',
 'sub_name',
 'sub_compl',
 'sub_struct',
 'sub_dom',
 'fov',
 'depth_m',
 'temp',
 'lat',
 'lon',
 'cruise_id',
 'pg_timesta',
 'kml_timest',
 'gid',
 'depth',
 'morfo']

Some Statistics

In [28]:
grass.read_command('v.univar', map=vpoint, col='morfo', type='point').strip().split('\n')
Out[28]:
['number of features with non NULL attribute: 57871',
 'number of missing attributes: 0',
 'number of NULL attributes: 0',
 'minimum: 1',
 'maximum: 5',
 'range: 4',
 'sum: 107156',
 'mean: 1.85164',
 'mean of absolute values: 1.85164',
 'population standard deviation: 0.841073',
 'population variance: 0.707403',
 'population coefficient of variation: 0.454232',
 'sample standard deviation: 0.84108',
 'sample variance: 0.707416',
 'kurtosis: 1.94295',
 'skewness: 1.18655']
In [29]:
grass.read_command('v.univar', map=vpoint, col='depth', type='point').strip().split('\n')
Out[29]:
['number of features with non NULL attribute: 57871',
 'number of missing attributes: 0',
 'number of NULL attributes: 0',
 'minimum: -131.95',
 'maximum: -58.45',
 'range: 73.5',
 'sum: -4.04018e+06',
 'mean: -69.8136',
 'mean of absolute values: 69.8136',
 'population standard deviation: 12.2363',
 'population variance: 149.727',
 'population coefficient of variation: 0.175271',
 'sample standard deviation: 12.2364',
 'sample variance: 149.73',
 'kurtosis: 10.7513',
 'skewness: -3.29567']

3D View

In [30]:
clf3d = clf+'_3d'
grass.run_command('m.nviz.image', elevation_map=elev, output=clf3d, perspective=15, height=2000, color_map=clf, resolution_fine=1, resolution_coarse=1, format='tif')
instruction = 'convert %s.tif %s.png' % (clf3d, clf3d)
os.system(instruction)
Image(filename=clf3d+'.png')
Out[30]:

KML export & Google Earth

In [45]:
gearth()
Out[45]:
In [33]:
grass.run_command('g.region', rast=clf, flags='ap')
grass.run_command('r.out.gdal', input=clf, output=clf+'.tif', format='GTiff', type='Byte', overwrite=True)
tif2kml2(clf+'.tif', clf, kmz=True, gtiff=False)
gdalbuildvrt elev1cl5nn5_tmp.vrt elev1cl5nn5.tif
['0...10...20...30...40...50...60...70...80...90...100 - done.']
gdalwarp -of VRT -t_srs EPSG:4326 elev1cl5nn5_tmp.vrt elev1cl5nn5_geo.vrt
['']
gdal_translate -of vrt -expand rgba elev1cl5nn5_geo.vrt elev1cl5nn5.vrt
['Input file size is 447, 169']
gdal2tiles.py -p geodetic -k elev1cl5nn5.vrt
['Generating Base Tiles:', '0...10...20...30...40...50...60...70...80...90...100 - done.', 'Generating Overview Tiles:']
cp elev1cl5nn5.tif /var/www/esr/elev1cl5nn5.tif
['']
kml saved in directory elev1cl5nn5
elev1cl5nn5.kmz
Out[33]:
'http://epi.whoi.edu/esr/elev1cl5nn5/doc.kml'
In [34]:
inst = "cp %s.tif /var/www/esr/" % clf
os.system(inst)
print "http://epi.whoi.edu/esr/%s.tif" % clf
http://epi.whoi.edu/esr/elev1cl5nn5.tif
In [50]:
makeGE(['http://epi.whoi.edu/esr/%s/doc.kml' % clf], clf+'.html', 800, 800)
http://epi.whoi.edu/esr/elev1cl5nn5.html
Out[50]:

Shaded Relief

In [37]:
shaded = 'dtm1shd'
rgb = 'elevshd'
grass.run_command('g.region', rast=elev, flags='ap')
grass.run_command('r.shaded.relief', input=elev, altitude=45, azimuth=225, output=shaded, overwrite=True)
grass.run_command('r.out.png', input=shaded, output=shaded+'.png', overwrite=True)
#Image(filename=shaded+'.png')
grass.run_command('r.his', h_map=elev, i_map=shaded, r_map='rr', b_map='bb', g_map='gg', overwrite=True)
grass.run_command('r.composite', red='rr', green='gg', blue='bb', output=rgb, overwrite=True)
grass.run_command('r.out.png', input=rgb, output=rgb+'.png', overwrite=True)
#Image(filename='rgb.png')
Out[37]:
0
In [38]:
Image(filename=shaded+'.png')
Out[38]:
In [40]:
Image(filename=rgb+'.png')
Out[40]:

Imagine the implications for science if scientists actually start showing the code they used to generate their graphs!

No more hiding that outlier point on the side of the graph


Massimo Di Stefano, Software Engineer

Rensselaer Polytechnic Institute, Troy, NY

Woods Hole Oceanographic Institution, Woods Hole, MA

In [15]:
cd
/home/epy

Publish it!