Let's explore a netCDF file from the Atlantic Real-Time Ocean Forecast System
To import netcdf4-python
import netCDF4
f = netCDF4.Dataset('data/rtofs_glo_3dz_f006_6hrly_reg3.nc', 'r')
for v in f.variables:
print(v),
MT Date Depth Y X Latitude Longitude u v temperature salinity
Let's put our Python object oriented knowledge to use
f
is an object, representing an open netCDF filevariables
is an attribute of f, in particular it is a dictionarytype(f.variables) # f.variables is a dictionary data structure
collections.OrderedDict
for d in f.dimensions:
print(d),
MT Y X Depth
print f.dimensions['Y']
<type 'netCDF4.Dimension'>: name = 'Y', size = 850
Get the variable objects for temperature and salinity
temp = f.variables['temperature']
sal = f.variables['salinity']
print temp
print sal
<type 'netCDF4.Variable'> float32 temperature(u'MT', u'Depth', u'Y', u'X') coordinates: Longitude Latitude Date standard_name: sea_water_potential_temperature units: degC _FillValue: 1.26765e+30 valid_range: [ -5.07860279 11.14989948] long_name: temp [90.9H] unlimited dimensions = (u'MT',) current size = (1, 10, 850, 712) <type 'netCDF4.Variable'> float32 salinity(u'MT', u'Depth', u'Y', u'X') coordinates: Longitude Latitude Date standard_name: sea_water_salinity units: psu _FillValue: 1.26765e+30 valid_range: [ 11.61832619 35.04695129] long_name: salinity [90.9H] unlimited dimensions = (u'MT',) current size = (1, 10, 850, 712)
We can get
ncattrs()
methodvar.att
syntaxfor att in temp.ncattrs():
print att
coordinates standard_name units _FillValue valid_range long_name
print temp.coordinates
print temp.standard_name
print temp.units
Longitude Latitude Date sea_water_potential_temperature degC
print 'Variables: ',
for v in f.variables:
print(v),
print
print 'Dimensions: ',
for d in temp.dimensions:
print(d),
Variables: MT Date Depth Y X Latitude Longitude u v temperature salinity Dimensions: MT Depth Y X
mt = f.variables['MT']
depth = f.variables['Depth']
print mt
print depth
<type 'netCDF4.Variable'> float64 MT(u'MT',) long_name: time units: days since 1900-12-31 00:00:00 calendar: standard axis: T unlimited dimensions = (u'MT',) current size = (1,) <type 'netCDF4.Variable'> float32 Depth(u'Depth',) standard_name: depth units: m positive: down axis: Z unlimited dimensions = () current size = (10,)
To access netCDF data, rather than just metadata, we will also need NumPy.
To import NumPy:
import numpy as np
mt[:] # Reads the netCDF variable MT, array of one element
array([ 41023.25])
depth[:] # Let's use NumPy to examine depth array
array([ 0., 100., 200., 400., 700., 1000., 2000., 3000., 4000., 5000.], dtype=float32)
y, x = f.variables['Y'], f.variables['X']
print y
print x
<type 'netCDF4.Variable'> int32 Y(u'Y',) point_spacing: even axis: Y unlimited dimensions = () current size = (850,) <type 'netCDF4.Variable'> int32 X(u'X',) point_spacing: even axis: X unlimited dimensions = () current size = (712,)
X
and Y
dimensions don't look like longitudes and latitudescoordinates
attribute, Latitude
and Longitude
standard_name
and units
in the CF Metadata Conventionslat = f.variables['Latitude']
lon = f.variables['Longitude']
print lat
print lon
<type 'netCDF4.Variable'> float32 Latitude(u'Y', u'X') standard_name: latitude units: degrees_north unlimited dimensions = () current size = (850, 712) <type 'netCDF4.Variable'> float32 Longitude(u'Y', u'X') standard_name: longitude units: degrees_east unlimited dimensions = () current size = (850, 712)
Aha! So we need to find array indices iy
and ix
such that Latitude[iy, ix]
is close to 50.0 and Longitude[iy, ix]
is close to -140.0 ...
(lat0,lon0)
and (lat1,lon1)
is (lat0-lat1)**2 + (lon0-lon1)**2
# Find array indices of lat and lon closest to specified point
lat_pt, lon_pt = 50.0, -140.0
# Read latitude and longitude from file into numpy arrays
latvals = lat[:]
lonvals = lon[:]
distsq_min = 1.0e30
ydim = f.dimensions['Y']
xdim = f.dimensions['X']
for iy in range(len(ydim)): # 0, 1, ..., 849
for ix in range(len(xdim)): # 0, 1, ..., 711
latval = latvals[iy, ix]
# force longitude into range -180 to 180
lonval = (lonvals[iy, ix] + 180) % 360 - 180
dist_sq = (latval - lat_pt)**2 + (lonval - lon_pt)**2
if dist_sq < distsq_min:
iy_min, ix_min, distsq_min = iy, ix, dist_sq
# How close did we get?
print iy_min, ix_min, lat[iy_min, ix_min], lon[iy_min, ix_min]
122 486 49.9867 -139.982
Operations and functions applied to whole numpy arrays is much faster than element-at-a-time operations in loops.
latvals = lat[:]
# get longitudes in range -180 to 180
lonvals = (lon[:] + 180) % 360.0 - 180.0
dist_sq = (latvals-lat_pt)**2 + (lonvals-lon_pt)**2
minindex_flattened = dist_sq.argmin() # 1D index of minimum dist_sq element
# Get 2D index for latvals and lonvals arrays from 1D index
iy_min, ix_min = np.unravel_index(minindex_flattened, latvals.shape)
print iy_min, ix_min, lat[iy_min, ix_min], lon[iy_min, ix_min]
122 486 49.9867 -139.982
On a sphere, trignometric formulas provide chord distance squared between two points on the sphere. Using trignometry, there is no need to worry about domain anomalies between -180 degrees and 180 degrees.
from math import pi
# for trignometry, need angles in radians
rad_factor = pi/180.0
lat_rad = lat_pt * rad_factor
lon_rad = lon_pt * rad_factor
latvals = lat[:] * rad_factor
lonvals = lon[:] * rad_factor
# Chord-distance squared formula, again no loops thanks to numpy
from numpy import cos, sin
delX = cos(lat_rad)*cos(lon_rad) - cos(latvals)*cos(lonvals)
delY = cos(lat_rad)*sin(lon_rad) - cos(latvals)*sin(lonvals)
delZ = sin(lat_rad) - sin(latvals);
chord_sq = delX**2 + delY**2 + delZ**2
minindex_flattened = chord_sq.argmin() # get 1D index of min chord_sq
# convert 1D index to 2D indices for latvals and lonvals
iy_min, ix_min = np.unravel_index(minindex_flattened, latvals.shape)
lat_deg = latvals[iy_min, ix_min] / rad_factor
lon_deg = lonvals[iy_min, ix_min] / rad_factor
print iy_min, ix_min, lat_deg, lon_deg # Sanity check
122 486 49.9867291955 -139.982049605
If getting indices for lots of coordinate pairs, use scipy.spatial.cKDTree. This takes some time to create a data structure from the 2D lat and lon arrays, but makes queries for indices of points given by (lat,lon) coordinates much faster.
from scipy.spatial import cKDTree
print "constructing kd-tree ..."
kdt = cKDTree(zip(np.ravel(cos(latvals)*cos(lonvals)),
np.ravel(cos(latvals)*sin(lonvals)),
np.ravel(sin(latvals))))
print "Querying kd-tree ..."
min_chordsq, min_index = kdt.query([cos(lat_rad)*cos(lon_rad),
cos(lat_rad)*sin(lon_rad),
sin(lat_rad)])
iy_min, ix_min = np.unravel_index(min_index, lat.shape)
print iy_min, ix_min, lat[iy_min, ix_min], lon[iy_min, ix_min]
constructing kd-tree ... Querying kd-tree ... 122 486 49.9867 -139.982
|----------+--------|
| Variable | Index |
|----------+--------|
| MT | 0 |
| Depth | 0 |
| Y | iy_min |
| X | ix_min |
|----------+--------|
sal = f.variables['salinity']
# Read values out of the netCDF file for temperature and salinity
print temp[0,0,iy_min,ix_min], temp.units
print sal[0,0,iy_min,ix_min], sal.units
6.46312 degC 32.6572 psu
from IPython.display import HTML
HTML('<iframe src=http://www.unidata.ucar.edu/software/netcdf-java/'
'formats/FileTypes.html width=700 height=350></iframe>')
pyudl
module¶get_latest_dods_url
: Given a top level THREDDS dataset URI, return the latest dataset.The following example showcases some nice netCDF features:
from netCDF4 import Dataset
from pyudl.tds import get_latest_dods_url
gfs_data_url = "http://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/" \
"Global_0p5deg/catalog.xml"
latest = get_latest_dods_url(gfs_data_url)
data = Dataset(latest)
for v in data.variables:
print v,
lat lon isobaric isobaric1 pressure_difference_layer pressure_difference_layer_bounds potential_vorticity_surface isobaric2 height_above_ground sigma height_above_ground1 sigma_layer sigma_layer_bounds pressure_difference_layer1 pressure_difference_layer1_bounds isobaric3 height_above_ground2 height_above_ground_layer height_above_ground_layer_bounds depth_below_surface_layer depth_below_surface_layer_bounds height_above_ground3 altitude_above_msl isobaric4 height_above_ground_layer1 height_above_ground_layer1_bounds isobaric5 pressure_difference_layer2 pressure_difference_layer2_bounds height_above_ground4 time time1 time1_bounds time2 time2_bounds time3 Temperature_surface Temperature_maximum_wind Temperature_tropopause Temperature_isobaric Temperature_altitude_above_msl Temperature_height_above_ground Temperature_sigma Temperature_depth_below_surface_layer Temperature_pressure_difference_layer Temperature_potential_vorticity_surface Temperature_low_cloud_top_Mixed_intervals_Average Temperature_middle_cloud_top_Mixed_intervals_Average Temperature_high_cloud_top_Mixed_intervals_Average Potential_temperature_sigma Maximum_temperature_height_above_ground_Mixed_intervals_Interval Minimum_temperature_height_above_ground_Mixed_intervals_Interval Latent_heat_net_flux_surface_Mixed_intervals_Average Sensible_heat_net_flux_surface_Mixed_intervals_Average Specific_humidity_height_above_ground Specific_humidity_pressure_difference_layer Relative_humidity_zeroDegC_isotherm Relative_humidity_isobaric Relative_humidity_height_above_ground Relative_humidity_sigma_layer Relative_humidity_sigma Relative_humidity_pressure_difference_layer Relative_humidity_entire_atmosphere Relative_humidity_highest_tropospheric_freezing Precipitable_water_entire_atmosphere Precipitation_rate_surface_Mixed_intervals_Average Total_precipitation_surface_Mixed_intervals_Accumulation Convective_precipitation_surface_Mixed_intervals_Accumulation Water_equivalent_of_accumulated_snow_depth_surface Cloud_mixing_ratio_isobaric Categorical_Rain_surface_Mixed_intervals_Average Categorical_Freezing_Rain_surface_Mixed_intervals_Average Categorical_Ice_Pellets_surface_Mixed_intervals_Average Categorical_Snow_surface_Mixed_intervals_Average Convective_Precipitation_Rate_surface_Mixed_intervals_Average Potential_Evaporation_Rate_surface u-component_of_wind_maximum_wind u-component_of_wind_tropopause u-component_of_wind_isobaric u-component_of_wind_altitude_above_msl u-component_of_wind_height_above_ground u-component_of_wind_sigma u-component_of_wind_pressure_difference_layer u-component_of_wind_potential_vorticity_surface u-component_of_wind_planetary_boundary v-component_of_wind_maximum_wind v-component_of_wind_tropopause v-component_of_wind_isobaric v-component_of_wind_altitude_above_msl v-component_of_wind_height_above_ground v-component_of_wind_sigma v-component_of_wind_pressure_difference_layer v-component_of_wind_potential_vorticity_surface v-component_of_wind_planetary_boundary Vertical_velocity_pressure_isobaric Vertical_velocity_pressure_sigma Absolute_vorticity_isobaric Momentum_flux_u-component_surface_Mixed_intervals_Average Momentum_flux_v-component_surface_Mixed_intervals_Average Wind_speed_gust_surface Vertical_Speed_Shear_tropopause Vertical_Speed_Shear_potential_vorticity_surface U-Component_Storm_Motion_height_above_ground_layer V-Component_Storm_Motion_height_above_ground_layer Ventilation_Rate_planetary_boundary Pressure_surface Pressure_maximum_wind Pressure_tropopause Pressure_height_above_ground Pressure_potential_vorticity_surface Pressure_low_cloud_bottom_Mixed_intervals_Average Pressure_low_cloud_top_Mixed_intervals_Average Pressure_middle_cloud_bottom_Mixed_intervals_Average Pressure_middle_cloud_top_Mixed_intervals_Average Pressure_high_cloud_bottom_Mixed_intervals_Average Pressure_high_cloud_top_Mixed_intervals_Average Pressure_convective_cloud_bottom Pressure_convective_cloud_top Pressure_reduced_to_MSL_msl ICAO_Standard_Atmosphere_Reference_Height_maximum_wind ICAO_Standard_Atmosphere_Reference_Height_tropopause Geopotential_height_surface Geopotential_height_zeroDegC_isotherm Geopotential_height_maximum_wind Geopotential_height_tropopause Geopotential_height_isobaric Geopotential_height_potential_vorticity_surface Geopotential_height_highest_tropospheric_freezing Geopotential_height_anomaly_isobaric MSLP_Eta_model_reduction_msl 5-Wave_Geopotential_Height_isobaric Zonal_Flux_of_Gravity_Wave_Stress_surface_Mixed_intervals_Average Meridional_Flux_of_Gravity_Wave_Stress_surface_Mixed_intervals_Average Planetary_Boundary_Layer_Height_surface 5-Wave_Geopotential_Height_Anomaly_isobaric Pressure_of_level_from_which_parcel_was_lifted_pressure_difference_layer Downward_Short-Wave_Radiation_Flux_surface_Mixed_intervals_Average Upward_Short-Wave_Radiation_Flux_surface_Mixed_intervals_Average Upward_Short-Wave_Radiation_Flux_atmosphere_top_Mixed_intervals_Average Downward_Long-Wave_Radp_Flux_surface_Mixed_intervals_Average Upward_Long-Wave_Radp_Flux_surface_Mixed_intervals_Average Upward_Long-Wave_Radp_Flux_atmosphere_top_Mixed_intervals_Average Total_cloud_cover_entire_atmosphere_Mixed_intervals_Average Total_cloud_cover_boundary_layer_cloud_Mixed_intervals_Average Total_cloud_cover_low_cloud_Mixed_intervals_Average Total_cloud_cover_middle_cloud_Mixed_intervals_Average Total_cloud_cover_high_cloud_Mixed_intervals_Average Total_cloud_cover_convective_cloud Cloud_water_entire_atmosphere Cloud_Work_Function_entire_atmosphere_Mixed_intervals_Average Sunshine_Duration_surface Convective_available_potential_energy_surface Convective_available_potential_energy_pressure_difference_layer Convective_inhibition_surface Convective_inhibition_pressure_difference_layer Storm_relative_helicity_height_above_ground_layer Surface_Lifted_Index_surface Best_4_layer_Lifted_Index_surface Total_ozone_entire_atmosphere Ozone_Mixing_Ratio_isobaric Albedo_surface_Mixed_intervals_Average Land_cover_0__sea_1__land_surface Water_runoff_surface_Mixed_intervals_Accumulation Volumetric_Soil_Moisture_Content_depth_below_surface_layer Ground_Heat_Flux_surface_Mixed_intervals_Average Wilting_Point_surface Field_Capacity_surface Haines_Index_surface Ice_cover_surface
# Look at metadata for a specific variable
print data.variables['Temperature_surface']
<type 'netCDF4.Variable'> float32 Temperature_surface(u'time', u'lat', u'lon') long_name: Temperature @ Ground or water surface units: K missing_value: nan abbreviation: TMP Grib_Variable_Id: VAR_0-0-0_L1 Grib2_Parameter: [0 0 0] Grib2_Parameter_Discipline: Meteorological products Grib2_Parameter_Category: Temperature Grib2_Parameter_Name: Temperature Grib2_Level_Type: 1 Grib2_Generating_Process_Type: Forecast unlimited dimensions = () current size = (65, 361, 720)
It's good to close netCDF files, but not actually necessary when only reading.
f.close()
data.close()
!pwd
/Users/russ/git/tds-python-workshop