One of the primary purposes of programming is automatization. Whenever you work on data, you should try to avoid any manual intervention such as copying and pasting from one file to another, for example into an excel spreadsheet. This introduces a source of error and severely limits the amount of data you can process. Further, whenever you change parts of your program, it is likely that the manual steps have to be repeated. For all your work on data, good practice is to work on the rawest form of files you get, i.e., the ones supplied by a contractor or produced by the measurement apparatus.
One of the most important data sources in subsurface geoscience are well logs. These are cylindrical logging tools of a length between one and several dozens of meters that are being run up and down wellbores along a wireline and measure, collect and transmit rock formation data such as electrical resistivity, radiation, temperature and more, as a function of depth. This allows for accurate chracterization of lithologic layers, rock properties, oil saturation and more, to a high resolution.
%pylab inline
import os
import numpy as np
from scipy import integrate
import matplotlib.pyplot as plt
Populating the interactive namespace from numpy and matplotlib
A common well log file in the raw LAS format looks as follows (we print the first 55 lines and skip the rest of the data section):
fname = '1044944685.las'
lines_to_display = 55
with open(fname, 'r') as f:
for i in range(lines_to_display):
print f.readline().rstrip()
~Version Information VERS. 2.0: CWLS Log ASCII Standard - VERSION 2.0 WRAP. NO: One line per depth step ~Well Information Block STRT.FT 245.0000: START DEPTH STOP.FT 2530.5000: STOP DEPTH STEP.FT 0.5000: STEP NULL. -999.2500: NULL VALUE COMP. Vess Oil Corporation: COMPANY WELL. Wilson A #448: WELL FLD. El Dorado: FIELD LOC. 330' FNL & 1550' FWL: LOCATION CNTY. Butler: COUNTY SRVC. : SERVICE COMPANY DATE. Wed Apr 17 07-22-29 2013: LOG DATE UWI. : UNIQUE WELL ID STAT. Kansas: STATE SECT. 9: SECTION TOWN. 25S: TOWNSHIP RANG. 5E: RANGE API. 15-015-23969-00-00: API# PDAT.FT Ground Level: PERMANENT DATUM LMF.FT Kelly Bushing: LOG MEASURED FROM DMF.FT Kelly Bushing: DRILLING MEASURED FROM EKB.FT 1371: KB EDF.FT : DF EGL.FT 1365: GL DATE1. 4/17/2013: DATE1 ENGI1. D. Schmidt: RECORDED BY WITN1. Roger Martin: WITNESSED BY ~Curve Information Block DEPT.FT 0 000 00 00: Depth BVTX. : AVTX. : DCAL.GAPI CDL Caliper: RHOB. CDL Bulk Density: RHOC. CDL Density Correction: DPOR. CDL Density Porosity: CNLS. CN Limestone Porosity: GR. Gamma Ray: CILD. DIL Deep Conductivity: RILD. DIL Deep Resistivity: RILM. DIL Medium Resistivity: RLL3. DIL Shallow Resistivity: RxoRt. Rxo / Rt: SP.MV DIL Spontaneous Potential: DGA. Apparent Grain Density: ~Parameter Information Block ~A Depth BVTX AVTX DCAL RHOB RHOC DPOR CNLS GR 245.0000 0.0000 0.0000 1.7734 1.8916 0.2025 47.8581 29.3674 52.3655 245.5000 0.0000 0.0000 1.7738 1.9023 0.2070 47.2337 29.6285 49.6518 246.0000 0.0000 0.0000 1.7741 1.9094 0.2103 46.8160 30.6172 47.9509 246.5000 0.0000 0.0000 1.7740 1.9142 0.2122 46.5372 32.2842 50.2727 247.0000 0.0000 0.0000 1.7745 1.9168 0.2133 46.3889 33.6512 52.2234 247.5000 0.0000 0.0000 1.7739 1.9171 0.2146 46.3682 33.4263 50.6780
We'll try to keep things simpel here. Say we know that the data array always starts at line 49, then we can load the entire dataset by using loadtxt, telling the method to skip everything before that line/row.
las_data = np.loadtxt(fname, skiprows=49)
print 'readings/rows, logs/columns:'
print las_data.shape
print 'values:'
print las_data
readings/rows, logs/columns: (4544L, 9L) values: [[ 2.45000000e+02 0.00000000e+00 0.00000000e+00 ..., 4.78581000e+01 2.93674000e+01 5.23655000e+01] [ 2.45500000e+02 0.00000000e+00 0.00000000e+00 ..., 4.72337000e+01 2.96285000e+01 4.96518000e+01] [ 2.46000000e+02 0.00000000e+00 0.00000000e+00 ..., 4.68160000e+01 3.06172000e+01 4.79509000e+01] ..., [ 2.51550000e+03 0.00000000e+00 0.00000000e+00 ..., 7.16980000e+00 2.88235000e+01 9.78750000e+00] [ 2.51600000e+03 0.00000000e+00 0.00000000e+00 ..., 7.43490000e+00 2.97673000e+01 1.15010000e+00] [ 2.51650000e+03 0.00000000e+00 0.00000000e+00 ..., 7.61220000e+00 3.22248000e+01 6.37000000e-02]]
Which creates a two-dimensional array containing all logs (columns) and their readings (rows). If we further know that Depth
is in column 0
, and our log of interest RHOB
is in column 4
, we can extract the corresponding data:
las_depth = las_data[:,0] # all readings of log 0
las_rhob = las_data[:,4] # all readings of log 4
print las_depth
print las_rhob
[ 245. 245.5 246. ..., 2515.5 2516. 2516.5] [ 1.8916 1.9023 1.9094 ..., 2.5874 2.5829 2.5798]
We have now extracted our arrays of interest, and can plot the measured rock density as a function of depth:
plt.plot(las_rhob, las_depth)
# so that depth, albeit positive, increases downward
plt.gca().invert_yaxis()
# cosmetics
plt.xlim(0.,3.)
plt.xlabel('Specific Bulk Density [-]')
plt.ylabel('Depth [ft]')
plt.show()
We will operate on the extracted data now, and plot the result. The simple operation is
$\sigma(z) = g \int \rho dz$
to compute the overburden stress resulting from the lithostatic (rock) weight. We will therefore first convert the density from specific to [kg m-3] and the depth from [ft] to [m]
rhob = las_rhob*1.0E3
depth = las_depth*0.3048
and then use a scipy
function to perform the integration
# sigma(z) = g int rho dz
overburden_stress_pa = 9.81 * integrate.cumtrapz(rhob, depth, initial=0)
overburden_stress_mpa = overburden_stress_pa/1.0E6
plt.plot(overburden_stress_mpa, depth)
# so that depth, albeit positive, increases downward
plt.gca().invert_yaxis()
# cosmetics
plt.xlabel('$\sigma_v$ [MPa]')
plt.ylabel('Depth [m]')
plt.show()
DCAL
measures the diamater of the wellbore in [in]. Plot its readings vs depth, and integrate the cross-sectional area (use np.pi
as $\pi$) to plot the cumulative borehole volume in [m3] vs [m]. (Download the notebook and las file here)las_dcal = las_data[:,3] # all readings of log 3
plt.plot(las_dcal, las_depth)
# so that depth, albeit positive, increases downward
plt.gca().invert_yaxis()
# cosmetics
plt.xlabel('Wellbor Diameter [in]')
plt.ylabel('Depth [ft]')
plt.show()
dcal = las_dcal*0.0254 # from [in] to [m]
xarea = np.pi * dcal**2/4. # cross-sectional area [m2], dcal=diameter
# V(z) = int area dz
volume_m3 = integrate.cumtrapz(xarea, depth, initial=0)
plt.plot(volume_m3, depth)
# so that depth, albeit positive, increases downward
plt.gca().invert_yaxis()
# cosmetics
plt.xlabel('Wellbore Volume [m3]')
plt.ylabel('Depth [m]')
plt.show()