Variations in submarine channel sinuosity as a function of latitude and slope

The purpose of this notebook is to provide the data and the plotting / analysis scripts that were used in the paper: Sylvester, Z., Pirmez, C., Cantelli, A., & Jobe, Z. R. (2013). Global (latitudinal) variation in submarine channel sinuosity: COMMENT. Geology, 41(5), e287–e287. doi:10.1130/G33548C.1. If you have comments or questions, please send them to zoltan_dot_sylvester_at_gmail_dot_com.

The original article that is discussed in the comment and the reply are as follows:

Peakall, J., Kane, I. A., Masson, D. G., Keevil, G., McCaffrey, W., & Corney, R. (2011). Global (latitudinal) variation in submarine channel sinuosity. Geology, 40(1), 11–14. doi:10.1130/G32295.1.

Peakall, J., Wells, M. G., Cossu, R., Kane, I. A., Masson, D. G., Keevil, G. M., et al. (2013). Global (latitudinal) variation in submarine channel sinuosity: REPLY. Geology, 41(5), e288–e288. doi:10.1130/G34319Y.1.

Read and plot data

In [139]:
import pandas as pd
from scipy import stats
pd.set_option('display.mpl_style', 'default') # nicer plots than the default matplotlib style
In [189]:
fileurl = 'https://dl.dropboxusercontent.com/u/25694950/channel_sinuosities.csv'
data = pd.read_csv(fileurl)

The majority of the data comes from the book by Clark and Pickering (1996), which also forms the basis of the analysis by Peakall et al. (2011); graphs in the book were digitized using Graphclick. Data for the Amazon Channel comes from Pirmez and Flood (1995) (provided by Carlos Pirmez); and Popescu et al. (2001) (centerline digitized from map in paper and half-wavelength sinuosities calculated from centerline). See full references in paper.

Let's have a look at the first 20 lines of the data table:

In [191]:
data[:20]
Out[191]:
Name Slope Sinuosity Latitude Large_river
0 DeSoto 0.00148 1.24586 28 0
1 DeSoto 0.00228 1.31765 28 0
2 DeSoto 0.00181 2.04757 28 0
3 DeSoto 0.00187 2.00569 28 0
4 DeSoto 0.00251 2.23903 28 0
5 Arguello 0.00440 1.45093 33 0
6 Arguello 0.00464 1.36212 33 0
7 Arguello 0.00537 1.09572 33 0
8 Arguello 0.01455 1.56045 33 0
9 Arguello 0.01575 1.07500 33 0
10 Arguello 0.06686 1.10460 33 0
11 Monterey 0.00092 1.02485 35 0
12 Monterey 0.00136 1.06459 35 0
13 Monterey 0.00145 1.05607 35 0
14 Monterey 0.00143 1.13271 35 0
15 Monterey 0.00223 1.01634 35 0
16 Monterey 0.00249 1.08446 35 0
17 Monterey 0.00308 1.30300 35 0
18 Monterey 0.00320 1.17812 35 0
19 Monterey 0.00334 1.04472 35 0

20 rows × 5 columns

In [73]:
figure(figsize=(10,8))
data['Name'].value_counts().plot(kind='bar');
In [69]:
figure(figsize=(10,8))
groups = data.groupby('Large_river')
labels = ('no large river','large river')
colors = (plt.rcParams['axes.color_cycle'][4],plt.rcParams['axes.color_cycle'][0])
for name, group in groups:
    plot(group.Slope, group.Sinuosity, marker='o', linestyle='', ms=8, label=labels[name], color=colors[name])
legend(loc='best')
xlabel('valley slope', fontsize=18)
ylabel('sinuosity', fontsize=18)
axis([0.0,0.03,1.0,3.0]);
In [71]:
figure(figsize=(10,8))
groups = data.groupby('Large_river')
labels = ('no large river','large river')
colors = (plt.rcParams['axes.color_cycle'][4],plt.rcParams['axes.color_cycle'][0])
for name, group in groups:
    plot(group.Latitude, group.Sinuosity, marker='o', linestyle='', ms=8, label=labels[name], color=colors[name])
legend(loc='best')
xlabel('latitude', fontsize=18)
ylabel('sinuosity', fontsize=18)
axis([0.0,70,1.0,3.0]);

Analysis of correlations

In [131]:
def bootstrap_regression(x,y,nit):
    slope, intercept, r_value, p_value, std_err = stats.linregress(x, y)
    r = np.zeros(nit)
    n = len(x)
    for i in range(1,nit):
        x1 = x[np.random.randint(n, size=n)]
        y1 = y[np.random.randint(n, size=n)]
        slope, intercept, r_value2, p_value2, std_err = stats.linregress(x1,y1)
        r[i] = r_value2
    if r_value>0:
        p_value_boot = len(r[r >= r_value])/float(len(r))
    else:
        p_value_boot = len(r[r <= r_value])/float(len(r))
    return r_value, p_value, p_value_boot
In [132]:
r_values = np.zeros([3,1])
p_values = np.zeros([3,1])
p_values_boot = np.zeros([3,1])
r_values[0], p_values[0], p_values_boot[0] = bootstrap_regression(data.Latitude,data.Sinuosity,1000)
# for variable pairs that involve slope, we can only use those data points that have slope measurements:
r_values[1], p_values[1], p_values_boot[1] = bootstrap_regression(data.Slope[isnan(data.Slope)==0],data.Sinuosity[isnan(data.Slope)==0],1000)
r_values[2], p_values[2], p_values_boot[2] = bootstrap_regression(data.Latitude[isnan(data.Slope)==0],data.Slope[isnan(data.Slope)==0],1000)

Create a dataframe that contains the result of the regression:

In [137]:
df = pd.DataFrame(np.hstack(([[292],[268],[268]],r_values,r_values**2,
                             p_values,p_values_boot)),columns=['n','Pearson R','R squared','p-value',
                            'p-value bootstrap'],index=('latitude and sinuosity','slope and sinuosity','latitude and slope'))
In [138]:
df
Out[138]:
n Pearson R R squared p-value p-value bootstrap
latitude and sinuosity 292 -0.347308 0.120623 1.058797e-09 0
slope and sinuosity 268 -0.277373 0.076936 4.020125e-06 0
latitude and slope 268 0.202860 0.041152 8.370798e-04 0

3 rows × 5 columns