# Detection of peaks in data¶

Marcos Duarte
Laboratory of Biomechanics and Motor Control](http://demotu.org/
Federal University of ABC, Brazil

One way to detect peaks (local maxima) or valleys (local minima) in data is to use the property that a peak (or valley) must be greater (or smaller) than its immediate neighbors. The function detect_peaks.py from Python module detecta detects peaks (or valleys) based on this feature and other characteristics. The function signature is:

ind = detect_peaks(x, mph=None, mpd=1, threshold=0, edge='rising', kpsh=False, valley=False, show=False, ax=None, title=True)


The parameters mph, mpd, and threshold follow the convention of the Matlab function findpeaks.m.
Let's see how to use detect_peaks.py; first let's import the necessary Python libraries and configure the environment:

## Installation¶

pip install detecta


Or

conda install -c duartexyz detecta

In [1]:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

from detecta import detect_peaks


Running the function examples:

In [3]:
    >>> x = np.random.randn(100)
>>> x[60:81] = np.nan
>>> # detect all peaks and plot data
>>> ind = detect_peaks(x, show=True)
>>> print(ind)

>>> x = np.sin(2*np.pi*5*np.linspace(0, 1, 200)) + np.random.randn(200)/5
>>> # set minimum peak height = 0 and minimum peak distance = 20
>>> detect_peaks(x, mph=0, mpd=20, show=True)

>>> x = [0, 1, 0, 2, 0, 3, 0, 2, 0, 1, 0]
>>> # set minimum peak distance = 2
>>> detect_peaks(x, mpd=2, show=True)

>>> x = np.sin(2*np.pi*5*np.linspace(0, 1, 200)) + np.random.randn(200)/5
>>> # detection of valleys instead of peaks
>>> detect_peaks(x, mph=-1.2, mpd=20, valley=True, show=True)

>>> x = [0, 1, 1, 0, 1, 1, 0]
>>> # detect both edges
>>> detect_peaks(x, edge='both', show=True)

>>> x = [-2, 1, -2, 2, 1, 1, 3, 0]
>>> # set threshold = 2
>>> detect_peaks(x, threshold = 2, show=True)

>>> x = [-2, 1, -2, 2, 1, 1, 3, 0]
>>> fig, axs = plt.subplots(ncols=2, nrows=1, figsize=(10, 4))
>>> detect_peaks(x, show=True, ax=axs[0], threshold=0.5, title=False)
>>> detect_peaks(x, show=True, ax=axs[1], threshold=1.5, title=False)

[ 6  9 11 13 17 19 21 23 26 30 33 35 37 40 42 45 47 51 54 57 83 86 89 91
94 96]

Out[3]:
array([1, 6])

## Function performance¶

The function detect_peaks.py is relatively fast but the parameter minimum peak distance (mpd) slows down the function if the data has several peaks (>1000). Try to decrease the number of peaks by tuning the other parameters or smooth the data before calling this function with several peaks in the data.
Here is a simple test of its performance:

In [4]:
x = np.random.randn(10000)
ind = detect_peaks(x)
print('Data with %d points and %d peaks\n' %(x.size, ind.size))
print('Performance (without the minimum peak distance parameter):')
print('detect_peaks(x)')
%timeit detect_peaks(x)
print('\nPerformance (using the minimum peak distance parameter):')
print('detect_peaks(x, mpd=10)')
%timeit detect_peaks(x, mpd=10)

Data with 10000 points and 3358 peaks

Performance (without the minimum peak distance parameter):
detect_peaks(x)
98.2 µs ± 1.95 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Performance (using the minimum peak distance parameter):
detect_peaks(x, mpd=10)
8.17 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)