%matplotlib inline
import mir_eval, librosa, librosa.display, numpy, matplotlib.pyplot as plt, IPython.display as ipd
plt.style.use('seaborn-muted')
plt.rcParams['figure.figsize'] = (14, 5)
plt.rcParams['axes.grid'] = True
plt.rcParams['axes.spines.left'] = False
plt.rcParams['axes.spines.right'] = False
plt.rcParams['axes.spines.bottom'] = False
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.xmargin'] = 0
plt.rcParams['axes.ymargin'] = 0
plt.rcParams['image.cmap'] = 'gray'
plt.rcParams['image.interpolation'] = None
Evaluation method: compute the error between the estimated beat times and some reference list of beat locations. Many metrics additionally compare the beat sequences at different metric levels in order to deal with the ambiguity of tempo.
Let's evaluate a beat detector on the following audio:
y, sr = librosa.load('audio/prelude_cmaj.wav')
ipd.Audio(y, rate=sr)
Estimate the beats using beat_track
:
est_tempo, est_beats = librosa.beat.beat_track(y=y, sr=sr, bpm=120)
est_beats = librosa.frames_to_time(est_beats, sr=sr)
est_beats
array([1.021678 , 1.53251701, 1.9969161 , 2.48453515, 2.9721542 , 3.43655329, 3.90095238, 4.41179138, 4.89941043, 5.38702948, 5.85142857, 6.33904762, 6.82666667, 7.29106576])
Load a fictional reference annotation.
ref_beats = numpy.array([0, 0.50, 1.02, 1.53, 1.99, 2.48, 2.97,
3.43, 3.90, 4.41, 4.89, 5.38,
5.85, 6.33, 6.82, 7.29, 7.70])
Plot the estimated and reference beats together.
D = librosa.stft(y)
S = abs(D)
S_db = librosa.amplitude_to_db(S)
librosa.display.specshow(S_db, sr=sr, x_axis='time', y_axis='log')
plt.ylim(0, 8192)
plt.vlines(est_beats, 0, 8192, color='#00ff00')
plt.scatter(ref_beats, 5000*numpy.ones_like(ref_beats), color='k', s=100)
<matplotlib.collections.PathCollection at 0x104122e10>
Evaluate using mir_eval.beat.evaluate
:
mir_eval.beat.evaluate(ref_beats, est_beats)
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-31-6a77a8e60fa6> in <module>() ----> 1 mir_eval.beat.evaluate(ref_beats, est_beats) ~/anaconda3/lib/python3.6/site-packages/mir_eval/beat.py in evaluate(reference_beats, estimated_beats, **kwargs) 751 # P-Score 752 scores['P-score'] = util.filter_kwargs(p_score, reference_beats, --> 753 estimated_beats, **kwargs) 754 755 # Continuity metrics ~/anaconda3/lib/python3.6/site-packages/mir_eval/util.py in filter_kwargs(_function, *args, **kwargs) 872 filtered_kwargs[kwarg] = value 873 # Call the function with the supplied args and the filtered kwarg dict --> 874 return _function(*args, **filtered_kwargs) 875 876 ~/anaconda3/lib/python3.6/site-packages/mir_eval/beat.py in p_score(reference_beats, estimated_beats, p_score_threshold) 407 start = middle_lag - win_size 408 end = middle_lag + win_size + 1 --> 409 train_correlation = train_correlation[start:end] 410 # Compute and return the P-score 411 n_beats = np.max([estimated_beats.shape[0], reference_beats.shape[0]]) TypeError: slice indices must be integers or None or have an __index__ method
mir_eval.chord.evaluate()
Hidden benefits
mir_eval has tools for display and sonification.
import librosa.display
import mir_eval.display
Common plots: events
, labeled_intervals
pitch, multipitch, piano_roll segments, hierarchy, separation
librosa.display.specshow(S, x_axis='time', y_axis='mel')
mir_eval.display.events(ref_beats, color='w', alpha=0.8, linewidth=3)
mir_eval.display.events(est_beats, color='c', alpha=0.8, linewidth=3, linestyle='--')
y_harm, y_perc = librosa.effects.hpss(y, margin=8)
plt.figure(figsize=(12, 4))
mir_eval.display.separation([y_perc, y_harm], sr, labels=['percussive', 'harmonic'])
plt.legend()
Audio(data=numpy.vstack([
mir_eval.sonify.chords()