Load, plot, and play simpleLoop.wav
:
from essentia.standard import MonoLoader
x = MonoLoader(filename='simpleLoop.wav')()
print x.shape
(132300,)
t = arange(len(x))/44100.0
plot(t, x)
xlabel('Time (seconds)')
<matplotlib.text.Text at 0x3f49450>
from IPython.display import Audio
Audio(x, rate=44100)
Before extracting features from a frame of audio, we first multiply the frame by a window, such as the Hamming window pictured below, to reduce artifacts caused by the edges of the frame.
plot(hamming(101))
[<matplotlib.lines.Line2D at 0x569e510>]
Mel-frequency cepstral coefficients (MFCCs) are a set of features that describe the coarse overall shape of a spectrum but not the fine harmonic structure. MFCCs are often used to describe the timbre of a musical signal.
from essentia.standard import MFCC, Spectrum, Windowing, FrameGenerator
hamming_window = Windowing(type='hamming')
spectrum = Spectrum() # we just want the magnitude spectrum
mfcc = MFCC()
mfccs = array([mfcc(spectrum(hamming_window(frame)))[1]
for frame in FrameGenerator(x, frameSize=1024, hopSize=512)])
print mfccs.shape
(260, 13)
Display MFCCs over time:
imshow(mfccs[:,1:].T, origin='lower', aspect='auto', interpolation='nearest') # Ignore the 0th MFCC
yticks(range(12), range(1,13)) # Ignore the 0th MFCC
ylabel('MFCC Coefficient Index')
xlabel('Frame Index')
<matplotlib.text.Text at 0x569ecd0>
In Essentia, centroid
can be used to compute either the temporal or spectral centroid of a frame.
from essentia.standard import Centroid, Spectrum, Windowing, FrameGenerator
hamming_window = Windowing(type='hamming')
spectrum = Spectrum() # we just want the magnitude spectrum
centroid = Centroid(range=22050)
energy = array([centroid(spectrum(hamming_window(frame)))
for frame in FrameGenerator(x, frameSize=2048, hopSize=1024)])
plot(energy)
ylabel('Spectral Centroid')
xlabel('Frame Index')
<matplotlib.text.Text at 0x56b2050>