In one of the previous posts, I analyzed a guitar sample of a low E-string ringing. In this article, I'll try to synthetize a guitar sound from the frequency analysis of this real guitar tone. I'm going to test the following hypothesis:
from scipy.io import wavfile
(rate, tone) = wavfile.read('files/Low_tone.wav') #rate is the sampling frequency
samples = len(tone)
samples
120800
NFFT = 1024 # the length of the windowing segments
# Pxx is the segments x freqs array of instantaneous power, freqs is
# the frequency vector, bins are the centers of the time bins in which
# the power is computed, and im is the matplotlib.image.AxesImage
# instance
figure(figsize=(10, 8))
ax1 = subplot(211)
plot(linspace(0, 1, samples) * samples / rate, tone)
xlabel('time (seconds)')
subplot(212, sharex=ax1)
Pxx, freqs, bins, im = specgram(tone, NFFT=NFFT, Fs=rate, noverlap=900,
cmap=cm.gist_heat)
xlabel('time (seconds)')
<matplotlib.text.Text at 0x6a6d3d0>
As one can see in the previous spectrogram, we do see a lot of harmonics. In fact, we can see harmonics up to the Shannon frequency of 4 kHz. As the fundamental is at 82 Hz (roughly), that's about $4000 / 82 \approx 49$ harmonics.
4000/82.
48.78048780487805
What we need to come up with is two things: an amplitude distribution between harmonics and their decay rate.
Let's look at the first frame of our spectrogram. How do we find the start? We can calculate the energy of each frame and look where it takes off.
print Pxx.shape, freqs.shape
(513, 966) (513,)
energy = sum(Pxx ** 2, axis=0)
energy.shape
(966,)
figure(figsize=(10, 4))
plot(bins, energy)
xlim(0, 4)
xticks(arange(0, 4, 0.25));
grid(True)
xlabel("time (s)");
From the previous graph, we know that the sound starts around 0.75 seconds. Let's take a closer look at the frequency content around that time. First, we need the right index.
argmax(bins > 0.75)
45
Now, we can plot the frequencies in the FFT at that point in time.
ampl = Pxx[:, 45]
plot(freqs, ampl)
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string")
<matplotlib.text.Text at 0x960c250>
On the previous figure, we can really see the structure with all the harmonics due to the y-axis scale. Let's see how that looks in log scale.
plot(freqs, log(ampl))
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string")
<matplotlib.text.Text at 0x97092f0>
We now see a lot more spikes, which is exactly what we expect. Let's see if we can find an envelope for that curve and just fit the structure of the frequencies on the envelope.
from scipy.signal import hilbert
signal = log(ampl)
envelope = real(hilbert(signal))
plot(freqs, signal, label='original curve')
plot(freqs, envelope, label='hilbert envelope')
legend()
<matplotlib.legend.Legend at 0xac20d10>
Apparently, this is not working very well. So instead I'm going to extract all the amplitudes at the frequencies $f_n = n * f_0$ where $f_0$ is the fundamental frequency of our signal.
plot(freqs, log(ampl))
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string, zoomed on lower frequencies")
xlim(0, 600)
(0, 600)
In our case, the frequency resolution of our plot is the following:
df = freqs[1] - freqs[0]
df
7.8125
As this is quite low resolution, I'm going to inerpolate linearly between the curve points to make sure I won't miss an amplitude point. I'm taking code from here: interpolation cookbook recipes on Scipy.
from scipy.signal import cspline1d, cspline1d_eval
harmonics_freqs = arange(82.5, 4000, 82.5)
cspline = cspline1d(ampl)
harmonics_ampl = cspline1d_eval(cspline, harmonics_freqs, dx=df,x0=freqs[0])
plot(harmonics_freqs, log(harmonics_ampl), 'o')
plot(freqs, log(ampl))
[<matplotlib.lines.Line2D at 0xae9be70>]
This looks like most frequency amplitudes are matched, with some small errors. Let's now generate a function that plays a sound with those amplitudes (normalizing the amplitudes at 1).
def gen_harmonic_sound(t, harmonics_freqs, harmonics_ampl):
#normalizing step
harmonics_ampl = harmonics_ampl / max(harmonics_ampl)
signal = zeros(t.shape)
for i in range(len(harmonics_freqs)):
signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t)
return signal
t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound(t, harmonics_freqs, harmonics_ampl)
plot(t, signal)
xlim(0, 0.5)
(0, 0.5)
We can now play this signal in the HTML player.
import StringIO
import base64
import struct
from IPython.core.display import HTML
def wavPlayer(data, rate):
""" will display html 5 player for compatible browser
The browser need to know how to play wav through html5.
there is no autoplay to prevent file playing when the browser opens
Adapted from SciPy.io.
"""
buffer = StringIO.StringIO()
buffer.write(b'RIFF')
buffer.write(b'\x00\x00\x00\x00')
buffer.write(b'WAVE')
buffer.write(b'fmt ')
if data.ndim == 1:
noc = 1
else:
noc = data.shape[1]
bits = data.dtype.itemsize * 8
sbytes = rate*(bits // 8)*noc
ba = noc * (bits // 8)
buffer.write(struct.pack('<ihHIIHH', 16, 1, noc, rate, sbytes, ba, bits))
# data chunk
buffer.write(b'data')
buffer.write(struct.pack('<i', data.nbytes))
if data.dtype.byteorder == '>' or (data.dtype.byteorder == '=' and sys.byteorder == 'big'):
data = data.byteswap()
buffer.write(data.tostring())
# return buffer.getvalue()
# Determine file size and place it in correct
# position at start of the file.
size = buffer.tell()
buffer.seek(4)
buffer.write(struct.pack('<i', size-8))
val = buffer.getvalue()
src = """
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<title>Simple Test</title>
</head>
<body>
<audio controls="controls" style="width:600px" >
<source controls src="data:audio/wav;base64,{base64}" type="audio/wav" />
Your browser does not support the audio element.
</audio>
</body>
""".format(base64=base64.encodestring(val))
display(HTML(src))
wavPlayer(2**13 * signal.astype(np.int16), 8000)
Well that doesn't sound like an E-string! Let's move on to the time decay then.
There's a couple of things here. Firstly, the sound of the E string is damped over time. Second, the low and high frequencies are damped at different rates. Therefore, we're first gonna look into the global damping and then on the frequency-by-frequency damping.
figure(figsize=(10, 4))
plot(bins, energy)
xlim(0, 4)
xticks(arange(0, 4, 0.25));
grid(True)
xlabel("time (s)");
From knowledge on dissipation, it would be easy to imagine that the damping is exponential. Let's therefore fit an exponential function of time on the decaying part of the signal $f_t = exp(-t/\tau)$, where $tau$ is the characteristic decay time of our signal. We'll borrow some code from this Scipy tutorial.
from scipy import optimize
selection = bins > 0.75
fit_time = bins[selection]
fit_points = energy[selection]
fitfunc = lambda p, x: p[0]*exp(-p[1] * x) # Target function
errfunc = lambda p, x, y: fitfunc(p, x) - y # Distance to the target function
p0 = [1e11, -1.] # Initial guess for the parameters
p1, success = optimize.leastsq(errfunc, p0[:], args=(fit_time, fit_points))
time = linspace(fit_time[0], fit_time[-1], num=50)
plot(fit_time, fit_points, label='data') # Plot of the data
plot(time, fitfunc(p1, time), label = 'fit') # and the fit
legend()
<matplotlib.legend.Legend at 0xd40ab10>
p1
array([ 9.15646841e+11, 2.16959692e+00])
Our exponential decay time is 2.17 seconds.
We can now apply this model to the sine wave of our previous section.
t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound(t, harmonics_freqs, harmonics_ampl)
signal = signal * fitfunc(p1, t)
signal = signal / norm(signal, ord=inf) * 32000
signal = signal.astype(np.int16)
wavPlayer(signal, 8000)
plot(t, signal)
[<matplotlib.lines.Line2D at 0xd436030>]
Interestingly, this starts to sound like something (but more like a bell!).
NFFT = 1024 # the length of the windowing segments
# Pxx is the segments x freqs array of instantaneous power, freqs is
# the frequency vector, bins are the centers of the time bins in which
# the power is computed, and im is the matplotlib.image.AxesImage
# instance
figure(figsize=(10, 8))
ax1 = subplot(211)
plot(linspace(0, 1, samples) * samples / rate, tone)
xlabel('time (seconds)')
subplot(212, sharex=ax1)
Pxx, freqs, bins, im = specgram(tone, NFFT=NFFT, Fs=rate, noverlap=900,
cmap=cm.gist_heat)
xlabel('time (seconds)')
<matplotlib.text.Text at 0xb700bd0>
As one can see on the spectrogram, the higher the frequency, the higher the decay. Why not look at the decay line by line for given frequencies? Let's start with the one that has the most power. First let's check where the max power is.
figure(figsize=(10, 4))
plot(freqs, ampl)
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string")
xlim(0, 1000)
xticks(arange(0, 1000, 50));
plot(arange(82, 1000, 82), 100000 * ones(arange(82, 1000, 82).shape), 'o')
[<matplotlib.lines.Line2D at 0xefac810>]
Apparently, this the 4th harmonic, with a theoric frequency of $4 * 82.5 = 330$ Hertz.
argmax(freqs >= 330)
43
argmax(freqs >= 2* 82.5)
22
plot(Pxx[22, :], label='2nd harmonic')
plot(Pxx[43, :], label='4th harmonic')
legend()
<matplotlib.legend.Legend at 0xf42c610>
Clearly, they show different decay profiles. Let's build a function that allows us to compare the decay.
def plot_decay(harmonic):
ind = argmax(freqs >= harmonic* 82.5)
decay = Pxx[ind, :]
decay = decay / max(decay)
label = "harmonic %s" % harmonic
plot(bins, decay, label=label)
plot_decay(2)
plot_decay(4)
plot_decay(5)
plot_decay(7)
plot_decay(10)
legend()
xlabel("time (s)")
xlim(0, 8)
(0, 8)
Let's plot the same curve than previously but with the exponential decay calculated in the previous step.
p1
array([ 9.15646841e+11, 2.16959692e+00])
p2 = array([1., 1.])
normed_fitfunc = lambda p, x: fitfunc(p, x) / p[0]
figure(figsize=(16, 4))
fit_values = normed_fitfunc(p1, bins)
plot(bins+0.75, fit_values, label='exp decay t=2.17 s')
plot(bins+0.75, normed_fitfunc(p2, bins), label='exp decay t=1 s')
plot_decay(2)
plot_decay(4)
plot_decay(5)
plot_decay(7)
plot_decay(10)
plot_decay(20)
legend()
xlabel("time (s)")
xlim(0, 8)
(0, 8)
Let's play a guitar chord with a simple frequency based decay rule:
def gen_harmonic_sound_simple_decay(t, harmonics_freqs, harmonics_ampl):
#normalizing step
harmonics_ampl = harmonics_ampl / max(harmonics_ampl)
signal = zeros(t.shape)
for i in range(len(harmonics_freqs)):
if i < 5:
signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t) * exp(-t / 2.17)
else:
signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t) * exp(-t / 1.0)
return signal
t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound_simple_decay(t, harmonics_freqs, harmonics_ampl)
signal = signal * fitfunc(p1, t)
signal = signal / norm(signal, ord=inf) * 32000
signal = signal.astype(np.int16)
wavPlayer(signal, 8000)
Sounds good!
What about a more elaborate rule, such as a linear damping as a function of harmonic order?
figure(figsize=(16, 4))
for harmonic in [1, 2, 4, 5, 10]:
plot_decay(harmonic)
decay_time = 2.17 / 5. * harmonic
label = "exp decay tau=%.2f" % (decay_time)
plot(bins+0.75, normed_fitfunc([1.0, decay_time], bins), label=label)
legend()
<matplotlib.legend.Legend at 0xf840490>
print decay_time
label = "tau %.2f" % (decay_time)
0.434
def gen_harmonic_sound_linear_decay(t, harmonics_freqs, harmonics_ampl):
#normalizing step
harmonics_ampl = harmonics_ampl / max(harmonics_ampl)
signal = zeros(t.shape)
for i in range(len(harmonics_freqs)):
decay_time = 2.17 / 4.5 * i
signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t) * exp(-t / decay_time)
return signal
t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound_linear_decay(t, harmonics_freqs, harmonics_ampl)
plot(t, signal)
[<matplotlib.lines.Line2D at 0x10b69c10>]
signal = (15000*signal).astype(np.int16)
wavPlayer(signal, 8000)
Not bad at all!