In one of the previous posts, I analyzed a guitar sample of a low E-string ringing. In this article, I'll try to synthetize a guitar sound from the frequency analysis of this real guitar tone. I'm going to test the following hypothesis:

There are only two relevant phenomena for a guitar string
The first one is frequency content: it should contain all of the harmonics up to a certain order and with their relative amplitude
The second one is damping: each harmonic should be damped in a characteristic time

Frequency content and structure of the low E-string¶

In [1]:

from scipy.io import wavfile

In [2]:

(rate, tone) = wavfile.read('files/Low_tone.wav') #rate is the sampling frequency
samples = len(tone)
samples

Out[2]:

In [4]:

NFFT = 1024       # the length of the windowing segments

# Pxx is the segments x freqs array of instantaneous power, freqs is
# the frequency vector, bins are the centers of the time bins in which
# the power is computed, and im is the matplotlib.image.AxesImage
# instance

figure(figsize=(10, 8))
ax1 = subplot(211)
plot(linspace(0, 1, samples) * samples / rate, tone)
xlabel('time (seconds)')
subplot(212, sharex=ax1)
Pxx, freqs, bins, im = specgram(tone, NFFT=NFFT, Fs=rate, noverlap=900,
                                cmap=cm.gist_heat)
xlabel('time (seconds)')

Out[4]:

<matplotlib.text.Text at 0x6a6d3d0>

As one can see in the previous spectrogram, we do see a lot of harmonics. In fact, we can see harmonics up to the Shannon frequency of 4 kHz. As the fundamental is at 82 Hz (roughly), that's about $4000 / 82 \approx 49$ harmonics.

In [5]:

4000/82.

Out[5]:

48.78048780487805

What we need to come up with is two things: an amplitude distribution between harmonics and their decay rate.

Amplitude distribution between harmonics¶

Let's look at the first frame of our spectrogram. How do we find the start? We can calculate the energy of each frame and look where it takes off.

In [20]:

print Pxx.shape, freqs.shape

(513, 966) (513,)

In [21]:

energy = sum(Pxx ** 2, axis=0)
energy.shape

Out[21]:

(966,)

In [32]:

figure(figsize=(10, 4))
plot(bins, energy)
xlim(0, 4)
xticks(arange(0, 4, 0.25));
grid(True)
xlabel("time (s)");

From the previous graph, we know that the sound starts around 0.75 seconds. Let's take a closer look at the frequency content around that time. First, we need the right index.

In [33]:

argmax(bins > 0.75)

Out[33]:

Now, we can plot the frequencies in the FFT at that point in time.

In [39]:

ampl = Pxx[:, 45]
plot(freqs, ampl)
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string")

Out[39]:

<matplotlib.text.Text at 0x960c250>

On the previous figure, we can really see the structure with all the harmonics due to the y-axis scale. Let's see how that looks in log scale.

In [40]:

plot(freqs, log(ampl))
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string")

Out[40]:

<matplotlib.text.Text at 0x97092f0>

We now see a lot more spikes, which is exactly what we expect. Let's see if we can find an envelope for that curve and just fit the structure of the frequencies on the envelope.

In [59]:

from scipy.signal import hilbert
signal = log(ampl)
envelope = real(hilbert(signal))

In [60]:

plot(freqs, signal, label='original curve')
plot(freqs, envelope, label='hilbert envelope')
legend()

Out[60]:

<matplotlib.legend.Legend at 0xac20d10>

Apparently, this is not working very well. So instead I'm going to extract all the amplitudes at the frequencies $f_n = n * f_0$ where $f_0$ is the fundamental frequency of our signal.

In [64]:

plot(freqs, log(ampl))
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string, zoomed on lower frequencies")
xlim(0, 600)

Out[64]:

(0, 600)

In our case, the frequency resolution of our plot is the following:

In [67]:

df = freqs[1] - freqs[0]
df

Out[67]:

7.8125

As this is quite low resolution, I'm going to inerpolate linearly between the curve points to make sure I won't miss an amplitude point. I'm taking code from here: interpolation cookbook recipes on Scipy.

In [74]:

from scipy.signal import cspline1d, cspline1d_eval
harmonics_freqs = arange(82.5, 4000, 82.5)
cspline = cspline1d(ampl)
harmonics_ampl = cspline1d_eval(cspline, harmonics_freqs, dx=df,x0=freqs[0])

In [75]:

plot(harmonics_freqs, log(harmonics_ampl), 'o')
plot(freqs, log(ampl))

Out[75]:

[<matplotlib.lines.Line2D at 0xae9be70>]

This looks like most frequency amplitudes are matched, with some small errors. Let's now generate a function that plays a sound with those amplitudes (normalizing the amplitudes at 1).

In [79]:

def gen_harmonic_sound(t, harmonics_freqs, harmonics_ampl):
    #normalizing step
    harmonics_ampl = harmonics_ampl / max(harmonics_ampl)
    signal = zeros(t.shape)
    for i in range(len(harmonics_freqs)):
        signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t)
    return signal

In [80]:

t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound(t, harmonics_freqs, harmonics_ampl)

In [83]:

plot(t, signal)
xlim(0, 0.5)

Out[83]:

(0, 0.5)

We can now play this signal in the HTML player.

In [86]:

import StringIO
import base64
import struct
from IPython.core.display import HTML

def wavPlayer(data, rate):
    """ will display html 5 player for compatible browser

    The browser need to know how to play wav through html5.

    there is no autoplay to prevent file playing when the browser opens
    
    Adapted from SciPy.io.
    """
    
    buffer = StringIO.StringIO()
    buffer.write(b'RIFF')
    buffer.write(b'\x00\x00\x00\x00')
    buffer.write(b'WAVE')

    buffer.write(b'fmt ')
    if data.ndim == 1:
        noc = 1
    else:
        noc = data.shape[1]
    bits = data.dtype.itemsize * 8
    sbytes = rate*(bits // 8)*noc
    ba = noc * (bits // 8)
    buffer.write(struct.pack('<ihHIIHH', 16, 1, noc, rate, sbytes, ba, bits))

    # data chunk
    buffer.write(b'data')
    buffer.write(struct.pack('<i', data.nbytes))

    if data.dtype.byteorder == '>' or (data.dtype.byteorder == '=' and sys.byteorder == 'big'):
        data = data.byteswap()

    buffer.write(data.tostring())
#    return buffer.getvalue()
    # Determine file size and place it in correct
    #  position at start of the file.
    size = buffer.tell()
    buffer.seek(4)
    buffer.write(struct.pack('<i', size-8))
    
    val = buffer.getvalue()
    
    src = """
    <head>
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
    <title>Simple Test</title>
    </head>
    
    <body>
    <audio controls="controls" style="width:600px" >
      <source controls src="data:audio/wav;base64,{base64}" type="audio/wav" />
      Your browser does not support the audio element.
    </audio>
    </body>
    """.format(base64=base64.encodestring(val))
    display(HTML(src))

In [89]:

wavPlayer(2**13 * signal.astype(np.int16), 8000)

Simple Test

Well that doesn't sound like an E-string! Let's move on to the time decay then.

Time decay¶

There's a couple of things here. Firstly, the sound of the E string is damped over time. Second, the low and high frequencies are damped at different rates. Therefore, we're first gonna look into the global damping and then on the frequency-by-frequency damping.

Energy decay¶

In [90]:

figure(figsize=(10, 4))
plot(bins, energy)
xlim(0, 4)
xticks(arange(0, 4, 0.25));
grid(True)
xlabel("time (s)");

From knowledge on dissipation, it would be easy to imagine that the damping is exponential. Let's therefore fit an exponential function of time on the decaying part of the signal $f_t = exp(-t/\tau)$, where $tau$ is the characteristic decay time of our signal. We'll borrow some code from this Scipy tutorial.

In [91]:

from scipy import optimize

selection = bins > 0.75
fit_time = bins[selection]
fit_points = energy[selection]

fitfunc = lambda p, x: p[0]*exp(-p[1] * x) # Target function
errfunc = lambda p, x, y: fitfunc(p, x) - y # Distance to the target function
p0 = [1e11, -1.] # Initial guess for the parameters
p1, success = optimize.leastsq(errfunc, p0[:], args=(fit_time, fit_points))

In [93]:

time = linspace(fit_time[0], fit_time[-1], num=50)
plot(fit_time, fit_points, label='data') # Plot of the data 
plot(time, fitfunc(p1, time), label = 'fit') # and the fit
legend()

Out[93]:

<matplotlib.legend.Legend at 0xd40ab10>

In [94]:

p1

Out[94]:

array([  9.15646841e+11,   2.16959692e+00])

Our exponential decay time is 2.17 seconds.

We can now apply this model to the sine wave of our previous section.

In [106]:

t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound(t, harmonics_freqs, harmonics_ampl)
signal = signal * fitfunc(p1, t)
signal = signal / norm(signal, ord=inf) * 32000
signal = signal.astype(np.int16)
wavPlayer(signal, 8000)

Simple Test

In [107]:

plot(t, signal)

Out[107]:

[<matplotlib.lines.Line2D at 0xd436030>]

Interestingly, this starts to sound like something (but more like a bell!).

Per frequency decay¶

In [108]:

NFFT = 1024       # the length of the windowing segments

# Pxx is the segments x freqs array of instantaneous power, freqs is
# the frequency vector, bins are the centers of the time bins in which
# the power is computed, and im is the matplotlib.image.AxesImage
# instance

figure(figsize=(10, 8))
ax1 = subplot(211)
plot(linspace(0, 1, samples) * samples / rate, tone)
xlabel('time (seconds)')
subplot(212, sharex=ax1)
Pxx, freqs, bins, im = specgram(tone, NFFT=NFFT, Fs=rate, noverlap=900,
                                cmap=cm.gist_heat)
xlabel('time (seconds)')

Out[108]:

<matplotlib.text.Text at 0xb700bd0>

As one can see on the spectrogram, the higher the frequency, the higher the decay. Why not look at the decay line by line for given frequencies? Let's start with the one that has the most power. First let's check where the max power is.

In [119]:

figure(figsize=(10, 4))
plot(freqs, ampl)
xlabel("frequency (Hz)")
title("Initial frequency content of the low E-string")
xlim(0, 1000)
xticks(arange(0, 1000, 50));
plot(arange(82, 1000, 82), 100000 * ones(arange(82, 1000, 82).shape), 'o')

Out[119]:

[<matplotlib.lines.Line2D at 0xefac810>]

Apparently, this the 4th harmonic, with a theoric frequency of $4 * 82.5 = 330$ Hertz.

In [120]:

argmax(freqs >= 330)

Out[120]:

In [122]:

argmax(freqs >= 2* 82.5)

Out[122]:

In [124]:

plot(Pxx[22, :], label='2nd harmonic')
plot(Pxx[43, :], label='4th harmonic')
legend()

Out[124]:

<matplotlib.legend.Legend at 0xf42c610>

Clearly, they show different decay profiles. Let's build a function that allows us to compare the decay.

In [140]:

def plot_decay(harmonic):
    ind = argmax(freqs >= harmonic* 82.5)
    decay = Pxx[ind, :]
    decay = decay / max(decay)
    label = "harmonic %s" % harmonic
    plot(bins, decay, label=label)

In [143]:

plot_decay(2)
plot_decay(4)
plot_decay(5)
plot_decay(7)
plot_decay(10)
legend()
xlabel("time (s)")
xlim(0, 8)

Out[143]:

(0, 8)

Let's plot the same curve than previously but with the exponential decay calculated in the previous step.

In [157]:

p1

Out[157]:

array([  9.15646841e+11,   2.16959692e+00])

In [158]:

p2 = array([1., 1.])

In [155]:

normed_fitfunc = lambda p, x: fitfunc(p, x) / p[0]

In [162]:

figure(figsize=(16, 4))
fit_values = normed_fitfunc(p1, bins)
plot(bins+0.75, fit_values, label='exp decay t=2.17 s')
plot(bins+0.75, normed_fitfunc(p2, bins), label='exp decay t=1 s')
plot_decay(2)
plot_decay(4)
plot_decay(5)
plot_decay(7)
plot_decay(10)
plot_decay(20)
legend()
xlabel("time (s)")
xlim(0, 8)

Out[162]:

(0, 8)

Let's play a guitar chord with a simple frequency based decay rule:

if harmonic is below 4, decay time is 1 second.
if harmonic is above or equal to 5, decay time is the one fitted on the data.

In [163]:

def gen_harmonic_sound_simple_decay(t, harmonics_freqs, harmonics_ampl):
    #normalizing step
    harmonics_ampl = harmonics_ampl / max(harmonics_ampl)
    signal = zeros(t.shape)
    for i in range(len(harmonics_freqs)):
        if i < 5:
            signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t) * exp(-t / 2.17)
        else:
            signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t) * exp(-t / 1.0)
    return signal

In [164]:

t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound_simple_decay(t, harmonics_freqs, harmonics_ampl)
signal = signal * fitfunc(p1, t)
signal = signal / norm(signal, ord=inf) * 32000
signal = signal.astype(np.int16)
wavPlayer(signal, 8000)

Simple Test

Sounds good!

What about a more elaborate rule, such as a linear damping as a function of harmonic order?

In [176]:

figure(figsize=(16, 4))
for harmonic in [1, 2, 4, 5, 10]:
    plot_decay(harmonic)
    decay_time = 2.17 / 5. * harmonic
    label = "exp decay tau=%.2f" % (decay_time)
    plot(bins+0.75, normed_fitfunc([1.0, decay_time], bins), label=label)
legend()

Out[176]:

<matplotlib.legend.Legend at 0xf840490>

In [173]:

print decay_time
label = "tau %.2f" % (decay_time)

0.434

In [189]:

def gen_harmonic_sound_linear_decay(t, harmonics_freqs, harmonics_ampl):
    #normalizing step
    harmonics_ampl = harmonics_ampl / max(harmonics_ampl)
    signal = zeros(t.shape)
    for i in range(len(harmonics_freqs)):
        decay_time = 2.17 / 4.5 * i
        signal += harmonics_ampl[i] * sin(2 * pi * harmonics_freqs[i] * t) * exp(-t / decay_time)
    return signal

In [190]:

t = arange(0, 5, 1/8000.)
signal = gen_harmonic_sound_linear_decay(t, harmonics_freqs, harmonics_ampl)
plot(t, signal)

Out[190]:

[<matplotlib.lines.Line2D at 0x10b69c10>]

In [191]:

signal = (15000*signal).astype(np.int16)
wavPlayer(signal, 8000)

Simple Test

Not bad at all!