In [1]:

%pylab inline
from __future__ import print_function
from __future__ import division

Populating the interactive namespace from numpy and matplotlib

Fourier Analysis¶

The Fourier Transform¶

Fourier's theorem:

Any periodic function can be written as a Fourier series. i.e. proves that any periodic function can be written as a Fourier series.

Or, any periodic function can be written as a sum of harmonic phasors.

$$S(f) = \int_{-\infty}^{\infty} s(t) \cdot e^{- i 2\pi f t} dt$$

Which is multiplying the input function by phasors of the frequency $f$. So it describes how to calculate a value that is a function of the frequency chosen.

This continuous version calculates the function for infinite time, and for a continuous and infinite set of frequencies.

To discretize, we need to make the time range and the set of frequencies be both finite and discrete.

The Discrete Fourier Transform¶

When working with discrete data, you need to use the Discrete Fourier Transform (DFT):

$$X_k = \sum_{n=0}^{N-1} x_n \cdot e^{-i 2 \pi k n / N}$$

http://en.wikipedia.org/wiki/Discrete_Fourier_transform

$$X_k = \sum_{n=0}^{N-1} x_n \cdot [\cos(-2 \pi k n / N) + i \sin(-2 \pi k n / N)]$$$$X_k = \sum_{n=0}^{N-1} x_n \cdot [\cos(2 \pi k n / N) - i \sin(2 \pi k n / N)]$$

In [2]:

phs = linspace(0, 10 * 2 * pi, 512, endpoint=False)
x = sin(phs)
plot(phs, x)
title('10 Oscillations in 512 points');

Now calculate the Fourier transform for bin $k=1$ :

In [3]:

phasor_phs = linspace(0, 2 * pi, 512, endpoint=False)
plot(x)
plot(sin(phasor_phs))
plot(cos(phasor_phs));

for $k= 1\ $ and $N=512$

$$X_1 = \sum_{n=0}^{511} x_n \cdot [\cos(2 \pi n / 512) - i \sin(2 \pi n / 512)]$$

In [4]:

subplot(121)
plot(x*sin(phasor_phs))
fill_between(arange(512), x*sin(phasor_phs))

subplot(122)
plot(x*cos(phasor_phs))
fill_between(arange(512), x*cos(phasor_phs))

gcf().set_figwidth(10)

print(sum(x*sin(phasor_phs)), sum(x*cos(phasor_phs)))

-6.54554535573e-15 1.55292445569e-14

We keep going for all values of $k$, e.g. $k=9$ :

$$X_9 = \sum_{n=0}^{511} x_n \cdot [\cos(2 \pi \cdot 9 \cdot n / 512) - i \sin(2 \pi \cdot 9 \cdot n / 512)]$$

In [5]:

subplot(121)
plot(x*sin(9*phasor_phs))
fill_between(arange(512), x*sin(9*phasor_phs))
subplot(122)
plot(x*cos(9*phasor_phs))
fill_between(arange(512), x*cos(9*phasor_phs))

gcf().set_figwidth(10)

print(sum(x*sin(9*phasor_phs)), sum(x*cos(9*phasor_phs)))

-2.88814111515e-14 4.35762537165e-15

And $k=10\ $ :

In [7]:

k = 10
subplot(121)
plot(x*sin(k*phasor_phs))
fill_between(arange(512), x*sin(k*phasor_phs))
subplot(122)
plot(x*cos(k*phasor_phs))
fill_between(arange(512), x*cos(k*phasor_phs))

gcf().set_figwidth(16)

print(sum(x*sin(k*phasor_phs)), sum(x*cos(k*phasor_phs)))

256.0 -9.42301792151e-15

Now the whole Fourier transform for all bins $0 < k < N\ \ $:

In [8]:

phs = linspace(0, 10.0 * 2.0 * pi, 512, endpoint=False)
x = sin(phs)
x.dtype

Out[8]:

dtype('float64')

In [9]:

dft = []
for k in range(len(x)):
    bin_phs = linspace(0, k * 2.0 * pi, 512, endpoint=False)
    fft_bin = complex(sum(x*cos(bin_phs)),
                     -sum(x*sin(bin_phs)))
    dft.append(fft_bin)

subplot(121)
plot(real(dft))
title('Real part')

subplot(122)
plot(imag(dft))
title('Imaginary part')

gcf().set_figwidth(10)

The magnitude spectrum is the length of the vector in the complex plain. This function is equivalent to finding the absolute value of the complex number:

$$ Magnitude\ spectrum = |X_n|$$

The phase spectrum is the angle of the vector in the complex plane:

$$Phase\ spectrum = \angle X_n$$

In [10]:

phs = linspace(0, 10.0 * 2.0 * pi, 512)
x = sin(phs)
subplot(121)
plot(abs(array(dft)))
title('The magnitude spectrum')
subplot(122)
plot(angle(array(dft)));
title('The phase spectrum')
gcf().set_figwidth(10)

Using the fft function from the fft module in numpy:

In [23]:

phs = linspace(0, 10.0 * 2.0 * pi, 512, endpoint=False)
x = 0.5 * cos(phs + 0.3 * pi)
subplot(121)
plot(abs(fft.fft(x)))
title('The magnitude spectrum')
subplot(122)
plot(angle(fft.fft(x)));
title('The phase spectrum')
gcf().set_figwidth(10)

In [39]:

fft.fft(x)[10]

Out[39]:

(124.26501033599166-90.283814752124243j)

In [24]:

plot(angle(fft.fft(x)));
xlim((0, 20))

Out[24]:

(0, 20)

In [25]:

angle(fft.fft(x))[10]

Out[25]:

0.94247779607693505

In [26]:

0.3 * pi

Out[26]:

0.9424777960769379

Real DFTs¶

When the input to the DFT is real only (no imaginary part), the second half of the transform is the complex conjugate in reverse. i.e. it mirrors around the center, and the imaginary part changes sign.

You can think of this happening because the FFT harmonic "phasors" wrap around with phase inversion at the Nyquist frequency, which is in the middle of the transform output.

$$X_k = \sum_{n=0}^{N-1} x_n \cdot [\cos(2 \pi k n / N) - i \sin(2 \pi k n / N)]$$

In [27]:

N = 128

k1 = 10
k2 = N - k1

subplot(121)
plot(sin(linspace(0 , 2 * pi * k1, N, endpoint=False)))
plot(sin(linspace(0 , 2 * pi * k2, N, endpoint=False)))
subplot(122)
plot(cos(linspace(0 , 2 * pi * k1, N, endpoint=False)))
plot(cos(linspace(0 , 2 * pi * k2, N, endpoint=False)))

gcf().set_figwidth(10)

The frequencies mirror around after the first half of bins, both in frequency and in phase (phase is inverted)

In [28]:

plot(abs(fft.fft(x)))
argsort(abs(fft.fft(x)))[-2:]

Out[28]:

array([ 10, 502])

The second half of the FFT is the reversed complex conjugate of the first.

This also shows as a reflection of the amplitude spectrum, and a phase reversed and reflected phase spectrum.

This property is called Hermitian. i.e. the FFT of a real signal is Hermitian around its center

The transform can be performed more quickly and can take up less memory if this property can be exploited (as in the case of audio signals).

In [29]:

plot(abs(fft.rfft(x)))

Out[29]:

[<matplotlib.lines.Line2D at 0x7f2ad0190e10>]

Now we only get half the spectrum because we already know the other half.

In [30]:

len(fft.rfft(x))

Out[30]:

In [31]:

len(x)

Out[31]:

The number of points we get from the real FFT is $\frac{N}{2} +1$

Scaling the DFT¶

Because the DFT adds the multiplication of many points together, the magnitude spectrum needs scaling of the amplitude by $N/2$

In [34]:

N = 512
phs = linspace(0, 10 * 2 * pi, 512, endpoint=False)
x = 0.6 * sin(phs + 0.3 * pi)

plot(abs(fft.rfft(x))/ (N/2))

Out[34]:

[<matplotlib.lines.Line2D at 0x7f2ad0c0e5d0>]

In [41]:

plot(real(fft.rfft(x))/ (N/2))

Out[41]:

[<matplotlib.lines.Line2D at 0x7f2ad0f6a810>]

In [38]:

plot(imag(fft.rfft(x))/ (N/2))

Out[38]:

[<matplotlib.lines.Line2D at 0x7f2ad24c0590>]

In [42]:

N = 512
phs = linspace(0, 10 * 2 * pi, 512, endpoint=False)
x = 0.6 * sin(phs) + 0.4 * sin(phs*3)

plot(x)

Out[42]:

[<matplotlib.lines.Line2D at 0x7f2ad0f021d0>]

In [43]:

plot(abs(fft.rfft(x))/ (N/2))

Out[43]:

[<matplotlib.lines.Line2D at 0x7f2ad0ddcf50>]

But the x scale is not telling us much about frequency...

In [44]:

fw = linspace(0, 0.5, 257)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X)
title('Normalized frequency scale')

Out[44]:

<matplotlib.text.Text at 0x7f2ad0cdc050>

In [45]:

fw = linspace(0, pi, 257)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X)
xticks(linspace(0, pi, 5), ['0', '$\pi/4$', '$\pi/2$', '$3\pi/4$', '$\pi$']) ;

title('Radians frequency scale');

In [46]:

sr = 44100
nyquist = sr/2.0
fw = linspace(0, nyquist, 257, endpoint=True)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X)
title('Hz frequency scale');

$$f = \frac{f_0 f_s}{N}$$

where $f$ is the "real" frequency, $f_0$ is the number of oscillations within the analysis window, $f_s$ is the sampling rate and $N$ is the size of the window

In [47]:

10.0 * float(sr)/512, 30.0 * float(sr)/512

Out[47]:

(861.328125, 2583.984375)

In [50]:

sr = 44100
nyquist = sr/2.0
fw = linspace(0, nyquist, 257)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X, 'o-')
title('Hz frequency scale');
xlim((500, 2700))

Out[50]:

(500, 2700)

Power Spectrum¶

The power spectrum can be computed by squaring the magnitude spectrum:

$$|X_n|^2$$

In [51]:

plot(fw, X**2)
title('Power spectrum');
xlim((500, 2700))

Out[51]:

(500, 2700)

This results in a sort of "warping" of the amplitude scale, making peaks more pronounced, and lower level detail less visible. This can be a useful technique when trying to emphasize peaks.

Fast Fourier Transform¶

It turns out that the computation of the DFT can be optimized if:

The number of points for the analysis is a power of 2

In [53]:

for i in range(20):
    print(2**i)

Short-Term Fourier Transform¶

One of the key assumptions of the DFT is that a signal is static within the analysis frame. (This relates to the assumption of periodicity)

A trick to extract a time-varying spectrum from a signal is to perform short DFTs, each starting a bit later than the previous one.

In [54]:

from scipy.io import wavfile
sr, signal = wavfile.read('passport.wav')

In [55]:

!aplay passport.wav

Playing WAVE 'passport.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Mono

In [56]:

win1 = signal[0:1024]
win2 = signal[1024:2048]
win3 = signal[2048: 3072]

plot(abs(fft.rfft(win1)))
plot(abs(fft.rfft(win2)))
plot(abs(fft.rfft(win3)))

Out[56]:

[<matplotlib.lines.Line2D at 0x7f2ac6e2ee10>]

In [57]:

arange(0, 40000, 2048)

Out[57]:

array([    0,  2048,  4096,  6144,  8192, 10240, 12288, 14336, 16384,
       18432, 20480, 22528, 24576, 26624, 28672, 30720, 32768, 34816,
       36864, 38912])

In [58]:

win_start = arange(0, 40000, 2048)
win_len = 2048

mag_spectrum = []

for start in win_start:
    win = signal[start: start + win_len]
    X = fft.rfft(win)
    mag_spectrum.append(abs(X)/float(win_len/2))

imshow(mag_spectrum, aspect='auto')

Out[58]:

<matplotlib.image.AxesImage at 0x7f2ac6fd5dd0>

In [59]:

plot(abs(fft.rfft(signal)))

Out[59]:

[<matplotlib.lines.Line2D at 0x7f2ac6eae790>]

In [60]:

win_start = arange(0, 40000, 2048)
win_len = 2048

pow_spectrum = []

for start in win_start:
    win = signal[start: start + win_len]
    X = fft.rfft(win)
    pow_spectrum.append(abs(X)**2/float(win_len/2))

imshow(pow_spectrum, aspect='auto')
title('Power spectrum')

Out[60]:

<matplotlib.text.Text at 0x7f2ac6df1490>

In [64]:

subplot(121)
imshow(array(mag_spectrum).T, aspect='auto')
subplot(122)
imshow(array(pow_spectrum).T, aspect='auto')

gcf().set_figwidth(10)
colorbar()

Out[64]:

<matplotlib.colorbar.Colorbar instance at 0x7f2abf417830>

In [62]:

array(mag_spectrum).shape

Out[62]:

(20, 1025)

Spectrogram¶

In [65]:

sout = specgram(signal[:40000], NFFT=2048, noverlap=0, window=window_none, Fs= sr);

sout[0].shape, sout[1].shape, sout[2].shape
colorbar();

In [66]:

imshow(10*log10(pow_spectrum), aspect='auto')
colorbar()

Out[66]:

<matplotlib.colorbar.Colorbar instance at 0x7f2abf18bd88>

In [68]:

imshow(10*log10(pow_spectrum).T, aspect='auto', interpolation='nearest')
colorbar()
ylim((0, 1024))

Out[68]:

(0, 1024)

So the specgram function in pylab plots the Power spectrum on a decibel scale. The decibel scale is more useful than the linear scale as the relative amplitudes can be detected better.

Windowed analysis¶

In [69]:

N = 512
phs = linspace(0.2* pi, 7.8 * 2 * pi, N)
x = sin(phs)
plot(x)

Out[69]:

[<matplotlib.lines.Line2D at 0x7f2abed9d390>]

In [70]:

X = fft.rfft(x)
plot(abs(X))

Out[70]:

[<matplotlib.lines.Line2D at 0x7f2abee5e450>]

In [72]:

plot(abs(X)/len(X), 'o-')
xlim((0, 20))

Out[72]:

(0, 20)

In [73]:

plot(abs(X)/len(X), 'o:')
xlim((3, 12))
grid()
vlines(7.8, 0, 0.9)

Out[73]:

<matplotlib.collections.LineCollection at 0x7f2abef2ec90>

In [91]:

def plot_mag_spectrum(x, sr=44100, db=True):
    X = fft.rfft(x)
    fw = linspace(0, sr/2.0, len(X))
    if db:
        plot(fw,20*log10(abs(X)/len(X)))  # assumes real FFT
    else:
        plot(fw,abs(X)/len(X))  # assumes real FFT
    ylabel('Amplitude (dB)'); xlabel('Frequency (Hz)'); title('Magnitude spectrum')
    xlim((0, sr/2.0))
    grid(True)
    
plot_mag_spectrum(x)

In [92]:

plot_mag_spectrum(x )
xlim(0, 2500)

Out[92]:

(0, 2500)

Why is there energy/amplitude around the center frequency, when only a single frequency was present?

Effect of analysis windows¶

In [94]:

N = 512
phs = linspace(0.6* pi, 107.2 * 2 * pi, N)
x = sin(phs)
plot_mag_spectrum(x)

x = 0.01 * sin(phs * 1.11)
plot_mag_spectrum(x)

ylim((-200, 0))

Out[94]:

(-200, 0)

In [95]:

N = 512
phs = linspace(0.6* pi, 107.2 * 2 * pi, N)
x = sin(phs) + (0.01 * sin(phs * 1.11))
plot_mag_spectrum(x)

Which are true components of the signal?

In [96]:

plot(hanning(N));

In [97]:

plot(hanning(N) * x);

In [102]:

plot(x)

Out[102]:

[<matplotlib.lines.Line2D at 0x7f2abed04bd0>]

In [103]:

plot(x)
xlim((0, 100))

Out[103]:

(0, 100)

In [98]:

plot_mag_spectrum(hanning(N) * x)

Ah! Much better!

But wait, isn't amplitude wrong? It doesn't match the amplitude values when not using a window...

In [99]:

def plot_mag_spectrum(x, sr=44100, db=True, window=window_none):
    w = window(len(x))
    X = fft.rfft(window(len(x)) *x)
    fw = linspace(0, sr/2.0, len(X))
    if db:
        plot(fw,20*log10(abs(X)/(sum(w)/2.0)))  # assumes real FFT
    else:
        plot(fw,abs(X)/(sum(w)/2.0))  # assumes real FFT
    ylabel('Amplitude'); xlabel('Frequency (Hz)'); title('Magnitude spectrum')
    xlim((0, sr/2.0))
    grid(True)

In [100]:

plot_mag_spectrum(x, window=hanning)

Windowing the analysis frame results in a tradeoff between main lobe width and sidelobe (leakage) level.

http://en.wikipedia.org/wiki/Windowing_function

There are many different functions which can be useful for different applications. In audio the most common are Hann (Hanning), Hamming, Kaiser and Bartlett, because they have lower sidelobe levels.

In [101]:

plot_mag_spectrum(x, window=hanning)
plot_mag_spectrum(x, window=hamming)
plot_mag_spectrum(x, window=bartlett)
plot_mag_spectrum(x, window=ones)

xlim((5000, 12000))
legend(['Hanning', 'Hamming', 'Bartlett', 'Rectangular'], loc='best')

Out[101]:

<matplotlib.legend.Legend at 0x7f2abeb0fb90>

Zero padding¶

Zero padding consists of adding zeros at the end of an analysis frame, to improve smoothness of the spectrum or to adjust to make the window size a power of two.

In [104]:

def plot_mag_spectrum(x, sr=44100, db=True, window=window_none, zp=0):
    w = window(len(x))
    padded_x = r_[window(len(x)) *x, zeros(zp)]
    X = fft.rfft(padded_x)
    fw = linspace(0, sr/2.0, len(X))
    if db:
        plot(fw,10*log10(abs(X)/(sum(w)/2.0)))  # assumes real FFT
    else:
        plot(fw,abs(X)/(sum(w)/2.0))  # assumes real FFT
    ylabel('Amplitude'); xlabel('Frequency (Hz)'); title('Magnitude spectrum')
    xlim((0, sr/2.0))
    grid(True)

plot_mag_spectrum(x, window=hanning, zp=2048)
plot_mag_spectrum(x, window=hamming, zp=2048)
plot_mag_spectrum(x, window=bartlett, zp=2048)
plot_mag_spectrum(x, window=ones, zp=2048)

xlim((5000, 12000))
legend(['Hanning', 'Hamming', 'Bartlett', 'Rectangular'], loc='best')

Out[104]:

<matplotlib.legend.Legend at 0x7f2abe92f850>

In [105]:

plot_mag_spectrum(x, window=hanning, zp=2048)
plot_mag_spectrum(x, window=hamming, zp=2048)
plot_mag_spectrum(x, window=bartlett, zp=2048)
plot_mag_spectrum(x, window=ones, zp=2048)

xlim((9000, 9500))
ylim((-40, 5))
legend(['Hanning', 'Hamming', 'Bartlett', 'Rectangular'], loc='lower center')

Out[105]:

<matplotlib.legend.Legend at 0x7f2abef35b90>

Zero padding is actually similar to interpolating the spectrum, it doesn't really give better frequency resolution.

But it can reveal artifacts, so it is more like upsampling.

Spectrogram (again)¶

Because windowing makes the spectrum focus on the center of the window, it is common to overlap windows

In [106]:

sout = specgram(signal[:20000], NFFT=2048, noverlap=0);

In [107]:

sout = specgram(signal[:20000], NFFT=2048, noverlap=1024);

The Inverse Fourier Transform¶

The Inverse Fourier transform can reconstruct the time domain representation of a frequency domain spectrum.

$$s(t) = \int_{-\infty}^{\infty} S(f) \cdot e^{i 2\pi f t} df$$

The only change in practice is the sign of the exponent!

In [121]:

mag_spec = [0, 1,0,0,0,0,0,0,0]
phs_spec = [0, 0, 0,0,0,0,0,0,0]

In [122]:

X = [np.complex(cos(phs)* mag, sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
plot(real(X))

Out[122]:

[<matplotlib.lines.Line2D at 0x7f2abe424110>]

In [123]:

plot(imag(X))

Out[123]:

[<matplotlib.lines.Line2D at 0x7f2abe350290>]

In [124]:

x = fft.irfft(X)
plot(x)

Out[124]:

[<matplotlib.lines.Line2D at 0x7f2abe302150>]

In [128]:

mag_spec = [0,0,0,1,0,0,0,0,0]
phs_spec = [0, 0, 0, pi/2,0,0,0,0,0]
X = [np.complex(cos(phs)* mag, -sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
x = fft.irfft(X)
plot(x, 'o-')

Out[128]:

[<matplotlib.lines.Line2D at 0x7f2abdfdc7d0>]

The inverse FT must be scaled.

In [129]:

mag_spec = [0,1,0,0,0,0,0,0, 0]
phs_spec = [0, 0, 0,0,0,0,0,0, 0]
X = [np.complex(cos(phs)* mag, sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
x = fft.irfft(X) * 8
plot(x)

Out[129]:

[<matplotlib.lines.Line2D at 0x7f2abdf185d0>]

In [134]:

mag_spec = [0] + ([0,0,0,0,0,0,1] * 4)
print(len(mag_spec))
mag_spec += [0]

In [135]:

phs_spec = ones(29) * pi/2
X = [np.complex(cos(phs)* mag, -sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]

In [137]:

type([0,0,0,0,0,0,1])

Out[137]:

list

In [138]:

type(ones(29))

Out[138]:

numpy.ndarray

In [136]:

x = fft.irfft(X)
plot(x)

Out[136]:

[<matplotlib.lines.Line2D at 0x7f2abdde43d0>]

In [139]:

phs_spec = linspace(0, 1, 29)
X = [np.complex(cos(phs)* mag, -sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
x = fft.irfft(X)
plot(x)

Out[139]:

[<matplotlib.lines.Line2D at 0x7f2abdd0d810>]

By: Andrés Cabrera mantaraya36@gmail.com For MAT course MAT 201A at UCSB

This ipython notebook is licensed under the CC-BY-NC-SA license: http://creativecommons.org/licenses/by-nc-sa/4.0/