%pylab inline
from __future__ import print_function
from __future__ import division
Populating the interactive namespace from numpy and matplotlib
Fourier's theorem:
Any periodic function can be written as a Fourier series. i.e. proves that any periodic function can be written as a Fourier series.
Or, any periodic function can be written as a sum of harmonic phasors.
$$S(f) = \int_{-\infty}^{\infty} s(t) \cdot e^{- i 2\pi f t} dt$$Which is multiplying the input function by phasors of the frequency $f$. So it describes how to calculate a value that is a function of the frequency chosen.
This continuous version calculates the function for infinite time, and for a continuous and infinite set of frequencies.
To discretize, we need to make the time range and the set of frequencies be both finite and discrete.
When working with discrete data, you need to use the Discrete Fourier Transform (DFT):
$$X_k = \sum_{n=0}^{N-1} x_n \cdot e^{-i 2 \pi k n / N}$$ $$X_k = \sum_{n=0}^{N-1} x_n \cdot [\cos(-2 \pi k n / N) + i \sin(-2 \pi k n / N)]$$$$X_k = \sum_{n=0}^{N-1} x_n \cdot [\cos(2 \pi k n / N) - i \sin(2 \pi k n / N)]$$phs = linspace(0, 10 * 2 * pi, 512, endpoint=False)
x = sin(phs)
plot(phs, x)
title('10 Oscillations in 512 points');
Now calculate the Fourier transform for bin $k=1$ :
phasor_phs = linspace(0, 2 * pi, 512, endpoint=False)
plot(x)
plot(sin(phasor_phs))
plot(cos(phasor_phs));
for $k= 1\ $ and $N=512$
$$X_1 = \sum_{n=0}^{511} x_n \cdot [\cos(2 \pi n / 512) - i \sin(2 \pi n / 512)]$$subplot(121)
plot(x*sin(phasor_phs))
fill_between(arange(512), x*sin(phasor_phs))
subplot(122)
plot(x*cos(phasor_phs))
fill_between(arange(512), x*cos(phasor_phs))
gcf().set_figwidth(10)
print(sum(x*sin(phasor_phs)), sum(x*cos(phasor_phs)))
-6.54554535573e-15 1.55292445569e-14
We keep going for all values of $k$, e.g. $k=9$ :
$$X_9 = \sum_{n=0}^{511} x_n \cdot [\cos(2 \pi \cdot 9 \cdot n / 512) - i \sin(2 \pi \cdot 9 \cdot n / 512)]$$subplot(121)
plot(x*sin(9*phasor_phs))
fill_between(arange(512), x*sin(9*phasor_phs))
subplot(122)
plot(x*cos(9*phasor_phs))
fill_between(arange(512), x*cos(9*phasor_phs))
gcf().set_figwidth(10)
print(sum(x*sin(9*phasor_phs)), sum(x*cos(9*phasor_phs)))
-2.88814111515e-14 4.35762537165e-15
And $k=10\ $ :
k = 10
subplot(121)
plot(x*sin(k*phasor_phs))
fill_between(arange(512), x*sin(k*phasor_phs))
subplot(122)
plot(x*cos(k*phasor_phs))
fill_between(arange(512), x*cos(k*phasor_phs))
gcf().set_figwidth(16)
print(sum(x*sin(k*phasor_phs)), sum(x*cos(k*phasor_phs)))
256.0 -9.42301792151e-15
Now the whole Fourier transform for all bins $0 < k < N\ \ $:
phs = linspace(0, 10.0 * 2.0 * pi, 512, endpoint=False)
x = sin(phs)
x.dtype
dtype('float64')
dft = []
for k in range(len(x)):
bin_phs = linspace(0, k * 2.0 * pi, 512, endpoint=False)
fft_bin = complex(sum(x*cos(bin_phs)),
-sum(x*sin(bin_phs)))
dft.append(fft_bin)
subplot(121)
plot(real(dft))
title('Real part')
subplot(122)
plot(imag(dft))
title('Imaginary part')
gcf().set_figwidth(10)
The magnitude spectrum is the length of the vector in the complex plain. This function is equivalent to finding the absolute value of the complex number:
$$ Magnitude\ spectrum = |X_n|$$The phase spectrum is the angle of the vector in the complex plane:
$$Phase\ spectrum = \angle X_n$$phs = linspace(0, 10.0 * 2.0 * pi, 512)
x = sin(phs)
subplot(121)
plot(abs(array(dft)))
title('The magnitude spectrum')
subplot(122)
plot(angle(array(dft)));
title('The phase spectrum')
gcf().set_figwidth(10)
Using the fft function from the fft module in numpy:
phs = linspace(0, 10.0 * 2.0 * pi, 512, endpoint=False)
x = 0.5 * cos(phs + 0.3 * pi)
subplot(121)
plot(abs(fft.fft(x)))
title('The magnitude spectrum')
subplot(122)
plot(angle(fft.fft(x)));
title('The phase spectrum')
gcf().set_figwidth(10)
fft.fft(x)[10]
(124.26501033599166-90.283814752124243j)
plot(angle(fft.fft(x)));
xlim((0, 20))
(0, 20)
angle(fft.fft(x))[10]
0.94247779607693505
0.3 * pi
0.9424777960769379
When the input to the DFT is real only (no imaginary part), the second half of the transform is the complex conjugate in reverse. i.e. it mirrors around the center, and the imaginary part changes sign.
You can think of this happening because the FFT harmonic "phasors" wrap around with phase inversion at the Nyquist frequency, which is in the middle of the transform output.
$$X_k = \sum_{n=0}^{N-1} x_n \cdot [\cos(2 \pi k n / N) - i \sin(2 \pi k n / N)]$$N = 128
k1 = 10
k2 = N - k1
subplot(121)
plot(sin(linspace(0 , 2 * pi * k1, N, endpoint=False)))
plot(sin(linspace(0 , 2 * pi * k2, N, endpoint=False)))
subplot(122)
plot(cos(linspace(0 , 2 * pi * k1, N, endpoint=False)))
plot(cos(linspace(0 , 2 * pi * k2, N, endpoint=False)))
gcf().set_figwidth(10)
The frequencies mirror around after the first half of bins, both in frequency and in phase (phase is inverted)
plot(abs(fft.fft(x)))
argsort(abs(fft.fft(x)))[-2:]
array([ 10, 502])
The second half of the FFT is the reversed complex conjugate of the first.
This also shows as a reflection of the amplitude spectrum, and a phase reversed and reflected phase spectrum.
This property is called Hermitian. i.e. the FFT of a real signal is Hermitian around its center
The transform can be performed more quickly and can take up less memory if this property can be exploited (as in the case of audio signals).
plot(abs(fft.rfft(x)))
[<matplotlib.lines.Line2D at 0x7f2ad0190e10>]
Now we only get half the spectrum because we already know the other half.
len(fft.rfft(x))
257
len(x)
512
The number of points we get from the real FFT is $\frac{N}{2} +1$
Because the DFT adds the multiplication of many points together, the magnitude spectrum needs scaling of the amplitude by $N/2$
N = 512
phs = linspace(0, 10 * 2 * pi, 512, endpoint=False)
x = 0.6 * sin(phs + 0.3 * pi)
plot(abs(fft.rfft(x))/ (N/2))
[<matplotlib.lines.Line2D at 0x7f2ad0c0e5d0>]
plot(real(fft.rfft(x))/ (N/2))
[<matplotlib.lines.Line2D at 0x7f2ad0f6a810>]
plot(imag(fft.rfft(x))/ (N/2))
[<matplotlib.lines.Line2D at 0x7f2ad24c0590>]
N = 512
phs = linspace(0, 10 * 2 * pi, 512, endpoint=False)
x = 0.6 * sin(phs) + 0.4 * sin(phs*3)
plot(x)
[<matplotlib.lines.Line2D at 0x7f2ad0f021d0>]
plot(abs(fft.rfft(x))/ (N/2))
[<matplotlib.lines.Line2D at 0x7f2ad0ddcf50>]
But the x scale is not telling us much about frequency...
fw = linspace(0, 0.5, 257)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X)
title('Normalized frequency scale')
<matplotlib.text.Text at 0x7f2ad0cdc050>
fw = linspace(0, pi, 257)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X)
xticks(linspace(0, pi, 5), ['0', '$\pi/4$', '$\pi/2$', '$3\pi/4$', '$\pi$']) ;
title('Radians frequency scale');
sr = 44100
nyquist = sr/2.0
fw = linspace(0, nyquist, 257, endpoint=True)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X)
title('Hz frequency scale');
where $f$ is the "real" frequency, $f_0$ is the number of oscillations within the analysis window, $f_s$ is the sampling rate and $N$ is the size of the window
10.0 * float(sr)/512, 30.0 * float(sr)/512
(861.328125, 2583.984375)
sr = 44100
nyquist = sr/2.0
fw = linspace(0, nyquist, 257)
X = abs(fft.rfft(x))/ (N/2)
plot(fw, X, 'o-')
title('Hz frequency scale');
xlim((500, 2700))
(500, 2700)
The power spectrum can be computed by squaring the magnitude spectrum:
$$|X_n|^2$$plot(fw, X**2)
title('Power spectrum');
xlim((500, 2700))
(500, 2700)
This results in a sort of "warping" of the amplitude scale, making peaks more pronounced, and lower level detail less visible. This can be a useful technique when trying to emphasize peaks.
It turns out that the computation of the DFT can be optimized if:
for i in range(20):
print(2**i)
1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536 131072 262144 524288
One of the key assumptions of the DFT is that a signal is static within the analysis frame. (This relates to the assumption of periodicity)
A trick to extract a time-varying spectrum from a signal is to perform short DFTs, each starting a bit later than the previous one.
from scipy.io import wavfile
sr, signal = wavfile.read('passport.wav')
!aplay passport.wav
Playing WAVE 'passport.wav' : Signed 16 bit Little Endian, Rate 44100 Hz, Mono
win1 = signal[0:1024]
win2 = signal[1024:2048]
win3 = signal[2048: 3072]
plot(abs(fft.rfft(win1)))
plot(abs(fft.rfft(win2)))
plot(abs(fft.rfft(win3)))
[<matplotlib.lines.Line2D at 0x7f2ac6e2ee10>]
arange(0, 40000, 2048)
array([ 0, 2048, 4096, 6144, 8192, 10240, 12288, 14336, 16384, 18432, 20480, 22528, 24576, 26624, 28672, 30720, 32768, 34816, 36864, 38912])
win_start = arange(0, 40000, 2048)
win_len = 2048
mag_spectrum = []
for start in win_start:
win = signal[start: start + win_len]
X = fft.rfft(win)
mag_spectrum.append(abs(X)/float(win_len/2))
imshow(mag_spectrum, aspect='auto')
<matplotlib.image.AxesImage at 0x7f2ac6fd5dd0>
plot(abs(fft.rfft(signal)))
[<matplotlib.lines.Line2D at 0x7f2ac6eae790>]
win_start = arange(0, 40000, 2048)
win_len = 2048
pow_spectrum = []
for start in win_start:
win = signal[start: start + win_len]
X = fft.rfft(win)
pow_spectrum.append(abs(X)**2/float(win_len/2))
imshow(pow_spectrum, aspect='auto')
title('Power spectrum')
<matplotlib.text.Text at 0x7f2ac6df1490>
subplot(121)
imshow(array(mag_spectrum).T, aspect='auto')
subplot(122)
imshow(array(pow_spectrum).T, aspect='auto')
gcf().set_figwidth(10)
colorbar()
<matplotlib.colorbar.Colorbar instance at 0x7f2abf417830>
array(mag_spectrum).shape
(20, 1025)
sout = specgram(signal[:40000], NFFT=2048, noverlap=0, window=window_none, Fs= sr);
sout[0].shape, sout[1].shape, sout[2].shape
colorbar();
imshow(10*log10(pow_spectrum), aspect='auto')
colorbar()
<matplotlib.colorbar.Colorbar instance at 0x7f2abf18bd88>
imshow(10*log10(pow_spectrum).T, aspect='auto', interpolation='nearest')
colorbar()
ylim((0, 1024))
(0, 1024)
So the specgram function in pylab plots the Power spectrum on a decibel scale. The decibel scale is more useful than the linear scale as the relative amplitudes can be detected better.
N = 512
phs = linspace(0.2* pi, 7.8 * 2 * pi, N)
x = sin(phs)
plot(x)
[<matplotlib.lines.Line2D at 0x7f2abed9d390>]
X = fft.rfft(x)
plot(abs(X))
[<matplotlib.lines.Line2D at 0x7f2abee5e450>]
plot(abs(X)/len(X), 'o-')
xlim((0, 20))
(0, 20)
plot(abs(X)/len(X), 'o:')
xlim((3, 12))
grid()
vlines(7.8, 0, 0.9)
<matplotlib.collections.LineCollection at 0x7f2abef2ec90>
def plot_mag_spectrum(x, sr=44100, db=True):
X = fft.rfft(x)
fw = linspace(0, sr/2.0, len(X))
if db:
plot(fw,20*log10(abs(X)/len(X))) # assumes real FFT
else:
plot(fw,abs(X)/len(X)) # assumes real FFT
ylabel('Amplitude (dB)'); xlabel('Frequency (Hz)'); title('Magnitude spectrum')
xlim((0, sr/2.0))
grid(True)
plot_mag_spectrum(x)
plot_mag_spectrum(x )
xlim(0, 2500)
(0, 2500)
Why is there energy/amplitude around the center frequency, when only a single frequency was present?
N = 512
phs = linspace(0.6* pi, 107.2 * 2 * pi, N)
x = sin(phs)
plot_mag_spectrum(x)
x = 0.01 * sin(phs * 1.11)
plot_mag_spectrum(x)
ylim((-200, 0))
(-200, 0)
N = 512
phs = linspace(0.6* pi, 107.2 * 2 * pi, N)
x = sin(phs) + (0.01 * sin(phs * 1.11))
plot_mag_spectrum(x)
Which are true components of the signal?
plot(hanning(N));
plot(hanning(N) * x);
plot(x)
[<matplotlib.lines.Line2D at 0x7f2abed04bd0>]
plot(x)
xlim((0, 100))
(0, 100)
plot_mag_spectrum(hanning(N) * x)
Ah! Much better!
But wait, isn't amplitude wrong? It doesn't match the amplitude values when not using a window...
def plot_mag_spectrum(x, sr=44100, db=True, window=window_none):
w = window(len(x))
X = fft.rfft(window(len(x)) *x)
fw = linspace(0, sr/2.0, len(X))
if db:
plot(fw,20*log10(abs(X)/(sum(w)/2.0))) # assumes real FFT
else:
plot(fw,abs(X)/(sum(w)/2.0)) # assumes real FFT
ylabel('Amplitude'); xlabel('Frequency (Hz)'); title('Magnitude spectrum')
xlim((0, sr/2.0))
grid(True)
plot_mag_spectrum(x, window=hanning)
Windowing the analysis frame results in a tradeoff between main lobe width and sidelobe (leakage) level.
http://en.wikipedia.org/wiki/Windowing_function
There are many different functions which can be useful for different applications. In audio the most common are Hann (Hanning), Hamming, Kaiser and Bartlett, because they have lower sidelobe levels.
plot_mag_spectrum(x, window=hanning)
plot_mag_spectrum(x, window=hamming)
plot_mag_spectrum(x, window=bartlett)
plot_mag_spectrum(x, window=ones)
xlim((5000, 12000))
legend(['Hanning', 'Hamming', 'Bartlett', 'Rectangular'], loc='best')
<matplotlib.legend.Legend at 0x7f2abeb0fb90>
Zero padding consists of adding zeros at the end of an analysis frame, to improve smoothness of the spectrum or to adjust to make the window size a power of two.
def plot_mag_spectrum(x, sr=44100, db=True, window=window_none, zp=0):
w = window(len(x))
padded_x = r_[window(len(x)) *x, zeros(zp)]
X = fft.rfft(padded_x)
fw = linspace(0, sr/2.0, len(X))
if db:
plot(fw,10*log10(abs(X)/(sum(w)/2.0))) # assumes real FFT
else:
plot(fw,abs(X)/(sum(w)/2.0)) # assumes real FFT
ylabel('Amplitude'); xlabel('Frequency (Hz)'); title('Magnitude spectrum')
xlim((0, sr/2.0))
grid(True)
plot_mag_spectrum(x, window=hanning, zp=2048)
plot_mag_spectrum(x, window=hamming, zp=2048)
plot_mag_spectrum(x, window=bartlett, zp=2048)
plot_mag_spectrum(x, window=ones, zp=2048)
xlim((5000, 12000))
legend(['Hanning', 'Hamming', 'Bartlett', 'Rectangular'], loc='best')
<matplotlib.legend.Legend at 0x7f2abe92f850>
plot_mag_spectrum(x, window=hanning, zp=2048)
plot_mag_spectrum(x, window=hamming, zp=2048)
plot_mag_spectrum(x, window=bartlett, zp=2048)
plot_mag_spectrum(x, window=ones, zp=2048)
xlim((9000, 9500))
ylim((-40, 5))
legend(['Hanning', 'Hamming', 'Bartlett', 'Rectangular'], loc='lower center')
<matplotlib.legend.Legend at 0x7f2abef35b90>
Zero padding is actually similar to interpolating the spectrum, it doesn't really give better frequency resolution.
But it can reveal artifacts, so it is more like upsampling.
Because windowing makes the spectrum focus on the center of the window, it is common to overlap windows
sout = specgram(signal[:20000], NFFT=2048, noverlap=0);
sout = specgram(signal[:20000], NFFT=2048, noverlap=1024);
The Inverse Fourier transform can reconstruct the time domain representation of a frequency domain spectrum.
$$s(t) = \int_{-\infty}^{\infty} S(f) \cdot e^{i 2\pi f t} df$$The only change in practice is the sign of the exponent!
mag_spec = [0, 1,0,0,0,0,0,0,0]
phs_spec = [0, 0, 0,0,0,0,0,0,0]
X = [np.complex(cos(phs)* mag, sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
plot(real(X))
[<matplotlib.lines.Line2D at 0x7f2abe424110>]
plot(imag(X))
[<matplotlib.lines.Line2D at 0x7f2abe350290>]
x = fft.irfft(X)
plot(x)
[<matplotlib.lines.Line2D at 0x7f2abe302150>]
mag_spec = [0,0,0,1,0,0,0,0,0]
phs_spec = [0, 0, 0, pi/2,0,0,0,0,0]
X = [np.complex(cos(phs)* mag, -sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
x = fft.irfft(X)
plot(x, 'o-')
[<matplotlib.lines.Line2D at 0x7f2abdfdc7d0>]
The inverse FT must be scaled.
mag_spec = [0,1,0,0,0,0,0,0, 0]
phs_spec = [0, 0, 0,0,0,0,0,0, 0]
X = [np.complex(cos(phs)* mag, sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
x = fft.irfft(X) * 8
plot(x)
[<matplotlib.lines.Line2D at 0x7f2abdf185d0>]
mag_spec = [0] + ([0,0,0,0,0,0,1] * 4)
print(len(mag_spec))
mag_spec += [0]
29
phs_spec = ones(29) * pi/2
X = [np.complex(cos(phs)* mag, -sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
type([0,0,0,0,0,0,1])
list
type(ones(29))
numpy.ndarray
x = fft.irfft(X)
plot(x)
[<matplotlib.lines.Line2D at 0x7f2abdde43d0>]
phs_spec = linspace(0, 1, 29)
X = [np.complex(cos(phs)* mag, -sin(phs)* mag) for mag, phs in zip(mag_spec, phs_spec)]
x = fft.irfft(X)
plot(x)
[<matplotlib.lines.Line2D at 0x7f2abdd0d810>]
By: Andrés Cabrera mantaraya36@gmail.com For MAT course MAT 201A at UCSB
This ipython notebook is licensed under the CC-BY-NC-SA license: http://creativecommons.org/licenses/by-nc-sa/4.0/