%pylab inline
rcParams['figure.figsize'] = (16, 4) #wide graphs by default
Populating the interactive namespace from numpy and matplotlib
WARNING: pylab import has clobbered these variables: ['f'] `%matplotlib` prevents importing * from pylab and numpy
Some simple ready to use data sets: http://www.isds.duke.edu/~mw/ts_data_sets.html
!wget http://www.isds.duke.edu/~mw/data-sets/ts_data/carinae.txt
--2015-02-18 13:38:00-- http://www.isds.duke.edu/~mw/data-sets/ts_data/carinae.txt Resolving www.isds.duke.edu (www.isds.duke.edu)... 152.3.22.15 Connecting to www.isds.duke.edu (www.isds.duke.edu)|152.3.22.15|:80... connected. HTTP request sent, awaiting response... 301 Moved Permanently Location: https://stat.duke.edu/~mw/data-sets/ts_data/carinae.txt [following] --2015-02-18 13:38:00-- https://stat.duke.edu/~mw/data-sets/ts_data/carinae.txt Resolving stat.duke.edu (stat.duke.edu)... 152.3.22.15 Connecting to stat.duke.edu (stat.duke.edu)|152.3.22.15|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 20830 (20K) [text/plain] Saving to: ‘carinae.txt.8’ 100%[======================================>] 20.830 --.-K/s in 0,1s 2015-02-18 13:38:00 (140 KB/s) - ‘carinae.txt.8’ saved [20830/20830]
f = open('carinae.txt')
values = [float(l) for l in f]
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-15-d7d3b606c767> in <module>() ----> 1 values = [float(l) for l in f] ValueError: could not convert string to float:
Some characters couldn't be converted! Need to try something else.
f.seek(0)
values = []
for l in f:
try:
values.append(float(l))
except:
print 'line ignored: ', l
line ignored: line ignored:
plot(values)
[<matplotlib.lines.Line2D at 0x7f61af760410>]
[(i, value) for i, value in enumerate(values[:10])]
[(0, 8.95), (1, 8.94), (2, 8.0), (3, 6.62), (4, 6.35), (5, 5.95), (6, 5.86), (7, 5.95), (8, 6.2), (9, 6.09)]
Must clean some of the values which are invalid, but must keep the time series intact (i.e. leave the space)
time_series = [(i, value) for i, value in enumerate(values) if value > 1]
time_series = array(time_series)
time_series.shape
(1149, 2)
plot(*(time_series.T))
[<matplotlib.lines.Line2D at 0x7f61af6c2b10>]
plot(*(time_series.T), marker='o')
xlim((95,110))
grid()
Let's fill in the points instead.
time_series = []
for i,value in enumerate(values):
if value < 1:
value = (values[i+1] - values[i-1])/2.0
time_series.append(value)
time_series = array(time_series)
plot(time_series)
[<matplotlib.lines.Line2D at 0x7f61adc61550>]
Oops! More than one consecutive point missing! We could do this ourselves or use a package like pandas.
order = 24
win_size= 512
start = 0
signal = time_series[start:start+win_size]
# Adapted from scikits.talkbox https://pypi.python.org/pypi/scikits.talkbox
from scipy.linalg import toeplitz
p = order + 1
r = zeros(p, signal.dtype)
# Number of non zero values in autocorrelation one needs for p LPC
# coefficients
nx = min([p, signal.size])
x = correlate(signal, signal, 'full')
r[:nx] = x[signal.size-1:signal.size+order]
phi = dot(inv(toeplitz(r[:-1])), -r[1:])
lpc = np.concatenate(([1.], phi))
plot(lpc)
[<matplotlib.lines.Line2D at 0x7f61adc8cc50>]
LPC coefficients are the feedback coefficients of an IIR filter. The LPC filter has no zeros.
from scipy.signal import freqz
h,w = freqz([1], lpc)
plot(h ,abs(w))
[<matplotlib.lines.Line2D at 0x7f619e80e810>]
from scipy.signal import tf2zpk
def PoleZeroPlot(b, a):
(zeros,poles,gain) = tf2zpk(b, a)
angle = np.linspace(-np.pi,np.pi,50)
cirx = np.sin(angle)
ciry = np.cos(angle)
figure()
plot(poles.real, poles.imag, 'x', zeros.real, zeros.imag, 'o', cirx,ciry, 'k-')
grid()
xlim((-2, 2))
xlabel('Real')
ylim((-1.5, 1.5))
ylabel('Imag')
gcf().set_figwidth(5)
return (zeros,poles,gain)
PoleZeroPlot([1], lpc);
In Linear Predictive Coding the coefficients calculated from auto regression are stored together with information about the excitation, like frequency and whether the frame is voiced (vowel/harmonic) or unvoiced (consonant/noisy).
Excellent LPC example:
f = open('noise.mp3')
header = f.read(4)
header
'\xff\xfb\xa0\xc4'
ord(header[0])
255
hex(ord(header[0]))
'0xff'
bin(ord(header[0]))
'0b11111111'
for byte in header:
print '%02x,'%ord(byte),
ff, fb, a0, c4,
bin(ord('c'))
'0b1100011'
bin(ord('4'))
'0b110100'
frame = f.read(518)
new_header = f.read(4)
new_header
'\xff\xfb\xa0\xc4'
frame2 = f.read(518)
new_header = f.read(4)
new_header
'\xff\xfb\xa2\xc4'
By: Andrés Cabrera mantaraya36@gmail.com For MAT course MAT 201A at UCSB
This ipython notebook is licensed under the CC-BY-NC-SA license: http://creativecommons.org/licenses/by-nc-sa/4.0/