%matplotlib inline
%load_ext autoreload
%autoreload 2
import numpy as np
from matplotlib import pyplot as plt
plt.rcdefaults
myfigsize = (12.0, 8.0)
myfontsize = 14.
plt.rcParams['figure.figsize'] = myfigsize
The autoreload extension is already loaded. To reload it, use: %reload_ext autoreload
#import datasets for figures:
from astroML.datasets import fetch_rrlyrae_templates
from astroML.plotting import setup_text_plots
#setup_text_plots(fontsize=myfontsize, usetex=True)
#plt.rc('text', usetex=False)
Are my data consistent with the null hypothesis described by a model consisting of a constant signal plus measurement noise?
Frequentist Solution: Use hypothesis testing! What is the probability that we would obtain our data by chance if the null hypothesis of no variability were correct?
Compute $\chi^2$ and the corresponding $p$ values.
Consider a sin wave:
$$y(t) = A\sin(\omega t)$$sampled by $N \sim 100$ data points with homoscedastic errors with standard deviation $\sigma$.
The variance is:
$$V = \sigma^2 + A^2/2$$By a simple $\chi^2_{dof}$ analysis, assuming no model the detectable amplitude decreases very slowly as a function of the number of observations. For example, for $N = 100$ data points if we want a 3 $\sigma$ detection, the minimum detectable amplitude is $A = 0.92\sigma$, and for $N = 1000$ data points it only improves to $A = 0.52 \sigma$ (where $\sigma = \sqrt{2/N}$ assuming homoscedastic gaussian errors.
from IPython.html.widgets import interact
from IPython.html import widgets
def makeSine(a, sigma, numSamps):
fig, ax = plt.subplots(1,1)
x = np.linspace(0, 2 * np.pi, numSamps)
y = a * np.sin(x)
y += np.random.normal(scale=sigma, size=numSamps)
ax.plot(x, y, 'o', color='#75BEBB')
ax.errorbar(x, y, sigma, capsize=0, fmt='None', ecolor='#75BEBB')
ax.set_xlim([-0.1, 2 * np.pi + 0.1])
ax.set_xlabel(r'$\omega t$')
ax.set_ylabel(r'$\sin{(\omega t)} + \sigma$')
ax.text(0.1, 0.15, 'N: {}'.format(numSamps), transform=ax.transAxes)
chinusq = 1/(numSamps * sigma**2) * np.sum(y**2)
ax.text(0.1, 0.1, r'$\mathbf{\chi_{\nu}^{2}}$'+': {:.2f}'.format(chinusq), transform=ax.transAxes)
sig = np.sqrt(2. / numSamps)
ax.text(0.1, 0.05, '{:.2f}'.format((chinusq - 1.)/sig) + r' $\mathbf{\sigma}$', transform=ax.transAxes)
ax.grid(which='major')
sigma = 0.75
a = 0.69
numSamps = 1e2
#slider_ax = plt.axes([0.1, 0.1, 0.8, 0.02])
#slider = Slider(slider_ax, "Offset", -5, 5, valinit=0, color='#AAAAAA')
#slider.on_changed(makeSine(a, sigma, numSamps))
interact(makeSine,
a=widgets.FloatSliderWidget(min=0.,max=1.,step=0.05,value=a),
sigma=widgets.FloatSliderWidget(min=0.,max=1.,step=0.05,value=sigma),
numSamps=widgets.IntSliderWidget(min=0,max=1000,step=10,value=numSamps))
However, we'll see that our ability to discern a signal greatly improves if we specify a model.
Detection of a signal is essentially a hypothesis testing or model selection problem.
The quantitative description of a signal belongs to parameter estimation and regression problems.
In general, time series model fitting of $N$ observations $(t_j, y_j), j = 1, ..., N$ is of the form:
$$y_j(t_j) = \sum_{m=1}^{N}\beta_m T_m(t_j | \mathbf{\theta}_m) + \epsilon_j$$where $T_m(t|\mathbf{\theta}_m)$ need not be periodic.
Common models include, sinusoidal, exponential, and "chirp" signals:
$$T(t) = \sin{(\omega t)}$$$$T(t) = \exp{(-\alpha t)}$$$$T(t) = \sin{(\phi + \omega t + \alpha t^{2})}$$In Fourier analysis, general functions can be expressed by integrals or sums of simpler trigonometric functions. Below is an example of an RR Lyra light curve and its approximation by a sum of sinusoids.
plot1001()
The Fourier Transform can be defined as $$ H(f) = \int_{-\infty}^{\infty} h(t) \exp{(-i 2 \pi f t)} dt$$
With its inverse being $$h(t) = \int_{-\infty}^{\infty} H(f) \exp{(i 2 \pi f t)} df $$
and Euler's identity can be helpful:
$$e^{-i \omega t} = \cos{\omega t} - i \sin{\omega t}$$But in practice, the function of interest is most likely thrown at a computer, which performs discrete fourier analysis using a fast fourier transform (fft). This is defined as
$$y[k] = \sum_{n=0}^{N-1} \exp{(-2 \pi j \frac{kn}{N})}x[n]$$and the inverse transform is defined as
$$x[n] = \frac{1}{N}\sum_{n=0}^{N-1} \exp{(2 \pi j \frac{kn}{N})} y[k]$$Scipy has a library called fftpack
that contains several convenient methods, including one for taking the fast fourier transform (fft
), and one for taking its inverse (ifft
)
from scipy import fftpack
To recreate the figure above and test out fftpack
, lets first import some RR Lyrae templates using the astroML
method fetch_rrlyrae_templates()
templates = fetch_rrlyrae_templates()
type(templates)
dict
templates.keys()
['109u', '109r', '115z', '109z', '115r', '109g', '115i', '115g', '109i', '106z', '106r', '106u', '106i', '106g', '114r', '114z', '114g', '114i', '107z', '107u', '107r', '107i', '107g', '113z', '113r', '108u', '113i', '113g', '100i', '100g', '100z', '100u', '100r', '112z', '108r', '112r', '108z', '108g', '112i', '108i', '112g', '0r', '101i', '0u', '0z', '101g', '0g', '101z', '101u', '0i', '101r', '102z', '102u', '110g', '1r', '102r', '118r', '110i', '102g', '110u', '110r', '1g', '118i', '1i', '118g', '102i', '103i', '110z', '103g', '103z', '103r', '103u', '117r', '117z', '117g', '117i', '104r', '104u', '104z', '104g', '104i', '116r', '116z', '116g', '120g', '116i', '105r', '105u', '111g', '111i', '105z', '119r', '119i', '111r', '105g', '105i', '111z', '119g']
x, y = templates['115r'].T
x109u, y109u = templates['109u'].T
plt.plot(x, y, 'o')
plt.plot(x109u, y109u, 'o')
plt.ylim([1,0])
(1, 0)
Now lets take the fast fourier transform of one of those templates
help(fftpack.fft)
Help on function fft in module scipy.fftpack.basic: fft(x, n=None, axis=-1, overwrite_x=False) Return discrete Fourier transform of real or complex sequence. The returned complex array contains ``y(0), y(1),..., y(n-1)`` where ``y(j) = (x * exp(-2*pi*sqrt(-1)*j*np.arange(n)/n)).sum()``. Parameters ---------- x : array_like Array to Fourier transform. n : int, optional Length of the Fourier transform. If ``n < x.shape[axis]``, `x` is truncated. If ``n > x.shape[axis]``, `x` is zero-padded. The default results in ``n = x.shape[axis]``. axis : int, optional Axis along which the fft's are computed; the default is over the last axis (i.e., ``axis=-1``). overwrite_x : bool, optional If True, the contents of `x` can be destroyed; the default is False. Returns ------- z : complex ndarray with the elements:: [y(0),y(1),..,y(n/2),y(1-n/2),...,y(-1)] if n is even [y(0),y(1),..,y((n-1)/2),y(-(n-1)/2),...,y(-1)] if n is odd where:: y(j) = sum[k=0..n-1] x[k] * exp(-sqrt(-1)*j*k* 2*pi/n), j = 0..n-1 Note that ``y(-j) = y(n-j).conjugate()``. See Also -------- ifft : Inverse FFT rfft : FFT of a real sequence Notes ----- The packing of the result is "standard": If ``A = fft(a, n)``, then ``A[0]`` contains the zero-frequency term, ``A[1:n/2]`` contains the positive-frequency terms, and ``A[n/2:]`` contains the negative-frequency terms, in order of decreasingly negative frequency. So for an 8-point transform, the frequencies of the result are [0, 1, 2, 3, -4, -3, -2, -1]. To rearrange the fft output so that the zero-frequency component is centered, like [-4, -3, -2, -1, 0, 1, 2, 3], use `fftshift`. For `n` even, ``A[n/2]`` contains the sum of the positive and negative-frequency terms. For `n` even and `x` real, ``A[n/2]`` will always be real. This function is most efficient when `n` is a power of two, and least efficient when `n` is prime. If the data type of `x` is real, a "real FFT" algorithm is automatically used, which roughly halves the computation time. To increase efficiency a little further, use `rfft`, which does the same calculation, but only outputs half of the symmetrical spectrum. If the data is both real and symmetrical, the `dct` can again double the efficiency, by generating half of the spectrum from half of the signal. Examples -------- >>> from scipy.fftpack import fft, ifft >>> x = np.arange(5) >>> np.allclose(fft(ifft(x)), x, atol=1e-15) # within numerical accuracy. True
y_fft = fftpack.fft(y)
print(y.shape)
print(type(y_fft))
print(y_fft.shape)
(500,) <type 'numpy.ndarray'> (500,)
plt.plot(y_fft.real, 'o-')
#plt.yscale('log')
plt.xlim([-5, 501])
(-5, 501)
What is this telling us? Lets take a look at the help file to see what fftpack.fft
returned:
plt.plot(fftpack.fftshift(np.abs(y_fft)), '-o')
#plt.yscale('log')
plt.xlim([-5, 501])
(-5, 501)
Now what if we take just the first few modes of the Fourier transform and try recreating the original signal:
#set the number of modes you'd like to see
k = 6
#now zero out the rest:
y_fft[k + 1:-k] = 0
y_fit = fftpack.ifft(y_fft).real
plt.plot(np.concatenate([x, 1+x]), np.concatenate([y, y]), '--', label='Template (Data)')
plt.plot(np.concatenate([x, 1+x]), np.concatenate([y_fit, y_fit]), label='Model')
plt.legend(loc='best')
plt.xlabel('Phase')
plt.ylabel('Amplitude')
plt.ylim([1.1, -0.1])
(1.1, -0.1)
A good opportunity to play more with the interactive features of IPython Notebooks
def fftApprox(nmodes):
#zero out other modes:
y_fftnew = fftpack.fft(y)
y_fftnew[nmodes + 1:-nmodes] = 0
y_fit = np.abs(fftpack.ifft(y_fftnew))
#now plot it up.
fig = plt.figure()
axTop = fig.add_axes([0.05, 0.3, .92, 0.67])
axTop.plot(np.concatenate([x, 1+x]), np.concatenate([y, y]), '--', label='Template (Data)')
axTop.plot(np.concatenate([x, 1+x]), np.concatenate([y_fit, y_fit]), label='Model')
axTop.legend(loc='best')
axTop.set_ylabel('Amplitude')
axTop.set_ylim([1.1, -0.1])
axTop.set_xlim([0, 2])
axBot = fig.add_axes([0.05, 0.05, 0.92, 0.25])
axBot.plot(np.concatenate([x, 1+x]), np.concatenate([y - y_fit, y - y_fit]))
axBot.set_xlim([0, 2])
axBot.set_ylabel('Temp - Mod')
axBot.set_xlabel('Phase')
axBot.text(0.025, 0.05, '$\sigma$: '+'{:.4f}'.format(np.std(y - y_fit)), transform=axBot.transAxes)
interact(fftApprox,
nmodes=widgets.IntSliderWidget(min=1,max=50,step=1,value=6))
There are some special properties of the Fourier Transform that are worth mentioning. One is that if $h(t)$ is real and even (i.e. $h(t) = h(-t)$), $H(f)$ will be real and even as well. One example of this is the Gaussian:
$$h(t) = \mathcal{N}(0, \mu) = \exp{(-t^{2}/\sigma^{2})}$$$$H(f) = \exp{(-2 \pi^{2} \sigma^{2} f^{2} )}$$mu = 0.
sigma=1.
def gaussy(mu, sigma, xarr):
return np.exp(-(xarr - mu)**2/(2 * sigma**2))
def plotGaussianTransform(mu, sigma):
xarr = np.linspace(-40, 40, 1e3)
gy = gaussy(mu, sigma, xarr)
ngy = gy / np.max(gy)
plt.plot(xarr, ngy, '-', label=r'h(t)')
ffty = fftpack.fftshift(np.abs(fftpack.fft(gaussy(mu, sigma, xarr))))
nffty = ffty/np.max(ffty)
plt.plot(xarr, nffty, '-', label='H(f)')
plt.legend(loc='best')
interact(plotGaussianTransform,
mu = widgets.FloatSliderWidget(min=-2,max=45.,step=.25,value=mu),
sigma = widgets.FloatSliderWidget(min=1e-3,max=10.,step=.001,value=sigma))
help(fftpack.fftfreq)
Help on function fftfreq in module numpy.fft.helper: fftfreq(n, d=1.0) Return the Discrete Fourier Transform sample frequencies. The returned float array `f` contains the frequency bin centers in cycles per unit of the sample spacing (with zero at the start). For instance, if the sample spacing is in seconds, then the frequency unit is cycles/second. Given a window length `n` and a sample spacing `d`:: f = [0, 1, ..., n/2-1, -n/2, ..., -1] / (d*n) if n is even f = [0, 1, ..., (n-1)/2, -(n-1)/2, ..., -1] / (d*n) if n is odd Parameters ---------- n : int Window length. d : scalar, optional Sample spacing (inverse of the sampling rate). Defaults to 1. Returns ------- f : ndarray Array of length `n` containing the sample frequencies. Examples -------- >>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5], dtype=float) >>> fourier = np.fft.fft(signal) >>> n = signal.size >>> timestep = 0.1 >>> freq = np.fft.fftfreq(n, d=timestep) >>> freq array([ 0. , 1.25, 2.5 , 3.75, -5. , -3.75, -2.5 , -1.25])
Parseval's Theorem
The power spectrum, or power spectral density (PSD), is a quantity of great interest in time series analysis. It is defined over the range $0 \le f < \infty$ as
$$PSD(f) \equiv |H(f)|^{2} + |H(-f)|^{2}$$The power at a given frequency is a quantitative statement about the importance of each frequency mode.
Parseval's Theorem tells us that the total power is the same whether compute in the frequency or time domain:
$$P_{tot} \equiv \int_{0}^{\infty} PSD(f) df = \int_{-\infty}^{\infty} |h(t)|^{2} dt$$def fourierPlots(func):
transform = fftpack.fftshift(np.abs(fftpack.fft(func)))**2
N = len(transform)
freqs = fftpack.fftshift(fftpack.fftfreq(len(func), 1.))
fig = plt.figure()
funcax = fig.add_axes([0.05, 0.05, .5, 0.5])
funcax.plot(func, '-')
funcax.set_xlabel('T [s]')
funcax.set_ylabel('Amplitude')
tranax = fig.add_axes([0.65, 0.05, .5, 0.5])
tranax.plot(freqs[N/2:], transform[N/2:]+transform[0:N/2][::-1], '-')
#tranax.plot(freqs, fftpack.fftshift(fftpack.fft(func).real), 'r-')
#tranax.plot(freqs, fftpack.fftshift(fftpack.fft(func).imag), 'g-')
tranax.set_xlabel('$f$ [Hz]')
tranax.set_ylabel('$PSD(f)$')
#tranax.set_ylim([-1.5, 1.5])
Again, the Fourier Transform and inverse are: $$ H(f) = \int_{-\infty}^{\infty} h(t) \exp{(-i 2 \pi f t)} dt$$ $$h(t) = \int_{-\infty}^{\infty} H(f) \exp{(i 2 \pi f t)} df $$
deltaFunc = np.zeros(500)
deltaFunc[249] = 1
fourierPlots(deltaFunc)
fourierPlots(np.zeros(500) + 10)
pulsesFunc = np.zeros(500)
pulsesFunc[0::50] = 1
fourierPlots(pulsesFunc)
pulsesFunc = np.zeros(500)
pulsesFunc[0::10] = 1
fourierPlots(pulsesFunc)
fourierPlots(np.sin(np.linspace(0, 100*np.pi, 5e2)))
fourierPlots(np.sin(np.linspace(0, 1250*np.pi, 5e2)))
Recall that Fourier transform and its inverse extend to infinity
$$ H(f) = \int_{-\infty}^{\infty} h(t) \exp{(-i 2 \pi f t)} dt$$$$h(t) = \int_{-\infty}^{\infty} H(f) \exp{(i 2 \pi f t)} dt $$However, the Discrete Fourier Transform and its inverse only extend through the range of our data set
$$H_{k} = \sum_{j=0}^{N-1} h_j \exp{[-i 2 \pi jk/N]}$$$$h_j = \frac{1}{N} \sum_{k=0}^{N-1}H_k \exp{[i 2 \pi jk/N]}$$or in SciPy notation: $$y[k] = \sum_{n=0}^{N-1} \exp{(-2 \pi j \frac{kn}{N})}x[n]$$ $$x[n] = \frac{1}{N}\sum_{n=0}^{N-1} \exp{(2 \pi j \frac{kn}{N})} y[k]$$
Can we still estimate the PSD even though we're only using the Discrete Fourier Transform?
Let us define h(t) to be band limited if H(f) = 0 for $|f| > f_c$, where $f_c$ is the band limit, or Nyquist critical frequency. For h(t) there is a resolution limit in $t$ space of $t_c = 1/(2 f_c)$ below which $h(t)$ appears smooth.
The Nyquist sampling theorem tells us that we can exactly reconstruct h(t) from evenly sampled data when $\Delta t \le t_c$ as
$$h(t) = \frac{\Delta t}{t_c} \sum_{k=-\infty}^{k=\infty} h_k \frac{\sin{[2 \pi f_c (t - k \Delta t)]}}{2 \pi f_c (t - k \Delta t)}$$where $\Delta t$ is the time sampling resolution.
Let's look at an example with the function $$h(t) = \sin{(2 \pi t/P)}$$
This function is band limited with a critical frequency of $f_c = 1/P$. The critical sampling rate is therefore $t_c = 1/(2 f_c) = P/2$. As long as we sample with time steps smaller than $P/2$ the Nyquist sampling theorem tells us we can fully reconstruct $h(t)$.
However, if $h(t)$ is not band-limited, or if the sampling rate is not sufficient, aliasing will occur, which prevents us from exactly reconstructing $h(t)$. Aliasing falsely transfers frequencies $|f| > f_c$ into the $-f_c < f < f_c$ range.
So the discrete Fourier transform is a good estimate of the true Fourier transform as long as we're dealing with a band limited function and $\Delta t < t_c$. The PSD in this case can be written as
$$PSD(f_k) = 2 \left(\frac{T}{N} \right)^2 \left[\left(\sum_{j=0}^{N-1} h_j \cos{(2 \pi f_k t_j)} \right)^2 + \left(\sum_{j=0}^{N-1} h_j \sin{(2 \pi f_k t_j)} \right)^2\right]$$and this is true only for noiseless evenly sampled data.
plt.axvspan(0, 5, ymax=0.5, color='#75BEBB', alpha=0.75)
plt.axvline(10, color='#EB6649', lw=2)
plt.xlim([0, 20])
plt.ylim([0, 2])
plt.xlabel('Frequency')
<matplotlib.text.Text at 0x10eb3ffd0>
tfull = np.linspace(0, 2 * np.pi, 1e3)
tlow = np.linspace(0, 2 * np.pi, 1e1)
omega_hi = 10.
omega_low = 1.
#1. show the high sampling high frequency curve
plt.plot(tfull, np.sin(omega_hi*tfull), lw=2, color='#C7B077')
#3. show the high sampling low frequency cuve
plt.plot(tfull, np.sin(omega_low*tfull), lw=4, color='#46959E')
#2. show the low sampling high frequency curve
plt.plot(tlow, np.sin(omega_hi*tlow), 'o', color='#BA4C37')
plt.xlim([0, 2 * np.pi])
(0, 6.283185307179586)
period = 10.
tcrit = period / 2.
tfull = np.linspace(0, 10, 1e3)
#1. show the high sampling high frequency curve
plt.plot(tfull, np.sin(2 * np.pi *tfull / period), lw=2, color='#C7B077')
tlow = np.linspace(0, 10, tcrit)
plt.plot(tlow, np.sin(2 * np.pi *tlow / period), 'o', color='#BA4C37')
[<matplotlib.lines.Line2D at 0x10acdd990>]
Window Function
The sampling window function in the time domain can be thought of as the sum of delta functions at sampled observation times.
The Fourier transform of a set of delta functions with spacing $\Delta t$ is another set of delta functions with spacing $1 / \Delta t$.
plot1004()
Runs at $\mathcal{O}(N \log N)$ instead of $\mathcal{O}(N^2)$. Check it out in numerical recipes if you're curious of the details.
There are some other variants too. One mentioned in the book is Welch's method, which computes multiple Fourier transforms over overlapping windows of the data to smooth noise effects in the resulting spectrum.
plot1006()
Trigonometric basis functions used in the Fourier transform have an infinite extent.
For this reason the Fourier transform may not be the best method for nonperiodic time series data.
Instead of using trigonometric basis functions, a better alternative may be to use basis functions that are localized.
Wavelets are one such family of basis functions.
Wavelets are localized in both time and frequency.
Individual wavelets are specified by a set of wavelet filter coefficients.
Given a wavelet, a complete orthonormal set of basis functions can be constructed by scaling and translations.
Some well known wavelet families are Daubechies, Mexican hat, and Haar wavelets.
Similar to the discrete fourier transform, a discrete wavelet transform (DWT) can be used to analyze the power spectrum of a time series as a function of time. This can be performed using the PyWavelets toolkit.
plot1007()
plot1008()
In both of these examples the wavelet is of the form
$$w(t|t_0, f_0, Q) = A \exp{[i 2 \pi f_0(t - t_0)] \exp{[-f_0^2(t - t_0)]^2/Q^2}}$$which has the transform
$$W(f|t_0, f_0, Q) = \left( \frac{\pi}{f_0^2/Q^2} \right)^{1/2} \exp{(-i 2 \pi f t_0)} \exp{\left[ \frac{-\pi^2 Q^2(f - f_0)^2}{Qf_0^2}\right]}$$actually, it's not technically a wavelet because it doesn't meet the admissibility criterion, so it should be called a "matched filter". However, a scaling and offset could make this a Morlet wavelet.
A similar method is called matching pursuit, which uses a large redundant set of nonorthogonal basis functions and uses the data to determine the appropriate set of basis functions (AKA, a dictionary). For an example, see Lachowicz and Done (2010).
A recent example is from (Metcalfe et al. 2013), where they used wavelet analysis to analyze the magnetic cycles of Eps Eri.
Hatched regions are outside of the "cone of influence", which suffer from edge effects. The weakest signal is shown with white and blue (and presumably black and purple), to the strongest signal being red and black.
10.2.5. Digital Filtering
Using this technique will always reduce the information content of your data, but if you're going to use it, make sure you use the Wiener filter to suppress ringing.
plot1010()
Optimization terminated successfully. Current function value: 574032608.556378 Iterations: 95 Function evaluations: 180
The low pass filtering mentioned here can only be done on evenly sampled data. For unevenly sampled data, use a Savitzky-Golay filter.
A periodic time series satisfies $y(t + P) = y(t)$, where $P$ is the period.
A convenient concept is phase-folding:
$$\phi = \frac{t}{P} - int \left( \frac{t}{P} \right)$$Lets take an example of a simple oscillator: $$y(t) = A\sin{(\omega t + \phi)} + \epsilon$$
This can be partially linearized to speed up the analysis through the use of trigonometric identities
$$y(t) = a \sin{(\omega t)} + b \cos{(\omega t)}$$(use the property $\sin{(x+y)} = \sin{x}\cos{y} + \cos{x}\sin{y}$ and draw a triangle to show if $\tan{\phi}=b/a$, then $\cos{\phi}=a/A$ and $\sin{\phi}=b/A$
Then the likelihood function is $$\mathcal{L}(\omega, a, b, \sigma|t, y) \equiv p({t, y}| \omega, a, b, \sigma) = \prod_{j=1}^{N} \frac{1}{\sqrt{2 \pi \sigma}} \exp{\left( \frac{-[y_j - a \sin{(\omega t_j)} - b\cos{(\omega t_j)}]^2} {2 \sigma^2}\right)}$$
Using Bayes' Theorem $$p(\omega, a, b, \sigma|{t, y}, I) = \frac{p({t, y}| \omega, a, b, \sigma, I) \times p(\omega, a, b, \sigma | I)} {p({t, y}|I)} $$
and assuming uniform priors, and ignoring the evidence, the posterior is then proportional to
$$p(\omega, a, b, \sigma | {t, y}) \propto \sigma^{-N} \exp{\left( \frac{-NQ}{2 \sigma^2}\right)}$$where $Q$ is a really long term defined in the book. A few approximations were used to remove some of the terms.
If we're only interested in periodicity, and not the phase of the periodicity, we can further reduce the number of terms by marginzalizing over them
$$p(\omega, \sigma | {t, y}) \propto \int p(\omega, a, b, \sigma | {t, y}) \ da \ db$$Finally, we're left at
$$p(\omega, \sigma | {t, y}) \propto \sigma^{-(N-2)}\exp{\left(\frac{-NV}{2 \sigma^2} + \frac{P(\omega)}{\sigma^2}\right)}$$where
$$V = \frac{1}{N} \sum_{j=1}^{N}y_j^2$$and $P(\omega)$ is known as the periodogram and is defined as
$$P(\omega) = \frac{1}{N}[I^2(\omega) + R^2(\omega)]$$where
$$I(\omega) = \sum_{j=1}^{N} y_j \sin{(\omega t_j)}$$$$R(\omega) = \sum_{j=1}^{N} y_j \cos{(\omega t_j)}$$Finally, when the noise is known, this further simplifies to
$$p(\omega | {t, y}, \sigma) \propto \exp{\left( \frac{P(\omega)}{\sigma^2}\right)}$$see the book for how to come about this without marginalizing over $a$ and $b$. Essentially, you just take the derivative of the posterior, set it to zero, and solve for $a_0$ and $b_0$. Then just role with those values.
The meaning of periodogram
But what is the best value of $\omega$ that is supported by the data? To answer this let's compute $\chi^2$
$$\chi^2(\omega) \equiv \frac{1}{\sigma^2} \sum_{j=1}^{N} [y_j - y(t_j)]^2 = \frac{1}{\sigma^2} \sum_{j=1}^{N} [y_j - a_0 \sin{(\omega t_j)} - b_0 \cos{(\omega t_j)}]^2 $$If we set $a_0 = b_0 = 0$ to get a constant model, the $\chi^2$ for that constant model is
$$\chi_0^2 = \frac{1}{\sigma^2} \sum_{j=1}^{N} y_j^2 = \frac{NV}{\sigma^2}$$which can be used to rewrite $\chi^2$ as
$$\chi^2(\omega) = \chi_0^2 \left[ 1 - \frac{2}{NV} P(\omega) \right]$$The Lomb-Scargle periodogram is defined as
$$P_{LS} (\omega) = \frac{2}{NV} P(\omega)$$making the above
$$\frac{\chi^2(\omega)}{\chi_0^2} = 1 - P_{LS} (\omega)$$That's great that we now have a way of dealing with unevenly sampled data and it's easy to incorporate heteroscedastic uncertainties. But how do we gauge the significance of peaks?
One way is by using the Bayesian Information Criterion:
$$\Delta BIC = \chi_0^2 - \chi^2(\omega_0) - (k_0 - k_{\omega})\ln{N}$$similarly, there's the Aikake Information Criterion
$$\Delta AIC = \frac{NV}{\sigma^2} P_{LS}(\omega_0) - 6$$This seems like a good route to take for large data sets. A more obvious solution to determining peak significance will be discussed soon.
The discrete PSD gives optimal results if
otherwise, you should consider the LS Periodogram
The discrete Fourier Transform is great when there are evenly sampled data. Unfortunately it cannot be used when the data are unevenly sampled, which is often what we deal with in astronomy. In the cases with unevenly sampled data the Lomb-Scargle is a great tool for searching for periodicity.
The normalized Lomb-Scargle periodogram is defined as:
$$P_{LS}(\omega) = \frac{1}{V}\left[ \frac{R^2(\omega)}{C(\omega)} + \frac{I^2(\omega)} {S(\omega)} \right]$$The meaning of the Lomb-Scargle Periodogram
The Lomb Scargle periodogoram can be thought of as an inverted plot of $\chi^2$ normalized by $\chi_0^2$ (the $\chi^2$ for a constant model).
The Lomb-Scargle does not save computational effort (common misconception).
Practical application of the Lomb-Scargle periodogram
Since the underlying model is still nonlinear in frequency, and the basis functions at different frequencies are not orthogonal, the global maximum must be found through grid search.
This can be done using SciPy
:
from scipy.signal.spectral import lombscargle
help(lombscargle)
Help on built-in function lombscargle in module scipy.signal._spectral: lombscargle(...) lombscargle(x, y, freqs) Computes the Lomb-Scargle periodogram. The Lomb-Scargle periodogram was developed by Lomb [1]_ and further extended by Scargle [2]_ to find, and test the significance of weak periodic signals with uneven temporal sampling. The computed periodogram is unnormalized, it takes the value ``(A**2) * N/4`` for a harmonic signal with amplitude A for sufficiently large N. Parameters ---------- x : array_like Sample times. y : array_like Measurement values. freqs : array_like Angular frequencies for output periodogram. Returns ------- pgram : array_like Lomb-Scargle periodogram. Raises ------ ValueError If the input arrays `x` and `y` do not have the same shape. Notes ----- This subroutine calculates the periodogram using a slightly modified algorithm due to Townsend [3]_ which allows the periodogram to be calculated using only a single pass through the input arrays for each frequency. The algorithm running time scales roughly as O(x * freqs) or O(N^2) for a large number of samples and frequencies. References ---------- .. [1] N.R. Lomb "Least-squares frequency analysis of unequally spaced data", Astrophysics and Space Science, vol 39, pp. 447-462, 1976 .. [2] J.D. Scargle "Studies in astronomical time series analysis. II - Statistical aspects of spectral analysis of unevenly spaced data", The Astrophysical Journal, vol 263, pp. 835-853, 1982 .. [3] R.H.D. Townsend, "Fast calculation of the Lomb-Scargle periodogram using graphics processing units.", The Astrophysical Journal Supplement Series, vol 191, pp. 247-253, 2010 Examples -------- >>> import scipy.signal First define some input parameters for the signal: >>> A = 2. >>> w = 1. >>> phi = 0.5 * np.pi >>> nin = 1000 >>> nout = 100000 >>> frac_points = 0.9 # Fraction of points to select Randomly select a fraction of an array with timesteps: >>> r = np.random.rand(nin) >>> x = np.linspace(0.01, 10*np.pi, nin) >>> x = x[r >= frac_points] >>> normval = x.shape[0] # For normalization of the periodogram Plot a sine wave for the selected times: >>> y = A * np.sin(w*x+phi) Define the array of frequencies for which to compute the periodogram: >>> f = np.linspace(0.01, 10, nout) Calculate Lomb-Scargle periodogram: >>> pgram = sp.signal.lombscargle(x, y, f) Now make a plot of the input data: >>> plt.subplot(2, 1, 1) <matplotlib.axes.AxesSubplot object at 0x102154f50> >>> plt.plot(x, y, 'b+') [<matplotlib.lines.Line2D object at 0x102154a10>] Then plot the normalized periodogram: >>> plt.subplot(2, 1, 2) <matplotlib.axes.AxesSubplot object at 0x104b0a990> >>> plt.plot(f, np.sqrt(4*(pgram/normval))) [<matplotlib.lines.Line2D object at 0x104b2f910>] >>> plt.show()
numSamps = 1e2
x = np.sort(np.random.uniform(low=0, high=1000, size=numSamps))
unc = np.random.normal(loc=0, scale=0.5, size=numSamps)
signalPer = 100.9 #days
y = np.sin(2 * np.pi * x / signalPer)
fig = plt.figure()
ax = plt.subplot(211)
ax.plot(x, y, 'o-', markersize=5)
ax.plot(x, y + unc, 'o')
ax.set_xlabel('Time')
ax.set_ylabel('Amplitude')
ax = plt.subplot(212)
pmin = 0.5
pmax = 1000.
omega = np.linspace(2 * np.pi / pmax, 2 * np.pi / pmin, 3e3)
P_SP = lombscargle(x, y, omega)
P_SP_Unc = lombscargle(x, y + unc, omega)
ax.plot(2 * np.pi/omega, P_SP, lw=2, alpha=0.5, label='No Noise')
ax.plot(2 * np.pi/omega, P_SP_Unc, lw=2, alpha=0.5, label='Noise')
ax.set_xscale('log')
ax.set_xlim([0.5, 1e3])
ax.set_xlabel('Period [d]')
ax.set_ylabel('Power')
ax.legend(loc='best')
fig.subplots_adjust(hspace=0.25)
-c:2: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future -c:3: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
False Alarm Probability
As mentioned before, the $\Delta$BIC can give a good idea of the significance of a signal if you know how to interpret it, and a large data set can help interpret the threshold.
Alternatively, the significance of a peak can be assessed using a False Alarm Probability (FAP).
A boostrap resampling is employed to compute the FAP, where the observation times are kept fixed and the y values are drawn with replacement. The peak in every sample is then recorded to assess the desired FAP levels. astroML
has a built in method for doing this, which is shown below.
from astroML.time_series import \
lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap
help(lomb_scargle)
Help on built-in function lomb_scargle in module astroML_addons.periodogram: lomb_scargle(...) (Generalized) Lomb-Scargle Periodogram with Floating Mean Parameters ---------- t : array_like sequence of times y : array_like sequence of observations dy : array_like sequence of observational errors omega : array_like frequencies at which to evaluate p(omega) generalized : bool if True (default) use generalized lomb-scargle method otherwise, use classic lomb-scargle. subtract_mean : bool if True (default) subtract the sample mean from the data before computing the periodogram. Only referenced if generalized is False. significance : None or float or ndarray if specified, then this is a list of significances to compute for the results. Returns ------- p : array_like Lomb-Scargle power associated with each frequency omega z : array_like if significance is specified, this gives the levels corresponding to the desired significance (using the Scargle 1982 formalism) Notes ----- The algorithm is based on reference [1]_. The result for generalized=False is given by equation 4 of this work, while the result for generalized=True is given by equation 20. Note that the normalization used in this reference is different from that used in other places in the literature (e.g. [2]_). For a discussion of normalization and false-alarm probability, see [1]_. To recover the normalization used in Scargle [3]_, the results should be multiplied by (N - 1) / 2 where N is the number of data points. References ---------- .. [1] M. Zechmeister and M. Kurster, A&A 496, 577-584 (2009) .. [2] W. Press et al, Numerical Recipies in C (2002) .. [3] Scargle, J.D. 1982, ApJ 263:835-853
PLSastroML = lomb_scargle(x, y + unc, unc, omega)
# Get significance via bootstrap
D = lomb_scargle_bootstrap(x, y + unc, unc, omega, generalized=True,
N_bootstraps=1000, random_state=0)
sig1, sig5 = np.percentile(D, [99, 95])
help(ax.axhline)
Help on method axhline in module matplotlib.axes._axes: axhline(self, y=0, xmin=0, xmax=1, **kwargs) method of matplotlib.axes._subplots.AxesSubplot instance Add a horizontal line across the axis. Parameters ---------- y : scalar, optional, default: 0 y position in data coordinates of the horizontal line. xmin : scalar, optional, default: 0 Should be between 0 and 1, 0 being the far left of the plot, 1 the far right of the plot. xmax : scalar, optional, default: 1 Should be between 0 and 1, 0 being the far left of the plot, 1 the far right of the plot. Returns ------- `~matplotlib.lines.Line2D` Notes ----- kwargs are the same as kwargs to plot, and can be used to control the line properties. e.g., Examples -------- * draw a thick red hline at 'y' = 0 that spans the xrange:: >>> axhline(linewidth=4, color='r') * draw a default hline at 'y' = 1 that spans the xrange:: >>> axhline(y=1) * draw a default hline at 'y' = .5 that spans the the middle half of the xrange:: >>> axhline(y=.5, xmin=0.25, xmax=0.75) Valid kwargs are :class:`~matplotlib.lines.Line2D` properties, with the exception of 'transform': agg_filter: unknown alpha: float (0.0 transparent through 1.0 opaque) animated: [True | False] antialiased or aa: [True | False] axes: an :class:`~matplotlib.axes.Axes` instance clip_box: a :class:`matplotlib.transforms.Bbox` instance clip_on: [True | False] clip_path: [ (:class:`~matplotlib.path.Path`, :class:`~matplotlib.transforms.Transform`) | :class:`~matplotlib.patches.Patch` | None ] color or c: any matplotlib color contains: a callable function dash_capstyle: ['butt' | 'round' | 'projecting'] dash_joinstyle: ['miter' | 'round' | 'bevel'] dashes: sequence of on/off ink in points drawstyle: ['default' | 'steps' | 'steps-pre' | 'steps-mid' | 'steps-post'] figure: a :class:`matplotlib.figure.Figure` instance fillstyle: ['full' | 'left' | 'right' | 'bottom' | 'top' | 'none'] gid: an id string label: string or anything printable with '%s' conversion. linestyle or ls: [``'-'`` | ``'--'`` | ``'-.'`` | ``':'`` | ``'None'`` | ``' '`` | ``''``] linewidth or lw: float value in points lod: [True | False] marker: unknown markeredgecolor or mec: any matplotlib color markeredgewidth or mew: float value in points markerfacecolor or mfc: any matplotlib color markerfacecoloralt or mfcalt: any matplotlib color markersize or ms: float markevery: unknown path_effects: unknown picker: float distance in points or callable pick function ``fn(artist, event)`` pickradius: float distance in points rasterized: [True | False | None] sketch_params: unknown snap: unknown solid_capstyle: ['butt' | 'round' | 'projecting'] solid_joinstyle: ['miter' | 'round' | 'bevel'] transform: a :class:`matplotlib.transforms.Transform` instance url: a url string visible: [True | False] xdata: 1D array ydata: 1D array zorder: any number See also -------- axhspan : for example plot and source code
fig = plt.figure()
ax = plt.subplot(111)
ax.plot(2 * np.pi/omega, PLSastroML, lw=2, alpha=0.5)
ax.set_xscale('log')
ax.set_xlim([0.5, 1e3])
ax.set_xlabel('Period [d]')
ax.set_ylabel('Power')
ax.axhline(y=sig5, color='#BA4C37', ls='--', lw=3, label='5% FAP')
ax.axhline(y=sig1, color='#982023', label='1% FAP')
ax.legend(loc='best')
<matplotlib.legend.Legend at 0x10f094d10>
plot1015()
Generalized Lomb-Scargle Periodogram
One problem with the 1982 version of the Lomb-Scargle periodogram is that it implicitly assumes the mean of the sample is the same as the mean of the true function. Since we are not evenly sampling the data, may be working with small data sets, or may not be sampling the full period, this is a bad assumption. As Cumming et al. (1999) pointed out, this can cause problems with aliases.
To correct for this, Cumming et al. (1999) added a floating mean to overcome this deficiency:
$$y = a \cos{\omega t} + b \sin{\omega t} + c$$(note the extra free parameter, $c$). They demonstrated its effectiveness and used it for a thorough statistical analysis of the detectability of planets from the Lick planet search program. Later, Zechmeister and Kurster (2009) wrote up the analytical derivation for this and referred to it as the "generalized Lomb-Scargle" periodogram (not to be confused with the Bretthorst (2001) generalized Lomb-Scargle periodogram, which is completely different).
Errata: There was a typo in Figure 10.15. An updated version can be found here, and a corrected version with a larger data set is reproduced below.
plot1015()
What happens if data have an underlying variability that is more complex than a single sinusoid?
#6 stars from the LINEAR data set. Periods estimated using the Lomb-Scargle Periodogram.
plot1017()
Downloading http://www.astro.washington.edu/users/ivezic/linear/allDataFinal/allLINEARfinal_targets.dat [=========================================] 1001.2kb / 1001.2kb Downloading http://www.astro.washington.edu/users/ivezic/linear/allDataFinal/allLINEARfinal_dat.tar.gz [=========================================] 16.05Mb / 16.05Mb @pickle_results: using precomputed results from 'LINEAR_LS.pkl' - omega_0 = 17.21695506 - omega_0 = 51.10600808 - omega_0 = 10.20062676 - omega_0 = 45.55493503 - omega_0 = 58.44512277 - omega_0 = 10.82722772
One way of going about this is to use multiple Fourier terms:
$$y(t) = b_0 + \sum_{m=1}^{M} a_m \sin{(m \omega t)} + b_m \cos{(m \omega t)}$$and then the periodogram becomes
$$P_{M}(\omega) = \frac{2}{V} \sum_{m=1}^{M}[R_{m}^2(\omega) + I_{m}^2(\omega)]$$from astroML.time_series import multiterm_periodogram, MultiTermFit
from astroML.datasets import fetch_LINEAR_sample
data = fetch_LINEAR_sample()
t, y, dy = data[14752041].T
print(np.min(t), np.max(t))
(52653.465077000001, 54620.262198999997)
pmin_linear = 1./48. #days
pmax_linear = 2.
omega_linear = np.linspace(2. * np.pi /pmax_linear, 2. * np.pi / pmin_linear, 1e4)
P_MT1 = multiterm_periodogram(t, y, dy, omega_linear, 1)
P_MT6 = multiterm_periodogram(t, y, dy, omega_linear, 6)
P_LS = lomb_scargle(t, y, dy, omega_linear, generalized=True)
print(P_MT.shape)
print(t.shape)
print(omega_linear.shape)
fig = plt.figure()
ax = fig.add_subplot(211)
ax.plot(2. * np.pi / omega_linear * 24., P_MT1, label='1 Multiterm')
ax.plot(2. * np.pi / omega_linear * 24., P_MT6, label='6 Multiterm')
ax.plot(2. * np.pi / omega_linear * 24., P_LS, label='Lomb Scargle')
#plt.xscale('log')
ax.set_xlim([pmin_linear*24., pmax_linear*24.])
ax.set_xlabel('P [hours]')
ax.set_ylabel('Power')
ax.legend(loc='best')
ax = fig.add_subplot(212)
maxper = 2. * np.pi / omega_linear[np.where(P_LS == np.max(P_LS))[0]]
ax.plot((t / maxper) % 1 , y, '.')
ax.set_ylim([np.max(y), np.min(y)])
ax.set_xlabel('Phase')
ax.set_ylabel('Mag')
(3000,) (253,) (10000,)
<matplotlib.text.Text at 0x11dde4850>
plot1018()
But how do we know the number of Fourier modes to include? 5, 10, 100, M = N/2 - 1?
Bayesian Information Criterion to the rescue:
$$\Delta BIC_{M} = \chi_0^2 P_M (\omega_0^M) - (2M + 1) \ln{N}.$$Add components and recreate the multi-term periodogram until $\Delta BIC$ is maximized.
plot1019()
Among other techniques for classification, one option is to use the multiterm periodogram to find the best periods, and then fit for the best coeffecients for all $m$ components:
$$y(t) = b_0 + \sum_{m=1}^{M} a_m \sin{(m \omega_0 t)} + b_m \cos{(m \omega_0 t)}$$$b_0$ is a generalization term. But that's not all! You can even fit multiple periods simultaneously:
$$y(t) = b_0 + \sum_{k=1}^{K}\sum_{m=1}^{M} a_{km} \sin{(m \omega_k t)} + b_{km} \cos{(m \omega_k t)}$$A nonparametric method based on the hypothesis that diferent physical classes will be clustered together in the multi-dimensional parameter space. Once the clusters have been identified, they can be assigned a physical meaning.
The below plots show the resulting 12 clusters from minimizing the BIC on two data attributes: $g-i$ color and $\log{P}$. ab-type RR Lyras are two separate clusters because a Gaussian is a poor description of their distribution. A is light curve amplitude.
plot1021()
@pickle_results: using precomputed results from 'LINEAR_clustering.pkl' number of components: 12 number of components: 15
The bottom 4 panels in the above set are Figure 10.21. This is the same data set, but it fits seven attributes, resulting in a 15-component Gaussian mixture solution. Despite the additional attributes and components, the clusters really didn't change much, indicating that not much new information was added.
Supervised Approach
When a training set of data is available, other methods include a Gaussian mixture model Bayes classifier (Section 9.3.5) and support vector machines (Section 9.6). They both did fairly well, and the GMMB slightly outperformed the SVM (see tables 2 and 3 on p. 450). Both methods struggled distinguishing EA (Algol-type) from EB/EW (contact) eclipsing binaries. However, that's not surprising since even experts have difficulty distinguishing the two.
All of the above analysis assumes some uncertainty in the measurement value. But what about when there are sparse data sets where single events are recorded with negligible error. One example is an X-ray data set, where individual photons are detected with negligible background.
The Rayleigh test is the best known classical test in this situation.
$$R^2 = \left( \sum_{j=1}^{N} \cos{(2 \pi \phi_j)}\right)^2 + \left( \sum_{j=1}^{N} \sin{(2 \pi \phi_j)}\right)^2$$This is carried out for a grid of periods, just like the periodogram.
Small $R^2 =>$ random
Large $R^2 =>$ periodic
Nonparametric Bayesian Approach
An interseting alternative was carried out by Gregory and Loredo (1992).
First, divide the data set into very small bins to create a sparse data set, where values are either 1 or 0 in each bin.
The expectation value is then
$$\mu(t) = r(t) \Delta t$$From Poisson statistics, the probability of detecting no photons is:
$$p(0) = e^{-r(t) \Delta t}$$And the probability of a single event/count is: $$p(1) = r(t)\Delta t e^{-r(t)\Delta t}$$
The data likelihood is then $$p(D|r, I) = (\Delta t)^N \exp{\left( -\int_{T} r(t) dt \right)} \prod_{j=1}^{N}r(t_j)$$
They then used a nonparametric function for the rate function, treating it as a piecewise constant function, $f_j$, with $M$ steps of the same width, and $\sum_j f_j = 1$.
They then marginalize the resulting pdf to produce an analog for the periodogram.
When little is known about the shape of the signal, this method is superior to a Fourier series expansion.
Bayesian Blocks
Similar to the Gregory-Loredo method, but with adaptive bin sizes.
Stationary signal with an event localized in time.
Examples are a microlensing event and bursts of emission.
Two parametric models are burst signal and chirp signal.
Super basic. You have a stationary signal
$$y(t) = b_0 + \epsilon$$at some point, there's a burst. This can be generalized in the form of
$$y_B (t | T, A, \theta) = A g_B (t - T|\theta)$$One option for $g_B$ is $g_B(t \alpha) = \exp{(-\alpha t)}$. Use MCMC to estimate the parameters.
plot1025()
@pickle_results: using precomputed results from 'matchedfilt_burst.pkl'
Doing the same thing as above, but with a "chirp" signal
$$y(t) = b_0 + A \sin{[\omega t + \beta t^2]}$$plot1026()
@pickle_results: using precomputed results from 'matchedfilt_chirp.pkl'
And here's a combination of the two -- a temporally localized exponentially decaying chirp signal
$$y_C(t|T, A, \phi, \omega, \beta) = A \sin{[\phi + \omega (t - T) + \beta (t - T)^2 ]}\exp{[-\alpha (t - T)]}$$that is riding on top of another signal
$$y(t) = b_0 + b_1 \sin{(\Omega_1 t)} \sin{(\Omega_2 t)}$$plot1027()
@pickle_results: using precomputed results from 'matchedfilt_chirp2.pkl'
For signals that are not just localized in time, but in frequency too, wavelets can be helpul.
plot1028()
Stochastic Processes: behavior that is not predictable forever, but is always there.
Underlying physics is too complex to deterministically predict future values (e.g. magnetic dynamos, long-term weather)
The correlation of two functions, $f(t)$ and $g(t)$, is
$$CF(\Delta t) = \frac{\lim_{T \rightarrow \infty} \frac{1}{T} \int_{(T)} f(t)g(t+\Delta t) dt}{\sigma_f \sigma_g}$$and the autocorrelation function compares a function to itself and different times
$$ACF(\Delta t) = \frac{\lim_{T \rightarrow \infty} \frac{1}{T} \int_{(T)} f(t)f(t+\Delta t) dt}{\sigma_f^2}$$If the time series repeats itself by shifting the time axis by $t_{lag}$, then the ACF will have a peak at $\Delta t = t_{lag}$.
This is analogous to computing the PSD in period space. For that reason, the ACF and PSD are known as "Fourier Pairs", which is known as the "Wiener-Khinchin theorem".
Structure Function
The structure function is closely related to the ACF.
$$SF(\Delta t) = SF_{\infty} [1 - ACF(\Delta t)]^{1/2}$$where $SF_{\infty}$ is the standard deviation of the time series over infinite time.
The structure function, the ACF, and the PSD are all mathematically equivalent and are all used; however, due to issues of noise and sampling, they may not always result in equivalent inferences about the data.
Examples of stochastic processes: $1/f$ and $1/f^2$ processes
The authors are really getting lazy here... Essentially, see [56] for details.
plot1029()
The above figure illustrates the connection between the PSDs for $1/f$ and $1/f^2$ processes and their corresponding time series.
The variance of the mean of a $1/f$ process does not decrease with time -- it's infinite.
Similar to problems with the PSD with unevenly sampled data, the discrete ACF only works for evenly sampled data. An alternative is the discrete correlation function (DCF, or UDCF), also known as the "slot autocorrelation function" in the physics community.
It's just binning the data.
astroML includes tools for computing the ACF of unevenly sampled data using two methods: the Scargle method and the Edelson Krolik method.
The difference (or lack thereof) is shown below)
plot1030()
Autoregressive models provide a good general description of processes that "retain memory", but are not periodic. A random walk is a good example:
$$y_i = \beta y_{i-1} + e_i$$$\beta > 1$ is known as a geometric random walk model, which is used extensively to model stock market data.
An autoregressive process of order $k$ for a discrete process is defined by
$$y_i = \sum_{j=1}^k a_j y_{i-j} + e_i$$In other words, the $ith$ value of $y$ is a linear combination of the $k$ previous values of $y$.
Related processes include
discrete
The equation for an autoregressive process shown above is only applicable for an evenly sampled data set. A generalization of this is the continuous autoregressive process (CAR($k$)).
A Gaussian process is also a great to model time series.
A CAR(1) process is also known as a damped random walk, or an Ornstein-Uhlenbeck process. The damped random walk is a $1/f^2$ process at high frequencies.
For evenly sampled data the CAR(1) process is equivalent to the AR(1) process with $a_1 = \exp{(-1/\tau)}$.
A damped random walk provides a good description of the optical continuum of quasars, and has been used to distinguish them from stars.
Astronomical data are usually unevenly sampled with heteroscedastic errors. For this reason, the generalized Lomb-Scargle periodogram is usually a good starting point for periodic variability.
If dealing with sparse data and an unknown model, the Gregory-Loredo method or Bayesian blocks can be really useful.
You can also drop in your own function to make a periodogram specific to your model. A good example of this is the box least squares (BLS) periodogram for transit data.
Another good non-parametric model is importance sampling, otherwise known as Approximate Bayesian Computing. Nikhil is a good person to talk to about this -- he has applied it to RV data with a Gaussian Mixture Model.
Other topics not discussed in this chapter include state-space models, Kalman filters, Markov chains, and stochastic differential equations.
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
#----------------------------------------------------------------------
# This function adjusts matplotlib settings for a uniform feel in the textbook.
# Note that with usetex=True, fonts are rendered with LaTeX. This may
# result in an error if LaTeX is not installed on your system. In that case,
# you can set usetex to False.
def plot1001():
#------------------------------------------------------------
# Load the RR Lyrae template
templates = fetch_rrlyrae_templates()
x, y = templates['115r'].T
#------------------------------------------------------------
# Plot the results
fig = plt.figure(figsize=myfigsize)
fig.subplots_adjust(hspace=0)
kvals = [1, 3, 8]
subplots = [311, 312, 313]
for (k, subplot) in zip(kvals, subplots):
ax = fig.add_subplot(subplot)
# Use FFT to fit a truncated Fourier series
y_fft = np.fft.fft(y)
y_fft[k + 1:-k] = 0
y_fit = np.fft.ifft(y_fft).real
# plot the true value and the k-term reconstruction
ax.plot(np.concatenate([x, 1 + x]),
np.concatenate([y, y]), '--k', lw=2)
ax.plot(np.concatenate([x, 1 + x]),
np.concatenate([y_fit, y_fit]), color='gray')
label = "%i mode" % k
if k > 1:
label += 's'
ax.text(0.02, 0.1, label, ha='left', va='bottom',
transform=ax.transAxes)
if subplot == subplots[-1]:
ax.set_xlabel('phase')
else:
ax.xaxis.set_major_formatter(plt.NullFormatter())
if subplot == subplots[1]:
ax.set_ylabel('amplitude')
ax.yaxis.set_major_formatter(plt.NullFormatter())
ax.set_xlim(0, 2)
ax.set_ylim(1.1, -0.1)
#plt.show()
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
import numpy as np
from matplotlib import pyplot as plt
from astroML.time_series import \
lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap
#----------------------------------------------------------------------
# This function adjusts matplotlib settings for a uniform feel in the textbook.
# Note that with usetex=True, fonts are rendered with LaTeX. This may
# result in an error if LaTeX is not installed on your system. In that case,
# you can set usetex to False.
from astroML.plotting import setup_text_plots
setup_text_plots(fontsize=8, usetex=True)
#------------------------------------------------------------
# Generate data where y is positive
np.random.seed(0)
N = 30
P = 0.3
t = P / 2 * np.random.random(N) + P * np.random.randint(100, size=N)
y = 10 + np.sin(2 * np.pi * t / P)
dy = 0.5 + 0.5 * np.random.random(N)
y_obs = y + np.random.normal(dy)
omega_0 = 2 * np.pi / P
#######################################################################
# Generate the plot with and without the original typo
for typo in [True, False]:
#------------------------------------------------------------
# Compute the Lomb-Scargle Periodogram
sig = np.array([0.1, 0.01, 0.001])
omega = np.linspace(17, 22, 1000)
# Notice the typo: we used y rather than y_obs
if typo is True:
P_S = lomb_scargle(t, y, dy, omega, generalized=False)
P_G = lomb_scargle(t, y, dy, omega, generalized=True)
else:
P_S = lomb_scargle(t, y_obs, dy, omega, generalized=False)
P_G = lomb_scargle(t, y_obs, dy, omega, generalized=True)
#------------------------------------------------------------
# Get significance via bootstrap
D = lomb_scargle_bootstrap(t, y_obs, dy, omega, generalized=True,
N_bootstraps=1000, random_state=0)
sig1, sig5 = np.percentile(D, [99, 95])
#Figure 10.4
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
#----------------------------------------------------------------------
# This function adjusts matplotlib settings for a uniform feel in the textbook.
# Note that with usetex=True, fonts are rendered with LaTeX. This may
# result in an error if LaTeX is not installed on your system. In that case,
# you can set usetex to False.
def plot1004():
#from astroML.plotting import setup_text_plots
#setup_text_plots(fontsize=8, usetex=True)
#------------------------------------------------------------
# Generate the data
Nbins = 2 ** 15
Nobs = 40
f = lambda t: np.sin(np.pi * t / 3)
t = np.linspace(-100, 200, Nbins)
dt = t[1] - t[0]
y = f(t)
# select observations
np.random.seed(42)
t_obs = 100 * np.random.random(40)
D = abs(t_obs[:, np.newaxis] - t)
i = np.argmin(D, 1)
t_obs = t[i]
y_obs = y[i]
window = np.zeros(Nbins)
window[i] = 1
#------------------------------------------------------------
# Compute PSDs
Nfreq = Nbins / 2
dt = t[1] - t[0]
df = 1. / (Nbins * dt)
f = df * np.arange(Nfreq)
PSD_window = abs(np.fft.fft(window)[:Nfreq]) ** 2
PSD_y = abs(np.fft.fft(y)[:Nfreq]) ** 2
PSD_obs = abs(np.fft.fft(y * window)[:Nfreq]) ** 2
# normalize the true PSD so it can be shown in the plot:
# in theory it's a delta function, so normalization is
# arbitrary
# scale PSDs for plotting
PSD_window /= 500
PSD_y /= PSD_y.max()
PSD_obs /= 500
#------------------------------------------------------------
# Prepare the figures
fig = plt.figure()
fig.subplots_adjust(bottom=0.15, hspace=0.2, wspace=0.25,
left=0.12, right=0.95)
# First panel: data vs time
ax = fig.add_subplot(221)
ax.plot(t, y, '-', c='gray')
ax.plot(t_obs, y_obs, '.k', ms=4)
ax.text(0.95, 0.93, "Data", ha='right', va='top', transform=ax.transAxes)
ax.set_ylabel('$y(t)$')
ax.set_xlim(0, 100)
ax.set_ylim(-1.5, 1.8)
# Second panel: PSD of data
ax = fig.add_subplot(222)
ax.fill(f, PSD_y, fc='gray', ec='gray')
ax.plot(f, PSD_obs, '-', c='black')
ax.text(0.95, 0.93, "Data PSD", ha='right', va='top', transform=ax.transAxes)
ax.set_ylabel('$P(f)$')
ax.set_xlim(0, 1.0)
ax.set_ylim(-0.1, 1.1)
# Third panel: window vs time
ax = fig.add_subplot(223)
ax.plot(t, window, '-', c='black')
ax.text(0.95, 0.93, "Window", ha='right', va='top', transform=ax.transAxes)
ax.set_xlabel('$t$')
ax.set_ylabel('$y(t)$')
ax.set_xlim(0, 100)
ax.set_ylim(-0.2, 1.5)
# Fourth panel: PSD of window
ax = fig.add_subplot(224)
ax.plot(f, PSD_window, '-', c='black')
ax.text(0.95, 0.93, "Window PSD", ha='right', va='top', transform=ax.transAxes)
ax.set_xlabel('$f$')
ax.set_ylabel('$P(f)$')
ax.set_xlim(0, 1.0)
ax.set_ylim(-0.1, 1.1)
def plot1006():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from scipy import fftpack
from matplotlib import mlab
from astroML.datasets import fetch_LIGO_large
#------------------------------------------------------------
# Fetch the LIGO hanford data
data, dt = fetch_LIGO_large()
# subset of the data to plot
t0 = 646
T = 2
tplot = dt * np.arange(T * 4096)
dplot = data[4096 * t0: 4096 * (t0 + T)]
tplot = tplot[::10]
dplot = dplot[::10]
fmin = 40
fmax = 2060
#------------------------------------------------------------
# compute PSD using simple FFT
N = len(data)
df = 1. / (N * dt)
PSD = abs(dt * fftpack.fft(data)[:N / 2]) ** 2
f = df * np.arange(N / 2)
cutoff = ((f >= fmin) & (f <= fmax))
f = f[cutoff]
PSD = PSD[cutoff]
f = f[::100]
PSD = PSD[::100]
#------------------------------------------------------------
# compute PSD using Welch's method -- no window function
PSDW1, fW1 = mlab.psd(data, NFFT=4096, Fs=1. / dt,
window=mlab.window_none, noverlap=2048)
dfW1 = fW1[1] - fW1[0]
cutoff = (fW1 >= fmin) & (fW1 <= fmax)
fW1 = fW1[cutoff]
PSDW1 = PSDW1[cutoff]
#------------------------------------------------------------
# compute PSD using Welch's method -- hanning window function
PSDW2, fW2 = mlab.psd(data, NFFT=4096, Fs=1. / dt,
window=mlab.window_hanning, noverlap=2048)
dfW2 = fW2[1] - fW2[0]
cutoff = (fW2 >= fmin) & (fW2 <= fmax)
fW2 = fW2[cutoff]
PSDW2 = PSDW2[cutoff]
#------------------------------------------------------------
# Plot the data
fig = plt.figure()
fig.subplots_adjust(bottom=0.1, top=0.9, hspace=0.3)
# top panel: time series
ax = fig.add_subplot(311)
ax.plot(tplot, dplot, '-k')
ax.set_xlabel('time (s)')
ax.set_ylabel('$h(t)$')
ax.set_ylim(-1.2E-18, 1.2E-18)
# middle panel: non-windowed filter
ax = fig.add_subplot(312)
ax.loglog(f, PSD, '-', c='#AAAAAA')
ax.loglog(fW1, PSDW1, '-k')
ax.text(0.98, 0.95, "Top-hat window",
ha='right', va='top', transform=ax.transAxes)
ax.set_xlabel('frequency (Hz)')
ax.set_ylabel(r'$PSD(f)$')
ax.set_xlim(40, 2060)
ax.set_ylim(1E-46, 1E-36)
ax.yaxis.set_major_locator(plt.LogLocator(base=100))
# bottom panel: hanning window
ax = fig.add_subplot(313)
ax.loglog(f, PSD, '-', c='#AAAAAA')
ax.loglog(fW2, PSDW2, '-k')
ax.text(0.98, 0.95, "Hanning (cosine) window",
ha='right', va='top', transform=ax.transAxes)
ax.set_xlabel('frequency (Hz)')
ax.set_ylabel(r'$PSD(f)$')
ax.set_xlim(40, 2060)
ax.set_ylim(1E-46, 1E-36)
ax.yaxis.set_major_locator(plt.LogLocator(base=100))
#Figure 10.7
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
def plot1007():
from astroML.fourier import\
FT_continuous, IFT_continuous, sinegauss, sinegauss_FT, wavelet_PSD
#------------------------------------------------------------
# Sample the function: localized noise
np.random.seed(0)
N = 1024
t = np.linspace(-5, 5, N)
x = np.ones(len(t))
h = np.random.normal(0, 1, len(t))
h *= np.exp(-0.5 * (t / 0.5) ** 2)
#------------------------------------------------------------
# Compute an example wavelet
W = sinegauss(t, 0, 1.5, Q=1.0)
#------------------------------------------------------------
# Compute the wavelet PSD
f0 = np.linspace(0.5, 7.5, 100)
wPSD = wavelet_PSD(t, h, f0, Q=1.0)
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
fig.subplots_adjust(hspace=0.05, left=0.12, right=0.95, bottom=0.08, top=0.95)
# First panel: the signal
ax = fig.add_subplot(311)
ax.plot(t, h, '-k', lw=1)
ax.text(0.02, 0.95, ("Input Signal:\n"
"Localized Gaussian noise"),
ha='left', va='top', transform=ax.transAxes)
ax.set_xlim(-4, 4)
ax.set_ylim(-2.9, 2.9)
ax.xaxis.set_major_formatter(plt.NullFormatter())
ax.set_ylabel('$h(t)$')
# Second panel: an example wavelet
ax = fig.add_subplot(312)
ax.plot(t, W.real, '-k', label='real part', lw=1)
ax.plot(t, W.imag, '--k', label='imag part', lw=1)
ax.text(0.02, 0.95, ("Example Wavelet\n"
"$t_0 = 0$, $f_0=1.5$, $Q=1.0$"),
ha='left', va='top', transform=ax.transAxes)
ax.text(0.98, 0.05,
(r"$w(t; t_0, f_0, Q) = e^{-[f_0 (t - t_0) / Q]^2}"
"e^{2 \pi i f_0 (t - t_0)}$"),
ha='right', va='bottom', transform=ax.transAxes)
ax.legend(loc=1)
ax.set_xlim(-4, 4)
ax.set_ylim(-1.4, 1.4)
ax.set_ylabel('$w(t; t_0, f_0, Q)$')
ax.xaxis.set_major_formatter(plt.NullFormatter())
# Third panel: the spectrogram
ax = plt.subplot(313)
ax.imshow(wPSD, origin='lower', aspect='auto', cmap=plt.cm.jet,
extent=[t[0], t[-1], f0[0], f0[-1]])
ax.text(0.02, 0.95, ("Wavelet PSD"), color='w',
ha='left', va='top', transform=ax.transAxes)
ax.set_xlim(-4, 4)
ax.set_ylim(0.5, 7.5)
ax.set_xlabel('$t$')
ax.set_ylabel('$f_0$')
#Figure 10.8
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
def plot1008():
from astroML.fourier import FT_continuous, IFT_continuous
def wavelet(t, t0, f0, Q):
return (np.exp(-(f0 / Q * (t - t0)) ** 2)
* np.exp(2j * np.pi * f0 * (t - t0)))
def wavelet_FT(f, t0, f0, Q):
# this is its fourier transform using
# H(f) = integral[ h(t) exp(-2pi i f t) dt]
return (np.sqrt(np.pi) * Q / f0
* np.exp(-2j * np.pi * f * t0)
* np.exp(-(np.pi * (f - f0) * Q / f0) ** 2))
def check_funcs(t0=1, f0=2, Q=3):
t = np.linspace(-5, 5, 10000)
h = wavelet(t, t0, f0, Q)
f, H = FT_continuous(t, h)
assert np.allclose(H, wavelet_FT(f, t0, f0, Q))
#------------------------------------------------------------
# Create the simulated dataset
np.random.seed(5)
t = np.linspace(-40, 40, 2001)[:-1]
h = np.exp(-0.5 * ((t - 20.) / 1.0) ** 2)
hN = h + np.random.normal(0, 0.5, size=h.shape)
#------------------------------------------------------------
# Compute the convolution via the continuous Fourier transform
# This is more exact than using the discrete transform, because
# we have an analytic expression for the FT of the wavelet.
Q = 0.3
f0 = 2 ** np.linspace(-3, -1, 100)
f, H = FT_continuous(t, hN)
W = np.conj(wavelet_FT(f, 0, f0[:, None], Q))
t, HW = IFT_continuous(f, H * W)
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
fig.subplots_adjust(hspace=0.05, left=0.12, right=0.95, bottom=0.08, top=0.95)
# First panel: the signal
ax = fig.add_subplot(311)
ax.plot(t, hN, '-k', lw=1)
ax.text(0.02, 0.95, ("Input Signal:\n"
"Localized spike plus noise"),
ha='left', va='top', transform=ax.transAxes)
ax.set_xlim(-40, 40)
ax.set_ylim(-1.2, 2.2)
ax.xaxis.set_major_formatter(plt.NullFormatter())
ax.set_ylabel('$h(t)$')
# Second panel: the wavelet
ax = fig.add_subplot(312)
W = wavelet(t, 0, 0.125, Q)
ax.plot(t, W.real, '-k', label='real part', lw=1)
ax.plot(t, W.imag, '--k', label='imag part', lw=1)
ax.legend(loc=1)
ax.text(0.02, 0.95, ("Example Wavelet\n"
"$t_0 = 0$, $f_0=1/8$, $Q=0.3$"),
ha='left', va='top', transform=ax.transAxes)
ax.text(0.98, 0.05,
(r"$w(t; t_0, f_0, Q) = e^{-[f_0 (t - t_0) / Q]^2}"
"e^{2 \pi i f_0 (t - t_0)}$"),
ha='right', va='bottom', transform=ax.transAxes)
ax.set_xlim(-40, 40)
ax.set_ylim(-1.4, 1.4)
ax.set_ylabel('$w(t; t_0, f_0, Q)$')
ax.xaxis.set_major_formatter(plt.NullFormatter())
# Third panel: the spectrogram
ax = fig.add_subplot(313)
ax.imshow(abs(HW) ** 2, origin='lower', aspect='auto', cmap=plt.cm.binary,
extent=[t[0], t[-1], np.log2(f0)[0], np.log2(f0)[-1]])
ax.set_xlim(-40, 40)
ax.text(0.02, 0.95, ("Wavelet PSD"), color='w',
ha='left', va='top', transform=ax.transAxes)
ax.set_ylim(np.log2(f0)[0], np.log2(f0)[-1])
ax.set_xlabel('$t$')
ax.set_ylabel('$f_0$')
ax.yaxis.set_major_locator(plt.MultipleLocator(1))
ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, *args: ("1/%i"
% (2 ** -x))))
#Figure 10.10
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
def plot1010():
from scipy import optimize, fftpack
from astroML.filters import savitzky_golay, wiener_filter
#------------------------------------------------------------
# Create the noisy data
np.random.seed(5)
N = 2000
dt = 0.05
t = dt * np.arange(N)
h = np.exp(-0.5 * ((t - 20.) / 1.0) ** 2)
hN = h + np.random.normal(0, 0.5, size=h.shape)
Df = 1. / N / dt
f = fftpack.ifftshift(Df * (np.arange(N) - N / 2))
HN = fftpack.fft(hN)
#------------------------------------------------------------
# Set up the Wiener filter:
# fit a model to the PSD consisting of the sum of a
# gaussian and white noise
h_smooth, PSD, P_S, P_N, Phi = wiener_filter(t, hN, return_PSDs=True)
#------------------------------------------------------------
# Use the Savitzky-Golay filter to filter the values
h_sg = savitzky_golay(hN, window_size=201, order=4, use_fft=False)
#------------------------------------------------------------
# Plot the results
N = len(t)
Df = 1. / N / (t[1] - t[0])
f = fftpack.ifftshift(Df * (np.arange(N) - N / 2))
HN = fftpack.fft(hN)
fig = plt.figure()
fig.subplots_adjust(wspace=0.05, hspace=0.25,
bottom=0.1, top=0.95,
left=0.12, right=0.95)
# First plot: noisy signal
ax = fig.add_subplot(221)
ax.plot(t, hN, '-', c='gray')
ax.plot(t, np.zeros_like(t), ':k')
ax.text(0.98, 0.95, "Input Signal", ha='right', va='top',
transform=ax.transAxes, bbox=dict(fc='w', ec='none'))
ax.set_xlim(0, 90)
ax.set_ylim(-0.5, 1.5)
ax.xaxis.set_major_locator(plt.MultipleLocator(20))
ax.set_xlabel(r'$\lambda$')
ax.set_ylabel('flux')
# Second plot: filtered signal
ax = plt.subplot(222)
ax.plot(t, np.zeros_like(t), ':k', lw=1)
ax.plot(t, h_smooth, '-k', lw=1.5, label='Wiener')
ax.plot(t, h_sg, '-', c='gray', lw=1, label='Savitzky-Golay')
ax.text(0.98, 0.95, "Filtered Signal", ha='right', va='top',
transform=ax.transAxes)
ax.legend(loc='upper right', bbox_to_anchor=(0.98, 0.9), frameon=False)
ax.set_xlim(0, 90)
ax.set_ylim(-0.5, 1.5)
ax.yaxis.set_major_formatter(plt.NullFormatter())
ax.xaxis.set_major_locator(plt.MultipleLocator(20))
ax.set_xlabel(r'$\lambda$')
# Third plot: Input PSD
ax = fig.add_subplot(223)
ax.scatter(f[:N / 2], PSD[:N / 2], s=9, c='k')
ax.plot(f[:N / 2], P_S[:N / 2], '-k')
ax.plot(f[:N / 2], P_N[:N / 2], '-k')
ax.text(0.98, 0.95, "Input PSD", ha='right', va='top',
transform=ax.transAxes)
ax.set_ylim(-100, 3500)
ax.set_xlim(0, 0.9)
ax.yaxis.set_major_locator(plt.MultipleLocator(1000))
ax.xaxis.set_major_locator(plt.MultipleLocator(0.2))
ax.set_xlabel('$f$')
ax.set_ylabel('$PSD(f)$')
# Fourth plot: Filtered PSD
ax = fig.add_subplot(224)
filtered_PSD = (Phi * abs(HN)) ** 2
ax.scatter(f[:N / 2], filtered_PSD[:N / 2], s=9, c='k')
ax.text(0.98, 0.95, "Filtered PSD", ha='right', va='top',
transform=ax.transAxes)
ax.set_ylim(-100, 3500)
ax.set_xlim(0, 0.9)
ax.yaxis.set_major_locator(plt.MultipleLocator(1000))
ax.yaxis.set_major_formatter(plt.NullFormatter())
ax.xaxis.set_major_locator(plt.MultipleLocator(0.2))
ax.set_xlabel('$f$')
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
def plot1015():
from astroML.time_series import\
lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap
#------------------------------------------------------------
# Generate Data
np.random.seed(0)
N = 30
P = 0.3
t = np.random.randint(100, size=N) + 0.3 + 0.4 * np.random.random(N)
y = 10 + np.sin(2 * np.pi * t / P)
dy = 0.5 + 0.5 * np.random.random(N)
y_obs = np.random.normal(y, dy)
#------------------------------------------------------------
# Compute periodogram
period = 10 ** np.linspace(-1, 0, 10000)
omega = 2 * np.pi / period
PS = lomb_scargle(t, y_obs, dy, omega, generalized=True)
#------------------------------------------------------------
# Get significance via bootstrap
D = lomb_scargle_bootstrap(t, y_obs, dy, omega, generalized=True,
N_bootstraps=1000, random_state=0)
sig1, sig5 = np.percentile(D, [99, 95])
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
fig.subplots_adjust(left=0.1, right=0.9, hspace=0.25)
# First panel: the data
ax = fig.add_subplot(211)
ax.errorbar(t, y_obs, dy, fmt='.k', lw=1, ecolor='gray')
ax.set_xlabel('time (days)')
ax.set_ylabel('flux')
ax.set_xlim(-5, 105)
# Second panel: the periodogram & significance levels
ax1 = fig.add_subplot(212, xscale='log')
ax1.plot(period, PS, '-', c='black', lw=1, zorder=1)
ax1.plot([period[0], period[-1]], [sig1, sig1], ':', c='black')
ax1.plot([period[0], period[-1]], [sig5, sig5], ':', c='black')
ax1.annotate("", (0.3, 0.65), (0.3, 0.85), ha='center',
arrowprops=dict(arrowstyle='->'))
ax1.set_xlim(period[0], period[-1])
ax1.set_ylim(-0.05, 0.85)
ax1.set_xlabel(r'period (days)')
ax1.set_ylabel('power')
# Twin axis: label BIC on the right side
ax2 = ax1.twinx()
ax2.set_ylim(tuple(lomb_scargle_BIC(ax1.get_ylim(), y_obs, dy)))
ax2.set_ylabel(r'$\Delta BIC$')
ax1.xaxis.set_major_formatter(plt.FormatStrFormatter('%.1f'))
ax1.xaxis.set_minor_formatter(plt.FormatStrFormatter('%.1f'))
ax1.xaxis.set_major_locator(plt.LogLocator(10))
ax1.xaxis.set_major_formatter(plt.FormatStrFormatter('%.3g'))
def plot1015():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from astroML.time_series import \
lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap
#------------------------------------------------------------
# Generate data where y is positive
np.random.seed(0)
N = 30
P = 0.3
t = P / 2 * np.random.random(N) + P * np.random.randint(100, size=N)
y = 10 + np.sin(2 * np.pi * t / P)
dy = 0.5 + 0.5 * np.random.random(N)
y_obs = y + np.random.normal(dy)
omega_0 = 2 * np.pi / P
#######################################################################
# Redo the plot without the typo
# We need a larger data range to actually get significant power
# with actual noisy data
#------------------------------------------------------------
# Generate data where y is positive
np.random.seed(0)
N = 300
P = 0.3
t = P / 2 * np.random.random(N) + P * np.random.randint(1000, size=N)
y = 10 + np.sin(2 * np.pi * t / P)
dy = 0.1 + 0.1 * np.random.random(N)
y_obs = y + np.random.normal(dy)
omega_0 = 2 * np.pi / P
#------------------------------------------------------------
# Compute the Lomb-Scargle Periodogram
sig = np.array([0.1, 0.01, 0.001])
omega = np.linspace(20.5, 21.1, 1000)
P_S = lomb_scargle(t, y_obs, dy, omega, generalized=False)
P_G = lomb_scargle(t, y_obs, dy, omega, generalized=True)
#------------------------------------------------------------
# Get significance via bootstrap
D = lomb_scargle_bootstrap(t, y_obs, dy, omega, generalized=True,
N_bootstraps=1000, random_state=0)
sig1, sig5 = np.percentile(D, [99, 95])
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
# First panel: input data
ax = fig.add_subplot(211)
ax.errorbar(t, y_obs, dy, fmt='.k', lw=1, ecolor='gray')
ax.plot([-20, 320], [10, 10], ':k', lw=1)
ax.set_xlim(-20, 320)
ax.set_xlabel('$t$')
ax.set_ylabel('$y(t)$')
# Second panel: periodogram
ax = fig.add_subplot(212)
ax.plot(omega, P_S, '--k', lw=1, label='standard')
ax.plot(omega, P_S, '-', c='gray', lw=1)
ax.plot(omega, P_G, '-', color='#BA4C37', lw=1, label='generalized')
ax.legend(loc=2)
# plot the significance lines.
xlim = (omega[0], omega[-1])
ax.plot(xlim, [sig1, sig1], ':', c='black')
ax.plot(xlim, [sig5, sig5], ':', c='black')
# label BIC on the right side
ax2 = ax.twinx()
ax2.set_ylim(tuple(lomb_scargle_BIC(ax.get_ylim(), y_obs, dy)))
ax2.set_ylabel(r'$\Delta BIC$')
ax.set_xlabel('$\omega$')
ax.set_ylabel(r'$P_{\rm LS}(\omega)$')
ax.set_xlim(xlim)
ax.set_ylim(0, 0.12)
def plot1017():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from astroML.decorators import pickle_results
from astroML.time_series import search_frequencies, lomb_scargle, MultiTermFit
from astroML.datasets import fetch_LINEAR_sample
#------------------------------------------------------------
# Load the dataset
data = fetch_LINEAR_sample()
ids = [14752041, 1009459, 10022663, 10025796, 11375941, 18525697]
#------------------------------------------------------------
# Compute the best frequencies
@pickle_results('LINEAR_LS.pkl')
def compute_best_frequencies(ids, n_eval=10000, n_retry=5, generalized=True):
results = {}
for i in ids:
t, y, dy = data[i].T
print " - computing power for %i (%i points)" % (i, len(t))
kwargs = dict(generalized=generalized)
omega, power = search_frequencies(t, y, dy, n_eval=n_eval,
n_retry=n_retry,
LS_kwargs=kwargs)
results[i] = [omega, power]
return results
results = compute_best_frequencies(ids, n_eval=10000, n_retry=5)
#------------------------------------------------------------
# Plot the phased light-curves
fig = plt.figure()
fig.subplots_adjust(hspace=0.1, bottom=0.06, top=0.94, left=0.12, right=0.94)
for i in range(6):
# get the data and best-fit angular frequency
t, y, dy = data[ids[i]].T
omega, power = results[ids[i]]
omega_best = omega[np.argmax(power)]
print " - omega_0 = %.10g" % omega_best
# do a fit to the first 4 Fourier components
mtf = MultiTermFit(omega_best, 4)
mtf.fit(t, y, dy)
phase_fit, y_fit, phased_t = mtf.predict(1000, return_phased_times=True)
# plot the phased data and best-fit curves
ax = fig.add_subplot(321 + i)
ax.errorbar(phased_t, y, dy, fmt='.k', ecolor='gray',
lw=1, ms=4, capsize=1.5)
ax.plot(phase_fit, y_fit, '-b', lw=2)
ax.set_xlim(0, 1)
ax.set_ylim(plt.ylim()[::-1])
ax.yaxis.set_major_locator(plt.MaxNLocator(4))
ax.text(0.03, 0.04, "ID = %i" % ids[i], ha='left', va='bottom',
transform=ax.transAxes)
ax.text(0.03, 0.96, "P = %.2f hr" % (2 * np.pi / omega_best * 24.),
ha='left', va='top',
transform=ax.transAxes)
ylim = ax.get_ylim()
ax.set_ylim(ylim[0], ylim[0] + 1.1 * (ylim[1] - ylim[0]))
if i < 4:
ax.xaxis.set_major_formatter(plt.NullFormatter())
if i % 2 == 0:
ax.set_ylabel('mag')
if i in (4, 5):
ax.set_xlabel('phase')
def plot1018():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from astroML.time_series import multiterm_periodogram, MultiTermFit
from astroML.datasets import fetch_LINEAR_sample
#------------------------------------------------------------
# Get data
data = fetch_LINEAR_sample()
t, y, dy = data[14752041].T
#------------------------------------------------------------
# Do a single-term and multi-term fit around the peak
omega0 = 17.217
nterms_fit = 6
# hack to get better phases: this doesn't change results,
# except for how the phase plots are displayed
t -= 0.4 * np.pi / omega0
width = 0.03
omega = np.linspace(omega0 - width - 0.01, omega0 + width - 0.01, 1000)
#------------------------------------------------------------
# Compute periodograms and best-fit solutions
# factor gives the factor that we're dividing the fundamental frequency by
factors = [1, 2]
nterms = [1, 6]
# Compute PSDs for factors & nterms
PSDs = dict()
for f in factors:
for n in nterms:
PSDs[(f, n)] = multiterm_periodogram(t, y, dy, omega / f, n)
# Compute the best-fit omega from the 6-term fit
omega_best = dict()
for f in factors:
omegaf = omega / f
PSDf = PSDs[(f, 6)]
omega_best[f] = omegaf[np.argmax(PSDf)]
# Compute the best-fit solution based on the fundamental frequency
best_fit = dict()
for f in factors:
for n in nterms:
mtf = MultiTermFit(omega_best[f], n)
mtf.fit(t, y, dy)
phase_best, y_best = mtf.predict(1000, adjust_offset=False)
best_fit[(f, n)] = (phase_best, y_best)
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
fig.subplots_adjust(left=0.1, right=0.95, wspace=0.25,
bottom=0.12, top=0.95, hspace=0.2)
for i, f in enumerate(factors):
P_best = 2 * np.pi / omega_best[f]
phase_best = (t / P_best) % 1
# first column: plot the PSD
ax1 = fig.add_subplot(221 + 2 * i)
ax1.plot(omega / f, PSDs[(f, 6)], '-', c='#46959E', label='6 terms', lw=2)
ax1.plot(omega / f, PSDs[(f, 1)], '-', c='#BA4C37', label='1 term')
ax1.grid(color='gray')
ax1.legend(loc=2)
ax1.axis('tight')
ax1.set_ylim(-0.05, 1.001)
ax1.xaxis.set_major_locator(plt.MultipleLocator(0.01))
ax1.xaxis.set_major_formatter(plt.FormatStrFormatter('%.2f'))
# second column: plot the phased data & fit
ax2 = fig.add_subplot(222 + 2 * i)
ax2.errorbar(phase_best, y, dy, fmt='.k', ms=4, ecolor='gray', lw=1,
capsize=1.5)
ax2.plot(best_fit[(f, 1)][0], best_fit[(f, 1)][1], '-', c='#BA4C37')
ax2.plot(best_fit[(f, 6)][0], best_fit[(f, 6)][1], '-', c='#46959E', lw=2)
ax2.text(0.02, 0.02, (r"$\omega_0 = %.2f$" % omega_best[f] + "\n"
+ r"$P_0 = %.2f\ {\rm hours}$" % (24 * P_best)),
ha='left', va='bottom', transform=ax2.transAxes)
ax2.grid(color='gray')
ax2.set_xlim(0, 1)
ax2.set_ylim(plt.ylim()[::-1])
ax2.yaxis.set_major_locator(plt.MultipleLocator(0.4))
# label both axes
ax1.set_ylabel(r'$P_{\rm LS}(\omega)$')
ax2.set_ylabel(r'${\rm mag}$')
if i == 1:
ax1.set_xlabel(r'$\omega$')
ax2.set_xlabel(r'${\rm phase}$')
def plot1019():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from astroML.time_series import multiterm_periodogram, lomb_scargle_BIC
from astroML.datasets import fetch_LINEAR_sample
#------------------------------------------------------------
# Fetch the data
data = fetch_LINEAR_sample()
t, y, dy = data[14752041].T
omega0 = 17.217
# focus only on the region with the peak
omega1 = np.linspace(17.213, 17.220, 100)
omega2 = 0.5 * omega1
#------------------------------------------------------------
# Compute the delta BIC
terms = np.arange(1, 21)
BIC_max = np.zeros((2, len(terms)))
for i, omega in enumerate([omega1, omega2]):
for j in range(len(terms)):
P = multiterm_periodogram(t, y, dy, omega, terms[j])
BIC = lomb_scargle_BIC(P, y, dy, n_harmonics=terms[j])
BIC_max[i, j] = BIC.max()
#----------------------------------------------------------------------
# Plot the results
fig = plt.figure()
ax = [fig.add_axes((0.15, 0.53, 0.8, 0.37)),
fig.add_axes((0.15, 0.1, 0.8, 0.37))]
ax_inset = [fig.add_axes((0.15 + 7 * 0.04, 0.55, 0.79 - 7 * 0.04, 0.17)),
fig.add_axes((0.15 + 7 * 0.04, 0.12, 0.79 - 7 * 0.04, 0.17))]
ylims = [(22750, 22850),
(26675, 26775)]
omega0 = [17.22, 8.61]
for i in range(2):
# Plot full panel
ax[i].plot(terms, BIC_max[i], '-k')
ax[i].set_xlim(0, 20)
ax[i].set_ylim(0, 30000)
ax[i].text(0.02, 0.95, r"$\omega_0 = %.2f$" % omega0[i],
ha='left', va='top', transform=ax[i].transAxes)
ax[i].set_ylabel(r'$\Delta BIC$')
if i == 1:
ax[i].set_xlabel('N frequencies')
ax[i].grid(color='gray')
# plot inset
ax_inset[i].plot(terms, BIC_max[i], '-k')
ax_inset[i].xaxis.set_major_locator(plt.MultipleLocator(5))
ax_inset[i].xaxis.set_major_formatter(plt.NullFormatter())
ax_inset[i].yaxis.set_major_locator(plt.MultipleLocator(25))
ax_inset[i].yaxis.set_major_formatter(plt.FormatStrFormatter('%i'))
ax_inset[i].set_xlim(7, 19.75)
ax_inset[i].set_ylim(ylims[i])
ax_inset[i].set_title('zoomed view')
ax_inset[i].grid(color='gray')
def plot1021():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from sklearn.mixture import GMM
from astroML.decorators import pickle_results
from astroML.datasets import fetch_LINEAR_geneva
#------------------------------------------------------------
# Get the Geneva periods data
data = fetch_LINEAR_geneva()
#----------------------------------------------------------------------
# compute Gaussian Mixture models
filetemplate = 'gmm_res_%i_%i.pkl'
attributes = [('gi', 'logP'),
('ug', 'gi', 'iK', 'JK', 'logP', 'amp', 'skew')]
components = np.arange(1, 21)
#------------------------------------------------------------
# Create attribute arrays
Xarrays = []
for attr in attributes:
Xarrays.append(np.vstack([data[a] for a in attr]).T)
#------------------------------------------------------------
# Compute the results (and save to pickle file)
@pickle_results('LINEAR_clustering.pkl')
def compute_GMM_results(components, attributes):
clfs = []
for attr, X in zip(attributes, Xarrays):
clfs_i = []
for comp in components:
print " - %i component fit" % comp
clf = GMM(comp, covariance_type='full',
random_state=0, n_iter=500)
clf.fit(X)
clfs_i.append(clf)
if not clf.converged_:
print " NOT CONVERGED!"
clfs.append(clfs_i)
return clfs
clfs = compute_GMM_results(components, attributes)
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
fig.subplots_adjust(hspace=0.1, wspace=0.1)
class_labels = []
for i in range(2):
# Grab the best classifier, based on the BIC
X = Xarrays[i]
BIC = [c.bic(X) for c in clfs[i]]
i_best = np.argmin(BIC)
print "number of components:", components[i_best]
clf = clfs[i][i_best]
n_components = clf.n_components
# Predict the cluster labels with the classifier
c = clf.predict(X)
classes = np.unique(c)
class_labels.append(c)
# sort the cluster by normalized density of points
counts = np.sum(c == classes[:, None], 1)
size = np.array([np.linalg.det(C) for C in clf.covars_])
weights = clf.weights_
density = counts / size
# Clusters with very few points are less interesting:
# set their density to zero so they'll go to the end of the list
density[counts < 5] = 0
isort = np.argsort(density)[::-1]
# find statistics of the top 10 clusters
Nclusters = 6
means = []
stdevs = []
counts = []
names = [name for name in data.dtype.names[2:] if name != 'LINEARobjectID']
labels = ['$u-g$', '$g-i$', '$i-K$', '$J-K$',
r'$\log(P)$', 'amplitude', 'skew',
'kurtosis', 'median mag', r'$N_{\rm obs}$', 'Visual Class']
assert len(names) == len(labels)
i_logP = names.index('logP')
for j in range(Nclusters):
flag = (c == isort[j])
counts.append(np.sum(flag))
means.append([np.mean(data[n][flag]) for n in names])
stdevs.append([data[n][flag].std() for n in names])
counts = np.array(counts)
means = np.array(means)
stdevs = np.array(stdevs)
# define colors based on median of logP
j_ordered = np.argsort(-means[:, i_logP])
# tweak colors by hand
if i == 1:
j_ordered[3], j_ordered[2] = j_ordered[2], j_ordered[3]
color = np.zeros(c.shape)
for j in range(Nclusters):
flag = (c == isort[j_ordered[j]])
color[flag] = j + 1
# separate into foureground and background
back = (color == 0)
fore = ~back
# Plot the resulting clusters
ax1 = fig.add_subplot(221 + 2 * i)
ax1.scatter(data['gi'][back], data['logP'][back],
c='gray', edgecolors='none', s=4, linewidths=0)
ax1.scatter(data['gi'][fore], data['logP'][fore],
c=color[fore], edgecolors='none', s=4, linewidths=0)
ax1.set_ylabel(r'$\log(P)$')
ax2 = plt.subplot(222 + 2 * i)
ax2.scatter(data['amp'][back], data['logP'][back],
c='gray', edgecolors='none', s=4, linewidths=0)
ax2.scatter(data['amp'][fore], data['logP'][fore],
c=color[fore], edgecolors='none', s=4, linewidths=0)
#------------------------------
# set axis limits
ax1.set_xlim(-0.6, 2.1)
ax2.set_xlim(0.1, 1.5)
ax1.set_ylim(-1.5, 0.5)
ax2.set_ylim(-1.5, 0.5)
ax2.yaxis.set_major_formatter(plt.NullFormatter())
if i == 0:
ax1.xaxis.set_major_formatter(plt.NullFormatter())
ax2.xaxis.set_major_formatter(plt.NullFormatter())
else:
ax1.set_xlabel(r'$g-i$')
ax2.set_xlabel(r'$A$')
#------------------------------
# print table of means and medians directly to LaTeX format
#print r"\begin{tabular}{|l|lllllll|}"
#print r" \hline"
#for j in range(7):
# print ' &', labels[j],
#print r"\\"
#print r" \hline"
#for j in range(Nclusters):
# print " %i " % (j + 1),
# for k in range(7):
# print " & $%.2f \pm %.2f$ " % (means[j, k], stdevs[j, k]),
# print r"\\"
#print r"\hline"
#print r"\end{tabular}"
#------------------------------------------------------------
# Second figure
fig = plt.figure()
fig.subplots_adjust(left=0.11, right=0.95, wspace=0.3)
attrs = ['skew', 'ug', 'iK', 'JK']
labels = ['skew', '$u-g$', '$i-K$', '$J-K$']
ylims = [(-1.8, 2.2), (0.6, 2.9), (0.1, 2.6), (-0.2, 1.2)]
for i in range(4):
ax = fig.add_subplot(221 + i)
ax.scatter(data['gi'][back], data[attrs[i]][back],
c='gray', edgecolors='none', s=4, linewidths=0)
ax.scatter(data['gi'][fore], data[attrs[i]][fore],
c=color[fore], edgecolors='none', s=4, linewidths=0)
ax.set_xlabel('$g-i$')
ax.set_ylabel(labels[i])
ax.set_xlim(-0.6, 2.1)
ax.set_ylim(ylims[i])
#------------------------------------------------------------
# Save the results
#
# run the script as
#
# >$ python fig_LINEAR_clustering.py --save
#
# to output the data file showing the cluster labels of each point
import sys
if len(sys.argv) > 1 and sys.argv[1] == '--save':
filename = 'cluster_labels.dat'
print "Saving cluster labels to %s" % filename
from astroML.datasets.LINEAR_sample import ARCHIVE_DTYPE
new_data = np.zeros(len(data),
dtype=(ARCHIVE_DTYPE + [('2D_cluster_ID', 'i4'),
('7D_cluster_ID', 'i4')]))
for name in data.dtype.names:
new_data[name] = data[name]
new_data['2D_cluster_ID'] = class_labels[0]
new_data['7D_cluster_ID'] = class_labels[1]
fmt = ('%.6f %.6f %.3f %.3f %.3f %.3f %.7f %.3f %.3f '
'%.3f %.2f %i %i %s %i %i\n')
F = open(filename, 'w')
F.write('# ra dec ug gi iK JK '
'logP Ampl skew kurt magMed nObs LCtype '
'LINEARobjectID 2D_cluster_ID 7D_cluster_ID\n')
for line in new_data:
F.write(fmt % tuple(line[col] for col in line.dtype.names))
F.close()
def plot1025():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
# Hack to fix import issue in older versions of pymc
import scipy
import scipy.misc
scipy.derivative = scipy.misc.derivative
import pymc
from astroML.plotting.mcmc import plot_mcmc
from astroML.decorators import pickle_results
#----------------------------------------------------------------------
# Set up toy dataset
def burst(t, b0, A, alpha, T):
"""Burst model"""
y = np.empty(t.shape)
y.fill(b0)
mask = (t >= T)
y[mask] += A * np.exp(-alpha * (t[mask] - T))
return y
np.random.seed(0)
N = 100
b0_true = 10
A_true = 5
alpha_true = 0.1
T_true = 50
sigma = 1.0
t = 100 * np.random.random(N)
y_true = burst(t, b0_true, A_true, alpha_true, T_true)
y_obs = np.random.normal(y_true, sigma)
#----------------------------------------------------------------------
# Set up MCMC sampling
b0 = pymc.Uniform('b0', 0, 50, value=50 * np.random.random())
A = pymc.Uniform('A', 0, 50, value=50 * np.random.random())
T = pymc.Uniform('T', 0, 100, value=100 * np.random.random())
log_alpha = pymc.Uniform('log_alpha', -10, 10, value=0)
# uniform prior on log(alpha)
@pymc.deterministic
def alpha(log_alpha=log_alpha):
return np.exp(log_alpha)
@pymc.deterministic
def y_model(t=t, b0=b0, A=A, alpha=alpha, T=T):
return burst(t, b0, A, alpha, T)
y = pymc.Normal('y', mu=y_model, tau=sigma ** -2, observed=True, value=y_obs)
model = dict(b0=b0, A=A, T=T, log_alpha=log_alpha,
alpha=alpha, y_model=y_model, y=y)
#----------------------------------------------------------------------
# Run the MCMC sampling
@pickle_results('matchedfilt_burst.pkl')
def compute_MCMC_results(niter=25000, burn=4000):
S = pymc.MCMC(model)
S.sample(iter=niter, burn=burn)
traces = [S.trace(s)[:] for s in ['b0', 'A', 'T', 'alpha']]
M = pymc.MAP(model)
M.fit()
fit_vals = (M.b0.value, M.A.value, M.alpha.value, M.T.value)
return traces, fit_vals
traces, fit_vals = compute_MCMC_results()
labels = ['$b_0$', '$A$', '$T$', r'$\alpha$']
limits = [(9.2, 11.2), (2, 12), (45, 55), (0.0, 0.25)]
true = [b0_true, A_true, T_true, alpha_true]
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
fig.subplots_adjust(bottom=0.1, top=0.95,
left=0.1, right=0.95,
hspace=0.05, wspace=0.05)
# This function plots multiple panels with the traces
plot_mcmc(traces, labels=labels, limits=limits, true_values=true, fig=fig,
bins=30, colors='k')
# Plot the model fit
ax = fig.add_axes([0.5, 0.7, 0.45, 0.25])
t_fit = np.linspace(0, 100, 101)
y_fit = burst(t_fit, *fit_vals)
ax.scatter(t, y_obs, s=9, c='k')
ax.plot(t_fit, y_fit, '-k')
ax.set_xlim(0, 100)
ax.set_xlabel('$t$')
ax.set_ylabel(r'$h_{\rm obs}$')
def plot1026():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
# Hack to fix import issue in older versions of pymc
import scipy
import scipy.misc
scipy.derivative = scipy.misc.derivative
import pymc
from astroML.plotting.mcmc import plot_mcmc
from astroML.decorators import pickle_results
#----------------------------------------------------------------------
# Set up toy dataset
def chirp(t, b0, beta, A, omega):
return b0 + A * np.sin(omega * t + beta * t * t)
np.random.seed(0)
N = 100
b0_true = 10
A_true = 5
beta_true = 0.01
omega_true = 0.1
sigma = 2.0
t = 100 * np.random.random(N)
y_true = chirp(t, b0_true, beta_true, A_true, omega_true)
y_obs = np.random.normal(y_true, sigma)
t_fit = np.linspace(0, 100, 1000)
y_fit = chirp(t_fit, b0_true, beta_true, A_true, omega_true)
i = np.argsort(t)
#----------------------------------------------------------------------
# Set up MCMC sampling
b0 = pymc.Uniform('b0', 0, 50, value=50 * np.random.random())
A = pymc.Uniform('A', 0, 50, value=50 * np.random.random())
log_beta = pymc.Uniform('log_beta', -10, 10, value=-4.6)
log_omega = pymc.Uniform('log_omega', -10, 10, value=-2.3)
# uniform prior on log(beta)
@pymc.deterministic
def beta(log_beta=log_beta):
return np.exp(log_beta)
# uniform prior on log(omega)
@pymc.deterministic
def omega(log_omega=log_omega):
return np.exp(log_omega)
@pymc.deterministic
def y_model(t=t, b0=b0, A=A, beta=beta, omega=omega):
return chirp(t, b0, beta, A, omega)
y = pymc.Normal('y', mu=y_model, tau=sigma ** -2, observed=True, value=y_obs)
model = dict(b0=b0, A=A,
log_beta=log_beta, beta=beta,
log_omega=log_omega, omega=omega,
y_model=y_model, y=y)
#----------------------------------------------------------------------
# Run the MCMC sampling (saving results to a pickle)
@pickle_results('matchedfilt_chirp.pkl')
def compute_MCMC_results(niter=20000, burn=2000):
S = pymc.MCMC(model)
S.sample(iter=niter, burn=burn)
traces = [S.trace(s)[:] for s in ['b0', 'A', 'omega', 'beta']]
M = pymc.MAP(model)
M.fit()
fit_vals = (M.b0.value, M.beta.value, M.A.value, M.omega.value)
return traces, fit_vals
traces, fit_vals = compute_MCMC_results()
labels = ['$b_0$', '$A$', r'$\omega$', r'$\beta$']
limits = [(9.5, 11.3), (3.6, 6.4), (0.065, 0.115), (0.00975, 0.01045)]
true = [b0_true, A_true, omega_true, beta_true]
#----------------------------------------------------------------------
# Find the Maximum a posteriori values
fig = plt.figure()
ax = plt.axes([0.5, 0.7, 0.45, 0.25])
t_fit = np.linspace(0, 100, 1001)
y_fit = chirp(t_fit, *fit_vals)
plt.scatter(t, y_obs, s=9, c='k')
plt.plot(t_fit, y_fit, '-k')
plt.xlim(0, 100)
plt.xlabel('$t$')
plt.ylabel(r'$h_{\rm obs}$')
# This function plots multiple panels with the traces
plot_mcmc(traces, labels=labels, limits=limits, true_values=true, fig=fig,
bins=30, bounds=[0.12, 0.08, 0.95, 0.91], colors='k')
def plot1027():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
# Hack to fix import issue in older versions of pymc
import scipy
import scipy.misc
scipy.derivative = scipy.misc.derivative
import pymc
from astroML.decorators import pickle_results
from astroML.plotting.mcmc import plot_mcmc
#----------------------------------------------------------------------
# Set up toy dataset
def chirp(t, T, A, phi, omega, beta):
"""chirp signal"""
signal = A * np.sin(phi + omega * (t - T) + beta * (t - T) ** 2)
signal[t < T] = 0
return signal
def background(t, b0, b1, Omega1, Omega2):
"""background signal"""
return b0 + b1 * np.sin(Omega1 * t) * np.sin(Omega2 * t)
np.random.seed(0)
N = 500
T_true = 30
A_true = 0.8
phi_true = np.pi / 2
omega_true = 0.1
beta_true = 0.02
b0_true = 0.5
b1_true = 1.0
Omega1_true = 0.3
Omega2_true = 0.4
sigma = 0.1
t = 100 * np.random.random(N)
signal = chirp(t, T_true, A_true, phi_true, omega_true, beta_true)
bg = background(t, b0_true, b1_true, Omega1_true, Omega2_true)
y_true = signal + bg
y_obs = np.random.normal(y_true, sigma)
t_fit = np.linspace(0, 100, 1000)
y_fit = (chirp(t_fit, T_true, A_true, phi_true, omega_true, beta_true) +
background(t_fit, b0_true, b1_true, Omega1_true, Omega2_true))
#----------------------------------------------------------------------
# Set up MCMC sampling
T = pymc.Uniform('T', 0, 100, value=T_true)
A = pymc.Uniform('A', 0, 100, value=A_true)
phi = pymc.Uniform('phi', -np.pi, np.pi, value=phi_true)
log_omega = pymc.Uniform('log_omega', -4, 0, value=np.log(omega_true))
log_beta = pymc.Uniform('log_beta', -6, 0, value=np.log(beta_true))
b0 = pymc.Uniform('b0', 0, 100, value=b0_true)
b1 = pymc.Uniform('b1', 0, 100, value=b1_true)
log_Omega1 = pymc.Uniform('log_Omega1', -3, 0, value=np.log(Omega1_true))
log_Omega2 = pymc.Uniform('log_Omega2', -3, 0, value=np.log(Omega2_true))
omega = pymc.Uniform('omega', 0.001, 1, value=omega_true)
beta = pymc.Uniform('beta', 0.001, 1, value=beta_true)
# uniform prior on log(Omega1)
@pymc.deterministic
def Omega1(log_Omega1=log_Omega1):
return np.exp(log_Omega1)
# uniform prior on log(Omega2)
@pymc.deterministic
def Omega2(log_Omega2=log_Omega2):
return np.exp(log_Omega2)
@pymc.deterministic
def y_model(t=t, T=T, A=A, phi=phi, omega=omega, beta=beta,
b0=b0, b1=b1, Omega1=Omega1, Omega2=Omega2):
return (chirp(t, T, A, phi, omega, beta)
+ background(t, b0, b1, Omega1, Omega2))
y = pymc.Normal('y', mu=y_model, tau=sigma ** -2, observed=True, value=y_obs)
model = dict(T=T, A=A, phi=phi, b0=b0, b1=b1,
log_omega=log_omega, omega=omega,
log_beta=log_beta, beta=beta,
log_Omega1=log_Omega1, Omega1=Omega1,
log_Omega2=log_Omega2, Omega2=Omega2,
y_model=y_model, y=y)
#----------------------------------------------------------------------
# Run the MCMC sampling (saving the results to a pickle file)
@pickle_results('matchedfilt_chirp2.pkl')
def compute_MCMC(niter=30000, burn=2000):
S = pymc.MCMC(model)
S.sample(iter=30000, burn=2000)
traces = [S.trace(s)[:] for s in ['T', 'A', 'omega', 'beta']]
return traces
traces = compute_MCMC()
labels = ['$T$', '$A$', r'$\omega$', r'$\beta$']
limits = [(29.75, 30.25), (0.75, 0.83), (0.085, 0.115), (0.0197, 0.0202)]
true = [T_true, A_true, omega_true, beta_true]
#------------------------------------------------------------
# Plot results
fig = plt.figure()
# This function plots multiple panels with the traces
axes_list = plot_mcmc(traces, labels=labels, limits=limits,
true_values=true, fig=fig,
bins=30, colors='k',
bounds=[0.14, 0.08, 0.95, 0.95])
for ax in axes_list:
for axis in [ax.xaxis, ax.yaxis]:
axis.set_major_locator(plt.MaxNLocator(5))
ax = fig.add_axes([0.52, 0.7, 0.43, 0.25])
ax.scatter(t, y_obs, s=9, c='k')
ax.plot(t_fit, y_fit, '-k')
ax.set_xlim(0, 100)
ax.set_xlabel('$t$')
ax.set_ylabel(r'$h_{\rm obs}$')
def plot1028():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from astroML.fourier import FT_continuous, IFT_continuous, wavelet_PSD
#------------------------------------------------------------
# Define the chirp signal
def chirp(t, T, A, phi, omega, beta):
signal = A * np.sin(phi + omega * (t - T) + beta * (t - T) ** 2)
signal[t < T] = 0
return signal
def background(t, b0, b1, Omega1, Omega2):
return b0 + b1 * np.sin(Omega1 * t) * np.sin(Omega2 * t)
N = 4096
t = np.linspace(-50, 50, N)
h_true = chirp(t, -20, 0.8, 0, 0.2, 0.02)
h = h_true + np.random.normal(0, 1, N)
#------------------------------------------------------------
# Compute the wavelet PSD
f0 = np.linspace(0.04, 0.6, 100)
wPSD = wavelet_PSD(t, h, f0, Q=1.0)
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
fig.subplots_adjust(hspace=0.05, left=0.1, right=0.95, bottom=0.1, top=0.95)
# Top: plot the data
ax = fig.add_subplot(211)
ax.plot(t + 50, h, '-', c='#AAAAAA')
ax.plot(t + 50, h_true, '-k')
ax.text(0.02, 0.95, "Input Signal: chirp",
ha='left', va='top', transform=ax.transAxes,
bbox=dict(boxstyle='round', fc='w', ec='k'))
ax.set_xlim(0, 100)
ax.set_ylim(-2.9, 2.9)
ax.xaxis.set_major_formatter(plt.NullFormatter())
ax.set_ylabel('$h(t)$')
# Bottom: plot the 2D PSD
ax = fig.add_subplot(212)
ax.imshow(wPSD, origin='lower', aspect='auto',
extent=[t[0] + 50, t[-1] + 50, f0[0], f0[-1]],
cmap=plt.cm.binary)
ax.text(0.02, 0.95, ("Wavelet PSD"), color='w',
ha='left', va='top', transform=ax.transAxes)
ax.set_xlim(0, 100)
ax.set_ylim(0.04, 0.6001)
ax.set_xlabel('$t$')
ax.set_ylabel('$f_0$')
def plot1029():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from astroML.time_series import generate_power_law
from astroML.fourier import PSD_continuous
N = 1024
dt = 0.01
factor = 100
t = dt * np.arange(N)
random_state = np.random.RandomState(1)
fig = plt.figure()
fig.subplots_adjust(wspace=0.05)
for i, beta in enumerate([1.0, 2.0]):
# Generate the light curve and compute the PSD
x = factor * generate_power_law(N, dt, beta, random_state=random_state)
f, PSD = PSD_continuous(t, x)
# First axes: plot the time series
ax1 = fig.add_subplot(221 + i)
ax1.plot(t, x, '-k')
ax1.text(0.95, 0.05, r"$P(f) \propto f^{-%i}$" % beta,
ha='right', va='bottom', transform=ax1.transAxes)
ax1.set_xlim(0, 10.24)
ax1.set_ylim(-1.5, 1.5)
ax1.set_xlabel(r'$t$')
# Second axes: plot the PSD
ax2 = fig.add_subplot(223 + i, xscale='log', yscale='log')
ax2.plot(f, PSD, '-k')
ax2.plot(f[1:], (factor * dt) ** 2 * (2 * np.pi * f[1:]) ** -beta, '--k')
ax2.set_xlim(1E-1, 60)
ax2.set_ylim(1E-6, 1E1)
ax2.set_xlabel(r'$f$')
if i == 1:
ax1.yaxis.set_major_formatter(plt.NullFormatter())
ax2.yaxis.set_major_formatter(plt.NullFormatter())
else:
ax1.set_ylabel(r'${\rm counts}$')
ax2.set_ylabel(r'$PSD(f)$')
def plot1030():
# Author: Jake VanderPlas
# License: BSD
# The figure produced by this code is published in the textbook
# "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
# For more information, see http://astroML.github.com
# To report a bug or issue, use the following forum:
# https://groups.google.com/forum/#!forum/astroml-general
from astroML.time_series import lomb_scargle, generate_damped_RW
from astroML.time_series import ACF_scargle, ACF_EK
#------------------------------------------------------------
# Generate time-series data:
# we'll do 1000 days worth of magnitudes
t = np.arange(0, 1E3)
z = 2.0
tau = 300
tau_obs = tau / (1. + z)
np.random.seed(6)
y = generate_damped_RW(t, tau=tau, z=z, xmean=20)
# randomly sample 100 of these
ind = np.arange(len(t))
np.random.shuffle(ind)
ind = ind[:100]
ind.sort()
t = t[ind]
y = y[ind]
# add errors
dy = 0.1
y_obs = np.random.normal(y, dy)
#------------------------------------------------------------
# compute ACF via scargle method
C_S, t_S = ACF_scargle(t, y_obs, dy,
n_omega=2 ** 12, omega_max=np.pi / 5.0)
ind = (t_S >= 0) & (t_S <= 500)
t_S = t_S[ind]
C_S = C_S[ind]
#------------------------------------------------------------
# compute ACF via E-K method
C_EK, C_EK_err, bins = ACF_EK(t, y_obs, dy, bins=np.linspace(0, 500, 51))
t_EK = 0.5 * (bins[1:] + bins[:-1])
#------------------------------------------------------------
# Plot the results
fig = plt.figure()
# plot the input data
ax = fig.add_subplot(211)
ax.errorbar(t, y_obs, dy, fmt='.k', lw=1)
ax.set_xlabel('t (days)')
ax.set_ylabel('observed flux')
# plot the ACF
ax = fig.add_subplot(212)
ax.plot(t_S, C_S, '-', c='gray', lw=1,
label='Scargle')
ax.errorbar(t_EK, C_EK, C_EK_err, fmt='.k', lw=1,
label='Edelson-Krolik')
ax.plot(t_S, np.exp(-abs(t_S) / tau_obs), '-k', label='True')
ax.legend(loc=3)
ax.plot(t_S, 0 * t_S, ':', lw=1, c='gray')
ax.set_xlim(0, 500)
ax.set_ylim(-1.0, 1.1)
ax.set_xlabel('t (days)')
ax.set_ylabel('ACF(t)')