Notebook

In [85]:

%matplotlib inline
%load_ext autoreload
%autoreload 2

import numpy as np
from matplotlib import pyplot as plt

plt.rcdefaults
myfigsize = (12.0, 8.0)
myfontsize = 14.
plt.rcParams['figure.figsize'] = myfigsize

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload

In [86]:

#import datasets for figures:
from astroML.datasets import fetch_rrlyrae_templates
from astroML.plotting import setup_text_plots
#setup_text_plots(fontsize=myfontsize, usetex=True)
#plt.rc('text', usetex=False)

Chapter 10: Time Series Analysis¶

10.1 Main Concepts for Time Series Analysis¶

Is My Time Series Just Noise?¶

Are my data consistent with the null hypothesis described by a model consisting of a constant signal plus measurement noise?

Frequentist Solution: Use hypothesis testing! What is the probability that we would obtain our data by chance if the null hypothesis of no variability were correct?

Compute $\chi^2$ and the corresponding $p$ values.

Consider a sin wave:

$$y(t) = A\sin(\omega t)$$

sampled by $N \sim 100$ data points with homoscedastic errors with standard deviation $\sigma$.

The variance is:

$$V = \sigma^2 + A^2/2$$

By a simple $\chi^2_{dof}$ analysis, assuming no model the detectable amplitude decreases very slowly as a function of the number of observations. For example, for $N = 100$ data points if we want a 3 $\sigma$ detection, the minimum detectable amplitude is $A = 0.92\sigma$, and for $N = 1000$ data points it only improves to $A = 0.52 \sigma$ (where $\sigma = \sqrt{2/N}$ assuming homoscedastic gaussian errors.

In [3]:

from IPython.html.widgets import interact
from IPython.html import widgets

In [4]:

def makeSine(a, sigma, numSamps):
    fig, ax = plt.subplots(1,1)
    x = np.linspace(0, 2 * np.pi, numSamps)
    y = a * np.sin(x)
    y += np.random.normal(scale=sigma, size=numSamps)
    ax.plot(x, y, 'o', color='#75BEBB')
    ax.errorbar(x, y, sigma, capsize=0, fmt='None', ecolor='#75BEBB')
    ax.set_xlim([-0.1, 2 * np.pi + 0.1])
    ax.set_xlabel(r'$\omega t$')
    ax.set_ylabel(r'$\sin{(\omega t)} + \sigma$')
    ax.text(0.1, 0.15, 'N: {}'.format(numSamps), transform=ax.transAxes)
    chinusq = 1/(numSamps * sigma**2) * np.sum(y**2)
    ax.text(0.1, 0.1, r'$\mathbf{\chi_{\nu}^{2}}$'+': {:.2f}'.format(chinusq), transform=ax.transAxes)
    sig = np.sqrt(2. / numSamps)
    ax.text(0.1, 0.05, '{:.2f}'.format((chinusq - 1.)/sig) + r' $\mathbf{\sigma}$', transform=ax.transAxes)
    ax.grid(which='major')

sigma = 0.75
a = 0.69
numSamps = 1e2
#slider_ax = plt.axes([0.1, 0.1, 0.8, 0.02])
#slider = Slider(slider_ax, "Offset", -5, 5, valinit=0, color='#AAAAAA')
#slider.on_changed(makeSine(a, sigma, numSamps))
interact(makeSine, 
         a=widgets.FloatSliderWidget(min=0.,max=1.,step=0.05,value=a),
         sigma=widgets.FloatSliderWidget(min=0.,max=1.,step=0.05,value=sigma),
         numSamps=widgets.IntSliderWidget(min=0,max=1000,step=10,value=numSamps))

However, we'll see that our ability to discern a signal greatly improves if we specify a model.

10.2 Modeling Toolkit for Time Series Analysis¶

10.2.1. Parameter Estimation, Model Selection, and Classification for Time Series Data¶

Detection of a signal is essentially a hypothesis testing or model selection problem.

The quantitative description of a signal belongs to parameter estimation and regression problems.

In general, time series model fitting of $N$ observations $(t_j, y_j), j = 1, ..., N$ is of the form:

$$y_j(t_j) = \sum_{m=1}^{N}\beta_m T_m(t_j | \mathbf{\theta}_m) + \epsilon_j$$

where $T_m(t|\mathbf{\theta}_m)$ need not be periodic.

Common models include, sinusoidal, exponential, and "chirp" signals:

$$T(t) = \sin{(\omega t)}$$$$T(t) = \exp{(-\alpha t)}$$$$T(t) = \sin{(\phi + \omega t + \alpha t^{2})}$$

10.2.2. Fourier Analysis¶

In Fourier analysis, general functions can be expressed by integrals or sums of simpler trigonometric functions. Below is an example of an RR Lyra light curve and its approximation by a sum of sinusoids.

In [24]:

plot1001()

So what's going on?¶

The Fourier Transform can be defined as $$ H(f) = \int_{-\infty}^{\infty} h(t) \exp{(-i 2 \pi f t)} dt$$

With its inverse being $$h(t) = \int_{-\infty}^{\infty} H(f) \exp{(i 2 \pi f t)} df $$

and Euler's identity can be helpful:

$$e^{-i \omega t} = \cos{\omega t} - i \sin{\omega t}$$

10.2.3 Discrete Fourier Transform¶

But in practice, the function of interest is most likely thrown at a computer, which performs discrete fourier analysis using a fast fourier transform (fft). This is defined as

$$y[k] = \sum_{n=0}^{N-1} \exp{(-2 \pi j \frac{kn}{N})}x[n]$$

and the inverse transform is defined as

$$x[n] = \frac{1}{N}\sum_{n=0}^{N-1} \exp{(2 \pi j \frac{kn}{N})} y[k]$$

Scipy has a library called fftpack that contains several convenient methods, including one for taking the fast fourier transform (fft), and one for taking its inverse (ifft)

In [25]:

from scipy import fftpack

To recreate the figure above and test out fftpack, lets first import some RR Lyrae templates using the astroML method fetch_rrlyrae_templates()

In [26]:

templates = fetch_rrlyrae_templates()

In [27]:

type(templates)

Out[27]:

dict

In [28]:

templates.keys()

Out[28]:

['109u',
 '109r',
 '115z',
 '109z',
 '115r',
 '109g',
 '115i',
 '115g',
 '109i',
 '106z',
 '106r',
 '106u',
 '106i',
 '106g',
 '114r',
 '114z',
 '114g',
 '114i',
 '107z',
 '107u',
 '107r',
 '107i',
 '107g',
 '113z',
 '113r',
 '108u',
 '113i',
 '113g',
 '100i',
 '100g',
 '100z',
 '100u',
 '100r',
 '112z',
 '108r',
 '112r',
 '108z',
 '108g',
 '112i',
 '108i',
 '112g',
 '0r',
 '101i',
 '0u',
 '0z',
 '101g',
 '0g',
 '101z',
 '101u',
 '0i',
 '101r',
 '102z',
 '102u',
 '110g',
 '1r',
 '102r',
 '118r',
 '110i',
 '102g',
 '110u',
 '110r',
 '1g',
 '118i',
 '1i',
 '118g',
 '102i',
 '103i',
 '110z',
 '103g',
 '103z',
 '103r',
 '103u',
 '117r',
 '117z',
 '117g',
 '117i',
 '104r',
 '104u',
 '104z',
 '104g',
 '104i',
 '116r',
 '116z',
 '116g',
 '120g',
 '116i',
 '105r',
 '105u',
 '111g',
 '111i',
 '105z',
 '119r',
 '119i',
 '111r',
 '105g',
 '105i',
 '111z',
 '119g']

In [29]:

x, y = templates['115r'].T
x109u, y109u = templates['109u'].T

In [30]:

plt.plot(x, y, 'o')
plt.plot(x109u, y109u, 'o')
plt.ylim([1,0])

Out[30]:

(1, 0)

Now lets take the fast fourier transform of one of those templates

In [31]:

help(fftpack.fft)

Help on function fft in module scipy.fftpack.basic:

fft(x, n=None, axis=-1, overwrite_x=False)
    Return discrete Fourier transform of real or complex sequence.
    
    The returned complex array contains ``y(0), y(1),..., y(n-1)`` where
    
    ``y(j) = (x * exp(-2*pi*sqrt(-1)*j*np.arange(n)/n)).sum()``.
    
    Parameters
    ----------
    x : array_like
        Array to Fourier transform.
    n : int, optional
        Length of the Fourier transform.  If ``n < x.shape[axis]``, `x` is
        truncated.  If ``n > x.shape[axis]``, `x` is zero-padded. The
        default results in ``n = x.shape[axis]``.
    axis : int, optional
        Axis along which the fft's are computed; the default is over the
        last axis (i.e., ``axis=-1``).
    overwrite_x : bool, optional
        If True, the contents of `x` can be destroyed; the default is False.
    
    Returns
    -------
    z : complex ndarray
        with the elements::
    
            [y(0),y(1),..,y(n/2),y(1-n/2),...,y(-1)]        if n is even
            [y(0),y(1),..,y((n-1)/2),y(-(n-1)/2),...,y(-1)]  if n is odd
    
        where::
    
            y(j) = sum[k=0..n-1] x[k] * exp(-sqrt(-1)*j*k* 2*pi/n), j = 0..n-1
    
        Note that ``y(-j) = y(n-j).conjugate()``.
    
    See Also
    --------
    ifft : Inverse FFT
    rfft : FFT of a real sequence
    
    Notes
    -----
    The packing of the result is "standard": If ``A = fft(a, n)``, then
    ``A[0]`` contains the zero-frequency term, ``A[1:n/2]`` contains the
    positive-frequency terms, and ``A[n/2:]`` contains the negative-frequency
    terms, in order of decreasingly negative frequency. So for an 8-point
    transform, the frequencies of the result are [0, 1, 2, 3, -4, -3, -2, -1].
    To rearrange the fft output so that the zero-frequency component is
    centered, like [-4, -3, -2, -1,  0,  1,  2,  3], use `fftshift`.
    
    For `n` even, ``A[n/2]`` contains the sum of the positive and
    negative-frequency terms.  For `n` even and `x` real, ``A[n/2]`` will
    always be real.
    
    This function is most efficient when `n` is a power of two, and least
    efficient when `n` is prime.
    
    If the data type of `x` is real, a "real FFT" algorithm is automatically
    used, which roughly halves the computation time.  To increase efficiency
    a little further, use `rfft`, which does the same calculation, but only
    outputs half of the symmetrical spectrum.  If the data is both real and
    symmetrical, the `dct` can again double the efficiency, by generating
    half of the spectrum from half of the signal.
    
    Examples
    --------
    >>> from scipy.fftpack import fft, ifft
    >>> x = np.arange(5)
    >>> np.allclose(fft(ifft(x)), x, atol=1e-15)  # within numerical accuracy.
    True

In [32]:

y_fft = fftpack.fft(y)

In [33]:

print(y.shape)
print(type(y_fft))
print(y_fft.shape)

(500,)
<type 'numpy.ndarray'>
(500,)

In [34]:

plt.plot(y_fft.real, 'o-')
#plt.yscale('log')
plt.xlim([-5, 501])

Out[34]:

(-5, 501)

What is this telling us? Lets take a look at the help file to see what fftpack.fft returned:

In [35]:

plt.plot(fftpack.fftshift(np.abs(y_fft)), '-o')
#plt.yscale('log')
plt.xlim([-5, 501])

Out[35]:

(-5, 501)

Now what if we take just the first few modes of the Fourier transform and try recreating the original signal:

In [36]:

#set the number of modes you'd like to see
k = 6
#now zero out the rest:
y_fft[k + 1:-k] = 0

In [37]:

y_fit = fftpack.ifft(y_fft).real

In [38]:

plt.plot(np.concatenate([x, 1+x]), np.concatenate([y, y]), '--', label='Template (Data)')
plt.plot(np.concatenate([x, 1+x]), np.concatenate([y_fit, y_fit]), label='Model')
plt.legend(loc='best')
plt.xlabel('Phase')
plt.ylabel('Amplitude')
plt.ylim([1.1, -0.1])

Out[38]:

(1.1, -0.1)

A good opportunity to play more with the interactive features of IPython Notebooks

In [39]:

def fftApprox(nmodes):
    #zero out other modes:
    y_fftnew = fftpack.fft(y)
    y_fftnew[nmodes + 1:-nmodes] = 0
    y_fit = np.abs(fftpack.ifft(y_fftnew))
    
    #now plot it up. 
    fig = plt.figure()
    axTop = fig.add_axes([0.05, 0.3, .92, 0.67])
    axTop.plot(np.concatenate([x, 1+x]), np.concatenate([y, y]), '--', label='Template (Data)')
    axTop.plot(np.concatenate([x, 1+x]), np.concatenate([y_fit, y_fit]), label='Model')
    axTop.legend(loc='best')
    axTop.set_ylabel('Amplitude')
    axTop.set_ylim([1.1, -0.1])
    axTop.set_xlim([0, 2])
    
    axBot = fig.add_axes([0.05, 0.05, 0.92, 0.25])
    axBot.plot(np.concatenate([x, 1+x]), np.concatenate([y - y_fit, y - y_fit]))
    axBot.set_xlim([0, 2])
    axBot.set_ylabel('Temp - Mod')
    axBot.set_xlabel('Phase')
    axBot.text(0.025, 0.05, '$\sigma$: '+'{:.4f}'.format(np.std(y - y_fit)), transform=axBot.transAxes)

In [40]:

interact(fftApprox,
         nmodes=widgets.IntSliderWidget(min=1,max=50,step=1,value=6))

What are some common Fourier transforms?¶

Gaussian¶

There are some special properties of the Fourier Transform that are worth mentioning. One is that if $h(t)$ is real and even (i.e. $h(t) = h(-t)$), $H(f)$ will be real and even as well. One example of this is the Gaussian:

$$h(t) = \mathcal{N}(0, \mu) = \exp{(-t^{2}/\sigma^{2})}$$$$H(f) = \exp{(-2 \pi^{2} \sigma^{2} f^{2} )}$$

In [41]:

mu = 0.
sigma=1.
def gaussy(mu, sigma, xarr):
    return np.exp(-(xarr - mu)**2/(2 * sigma**2))

def plotGaussianTransform(mu, sigma):
    xarr = np.linspace(-40, 40, 1e3)
    gy = gaussy(mu, sigma, xarr)
    ngy = gy / np.max(gy)
    plt.plot(xarr, ngy, '-', label=r'h(t)')
    ffty = fftpack.fftshift(np.abs(fftpack.fft(gaussy(mu, sigma, xarr))))
    nffty = ffty/np.max(ffty)
    plt.plot(xarr, nffty, '-', label='H(f)')
    plt.legend(loc='best')

interact(plotGaussianTransform,
         mu = widgets.FloatSliderWidget(min=-2,max=45.,step=.25,value=mu),
         sigma = widgets.FloatSliderWidget(min=1e-3,max=10.,step=.001,value=sigma))

In [42]:

help(fftpack.fftfreq)

Help on function fftfreq in module numpy.fft.helper:

fftfreq(n, d=1.0)
    Return the Discrete Fourier Transform sample frequencies.
    
    The returned float array `f` contains the frequency bin centers in cycles
    per unit of the sample spacing (with zero at the start).  For instance, if
    the sample spacing is in seconds, then the frequency unit is cycles/second.
    
    Given a window length `n` and a sample spacing `d`::
    
      f = [0, 1, ...,   n/2-1,     -n/2, ..., -1] / (d*n)   if n is even
      f = [0, 1, ..., (n-1)/2, -(n-1)/2, ..., -1] / (d*n)   if n is odd
    
    Parameters
    ----------
    n : int
        Window length.
    d : scalar, optional
        Sample spacing (inverse of the sampling rate). Defaults to 1.
    
    Returns
    -------
    f : ndarray
        Array of length `n` containing the sample frequencies.
    
    Examples
    --------
    >>> signal = np.array([-2, 8, 6, 4, 1, 0, 3, 5], dtype=float)
    >>> fourier = np.fft.fft(signal)
    >>> n = signal.size
    >>> timestep = 0.1
    >>> freq = np.fft.fftfreq(n, d=timestep)
    >>> freq
    array([ 0.  ,  1.25,  2.5 ,  3.75, -5.  , -3.75, -2.5 , -1.25])

Parseval's Theorem

The power spectrum, or power spectral density (PSD), is a quantity of great interest in time series analysis. It is defined over the range $0 \le f < \infty$ as

$$PSD(f) \equiv |H(f)|^{2} + |H(-f)|^{2}$$

The power at a given frequency is a quantitative statement about the importance of each frequency mode.

Parseval's Theorem tells us that the total power is the same whether compute in the frequency or time domain:

$$P_{tot} \equiv \int_{0}^{\infty} PSD(f) df = \int_{-\infty}^{\infty} |h(t)|^{2} dt$$

In [43]:

def fourierPlots(func):
    transform = fftpack.fftshift(np.abs(fftpack.fft(func)))**2
    N = len(transform)
    freqs = fftpack.fftshift(fftpack.fftfreq(len(func), 1.))
    fig = plt.figure()
    funcax = fig.add_axes([0.05, 0.05, .5, 0.5])
    funcax.plot(func, '-')
    funcax.set_xlabel('T [s]')
    funcax.set_ylabel('Amplitude')
    tranax = fig.add_axes([0.65, 0.05, .5, 0.5])
    tranax.plot(freqs[N/2:], transform[N/2:]+transform[0:N/2][::-1], '-')
    #tranax.plot(freqs, fftpack.fftshift(fftpack.fft(func).real), 'r-')
    #tranax.plot(freqs, fftpack.fftshift(fftpack.fft(func).imag), 'g-')
    tranax.set_xlabel('$f$ [Hz]')
    tranax.set_ylabel('$PSD(f)$')
    #tranax.set_ylim([-1.5, 1.5])
    

Again, the Fourier Transform and inverse are: $$ H(f) = \int_{-\infty}^{\infty} h(t) \exp{(-i 2 \pi f t)} dt$$ $$h(t) = \int_{-\infty}^{\infty} H(f) \exp{(i 2 \pi f t)} df $$

Delta Function¶

In [44]:

deltaFunc = np.zeros(500)
deltaFunc[249] = 1
fourierPlots(deltaFunc)

Constant¶

In [46]:

fourierPlots(np.zeros(500) + 10)

Periodic Pulses¶

In [47]:

pulsesFunc = np.zeros(500)
pulsesFunc[0::50] = 1
fourierPlots(pulsesFunc)

In [48]:

pulsesFunc = np.zeros(500)
pulsesFunc[0::10] = 1
fourierPlots(pulsesFunc)

Sine Wave¶

In [49]:

fourierPlots(np.sin(np.linspace(0, 100*np.pi, 5e2)))

In [50]:

fourierPlots(np.sin(np.linspace(0, 1250*np.pi, 5e2)))

The Nyquist Sampling Theorem¶

Recall that Fourier transform and its inverse extend to infinity

$$ H(f) = \int_{-\infty}^{\infty} h(t) \exp{(-i 2 \pi f t)} dt$$$$h(t) = \int_{-\infty}^{\infty} H(f) \exp{(i 2 \pi f t)} dt $$

However, the Discrete Fourier Transform and its inverse only extend through the range of our data set

$$H_{k} = \sum_{j=0}^{N-1} h_j \exp{[-i 2 \pi jk/N]}$$$$h_j = \frac{1}{N} \sum_{k=0}^{N-1}H_k \exp{[i 2 \pi jk/N]}$$

or in SciPy notation: $$y[k] = \sum_{n=0}^{N-1} \exp{(-2 \pi j \frac{kn}{N})}x[n]$$ $$x[n] = \frac{1}{N}\sum_{n=0}^{N-1} \exp{(2 \pi j \frac{kn}{N})} y[k]$$

Can we still estimate the PSD even though we're only using the Discrete Fourier Transform?

Let us define h(t) to be band limited if H(f) = 0 for $|f| > f_c$, where $f_c$ is the band limit, or Nyquist critical frequency. For h(t) there is a resolution limit in $t$ space of $t_c = 1/(2 f_c)$ below which $h(t)$ appears smooth.

The Nyquist sampling theorem tells us that we can exactly reconstruct h(t) from evenly sampled data when $\Delta t \le t_c$ as

$$h(t) = \frac{\Delta t}{t_c} \sum_{k=-\infty}^{k=\infty} h_k \frac{\sin{[2 \pi f_c (t - k \Delta t)]}}{2 \pi f_c (t - k \Delta t)}$$

where $\Delta t$ is the time sampling resolution.

Let's look at an example with the function $$h(t) = \sin{(2 \pi t/P)}$$

This function is band limited with a critical frequency of $f_c = 1/P$. The critical sampling rate is therefore $t_c = 1/(2 f_c) = P/2$. As long as we sample with time steps smaller than $P/2$ the Nyquist sampling theorem tells us we can fully reconstruct $h(t)$.

However, if $h(t)$ is not band-limited, or if the sampling rate is not sufficient, aliasing will occur, which prevents us from exactly reconstructing $h(t)$. Aliasing falsely transfers frequencies $|f| > f_c$ into the $-f_c < f < f_c$ range.

So the discrete Fourier transform is a good estimate of the true Fourier transform as long as we're dealing with a band limited function and $\Delta t < t_c$. The PSD in this case can be written as

$$PSD(f_k) = 2 \left(\frac{T}{N} \right)^2 \left[\left(\sum_{j=0}^{N-1} h_j \cos{(2 \pi f_k t_j)} \right)^2 + \left(\sum_{j=0}^{N-1} h_j \sin{(2 \pi f_k t_j)} \right)^2\right]$$

and this is true only for noiseless evenly sampled data.

In [84]:

plt.axvspan(0, 5, ymax=0.5, color='#75BEBB', alpha=0.75)
plt.axvline(10, color='#EB6649', lw=2)
plt.xlim([0, 20])
plt.ylim([0, 2])
plt.xlabel('Frequency')

Out[84]:

<matplotlib.text.Text at 0x10eb3ffd0>

In [92]:

tfull = np.linspace(0, 2 * np.pi, 1e3)
tlow = np.linspace(0, 2 * np.pi, 1e1)
omega_hi = 10.
omega_low = 1.
#1. show the high sampling high frequency curve
plt.plot(tfull, np.sin(omega_hi*tfull), lw=2, color='#C7B077')

#3. show the high sampling low frequency cuve
plt.plot(tfull, np.sin(omega_low*tfull), lw=4, color='#46959E')

#2. show the low sampling high frequency curve
plt.plot(tlow, np.sin(omega_hi*tlow), 'o', color='#BA4C37')

plt.xlim([0, 2 * np.pi])

Out[92]:

(0, 6.283185307179586)

In [82]:

period = 10.
tcrit = period / 2.

tfull = np.linspace(0, 10, 1e3)
#1. show the high sampling high frequency curve
plt.plot(tfull, np.sin(2 * np.pi *tfull / period), lw=2, color='#C7B077')

tlow = np.linspace(0, 10, tcrit)
plt.plot(tlow, np.sin(2 * np.pi *tlow / period), 'o', color='#BA4C37')

Out[82]:

[<matplotlib.lines.Line2D at 0x10acdd990>]

In [ ]:

Window Function

The sampling window function in the time domain can be thought of as the sum of delta functions at sampled observation times.

The Fourier transform of a set of delta functions with spacing $\Delta t$ is another set of delta functions with spacing $1 / \Delta t$.

In [42]:

plot1004()

The fast Fourier transform¶

Runs at $\mathcal{O}(N \log N)$ instead of $\mathcal{O}(N^2)$. Check it out in numerical recipes if you're curious of the details.

There are some other variants too. One mentioned in the book is Welch's method, which computes multiple Fourier transforms over overlapping windows of the data to smooth noise effects in the resulting spectrum.

In [115]:

plot1006()

10.2.4 Wavelets¶

Trigonometric basis functions used in the Fourier transform have an infinite extent.

For this reason the Fourier transform may not be the best method for nonperiodic time series data.

Instead of using trigonometric basis functions, a better alternative may be to use basis functions that are localized.

Wavelets are one such family of basis functions.

Wavelets are localized in both time and frequency.

Individual wavelets are specified by a set of wavelet filter coefficients.

Given a wavelet, a complete orthonormal set of basis functions can be constructed by scaling and translations.

Some well known wavelet families are Daubechies, Mexican hat, and Haar wavelets.

Similar to the discrete fourier transform, a discrete wavelet transform (DWT) can be used to analyze the power spectrum of a time series as a function of time. This can be performed using the PyWavelets toolkit.

In [43]:

plot1007()

In [44]:

plot1008()

In both of these examples the wavelet is of the form

$$w(t|t_0, f_0, Q) = A \exp{[i 2 \pi f_0(t - t_0)] \exp{[-f_0^2(t - t_0)]^2/Q^2}}$$

which has the transform

$$W(f|t_0, f_0, Q) = \left( \frac{\pi}{f_0^2/Q^2} \right)^{1/2} \exp{(-i 2 \pi f t_0)} \exp{\left[ \frac{-\pi^2 Q^2(f - f_0)^2}{Qf_0^2}\right]}$$

actually, it's not technically a wavelet because it doesn't meet the admissibility criterion, so it should be called a "matched filter". However, a scaling and offset could make this a Morlet wavelet.

A similar method is called matching pursuit, which uses a large redundant set of nonorthogonal basis functions and uses the data to determine the appropriate set of basis functions (AKA, a dictionary). For an example, see Lachowicz and Done (2010).

A recent example is from (Metcalfe et al. 2013), where they used wavelet analysis to analyze the magnetic cycles of Eps Eri. Time series of Eps Eri

Wavelet Analysis of Eps Eri

Hatched regions are outside of the "cone of influence", which suffer from edge effects. The weakest signal is shown with white and blue (and presumably black and purple), to the strongest signal being red and black.

10.2.5. Digital Filtering

Low pass filtering: only low frequencies are admitted
High pass filtering: only high frequencies admitted
passband filtering: finite range of frequencies admitted
notch filtering: finite range of frequencies blocked

Using this technique will always reduce the information content of your data, but if you're going to use it, make sure you use the Wiener filter to suppress ringing.

In [45]:

plot1010()

Optimization terminated successfully.
         Current function value: 574032608.556378
         Iterations: 95
         Function evaluations: 180

The low pass filtering mentioned here can only be done on evenly sampled data. For unevenly sampled data, use a Savitzky-Golay filter.

10.3 Analysis of Periodic Time Series¶

A periodic time series satisfies $y(t + P) = y(t)$, where $P$ is the period.

A convenient concept is phase-folding:

$$\phi = \frac{t}{P} - int \left( \frac{t}{P} \right)$$

10.3.1. A Single Sinusoid Model¶

Lets take an example of a simple oscillator: $$y(t) = A\sin{(\omega t + \phi)} + \epsilon$$

This can be partially linearized to speed up the analysis through the use of trigonometric identities

$$y(t) = a \sin{(\omega t)} + b \cos{(\omega t)}$$

(use the property $\sin{(x+y)} = \sin{x}\cos{y} + \cos{x}\sin{y}$ and draw a triangle to show if $\tan{\phi}=b/a$, then $\cos{\phi}=a/A$ and $\sin{\phi}=b/A$

Then the likelihood function is $$\mathcal{L}(\omega, a, b, \sigma|t, y) \equiv p({t, y}| \omega, a, b, \sigma) = \prod_{j=1}^{N} \frac{1}{\sqrt{2 \pi \sigma}} \exp{\left( \frac{-[y_j - a \sin{(\omega t_j)} - b\cos{(\omega t_j)}]^2} {2 \sigma^2}\right)}$$

Using Bayes' Theorem $$p(\omega, a, b, \sigma|{t, y}, I) = \frac{p({t, y}| \omega, a, b, \sigma, I) \times p(\omega, a, b, \sigma | I)} {p({t, y}|I)} $$

and assuming uniform priors, and ignoring the evidence, the posterior is then proportional to

$$p(\omega, a, b, \sigma | {t, y}) \propto \sigma^{-N} \exp{\left( \frac{-NQ}{2 \sigma^2}\right)}$$

where $Q$ is a really long term defined in the book. A few approximations were used to remove some of the terms.

If we're only interested in periodicity, and not the phase of the periodicity, we can further reduce the number of terms by marginzalizing over them

$$p(\omega, \sigma | {t, y}) \propto \int p(\omega, a, b, \sigma | {t, y}) \ da \ db$$

Finally, we're left at

$$p(\omega, \sigma | {t, y}) \propto \sigma^{-(N-2)}\exp{\left(\frac{-NV}{2 \sigma^2} + \frac{P(\omega)}{\sigma^2}\right)}$$

where

$$V = \frac{1}{N} \sum_{j=1}^{N}y_j^2$$

and $P(\omega)$ is known as the periodogram and is defined as

$$P(\omega) = \frac{1}{N}[I^2(\omega) + R^2(\omega)]$$

where

$$I(\omega) = \sum_{j=1}^{N} y_j \sin{(\omega t_j)}$$$$R(\omega) = \sum_{j=1}^{N} y_j \cos{(\omega t_j)}$$

Finally, when the noise is known, this further simplifies to

$$p(\omega | {t, y}, \sigma) \propto \exp{\left( \frac{P(\omega)}{\sigma^2}\right)}$$

see the book for how to come about this without marginalizing over $a$ and $b$. Essentially, you just take the derivative of the posterior, set it to zero, and solve for $a_0$ and $b_0$. Then just role with those values.

The meaning of periodogram

But what is the best value of $\omega$ that is supported by the data? To answer this let's compute $\chi^2$

$$\chi^2(\omega) \equiv \frac{1}{\sigma^2} \sum_{j=1}^{N} [y_j - y(t_j)]^2 = \frac{1}{\sigma^2} \sum_{j=1}^{N} [y_j - a_0 \sin{(\omega t_j)} - b_0 \cos{(\omega t_j)}]^2 $$

If we set $a_0 = b_0 = 0$ to get a constant model, the $\chi^2$ for that constant model is

$$\chi_0^2 = \frac{1}{\sigma^2} \sum_{j=1}^{N} y_j^2 = \frac{NV}{\sigma^2}$$

which can be used to rewrite $\chi^2$ as

$$\chi^2(\omega) = \chi_0^2 \left[ 1 - \frac{2}{NV} P(\omega) \right]$$

The Lomb-Scargle periodogram is defined as

$$P_{LS} (\omega) = \frac{2}{NV} P(\omega)$$

making the above

$$\frac{\chi^2(\omega)}{\chi_0^2} = 1 - P_{LS} (\omega)$$

That's great that we now have a way of dealing with unevenly sampled data and it's easy to incorporate heteroscedastic uncertainties. But how do we gauge the significance of peaks?

One way is by using the Bayesian Information Criterion:

$$\Delta BIC = \chi_0^2 - \chi^2(\omega_0) - (k_0 - k_{\omega})\ln{N}$$

similarly, there's the Aikake Information Criterion

$$\Delta AIC = \frac{NV}{\sigma^2} P_{LS}(\omega_0) - 6$$

This seems like a good route to take for large data sets. A more obvious solution to determining peak significance will be discussed soon.

The discrete PSD gives optimal results if

signal is single harmonic with constant amplitude and phase
evenly sampled data
N >>1
Noise is gaussian
noise is homoscedastic

otherwise, you should consider the LS Periodogram

10.3.2 The Lomb-Scargle Periodogram¶

The discrete Fourier Transform is great when there are evenly sampled data. Unfortunately it cannot be used when the data are unevenly sampled, which is often what we deal with in astronomy. In the cases with unevenly sampled data the Lomb-Scargle is a great tool for searching for periodicity.

The normalized Lomb-Scargle periodogram is defined as:

$$P_{LS}(\omega) = \frac{1}{V}\left[ \frac{R^2(\omega)}{C(\omega)} + \frac{I^2(\omega)} {S(\omega)} \right]$$

The meaning of the Lomb-Scargle Periodogram

The Lomb Scargle periodogoram can be thought of as an inverted plot of $\chi^2$ normalized by $\chi_0^2$ (the $\chi^2$ for a constant model).

The Lomb-Scargle does not save computational effort (common misconception).

Practical application of the Lomb-Scargle periodogram

Since the underlying model is still nonlinear in frequency, and the basis functions at different frequencies are not orthogonal, the global maximum must be found through grid search.

This can be done using SciPy:

In [96]:

from scipy.signal.spectral import lombscargle

In [97]:

help(lombscargle)

Help on built-in function lombscargle in module scipy.signal._spectral:

lombscargle(...)
    lombscargle(x, y, freqs)
    
    Computes the Lomb-Scargle periodogram.
    
    The Lomb-Scargle periodogram was developed by Lomb [1]_ and further
    extended by Scargle [2]_ to find, and test the significance of weak
    periodic signals with uneven temporal sampling.
    
    The computed periodogram is unnormalized, it takes the value
    ``(A**2) * N/4`` for a harmonic signal with amplitude A for sufficiently
    large N.
    
    Parameters
    ----------
    x : array_like
        Sample times.
    y : array_like
        Measurement values.
    freqs : array_like
        Angular frequencies for output periodogram.
    
    Returns
    -------
    pgram : array_like
        Lomb-Scargle periodogram.
    
    Raises
    ------
    ValueError
        If the input arrays `x` and `y` do not have the same shape.
    
    Notes
    -----
    This subroutine calculates the periodogram using a slightly
    modified algorithm due to Townsend [3]_ which allows the
    periodogram to be calculated using only a single pass through
    the input arrays for each frequency.
    
    The algorithm running time scales roughly as O(x * freqs) or O(N^2)
    for a large number of samples and frequencies.
    
    References
    ----------
    .. [1] N.R. Lomb "Least-squares frequency analysis of unequally spaced
           data", Astrophysics and Space Science, vol 39, pp. 447-462, 1976
    
    .. [2] J.D. Scargle "Studies in astronomical time series analysis. II - 
           Statistical aspects of spectral analysis of unevenly spaced data",
           The Astrophysical Journal, vol 263, pp. 835-853, 1982
    
    .. [3] R.H.D. Townsend, "Fast calculation of the Lomb-Scargle
           periodogram using graphics processing units.", The Astrophysical
           Journal Supplement Series, vol 191, pp. 247-253, 2010
    
    Examples
    --------
    >>> import scipy.signal
    
    First define some input parameters for the signal:
    
    >>> A = 2.
    >>> w = 1.
    >>> phi = 0.5 * np.pi
    >>> nin = 1000
    >>> nout = 100000
    >>> frac_points = 0.9 # Fraction of points to select
     
    Randomly select a fraction of an array with timesteps:
    
    >>> r = np.random.rand(nin)
    >>> x = np.linspace(0.01, 10*np.pi, nin)
    >>> x = x[r >= frac_points]
    >>> normval = x.shape[0] # For normalization of the periodogram
     
    Plot a sine wave for the selected times:
    
    >>> y = A * np.sin(w*x+phi)
    
    Define the array of frequencies for which to compute the periodogram:
    
    >>> f = np.linspace(0.01, 10, nout)
     
    Calculate Lomb-Scargle periodogram:
    
    >>> pgram = sp.signal.lombscargle(x, y, f)
    
    Now make a plot of the input data:
    
    >>> plt.subplot(2, 1, 1)
    <matplotlib.axes.AxesSubplot object at 0x102154f50>
    >>> plt.plot(x, y, 'b+')
    [<matplotlib.lines.Line2D object at 0x102154a10>]
    
    Then plot the normalized periodogram:
    
    >>> plt.subplot(2, 1, 2)
    <matplotlib.axes.AxesSubplot object at 0x104b0a990>
    >>> plt.plot(f, np.sqrt(4*(pgram/normval)))
    [<matplotlib.lines.Line2D object at 0x104b2f910>]
    >>> plt.show()

In [98]:

numSamps = 1e2
x = np.sort(np.random.uniform(low=0, high=1000, size=numSamps))
unc = np.random.normal(loc=0, scale=0.5, size=numSamps)
signalPer = 100.9 #days
y = np.sin(2 * np.pi * x / signalPer)
fig = plt.figure()
ax = plt.subplot(211)
ax.plot(x, y, 'o-', markersize=5)
ax.plot(x, y + unc, 'o')
ax.set_xlabel('Time')
ax.set_ylabel('Amplitude')

ax = plt.subplot(212)
pmin = 0.5
pmax = 1000.
omega = np.linspace(2 * np.pi / pmax, 2 * np.pi / pmin, 3e3)
P_SP = lombscargle(x, y, omega)
P_SP_Unc = lombscargle(x, y + unc, omega)

ax.plot(2 * np.pi/omega, P_SP, lw=2, alpha=0.5, label='No Noise')
ax.plot(2 * np.pi/omega, P_SP_Unc, lw=2, alpha=0.5, label='Noise')
ax.set_xscale('log')
ax.set_xlim([0.5, 1e3])
ax.set_xlabel('Period [d]')
ax.set_ylabel('Power')
ax.legend(loc='best')

fig.subplots_adjust(hspace=0.25)

-c:2: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
-c:3: DeprecationWarning: using a non-integer number instead of an integer will result in an error in the future

False Alarm Probability

As mentioned before, the $\Delta$BIC can give a good idea of the significance of a signal if you know how to interpret it, and a large data set can help interpret the threshold.

Alternatively, the significance of a peak can be assessed using a False Alarm Probability (FAP).

A boostrap resampling is employed to compute the FAP, where the observation times are kept fixed and the y values are drawn with replacement. The peak in every sample is then recorded to assess the desired FAP levels. astroML has a built in method for doing this, which is shown below.

In [99]:

from astroML.time_series import \
    lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap

In [100]:

help(lomb_scargle)

Help on built-in function lomb_scargle in module astroML_addons.periodogram:

lomb_scargle(...)
    (Generalized) Lomb-Scargle Periodogram with Floating Mean
    
    Parameters
    ----------
    t : array_like
        sequence of times
    y : array_like
        sequence of observations
    dy : array_like
        sequence of observational errors
    omega : array_like
        frequencies at which to evaluate p(omega)
    generalized : bool
        if True (default) use generalized lomb-scargle method
        otherwise, use classic lomb-scargle.
    subtract_mean : bool
        if True (default) subtract the sample mean from the data before
        computing the periodogram.  Only referenced if generalized is False.
    significance : None or float or ndarray
        if specified, then this is a list of significances to compute
        for the results.
    
    Returns
    -------
    p : array_like
        Lomb-Scargle power associated with each frequency omega
    z : array_like
        if significance is specified, this gives the levels corresponding
        to the desired significance (using the Scargle 1982 formalism)
    
    Notes
    -----
    The algorithm is based on reference [1]_.  The result for generalized=False
    is given by equation 4 of this work, while the result for generalized=True
    is given by equation 20.
    
    Note that the normalization used in this reference is different from that
    used in other places in the literature (e.g. [2]_).  For a discussion of
    normalization and false-alarm probability, see [1]_.
    
    To recover the normalization used in Scargle [3]_, the results should
    be multiplied by (N - 1) / 2 where N is the number of data points.
    
    References
    ----------
    .. [1] M. Zechmeister and M. Kurster, A&A 496, 577-584 (2009)
    .. [2] W. Press et al, Numerical Recipies in C (2002)
    .. [3] Scargle, J.D. 1982, ApJ 263:835-853

In [101]:

PLSastroML = lomb_scargle(x, y + unc, unc, omega)
# Get significance via bootstrap
D = lomb_scargle_bootstrap(x, y + unc, unc, omega, generalized=True,
                           N_bootstraps=1000, random_state=0)
sig1, sig5 = np.percentile(D, [99, 95])

In [72]:

help(ax.axhline)

Help on method axhline in module matplotlib.axes._axes:

axhline(self, y=0, xmin=0, xmax=1, **kwargs) method of matplotlib.axes._subplots.AxesSubplot instance
    Add a horizontal line across the axis.
    
    Parameters
    ----------
    y : scalar, optional, default: 0
        y position in data coordinates of the horizontal line.
    
    xmin : scalar, optional, default: 0
        Should be between 0 and 1, 0 being the far left of the plot, 1 the
        far right of the plot.
    
    xmax : scalar, optional, default: 1
        Should be between 0 and 1, 0 being the far left of the plot, 1 the
        far right of the plot.
    
    Returns
    -------
    `~matplotlib.lines.Line2D`
    
    Notes
    -----
    kwargs are the same as kwargs to plot, and can be
    used to control the line properties.  e.g.,
    
    Examples
    --------
    
    * draw a thick red hline at 'y' = 0 that spans the xrange::
    
        >>> axhline(linewidth=4, color='r')
    
    * draw a default hline at 'y' = 1 that spans the xrange::
    
        >>> axhline(y=1)
    
    * draw a default hline at 'y' = .5 that spans the the middle half of
      the xrange::
    
        >>> axhline(y=.5, xmin=0.25, xmax=0.75)
    
    Valid kwargs are :class:`~matplotlib.lines.Line2D` properties,
    with the exception of 'transform':
    
      agg_filter: unknown
      alpha: float (0.0 transparent through 1.0 opaque)         
      animated: [True | False]         
      antialiased or aa: [True | False]         
      axes: an :class:`~matplotlib.axes.Axes` instance         
      clip_box: a :class:`matplotlib.transforms.Bbox` instance         
      clip_on: [True | False]         
      clip_path: [ (:class:`~matplotlib.path.Path`,         :class:`~matplotlib.transforms.Transform`) |         :class:`~matplotlib.patches.Patch` | None ]         
      color or c: any matplotlib color         
      contains: a callable function         
      dash_capstyle: ['butt' | 'round' | 'projecting']         
      dash_joinstyle: ['miter' | 'round' | 'bevel']         
      dashes: sequence of on/off ink in points         
      drawstyle: ['default' | 'steps' | 'steps-pre' | 'steps-mid' |                   'steps-post']         
      figure: a :class:`matplotlib.figure.Figure` instance         
      fillstyle: ['full' | 'left' | 'right' | 'bottom' | 'top' | 'none']         
      gid: an id string         
      label: string or anything printable with '%s' conversion.         
      linestyle or ls: [``'-'`` | ``'--'`` | ``'-.'`` | ``':'`` | ``'None'`` |                   ``' '`` | ``''``]
      linewidth or lw: float value in points         
      lod: [True | False]         
      marker: unknown
      markeredgecolor or mec: any matplotlib color         
      markeredgewidth or mew: float value in points         
      markerfacecolor or mfc: any matplotlib color         
      markerfacecoloralt or mfcalt: any matplotlib color         
      markersize or ms: float         
      markevery: unknown
      path_effects: unknown
      picker: float distance in points or callable pick function         ``fn(artist, event)``         
      pickradius: float distance in points         
      rasterized: [True | False | None]         
      sketch_params: unknown
      snap: unknown
      solid_capstyle: ['butt' | 'round' |  'projecting']         
      solid_joinstyle: ['miter' | 'round' | 'bevel']         
      transform: a :class:`matplotlib.transforms.Transform` instance         
      url: a url string         
      visible: [True | False]         
      xdata: 1D array         
      ydata: 1D array         
      zorder: any number         
    
    See also
    --------
    axhspan : for example plot and source code

In [102]:

fig = plt.figure()
ax = plt.subplot(111)
ax.plot(2 * np.pi/omega, PLSastroML, lw=2, alpha=0.5)
ax.set_xscale('log')
ax.set_xlim([0.5, 1e3])
ax.set_xlabel('Period [d]')
ax.set_ylabel('Power')
ax.axhline(y=sig5, color='#BA4C37', ls='--', lw=3, label='5% FAP')
ax.axhline(y=sig1, color='#982023', label='1% FAP')
ax.legend(loc='best')

Out[102]:

<matplotlib.legend.Legend at 0x10f094d10>

In [103]:

plot1015()

Generalized Lomb-Scargle Periodogram

One problem with the 1982 version of the Lomb-Scargle periodogram is that it implicitly assumes the mean of the sample is the same as the mean of the true function. Since we are not evenly sampling the data, may be working with small data sets, or may not be sampling the full period, this is a bad assumption. As Cumming et al. (1999) pointed out, this can cause problems with aliases.

To correct for this, Cumming et al. (1999) added a floating mean to overcome this deficiency:

$$y = a \cos{\omega t} + b \sin{\omega t} + c$$

(note the extra free parameter, $c$). They demonstrated its effectiveness and used it for a thorough statistical analysis of the detectability of planets from the Lick planet search program. Later, Zechmeister and Kurster (2009) wrote up the analytical derivation for this and referred to it as the "generalized Lomb-Scargle" periodogram (not to be confused with the Bretthorst (2001) generalized Lomb-Scargle periodogram, which is completely different).

Errata: There was a typo in Figure 10.15. An updated version can be found here, and a corrected version with a larger data set is reproduced below.

In [82]:

plot1015()

10.3.3. Truncated Fourier Series Model¶

What happens if data have an underlying variability that is more complex than a single sinusoid?

In [104]:

#6 stars from the LINEAR data set. Periods estimated using the Lomb-Scargle Periodogram.
plot1017()

Downloading http://www.astro.washington.edu/users/ivezic/linear/allDataFinal/allLINEARfinal_targets.dat
[=========================================]  1001.2kb / 1001.2kb   
Downloading http://www.astro.washington.edu/users/ivezic/linear/allDataFinal/allLINEARfinal_dat.tar.gz
[=========================================]  16.05Mb / 16.05Mb   
@pickle_results: using precomputed results from 'LINEAR_LS.pkl'
 - omega_0 = 17.21695506
 - omega_0 = 51.10600808
 - omega_0 = 10.20062676
 - omega_0 = 45.55493503
 - omega_0 = 58.44512277
 - omega_0 = 10.82722772

One way of going about this is to use multiple Fourier terms:

$$y(t) = b_0 + \sum_{m=1}^{M} a_m \sin{(m \omega t)} + b_m \cos{(m \omega t)}$$

and then the periodogram becomes

$$P_{M}(\omega) = \frac{2}{V} \sum_{m=1}^{M}[R_{m}^2(\omega) + I_{m}^2(\omega)]$$

In [105]:

from astroML.time_series import multiterm_periodogram, MultiTermFit
from astroML.datasets import fetch_LINEAR_sample
data = fetch_LINEAR_sample()
t, y, dy = data[14752041].T

In [106]:

print(np.min(t), np.max(t))

(52653.465077000001, 54620.262198999997)

In [107]:

pmin_linear = 1./48. #days
pmax_linear = 2.
omega_linear = np.linspace(2. * np.pi /pmax_linear, 2. * np.pi / pmin_linear, 1e4)
P_MT1 = multiterm_periodogram(t, y, dy, omega_linear, 1)
P_MT6 = multiterm_periodogram(t, y, dy, omega_linear, 6)

P_LS = lomb_scargle(t, y, dy, omega_linear, generalized=True)

In [132]:

print(P_MT.shape)
print(t.shape)
print(omega_linear.shape)
fig = plt.figure()
ax = fig.add_subplot(211)
ax.plot(2. * np.pi / omega_linear * 24., P_MT1, label='1 Multiterm')
ax.plot(2. * np.pi / omega_linear * 24., P_MT6, label='6 Multiterm')
ax.plot(2. * np.pi / omega_linear * 24., P_LS, label='Lomb Scargle')
#plt.xscale('log')
ax.set_xlim([pmin_linear*24., pmax_linear*24.])
ax.set_xlabel('P [hours]')
ax.set_ylabel('Power')
ax.legend(loc='best')
ax = fig.add_subplot(212)
maxper = 2. * np.pi / omega_linear[np.where(P_LS == np.max(P_LS))[0]]
ax.plot((t / maxper) % 1 , y, '.')
ax.set_ylim([np.max(y), np.min(y)])
ax.set_xlabel('Phase')
ax.set_ylabel('Mag')

(3000,)
(253,)
(10000,)

Out[132]:

<matplotlib.text.Text at 0x11dde4850>

In [108]:

plot1018()

But how do we know the number of Fourier modes to include? 5, 10, 100, M = N/2 - 1?

Bayesian Information Criterion to the rescue:

$$\Delta BIC_{M} = \chi_0^2 P_M (\omega_0^M) - (2M + 1) \ln{N}.$$

Add components and recreate the multi-term periodogram until $\Delta BIC$ is maximized.

In [143]:

plot1019()

10.3.4. Classification of Periodic Light Curves¶

Among other techniques for classification, one option is to use the multiterm periodogram to find the best periods, and then fit for the best coeffecients for all $m$ components:

$$y(t) = b_0 + \sum_{m=1}^{M} a_m \sin{(m \omega_0 t)} + b_m \cos{(m \omega_0 t)}$$

$b_0$ is a generalization term. But that's not all! You can even fit multiple periods simultaneously:

$$y(t) = b_0 + \sum_{k=1}^{K}\sum_{m=1}^{M} a_{km} \sin{(m \omega_k t)} + b_{km} \cos{(m \omega_k t)}$$

Gaussian Mixture Clustering Analysis¶

A nonparametric method based on the hypothesis that diferent physical classes will be clustered together in the multi-dimensional parameter space. Once the clusters have been identified, they can be assigned a physical meaning.

The below plots show the resulting 12 clusters from minimizing the BIC on two data attributes: $g-i$ color and $\log{P}$. ab-type RR Lyras are two separate clusters because a Gaussian is a poor description of their distribution. A is light curve amplitude.

In [148]:

plot1021()

@pickle_results: using precomputed results from 'LINEAR_clustering.pkl'
number of components: 12
number of components: 15

The bottom 4 panels in the above set are Figure 10.21. This is the same data set, but it fits seven attributes, resulting in a 15-component Gaussian mixture solution. Despite the additional attributes and components, the clusters really didn't change much, indicating that not much new information was added.

Supervised Approach

When a training set of data is available, other methods include a Gaussian mixture model Bayes classifier (Section 9.3.5) and support vector machines (Section 9.6). They both did fairly well, and the GMMB slightly outperformed the SVM (see tables 2 and 3 on p. 450). Both methods struggled distinguishing EA (Algol-type) from EB/EW (contact) eclipsing binaries. However, that's not surprising since even experts have difficulty distinguishing the two.

10.3.5. Analysis of Arrival Time Data¶

All of the above analysis assumes some uncertainty in the measurement value. But what about when there are sparse data sets where single events are recorded with negligible error. One example is an X-ray data set, where individual photons are detected with negligible background.

The Rayleigh test is the best known classical test in this situation.

$$R^2 = \left( \sum_{j=1}^{N} \cos{(2 \pi \phi_j)}\right)^2 + \left( \sum_{j=1}^{N} \sin{(2 \pi \phi_j)}\right)^2$$

This is carried out for a grid of periods, just like the periodogram.

Small $R^2 =>$ random

Large $R^2 =>$ periodic

Nonparametric Bayesian Approach

An interseting alternative was carried out by Gregory and Loredo (1992).

First, divide the data set into very small bins to create a sparse data set, where values are either 1 or 0 in each bin.

The expectation value is then

$$\mu(t) = r(t) \Delta t$$

From Poisson statistics, the probability of detecting no photons is:

$$p(0) = e^{-r(t) \Delta t}$$

And the probability of a single event/count is: $$p(1) = r(t)\Delta t e^{-r(t)\Delta t}$$

The data likelihood is then $$p(D|r, I) = (\Delta t)^N \exp{\left( -\int_{T} r(t) dt \right)} \prod_{j=1}^{N}r(t_j)$$

They then used a nonparametric function for the rate function, treating it as a piecewise constant function, $f_j$, with $M$ steps of the same width, and $\sum_j f_j = 1$.

They then marginalize the resulting pdf to produce an analog for the periodogram.

When little is known about the shape of the signal, this method is superior to a Fourier series expansion.

Bayesian Blocks

Similar to the Gregory-Loredo method, but with adaptive bin sizes.

10.4. Temporally Localized Signals¶

Stationary signal with an event localized in time.

Examples are a microlensing event and bursts of emission.

Two parametric models are burst signal and chirp signal.

10.4.1. Searching for a Burst Signal¶

Super basic. You have a stationary signal

$$y(t) = b_0 + \epsilon$$

at some point, there's a burst. This can be generalized in the form of

$$y_B (t | T, A, \theta) = A g_B (t - T|\theta)$$

One option for $g_B$ is $g_B(t \alpha) = \exp{(-\alpha t)}$. Use MCMC to estimate the parameters.

In [175]:

plot1025()

@pickle_results: using precomputed results from 'matchedfilt_burst.pkl'

10.4.2. Searching for Chirp Signal¶

Doing the same thing as above, but with a "chirp" signal

$$y(t) = b_0 + A \sin{[\omega t + \beta t^2]}$$

In [176]:

plot1026()

@pickle_results: using precomputed results from 'matchedfilt_chirp.pkl'

And here's a combination of the two -- a temporally localized exponentially decaying chirp signal

$$y_C(t|T, A, \phi, \omega, \beta) = A \sin{[\phi + \omega (t - T) + \beta (t - T)^2 ]}\exp{[-\alpha (t - T)]}$$

that is riding on top of another signal

$$y(t) = b_0 + b_1 \sin{(\Omega_1 t)} \sin{(\Omega_2 t)}$$

In [177]:

plot1027()

@pickle_results: using precomputed results from 'matchedfilt_chirp2.pkl'

For signals that are not just localized in time, but in frequency too, wavelets can be helpul.

In [178]:

plot1028()

10.5 Analysis of Stochastic Processes¶

Stochastic Processes: behavior that is not predictable forever, but is always there.

Underlying physics is too complex to deterministically predict future values (e.g. magnetic dynamos, long-term weather)

10.5.1. The Autocorrelation and Structure Functions¶

The correlation of two functions, $f(t)$ and $g(t)$, is

$$CF(\Delta t) = \frac{\lim_{T \rightarrow \infty} \frac{1}{T} \int_{(T)} f(t)g(t+\Delta t) dt}{\sigma_f \sigma_g}$$

and the autocorrelation function compares a function to itself and different times

$$ACF(\Delta t) = \frac{\lim_{T \rightarrow \infty} \frac{1}{T} \int_{(T)} f(t)f(t+\Delta t) dt}{\sigma_f^2}$$

If the time series repeats itself by shifting the time axis by $t_{lag}$, then the ACF will have a peak at $\Delta t = t_{lag}$.

This is analogous to computing the PSD in period space. For that reason, the ACF and PSD are known as "Fourier Pairs", which is known as the "Wiener-Khinchin theorem".

Structure Function

The structure function is closely related to the ACF.

$$SF(\Delta t) = SF_{\infty} [1 - ACF(\Delta t)]^{1/2}$$

where $SF_{\infty}$ is the standard deviation of the time series over infinite time.

The structure function, the ACF, and the PSD are all mathematically equivalent and are all used; however, due to issues of noise and sampling, they may not always result in equivalent inferences about the data.

Examples of stochastic processes: $1/f$ and $1/f^2$ processes

The authors are really getting lazy here... Essentially, see [56] for details.

In [181]:

plot1029()

The above figure illustrates the connection between the PSDs for $1/f$ and $1/f^2$ processes and their corresponding time series.

$1/f$: AKA memory noise, flicker noise, and red noise
$1/f^2$: brownian motion, random walk, red noise (in which case $1/f$ is known as pink noise)

The variance of the mean of a $1/f$ process does not decrease with time -- it's infinite.

10.5.2. Autocorrelation and Structure Function for Evenly and Unevenly Sampled Data¶

Similar to problems with the PSD with unevenly sampled data, the discrete ACF only works for evenly sampled data. An alternative is the discrete correlation function (DCF, or UDCF), also known as the "slot autocorrelation function" in the physics community.

It's just binning the data.

astroML includes tools for computing the ACF of unevenly sampled data using two methods: the Scargle method and the Edelson Krolik method.

The difference (or lack thereof) is shown below)

In [183]:

plot1030()

10.5.3. Autoregressive Models¶

Autoregressive models provide a good general description of processes that "retain memory", but are not periodic. A random walk is a good example:

$$y_i = \beta y_{i-1} + e_i$$

$\beta > 1$ is known as a geometric random walk model, which is used extensively to model stock market data.

An autoregressive process of order $k$ for a discrete process is defined by

$$y_i = \sum_{j=1}^k a_j y_{i-j} + e_i$$

In other words, the $ith$ value of $y$ is a linear combination of the $k$ previous values of $y$.

Related processes include

moving average (MA): $y_i$ depends only on past values of noise
autogressive moving average (ARMA): a combination of AR and MA
autoregressive integrated moving average (ARIMA): ARMA + random walk
Kalman filtering: state-space or dynamic linear modeling

discrete

The equation for an autoregressive process shown above is only applicable for an evenly sampled data set. A generalization of this is the continuous autoregressive process (CAR($k$)).

A Gaussian process is also a great to model time series.

10.5.4. Damped Random Walk Model¶

A CAR(1) process is also known as a damped random walk, or an Ornstein-Uhlenbeck process. The damped random walk is a $1/f^2$ process at high frequencies.

For evenly sampled data the CAR(1) process is equivalent to the AR(1) process with $a_1 = \exp{(-1/\tau)}$.

A damped random walk provides a good description of the optical continuum of quasars, and has been used to distinguish them from stars.

10.6. Which Method Should I Use for Time Series Analysis?¶

Astronomical data are usually unevenly sampled with heteroscedastic errors. For this reason, the generalized Lomb-Scargle periodogram is usually a good starting point for periodic variability.

If dealing with sparse data and an unknown model, the Gregory-Loredo method or Bayesian blocks can be really useful.

You can also drop in your own function to make a periodogram specific to your model. A good example of this is the box least squares (BLS) periodogram for transit data.

Another good non-parametric model is importance sampling, otherwise known as Approximate Bayesian Computing. Nikhil is a good person to talk to about this -- he has applied it to RV data with a Gaussian Mixture Model.

Other topics not discussed in this chapter include state-space models, Kalman filters, Markov chains, and stochastic differential equations.

In [ ]:

Appendix: Figure Code¶

In [5]:

# Author: Jake VanderPlas
# License: BSD
#   The figure produced by this code is published in the textbook
#   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
#   For more information, see http://astroML.github.com
#   To report a bug or issue, use the following forum:
#    https://groups.google.com/forum/#!forum/astroml-general


#----------------------------------------------------------------------
# This function adjusts matplotlib settings for a uniform feel in the textbook.
# Note that with usetex=True, fonts are rendered with LaTeX.  This may
# result in an error if LaTeX is not installed on your system.  In that case,
# you can set usetex to False.
def plot1001():
    #------------------------------------------------------------
    # Load the RR Lyrae template
    templates = fetch_rrlyrae_templates()
    x, y = templates['115r'].T

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure(figsize=myfigsize)
    fig.subplots_adjust(hspace=0)

    kvals = [1, 3, 8]
    subplots = [311, 312, 313]

    for (k, subplot) in zip(kvals, subplots):
        ax = fig.add_subplot(subplot)

        # Use FFT to fit a truncated Fourier series
        y_fft = np.fft.fft(y)
        y_fft[k + 1:-k] = 0
        y_fit = np.fft.ifft(y_fft).real

        # plot the true value and the k-term reconstruction
        ax.plot(np.concatenate([x, 1 + x]),
                np.concatenate([y, y]), '--k', lw=2)
        ax.plot(np.concatenate([x, 1 + x]),
                np.concatenate([y_fit, y_fit]), color='gray')

        label = "%i mode" % k
        if k > 1:
            label += 's'

        ax.text(0.02, 0.1, label, ha='left', va='bottom',
                transform=ax.transAxes)

        if subplot == subplots[-1]:
            ax.set_xlabel('phase')
        else:
            ax.xaxis.set_major_formatter(plt.NullFormatter())

        if subplot == subplots[1]:
            ax.set_ylabel('amplitude')
        ax.yaxis.set_major_formatter(plt.NullFormatter())

        ax.set_xlim(0, 2)
        ax.set_ylim(1.1, -0.1)

    #plt.show()

In [6]:

# Author: Jake VanderPlas
# License: BSD
#   The figure produced by this code is published in the textbook
#   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
#   For more information, see http://astroML.github.com
#   To report a bug or issue, use the following forum:
#    https://groups.google.com/forum/#!forum/astroml-general
import numpy as np
from matplotlib import pyplot as plt

from astroML.time_series import \
    lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap

#----------------------------------------------------------------------
# This function adjusts matplotlib settings for a uniform feel in the textbook.
# Note that with usetex=True, fonts are rendered with LaTeX.  This may
# result in an error if LaTeX is not installed on your system.  In that case,
# you can set usetex to False.
from astroML.plotting import setup_text_plots
setup_text_plots(fontsize=8, usetex=True)

#------------------------------------------------------------
# Generate data where y is positive
np.random.seed(0)
N = 30
P = 0.3

t = P / 2 * np.random.random(N) + P * np.random.randint(100, size=N)
y = 10 + np.sin(2 * np.pi * t / P)
dy = 0.5 + 0.5 * np.random.random(N)
y_obs = y + np.random.normal(dy)

omega_0 = 2 * np.pi / P

#######################################################################
# Generate the plot with and without the original typo

for typo in [True, False]:
    #------------------------------------------------------------
    # Compute the Lomb-Scargle Periodogram
    sig = np.array([0.1, 0.01, 0.001])
    omega = np.linspace(17, 22, 1000)

    # Notice the typo: we used y rather than y_obs
    if typo is True:
        P_S = lomb_scargle(t, y, dy, omega, generalized=False)
        P_G = lomb_scargle(t, y, dy, omega, generalized=True)
    else:
        P_S = lomb_scargle(t, y_obs, dy, omega, generalized=False)
        P_G = lomb_scargle(t, y_obs, dy, omega, generalized=True)

    #------------------------------------------------------------
    # Get significance via bootstrap
    D = lomb_scargle_bootstrap(t, y_obs, dy, omega, generalized=True,
                               N_bootstraps=1000, random_state=0)
    sig1, sig5 = np.percentile(D, [99, 95])

In [7]:

#Figure 10.4
# Author: Jake VanderPlas
# License: BSD
#   The figure produced by this code is published in the textbook
#   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
#   For more information, see http://astroML.github.com
#   To report a bug or issue, use the following forum:
#    https://groups.google.com/forum/#!forum/astroml-general

#----------------------------------------------------------------------
# This function adjusts matplotlib settings for a uniform feel in the textbook.
# Note that with usetex=True, fonts are rendered with LaTeX.  This may
# result in an error if LaTeX is not installed on your system.  In that case,
# you can set usetex to False.
def plot1004():

    #from astroML.plotting import setup_text_plots
    #setup_text_plots(fontsize=8, usetex=True)

    #------------------------------------------------------------
    # Generate the data
    Nbins = 2 ** 15
    Nobs = 40
    f = lambda t: np.sin(np.pi * t / 3)

    t = np.linspace(-100, 200, Nbins)
    dt = t[1] - t[0]
    y = f(t)

    # select observations
    np.random.seed(42)
    t_obs = 100 * np.random.random(40)

    D = abs(t_obs[:, np.newaxis] - t)
    i = np.argmin(D, 1)

    t_obs = t[i]
    y_obs = y[i]
    window = np.zeros(Nbins)
    window[i] = 1

    #------------------------------------------------------------
    # Compute PSDs
    Nfreq = Nbins / 2

    dt = t[1] - t[0]
    df = 1. / (Nbins * dt)
    f = df * np.arange(Nfreq)

    PSD_window = abs(np.fft.fft(window)[:Nfreq]) ** 2
    PSD_y = abs(np.fft.fft(y)[:Nfreq]) ** 2
    PSD_obs = abs(np.fft.fft(y * window)[:Nfreq]) ** 2

    # normalize the true PSD so it can be shown in the plot:
    # in theory it's a delta function, so normalization is
    # arbitrary

    # scale PSDs for plotting
    PSD_window /= 500
    PSD_y /= PSD_y.max()
    PSD_obs /= 500

    #------------------------------------------------------------
    # Prepare the figures
    fig = plt.figure()
    fig.subplots_adjust(bottom=0.15, hspace=0.2, wspace=0.25,
                        left=0.12, right=0.95)

    # First panel: data vs time
    ax = fig.add_subplot(221)
    ax.plot(t, y, '-', c='gray')
    ax.plot(t_obs, y_obs, '.k', ms=4)
    ax.text(0.95, 0.93, "Data", ha='right', va='top', transform=ax.transAxes)
    ax.set_ylabel('$y(t)$')
    ax.set_xlim(0, 100)
    ax.set_ylim(-1.5, 1.8)

    # Second panel: PSD of data
    ax = fig.add_subplot(222)
    ax.fill(f, PSD_y, fc='gray', ec='gray')
    ax.plot(f, PSD_obs, '-', c='black')
    ax.text(0.95, 0.93, "Data PSD", ha='right', va='top', transform=ax.transAxes)
    ax.set_ylabel('$P(f)$')
    ax.set_xlim(0, 1.0)
    ax.set_ylim(-0.1, 1.1)

    # Third panel: window vs time
    ax = fig.add_subplot(223)
    ax.plot(t, window, '-', c='black')
    ax.text(0.95, 0.93, "Window", ha='right', va='top', transform=ax.transAxes)
    ax.set_xlabel('$t$')
    ax.set_ylabel('$y(t)$')
    ax.set_xlim(0, 100)
    ax.set_ylim(-0.2, 1.5)

    # Fourth panel: PSD of window
    ax = fig.add_subplot(224)
    ax.plot(f, PSD_window, '-', c='black')
    ax.text(0.95, 0.93, "Window PSD", ha='right', va='top', transform=ax.transAxes)
    ax.set_xlabel('$f$')
    ax.set_ylabel('$P(f)$')
    ax.set_xlim(0, 1.0)
    ax.set_ylim(-0.1, 1.1)

In [8]:

def plot1006():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general
    from scipy import fftpack
    from matplotlib import mlab

    from astroML.datasets import fetch_LIGO_large

    #------------------------------------------------------------
    # Fetch the LIGO hanford data
    data, dt = fetch_LIGO_large()

    # subset of the data to plot
    t0 = 646
    T = 2
    tplot = dt * np.arange(T * 4096)
    dplot = data[4096 * t0: 4096 * (t0 + T)]

    tplot = tplot[::10]
    dplot = dplot[::10]

    fmin = 40
    fmax = 2060

    #------------------------------------------------------------
    # compute PSD using simple FFT
    N = len(data)
    df = 1. / (N * dt)
    PSD = abs(dt * fftpack.fft(data)[:N / 2]) ** 2
    f = df * np.arange(N / 2)

    cutoff = ((f >= fmin) & (f <= fmax))
    f = f[cutoff]
    PSD = PSD[cutoff]
    f = f[::100]
    PSD = PSD[::100]

    #------------------------------------------------------------
    # compute PSD using Welch's method -- no window function
    PSDW1, fW1 = mlab.psd(data, NFFT=4096, Fs=1. / dt,
                          window=mlab.window_none, noverlap=2048)

    dfW1 = fW1[1] - fW1[0]

    cutoff = (fW1 >= fmin) & (fW1 <= fmax)
    fW1 = fW1[cutoff]
    PSDW1 = PSDW1[cutoff]

    #------------------------------------------------------------
    # compute PSD using Welch's method -- hanning window function
    PSDW2, fW2 = mlab.psd(data, NFFT=4096, Fs=1. / dt,
                          window=mlab.window_hanning, noverlap=2048)

    dfW2 = fW2[1] - fW2[0]

    cutoff = (fW2 >= fmin) & (fW2 <= fmax)
    fW2 = fW2[cutoff]
    PSDW2 = PSDW2[cutoff]

    #------------------------------------------------------------
    # Plot the data
    fig = plt.figure()
    fig.subplots_adjust(bottom=0.1, top=0.9, hspace=0.3)

    # top panel: time series
    ax = fig.add_subplot(311)
    ax.plot(tplot, dplot, '-k')
    ax.set_xlabel('time (s)')
    ax.set_ylabel('$h(t)$')

    ax.set_ylim(-1.2E-18, 1.2E-18)

    # middle panel: non-windowed filter
    ax = fig.add_subplot(312)
    ax.loglog(f, PSD, '-', c='#AAAAAA')
    ax.loglog(fW1, PSDW1, '-k')
    ax.text(0.98, 0.95, "Top-hat window",
            ha='right', va='top', transform=ax.transAxes)

    ax.set_xlabel('frequency (Hz)')
    ax.set_ylabel(r'$PSD(f)$')

    ax.set_xlim(40, 2060)
    ax.set_ylim(1E-46, 1E-36)
    ax.yaxis.set_major_locator(plt.LogLocator(base=100))

    # bottom panel: hanning window
    ax = fig.add_subplot(313)
    ax.loglog(f, PSD, '-', c='#AAAAAA')
    ax.loglog(fW2, PSDW2, '-k')
    ax.text(0.98, 0.95, "Hanning (cosine) window",
            ha='right', va='top', transform=ax.transAxes)

    ax.set_xlabel('frequency (Hz)')
    ax.set_ylabel(r'$PSD(f)$')

    ax.set_xlim(40, 2060)
    ax.set_ylim(1E-46, 1E-36)
    ax.yaxis.set_major_locator(plt.LogLocator(base=100))

In [9]:

#Figure 10.7
# Author: Jake VanderPlas
# License: BSD
#   The figure produced by this code is published in the textbook
#   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
#   For more information, see http://astroML.github.com
#   To report a bug or issue, use the following forum:
#    https://groups.google.com/forum/#!forum/astroml-general

def plot1007():
    from astroML.fourier import\
        FT_continuous, IFT_continuous, sinegauss, sinegauss_FT, wavelet_PSD

    #------------------------------------------------------------
    # Sample the function: localized noise
    np.random.seed(0)

    N = 1024
    t = np.linspace(-5, 5, N)
    x = np.ones(len(t))

    h = np.random.normal(0, 1, len(t))
    h *= np.exp(-0.5 * (t / 0.5) ** 2)

    #------------------------------------------------------------
    # Compute an example wavelet
    W = sinegauss(t, 0, 1.5, Q=1.0)

    #------------------------------------------------------------
    # Compute the wavelet PSD
    f0 = np.linspace(0.5, 7.5, 100)
    wPSD = wavelet_PSD(t, h, f0, Q=1.0)

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()
    fig.subplots_adjust(hspace=0.05, left=0.12, right=0.95, bottom=0.08, top=0.95)

    # First panel: the signal
    ax = fig.add_subplot(311)
    ax.plot(t, h, '-k', lw=1)
    ax.text(0.02, 0.95, ("Input Signal:\n"
                         "Localized Gaussian noise"),
            ha='left', va='top', transform=ax.transAxes)

    ax.set_xlim(-4, 4)
    ax.set_ylim(-2.9, 2.9)
    ax.xaxis.set_major_formatter(plt.NullFormatter())
    ax.set_ylabel('$h(t)$')

    # Second panel: an example wavelet
    ax = fig.add_subplot(312)
    ax.plot(t, W.real, '-k', label='real part', lw=1)
    ax.plot(t, W.imag, '--k', label='imag part', lw=1)

    ax.text(0.02, 0.95, ("Example Wavelet\n"
                         "$t_0 = 0$, $f_0=1.5$, $Q=1.0$"),
            ha='left', va='top', transform=ax.transAxes)
    ax.text(0.98, 0.05,
            (r"$w(t; t_0, f_0, Q) = e^{-[f_0 (t - t_0) / Q]^2}"
             "e^{2 \pi i f_0 (t - t_0)}$"),
            ha='right', va='bottom', transform=ax.transAxes)

    ax.legend(loc=1)

    ax.set_xlim(-4, 4)
    ax.set_ylim(-1.4, 1.4)
    ax.set_ylabel('$w(t; t_0, f_0, Q)$')
    ax.xaxis.set_major_formatter(plt.NullFormatter())

    # Third panel: the spectrogram
    ax = plt.subplot(313)
    ax.imshow(wPSD, origin='lower', aspect='auto', cmap=plt.cm.jet,
              extent=[t[0], t[-1], f0[0], f0[-1]])

    ax.text(0.02, 0.95, ("Wavelet PSD"), color='w',
            ha='left', va='top', transform=ax.transAxes)

    ax.set_xlim(-4, 4)
    ax.set_ylim(0.5, 7.5)

    ax.set_xlabel('$t$')
    ax.set_ylabel('$f_0$')

In [10]:

#Figure 10.8
# Author: Jake VanderPlas
# License: BSD
#   The figure produced by this code is published in the textbook
#   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
#   For more information, see http://astroML.github.com
#   To report a bug or issue, use the following forum:
#    https://groups.google.com/forum/#!forum/astroml-general
def plot1008():
    from astroML.fourier import FT_continuous, IFT_continuous

    def wavelet(t, t0, f0, Q):
        return (np.exp(-(f0 / Q * (t - t0)) ** 2)
                * np.exp(2j * np.pi * f0 * (t - t0)))


    def wavelet_FT(f, t0, f0, Q):
        # this is its fourier transform using
        # H(f) = integral[ h(t) exp(-2pi i f t) dt]
        return (np.sqrt(np.pi) * Q / f0
                * np.exp(-2j * np.pi * f * t0)
                * np.exp(-(np.pi * (f - f0) * Q / f0) ** 2))


    def check_funcs(t0=1, f0=2, Q=3):
        t = np.linspace(-5, 5, 10000)
        h = wavelet(t, t0, f0, Q)

        f, H = FT_continuous(t, h)
        assert np.allclose(H, wavelet_FT(f, t0, f0, Q))

    #------------------------------------------------------------
    # Create the simulated dataset
    np.random.seed(5)

    t = np.linspace(-40, 40, 2001)[:-1]
    h = np.exp(-0.5 * ((t - 20.) / 1.0) ** 2)
    hN = h + np.random.normal(0, 0.5, size=h.shape)

    #------------------------------------------------------------
    # Compute the convolution via the continuous Fourier transform
    # This is more exact than using the discrete transform, because
    # we have an analytic expression for the FT of the wavelet.
    Q = 0.3
    f0 = 2 ** np.linspace(-3, -1, 100)

    f, H = FT_continuous(t, hN)
    W = np.conj(wavelet_FT(f, 0, f0[:, None], Q))
    t, HW = IFT_continuous(f, H * W)

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()
    fig.subplots_adjust(hspace=0.05, left=0.12, right=0.95, bottom=0.08, top=0.95)

    # First panel: the signal
    ax = fig.add_subplot(311)
    ax.plot(t, hN, '-k', lw=1)

    ax.text(0.02, 0.95, ("Input Signal:\n"
                         "Localized spike plus noise"),
            ha='left', va='top', transform=ax.transAxes)

    ax.set_xlim(-40, 40)
    ax.set_ylim(-1.2, 2.2)
    ax.xaxis.set_major_formatter(plt.NullFormatter())
    ax.set_ylabel('$h(t)$')

    # Second panel: the wavelet
    ax = fig.add_subplot(312)
    W = wavelet(t, 0, 0.125, Q)
    ax.plot(t, W.real, '-k', label='real part', lw=1)
    ax.plot(t, W.imag, '--k', label='imag part', lw=1)

    ax.legend(loc=1)
    ax.text(0.02, 0.95, ("Example Wavelet\n"
                         "$t_0 = 0$, $f_0=1/8$, $Q=0.3$"),
            ha='left', va='top', transform=ax.transAxes)
    ax.text(0.98, 0.05,
            (r"$w(t; t_0, f_0, Q) = e^{-[f_0 (t - t_0) / Q]^2}"
             "e^{2 \pi i f_0 (t - t_0)}$"),
            ha='right', va='bottom', transform=ax.transAxes)

    ax.set_xlim(-40, 40)
    ax.set_ylim(-1.4, 1.4)
    ax.set_ylabel('$w(t; t_0, f_0, Q)$')
    ax.xaxis.set_major_formatter(plt.NullFormatter())

    # Third panel: the spectrogram
    ax = fig.add_subplot(313)
    ax.imshow(abs(HW) ** 2, origin='lower', aspect='auto', cmap=plt.cm.binary,
              extent=[t[0], t[-1], np.log2(f0)[0], np.log2(f0)[-1]])
    ax.set_xlim(-40, 40)

    ax.text(0.02, 0.95, ("Wavelet PSD"), color='w',
            ha='left', va='top', transform=ax.transAxes)

    ax.set_ylim(np.log2(f0)[0], np.log2(f0)[-1])
    ax.set_xlabel('$t$')
    ax.set_ylabel('$f_0$')

    ax.yaxis.set_major_locator(plt.MultipleLocator(1))
    ax.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, *args: ("1/%i"
                                                                     % (2 ** -x))))

In [11]:

#Figure 10.10
# Author: Jake VanderPlas
# License: BSD
#   The figure produced by this code is published in the textbook
#   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
#   For more information, see http://astroML.github.com
#   To report a bug or issue, use the following forum:
#    https://groups.google.com/forum/#!forum/astroml-general
def plot1010():
    from scipy import optimize, fftpack
    from astroML.filters import savitzky_golay, wiener_filter

    #------------------------------------------------------------
    # Create the noisy data
    np.random.seed(5)
    N = 2000
    dt = 0.05

    t = dt * np.arange(N)
    h = np.exp(-0.5 * ((t - 20.) / 1.0) ** 2)
    hN = h + np.random.normal(0, 0.5, size=h.shape)

    Df = 1. / N / dt
    f = fftpack.ifftshift(Df * (np.arange(N) - N / 2))
    HN = fftpack.fft(hN)

    #------------------------------------------------------------
    # Set up the Wiener filter:
    #  fit a model to the PSD consisting of the sum of a
    #  gaussian and white noise
    h_smooth, PSD, P_S, P_N, Phi = wiener_filter(t, hN, return_PSDs=True)

    #------------------------------------------------------------
    # Use the Savitzky-Golay filter to filter the values
    h_sg = savitzky_golay(hN, window_size=201, order=4, use_fft=False)

    #------------------------------------------------------------
    # Plot the results
    N = len(t)
    Df = 1. / N / (t[1] - t[0])
    f = fftpack.ifftshift(Df * (np.arange(N) - N / 2))
    HN = fftpack.fft(hN)

    fig = plt.figure()
    fig.subplots_adjust(wspace=0.05, hspace=0.25,
                        bottom=0.1, top=0.95,
                        left=0.12, right=0.95)

    # First plot: noisy signal
    ax = fig.add_subplot(221)
    ax.plot(t, hN, '-', c='gray')
    ax.plot(t, np.zeros_like(t), ':k')
    ax.text(0.98, 0.95, "Input Signal", ha='right', va='top',
            transform=ax.transAxes, bbox=dict(fc='w', ec='none'))

    ax.set_xlim(0, 90)
    ax.set_ylim(-0.5, 1.5)

    ax.xaxis.set_major_locator(plt.MultipleLocator(20))
    ax.set_xlabel(r'$\lambda$')
    ax.set_ylabel('flux')

    # Second plot: filtered signal
    ax = plt.subplot(222)
    ax.plot(t, np.zeros_like(t), ':k', lw=1)
    ax.plot(t, h_smooth, '-k', lw=1.5, label='Wiener')
    ax.plot(t, h_sg, '-', c='gray', lw=1, label='Savitzky-Golay')

    ax.text(0.98, 0.95, "Filtered Signal", ha='right', va='top',
            transform=ax.transAxes)
    ax.legend(loc='upper right', bbox_to_anchor=(0.98, 0.9), frameon=False)

    ax.set_xlim(0, 90)
    ax.set_ylim(-0.5, 1.5)

    ax.yaxis.set_major_formatter(plt.NullFormatter())
    ax.xaxis.set_major_locator(plt.MultipleLocator(20))
    ax.set_xlabel(r'$\lambda$')

    # Third plot: Input PSD
    ax = fig.add_subplot(223)
    ax.scatter(f[:N / 2], PSD[:N / 2], s=9, c='k')
    ax.plot(f[:N / 2], P_S[:N / 2], '-k')
    ax.plot(f[:N / 2], P_N[:N / 2], '-k')

    ax.text(0.98, 0.95, "Input PSD", ha='right', va='top',
            transform=ax.transAxes)

    ax.set_ylim(-100, 3500)
    ax.set_xlim(0, 0.9)

    ax.yaxis.set_major_locator(plt.MultipleLocator(1000))
    ax.xaxis.set_major_locator(plt.MultipleLocator(0.2))
    ax.set_xlabel('$f$')
    ax.set_ylabel('$PSD(f)$')

    # Fourth plot: Filtered PSD
    ax = fig.add_subplot(224)
    filtered_PSD = (Phi * abs(HN)) ** 2
    ax.scatter(f[:N / 2], filtered_PSD[:N / 2], s=9, c='k')

    ax.text(0.98, 0.95, "Filtered PSD", ha='right', va='top',
            transform=ax.transAxes)

    ax.set_ylim(-100, 3500)
    ax.set_xlim(0, 0.9)

    ax.yaxis.set_major_locator(plt.MultipleLocator(1000))
    ax.yaxis.set_major_formatter(plt.NullFormatter())
    ax.xaxis.set_major_locator(plt.MultipleLocator(0.2))
    ax.set_xlabel('$f$')

In [12]:

# Author: Jake VanderPlas
# License: BSD
#   The figure produced by this code is published in the textbook
#   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
#   For more information, see http://astroML.github.com
#   To report a bug or issue, use the following forum:
#    https://groups.google.com/forum/#!forum/astroml-general
def plot1015():
    from astroML.time_series import\
        lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap

    #------------------------------------------------------------
    # Generate Data
    np.random.seed(0)
    N = 30
    P = 0.3

    t = np.random.randint(100, size=N) + 0.3 + 0.4 * np.random.random(N)
    y = 10 + np.sin(2 * np.pi * t / P)
    dy = 0.5 + 0.5 * np.random.random(N)
    y_obs = np.random.normal(y, dy)

    #------------------------------------------------------------
    # Compute periodogram
    period = 10 ** np.linspace(-1, 0, 10000)
    omega = 2 * np.pi / period
    PS = lomb_scargle(t, y_obs, dy, omega, generalized=True)

    #------------------------------------------------------------
    # Get significance via bootstrap
    D = lomb_scargle_bootstrap(t, y_obs, dy, omega, generalized=True,
                               N_bootstraps=1000, random_state=0)
    sig1, sig5 = np.percentile(D, [99, 95])

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()
    fig.subplots_adjust(left=0.1, right=0.9, hspace=0.25)

    # First panel: the data
    ax = fig.add_subplot(211)
    ax.errorbar(t, y_obs, dy, fmt='.k', lw=1, ecolor='gray')
    ax.set_xlabel('time (days)')
    ax.set_ylabel('flux')
    ax.set_xlim(-5, 105)

    # Second panel: the periodogram & significance levels
    ax1 = fig.add_subplot(212, xscale='log')
    ax1.plot(period, PS, '-', c='black', lw=1, zorder=1)
    ax1.plot([period[0], period[-1]], [sig1, sig1], ':', c='black')
    ax1.plot([period[0], period[-1]], [sig5, sig5], ':', c='black')

    ax1.annotate("", (0.3, 0.65), (0.3, 0.85), ha='center',
                 arrowprops=dict(arrowstyle='->'))

    ax1.set_xlim(period[0], period[-1])
    ax1.set_ylim(-0.05, 0.85)

    ax1.set_xlabel(r'period (days)')
    ax1.set_ylabel('power')

    # Twin axis: label BIC on the right side
    ax2 = ax1.twinx()
    ax2.set_ylim(tuple(lomb_scargle_BIC(ax1.get_ylim(), y_obs, dy)))
    ax2.set_ylabel(r'$\Delta BIC$')

    ax1.xaxis.set_major_formatter(plt.FormatStrFormatter('%.1f'))
    ax1.xaxis.set_minor_formatter(plt.FormatStrFormatter('%.1f'))
    ax1.xaxis.set_major_locator(plt.LogLocator(10))
    ax1.xaxis.set_major_formatter(plt.FormatStrFormatter('%.3g'))

In [13]:

def plot1015():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general
    from astroML.time_series import \
        lomb_scargle, lomb_scargle_BIC, lomb_scargle_bootstrap

    #------------------------------------------------------------
    # Generate data where y is positive
    np.random.seed(0)
    N = 30
    P = 0.3

    t = P / 2 * np.random.random(N) + P * np.random.randint(100, size=N)
    y = 10 + np.sin(2 * np.pi * t / P)
    dy = 0.5 + 0.5 * np.random.random(N)
    y_obs = y + np.random.normal(dy)

    omega_0 = 2 * np.pi / P

    #######################################################################
    # Redo the plot without the typo
    # We need a larger data range to actually get significant power
    # with actual noisy data

    #------------------------------------------------------------
    # Generate data where y is positive
    np.random.seed(0)
    N = 300
    P = 0.3

    t = P / 2 * np.random.random(N) + P * np.random.randint(1000, size=N)
    y = 10 + np.sin(2 * np.pi * t / P)
    dy = 0.1 + 0.1 * np.random.random(N)
    y_obs = y + np.random.normal(dy)

    omega_0 = 2 * np.pi / P


    #------------------------------------------------------------
    # Compute the Lomb-Scargle Periodogram
    sig = np.array([0.1, 0.01, 0.001])
    omega = np.linspace(20.5, 21.1, 1000)

    P_S = lomb_scargle(t, y_obs, dy, omega, generalized=False)
    P_G = lomb_scargle(t, y_obs, dy, omega, generalized=True)

    #------------------------------------------------------------
    # Get significance via bootstrap
    D = lomb_scargle_bootstrap(t, y_obs, dy, omega, generalized=True,
                               N_bootstraps=1000, random_state=0)
    sig1, sig5 = np.percentile(D, [99, 95])

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()

    # First panel: input data
    ax = fig.add_subplot(211)
    ax.errorbar(t, y_obs, dy, fmt='.k', lw=1, ecolor='gray')
    ax.plot([-20, 320], [10, 10], ':k', lw=1)

    ax.set_xlim(-20, 320)
    ax.set_xlabel('$t$')
    ax.set_ylabel('$y(t)$')

    # Second panel: periodogram
    ax = fig.add_subplot(212)
    ax.plot(omega, P_S, '--k', lw=1, label='standard')
    ax.plot(omega, P_S, '-', c='gray', lw=1)
    ax.plot(omega, P_G, '-', color='#BA4C37', lw=1, label='generalized')
    ax.legend(loc=2)

    # plot the significance lines.
    xlim = (omega[0], omega[-1])
    ax.plot(xlim, [sig1, sig1], ':', c='black')
    ax.plot(xlim, [sig5, sig5], ':', c='black')

    # label BIC on the right side
    ax2 = ax.twinx()
    ax2.set_ylim(tuple(lomb_scargle_BIC(ax.get_ylim(), y_obs, dy)))
    ax2.set_ylabel(r'$\Delta BIC$')

    ax.set_xlabel('$\omega$')
    ax.set_ylabel(r'$P_{\rm LS}(\omega)$')
    ax.set_xlim(xlim)
    ax.set_ylim(0, 0.12)

In [14]:

def plot1017():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    from astroML.decorators import pickle_results
    from astroML.time_series import search_frequencies, lomb_scargle, MultiTermFit
    from astroML.datasets import fetch_LINEAR_sample

    #------------------------------------------------------------
    # Load the dataset
    data = fetch_LINEAR_sample()
    ids = [14752041, 1009459, 10022663, 10025796, 11375941, 18525697]


    #------------------------------------------------------------
    # Compute the best frequencies
    @pickle_results('LINEAR_LS.pkl')
    def compute_best_frequencies(ids, n_eval=10000, n_retry=5, generalized=True):
        results = {}
        for i in ids:
            t, y, dy = data[i].T
            print " - computing power for %i (%i points)" % (i, len(t))
            kwargs = dict(generalized=generalized)
            omega, power = search_frequencies(t, y, dy, n_eval=n_eval,
                                              n_retry=n_retry,
                                              LS_kwargs=kwargs)
            results[i] = [omega, power]

        return results

    results = compute_best_frequencies(ids, n_eval=10000, n_retry=5)

    #------------------------------------------------------------
    # Plot the phased light-curves
    fig = plt.figure()
    fig.subplots_adjust(hspace=0.1, bottom=0.06, top=0.94, left=0.12, right=0.94)

    for i in range(6):
        # get the data and best-fit angular frequency
        t, y, dy = data[ids[i]].T
        omega, power = results[ids[i]]
        omega_best = omega[np.argmax(power)]
        print " - omega_0 = %.10g" % omega_best

        # do a fit to the first 4 Fourier components
        mtf = MultiTermFit(omega_best, 4)
        mtf.fit(t, y, dy)
        phase_fit, y_fit, phased_t = mtf.predict(1000, return_phased_times=True)

        # plot the phased data and best-fit curves
        ax = fig.add_subplot(321 + i)
        ax.errorbar(phased_t, y, dy, fmt='.k', ecolor='gray',
                    lw=1, ms=4, capsize=1.5)
        ax.plot(phase_fit, y_fit, '-b', lw=2)

        ax.set_xlim(0, 1)
        ax.set_ylim(plt.ylim()[::-1])
        ax.yaxis.set_major_locator(plt.MaxNLocator(4))

        ax.text(0.03, 0.04, "ID = %i" % ids[i], ha='left', va='bottom',
                transform=ax.transAxes)
        ax.text(0.03, 0.96, "P = %.2f hr" % (2 * np.pi / omega_best * 24.),
                ha='left', va='top',
                transform=ax.transAxes)

        ylim = ax.get_ylim()
        ax.set_ylim(ylim[0], ylim[0] + 1.1 * (ylim[1] - ylim[0]))

        if i < 4:
            ax.xaxis.set_major_formatter(plt.NullFormatter())
        if i % 2 == 0:
            ax.set_ylabel('mag')

        if i in (4, 5):
            ax.set_xlabel('phase')

In [15]:

def plot1018():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    from astroML.time_series import multiterm_periodogram, MultiTermFit
    from astroML.datasets import fetch_LINEAR_sample

    #------------------------------------------------------------
    # Get data
    data = fetch_LINEAR_sample()
    t, y, dy = data[14752041].T

    #------------------------------------------------------------
    # Do a single-term and multi-term fit around the peak
    omega0 = 17.217
    nterms_fit = 6

    # hack to get better phases: this doesn't change results,
    #  except for how the phase plots are displayed
    t -= 0.4 * np.pi / omega0

    width = 0.03
    omega = np.linspace(omega0 - width - 0.01, omega0 + width - 0.01, 1000)

    #------------------------------------------------------------
    # Compute periodograms and best-fit solutions
    #  factor gives the factor that we're dividing the fundamental frequency by
    factors = [1, 2]
    nterms = [1, 6]

    # Compute PSDs for factors & nterms
    PSDs = dict()
    for f in factors:
        for n in nterms:
            PSDs[(f, n)] = multiterm_periodogram(t, y, dy, omega / f, n)

    # Compute the best-fit omega from the 6-term fit
    omega_best = dict()
    for f in factors:
        omegaf = omega / f
        PSDf = PSDs[(f, 6)]
        omega_best[f] = omegaf[np.argmax(PSDf)]

    # Compute the best-fit solution based on the fundamental frequency
    best_fit = dict()
    for f in factors:
        for n in nterms:
            mtf = MultiTermFit(omega_best[f], n)
            mtf.fit(t, y, dy)
            phase_best, y_best = mtf.predict(1000, adjust_offset=False)

            best_fit[(f, n)] = (phase_best, y_best)

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()
    fig.subplots_adjust(left=0.1, right=0.95, wspace=0.25,
                        bottom=0.12, top=0.95, hspace=0.2)

    for i, f in enumerate(factors):
        P_best = 2 * np.pi / omega_best[f]
        phase_best = (t / P_best) % 1

        # first column: plot the PSD
        ax1 = fig.add_subplot(221 + 2 * i)
        ax1.plot(omega / f, PSDs[(f, 6)], '-', c='#46959E', label='6 terms', lw=2)
        ax1.plot(omega / f, PSDs[(f, 1)], '-', c='#BA4C37', label='1 term')
        ax1.grid(color='gray')
        ax1.legend(loc=2)

        ax1.axis('tight')
        ax1.set_ylim(-0.05, 1.001)
        ax1.xaxis.set_major_locator(plt.MultipleLocator(0.01))
        ax1.xaxis.set_major_formatter(plt.FormatStrFormatter('%.2f'))

        # second column: plot the phased data & fit
        ax2 = fig.add_subplot(222 + 2 * i)
        ax2.errorbar(phase_best, y, dy, fmt='.k', ms=4, ecolor='gray', lw=1,
                     capsize=1.5)
        ax2.plot(best_fit[(f, 1)][0], best_fit[(f, 1)][1], '-', c='#BA4C37')
        ax2.plot(best_fit[(f, 6)][0], best_fit[(f, 6)][1], '-', c='#46959E', lw=2)

        ax2.text(0.02, 0.02, (r"$\omega_0 = %.2f$" % omega_best[f] + "\n"
                              + r"$P_0 = %.2f\ {\rm hours}$" % (24 * P_best)),
                 ha='left', va='bottom', transform=ax2.transAxes)

        ax2.grid(color='gray')
        ax2.set_xlim(0, 1)
        ax2.set_ylim(plt.ylim()[::-1])
        ax2.yaxis.set_major_locator(plt.MultipleLocator(0.4))

        # label both axes
        ax1.set_ylabel(r'$P_{\rm LS}(\omega)$')
        ax2.set_ylabel(r'${\rm mag}$')
        if i == 1:
            ax1.set_xlabel(r'$\omega$')
            ax2.set_xlabel(r'${\rm phase}$')

In [16]:

def plot1019():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general
    from astroML.time_series import multiterm_periodogram, lomb_scargle_BIC
    from astroML.datasets import fetch_LINEAR_sample

    #------------------------------------------------------------
    # Fetch the data
    data = fetch_LINEAR_sample()

    t, y, dy = data[14752041].T
    omega0 = 17.217

    # focus only on the region with the peak
    omega1 = np.linspace(17.213, 17.220, 100)
    omega2 = 0.5 * omega1

    #------------------------------------------------------------
    # Compute the delta BIC
    terms = np.arange(1, 21)
    BIC_max = np.zeros((2, len(terms)))

    for i, omega in enumerate([omega1, omega2]):
        for j in range(len(terms)):
            P = multiterm_periodogram(t, y, dy, omega, terms[j])
            BIC = lomb_scargle_BIC(P, y, dy, n_harmonics=terms[j])
            BIC_max[i, j] = BIC.max()

    #----------------------------------------------------------------------
    # Plot the results
    fig = plt.figure()

    ax = [fig.add_axes((0.15, 0.53, 0.8, 0.37)),
          fig.add_axes((0.15, 0.1, 0.8, 0.37))]
    ax_inset = [fig.add_axes((0.15 + 7 * 0.04, 0.55, 0.79 - 7 * 0.04, 0.17)),
                fig.add_axes((0.15 + 7 * 0.04, 0.12, 0.79 - 7 * 0.04, 0.17))]
    ylims = [(22750, 22850),
             (26675, 26775)]
    omega0 = [17.22, 8.61]

    for i in range(2):
        # Plot full panel
        ax[i].plot(terms, BIC_max[i], '-k')
        ax[i].set_xlim(0, 20)
        ax[i].set_ylim(0, 30000)
        ax[i].text(0.02, 0.95, r"$\omega_0 = %.2f$" % omega0[i],
                   ha='left', va='top', transform=ax[i].transAxes)

        ax[i].set_ylabel(r'$\Delta BIC$')
        if i == 1:
            ax[i].set_xlabel('N frequencies')
        ax[i].grid(color='gray')

        # plot inset
        ax_inset[i].plot(terms, BIC_max[i], '-k')
        ax_inset[i].xaxis.set_major_locator(plt.MultipleLocator(5))
        ax_inset[i].xaxis.set_major_formatter(plt.NullFormatter())
        ax_inset[i].yaxis.set_major_locator(plt.MultipleLocator(25))
        ax_inset[i].yaxis.set_major_formatter(plt.FormatStrFormatter('%i'))
        ax_inset[i].set_xlim(7, 19.75)
        ax_inset[i].set_ylim(ylims[i])
        ax_inset[i].set_title('zoomed view')
        ax_inset[i].grid(color='gray')

In [17]:

def plot1021():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general
    from sklearn.mixture import GMM

    from astroML.decorators import pickle_results
    from astroML.datasets import fetch_LINEAR_geneva

    #------------------------------------------------------------
    # Get the Geneva periods data
    data = fetch_LINEAR_geneva()

    #----------------------------------------------------------------------
    # compute Gaussian Mixture models

    filetemplate = 'gmm_res_%i_%i.pkl'
    attributes = [('gi', 'logP'),
                  ('ug', 'gi', 'iK', 'JK', 'logP', 'amp', 'skew')]
    components = np.arange(1, 21)

    #------------------------------------------------------------
    # Create attribute arrays
    Xarrays = []
    for attr in attributes:
        Xarrays.append(np.vstack([data[a] for a in attr]).T)


    #------------------------------------------------------------
    # Compute the results (and save to pickle file)
    @pickle_results('LINEAR_clustering.pkl')
    def compute_GMM_results(components, attributes):
        clfs = []

        for attr, X in zip(attributes, Xarrays):
            clfs_i = []

            for comp in components:
                print "  - %i component fit" % comp
                clf = GMM(comp, covariance_type='full',
                          random_state=0, n_iter=500)
                clf.fit(X)
                clfs_i.append(clf)

                if not clf.converged_:
                    print "           NOT CONVERGED!"

            clfs.append(clfs_i)
        return clfs

    clfs = compute_GMM_results(components, attributes)

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()
    fig.subplots_adjust(hspace=0.1, wspace=0.1)

    class_labels = []

    for i in range(2):
        # Grab the best classifier, based on the BIC
        X = Xarrays[i]
        BIC = [c.bic(X) for c in clfs[i]]
        i_best = np.argmin(BIC)

        print "number of components:", components[i_best]

        clf = clfs[i][i_best]
        n_components = clf.n_components

        # Predict the cluster labels with the classifier
        c = clf.predict(X)
        classes = np.unique(c)

        class_labels.append(c)

        # sort the cluster by normalized density of points
        counts = np.sum(c == classes[:, None], 1)
        size = np.array([np.linalg.det(C) for C in clf.covars_])
        weights = clf.weights_
        density = counts / size

        # Clusters with very few points are less interesting:
        # set their density to zero so they'll go to the end of the list
        density[counts < 5] = 0
        isort = np.argsort(density)[::-1]

        # find statistics of the top 10 clusters
        Nclusters = 6

        means = []
        stdevs = []
        counts = []

        names = [name for name in data.dtype.names[2:] if name != 'LINEARobjectID']
        labels = ['$u-g$', '$g-i$', '$i-K$', '$J-K$',
                  r'$\log(P)$', 'amplitude', 'skew',
                  'kurtosis', 'median mag', r'$N_{\rm obs}$', 'Visual Class']

        assert len(names) == len(labels)

        i_logP = names.index('logP')

        for j in range(Nclusters):
            flag = (c == isort[j])
            counts.append(np.sum(flag))
            means.append([np.mean(data[n][flag]) for n in names])
            stdevs.append([data[n][flag].std() for n in names])

        counts = np.array(counts)
        means = np.array(means)
        stdevs = np.array(stdevs)

        # define colors based on median of logP
        j_ordered = np.argsort(-means[:, i_logP])

        # tweak colors by hand
        if i == 1:
            j_ordered[3], j_ordered[2] = j_ordered[2], j_ordered[3]

        color = np.zeros(c.shape)
        for j in range(Nclusters):
            flag = (c == isort[j_ordered[j]])
            color[flag] = j + 1

        # separate into foureground and background
        back = (color == 0)
        fore = ~back

        # Plot the resulting clusters
        ax1 = fig.add_subplot(221 + 2 * i)
        ax1.scatter(data['gi'][back], data['logP'][back],
                    c='gray', edgecolors='none', s=4, linewidths=0)
        ax1.scatter(data['gi'][fore], data['logP'][fore],
                    c=color[fore], edgecolors='none',  s=4, linewidths=0)

        ax1.set_ylabel(r'$\log(P)$')

        ax2 = plt.subplot(222 + 2 * i)
        ax2.scatter(data['amp'][back], data['logP'][back],
                    c='gray', edgecolors='none', s=4, linewidths=0)
        ax2.scatter(data['amp'][fore], data['logP'][fore],
                    c=color[fore], edgecolors='none', s=4, linewidths=0)

        #------------------------------
        # set axis limits
        ax1.set_xlim(-0.6, 2.1)
        ax2.set_xlim(0.1, 1.5)
        ax1.set_ylim(-1.5, 0.5)
        ax2.set_ylim(-1.5, 0.5)

        ax2.yaxis.set_major_formatter(plt.NullFormatter())
        if i == 0:
            ax1.xaxis.set_major_formatter(plt.NullFormatter())
            ax2.xaxis.set_major_formatter(plt.NullFormatter())
        else:
            ax1.set_xlabel(r'$g-i$')
            ax2.set_xlabel(r'$A$')


        #------------------------------
        # print table of means and medians directly to LaTeX format
        #print r"\begin{tabular}{|l|lllllll|}"
        #print r"   \hline"
        #for j in range(7):
        #    print '   &', labels[j],
        #print r"\\"
        #print r"   \hline"

        #for j in range(Nclusters):
        #    print "   %i " % (j + 1),
        #    for k in range(7):
        #        print " & $%.2f \pm %.2f$ " % (means[j, k], stdevs[j, k]),
        #    print r"\\"

        #print r"\hline"
        #print r"\end{tabular}"

    #------------------------------------------------------------
    # Second figure
    fig = plt.figure()
    fig.subplots_adjust(left=0.11, right=0.95, wspace=0.3)

    attrs = ['skew', 'ug', 'iK', 'JK']
    labels = ['skew', '$u-g$', '$i-K$', '$J-K$']
    ylims = [(-1.8, 2.2), (0.6, 2.9), (0.1, 2.6), (-0.2, 1.2)]

    for i in range(4):
        ax = fig.add_subplot(221 + i)
        ax.scatter(data['gi'][back], data[attrs[i]][back],
                   c='gray', edgecolors='none', s=4, linewidths=0)
        ax.scatter(data['gi'][fore], data[attrs[i]][fore],
                   c=color[fore], edgecolors='none', s=4, linewidths=0)
        ax.set_xlabel('$g-i$')
        ax.set_ylabel(labels[i])

        ax.set_xlim(-0.6, 2.1)
        ax.set_ylim(ylims[i])

    #------------------------------------------------------------
    # Save the results
    #
    # run the script as
    #
    #   >$ python fig_LINEAR_clustering.py --save
    #
    # to output the data file showing the cluster labels of each point
    import sys
    if len(sys.argv) > 1 and sys.argv[1] == '--save':
        filename = 'cluster_labels.dat'

        print "Saving cluster labels to %s" % filename

        from astroML.datasets.LINEAR_sample import ARCHIVE_DTYPE
        new_data = np.zeros(len(data),
                            dtype=(ARCHIVE_DTYPE + [('2D_cluster_ID', 'i4'),
                                                    ('7D_cluster_ID', 'i4')]))

        for name in data.dtype.names:
            new_data[name] = data[name]
        new_data['2D_cluster_ID'] = class_labels[0]
        new_data['7D_cluster_ID'] = class_labels[1]

        fmt = ('%.6f   %.6f   %.3f   %.3f   %.3f   %.3f   %.7f   %.3f   %.3f   '
               '%.3f    %.2f     %i     %i      %s          %i              %i\n')


        F = open(filename, 'w')
        F.write('#    ra           dec       ug      gi      iK      JK     '
                'logP       Ampl    skew      kurt    magMed    nObs  LCtype  '
                'LINEARobjectID  2D_cluster_ID   7D_cluster_ID\n')
        for line in new_data:
            F.write(fmt % tuple(line[col] for col in line.dtype.names))
        F.close()

In [18]:

def plot1025():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    # Hack to fix import issue in older versions of pymc
    import scipy
    import scipy.misc
    scipy.derivative = scipy.misc.derivative
    import pymc

    from astroML.plotting.mcmc import plot_mcmc
    from astroML.decorators import pickle_results

    #----------------------------------------------------------------------
    # Set up toy dataset
    def burst(t, b0, A, alpha, T):
        """Burst model"""
        y = np.empty(t.shape)
        y.fill(b0)
        mask = (t >= T)
        y[mask] += A * np.exp(-alpha * (t[mask] - T))
        return y

    np.random.seed(0)

    N = 100
    b0_true = 10
    A_true = 5
    alpha_true = 0.1
    T_true = 50
    sigma = 1.0

    t = 100 * np.random.random(N)

    y_true = burst(t, b0_true, A_true, alpha_true, T_true)
    y_obs = np.random.normal(y_true, sigma)


    #----------------------------------------------------------------------
    # Set up MCMC sampling
    b0 = pymc.Uniform('b0', 0, 50, value=50 * np.random.random())
    A = pymc.Uniform('A', 0, 50, value=50 * np.random.random())
    T = pymc.Uniform('T', 0, 100, value=100 * np.random.random())
    log_alpha = pymc.Uniform('log_alpha', -10, 10, value=0)


    # uniform prior on log(alpha)
    @pymc.deterministic
    def alpha(log_alpha=log_alpha):
        return np.exp(log_alpha)


    @pymc.deterministic
    def y_model(t=t, b0=b0, A=A, alpha=alpha, T=T):
        return burst(t, b0, A, alpha, T)

    y = pymc.Normal('y', mu=y_model, tau=sigma ** -2, observed=True, value=y_obs)

    model = dict(b0=b0, A=A, T=T, log_alpha=log_alpha,
                 alpha=alpha, y_model=y_model, y=y)


    #----------------------------------------------------------------------
    # Run the MCMC sampling
    @pickle_results('matchedfilt_burst.pkl')
    def compute_MCMC_results(niter=25000, burn=4000):
        S = pymc.MCMC(model)
        S.sample(iter=niter, burn=burn)
        traces = [S.trace(s)[:] for s in ['b0', 'A', 'T', 'alpha']]

        M = pymc.MAP(model)
        M.fit()
        fit_vals = (M.b0.value, M.A.value, M.alpha.value, M.T.value)

        return traces, fit_vals

    traces, fit_vals = compute_MCMC_results()

    labels = ['$b_0$', '$A$', '$T$', r'$\alpha$']
    limits = [(9.2, 11.2), (2, 12), (45, 55), (0.0, 0.25)]
    true = [b0_true, A_true, T_true, alpha_true]

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()
    fig.subplots_adjust(bottom=0.1, top=0.95,
                        left=0.1, right=0.95,
                        hspace=0.05, wspace=0.05)

    # This function plots multiple panels with the traces
    plot_mcmc(traces, labels=labels, limits=limits, true_values=true, fig=fig,
              bins=30, colors='k')

    # Plot the model fit
    ax = fig.add_axes([0.5, 0.7, 0.45, 0.25])
    t_fit = np.linspace(0, 100, 101)
    y_fit = burst(t_fit, *fit_vals)

    ax.scatter(t, y_obs, s=9, c='k')
    ax.plot(t_fit, y_fit, '-k')
    ax.set_xlim(0, 100)
    ax.set_xlabel('$t$')
    ax.set_ylabel(r'$h_{\rm obs}$')

In [19]:

def plot1026():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    # Hack to fix import issue in older versions of pymc
    import scipy
    import scipy.misc
    scipy.derivative = scipy.misc.derivative
    import pymc

    from astroML.plotting.mcmc import plot_mcmc
    from astroML.decorators import pickle_results

    #----------------------------------------------------------------------
    # Set up toy dataset
    def chirp(t, b0, beta, A, omega):
        return b0 + A * np.sin(omega * t + beta * t * t)

    np.random.seed(0)

    N = 100
    b0_true = 10
    A_true = 5
    beta_true = 0.01
    omega_true = 0.1
    sigma = 2.0

    t = 100 * np.random.random(N)

    y_true = chirp(t, b0_true, beta_true, A_true, omega_true)
    y_obs = np.random.normal(y_true, sigma)

    t_fit = np.linspace(0, 100, 1000)
    y_fit = chirp(t_fit, b0_true, beta_true, A_true, omega_true)

    i = np.argsort(t)

    #----------------------------------------------------------------------
    # Set up MCMC sampling
    b0 = pymc.Uniform('b0', 0, 50, value=50 * np.random.random())
    A = pymc.Uniform('A', 0, 50, value=50 * np.random.random())
    log_beta = pymc.Uniform('log_beta', -10, 10, value=-4.6)
    log_omega = pymc.Uniform('log_omega', -10, 10, value=-2.3)


    # uniform prior on log(beta)
    @pymc.deterministic
    def beta(log_beta=log_beta):
        return np.exp(log_beta)


    # uniform prior on log(omega)
    @pymc.deterministic
    def omega(log_omega=log_omega):
        return np.exp(log_omega)


    @pymc.deterministic
    def y_model(t=t, b0=b0, A=A, beta=beta, omega=omega):
        return chirp(t, b0, beta, A, omega)

    y = pymc.Normal('y', mu=y_model, tau=sigma ** -2, observed=True, value=y_obs)

    model = dict(b0=b0, A=A,
                 log_beta=log_beta, beta=beta,
                 log_omega=log_omega, omega=omega,
                 y_model=y_model, y=y)


    #----------------------------------------------------------------------
    # Run the MCMC sampling (saving results to a pickle)
    @pickle_results('matchedfilt_chirp.pkl')
    def compute_MCMC_results(niter=20000, burn=2000):
        S = pymc.MCMC(model)
        S.sample(iter=niter, burn=burn)
        traces = [S.trace(s)[:] for s in ['b0', 'A', 'omega', 'beta']]

        M = pymc.MAP(model)
        M.fit()
        fit_vals = (M.b0.value, M.beta.value, M.A.value, M.omega.value)

        return traces, fit_vals

    traces, fit_vals = compute_MCMC_results()

    labels = ['$b_0$', '$A$', r'$\omega$', r'$\beta$']
    limits = [(9.5, 11.3), (3.6, 6.4), (0.065, 0.115), (0.00975, 0.01045)]
    true = [b0_true, A_true, omega_true, beta_true]

    #----------------------------------------------------------------------
    # Find the Maximum a posteriori values
    fig = plt.figure()

    ax = plt.axes([0.5, 0.7, 0.45, 0.25])
    t_fit = np.linspace(0, 100, 1001)
    y_fit = chirp(t_fit, *fit_vals)
    plt.scatter(t, y_obs, s=9, c='k')
    plt.plot(t_fit, y_fit, '-k')
    plt.xlim(0, 100)
    plt.xlabel('$t$')
    plt.ylabel(r'$h_{\rm obs}$')

    # This function plots multiple panels with the traces
    plot_mcmc(traces, labels=labels, limits=limits, true_values=true, fig=fig,
              bins=30, bounds=[0.12, 0.08, 0.95, 0.91], colors='k')

In [20]:

def plot1027():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    # Hack to fix import issue in older versions of pymc
    import scipy
    import scipy.misc
    scipy.derivative = scipy.misc.derivative
    import pymc

    from astroML.decorators import pickle_results
    from astroML.plotting.mcmc import plot_mcmc

    #----------------------------------------------------------------------
    # Set up toy dataset
    def chirp(t, T, A, phi, omega, beta):
        """chirp signal"""
        signal = A * np.sin(phi + omega * (t - T) + beta * (t - T) ** 2)
        signal[t < T] = 0
        return signal


    def background(t, b0, b1, Omega1, Omega2):
        """background signal"""
        return b0 + b1 * np.sin(Omega1 * t) * np.sin(Omega2 * t)


    np.random.seed(0)
    N = 500
    T_true = 30
    A_true = 0.8
    phi_true = np.pi / 2
    omega_true = 0.1
    beta_true = 0.02
    b0_true = 0.5
    b1_true = 1.0
    Omega1_true = 0.3
    Omega2_true = 0.4
    sigma = 0.1

    t = 100 * np.random.random(N)

    signal = chirp(t, T_true, A_true, phi_true, omega_true, beta_true)
    bg = background(t, b0_true, b1_true, Omega1_true, Omega2_true)

    y_true = signal + bg

    y_obs = np.random.normal(y_true, sigma)

    t_fit = np.linspace(0, 100, 1000)
    y_fit = (chirp(t_fit, T_true, A_true, phi_true, omega_true, beta_true) +
             background(t_fit, b0_true, b1_true, Omega1_true, Omega2_true))


    #----------------------------------------------------------------------
    # Set up MCMC sampling
    T = pymc.Uniform('T', 0, 100, value=T_true)
    A = pymc.Uniform('A', 0, 100, value=A_true)
    phi = pymc.Uniform('phi', -np.pi, np.pi, value=phi_true)
    log_omega = pymc.Uniform('log_omega', -4, 0, value=np.log(omega_true))
    log_beta = pymc.Uniform('log_beta', -6, 0, value=np.log(beta_true))
    b0 = pymc.Uniform('b0', 0, 100, value=b0_true)
    b1 = pymc.Uniform('b1', 0, 100, value=b1_true)
    log_Omega1 = pymc.Uniform('log_Omega1', -3, 0, value=np.log(Omega1_true))
    log_Omega2 = pymc.Uniform('log_Omega2', -3, 0, value=np.log(Omega2_true))
    omega = pymc.Uniform('omega', 0.001, 1, value=omega_true)
    beta = pymc.Uniform('beta', 0.001, 1, value=beta_true)


    # uniform prior on log(Omega1)
    @pymc.deterministic
    def Omega1(log_Omega1=log_Omega1):
        return np.exp(log_Omega1)


    # uniform prior on log(Omega2)
    @pymc.deterministic
    def Omega2(log_Omega2=log_Omega2):
        return np.exp(log_Omega2)


    @pymc.deterministic
    def y_model(t=t, T=T, A=A, phi=phi, omega=omega, beta=beta,
                b0=b0, b1=b1, Omega1=Omega1, Omega2=Omega2):
        return (chirp(t, T, A, phi, omega, beta)
                + background(t, b0, b1, Omega1, Omega2))

    y = pymc.Normal('y', mu=y_model, tau=sigma ** -2, observed=True, value=y_obs)

    model = dict(T=T, A=A, phi=phi, b0=b0, b1=b1,
                 log_omega=log_omega, omega=omega,
                 log_beta=log_beta, beta=beta,
                 log_Omega1=log_Omega1, Omega1=Omega1,
                 log_Omega2=log_Omega2, Omega2=Omega2,
                 y_model=y_model, y=y)


    #----------------------------------------------------------------------
    # Run the MCMC sampling (saving the results to a pickle file)
    @pickle_results('matchedfilt_chirp2.pkl')
    def compute_MCMC(niter=30000, burn=2000):
        S = pymc.MCMC(model)
        S.sample(iter=30000, burn=2000)
        traces = [S.trace(s)[:] for s in ['T', 'A', 'omega', 'beta']]
        return traces

    traces = compute_MCMC()

    labels = ['$T$', '$A$', r'$\omega$', r'$\beta$']
    limits = [(29.75, 30.25), (0.75, 0.83), (0.085, 0.115), (0.0197, 0.0202)]
    true = [T_true, A_true, omega_true, beta_true]

    #------------------------------------------------------------
    # Plot results
    fig = plt.figure()

    # This function plots multiple panels with the traces
    axes_list = plot_mcmc(traces, labels=labels, limits=limits,
                          true_values=true, fig=fig,
                          bins=30, colors='k',
                          bounds=[0.14, 0.08, 0.95, 0.95])

    for ax in axes_list:
        for axis in [ax.xaxis, ax.yaxis]:
            axis.set_major_locator(plt.MaxNLocator(5))

    ax = fig.add_axes([0.52, 0.7, 0.43, 0.25])
    ax.scatter(t, y_obs, s=9, c='k')
    ax.plot(t_fit, y_fit, '-k')
    ax.set_xlim(0, 100)
    ax.set_xlabel('$t$')
    ax.set_ylabel(r'$h_{\rm obs}$')

In [21]:

def plot1028():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    from astroML.fourier import FT_continuous, IFT_continuous, wavelet_PSD

    #------------------------------------------------------------
    # Define the chirp signal
    def chirp(t, T, A, phi, omega, beta):
        signal = A * np.sin(phi + omega * (t - T) + beta * (t - T) ** 2)
        signal[t < T] = 0
        return signal


    def background(t, b0, b1, Omega1, Omega2):
        return b0 + b1 * np.sin(Omega1 * t) * np.sin(Omega2 * t)

    N = 4096
    t = np.linspace(-50, 50, N)
    h_true = chirp(t, -20, 0.8, 0, 0.2, 0.02)
    h = h_true + np.random.normal(0, 1, N)

    #------------------------------------------------------------
    # Compute the wavelet PSD
    f0 = np.linspace(0.04, 0.6, 100)
    wPSD = wavelet_PSD(t, h, f0, Q=1.0)

    #------------------------------------------------------------
    # Plot the  results
    fig = plt.figure()
    fig.subplots_adjust(hspace=0.05, left=0.1, right=0.95, bottom=0.1, top=0.95)

    # Top: plot the data
    ax = fig.add_subplot(211)
    ax.plot(t + 50, h, '-', c='#AAAAAA')
    ax.plot(t + 50, h_true, '-k')

    ax.text(0.02, 0.95, "Input Signal: chirp",
            ha='left', va='top', transform=ax.transAxes,
            bbox=dict(boxstyle='round', fc='w', ec='k'))

    ax.set_xlim(0, 100)
    ax.set_ylim(-2.9, 2.9)
    ax.xaxis.set_major_formatter(plt.NullFormatter())
    ax.set_ylabel('$h(t)$')

    # Bottom: plot the 2D PSD
    ax = fig.add_subplot(212)
    ax.imshow(wPSD, origin='lower', aspect='auto',
              extent=[t[0] + 50, t[-1] + 50, f0[0], f0[-1]],
              cmap=plt.cm.binary)

    ax.text(0.02, 0.95, ("Wavelet PSD"), color='w',
            ha='left', va='top', transform=ax.transAxes)

    ax.set_xlim(0, 100)
    ax.set_ylim(0.04, 0.6001)
    ax.set_xlabel('$t$')
    ax.set_ylabel('$f_0$')

In [22]:

def plot1029():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    from astroML.time_series import generate_power_law
    from astroML.fourier import PSD_continuous

    N = 1024
    dt = 0.01
    factor = 100

    t = dt * np.arange(N)
    random_state = np.random.RandomState(1)

    fig = plt.figure()
    fig.subplots_adjust(wspace=0.05)

    for i, beta in enumerate([1.0, 2.0]):
        # Generate the light curve and compute the PSD
        x = factor * generate_power_law(N, dt, beta, random_state=random_state)
        f, PSD = PSD_continuous(t, x)

        # First axes: plot the time series
        ax1 = fig.add_subplot(221 + i)
        ax1.plot(t, x, '-k')

        ax1.text(0.95, 0.05, r"$P(f) \propto f^{-%i}$" % beta,
                 ha='right', va='bottom', transform=ax1.transAxes)

        ax1.set_xlim(0, 10.24)
        ax1.set_ylim(-1.5, 1.5)

        ax1.set_xlabel(r'$t$')

        # Second axes: plot the PSD
        ax2 = fig.add_subplot(223 + i, xscale='log', yscale='log')
        ax2.plot(f, PSD, '-k')
        ax2.plot(f[1:], (factor * dt) ** 2 * (2 * np.pi * f[1:]) ** -beta, '--k')

        ax2.set_xlim(1E-1, 60)
        ax2.set_ylim(1E-6, 1E1)

        ax2.set_xlabel(r'$f$')

        if i == 1:
            ax1.yaxis.set_major_formatter(plt.NullFormatter())
            ax2.yaxis.set_major_formatter(plt.NullFormatter())
        else:
            ax1.set_ylabel(r'${\rm counts}$')
            ax2.set_ylabel(r'$PSD(f)$')

In [23]:

def plot1030():
    # Author: Jake VanderPlas
    # License: BSD
    #   The figure produced by this code is published in the textbook
    #   "Statistics, Data Mining, and Machine Learning in Astronomy" (2013)
    #   For more information, see http://astroML.github.com
    #   To report a bug or issue, use the following forum:
    #    https://groups.google.com/forum/#!forum/astroml-general

    from astroML.time_series import lomb_scargle, generate_damped_RW
    from astroML.time_series import ACF_scargle, ACF_EK

    #------------------------------------------------------------
    # Generate time-series data:
    #  we'll do 1000 days worth of magnitudes

    t = np.arange(0, 1E3)
    z = 2.0
    tau = 300
    tau_obs = tau / (1. + z)

    np.random.seed(6)
    y = generate_damped_RW(t, tau=tau, z=z, xmean=20)

    # randomly sample 100 of these
    ind = np.arange(len(t))
    np.random.shuffle(ind)
    ind = ind[:100]
    ind.sort()
    t = t[ind]
    y = y[ind]

    # add errors
    dy = 0.1
    y_obs = np.random.normal(y, dy)

    #------------------------------------------------------------
    # compute ACF via scargle method
    C_S, t_S = ACF_scargle(t, y_obs, dy,
                           n_omega=2 ** 12, omega_max=np.pi / 5.0)

    ind = (t_S >= 0) & (t_S <= 500)
    t_S = t_S[ind]
    C_S = C_S[ind]

    #------------------------------------------------------------
    # compute ACF via E-K method
    C_EK, C_EK_err, bins = ACF_EK(t, y_obs, dy, bins=np.linspace(0, 500, 51))
    t_EK = 0.5 * (bins[1:] + bins[:-1])

    #------------------------------------------------------------
    # Plot the results
    fig = plt.figure()

    # plot the input data
    ax = fig.add_subplot(211)
    ax.errorbar(t, y_obs, dy, fmt='.k', lw=1)
    ax.set_xlabel('t (days)')
    ax.set_ylabel('observed flux')

    # plot the ACF
    ax = fig.add_subplot(212)
    ax.plot(t_S, C_S, '-', c='gray', lw=1,
            label='Scargle')
    ax.errorbar(t_EK, C_EK, C_EK_err, fmt='.k', lw=1,
                label='Edelson-Krolik')
    ax.plot(t_S, np.exp(-abs(t_S) / tau_obs), '-k', label='True')
    ax.legend(loc=3)

    ax.plot(t_S, 0 * t_S, ':', lw=1, c='gray')

    ax.set_xlim(0, 500)
    ax.set_ylim(-1.0, 1.1)

    ax.set_xlabel('t (days)')
    ax.set_ylabel('ACF(t)')

In [23]: