We introduce the pyBlocxs with the default MCMC samplers in Sherpa, including standard Bayesian methods for exploring parameter space and the addition of the calibration uncertainties. The algorithms are published in the following articles:
Van Dyk, et al. (2001), Analysis of Energy Spectra with Low Photon Counts via Bayesian Posterior Simulation https://ui.adsabs.harvard.edu/abs/2001ApJ...548..224V/abstract
Lee et al (2011), Accounting for Calibration Uncertainties in X-ray Analysis: Effective Areas in Spectral Fitting https://ui.adsabs.harvard.edu/abs/2011ApJ...731..126L/abstract
Xu et al (2014), A Fully Bayesian Method for Jointly Fitting Instrumental Calibration and X-Ray Spectral Models https://ui.adsabs.harvard.edu/abs/2014ApJ...794...97X/abstract
This notebook works in CIAO 4.13
First import Sherpa
from sherpa.astro.ui import *
We are only working with spectra, so imaging with ds9 will not be shown. We will use "matplotlib" for visualization. This is the default package for plotting in Sherpa.
We use the Poisson likelihood statistics called "cash" in Sherpa and select an optimization method for the initial fit - nelder-mead simplex algorithm.
set_stat('cash')
set_method('neldermead')
Next load the data, filter and setup a simple absorbed power law model. Note that the calibration files, ARF and RMF, which are needed for the X-ray analysis and a background fileare read and when the file "acis.pi" is loaded into the session.
load_data("acis.pi")
ignore(":0.5,7.0:")
set_model(xsphabs.abs1*xspowerlaw.p1)
read ARF file acis.arf read RMF file acis.rmf read background file acis_bg.pi
Run the initial fit using simplex and cash. This returns the best fit model parameters, absorption column, power law photon index and normalization.
fit()
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Dataset = 1 Method = neldermead Statistic = cash Initial fit statistic = 3.15508e+07 Final fit statistic = -89797.1 at function evaluation 540 Data points = 444 Degrees of freedom = 441 Change in statistic = 3.16406e+07 abs1.nH 0.0716477 p1.PhoIndex 1.17604 p1.norm 0.000533859
Next get the covariance matrix with covar(). The matrix is later used in the MCMC run.
covar()
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Dataset = 1 Confidence Method = covariance Iterative Fit Method = None Fitting Method = neldermead Statistic = cash covariance 1-sigma (68.2689%) bounds: Param Best-Fit Lower Bound Upper Bound ----- -------- ----------- ----------- abs1.nH 0.0716477 -0.00811361 0.00811361 p1.PhoIndex 1.17604 -0.0242164 0.0242164 p1.norm 0.000533859 -1.37268e-05 1.37268e-05
Plot the model and the data.
plot_fit()
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with cash
There are a few MCMC samplers in Sherpa that can be used to explore the parameter space. All are based on the Metropolis and Metropolis-Hastings algorithms. The standard metropolismh
runs MCMC with Metropolis-Metropolis-Hastings criterium for accepting parameters.
list_samplers()
['metropolismh', 'mh', 'pragbayes', 'fullbayes']
Select MetropolisMH and check the options:
set_sampler("MetropolisMH")
get_sampler()
{'log': False, 'inv': False, 'defaultprior': True, 'priorshape': False, 'priors': (), 'originalscale': True, 'scale': 1, 'sigma_m': False, 'p_M': 0.5}
Here we list several options for running the MCMC sampler:
defaultprior – indicates that all parameters have the default flat prior in Sherpa
inv – indicates which parameters are on the inverse scale.
log – indicates which parameters are on the logarithm scale (natural log).
originalscale – shows that the parameters are on the original scale for a given parameter
p_M – the proportion of jumps generated by the MCMC jumping rule.
priorshape – indicates which parameters have user-defined prior functions.
scale – A scalar multiple of the output of covar() used in the scale of the t-distribution
We first run the MCMC with the default settings. get_draws
is the function which runs the chains.
help(get_draws)
Help on function get_draws in module sherpa.astro.ui: get_draws(id=None, otherids=(), niter=1000, covar_matrix=None) Run the pyBLoCXS MCMC algorithm. The function runs a Markov Chain Monte Carlo (MCMC) algorithm designed to carry out Bayesian Low-Count X-ray Spectral (BLoCXS) analysis. It explores the model parameter space at the suspected statistic minimum (i.e. after using `fit`). The return values include the statistic value, parameter values, and an acceptance flag indicating whether the row represents a jump from the current location or not. For more information see the `sherpa.sim` module and [1]_. Parameters ---------- id : int or str, optional The data set that provides the data. If not given then all data sets with an associated model are used simultaneously. otherids : sequence of int or str, optional Other data sets to use in the calculation. niter : int, optional The number of draws to use. The default is ``1000``. covar_matrix : 2D array, optional The covariance matrix to use. If ``None`` then the result from `get_covar_results().extra_output` is used. Returns ------- stats, accept, params The results of the MCMC chain. The stats and accept arrays contain ``niter+1`` elements, with the first row being the starting values. The params array has ``(nparams, niter+1)`` elements, where nparams is the number of free parameters in the model expression, and the first column contains the values that the chain starts at. The accept array contains boolean values, indicating whether the jump, or step, was accepted (``True``), so the parameter values and statistic change, or it wasn't, in which case there is no change to the previous row. The `sherpa.utils.get_error_estimates` routine can be used to calculate the credible one-sigma interval from the params array. See Also -------- covar : Estimate the confidence intervals using the covariance method. fit : Fit a model to one or more data sets. plot_cdf : Plot the cumulative density function of an array. plot_pdf : Plot the probability density function of an array. plot_scatter : Create a scatter plot. plot_trace : Create a trace plot of row number versus value. set_prior : Set the prior function to use with a parameter. set_sampler : Set the MCMC sampler. get_sampler : Return information about the current MCMC sampler. Notes ----- The chain is run using fit information associated with the specified data set, or sets, the currently set sampler (`set_sampler`) and parameter priors (`set_prior`), for a specified number of iterations. The model should have been fit to find the best-fit parameters, and `covar` called, before running `get_draws`. The results from `get_draws` is used to estimate the parameter distributions. References ---------- .. [1] "Analysis of Energy Spectra with Low Photon Counts via Bayesian Posterior Simulation", van Dyk, D.A., Connors, A., Kashyap, V.L., & Siemiginowska, A. 2001, Ap.J., 548, 224 http://adsabs.harvard.edu/abs/2001ApJ...548..224V Examples -------- Fit a source and then run a chain to investigate the parameter distributions. The distribution of the stats values created by the chain is then displayed, using `plot_trace`, and the parameter distributions for the first two thawed parameters are displayed. The first one as a cumulative distribution using `plot_cdf` and the second one as a probability distribution using `plot_pdf`. Finally the acceptance fraction (number of draws where the chain moved) is displayed. Note that in a full analysis session a burn-in period would normally be removed from the chain before using the results. >>> fit() >>> covar() >>> stats, accept, params = get_draws(1, niter=1e4) >>> plot_trace(stats, name='stat') >>> names = [p.fullname for p in get_source().pars if not p.frozen] >>> plot_cdf(params[0,:], name=names[0], xlabel=names[0]) >>> plot_pdf(params[1,:], name=names[1], xlabel=names[1]) >>> accept[:-1].sum() * 1.0 / len(accept - 1) 0.4287 The following runs the chain on multiple data sets, with identifiers 'core', 'jet1', and 'jet2': >>> stats, accept, params = get_draws('core', ['jet1', 'jet2'], niter=1e4)
stats, accept, params = get_draws(niter=1e3)
# check the parameters at the minimum statistic value based on the sampler:
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Using Priors: abs1.nH: <function flat at 0x7ff2312fc670> p1.PhoIndex: <function flat at 0x7ff2312fc670> p1.norm: <function flat at 0x7ff2312fc670>
The stats array contains the fit statistic for each iteration (the first element of these arrays is the starting point of the chain, so there will be 10001 elements in this case). We can check the minimum of the statistics for the best fit parameter values and the standard deviations.
bestfit = params[::,stats.argmin()].T
# print bestfit values and standard deviations
print(bestfit)
[params[i].std() for i in [0,1,2]]
[7.16476507e-02 1.17603887e+00 5.33858688e-04]
[0.007708456321037247, 0.022955572070114207, 1.3342403008224056e-05]
The traces of each arrays show the steps for the entire chain and can be plotted with the Sherpa default plotting functions.
# visualization of the MCMC run - Cash at each iteration
plot_trace(stats)
plot_trace(params[0])
plot_trace(params[1])
# dependence between parameters
plot_scatter(params[0],params[1])
The relation between the two parameters can be visualized as a scatter plot above.
The distributions for each parameter can be plotted as histograms and as a cummulative distributions which provides an easy way to access the median and upper/lower 1 sigma interval.
# plot the distribution for photon index gamma
plot_pdf(params[1])
# plot the cummulative density for photon index, print the median and upper/lower confidence values.
plot_cdf(params[1])
print(get_cdf_plot().median)
print(get_cdf_plot().upper)
print(get_cdf_plot().lower)
1.176220928128069 1.1989846584438775 1.156263869252614
bestfit1 = params[::,stats.argmin()].T
# print bestfit values and standard deviations
print(bestfit1)
[params[i].std() for i in [0,1,2]]
[7.16476507e-02 1.17603887e+00 5.33858688e-04]
[0.007708456321037247, 0.022955572070114207, 1.3342403008224056e-05]
Changing the prior for photon index to a gaussian prior. We use one of the predefined Sherpa model normgauss1d
to set the prior.
normgauss1d.g1
g1.pos=2.
g1.fwhm=0.5
set_prior(p1.PhoIndex,g1)
Below, set_sampler_opt
is used to set a prior for each parameter
set_sampler_opt('defaultprior',False) # not using the default prior for each parameter
# below define the prior shape for p1.gamma, use default for abs.nH and p1.ampl
set_sampler_opt('priorshape', [False, True, False])
set_sampler_opt('originalscale', [True, True, True])# all prior is defined on the original parameter scale
# Run sampler
stats, accept, params = get_draws(niter=1e3)
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Using Priors: abs1.nH: <function flat at 0x7ff2312fc670> p1.PhoIndex: normgauss1d.g1 Param Type Value Min Max Units ----- ---- ----- --- --- ----- g1.fwhm thawed 0.5 1.17549e-38 3.40282e+38 g1.pos thawed 2 -3.40282e+38 3.40282e+38 g1.ampl thawed 1 -3.40282e+38 3.40282e+38 p1.norm: <function flat at 0x7ff2312fc670>
plot_trace(params[1])
plot_scatter(params[0],params[1])
plot_scatter(params[0][100:200],params[1][100:200])
plot_cdf(params[1])
print(get_cdf_plot().median)
print(get_cdf_plot().upper)
print(get_cdf_plot().lower)
1.1845307095043711 1.211167826392162 1.1621568846734065
bestfit1 = params[::,stats.argmin()].T
# print bestfit values and standard deviations
print(bestfit1)
[params[i].std() for i in [0,1,2]]
[7.38638019e-02 1.18463750e+00 5.38155302e-04]
[0.008279386702399395, 0.024729391151673632, 1.4244726633682999e-05]
If the uncertainties on the ARF are provided then they can be included in the evaluation of the best fit parameters via MCMC.
This is the standard loop for running MCMC
Here we include the additional step to incorporate draws from the calibration data
pragbayes
sampler will include this step if the required data is available.
set_sampler('pragbayes')
get_sampler() # information about the available options - Note simarf and nsubiters.
{'log': False, 'inv': False, 'defaultprior': True, 'priorshape': False, 'priors': (), 'originalscale': True, 'scale': 1, 'sigma_m': False, 'p_M': 0.5, 'simarf': None, 'nsubiters': 10}
set_sampler_opt('simarf', 'aref_Cedge.fits')
set_sampler_opt('nsubiters',5)
get_sampler()
{'log': False, 'inv': False, 'defaultprior': True, 'priorshape': False, 'priors': (), 'originalscale': True, 'scale': 1, 'sigma_m': False, 'p_M': 0.5, 'simarf': 'aref_Cedge.fits', 'nsubiters': 5}
print(get_model())
apply_rmf(apply_arf((27358.24455786 * (xsphabs.abs1 * xspowerlaw.p1)))) Param Type Value Min Max Units ----- ---- ----- --- --- ----- abs1.nH thawed 0.0716477 0 100000 10^22 atoms / cm^2 p1.PhoIndex thawed 1.17604 -2 9 p1.norm thawed 0.000533859 0 1e+24
stat_sim, accept_sim, params_sim = get_draws(niter=1e3)
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Using Priors: abs1.nH: <function flat at 0x7ff2312fc670> p1.PhoIndex: normgauss1d.g1 Param Type Value Min Max Units ----- ---- ----- --- --- ----- g1.fwhm thawed 0.5 1.17549e-38 3.40282e+38 g1.pos thawed 2 -3.40282e+38 3.40282e+38 g1.ampl thawed 1 -3.40282e+38 3.40282e+38 p1.norm: <function flat at 0x7ff2312fc670> Draw rejected: parameter boundary exception: [-1.99466812e-02 8.99252054e-01 4.32438144e-04] Draw rejected: parameter boundary exception: [-9.88342308e-02 7.28437628e-01 2.99137917e-04]
Display the PDF for Photon index (gamma) from the MCMC with a default calibration file and from PragBayes with draws from the calibration data.
plot_pdf(params_sim[1], name='', xlabel = 'Photon Index')
plot_pdf(params[1], name='', xlabel = 'Photon Index',overplot =True,clearwindow=False)
Display Photon Index values obtained with the default calibration and compare to the PragBayes values with the draws from calibration data.
plot_scatter(params_sim[1],params[1][0:1001], name="(Parameter)", xlabel="params_sim", ylabel="params")
len(params_sim[0])
1001
This is an experimental function in Sherpa to include both fitting of the model and the calibration data (finding the best ARF within a calibration uncertainties). The algorithm is described in Xu et al (2014), A Fully Bayesian Method for Jointly Fitting Instrumental Calibration and X-Ray Spectral Models https://ui.adsabs.harvard.edu/abs/2014ApJ...794...97X/abstract
print(list_samplers())
['metropolismh', 'mh', 'pragbayes', 'fullbayes']
set_sampler('fullbayes')
get_sampler()
{'log': False, 'inv': False, 'defaultprior': True, 'priorshape': False, 'priors': (), 'originalscale': True, 'scale': 1, 'sigma_m': False, 'p_M': 0.5, 'simarf': None, 'p_M_arf': 0.5, 'sigma_arf': 0.1}
set_sampler_opt('simarf','aref_Cedge.fits')
stat_sim1, accept_sim1, params_sim1 = get_draws(niter=1e3)
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set Using Priors: abs1.nH: <function flat at 0x7ff2312fc670> p1.PhoIndex: normgauss1d.g1 Param Type Value Min Max Units ----- ---- ----- --- --- ----- g1.fwhm thawed 0.5 1.17549e-38 3.40282e+38 g1.pos thawed 2 -3.40282e+38 3.40282e+38 g1.ampl thawed 1 -3.40282e+38 3.40282e+38 p1.norm: <function flat at 0x7ff2312fc670> Draw rejected: parameter boundary exception: [-9.98918103e-04 1.00637732e+00 4.48082558e-04] Draw rejected: parameter boundary exception: [-6.16929812e-02 7.60192563e-01 3.46722396e-04]
plot_pdf(params_sim1[1], name='', xlabel = 'Photon Index')
plot_pdf(params[1], name='', xlabel = 'Photon Index',overplot =True,clearwindow=False)
plot_pdf(params_sim[1], name='', xlabel = 'Photon Index',overplot =True,clearwindow=False)
# Use the python package from Dan Foreman-Mackey https://github.com/dfm/corner.py to make a corner plots
import corner
parvals = [p.val for p in get_model().pars if not p.frozen]
print(parvals)
[0.07164765072846567, 1.176038869172598, 0.0005338586882594435]
corner.corner(params[:].T,
quantiles=[0.16, 0.5, 0.84],
plot_contours=False, truths=parvals)
params.shape
(3, 1001)