Last modified: 13 Dec 2024

URL: https://cxc.cfa.harvard.edu/sherpa/threads/pyblocxs/

Bayesian Analysis in Sherpa

Sherpa Threads (CIAO 4.17 Sherpa)


Overview

Synopsis:

pyBLoCXS is a sophisticated Markov chain Monte Carlo (MCMC) based algorithm designed to carry out Bayesian Low-Count X-ray Spectral (BLoCXS) analysis in the Sherpa environment. The algorithm may be used to explore parameter space at a suspected minimum using a pre-defined Sherpa model to high-energy X-ray spectral data.

Run this thread if:

Use this thread if you are interested in conducting Bayesian Low-Counts X-ray Spectral (BLoCXS) analysis in the Sherpa environment.

Last Update: 13 Dec 2024 - reviewed for CIAO 4.17, updated returned outputs and figures.


Contents


Introduction to pyBLoCXS

The pyBLoCXS code is a Python extension to Sherpa that explores model parameter space at a suspected minimum using a pre-defined Sherpa model to high-energy X-ray spectral data. pyBLoCXS includes a flexible definition of priors and can efficiently probe the model parameter space to characterize the posterior distributions. It allows for variations in the calibration information and can be used to compute posterior predictive p-values for the likelihood ratio test (see Protassov, et al. 2002, ApJ 571, 542, Statistics, Handle with Care: Detecting Multiple Model Components with the Likelihood Ratio Test).

The Sherpa get_draws function runs a pyBLoCXS chain using fit information associated with the user-specified data set(s), the currently set sampler, and parameter priors for a specified number of iterations. It returns an array of simulated fit statistic values, an array of True or False accepted/rejected Booleans, and a 2-D array of associated model parameter values.

A general description of the MCMC techniques employed by the pyBLoCXS algorithm can be found in the following references:

The Sherpa pyBLoCXS functions include:


Getting Started

Please follow the Obtaining Data Used in Sherpa thread in order to obtain the sample data used in this thread. The spectral data products for the PKS 1127-145 quasar core are located in the pyblocxs sub-directory.


Fitting a Model to the Spectrum

The pyBLoCXS routine requires information about a fit to a data set in order to run; therefore, we begin by loading a source spectrum into Sherpa, fitting a model to it, and calculating confidence intervals on thawed parameters using the covariance method.

The source spectrum is loaded with load_pha, and the source model expression to be used for fitting is defined with set_model. We decide to restrict the analysis to the 0.5–7.0 keV energy range, using the ignore command.

sherpa> load_pha(1, "core.pi")
sherpa> set_model(xsphabs.abs0 * powlaw1d.p1)
sherpa> ignore(None, 0.5)
dataset 1: 0.0073:14.9504 -> 0.511:14.9504 Energy (keV)

sherpa> ignore(7., None)
dataset 1: 0.511:14.9504 -> 0.511:6.9934 Energy (keV)

We fit an absorbed power-law model to the data using the robust, Nelder-Mead Simplex fit optimization method and Cash fit statistic in order to select a Poisson likelihood as appropriate for pyBLoCXS simulations.

sherpa> set_method("neldermead")
sherpa> set_stat("cash")

Then, we fit and estimate confidence intervals on all thawed parameters with the Sherpa covariance command.

sherpa> fit()
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set
Dataset               = 1
Method                = neldermead
Statistic             = cash
Initial fit statistic = 1.01254e+08
Final fit statistic   = -461303 at function evaluation 658
Data points           = 444
Degrees of freedom    = 441
Change in statistic   = 1.01716e+08
   abs0.nH        0.0721599   
   p1.gamma       1.24384     
   p1.ampl        0.000676157 
[NOTE]
Note

Sherpa has recognized that the PHA data has an associated bakground data set but it isn't being used (it can not be subtracted when using the Cash statistic, and no model has been set up to fit it). For this example we will ignore the background contribution, but in an actual analysis you will have to decide how to handle the background.

sherpa> plot_fit_resid(yerrorbars=False)
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with wstat
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with wstat

Figure 1: Starting point for the MCMC analysis

[A plot of the best-fit model overlaying the data (top plot) and the residuals about this plot (bottom). There's a lot of scatter but the fit seems reasonable.]
[Print media version: A plot of the best-fit model overlaying the data (top plot) and the residuals about this plot (bottom). There's a lot of scatter but the fit seems reasonable.]

Figure 1: Starting point for the MCMC analysis

sherpa> covar()
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set
Dataset               = 1
Confidence Method     = covariance
Iterative Fit Method  = None
Fitting Method        = neldermead
Statistic             = cash
covariance 1-sigma (68.2689%) bounds:
   Param            Best-Fit  Lower Bound  Upper Bound
   -----            --------  -----------  -----------
   abs0.nH         0.0721599  -0.00455609   0.00455609
   p1.gamma          1.24384   -0.0126143    0.0126143
   p1.ampl       0.000676157 -9.20228e-06  9.20228e-06

Running the pyBLoCXS Routine

pyBLoCXS simulations are executed with the get_draws function, using the fit information associated with our specified data set(s)—and the current sampler and parameter priors—for a specified number of iterations. The simplest case uses the default MetropolisMH sampler, and flat priors for all parameters.

The list of all available samplers is returned by the list_samplers function, and the currently set sampler may be viewed with get_sampler_name, as demonstrated below.

sherpa> print(get_sampler_name())
MetropolisMH

sherpa> print(list_samplers())
['metropolismh', 'mh', 'pragbayes', 'fullbayes']

The sampler—the type of jumping rule to be used in MCMC—can be either MH, MetropolisMH, pragbayes, or fullbayes (note that fullbayes is not thoroughly tested and not recommended at this time). Metropolis and Metropolis-Hastings are the two standard algorithms for sampling the model parameter space (the posterior). Metropolis jumps to the new set of parameters from the current parameter set, while Metropolis-Hastings always jumps from the best fit. The Sherpa sampler MH is a pure Metropolis-Hastings implementation. MetropolisMH is a mixture of Metropolis and Metropolis-Hastings, selected with a probability p_M. PragBayes is used when the effective area calibration uncertainty is to be included in the calculation.

[NOTE]
Note

At each nominal MCMC iteration, a new calibration product is generated, and a series of N (option in set_sampler_opt) MCMC sub-iteration steps are carried out, choosing between Metropolis and Metropolis-Hastings types of samplers with probability p_M (option in set_sampler_opt). Only the last of these sub-iterations are kept in the chain.

The get_sampler function is available for accessing the default settings of the current sampler, and set_sampler_opt for modifying those settings.

sherpa> get_sampler()
{'log': False,
 'inv': False,
 'defaultprior': True,
 'priorshape': False,
 'priors': (),
 'originalscale': True,
 'scale': 1,
 'sigma_m': False,
 'p_M': 0.5}

We accept the default settings for the initial, simplest case and run the sampler using the get_draws function, specifying the names of the output arrays and the number of iterations.

sherpa> stats, accept, params = get_draws(niter=1e4)

Here we use Python syntax to name the data arrays created by the simulations: stats, accept, and params. stats contains a 1-D array of floating-point statistic values; accept contains a 1-D array of True or False accepted/rejected Booleans for each iteration; and params contains a N-D array of parameter values, where N is the number of free parameters in the model. In this example, N is equal to 3.

The get_draws command above produces the following screen output, showing which parameters are being sampled and what prior function is assumed for the simulation (the hexadecimal value will not be the same as shown below):

Using Priors:
abs0.nH: <function flat at 0x16a43f50>
p1.gamma: <function flat at 0x16a43f50>
p1.ampl: <function flat at 0x16a43f50>

Visualizing the Results of the pyBLoCXS Simulations

There are several Sherpa plotting functions available for visualizing statistics and parameter values for each accepted iteration; these include:

Fit Statistic Values

The plot_trace function can be used to plot by default the entire array of statistic values for each sample. We can use the associated get_trace_plot command to specify plot preferences, such as the desired value for line thickness, and then enter the name of the fit statistic array returned by get_draws into the plot_trace expression (stats) to create the plot. Finally, we can save the resulting plot to a PNG format image file with Matplotlib's savefig function.

sherpa> get_trace_plot().plot_prefs
{'xerrorbars': False,
 'yerrorbars': False,
 'ecolor': None,
 'capsize': None,
 'barsabove': False,
 'xlog': False,
 'ylog': False,
 'linestyle': '-',
 'linecolor': None,
 'color': None,
 'marker': 'None',
 'markerfacecolor': None,
 'markersize': None,
 'xaxis': False,
 'ratioline': False}

sherpa> plot_trace(stats, name='Stats')
sherpa> plt.savefig('trace_stats.png')

Figure 2: Trace Plot of Fit Statistic Values

[Trace plot of fit statistic array returned by get_draws.]
[Print media version: Trace plot of fit statistic array returned by get_draws.]

Figure 2: Trace Plot of Fit Statistic Values


Model Parameter Values

We may also examine the distribution of model parameters resulting from the simulations, stored in the params array returned by get_draws. Like the other data arrays output by get_draws, this array is a standard NumPy ndarray, which means we can obtain information about the array using Python syntax, for example:

sherpa> print(params.shape)
(3, 10001)

sherpa> print(params.dtype)
float64

sherpa> print(params.size)
30003

By manipulating the params array in this way, we can access the simulation results for any of the three model parameters sampled. Below, we filter params to return just the values for iterations 1011 to 1021 of the first parameter, abs0.nH:

sherpa> print(params[0, 1010:1020])
[0.06226211 0.06226211 0.06658911 0.06658911 0.06996455 0.06996455
 0.06996455 0.07968891 0.07471671 0.07655606]

Likewise, we can list the parameter values for the second parameter, p1.gamma, over this range:

sherpa> print(params[1, 1010:1020])
[1.22337425 1.22337425 1.23948616 1.23948616 1.23803053 1.23803053
 1.23803053 1.26654783 1.2588446  1.26032146]

This range was chosen at random, and has no "theoretical" significance. Figure 5 shows how the parameter values vary together.

The probability distribution of parameter values may be plotted with the plot_pdf command, as shown below for the gamma parameter:

sherpa> plot_pdf(params[1, :], name=r'$\Gamma$')

Figure 3: Probability Distribution of a Selected Parameter

[Plot of the probability distribution of the gamma parameter values from the simulations]
[Print media version: Plot of the probability distribution of the gamma parameter values from the simulations]

Figure 3: Probability Distribution of a Selected Parameter

The plot_cdf function displays the cumulative distribution, with the median and 1σ bounds marked on the plot:

sherpa> plot_cdf(params[1, :], name=r'$\Gamma$')

Figure 4: Cumulative Distribution of a Selected Parameter

[Plot of the cumulative probability distribution of the gamma parameter values from the simulations]
[Print media version: Plot of the cumulative probability distribution of the gamma parameter values from the simulations]

Figure 4: Cumulative Distribution of a Selected Parameter

To access all information about the displayed distribution, the get_pdf_plot and get_cdf_plot functions may be used; for example:

sherpa> print(get_cdf_plot())
points = [1.1983,1.1983,1.1983,1.1983,1.2011,1.2011,1.2011,1.2011,1.2017,1.2018,
 1.2019,1.2019,1.2024,1.203 ,1.2038,1.2038,1.2038,1.2041,1.2043,1.2055,
...
 1.284 ,1.2854,1.2866,1.2866,1.2866,1.2881,1.2882,1.2883,1.2884,1.2929,
 1.2974]
x      = [9.9990e-05,1.9998e-04,2.9997e-04,3.9996e-04,4.9995e-04,5.9994e-04,
 6.9993e-04,7.9992e-04,8.9991e-04,9.9990e-04,1.0999e-03,1.1999e-03,
...
 9.9960e-01,9.9970e-01,9.9980e-01,9.9990e-01,1.0000e+00]
y      = [1.2439,1.2476,1.2443,1.2315,1.2668,1.2668,1.2229,1.2295,1.2295,1.2295,
 1.2258,1.2419,1.2419,1.2419,1.2419,1.2428,1.2339,1.2191,1.2499,1.2418,
...
 1.2382,1.2483,1.2578,1.2588,1.2202,1.2463,1.2602,1.2468,1.2458,1.2261,
 1.2261]
median = 1.2441830280630497
lower  = 1.2314057569145453
upper  = 1.2565747593145233
xlabel = x
ylabel = p(<=x)
title  = CDF: $\Gamma$
plot_prefs = {'xlog': False, 'ylog': False, 'label': None, 'xerrorbars': False, 'yerrorbars': False, 'color': None, 'linestyle': '-', 'linewidth': None, 'marker': 'None', 'alpha': None, 'markerfacecolor': None, 'markersize': None, 'ecolor': None, 'capsize': None}

To access the values for the median of the distribution, and the lower and upper bounds, we can use the function directly, as follows:

sherpa> get_cdf_plot().median
        1.2441830280630497

sherpa> get_cdf_plot().lower
        1.2314057569145453

sherpa> get_cdf_plot().upper
        1.2565747593145233

We can also display the values of the parameters in a scatter plot, showing correlations between different parameters. Here we display abs0.nH and p1.gamma, showing parameter correlation typical of an X-ray data set.

sherpa> plot_scatter(params[0, :], params[1, :], name=r'n$_H$ vs. $\Gamma$', xlabel='n$_H$', ylabel=r'$\Gamma', markersize=1)

Figure 5: Scatter Plot of Two Correlated Model Parameters

[Scatter plot showing correlation of NH and Gamma model parameter values]
[Print media version: Scatter plot showing correlation of NH and Gamma model parameter values]

Figure 5: Scatter Plot of Two Correlated Model Parameters

Finally, we can access the data arrays and preferences defining the scatter plot with the get_scatter_plot function:

sherpa> print(get_scatter_plot())
x      = [0.0722,0.0705,0.0747,0.0667,0.0805,0.0805,0.063 ,0.0618,0.0618,0.0618,
 0.0688,0.0722,0.0722,0.0722,0.0722,0.0738,0.0656,0.0662,0.0707,0.0743,
...
 0.0758,0.0804,0.0745,0.0762,0.0662,0.0729,0.0733,0.0692,0.0695,0.0667,
 0.0667]
y      = [1.2439,1.2476,1.2443,1.2315,1.2668,1.2668,1.2229,1.2295,1.2295,1.2295,
 1.2258,1.2419,1.2419,1.2419,1.2419,1.2428,1.2339,1.2191,1.2499,1.2418,
...
 1.2382,1.2483,1.2578,1.2588,1.2202,1.2463,1.2602,1.2468,1.2458,1.2261,
 1.2261]
yerr   = None
xerr   = None
xlabel = n$_H$
ylabel = $\Gamma$
title  = Scatter: n$_H$ vs. $\Gamma$
plot_prefs = {'xlog': False, 'ylog': False, 'label': None, 'xerrorbars': False, 'yerrorbars': True, 'color': None, 'linestyle': 'None', 'linewidth': None, 'marker': '.', 'alpha': None, 'markerfacecolor': None, 'markersize': None, 'ecolor': None, 'capsize': None}

This can be compared to the standard error analysis; for example overplotting the reg_proj contours with:

sherpa> reg_proj(abs0.nh, p1.gamma, overplot=True)
WARNING: data set 1 has associated backgrounds, but they have not been subtracted, nor have background models been set

Figure 6: Comparing scatter to region-projection

[The one-, two-, and three-sigma contours added by the reg_proj command seem to match the data (but a more-quantitative analysis is really needed than this visual comparison).]
[Print media version: The one-, two-, and three-sigma contours added by the reg_proj command seem to match the data (but a more-quantitative analysis is really needed than this visual comparison).]

Figure 6: Comparing scatter to region-projection


Scripting It

The file, fit.py , performs the primary commands used above. It can be executed by typing '%run -i fit.py' on the Sherpa command line. The Sherpa script command may be used to save everything typed on the command line in a Sherpa session:

sherpa> script(filename="sherpa.log", clobber=False)

(Note that restoring a Sherpa session from such a file could be problematic since it may include syntax errors, unwanted fitting trials, et cetera.)


History

15 Dec 2011 new for CIAO 4.4
11 Dec 2013 reviewed for CIAO 4.6: no changes
19 Mar 2015 updated for CIAO 4.7, no content change.
09 Dec 2015 reviewed for CIAO 4.8, no content change.
01 Dec 2016 reviewed for CIAO 4.9, no content change.
29 May 2018 reviewed for CIAO 4.10, no content change.
13 Dec 2018 reviewed for CIAO 4.11, no content change.
10 Dec 2019 Updated for CIAO 4.12, changing ChIPS to matplotlib; added a workaround for plot_cdf being broken in CIAO 4.12.
21 Dec 2020 Updated for plot changes in CIAO 4.13 and to note that plot_cdf works again.
28 Mar 2022 reviewed for CIAO 4.14, fixed typos, no content change.
06 Dec 2022 reviewed for CIAO 4.15, no content change.
28 Nov 2023 reviewed for CIAO 4.16, fixed typos, updated returned outputs, no content change.
13 Dec 2024 reviewed for CIAO 4.17, updated returned outputs and figures.