Skip to the navigation links
Last modified: 11 December 2023

URL: https://cxc.cfa.harvard.edu/sherpa/snapshot.html

Sherpa Snapshot


This page offers the user a bird's-eye view of the Sherpa software, so to speak, leaving the detailed and exhaustive list of commands, features, and functionality to other portions of the website. The goal is to provide the new or returning Sherpa user with a means of quickly and easily getting through a basic fitting session, without having to wade through the extensive collection of Sherpa threads, help files, and other references, in order to learn or remember how to do the basics, such as loading data, plotting data, or fitting a model to data.

The format of the page is essentially a cross between a Sherpa command reference, a command help file, and an analysis thread (users should refer to these separate areas of the website for any desired information not found here). It is like a Sherpa command reference in that it lists many or all of the commands available for performing a given function; e.g., it lists the group_* commands available for binning a data set in various ways. It is also reminiscent of the "Examples" section of a Sherpa help file, as it shows how to use a given command for a variety of data types, e.g., how to load data from a PHA Type I versus Type II file using the load_pha command. Finally, the sections are organized like those of a Sherpa thread, to indicate the flow of analysis in a Sherpa session; e.g., data should be binned before filtering, before moving on to defining model expressions for fitting, and so on.


Loading Data

Load source, background, and instrument response data from FITS table and image files, as well as ASCII tables.

Background data and ARF and RMF responses may be loaded separately from the source data, or automatically read into the session when the source data is loaded, as long as the appropriate filenames are recorded in the source file header.

Non-grating source spectrum read from Chandra Type I PHA file.

load_pha("source.pi.fits")    # Load a source data set and associated
                              # responses, and assign to default data
                              # set 1.
load_arf("arf.fits")
load_rmf("rmf.fits")

load_bkg("bkg.pi.fits")
load_bkg_arf("bkg.arf.fits")  # These two are not needed if the ANCRFILE/RESPFILE 
load_bkg_rmf("bkg.rmf.fits")  # keywords are set in bkg.pi.fits

subtract()                    # subtract the background data from default source data set 1

Grating source spectrum read from a row of a Chandra Type II PHA file.

load_pha("pha2[TG_M=-1]")  # Load the HEG and MEG -1 order spectra and responses,
 or
load_pha("pha2[#row=3,9]")   # and assign to data sets 1 and 2. 

load_arf(1, "heg_-1.arf")
load_arf(2, "meg_-1.arf")

load_rmf(1, "heg_-1.rmf")
load_rmf(2, "meg_-1.rmf")

1-D radial profile data columns read from a FITS table file.

# Load columns containing the radial midpoint of 
# annular source regions, net counts per square pixel,
# and the error on the net counts per square pixel.
load_data("rprofile_rmid.fits[cols rmid,sur_bri,sur_bri_err]") 

The first three x, y, and y error data columns read from an ASCII table.

load_data("data.dat", 3)

Counts data loaded from a 2-D file.

load_image("image.fits")

Loading multiple related data sets

Sherpa allows multiple data sets to be loaded and analysied or fit, either individually, in sub groups, or all together. The datastack module extends the Sherpa syntax (i.e. it provides versions of many common Sherpa functions) that simplifies running commands for multiple data sets.

# It is possible to say 'from sherpa.astro.datastack import *'
# to avoid needing the 'datastack.' prefix.
#
from sherpa.astro import datastack

# The text file src.lis contains multiple file names, one per line
datastack.load_pha("@src.lis")

# Show the files
datastack.show_stack()

# Set up a single source model for all the datasets (using the syntax []
# to refer to all the data sets in the stack)
datastack.set_source([], xsphabs.gal * xspowerlaw.pl)

# Subtract the background from each data set
datastack.subtract([])

# The default notice command already works on all data sets, so
# no need to use the datastack version
notice(0.5, 7)

# Appending __ID to a component name means that a component will
# be created for each data set, with the __ID text replaced by
# an integer). In this case an absorbed APEC model is fit to
# each dataset and each data set has its own normalization for
# the APEC model, set by the scale1d model (this is chosen over
# const1d because its integrate flag is not set, and hence is
# not scaled by the bin-width).
#
datastack.set_source([], scale1d.c__ID * xsphabs.gal * xsapec.src)
freeze(src.norm)

fit()

# This will create a plot window for each data set.
datastack.plot_fit([])

# Remove the models and data sets in the stack
datastack.clear_models()
datastack.clear_stack()

Setting Data Units

Data analysis in Sherpa may be done in energy, wavelength, or channel space, with the default being energy for PHA imaging data and channel for grating data. The analysis units of a particular session can be set and checked using the following commands.

get_analysis()

set_analysis("wave")
set_analysis("energy")
set_analysis("channel")

When using image data, the analysis can be done in logical, physical, or world coordinates (although it is strongly suggested that the world coordinate system is not used since the models assume a cartesian coordinate system).

# Return the current coordinate setting
get_coord()

# set the coordinate setting for the default data set
set_coord("physical")

# set the coordinate setting for the data set with label "clus"
set_coord("clus", "physical")

Filtering Data

If you do not wish to display or model the full range of source data loaded into a session, you may use the following commands to mask one or multiple data ranges, as well as check the current data filter setting.

Note: filtering should be done before grouping so that the grouping is only applied to the filtered data.

show_filter()

notice()              # Include the full range of data in the analysis.
notice(5.,)           # Include data between 5 keV and the high endpoint.
notice(,10.)          # Include all data from the low endpoint up to 10 keV.
notice("0.1:5, 6:7")  # Include data in the 0.1 - 5 keV and 6 - 7 keV ranges.
notice_id(2, 1., 6.)  # Include only the 1.0-6.0 keV data range, only in data set 2.

ignore()              # Mask the full range of data.
ignore(5,)            # Mask data above 5 keV.
ignore(,10)           # Mask all data up to 10 keV.
ignore("0.1:5, 6:7")  # Mask data in the 0.1 - 5 keV and 6 - 7 keV ranges.
ignore_id(2, 1., 6.)  # Mask the 1.0-6.0 keV range only in data set 2.

Grouping Data

The following commands are available for "grouping" a spectral data set in Sherpa, i.e., combining energy, wavelength, or channel bins until there are enough counts per group for spectral fitting.

Note: groupig should be done after filtering so that the grouping is only applied to the filtered data.

group_bins(23)            # Group default data set 1 into 23 bins.

group_bins(23, bkg_id=1)  # Group the background associated with source
                          # data set 1, into 23 bins.

group_counts(3, 20)       # Group data set 3 to include a minimum of 20 counts
                          # per bin.

group_snr(2, 10)          # Group data set 2 so that each group of bins has
                          # a minimum SNR of 10.

group_adapt(4, 13)        # Adaptively group low SNR regions in data set 4,
                          # with at least 13 counts per group of bins.

group_adapt_snr(100)      # Adaptively group low SNR regions in data set 1,
                          # with a minimum SNR per group of 100.

group_width("src1", 16)   # Divide the bins of data set "src1" into
                          # groups of 16.

[Updated] In CIAO 4.16 the group_xxx calls now automatically set the tabStops argument so that the grouping is done using the existing filter. So, arguments like tabStops=~get_data().mask are no-longer needed, but will still work.


Displaying Data

A 1-D source or background data set may be displayed using the appropriate plot_* command, and 2-D data with an image_* command. 1-D data are plotted in the units currently set in the session, with the default being Counts/sec/kev vs keV; 2-D data are displayed in Counts in the DS9 frame.

The data units of the y-axis of a plot may be changed to Counts, using the appropriate set_analysis command setting described here. Data may be plotted in photon flux units according to the instructions provided here.

plot_data()                  # Plot data set 1 in default Counts/sec/kev vs keV units.

plot_data(2, overplot=True)  # Display data set 2 in a plot window
                             # previously opened and displaying other data.  

plot_bkg()                   # Plot the background associated with data set 1.

plot_arf()                   # Plot the ARF response associated with data set 1.

image_data(3)                # Open an image of 2-D data set 3 in DS9.

The plot comands can take optional arguments to control the plot appearance, such as axis scaling and line properties. The following plot shows datasets 1 and 2 in the same plot, using logarithmic scales for both axes, and with the second dataset drawn with an alpha-transparancy of 0.5.

plot_data(xlog=True, ylog=True)
plot_data(2, overplot=True, alpha=0.5)

Defining Model Expressions

Sherpa provides functionality for defining both single and multi-component model expressions for fitting to data, where the associated ARF and RMF responses can be automatically convolved with the expression, or manually applied to individual model components (e.g., to model an instrumental background component of a multi-component background model expression, which should not get folded through the instrument response).

The set_source/set_model and set_bkg commands automatically convolve the ARF*RMF instrument response with the defined model expression before fitting, while set_full_model/set_bkg_full_model are available for manually applying (or not applying) the instrument response to individual components of a full, multi-component model expression.

# Browse available Sherpa models.
list_models()

# Assign Sherpa model 'powlaw1d' to data set 1, and give it model ID "p1".
set_source(powlaw1d.p1)
 
# Print the parameters for the p1 model component
print(p1)

# Assign the sum of the 'beta2d' and 'const2d' models
# to 2-D data set 3, and give them model IDs "b1" and "c1".
set_source(3, beta2d.b1 + const2d.c1)

# Assign an asborbed power-law model to the second background of data set 2.
set_bkg(2, xsphabs.a1 * p1, 2)

# Display model parameter values and ranges for all models assigned to data
# sets in the session.
show_model()

# Display the current source model for data set 2
src2 = get_source(2)
print(src2)

# Loop the individual components in the source expression
for cpt in src2.parts:
    print(cpt)
    print("")

Setting model parameter values

set_par(g1.pos, val=15.5, freeze=True)   # Set the mean position
                                         # parameter of a gauss1d
                                         # model named "g1" to value
                                         # 15.5 and do not allow
                                         # to vary during a fit. 

p1.gamma = 0.8                           # Set initial model parameter values
p1.gamma.min = 0.3                       # and ranges for a power-law model
p1.gamma.max = 10.                       # component named "p1".

freeze(p1)                               # Freeze and thaw individual or sets of
thaw(p1.ampl)                            # model parameters before a fit.

Manually defining model expressions

rsp = get_response()                     # Store the source and background
bkg_rsp = get_response(bkg_id=1)         # ARF*RMF responses associated with
                                         # data set 1 to variables for
                                         # use in the set_*_full_model() expressions 
 

# Define the background scaling factor to use in the set_full_model()
# expression, which includes both source and background components for
# data set 1.

bkg_scale = get_exposure()*get_backscal()/get_exposure(bkg_id=1)/get_backscal(bkg_id=1)


# Manually define the full source and background model expressions
# for data set 1. Apply separate instrument responses to the source
# and background components, and scale the background component in
# the source model expression to account for differences in exposure 
# time and extraction region area.

set_full_model(rsp(xsphabs.abs1*powlaw1d.p1) + bkg_scale*bkg_rsp(abs1*powlaw1d.p2))

set_bkg_full_model(bkg_rsp(abs1*p2))

Displaying Models

As with plotting data we can also plot models with plot_model (convolved by any instrument model) and plot_source (no instrument model).

# Show the models from the first and second datasets. This includes the
# ARF and RMF files for PHA datasets (the output of the
# get_model function).
#
plot_model()
plot_model(2, overplot=True)

# Show the models from the first and second datasets. This shows the model
# before passing through the instrument response (for PHA datasets) and
# corresponds to the output of get_source.
#
plot_source()
plot_source(2, overplot=True)

For PHA data sets there is a difference between

and

plot_data()
plot_model(overplot=True)

as the first version displays the model grouped to match the data, whereas the second uses the binning of the instrument response to display the model.


Simulating Data

The sherpa fake* commands may be used to simulate a data set, e.g., to assist in proposal writing. To simulate a PHA data set, a source model expression must be defined and assigned a data set ID, such as "faked", which then gets populated with simulated data values after running the fake_pha command as shown below.

To use the fake command to simulate a generic data set, instead, an extra step is required: you must first create a blank data set on a specified grid, before defining a model expression and running the simulation.

Simulating a PHA data set

set_source("faked", "xsphabs.abs1*xspowerlaw.p1")
fake_pha("faked", arf="arf.fits", rmf="rmf.fits", exposure = 1200, grouped=False)

Simulating a generic 1-D or 2-D data set

dataspace1d(0.1,10,0.01)
set_source(gauss1d.g1)
fake()

dataspace2d([256,256])
set_source(beta2d.bb+const2d.cc)
fake()

Setting Fit Parameters

The fit statistic and optimization method to be used in fitting a data set may be specified with the appropriate Sherpa set_* commands, and the current settings may be checked with show_* commands. The default fit statistic is Chi2Gehrels, and the default method is Levenberg-Marquardt.

list_stats()              # List available fit statistics and methods
list_methods()

show_stat()
show_method()

set_stat("cash")         
set_method("neldermead")

Fitting Data

The Sherpa fit* functions provide the flexibility of fitting one or multiple source and/or background data sets, either individually or simultaneously.

fit()       # Fit all current data sets with their assigned models.
fit(1,2)    # Fit only data sets 1 and 2.
fit(4)      # Fit only data set 4.
 
fit_bkg()   # Fit only the current background data sets, ignoring the
            # associated source data.

fit_bkg(2)  # Fit only the background associated with source data set 2.

Displaying Fit Results

The Sherpa plot_fit* commands may be used to display a fitted model curve overlaid on the data, in default units of Counts/sec/keV versus keV for 1-D PHA data. Fit residuals may also be plotted using the appropriate command, as measured counts minus predicted counts, and delta chi values as residuals divided by uncertainties.

For fitted 2-D data, the image_* commands may be used to display in DS9 the images of the fitted model, residuals, and data-to-model ratio, in units of counts.

plot_fit()           # Display the fitted model and residuals 
plot_fit_delchi()    # for 1-D data set 1.
plot_fit_resid()


image_fit(4)         # Display the fitted model and residuals
image_resid(4)       # for 2-D data set 4.
image_ratio(4)

get_fit_results()    # Return the fit statistics resulting from
                     # the most recent fit in the session.

Calculating Fit Statistics and Fluxes

The Sherpa calc* commands are available for calculating the fit statistics for a given data set and its assigned model, as well as for calculating unconvolved, integrated model photon and energy fluxes.

calc_stat(3)                   # Return the initial fit statistic for data set 3.

calc_stat_info()               # Return the goodness-of-fit statistics resulting
                               # from the most recent fit in the session.

calc_photon_flux()             # Return the integrated, unconvolved model photon
                               # flux over the entire data range of the default 
                               # data set 1.

calc_energy_flux(0.1,7.0,1,2)  # Return the integrated, unconvolved
                               # model energy flux in the 0.1-7. keV
                               # data range of the second background
                               # associated with source data set 1.

Estimating Errors on Model Parameters

The confidence intervals on model parameter values resulting for a Sherpa fit may be calculated using the confidence command, and plotted using the accompanying *_proj commands. The confidence settings may be viewed and edited using the appropriate get and set commands.

get_conf_opt()               # Return the current settings to be used when the 
                             # confidence estimation method is run.

set_conf_opt("fast", False)  # Use the current fit optimization method
                             # in the confidence calculation, instead
                             # of the faster default method "levmar". 

conf()                       # Run the confidence calculation on all current
                             # data sets with assigned models, and
                             # return a table of model parameter
                             # confidence intervals.  

int_proj(p1.gamma)           # Display a confidence plot of fit
                             # statistic versus thawed
                             # parameter value, for the 'gamma' parameter of
                             # the power law model named "p1".

reg_proj(p1.gamma, p1.ampl)  # Display a confidence contour of fit
                             # statistic versus two thawed model parameters,
                             # for the 'gamma' 'ampl' parameters of
                             # the power law model named "p1".

Bayesian analysis for Poisson statistics

Once a best-fit location has been found, the confidence intervals can be found using the Monte-Carlo Markov Chain (MCMC) based routine get_draws, which comes from the Python Bayesian Low-Count X-ray Spectral (pyBLoCXS) package.

# Run the MCMC for 1000 iterations for data set 1
chain = get_draws()

# Increase the number of iterations to 10000
chain = get_draws(niter=10000)

# Use datasets 1, 2, and 3
chain = get_draws(1, [2,3], niter=10000)

# Change the sampler to use the Metropolis Hastings method. The available
# samplers are given by list_samplers
# and the current method is returned by get_sampler_name.
set_sampler("mh")
chain = run_draws()

# Priors can be set on one or more parameters in the fit:
# in this case the kT parameter of the therm component is set
# to a gaussian centered at 2.5 keV with a FWHM of 1.2 keV:
create_model_component("normgauss1d", "gprior")
gprior.pos = 2.5
gprior.fwhm = 1.2
set_prior(therm.kt, gprior)
# Once the prior has been set the set_sampler_opt
# command should be used to change the 'defaultprior', 'priorshape',
# and 'originalscale' options for the sampler.

# The return value of get_draws is a tuple listing the statistic value,
# an acceptance flag, and the parameter values for each iteration.
(stats, accept, params) = chain

# Plot a trace of the statistic value versus iteration number
plot_trace(stats, name='Statistic')

# A 1 indicates that the iteration jumped to a new parameter value,
# a 0 it did not.
plot_trace(accept, name='Acceptance Flag')

# Plot the value of the first fitted parameter per iteration
plot_trace(params[0, :], name='parameter 1')

# A scatter plot of the first two fitted parameters
plot_scatter(params[0, :], params[1, :])

# Plot the cumulative density function of the first parameter
plot_cdf(params[0, :])

# Plot the probability density function of the first parameter
plot_pdf(params[0, :])

# Calculate the one-sigma confidence interval of the third parameter
# (the return values are the median, one-sigma lower value, and
# one-sigma upper value)
from sherpa import utils
(medval, loval, hival) = utils.get_error_estimates(params[2, :])

Saving Data and Sessions

A particular Sherpa fitting session may be saved to either a human readable or binary format file and later restored. 1-D and 2-D data and model arrays may be saved to FITS table or images and ASCII files.

save("session1.save")                # Record a Sherpa session to a binary
                                     # format file and  restore later with 
                                     # restore("session1.save").
                         
save_all("script.txt")               # Record everything typed at the command line
                                     # in a session to a text file. Edit the file
                                     # and restore later with "execfile(script.txt)".





save_pha(2, "data2.pha")             # Save PHA data set 2 to a PHA FITS file.

save_image("srcimg","output.fits")   # Save  data set "srcimg" to a FITS
                                     # image file.


# Write the x and y data arrays of data set "src" to a text file named
# "source.ascii", where the output file contains column names "x" and
# "y"

save_data("src", "source.ascii", ["x", "y"])

# Repeat, but write out as a FITS table

save_data("src", "source.fits", ["x", "y"], ascii=False)


# Write the arrays of x and y data values of data set 2 to an ASCII
# file named "data_xy.txt". Note that the default value for the ascii
# option varies with function (True for save_data and save_arrays,
# False for save_model).

d2 = get_data(2)
save_arrays("data_xy.txt", [d2.x, d2.y], ascii=True)

save_model(2, "source.txt", ascii=True)  # Save the convolved source
                                         # model for data set 2