Last modified: December 2020

URL: https://cxc.cfa.harvard.edu/sherpa/ahelp/datastack.html
AHELP for CIAO 4.17 Sherpa

datastack

Context: data

Synopsis

Sherpa extension package for modeling stacks of data.

Syntax

from sherpa.astro import datastack

Description

Datastack is a Sherpa extension package for manipulating a stack of related datasets and simultaneously fitting a model to them. It provides stack-enabled (i.e. vectorized) versions of the key Sherpa commands used to load data, set source models, get and set parameters, fit, and plot. The current implementation is based on Tom Aldcroft's contrib package (https://cxc.harvard.edu/contrib/datastack/) wrapping Sherpa functions. The addition of the CIAO DM stack syntax allows reading and filtering the stacks of data using standard DM filtering syntax.

For X-ray spectral analysis, a datastack means that a group of related datasets, for instance 10 observations of the same source taken at different times, can be analyzed as if they were a single merged dataset. This has the important advantage of using the appropriate individual response functions while hiding the complexity (or tedium) of defining models and parameters for all the datasets.

Loading the Routines

This package needs to be imported to a Sherpa session, for example:

from sherpa.astro import datastack

Datastack File Input

The file stack syntax (e.g. '@file.lis') is available for Sherpa load_* commands. The details of the DM stack concept can be found in the "stack" ahelp file.

The following load_* functions support file stack syntax:

The datastack-specific load_* calls are slightly different than the default Sherpa calls, as they do not take a data ID. The "@" character entered with the filename signals a stack file and all the files listed in the "@file.lis" are added to the datastack. Individual files can be added to the datastack later. You can also use multiple datastack identifiers, providing a way to treat several datastacks in one Sherpa session.

sherpa> from sherpa.astro.datastack import *
sherpa> load_data("@file.lis")
read ARF file obs1/q9407.warf
read RMF file obs1/q9407.wrmf
read ARF (background) file obs1/q9407_bkg.warf
read RMF (background) file obs1/q9407_bkg.wrmf
read background file obs1/q9407_bkg.pi
read ARF file obs2/q9708.warf
read RMF file obs2/q9708.wrmf

After importing the datastack package, the Sherpa load_data command loads the PHA files listed in the file.lis stack, as well as the associated response files pointed to by the headers of those PHA files.

sherpa> show_stack()
1: obs1/q9407.pi OBS_ID: 9407 MJD_OBS: 54437.6143274
2: obs2/q9708.pi OBS_ID: 9708 MJD_OBS: 54445.5404309

unix% more file.lis
obs1/q9407.pi
obs2/q9708.pi

Note that after importing datastack with "from sherpa.astro.datastack import *", datastacks are used by default to manage Sherpa data IDs. New files can either be added to the existing datastack, or input as a separate file. The default datastack is referred to as [].

The following syntax is allowed:

An example of an unsupported syntax is:

load_arrays

The datastack version of load_arrays has a slightly different interface than the one in standard Sherpa. In standard Sherpa, load_arrays must always get an ID as the first argument.

load_arrays accepts an array of arrays as the first argument when no datastack identifier is provided, as in the example below:

x1 = ... an array ...
y1 = ... an array ...
x2 = ... an array ...
y2 = ... an array ...
x3 = ... an array ...
y3 = ... an array ...

sherpa> load_arrays([[x1,y1], [x2,y2], [x3,y3]])

Datastack Identifier

By default, the datastack is identified by '[]'. In general, it can also be:

For example:

sherpa> ds = DataStack()

'ds' is both a datastack instance and identifier, and another valid identifier is:

sherpa> ds[1,2] 

meaning ds[1,2] is an identifier for datasets 1 and 2 from the datastack 'ds'. If picking an identifier from the default datastack, the syntax:

sherpa> [3,4]

may be used, where [3,4] is the datastack identifier for datasets 3 and 4 from the default datastack.

Datastack Functions

The following functions are compatible with datastacks:

In order for the function to operate on a datastack, the datastack identifier has to be included as the first argument of that function.

Setting Models for a Datastack

The following model function can be used for assigning a model to a datastack:

They can be used to assign a model to all the datasets in a datastack or to a dataset subset with the data indicator '__ID'.


Examples

Example 1

sherpa> from sherpa.astro.datastack import *
sherpa> load_data("@file.lis")

Import the datastack package into Sherpa and use the load_data command to load the PHA files listed in file.lis into the default datastack instance.

Example 2

sherpa> set_source([],'xsphabs.abs1*xsapec.kt1')

Here the same model is assigned to both datasets in the default datastack, identified by '[]'.

sherpa> show_source()
Model: 1
(xsphabs.abs1 * xsapec.kt1)
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   abs1.nH      thawed            1            0       100000 10^22 atoms / cm^2
   kt1.kT       thawed            1        0.008           64        keV
   kt1.Abundanc frozen            1            0            5           
   kt1.redshift frozen            0       -0.999           10           
   kt1.norm     thawed            1            0        1e+24           

Model: 2
(xsphabs.abs1 * xsapec.kt1)
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   abs1.nH      thawed            1            0       100000 10^22 atoms / cm^2
   kt1.kT       thawed            1        0.008           64        keV
   kt1.Abundanc frozen            1            0            5           
   kt1.redshift frozen            0       -0.999           10           
   kt1.norm     thawed            1            0        1e+24           

These data will be simultaneously fit, assuming this model.

Example 3

sherpa> set_source([1,2], "powlaw1d.p__ID")
sherpa> set_source([3,4], "brokenpowerlaw.bp__ID")

These commands set different models for different subsets of the default datastack.

sherpa> show_source()
Model: 1
powlaw1d.p1
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   p1.gamma     thawed            1          -10           10           
   p1.ref       frozen            1 -3.40282e+38  3.40282e+38           
   p1.ampl      thawed            1            0  3.40282e+38           

Model: 2
powlaw1d.p2
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   p2.gamma     thawed            1          -10           10           
   p2.ref       frozen            1 -3.40282e+38  3.40282e+38           
   p2.ampl      thawed            1            0  3.40282e+38           

Model: 3
brokenpowerlaw.bp3
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   bp3.refer    frozen         5000  1.17549e-38  3.40282e+38  angstroms
   bp3.ampl     thawed            1  1.17549e-38  3.40282e+38  angstroms
   bp3.index1   thawed          0.1          -10           10           
   bp3.index2   thawed         -0.1          -10           10           

Model: 4
brokenpowerlaw.bp4
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   bp4.refer    frozen         5000  1.17549e-38  3.40282e+38  angstroms
   bp4.ampl     thawed            1  1.17549e-38  3.40282e+38  angstroms
   bp4.index1   thawed          0.1          -10           10           
   bp4.index2   thawed         -0.1          -10           10           

The use of the '__ID' data indicator assigns a model to all the datasets in a datastack of subset beginning with the model component name preceding the data indicator. The model parameters are assigned names with the string '__ID' replaced by 1,2,3,... as appropriate, as seen in the above examples.

Example 4

sherpa> freeze([], 'abs1')
sherpa> thaw(ds, 'poly.c0')
sherpa> thaw([1,3], 'poly.c1')

These are examples of different ways to freeze and thaw parameters using the various datastack instances.


Utility Functions for Datastacks

These functions operate on datstack instances.

The show_stack function lists the datasets currently in the stack and it displays the contents of the OBS_ID and MJD_OBS cards in each file header, if available.

Querying Datastacks

Datastack instances can be queried to find which datasets in the stack match some given requirements. The two main methods to do so are:

query_by_header_keyword is useful for any datasets that have a 'header' attribute whose elements can be accessed, for example as a Python dictionary (e.g. header['XTENSION']), like a DataPHA.

query(func) is a general approach, and accepts any function that is defined according to the following signature:

func(dataset) returns Boolean

The query function will return a list of all dataset IDs in the stack for which 'func' returns True.

From these two methods, one can derive any number of more specific methods. The output of the query functions can be used as a stack identifier to perform some actions, e.g.:

sherpa> f = query_by_header_keyword('INSTRUME', 'ACIS')
sherpa> set_source(f, "powlaw1d.p__ID")

or

sherpa> set_source(ds[f], "powlaw1d.p__ID")

A third call is implemented by default:

This query returns the datasets(s) matching an OBS_ID, by using the OBS_ID keyword in the file header, if available.

Cleaning a Session

The datastack module offers the clean() function, that clears all models, datasets, and invokes Sherpa's clean() function.

Object Oriented Approach

In this approach, the user does not need to "import *", and can just use the objects defined in the datastack module, as shown below:

sherpa> from sherpa.astro import datastack as dsmod
sherpa> ds = dsmod.DataStack()
sherpa> dsmod.set_template_id("ID")
sherpa> ds.load_pha("@pha.lis")
sherpa> ds.show_stack()
sherpa> f = ds.query_by_header_keyword('INSTRUME', 'ACIS')
sherpa> ds[f].set_source("powlaw1d.pID")

Changes in CIAO 4.13

The ungroup and unsubtract commands have been added to the datastack module.


Bugs

See the bugs pages on the Sherpa website for an up-to-date listing of known bugs.

See Also

data
copy_data, dataspace1d, dataspace2d, delete_data, fake, get_axes, get_bkg_chisqr_plot, get_bkg_delchi_plot, get_bkg_fit_plot, get_bkg_model_plot, get_bkg_plot, get_bkg_ratio_plot, get_bkg_resid_plot, get_bkg_source_plot, get_counts, get_data, get_data_contour, get_data_contour_prefs, get_data_image, get_data_plot, get_data_plot_prefs, get_dep, get_dims, get_error, get_quality, get_specresp, get_staterror, get_syserror, group, group_adapt, group_adapt_snr, group_bins, group_counts, group_snr, group_width, load_arf, load_arrays, load_ascii, load_bkg, load_bkg_arf, load_bkg_rmf, load_data, load_grouping, load_image, load_multi_arfs, load_multi_rmfs, load_pha, load_quality, load_rmf, load_staterror, load_syserror, load_table, pack_image, pack_pha, pack_table, set_data, set_quality, ungroup, unpack_arf, unpack_arrays, unpack_ascii, unpack_bkg, unpack_data, unpack_image, unpack_pha, unpack_rmf, unpack_table
filtering
get_filter, load_filter, set_filter
info
get_default_id, list_bkg_ids, list_data_ids, list_response_ids
modeling
add_model, add_user_pars, clean, load_table_model, load_template_interpolator, load_template_model, load_user_model, save_model, save_source
plotting
plot_data, set_xlinear, set_xlog, set_ylinear, set_ylog
saving
save_arrays, save_data, save_delchi, save_error, save_filter, save_grouping, save_image, save_pha, save_quality, save_resid, save_staterror, save_syserror, save_table
statistics
load_user_stat
utilities
calc_data_sum, calc_data_sum2d, calc_ftest, calc_kcorr, calc_mlr, calc_model_sum2d, calc_source_sum2d, get_rate
visualization
contour, contour_data, contour_ratio, histogram1d, histogram2d, image_data, rebin