Last modified: 8 Apr 2022

URL: https://cxc.cfa.harvard.edu/sherpa/threads/nustar_sim/

Simulating NuSTAR X-Ray Spectra with Sherpa

Sherpa Threads (CIAO 4.14 Sherpa)


Overview

Synopsis:

This thread illustrates the use of the Sherpa fake_pha command to simulate a hard X-ray emission spectrum of a supernova remnant which might be imaged by the detectors aboard the Nuclear Spectroscopic Telescopic Array (NuSTAR) satellite, launched in June 2012. NuSTAR, the first focusing high-energy X-ray mission sensitive across the 5-80 keV range, is aimed at studying objects such as black holes, supernovae, and extremely active galaxies. Here, we use Sherpa to simulate a 1-D PHA spectrum using the currently available NuSTAR imaging ARF and RMF response files, in addition to a source model expression tailored to the Chandra X-ray spectrum of SN 1979C, suspected to harbor an accreting stellar-mass black hole.

Last Update: 8 Apr 2022 - Updated for CIAO 4.14, use mask array to better group spectrum to selected energy band.


Contents


Quick Start to Simulating Data

All that is required to simulate a 1-D PHA NuSTAR data set in Sherpa is the following:

  • a source model expression defined with set_source
  • proposal ARF & RMF instrument response files downloaded from the NuSTAR website
  • exposure time for the simulation in seconds

fake_pha will do the rest:

unix% sherpa

# Search available Sherpa models.

sherpa> list_models()
['absorptionedge',
 'absorptiongaussian',
 'absorptionlorentz',
 'absorptionvoigt',
...
 'xszvphabs',
 'xszwabs',
 'xszwndabs',
 'xszxipcf']

#  Define a source model expression and assign it to an ID.

sherpa> emis = xsmekal.tplasma + xsmekal.tplasma2 + xspowerlaw.powlaw
sherpa> set_source("faked", xswabs.gal * emis)

# Set model parameter values.

sherpa> set_par(gal.nH, 0.025, frozen=True)
sherpa> tplasma.redshift = 0.005  # frozen by default
sherpa> tplasma.kT = 1.1401      
sherpa> tplasma.norm = 3.86263e-06 
sherpa> tplasma2.redshift = 0.005  # frozen by default
sherpa> tplasma2.kT = 0.659763    
sherpa> tplasma2.norm = 2.62793e-05 
sherpa> powlaw.PhoIndex = 2.44436     
sherpa> powlaw.norm = 7.87378e-05


# Fake a PHA data set over the grid defined by the input response.

sherpa> fake_pha("faked", arf="point_30arcsecRad_1arcminOA.arf", rmf="nustar.rmf", exposure=200000., backscal=1.0)

# Return information on the faked data set and associated responses.

sherpa> show_data("faked")

sherpa> calc_data_sum(id="faked")
sherpa> calc_energy_flux(id="faked")
sherpa> calc_photon_flux(id="faked") 

# Plot simulated data.

sherpa> plot_data("faked")

# Save simulated data.

sherpa> save_pha("faked", "nustar_sim_200ksec.pha")

This represents the quickest and simplest way to simulate a NuSTAR PHA data set in Sherpa. Simply start CIAO and Sherpa; define a source model expression in Sherpa and assign it to a data set ID such as "faked"; set model parameter values; and then run fake_pha according to your specifications. Read on to find a detailed explanation of each step of this quick-start recipe.


Simulating Data Step by Step:

Downloading Calibration Response Files

In order to simulate a NuSTAR X-ray spectrum, it is necessary to define an instrument response using the appropriate ARF and RMF files currently available for the mission. NuSTAR response files are available for download from the "For Proposers" page of the NuSTAR website (CalTech); the files used in this thread are:

  • point_30arcsecRad_1arcminOA.arfunweighted effective area for a 1' off-axis circular extraction region of radius 30''
  • nustar.rmfRMF includes detector efficiency

Figure 1: Effective Area Comparison

[Effective Area Comparison]
[Print media version: Effective Area Comparison]

Figure 1: Effective Area Comparison

Plot of the effective area of the combined NuSTAR detector units, compared to those of current X-ray observatories1 .

1http://www.nustar.caltech.edu/page/researchers


Establishing the Source Model Expression for the Simulation

We will use the Sherpa fake_pha command to simulate a NuSTAR PHA spectrum using a defined source model expression and instrument response as input. Details on the functionality of fake_pha, with other examples of its usage, are available in the ahelp file and the other Sherpa simulation threads.

The source model chosen for the simulation is defined with the Sherpa set_source command, as follows, where it is assigned to a string data set ID, "faked".

sherpa> emis = xsmekal.tplasma + xsmekal.tplasma2 + xspowerlaw.powlaw
sherpa> set_source("faked", xswabs.gal * emis)

sherpa> set_par(gal.nH, 0.025, frozen=True)

sherpa> tplasma.redshift = 0.005  # frozen by default
sherpa> tplasma.kT = 1.1401      
sherpa> tplasma.norm = 3.86263e-06 

sherpa> tplasma2.redshift = 0.005  # frozen by default
sherpa> tplasma2.kT = 0.659763    
sherpa> tplasma2.norm = 2.62793e-05 

sherpa> powlaw.PhoIndex = 2.44436     
sherpa> powlaw.norm =  7.87378e-05 

Here we use a source model consisting of a combination of two thermal MEKAL components and a power-law, with parameter values chosen to give a good approximation the Chandra imaging spectra of SN 1979C in the 0.6-3 keV range (Patnaude et al., 2011). We can view the model parameter settings in this case using the print command, as the Sherpa show_source command only displays model information for models already assigned to data sets.

sherpa> print(gal)
xswabs.gal
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   gal.nH       frozen        0.025            0        1e+06 10^22 atoms / cm^2

sherpa> print(tplasma)
xsmekal.tplasma
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   tplasma.kT   thawed       1.1401       0.0808         79.9        keV
   tplasma.nH   frozen            1        1e-06        1e+20       cm-3
   tplasma.Abundanc frozen            1            0         1000           
   tplasma.redshift frozen        0.005       -0.999           10           
   tplasma.switch frozen            1            0            1           
   tplasma.norm thawed  3.86263e-06            0        1e+24           


sherpa> print(tplasma2)
xsmekal.tplasma2
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   tplasma2.kT  thawed     0.659763       0.0808         79.9        keV
   tplasma2.nH  frozen            1        1e-06        1e+20       cm-3
   tplasma2.Abundanc frozen            1            0         1000           
   tplasma2.redshift frozen        0.005       -0.999           10           
   tplasma2.switch frozen            1            0            1           
   tplasma2.norm thawed  2.62793e-05            0        1e+24           

sherpa> print(powlaw)
xspowerlaw.powlaw
   Param        Type          Value          Min          Max      Units
   -----        ----          -----          ---          ---      -----
   powlaw.PhoIndex thawed      2.44436           -3           10           
   powlaw.norm  thawed  7.87378e-05            0        1e+24           

Defining the Instrument Response for the Simulation

Note that the steps in this section are optional, as ARF and RMF (or RSP) filenames may be directly entered into the arf and rmf arguments of the fake_pha expression shown in the next section of this thread. However, if you would like to view the details of or modify the response used in the simulation, you should load the NuSTAR ARF and RMF responses into Sherpa as follows:

sherpa> nustar_arf = unpack_arf("point_30arcsecRad_1arcminOA.arf")
sherpa> nustar_rmf = unpack_rmf("nustar.rmf")
sherpa> print(nustar_arf)
name     = point_30arcsecRad_1arcminOA.arf
energ_lo = Float64[4096]
energ_hi = Float64[4096]
specresp = Float64[4096]
bin_lo   = None
bin_hi   = None
exposure = None
ethresh  = 1e-10


sherpa> print(nustar_rmf)
name     = nustar.rmf
detchans = 4096
energ_lo = Float64[4096]
energ_hi = Float64[4096]
n_grp    = UInt64[4096]
f_chan   = UInt32[1014481]
n_chan   = UInt32[1014481]
matrix   = Float64[5627679]
offset   = 0
e_min    = Float64[4096]
e_max    = Float64[4096]
ethresh  = 1e-10

The response data is loaded into variables nustar_arf and nustar_rmf with the unpack_arf and unpack_rmf commands. These variables will be used to assign the instrument response to the faked data set in the next section of this thread, "Running the Simulation with fake_pha."


Running the Simulation with fake_pha

Simulating the NuSTAR spectrum with Sherpa involves convolving the chosen source model expression with the corresponding NuSTAR response, and applying Poisson noise to the counts predicted by the model. All of these steps are performed by the fake_pha command, which has four required arguments: data set or model ID, ARF, RMF, and exposure time. We set the fake_pha arf and rmf arguments to the nustar_arf/nustar_rmf variables defined in the previous section, and the exposure argument to a value of 200 kiloseconds.

sherpa> fake_pha("faked", arf=nustar_arf, rmf=nustar_rmf, exposure=2e5)

This command associates the data/model ID "faked" with a simulated spectrum which is calculated over the grid defined by the NuSTAR instrument response, using the input exposure time and source model expression. Poisson noise is added to the modeled data.

We may use the show_data command to inspect the basic properties of the new data set; observe that the "name" field reads "faked", and the "exposure" field has been set to the chosen exposure time:

sherpa> show_data()
Data Set: faked
Filter: 1.6000-165.4400 Energy (keV)
Noticed Channels: 1-4096
name           = faked
channel        = Int64[4096]
counts         = Float64[4096]
staterror      = None
syserror       = None
bin_lo         = None
bin_hi         = None
grouping       = None
quality        = None
exposure       = 200000.0
backscal       = None
areascal       = None
grouped        = False
subtracted     = False
units          = energy
rate           = True
plot_fac       = 0
response_ids   = [1]
background_ids = []

RMF Data Set: faked:1
name     = nustar.rmf
detchans = 4096
energ_lo = Float64[4096]
energ_hi = Float64[4096]
n_grp    = UInt64[4096]
f_chan   = UInt32[1014481]
n_chan   = UInt32[1014481]
matrix   = Float64[5627679]
offset   = 0
e_min    = Float64[4096]
e_max    = Float64[4096]
ethresh  = 1e-10

ARF Data Set: faked:1
name     = point_30arcsecRad_1arcminOA.arf
energ_lo = Float64[4096]
energ_hi = Float64[4096]
specresp = Float64[4096]
bin_lo   = None
bin_hi   = None
exposure = None
ethresh  = 1e-10

The calc_data_sum and calc_energy_flux/calc_photon_flux commands may be used to return the total counts and integrated energy/photon flux of the faked data set over the entire energy range (as in this example), or within a specified 'lo' to 'hi' range:

sherpa> calc_data_sum(id="faked")
           246.0

sherpa> calc_energy_flux(id="faked")   # ergs/cm^2/sec/keV
           2.0588096302472468e-13

sherpa> calc_photon_flux(id="faked")   # photons/cm^2/sec/keV
           2.902919956116689e-05
[TIP]
Including Background Contribution

NuSTAR observations often include a significant background contribution to the spectra and depending on your science, should be accounted for in your simulation. As part of the downloaded NuSTAR simulation files package, background PHA spectral data are included. This background data can be appropriately scaled and included in the simulated spectra by using the bkg and backscal arguments in fake_pha as demonstrated below.

  sherpa> nustar_bkg = unpack_bkg("bgd_30arcsec.pha")
  WARNING: file 'nu10002020001A01_sr.arf' not found
  WARNING: file 'nu10002020001A01_sr.rmf' not found

  sherpa> fake_pha("faked", arf=nustar_arf, rmf=nustar_rmf, exposure=200000, 
                   bkg=nustar_bkg, backscal=nustar_bkg.backscal)

The warnings thrown are due to non-existant response files referenced in the background PHA file header and will not affect the simulated spectrum. We can also see that the background data is incorporated into the simulated results using the same model by looking at the counts and fluxes compared to the earlier simulation excluding the background arguments.

  sherpa> calc_data_sum(id="faked")
             1450.0

  sherpa> calc_energy_flux(id="faked")   # ergs/cm^2/sec/keV
	     2.0585882218932305e-13
	  
  sherpa> calc_photon_flux(id="faked")   # photons/cm^2/sec/keV	       
             2.9023013208479222e-05
  

The simulated data set may be visualized with the plot_data command, as shown below, and saved with plt.savefig. Before plotting, we group the counts in the faked data spectrum such that each bin contains at least 1 count, and set both the X- and Y-axes of the plot to a logarithmic scale.

sherpa> group_counts("faked", 1)

sherpa> set_xlog()
sherpa> set_ylog()
sherpa> plot_data("faked")

sherpa> plt.savefig("nustar_sim_data_200ksec.ps")

The resulting plot is shown Figure 2.

Figure 2: Plot of simulated NuSTAR source spectrum

[Plot of simulated NuSTAR source spectra]
[Print media version: Plot of simulated NuSTAR source spectra]

Figure 2: Plot of simulated NuSTAR source spectrum

Plot of simulated hard X-ray NuSTAR imaging spectrum created with the Sherpa fake_pha command.

Note that had we not applied a customized, properly normalized source model expression to the faked data, we would have had to re-normalize the simulated data to match the known flux (or counts) of SN 1979C. This process is illustrated in the threads "Simulating Chandra ACIS-S Spectra with Sherpa" and "Simulating 1-D Data: the Sherpa fake_pha Command."


Writing the Simulated Data to Output Files

We may use the save_pha command to write the ungrouped simulated data set to a PHA file, with a header containing the exposure time value and paths to the corresponding response files:

sherpa> ungroup("faked")
sherpa> save_pha("faked", "nustar_sim_200ksec.pha")

sherpa> !dmkeypar nustar_sim_200ksec.pha EXPOSURE echo+
200000.0

sherpa> !dmkeypar nustar_sim_200ksec.pha ANCRFILE echo+
point_30arcsecRad_1arcminOA.arf

sherpa> !dmkeypar nustar_sim_200ksec.pha RESPFILE echo+
nustar.rmf

Fitting the Simulated Data

A data set simulated with fake_pha may be fit as any other data set in Sherpa. For example, we can fit the simulated data with the same source model expression used to create it (and which is already assigned to it at this point), or define a different model. In this example, we choose to first fit the full range of the data using only the power-law model component already assigned to it (minus the thermal plasma models), and then fit again using the xskerrbb model, a multi-temperature blackbody model for a thin accretion disk around a Kerr black hole.

While the detectors onboard NuSTAR may be sensitive to 1.5-165 keV photons, the mirrors only focus ~5-80 keV photons, so we restrict the data used to an energy band in this range with the notice command and group the spectral bins to have a minimum of one count using data only from the specified band. Before fitting, we check the energy range of the data set being considered in the analysis with the help of the the show_filter command:

sherpa> notice(5, 80)
sherpa> mask = get_data("faked").mask
sherpa> group_counts("faked", 1, tabStops=~mask)
  
sherpa> show_filter()
Data Set Filter: faked
5.0000-80.0000 Energy (keV)

If the data has been grouped, the filter uses the energies at the edges of the detector channels that contain the energies requested in the notice command; however, since we previously ungrouped the data, we can now group the data to the restricted energy range using an inverted mask array ~mask.

Then, we change the fit statistic from the default χ2-Gehrels to C-Stat, the fit method from Levenberg-Marquardt to Nelder-Mead Simplex, and change the model assigned to data set "faked" from gal*(tplasma+tplasma2+powlaw) to just powlaw.

sherpa> set_source("faked", powlaw)
sherpa> set_method("neldermead")
sherpa> set_stat("cstat")
sherpa> fit("faked")
Dataset               = faked
Method                = neldermead
Statistic             = cstat
Initial fit statistic = 88.1156
Final fit statistic   = 86.484 at function evaluation 319
Data points           = 112
Degrees of freedom    = 110
Probability [Q-value] = 0.952461
Reduced statistic     = 0.786219
Change in statistic   = 1.6316
   powlaw.PhoIndex   2.41286     
   powlaw.norm    8.17932e-05 

We can visualize the fit using the plot_fit command:

sherpa> plot_fit("faked")
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with cstat

The resulting plot is shown in Figure 3.

Figure 3: Plot of power-law fit to simulated NuSTAR source spectrum

[Plot of fit to simulated NuSTAR spectrum]
[Print media version: Plot of fit to simulated NuSTAR spectrum]

Figure 3: Plot of power-law fit to simulated NuSTAR source spectrum

Plot of power-law model fit to the simulated NuSTAR X-ray spectrum.

Next, we change the source model for data set "faked" from powlaw to an absorbed xskerrbb—using the more contemporary xsphabs photoelectric absorption model—and re-fit the data with the prior knowledge that the source is ~5.2 M and ~15.2 Mpc away:

sherpa> set_source("faked", xsphabs.abs1*xskerrbb.b1)
sherpa> abs1.nh = gal.nh.val
sherpa> freeze(abs1)
sherpa> b1.Mbh = 5.2 # M_sun
sherpa> b1.Dbh._max = 16e3 # modify hard upper-limit
sherpa> b1.Dbh = 15200 # in kpc
sherpa> freeze(b1.Mbh)
sherpa> freeze(b1.Dbh)
sherpa> thaw(b1.eta)
sherpa> thaw(b1.i)
sherpa> thaw(b1.hd)

sherpa> fit("faked")
Dataset               = faked
Method                = neldermead
Statistic             = cstat
Initial fit statistic = 3194.22
Final fit statistic   = 92.9077 at function evaluation 1402
Data points           = 112
Degrees of freedom    = 106
Probability [Q-value] = 0.814026
Reduced statistic     = 0.876487
Change in statistic   = 3101.32
   b1.eta         0.883648    
   b1.a           0.717193    
   b1.i           52.5313     
   b1.Mdd         2.82277     
   b1.hd          4.01007     
   b1.norm        12.7345     

sherpa> thaw(b1.Mbh)
sherpa> b1.Mbh.max = 7
sherpa> b1.Mbh.min = 3
sherpa> fit("faked")
Dataset               = faked
Method                = neldermead
Statistic             = cstat
Initial fit statistic = 92.9077
Final fit statistic   = 92.9077 at function evaluation 835
Data points           = 112
Degrees of freedom    = 105
Probability [Q-value] = 0.794559
Reduced statistic     = 0.884835
Change in statistic   = 0
   b1.eta         0.883648    
   b1.a           0.717193    
   b1.i           52.5313     
   b1.Mbh         5.2         
   b1.Mdd         2.82277     
   b1.hd          4.01007     
   b1.norm        12.7345     

This time, we visualize the new fit using the Kerr black hole model with plot_fit_ratio:

sherpa> plot_fit_ratio("faked")
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with cstat
WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with cstat
sherpa> plt.yscale('linear')

The residuals plot is shown in Figure 4.

Figure 4: Plot of 'xskerrbb' model fit and ratio to the best-fit

[Plot of fit to simulated NuSTAR spectrum]
[Print media version: Plot of fit to simulated NuSTAR spectrum]

Figure 4: Plot of 'xskerrbb' model fit and ratio to the best-fit

Plot of the best-fit to the simulated NuSTAR spectrum, using a multi-temperature blackbody model for a thin accretion disk around a Kerr black hole. The bottom plot shows the ratio of the data to the model values.

The plot may be saved to a PostScript (or other format) file with the plt.savefig command:

sherpa> plt.savefig("nustar_sim_fit_delchi_200ksec.ps")

Examining the Fit Results

We can explore the errors associated with the best-fit model parameters using the confidence command; the 68% confidence level values are returned by default (1σ), and may be changed with the set_conf_opt command. In this example, we run the confidence routine to check the errors on all of the thawed model parameters:

sherpa> set_conf_opt("fast", False)
sherpa> conf("faked")
b1.i lower bound:       -18.5291
b1.eta lower bound:     -0.805932
b1.eta upper bound:     -----
b1.i upper bound:       31.8524
b1.Mbh lower bound:     -----
b1.a lower bound:       -----
b1.Mbh upper bound:     -----
b1.a upper bound:       0.269324
b1.Mdd -: WARNING: The confidence level lies within (9.101852e-01, 9.101721e-01)
b1.Mdd lower bound:     -1.9126
b1.norm -: WARNING: The confidence level lies within (1.786498e+00, 1.786459e+00)
b1.norm lower bound:    -10.948
b1.Mdd upper bound:     -----
b1.norm +: WARNING: The confidence level lies within (2.004329e+03, 2.004332e+03)
b1.norm upper bound:    1991.6
b1.hd -: WARNING: The confidence level lies within (1.817647e+00, 1.817630e+00)
b1.hd lower bound:      -2.19243
b1.hd upper bound:      -----
Dataset               = faked
Confidence Method     = confidence
Iterative Fit Method  = None
Fitting Method        = neldermead
Statistic             = cstat
confidence 1-sigma (68.2689%) bounds:
   Param            Best-Fit  Lower Bound  Upper Bound
   -----            --------  -----------  -----------
   b1.eta           0.883648    -0.805932        -----
   b1.a             0.717193        -----     0.269324
   b1.i              52.5313     -18.5291      31.8524
   b1.Mbh                5.2        -----        -----
   b1.Mdd            2.82277      -1.9126        -----
   b1.hd             4.01007     -2.19243        -----
   b1.norm           12.7345      -10.948       1991.6

The set_conf_opt command issued before conf in the example above ensures that the currently set fit optimization method is used in the confidence calculation, instead of the (faster) Sherpa default Levenberg-Marquardt method.

The output of the confidence command shows that the hard upper and lower limits were hit for several parameters (denoted by the -----). This occurs when the parameter bound found by confidence lies outside the hard limit boundary for a model parameter, and could result from an issue with the signal-to-noise of the data, the applicability of the model to the data, systematic errors in the data, among other things.

The warning messages printed in the confidence output shown above occur where confidence cannot locate the minimum value of the fit statistic function, even though it is bracketed within an interval (perhaps due to poor resolution of the data or a discontinuity). In such cases, when the openinterval option of confidence is set to False (default), the confidence function will return the average of the open interval which brackets the minimum value.

The region-projection and interval-projection plotting commands are available to further explore the quality of a fit, through visualization of fitted model parameter value(s) as a function of fit statistic value.


Scripting It

The file fit.py is a Python script which performs the primary commands used above; it can be executed by typing %run -i fit.py on the Sherpa command line.

The Sherpa script command may be used to save everything typed on the command line in a Sherpa session:

sherpa> script(filename="sherpa.log", clobber=False)


History

21 Jan 2011 original version
15 Dec 2011 reviewed for CIAO 4.4: a work-around for a save_pha bug was added
04 Dec 2013 reviewed for CIAO 4.6: no changes
20 Feb 2015 updated for CIAO 4.7 and CY17 Chandra CfP utilizing the current NuSTAR proposal responses.
15 Dec 2015 updated for CIAO 4.8 and CY18 Chandra CfP utilizing the version 3 of the NuSTAR proposal responses.
30 Nov 2016 updated for CIAO 4.9, fit results updated; no new content.
16 Jul 2018 reviewed for CIAO 4.10: no content change
11 Dec 2019 Updated for CIAO 4.12: use matplotlib rather than ChIPS.
27 Apr 2021 Includes information on including background data to simulated spectrum.
08 Apr 2022 Updated for CIAO 4.14, use mask array to better group spectrum to selected energy band.