Simulating NuSTAR X-Ray Spectra with Sherpa
Sherpa Threads (CIAO 4.14 Sherpa)
This thread illustrates the use of the Sherpa fake_pha command to simulate a hard X-ray emission spectrum of a supernova remnant which might be imaged by the detectors aboard the Nuclear Spectroscopic Telescopic Array (NuSTAR) satellite, launched in June 2012. NuSTAR, the first focusing high-energy X-ray mission sensitive across the 5-80 keV range, is aimed at studying objects such as black holes, supernovae, and extremely active galaxies. Here, we use Sherpa to simulate a 1-D PHA spectrum using the currently available NuSTAR imaging ARF and RMF response files, in addition to a source model expression tailored to the Chandra X-ray spectrum of SN 1979C, suspected to harbor an accreting stellar-mass black hole.
Last Update: 8 Apr 2022 - Updated for CIAO 4.14, use mask array to better group spectrum to selected energy band.
- Quick Start to Simulating Data
- Simulating Data Step by Step:
- Scripting It
Quick Start to Simulating Data
All that is required to simulate a 1-D PHA NuSTAR data set in Sherpa is the following:
- a source model expression defined with set_source
- proposal ARF & RMF instrument response files downloaded from the NuSTAR website
- exposure time for the simulation in seconds
fake_pha will do the rest:
unix% sherpa # Search available Sherpa models. sherpa> list_models() ['absorptionedge', 'absorptiongaussian', 'absorptionlorentz', 'absorptionvoigt', ... 'xszvphabs', 'xszwabs', 'xszwndabs', 'xszxipcf'] # Define a source model expression and assign it to an ID. sherpa> emis = xsmekal.tplasma + xsmekal.tplasma2 + xspowerlaw.powlaw sherpa> set_source("faked", xswabs.gal * emis) # Set model parameter values. sherpa> set_par(gal.nH, 0.025, frozen=True) sherpa> tplasma.redshift = 0.005 # frozen by default sherpa> tplasma.kT = 1.1401 sherpa> tplasma.norm = 3.86263e-06 sherpa> tplasma2.redshift = 0.005 # frozen by default sherpa> tplasma2.kT = 0.659763 sherpa> tplasma2.norm = 2.62793e-05 sherpa> powlaw.PhoIndex = 2.44436 sherpa> powlaw.norm = 7.87378e-05 # Fake a PHA data set over the grid defined by the input response. sherpa> fake_pha("faked", arf="point_30arcsecRad_1arcminOA.arf", rmf="nustar.rmf", exposure=200000., backscal=1.0) # Return information on the faked data set and associated responses. sherpa> show_data("faked") sherpa> calc_data_sum(id="faked") sherpa> calc_energy_flux(id="faked") sherpa> calc_photon_flux(id="faked") # Plot simulated data. sherpa> plot_data("faked") # Save simulated data. sherpa> save_pha("faked", "nustar_sim_200ksec.pha")
This represents the quickest and simplest way to simulate a NuSTAR PHA data set in Sherpa. Simply start CIAO and Sherpa; define a source model expression in Sherpa and assign it to a data set ID such as "faked"; set model parameter values; and then run fake_pha according to your specifications. Read on to find a detailed explanation of each step of this quick-start recipe.
Simulating Data Step by Step:
Downloading Calibration Response Files
In order to simulate a NuSTAR X-ray spectrum, it is necessary to define an instrument response using the appropriate ARF and RMF files currently available for the mission. NuSTAR response files are available for download from the "For Proposers" page of the NuSTAR website (CalTech); the files used in this thread are:
- point_30arcsecRad_1arcminOA.arf — unweighted effective area for a 1' off-axis circular extraction region of radius 30''
- nustar.rmf — RMF includes detector efficiency
Figure 1: Effective Area Comparison
Establishing the Source Model Expression for the Simulation
We will use the Sherpa fake_pha command to simulate a NuSTAR PHA spectrum using a defined source model expression and instrument response as input. Details on the functionality of fake_pha, with other examples of its usage, are available in the ahelp file and the other Sherpa simulation threads.
The source model chosen for the simulation is defined with the Sherpa set_source command, as follows, where it is assigned to a string data set ID, "faked".
sherpa> emis = xsmekal.tplasma + xsmekal.tplasma2 + xspowerlaw.powlaw sherpa> set_source("faked", xswabs.gal * emis) sherpa> set_par(gal.nH, 0.025, frozen=True) sherpa> tplasma.redshift = 0.005 # frozen by default sherpa> tplasma.kT = 1.1401 sherpa> tplasma.norm = 3.86263e-06 sherpa> tplasma2.redshift = 0.005 # frozen by default sherpa> tplasma2.kT = 0.659763 sherpa> tplasma2.norm = 2.62793e-05 sherpa> powlaw.PhoIndex = 2.44436 sherpa> powlaw.norm = 7.87378e-05
Here we use a source model consisting of a combination of two thermal MEKAL components and a power-law, with parameter values chosen to give a good approximation the Chandra imaging spectra of SN 1979C in the 0.6-3 keV range (Patnaude et al., 2011). We can view the model parameter settings in this case using the print command, as the Sherpa show_source command only displays model information for models already assigned to data sets.
sherpa> print(gal) xswabs.gal Param Type Value Min Max Units ----- ---- ----- --- --- ----- gal.nH frozen 0.025 0 1e+06 10^22 atoms / cm^2 sherpa> print(tplasma) xsmekal.tplasma Param Type Value Min Max Units ----- ---- ----- --- --- ----- tplasma.kT thawed 1.1401 0.0808 79.9 keV tplasma.nH frozen 1 1e-06 1e+20 cm-3 tplasma.Abundanc frozen 1 0 1000 tplasma.redshift frozen 0.005 -0.999 10 tplasma.switch frozen 1 0 1 tplasma.norm thawed 3.86263e-06 0 1e+24 sherpa> print(tplasma2) xsmekal.tplasma2 Param Type Value Min Max Units ----- ---- ----- --- --- ----- tplasma2.kT thawed 0.659763 0.0808 79.9 keV tplasma2.nH frozen 1 1e-06 1e+20 cm-3 tplasma2.Abundanc frozen 1 0 1000 tplasma2.redshift frozen 0.005 -0.999 10 tplasma2.switch frozen 1 0 1 tplasma2.norm thawed 2.62793e-05 0 1e+24 sherpa> print(powlaw) xspowerlaw.powlaw Param Type Value Min Max Units ----- ---- ----- --- --- ----- powlaw.PhoIndex thawed 2.44436 -3 10 powlaw.norm thawed 7.87378e-05 0 1e+24
Defining the Instrument Response for the Simulation
Note that the steps in this section are optional, as ARF and RMF (or RSP) filenames may be directly entered into the arf and rmf arguments of the fake_pha expression shown in the next section of this thread. However, if you would like to view the details of or modify the response used in the simulation, you should load the NuSTAR ARF and RMF responses into Sherpa as follows:
sherpa> nustar_arf = unpack_arf("point_30arcsecRad_1arcminOA.arf") sherpa> nustar_rmf = unpack_rmf("nustar.rmf") sherpa> print(nustar_arf) name = point_30arcsecRad_1arcminOA.arf energ_lo = Float64 energ_hi = Float64 specresp = Float64 bin_lo = None bin_hi = None exposure = None ethresh = 1e-10 sherpa> print(nustar_rmf) name = nustar.rmf detchans = 4096 energ_lo = Float64 energ_hi = Float64 n_grp = UInt64 f_chan = UInt32 n_chan = UInt32 matrix = Float64 offset = 0 e_min = Float64 e_max = Float64 ethresh = 1e-10
The response data is loaded into variables nustar_arf and nustar_rmf with the unpack_arf and unpack_rmf commands. These variables will be used to assign the instrument response to the faked data set in the next section of this thread, "Running the Simulation with fake_pha."
Running the Simulation with fake_pha
Simulating the NuSTAR spectrum with Sherpa involves convolving the chosen source model expression with the corresponding NuSTAR response, and applying Poisson noise to the counts predicted by the model. All of these steps are performed by the fake_pha command, which has four required arguments: data set or model ID, ARF, RMF, and exposure time. We set the fake_pha arf and rmf arguments to the nustar_arf/nustar_rmf variables defined in the previous section, and the exposure argument to a value of 200 kiloseconds.
sherpa> fake_pha("faked", arf=nustar_arf, rmf=nustar_rmf, exposure=2e5)
This command associates the data/model ID "faked" with a simulated spectrum which is calculated over the grid defined by the NuSTAR instrument response, using the input exposure time and source model expression. Poisson noise is added to the modeled data.
We may use the show_data command to inspect the basic properties of the new data set; observe that the "name" field reads "faked", and the "exposure" field has been set to the chosen exposure time:
sherpa> show_data() Data Set: faked Filter: 1.6000-165.4400 Energy (keV) Noticed Channels: 1-4096 name = faked channel = Int64 counts = Float64 staterror = None syserror = None bin_lo = None bin_hi = None grouping = None quality = None exposure = 200000.0 backscal = None areascal = None grouped = False subtracted = False units = energy rate = True plot_fac = 0 response_ids =  background_ids =  RMF Data Set: faked:1 name = nustar.rmf detchans = 4096 energ_lo = Float64 energ_hi = Float64 n_grp = UInt64 f_chan = UInt32 n_chan = UInt32 matrix = Float64 offset = 0 e_min = Float64 e_max = Float64 ethresh = 1e-10 ARF Data Set: faked:1 name = point_30arcsecRad_1arcminOA.arf energ_lo = Float64 energ_hi = Float64 specresp = Float64 bin_lo = None bin_hi = None exposure = None ethresh = 1e-10
The calc_data_sum and calc_energy_flux/calc_photon_flux commands may be used to return the total counts and integrated energy/photon flux of the faked data set over the entire energy range (as in this example), or within a specified 'lo' to 'hi' range:
sherpa> calc_data_sum(id="faked") 246.0 sherpa> calc_energy_flux(id="faked") # ergs/cm^2/sec/keV 2.0588096302472468e-13 sherpa> calc_photon_flux(id="faked") # photons/cm^2/sec/keV 2.902919956116689e-05
NuSTAR observations often include a significant background contribution to the spectra and depending on your science, should be accounted for in your simulation. As part of the downloaded NuSTAR simulation files package, background PHA spectral data are included. This background data can be appropriately scaled and included in the simulated spectra by using the bkg and backscal arguments in fake_pha as demonstrated below.
sherpa> nustar_bkg = unpack_bkg("bgd_30arcsec.pha") WARNING: file 'nu10002020001A01_sr.arf' not found WARNING: file 'nu10002020001A01_sr.rmf' not found sherpa> fake_pha("faked", arf=nustar_arf, rmf=nustar_rmf, exposure=200000, bkg=nustar_bkg, backscal=nustar_bkg.backscal)
The warnings thrown are due to non-existant response files referenced in the background PHA file header and will not affect the simulated spectrum. We can also see that the background data is incorporated into the simulated results using the same model by looking at the counts and fluxes compared to the earlier simulation excluding the background arguments.
sherpa> calc_data_sum(id="faked") 1450.0 sherpa> calc_energy_flux(id="faked") # ergs/cm^2/sec/keV 2.0585882218932305e-13 sherpa> calc_photon_flux(id="faked") # photons/cm^2/sec/keV 2.9023013208479222e-05
The simulated data set may be visualized with the plot_data command, as shown below, and saved with plt.savefig. Before plotting, we group the counts in the faked data spectrum such that each bin contains at least 1 count, and set both the X- and Y-axes of the plot to a logarithmic scale.
sherpa> group_counts("faked", 1) sherpa> set_xlog() sherpa> set_ylog() sherpa> plot_data("faked") sherpa> plt.savefig("nustar_sim_data_200ksec.ps")
The resulting plot is shown Figure 2.
Figure 2: Plot of simulated NuSTAR source spectrum
Note that had we not applied a customized, properly normalized source model expression to the faked data, we would have had to re-normalize the simulated data to match the known flux (or counts) of SN 1979C. This process is illustrated in the threads "Simulating Chandra ACIS-S Spectra with Sherpa" and "Simulating 1-D Data: the Sherpa fake_pha Command."
Writing the Simulated Data to Output Files
We may use the save_pha command to write the ungrouped simulated data set to a PHA file, with a header containing the exposure time value and paths to the corresponding response files:
sherpa> ungroup("faked") sherpa> save_pha("faked", "nustar_sim_200ksec.pha") sherpa> !dmkeypar nustar_sim_200ksec.pha EXPOSURE echo+ 200000.0 sherpa> !dmkeypar nustar_sim_200ksec.pha ANCRFILE echo+ point_30arcsecRad_1arcminOA.arf sherpa> !dmkeypar nustar_sim_200ksec.pha RESPFILE echo+ nustar.rmf
Fitting the Simulated Data
A data set simulated with fake_pha may be fit as any other data set in Sherpa. For example, we can fit the simulated data with the same source model expression used to create it (and which is already assigned to it at this point), or define a different model. In this example, we choose to first fit the full range of the data using only the power-law model component already assigned to it (minus the thermal plasma models), and then fit again using the xskerrbb model, a multi-temperature blackbody model for a thin accretion disk around a Kerr black hole.
While the detectors onboard NuSTAR may be sensitive to 1.5-165 keV photons, the mirrors only focus ~5-80 keV photons, so we restrict the data used to an energy band in this range with the notice command and group the spectral bins to have a minimum of one count using data only from the specified band. Before fitting, we check the energy range of the data set being considered in the analysis with the help of the the show_filter command:
sherpa> notice(5, 80) sherpa> mask = get_data("faked").mask sherpa> group_counts("faked", 1, tabStops=~mask) sherpa> show_filter() Data Set Filter: faked 5.0000-80.0000 Energy (keV)
If the data has been grouped, the filter uses the energies at the edges of the detector channels that contain the energies requested in the notice command; however, since we previously ungrouped the data, we can now group the data to the restricted energy range using an inverted mask array ~mask.
Then, we change the fit statistic from the default χ2-Gehrels to C-Stat, the fit method from Levenberg-Marquardt to Nelder-Mead Simplex, and change the model assigned to data set "faked" from gal*(tplasma+tplasma2+powlaw) to just powlaw.
sherpa> set_source("faked", powlaw) sherpa> set_method("neldermead") sherpa> set_stat("cstat") sherpa> fit("faked") Dataset = faked Method = neldermead Statistic = cstat Initial fit statistic = 88.1156 Final fit statistic = 86.484 at function evaluation 319 Data points = 112 Degrees of freedom = 110 Probability [Q-value] = 0.952461 Reduced statistic = 0.786219 Change in statistic = 1.6316 powlaw.PhoIndex 2.41286 powlaw.norm 8.17932e-05
We can visualize the fit using the plot_fit command:
sherpa> plot_fit("faked") WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with cstat
The resulting plot is shown in Figure 3.
Figure 3: Plot of power-law fit to simulated NuSTAR source spectrum
Next, we change the source model for data set "faked" from powlaw to an absorbed xskerrbb—using the more contemporary xsphabs photoelectric absorption model—and re-fit the data with the prior knowledge that the source is ~5.2 M⊙ and ~15.2 Mpc away:
sherpa> set_source("faked", xsphabs.abs1*xskerrbb.b1) sherpa> abs1.nh = gal.nh.val sherpa> freeze(abs1) sherpa> b1.Mbh = 5.2 # M_sun sherpa> b1.Dbh._max = 16e3 # modify hard upper-limit sherpa> b1.Dbh = 15200 # in kpc sherpa> freeze(b1.Mbh) sherpa> freeze(b1.Dbh) sherpa> thaw(b1.eta) sherpa> thaw(b1.i) sherpa> thaw(b1.hd) sherpa> fit("faked") Dataset = faked Method = neldermead Statistic = cstat Initial fit statistic = 3194.22 Final fit statistic = 92.9077 at function evaluation 1402 Data points = 112 Degrees of freedom = 106 Probability [Q-value] = 0.814026 Reduced statistic = 0.876487 Change in statistic = 3101.32 b1.eta 0.883648 b1.a 0.717193 b1.i 52.5313 b1.Mdd 2.82277 b1.hd 4.01007 b1.norm 12.7345 sherpa> thaw(b1.Mbh) sherpa> b1.Mbh.max = 7 sherpa> b1.Mbh.min = 3 sherpa> fit("faked") Dataset = faked Method = neldermead Statistic = cstat Initial fit statistic = 92.9077 Final fit statistic = 92.9077 at function evaluation 835 Data points = 112 Degrees of freedom = 105 Probability [Q-value] = 0.794559 Reduced statistic = 0.884835 Change in statistic = 0 b1.eta 0.883648 b1.a 0.717193 b1.i 52.5313 b1.Mbh 5.2 b1.Mdd 2.82277 b1.hd 4.01007 b1.norm 12.7345
This time, we visualize the new fit using the Kerr black hole model with plot_fit_ratio:
sherpa> plot_fit_ratio("faked") WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with cstat WARNING: The displayed errorbars have been supplied with the data or calculated using chi2xspecvar; the errors are not used in fits with cstat sherpa> plt.yscale('linear')
The residuals plot is shown in Figure 4.
Figure 4: Plot of 'xskerrbb' model fit and ratio to the best-fit
The plot may be saved to a PostScript (or other format) file with the plt.savefig command:
Examining the Fit Results
We can explore the errors associated with the best-fit model parameters using the confidence command; the 68% confidence level values are returned by default (1σ), and may be changed with the set_conf_opt command. In this example, we run the confidence routine to check the errors on all of the thawed model parameters:
sherpa> set_conf_opt("fast", False) sherpa> conf("faked") b1.i lower bound: -18.5291 b1.eta lower bound: -0.805932 b1.eta upper bound: ----- b1.i upper bound: 31.8524 b1.Mbh lower bound: ----- b1.a lower bound: ----- b1.Mbh upper bound: ----- b1.a upper bound: 0.269324 b1.Mdd -: WARNING: The confidence level lies within (9.101852e-01, 9.101721e-01) b1.Mdd lower bound: -1.9126 b1.norm -: WARNING: The confidence level lies within (1.786498e+00, 1.786459e+00) b1.norm lower bound: -10.948 b1.Mdd upper bound: ----- b1.norm +: WARNING: The confidence level lies within (2.004329e+03, 2.004332e+03) b1.norm upper bound: 1991.6 b1.hd -: WARNING: The confidence level lies within (1.817647e+00, 1.817630e+00) b1.hd lower bound: -2.19243 b1.hd upper bound: ----- Dataset = faked Confidence Method = confidence Iterative Fit Method = None Fitting Method = neldermead Statistic = cstat confidence 1-sigma (68.2689%) bounds: Param Best-Fit Lower Bound Upper Bound ----- -------- ----------- ----------- b1.eta 0.883648 -0.805932 ----- b1.a 0.717193 ----- 0.269324 b1.i 52.5313 -18.5291 31.8524 b1.Mbh 5.2 ----- ----- b1.Mdd 2.82277 -1.9126 ----- b1.hd 4.01007 -2.19243 ----- b1.norm 12.7345 -10.948 1991.6
The set_conf_opt command issued before conf in the example above ensures that the currently set fit optimization method is used in the confidence calculation, instead of the (faster) Sherpa default Levenberg-Marquardt method.
The output of the confidence command shows that the hard upper and lower limits were hit for several parameters (denoted by the -----). This occurs when the parameter bound found by confidence lies outside the hard limit boundary for a model parameter, and could result from an issue with the signal-to-noise of the data, the applicability of the model to the data, systematic errors in the data, among other things.
The warning messages printed in the confidence output shown above occur where confidence cannot locate the minimum value of the fit statistic function, even though it is bracketed within an interval (perhaps due to poor resolution of the data or a discontinuity). In such cases, when the openinterval option of confidence is set to False (default), the confidence function will return the average of the open interval which brackets the minimum value.
The region-projection and interval-projection plotting commands are available to further explore the quality of a fit, through visualization of fitted model parameter value(s) as a function of fit statistic value.
The file fit.py is a Python script which performs the primary commands used above; it can be executed by typing %run -i fit.py on the Sherpa command line.
The Sherpa script command may be used to save everything typed on the command line in a Sherpa session:
sherpa> script(filename="sherpa.log", clobber=False)
|21 Jan 2011||original version|
|15 Dec 2011||reviewed for CIAO 4.4: a work-around for a save_pha bug was added|
|04 Dec 2013||reviewed for CIAO 4.6: no changes|
|20 Feb 2015||updated for CIAO 4.7 and CY17 Chandra CfP utilizing the current NuSTAR proposal responses.|
|15 Dec 2015||updated for CIAO 4.8 and CY18 Chandra CfP utilizing the version 3 of the NuSTAR proposal responses.|
|30 Nov 2016||updated for CIAO 4.9, fit results updated; no new content.|
|16 Jul 2018||reviewed for CIAO 4.10: no content change|
|11 Dec 2019||Updated for CIAO 4.12: use matplotlib rather than ChIPS.|
|27 Apr 2021||Includes information on including background data to simulated spectrum.|
|08 Apr 2022||Updated for CIAO 4.14, use mask array to better group spectrum to selected energy band.|