Introduction to Fitting ASCII Data with Errors: Single-Component Source Models
![[CXC Logo]](/ciao/imgs/cxc-logo.gif)
Sherpa Threads (CIAO 4.0 Beta)
[Python Syntax]
OverviewLast Update: 14 Nov 2007 - rewritten for CIAO 4.0 Beta 3 Synopsis: One-dimensional data from an ASCII file are empirically fit with polynomials of several orders. A parameter expression is defined to link the polynomial offset with one of the constants. The data and fits are plotted, and the plots are customized with ChIPS commands. |
Contents
- Reading ASCII Data & Errors Into Sherpa
- Plotting Data
- Establishing a Model Component
- Viewing Method & Statistic Settings
- Thawing Model Parameters & Fitting
- Examining Fit Results
- Linking Model Parameters
- Independently Fitting a Second Dataset
- Checking Sherpa Session Status
- Exiting Sherpa
- History
- Images
Reading ASCII Data & Errors Into Sherpa
In this thread, we wish to fit 1-D data from the following ASCII dataset:
sherpa> !more data1.dat 0.5 1.6454 0.04114 1.5 1.7236 0.04114 2.5 1.9472 0.04114 3.5 2.2348 0.04114 4.5 2.6187 0.04114 5.5 2.8642 0.04114 6.5 3.1263 0.04114 7.5 3.2073 0.04114 8.5 3.2852 0.04114 9.5 3.3092 0.04114 10.5 3.4496 0.04114
This dataset is input into Sherpa using the load_data() function with the first argument setting the data set id to 1 (default), the second indicating the filename and the third indicating the number of columns to read. The first and second columns are read in as x and y(x), the third column is read in as the statistical errors:
sherpa> load_data(1, "data1.dat", 3) sherpa> print(get_data()) name = data1.dat x = Float64[11] y = Float64[11] staterror = Float64[11] syserror = None sherpa> print(get_data().x) [ 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 9.5 10.5] sherpa> print(get_data().staterror) [ 0.04114 0.04114 0.04114 0.04114 0.04114 0.04114 0.04114 0.04114 0.04114 0.04114 0.04114]
To examine a content of the data we use print get_data() command.
Plotting Data
Now the dataset may be plotted:
sherpa> plot_data()
The CIAO software package includes a plotting application called ChIPS. ChIPS plotting can be used within Sherpa to modifying the appearance of a plot. The set_plot_xlabel() and set_plot_ylabel() commands are used to add axis labels:
Figure 1
shows the resulting plot.
Establishing a Model Component
A polynomial model will be used to fit the data. The Sherpa model name for a 1-D polynomial function is polynom1d.
The model is established using the set_source() function, specifying the model name for the session. Here, the generic polynom1d model is named model1:
sherpa> set_source(polynom1d.model1)
The get_model_type() and get_model() commands may be used to examine the details of the established model. The initial parameter settings - thawed or frozen, initial values, minimum and maximum allowed values - are listed:
sherpa> get_model_type("model1")
polynom1d
sherpa> print(get_model())
model1
Param Type Value Min Max Units
----- ---- ----- --- --- -----
model1.c0 thawed 1 -1e+120 1e+120
model1.c1 frozen 0 -1e+120 1e+120
model1.c2 frozen 0 -1e+120 1e+120
model1.c3 frozen 0 -1e+120 1e+120
model1.c4 frozen 0 -1e+120 1e+120
model1.c5 frozen 0 -1e+120 1e+120
model1.c6 frozen 0 -1e+120 1e+120
model1.c7 frozen 0 -1e+120 1e+120
model1.c8 frozen 0 -1e+120 1e+120
model1.offset frozen 0 -1e+120 1e+120
Viewing Method & Statistic Settings
We use Sherpa's default optimization method and statistics for these polynomial fits. To view the current method and statistics settings:
sherpa> print(get_method_name()) levmar sherpa> print(get_method()) ftol = 1.19209289551e-07 xtol = 1.19209289551e-07 gtol = 1.19209289551e-07 maxfev = 500 epsfcn = 1.19209289551e-07 factor = 100.0 verbose = 0 sherpa> print(get_stat_name()) chi2gehrels
Thawing Model Parameters & Fitting
To start, we wish to fit these data with a first-order polynomial. The c1 parameter of model1 needs to be thawed so that it will be allowed to vary during the fit:
sherpa> thaw(model1.c1) sherpa> print(get_model()) model1 Param Type Value Min Max Units ----- ---- ----- --- --- ----- model1.c0 thawed 1 -1e+120 1e+120 model1.c1 thawed 0 -1e+120 1e+120 model1.c2 frozen 0 -1e+120 1e+120 model1.c3 frozen 0 -1e+120 1e+120 model1.c4 frozen 0 -1e+120 1e+120 model1.c5 frozen 0 -1e+120 1e+120 model1.c6 frozen 0 -1e+120 1e+120 model1.c7 frozen 0 -1e+120 1e+120 model1.c8 frozen 0 -1e+120 1e+120 model1.offset frozen 0 -1e+120 1e+120
The dataset is then fit using the default optimization method and statistics:
sherpa> fit() Statistic value = 151.827 at function evaluation 6 Data points = 11 Degrees of freedom = 9 Probability [Q-value] = 3.68798e-28 Reduced statistic = 16.8697 model1.c0 1.58227 model1.c1 0.198455
The fit information includes the chi-squared statistic value for chi2gehrels, goodness-of-fit and reduced chi-square along with the best-fit parameter values.
To plot the fit model and the data, use the plot_fit() function:
By default, the fit is plotted as a red line over the data,
as shown in Figure 2
. Clearly
a first-order polynomial does not give a good fit to the
dataset.
The c2 parameter of model1 is thawed so that a second-order polynomial may be fit to the data:
sherpa> thaw(model1.c2) sherpa> fit() Statistic value = 59.0027 at function evaluation 8 Data points = 11 Degrees of freedom = 8 Probability [Q-value] = 7.31065e-10 Reduced statistic = 7.37534 model1.c0 1.30826 model1.c1 0.347303 model1.c2 -0.0135317
Plot the fit and the data:
sherpa> plot_fit() sherpa> set_plot_xlabel("X = Off-Axis (arcmin)") sherpa> set_plot_ylabel("F(X) = SNR")
Figure 3
shows the plot of the
second-order polynomial fit.
Finally, the data are fit with a third-order polynomial. The c3 parameter of model1 is thawed, then the data is fit again and plotted:
sherpa> thaw(model1.c3) sherpa> fit() Statistic value = 30.8491 at function evaluation 10 Data points = 11 Degrees of freedom = 7 Probability [Q-value] = 6.62869e-05 Reduced statistic = 4.40701 model1.c0 1.49843 model1.c1 0.1447 model1.c2 0.0322936 model1.c3 -0.00277729 sherpa> plot_fit() sherpa> set_plot_xlabel("X = Off-Axis (arcmin)") sherpa> set_plot_ylabel("F(X) = SNR")
Figure 4
shows the plot of the
third-order polynomial fit together with the data.
Examining Fit Results
The best fit model with the data and residuals may be plotted in the same window:
sherpa> plot_fit_resid()
Various modifications may be made to the plots:
sherpa> set_current_plot("plot2") # Label the x-axis; reformat the y-axis tickmarks sherpa> set_plot_xlabel("X = Off-Axis (arcmin)") sherpa> set_axis_tickformat("ay1","%1.2f") # Label the y-axis; reformat the y-axis tickmarks sherpa> set_current_plot("plot1") sherpa> set_plot_ylabel("F(X) = SNR") sherpa> set_axis_tickformat("ay1","%1.2f") # Change the title; adjust the space between the two plots sherpa> set_plot_title("ACIS 25000 Counts Per Chip") sherpa> adjust_grid_ygap(0.04) # Add a label that describes the third-order polynomial fit sherpa> add_label(3.0,1.7,"F(X)=(1.4984)+(0.1447)X+(0.0323)X^2+(-0.0028)X^3")
The "#" sign indicates a comment and will be ignored if entered on the command line.
Figure 5
, is saved as a JPG
image and a PS file:
Use the projection() function - which may be shortened to proj() - to estimate the confidence intervals for the thawed parameters. By default, the 1-sigma upper and lower bounds are returned:
sherpa> proj() projection 1-sigma bounds: Param Best-Fit Lower Bound Upper Bound ----- -------- ----------- ----------- model1.c0 1.49843 -0.0520806 0.0520806 model1.c1 0.1447 -0.0413683 0.0413683 model1.c2 0.0322936 -0.00874421 0.00874421 model1.c3 -0.00277729 -0.000523298 0.000523298
In CIAO 3.4, the goodness command was used to get the chi-squared goodness-of-fit. This information is now reported with the best-fit values after a fit; it is no longer necessary to run a separate command. get_fit_results() allows access to this information after the fit has been performed.
For the final fit (third-order polynomial), the results are:
Statistic value = 30.8491 at function evaluation 10 Data points = 11 Degrees of freedom = 7 Probability [Q-value] = 6.62869e-05 Reduced statistic = 4.40701
Linking Model Parameters
Instead of empirically fitting a polynomial to the data as before, we now wish to fit a first order polynomial such that the offset constant parameter is the following product:
offset = 4.3979 * c1
where c1 is the first order coefficient and 4.3979 = log(25000).
First, a new polynom1d model is established and is named model2:
sherpa> set_source(polynom1d.model2)
For a third-order polynomial fit, thaw the first three constant parameters of model2:
The offset parameter of model2 is linked to the value of the c1 parameter:
sherpa> model2.offset = (model2.c1)*(4.3979)
The resulting source model expression is:
sherpa> print(get_model()) model2 Param Type Value Min Max Units ----- ---- ----- --- --- ----- model2.c0 thawed 1 -1e+120 1e+120 model2.c1 thawed 0 -1e+120 1e+120 model2.c2 thawed 0 -1e+120 1e+120 model2.c3 thawed 0 -1e+120 1e+120 model2.c4 frozen 0 -1e+120 1e+120 model2.c5 frozen 0 -1e+120 1e+120 model2.c6 frozen 0 -1e+120 1e+120 model2.c7 frozen 0 -1e+120 1e+120 model2.c8 frozen 0 -1e+120 1e+120 model2.offset linked 0 -1e+120 1e+120 sherpa> print(get_model().offset) val = 0 min = -1e+120 default = 0 max = 1e+120 units = frozen = True link = (model2.c1 * 4.3979)
Here the details of the link of the model2.offset are shown along with the initial settings.
The dataset is then fit:
sherpa> fit() Statistic value = 30.8491 at function evaluation 25 Data points = 11 Degrees of freedom = 7 Probability [Q-value] = 6.62869e-05 Reduced statistic = 4.40701 model2.c0 1.64339 model2.c1 0.193666 model2.c2 0.0251972 model2.c3 -0.00277729
Plot the best fit model and add axis labels:
sherpa> plot_fit() sherpa> set_plot_xlabel("X = Off-Axis (arcmin)") sherpa> set_plot_ylabel("F(X) = SNR")
Figure 6
shows the resulting plot.
To compare the fitting results obtained using model2 to those obtained using model1:
sherpa> print(model1) model1 Param Type Value Min Max Units ----- ---- ----- --- --- ----- model1.c0 thawed 1.49843 -1e+120 1e+120 model1.c1 thawed 0.1447 -1e+120 1e+120 model1.c2 thawed 0.0322936 -1e+120 1e+120 model1.c3 thawed -0.00277729 -1e+120 1e+120 model1.c4 frozen 0 -1e+120 1e+120 model1.c5 frozen 0 -1e+120 1e+120 model1.c6 frozen 0 -1e+120 1e+120 model1.c7 frozen 0 -1e+120 1e+120 model1.c8 frozen 0 -1e+120 1e+120 model1.offset frozen 0 -1e+120 1e+120 sherpa> print(model2) model2 Param Type Value Min Max Units ----- ---- ----- --- --- ----- model2.c0 thawed 1.64339 -1e+120 1e+120 model2.c1 thawed 0.193666 -1e+120 1e+120 model2.c2 thawed 0.0251972 -1e+120 1e+120 model2.c3 thawed -0.00277729 -1e+120 1e+120 model2.c4 frozen 0 -1e+120 1e+120 model2.c5 frozen 0 -1e+120 1e+120 model2.c6 frozen 0 -1e+120 1e+120 model2.c7 frozen 0 -1e+120 1e+120 model2.c8 frozen 0 -1e+120 1e+120 model2.offset linked 0.851724 -1e+120 1e+120
Again, the chi-squared goodness-of-fit was reported with the fit results:
Statistic value = 30.8491 at function evaluation 25 Data points = 11 Degrees of freedom = 7 Probability [Q-value] = 6.62869e-05 Reduced statistic = 4.40701
Independently Fitting a Second Dataset
Finally, we fit a different dataset with a third-order polynomial.
This dataset and errors are read and loaded into Sherpa as dataset with id="2" so that the previous dataset (with the default id="1") isn't overwritten:
sherpa> !more data2.dat 0.5 2.4602 0.06151 1.5 2.4602 0.06151 2.5 2.6373 0.06151 3.5 2.9393 0.06151 4.5 3.2628 0.06151 5.5 3.4131 0.06151 6.5 3.4999 0.06151 7.5 3.6038 0.06151 8.5 3.6374 0.06151 9.5 3.8728 0.06151 10.5 4.3325 0.06151 sherpa> load_data(2, "data2.dat", 3) sherpa> print(get_data()) name = data1.dat x = Float64[11] y = Float64[11] staterror = Float64[11] syserror = None
Dataset id=2 may now be plotted. Note that you need to specify the id in the plot_data() function:
sherpa> plot_data(2) sherpa> set_plot_xlabel("X = Off-Axis (arcmin)") sherpa> set_plot_ylabel("F(X) = SNR")
Another polynom1d model - model3 - is established and the first three constants are thawed:
sherpa> set_source(2,polynom1d.model3) sherpa> thaw(model3.c1,model3.c2,model3.c3)
Dataset id=2 is then fit with the model3. Note that id has been explicitly stated in the fit() function:
sherpa> fit(2) Statistic value = 35.5324 at function evaluation 10 Data points = 11 Degrees of freedom = 7 Probability [Q-value] = 8.88089e-06 Reduced statistic = 5.07605 model3.c0 2.19527 model3.c1 0.286636 model3.c2 -0.0234649 model3.c3 0.0013769
And the best fit model is plotted:
sherpa> plot_fit(2) sherpa> set_plot_xlabel("X = Off-Axis (arcmin)") sherpa> set_plot_ylabel("F(X) = SNR")
This dataset may need another order in the polynomial, so we free model3.c4 and fit again:
sherpa> thaw(model3.c4) sherpa> fit(2) Statistic value = 1.18643 at function evaluation 12 Data points = 11 Degrees of freedom = 6 Probability [Q-value] = 0.97755 Reduced statistic = 0.197739 model3.c0 2.60526 model3.c1 -0.407013 model3.c2 0.254528 model3.c3 -0.0377019 model3.c4 0.00177631 sherpa> plot_fit(2) sherpa> set_plot_xlabel("X = Off-Axis (arcmin)") sherpa> set_plot_ylabel("F(X) = SNR")
Figure 7
shows the resulting plot.
Checking Sherpa Session Status
A variety of information about the final status of this Sherpa session may be obtained:
# method and statistic setting
sherpa> print(get_method_name())
levmar
sherpa> print(get_stat_name())
chi2gehrels
# observed counts in each dataset
sherpa> calc_data_sum(1)
29.4115
sherpa> calc_data_sum(2)
36.1193
# what types of models are defined
sherpa> get_model_type("model1")
'polynom1d'
sherpa> get_model_type("model2")
'polynom1d'
sherpa> get_model_type("model3")
'polynom1d'
# final fit values for the source models
sherpa> print(get_model(1))
model2
Param Type Value Min Max Units
----- ---- ----- --- --- -----
model2.c0 thawed 1.64339 -1e+120 1e+120
model2.c1 thawed 0.193666 -1e+120 1e+120
model2.c2 thawed 0.0251972 -1e+120 1e+120
model2.c3 thawed -0.00277729 -1e+120 1e+120
model2.c4 frozen 0 -1e+120 1e+120
model2.c5 frozen 0 -1e+120 1e+120
model2.c6 frozen 0 -1e+120 1e+120
model2.c7 frozen 0 -1e+120 1e+120
model2.c8 frozen 0 -1e+120 1e+120
model2.offset linked 0.851724 -1e+120 1e+120
sherpa> print(get_model(2))
model3
Param Type Value Min Max Units
----- ---- ----- --- --- -----
model3.c0 thawed 2.60526 -1e+120 1e+120
model3.c1 thawed -0.407013 -1e+120 1e+120
model3.c2 thawed 0.254528 -1e+120 1e+120
model3.c3 thawed -0.0377019 -1e+120 1e+120
model3.c4 thawed 0.00177631 -1e+120 1e+120
model3.c5 frozen 0 -1e+120 1e+120
model3.c6 frozen 0 -1e+120 1e+120
model3.c7 frozen 0 -1e+120 1e+120
model3.c8 frozen 0 -1e+120 1e+120
model3.offset frozen 0 -1e+120 1e+120
History
| 14 Nov 2007 | rewritten for CIAO 4.0 Beta 3 |
