AHELP for CIAO 4.3 Sherpa v1

# cstat

Context: statistics

## Synopsis

A maximum likelihood function

## Description

The cstat statistic is equivalent to the XSPEC implementation of the Cash statistic.

Counts are sampled from the Poisson distribution, and so the best way to assess the quality of model fits is to use the product of individual Poisson probabilities computed in each bin i, or the likelihood L:

`L = (product)_i [ M(i)^N(i)/N(i)! ] * exp[-M(i)]`

where M(i) = S(i) + B(i) is the sum of source and background model amplitudes, and D(i) is the number of observed counts, in bin i.

The cstat statistic (Cash 1979, ApJ 228, 939) is derived by (1) taking the logarithm of the likelihood function, (2) changing its sign, (3) dropping the factorial term (which remains constant during fits to the same dataset), (4) adding an extra data-dependent term, and (4) multiplying by two:

`C = 2 * (sum)_i [ M(i) - D(i) + D(i)*[log D(i) - log M(i)] ]`

The factor of two exists so that the change in cstat statistic from one model fit to the next, (Delta)C, is distributed approximately as (Delta)chi-square when the number of counts in each bin is high (> 5). One can then in principle use (Delta)C instead of (Delta)chi-square in certain model comparison tests. However, unlike chi-square, the cstat statistic may be used regardless of the number of counts in each bin.

The advantage of cstat over Sherpa's implementation of cash is that one can assign an approximate goodness-of-fit measure to a given value of the cstat statistic, i.e. the observed statistic, divided by the number of degrees of freedom, should be of order 1 for good fits.

### Background Subtraction

The background should not be subtracted from the data when this statistic is used. It should be modeled simultaneously with the source.

### Zero and negative value numbers

The C-stat statistic function evaluates the logarithm of each data point. If the number of counts is zero or negative, it's not possible to take the log of that number. The behavior in this case is controlled by the truncate and trunc_value settings in the .sherpa.rc file; see "ahelp sherparc" for details on this file.

If truncate is set to True (the default), then log(<trunc_value>) is substituted into the equation, and the statistics calculation proceeds. The default trunc_value is 1.0e-25.

If truncate is set to False, C-stat returns an error and stops the calculation when the number of counts in a bin is zero or negative. The trunc_value setting is not used.

## Example

```sherpa> set_stat("cstat")
sherpa> show_stat()
Statistic: CStat```

Set the fitting statistic and then confirm the new value.

## Bugs

See the bugs pages on the Sherpa website for an up-to-date listing of known bugs.