Subsections

13. wavdetect Theory

This chapter is a draft copy of the paper "A Wavelet-Based Algorithm for the Spatial Analysis of Poisson Data'' by P.E. Freeman, V. Kashyap, R. Rosner, D.Q. Lamb, Ap. J. Suppl. Series, Vol. 138, p. 185, 2002. The published paper is available online via astro-ph: http://arXiv.org/abs/astro-ph/0108429 .

13.1 Source Detection Using Wavelet Functions

A wavelet is a member of a family of oscillatory scaleable functions which deviates from zero within a limited spatial regime, and has zero normalization. Perhaps the simplest example of such a function is a positively valued "top hat," with amplitude and width , flanked by two negatively valued troughs with total integrated area . (This in fact describes the Haar wavelet family.) Any function with zero normalization satisfying the property

$\displaystyle W_{a,\sigma}(x) \equiv \frac{1}{\sigma} \, W\left(\frac{x-a}{\sigma}\right) \,$

(13.1)

may be used as a wavelet function. $\sigma$ is the scaling or dilation parameter and

is the translation parameter.

The importance of wavelet functions only began to be recognized in early 1980s, when they were used in the analysis of seismic data (Goupillaud et al. 1984). Wavelets are now used as tools in atomic physics, the study of fractal structures, time series, sound, and image analysis, etc. Slezak, Bijaoui, & Mars (1990) were the first to apply wavelets in astronomy, applying them to galaxy catalogue data to search for statistically significant clusters and voids. The interested reader may find more information on wavelets in Daubechies (1992) or Holschneider (1995).

Wavelet transforms, convolutions of an arbitrary analyzable function with a set of suitably scaled wavelet functions, provides a superior means to detect and characterize sources in an image, compared with classical "sliding-box''^13.1 and Fourier methods. Because a wavelet is a function with limited spatial extent, it can be used to determine source locations, like the sliding-box. Unlike the sliding-box, however, it can also be used to identify the dominant frequencies in the analyzed function, i.e. to characterize source shape and extent. Such characterization could also be done with Fourier analysis, but unlike Fourier methods, wavelets determine localized frequency information, as a consequence of their localization in both the spatial and Fourier domains.

The program wtransform applies wavelets to the problem of detecting sources in X-ray data, and the associated program wrecon analyzes the detected sources and constructs the source list. While others have developed codes with similar analysis goals (e.g. Vikhlinin, Forman, & Jones 1994; Rosati et al. 1995; Grebenev, Forman, & Jones 1995), wtransform and wrecon are considerably more complex, e.g. correcting for exposure variation across the field of view (FOV), and making error estimates, while not requiring that the point spread function (PSF) have a particular functional form (which is particularly important for analysis of Chandra images). Another program of similar complexity but using different methods is the palermo wdetect program of Damiani et al. (1997).

In §13.2 we describe the basic recipe one would follow to detect sources in data, without reference to any particular wavelet function. We do this is to underscore the fact that the basic details of the method are not dependent on the choice of wavelet function, so long as that function is simple, with one centralized positive amplitude mode. The fact that the PSF of X-ray detectors often has Gaussian-like shape motivates our use of the Marr wavelet, or "Mexican Hat" (MH) function, which we describe in Appendix 13.5.

The program wtransform uses the correlation of the MH function with a given image to identify putative source pixels. In §13.3 we describe in detail the steps that wtransform follows, in particular discussing how the program estimates the local background at each pixel, how it corrects for the systematic shift that a pixel's correlation value will take if the exposure varies within the limited spatial extent of the wavelet function, and how it performs error analysis. In appendices which relate to §13.3, we present derivations of analytic quantities related to the MH function (Appendix 13.6), as well as describe the simulations used to estimate the threshold correlation value for source detection in each pixel (Appendix 13.7). In §13.4 we discuss in detail how the source list is constructed by wrecon , and in an appendix we describe how wrecon computes a default background image for use in determining source properties (Appendix 13.8).

13.2 Basic Algorithm

The algorithm by which one uses a simple unimodal wavelet function $W(\sigma_x,\sigma_y,x,y)$ to detect sources in a two-dimensional image is conceptually simple.^13.2 This function is convolved with to produce a "correlation image" :

$\displaystyle C(\sigma_x,\sigma_y,x,y)~=~{\int}{\int}~dx'~dy'~W(\sigma_x,\sigma_y,x-x',y-y')~D(x',y')~{\equiv}~<W{\ast}D> .$

(13.2)

(Note that the data are assumed to have units counts, and not counts per second.) The expectation value of $C(\sigma_x,\sigma_y,x,y)$ is zero, if there are no sources within the limited spatial extent of the wavelet function, and the background count rate is locally constant, because the normalization of $W(\sigma_x,\sigma_y,x,y)$ is zero.

A clump of counts will manifest itself as a local maximum in a correlation image if the scale sizes $(\sigma_x,\sigma_y)$ are approximately the same as, or larger than, the size of the clump. One can see why this is so more easily if we write $C~=~<PW{\ast}D> + <NW{\ast}D>$ , where and denote the positive and negative amplitude portions of the wavelet function, respectively. If the clump is contained completely within , then the contribution of the positive term to outweighs that of the negative term, producing a maximum.^13.3 If the scale sizes are smaller, then the wavelet function will extend over a smaller region within the clump, within which the inferred counts amplitude will tend more and more to a constant. Hence $C \rightarrow 0$ . For larger scale sizes, the correlation value tends asymptotically to a maximum, $C_{\rm max}$ .

Source detection boils down to the question: how large must the value $C(\sigma_x,\sigma_y,x,y)$ become before we may safely accept that a source has contributed counts to the observed clump? To answer this, we compute the Type I error, or the probability that we would erroneously accept a source when the null hypothesis, that there is no source, is correct. We compute this quantity, also called the significance, in each image pixel :

$\displaystyle S_{i,j}~=~\int_{C_{i,j}}^{\infty} dC p(C \vert n_{B,i,j}) .$

(13.3)

$n_{B,i,j}$ is the inferred number of background counts within the limited spatial extent of the wavelet function, and $p(C \vert n_{B,i,j})$ is the probability sampling distribution for

given $n_{B,i,j}$ . By definition, $0 \leq S_{i,j} \leq 1$ ; if a source is present, $S_{i,j} \rightarrow$ 0. If $S_{i,j} \leq S_o$ , a user-defined threshold (e.g. 10 $^{-6}$ ), pixel

is identified as a source pixel. (Note that this identification may be made for any correlation image pixel, not just those which are local maxima in correlation space.)

If $n_{B,i,j}$ is estimated from the raw data themselves, this estimate will be biased if source counts are present, so that $S_{i,j} \ga S_{i,j,{\rm true}}$ . Thus an iterative procedure is used to remove source counts from the image:

estimate $B_{i,j}$ , the number of background counts in pixel , and compute $C_{i,j} = <W{\ast}D>_{i,j}$ ;
compute $S_{i,j}$ ;
determine which pixels are associated with (the strongest) sources;
replace $D_{i,j}$ with $B_{i,j}$ in those pixels, and call the new data image ;
estimate $B_{i,j}'$ using , and compute $C_{i,j}' = <W{\ast}D'>_{i,j}$ ;
compute $S_{i,j}'$ ;
determine which pixels are associated with (weaker) sources;
etc.

One continues to iterate until $B_{i,j}^{\rm final}$ is determined. (The usual number of iterations depends upon many factors, but is usually $\approx$ 3-4.) With this final background estimate, one computes a final significance $S_{i,j}^{\rm final}$ for each value $C_{i,j} = <W{\ast}D>_{i,j}$ (the correlation of the wavelet function with the raw image data) so that a final listing of source pixels may be made.

After this algorithm is used to determine lists of source pixels for many wavelet scale sizes, cross-identification of pixels across scales is performed to create the final source list. This cross-identification allows the weeding out of spurious sources which may be detected even if a high significance threshold is used.

13.3 Source Pixel Identifications

13.3.1 Background Estimation

In §13.2, we noted how the condition $<NW{\ast}D> \leq$ 0 can be interpreted to mean that the performs a "background subtraction." It is thus natural to use a function of $<NW{\ast}D>$ to estimate in pixel , if it is a priori unknown:

$\displaystyle B_{i,j}~=~\frac{1}{N_{i,j}}<NW{\ast}D>_{i,j} .$

(13.4)

$N_{i,j}$ is the normalization. Variations in exposure within the limited spatial extent of the wavelet function will affect the estimate of $N_{i,j}$ , which we estimate as $N_{i,j} = \frac{<NW{\ast}E>_{i,j}}{E_{i,j}}$ . (If exposure variations may be safely ignored, then $N_{i,j}$ can be computed analytically; see Appendix 13.6.) The normalized (or flat-field) background $B_{i,j,{\rm norm}}$ is directly related to $B_{i,j}$ :

$\displaystyle B_{i,j}~=~E_{i,j}\frac{<NW{\ast}D>_{i,j}}{<NW{\ast}E>_{i,j}}~=~E_{i,j}B_{i,j,{\rm norm}}$

(13.5)

$B_{i,j,{\rm norm}}$ is an estimate of the number of counts that would be detected if there were no exposure variations; it differs from the number of photons seen from a similarly sized region of sky by a factor related to the efficiencies of the mirror assembly and detector. Note that $B_{i,j,{\rm norm}}$ is set to zero in pixels with exposure less than a user-specified threshold.

We stress that one should not interpret $B_{i,j}$ blindly as the actual background amplitude $B_{i,j,{\rm true}}$ ; it should only be viewed as a computation made in order to select source pixels. For instance, if a small wavelet scale is applied in a region with a large PSF, then $B_{i,j} \gg B_{i,j,{\rm true}}$ around any source, because the estimate of $B_{i,j}$ is unavoidably biased by the source counts which lie in the region spanned by . In wrecon , we provide a default method of combining information across scales to estimate $B_{i,j,{\rm true}}$ (which we discuss in Appendix 13.8); it is this quantity which is used to help determine source properties.

13.3.2 Correlation Image

13.3.2.0.1 Analytic Correlation.

We detect sources in the image

by correlating these data with the wavelet function:

$\displaystyle C_{i,j}~$	$\displaystyle =$	$\displaystyle ~<W{\ast}D>_{i,j}$
	$\displaystyle =$	$\displaystyle ~\sum_{i'} \sum_{j'} \left[ \int_{{\Delta}x'} \int_{{\Delta}y'} dx' dy' W(\sigma_x,\sigma_y,x',y')\right] D_{i',j'}$	(13.6)
	$\displaystyle =$	$\displaystyle ~\sum_{i'} \sum_{j'} W_{i-i',j-j'} D_{i',j'}$	(13.7)

The wavelet is centered in pixel

(e.g.

= [0.5,0.5] for pixel

= [1,1]), and ${\Delta}x'$ and ${\Delta}y'$ represent the limits of integration over pixel

. The integral in Eq. (13.6) can be solved analytically if the MH function (Eq. [13.37]) is used; we present this solution in Appendix 13.6. For the MH function, summation over

and

within an ellipse with semi-major and minor axes of length $\approx$ 5 $\sigma$ is sufficient to determine $C_{i,j}$ to high accuracy.

13.3.2.0.2 Calculation using the Fast Fourier Transform.

Analytic integration may be too computationally intensive if the image or scale size is large. In these situations, we invoke the Correlation Theorem, which allows us to use Fast Fourier Transforms (FFTs) to compute

$\displaystyle C~\approx~{\rm FFT}^{-1} [N \times {\rm FFT}(W) \times {\rm FFT}^{\ast}(D) ] .$

(13.8)

The right side of this equation may be read as follows: "the inverse FFT of the product of a normalization factor

, the FFT of

, and the complex conjugate of the FFT of

." To avoid problems associated with FFT aliasing, we pad the image

with zeros; an axis-length in the padded image is, in general, the next power of two larger than the length of that axis in the unpadded image (e.g. for an image of size 1024 $\times$ 1024, the padded image has size 2048 $\times$ 2048, with the contents of the unpadded image in the center). We compute the FFT using realft and associated routines of Press et al. (1992). For the particular case of the MH function, we may compute the Fourier Transform analytically; we present the derivation in Appendix 13.6.

13.3.2.1 Correcting for Exposure Variation

Variations in exposure within the limited spatial extent of the MH function, e.g. caused by the shadows of mirror supports and ribs, or by the edge of the FOV itself, will reduce the accuracy of the estimate of $C_{i,j}$ . This quantity may be overestimated or underestimated, depending upon the location of pixel relative to regions of reduced exposure, but an overestimate is more damaging, as it can lead to the detection of a spurious source. An overestimate will occur if the exposure reduction primarily occurs within . To see this, let us assume the counts in regions without exposure variation to be a constant, and that the exposure reduction causes a reduction in counts ${\Delta}D$ . Then:

$\displaystyle C_{i,j}~$	$\displaystyle =$	$\displaystyle ~<PW{\ast}D>_{i,j}~+~<NW{\ast}(D-{\Delta}D)>_{i,j}$
	$\displaystyle =$	$\displaystyle ~-<NW{\ast}{\Delta}D>_{i,j}$
	$\displaystyle >$	$\displaystyle ~0$

In this example, $C_{i,j}^{\rm novar} =$ 0. Thus exposure variations can lead to the detection of spurious sources near shadows (but not within them) and near the FOV edge. Below, we present two methods for correcting the estimate $C_{i,j}$ .

13.3.2.1.1 "Full" Exposure Correction.

We represent the data as the sum of source counts

and background counts

$\displaystyle C_{i,j}~=~<W{\ast}D>_{i,j}~=~<W{\ast}S>_{i,j}~+~<W{\ast}B>_{i,j}$

(13.9)

We rewrite the second term in terms of the exposure

and normalized background $B_{\rm norm}$ :

$\displaystyle <W{\ast}B>_{i,j} = <W{\ast}(EB_{\rm norm})>_{i,j} .$

(13.10)

We estimate what the correlation value would be if $E~=~E_{i,j}$ throughout the region spanned by the wavelet function:

$\displaystyle C_{i,j,{\rm cor}}~=~C_{i,j}~-~<W{\ast}(EB_{\rm norm})>_{i,j}~+~E_{i,j}<W{\ast}B_{\rm norm}>_{i,j} .$

(13.11)

As noted above, $C_{i,j,{\rm cor}} \la C_{i,j}$ if exposure reductions primarily occur in

. As a default,

is set to one in each pixel, so that the edge-of-field correction may be made even if the exposure map is not read; this default array is then overwritten if the exposure map is entered. We note that the calculation of $<W{\ast}B_{\rm norm}>_{i,j}$ is prone to systematic error unless $B_{\rm norm}$ is extrapolated over regions of low exposure (where $B_{\rm norm}$ is set to zero) and beyond the edge of the FOV.^13.4

The source counts, , may also be altered by variations in exposure if the source is extended; however, correcting requires knowledge of $S_{\rm norm}$ , which is a priori unknown. Also, our prime motivation for correcting $C_{i,j}$ is to reduce the number of spurious sources, and correcting does not help us in this regard.

13.3.2.1.2 "Fast" Exposure Correction.

The full exposure correction requires wtransform to perform calculations during each iteration, since the value of $B_{i,j,{\rm norm}}$ will change from iteration to iteration until sources are cleansed from the image. However, we can make a "faster" exposure correction if we assume that $B_{\rm norm}$ is locally constant (which is often a safe assumption, especially if $\sigma_x$ and $\sigma_y$ are small). If this is the case, then $<W{\ast}B_{\rm norm}>_{i,j}$ = 0, and

$\displaystyle C_{i,j,{\rm cor}}~=~C_{i,j}~-~B_{i,j,{\rm norm}}<W{\ast}E>_{i,j} .$

(13.12)

This correction is deemed "fast" because wtransform calculates $<W{\ast}E>_{i,j}$ once, as opposed to once per iteration. To ensure accuracy, however, the full exposure correction is performed once the estimate of $B_{i,j,{\rm norm}}$ has stabilized.

13.3.2.2 Calculation of Significance

In §13.2, we show how the significance is computed in each pixel. However, for reasons discussed fully in Appendix 13.7, wtransform does not compute significances directly, but rather compares each correlation value $C_{i,j}$ with a threshold correlation value $C_{i,j,o}$ . This threshold is found by estimating the lower bound on the integral in Eq. (13.3), given , a user-specified significance threshold (e.g. 10 $^{-6}$ ), and $q_{i,j} = 2{\pi}\sigma_x\sigma_yB_{i,j}$ . If $C_{i,j} > C_{i,j,o}$ , pixel is identified as a possible source pixel.

In wtransform , the user actually specifies two threshold significance values: $S_{o,sc}$ , for source cleansing, and . During iterations, $S_{i,j}$ is compared to $S_{o,sc}$ , which may be set to be much larger than (e.g. 10 $^{-2}$ ) to help reduce the bias caused by source counts in estimates of $B_{i,j}$ .

13.3.3 Error Estimation

The analytic calculations of $C_{i,j}$ and $B_{i,j}$ involve the pixel-by-pixel summation of the independent Poisson-distributed random variable $D_{i,j}$ . To compute the variances of these quantities, we use the general formula

$\displaystyle V[Y]~=~V[\sum_m a_m X_m]~=~\sum_m a_m^2 V[X_m] + 2 \sum_m \sum_{n > m} a_m a_n {\rm cov}[X_m,X_n] .$

(13.13)

where $Y = \sum_m a_m X_m$ , and

are the random variables (see, e.g., Eadie et al. 1971). If

cannot be written simply as a linear function of random variables, we use the approximation formula

$\displaystyle V[Y]~\approx~\sum_m \sum_n \frac{{\partial}Y}{{\partial}X_m} \frac{{\partial}Y}{{\partial}X_n} {\rm cov}[X_m,X_n]$

(13.14)

depends directly on $D_{i,j}$ , the covariance terms are zero, and the approximation formula simplies to the familiar formula $V[Y]~\approx~\sum_m \left(\frac{{\partial}Y}{{\partial}X_m}\right)^2 V[X_m]$ .

13.3.3.0.1 Background.

We rewrite Eq. (13.5) in terms of summations:

$\displaystyle B_{i,j}~=~\frac{E_{i,j}}{<NW{\ast}E>_{i,j}}\sum_{i'}\sum_{j'} NW_{i-i',j-j'} D_{i',j'}$

(13.15)

The random variables $D_{i',j'}$ have variance $V[D_{i',j'}] = D_{i',j'}$ .

$\displaystyle V[B_{i,j}]~$	$\displaystyle =$	$\displaystyle ~\left(\frac{E_{i,j}}{<NW{\ast}E>_{i,j}}\right)^2\sum_{i'}\sum_{j'} NW_{i-i',j-j'}^2 D_{i',j'}$	(13.16)
	$\displaystyle =$	$\displaystyle ~\left(\frac{E_{i,j}}{<NW{\ast}E>_{i,j}}\right)^2 < NW^2{\ast}D >_{i,j}$	(13.17)
	$\displaystyle =$	$\displaystyle ~E_{i,j}^2 V[B_{i,j,{\rm norm}}]$

This quantity is calculated only after the data are cleansed of sources. It may be calculated analytically (Eq. [13.16]) or using FFTs (Eq. [13.17]).

13.3.3.0.2 Correlation: No Exposure Correction.

$\displaystyle V[C_{i,j}]~$	$\displaystyle =$	$\displaystyle ~V[\sum_{i'} \sum_{j'} W_{i-i',j-j'}D_{i',j'}]$
	$\displaystyle =$	$\displaystyle ~\sum_{i'} \sum_{j'} W_{i-i',j-j'}^2 D_{i',j'}$	(13.18)
	$\displaystyle =$	$\displaystyle ~<W^2{\ast}D>_{i,j}$

This variance is computed once, after the first iteration, either analytically (Eq. [13.18]) or through the use of FFTs (Eq. [13.19]). To speed computation when using Fourier Transforms, we use the analytically-derived quantity

. Details on its derivation may be found in Appendix 13.6.

13.3.3.0.3 Correlation: Full Exposure Correction.

Rewriting Eq. (13.11) in terms of summations and rearranging terms, we write the full exposure correction as:

$\displaystyle C_{{\rm cor},i,j}~$	$\displaystyle =$	$\displaystyle ~\sum_{i'} \sum_{j'} \left[W_{i-i',j-j'}-\left(\sum_{i''} \sum_{j... ...,j'-j''}(E_{i,j}-E_{i'',j''})}{<NW{\ast}E>_{i'',j''}} \right) \right] D_{i',j'}$
	$\displaystyle =$	$\displaystyle ~\sum_{i'} \sum_{j'} a_{i,i',i'',j,j',j''} D_{i',j'}$

The variance is then

$\displaystyle V[C_{{\rm cor},i,j}]~=~\sum_{i'} \sum_{j'} a_{i,i',i'',j,j',j''}^2 D_{i',j'}$

(13.19)

The presence of one summation over image pixels within another makes the computation time for any one given pixel $\sim$ 1 CPU-minute. Thus in wtransform the covariance terms are currently ignored and the variance is computed as

$\displaystyle V[C_{{\rm cor},i,j}]~=~V[C_{i,j}] + < W^2{\ast}(E^2V[B_{\rm norm}]) >_{i,j} + E_{i,j}^2 < W^2{\ast}V[B_{\rm norm}] >_{i,j} .$

(13.20)

Initial tests indicate that this formula is accurate away from sources, but greatly overestimates the variance around sources. These tests indicate that the true variance around sources does not differ greatly from the true variance away from sources ( $\la$ 50%), so that this equation can provide the analyst a good estimate of the magnitude of the variances.

13.4 Constructing the Source List

The program wrecon is used to construct the final source list after wtransform has been run at all desired scales. To construct this list, it uses:

lists of correlation space maxima whose amplitudes are greater than the threshold specified in wtransform , created at each wavelet scale;
correlation images created by wtransform at each wavelet scale;
and "flux images" computed within wrecon itself.

Next we discuss how the flux images are computed, then show how they are used to define the cells within the FOV that delineate the extent of a putative source and which allow us to estimate its properties.

13.4.1 Flux Image

The flux image is created by smoothing the raw data, then subtracting the normalized background from the smoothed image^13.5 on a pixel-by-pixel basis:

$\displaystyle F_{i,j}~=~{\rm max}\left(\frac{<PW{\ast}D>_{i,j}}{<PW{\ast}E>_{i,j}}~-~B_{i,j,{\rm norm}},0\right) .$

(13.21)

We use

as the smoothing function because it has the desirable properties of being localized, and, for the particular case of the MH function, of mimicking the shape of a canonical Gaussian PSF.

The scale pairs $(\sigma_x,\sigma_y)$ at which to compute the flux images are user-specified; the number of pairs can be less than the number of wavelet scale pairs examined overall. Using fewer pairs reduces computation time, but may lead to a less-precise determination of source properties. Hereafter, we assume that $\sigma = \sigma_x = \sigma_y$ , forcing symmetric wavelets.

13.4.1.1 The Source Cell

To determine source properties, it is necessary to define a "cell" on the detector, in which some portion of a source's counts are detected. If we choose the size and shape of the cell, then we need detailed knowledge of the PSF to estimate the fraction of a source's counts that are recorded in the cell. However, wrecon and the other Chandra detection algorithms assume no details about a detector's PSF except for its characteristic size $r_{\rm PSF}$ , which is computed for Chandra data using the routine psf_calc_size; its functional form is unknown. But even if we did have detailed knowledge of the PSF, its use would provide a good estimate of the total source counts only if the source were point-like.

wrecon uses the flux images to define the source cell. In regions where there are no sources, the mean value of is zero; only in the vicinity of sources does deviate markedly from zero. Hence wrecon defines the source cell by

using the flux image computed at the scale $\sigma_F$ closest to the PSF size, by minimizing the function $\vert \log_2 r_{\rm PSF} - \log_2 \sigma_{F,i} \vert$ ;
determining the local maximum in -space corresponding to , the estimated location of the putative source;
and determining those pixels for which is the local maximum in -space. Pixels for which $F_{i,j} =$ 0 are not considered.

The use of $<PW{\ast}E>$ in Eq. (13.21) makes

a normalized quantity, which is beneficial since exposure variations can cause spurious flux-image maxima, which would reduce the cell size and adversely affect the determination of source properties. We stress, however, that the flux image is not used in the actual determination of source properties.

A cell defined in this manner has advantageous properties:

there are no a priori assumptions about source shape and extent;
saddle points in -space allow the definition of boundaries between nearby sources;
and the creation of the flux image involves smearing, or de-focusing, data, so that cell boundaries will lie outside the region where the source is evident to the eye. Hence, at least for isolated sources, nearly all source counts should have been detected within cell boundaries.

This approach is not always optimal when analyzing extended sources: such sources may have more than one local maximum in counts space, leading to multiple "sources" being listed. One way to counteract this is to set $\sigma_{F,\rm min} \gg r_{\rm PSF}$ , as this will increase the probability of the extended object being associated with one maximum in -space (which will ensure a sufficiently large cell). However, it also increases the probability that point sources in the vicinity of the extended source will be "swallowed up" by the larger source cell, so care must be exercised when interpreting derived source properties such as count rate. This approach is also not optimal if PSF is bimodal, as it is far off-axis for Chandra ; in this case, wtransform will output two correlation maxima for a well-sampled source, and wrecon will characterize each maximum independently of the other. Only with the a posteriori application of a detailed PSF can the two "sources" be properly combined.

It is also possible for a source cell to contain two or more underlying sources. This is indicated when there are multiple correlation maxima in the source cell at the PSF scale. The code indicates such a condition with a flag in the source list; it does not currently attempt to "break up" the cell to refine source property estimates.

13.4.1.2 Source Selection Algorithm

To create the source list, wrecon examines the lists of correlation maxima in ascending order of wavelet size. If the location of a maximum is not already associated with a source cell (see below), it will be provisionally associated with a real source if

$F_{i_C,j_C} \neq$ 0;
and the ellipse defined by

$\displaystyle \frac{(x-i_C)^2}{2\sigma_{\rm cur}^2}~+~\frac{(y-j_C)^2}{2\sigma_{\rm cur}^2}~=~1$

does not contain one or more sources first detected at scales smaller than the currently considered scale $\sigma_{\rm cur}$ . This is an important check when $\sigma_{\rm cur} \ga r_{\rm PSF}$ , as sources in a crowded field merge, possibly creating "new" sources at new locations in the FOV.

After determining for the putative source its cell and properties, wrecon first scans through the lists of all correlation maxima at all scales, and marks as "examined" those which lie within the current source cell, and then verifies that the putative source appears at one or more scales $\ga r_{\rm PSF}$ . It does this by determining the scale size closest to the PSF size, $\sigma_{\rm PSF}$ (by minimizing $\vert \log_2 r_{\rm PSF} - \log_2 \sigma_i \vert$ ), then counting the number of correlation maxima that lie within the source cell for $\sigma_i \geq \sigma_{\rm PSF}$ . If the number is zero, wrecon rejects the putative source.

13.4.2 Source Properties

13.4.2.0.1 Location.

wrecon calculates the expectation values $x_{\rm source}$ and $y_{\rm source}$ by summing pixel numbers within the source cell, using the data $D_{i,j}$ as a weighting function:

$\displaystyle x_{\rm source}~$	$\displaystyle =$	$\displaystyle ~\frac{1}{D_o}\sum_{i \in sc}\sum_{j \in sc} D_{i,j} i$	(13.22)
$\displaystyle y_{\rm source}~$	$\displaystyle =$	$\displaystyle ~\frac{1}{D_o}\sum_{i \in sc}\sum_{j \in sc} D_{i,j} j$	(13.23)

where $D_o = \sum_{i \in sc}\sum_{j \in sc} D_{i,j}$ , and

represents the source cell. We note that the estimate of location may be biased if the source cell is truncated because of the presence of a nearby source.

A preferred weighting function would be the source flux $D_{i,j}-E_{i,j}B_{i,j,{\rm norm}}$ , but its use greatly complicates the computation of variances $V[x_{\rm source}]$ and $V[y_{\rm source}]$ . (For similar reasons, $F_{i,j}$ is not used as a weighting function.) $D_{i,j}$ and $D_{i,j}-E_{i,j}B_{i,j,{\rm norm}}$ will give similar estimates if the source is strong or if background is small.

The variance of the location estimate, $V[x_{\rm source}]$ , is estimated using Eq. (13.14):

$\displaystyle V[x_{\rm source}]~$	$\displaystyle =$	$\displaystyle ~\sum_{i \in sc}\sum_{j \in sc} \left(\frac{{\partial}x_{\rm source}}{{\partial}D_{i,j}}\right)^2 V[D_{i,j}]$
	$\displaystyle =$	$\displaystyle ~\sum_{i \in sc}\sum_{j \in sc} \left(\frac{i-x_{\rm source}}{D_o}\right)^2 D_{i,j}$	(13.24)

$V[y_{\rm source}]$ is computed similarly.

13.4.2.0.2 Counts.

The source counts are estimated by summing count flux within the source cell:

$\displaystyle C_{\rm source}~=~\sum_{i \in sc}\sum_{j \in sc} (D_{i,j}-E_{i,j}B_{{\rm norm},i,j})$

(13.25)

It is possible to rewrite this equation in the form $C = \sum_i \sum_j A_{i,j} D_{i,j}$ , which in theory makes the variance directly computable. However, this calculation is computationally intensive and is not performed by the current version of the program (see the discussion around Eq. [13.65] for details). The variance is computed using the formula

$\displaystyle V[C_{\rm source}]~\approx~\sum_{i \in sc}\sum_{j \in sc} D_{i,j} + E_{i,j}^2V[B_{{\rm norm},i,j}]$

(13.26)

The missing covariance terms are positive, so Eq. (13.26) underestimates the true variance.

13.4.2.0.3 Source Exposure Time.

The effective exposure time of the source is the expectation value of

, where

is the total detector livetime, $E_{i,j}$ is the fractional exposure in each bin within the source cell ( $0 \leq E_{i,j} \leq 1$ ), and $D_{i,j}$ is the weighting function:

$\displaystyle t_{\rm source}~=~\frac{t_o}{D_o} \sum_{i \in sc}\sum_{j \in sc} D_{i,j}E_{i,j}$

(13.27)

$V[t_{\rm source}]$ is computed in an analogous manner to $V[x_{\rm source}]$ , by replacing with $t_oE_{i,j}$ :

$\displaystyle V[t_{\rm source}]~$	$\displaystyle =$	$\displaystyle ~\sum_{i \in sc}\sum_{j \in sc} \left(\frac{{\partial}t_{\rm source}}{{\partial}D_{i,j}}\right)^2 V[D_{i,j}]$
	$\displaystyle =$	$\displaystyle ~\sum_{i \in sc}\sum_{j \in sc} \left(\frac{t_oE_{i,j}-t_{\rm source}}{D_o}\right)^2 D_{i,j}$	(13.28)

13.4.2.0.4 Source Count Rate.

The source count rate is

$\displaystyle R_{\rm source,C}~=~\frac{C_{\rm source}}{t_{\rm source}} .$

(13.29)

The variance on the count rate estimate is

$\displaystyle V[R_{\rm source,C}]~$	$\displaystyle \approx$	$\displaystyle ~\left(\frac{{\partial}R_{\rm source,C}}{{\partial}C_{\rm source}... ...\partial}R_{\rm source,C}}{{\partial}t_{\rm source}}\right)^2 V[t_{\rm source}]$
	$\displaystyle =$	$\displaystyle ~\frac{1}{t_{\rm source}^2} \left( V[C_{\rm source}] + R_{\rm source,C}^2V[t_{\rm source}] \right)$	(13.30)

where $V[C_{\rm source}]$ and $V[t_{\rm source}]$ are given in eqs. (13.26) and (13.28) respectively. Equation (13.30) underestimates the true variance.

13.4.2.0.5 Background Counts in Source Cell.

The number of background counts in the source cell is

$\displaystyle B_{\rm source}~=~\sum_{i \in sc}\sum_{j \in sc} E_{i,j}B_{i,j,{\rm norm}} .$

(13.31)

The (underestimated) variance is given by

$\displaystyle V[B_{\rm source}]~\approx~\sum_{i \in sc}\sum_{j \in sc} E_{i,j}^2B_{i,j,{\rm norm}} .$

(13.32)

13.4.2.0.6 Background Count Rate in Source Cell.

The background count rate in the source cell is

$\displaystyle R_{\rm source,B}~=~\frac{B_{\rm source}}{t_{\rm source}} ,$

(13.33)

which has (underestimated) variance

$\displaystyle V[R_{\rm source,B}]~\approx~\frac{1}{t_{\rm source}^2} \left( V[B_{\rm source}] + R_{\rm source,B}^2V[t_{\rm source}] \right) .$

(13.34)

13.4.2.0.7 Source Image.

wrecon constructs a noise-free image of detected sources using the equation

$\displaystyle S_{i,j}~=~\sum_{k=1}^N \frac{C_{i,j,k}}{\sigma_{x,k}\sigma_{y,k}} \nu_{i,j,k} ,$

(13.35)

where

is the number of input scale pairs, and $\nu_{i,j,k}$ = 1 if the following conditions hold:

$C_{i,j,k} >$ 0;
the associated local correlation maximum is identified as a source pixel by wtransform ;
and the associated local correlation maximum is contained within a source cell.

Otherwise, $\nu_{i,j,k}$ = 0. The second condition ensures that random maxima which are not associated with a source but which happen to lie within a source cell are not included in the source image; the last condition ensures that putative sources rejected by wrecon are also not included in the image.

13.5 The Mexican Hat (MH) Function

The source detection algorithm we present in this paper should not depend on the details of function itself, at least for simple wavelets which have one central positive mode. In this Appendix, we describe the Marr wavelet function, or "Mexican Hat function,'' (MH) which is used by wtransform and wrecon .

A wavelet function may be derived from a real-valued, non-negative, infinitely differentiable function $\phi(x,y)$ that satisfies the condition $\int \phi(x,y) =$ 1 (see, e.g., Holschneider 1995). One function that satisfies these requirements is the Gaussian:

$\displaystyle \phi(x,y)~=~\frac{1}{2{\pi}\sigma_x\sigma_y}\exp\left(-\frac{x^2}{2\sigma_x^2}-\frac{y^2}{2\sigma_y^2}\right) ,$

(13.36)

The relationship between $\phi$ and the wavelet function

in two dimensions is

$\displaystyle W(\sigma_x,\sigma_{y},x,y)~$	$\displaystyle =$	$\displaystyle ~[(x \frac{\partial}{{\partial}x} + 1) + (y \frac{\partial}{{\partial}y} + 1)] \phi(x,y)$
	$\displaystyle =$	$\displaystyle ~[((x,y){\cdot}{\nabla}) + 2]\phi(x,y)$
	$\displaystyle =$	$\displaystyle \frac{1}{2{\pi}\sigma_x\sigma_y}\left[2-\frac{x^2}{\sigma_x^2}-\frac{y^2}{\sigma_y^2}\right]e^{-\frac{x^2}{2\sigma_x^2}-\frac{y^2}{2\sigma_y^2}}$	(13.37)

The integral of

over all space is, by definition, zero. This allows us to disregard the factor $\frac{1}{2{\pi}\sigma_x\sigma_y}$ in Eq. (13.37) when performing correlations.

The MH has a positive kernel surrounded by a negative annulus; the boundary between these two regions is an ellipse with axis-lengths $\sqrt{2}\sigma_x$ and $\sqrt{2}\sigma_y$ :

$\displaystyle \frac{x^2}{2\sigma_x^2}~+~\frac{y^2}{2\sigma_y^2}~=~1$

(13.38)

The area of this ellipse, used to compute correlation thresholds, is $A(\sigma_x,\sigma_y) = 2{\pi}\sigma_x\sigma_y$ . The amplitude of the MH function is 2 at its center, regardless of scale; its minimum amplitude is $\approx$ -0.27 on the ellipse defined by axis-lengths 2 $\sigma_x$ and 2 $\sigma_y$ .

The MH function has several advantageous properties which motivate its use.

When rotationally symmetric, its Gaussian-like positive kernel has similar shape to a canonical PSF.
The kernel's limited extent allows only those sources with intrinsically or instrumentally broadened size $\sim (\sigma_x,\sigma_y)$ to be strongly detected.
It has an analytically-derivable Fourier Transform, speeding its correlation with data via the Correlation Theorem (Appendix 13.6).
It has a limited extent in Fourier space, such that limited, discrete sampling in $\sigma_x$ and $\sigma_y$ (e.g. at values separated by factors of two) is sufficient to sample the entire frequency domain.
Its limited extent in both spatial and Fourier domains helps to minimize effects of aliasing.

13.6 Solutions to Integrals Involving the MH Function

13.6.1 Analytic Integration of the MH Function

13.6.1.1 Pixel-by-Pixel Integration

If the width of the wavelet function is of order the size of the image pixel, then using FFTs to compute $<W{\ast}D>$ and other related quantities becomes impractible. To determine $C_{i,j}$ , for instance, we integrate on a pixel-by-pixel basis, as shown in Eq. (13.7).

We substitute the form of in Eq. (13.37) into Eq. (13.7):

$\displaystyle W_{\rm p}~=~\int_{x_1}^{x_2} dx \int_{y_1}^{y_2} dy \left[ 2 - \f... ...-\frac{x^2}{2\sigma_{x}^2}\right) \exp\left(-\frac{y^2}{2\sigma_{y}^2}\right) .$

(13.39)

Here,

represent coordinates relative to the center of pixel

, and the integration is carried out over the pixel

$\displaystyle x1$	$\displaystyle =$	$\displaystyle i'-i-\frac{1}{2}~~~~y1=j'-j-\frac{1}{2}$
$\displaystyle x2$	$\displaystyle =$	$\displaystyle i'-i+\frac{1}{2}~~~~y2=j'-j+\frac{1}{2}$

$W(x,y) \approx$ 0 if

are outside an ellipse with axis-lengths 5 $\sigma_x$ and 5 $\sigma_y$ , limiting the range of integration.

If we expand Eq. (13.39), we find that there are two basic integrals which must be performed. The first is:

$\displaystyle \int_{x_1}^{x_2} dx \exp\left(-\frac{x^2}{2\sigma_{x}^2}\right)$	$\displaystyle =$	$\displaystyle \sqrt{2}\sigma_x \int_{t_1}^{t_2} dt \exp(-t^2)$
	$\displaystyle =$	$\displaystyle \int_0^{t_2} dt \exp(-t^2) - \int_0^{t_1} dt \exp(-t^2)$
	$\displaystyle =$	$\displaystyle \sqrt{\frac{\pi}{2}} \sigma_x \left[{\rm erf}\left(\frac{x_2}{\sqrt{2}\sigma_x}\right) - {\rm erf}\left(\frac{x_1}{\sqrt{2}\sigma_x}\right)\right]$
	$\displaystyle =$	$\displaystyle \sqrt{\frac{\pi}{2}} \sigma_x x_{\rm erf}$

where the substitution $t = \frac{x}{\sqrt{2}\sigma_x}$ is made, and ${\rm erf}(x)$ is the error function (see, e.g., Press et al. 1992). The second integral is:

$\displaystyle \int_{x_1}^{x_2} dx \frac{x^2}{\sigma_x^2} \exp\left(-\frac{x^2}{2\sigma_{x}^2}\right) = 2\sqrt{2}\sigma_x \int_{t_1}^{t_2} dt t^2 \exp(-t^2) .$

This integral can be solved by parts, by taking the derivative of

and the integral of $t\exp(-t^2)$ , leaving an integral of $\exp(-t^2)$ (solved above). Leaving out intermediate steps, we present the solution:

$\displaystyle x_1 \exp\left(-\frac{x_1^2}{2\sigma_{x}^2}\right) - x_2 \exp\left... ...igma_x x_{\rm erf}~=~x_{\rm diff} - \sqrt{\frac{\pi}{2}} \sigma_x x_{\rm erf} .$

Taking the products of these solutions, and exchanging

for

when needed, we find:

$\displaystyle W_{\rm p}=\pi\sigma_x\sigma_y{x_{\rm erf}}{y_{\rm erf}} - \sqrt{\... ...\rm erf}\left(y_{\rm diff} + \sqrt{\frac{\pi}{2}} \sigma_y y_{\rm erf}\right) .$

(13.40)

13.6.1.2 Integration of the Negative Annulus

If exposure variations and the FOV edge may be ignored in the computation of the background, then we may replace $<NW{\ast}E>_{i,j}$ in Eq. (13.5) with a background normalization factor, which we derive here.

The overall normalization of the MH function is zero. Hence the background normalization factor may be derived by integrating over its positive core:

$\displaystyle N_{\rm back}~=~\int_{-\sqrt{2}\sigma_{x}}^{\sqrt{2}\sigma_{x}} dx... ...\frac{x^2}{2\sigma_{x}^2}\right) \exp\left(-\frac{y^2}{2\sigma_{y}^2} \right) .$

(13.41)

The limits of integration reflect that the core extends over an ellipse with axes $\sqrt{2}\sigma_x$ and $\sqrt{2}\sigma_y$ . We reparametrize the integral using polar coordinates $(r,\theta)$ , after first mapping the boundary ellipse to a boundary circle using the transformation $y' = \frac{\sigma_{x}}{\sigma_{y}}y$ :

$\displaystyle N_{\rm back}~$	$\displaystyle =$	$\displaystyle ~\frac{\sigma_{y}}{\sigma_{x}} \int_{-\sqrt{2}\sigma_{x}}^{\sqrt{... ...^2+y'^2}{\sigma_{x}^2} \right] \exp\left(-\frac{x^2+y'^2}{2\sigma_{x}^2}\right)$	(13.42)
	$\displaystyle =$	$\displaystyle ~\frac{\sigma_{y}}{\sigma_{x}}\int_0^{2{\pi}} d\theta \int_0^{\sq... ...2 - \frac{r^2}{\sigma_{x}^2}\right] \exp\left(-\frac{r^2}{2\sigma_{x}^2}\right)$	(13.43)
	$\displaystyle =$	$\displaystyle ~2{\pi}\frac{\sigma_{y}}{\sigma_{x}} \left[\int_0^{\sqrt{2}\sigma... ...dr \frac{r^3}{\sigma_{x}^2} \exp\left(-\frac{r^2}{2\sigma_{x}^2}\right)\right].$	(13.44)

The determinant of the Jacobian of the transformation from $(x,y') \rightarrow (r,\theta)$ is

We evaluate the second integral using integration by parts:

$\displaystyle \int_0^{\sqrt{2}\sigma_{x}} dr \frac{r^3}{\sigma_{x}^2} \exp\left... ...int_0^{\sqrt{2}\sigma_{x}} dr 2 r \exp\left(-\frac{r^2}{2\sigma_{x}^2}\right) .$

(13.45)

The second integral in Eq. (13.45) cancels with the first integral in Eq. (13.44), leaving:

$\displaystyle N_{\rm back}~$	$\displaystyle =$	$\displaystyle ~2{\pi}\frac{\sigma_{y}}{\sigma_{x}} \left[r^2\exp\left(-\frac{r^2}{2\sigma_{x}}\right)\mid_0^{\sqrt{2}\sigma_{x}}\right]$
	$\displaystyle =$	$\displaystyle ~2{\pi}\frac{\sigma_{y}}{\sigma_{x}} \frac{2\sigma_{x}^2}{\exp(1)}$
	$\displaystyle =$	$\displaystyle ~\frac{4{\pi}\sigma_{x}\sigma_{y}}{\exp(1)} .$	(13.46)

13.6.2 Analytic Fourier Transform of the MH Function

The Fourier Transform of is

$\displaystyle FT(W)~$	$\displaystyle =$	$\displaystyle ~\int_{-\infty}^{+\infty} \int_{-\infty}^{+\infty} dx dy W(x,y) e^{2{\pi}i(k_{x}x+k_{y}y)}$
	$\displaystyle =$	$\displaystyle ~\int_{-\infty}^{+\infty} dy e^{2{\pi}ik_{y}y} \int_{-\infty}^{+\infty} dx W(x,y) \cos(2{\pi}k_{x}x)$
		$\displaystyle ~+~i \int_{-\infty}^{+\infty} dy e^{2{\pi}ik_{y}y} \int_{-\infty}^{+\infty} dx W(x,y) \sin(2{\pi}k_{x}x)$	(13.47)
	$\displaystyle =$	$\displaystyle ~\int_{-\infty}^{+\infty} dy e^{2{\pi}ik_{y}y} \int_{-\infty}^{+\infty} dx W(x,y) \cos(2{\pi}k_{x}x)$
	$\displaystyle =$	$\displaystyle ~\left[\int_{-\infty}^{+\infty} dy \cos(2{\pi}k_{y}y) + i\int_{-\... ...\sin(2{\pi}k_{y}y)\right] \int_{-\infty}^{+\infty} dx W(x,y) \cos(2{\pi}k_{x}x)$
	$\displaystyle =$	$\displaystyle ~\int_{-\infty}^{+\infty} dy \cos(2{\pi}k_{y}y) \int_{-\infty}^{+\infty} dx W(x,y) \cos(2{\pi}k_{x}x)$
	$\displaystyle =$	$\displaystyle ~\int_{-\infty}^{+\infty} dy \cos(2{\pi}k_{y}y) \times C_{\rm W}$	(13.48)

The wave-number

equals $\frac{2(i_k-1)k_{\rm Nyquist}}{l}$ , where

is the pixel number in Fourier space, $k_{\rm Nyquist} = \frac{1}{2}$ is the Nyquist wave-number, and

is half the length of the relevant axis in the padded image. The fourth integral in Eq. (13.47) and the second integral in Eq. (13.48) are zero, as the integrands are products of even and odd functions.

$\displaystyle C_{\rm W}~$	$\displaystyle =$	$\displaystyle ~\exp\left(-\frac{y^2}{2\sigma_{y}^2}\right) \int_{-\infty}^{+\in... ...gma_{y}^2}\right] \exp\left(-\frac{x^2}{2\sigma_{x}^2}\right)\cos(2{\pi}k_{x}x)$
	$\displaystyle =$	$\displaystyle ~\exp\left(-\frac{y^2}{2\sigma_{y}^2}\right) \left[ \left(2 - \fr... ...ac{1}{\sigma_{x}^2} \Psi_{\rm 2,c}(2{\pi},\frac{1}{\sqrt{2}\sigma_{x}}) \right]$	(13.49)

$\Psi_{\rm0,c}(p,q,{\lambda})$ and $\Psi_{\rm 2,c}(a,p)$ represent two integral solutions which one may find, e.g., in Gradshteyn & Ryzhik (1980; formulae 3.896-2 and 3.952-4 respectively):

$\displaystyle \Psi_{\rm0,c}(p,q,{\lambda})~=~\frac{1}{q}\sqrt{\pi}\exp\left(-\frac{p^2}{4q^2}\right)\cos(p{\lambda}) ,$

and

$\displaystyle \Psi_{\rm 2,c}(a,p)~=~\sqrt{\pi}\frac{2p^2-a^2}{4p^5}\exp\left(-\frac{a^2}{4p^2}\right) .$

Substituting these quantities into Eq. (13.49), one finds that

$\displaystyle C_{\rm W}~=~\exp\left(-\frac{y^2}{2\sigma_{y}^2}-2{\pi}^2k_{x}^2\... ...\pi}} \left[ 4{\pi}^2k_{x}^2\sigma_{x}^2 + 1 - \frac{y^2}{\sigma_{y}^2} \right]$

(13.50)

Substituting Eq. (13.50) into Eq. (13.48) and solving, we find:

	$\displaystyle FT(W)$
$\displaystyle =$	$\displaystyle ~\exp(-2{\pi}^2k_{x}^2\sigma_{x}^2) \sigma_{x} \sqrt{2{\pi}} \int... ...ight) \left[ 4{\pi}^2k_{x}^2\sigma_{x}^2 + 1 - \frac{y^2}{\sigma_{y}^2} \right]$
$\displaystyle =$	$\displaystyle ~\exp(-2{\pi}^2k_{x}^2\sigma_{x}^2) \sigma_{x} \sqrt{2{\pi}} \lef... ...1}{\sigma_{y}^2}\Psi_{\rm 2,c}(2{\pi}k_{y},\frac{1}{\sqrt{2}\sigma_{y}})\right]$
$\displaystyle =$	$\displaystyle ~(2{\pi})^3\sigma_{x}\sigma_{y}(k_{x}^2\sigma_{x}^2 + k_{y}^2\sigma_{y}^2)\exp[-2{\pi}^2(k_{x}^2\sigma_{x}^2+k_{y}^2\sigma_{y}^2)].$	(13.51)

To compute the error of the correlation values in each pixel, we calculate $<W^2{\ast}D>_{i,j}$ using an analytic Fourier Transform . The derivation of this function is similar to the derivation of . A new integral which appears is:

$\displaystyle \int_{-\infty}^{+\infty} dx \frac{x^4}{\sigma_{x}^4} \exp\left(-\frac{x^2}{\sigma_{x}^2}\right) \cos(2{\pi}k_{x}x)$

(13.52)

One may solve this integral by parts, differentiating the term $\frac{x^3}{\sigma_{x}^2}\cos(2{\pi}k_{x}x)$ and integrating the term $\frac{x}{\sigma_{x}^2}\exp\left(-\frac{x^2}{\sigma_{x}^2}\right)$ . The integral

$\displaystyle \int_{-\infty}^{+\infty} dx x^3 \sin(2{\pi}k_{x}x) \exp\left(-\frac{x^2}{\sigma_{x}^2}\right)$

(13.53)

has solution (see, e.g., Gradshteyn & Ryzhik 1980, formula 3.952-5):

$\displaystyle \int_{-\infty}^{+\infty} dx x^3 \sin(2{\pi}k_{x}x) \exp\left(-\fr... ...rac{k_{x}}{\sigma_{x}^2}-2{\pi}^3k_{x}^3}{2}\exp(-{\pi}^2\sigma_{x}^2k_{x}^2) .$

(13.54)

The final solution is

$\displaystyle FT(W^2)~=~\left[2{\pi}\sigma_{x}\sigma_{y} + {\pi}^5\sigma_{x}\si... ...\sigma_{y}^2)^2\right]\exp[-{\pi}^2(k_{x}^2\sigma_{x}^2+k_{y}^2\sigma_{y}^2)] .$

(13.55)

13.7 Computation of Threshold Correlation Values

An image pixel is associated with an astronomical source if the significance $S_{i,j}$ of its correlation value $C_{i,j}$ is greater than a user-defined threshold significance :

$\displaystyle S_{i,j}~=~\int_{C_{i,j}}^{\infty} p(C \vert n_{B,i,j})) > S_o .$

$n_{B,i,j}$ is the inferred number of background counts within the limited spatial extent of the wavelet function, and $p(C \vert n_{B,i,j})$ is the probability sampling distribution for

given $n_{B,i,j}$ . This sampling distribution does not depend upon the scale size of the wavelet function. We assume that the background count rate is locally constant, so that instead of $n_{B,i,j}$ , we use the expected number of background counts in the positive kernel of the wavelet, a quantity which we denote $q_{i,j}$ . For the MH function, $q_{i,j} = 2{\pi}{\sigma_x}{\sigma_y}B_{i,j}$ .

We find that $p(C \vert q_{i,j})$ does not have analytic form for those values of $q_{i,j}$ most likely to be seen in analyses of Chandra images, so we must estimate this distribution using simulations. The computation time needed for such simulations means that we cannot accurately assign significances to pixels if $S_{i,j} \la 10^{-7}$ . So instead of computing the significance, wtransform compares the value of $C_{i,j}$ in each pixel with a threshold correlation value $C_{i,j,o}(S_o,q_{i,j})$ , defined by the equation

$\displaystyle S_o~=~\int_{C_{i,j,o}}^{\infty} dC p(C \vert q_{i,j}) .$

(13.56)

We determine $C_{i,j,o}$ by:

randomly selecting values ${\log}q_o$ from the range -10 $\leq {\log}q_o \leq$ 3.25;
creating 1024 $\times$ 1024 images, sampling data in each pixel from the Poisson distribution, with the expected number of counts in each pixel being $B_o = \frac{q_o}{2{\pi}\sigma^2}$ , and with $\sigma = 4$ pixels;
computing $C_{i,j}$ and $q_{i,j}$ in each image pixel;
and recording $C_{i,j}$ in bins of size ${\Delta}{\log}q_{i,j}$ = 0.2, for -6.9 $\leq {\log}q_{i,j} \leq$ 3.1, with one bin being used for all values of ${\log}q_{i,j} <$ -6.9.^13.6 From these distributions $p(C \vert q_{i,j})$ , we can determine $C_{i,j,o}$ .

If a simulated dataset has no counts, we do not analyze it. We chose ${\log}q_{i,j} =$ 3.25 as a upper limit because of the report by Damiani et al. (1997) that for ${\log}q_{i,j} \ga$ 3, the probability sampling distribution is analytically representable as a Gaussian with width $\sigma = \sqrt{q_{i,j}}$ . (Note that the value of $q_{i,j}$ that we use in this work is larger than that used by Damiani et al. by a factor of $2{\pi}$ .) We return to this point below.

By analyzing over 50,000 simulated images, we are able to determine 25 values of in each ${\log}q_{i,j}$ bin, for values of $S_o \ga$ 10 $^{-7}$ , using the central 68% (17) of the values to estimate variances. We fit simple functions to the estimated threshold values, using the $\chi^2$ method and the estimated variances. We present our results below. These functions describe the observed threshold values well, except in the regime ${\log}q_{i,j} \la 0$ and ${\log}S_o \ga -4$ , where a binned look-up table is used instead. We note that these fits allow us in principle to compute threshold correlation values for significance values below our computational lower limit $S_o \sim$ 10 $^{-7}$ (such as for $S_o \sim$ 10 $^{-9}$ , the significance corresponding to one false source pixel in an Chandra HRC image).

In the regime ${\log}q_{i,j} \la 0$ and ${\log}S_o \la -4$ , ${\log}C_o(q_{i,j})$ is well-described by the parabola:

$\displaystyle {\log}C_o(q_{i,j})~=~A_{\rm lo}({\log}q_{i,j})^2 + B_{\rm lo}{\log}q_{i,j} + C_{\rm lo} ,$

(13.57)

with

$\displaystyle A_{\rm lo}~$	$\displaystyle =$	$\displaystyle ~0.00462$
$\displaystyle B_{\rm lo}~$	$\displaystyle =$	$\displaystyle ~0.0661$
$\displaystyle C_{\rm lo}~$	$\displaystyle =$	$\displaystyle ~-0.0154({\log}S_o)^2 - 0.252({\log}S_o) - 0.031$

(Note that errors have not been estimated for these parameters.)

In the regime ${\log}q_{i,j} \ga 0$ , $C_o(q_{i,j})$ is described by the function

$\displaystyle C_o(q_{i,j})~=~A_{\rm hi}\sqrt{q_{i,j}} + B_{\rm hi} ,$

(13.58)

a function used by Damiani et al., with

$\displaystyle A_{\rm hi}~=~\left\{ \begin{array} {l@{\quad \quad}l} -0.509~{\lo... ...g}S_o \geq -7 \\ -0.509~{\log}S_o + 1.897 & {\log}S_o < -7 \end{array} \right.$

$\displaystyle B_{\rm hi}~=~\begin{array} {l@{\quad}} -1.115~{\log}S_o - 1.038 \end{array}$

For values ${\log}q_{i,j} \sim 0$ , we find that we must sometimes use the linear equation

$\displaystyle {\log}C_o(q_{i,j})~=~A_{\rm mid}{\log}q_{i,j} + B_{\rm mid}$

(13.59)

to represent the threshold, with

$\displaystyle A_{\rm mid}~=~0.00182({\log}S_o)^3 + 0.0279({\log}S_o)^2 + 0.158{\log}S_o + 0.607$

(13.60)

and

$\displaystyle B_{\rm mid}~=~\left\{ \begin{array} {l@{\quad \quad}l} -0.064{\lo... ...g}S_o \leq -2\\ -0.064{\log}S_o + 0.612 & {\log}S_o < -7 \end{array} \right.$

If the probability sampling distribution is Gaussian with width $\sigma = \sqrt{q_{i,j}}$ , then the significance is given by:

$\displaystyle S_{i,j}~$	$\displaystyle =$	$\displaystyle ~\frac{1}{\sqrt{2{\pi}q_{i,j}}}\int_{C_{i,j}}^{\infty} dC \exp\left(-\frac{C_{i,j}^2}{2q_{i,j}}\right)$
	$\displaystyle =$	$\displaystyle ~1 - \frac{1}{2} - \frac{1}{\sqrt{2{\pi}q_{i,j}}}\int_0^{C_{i,j}} dC \exp\left(-\frac{C_{i,j}^2}{2q_{i,j}}\right)$
	$\displaystyle =$	$\displaystyle ~\frac{1}{2} \left[1 - {\rm erf}\left(\frac{C_{i,j}}{\sqrt{2q_{i,j}}}\right)\right]$	(13.61)

We find that if we use Eq. (13.58) in the regime ${\log}q_{i,j} \ga$ 3, then the derived values of $C_{i,j,o}$ are larger than those predicted by Eq. (13.61) above. Because it is more conservative, we use Eq. (13.58) for all values ${\log}q_{i,j} \ga$ 0.

13.8 Normalized Background Map

If the user does not provide a normalized background image, wrecon will create such an image by combining the normalized background images that wtransform creates at each wavelet scale:

$\displaystyle B_{{\rm norm},i,j}^{\rm total}~$	$\displaystyle =$	$\displaystyle ~\sum_{k=1}^N \epsilon_{i,j,k} \sigma_k^2 B_{{\rm norm},i,j,k}$	(13.62)
$\displaystyle w_{i,j}^{\rm total}~$	$\displaystyle =$	$\displaystyle ~\sum_{k=1}^N \epsilon_{i,j,k} \sigma_k^2$	(13.63)

where

$\displaystyle \epsilon_{i,j,k}~=~\left\{ \begin{array} {l@{\quad \quad}l} 1&\sigma_k \ga r_{{\rm PSF},i,j} \\ 0&{\rm otherwise} \end{array} \right.$

and is the number of scales used. " $\sigma_k \ga r_{\rm PSF}$ " denotes all $\sigma > r_{\rm PSF}$ , along with the largest $\sigma < r_{\rm PSF}$ . For instance, if $r_{\rm PSF}$ = 9 pixels, $\epsilon =$ 1 for the standard progression values $\sigma$ = [8,8 $\sqrt{2}$ ,16,...], and $\epsilon =$ 0 for $\sigma$ = [...4,4 $\sqrt{2}$ ]. We exclude the smallest scales because for $\sigma \ll r_{\rm PSF}$ , the background is greatly overestimated in the vicinity of sources. The weighting reflects that the area of the FOV used to estimate the background goes as $\sigma^2$ . The default background for wrecon is then:

$\displaystyle B_{{\rm norm},i,j}^{\rm default}~=~\frac{B_{{\rm norm},i,j}^{\rm total}}{w_{i,j}^{\rm total}} .$

(13.64)

We may algebraically manipulate the combination of eqs. (13.5), (13.62), and (13.63), to express the default normalized background as

$\displaystyle B_{{\rm norm},i,j}^{\rm default}~$	$\displaystyle =$	$\displaystyle ~\sum_{i'} \sum_{j'} \left[ \sum_{k=1}^N \frac{\epsilon_{i,j,k}\sigma_k^2}{<NW{\ast}E>_{i,j,k}} NW_{i-i',j-j',k} \right] D_{i',j'}$
	$\displaystyle =$	$\displaystyle ~\sum_{i'} \sum_{j'} \left[ \sum_{k=1}^N a_{i,j,k} NW_{i-i',j-j',k} \right] D_{i',j'}$	(13.65)

The presence of the term $a_{i,j,k}$ makes it impossible to compute the variance of the default background estimate using FFTs. Hence the current version of the program computes the variance of the background estimate by summing variance estimates at each scale:

$\displaystyle V[B_{{\rm norm},i,j}^{\rm default}]~=~\sum_{k=1}^N \left(\frac{\epsilon_{i,j,k}\sigma_k^2}{w_{i,j}^{\rm total}}\right)^2 V[B_{{\rm norm},i,j,k}]$

(13.66)

The missing covariance terms are positive, so Eq. (13.66) provides an underestimate of the true variance. If the user wishes to reduce the contribution of covariance terms, he or she may apply a sparser set of scales, since the magnitude of the covariance is proportional to the amount of overlap between negative annuli. For instance, if only scales separated by a factor of 2 $\sqrt{2}$ are used, the covariance terms will be $\approx$ 0, since the overlap of annuli is minimal.^13.7

13.9 References

Damiani, F., Maggio, A., Micela, G., & Sciortino, S. 1997, ApJ 483, 350

Daubechies, I. 1992, Ten Lectures on Wavelets (Philadelphia: SIAM)

Eadie, W. T., Drijard, D., James, F. E., Roos, M., & Sadoulet, B. 1971, Statistical Methods in Experimental Physics (Amsterdam: North-Holland)

Goupillaud, P., Grossmann, A., & Morlet, J. 1984, Geoexploration 23, 85

Gradshteyn, I., & Ryzhik, I. 1980, Table of Integrals, Series, and Products (San Diego: Academic Press)

Grebenev, S. A., Forman, W., & Jones, C. 1995, ApJ 445, 607

Holschneider, M. 1995, Wavelets: An Analysis Tool (Oxford: Oxford University Press)

Micela, G., Sciortino, S., Kashyap, V., Harnden, F. R., Jr., & Rosner, R. 1996, ApJS 102, 75

Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P.1992, Numerical Recipes (Cambridge: Cambridge Univ. Press)

Rosati, P., Della Cecai, R., Burg, R., Norman, C., & Giacconi, R. 1995, ApJL 445, 11

Slezak, E., Bijaoui, A., & Mars, G. 1990, A&A 227, 301

Vikhlinin, A., Forman, W., & Jones, C. 1994, ApJ 435, 162

cxchelp@head.cfa.harvard.edu