|AHELP for CIAO 4.2||
Mexican-Hat Wavelet source detection (wtransform+wrecon)
wavdetect infile outfile scellfile imagefile defnbkgfile [scales] [regfile] [clobber] [ellsigma] [interdir] [bkginput] [bkgerrinput] [outputinfix] [sigthresh] [bkgsigthresh] [falsesrc] [sigcalfile] [exptime] [expfile] [expthresh] [bkgtime] [maxiter] [iterstop] [xoffset] [yoffset] [eband] [eenergy] [psftable] [log] [verbose]
wavdetect correlates the image with wavelets of different scales (selected by the user) and then searches the results for significant correlations. It is a wrapper for the tools wtransform and wrecon, and works in two steps:
- Step 1: wtransform detects probable source pixels within a dataset by repeatedly correlating it with "Mexican Hat" wavelet functions with different scale sizes.
- Step 2: wrecon generates a source list with information from each wavelet scale. For each source, a cell is computed that contains the majority of the source flux, and source properties are computed within that cell.
- Separates closely-spaced point sources.
- Finds extended sources so long as wavelet scales are chosen appropriately.
- Slower than celldetect
- The tool requires a lot of memory. Data structures for a 512x512 image use up 36 Mb. A 2048x2048 image requires over 300 Mb. For images larger than 2048x2048, at least 1 Gb of memory is recommended. Datasets that do not fit in physical memory will page heavily to disk and processing will run very slowly.
The "wavdetect step-by-step" section at the end of this document contains a detailed description of how the tool identifies source detections. For a description of the theory and operation of wavdetect, see the wavdetect section of the CIAO Detect Manual and the paper "A Wavelet-Based Algorithm for the Spatial Analysis of Poisson Data", by P. E. Freeman et al. (ApJS 2002, v138, p.185; astro-ph/0108429).
If the instrument/detector is neither Chandra nor ROSAT PSPC, wavdetect cannot compute point-spread-function sizes. The tool will still run but the source characteristics will not be reliable. If the size is wrong (as the sizes returned by the PSF lib may be), then the wrong scales may be used for the determination of source parameters, and thus the source properties may be wrongly estimated. But the *detection* process is unaffected.
wavdetect my_input.fits source_list.fits source_cell.fits image.fits background.fits expfile=none
Read a primary image from "my_input.fits", generating a source list "source_list.fits", a source cell image "source_cell.fits" a source image "image.fits" and a normalized background image "background.fits".
wavdetect my_input.fits source_list.fits source_cell.fits image.fits background.fits scales="1 2 4 8 16"
Run wavdetect with 5 scales. Note that quotes around the scale list are required.
wavdetect "my_input.fits[events]" source_list.fits source_cell.fits image.fits background.fits scales="2 4 8" maxiter=3
Run wavdetect on an event list in file my_input.fits, blockname "events". Use 3 scales, and allow 3 cleansing iterations per scale.
wavdetect my_input.fits source_list.fits source_cell.fits image.fits background.fits expfile=none exptime=0
Read a primary image from "my_input.fits", generating a source list "source_list.fits", a source cell image "source_cell.fits" a source image "image.fits" and a normalized background image "background.fits". No exposure map is input, and exposure time is not set. A value of 1.0 will be used to estimate rate parameters in the output source list.
wavdetect my_input.fits source_list.fits source_cell.fits image.fits background.fits expfile=none exptime=10000.0
Read a primary image from "my_input.fits", generating a source list "source_list.fits", a source cell image "source_cell.fits" a source image "image.fits" and a normalized background image "background.fits". No exposure map is input, and the exposure time is set to 10000.0 This value will be used to estimate rates in the output source list.
wavdetect my_input.fits source_list.fits source_cell.fits image.fits background.fits expfile=expmap.fits exptime=0
Read a primary image from "my_input.fits", generating a source list "source_list.fits", a source cell image "source_cell.fits" a source image "image.fits" and a normalized background image "background.fits". An exposure map, expmap.fits is input. Values from the exposure map at source positions will be used to estimate rates in the output source list.
wavdetect my_input.fits source_list.fits source_cell.fits image.fits background.fits expfile=expmap.fits exptime=10000.0
Read a primary image from "my_input.fits", generating a source list "source_list.fits", a source cell image "source_cell.fits" a source image "image.fits" and a normalized background image "background.fits". Both exposure map and exposure time are set. Rates in the output source list will be estimated from exptime, scaled by the quantity (expmap value at source location)/(max expmap value).
wavdetect my_input.fits source_list.fits source_cell.fits image.fits background.fits expfile=none scales="1 2 4" falsesrc=1.0
Run wavdetect with 3 scales, specifying a false source rate of 1 per scale. The user should thus expect (on average) 3 false sources in the output sourcelist.
For other wavdetect examples, please see the CIAO Science Thread Running wavdetect.
Parameter=infile (file required filetype=input)
Input image file. For images larger than 2048x2048, at least 1 Gb of memory is recommended.
Pixel values should be reasonable numbers. WAVDETECT probability distributions have been derived assuming Poisson statistics. WAVDETECT simply will not work with fluxed images. The calculated correlation values are so small that no detections occur.
Parameter=outfile (file required filetype=output autoname=yes)
File name of output source list.
If auto-naming (outfile=.) is used, the output file will have the suffix "_src"
Parameter=scellfile (file required filetype=output autoname=yes)
File name of the image showing the source cells, which delimit the image pixels which are used to estimate source properties (e.g. count rate).
If auto-naming is used (scellfile=.), the output file will have the suffix "_scell"
Parameter=imagefile (file required filetype=output autoname=yes)
Output file containing the reconstructed source image.
If auto-naming is used (imagefile=.), the output file will have the suffix "_image"
Parameter=defnbkgfile (file required filetype=output autoname=yes)
Default normalized background.
During its second stage, wavdetect makes one normalized background estimate from the stack of per-scale normalized backgrounds produced during the first stage. It writes this computed background to this file. If auto-naming is used (defnbkgfile=.), the output file will have the suffix "_nbkg"
Parameter=scales (string default=2.0 4.0)
The wavelet radii, in pixels, are given by this list of numbers. The list must be in quotes and can be separated by spaces, semicolons (;), or commas (,).
Wtransform produces a complete set of outputs for each scale. Small scales tend to detect small features, and larger scales larger features. The default parameter file only specifies 2 wavelet scales (2 and 4). This is fine for experimentation, but for completeness a scale list of "1 2 4 8 16" would be a reasonable default for Chandra data.
Values are typically 2**n, where n is an integer, and n_lo and n_hi are chosen with respect to instrumental PSF sizes. For the ROSAT PSPC, for instance, this could mean n_lo = 1 pixel, n_hi = 16 pixels. Note that larger scales might need to be used to characterize (i.e. derive good source property estimates for) extended sources.
2**n should not be larger than the size of the image (in pixels) divided by 5 (in which case the extent of the wavelet function is larger than the image itself, which could lead to strange results).
Note that wavdetect has only one parameter to specify scale size. If users want control over the xscale and yscale separately, or if they want to specify which scale sizes will be used for flux estimates, or if they want to experiment with many different runs of the second part of the process without repeating the initial phase, then they should run wtransform followed by (multiple) runs of wrecon.
Parameter=regfile (file default= autoname=yes)
File for ASCII region output.
If auto-naming is used (regfile=.), the output file will have the suffix "_reg". NB: autonaming for regfile is not currently operational.
Parameter=clobber (boolean not required default=no)
If set to "yes", existing output files will be overwritten.
Parameter=ellsigma (real default=3.0)
Size, in sigmas, to make the elliptical source regions.
ellsigma is a multiplicative factor applied to sigma, the standard deviation of the distribution, to scale the major and minor axes of the ellipses for each source. ellsigma affects both the outfile and the ASCII region file (regfile). This feature is included so that the graphics overlay will be more visible and under the user's control. Often a value greater than 3 is helpful.
Parameter=interdir (file not required filetype=output default=.)
Directory for intermediate results.
Parameter=bkginput (file not required filetype=input default=)
Pre-determined background input image.
Use a previously computed background in the specified file in place of constructing a new default normalized background. When using bkginput, enter "none" for the default normalized background file.
Parameter=bkgerrinput (boolean not required default=no)
If yes, use background error in file bkginput.
Parameter=outputinfix (file not required default=)
If needed to avoid file naming collisions with other wavdetect users, set to a unique [string] to give the set of outputs differing names. The string will be embedded in the file names.
Parameter=sigthresh (real not required default=1e-06)
Threshold for identifying a pixel as belonging to a source.
*After* the final background estimate B is made, the threshold correlation is recomputed by solving for C_o:
sigthresh = integral(C_o dC) PSD(C(B))
Here, C = avg[W*D], where D are the *raw* data. (The iterations are simply to compute background. Once we have that, we go back to the beginning.)
If C is less than or equal to C_o in a pixel, then that pixel is considered to be associated with a source. If the pixel is also the location of a local maximum in C-space, the location of that maximum is output to a FITS table.
sigthresh should not be significantly larger than one over the number of pixels in the image; for a 1024x1024 image, that means sigthresh ~ 1.e-6. If, in this case, sigthresh = 1.e-5, there will be approximately 10 strong background fluctuations detected as sources.
sigthresh should not be smaller than *roughly* 1.e-9 - 1.e-10; even at this point, the accuracy of the computed detection thresholds is unknown (though probably fine for most applications).
Parameter=bkgsigthresh (real not required default=0.001)
Significance threshold for cleansing data during iterations.
Significance threshold for cleansing data during iterations. In each pixel, during each iteration, the background, and correlation of the wavelet function and the current data (C' = avg[W*D']), are estimated. The background estimate B' implies a probability sampling distribution (PSD) for C', i.e. the distribution that C' would have if there was locally no source in the image and D' were sampled from background. A threshold C_o' is calculated from this PSD, using the supplied value of bkgsigthresh (e.g. 10**-2) by solving the following for C_o':
bkgsigthresh = integral(C_o' dC') PSD(C'(B'))
If C' is greater than or equal to C_o', the data in the pixel is replaced with the background estimate. In this way, putative sources are eliminated from the data image, along with weak (but otherwise undetectable) sources and background fluctuations, so that a better background estimate can be made.
This value should be no larger than 0.05 (the usual 5 per cent, or 95 per cent, statistical criterion for rejecting the null hypothesis, which is that the pixel in question has data sampled solely from the background); it should not be smaller than sigthresh.
Parameter=falsesrc (real not required default=-1.0)
Number of false sources allowed per scale in the image.
That number is combined with image size and scale information, using additional simulation results contained in the file defined by the sigcalfile parameter, to determine the threshold significance for each pixel; consequently the value given for the parameter sigthresh is ignored. If falsesrc is set to a negative number, the code uses sigthresh instead to determine thresholds.
The calibrated range of wavdetect is 0.1 to 1e-8 false sources per pixel. Depending on the significance settings and the input image size, users can go outside of this range in which case the results become unpredictable
Parameter=sigcalfile (file not required filetype=ARD default=$ASCDS_CALIB/wtsimresult.fits)
Data file for use by the false-source algorithm.
The file contains simulation results that are used with the false-source algorithm. This file is included with with the release and may be used for both Chandra and non-Chandra data. The user may not specify "CALDB" for this parameter.
Parameter=exptime (real default=0 units=seconds)
Time of the exposure, in seconds.
If set to zero and no exposure file is used, the program sets the exposure time to 1.0 when estimating sorce rates.
If set to a non-zero value and no exposure file is used, the program uses the input value when estimating source rates.
If set to zero and an exposure file is used, the program uses the value from the exposure file when estimating source rates.
If set to a non-zero value and an exposure file is used, the program uses the quantity exptime*(expfile value at source/max expfile value) when estimating source rates.
Parameter=expfile (file filetype=input default=)
Exposure file. Image of the data set exposure time.
Parameter=expthresh (real not required default=0.1)
If relative exposure is less than the parameter expthresh, then the pixel in question is not analyzed
For each pixel, a relative exposure is calculated (pixel exposure value over maximum value of exposure in map). If this relative exposure is less than the parameter expthresh, then the pixel in question is not analyzed: the background, correlation, et al., are set to zero.
expthresh should not be less than 0.1 (or else the accuracy of normalization will be too low); if set close to 1, only very limited regions of the FOV will be considered when source correlation maxima are listed. Typical values would be 0.1-0.2.
Parameter=bkgtime (real not required default=0)
Livetime of predetermined background.
Livetime of predetermined background. This parameter should only be used *if* the user has provided a normalized background image (rather than having wavdetect calculate it). A value of 0 will make the exposure time used for background time. If bkgtime is non-zero and no background map is supplied, the source parameters will be adversely affected (i.e. wrong).
Parameter=maxiter (integer not required default=2)
Maximum number of cleansing iterations per scale pair. If maxiter is greater than necessary, the program will usually stop on its own (see iterstop).
Parameter=iterstop (real not required default=0.0001)
If the ratio of *newly* cleansed pixels to the overall number of pixels in the dataset is less than the parameter iterstop, the program quits iterating and uses the current background estimate as the final background estimate.
The user specifies how many iterations the program should go through to calculate the background. However, the background is for all intents and purposes calculated if there are very few new pixels being cleansed (see bkgsigthresh). If the ratio of *newly* cleansed pixels to the overall number of pixels in the dataset is less than the parameter iterstop, the program quits iterating and uses the current background estimate as the final background estimate.
A typical value is 0.001: stop iterating when only one of every thousand pixels is being cleansed. This parameter should not be larger than 1, nor smaller than one over the number of pixels in the dataset.
Parameter=xoffset (integer not required default=INDEF units=pixels)
Offsets of x- and y-axis for the calculation of the off-axis angle.
By default (when both offsets are set to INDEF), wavdetect calculates the off-axis angle using the nominal pointing of the data file as the origin. Users may change this behavior by specifying offsets to any numerical values; the offsets provide the location of the optical axis with respect to the center of the data file.
A typical scenario in which the users may select to use offsets is when their data file is a sum of two or more observations, with different pointings. The nominal pointing in such data file is usually poorly defined and the off-axis angle could be calculated from an undesired origin, leading to suboptimal selection of the detect cell size. The ability to set the origin explicitly with x/yoffset solves this problem.
Parameter=yoffset (integer not required default=INDEF units=pixels)
Parameter=eband (real default=1.4967 units=keV)
Energy, in keV, at which the PSF size should be computed.
For ROSAT PSPCB, eband and eenergy parameters are ignored. The values 1 keV and .393 are used.
Parameter=eenergy (real not required default=0.393)
Percentage of encircled energy, expressed as a fraction of 1.0, which the PSF size should contain.
Percentage of encircled energy, expressed as a fraction of 1.0, which the PSF size (see "eband") should contain. Suggested values: .393 (one-sigma percentage threshold for a two-dimensional Gaussian), or 0.500. The efficiency of the algorithm is reduced if this fraction is made too high (say, 0.900) or too low.
The encircled energy fraction of the PSF is used to estimate the PSF size, in pixels, at the location of each detected source. The default value (0.393) is the one-sigma integrated volume of a normalized two- dimensional Gaussian, thus the estimated PSF size should be approximately one sigma. The PSF size is used only to select a flux image; a source cell is computed using this image, and source properties are estimated using the data within the computed cell. NB: Unlike celldetect, the PSF library and eenergy have no effect on the detection process in wavdetect.
If the value of eenergy is set too high, there is the risk that the computed source cell will contain more than one source, rendering source property estimates moot. On the other hand, if the value is set too low, then the computed cell may be too small and contain only a fraction of the source counts.
For ROSAT PSPCB, eband and eenergy parameters are ignored. The values 1 keV and .393 are used.
Parameter=psftable (file filetype=ARD default=)
Table of PSF size data.
FITS file of PSF information. A file for Chandra is included with the release. The user may not specify "CALDB" for this parameter.
It is possible to set psftable="" to allow wavdetect to run on merged datasets. In this mode, wrecon will use the smallest wavelet scale for all reconstructions. This may result in many small, spurious sources and potential loss of extended and/or off-axis angle sources.
Users have to be aware that any sources that wavdetect identifies are merely candidate sources; all properties - including the actual existence of the source - should be scrutinized.
Parameter=log (boolean not required default=no)
If set to "no", log information will go to stdout. If set to "yes", file "wavdetect.log" will be created.
Parameter=verbose (integer not required default=0 min=0 max=5)
Level of log output. 0=none, 5=highest.
`wavdetect' operates on its input in two stages: first (in `wtransform'), it detects putative source pixels within a dataset by repeatedly correlating it with "Mexican Hat" wavelet functions with different scale sizes. At each scale, the original image is correlated with the wavelet. Pixels with sufficiently large positive correlation values are removed from the image (as assumed sources), and subsequent correlations are performed at the same scale. This procedure of extracting source pixels from an image is called "cleansing". Typically, when very few source candidates are being found, or when an iteration-count limit is reached, the cleansing process stops. At this point, the estimated background (estimated from the cleansed image) is used to set detection thresholds, which are applied to the initial correlation image array values to identify putative sources. A set of outputs for the given wavelet scale is generated: a table of candidate sources (identified by correlation maxima), an image of the correlation of the data with the wavelet function, an image of the normalized (or flat-field) background (the image minus the "source" pixels), and the normalized background error image. Then the tool moves on to the next scale and repeats the process with a fresh copy of the image.
The second stage (`wrecon') generates a source list from information from the first stage at each wavelet scale [correlation maxima tables, normalized (flat-field) background images (with errors), and correlation images]. Each correlation maximum from stage 1 is tested to see if it represents an independent, new, source, or a source seen at other scales. For each source, a cell is computed that contains the majority of the source flux, and within that cell, source properties (location, count flux, etc.) are computed.
The X_ERR and Y_ERR are simple (net) counts-weighted variances around the centroid; RA_ERR & DEC_ERR are simply scaled to get to their correct units. So they represent a statistical 1-sigma error.
However, there are many places where systematic errors can be introduced.
- The quality of the background, which can be affected by choice of wavelet scales. (This assumes that the tool is run such that wavdetect generates & uses its own background.)
- There can be some shifts between where the simple centroid is located and the true source location, especially for sources far off-axis.
- The data processing: pixel randomization, pixel quantization, aspect reconstruction, etc.
None of these are accounted for in wavdetect's error estimates.
See the bugs page for this tool on the CIAO website for an up-to-date listing of known bugs.