Last modified: 22 August 2023

Why does using an exclude filter take so long?


The [exlude sky=region(ciao.reg)] filter, especially with dmextract, can be extremely slow; sometimes requiring hours or days to complete. This is often noticed when trying to exclude hundreds or even thousands of sources from, for example, some type extended emission.

The [exclude ] directive does not simply change the "is this point inside" logic, it actually inverts the logic of the region itself so it can be stored in the output file's subspace. It uses De Morgan's Law to invert the logic in the region. Since CIAO regions support complex logical AND (*), OR (+) , and NOT (!) operations, the general solution to inverting a region is an order O(N2) kind of operation; that is the run-time of the tool grows exponentially with number of sources being excluded. To reduce this, CIAO does have some optimizations to deal with groups of shapes whose bounding boxes overlap; however, this does not help for dense source regions that cover the same area.

While CIAO does support complex AND, OR, and NOT operations, most users, expecially in this scenerio, are not making use of complicated logic patterns. Most often the region being excluded simply contains ellipses (or maybe circles) generated by one of the source detect tools, eg wavdetect.

unix% cat ciao.reg
ellipse(...A...)
ellipse(...B...)
ellipse(...C...)
...

In this scenario, it is easier and faster for the user to manually invert the region and avoid using the [exclude ] syntax altogether.

In CIAO 4.11, the python region module was updated to make this type of operation much easier. Users simply need to subtract the source region from the field(). For example these commands :

from region import *
src = CXCRegion("ciao.reg")
invert = field()-src
invert.write( "invert_ciao.fits", fits=True, clobber=True)

can be saved to a python script , invert.py, and then run

unix% python invert.py
unix% dmlist invert_ciao.fits data,clean,array
Region Block: Field()&!Ellipse(4254.36,3654.59,7.27874,4.41364,145.504)&!Ellipse...
#  POS(X,Y)                                 SHAPE              R[2]                 ROTANG[2]            COMPONENT 
                                  NaN NaN Field                               NaN                  NaN          1
                                                                              NaN                  NaN           
      4254.3550724638      3654.5869565217 !Ellipse                   7.2787399292       145.5038452148          1
                                                                     4.4136433601                  NaN           
      4379.0974576271      3670.9406779661 !Ellipse                   7.5883789062       164.3294067383          1
                                                                     6.1752810478                  NaN           
...

This file can the be used as a normal CIAO filter:

unix% dmextract "evt.fits[sky=region(invert_ciao.fits)][bin sky=annulus(4096,4096,0:1000:10)]" ...etc...

This will run much faster than with the [exclude ] syntax.

[NOTE]
Regions in celestial coordinates

This technique works best for regions stored in physical coordinates. For regions stored in celestial coordinates, users should use dmmakereg to convert them physical before inverting the logic.

unix% dmmakereg region="region(ds9.reg)" out="ciao.reg" kernel=ascii wcsfile=acis_evt.fits

The wcsfile is necessary to provide the coordinate transform information.

There are of course examples of more complicated regions: sources that have other nearby sources excluded, sources that include a polygon for the field of view, or maybe exclude a rectangular region for the readout streak. This technique will also handle all the logic needed to invert these complex regions.