Last modified: December 2022

URL: https://cxc.cfa.harvard.edu/ciao/ahelp/dmgroup.html
AHELP for CIAO 4.15

dmgroup

Context: Tools::Table

Synopsis

Group a specified column in a table with various options

Syntax

dmgroup infile outfile grouptype grouptypeval binspec xcolumn ycolumn
[tabspec] [tabcolumn] [stopspec] [stopcolumn] [errcolumn] [maxlength]
[clobber] [verbose]

Description

'dmgroup' takes as input a FITS table, and generates a new table by grouping elements of the column 'xcolumn' in the user specified column 'ycolumn'. Elements in 'xcolumn' are grouped according to the grouping method chosen (NONE, BIN, SNR, NUM_BINS, NUM_CTS, ADAPTIVE, ADAPTIVE_SNR, BIN_WIDTH, MIN_SLOPE, MAX_SLOPE, BIN_FILE). The various grouping methods each have required and optional parameters. With the exception of the NONE method, the tabspec, tabcolumn, stopspec, and stopcolum parameters may be used to restrict and refine the grouping using any method. Note that input files may also be stacks ('@stackfile.txt') provided the number of input files matches the number of output files in both infile and outfile stacks.

With all grouping methods, grouping-specific columns will be created if they do not yet exist in the input file. The GROUPING column's values designate the groups. The QUALITY column denotes the quality of each group. The GRP_NUM column enumerates the groups. The CHANS_PER_GROUP column counts the number of channels within each group. The GRP_DATA column gives the total number of 'counts' in each group. And the GRP_STAT_ERR shows the total statistical error of each group.

GROUPTYPE OPTIONS:

Incomplete Groups:

For each grouping option, a group may be left incomplete due to insufficient counts or an insufficient number of rows. This may occur because of a tab interval, the end of the dataset, or the presence of a previously grouped row. All rows in such an incomplete group will be given a quality flag=2. For example, for the option NUM_BINS with the grouptypeval=4 and an input file with 26 rows, the resulting 4 groups would be (in the case of no tab intervals): group 1 (rows 1-6), group 2 (rows 7-12), group 3 (rows 13-18), and group 4 (rows 19-24), with rows 25 and 26 ungrouped and given quality flags=2.

Bin, tab, and stop Specifications:

The binning specification, tab specification, and stop specification all share the same syntax, and are now consistent with the dmbinning syntax. The binspec, tabspec, and stopspec are applied respectively to the 'xcolumn', 'tabcolumn', and 'stopcolumn' column data when used. The syntax is given by spec='min:max:step,min:max:#bins,...'. One or two of the three values may be omitted (ex: spec='1::10'). Any omitted min, max, or step values will be determined from the file if possible using the TLMINn, TLMAXn, and TDBIN keywords if available. The values given in these specifications must be within the data value range of the table.

Tabs are used to label intervals of data to not be binned over (e.g. instrumental features). They can be used for any grouping option except NONE. Each row appearing in a tab specification is not grouped and given a quality flag=5. If the column name given in 'tabcolumn' is a valid column in the input file, the tab specification(s) will pertain to data in this column. If 'tabcolumn' is left blank, the tab specification(s) will apply to the row numbers in the input table.

Stops are used to group specified ranges of data at maximum resolution. That is, each channel within the stopspec will be in its own group, and will be marked with quality=0. Stops can be used for all grouping methods except NONE.

Output Table Additions:

The output table contains all the columns present in the input table, plus the following additional columns:


Examples

Example 1

dmgroup in.fits out.fits grouptype=BIN xcolumn=pha grouptypeval=""
binspec="10:40:5" ycolumn=counts tabspec="25:32:#1" tabcolumn=pha

Group the data in column 'counts' of the input file in.fits using the BIN method. Begin grouping where the pha column=10, creating new groups whenever pha increases by 5 until reaching 40. Exclude channels where pha is between 25 and 32 inclusive, marking them with bad quality.

Example 2

dmgroup @instack.txt @outstack.txt grouptype=SNR xcolumn=channel
grouptypeval=30 binspec="" ycolumn=counts

Group the data in column 'counts' of each of the input files listed in instack.txt, mapping the output to the files listed in outstack.txt. Construct groups with a minimum signal-to-noise ratio of 30.

Example 3

dmgroup in.fits out.fits grouptype=ADAPTIVE xcolumn=chipx
grouptypeval=100 binspec="" ycolumn=counts
tabspec="55710000:55760000:#1" tabcolumn=time

Group the data in column 'counts' of in.fits creating groups with a minimum of 100 events each. The rows specified by the tab specification (rows in which the time has values 55710000 to 55760000 inclusive) will remain ungrouped and flagged bad. Note that although an xcolumn is given ('chipx'), it is not used in this example.

Example 4

dmgroup in.fits out.fits grouptype=NUM_BINS xcolumn=channel
grouptypeval=20 binspec="" ycolumn=counts stopspec="50:150:"
stopcolumn="channel"

Group the data in column 'counts' of in.fits, making 20 equal-sized (if possible) groups of 'channel'. Group channels 50-150 inclusive at the highest resolution.

Example 5

dmgroup in.fits out.fits grouptype=NUM_CTS xcolumn=pi grouptypeval=270
binspec="" ycolumn=counts

Group the data in column 'counts' of in.fits creating groups containing a minimum of 270 events.

Example 6

dmgroup in.fits out.fits grouptype=NONE xcolumn=time grouptypeval=0
binspec="" ycolumn=counts

Reset the grouping, grouping everything at the highest resolution. Group each channel by itself and give it a good quality (0).

Example 7

dmgroup in.fits out.fits grouptype=MAX_SLOPE xcolumn=channel
grouptypeval=100 binspec="" ycolumn=counts

Group those features in in.fits that increase or decrease at a rate higher than 100 counts/channel.

Example 8

dmgroup in.fits out.fits grouptype=BIN_FILE xcolumn=channel
grouptypeval=0 binspec="/data/pha/grouped.fits" ycolumn=counts

Group in.fits the same way /data/pha/grouped.fits has been grouped. Reconstruct the original binspec using the 'channel' column.

Example 9

dmgroup asol1.fits grouped_asol1.fits grouptype=MAX_SLOPE
grouptypeval=1e-5 xcol=TIME ycol=RA maxlen=50
dmcopy grouped_asol1.fits"[grouping=1]" sampled_asol1.fits

Group all the rows in the aspect solution file where delta(RA)/delta(TIME) is greater than 1e-5. The maxlen=50 says that no bin should be longer than 50, in this case seconds (since xcolumn is TIME).

The second dmgroup selects the rows (time) at the start of each group. This is a simple way to create a "thinned" file for quick plotting.


Parameters

name type ftype def min max reqd
infile string input       yes
outfile string output       yes
grouptype string   NONE     yes
grouptypeval real   0     yes
binspec string         yes
xcolumn string         yes
ycolumn string         yes
tabspec string         no
tabcolumn string         no
stopspec string         no
stopcolumn string         no
errcolumn string         no
clobber boolean   no     no
verbose integer   0 0 5 no
maxlength real   0 0   no

Detailed Parameter Descriptions

Parameter=infile (string required filetype=input)

The input file. Can be a FITS table or a stack of tables.

Parameter=outfile (string required filetype=output)

The output file. Can be a file name for FITS output or a stack of output filenames.

Parameter=grouptype (string required default=NONE)

Type of grouping: one of NONE, BIN, SNR, NUM_BINS, NUM_CTS, ADAPTIVE, ADAPTIVE_SNR, BIN_WIDTH, MIN_SLOPE, MAX_SLOPE, BIN_FILE, and must be in upper case.

Parameter=grouptypeval (real required default=0)

The numerical value used for the methods SNR, NUM_BINS, NUM_CTS, ADAPTIVE, ADAPTIVE_SNR, MIN_SLOPE, MAX_SLOPE, BIN, and BIN_WIDTH. Ignored otherwise.

Parameter=binspec (string required)

The binning specification having syntax consistent with the Data Model dmbinning syntax. For grouptype=BIN, if binspec is unspecified but grouptypeval is set, then grouptypeval will be used as the binning specification.

Parameter=xcolumn (string required)

The x-axis of the data. Only used in the following grouping method: BIN, BIN_FILE, BIN_WIDTH, MIN_SLOPE, and MAX_SLOPE. Typically values depending on file type might be 'channel', 'pha', 'pi', 'time', 'energy', although any column could be used. It must contain monotonically increasing data when the BIN method is selected.

Parameter=ycolumn (string required)

The name of the column of the input table containing the data to be counted and used to compute the GRP_DATA column. Typically values might be something like 'counts', 'rate', or 'surface_brightness' depending on the input file type although any value can be used.

Parameter=tabspec (string not required)

The tab specification having syntax consistent with the Data Model dmbinning syntax. Denotes the channels not to be grouped, but instead marked as bad.

Parameter=tabcolumn (string not required)

The name of the column of the input table containing the data that have to be used by the tabspec. It must contain monotonically increasing data.

Parameter=stopspec (string not required)

The stop specification having syntax consistent with the Data Model dmbinning syntax. Denotes the channels to be grouped at the highest resolution.

Parameter=stopcolumn (string not required)

The name of the column of the input table containing the data that have to be used by the stopspec. It must contain monotonically increasing data.

Parameter=errcolumn (string not required)

The name of the column of the input table containing the data that should be used to calculate the GRP_STAT_ERROR values and the signal-to-noise when using the SNR and ADAPTIVE_SNR methods. If no value is given, sqrt(grp_data) is used.

Parameter=clobber (boolean not required default=no)

Specifies if an existing output file should be overwritten.

Parameter=verbose (integer not required default=0 min=0 max=5)

Specifies the level of verbosity in displaying diagnostic messages.

Parameter=maxlength (real not required default=0 min=0 max=)

Specifies the maximum group size of groups. If set to 0, then no maximum is specified. Only applies to methods BIN, SNR, NUM_CTS, ADAPTIVE, ADAPTIVE_SNR, MIN_SLOPE, MAX_SLOPE, and is ignored otherwise.


Bugs

Caveats

Minimum and maximum values for the binspec parameter

If there is no min/max given in the binspec parameter, the tool returns the min/max for the datatype for the specified column. If this is a large range, it will fail with

# dmgroup (CIAO): dsALLOCERR -- ERROR: Could not allocate memory.

Workarounds:

  1. Provide a binning specification in the binspec parameter.

  2. If the column range is missing or too large, one may be forced by filtering the column. For example, to establish a PHA range:

    unix% pset dmgroup infile="acis_evt2.fits[pha=1:4096]" 
    

See Also

dm
dmbinning, dmfiltering, dmopt
tools
apply_fov_limits, dmappend, dmcopy, dmextract, dmfilth, dmgroup, dmgti, dmimghist, dmjoin, dmmerge, dmpaste, dmregrid, dmsort, dmtabfilt, dmtcalc, dmtype2split, get_fov_limits, get_sky_limits, tgextract, tgextract2