Using ASCII Files in CIAO
CIAO 4.15 Science Threads
Overview
Synopsis:
CIAO users are familiar with the flexible filtering and binning capability that the Data Model tools provide with FITS files. The same tools also work on ASCII (text) files containing tables via the "ASCII kernel."
The ASCII kernel allows easy text file manipulation by the tools dmlist, dmcopy, dmstat, and dmtcalc. The majority of the other DM-specific tools (e.g. dmlist, dmstat) also support ASCII input; refer to the Limitations section of this thread for exceptions.
Related Links:
- Help file for the DM ASCII kernel: "ahelp dmascii"
-
Making Images & Filtering Data: an Introduction to the Chandra Data Model, an overview of the four core DM tools
- Full details on all the dmtools can be found in their individual ahelp files.
Last Update: 13 Jan 2022 - Review for CIAO 4.14. No changes.
Contents
Filtering Data
If a basic text file - sample.dat - contains a table with three columns and four rows:
21.0 41.3 21.8 22.0 41.1 20.2 23.0 43.8 17.3 24.0 12.3 11.1
then dmlist may be used to select a range of data from two of the columns:
unix% dmlist "sample.dat[col3=11:20][cols col2,col3]" data,clean # col2 col3 43.80 17.30 12.30 11.10
By default, unnamed columns are referred to as "col1", "col2", etc. If column names are provided in the file, they may be used in the filter:
unix% cat input.txt # ROW time ccd_id energy pi 1 87.5272969157 0 14707.57031250 1008 2 87.5272969157 0 13968.8378906250 957 3 87.5272969157 0 15152.52343750 1024 4 87.5683369190 7 268.5079650879 19 5 87.5683369190 7 1101.3159179688 76 6 87.5683369190 7 2045.5782470703 141 ... unix% dmlist "input.txt[time=10:100][cols energy]" data -------------------------------------------------------------------------------- Data for Table Block input.txt -------------------------------------------------------------------------------- ROW energy 1 14707.57031250 2 13968.8378906250 3 15152.52343750 4 268.5079650879 5 1101.3159179688 6 2045.5782470703 7 13929.0205078125 8 3547.0227050781 9 1672.4017333984 ...
ASCII to FITS; FITS to ASCII
The DM creates FITS format output by default; the kernel option must be specified every time to make the output be a text file.
To copy a FITS file to a simple text output format, e.g. to be used by another piece of code:
unix% dmcopy input.fits "output.txt[opt kernel=text/simple]"
To create a basic FITS file from ASCII input:
unix% dmcopy input.txt output.fits
DM filter syntax may be included when creating a new file. Here dmcopy is used to create an output file of filtered data in FITS or text format:
unix% dmcopy "sample.dat[col3=11:20][cols col2,col3]" filtered.fits unix% dmcopy "sample.dat[col3=11:20][cols col2,col3]" "filtered.txt[opt kernel=text/simple]"
About the Kernel
Output Text Formats: kernel option
There are four output formats allowed with the ASCII kernel: text/raw, text/simple, text/dtf, and text/tsv. The output is specified by including the "kernel" option in the output file name.
-
[opt kernel=text/raw]
Simple text table format consists of free-format columns with no header keywords. The format understands only two datatypes: numbers (treated as double precision) and text strings. All columns are scalar and are given the default names "col1", "col2", etc.
-
[opt kernel=text/simple]
This format is similar to text/raw, but has an optional header defining the column names. In its simplest form, the header consists of a single line of whitespace-separated column names preceded by the comment character; see the "comment" and "colnames" options below. The text/simple option is compatible with the SM plotting program.
-
[opt kernel=text/dtf]
Data Text Format (DTF) is a pseudo-FITS format with support for headers and data subspaces. Free format tables are the default, but fixed-format fields are also supported, as described in the Using the ASCII Kernel Manual (PS).
-
[opt kernel=text/tsv]
Generic TSV format files are recognized, as well as the extended header detail provided by the Chandra Source Catalog (CSC) output format. TSV flavor can not be auto-determined, it must either be specified by this kernel syntax or by putting the TSV flavor specification line at the top of the file (#TEXT/TAB-SEPARATED-VALUES).
Note that some of the additional header info (e.g. UCD) from the CSC format will be lost, since the DM does not yet support these concepts.
Additional Options
There are several other options that may also be used to qualify a text file. Multiple options are specified as a comma-separated list. You can use these options to allow CIAO to read tables in text files with slightly different formats from the default, for example by skipping header lines or changing the field separator.
- [opt sep=:], [opt sep=:,white], [opt sep=":;"]
-
Define the given character (e.g. ":", used here, or "/") , to be the separator for data fields. The "sep" option defines each instance of the character as a new field. This example represents four fields, with the second one being empty:
14.1::23.2:15.1
If the "white" qualifier is included, the separator is treated as whitespace. This means that if you have multiple separator characters next to each other, they only count as one separator. The same example - "14.1::23.2:15.1" - then represents only three fields.
More than one character may be defined as the separator. For instance, [opt sep=":;"] defines both ":" and ";" as separators. The only printable characters which may not be used are single quote ('), double quote ("), and backslash (\).
- [opt skip=3]
-
Skip the given number of lines (e.g. 3) at the beginning of the file. This helps handle some formats with fixed headers.
- [opt comment=#]
-
Lines that begin with the given character (e.g. "#", the default) prior to the first data line will be treated as comments. There is one special comment line which the "colnames" option controls, as described in the next item.
- [opt colnames=first]
-
The first comment-character line is treated as a space-separated list of column names. The value "first" is the default; other possible values are "last" (the last comment-character line prior to the first data line) and "none" (none of the lines are treated as a colnames definition).
- [opt nullstr="",NaN]
-
When specified on an input file, the nullstr value specifies an arbitrary string which represents a NULL value. This is in addition to the 'default' NULL values for each datatype:
- INTEGER: {empty}, {tnull}, -, INDEF, INF
- REAL: {empty}, -, NaN, INDEF
- STRING: {empty}
When specified on an output file, the string will be used to represent all NULL values.
The defaults for each output format are:
- text/raw - comment='#',sep=" \t\r",white,colnames=none,skip=0,nullstr='"",NaN'
- text/simple - comment='#',sep=" \t\r",white,colnames=first,skip=0,nullstr='"",NaN'
- text/dtf - sep=" \t\r",white,colnames=none,skip=0,nullstr='"",NaN'
Limitations
The ASCII kernel was developed to allow CIAO users to use the familiar DM syntax in manipulating and filtering text files; it is not intended as a replacement for the FITS kernel in pipelines.
The following are some limitations of the kernel. Refer also to the ASCII Kernel section of the Data Model bugs pages for known bugs.
-
The kernel does not always work well with other tools, e.g. dmextract, acis_process_events, etc.
-
To create a dataset with more than one block or write header keywords, you have to use the DTF flavor. Therefore, the simple and raw formats cannot be used with any CIAO tools which require multiple blocks and header keywords.
-
Files larger than 2 GByte are not supported in the ASCII kernel.
-
Header lines longer than 1024 bytes are not supported. Data lines may be arbitrarily long.
History
14 Dec 2007 | new for CIAO 4.0 |
02 Jan 2008 | updated for CIAO 4.1: added nullstr to the Additional Options section; images are now supported (removed item from Limitations section) |
25 Jan 2010 | reviewed for CIAO 4.2: no changes |
11 Jan 2011 | updated for CIAO 4.3: TSV format is now supported |
03 Jan 2012 | reviewed for CIAO 4.4: no changes |
03 Dec 2012 | Review for CIAO 4.5; no changes |
25 Nov 2013 | Review for CIAO 4.6. No changes. |
17 Dec 2014 | Reviewed for CIAO 4.7; no changes. |
13 Jan 2022 | Review for CIAO 4.14. No changes. |