Skip to the navigation links
Last modified: 20 December 2024

URL: https://cxc.cfa.harvard.edu/ciao/scripting/io.html

Reading and writing files in Python

The CIAO Data Model can be used to read and write FITS and ASCII files in Python, using the crates module. This provides a high-level interface to data; the cxcdm module is also provided for those cases where low-level access is required and the user is familiar with the Data Model documentation.

The AstroPy I/O module can be used to read and write files with Python, but the CIAO helpdesk does not provide support. The AstroPy I/O modules do not recognize Data Model syntax, such as column selection, filtering, or binning.


The following imports are assumed below:

import numpy as np
from matplotlib import pyplot as plt

import pycrates

Many Crates routines can be used either as a function, such as get_colvals, or as a method on an object, in this case the get_column method of the TABLECrate object returned by read_file. This guide prefers the method option, but either can be used (although note that sometimes there are differences in what the calls return).


For most examples the following event file will be used:

% dmlist acisf13770_repro_evt2.fits blocks

--------------------------------------------------------------------------------
Dataset: acisf13770_repro_evt2.fits
--------------------------------------------------------------------------------

     Block Name                          Type         Dimensions
--------------------------------------------------------------------------------
Block    1: PRIMARY                        Null
Block    2: EVENTS                         Table        16 cols x 85650    rows
Block    3: GTI3                           Table         2 cols x 1        rows
Block    4: GTI2                           Table         2 cols x 1        rows
Block    5: GTI1                           Table         2 cols x 2        rows
Block    6: GTI0                           Table         2 cols x 1        rows
Block    7: GTI6                           Table         2 cols x 1        rows

Reading a single block from a FITS file

The read_file function will read in the "most interesting" block (in this case the "EVENTS block), returning a "crate" object:

cr = pycrates.read_file("acisf13770_repro_evt2.fits")
print(cr)
   Crate Type:        <TABLECrate>
   Crate Name:        EVENTS
   Ncols:             16
   Nrows:             85650

print(cr.get_filename())
acisf13770_repro_evt2.fits
print(cr.get_colnames())
['time', 'expno', 'ccd_id', 'node_id', 'chip(chipx, chipy)',
'tdet(tdetx, tdety)', 'det(detx, dety)', 'sky(x, y)', 'phas',
'pha', 'pha_ro', 'energy', 'pi', 'fltgrade', 'grade', 'status']

There are a number of routines in the crates module for accessing data, such as get_col_names and get_colvals, as well as methods on the crate object, such as get_colnames and get_column (use the Python help routine for help). Some of the routines will return the data as a cratedata object:

tdata = cr.get_column("time")
print(tdata)
  Name:     time
  Shape:    (85650,)
  Datatype: float64
  Nsets:    85650
  Unit:     s
  Desc:     S/C TT corresponding to mid-exposure
  Eltype:   Scalar
  Range:
     Min:   455646659.46568
     Max:   455677834.84234

print(type(tdata))
print(f"name={tdata.name} units={tdata.unit} desc={tdata.desc}")
<class 'pycrates.cratedata.CrateData'>
name=time units=s desc=S/C TT corresponding to mid-exposure
tvals = tdata.values
print(type(tvals))
print(tvals[0:4])
<class 'numpy.ndarray'>
[4.55647616e+08 4.55647616e+08 4.55647616e+08 4.55647616e+08]
x = cr.get_column("x").values
y = cr.get_column("y").values

plt.hexbin(x, y, mincnt=1)
plt.gca().set_aspect(aspect=1)
plt.colorbar()
[A binned representation of the X and Y columns of the event file, similar to that created by DS9.]
[Print media version: A binned representation of the X and Y columns of the event file, similar to that created by DS9.]

The SKY coordinates (that is, x and y) displayed using hex binning.

The data can be filtered, for instance to remove the low and high energy values:

energy = cr.get_column("energy").values
idx, = np.where((energy >= 500) & (energy < 2000))

plt.hexbin(x[idx], y[idx], mincnt=1)
plt.gca().set_aspect(aspect=1)
plt.colorbar()
[The filter applied to the data means that the faing emission of the source is much-more visible in the data now.]
[Print media version: The filter applied to the data means that the faing emission of the source is much-more visible in the data now.]

The events have been filtered to accept only those with energies between 500 and 2000 eV.

Reading an ASCII file

% dmcopy "pcadf13770_001N001_asol1.fits[cols time,ra,dec]" "asol.dat[opt kernel=text/simple]"
% file asol.dat
asol.dat: ASCII text
% head -4 asol.dat
#TEXT/SIMPLE
# time ra dec
4.5564730880322e+08 286.7639309292 4.524806469711
4.5564730905947e+08 286.7640088001 4.524777927964
  
acr = pycrates.read_file("asol.dat")
print(acr)
   Crate Type:        <TABLECrate>
   Crate Name:        asol.dat
   Ncols:             3
   Nrows:             113138

As an example, the aspect solution can be plotted:

ra = acr.get_column("ra").values
dec = acr.get_column("dec").values
t = acr.get_column("time").values

dt = t - t[0]
dra = 3600 * (ra - ra.mean())
ddec = 3600 * (dec - dec.mean())

plt.plot(dt, dra, '-', alpha=0.5, c="black", label="RA")
plt.plot(dt, ddec, '-', alpha=0.5, c="gray", label="Dec")
plt.legend()
plt.xlabel(r"$\Delta T$ (s)")
plt.ylabel(r"$\Delta$ (arcsec)")
[The RA and DEC values vary with time periodically (via a Lissajous curve which means the values are related but they do not just simply repeat).]
[Print media version: The RA and DEC values vary with time periodically (via a Lissajous curve which means the values are related but they do not just simply repeat).]
[NOTE]
Note

The ASCII format does not have all the metadata that the FITS format has, as can be seen in the lack of extra information - such as a description, units, or a data range - in the CrateData object:

print(acr.get_column("time"))
  Name:     time
  Shape:    (113138,)
  Datatype: float64
  Nsets:    113138
  Unit:
  Desc:
  Eltype:   Scalar
  Range:
     Min:   -1.7976931348623157e+308
     Max:   1.7976931348623157e+308

This is partly because the "simple" ASCII format was used when creating the asol.dat file. If opt kernel=text/dtf were used in the dmcopy call above then more metadata would be available, but the file itself would be harder to parse by code outside of CIAO.

Reading in all the blocks in a file

All the blocks can be accessed using the read_dataset routine:

ds = pycrates.read_dataset("acisf13770_repro_evt2.fits")
print(type(ds))
<class 'pycrates.cratedataset.CrateDataset'>
print(ds)
   Crate Dataset:
     File Name:         acisf13770_repro_evt2.fits
     Read-Write Mode:   rw
     Number of Crates:  7
       1)     Crate Type:        <IMAGECrate>
   Crate Name:        PRIMARY

       2)     Crate Type:        <TABLECrate>
   Crate Name:        EVENTS
   Ncols:             16
   Nrows:             85650

       3)     Crate Type:        <TABLECrate>
   Crate Name:        GTI3
   Ncols:             2
   Nrows:             1

       4)     Crate Type:        <TABLECrate>
   Crate Name:        GTI2
   Ncols:             2
   Nrows:             1

       5)     Crate Type:        <TABLECrate>
   Crate Name:        GTI1
   Ncols:             2
   Nrows:             2

       6)     Crate Type:        <TABLECrate>
   Crate Name:        GTI0
   Ncols:             2
   Nrows:             1

       7)     Crate Type:        <TABLECrate>
   Crate Name:        GTI6
   Ncols:             2
   Nrows:             1

[NOTE]
Note

Although read_dataset has read in enough of the data file to recognize the different blocks, and the columns in these blocks, it will not read in the actual data values until a method like get_column or routines like get_colvals and get_piximg are called.

Reading the metadata

The keyword values can be found either as a cratedata object

ekey = cr.get_key("EXPOSURE")
print(ekey)
   Name:   EXPOSURE
   Value:  28309.22660496
   Unit:   s
   Desc:   Exposure time
   Groups: ['BASIC', 'EVENT', 'O_1', 'O_B', 'O_E']

or as a scalar

print(cr.get_key_value("EXPOSURE"))
28309.22660496
[NOTE]
Note

The CIAO Data Model is not the same as the FITS system, and so some FITS header keys, particularly those that represent how column and image data are stored (such as TUNIT1 and TTYPE1), are not available as keywords. The values are available through other means - such as accessing the list of column names in a table crate, or the data type stored in the values field of a CrateData object.

Some of the metadata is accessible with the Subspace and History and Comments routines of the crates module.

Writing out a FITS block

Although it is possible to create a crate directly - with TABLECrate or IMAGECrate - it is easiest to read in a file, edit it, and then write it out. This can be done with the write_file routine or with the write method of the Crate class.

As an example, the ASCII file from above can have

  1. a units field added to the TIME column;
  2. a new column, set to the time written as an offset from the first time and stored in kiloseconds;
  3. and converted to FITS format.
acr = pycrates.read_file("asol.dat")
t = acr.get_column("time")
t.unit = "s"

newcol = pycrates.CrateData()
newcol.name = "dt"
newcol.unit = "ks"
newcol.desc = "Offset time"
newcol.values = (t.values - t.values[0]) / 1000
acr.add_column(newcol)

acr.write("asol.fits", clobber=True)

This creates the file:

% dmlist asol.fits cols

--------------------------------------------------------------------------------
Columns for Table Block asol.dat
--------------------------------------------------------------------------------

ColNo  Name                 Unit        Type             Range
   1   time                 s            Real8          -Inf:+Inf
   2   ra                                Real8          -Inf:+Inf
   3   dec                               Real8          -Inf:+Inf
   4   dt                   ks           Real8          -Inf:+Inf            Offset time

Writing out an ASCII file

The DataModel opt argument is used to specify the output file format in a write method call, write_file call, or write_dataset call.

acr.write("asol_copy.dat[opt kernel=text/dtf]", clobber=True)

Since this uses the CIAO Data Text format, which uses a FITS-like encoding in ASCII, the output includes much of the metadata, such as column units and descriptions, as the FITS format does. Using "opt kernel=text/simple" would produce an ASCII output file with no metadata.

% head asol_copy.dat
#TEXT/DTF
XTENSION= "TABLE   "
HDUNAME = "asol.dat"
EXTNAME = "asol.dat"
TFIELDS =                    4
TTYPE1  = "time    "
TFORM1  = "1D      "           / data format of field.
TUNIT1  = "s       "           / physical unit of field.
TTYPE2  = "ra      "
TFORM2  = "1D      "           / data format of field.
% dmlist asol_copy.dat cols

--------------------------------------------------------------------------------
Columns for Table Block asol.dat
--------------------------------------------------------------------------------

ColNo  Name                 Unit        Type             Range
   1   time                 s            Real8          -Inf:+Inf
   2   ra                                Real8          -Inf:+Inf
   3   dec                               Real8          -Inf:+Inf
   4   dt                   ks           Real8          -Inf:+Inf            Offset time

Writing out all the FITS blocks

The CrateDataset object also has a write method, or write_dataset call, that will write all the blocks out. For example

ds.write("copy.fits", clobber=True)