SKA Telemetry Archive Tutorial

Overview

This document gives a tutorial introduction to the SKA Telemetry Archive (also known as the thermal database), including basic configuration and a number of examples.

The telemetry archive consists of:

  • Tools to ingest and compress telemetry from the CXC Chandra archive.
  • Compressed telemetry files for data relevant to thermal, aspect, science instrument, and other spacecraft subsystems. These are FITS format files.
  • Observation summary SQL database (as a single SQLite3 file).
  • A tool to retrieve telemetry values.

Configure to use the archive

To set up and to use the archive it is best to start with a “clean” environment by opening a new terminal window. It is assumed you are using csh or tcsh.

There are several ways that one can set up to use the archive, but the steps below are one simple way that will probably work for most users. In most cases in this tutorial the highlighted text (in gray boxes) can be cut-n-pasted directly into the terminal window.

From a 32-bit linux machine, go to a directory where you intend to do archive-related analysis and create a working directory:

cd ~/my_working_area   # for example..
mkdir thermal_work
cd thermal_work

The actual name thermal_work is not important. Next, create links to the archive retrieval script fetch and the observation SQL database db.sql3:

# On the HEAD LAN
setenv SKA /proj/sot/ska
#
# On the OCC Greta LAN
setenv SKA /home/SOT/ska
#
# Make links
ln -s $SKA/bin/fetch ./
ln -s $SKA/data/telem_archive/db.sql3 ./

Basic Functionality Test

To test the basic functionality of your setup, try the following to get a short summary of the available command line options for the main archive retrieval script:

./fetch --help

Remember that this assumes you are in the thermal_work directory that was created in the initial setup. Depending on the way your PATH environment variable is set, you might not need the ‘./’ before the fetch command. Next try to get some actual data:

./fetch --start 2006:329:23:00:00 --stop 2006:329:23:04:00 ephin2eng:

You should see:

date,5ephint,5eiot,quality
2006:329:23:00:00.000,104.544090271,137.819366455,0
2006:329:23:00:32.790,104.544090271,137.819366455,0
2006:329:23:01:05.590,104.544090271,137.819366455,0
2006:329:23:01:38.390,104.544090271,137.819366455,0
2006:329:23:02:11.190,104.544090271,137.819366455,0
2006:329:23:02:44.000,104.544090271,137.819366455,0
2006:329:23:03:16.800,104.544090271,137.819366455,0
2006:329:23:03:49.600,104.544090271,137.819366455,0

Next try out the observation database:

sqlite3 db.sql3

You should get the SQLite interactive query tool prompt like sqlite>. Now try to find the number of ACIS CCDs that were on for observations in a certain date range:

select obsid, kalman_datestart, kalman_datestop, num_ccd_on
  from observations
  where kalman_datestart > '2007:113'
    and kalman_datestop  < '2007:116'
  order by kalman_datestart ;

You should see:

sqlite> select obsid, kalman_datestart, kalman_datestop, num_ccd_on from observations
   ...>   where kalman_datestart > '2007:113'
   ...>     and kalman_datestop < '2007:116'
   ...>   order by kalman_datestart ;
7606|2007:113:01:58:21.456|2007:113:02:05:11.456|6
7639|2007:113:23:02:15.060|2007:114:09:11:58.337|4
58424|2007:114:16:07:47.863|2007:114:17:41:37.163|6
58423|2007:114:19:46:27.863|2007:114:20:07:59.364|0
58422|2007:114:23:25:07.864|2007:115:02:33:17.215|6
8005|2007:115:11:33:17.466|2007:115:11:56:08.916|6
7790|2007:115:13:22:37.466|2007:115:13:40:43.967|1

Fetch: Basics and Examples

The main tool for extracting telemetry data from the archive is the python script fetch. Based on the directives given via the command line, fetch will output an ASCII table of telemetry values at a uniformly spaced time sampling.

The basic syntax for calling fetch is:

fetch [options] table_col_spec1 [table_col_spec2 ...]

Table_col_spec is a table column specifier that specifies one or more columns (MSIDs) within the telemetry archive tables. The allowed formats are:

<table>:<col1>,[<col2>,...]   # col1,... in specified table
<col1>,..                     # col1,... within any table (must be unique)

The default table_col_spec (if none is supplied) is ephin2eng. Some examples of valid values of table_col_spec are:

acis2eng_curr:                        # All values in acis2eng_curr table
acis2eng_curr:1dahacu                 # Just 1DAHACU from acis2eng_curr table
acis2eng_curr:1dahacu,1deicacu        # 1DAHACU,1DEICACU from acis2eng_curr table
1dahacu,1deicacu                      # 1DAHACU,1DEICACU from acis2eng_curr table
cobsrqid 5ephint 5eiot aosaest1       # Specified MSIDs
thm1eng: pcad7eng: aspect_mode        # Useful quantities for EPHIN thermal analysis
point_suncentang,dist_satearth        # Values from CXC definitive ephemeris

One example that will fail is shown below. In this case the comma-separated column names are expected to be found within the same table:

cobsrqid,5ephint,5eiot,aosaest1       # Doesn't work

Options

The available command line options are:

-h, --help               Show this help message and exit
--obsid=OBSID            Return data for OBSID
--start=START            Start date of processing
--stop=STOP              Stop date of processing
--dt=DT                  Sampling interval (sec)
--outfile=OUTFILE        File for fetch output (default = stdout)
--statusfile=STATUSFILE  Write out fetch status each status-interval seconds
--status-interval=INTRVL Time interval between statusfile update and file size check (sec)
--max-size=MAX_SIZE      Approximate output file size limit (default = None)
--ignore-quality         Output bad quality rows (suppressed by default)
--mind-the-gaps          Abort if data gap detected (instead of just setting quality=1)
--file-format=FILE_FMT   Stop date of processing
--time-format=TIME_FMT   Format for output time stamp

In more detail the options are:

obsid
Process over the kalman interval for the given ObsId. This overrides any --start and --stop values.
--start, --stop
Specify the time range of the query. Both options require a value in one of the available DateTime Formats. The default values are 2007:001:23:00:00 and 2007:002:01:00:00, respectively.
--dt
Specify the time sampling for output data (in seconds). The default value is 32.8 seconds. Enhancing fetch to automatically determine the time sampling based on the first output column is a high priority. Note that fetch never averages, so if the output sampling period is longer than the natural sampling of the data column, then data values will be missed.
--outfile
Write the output table to the specified file. Default is writing to STDOUT.
--statusfile

Write a status file with the given name during processing. This gives a way of tracking the progress during long requests. This is written out once each status-interval seconds (approximately). An example output is:

current_row: 82400
total_rows: 787609
percent_complete: 10.5
process_start: Tue Jan 27 12:43:24 2009
current_time: Tue Jan 27 12:43:36 2009
datestart: 2008:001:12:00:00.000
datestop: 2008:300:12:00:00.000
columns: date 5ephint 5eiot quality
status: processing
--status-interval
Write the status file at the given time interval (seconds).
--max-size

Stop processing if the output file grows beyond the given file size (bytes). This is approximate since the file size is checked only each status-interval seconds. If this limit is exceeded then the processing status will be something like:

status: File size limit 1000000 bytes exceeded
--ignore-quality
The fetch default is to suppress output of any rows where any of the output column values have bad quality in the input telemetry (e.g. missing minor frame). With this command line option included then bad quality rows will be output with a quality column value of 1. The bad column values will be set to None.
--mind-the-gaps
Gaps in telemetry are normally just skipped so that the output data file has similar gaps. With this flag set the program will abort if a data gap is detected.
--file-format=FILE_FORMAT
Provide output table data in the specified FILE_FORMAT. The available options are csv (comma-separated),``tab`` (tab-separated),``rdb`` (currently same as tab, to be fixed), fits (FITS; not yet implemented). The default is csv.
--time-format=TIME_FORMAT
Specify the format for the date column in the output table data. Available options are given in the DateTime Formats section. The default is date.

Examples

fetch --start 2006:100:00:00:00 --stop 2006200.123456 --statusfile status.dat ephin2eng:
fetch --start 2006:100:00:00:00 --stop 2006200.123456 --dt 300 ephin2eng:
fetch --file-format=tab --time-format secs --outfile ephin_data.tab  ephin2eng:
fetch --start 2007:001 --stop 2007:005 >! ephin_data.csv

Notice that the ‘=’ (equals sign) in the options is optional.

DateTime Formats

Fetch (and in general the SKA telemetry archive toolsuite) supports a wide range of formats for representing date-time stamps. Note that within this and other documents for this tool suite, the words ‘time’ and ‘date’ are used interchangably to mean a date-time stamp.

The available formats are listed in the table below:

Format Description System
secs Elapsed seconds since 1998-01-01T00:00:00 tt
numday DDDD:hh:mm:ss.ss... Elapsed days and time utc
jd* Julian Day utc
mjd* Modified Julian Day = JD - 2400000.5 utc
date YYYY:DDD:hh:mm:ss.ss.. utc
caldate YYYYMonDD at hh:mm:ss.ss.. utc
fits FITS date/time format YYYY-MM-DDThh:mm:ss.ss.. tt
unix* Unix time (since 1970.0) utc
greta YYYYDDD.hhmmss[sss] utc

* Ambiguous for input parsing and only available as output formats.

The default time “System” for the different formats is either tt (Terrestrial Time) or utc (UTC). Since TT differs from UTC by around 64 seconds it is important to be consistent in specifying the time format. Utilities to convert between the formats exist and can be made available if there is interest.

In general when supplying a date as an input (e.g. for the --start option), the format will be be automatically detected. If you enter a date in a format that is not recognized the tool will spit out a full code traceback (not too pretty), but the key will be to notice the final line which indicates a problem with the input time format:

Traceback (most recent call last):
  File "/proj/sot/ska/share/telem_archive/fetch.py", line 222, in ?
    main()
  File "/proj/sot/ska/share/telem_archive/fetch.py", line 48, in main
    for date in get_date_stamps(opt.start, opt.stop, opt.dt):
  File "/proj/sot/ska/share/telem_archive/fetch.py", line 98, in get_date_stamps
    datestop  = Chandra.Time.DateTime(stop).mxDateTime
  File "build/bdist.linux-i686/egg/Chandra/Time.py", line 298, in __getattr__
  File "build/bdist.linux-i686/egg/Chandra/Time.py", line 251, in convert
Chandra.Time.ChandraTimeError: Invalid input format 'None'

Observation Database: Basics and Examples

The observation database is an SQL table that contains detailed information about each observation. An “observation” means an interval at least 150 seconds long where the OBC reports being in KALMAN mode (via the AOACASEQ MSID) and there are no changes in the Obsid or SIM position. For analysis purposes this can be considered a time of stable pointing.

In addition to containing basic timing information about each observation (i.e. the start and stop of the Kalman interval), the contents of the CXCDS obspar file (i.e. the as-run OCAT values [??]) for each observation are captured.

Starting SQLite

To use the observation table you need to start up the interactive SQLite database browser via the command (recalling that db.sql3 is a link to the real file):

sqlite3 db.sql3

You can get help on the browser meta-commands (i.e. commands on top of the SQL query language) by typing .help. The meta-commands most likely to be useful are those which specify the formatting and output file for query results:

.header(s) ON|OFF      Turn display of headers on or off
.help                  Show this message
.mode MODE ?TABLE?     Set output mode where MODE is one of:
                         csv      Comma-separated values
                         column   Left-aligned columns.  (See .width)
                         html     HTML <table> code
                         insert   SQL insert statements for TABLE
                         line     One value per line
                         list     Values delimited by .separator string
                         tabs     Tab-separated values
                         tcl      TCL list elements
.output FILENAME       Send output to FILENAME
.output stdout         Send output to the screen
.quit                  Exit this program
.separator STRING      Change separator used by output mode and .import
.show                  Show the current values for various settings
.width NUM NUM ...     Set column widths for "column" mode

Basic SQL queries

The syntax for basic SQL queries is straightforward. In the example below the SQL keywords are in upper case and column names are in lower case. In general whitespace and carriage returns are not significant. The query statement must be ended with a semicolon to tell the browser to execute the query.

SELECT obsid, kalman_datestart, kalman_datestop, num_ccd_on
  FROM observations
  WHERE kalman_datestart > '2007:113'
    AND kalman_datestop  < '2007:116'
  ORDER BY kalman_datestart ;

The SELECT <columns> and FROM <table> clauses are required, while the rest are optional. The WHERE clause can contain any reasonable boolean expression with AND, OR and parentheses as needed.

One slight trick in the above query is using a text comparison to limit the query date range. This works because the date fields have a fixed format and the comparison does the right thing when comparing strings of different length. The SQLite database is very lightweight and does not support true date type fields.

Further information on SQL can be obtained at: