Ska runtime environment 1.0: standalone

The Ska runtime environment consists of an integrated suite of Python packages and associated applications. As of version 1.0 the installation and update of these packages is done using the Anaconda Python distribution and the Conda package manager. This allows simple installation on standalone platforms such as linux or Mac laptops. Similar support for Windows is expected.

Note

Version 1.0 is currently in development and is not approved for flight use. In particular many packages have been updated to newer versions but no formal verification has been done.

Anaconda is a binary Python distribution that has all the major scientific Python packages that are needed for science and engineering analysis. The Anaconda installation is managed by the Conda binary package manager. Together these make it extremely easy to install, update, and remove Python packages.

The great thing about Anaconda is that an installation just lives in a single directory that you choose and that you own. If you don’t want it any more, removal is as simple as removing that one directory. Conversely you can have multiple Anacondas, or better yet multiple (entirely distinct) working environments within a single Anaconda distribution. The only caveat is that you should have separate Anaconda root directories for each machine architecture, e.g. linux-32 and linux-64 should not be mixed.

Installation

If you already have an Anaconda Python distribution on your platform and you want to install Ska into it, go straight to Installing Ska. If you don’t know what Anaconda is, don’t have it installed, or want to make a separate installation for Ska, then follow the steps for New Anaconda.

New Anaconda

The steps for creating a new Anaconda environment are as follows:

  • Navigate to http://conda.pydata.org/miniconda.html and download the appropriate miniconda installer for your OS.

  • Open a terminal window and change to the downloads directory for your browser.

  • The installer is called something like Miniconda-<version>-<OS>.sh. Find it and then do:

    % bash Miniconda-3.5.5-Linux-x86_64.sh  # Use the actual file name
  • Hit ENTER, then agree to the license terms, and finally specify the installation directory. This will be created by the installer and needs to be a directory that doesn’t currently exist. A fine choice is ~/anaconda (which is what is assumed for the rest of the installation), but you may want to install to a different directory depending on the details of your system.

  • When it asks about updating your .bashrc file at the end, most likely you should decline (answer no), but if you want to make this Ska/Anaconda your default Python computing environment, then answer yes. (This also assumes you are running the bash shell, which is typical for Mac and modern linux distros).

  • Now put the Anaconda bin directory into your path:

    # FOR csh / tcsh
    % set path=(${HOME}/anaconda $path)
    % rehash
    
    # For bash
    % export PATH=$HOME/anaconda:$PATH
    % hash -r

This gives you a minimal working Python along with the package manager utility conda.

Installing Ska

To install the full environment of Ska packages do the following (but you may want to read the note below on Conda Environments first). This assumes that you have set your path to use the Anaconda Python and so that the conda command is in your path:

% conda install --channel=ska ska

Congratulations, you now have all the Ska packages installed on your local machine! This includes:

agasc                 pexpect           Ska.CIAO
asciitable            pyfits            Ska.DBI
astropy               pyger             Ska.engarchive
Chandra.cmd_states    pyparsing         Ska.File
Chandra.Maneuver      pyqt              Ska.ftp
Chandra.taco          pytables          Ska.Matplotlib
Chandra.Time          pytest            Ska.Numpy
chandra_models        python            Ska.ParseCM
django                pytz              Ska.quatutil
egenix-mx-base        pyyaks            Ska.report_ranges
hdf5                  pyyaml            Ska.Shell
ipython               pyzmq             Ska.Sun
jinja2                qt                Ska.Table
kadi                  quaternion        Ska.tdb
matplotlib            setuptools        ska_path
nose                  six               ska_sync
numexpr               Ska.arc5gl        sqlite
numpy                 Ska.astro         tornado
                                        xija

Note

Conda Environments

For people that worry about having stable or reproducible environments, conda provides an extremely useful capability to create and maintain independent environments within a single distribution. See the Conda create docs for more info or contact a member of the aspect team for help if you are interested.

Test

As a quick smoke test that things are working, confirm that the following works and makes a line plot:

% which ska_sync
/home/aldcroft/anaconda/bin/ska_sync

% ipython --matplotlib --classic
>>> import matplotlib.pyplot as plt
>>> from Chandra.Time import DateTime
>>> import ska_path
>>> plt.plot([1, 2])
>>> DateTime(100).date
'1998:001:00:00:36.816'

Ska data

To make full use of the Ska environment you need various package data, e.g.:

  • Ska engineering archive
  • Kadi databases
  • Commanded states databases
  • AGASC star catalog
  • TDB (MSIDs)

The very first thing to do is define the disk location where you want to store the Ska data. Packages and applications in the Ska runtime environment use an environment variable SKA to define the root directory of data and other shared resources. On the HEAD or GRETA networks this root is the familiar /proj/sot/ska.

For a machine (e.g. your laptop) not on the HEAD or GRETA networks you need to make a directory that will hold the Ska data. A reasonable choice is putting this in a ska directory in your home directory:

% mkdir ~/ska
% setenv SKA ${HOME}/ska  # csh / tcsh
% export SKA=${HOME}/ska  # bash

If you have made a local Ska environment on a machine on the HEAD or GRETA networks, you can just set the SKA environment variable to point at the existing root:

% setenv SKA /proj/sot/ska  # csh / tcsh
% export SKA=/proj/sot/ska  # bash

Syncing data

For a machine not on the HEAD or GRETA networks you need to get at least a subset of the data on your machine. This is done by means of a script called ska_sync that is installed as part of Ska. This assume you have set the SKA environment variable and created a directory as shown above.

The first thing is to try running it and getting the online help:

% ska_sync --help
usage: ska_sync [-h] [--user USER] [--install] [--force]

Synchronize data files for Ska runtime environment

Arguments
=========

optional arguments:
  -h, --help   show this help message and exit
  --user USER  User name for remote host
  --install    Install sync config file in default location
  --force      Force overwrite of sync config file

Next you need to install a copy of the template configuration file into your Ska root directory. This will let you customize which data you want and how to get it. This installation is done with:

% echo $SKA  # Check the SKA root directory
/Users/aldcroft/ska

% ska_sync --install
Wrote ska sync config file to /Users/aldcroft/ska/ska_sync_config

Now you need to edit the file it just created and set the remote host for getting data and the remote host user name. Choose either kadi.cfa.harvard.edu (HEAD) or the IP address for chimchim (OCC) and edit accordingly. Likewise put in the corresonding user name:

# Host machine to supply Ska data (could also be chimchim but kadi works
# from OCC VPN just as well).
host: kadi.cfa.harvard.edu

# Remote host user name.  Default is local user name.
# user: name

Finally do the sync step:

% ska_sync
Loaded config from /Users/aldcroft/ska/ska_sync_config

COPY and PASTE the following at your terminal command line prompt:

  rsync -arzv --progress --size-only --files-from="/Users/aldcroft/ska/ska_sync_files" \
    aldcroft@kadi.cfa.harvard.edu:/proj/sot/ska/ "/Users/aldcroft/ska/"

As instructed copy and paste the rsync line, after which you will need to enter your password. At this point it will sync the relevant Ska data files into your local Ska root.

As long as you don’t change your config file, you can just re-run that same command to re-sync as needed.

Note

Support for syncing Engineering archive data is coming. This is slightly trickier because the whole archive is a few hundred gigabytes, which is not large by big data standards, but big enough that you probably don’t want the whole thing.

Testing data

To test that the data are really there make sure you can reproduce the following:

% ipython --classic
>>> from Ska.tdb import msids
>>> msids.find('tephin')
[<MsidView msid="TEPHIN" technical_name="EPHIN SENSOR HOUSING TEMP">]

>>> from kadi import events
>>> events.normal_suns.filter('2014:001')
<NormalSun: start=2014:207:07:04:09.331 dur=65207>

>>> from Chandra.cmd_states import fetch_states
>>> fetch_states('2011:100', '2011:101', vals=['obsid'])
    ...
    SOME WARNINGS WHICH ARE OK and will get patched up later
    ...
[ ('2011:100:11:53:12.378', '2011:101:00:26:01.434', 418823658.562, 418868827.618, 13255)
  ('2011:101:00:26:01.434', '2011:102:13:39:07.421', 418868827.618, 419002813.605, 12878)]

To Do

  • Package scripts (use entry points where possible)
    • pyger: cp pyger.sh $(INSTALL_BIN)/pyger in Makefile
  • Proceduralize the process of building packages. Need to get the order right and force upload to binstar, etc.