rdbplt - plot columns in an RDB table
rdbplt reads an RDB format database file and plots two columns against each other. The data may be transformed logarithmically. Independent positive and negative going error bars may also be plotted. Points may be connected, and may also have markers drawn. For more detailed usage instructions, see the later sections.
rdbplt uses an IRAF-compatible parameter interface. A template parameter file is in /proj/axaf/simul/lib/uparm/rdbplt.par.
The name of the RDB file to plot. If the filename is the string stdin
,
it reads from the standard input stream.
The data plotted along the X and Y axes. It may be a column name, column index designator, or record index designator.
Either the string none
, indicating no error bars for that axis are
to be plotted, or a comma separated list of column specifications.
Spaces may be used to increase readability. See Specifying the Data for more info.
This parameter specifies the transformations to be applied to
the data. It is one of: linlin
, linlog
, loglin
, loglog
.
A string of flags (0
or 1
) indicating whether the axis limits
should be flipped (1
), i.e. whether the axes have increasing
coordinate values to the right (for X) or up (for Y), or to the
left (for X) or down (for Y).
The flags are ordered as X and Y. For example, 01
.
A string of flags (0
or 1
) indicating which limits are to be
autoscaled (1
). The flags are ordered as xmin, xmax,
ymin, and ymax. For example, 0101
.
yes
|no
|line style(s)If no
, the input data are not connected by a line. If yes
, the
input data are connected by a line. It may be a comma separated list
of line specifications, the string no
(indicating no lines are to
be drawn) or the string yes
(indcating lines are to be drawn and
their styles automatically selected). Spaces may be used to increase
readability.
A comma separated list of marker specifications, or none
,
indicating that no markers should be drawn. See Markers and Lines. If there are more data sets than markers, new marker types
are generated automatically. Spaces may be used to increase
readability.
This is a comma separated list of colors for the data sets.
Colors may be one of
black
,
blue
,
bluecyan
,
bluemagenta
,
chartreuse
,
cyan
,
darkgray
,
green
,
greencyan
,
lightgray
,
magenta
,
orange
,
red
,
redmagenta
,
white
,
or
yellow
.
If the list is prefixed with cycle:
, colors in the set will
be cycled if the number of datasets exceeds the number of specified
colors. If the list is composed only of cycle
, then a default
list of colors will be cycled through (this is a subset of the
available colors selected to avoid colors which are too similar).
The column containing data set identifiers, or the string none
to indicate a single data set.
Explicitly specified plot limits, used if the appropriate flag in autoscale is turned off. The limits are specified in the transformed data space.
The labels for the plot axes. If not specified, these will be the names of the columns which are plotted.
The plot title.
The plot subtitle.
The height of the plot frame, in inches. If zero, the plot will use the largest possible height.
The width of the plot frame, in inches. If zero, the plot will use the largest possible width.
If true, the scale of the axes will be the same (where scale means plot units/inch of paper)
If true, plot a legend. See legend_opts and legend_text.
legend_opts takes a list of parameters and flags which change how legends are drawn. See Plot Legend for more info.
A list of texts to be drawn in the legend, one element per dataset.
If there are more datasets than texts, the last one is repeated.
Elements in the list are separated by the characters "\n".
If the string %s
appears in the text, it will be replaced by the
contents of the break column for the dataset.
The height of the characters, in dimensionless units, where 1 is the default.
The default line width. This affects everything (including characters).
The algorithm used to specify padding between the data and the
frame. One of none
, pgplot
, or percent
.
If limit_alg is percent
, this specifies the fration of the
plot window to be used as padding.
The PGPLOT device to which to output the plot.
Print out a simple help message and exit.
Print out rdbplt's version and exit.
A list of debug flags. None are presently defined.
The file to be read in is specified via the input parameter.
If it is the string stdin
, the data are read from the standard
input stream.
The data to plot are selected with the xcol and ycol parameters. They take a specification which has one of the following forms:
The name of the column in the RDB file containing the data. These are case sensitive!
The unary based index of a column. These are written as the index preceded by an underscore, e.g.
xcol=_3
The string _NR
, indicating that the record numbers are to be used
for the data values.
xcol=_NR
Record numbers start at 1
To plot error bars, specify the columns containing the errors via the xerr and yerr parameters. Error bars are correctly transformed if the logarithm of the data is performed.
The transformations applied to the data before being plotted are specified with the axisxfrm parameter. It can take the following values:
linlin
No transformation is applied.
linlog
The base 10 logarithm of the Y values is plotted.
loglin
The base 10 logarithm of the X values is plotted.
loglog
The base 10 logarithm of both axes is plotted.
Normally rdbplt will autoscale the axes. To force axis limits,
use the autoscale parameter in conjunction with the xmin,
xmax, ymin, ymax parameters. autoscale should be set
to a string of 1
's and 0
's, indicating whether an axis is
to be autoscaled (1
) or whether the appropriate limit parameter is
to be used (0
). The digits are ordered xmin, xmax, ymin,
ymax. For instance, to autoscale on xmax and ymax, and fix
the rest,
autoscale=0101 xmin=0 ymin=-33
Note that axis limits are specified in the transformed data space.
If it makes more sense for the axis coordinates to increase in the opposite sense from normal (i.e. right to left, or up to down ), use axisflip. For example,
axisflip=01
flips the sense of the Y axis.
A single RDB table may contain multiple data sets to be overplotted. The X and Y values for the different data sets are not in separate columns, but are concatenated in a single pair of X and Y columns, with a third column indicating the set to which the data belong (this is known in RDB parlance as a break column).
Here's how it works. Each data set has a unique identifier in the break column. rdbplt decides that a new data set has begun when the value in the break column changes. This means that all data points for a given data set should be contiguous in the file. For example, here is a valid multiple data set RDB table:
x y break N N S 0 1 set1 1 2 set1 9 9 set2 10 10 set2
The break column is specified via the break
parameter. If set to
none
, a single data set is plotted. Otherwise, it should be set to
the name of a column in the data file containing the data set
identifier. The contents of the break column are compared as strings,
regardless of the actual rdb column type.
Creating such RDB tables from multiple RDB tables is fairly straighforward, using the rdbcat and/or sorttbl RDB commands.
One would of course like to see the data, and this is accomplished by laying down either markers or lines, or both.
Markers are specified by the marker parameter. It can be simply
set to yes
, or no
(or none
), in which case one gets markers
or one doesn't. In general, it takes a comma separated list of one or
more marker specifications (in case there's more than one data set).
Spaces may be used to increase readability.
If the markers should be read from a column in the data file, and the column contains marker codes, set
marker=column_name
If instead the column contains strings which should be plotted, set
marker=%column_name
The marker type may also be explicitly set:
marker=code
If there are more data sets than marker specifications, rdbplt will automatically choose a new marker to use (by incrementing the marker code for the last specified marker). There are thousands of them ranging from interesting map symbols to letters and digits, so it'll be pretty uncommon to have them repeat.
Marker codes range from -8 to several thousand. Negative markers are
filled polygons, positive are symbols, letters, etc. The PGPLOT
manual describes them all. -1
is especially useful, as it's a very
small dot.
To omit markers for a particular data set, set its spec to none
:
marker=1,2,3,none,4,5
You can specify optional attributes for the markers (to override
the global attributes) by appending them to the marker spec (with
a /
character between them). Attributes are of the form
height/linewidth/color
e.g.,
marker=column_name/height/linewidth/color
The height of the markers, specified in character height units (see Miscellaneous for a definition). It defaults to the value of the char_height parameter.
The thickness of the line used to draw the markers. It
defaults to the value of the line_width parameter. A thickness of
1
gives the thinnest lines.
The color of the marker. The available colors are listed under the color entry in PARAMTERS.
All attributes are optional (leave the field empty), and trailing empty attributes may be removed:
marker=col/3//blue marker=col/3
Connecting lines are controlled by the connect parameter. It can
be simply set to yes
, or no
, in which case one gets lines or one
doesn't. In this case, if there is more than one data set, rdbplt
will automatically cycle through the available line styles. There are
unfortunately only 5 styles, so it can get confusing.
For more control over the lines' appearance, connect may also be given a comma separated list of line specifications (one per data set), which have the following forms:
style/width/color
Only style
is required; the rest are optional, and follow
the same rules as the marker attribute specs as to leaving out attributes.
Spaces may be used to increase readability.
A line style may be one of the following types: dash
, dashdot1
,
dashdot2
, dot
, full
, none
. A style of none
indicates
that no line should be drawn for that particular data set.
This is the width of the line. It defaults to the value of the
line_width
parameter. A thickness of 1
gives the thinnest
lines.
The color of the marker. The available colors are listed under the color entry in PARAMTERS.
Just as with markers, if fewer line specifications are present then there are data sets, rdbplt will cycle through line styles.
Colors for markers and lines are specified with the color parameter, which takes a comma separated list of colors, one per data set. If there are more data sets than there are color specifications, the default color (whatever that means) will be used for the unspecified ones. This parameter is useful for easily getting the same color for both lines and markers in a data set.
If the list is prefixed with cycle:
, colors in the set will
be cycled if the number of datasets exceeds the number of specified
colors. If the list is composed only of cycle
, then a default
list of colors will be cycled through (this is a subset of the
available colors selected to avoid colors which are too similar).
The available colors are listed under the color entry in PARAMTERS.
Because PGPLOT attempts to fill the plot window, and the window may not be square, plotting data where both axes have the same units may lead to squashed or stretched looking plots. To ensure that both axes have the same ratio of plot units to output pixels (or inches), set the justify parameter to yes. This will plot square regions as square.
Nicely determined padding between the plotted data and the plot frame makes for a well balanced plot. The padding between the data and the labeled axes is determined by the limit_alg parameter, with these possible values:
The determined or specified data min and max limits will be used. This may cause data values to appear on top of the plot frame.
PGPLOT's algorithm is used, which uses the determined or specified data min and max and finds "nice" rounded numbers near them.
The padding is a fraction of the plot window; the limit_val parameter specifies the fraction.
The size of the frame (in inches) may be specified with the height and width parameters. A value of zero indicates the frame should be the maximum possible width in that direction allowed by the plotting device.
A title, subtitle, and X and Y axis labels are specified via the title, subtitle, xlabel, and ylabel parameters, respectively. PGPLOT allows one to specify different fonts (including Greek letters), symbols, and super- and sub- scripts.
It's nice to know which points went with which data set. rdbplt
can construct a legend fairly automatically. To have it do so,
set the legend parameter to yes
.
The text associated with each data set is taken from the
legend_text parameter. This is a set of strings (optionally one per
data set) to be written next to the marker and line in the legend.
Strings are separated by the two characters \n
. If the characters
%s
are found within a string, they are replaced by the contents of
the break column for that parameter set. If there are fewer strings
than data sets, the last is repeated.
This sounds a bit complicated, but it's not. If you want to make very spiffy legends, and know how many data sets are there ahead of time, specify the legend strings completely:
legend_text='My First Data Set\nMy Second Data Set'
Or, say the break column contains the independent parameter which varies between data sets, and you want to insert that into the legend:
legend_text='%s [ppm / day]'
Or, say the break column has everything in there that you want:
legend_text=%s
The position of the legend and how it looks is controlled by the legend_opts parameter, which takes a space or semicolon separated list of keyword-value pairs or booleans. Pairs have the form
keyword=value
while booleans appear as simply
keyword
For example,
legend_opts='x=0.3 ll'
Note the use of the quotes to get the spaces past the shell.
The x coordinate of the legend box, as a fraction of the plot frame width. By default this refers to the left edge of the legend box. See ul or ll. Defaults to zero.
The y coordinate of the legend box, as a fraction of the plot frame height. By default this refers to the top edge of the legend box. See ul or ll. Defaults to one.
The fraction of a character height to skip between legend entries. Defaults to 1.5.
The length of illustrative lines drawn in the legend, as a fraction of the plot frame width. Defaults to 0.05.
Boolean. Draw a box around the legend. Defaults to off.
Padding between the legend box and its contents, in character heights.
The legend character height, in dimensionless units, where 1 is the default.
Boolean. Specifies that the x and y legend positions refer to the upper left of the legend.
Boolean. Specifies that the x and y legend positions refer to the lower left of the legend.
The device (or file) to which to write the plot is specified with the device parameter.
X window display devices are:
/xserve
/xdisp
(an alternate device which doesn't require as many colors)
To create a PostScript file that can be printed, use a device name of filename/?? where filename is the name of the file to be created, and /?? is one of the following:
/ps
landscape PostScript
/vps
portrait PostScript
/cps
landscape color PostScript
/vcps
portrait color PostScript
For the entire list of supported devices, specify ?
device='?'
Note the quotes to pass the question mark by the shell.
The default character height for most things is specified via the
char_height parameter. A height of 1
is approximately 1/40 of
the plot window's height
The default line width is specified by the line_width parameter.
A width of 1
is very thin.
It is possible to specify a global marker spec, and then override it for individual data sets. The global spec should be separated from the data set ones by a colon.
marker=global_def:ds1,ds2
marker=col/1/3/blue://red,none,/3/yellow
Note that the global default may be none
if the majority of
data sets should not have markers associated with them.
Remember that spaces may be used to increase readability.
With a global spec, one need not specify any data set specific attributes; the marker type will be incremented automatically as usual.
It is possible to specify a global connect spec, and then override it for individual data sets. The global spec should be separated from the data set ones by a colon.
connect=global_def:ds1,ds2
connect=full/1/3/blue://red,none,/3/yellow
Note that the global default may be none
if the majority of data
sets should not be connected. Remember that spaces may be used to
increase readability.
With a global spec, one need not specify any data set specific attributes; the line style will be changed automatically as usual.
The xerr and yerr parameters control drawing of error bars Spaces are ignored in error bar specifications, so use them as required to make things readable. Error bar specs come in two forms: as a column name with optional attributes, or as attributes. Each axis has in general two specs for the error bars, one for the positive going side, the other for the negative going side. If only one spec is provided, it'll be used for both. When plotting multiple data sets, one can give a global default spec and then override that for individual data sets.
The column name spec (hereafter col_spec) looks like
column/X/line_style/line_width/color
where column is the column name, X is the cross bar length (in units of the character height), line_width is the thickness of the line used to plot the bar and color is the error bar color. The attributes are optional, and may be empty to specify only a particular attribute:
xerr='column / / full / / blue'
rdbplt does a reasonable job for setting defaults if an attribute isn't set.
The pure attributes spec hereafter (attr_spec) looks like a
col_spec, but the column name (and the first /
character) aren't
required:
X/line_style/line_width/color
There are several ways to provide the specs for a single error bar:
If the same error is to be used for negative and positive going error bars ( i.e, x - error, x + error ), specify a single spec.
for independent error bars, specify two specs, with the negative
going errors first. For col_specs, separate them with commas,
for attr_specs, separate them with %
characters:
neg_col_spec, pos_col_spec neg_attr_spec % pos_attr_spec
(note that spaces are allowed)
This results in y - err_neg
and y + err_pos
.
for one-sided error bars, set the column name for the other side
to none
:
none,pos_col_spec none % pos_attr_spec
Finally, here's how to combine the specs for one or more data sets.
If there's a single data set, or you're going to use the same specs for all of your data sets,
xerr=col_spec
To provide a default set of specs and then override them for multiple data sets,
xerr=col_spec:attr_spec,attr_spec
If the default spec is to be used for a dataset, provide an empty attr_spec:
xerr='col_spec : , attr_spec'
To not plot error bars for a data set, give it an attr_spec of
none
:
xerr='col_spec : none, attr_spec'
If you unexpectedly find lots of strange symbols plotted, you've probably plotting multiple data sets and have forgotten to sort the RDB table on the input column.
Copyright 2007 Smithsonian Astrophysical Observatory. This software is released under the GNU General Public License. You may find a copy at
http://www.fsf.org/copyleft/gpl.html
D. Jerius <djerius@cfa.harvard.edu>