An extensible ASCII table reader and writer for Python 2 and 3.
Asciitable can read and write a wide range of ASCII table formats via built-in Extension Reader Classes:
At the top level asciitable looks like many other ASCII table interfaces since it provides default read() and write() functions with long lists of parameters to accommodate the many variations possible in commonly encountered ASCII table formats. Below the hood however asciitable is built on a modular and extensible class structure. The basic functionality required for reading or writing a table is largely broken into independent base class elements so that new formats can be accomodated by modifying the underlying class methods as needed.
Warning
This package is no longer being developed.
The asciitable package has been moved into the Astropy project and is now known as astropy.io.ascii. This new version is very compatible with asciitable and most existing code should work.
The astropy.io.ascii package is being actively developed and contains many new features and bug fixes relative to asciitable. Users are strongly encouraged to migrate to astropy.io.ascii. If you have any questions or problems please send mail to the AstroPy mailing list (astropy@scipy.org).
Copyright: | Smithsonian Astrophysical Observatory (2011) |
---|---|
Author: | Tom Aldcroft (aldcroft@head.cfa.harvard.edu) |
OS | Python version |
---|---|
Linux | 2.4, 2.6, 2.7, 3.2 |
MacOS 10.6 | 2.7 |
Windows XP | 2.7 |
The latest release of the asciitable package is available on the Python Package Index at http://pypi.python.org/pypi/asciitable.
The latest git repository version is available at https://github.com/taldcroft/asciitable or with:
git clone git://github.com/taldcroft/asciitable.git
The asciitable package includes a number of component modules that must be made available to the Python interpreter.
The easy way to install asciitable is using pip install or easy_install. Either one will work, but pip is the more “modern” alternative. The following will download and install the package:
pip install [--user] asciitable
** OR **
easy_install [--user] asciitable
The --user option will install asciitable in a local user directory instead of within the Python installation directory structure. See the discussion on where packages get installed for more information. The --user option requires Python 2.6 or later.
Download and untar the package tarball, then change into the source directory:
tar zxf asciitable-<version>.tar.gz
cd asciitable-<version>
If you have the nose module installed then at this point you can run the test suite:
nosetests # Python 2
nosetests3 # Python 3
There are several methods for installing. Choose ONE of them.
Python site-packages
If you have write access to the python site-packages directory you can do:
python setup.py install
Local user library
If you running python 2.6 or later the following command installs the asciitable module to the appropriate local user directory:
python setup.py install --user
The majority of commonly encountered ASCII tables can be easily read with the read() function:
import asciitable
data = asciitable.read(table)
where table is the name of a file, a string representation of a table, or a list of table lines. By default read() will try to guess the table format by trying all the supported formats. If this does not work (for unusually formatted tables) then one needs give asciitable additional hints about the format, for example:
data = asciitable.read('t/nls1_stackinfo.dbout', data_start=2, delimiter='|')
data = asciitable.read('t/simple.txt', quotechar="'")
data = asciitable.read('t/simple4.txt', Reader=asciitable.NoHeader, delimiter='|')
table = ['col1 col2 col3', '1 2 hi', '3 4.2 there']
data = asciitable.read(table, delimiter=" ")
The read() function accepts a number of parameters that specify the detailed table format. Different Reader classes can define different defaults, so the descriptions below sometimes mention “typical” default values. This refers to the Basic reader and other similar Reader classes.
There are four ways to specify the table to be read:
The first two options are distinguished by the presence of a newline in the string. This assumes that valid file names will not normally contain a newline.
By default the output from read() is a NumPy record array object. This powerful container efficiently supports both column-wise and row access to the table and comes with the full NumPy stack of array manipulation methods.
If NumPy is not available or desired then set numpy=False. The output of read() will then be a dictionary of Column objects with each key for the corresponding named column.
read() can accept a few more parameters that allow for code-level customization of the reading process. These will be discussed in the Advanced table reading section.
data_Splitter: Splitter class to split data columns
header_Splitter: Splitter class to split header columns
Inputter: Inputter class
Outputter: Outputter class
Asciitable can replace string values in the input data before they are converted. The most common use case is probably a table which contains string values that are not a valid representation of a number, e.g. "..." for a missing value or "". If Asciitable cannot convert all elements in a column to a numeric type, it will format the column as strings. To avoid this, fill_values can be used at the string level to fill missing values with the following syntax, which replaces <old> with <new> before the type conversion is done:
fill_values = <fill_spec> | [<fill_spec1>, <fill_spec2>, ...]
<fill_spec> = (<old>, <new>, <optional col name 1>, <optional col name 2>, ...)
Within the <fill_spec> tuple the <old> and <new> values must be strings. These two values are then followed by zero or more column names. If column names are included the replacement is limited to those columns listed. If no columns are specified then the replacement is done in every column, subject to filtering by fill_include_names and fill_exclude_names (see below).
The fill_values parameter in read() takes a single <fill_spec> or a list of <fill_spec> tuples. If several <fill_spec> apply to a single occurence of <old> then the first one determines the <new> value. For instance the following will replace an empty data value in the x or y columns with “1e38” while empty values in any other column will get “-999”:
asciitable.read(table, fill_values=[('', '1e38', 'x', 'y'), ('', '-999')])
The following shows an example where string information needs to be exchanged before the conversion to float values happens. Here no_rain and no_snow is replaced by 0.0:
table = ['day rain snow', # column names
#--- ------- --------
'Mon 3.2 no_snow',
'Tue no_rain 1.1',
'Wed 0.3 no_snow']
asciitable.read(table, fill_values=[('no_rain', '0.0'), ('no_snow', '0.0')])
Sometimes these rules apply only to specific columns in the table. Columns can be selected with fill_include_names or excluded with fill_exclude_names. Also, column names can be given directly with fill_values:
asciidata = ['text,no1,no2', 'text1,1,1.',',2,']
asciitable.read(asciidata, fill_values = ('', 'nan','no1','no2'), delimiter = ',')
Here, the empty value '' in column no2 is replaced by nan, but the text column remains unaltered.
If the numpy module is available, then the default output is a NumPy masked array, where all values, which were replaced by fill_values are masked. See the description of the NumpyOutputter class for information on disabling masked arrays.
If the guess parameter in read() is set to True (which is the default) then read() will try to guess the table format by cycling through a number of possible table format permutations and attemping to read the table in each case. The first format which succeeds and will be used to read the table. To succeed the table must be successfully parsed by the Reader and satisfy the following column requirements:
- At least two table columns
- No column names are a float or int number
- No column names begin or end with space, comma, tab, single quote, double quote, or a vertical bar (|).
These requirements reduce the chance for a false positive where a table is successfully parsed with the wrong format. A common situation is a table with numeric columns but no header row, and in this case asciitable will auto-assign column names because of the restriction on column names that look like a number.
The order of guessing is shown by this Python code:
for Reader in (Rdb, Tab, Cds, Daophot, Ipac):
read(Reader=Reader)
for Reader in (CommentedHeader, BasicReader, NoHeader):
for delimiter in ("|", ",", " ", "\\s"):
for quotechar in ('"', "'"):
read(Reader=Reader, delimiter=delimiter, quotechar=quotechar)
Note that the FixedWidth derived-readers are not included in the default guess sequence (this causes problems), so to read such tables one must explicitly specify the reader class with the Reader keyword.
If none of the guesses succeed in reading the table (subject to the column requirements) a final try is made using just the user-supplied parameters but without checking the column requirements. In this way a table with only one column or column names that look like a number can still be successfully read.
The guessing process respects any values of the Reader, delimiter, and quotechar parameters that were supplied to the read() function. Any guesses that would conflict are skipped. For example the call:
dat = asciitable.read(table, Reader=NoHeader, quotechar="'")
would only try the four delimiter possibilities, skipping all the conflicting Reader and quotechar combinations.
Guessing can be disabled in two ways:
import asciitable
data = asciitable.read(table) # guessing enabled by default
data = asciitable.read(table, guess=False) # disable for this call
asciitable.set_guess(False) # set default to False globally
data = asciitable.read(table) # guessing disabled
Asciitable converts the raw string values from the table into numeric data types by using converter functions such as the Python int and float functions. For example int("5.0") will fail while float(“5.0”) will succeed and return 5.0 as a Python float.
The default set of converters for the BaseOutputter class is defined as such:
default_converters = [asciitable.convert_list(int),
asciitable.convert_list(float),
asciitable.convert_list(str)]
These take advantage of the convert_list() function which returns a 2-element tuple. The first element is function that will convert a list of values to the desired type. The second element is an asciitable class that specifies the type of data produced. This element should be one of StrType, IntType, or FloatType.
The conversion code steps through each applicable converter function and tries to call the function with a column of string values. If it succeeds without throwing an exception it will then break out, but otherwise move on to the next conversion function.
Use the converters keyword argument in order to force a specific data type for a column. This should be a dictionary with keys corresponding to the column names. Each dictionary value is a list similar to the default_converter. For example:
# col1 is int, col2 is float, col3 is string
converters = {'col1': [asciitable.convert_list(int)],
'col2': [asciitable.convert_list(float)],
'col3': [asciitable.convert_list(str)]}
read('file.dat', converters=converters)
Note that it is also possible to specify a list of converter functions that will be tried in order:
converters = {'col1': [asciitable.convert_list(float),
asciitable.convert_list(str)]}
read('file.dat', converters=converters)
If the numpy module is available then the NumpyOutputter is selected by default. In this case the default converters are:
default_converters = [asciitable.convert_numpy(numpy.int),
asciitable.convert_numpy(numpy.float),
asciitable.convert_numpy(numpy.str)]
These take advantage of the convert_numpy() function which returns a 2-element tuple (converter_func, converter_type) as described in the previous section. The type provided to convert_numpy() must be a valid numpy type, for example numpy.int, numpy.uint, numpy.int8, numpy.int64, numpy.float, numpy.float64, numpy.str.
The converters for each column can be specified with the converters keyword:
converters = {'col1': [asciitable.convert_numpy(numpy.uint)],
'col2': [asciitable.convert_numpy(numpy.float32)]}
read('file.dat', converters=converters)
This section is not finished. It will discuss ways of making custom reader functions and how to write custom Reader, Splitter, Inputter and Outputter classes. For now please look at the examples and especially the code for the existing Extension Reader Classes.
Define a custom reader functionally
def read_rdb_table(table):
reader = asciitable.Basic()
reader.header.splitter.delimiter = '\t'
reader.data.splitter.delimiter = '\t'
reader.header.splitter.process_line = None
reader.data.splitter.process_line = None
reader.data.start_line = 2
return reader.read(table)
Define custom readers by class inheritance
# Note: Tab and Rdb are already included in asciitable for convenience.
class Tab(asciitable.Basic):
def __init__(self):
asciitable.Basic.__init__(self)
self.header.splitter.delimiter = '\t'
self.data.splitter.delimiter = '\t'
# Don't strip line whitespace since that includes tabs
self.header.splitter.process_line = None
self.data.splitter.process_line = None
# Don't strip data value spaces since that is significant in TSV tables
self.data.splitter.process_val = None
self.data.splitter.skipinitialspace = False
class Rdb(asciitable.Tab):
def __init__(self):
asciitable.Tab.__init__(self)
self.data.start_line = 2
Create a custom splitter.process_val function
# The default process_val() normally just strips whitespace.
# In addition have it replace empty fields with -999.
def process_val(x):
"""Custom splitter process_val function: Remove whitespace at the beginning
or end of value and substitute -999 for any blank entries."""
x = x.strip()
if x == '':
x = '-999'
return x
# Create an RDB reader and override the splitter.process_val function
rdb_reader = asciitable.get_reader(Reader=asciitable.Rdb)
rdb_reader.data.splitter.process_val = process_val
Asciitable is able to write ASCII tables out to a file or file-like object using the same class structure and basic user interface as for reading tables.
As a very simple example:
x = np.array([1, 2, 3])
y = x**2
asciitable.write({'x': x, 'y': y}, 'outfile.dat', names=['x', 'y'])
A number of data formats for the input table are supported:
The example below highlights that the get_reader() function returns a Reader object that supports keywords and table metadata. The Reader object can then be an input to the write() function and allow for any associated metadata to be written.
Note that in the current release there is no support for actually writing the available keywords or other metadata, but the infrastructure is available and this is the top priority for development.
# Get a Reader object
table = asciitable.get_reader(Reader=asciitable.Daophot)
# Read a table from a file. Return a NumPy record array object and also
# update column and metadata attributes in the "table" Reader object.
data = table.read('t/daophot.dat')
# Write the data in a variety of ways using as input both the NumPy record
# array and the higher-level Reader object.
asciitable.write(table, "table.dat", Writer=asciitable.Tab )
asciitable.write(table, open("table.dat", "w"), Writer=asciitable.NoHeader )
asciitable.write(table, sys.stdout, Writer=asciitable.CommentedHeader )
asciitable.write(table, sys.stdout, Writer=asciitable.Rdb, exclude_names=['CHI'] )
asciitable.write(table, sys.stdout, formats={'XCENTER': '%12.1f',
'YCENTER': lambda x: round(x, 1)},
include_names=['XCENTER', 'YCENTER'])
Asciitable.read returns a data object that can be an input to the write() function. If NumPy is available the default data object type is a NumPy record array. However it is possible to use asciitable without NumPy in which case a DictLikeNumpy object is returned. This object supports the most basic column and row indexing API of a NumPy structured array. This object can be used as input to the write() function.
table = asciitable.get_reader(Reader=asciitable.Daophot, numpy=False)
data = table.read('t/daophot.dat')
asciitable.write(data, sys.stdout)
A NumPy structured array (aka record array) can serve as input to the write() function.
data = numpy.zeros((2,), dtype=('i4,f4,a10'))
data[:] = [(1, 2., 'Hello'), (2, 3., "World")]
asciitable.write(data, sys.stdout)
A doubly-nested structure of iterable objects (e.g. lists or tuples) can serve as input to write(). The outer layer represents rows while the inner layer represents columns.
data = [[1, 2, 3 ],
[4, 5.2, 6.1 ],
[8, 9, 'hello']]
asciitable.write(data, 'table.dat')
asciitable.write(data, 'table.dat', names=['x', 'y', 'z'], exclude_names=['y'])
A dictionary containing iterable objects can serve as input to write(). Each dict key is taken as the column name while the value must be an iterable object containing the corresponding column values. Note the difference in output between this example and the previous example.
data = {'x': [1, 2, 3],
'y': [4, 5.2, 6.1],
'z': [8, 9, 'hello world']}
asciitable.write(data, 'table.dat', names=['x', 'y', 'z'])
Specifying the names argument is necessary if the order of the columns matters. The specified values must match the keys in the data dict.
The write() function accepts a number of parameters that specify the detailed output table format. Different Reader classes can define different defaults, so the descriptions below sometimes mention “typical” default values. This refers to the Basic reader and other similar Reader classes.
Some Reader classes, e.g. Latex or AASTex accept aditional keywords, that can customize the output further. See the documentation of these classes for details.
There are two ways to specify the output for the write operation:
The are five possible formats for the data table that is to be written:
For each key (column name) use the given value to convert the column data to a string. If the format value is string-like then it is used as a Python format statement, e.g. ‘%0.2f’ % value. If it is a callable function then that function is called with a single argument containing the column value to be converted. Example:
asciitable.write(table, sys.stdout, formats={'XCENTER': '%12.1f',
'YCENTER': lambda x: round(x, 1)},
This can be used to fill missing values in the table or replace values with special meaning. The syntax is the same as used on input. See the Replace bad or missing values section for more information on the syntax. When writing a table, all values are converted to strings, before any value is replaced. Thus, you need to provide the string representation (stripped of whitespace) for each value. Example:
asciitable.write(table, sys.stdout, fill_values = [('nan', 'no data'),
('-999.0', 'no data')])
The key elements in asciitable are:
Each of these elements is an inheritable class with attributes that control the corresponding functionality. In this way the large number of tweakable parameters is modularized into managable groups. Where it makes sense these attributes are actually functions that make it easy to handle special cases.
An extensible ASCII table reader and writer.
Copyright: | Smithsonian Astrophysical Observatory (2010) |
---|---|
Author: | Tom Aldcroft (aldcroft@head.cfa.harvard.edu) |
Read the input table. If numpy is True (default) return the table in a numpy record array. Otherwise return the table as a dictionary of column objects using plain python lists to hold the data. Most of the default behavior for various parameters is determined by the Reader class.
Parameters: |
|
---|
Initialize a table reader allowing for common customizations. Most of the default behavior for various parameters is determined by the Reader class.
Parameters: |
|
---|
Write the input table to filename. Most of the default behavior for various parameters is determined by the Writer class.
Parameters: |
|
---|
Initialize a table writer allowing for common customizations. Most of the default behavior for various parameters is determined by the Writer class.
Parameters: |
|
---|
Return a tuple (converter_func, converter_type). The converter function converts a list into a list of the given python_type. This argument is a function that takes a single argument and returns a single value of the desired type. In general this will be one of int, float or str. The converter type is used to track the generic data type (int, float, str) that is produced by the converter function.
Return a tuple (converter_func, converter_type). The converter function converts a list into a numpy array of the given numpy_type. This type must be a valid numpy type, e.g. numpy.int, numpy.uint, numpy.int8, numpy.int64, numpy.float, numpy.float64, numpy.str. The converter type is used to track the generic data type (int, float, str) that is produced by the converter function.
Set the default value of the guess parameter for read()
Parameters: | guess – New default guess value (True|False) |
---|
Bases: object
Class providing methods to read an ASCII table using the specified header, data, inputter, and outputter instances.
Typical usage is to instantiate a Reader() object and customize the header, data, inputter, and outputter attributes. Each of these is an object of the corresponding class.
There is one method inconsistent_handler that can be used to customize the behavior of read() in the event that a data row doesn’t match the header. The default behavior is to raise an InconsistentTableError.
Return lines in the table that match header.comment regexp
Adjust or skip data entries if a row is inconsistent with the header.
The default implementation does no adjustment, and hence will always trigger an exception in read() any time the number of data entries does not match the header.
Note that this will not be called if the row already matches the header.
Parameters: |
|
---|---|
Returns: | list of strings to be parsed into data entries in the output table. If the length of this list does not match ncols, an exception will be raised in read(). Can also be None, in which case the row will be skipped. |
Read the table and return the results in a format determined by the outputter attribute.
The table parameter is any string or object that can be processed by the instance inputter. For the base Inputter class table can be one of:
Parameters: | table – table input |
---|---|
Returns: | output table |
Write table as list of strings.
Parameters: | table – asciitable Reader object |
---|---|
Returns: | list of strings corresponding to ASCII table |
Bases: object
Base table data reader.
Parameters: |
|
---|
alias of str
Set the data_lines attribute to the lines slice comprising the table data values.
Return a generator that returns a list of column values (as strings) for each data line.
Set fill value for each column and then apply that fill value
In the first step it is evaluated with value from fill_values applies to which column using fill_include_names and fill_exclude_names. In the second step all replacements are done for the appropriate columns.
Strip out comment lines and blank lines from list of lines
Parameters: | lines – all lines in table |
---|---|
Returns: | list of lines |
alias of DefaultSplitter
Bases: object
Base table header reader
Parameters: |
|
---|
Return the column names of the table
Initialize the header Column objects from the table lines.
Based on the previously set Header attributes find or create the column names. Sets self.cols with the list of Columns. This list only includes the actual requested columns after filtering by the include_names and exclude_names attributes. See self.names for the full list.
Parameters: | lines – list of table lines |
---|---|
Returns: | None |
Return the number of expected data columns from data splitting. This is either explicitly set (typically for fixedwidth splitters) or set to self.names otherwise.
Generator to yield non-comment lines
alias of DefaultSplitter
Bases: object
Get the lines from the table input and return a list of lines. The input table can be one of:
Get the lines from the table input.
Parameters: | table – table input |
---|---|
Returns: | list of lines |
Process lines for subsequent use. In the default case do nothing. This routine is not generally intended for removing comment lines or stripping whitespace. These are done (if needed) in the header and data line processing.
Override this method if something more has to be done to convert raw input lines to the table rows. For example the ContinuationLinesInputter derived class accounts for continuation characters if a row is split into lines.
Bases: object
Output table as a dict of column objects keyed on column name. The table data are stored as plain python lists within the column objects.
Bases: object
Base splitter that uses python’s split method to do the work.
This does not handle quoted values. A key feature is the formulation of __call__ as a generator that returns a list of the split line values at each iteration.
There are two methods that are intended to be overridden, first process_line() to do pre-processing on each input line before splitting and process_val() to do post-processing on each split string value. By default these apply the string strip() function. These can be set to another function via the instance attribute or be disabled entirely, for example:
reader.header.splitter.process_val = lambda x: x.lstrip()
reader.data.splitter.process_val = None
Parameters: | delimiter – one-character string used to separate fields |
---|
Remove whitespace at the beginning or end of line. This is especially useful for whitespace-delimited files to prevent spurious columns at the beginning or end.
Remove whitespace at the beginning or end of value.
Bases: object
Table column.
The key attributes of a Column object are:
Bases: asciitable.core.BaseSplitter
Default class to split strings into columns using python csv. The class attributes are taken from the csv Dialect class.
Typical usage:
# lines = ..
splitter = asciitable.DefaultSplitter()
for col_vals in splitter(lines):
for col_val in col_vals:
...
Parameters: |
|
---|
Remove whitespace at the beginning or end of line. This is especially useful for whitespace-delimited files to prevent spurious columns at the beginning or end. If splitting on whitespace then replace unquoted tabs with space first
Remove whitespace at the beginning or end of value.
Bases: dict
Provide minimal compatibility with numpy rec array API for BaseOutputter object:
table = asciitable.read('mytable.dat', numpy=False)
table.field('x') # List of elements in column 'x'
table.dtype.names # get column names in order
table[1] # returns row 1 as a list
table[1][2] # 3nd column in row 1
table['col1'][1] # Row 1 in column col1
for row_vals in table: # iterate over table rows
print row_vals # print list of vals in each row
Bases: object
Bases: exceptions.ValueError
Bases: asciitable.core.BaseOutputter
Output the table as a numpy.rec.recarray
Missing or bad data values are handled at two levels. The first is in the data reading step where if data.fill_values is set then any occurences of a bad value are replaced by the correspond fill value. At the same time a boolean list mask is created in the column object.
The second stage is when converting to numpy arrays which by default generates masked arrays, if data.fill_values is set and plain arrays if it is not. In the rare case that plain arrays are needed set auto_masked (default = True) and default_masked (default = False) to control this behavior as follows:
auto_masked | default_masked | fill_values | output |
---|---|---|---|
– | True | – | masked_array |
– | False | None | array |
True | – | dict(..) | masked_array |
False | – | dict(..) | array |
To set these values use:
Outputter = asciitable.NumpyOutputter()
Outputter.default_masked = True
The following classes extend the base Reader functionality to handle different table formats. Some, such as the Basic Reader class are fairly general and include a number of configurable attributes. Others such as Cds or Daophot are specialized to read certain well-defined but idiosyncratic formats.
Bases: asciitable.latex.Latex
Write and read AASTeX tables.
This class implements some AASTeX specific commands. AASTeX is used for the AAS (American Astronomical Society) publications like ApJ, ApJL and AJ.
It derives from Latex and accepts the same keywords (see Latex for documentation). However, the keywords header_start, header_end, data_start and data_end in latexdict have no effect.
Bases: asciitable.core.BaseReader
Read a character-delimited table with a single header line at the top followed by data lines to the end of the table. Lines beginning with # as the first non-whitespace character are comments. This reader is highly configurable.
rdr = asciitable.get_reader(Reader=asciitable.Basic)
rdr.header.splitter.delimiter = ' '
rdr.data.splitter.delimiter = ' '
rdr.header.start_line = 0
rdr.data.start_line = 1
rdr.data.end_line = None
rdr.header.comment = r'\s*#'
rdr.data.comment = r'\s*#'
Example table:
# Column definition is the first uncommented line
# Default delimiter is the space character.
apples oranges pears
# Data starts after the header column definition, blank lines ignored
1 2 3
4 5 6
Bases: asciitable.core.BaseReader
Read a CDS format table: http://vizier.u-strasbg.fr/doc/catstd.htx. Example:
Table: Spitzer-identified YSOs: Addendum
================================================================================
Byte-by-byte Description of file: datafile3.txt
--------------------------------------------------------------------------------
Bytes Format Units Label Explanations
--------------------------------------------------------------------------------
1- 3 I3 --- Index Running identification number
5- 6 I2 h RAh Hour of Right Ascension (J2000)
8- 9 I2 min RAm Minute of Right Ascension (J2000)
11- 15 F5.2 s RAs Second of Right Ascension (J2000)
--------------------------------------------------------------------------------
1 03 28 39.09
Basic usage
Use the asciitable.read() function as normal, with an optional readme parameter indicating the CDS ReadMe file. If not supplied it is assumed that the header information is at the top of the given table. Examples:
>>> import asciitable
>>> table = asciitable.read("t/cds.dat")
>>> table = asciitable.read("t/vizier/table1.dat", readme="t/vizier/ReadMe")
>>> table = asciitable.read("t/cds/multi/lhs2065.dat", readme="t/cds/multi/ReadMe")
>>> table = asciitable.read("t/cds/glob/lmxbrefs.dat", readme="t/cds/glob/ReadMe")
Using a reader object
When Cds reader object is created with a readme parameter passed to it at initialization, then when the read method is executed with a table filename, the header information for the specified table is taken from the readme file. An InconsistentTableError is raised if the readme file does not have header information for the given table.
>>> readme = "t/vizier/ReadMe"
>>> r = asciitable.get_reader(asciitable.Cds, readme=readme)
>>> table = r.read("t/vizier/table1.dat")
>>> # table5.dat has the same ReadMe file
>>> table = r.read("t/vizier/table5.dat")
If no readme parameter is specified, then the header information is assumed to be at the top of the given table.
>>> r = asciitable.get_reader(asciitable.Cds)
>>> table = r.read("t/cds.dat")
>>> #The following gives InconsistentTableError, since no
>>> #readme file was given and table1.dat does not have a header.
>>> table = r.read("t/vizier/table1.dat")
Traceback (most recent call last):
...
InconsistentTableError: No CDS section delimiter found
Caveats:
Code contribution to enhance the parsing to include metadata in a Reader.meta attribute would be welcome.
Not available for the Cds class (raises NotImplementedError)
Bases: asciitable.core.BaseReader
Read a file where the column names are given in a line that begins with the header comment character. The default delimiter is the <space> character.:
# col1 col2 col3
# Comment line
1 2 3
4 5 6
Bases: asciitable.core.BaseReader
Read a DAOphot file. Example:
#K MERGERAD = INDEF scaleunit %-23.7g
#K IRAF = NOAO/IRAFV2.10EXPORT version %-23s
#K USER = davis name %-23s
#K HOST = tucana computer %-23s
#
#N ID XCENTER YCENTER MAG MERR MSKY NITER \
#U ## pixels pixels magnitudes magnitudes counts ## \
#F %-9d %-10.3f %-10.3f %-12.3f %-14.3f %-15.7g %-6d
#
#N SHARPNESS CHI PIER PERROR \
#U ## ## ## perrors \
#F %-23.3f %-12.3f %-6d %-13s
#
14 138.538 256.405 15.461 0.003 34.85955 4 \
-0.032 0.802 0 No_error
The keywords defined in the #K records are available via the Daophot reader object:
reader = asciitable.get_reader(Reader=asciitable.DaophotReader)
data = reader.read('t/daophot.dat')
for keyword in reader.keywords:
print keyword.name, keyword.value, keyword.units, keyword.format
Bases: asciitable.core.BaseReader
Read or write a fixed width table with a single header line that defines column names and positions. Examples:
# Bar delimiter in header and data
| Col1 | Col2 | Col3 |
| 1.2 | hello there | 3 |
| 2.4 | many words | 7 |
# Bar delimiter in header only
Col1 | Col2 | Col3
1.2 hello there 3
2.4 many words 7
# No delimiter with column positions specified as input
Col1 Col2Col3
1.2hello there 3
2.4many words 7
See the Fixed-width Gallery for specific usage examples.
Parameters: |
|
---|
Bases: asciitable.fixedwidth.FixedWidth
Read or write a fixed width table which has no header line. Column names are either input (names keyword) or auto-generated. Column positions are determined either by input (col_starts and col_stops keywords) or by splitting the first data line. In the latter case a delimiter is required to split the data line.
Examples:
# Bar delimiter in header and data
| 1.2 | hello there | 3 |
| 2.4 | many words | 7 |
# Compact table having no delimiter and column positions specified as input
1.2hello there3
2.4many words 7
This class is just a convenience wrapper around FixedWidth but with header.start_line = None and data.start_line = 0.
See the Fixed-width Gallery for specific usage examples.
Parameters: |
|
---|
Bases: asciitable.fixedwidth.FixedWidth
Read or write a fixed width table which has two header lines. The first header line defines the column names and the second implicitly defines the column positions. Examples:
# Typical case with column extent defined by ---- under column names.
col1 col2 <== header_start = 0
----- ------------ <== position_line = 1, position_char = "-"
1 bee flies <== data_start = 2
2 fish swims
# Pretty-printed table
+------+------------+
| Col1 | Col2 |
+------+------------+
| 1.2 | "hello" |
| 2.4 | there world|
+------+------------+
See the Fixed-width Gallery for specific usage examples.
Parameters: |
|
---|
Bases: asciitable.core.BaseReader
Read an IPAC format table: http://irsa.ipac.caltech.edu/applications/DDGEN/Doc/ipac_tbl.html:
\name=value
\ Comment
| column1 | column2 | column3 | column4 | column5 |
| double | double | int | double | char |
| unit | unit | unit | unit | unit |
| null | null | null | null | null |
2.0978 29.09056 73765 2.06000 B8IVpMnHg
Or:
|-----ra---|----dec---|---sao---|------v---|----sptype--------|
2.09708 29.09056 73765 2.06000 B8IVpMnHg
Caveats:
Overcoming these limitations would not be difficult, code contributions welcome from motivated users.
Not available for the Ipac class (raises NotImplementedError)
Bases: asciitable.core.BaseReader
Write and read LaTeX tables.
This class implements some LaTeX specific commands. Its main purpose is to write out a table in a form that LaTeX can compile. It is beyond the scope of this class to implement every possible LaTeX command, instead the focus is to generate a syntactically valid LaTeX tables. This class can also read simple LaTeX tables (one line per table row, no \multicolumn or similar constructs), specifically, it can read the tables that it writes.
Reading a LaTeX table, the following keywords are accepted:
When writing a LaTeX table, the some keywords can customize the format. Care has to be taken here, because python interprets \ in a string as an escape character. In order to pass this to the output either format your strings as raw strings with the r specifier or use a double \\. Examples:
caption = r'My table \label{mytable}'
caption = 'My table \\label{mytable}'
The default is \begin{table}. The following would generate a table, which spans the whole page in a two-column document:
asciitable.write(data, sys.stdout, Writer = asciitable.Latex,
latexdict = {'tabletype': 'table*'})
If not present all columns will be centered.
This will appear above the table as it is the standard in many scientific publications. If you prefer a caption below the table, just write the full LaTeX command as latexdict['tablefoot'] = r'\caption{My table}'
Each one can be a string or a list of strings. These strings will be inserted into the table without any further processing. See the examples below.
Keys in this dictionary should be names of columns. If present, a line in the LaTeX table directly below the column names is added, which contains the values of the dictionary. Example:
import asciitable import asciitable.latex import sys data = {‘name’: [‘bike’, ‘car’], ‘mass’: [75,1200], ‘speed’: [10, 130]} asciitable.write(data, sys.stdout, Writer = asciitable.Latex,
latexdict = {‘units’: {‘mass’: ‘kg’, ‘speed’: ‘km/h’}})
If the column has no entry in the units dictionary, it defaults to ‘ ‘.
Run the following code to see where each element of the dictionary is inserted in the LaTeX table:
import asciitable
import asciitable.latex
import sys
data = {'cola': [1,2], 'colb': [3,4]}
asciitable.write(data, sys.stdout, Writer = asciitable.Latex,
latexdict = asciitable.latex.latexdicts['template'])
Some table styles are predefined in the dictionary asciitable.latex.latexdicts. The following generates in table in style preferred by A&A and some other journals:
asciitable.write(data, sys.stdout, Writer = asciitable.Latex,
latexdict = asciitable.latex.latexdicts['AA'])
As an example, this generates a table, which spans all columns and is centered on the page:
asciitable.write(data, sys.stdout, Writer = asciitable.Latex,
col_align = '|lr|',
latexdict = {'preamble': r'egin{center}', 'tablefoot': r'\end{center}',
'tabletype': 'table*'})
Shorthand for:
latexdict['caption'] = caption
If not present this will be auto-generated for centered columns. Shorthand for:
latexdict['col_align'] = col_align
Bases: asciitable.core.BaseReader
Read a table from a data object in memory. Several input data formats are supported:
Output of asciitable.read():
table = asciitable.get_reader(Reader=asciitable.Daophot)
data = table.read('t/daophot.dat')
mem_data_from_table = asciitable.read(table, Reader=asciitable.Memory)
mem_data_from_data = asciitable.read(data, Reader=asciitable.Memory)
Numpy structured array:
data = numpy.zeros((2,), dtype=[('col1','i4'), ('col2','f4'), ('col3', 'a10')])
data[:] = [(1, 2., 'Hello'), (2, 3., "World")]
mem_data = asciitable.read(data, Reader=asciitable.Memory)
Numpy masked structured array:
data = numpy.ma.zeros((2,), dtype=[('col1','i4'), ('col2','f4'), ('col3', 'a10')])
data[:] = [(1, 2., 'Hello'), (2, 3., "World")]
data['col2'] = ma.masked
mem_data = asciitable.read(data, Reader=asciitable.Memory)
In the current version all masked values will be converted to nan.
Sequence of sequences:
data = [[1, 2, 3 ],
[4, 5.2, 6.1 ],
[8, 9, 'hello']]
mem_data = asciitable.read(data, Reader=asciitable.Memory, names=('c1','c2','c3'))
Dict of sequences:
data = {'c1': [1, 2, 3],
'c2': [4, 5.2, 6.1],
'c3': [8, 9, 'hello']}
mem_data = asciitable.read(data, Reader=asciitable.Memory, names=('c1','c2','c3'))
Not available for the Memory class (raises NotImplementedError)
Bases: asciitable.basic.Basic
Read a table with no header line. Columns are autonamed using header.auto_format which defaults to “col%d”. Otherwise this reader the same as the Basic class from which it is derived. Example:
# Table data
1 2 "hello there"
3 4 world
Bases: asciitable.basic.Tab
Read a tab-separated file with an extra line after the column definition line. The RDB format meets this definition. Example:
col1 <tab> col2 <tab> col3
N <tab> S <tab> N
1 <tab> 2 <tab> 5
In this reader the second line is just ignored.
Bases: asciitable.basic.Basic
Read a tab-separated file. Unlike the Basic reader, whitespace is not stripped from the beginning and end of lines. By default whitespace is still stripped from the beginning and end of individual column values.
Example:
col1 <tab> col2 <tab> col3
# Comment line
1 <tab> 2 <tab> 5
These classes provide support for extension readers.
Bases: asciitable.core.BaseData
CDS table data reader
Skip over CDS header by finding the last section delimiter
alias of FixedWidthSplitter
Bases: asciitable.core.BaseHeader
Initialize the header Column objects from the table lines for a CDS header.
Parameters: | lines – list of table lines |
---|---|
Returns: | list of table Columns |
Bases: asciitable.core.BaseHeader
Header class for which the column definition line starts with the comment character. See the CommentedHeader class for an example.
Return only lines that start with the comment regexp. For these lines strip out the matching characters.
Bases: asciitable.core.BaseInputter
Inputter where lines ending in continuation_char are joined with the subsequent line. Example:
col1 col2 col3
1 2 3
4 5 6
Bases: asciitable.core.BaseHeader
Read the header from a file produced by the IRAF DAOphot routine.
Initialize the header Column objects from the table lines for a DAOphot header. The DAOphot header is specialized so that we just copy the entire BaseHeader get_cols routine and modify as needed.
Parameters: | lines – list of table lines |
---|---|
Returns: | list of table Columns |
Bases: asciitable.core.BaseSplitter
Split line based on fixed start and end positions for each col in self.cols.
This class requires that the Header class will have defined col.start and col.end for each column. The reference to the header.cols gets put in the splitter object by the base Reader.read() function just in time for splitting data lines by a data object.
Note that the start and end positions are defined in the pythonic style so line[start:end] is the desired substring for a column. This splitter class does not have a hook for process_lines since that is generally not useful for fixed-width input.
Bases: asciitable.core.BaseHeader
Fixed width table header reader.
The key settable class attributes are:
Parameters: |
|
---|
Initialize the header Column objects from the table lines.
Based on the previously set Header attributes find or create the column names. Sets self.cols with the list of Columns. This list only includes the actual requested columns after filtering by the include_names and exclude_names attributes. See self.names for the full list.
Parameters: | lines – list of table lines |
---|---|
Returns: | None |
Split line on the delimiter and determine column values and column start and end positions. This might include null columns with zero length (e.g. for header row = “| col1 || col2 | col3 |” or header2_row = “—– ——- —–”). The null columns are stripped out. Returns the values between delimiters and the corresponding start and end positions.
Parameters: | line – input line |
---|---|
Returns: | (vals, starts, ends) |
Bases: asciitable.core.BaseData
Base table data reader.
Parameters: |
|
---|
alias of FixedWidthSplitter
Bases: asciitable.core.BaseData
IPAC table data reader
alias of FixedWidthSplitter
Bases: asciitable.core.BaseHeader
IPAC table header
Initialize the header Column objects from the table lines.
Based on the previously set Header attributes find or create the column names. Sets self.cols with the list of Columns. This list only includes the actual requested columns after filtering by the include_names and exclude_names attributes. See self.names for the full list.
Parameters: | lines – list of table lines |
---|---|
Returns: | list of table Columns |
Generator to yield IPAC header lines, i.e. those starting and ending with delimiter character.
alias of BaseSplitter
Bases: asciitable.core.BaseHeader
Bases: asciitable.core.BaseData
Bases: asciitable.core.BaseSplitter
Split LaTeX table date. Default delimiter is &.
Join values together and add a few extra spaces for readability
Remove whitespace at the beginning or end of line. Also remove at end of line
Remove whitespace and {} at the beginning or end of value.
Bases: asciitable.latex.LatexHeader
In a deluxetable some header keywords differ from standard LaTeX.
This header is modified to take that into account.
Bases: asciitable.latex.LatexData
In a deluxetable the data is enclosed in startdata and enddata
Bases: asciitable.latex.LatexSplitter
extract column names from a deluxetable
This splitter expects the following LaTeX code in a single line:
ablehead{colhead{col1} & ... & colhead{coln}}
extract column names from tablehead