EleFits  4.0.0
A modern C++ API on top of CFitsIO
Software design

Purpose and scope

The purpose of this page is to describe the internal implementation of EleFits. Usage of the API is described in the main pages of this documentation. When implementation and usage are titghly coupled, links are provided to get the user point-of-view.

Library organization

Layers

The package is split in two layers:

  • A high-level object-oriented layer (namespace Fits, see below) which aims at handling Fits files and their contents with an as-simple-as-possible modern API:
    • The file structure is mapped to a class tree (file, HDUs, records and co. are represented by dedicated classes),
    • Data classes represent the data contents of the file,
    • Service classes provide access to each node of the file structure;
  • A low-level procedural layer (namespace Cfitsio, see below) which consists in C++ wrapper functions of selected CFitsIO functions, not intended to end-users:
    • C arrays are wrapped with aforementioned data classes,
    • The use of raw pointers is limited to fitsfile*, which is itself wrapped in the service classes,
    • Templates wrap the CFitsIO type codes,
    • Exceptions wrap the CFitsIO error codes.

Namespaces

The API is organized in namespaces as follows:

  • Euclid is the top level namespace, which contains everything;
  • Cfitsio corresponds to the low-level layer module:
  • Fits is the end-user namespace:
    • Data classes and service classes are implemented in it,
    • Fits::Test contains test helper classes and functions like fixtures;
  • Whatever the level, Internal namespaces contain implementation details, not displayed to the end-user.

Modules

This translates into the following set of modules:

  • EleFitsData contains basic data-storage structures: Records, binary table columns and image rasters are implemented as light structures to abstract from CFitsIO (void) raw pointers; See Data classes for more details on their API and usage.
  • EleCfitsioWrapper is the low-level layer: The set of CFitsIO functions are wrapped in C++ functions to secure memory usage, ease type handling through templates, and throw exceptions instead of returning error codes; Yet, this API is not meant for end-users: it exposes the fitsfile* type and is still a bit verbose and cumbersome.
  • EleFits exposes the end-user API: It provides set of read-write services in dedicated classes; See File and HDU handlers for more details.
  • EleFitsExamples provides examples! Among others, the module compares CFitsIO, EleCfitsioWrapper and EleFits implementations of the same program, which demonstrates the reading and writing of records, images and binary tables with the three APIs; Of course, this is also where the tutorial is implemented.
  • EleFitsValidation contains validation executables:
    • Generator programs allow creating dummy Fits files from scratch;
    • The performances of EleFits are compared to those of CFitsIO;
    • A validation script checks that all of the provided programs run smoothly.

Auxiliary classes, like exceptions, which are useful to both EleCfitsioWrapper and EleFits are implemented in EleFitsData.

The general organization of the free functions, data classes and service classes in modules is illustrated below in the form of a class diagram.

Source files

Inside each module, the source files are organized as follows:

  • The declarations are in files with .h extension in a folder named after the module;
  • The inline (inc. template) definitions are in files with .hpp extension in a subfolder impl of the header folder;
  • The other library definitions are in files with .cpp extension in the folder src/lib;
  • Tests are implemented in the folder tests/src, in files with _test suffix and .cpp extension;
  • The programs are implemented in .cpp files in the folder src/program.

Type codes

Overview

Simple things first:

  • Indices and lengths are implemented as longs instead of std::size_ts (see On types rationale).
  • Strings are generally handled as std::string; additionally, a few shortcuts with const char * are provided; In the latter case, a std::string version is still provided.

As for the binary parts of Fits files (image and binary table data), the mapping is not straightforward. CFitsIO functions are not consistent with this respect: half of them work with built-in types (e.g. CFitsIO type code TSHORT corresponds to short) and the other half with fixed types (e.g. CFitsIO type code SHORT_IMG corresponds to std::int16_t). Check the CFitsIO type mapping table for more details. This makes the user responsible of knowing the underlying architecture, or to spend runtime in conversions at each reading and writing operation.

Type code handling

One of the main purposes of EleFits is to abstract from type codes, and to provide homogeneous template functions. This is implemented as type traits in class Cfitsio::TypeCode, which provides one method per column of the CFitsIO type mapping table. Template specialization are made with built-in types or fixed types depending on what the CFitsIO functions expect. For example, TSHORT is TypeCode<short>::forImage() and SHORT_IMG is TypeCode<std::int16_t>::bitpix().

Note
Internally, CFitsIO does the opposite when working with built-in types: it detects the length of those types and selects the appropriate fixed type. Therefore, a possible improvement would be to bypass top-level CFitsIO functions, and avoid casting fixed types to built-ins and back. Yet, the runtime benefit is expected to be minimal for such a costly development effort (see benchmark results).

Specific types: strings

As stated above, strings are represented with std::string, while CFitsIO is using const char * or even char *. Most of the template functions are therefore specialized for std::string. It is thus required to specialize reading and writing methods for strings. Moreover, strings are treated very differently from other types in binary table columns: they are considered by CFitsIO like scalar columns with a repeat count greater than one. Therefore, Fits::Column methods are also specialized.

One classical transform needed to interface with CFitsIO is to convert a sequence of strings (e.g. a std::vector<std::string>) into a char **. This must be done carefully to keep the memory clean. A class dedicated to the transform is implemented: CStrArray, which can be built from a sequence of strings and has a getter which returns a char **.

Specific types: variant values

Another specific type is VariantValue, which is an alias for boost::any as of today (see note below). It was added to the library to handle large sets of heterogeneous records. It is obviously necessary to provide read and write functions for this type, and they have to work with any underlying value type. This is implemented as runtime switches in read and write functions.

In addition, Record<VariantValue> can be cast to Record<T> if T is compatible with the underlying VariantValue value type. This cast is more complex than a mere boost::any_cast because the size of an integer record cannot be known at compile time. For example, 666 is read as a short, while 1,000,000 is read as an int or long. Since there is no way for the user to know the value and therefore the deduced type of a record, the casting service should allow to request an int when a short was read. Method Record::cast() is in charge of handling the various cases.

Note
A welcome improvement would be to switch from boost::any to boost::variant or std::variant (C++17). This would allow relying on compile-time dispatch (visitor pattern) instead of runtime dispatch (switches). This is the role of Fits::Record::cast.

Variadic templates

Reading and writing collections of heterogeneous objects, like columns of different types, requires using variadic templates (or abstraction classes like VariantValue which cost too much for thousands of values).

Variatic template design patterns to be documented:

  • recursive implementation
  • partial specialization
  • return parameters
  • packs vs. tuples
  • TSeq vs. variadic
  • use of forward and forward_as_tuple
  • seqForeach() and seqTransform()

Data classes

Data classes are Record, Raster, Column and their child or helper classes (e.g. ColumnInfo).

To give as much freedom as possible to the user, Raster and Column are interfaces with as few requirements as possible. Implementations are provided to allow users working without writing their own implementations, although this is a possibility. Raster and Column are implemented as analogously as possible. Theycome with two flavors:

File handlers

File operations to be documented.

Vector of HDUs to be documented.

Nested classes are generally avoided, in order to simplify name resolution. The passkey idiom is preferred, to make constructors private.

HDU handlers

To be documented:

  • Handling of HDUs through vector of pointers
  • HDU factory, inc. passkey idiom
  • Access through const references