EleFits  5.0.0
A modern C++ API on top of CFITSIO
On the API design

Understanding the design helps using EleFits better.

A journey to the API design

The main reason why we have started implementing EleFits is to make usage safer wrt. CFITSIO and CCfits. This means improving type safety, memory management, and also safety of use in the sense of preventing API usage mistakes. This comes with the simplification of the parameters (no pointers, structured groups of parameters...) and the reduction of their number and of the amount of duplication.

Let's consider the following use case: In the fourth extension of some file "catalog.fits", read columns "ID" and "RADEC" as vectors of string and std::complex<double>, respectively. By the way, get the name of the extension.

An object-based API following the CFITSIO approach (one single class does everything) could be something like this:

FitsFile f("catalog.fits", FileMode::READ);
f.access(4);
auto name = f.readHduName();
auto id = f.readColumn<std::string>("ID");
auto radec = f.readColumn<std::complex<double>>("RADEC");
FITS file reader-writer.
Definition: FitsFile.h:85

Here, FitsFile is responsible for everything: accessing and creating the HDUs, reading and writing the header units as well as the image and binary table data units. The code is simple to read but to reach the number of features EleFits provides, several hundreds of methods would be in FitsFile.

A CCfits-like approach would go down to the level of HDUs but not header units and data units:

FitsFile f("catalog.fits", FileMode::READ);
const auto& hdu = f.access(4); // Returns an Hdu
auto name = hdu.readName();
auto id = hdu.readColumn<std::string>("ID");
auto radec = hdu.readColumn<std::complex<double>>("RADEC");

We have decoupled HDU access and creation on one hand (in FitsFile) and reading and writing on the other hand (in Hdu), which makes the classes smaller, but there are still major issues: for example, methods to read image data can be called on hdu which represents a binary table extension. This can be easily mitigated by creating classes ImageHdu and BintableHdu.

To further reduce the coupling between header units and data units, classes have been introduced as follows:

FitsFile f("catalog.fits", FileMode::READ);
const auto& hdu = f.access<BintableHdu>(4); // Get a BintableHdu instead of an Hdu
auto name = hdu.readName(); // From parent class Hdu
auto id = hdu.columns().read<std::string>("ID");
auto radec = hdu.columns().read<std::complex<double>>("RADEC");
Binary table HDU reader-writer.
Definition: BintableHdu.h:21
std::string readName() const
Read the extension name.

Last but not least, for performance, it is preferable to read several columns at the same time:

FitsFile f("catalog.fits", FileMode::READ);
const auto& hdu = f.access(4);
auto name = hdu.readName();
auto catalog = hdu.columns().readSeq<std::string, std::complex<double>>({"ID", "RADEC"});

One issue remains with the above proposal. Imagine a more realistic use case where more columns should be read:

auto catalog = hdu.columns().readSeq<
std::string, std::complex<double>, double, float, double, float, std::int16_t>(
{"ID", "RADEC", "Z", "FLUX", "SHAPE", "SPECTRUM", "THUMBNAIL"});

It is very easy to make mistakes in the order of the arguments. To solve this, we can manage to have each column data type written alongside its name:

auto catalog = hdu.columns().readSeq(
as<std::string>("ID"),
as<std::complex<double>>("RADEC"),
as<double>("Z"),
as<float>("FLUX"),
as<double>("SHAPE"),
as<float>("SPECTRUM"),
as<std::int16_t>("THUMBNAIL"));

We have introduced a new template class to store the type as the template parameter and the name as a member variable. This is the purpose of TypedKey in EleFits, which is instantiated with function as() for consiceness.

The above code snippet reads very fluently: To read a catalog which is the following sequence of columns:

Drawbacks of API fluency

The main drawback of fluent APIs is that they often involve many classes which should be seen as implementation details. For example, the above line can be decomposed as follows:

const std::string fileName = "catalog.fits";
const FileMode mode = FileMode::READ;
const long hduIndex = 4;
const std::string idName = "ID";
const std::string radecName = "RADEC";
using Id = std::string;
using Radec = std::complex<double>;
const TypedKey<Id, std::string> idKey(idName);
const TypedKey<Radec, std::string> radecKey(idName);
MefFile f(fileName, mode);
const BintableHdu& hdu = f.access<BintableHdu>(hduIndex);
const BintableColumns& du = hdu.columns();
std::tuple<VecColumn<Id>, VecColumn<Radec>> coords = du.readSeq(idKey, radecKey);
Column-wise reader-writer for the binary table data unit.
Definition: BintableColumns.h:77
const BintableColumns & columns() const
Access the data unit column-wise.
Binary table column data and metadata.
Definition: Column.h:69
Multi-Extension FITS file reader-writer.
Definition: MefFile.h:72
FileMode
FITS file read/write permissions.
Definition: FitsFile.h:49
A light structure to bind a return type and a key, e.g. for reading records and columns.
Definition: DataUtils.h:78

Here we see how verbose the implementation can be for a rather simple example. There are 5 EleFits types involved: MefFile, BintableHdu, BintableColumns, TypedKey, VecColumn. Among them, 2 are meant for fluency, and should not be instantiated explicitely: BintableColumns and TypedKey.

For clearer code, the solution lies inbetween the two extreme code example given above. BintableColumns can be replaced with BintableHdu::columns() and TypedKey with as(). Also, auto should be favored over explicit return types. Finally, a more natural writing of the example is:

const std::string fileName = "catalog.fits";
const long hduIndex = 4;
const std::string idName = "ID";
const std::string radecName = "RADEC";
using Id = std::string;
using Radec = std::complex<double>;
MefFile f(fileName, FileMode::READ);
const auto& hdu = f.access<BintableHdu>(hduIndex);
auto coords = hdu.columns().readSeq(as<Id>(idName), as<Radec>(radecName));
std::tuple< VecColumn< Ts, 1 >... > readSeq(const TypedKey< Ts, TKey > &... keys) const
Read a tuple of columns with given names or indices.

How the documentation copes with fluency

Although they are simple to read, large fluent APIs are hard to document, due to the relatively large amount of classes. Because it is not convenient to browse many class documentation pages when trying to perform such a simple operation, the documentation of EleFits is built around examples. For example, if you check the documentation of BintableColumns::readSeq(), you will see that, although the parameters are of type TypedKey, the method documentation explains that they can be fed with a name or index and provides simple and more complex examples, so that the reader doesn't need to check the TypedKey documentation itself.

Additionally, the main documentation pages are not those of the namespaces or classes, but the Module pages, in which related classes are gathered.