2 Starlink data structures

One of the sources of irritation when using applications software is the variety of data formats in existence. The typical package has lots of applications solely for the purpose of reading in different types of data.

A major preoccupation of Starlink since its inception has been to design a format which is both standard and yet which can accommodate most of the data objects which one might wish to store. The solution, the NDF, (Extensible n-Dimensional-Data format) uses the Hierarchical Data System (HDS, SUN/92), and is described in awesome detail in SGP/38.

However, the essence of the system is simple; data objects in an NDF are stored in a logical hierarchical structure which can be compared to the VMS directory structure. At each level there are objects which may be primitives or structures. Primitives actually contain data – like ordinary VMS files, whereas structures contain further levels of objects – like VMS directory files.

There are defined locations for standard items such as the main data array, axes, title, units etc. (see Appendix A). The only mandatory item is the main data array; all other items are optional. Non-standard items are stored in extensions as described in Section 16.

However, the huge advantage of this system is that the programmer doesn’t need to know the details of the format at all! A set of routines has been provided to give access to all the standard components of an NDF. A full description of this NDF subroutine library is given in SUN/33; see also Appendix B.

For example, the title of an NDF can be read into the character string VALUE by the call below:

        CALL NDF_CGET (NDF, ’TITLE’, VALUE, STATUS)

The units and label can be accessed by replacing ’TITLE’ with ’UNITS’ and ’LABEL’ respectively.

Of course, you still want to know what the format actually is… You can examine the contents of a sample NDF using TRACE1. NDFs have the default file extension ‘.SDF’ and a selection are contained in the ADAM_EXAMPLES directory. The example below uses SPECTRUM.SDF.

  $ TRACE SPECTRUM
  
  SPECTRUM  <NDF>
  
    DATA_ARRAY(852)  <_REAL>       56.47374,97.49321,68.82304,82.95155,
                                   ... 820.8976,570.0729,471.8835,449.574
    TITLE          <_CHAR*30>      ’HR6259 - a Red Giant in w Cen’
    LABEL          <_CHAR*4>       ’Flux’
    AXIS(1)        <AXIS>          {structure}
  
    Contents of AXIS(1)
       LABEL          <_CHAR*20>      ’Wavelength’
       UNITS          <_CHAR*20>      ’Angstroms’
       DATA_ARRAY(852)  <_REAL>       3849.26,3849.79,3850.32,3850.849,
                                      ... 4298.309,4298.838,4299.368,4299.897
  End of Trace.

The indentation reflects the hierarchy of the data objects. For example, the first-level objects in the structure above are DATA_ARRAY, TITLE, LABEL and AXIS(1). The first three of these are primitives whereas AXIS(1) is a structure and contains the second-level objects LABEL, UNITS & DATA_ARRAY.

The output also indicates that the main data array is of type _REAL2 and is a 1-d array with 852 elements; the first and last of these are shown, separated by an ellipsis. Similarly TITLE is an object of type _CHAR and length 30, and has a value of ’HR6259 - a Red Giant in w Cen’.

It is important to note that the above is an example format which happens to show a primitive NDF, i.e. DATA_ARRAY is a primitive object. SGP/38 describes other possibilities or variants in which DATA_ARRAY is a structure which can be used to express data in a variety of ways.

1Type ADAMSTART to set up the symbol TRACE.

2HDS data types _REAL and _CHAR correspond to Fortran types REAL and CHARACTER respectively (see Appendix C).