7 Bad-Pixel Methods

 7.1 Quality
 7.2 Magic or Undefined Value
 7.3 Implementation

Starlink standard data formats support two methods of handling bad data: magic value (which flags specified pixels as undefined) and data quality (a more general mechanism, which may be used to indicate any attribute of selected pixels, including “badness”). Magic value is simple and efficient. Data quality is flexible and preserves the original data.

7.1 Quality

To flag a data value as “bad”, an associated data-quality value can be used. This is an array of 8-bit positive integers, one per element of the data array with which it is associated (a single value, applying to all elements of the data array, is also possible, but this will rarely be useful), whose bits describe, in various ways, attributes of the data value concerned. The recommended way to use data quality is to regard the 8 bits as eight independent logical masks, one mask per attribute.

As its name implies, data-quality is a qualitative description of the data value. It is frequently used to flag bad pixels, but is also useful for “good” attributes, e.g. which regions of a picture constitute the sky sample. It is not in any sense an error estimate (though groups of bits might be used to convey some numerical meaning); it finds application in circumstances where an error estimate is not meaningful. Here are some examples of how data quality might be used:

Sometimes a simple true/false mask is not enough. In such cases it is possible to use combinations of bits to indicate both the presence of the condition and to what subclass of that condition the pixel belongs. For example, a group of three data quality bits could be used not only to flag saturation but also to grade the degree of saturation, on a scale of 1–7.

Clearly, not all values stored in the data system will have associated data-quality; that would be unnecessary and quite wasteful of resources. Normally, data-quality values are associated with basic observational or measured data.

7.2 Magic or Undefined Value

The alternative method for handling bad pixels is the so-called magic value method, where a pixel is assigned a special flag when it has an undefined value—it corresponds to a dead element in a CCD chip, for example, or is the result of division by zero. This terminology should not be confused with the HDS “undefined state”, where a data object exists, but has no value(s) assigned to it. In this document “undefined” means “having a magic value”, unless explicitly stated. An undefined pixel will always be bad, unless repaired in some fashion, and so the data-quality technique is not applicable.

The method is efficient on space: it can always be applied without increasing the data-storage requirement because the flag or magic value replaces the unwanted data value. (For applications where it is important to retain pixel values, or where there is a degree of badness, data quality should be used.) The method enables an application to discover whether a given pixel is bad as soon as it is accessed.

Alternative techniques, based on a list of bad pixels, would be less efficient, because the list would have to be searched repeatedly to see whether given pixels are bad. Such methods would be especially inefficient if large areas of pixels were undefined.

Once a bad pixel has been detected, the application can take appropriate action – flagging the corresponding output pixel as bad, or attempting a repair, perhaps via a choice of interpolation methods.

The HDS undefined state must not be used to indicate bad pixels. If an application finds a data-object in this state, it must report an error, so that the malfunctioning application which created the object can be identified and corrected. The error is fatal.

7.3 Implementation

General-purpose applications, like those in the KAPPA package, should support both magic-value and quality arrays. It will usually be best to look only for the magic-value case in the scientific algorithm part of the code, having dealt with any data-quality information in a preliminary pass which converts flagged pixels into magic-value ones.

The groups of true/false logical flags involved in the data quality mechanism are stored as integer values. We picture these integers as having conventional binary encoding, and adopt the convention that 1 true. Specific VAX representations and conventions are not followed and are not in any way involved with the discussion.

7.3.1 Data Quality

Quality is an 8-bit value associated with each datum, and is stored as an unsigned byte. A value of zero (i.e. all quality flags set 0 false) implies a “ordinary” value which can be accepted at face value by application programs.

General-purpose applications

In general-purpose applications, the data-quality values are regarded as a set of 8 independent masks, each of which is 1 bit deep. Whether a given pixel is to be included in the processing or not (i.e. whether it “bad”) is determined by comparing its quality value with a bit pattern stored in a <_UBYTE > data object [BADBITS] within the <QUALITY > structure. The following logical expression is evaluated: BAD = QUALITY^BADBITS

where ^ is the logical AND operation.

Note that if a [BADBITS] mask is zero (i.e. all false), the corresponding data-quality mask is ignored. This can be used to turn off all 8 data-quality masks and allow inspection or processing of the pixels whatever their status. For a single bit, the above expression has the following truth table:

PICT

and the overall logical value of BAD is the OR of the results for all eight bits—just one of which has to be TRUE to make the resulting pixel bad.

An example may clarify this. Assume [BADBITS] is 01001010 (where the bits of the binary number are written with the most significant at the left, and are numbered from the right beginning with zero). For this [BADBITS] value, a pixel with a [QUALITY] value of 10100100 is interpreted as non-bad, because bits 2, 5 and 7, which are set in the data-quality value, are not set in [BADBITS]. However, a [QUALITY] value of 10100110 generates a bad value because bit 1, which is set in the data-quality value, is also set in [BADBITS]. If data object [BADBITS] is not present its value is assumed to be to be 00000000, and general-purpose applications will accept as “good” any pixel, irrespective of the corresponding data-quality value.

The rules and conventions for the processing of data-quality values and their associated data, taking into account the possible presence of undefined values, are as follows.

Rules
Conventions

If a [QUALITY] array is present it is assumed that it is to be used to define bad pixels unless:

If [QUALITY] is not present the magic-value method is assumed.

There is no one ideal way of handling data quality in general-purpose routines. Methods will evolve as experience with real applications and data is gained. The main considerations are:

Specialist applications

Applications can be as sophisticated and specialised as they like in their use of data quality, and are at liberty to assign specific meanings to values of data quality, e.g. a fiducial mark, vignetting, saturation. The details of how data-quality information is encoded within the 8 bits are specific to each kind of data source and specialist package. A description of how quality will be interpreted must be given in the documentation for each package that uses the technique. However, it is possible to identify some general features of data-quality processing.

Each data-quality value can be regarded as a set of bit groups, each containing one or more bits. The recommended approach is to use single bits, each with an independent meaning, to form eight 1-bit deep logical masks. However, it is also permissible to take several bits (which ought to be contiguous) and interpret them as a positive integer. Single bit fields are used to contain a flag (1 = .TRUE., 0 = .FALSE.) for some feature (e.g. “pixel in fiducial”). Multiple-bit fields are used to contain code numbers or degree of quality.

It is envisaged that most manipulation of data-quality values will be done quite transparently by those applications which know how to use them to advantage, without the user being aware of the mechanism. However, it is expected that there will be some cases where users will want to manipulate data quality explicitly, and there will be various data-quality editing applications, often using graphics or image displays. For example, there will be instances where the user wishes to view a picture on a display and select which pixels are to be temporarily flagged as “wrong”, rather than trust some automatic algorithm.

Since the data quality codes are stored separately from the actual data, data-quality editing will normally be a reversible process, leaving the data values themselves untouched.

(n.b. The implementation of data quality is largely unchanged from the Wright-Giddings proposal.)

7.3.2 Magic Values

For each of HDS’s primitive TYPEs, the magic-value method uses the values given in Table 10. Alert readers will note that these are the same as the bad values used by HDS.


Table 10: Magic values for bad pixels



Data TYPE Value Hexadecimal pattern



<_BYTE > 128 80
<_UBYTE > 255 FF
<_WORD > 32768 8000
<_UWORD > 65535 FFFF
<_INTEGER > 2147483648 80000000
<_REAL > 1.7014117E+38 FFFFFFFF
<_DOUBLE > 1.701411834604923D+38 FFFFFFFFFFFFFFFF




Use of “undefined data” flags must be restricted to three operations: (i) setting a datum to “undefined”, (ii) testing whether a datum is in the undefined state, and (iii) replacing an undefined datum with a valid value (using an assignment statement). Arithmetic operations on undefined data values are banned. Magic values are applicable to both scalar and vector data objects. There are some exceptions and these are individually noted.

For efficiency, pixel values are tested inline for equality with the magic value of the appropriate type. However, the numerical values given above must not be written explicitly in the code; instead, variables called VAL__BAD<T>, where <T> is the one or two-letter type code (e.g. see SUN/7), should be used. These variables are specified via an INCLUDE file with logical name BAD_PAR. Here is a trivial example, which computes the mean of a one-dimensional REAL array. (n.b. Actual applications would include comments and defences against rounding errors, excluded for brevity here.)

        INTEGER I, N, NPIX
        PARAMETER (NPIX = 100)
  
        REAL DATA(NPIX), SUM, MEAN
  
        INCLUDE ’BAD_PAR’
  
        SUM = 0.0
        N = 0
        DO I = 1, NPIX
           IF ( DATA( I ) .NE. VAL__BADR ) THEN
              SUM = SUM + DATA( I )
              N = N + 1
           END IF
        END DO
  
        IF ( N .EQ. 0 ) THEN
           MEAN = VAL__BADR
        ELSE
           MEAN = SUM/ REAL( N )
        END

Note that only valid pixels are counted and summed.

For reasons of efficiency of processor time and work space, and to permit easier portability and adaptation of general-purpose subroutine libraries, a flag called [BAD_PIXEL] may be provided within a structure to denote whether undefined pixels are present. Only if it is present and set to .FALSE. will it be permissible to bypass magic-value testing. Thus, many packages will support two sets of algorithmic subroutines; one which tests magic values, and one which does not.