5 More on Calling C from FORTRAN

 5.1 General Description
 5.2 Declaration of a Function
 5.3 Declaration of Arguments
 5.4 Arguments – and Pointers to Them
 5.5 Type Specifiers
 5.6 Logical Values
 5.7 External Names
 5.8 Common Blocks

As the examples in the appendix on machine specific details show, different computers handle subroutine interfaces in different ways. This apparently makes it difficult to write portable programs that are a mixture of FORTRAN and C. However, it is only the C code that differs and fortunately the differences can be hidden by suitable C macros so that the same code can be compiled on all types of hardware mentioned in this document. The macros have been constructed in such a way that they can accommodate other subroutine passing mechanisms; however, it is not possible to guess all the types of mechanisms that we might come across.

The macros can be used in a C function by including the file f77.h. This file will naturally be stored in different places on different types of system, even if it is only the syntax of the file name that is different. It would be a pity if all of the implementation specific details were hidden away in these macros, only to have to have an implementation specific #include statement in each C source file. Fortunately there is a way around this problem that is described in Section 13 on compiling and linking.

Let us now consider an example of using the F77 macros to illustrate their use. The following example generates a banner which consists of some hyphens, followed by some stars and finally the same number of hyphens again. There are also some blanks between the beginning of the line and between the hyphens and stars. The work is done in the subroutine BANNER and the form of the output is governed by the three arguments FIRST, MIDDLE and GAP. For example, CALL BANNER( LINE, 5, 10, 3 ) would return with LINE set to the following character string.

  -----   **********   -----

Example 3 – Passing arguments between FORTRAN and C.
FORTRAN program:
        PROGRAM F1
        INTEGER FIRST, MIDDLE, GAP
        CHARACTER*(80) LINE
  
        FIRST = 5
        MIDDLE = 10
        GAP = 3
        CALL BANNER( LINE, FIRST, MIDDLE, GAP )
        PRINT *, LINE
  
        END

C function:
  #include "f77.h"
  
  F77_SUBROUTINE(banner)( CHARACTER(line), INTEGER(first), INTEGER(middle),
                         INTEGER(gap) TRAIL(line) )  {
    GENPTR_CHARACTER(line)
    GENPTR_INTEGER(first)
    GENPTR_INTEGER(middle)
    GENPTR_INTEGER(gap)
    int i, j;       /* Loop counters.  */
    char *cp;       /* Pointer to a character.  */
  
  /* Make cp point to the beginning of the string line.  */
    cp = line;
  
  /* First blanks.  */
    for( i = 0, j = 0 ; (j < line_length) && (i < *gap) ; i++, j++ )
       *cp++ = ’ ’;
  
  /* First hyphens.  */
    for( i = 0 ; (j < line_length) && (i < *first) ; i++, j++ )
       *cp++ = ’-’;
  
  /* More blanks.  */
    for( i = 0 ; (j < line_length) && (i < *gap) ; i++, j++ )
       *cp++ = ’ ’;
  
  /* Middle stars.  */
    for( i = 0 ; (j < line_length) && (i < *middle) ; i++, j++ )
       *cp++ = ’*’;
  
  /* More blanks.  */
    for( i = 0 ; (j < line_length) && (i < *gap) ; i++, j++ )
       *cp++ = ’ ’;
  
  /* Last hyphens.  */
    for( i = 0 ; (j < line_length) && (i < *first) ; i++, j++ )
       *cp++ = ’-’;
  }

The FORTRAN part of this example is completely standard; it is the C code that need further explanation. Firstly there is the declaration of the subroutine with the macro F77_SUBROUTINE. This handles the fact that some computer systems require a trailing underscore to be added to the name of the C function if it is to be called from FORTRAN. In the same statement there are the function’s dummy arguments, declared using the macros CHARACTER, INTEGER and TRAIL. The INTEGER macro declares the appropriate dummy argument to handle an incoming argument passed from FORTRAN. This will usually be declared to be “pointer to int”. The CHARACTER and TRAIL macros come in pairs. The CHARACTER macro declares the appropriate argument to handle the incoming character variable and TRAIL may declare an extra argument to handle those cases where an extra hidden argument is added to specify the length of a character argument. On some machines, TRAIL will be null and on account of this there should not be a comma before any TRAIL macros. When TRAIL is not null, then it will add the comma itself. If there are several TRAIL macros then there must not be a comma directly in front of any of them.

The next set of macros are the GENPTR_type macros, one for each argument of the FORTRAN subroutine (TRAIL arguments are not counted as separate arguments for this purpose). These handle the ways that subprogram arguments may be passed on different machines. They ensure that a pointer to the argument exists. On most systems, this is exactly what is passed from the FORTRAN program and so the macros for numeric arguments are null. If a particular system passed the value of an argument, rather than its address, then these macros would generate the appropriate pointers.

The CHARACTER, TRAIL and GENPTR_CHARACTER macros have to cope with the different ways that systems deal with passing character variables. Although the way that these macros are implemented can be a bit complex, what the programmer sees is essentially simple. For each character argument, the macros generate a pointer to a character variable and an integer holding the length of that character variable. The above example will create the variable line of type char * and variable line_length of type int. If these are available directly as function arguments, then the macro GENPTR_CHARACTER will be null, otherwise it will generate these two variables from the arguments. The best way of seeing what is going on is to compile a function with macro expansion turned on and list the output.

There is an important difference between this example and the one in the cookbook. In this case, an int variable containing the length of the character argument is generated automatically whereas in the example in the cookbook the length was passed explicitly. In fact, the int variable was also generated in the example in the cookbook, but it was not used. It is more portable to explicitly pass the length of CHARACTER variables and to ignore the automatically generated length as this will cope with the situation where the length cannot be generated automatically. No such machines are known to the author at present, but Murphy’s Law would indicate that the next machine that we desperately need to use will have this problem.

Although the use of these macros does seems a bit strange at first, once any pointers have been generated, the rest of the code is standard C.

Something that has not yet been considered is whether to write the code in upper or lower case. All of the examples in this document have the FORTRAN code in upper case and the C code in lower case, thereby following common practice. Normally it makes no difference whether code is written in upper case or lower case. Where it does matter is in declaring external symbols. External symbols are names of routines and names of common blocks (FORTRAN) or variables declared extern (C). The linker must be able to recognise that the external symbols in the FORTRAN routines are the same external symbols in the C functions. On a VMS system, the VAX C compiler will fold all external symbols to upper case by default, although there is a compiler option to fold them all to lower case or leave them as written in the source code. The VAX FORTRAN compiler will generate all external symbols in upper case. On Unix systems, the FORTRAN compiler will typically fold external names to lower case (and add a trailing underscore), whereas the C compiler will leave the case unchanged. Consequently, all external symbols in C functions that might be referenced from FORTRAN should be coded in lower case.

5.1 General Description

Having considered an example of using the macros to write a C function that is to be called from FORTRAN, let us look at all of the macros in more details. You will notice that some of the macros are prefixed by F77 while others are not. Those that do not have the F77 prefix are those that occur in standard places in the source code and so are unlikely to be confused with other macros. The macros that do have the F77 prefix are those that declare a C function and others that are less commonly used, and when they are, they can occur anywhere within the body of the C routine. A full description of each macro is available in appendix E.

The whole ethos of the F77 macros is to try to isolate the FORTRAN/C interface to the beginning of the C function. Within the body of the C function, the programmer should not need to be aware of the fact that this function is designed to be called from FORTRAN. It is not possible to achieve this completely and at the same time retain portability of code, but the intention is there none the less.

5.2 Declaration of a Function

There are two types of macros involved in declaring a C function that is to be called from FORTRAN; the function name and the function arguments. If the C function is to be treated as a FORTRAN subroutine, then it should be declared with the macro F77_SUBROUTINE. This will declare the C function to be of type void and will generate the correct form of the of the routine name, handling such things as appending a trailing underscore where required.

If the C function is to be treated as a FORTRAN numerical or logical function, then it should be declared with one of the macros F77_type_FUNCTION. These macros will declare the function to be of the appropriate type, e.g. a function declared with F77_INTEGER_FUNCTION is likely to be of type int.

The declaration of a C function that is to be treated as a FORTRAN character function is more complex than one that returns a scalar numeric or logical value. The first argument of the function should be CHARACTER_RETURN_VALUE(return_value), where return_value is a variable of type “pointer to char”. Although character functions work perfectly well on all current Starlink hardware, it is one of the more difficult things to guess how other manufacturer might implement them. Consequently, it is recommended that character functions be avoided where possible and that a subroutine that returns a character argument be used instead.

5.3 Declaration of Arguments

Scalar arguments are declared with the macros INTEGER, REAL, DOUBLE, LOGICAL, CHARACTER and TRAIL. (Or the non-standard BYTE, WORD, UBYTE, UWORD or POINTER.) The macros that declare numeric and logical arguments take account of the fact that a FORTRAN integer variable may correspond to a C type of int on one machine, but to long int on another. They also handle the mechanism that is used to pass the arguments.

Character arguments are more complex as different computers use differing mechanisms for passing the arguments. To take account of this, for every argument that is declared using the CHARACTER (or CHARACTER_ARRAY) macro, there should be a corresponding TRAIL macro at the end of the list of dummy arguments. As mentioned in a preceding example, there should not be a comma before any TRAIL macros.

C differs from FORTRAN in that it has pointer variables. These are often used to manipulate arrays, rather than by using array subscripts. The macros that are used to declare array arguments do in fact declare them to be arrays. If programmers wish to manipulate these arrays by means of pointer arithmetic, then for maximum portability they should declare separate pointers within the C function that point to the array argument.

Array arguments are declared by one of the macros type_ARRAY. The macros that declare numeric or logical array arguments declare the arrays to be pointers to type. To enable the C function to process the array correctly, the dimensions of the array should be passed as additional arguments.

The F77 macros do not allow you to declare fixed sized dimensions for an array that is a dummy argument. Normally, it is necessary to pass the dimensions as arguments of the routine anyway, but there are circumstances where the dimensions of the array will be fixed, e.g. an array might specify a rotation in space and hence is always 3 x 3. What is gained by declaring the fixed dimensions of the array is that subscript calculations can be done on arrays of more than one dimension. Unfortunately, such declarations cannot be made portable as some FORTRAN systems pass arrays by descriptor. If you really must declare arrays with fixed dimensions, you can do so as follows:

  F77_SUBROUTINE(subname)( F77_INTEGER_TYPE array[3][3] )
  {
    ...
    elem = array[i][j]
    ...
  }

This example declares the dummy argument to be an INTEGER array of fixed size. Although the subscript calculation can be performed as the routine knows the size of the array, the sizeof operator does not return the full size of the array as the complier casts array[3][3] to *array. All things considered, it is better to have the dimensions of arrays passed as separate arguments and to do the subscript arithmetic yourself with pointers. Here is an example of initializing an array of arbitrary size and arbitrary number of dimensions.

Example 4 – Passing an array of arbitrary size from FORTRAN to C.
FORTRAN program:
        PROGRAM ARY
  
        INTEGER NDIMS, DIM1, DIM2, DIM3
        PARAMETER( NDIMS = 3, DIM1 = 5, DIM2 = 10, DIM3 = 2 )
  
        INTEGER DIMS( NDIMS )
        INTEGER A( DIM1, DIM2, DIM3 )
  
        DIMS( 1 ) = DIM1
        DIMS( 2 ) = DIM2
        DIMS( 3 ) = DIM3
        CALL INIT( A, NDIMS, DIMS )
  
        END

C function:
  #include "f77.h"
  
  F77_SUBROUTINE(init)( INTEGER_ARRAY(a), INTEGER(ndims), INTEGER_ARRAY(dims) )
  {
    GENPTR_INTEGER_ARRAY(a)
    GENPTR_INTEGER(ndims)
    GENPTR_INTEGER_ARRAY(dims)
  
    int *ptr = &a[0];  /* ptr now points to the first element of a.  */
    int size = 1;      /* Declare and initialize size.  */
    int i;             /* A loop counter.  */
  
    /* Find the number of elements in a.  */
  
    for( i = 0; i < *ndims ; i++ )
       size = size * dims[i];
  
    /* Set each element of a to zero.  */
  
    for( i = 0 ; i < size ; i++ )
       *ptr++ = 0;
  }

In this example, each element of the array a is accessed via the pointer ptr, which is incremented each time around the last loop.

5.4 Arguments – and Pointers to Them

When a FORTRAN program calls a subprogram, it is possible for the value of any of its arguments to be altered by that subprogram. In the case of C, a function cannot return modified values of arguments to the calling routine if what is passed is the value of the argument. If a C function is to modify one of its argument, then the address must be a pointer to the value to be modified rather than the actual value. Consequently in C functions that are designed to be called from FORTRAN, all function arguments should be treated as though the address of the actual argument had been passed, not its value. This means that the arguments should be referenced as *arg from within the C function and not directly as arg. This may seem odd to a FORTRAN programmer, but is natural to a C programmer.

To ensure that there always exists a pointer to each dummy argument, the first lines of code in the body of any C function that is to be called from FORTRAN should be GENPTR macros for each of the function arguments. The macros GENPTR_type always result in there being a C variable of type “pointer to type” for all non-character variables. For example, GENPTR_INTEGER(first) ensures that there will be a variable declared as int *first. On all current types of system, this macro will actually be null since the pointer is available directly as an argument. However, the macro should be present to guard against future computers working in a different way. For example, if a particular system passed FORTRAN variables by value rather than by reference, then this macro would construct the appropriate pointer.

Character arguments are different in that the GENPTR macro ensures that there are two variables available, one of type “pointer to char” that points to the actual character data, and one of type int that is the length of the character variable. The name of the variable that holds the length of the character string is constructed by appending “_length” to the name of the character variable. For example, if a function is declared to have a dummy argument with the macro CHARACTER(ch) and a corresponding TRAIL(ch), then after the execution of whatever the macro GENPTR_CHARACTER(ch) expands into, there will be a “pointer to character” variable called ch and an integer variable called ch_length. Although the length of a character variable is directly accessible through the int variable ch_length, it is better to pass the length of the character variable explicitly if maximum portability is sought. This is because, although it works on all currently supported platforms, it may not be possible to gain access to the length on some machines.

It is important to remember that what is available after the execution of what a GENPTR macro expands into will be a pointer to the dummy argument, not a variable of numeric or character type. Consequently the body of the code should refer to it as *arg and not as arg. In a long C function, it may be worth copying scalar arguments into local variables to avoid having to remember to put the * on each reference to an argument. If the variable is changed in the function, then it should of course be copied back into the argument at the end of the function. Alternatively you could define C macros to refer to the pointers, such as

  #define STATUS *status

Note that although ANSI C will allow the above as status and STATUS are distinct names, you should beware of the possibility of a computer that does not have lower case characters. Such machines used to exist in abundance, but at present, this does seem a remote possibility.

Array arguments should have pointers generated (if necessary) by using the GENPTR_type_ARRAY macros. All arrays are handled by these macros.

5.5 Type Specifiers

There are macros F77_type_TYPE which expand to the C data type that corresponds to the FORTRAN data type of the macro name, e.g. on a particular computer F77_INTEGER_TYPE may expand to int. These are usually not needed explicitly within user written code, but can be required when declaring common blocks, casting values from a variable of one type to one of a different type and when using the sizeof operator.

5.6 Logical Values

The macros F77_FALSE and F77_TRUE expand to the numerical values that FORTRAN treats as false and true (e.g. 0 and 1). They should be used when setting logical values to be returned to the calling FORTRAN routine. There are also macros F77_ISFALSE and F77_ISTRUE that should be used when testing a function argument for truth or falsehood.

5.7 External Names

The macro F77_EXTERNAL_NAME handles the difference between the actual external name of a function called from FORTRAN and a function that apparently has the same name when called from C. Typically this involves appending an underscore character to a name. This macro is not normally needed directly by the programmer, but is called by other macros.

5.8 Common Blocks

There are two macros that deal with common blocks, F77_NAMED_COMMON and F77_BLANK_COMMON. They are used when declaring external structures that corresponds to FORTRAN common blocks and when referring to components of those structures in the C code. The following declares a common block named “block” that contains three INTEGER variables and three REAL variables.

  extern struct
  {
   F77_INTEGER_TYPE i,j,k;
   F77_REAL_TYPE a,b,c;
  } F77_NAMED_COMMON(block);

The corresponding FORTRAN statements are

        INTEGER I,J,K
        REAL A,B,C
        COMMON /BLOCK/ I,J,K,A,B,C

Within the C function the variables would be referred to as:

F77_NAMED_COMMON(block).i, F77_NAMED_COMMON(block).j, etc.

Note that all that these macros do is to hide the actual name of the external structure from the programmer. If a computer implemented the correspondence between FORTRAN common blocks and C global data in a completely different way, then these macros would not provide portability to such an environment.

On account of this, it is best to avoid using common blocks where possible, but of course, if you need to interface to existing FORTRAN programs, this may not be practical.