Java Site MenuProgramming SectionsMiscellaneous StuffConsultancy ServicesDownloadsFeedback Form


[C] [Previous] [Home]

Casting

When people first start coding in C they are often confused by casting and when they need to use it. Casting is the act of changing a value of one type in to a similar value of another type. The resultant value may actually be different to the starting one because it could have had it's size changed (the number of bytes of memory used to store the value) or it has been written into a type whose bits mean different things to the original (see Signed Values later on).

Casting by the Compiler [Top]

Casting can happen automatically, being used by the compiler transparently to the coder. A good example of this is in a piece of math code:

    int iVal;
    long lVal;
    double dVal;

    iVal = 2;
    lVal = 4;
    dVal = iVal * lVal;

Here, because an int value is being multiplied by a long value and the result is to be stored in a double value, lots of casting is employed by the compiler. As the compiler sees that the * operator has operands of different types (and potentially memory size too) it has to make them the same. It does this by taking the lower type (usually the smallest in size) and casting it into the higher one. So in this example the int is cast into a long before the * is performed. The result, now a long is then cast into a double before being stored into dVal. That's 3 casts already and we have hardly done anything. This of course is a fairly extreme example deliberately using different operand types.

There are some fairly simple rules that govern how the compiler will apply this automatic casting in math.

  • If either operand is a long double, convert the other one to a long double
  • Else, if either operand is a double, convert the other one to a double
  • Else, if either operand is a float, convert the other one to a float
  • Otherwise, convert char and short to int
  • Then, if either operand is long, convert the other one to long

Function Arguments

Automatic casting is also used when passing arguments to functions and when assigning function return values. The rules followed are the same as for math with the exception that, if no prototype is found for a function specifying the argument types, then char and short values are cast to int and float goes to double. Needless to say it is best practice to use function prototypes when coding in ANSI C (old style K&R C does not have function prototyping) to avoid strange values being passed to your functions.

Signed Values [Top]

A quick word about signed values and types before we go any further. A signed type like int or long can hold a value that is either negative, positive or zero. An unsigned type like unsigned long or unsigned int can only hold positive or zero values. Often, signed and unsigned types of the same basic type (unsigned int and int for example) take up the same number of bytes of memory. The difference is in how the underlying hardware treats the individual bit values in the bytes that make up the value. Usually there is what is called a sign-bit in a signed value that determines whether the value is positive or negative.

Consider what happens then to a signed value whose sign-bit is set (indicating a negative value) that is cast to an unsigned type of the same basic type. The value of the bits do not change in this instance, only their meaning. For example:

    int sVal;
    unsigned int uVal;

    sVal = -1;
    uVal = sVal;

Here the compiler casts sVal to an unsigned int following the general rules above. Assuming the types are the same number of bytes in size uVal will be a very large positive number. Why ?

The answer lies in how negative numbers are represented and good old binary math. To hold a value of -1 and be able to perform binary math on it (don't forget computers only use binary) all the bits in the number are set, including the sign-bit. If the number of bytes required to store an int on our imaginary machine is 4 and the sign-bit is the left most one, then -1 would be represented by a binary number consisting of 32 1s (0xffffffff in hexadecimal). Casting this to an unsigned int leaves the number and values of the bits the same, but now there is no sign-bit. The left most bit is treated as part of the number. So we still have 32 1s, so the resulting value is 232 - 1, a fairly large positive number. Any negative number when cast to an unsigned type will miraculously change it's value. Unfortunately the degree to which the value will change can not be predicted in general terms because it is dependent on the hardware and the number of bytes used to hold values of differing types.

Explicit Casting [Top]

Sometimes it is necessary to explicitly cast a value from one type to another to achieve some desired side-effect (an example later), or just because because it clarifies the code you have written (good practice). Probably the most needlessly used bit of explicit casting in ANSI C is with the returned pointer value from one of the malloc family of functions.

The return type from the malloc functions is the generic pointer type, which is void * (or char * in K&R C). This pointer type is special in C because it can hold the value of a pointer to any other type without losing information, that is, you can cast any pointer to the generic pointer type and back again without the value changing in any way at all. This is most useful in data structure hiding in libraries and modules, where you return a generic pointer to a request for a new object, and expect a generic pointer type argument back in to the accessor or modifier functions, which, upon receipt, you can safely cast back into a pointer to your special structure type known only to the library or module.

In ANSI C the generic pointer is also special in another way too (hinted at above). A variable of type void * can have any other pointer variable assigned to it without the need for explicit casting. This is because of the guaranteed no-loss conversion. In K&R C this no-loss conversion was also true of the char * pointer but it was not blessed with the no casting ability. As mentioned above the return type of the malloc functions is the generic pointer, therefore in ANSI C you do not need to cast the returned value in order to assign it to a pointer variable of another type.

Casting for side-effect is useful too. Consider the following code:

    float fVal1;
    float fVal2;
    int   iVal1;
    int   iVal2;

    iVal1 = 10;
    iVal2 = 3;
    fVal1 = iVal1 / iVal2;
    fVal2 = (float )iVal1 / iVal2;

The assigned values of the iVal1 and iVal2 statements are different. If the intention was to find the exact result of dividing the 2 integer values the first assignment statement above fails. This is because, although you might expect the result to be cast to a float (and quite right too) unfortunately the math has already been done, therefore the result is 3.

In the second assignment statement one of the integer values has been explicitly cast to a float. When the compiler sees this it is forced, as a side-effect, to convert the other value to float and only then perform the divide operation. The result is therefore already a float and no further casting is needed. The answer is also correct too being the expected value of 3 1/3.

Pointers to Structures and Casting - a Practical Example

When using structures for passing objects into and out of functions it is often useful to be able to pass or receive from a choice of structure types according to the action to be performed or as the result of an action that has taken place. This can be achieved by using pointers to structures, casting, and a handy property of the C structure.

The function would expect a generic pointer as input, to allow any type of structure pointer to be passed, this allows for the choice. How then to distinguish what type of structure was actually passed, and whether it was valid for the function in question without having to explicitly pass in a type in another argument ? Part of the answer lies in the fact that with C structures you are guaranteed the order and offset from the start of the structure of the constituent fields.

Utilising this feature means that you can use the first field of any structure as an identifier. Consider the following code:

    typedef enum {
                 StringType,
                 IntegerVar,
                 Lexicon
                 } StructIdT;
    typedef struct {
                   StructIdT identifier;
                   } IdT, *IdPtr;

    typedef struct {
                   StructIdT id;
                   char      array[500];
                   } StringT, *StringPtr;
    typedef struct {
                   StructIdT id;
                   long      lVal;
                   char      name[32];
                   } IntVarT, *IntVarPtr;
    typedef struct {
                   StructIdT id;
                   int       lType;
                   char      name[32];
                   } LexT, *LexPtr;

Here we have defined an enumerated value and 4 structures of differing types, along with pointers to those structures. The structures all have the same type of initial field into which we can store a value to represent the type of structure.

Next let us define a simple function that will accept and decode each of the 3 main structures above printing out their contents:

    void decodeStruct(void *sPtr) {
    StructIdPtr sTypePtr;
    StringPtr   strPtr;
    IntVarPtr   iVarPtr;
    LexPtr      lPtr;

    sTypePtr = sPtr;
    switch (sptr->identifier)
      {
      case StringType: /* A String Type */
                       strPtr = sPtr;
                       printf("A StringType structure\n");
                       printf("Value : \"%.500s\"\n", strPtr->array);
                       break;
      case IntVarType: /* An Integer Variable Type */
                       iVarPtr = sPtr;
                       printf("An IntVarType structure\n");
                       printf("Name  : %.32s\n", iVarPtr->name);
                       printf("Value : %ld\n", iVarPtr->lVal);
                       break;
      case Lexicon   : /* A Lexicon Type */
                       lPtr = sPtr;
                       printf("A Lexicon structure\n");
                       printf("Name : %.32s\n", lPtr->name);
                       printf("Type : %d\n", lptr->lType);
                       break;
      default        : /* Unrecognized type ERROR */
                       printf("Oh dear, unrecognized structure\n");
      }
    }

By casting the incoming generic pointer to StructIdPtr it is possible to determine the declared type of the structure because all 4 of the structure have the same first field type. This value can then be used to cast the incoming pointer again to retrieve the information from the passed pointer. This does assume of course that the type of the structure was filled in correctly of course. An added advantage of this method is that the type stays with the structure rather than being passed in separately. There are other ways to perform the same kind of function as above as I am sure you can appreciate.


[Fiendish Home]


Content of this page Copyright Robert Quince 1996 - 2005.
Site Comments