Casting
When people first start coding in C they are often confused by casting
and when they need to use it. Casting is the act of changing a value
of one type in to a similar value of another type. The resultant value
may actually be different to the starting one because it could have
had it's size changed (the number of bytes of memory used to store the
value) or it has been written into a type whose bits mean different
things to the original (see Signed Values
later on).
Casting by the Compiler
Casting can happen automatically, being used by the compiler transparently
to the coder. A good example of this is in a piece of math code:
int iVal;
long lVal;
double dVal;
iVal = 2;
lVal = 4;
dVal = iVal * lVal;
Here, because an int value is being multiplied by a long
value and the result is to be stored in a double value, lots of
casting is employed by the compiler. As the compiler sees that the *
operator has operands of different types (and potentially memory size too)
it has to make them the same. It does this by taking the lower
type (usually the smallest in size) and casting it into the higher
one. So in this example the int is cast into a long before
the * is performed. The result, now a long is then cast
into a double before being stored into dVal. That's 3
casts already and we have hardly done anything. This of course is a fairly
extreme example deliberately using different operand types.
There are some fairly simple rules that govern how the compiler will apply
this automatic casting in math.
- If either operand is a long double, convert the other one
to a long double
- Else, if either operand is a double, convert the other
one to a double
- Else, if either operand is a float, convert the other one
to a float
- Otherwise, convert char and short to int
- Then, if either operand is long, convert the other one
to long
Function Arguments
Automatic casting is also used when passing arguments to functions and
when assigning function return values. The rules followed are the same
as for math with the exception that, if no prototype is found for a
function specifying the argument types, then char and
short values are cast to int and float
goes to double. Needless to say it is best practice to use
function prototypes when coding in ANSI C (old style K&R C does not
have function prototyping) to avoid strange values being passed to
your functions.
Signed Values
A quick word about signed values and types before we go any further.
A signed type like int or long can hold a value
that is either negative, positive or zero. An unsigned type like
unsigned long or unsigned int can only hold
positive or zero values. Often, signed and unsigned types of the
same basic type (unsigned int and int for example)
take up the same number of bytes of memory. The difference is in
how the underlying hardware treats the individual bit values in
the bytes that make up the value. Usually there is what is called
a sign-bit in a signed value that determines whether the
value is positive or negative.
Consider what happens then to a signed value whose sign-bit is
set (indicating a negative value) that is cast to an unsigned
type of the same basic type. The value of the bits do not change
in this instance, only their meaning. For example:
int sVal;
unsigned int uVal;
sVal = -1;
uVal = sVal;
Here the compiler casts sVal to an unsigned int
following the general rules above. Assuming the types are the
same number of bytes in size uVal will be a very large
positive number. Why ?
The answer lies in how negative numbers are represented and good
old binary math. To hold a value of -1 and be able to perform
binary math on it (don't forget computers only use binary) all
the bits in the number are set, including the sign-bit. If the
number of bytes required to store an int on our
imaginary machine is 4 and the sign-bit is the left most one,
then -1 would be represented by a binary number consisting of
32 1s (0xffffffff in hexadecimal). Casting this to an
unsigned int leaves the number and values of the bits
the same, but now there is no sign-bit. The left most bit is
treated as part of the number. So we still have 32 1s, so the
resulting value is 232 - 1, a fairly large
positive number. Any negative number when cast to an unsigned
type will miraculously change it's value. Unfortunately the
degree to which the value will change can not be predicted in
general terms because it is dependent on the hardware
and the number of bytes used to hold values of differing types.
Explicit Casting
Sometimes it is necessary to explicitly cast a value from one
type to another to achieve some desired side-effect (an example later), or
just because because it clarifies the code you have written (good practice).
Probably the most needlessly used bit of explicit casting in ANSI C is with
the returned pointer value from one of the malloc family of functions.
The return type from the malloc functions is the generic
pointer type, which is void * (or char * in K&R
C). This pointer type is special in C because it can hold the value of a
pointer to any other type without losing information, that is, you can cast
any pointer to the generic pointer type and back again without the value
changing in any way at all. This is most useful in data structure hiding in
libraries and modules, where you return a generic pointer to a request for a
new object, and expect a generic pointer type argument back in to the accessor
or modifier functions, which, upon receipt, you can safely cast back into
a pointer to your special structure type known only to the library or module.
In ANSI C the generic pointer is also special in another way too (hinted at
above). A variable of type void * can have any other pointer variable
assigned to it without the need for explicit casting. This is because of the
guaranteed no-loss conversion. In K&R C this no-loss conversion was also true of
the char * pointer but it was not blessed with the no casting
ability. As mentioned above the return type of the malloc functions
is the generic pointer, therefore in ANSI C you do not need to cast the returned
value in order to assign it to a pointer variable of another type.
Casting for side-effect is useful too. Consider the following code:
float fVal1;
float fVal2;
int iVal1;
int iVal2;
iVal1 = 10;
iVal2 = 3;
fVal1 = iVal1 / iVal2;
fVal2 = (float )iVal1 / iVal2;
The assigned values of the iVal1 and iVal2 statements are
different. If the intention was to find the exact result of dividing the 2
integer values the first assignment statement above fails. This is because,
although you might expect the result to be cast to a float (and quite
right too) unfortunately the math has already been done, therefore the result is
3.
In the second assignment statement one of the integer values has been explicitly
cast to a float. When the compiler sees this it is forced, as a
side-effect, to convert the other value to float and only then perform
the divide operation. The result is therefore already a float and no
further casting is needed. The answer is also correct too being the expected
value of 3 1/3.
Pointers to Structures and Casting - a Practical Example
When using structures for passing objects into and out of functions it is
often useful to be able to pass or receive from a choice of structure types
according to the action to be performed or as the result of an action that
has taken place. This can be achieved by using pointers to structures,
casting, and a handy property of the C structure.
The function would expect a generic pointer as input, to allow any type of
structure pointer to be passed, this allows for the choice. How then to
distinguish what type of structure was actually passed, and whether it was
valid for the function in question without having to explicitly pass in a
type in another argument ? Part of the answer lies in the fact that with C
structures you are guaranteed the order and offset from the start of the
structure of the constituent fields.
Utilising this feature means that you can use the first field of any
structure as an identifier. Consider the following code:
typedef enum {
StringType,
IntegerVar,
Lexicon
} StructIdT;
typedef struct {
StructIdT identifier;
} IdT, *IdPtr;
typedef struct {
StructIdT id;
char array[500];
} StringT, *StringPtr;
typedef struct {
StructIdT id;
long lVal;
char name[32];
} IntVarT, *IntVarPtr;
typedef struct {
StructIdT id;
int lType;
char name[32];
} LexT, *LexPtr;
Here we have defined an enumerated value and 4 structures of differing
types, along with pointers to those structures. The structures all have the
same type of initial field into which we can store a value to represent the
type of structure.
Next let us define a simple function that will accept and decode each of
the 3 main structures above printing out their contents:
void decodeStruct(void *sPtr) {
StructIdPtr sTypePtr;
StringPtr strPtr;
IntVarPtr iVarPtr;
LexPtr lPtr;
sTypePtr = sPtr;
switch (sptr->identifier)
{
case StringType: /* A String Type */
strPtr = sPtr;
printf("A StringType structure\n");
printf("Value : \"%.500s\"\n", strPtr->array);
break;
case IntVarType: /* An Integer Variable Type */
iVarPtr = sPtr;
printf("An IntVarType structure\n");
printf("Name : %.32s\n", iVarPtr->name);
printf("Value : %ld\n", iVarPtr->lVal);
break;
case Lexicon : /* A Lexicon Type */
lPtr = sPtr;
printf("A Lexicon structure\n");
printf("Name : %.32s\n", lPtr->name);
printf("Type : %d\n", lptr->lType);
break;
default : /* Unrecognized type ERROR */
printf("Oh dear, unrecognized structure\n");
}
}
By casting the incoming generic pointer to StructIdPtr it is
possible to determine the declared type of the structure because all 4 of
the structure have the same first field type. This value can then be used
to cast the incoming pointer again to retrieve the information from the
passed pointer. This does assume of course that the type of the structure
was filled in correctly of course. An added advantage of this method is that
the type stays with the structure rather than being passed in separately.
There are other ways to perform the same kind of function as above as I am
sure you can appreciate.
|