Java Site MenuProgramming SectionsMiscellaneous StuffConsultancy ServicesDownloadsFeedback Form


[C] [Previous] [Home]

Memory Allocation

Dynamic allocation of memory space can even be confusing to the more accomplished programmer, let alone for the novice. Most of the problems I have come across arise when more complicated data structures need to be constructed, such as argument vectors and so on.

The Basics [Top]

The standard functions malloc and calloc enable the programmer to access the computer's runtime memory space. The number of bytes of this memory that the programmer wants is passed as an argument and the address of the allocated memory is the return value, a void * suitable for casting into the required form. The calloc function takes a further argument which specifies the number of elements of the requested size that the programmer would like, for example:

    ptr = calloc(10, 50);

would give the programmer 10 lots of 50 bytes.

The major difference between malloc and calloc is that calloc initializes the contents of the returned memory bytes to zeroes, whereas malloc returns the memory with whatever values happened to be there at the time.

Take care with allocated memory

Care must be taken when using the returned memory to store values to ensure that the program does not write data into the memory that is too long. This is very bad for the program. Let us investigate why.

In general when a program is run that makes use of dynamically allocated memory, the surrounding operating system reserves a chunk of memory for it to use. The malloc and calloc functions make use of this reserved chunk to return the smaller blocks that the program requests during its execution. Now, in order to keep track of the blocks that are in use and the blocks that are free within the large chunk, the functions actually reserve a few more bytes than than the program requests each time to store the necessary tracking information. So, if the program then writes too much data to one of the returned blocks it is possible to destroy this tracking information such that, either when the program calls the free function to return the memory block to the large chunk, or when the next request is made, the allocation functions can no longer distinguish between free and used blocks properly.

This often leads program failure because of the allocation functions:

  • looping endlessly hanging the program when trying to free a block
  • crashing the program with a core dump because the tracking information contains an invalid memory address
  • repeatedly requesting more memory from the operating system because they think that all of the reserved chunk has been used up, eventually causing either the program or the machine to crash because there is no more memory available

It is also important to remember not to call the free function more than once with the same returned block address. Doing so will almost certainly cause the allocation functions to fail and thus the calling program.

Arrays [Top]

The creation of an array during runtime can be quite a problem, depending what data type the array is to hold. For simple data types such as char or int creating the array is trivial. For example:

    #define N_ELEMENTS 50
    int *iArray;
    iArray = (int *)malloc(N_ELEMENTS * sizeof(int));

would result in an array that could used in exactly the same manner as one hard coded into a program as:

    int iArray[50];

This type of malloc construction can also be used to create arrays of structures in a very similar manner by just replacing the int with the structure data type. The sizeof operator returns the number of bytes of memory that are required to store an object of the type given. For structures this includes any padding bytes that may be needed to allow the structure to be arrayed correctly. This means that if you manually counted the number of bytes in a structure and compared that to the value returned from sizeof there may be a difference in favour of sizeof.

This difference is not a problem as when accessing the array in the normal way, the compiler uses the sizeof operator in its calculations of the address of the subscripted element. See Arrays and Strings for more information.

Multi-dimensional Arrays

Straight forward multi-dimensional arrays are not possible with malloc because, for example with a hard-coded 2 dimensional array, the compiler knows the number of columns in each row and can therefore perform the required calculations to access an element. So for a 3 by 2 array:

    int iMArray[3][2];
    iMArray[2][1] = 20;

the compiler would access the element by transforming the statement into something like this:

    *(iMArray + (2 * (3 * sizeof(int))) + (1 * sizeof(int))) = 20;

where the (3 * sizeof(int)) is the size of a row. This value would not be calculable in the case of a malloc array.

It is possible to simulate a multi-dimensional array with malloc though, in order that it can be accessed in the same manner as that given above for the hard-coded array. To do this you have to create an array of pointers.

Arrays of Pointers

Things become more complicated when it is necessary to represent an array of strings for example. This is because we are now considering either 2 dimensional arrays or, more commonly put, arrays of arrays.

The string case, an array of arrays of char is pretty complicated to explain, so let us consider a more simple example:

    int *ipArray[10];
    int  iMArray[10][20];
    iMArray[2][6] = 12;
    ipArray[2][6] = 12;

Both the assignment statements are correct C code, but only the first one will succeed at this point. This is because the int * array only has enough memory reserved for 10 int pointers, and even these are unusable as they don't point to anything useful yet, whereas the hard-coded array has enough memory now for the desired 200 ints.

To make the statement work the pointers (the array rows) need to be filled in to point to the array column data before the assignment is performed. This can be a trivial for loop and malloc construct, inserted after the array definitions:

    int *ipArray[10];
    int  iMArray[10][20];
    int i;
    for (i = 0; i < 10; i++) ipArray[i] = (int *)malloc(20 * sizeof(int));
    iMArray[2][6] = 12;
    ipArray[2][6] = 12;

This would then allocate the 10 rows of 20 ints required. More memory is used with a malloc based multi-dimensional array compared to a hard-coded one because of the intermediate pointers required, the 10 int *s in this case.

The same assignment statement can be used for the malloc array because the compiler again performs some statement transformations:

    *(*(ipArray + (2 * sizeof(int *))) + (6 * sizeof(int))) = 12;
It can be seen that 2 memory accesses are required, the first one to get the address of the column data for the row, and the second to store the desired value in the calculated location.

Expanding the above example to handle an array of strings is not too bad once it is understood that the arrays have to constructed before they can be used, thus:

    char cStrArray[10][20];
    char *cpArray[10];
    int i;
    for (i = 0; i < 10; i++) cpArray[i] = (char *)malloc(20 * sizeof(char));
    strcpy(cStrArray[5], "A hat is on the mat");
    strcpy(cpArray[5], "A hat is on the mat");
    cStrArray[5][3] = 'u';
    cpArray[5][3] = 'u';

is perfectly okay, the character access to change the 'a' to a 'u' in the char * case would be performed something like this:

    *(*(cpArray + (5 * sizeof(char *))) + (3 * sizeof(char))) = 'u';

very similar indeed to the int example.


[Fiendish Home]


Content of this page Copyright © Robert Quince 1996 - 2005.
Site Comments