CSC270 October 28 Tutorial: C pointers and Makefiles


[ King 11, 12, 15.4, 17 ]


C pointers

[ King 11.1, 11.2, 11.3 ] Each byte of memory has a different address. Each variable is stored in some number of contiguous bytes of memory. The address of a variable is the address of the first byte in which it's stored. 0100 0101 xxx <-- An integer might take 4 bytes. This integer's 0102 xxx address is 0101 0103 xxx 0104 xxx 0105 0106 0107 ccc <-- A character only takes on byte. This character's 0108 address is 0107 In C, a pointer is a variable that stores an address. If a pointer `p' contains the address of a variable `x', we say p points to x. This is typically drawn with an arrow from p to x: p x +---+ +---+ | O | --> | 5 | +-+-+ / +---+ | | +----+ Each pointer can point to many different variables at different times during the program execution, but each variable must be of the same type. Declarations The following declares integers `a' and `b' and a "pointer to integer" `p'. The asterisk before `p' shows that it is a pointer. There can be a space between the asterisk and `p' if you wish. int a, b; int *p; ``Address of'' To get `p' pointing to `a', we assign `p' the address of `a': p = &a; The ampersand, when appearing in front of a variable, gives that variable's address. The above statement is read `p is assigned the address of a'. Dereferencing To assign a value to the variable pointed to by `p', we use the asterisk. *p = 55; printf( "%d %d\n", a, *p ); output ---> 55 55 The above statement is read `the integer pointed to by p is assigned 55'. This operation ``dereferences'' p. In other words, p is a ``reference'' to something and *p is the actual something. To get the value, we also use the asterisk: b = *p; printf( "%d\n", b ); output ---> 55 Layout in memory Suppose things are laid out in memory as follows: 0100 aaa < -- integer a 0101 aaa 0102 aaa 0103 aaa 0104 bbb < -- integer b 0105 bbb 0106 bbb 0107 bbb 0108 0109 010A 010B 010C ppp < -- pointer p 010D ppp 010E ppp 010F ppp Then after the following statements, memory would look as shown below (the hyphens are placeholder for the remainder of the variable). a = 55; p = &a; b = *p + 22; 0100 55 < -- integer a 0101 - 0102 - 0103 - 0104 77 < -- integer b 0105 - 0106 - 0107 - 0108 0109 010A 010B 010C 0100 < -- pointer p 010D - 010E - 010F -

Pointers as Arguments

[ King 11.4 ] C functions use "call-by-value". The arguments of the function call are evaluated, their values are passed in to the function as parameters, and any modifications to the parameters in not returned from the function call. To modify a variable that is one of the function arguments, you must pass a POINTER to that variable: void f( int *p ) { *p = *p + 1; } main() { int i, j; i = 0; j = 1; f( &i ); f( &j ); printf( "i = %d, j = %d\n", i, j ); --> i = 1, j = 2 } Above, f takes a pointer to an integer. It increments the integer pointed to by `p'. The call to `f' must pass in a pointer to an integer: &i is the address of `i' (in other words, a pointer to `i').

Pointers and structs

[ King 17.3, 17.4, 17.5 ] Pointers are often used with structures in C. For example, a linked-list node looks like struct ll_node { int data; struct ll_node *next; } Such a node contains data and a pointer `next' to the next node on the list. Here, `struct ll_node' is the data type. Let's define an LL_NODE type: typedef struct ll_node { int data; struct ll_node *next; } LL_NODE; Note that it would not work to use "LL_NODE *" inside the structure definition. Typically, pointers to structures use the "struct ll_node *" form. A linked list has a pointer to its first node, which is initialized to NULL. `NULL' is usually defined in stdlib.h. LL_NODE *head; head = NULL; A new node is created with malloc. LL_NODE *p; p = (LL_NODE *) malloc( sizeof( LL_NODE ) ); The fields of the structure would normally be referenced with the `dot' operator, as in struct.data and struct.next. BUT ... since p is a POINTER to the structure, we use another operator, the `arrow'. The following initializes the new structure and adds it to the head of the list. p->data = 0; p->next = NULL; head = p; The following creates a list of nodes storing 5,4,3,2,1,0 in that order. Nodes are successively added to the *head* of the list as the index i increases. LL_NODE *head, *p; int i; head = NULL; for (i=0; i < 6; i++) { p = (LL_NODE *) malloc( sizeof( LL_NODE ) ); p->data = i; p->next = head; head = p; } Trace it. When you no longer need something that you allocated with `malloc', BE SURE to return the memory to the operating system: free( p ); Above, p is a pointer that was returned from some call to malloc.

Pointer Notation Tricks

If `s' is a structure and `p' is a pointer to it, there are several ways to reference the fields of the structure. The following references to `data' are all equivalent. LL_NODE s, *p; s.data = 5; p->data = 5; (*p).data = 5; The last is interesting. (*p) is the thing pointed to by `p'. Since this thing IS a structure, we use the dot notation to reference a field.

Pointers and Arrays

[ King 12.1, 12.2, 12.3 ] In C, an array variable is always a pointer to the first element of the array! This means that you can pass array into functions without incurring the cost of copying the whole array. For example, suppose f() takes an array of integers as an argument. The following works because arrays are represented with pointers. void f( int *a, int size ) { int i; for (i=0; i < size; i++) printf( "a[%d] = %d\n", i, a[i] ); } main() { int x[10]; f( x, 10 ); } In passing `x' in to the function, we're really passing a pointer to the first element of the array. We could just as well have done f( &(x[0]), 10 ); since this also passes in the address of the first element. Or, we could have passed in only the middle four elements of the array: f( &(x[3]), 4 ); This passes in a pointer to element x[3] and tells f() that there are four elements in the array that starts at that address: &(x[3]) | v +---+---+---+---+---+---+---+---+---+---+ x | | | | | | | | | | | +---+---+---+---+---+---+---+---+---+---+ 0 1 2 3 4 5 6 7 8 9 f() thinks it has the following array: +---+---+---+---+ a | | | | | +---+---+---+---+ 0 1 2 3

Dynamic Array Allocation

[ King 17.3 ] Since arrays are represented with pointers, we can allocate them dynamically. The following allocates an array of 100 integers and sets all entries to zero. int *a, i; a = (int *) malloc( 100 * sizeof( int ) ); for (i=0; i < 100; i++) a[i] = 0; We could just as well have used a pointer to move through the array: int *a, *p; a = (int *) malloc( 100 * sizeof( int ) ); p = a; for (i=99; i>=; i--) { *p = 0; p++; } The statement `p++' means `increment p'. When applied to pointers, this means `increment the address in p by the size of the thing p points to'. In other words (in this case), `point p to the next integer following it in memory'.

Multidimensional Arrays

Pointers and multidimensional arrays are slightly more involved. See [ King 12.4 ], although that doesn't go into very much detail.


Makefiles

[ King 15.4 ] `make' is a Unix program that helps in compiling large programs that occupy more than one source file. Suppose you had a program that occupied three source files: maze.c read.c shortest-matrix.c Typically, what you do is compile each file separately and then link them together: % gcc -c maze.c % gcc -c read.c % gcc -c shortest-matrix.c % gcc -o mazem maze.o read.o shortest-matrix.o The -c flag causes gcc to stop compiling after producing an object file with the .o suffix. Thus, the first three lines above produce the files maze.o read.o shortest-matrix.o The last line above links the three *.o files into an executable called `mazem'. It would be painful if you had to do this by hand every time you made a change in your source code. In fact, you might forget which files you made changes to and then forget to recompile, resulting in a debugging nightmare! `make' deals with this. You must create a `Makefile' that defines the dependences between your files. For example, mazem depends upon maze.o, read.o, and shortest-matrix.o maze.o depends upon maze.c read.o depends upon read.c shortest-matrix.o depends upon shortest-matrix.c The corresponding Makefile would contain mazem: maze.o read.o shortest-matrix.o gcc -o mazem maze.o read.o shortest-matrix.o maze.o: maze.c gcc -c maze.c read.o: read.c gcc -c read.c shortest-matrix.o: shortest-matrix.c gcc -c shortest-matrix.c The lines of the form FILE: FILE1 FILE2 FILE3 ... define dependences, where FILE depends upon FILE1 FILE2 FILE3 ... If you change any one of the file to the right of the colon, `make' will recreate the file to the left of the colon. WARNING: There must be at least one TAB after the colon. Otherwise, `make' will not work. The line below the dependency line gives a Unix command to recreate FILE from FILE1, FILE2, FILE3, ... There can be more than one line if necessary. WARNING: These line must also start with TABs. NOTE: If a *.o file depends only upon a *.c file of the same name, no dependency line is necessary; `make' knows what to do. Thus, the Makefile above could be shortened to: mazem: maze.o read.o shortest-matrix.o gcc -o mazem maze.o read.o shortest-matrix.o

Including compilation flags

There are several variables that can be used in the Makefile. The most important is CC, which is the string used by `make' to run the C compiler. If you want to include flags with every C compilation, include the following at the top of your Makefile: CC = gcc -g -Wall Then `make' will use that string every time it does a C compilation. However, you must then use $(CC) in your makefile everywhere that you do a compilation. The Makefile would change to CC = gcc -g -Wall mazem: maze.o read.o shortest-matrix.o $(CC) -o mazem maze.o read.o shortest-matrix.o

Including header files *.h

If your source files depend upon other *.h files that contain definitions, it's often a good idea to state this in the Makefile. For example, if read.c has a line of the form #include "defs.h" then this is reflected in the Makefile as read.o: read.c defs.h Thus, `make' will recompile read.o if there is a change to read.c OR to defs.h. As before, no compilation statement is necessary since `make' knows how to create read.o from read.c.