Strings and Pointers
Prof. Brian D. Davison
Computer Science & Engineering, Lehigh University
Using strings in C
- There are many useful functions to help us use strings in C.
- For example, since C does not let us assign entire arrays, we use the strcpy(3c) function to copy one string to another:
#include
char string1[] = "Hello, world!";
char string2[20];
strcpy(string2, string1);
- Another lets us compare arrays.
char string3[] = "this is";
char string4[] = "a test";
if(strcmp(string3, string4) == 0)
printf("strings are equal\n");
else
printf("strings are different\n");
This code fragment will print ``strings are different''. Notice that strcmp does not return a boolean result.
- And of course the string concatenate function (which we implemented previously) is available as strcat(3c).
Using strings in C
- Often you'll need to know how long a string is
- E.g., to see if a copy of it will fit into a destination buffer
- Use can call strlen(3c),
which returns the length of the string (i.e., the number of characters in it), not including the terminating null:
char string7[] = "abc";
int len = strlen(string7);
printf("%d\n", len);
Which might be implemented as
int mystrlen(char str[]) {
int i;
for(i = 0; str[i]; i++)
;
return i;
}
- And of course we can print strings using printf() with the %s format
(e.g., printf("%s\n", string5);).
Pointers
- With pointers, we can manipulate addresses or contents referenced by the addresses.
- We can first declare a pointer variable
int *ip;
Which tells the compiler that variable
ip is of type
(int *) or equivalently that
*ip is of type
int.
Pointers contain addresses (Note that ip is currently undefined).
We can set to the address of another variable easily, as in
int i = 5;
ip = &i;
At this point, the contents ip and the address of i are the same -- they both refer to the same memory location, which contains the number 5.
Pointer Declarations
- As with any other variable types, you can initialize the value of a pointer variable when you declare it, as in
int *ip = &i;
but you cannot initialize the value of the memory location to which it points,
as something like
int *ip = 5;
will only tell the compiler to use address 5 as the initial value for ip (and the contents of address 5 are undefined, and probably off-limits to your program anyway).
While the compiler thinks these are equivalent,
int *j;
int* j;
the latter leads to possible problems later, such as writing
int* i,j;
when you wanted two pointers.
Pointer arithmetic
- In addition to single variables, pointers can be used to access parts of an array.
int *ip;
int a[20];
ip = &a[3];
Given that ip points to element 3 of a, we can use pointer arithmetic to access elements before or after 3, as in
ip++;
*ip = 7;
*(ip+1) = 8;
*(ip-2) = 3;
which set element 4 to 7, element 5 to 8 and element 2 to 3.
String operations using pointers
- mystrcmp() using pointers
char *p1 = &str1[0], *p2 = &str2[0];
while(1) {
if (*p1 != *p2)
return *p1 - *p2;
if (*p1 == '\0' || *p2 == '\0')
return 0;
p1++;
p2++;
}
mystrcpy() using pointers
char *dp = &dest[0], *sp = &src[0];
while (*sp != '\0')
*dp++ = *sp++;
*dp = '\0';
Null pointers
- A null pointer is a special value that is known to not point anywhere. Such a pointer is never valid.
- One way to get a null pointer is to use the constant NULL:
#include
int *ip = NULL;
and then you can test the value of ip to see if it is a valid pointer, as in
if (ip)
printf("%d\n", *ip);
Null pointers are useful as markers to say that the pointer is not ready for use, or for failure when you would otherwise return a valid pointer.
For example, the strstr(3c) function returns a pointer to the first occurrence of one string within another string, but returns a null if not found.
Also helps prevent the use of uninitialized pointers (e.g., those with undefined values) which can cause unrepeatable problems.
NULL is really a macro for the number 0, much like the null character '\0' is also the number 0.
Pointers and arrays
- It turns out that pointers and arrays have much in common.
int a[10];
int *ip;
ip = a;
It is as if we had written ip = &a[0];
We can also use array subscripting with pointers. E.g.,
ip[3] == *(ip+3) == a[3];
is also valid and evaluates to true.
This is how the compiler lets us pass arrays as parameters!
- A function call: myfunc(a,10) is actually myfunc(&a[0],10)
- Similarly, the definition void myfunc(int array[], int size) is treated as if it had been void myfunc(int *array, int size) since later uses of array[x] are still permitted when array is a pointer.
Fun, eh?
Strings as pointers
- Since arrays and pointers can be used interchangeably, it is common to refer to and manipulate character pointers as strings.
- This means:
- Any function declared to take a string (char array), will also accept a character pointer, since even if an array is passed, the function actually receives as pointer to the first element of the array.
- printf's %s actually expects a character pointer.
- Many programs extensively manipulate strings as character pointers and never explicitly declare any actual arrays.
- One caveat in initialization, however.
char string1[] = "Hello 1";
char *string2 = "Hello 2";
string1[0] = 'J';
string2[0] = 'J';
The first assignment is fine; the second may crash! The first declaration created an array with the initial contents of "Hello 1". The second declaration created a pointer to a string constant, which might be placed in an area of memory that is read-only.