You are on page 1of 37

Strings, Arrays, and Pointers

CS-2303
System Programming Concepts
(Slides include materials from The C Programming Language, 2nd edition, by Kernighan and Ritchie and
from C: How to Program, 5th and 6th editions, by Deitel and Deitel)

CS-2303, C-Term 2010 Strings, Arrays, and Poin 1


ters
Summary

• Arrays and pointers are closely related


• Let int A[25];
int *p; int i, j;
i n ter s
• Let p = A; is on o f po gfu l
a n in
r
pa nly m e
o m
• Then p points to A[0] o te: C l, but o ray
N lega ar
is e same
p + i points to A[i]in th
&A[j] == p+j
*(p+j) is the same as A[j]
CS-2303, C-Term 2010 Strings, Arrays, and Poin 6
ters
Summary (continued)

• If void f(int A[], int arraySize);

• Then f(&B[i], bSize-i) calls f with


subarray of B as argument
• Starting at element i, continuing for bSize-i
elements

CS-2303, C-Term 2010 Strings, Arrays, and Poin 7


ters
Review (concluded)
void f(int A[], int arraySize);
and
• void f(int *A, int arraySize);
are identical!

• Most C programmers use pointer notation


rather than array notation
• In these kinds of situations
CS-2303, C-Term 2010 Strings, Arrays, and Poin 8
ters
Additional Notes

• A pointer is not an integer …


• Not necessarily the same size
• May not be assigned to each other
• … except for value of zero!
• Called NULL in C; defined in <stdio.h>
• Means “pointer to nowhere”
• void * is a pointer to no type at all
• May be assigned to any pointer type
• Any pointer type may be assigned to void *
CS-2303, C-Term 2010 Strings, Arrays, and Poin 9
ters
Additional Notes

• A pointer is not an integer …


• Not necessarily the same size
• May not be assigned to each other
e
in th
• … except for value of zero! - ch eck i
n g
ty p e
f e a ts
• Called NULL in C; defined in <stdio.h> t o big
De mpiler g e t in
• Means “pointer to nowhere” co y way to m o st
e as e r y in
A bl ess a
• void * is a pointer to no typet atlutall r o u n e c
ely ograms
A b so C p r
• May be assigned to any pointer typelarge
• Any pointer type may be assigned to void *
CS-2303, C-Term 2010 Strings, Arrays, and Poin 10
ters
Characters in C
• char is a one-byte data type capable C99 & of holding a
C++ in
new d troduc
character ata ty ea
w pe ca
• Treated as an arithmetic integer type char for lled
text internat
• (Usually) unsigned ional
• May be used in arithmetic expressions
• Add, subtract, multiply, divide, etc.
• Character constants
• 'a', 'b', 'c', …'z', '0', '1', … '9', '+', '-', '=',
'!', '~', etc, '\n', '\t', '\0', etc.
• A-Z, a-z, 0-9 are in order, so that arithmetic can be done

CS-2303, C-Term 2010 Strings, Arrays, and Poin 12


ters
Strings in C
• Definition:– A string is a character array ending in
'\0' — i.e.,
– char s[256];
– char t[] = "This is an initialized string!";
– char *u = "This is another string!";

• String constants are in double quotes


• May contain any characters
• Including \" and \' — see p. 38, 193 of K&R
• String constants may not span lines
• However, they may be concatenated — e.g.,
"Hello, " "World!\n" is the same as "Hello, World!\n"

CS-2303, C-Term 2010 Strings, Arrays, and Poin 13


ters
Strings in C (continued)

• Let
– char *u = "This is another string!";
• Then
– u[0] == 'T'
u[1] == 'h'
u[2] == 'i'

u[21] == 'g'
u[22] == '!'
u[23] == '\0'
CS-2303, C-Term 2010 Strings, Arrays, and Poin 14
ters
Support for Strings in C
• Most string manipulation is done through
functions in <string.h>

• String functions depend upon final '\0'


• So you don’t have to count the characters!
• Examples:–
– int strlen(char *s) – returns length of string
• Excluding final '\0'
– char* strcpy(char *s, char *ct) – Copies string ct
to string s, return s
• s must be big enough to hold contents of ct
• ct may be smaller than s

– See K&R pp. 105-106 for various implementations


CS-2303, C-Term 2010 Strings, Arrays, and Poin 15
ters
Support for Strings in C (continued)
• Examples (continued):–
– int strcmp(char *s, char *t)
• lexically compares s and t, returns <0 if s < t, >0 if
s > t, zero if s and t are identical
– D&D §8.6, K&R p. 106
– char* strcat(char *s, char *ct)
• Concatenates string ct to onto end of string s, returns s
• s must be big enough to hold contents of both strings!
• Other string functions
– strchr(), strrchr(), strspn(), strcspn()
strpbrk(), strstr(), strtok(), …
– K&R pp. 249-250; D&D §8.6
CS-2303, C-Term 2010 Strings, Arrays, and Poin 16
ters
Character functions in C

• See <ctype.h>
• These return or false (0) or true (non-zero)
int isdigit(int c) int isalpha(int c)
int isalnum(int c) int isxdigit(int
c)
int islower(int c) int isupper(int c)
int isspace(int c) int iscntrl(int c)
int ispunct(int c) int isprint(int c)
int isgraph(int c)
• These change case (if appropriate)
int toupper(int c) int tolower(int c)
CS-2303, C-Term 2010 Strings, Arrays, and Poin 17
ters
String Conversion Functions in C

• See <stdlib.h>
• D&D §8.4; K&R p. 251-252
double atof(const char *s)
int atoi(const char *s)
long atol(const char *s)

double strtod(const char *s, char **endp)


long strtol(const char *s, char **endp,
int base)
unsigned long strtoul(const char *s, char
**endp, int base)
CS-2303, C-Term 2010 Strings, Arrays, and Poin 18
ters
Dilemma

• Question:–
– If strings are arrays of characters, …
– and if arrays cannot be returned from functions,

– how can we manipulate variable length strings


and pass them around our programs?
• Answer:–
– Use storage allocated in The Heap!
CS-2303, C-Term 2010 Strings, Arrays, and Poin 19
ters
Definition — The Heap

• A region of memory provided by most


operating systems for allocating storage not
in Last in, First out discipline
• I.e., not a stack

• Must be explicitly allocated and released


• May be accessed only with pointers
• Remember, an array is equivalent to a pointer

• Many hazards to the C programmer


CS-2303, C-Term 2010 Strings, Arrays, and Poin 20
ters
Static Data Allocation
0xFFFFFFFF
stack
(dynamically allocated)
This SP
is Th
e H ea p
.
heap
address space (dynamically allocated)

static data

program code
PC
0x00000000 (text)

CS-2303, C-Term 2010 Strings, Arrays, and Poin 21


ters
Allocating Memory in The Heap
• See <stdlib.h>
void *malloc(size_t size);
void free(void *ptr);
void *calloc(size_t nmemb, size_t size);
void *realloc(void *ptr, size_t size);

• malloc() — allocates size bytes of memory


from the heap and returns a pointer to it.
• NULL pointer if allocation fails for any reason
• free() — returns the chunk of memory pointed
to by ptr
• Must have been allocated by malloc or calloc

CS-2303, C-Term 2010 Strings, Arrays, and Poin 22


ters
Allocating Memory in The Heap
• See <stdlib.h>
void *malloc(size_t size);
void free(void *ptr);
void *calloc(size_t nmemb, size_t size);
void *realloc(void *ptr, size_t size);

• malloc() — allocates size bytes of memory r big-


n d /o
from the heap and returns a pointer to it. fault a r
te
• NULL pointer if allocation fails for any reason
ta t p o in
ion bad
g m en or if
• free() — returns the chunk ofSmemory e
e err pointed
tim
to by ptr
• Must have been allocated by malloc or calloc

CS-2303, C-Term 2010 Strings, Arrays, and Poin 23


ters
Allocating Memory in The Heap
• See <stdlib.h>
void *malloc(size_t size);
void free(void *ptr);
void *calloc(size_t nmemb, size_t size);
void *realloc(void *ptr, size_t size);

• malloc() — allocates size bytes of memoryof chunkr


e o
s iz
from the heap and returns a pointer tokit.w s oc()
no ll
• NULL pointer if allocation fails for any ( )
reason
e b y ma
fre ocated )
• free() — returns the chunk of memory all loc(pointed
al
to by ptr c
• Must have been allocated by malloc or calloc

CS-2303, C-Term 2010 Strings, Arrays, and Poin 24


ters
Notes

• calloc() is just a variant of malloc()


• malloc() is analogous to new in C++ and
Java
• new in C++ actually calls malloc()
• free() is analogous to delete in C++
• delete in C++ actually calls free()
• Java does not have delete — uses garbage
collection to recover memory no longer in use

CS-2303, C-Term 2010 Strings, Arrays, and Poin 25


ters
Typical usage of malloc() and free()
char *getTextFromSomewhere(…);

int main(){
char * txt;
…;
txt = getTextFromSomewhere(…);
…;
printf("The text returned is %s.", txt);
free(txt);
}

CS-2303, C-Term 2010 Strings, Arrays, and Poin 26


ters
Typical usage of malloc() and free()
char * getTextFromSomewhere(…){
char *t; getTextF
romSomew
... creates a n here()
ew string u
t = malloc(stringLength); malloc() sing
...
return t;
}

int main(){
char * txt;
…;
txt = getTextFromSomewhere(…);
…;
printf("The text returned is %s.", txt);
free(txt);
}

CS-2303, C-Term 2010 Strings, Arrays, and Poin 27


ters
Typical usage of malloc() and free()
char * getTextFromSomewhere(…){
char *t;
...
t = malloc(stringLength);
...
return t;
} n e d to
g
assi on
t is
int main(){ r to tex f u n cti
te g
char * txt; Poin in callin
txt
…;
txt = getTextFromSomewhere(…);
…;
printf("The text returned is %s.", txt);
free(txt);
}

CS-2303, C-Term 2010 Strings, Arrays, and Poin 28


ters
Usage of malloc() and free()
char *getText(…){
char *t;
...
t = malloc(stringLength);
...
return t;
}
b e r to
m em
int main(){
u s t r e
in t e d to
char * txt; i n ( ) m ra g e p o
…; ma th e s to
f r ee t
txt = getText(…); x
by t
…;
printf("The text returned is %s.", txt);
free(txt);
}

CS-2303, C-Term 2010 Strings, Arrays, and Poin 29


ters
Definition – Memory Leak

• The steady loss of available memory due to


forgetting to free() everything that was
malloc’ed.
• Bug-a-boo of most large C and C++ programs

• If you “forget” the value of a pointer to a


piece of malloc’ed memory, there is no
way to find it again!
• Killing the program frees all memory!
CS-2303, C-Term 2010 Strings, Arrays, and Poin 30
ters
Questions?

CS-2303, C-Term 2010 Strings, Arrays, and Poin 31


ters
String Manipulation in C

• Almost all C programs that manipulate text


do so with malloc’ed and free’d memory
• No limit on size of string in C
• Need to be aware of sizes of character
arrays!
• Need to remember to free storage when it is
no longer needed
• Before forgetting pointer to that storage!

CS-2303, C-Term 2010 Strings, Arrays, and Poin 32


ters
Input-Output Functions

• printf(const char *format, ...)


• Format string may contain %s – inserts a string
argument (i.e., char *) up to trailing '\0'
• scanf(const char *format, ...)
• Format string may contain %s – scans a string into
argument (i.e., char *) up to next “white space”
• Adds '\0'
• Related functions
• fprintf(), fscanf() – to/from a file
• sprintf(), sscanf() – to/from a string
CS-2303, C-Term 2010 Strings, Arrays, and Poin 33
ters
Example Hazard
char word[20];
…;
scanf("%s", word);
• scanf will continue to scan characters from input
until a space, tab, new-line, or EOF is detected
• An unbounded amount of input
• May overflow allocated character array
• Probable corruption of data!
• scanf adds trailing '\0'
• Solution:–
scanf("%19s", word);

CS-2303, C-Term 2010 Strings, Arrays, and Poin 34


ters
Questions?

CS-2303, C-Term 2010 Strings, Arrays, and Poin 35


ters
Command Line Arguments

• See §5.10
• By convention, main takes two arguments:-
int main(int argc, char *argv[]);
– argc is number of arguments An
arra
y
– argv[0] is string name of program itself char of poi
(i.e nters
., o
– argv[i] is argument i in string form f st to
ri
n gs
• i.e., i < argc )
– argv[argc] is null pointer!
• Sometimes you will see (the equivalent)
int main(int argc, char **argv);
CS-2303, C-Term 2010 Strings, Arrays, and Poin 36
ters
Example — PA #2

• Instead of prompting user game parameters,


simply take them from command line
% life 100 50 1000
arg
would play the rGame
a gv[ a ofv[1Life on a grid of
0] rgum ] – fi
100  50 squares for–1000
nament generations.
e of
r st
pro
g ra
pro m
gra
m

CS-2303, C-Term 2010 Strings, Arrays, and Poin 37


ters
Example — PA #2 (continued)
int main(int argc, char *argv[])){
int xSize, ySize, gens;
char grid[][];
grid:– an array of unknown size
if (argc <= 1) { declared; no storage allocated
printf(“Input parameters:- ");
scanf("%d%d%d", &xSize, &ySize, &gens);
} else {
xSize = atoi(argv[1]);
ySize = atoi(argv[2]);
gens = atoi(argv[3]);
}

grid = calloc(xSize*ySize, sizeof char);

/* rest of program using grid[i][j] */

free(grid);
return 0;
} // main(argc, argv)

CS-2303, C-Term 2010 Strings, Arrays, and Poin 38


ters
Example — PA #2 (continued)
int main(int argc, char *argv[])){
int xSize, ySize, gens;
char grid[][];
if (argc <= 1) {
printf(“Input parameters:- ");
scanf("%d%d%d", &xSize, &ySize, &gens);
} else { Convert argument #1 to
xSize = atoi(argv[1]); Convert
ySize = atoi(argv[2]); int forargument
xSize #2 to
gens = atoi(argv[3]); Convert
int forargument
ySize #3 to
} int for gens
grid = calloc(xSize*ySize, sizeof char);

/* rest of program using grid[i][j] */

free(grid);
return 0;
} // main(argc, argv)

CS-2303, C-Term 2010 Strings, Arrays, and Poin 39


ters
Example — PA #2 (continued)
int main(int argc, char *argv[])){
int xSize, ySize, gens;
char grid[][];
if (argc <= 1) {
printf(“Input parameters:- ");
scanf("%d%d%d", &xSize, &ySize, &gens);
} else {
xSize = atoi(argv[1]);
ySize = atoi(argv[2]);
gens = atoi(argv[3]);
}

grid = calloc(xSize*ySize, sizeof char); grid:– allocate array


/* rest of program using grid[i][j] */

free(grid);
return 0;
} // main(argc, argv)

CS-2303, C-Term 2010 Strings, Arrays, and Poin 40


ters
Example — PA #2 (continued)
int main(int argc, char *argv[])){
int xSize, ySize, gens;
char grid[][];
if (argc <= 1) {
printf(“Input parameters:- ");
scanf("%d%d%d", &xSize, &ySize, &gens);
} else {
xSize = atoi(argv[1]);
ySize = atoi(argv[2]);
gens = atoi(argv[3]);
}

grid = calloc(xSize*ySize, sizeof char);

/* rest of program using grid[i][j] */


Be sure to free() the
free(grid); allocated array before
return 0;
exiting!
} // main(argc, argv)

CS-2303, C-Term 2010 Strings, Arrays, and Poin 41


ters
Questions?

CS-2303, C-Term 2010 Strings, Arrays, and Poin 42


ters

You might also like