You are on page 1of 8

1. Do array subscripts always start with zero?

Yes. If you have an array a[MAX] (in which MAX is some value known at compile time), the first element
is a[0], and the last element is a[MAX-1]. This arrangement is different from what you would find in some
other languages. In some languages, such as some versions of BAI!, the elements would
"e a[1] through a[MAX], and in other languages, such as #ascal, you can have it either way.
This variance can lead to some confusion. The $first element$ in non%technical terms is the $&ero'th$ element
according to its array inde(. If you're using spoken words, use $first$ as the opposite of $last.$ If that's not
precise enough, use pseudo%!. You might say, $The elements a su" one through a su" eight,$ or, $The second
through ninth elements of a.$
There's something you can do to try to fake array su"scripts that start with one. )on't do it. The techni*ue is
descri"ed here only so that you'll know why not to use it.
Because pointers and arrays are almost identical, you might consider creating a pointer that would refer to
the same elements as an array "ut would use indices that start with one. +or e(ample,
/* don't do this!!! */
int a0[ MAX ];
int *a1 = a0 - 1; /* & a[ -1 ] */
Thus, the first element of a0 (if this worked, which it might not) would "e the same as a1[1]. The last
element of a-, a0[MAX-1], would "e the same as a1[MAX]. There are two reasons why you shouldn't do
this.
The first reason is that it might not work. According to the A.I/I0 standard, it's undefined (which is a Bad
Thing). The pro"lem is that &a[-1] might not "e a valid address1 Your program might work all the time with
some compilers, and some of the time with all compilers. Is that good enough2
The second reason not to do this is that it's not !%like. #art of learning ! is to learn how array indices work.
#art of reading (and maintaining) someone else's ! code is "eing a"le to recogni&e common ! idioms. If you
do weird stuff like this, it'll "e harder for people to understand your code. (It'll "e harder for you to
understand your own code, si( months later.)
2. Is it valid to address one element beyond the end of an array?
It's valid to address it, "ut not to see what's there. (The really short answer is, $Yes, so don't worry a"out
it.$) 3ith most compilers, if you say
int i, a[MAX], j;
then either i or j is at the part of memory 4ust after the last element of the array. The way to see
whether i or j follows the array is to compare their addresses with that of the element following the array.
The way to say this in ! is that either
& i == & a[ MAX ]
is true or
& a[ MAX ] == & j
is true. This isn't guaranteed1 it's 4ust the way it usually works. The point is, if you store something
in a[MAX], you'll usually clo""er something outside the a array. 5ven looking at the value of a[MAX] is
technically against the rules, although it's not usually a pro"lem. 3hy would you ever want to say &a[MAX]2
There's a common idiom of going through every mem"er of a loop using a pointer. Instead of
for ( i = 0; i < MAX; ++i )
{
/* do something */;
}
! programmers often write this,
for ( p = a; p < & a[ MAX ]; ++p )
{
/* do something */;
}
The kind of loop shown here is so common in e(isting ! code that the ! standard says it must work.
3. Can the sizeof operator be used to tell the size of an array passed to a function?
.o. There's no way to tell, at runtime, how many elements are in an array parameter 4ust "y looking at the
array parameter itself. 6emem"er, passing an array to a function is e(actly the same as passing a pointer to
the first element. This is a 7ood Thing. It means that passing pointers and arrays to ! functions is very
efficient.
It also means that the programmer must use some mechanism to tell how "ig such an array is. There are
two common ways to do that. The first method is to pass a count along with the array. This is
what memcpy() does, for e(ample,
char source[ MAX ], dest[ MAX ];
/* ... */
memcpy( dest, source, MAX );
The second method is to have some convention a"out when the array ends. +or e(ample, a ! $string$ is 4ust
a pointer to the first character1 the string is terminated "y an A!II .89 ('\0') character. This is also
commonly done when you have an array of pointers1 the last is the null pointer. !onsider the following
function, which takes an array of char*s. The last char* in the array is N!!1 that's how the function knows
when to stop.
void printMany( char *strings[] )
{
int i;
i = 0;
while ( strings[ i ] != NULL )
{
puts( strings[ i ] );
++i;
}
}
:ost ! programmers would write this code a little more cryptically,
void printMany( char *strings[] )
{
while ( *strings )
{
puts( *strings++ );
}
}
! programmers often use pointers rather than indices. You can't change the value of an array tag, "ut
"ecause strings is an array parameter, it's really the same as a pointer. That's why you can increment strings.
Also,
"hi#e ( *strin$s )
means the same thing as
"hi#e ( *strin$s %= N!! )
and the increment can "e moved up into the call to p&ts().
If you document a function (if you write comments at the "eginning, or if you write a $manual page$ or a
design document), it's important to descri"e how the function $knows$ the si&e of the arrays passed to it.
This description can "e something simple, such as $null terminated,$ or $elephants
has n&m'#ephants elements.$ (0r $arr should have ;< elements,$ if your code is written that way. 8sing
hard coded num"ers such as ;< or => or ;-?> is not a great way to write ! code, though.)
4. Is it better to use a pointer to naviate an array of values! or is it better to use a subscripted
array name?
It's easier for a ! compiler to generate good code for pointers than for su"scripts.
ay that you have this,
/* X is some type */
X a[ MAX ]; /* array */
X *p; /* pointer */
X x; /* element */
int i; /* index */
@ere's one way to loop through all elements,
/* version (a) */
for ( i = 0; i < MAX; ++i )
{
x = a[ i ];
/* do something with x */
}
0n the other hand, you could write the loop this way,
/* version (b) */
for ( p = a; p < & a[ MAX ]; ++p )
{
x = *p;
/* do something with x */
}
3hat's different "etween these two versions2 The initiali&ation and increment in the loop are the same. The
comparison is a"out the same1 more on that in a moment. The difference is "etween (=a[i] and (=*p. The
first has to find the address of a[i]; to do that, it needs to multiply i "y the si&e of an X and add it to the
address of the first element of a. The second 4ust has to go indirect on the p pointer. Indirection is fast1
multiplication is relatively slow.
This is $micro efficiency.$ It might matter, it might not. If you're adding the elements of an array, or simply
moving information from one place to another, much of the time in the loop will "e spent 4ust using the array
inde(. If you do any I/0, or even call a function, each time through the loop, the relative cost of inde(ing will
"e insignificant.
ome multiplications are less e(pensive than others. If the si&e of an X is ;, the multiplication can "e
optimi&ed away (; times anything is the original anything). If the si&e of an X is a power of ? (and it usually
is if X is any of the "uilt%in types), the multiplication can "e optimi&ed into a left shift. (It's like multiplying "y
;- in "ase ;-.)
3hat a"out computing &a[MAX] every time though the loop2 That's part of the comparison in the pointer
version. Isn't it as e(pensive computing a[i] each time2 It's not, "ecause &a[MAX] doesn't change during
the loop. Any decent compiler will compute that, once, at the "eginning of the loop, and use the same value
each time. It's as if you had written this,
/* how the compiler implements version (b) */
X *temp = & a[ MAX ]; /* optimization */
for ( p = a; p < temp; ++p )
{
x = *p;
/* do something with x */
}
This works only if the compiler can tell that a and :AA can't change in the middle of the loop. There are two
other versions1 "oth count down rather than up. That's no help for a task such as printing the elements of an
array in order. It's fine for adding the values or something similar. The inde( version presumes that it's
cheaper to compare a value with &ero than to compare it with some ar"itrary value,
/* version (c) */
for ( i = MAX - 1; i >= 0; --i )
{
x = a[ i ];
/* do something with x */
}
The pointer version makes the comparison simpler,
/* version (d) */
for ( p = & a[ MAX - 1 ]; p >= a; --p )
{
x = *p;
/* do something with x */
}
!ode similar to that in version (d) is common, "ut not necessarily right. The loop ends only when p is less
than a. That might not "e possi"le.
The common wisdom would finish "y saying, $Any decent optimi&ing compiler would generate the same code
for all four versions.$ 8nfortunately, there seems to "e a lack of decent optimi&ing compilers in the world. A
test program (in which the si&e of an A was not a power of ? and in which the $do something$ was trivial)
was "uilt with four very different compilers. Bersion (") always ran much faster than version (a), sometimes
twice as fast. 8sing pointers rather than indices made a "ig difference. (!learly, all four compilers
optimi&e &a[MAX] out of the loop.)
@ow a"out counting down rather than counting up2 3ith two compilers, versions (c) and (d) were a"out the
same as version (a)1 version (") was the clear winner. (:ay"e the comparison is cheaper, "ut decrementing
is slower than incrementing2) 3ith the other two compilers, version (c) was a"out the same as version (a)
(indices are slow), "ut version (d) was slightly faster than version (").
o if you want to write porta"le efficient code to navigate an array of values, using a pointer is faster than
using su"scripts. 8se version (")1 version (d) might not work, and even if it does, it might "e compiled into
slower code.
:ost of the time, though, this is micro%optimi&ing. The $do something$ in the loop is where most of the time
is spent, usually. Too many ! programmers are like half%sloppy carpenters1 they sweep up the sawdust "ut
leave a "unch of two%"y%fours lying around.
". Can you assin a different address to an array ta?
.o, although in one common special case, it looks as if you can. An array tag is not something you can put
on the left side of an assignment operator. (It's not an $lvalue,$ let alone a $modifia"le lvalue.$) An array is
an o"4ect1 the array tag is a pointer to the first element in that o"4ect.
+or an e(ternal or static array, the array tag is a constant value known at link time. You can no more change
the value of such an array tag than you can change the value of C.
Assigning to an array tag would "e missing the point. An array tag is not a pointer. A pointer says, $@ere's
one element1 there might "e others "efore or after it.$ An array tag says, $@ere's the first element of an
array1 there's nothing "efore it, and you should use an inde( to find anything after it.$ If you want a pointer,
use a pointer.
In one special case, it looks as if you can change an array tag,
void f( char a[ 12 ] )
{
++a; /* legal! */
}
The trick here is that array parameters aren't really arrays. They're really pointers. The preceding e(ample is
e*uivalent to this,
void f( char *a )
{
++a; /* certainly legal */
}
You can write this function so that the array tag can't "e modified. 0ddly enough, you need to use pointer
synta(,
void f( char * const a )
{
++a; /* illegal */
}
@ere, the parameter is an lvalue, "ut the const keyword means it's not modifia"le.
#. $hat is the difference between array_name and &array_name?
0ne is a pointer to the first element in the array1 the other is a pointer to the array as a whole.
An array is a type. It has a "ase type (what it's an array of ), a si&e (unless it's an $incomplete$ array), and a
value (the value of the whole array). You can get a pointer to this value,
char a[ MAX ]; /* array of MAX characters */
char *p; /* pointer to one character */
/* pa is declared below */
pa = & a;
p = a; /* = & a[ 0 ] */
After running that code fragment, you might find that p and pa would "e printed as the same value1 they
"oth point to the same address. They point to different types of :AA characters.
The wrong answer is
char *( ap[ MAX ] );
which is the same as this,
char *ap[ MAX ];
This code reads, $ap is an array of MAX pointers to characters.$
%. $hy can&t constant values be used to define an array&s initial size?
There are times when constant values can "e used and there are times when they can't. A ! program can
use what ! considers to "e constant e(pressions, "ut not everything !DD would accept.
3hen defining the si&e of an array, you need to use a constant e(pression. A constant e(pression will always
have the same value, no matter what happens at runtime, and it's easy for the compiler to figure out what
that value is. It might "e a simple numeric literal,
char a[ )1* ];
0r it might "e a $manifest constant$ defined "y the preprocessor,
#define MAX 512
/* ... */
char a[ MAX ];
0r it might "e a si&eof,
char a[ si+e,-( str&ct cache./ject ) ];
0r it might "e an e(pression "uilt up of constant e(pressions,
char /&-[ si+e,-( str&ct cache./ject ) * MAX ];
5numerations are allowed too.
An initiali&ed const int varia"le is not a constant e(pression in !,
int ma( = )1*; 0* n,t a c,nstant e(pressi,n in 1 *0
char /&--er[ ma( ]; 0* n,t 2a#i3 1 *0
8sing c,nst ints as array si&es is perfectly legal in !DD1 it's even recommended. That puts a "urden on !D
D compilers (to keep track of the values of const int varia"les) that ! compilers don't need to worry a"out.
0n the other hand, it frees !DD programs from using the ! preprocessor *uite so much.
'. $hat is the difference between a strin and an array?
An array is an array of anything. A string is a specific kind of an array with a well%known convention to
determine its length.
There are two kinds of programming languages, those in which a string is 4ust an array of characters, and
those in which it's a special type. In !, a string is 4ust an array of characters (type char), with one wrinkle, a
! string always ends with a N! character. The $value$ of an array is the same as the address of (or a pointer
to) the first element1 so, fre*uently, a ! string and a pointer to char are used to mean the same thing.
An array can "e any length. If it's passed to a function, there's no way the function can tell how long the
array is supposed to "e, unless some convention is used. The convention for strings is N! termination1 the
last character is an A!II .89 ('\0') character.
In !, you can have a literal for an integer, such as the value of >?1 for a character, such as the value of '*'1 or
for a floating%point num"er, such as the value of 45*e1 for a -#,at or 3,&/#e.
There's no such thing as a literal for an array of integers, or an ar"itrary array of characters. It would "e very
hard to write a program without string literals, though, so ! provides them. 6emem"er, ! strings
conventionally end with a N! character, so ! string literals do as well. $si( times nine$ is ;E characters long
(including the N! terminator), not 4ust the ;> characters you can see.
There's a little%known, "ut very useful, rule a"out string literals. If you have two or more string literals, one
after the other, the compiler treats them as if they were one "ig string literal. There's only one
terminating N! character. That means that $@ello, $ $world$ is the same as $@ello, world$, and that
char message[] =
"This is an extremely long prompt\n"
"How long is it?\n"
"It's so long,\n"
"It wouldn't fit on one line\n";
3hen defining a string varia"le, you need to have either an array that's long enough or a pointer to some
area that's long enough. :ake sure that you leave room for the .89 terminator. The following e(ample code
has a pro"lem,
char greeting[ 12 ];
strcpy( greeting, "Hello, world" ); /* trouble */
There's a pro"lem "ecause greeting has room for only ;? characters, and $@ello, world$ is ;< characters long
(including the terminatingN! character). The N! character will "e copied to someplace "eyond the greeting
array, pro"a"ly trashing something else near"y in memory. 0n the other hand,
char $reetin$[ 1* ] = 67e##,, ",r#36; 0* n,t a strin$ *0
is 0F if you treat greeting as a char array, not a string. Because there wasn't room for the N! terminator,
the N! is not part of greeting. A "etter way to do this is to write
char $reetin$[] = 67e##,, ",r#36;
to make the compiler figure out how much room is needed for everything, including the
terminating N! character.
tring literals are arrays of characters (type char), not arrays of constant characters (type const char). The
A.I ! committee could have redefined them to "e arrays of c,nst char, "ut millions of lines of code
would have screamed in terror and suddenly not compiled. The compiler won't stop you from trying to modify
the contents of a string literal. You shouldn't do it, though. A compiler can choose to put string literals in
some part of memory that can't "e modifiedGin 60:, or somewhere the memory mapping registers will
for"id writes. 5ven if string literals are someplace where they could "e modified, the compiler can make them
shared.
+or e(ample, if you write
char *p = "message";
char *q = "message";
p[ 4 ] = '\0'; /* p now points to "mess" */
(and the literals are modifia"le), the compiler can take one of two actions. It can create two separate string
constants, or it can create 4ust one (that "oth p and 8 point to). )epending on what the compiler
did, 8 might still "e a message, or it might 4ust "e a mess.

You might also like