You are on page 1of 117

C++ Lecture Notes

Department of Computer Science


University of Cape Town
Prepared by: Dr P Marais
c
2000-2006

Contents
1 Introduction

2 A Simple C++ Program

2.1

Hello World . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Sorting Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

3 Compiling and Debugging a C++ Program

13

3.1

The Program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13

3.2

Compiling and Linking the Program . . . . . . . . . . . . . . . . . . . . . . . . .

13

3.2.1

Makefiles Automating Compilation . . . . . . . . . . . . . . . . . . . .

14

Debugging your C++ Program . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16

3.3.1

19

3.3

The DDD Debugger under FreeBSD . . . . . . . . . . . . . . . . . . . . .

4 Basic C++ Syntax and Constructs


4.1

4.2

4.3

4.4

21

Basic Program Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

21

4.1.1

Header and Source File Conventions . . . . . . . . . . . . . . . . . . . . .

21

4.1.2

Multiple Source Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

Scope and Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

4.2.1

Namespaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

24

The C++ Pre-processor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

4.3.1

Pre-processor Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . .

26

4.3.2

Pre-processor String Operations . . . . . . . . . . . . . . . . . . . . . . .

28

Types in C++ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

4.4.1

Simple Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

4.4.2

Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

4.4.3

Structures, Unions and Enumerations . . . . . . . . . . . . . . . . . . . .

32

4.4.4

Classe Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

34

4.4.5

Variable Initialisers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

4.4.6

Type Conversions

36

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3

CONTENTS
4.5

Pointers and References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

37

4.5.1

Multiple Indirection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

4.5.2

The Pointer void* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

39

4.5.3

Relationship between Arrays and Pointers . . . . . . . . . . . . . . . . . .

40

4.5.4

Dynamic Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

41

4.5.5

Function Pointers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

42

4.5.6

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

4.6

Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

45

4.7

C++ Specific Operators and Expression Syntax . . . . . . . . . . . . . . . . . . .

46

4.8

C++ strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.8.1

Character Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

48

4.8.2

The string class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

49

Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

50

4.9.1

Arguments and Return Types . . . . . . . . . . . . . . . . . . . . . . . . .

50

4.9.2

Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

4.9.3

Function Inlining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

52

4.9

5 Simple I/O in C++


5.1

5.2

5.3

53

The Standard I/O Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

53

5.1.1

Reading from stdin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

I/O Formatting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

55

5.2.1

Formatting to/from memory . . . . . . . . . . . . . . . . . . . . . . . . .

56

File-based I/O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

57

6 Classes in C++

61

6.1

Defining a Class

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

61

6.2

Class Constructors and Destructors . . . . . . . . . . . . . . . . . . . . . . . . . .

65

6.2.1

The Copy Constructor . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

67

6.2.2

Return Value Optimisation . . . . . . . . . . . . . . . . . . . . . . . . . .

68

6.3

The Pointer this . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

6.4

Class Type Conversions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

6.5

Constant Objects and Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .

69

6.6

Operator Overloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

70

6.6.1

Type Conversion using Operator Overloading . . . . . . . . . . . . . . . .

73

Friends of a Class . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

74

6.7

CONTENTS

7 Inheritance

77

7.1

Simple Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

77

7.2

Inheritance Access Control

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

79

7.3

Virtual Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

7.3.1

Virtual Destructors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

84

7.3.2

Identifying Object Type . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

7.4

Abstract Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

86

7.5

Multiple Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

87

8 C++ Templates

91

8.1

Class Templates

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

91

8.2

Placement of Template Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

8.3

Iterators and the Standard Template Library . . . . . . . . . . . . . . . . . . . .

94

8.3.1

How to use Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

94

8.3.2

Types of Iterators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

95

8.3.3

Common Containers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

96

8.3.4

Examples of Template Containers

. . . . . . . . . . . . . . . . . . . . . .

97

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

8.4

Function Templates

8.5

Algorithms and the STL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100


8.5.1

Predicates and Function Objects . . . . . . . . . . . . . . . . . . . . . . . 100

8.5.2

Other Useful Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

9 Exceptions

107

9.1

Exception Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

9.2

Specifying Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

10 Using C and C++

113

10.1 Linking C code to C++ code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113


10.2 Writing C Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
10.2.1 Function Prototypes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.2.2 I/O Under C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.2.3 Memory Management Under C . . . . . . . . . . . . . . . . . . . . . . . . 115
10.2.4 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
Bibliography

115

CONTENTS

Chapter 1

Introduction
Both Java and C++ are object-orientated languages, in the sense that an object-based approach
to code design is supported by each. A C++ program usually consist of a collection of classes
together with methods (functions) which operate on objects of those classes. C++ is really C
with objects, and as such it includes the C programming language as a subset: one need not
use the extensions provided by C++. In the case of Java, the language was designed specifically
with object orientated programming in mind, and one is compelled to utilise objects for even
the most trivial of tasks.
Another fundamental difference arises from the requirement for platform independence in the
case of Java. Java programs are executed by an interpreter, the Java Virtual Machine (JVM)
which must be implemented for each platform. A C++ program, on the other hand, is compiled
to (highly optimized) platform-specific machine code, which is directly executed by the operating
system. If one takes a set of compiled Java classes (byte code), one can execute this byte
code on any platform which has a JVM. However, when one ports a C++ program to another
platform, at the very least a code recompilation will be necessary. In many cases the C++ code
will contain non-standard functionality, which may not be supported in a different environment,
requiring a rewrite of source code. The American National Standards Institute (ANSI) has spent
a considerable amount of time introducing standards to which good programmers are expected
to adhere. This ensures that code can be ported with a minimum of fuss. In general, interpreted
programs execute considerably slower than compiled programs, although the introduction of JIT
Java compilers has offset this concern somewhat.
Unlike C++, Java has a well defined collection of class libraries (packages) which each JVM
must support to be compliant with the Java standard. The standard libraries of C++ fulfil
a similar role, but the functionality of these libraries is somewhat less extensive. For example,
support for threads and GUIs is built into Java, whilst C++ programmers must look to thirdparty libraries to make use of such facilities. Of course, both languages are continuing to evolve,
so new functionality is being continually added.
There are many syntactic similarities between the languages, but these may be misleading.
For example, the rules governing class inheritance are quite different although the syntax looks
familiar. Furthermore, C++ has access to a pre-processor which parses the source code prior to
compilation, and can perform tasks such as code inclusion, macro expansion and the selection
of esoteric compiler operations. Java does not support such a mechanism, which can lead to
unwieldy code. The pre-processor allows one to conditionally compile code: so one can check
the platform, for example, and compile a different code segment to handle some inconsistency.
C++ includes support for class templates. Templates enable one to create a single class which
can operate on any number of different types. So, if one has a List template, you can insert
7

CHAPTER 1. INTRODUCTION

objects of arbitrary type into the list without requiring a separate List class for each type.
Furthermore, the recent draft standard for C++ provides a specification for a Standard Template
Library (STL), providing access to a wide array of useful templates and thus simplifying the
task of porting code between systems. Until very recently Java had nothing comparable, but
they have now introduced a templating scheme, thus further diluting the original simplicity of
the language.
While Java appears to offer some advantages over C++, the latter is well established and is
widely used by programming professionals. This fact is reason enough to learn the language;
however, as the language evolves, a number of new and powerful extensions are appearing, and
these will ensure that C++ remains in use for years to come.

Structure
These notes are set out as follows:
Chapter 2 presents two examples of working C++ programs.
Chapter 3 provides information on how to build and debug a C++ program.
Chapter 4 examines basic C++ syntax and ideas such as pointers and basic types.
Chapter 5 looks at the basic C++ I/O facilities.
Chapter 6 introduces and examines C++ classes in detail.
Chapter 7 covers inheritance under C++.
Chapter 9 introduces the exception handling system in C++.
Chapter 8 examines class and function templates.
Chapter 10 discusses some of the issues around C legacy code.

Chapter 2

A Simple C++ Program


As with most things, examples are often this best way to understand new concepts.

2.1

Hello World

Our first C++ program will be the traditional hello World, which does little to illustrate the
finer points of the language, but should help to ease the transition!
Here it is, in both a traditional C and the newer C++ version:
/* The C version */
#include <stdio.h>
int main(void)
{ printf("Hello World!"); return 0;
}
/* the C++ version; both compile under C++ */
#include <iostream>
int main(void)
{ cout << "Hello World!"; return 0;
}
The first statement is an include directive which simply inserts code from the indicated file.
These files are usually header files: they contain variable/function/class definitions which enable
the compiler to do proper type checking. You can, however, include any file you like at any
point in a C/C++ program. The compiler will fetch the text and insert it prior to compiling the
source code. In this example, the header files are required to use the I/O facilities of C/C++
leaving them out will result in a compilation error.
Next, we see what appears to be a function definition. The function main() provides an entry
point into your program and must be present in order to compile an executable program. Unlike
Java, C does not associate this function with a class. In fact, you can can write complete
functional C/C++ programs without ever referring to classes at all. The OO extensions to
C++ co-exist with the older procedural form of C. We are free to define classes with member
functions, or simply write free functions which are not connected to any class or object.
9

10

CHAPTER 2. A SIMPLE C++ PROGRAM

Note that we have declared main() to have a void argument list, since we do not wish to parse
command line arguments. In general, one cannot redefine an argument list, but main() is a
special function which provides an execution entry point in that program so it is not subject to
the same rules. A zero return value signals that all is well to any calling process (it is good form
to write your programs in this way).
The C program uses the printf() function to perform output (it writes to stdout); while the
C++ program uses the output operator << to perform the same task. Although it is not clear
at this point, there are some fairly sophisticated translations occurring behind the scenes in the
C++ output statement. The entity cout is the standard output stream (which is actually an
object), and the text is been sent to that object for proper formatting and printing.

2.2

Sorting Words

The next program will read in a number of words from stdin and hold them in a vector, which
is very similar to the Java Vector. It will then sort the words before iterating through the vector
and printing each in turn. Heres the code:
#include
#include
#include
#include

<iostream>
<vector>
<string>
<algorithm>

/* Program to buffer some words, sort them, and then display them */
int main(void)
{
vector<string> words;
string word;
while (cin >> word) words.push back(word);
cout << "Words entered (sorted): \n";
sort(words.begin(), words.end());
for (int i = 0; i < words.size(); i++)
cout << words[i] << ;;
cout <<endl;
return 0;
}
First, we have to include all the required header files. In this example, we will need I/O capability
(to read/write strings). We will also need to use a vector and string, and we therefore include
the appropriate header files. Unlike Java, C/C++ does not have built in class types: the
compiler can build the source code to add to our executable, but we need to tell it where to
look. The header file algorithm makes all the necessary declarations to use algorithms such
as sorting and searching. Note that these algorithms are generic and will work on any type of
entity provided it is properly defined.
Next, we see a multi-line comment, which uses /* and */ to delimit the comment. Comment
syntax is identical for Java and C++, but C++ does not have a Javadocs syntax.
The main() function will be executed and then the program will terminate with a return code
of 0.

2.2. SORTING WORDS

11

We then declare a vector to hold our input words. C++ requires us to indicate the type that
will be held in this vector; Java does not, since it uses a simple inheritance hierarchy in which
all objects are derived from class Object. In this example, we will hold entities of type string,
which is indicated by using angle brackets. Now we set up a simple loop which will read words
until the input terminates. Each time we fetch a new word we push it onto the back of the
vector. This is necessary since we created the vector with 0 size, and the push back() member
function ensures that new space is created as we add elements to the vector. We can only do
an indexed insertion if the vector is big enough i.e. has had enough cells allocated; otherwise
we will generate an error. While this may seem bizarre, it is consistent with the way C++ treats
containers and will become more natural once that topic has been dealt within detail.
Sorting (which is based on a optimised quicksort) is accomplished by using the generic sort
algorithm. We pass it arguments indicating the start and end of the container to be sorted (a
vector in this case) and it will sort the data in place.
The for loop simply iterates through the vector printing out element in turn. There are many
ways to move through such a containers, but in this case a simple indexing expression makes
sense. The output manipulator endl inserts a newline into the output stream and flushes the
output buffer.

12

CHAPTER 2. A SIMPLE C++ PROGRAM

Chapter 3

Compiling and Debugging a C++


Program
Before advancing to more complex topics, some practical issues need to be addressed. In particular, the process of building a C++ executable from a source description.

3.1

The Program

A C++ program consist of several source and header files. The source files contain the program
code, and the header files usually contain class and function declarations. In general, then, we
have the following:
file1.CC,...,fileN.CC; hfile1.h,...,hfileM.h
where the .CC files contain the C++ source code and the .h files are header files. Note that the
extension for a C++ source file can be whatever you wish it to be. Common choices are .cpp,
.c++, .C and .cc. Header files usually have the extension .h, but this is also a matter of choice.
It should be noted that under the new C++ draft standard, header files have no extension at
all. However, in order to maintain compatibility with C applications older style header files are
still present.

3.2

Compiling and Linking the Program

To produce an executable file we need to compile and link the source files. Each source file will
be compiled into an object file, containing the relocatable machine code for the relevant source
code. These object files must then be combined to produce a single executable. A program
called a linker is responsible for this task. Most compilers can link the object code (perhaps by
calling a linker in the background), so one usually does not need to know about the specifics of
the linking process. The linker has the additional task of ensuring that code from libraries is
present within the final executable program. Libraries are similar to Java packages, but because
there is no standard virtual machine, the executable must contain all the necessary code to
implement every function we require1 . For example, the code which allows I/O is contained in
the standard C++ library, so we can call the appropriate functions in our source code, as long
1

Unless the libraries are shared libraries, as explained below.

13

14

CHAPTER 3. COMPILING AND DEBUGGING A C++ PROGRAM

as we tell the compiler where to find the actual program code. The compiler (usually) passes
this information to the linker; within the source code, we inform the compiler that we are using
this external function by including an appropriate header file in the C++ source code, for
example:
#include <iostream>
The header file contains a description of the appropriate functions which enables the compiler
to process each source file, even though it has not yet retrieved the required code. The linker
will then search a list of user-specified libraries to read the required code and resolve all external
references. A number of standard libraries are searched by default, but others, such as the maths
library, must be explicitly specified.
In the interests of reducing the size of the executable, most linkers support shared objects or
dynamic linking. In this case, the code from the library is not actually resident within the
executable. Instead, the operating system intercepts calls to unresolved functions, and reads the
required code into memory as the program executes. Although there is some overhead involved,
shared libraries (or objects) can dramatically reduce the size of the executable and also provide
an easy way to plug in extensions to a program without having to recompile the application.

3.2.1

Makefiles Automating Compilation

The commands required to compile and link programs can be tedious to type: they often have
a multitude of switches or options, and several steps may be required to produce the final
executable.
For example, if we are using the GNU C++ compiler, g++, and we wish to compile a program
consisting of two source files, file1.CC and file2.CC into an executable myprog, which refers to
libraries libm and libX (the maths and X libraries, resp.), we would require something like
the following:
g++ file1.CC file2.CC -o myprog -lm -lX
Alternatively, we could do the linking explicitly:
g++ file1.CC -c
g++ file2.CC -c
g++ file1.o file2.o -o myprog -lm -lX
In this case, each source file is compiled to an object file (using the -c option) and then linked
in the final invocation to the compiler, which specifies the required library information. Clearly,
repeating such a string of commands every time you modify source code is going to become
painful.
A Makefile provides a means of automating project rebuilding by ensuring that various code
dependencies are always up to date. If you modify a file, and have specified the Makefile
correctly, then all the source files are rebuilt (if necessary) and re-linked correctly, all using a
simple command: make. The file containing the rules is usually called Makefile, although one
can use any valid file name. If the default name is not used, the -f flag followed by the file name
must be passed to make.

3.2. COMPILING AND LINKING THE PROGRAM

15

Makefiles contain a list of file dependencies which determine when a particular target has to be
recompiled/rebuilt. For example, if a source file is modified, the corresponding object file will
have to be rebuilt. The action that is invoked if the dependency is triggered must contain a list
of commands to recreate the target and forms part of the Makefile specification. An example
might be the following:
file1.o: file1.CC header1.h
g++ -c file1.CC
This states that file1.o depends on file1.CC (obviously) and a header file, header1.h. If either of
these is modified, the next invocation of the make command will ensure that file1.o is rebuilt using
the stated rule. If file1.o happens to form part of some other dependency, then a rebuild will
trigger that rule, and so on. If make is invoked without an argument, the first rule encountered
in the Makefile will be processed. One can specify a certain target if desired, e.g. make myprog.
Also observe that lines following the dependency must be indented by at least one tab. Rules
should be separated by a empty line.
Makefiles also allow for variable substitution and have a simple conditional statements. An
example Makefile is shown below. Note how variables are used to make sure that we can easily
switch between different compilers, or add additional libraries to all the relevant commands:
# This is a Makefile comment
CC=g++ # the compiler name
LIBS=-lm -lX # the libraries we will reference
# the normal build rules
file1.o: file1.CC file1.h
$(CC) file1.CC -c
file2.o: file2.CC file2.h
$(CC) file1.CC -c
myprog:

file1.o file2.o mainheader.h


$(CC) file1.o file2.o -o myprog $(LIBS)

# other rules; no dependency; e.g.

make clean

clean:
rm -f *.o
It can become tedious to set up a Makefile if there are many source files, so a number of macros
are available. For example, to compile all source (.CC) files to object (.o) files, we can use a rule
such as the following (a default rule such as this already exists in make):
.CC.o:
$(CC) -c $<
The $< macro means that for every file that satisfies this rule, the source (.CC) file name will be
substituted when the rule is invoked. However, one should be careful that all the dependencies
are considered. The above rule will not rebuild the .o files if any header file is modified.
There are a number of very complex rules one can specify: the manual pages for make contain
additional information. One should also be aware that there are enhanced versions of make
available which may have a slightly different syntax.

16

CHAPTER 3. COMPILING AND DEBUGGING A C++ PROGRAM

It is possible (with g++) to have the compiler generate a list of dependencies for you by using
the -M switch:
g++ -M s1.C s2.C sN.C > incl.defs
In this case we will have a list of dependencies in the file incl.defs. We can then write a
Makefile such as the following:
# include dependencies; rules below
include incl.defs
CC=g++
SOURCES=s1.C s2.C s3.C
OBJECTS=s1.o s2.o s3.o
myprog: $(OBJECTS)
$(CC) $(OBJECTS) -o myprog $(LIBS)
.C.o:
$(CC) -ansi -c $<
# type make depend to build dependencies
depend:
$(CC) -M $(SOURCES) > incl.defs
For the above example, we can type make depend to generate the list of dependencies for the
indicated files, and then simply make to rebuild myprog. Remember that make will process the
first rule in the file if no target is specified; if there had been another rule before the one to
build our application we would have used make myprog instead. There are versions of make
which allow you to assign the output of a shell command to a variable. This would simplify the
Makefile shown here, since we could then generate a list of source files by simply using ls *.C
rather than having to list each file explicitly. The version of make supplied by GNU allows this.

3.3

Debugging your C++ Program

Your programs will, unless they are extremely simple, contain logic errors during development
(and perhaps even afterwards!) These can be notoriously difficult to discover, unless you utilise
sensible coding techniques. The compiler will identify obvious syntactical errors but can provide
no run-time assistance for logic errors. Although C++ is a strongly typed language, the existence
of constructs such as pointers create a great deal of scope for rather nasty errors. In fact, pointer
based errors are probably the largest source of program bugs for languages which support them.
Here are some simple issues to bear in mind when working with pointers. Pointers will be
introduced in the next chapter. A pointer is what Java calls a reference, although it has a
special syntax under C++. You should re-read this section once you have completed the next
chapter.
Pointer Initialisation Always initialise a pointer when it is declared. If the pointer is not
pointing to an object or region of memory, set it to the value NULL.
Check Pointer Returns If a function returns a pointer, check that the returned value is nonNULL:

3.3. DEBUGGING YOUR C++ PROGRAM

17

if ((memory = new MyObject [3000]) == NULL) {


cerr << "Eek!" << endl;
exit(1);
}
A NULL return value usually indicates that the function invocation failed. C++ can use
exceptions to signal such errors, but many programmers still use the old C-style convention
of returning a NULL pointer in such cases.
Allocation Bounds Checking Remember that C++ usually uses new to allocate arrays or
blocks of memory. Unless you use an array class, you will not be protected from overrunning the ends of the allocated block during a memory reference. Writing to a piece
of memory outside the block will almost certainly cause the application to crash at some
point, since the memory management infrastructure will become corrupted.
Delete Unused Memory Because C++ allows you to manage your memory resources, you
should ensure that you delete dynamically allocated memory or objects when you are
done with them. Failure to do so can result in memory leaks which can drain all the
memory resources of the system.
There are two other common sources of errors, macro expansions and precedence issues. As was
mentioned elsewhere, a macro expansion is a simple text substitution with no respect for the
normal C++ syntactical rules. A common mistake when using macros involves the use of the
auto increment/decrement operator:
#define TWO(a) ((a)+(a))
if (TWO(b++) == 6) ...
// expands to:

if ( ((b++) + (b++)) == 6) ...

As the comment indicates, the variable b will be incremented twice, which is not what one would
expect. A macro expansion is not equivalent to a function call.
By precedence issues I means those errors which arise because our assumptions about the order of
evaluation in an expression or control structure are wrong. When one looks at simple expression
like a+b*c, we know that the second half of the expression is evaluated first. However, when
one overloads operators it is easy to forget the precedence or associativity rules and the order
in which operations occur might be very different from our expectations. In terms of control
structure precedence, perhaps the most common error is the dangling else statement. Because
of the way in which code is indented it is easy to forget that, in the absence of the appropriate
{}s, the else clause of an if-else statement binds to the closest enclosing if statement:
if (...)
if (...)
{ ...}
else
{...}
The else clause binds to second if statement, since it is closer. Other problems can emerge
with the order of evaluation in expressions which use the auto increment/decrement operations,
but these should be familiar by now.

18

CHAPTER 3. COMPILING AND DEBUGGING A C++ PROGRAM

When developing code it is common practise to insert debugging statements, in other words,
statements which are only used during development and can be disabled or removed when the
final (bug-free!!!) version of the product is built. The pre-processor can be used as to accomplish
this, in conjunction with macro parameters to the compiler. Consider the following code:
#include <assert.h>
// normal C++ code
void ComputeValue(int a, int b, int c)
{
#ifdef DEBUG ON
// Code to include during debug stage:
assert(a==b);
cerr << "The value of a is " << a <<
" and b is " << b << endl;
#endif
// normal C++ code
}
If the macro constant DEBUG ON is defined, the compiler will include the enclosed code section
when compiling the function. One can either define the constant in the program itself (which
requires editing the source code) or use the compilers macro definition flag to set the value:
g++ -g -D DEBUG ON source file.C
In may cases, project Makefiles will have rules which build debug or stable versions by exploiting
this form of the macro definition. The above example makes use of the assert() macro. Assert
evaluates its argument, and if it is false, terminates the program with a diagnostic message
which indicates the line in which the assertion failed.
There are a number of debuggers available for C++. These range from complex (expensive)
software suites with fancy visual interfaces, to text-based free software, such as gdb, the GNU debugger. Debuggers provide an execution environment in which the application can be monitored
as it executes. To use a debugger, your executable must include debug information. Include
the -g flag on your compile line to achieve this. Debuggers support (at least) the following
operations:
Source Step You can examine the stack, CPU registers, program variables etc as you execute
each line of source code.
Breakpoints You can insert a marker at specific lines of source code which will cause the
program to stop execution when it reaches the corresponding line of code. These are
called breakpoints.
Function Tracing If a line of code invokes a function call, you can choose to step into the
function (in which case your source line indicator will move to the appropriate line in the
source code) or to step over the function, in which case the source line will be executed
and you can examine the state of the variables etc once this has been accomplished.

3.3. DEBUGGING YOUR C++ PROGRAM

19

Stack Trace You can back up to any previous function stack frames in the call stack and
examine the state of those variables. You move back down the call stack by stepping into
functions once more.
Variable Manipulation You can print out the value of variables and in many cases assign
new values to see what impact this has.
Core Files Under UNIX, if a program crashes a file named core a full system memory
(core) dump is written to the working directory. If you invoke a debugger on a core file,
it will jump to the statement which caused the crash and you can move back and forth
along the call stack. Note that in many cases the system automatically eliminates core
files to save space.
The means by which these operations are achieved is software dependent so you will have to
consult the manual pages for more information. If you wish to allow code debugging, you may
need to ensure that the compiler inserts additional debug information. Once the code is free of
errors, the compilation can be optimised by removing the debug switches. It is also a good idea to
turn off severe compiler optimisations since these can (on rare occasions) generate error condition
when debugging. A final observation: if it aint broke, dont fix it! Avoid the temptation to
continually optimise your code. You may cause yourself a great deal of heartache!

3.3.1

The DDD Debugger under FreeBSD

DDD is a front-end to the GNU debugger gdb. It supports all the features of gdb and adds many
useful facilities. The following section will describe how one can use some of these features.
The first step is to ensure that your program is compiled with debug information. To do this,
simply add the -g flag to the compile line. You can now start up the debugger in one of two
ways:
Directly simply type ddd; using this approach you will need to read in the executable from the
Files->Open Program menu item.
Executable Type ddd ExecutableName; the debugger will start with your executable loaded,
and will show the source code in the source window.
Note that if there is a core file present for the executable, the debugger will usually open this
too and move the source pointer to indicate which instruction caused the crash. The default
resource settings on many systems disable core dumps, however, so you will either have to do
without or run the program within the debugger to trigger the crash and proceed from there.
Once the executable is loaded, the source window opens and you may place breakpoints at
various source lines and so on. To place a breakpoint, simply double click on the left side of
a source code line. A stop symbol will appear and during execution, the debugger will stop
the program at that point so you can check/set variable values or perform any number of other
useful operations.
The rest of the discussion assumes that the program to read and sort some words introduced in
Chapter 2 has been typed in and compiled.
We can generate an error by adding a null pointer dereference somewhere before the return
statement in main():

20

CHAPTER 3. COMPILING AND DEBUGGING A C++ PROGRAM


int *ip = NULL;
*ip = 5; // illegal write at address 0x0!!!

When the executable is run, the debugger will stop at the second line and declare that a segmentation fault has occurred. This indicates an illegal write to a memory location, and is a
common source of errors in a C++ program. In this case, the error was easily identified since
we had initialised ip to be NULL (as one should always do!). In general, finding a memory error
is much harder we could, for example, halt at a breakpoint and print out the value of a
pointer variable (ip, in this example) to see whether its current value looks reasonable (very
small memory addresses usually indicate that something has gone wrong somewhere else).
Print/Plotting Values
To print out a value just click on it, and then click on the print button or right-click and select
print from the options provided. The right-click menu also allows several other options, such as
identifying the type of the variable and the ability to plot the data it refers to. This is useful
for checking whether data in an array/vector is what you expect it to be (a quick visual check
can identify the symptoms of a problem elsewhere). You can see how this works by adding the
following line to your source code, just after the start of main():
int data[5]={1,3,-1,2,5};
Then, place a breakpoint just after this line to ensure that the debugger will stop. Select the
data and click on plot: a graph of the will be displayed. This works for 2D arrays too, as long
as the dimensions are present. You can set the display options as you wish.
Stepping through Code
Perhaps the most useful feature of a debugger is the ability to step through lines of source code,
or even an instruction at a time. Using this feature, you can trace the logic of your program,
checking the values at each point and in some cases, changing them to see what such changes lead
to further on. In general step operations step through the code a source line at a time, while
next operations step through an instruction at a time. After reaching a breakpoint, you may
use continue to resume execution. The kill command will terminate execution immediately.
Stack Frames and Function Calls
When a function call is made, the arguments are passed and returned using a stack frame. This
provides a way of tracking the values of variables as you move into and out of function calls.
For example, if a function X calls a function Y, and you step into Y (using step), you will be
able to see all of Ys variables, but the variables for X and functions that called X will no longer
be in scope (unless they are global). To return from a currently executing function, you can use
the finish command. When you reach main() again, issuing another finish command will
cause the program to finish executing. The main use of a stack frame is to allow you to see
what the variables within a function are set to, rather than simply stepping over the function
call (although you can do that if you wish).

Chapter 4

Basic C++ Syntax and Constructs


There is a great deal of overlap between the syntax and constructs of Java and C++. For
example, both languages have identical if, for and do/while control structures. The only
difference between the two sets of control structures is the existence of a labelled break and
continue in Java, which has no analogue in C++. Because of these similarities, we shall
assume that the reader is comfortable with elementary control structures, and not refer to them
further. The remainder of this chapter will deal with more significant differences between the
two languages.

4.1

Basic Program Structure

A C++ program consists of a combination of the following:


headers files and other pre-processor directives,
type definitions (structures, objects etc),
variable declarations and definitions,
function prototypes and definitions.

4.1.1

Header and Source File Conventions

A header file contains class/function/variable definitions which are required for correct type
and syntax checking during compilation. The header file itself is inserted using an #include
directive, which simply inserts the file into the current source file. This new file is then compiled.
Header files are simply a convenient way of collecting the information required by the compiler
into logical units if they did not exist you would have to cut and paste the text in by hand,
which would be very tedious! A header file can refer to other header files and may contain any
valid C/C++ code.
In general, every class that you create (or that is part of the C++ standard) will have its own
header file which contains the class definition etc. This is a good practise to follow. An example
is the following:
21

22

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS


#ifndef MYCLASS H
#define MYCLASS H
/* definitions go here */
class MyClass { };
#endif

Following the C++ convention, the source (function definitions and other function code) will
go in a C++ source file (usually extension .cpp or .cc) and the class and other declarations
will go into the header file which has no extension1 . Thus, our source file would be called
MyClass.cpp and the header file would be called MyClass. The purpose of the #ifndef (if
symbol not defined) directive is to ensure that the header file is not included multiple times.
If this were to happen, the compiler would complain about multiple declarations and halt. By
declaring a symbol MYCLASS H, again this is a convention, the compiler will simply ignore the
code between the #ifndef statement if the header file has previously been encountered. In that
case, the symbol will have been defined and the compiler will branch over the code to look for
the next valid text to insert. Thus when two include statements are encountered:
#include <MyClass>
#include <MyClass>
only the first will result in pre-processor code substitution. By the time the second include
directive is reached, the symbol MYCLASS H will have been defined and the #ifndef statement
will cause the preprocessor to ignore the code it surrounds (no substitution takes place).
A final note on header files concerns the use of the angle brackets: <>. If the header file name is
enclosed within angle brackets the system will search a set of predefined paths since this form is
usually used for system header files. Your current directory will not be searched. If you create
your own header files, you should enclose them in double quotes "" instead. This will cause the
compiler to search your local directory first. Note that in the case of the angle bracket form, you
can modify the compiler search paths using option flags and thus still use your local directory,
but this is not conventional practise.

4.1.2

Multiple Source Files

The #include directive also allows us to separate function prototypes and type definitions from
the source code. Variables may be declared at any point in the program, and if they are declared
outside of a function they are global. In other words, they are visible from every function within
the source file. It is considered bad practice to declare global variables, since they are prone to
unintended side effects. Variables declared in one source file can be referenced by functions in
another source file. The extern keyword is used to achieve this:
// C++ file 1 - variable declaration
float myval;
// C++ file 2 - needs to refer to this
// external variable.
extern float myval;
1

Some old versions of C++ use a .h extension.

4.2. SCOPE AND NAMESPACES

23

The same strategy works for functions: if a function is declared and defined in another file, then
we must include an extern prototype within the new file:
// file 1 - function declaration
int Afunc(int *t1) { ...

// file 2 - external function ref


extern int Afunc(int *t);
If a function or variable at file scope is declared static, then it cannot be referenced outside
of that file. For variables or function declared in remote files or libraries, it is often more
convenient to create a single header file which contains the appropriate external prototype or
variable definitions:
#ifndef MYHDR H
#define MYHDR H
struct ANewSruct { ...
typedef float REAL;

};

extern int func1(int *t1);


extern float * func2(char ch);
extern int GlobIntVal;
#endif
C++ allows such a header file to be included in the file where the actual definition takes place:
the extern keyword is simply ignored in this case. This allows us to create a single header file
for our function prototype and global variable definitions.

4.2

Scope and Namespaces

The scope of a variable is the section of source code over which it is is visible i.e. may be
referenced. Scoping rules allow one to make this determination, and for the most part these
rules are easily understood and unambiguous. Within C++, a pair of curly braces ({ }) usually
defines a new scope: variables declared within this code segment will not usually be visible to
external code. Since curly braces can be nested, this implies that variable scope may be nested
too. In general, a variable declared in an outer scope will be visible to one declared in an
inner scope unless they have the same name. In that case, the code within the inner scope
block will refer to the variable defined there, rather than the outer variable declaration.
In addition to the general observation that pairs of {}s delimit a scope, we also have file or
global scope. Any global variable or function declared outside of a function will reside at this
scope level, which encompasses all others.
Consider the following example:
int i = 10;
for (int j = 0; j < 5; j++) { int i = j;
cout << i << endl; }

24

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

In this case the code in the for statement cannot see the variable i defined in the enclosing
scope block. Thus, assignments made to the local version of i only apply to the local version.
If you wish to access the version of i that resides at the global scope level, you can use C++s
scope resolution operator, ::. Thus one might write:
int i = 10;
for (int j = 0; j < 5; j++) { int i = j;
cout << ::i << endl; }

in which case the value 10 will be printed on every iteration of the loop, since the global variable is
being referred to. The scope resolution operator can also be used to resolve function ambiguities.
For example, we have seen how the association of functions to classes is achieved (a class is a
scope too). Consider the following example:
int i(int j) {
cout << "The function I invoked with arg "
<< j << endl; return 1; }
int main(void) { int i = 3;
cout << i(5) << end; // wont compile
cout << ::i(5) << endl; // works fine
return 0; }
In this case there is a variable called i as well as a function with the same name. Both of these
entities are visible from the scope block of the main function, but the variable i is within the
inner scope and obscures the function i(). C++ thus takes the value of the integer value to be
a badly typed function pointer!
A variable is only visible within a scope after it has been declared! Thus, if one moves the
occluding variable statement further down in the code sequence, the value of the variable in the
outer scope will be referenced up until that point. The traditional C practice, and one which is
still useful, is to declare all your variables at the start of a new scope block.
Classes have different scoping and visibility rules, which we will address in Chapter 6.

4.2.1

Namespaces

A namespace is a scope construct which can be used to ensure that labels (variable name,
function names etc) within a program can be uniquely referred to without fear of duplication.
There is a strong possibility that a particular variable or function will be multiply defined if we
are working on a project which consists of many source files. A namespace defines a scope and
associates a name with it. Any declaration placed within a namespace can be unambiguously
referred to. The scope resolution operator :: is used to refer to members within a namespace:
NameSpaceName::member. Note that the default global namespace has no name, and so we use
::member if we wish to refer to such members in this namespace. A namespace is declared as
follows:

4.2. SCOPE AND NAMESPACES

25

namespace name {
// declarations
}
The declarations will usually be function prototypes or variable declarations and are subject
to the normal language constraints. You can also define new types and/or classes within a
namespace. Heres a simple example:
namespace LibA {
const char TheChar = a;
int function(void);
}
namespace LibB {
const char Bchar = b;
int function(void);
}
In this case we have declared two namespaces, called LibA and LibB, each of which contain the
indicated declarations. Observe that the function function() is common to both namespaces.
We cannot refer to the function or variables directly: they are invisible by default. We can
access members of a namespace in three ways:
Explicit qualification We use the scope operator to access members of a namespace. For
instance, LibA::TheChar or LibB::function() will access the corresponding members of
the namespace LibA and LibB.
using Declaration The keyword using can be employed to make a member of a namespace
visible to the current code segment, e.g. using LibA::function;. The effect of such a
statement is to declare the variable or function at that position the code, thus making it
visible. Normal scope considerations then apply.
using Directive If the keyword using is applied to the name of a namespace, all the members
of that namespace become visible to the rest of the program. However, unlike a using
declaration, the members are not bound to the point in the code at which the directive is
encountered: they are declared at global scope.
Using directives should be used cautiously. In the example above, we have an identical function
in each namespace. Consequently opening both namespaces with a directive will lead to an
ambiguity and hence a compiler error. Here are some properties of namespaces:
Namespaces are extensible. If we define a new namespace which has the same name as an
existing one, this is considered an extension of the existing namespace, and the declarations
are merged.
We may have an unnamed namespace. In this case, the declarations are bound to the
current source file (just as global static variable and function declarations) and are only
visible to the code in this file.
Namespaces may have aliases. An alternative name can be defined instead of the original
name. Thus, if we have a library which resides in a namespace MyBigCPPLibrary we can
alias this name to something shorter, say MyLib, to avoid the extra typing:

26

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS


namespace MyLib = MyBigCPPLibrary;
This name can be used in subsequent namespace declarations or directives.

The C++ draft employs namespaces to handle C legacy code and to avoid the problems associated with global name spaces. More specifically, all the functions in the standard C++ library
are bound to a namespace called std (rather unfortunately!) and to ensure compatibility with
old C code, the existing C header files are recomposed to make use of these protected functions
and variables. If you are writing C++ code with no legacy C code, it can become tedious to
use the namespace qualification all the time. In this case, it is safe to import all the members
of std into the global namespace to save typing:
#include <iostream>
using namespace std;
string mystring; // otherwise:

std::string mystring;

Note that some implementations of C++ do not properly support the standard in this area.

4.3

The C++ Pre-processor

The C/C++ pre-processor scans the source code for embedded instructions prior to compilation.
These commands usually modify the code prior to compilation, although they can be used to
change the way, for example, that the compiler generates code (using #pragma instructions).

4.3.1

Pre-processor Directives

We have seen how one can use conditional statements (#ifdef/#ifndef) to ensure that code
is conditionally included. It is important to emphasize that in such cases, as in the case of the
#include directive, the code is inserted (expanded) before the compiler is invoked. There are a
number of directives, but we shall consider only one other: #define.
This directive allows us to define constants which can subsequently be used throughout our
code. Note that C++ does have a const keyword but this cannot be applied to an actual
constant item, such as a number. The const of C++ is essentially the same as Javas final
keyword. However, in the case of a define directive, the compiler scans the code looking for
occurrences of the macro we have defined. When it finds one, it physically substitutes the value
of the defined macro into the code. For example,
#define MYNUMBER 22
...
if (variable1 == MYNUMBER)
expands to
...
if (variable1 == 22)

4.3. THE C++ PRE-PROCESSOR

27

prior to compilation. This is known as macro expansion. Macros can accept arguments, and
refer to other previously defined macros. Thus, we might have
#define VALU 33
#define NEWVALU (VALU + 22)
A typical example of a macro with arguments is MAX(a,b), which returns the maximum of the
two values a and b:
#define MAX(a,b) ( (a) > (b) ?

(a):(b) )

The two arguments to the macro occur in brackets, just as an argument list for a function.
The remaining code snippet contains the code to be inserted by the pre-processor, once the
appropriate arguments have been substituted:
if (MAX(i,j) > 3)

exit(0);

else break;

expands to
if ( ((i) > (j) ?

(i) :

(j)) > 3)

exit(0);

else break;

The above example uses the ternary operator ?, which is a compact version of the if then else
statement. The operator evaluates the expression before the ? and if it is true, returns the
expression after the ?. Otherwise it returns the expression after the :. Note how the macro
allows us to simplify the code, whilst avoiding the expense of a function call. There is obviously
a trade-off between code size and speed, but in this instance the additional work required to set
up the function call suggests that a macro is a better alternative. Note the extensive use use
of parenthesis: although they are not necessary in this instance, you should remember that the
macro expansion process substitutes the defined statement as is. Consider the following:
#define TEST 2 + 3
if (TEST * 5 > 2)
expands to
if ( 2 + 3 * 5 > 2)
Is this really what you wanted? To avoid such issues, always make sure to protect your expression
appropriately.
A #define directive can span multiple lines. In this case, each line except the last should be
terminated by a \, with no trailing space:
#define LONGDEF(a) if ( (a) > 3 || (a) < 7) { \
PerformFunc1(a); }\
else { \
exit(0); }

28

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

Remember, this code snippet will be substituted as it is defined, with the argument appropriately
expanded. In this case, surrounding the if statement with parenthesis would have resulted in a
syntax error. You can also define string literals as constants:
#define MYSTRING "A char array, not an object!"
By convention the name of the macro is capitalised to distinguish it from a variable or function
call. If you need to use a constant value or string at several places in your code, define a macro
and use the macro name instead. If you later need to change the value of the macro, it only
needs to be changed in once place. Also note that one can place pre-processor directives at any
point in the source file, although it is usual to place them at the top. Thus, one can define a
macro close to its intended target.
Macros must be used with some caution. Macro code substitution can lead to some unexpected
results. Consider the following:
int a = 2, b = 3, val;
val = MAX(a++,b++);
This code expands to:
int a = 2, b = 3, val;
val = (a++) > (b++) ? (a++):(b++) );
In this case each variable is incremented twice, which is not what we had intended. Such errors
can be extremely hard to find.

4.3.2

Pre-processor String Operations

In addition to macro expansion, the C++ pre-processor has set of string operators which can
be useful for many tasks. The first of these is stringizing: taking an identifier and turning it in
a string of characters using the macro operator #. Consider the following example:
#define PRINT(x) cerr << #x "=" << x << endl;
This will turn the x into a character string and concatenate this to the "=" to form the final
macro substitution
// PRINT(y) expands to:
cerr << "y" "=" << y << endl; // "y" "=" becomes "y="
Note that two strings in a macro are concatenated into one string when there is only white space
separating them.
Another useful pre-processor string facility is token pasting: the creation of a new identifier from
two supplied tokens using the macro operator ##: This is illustrated in the following code:

4.4. TYPES IN C++

29

#ifdef USE DEBUGV


#define WRAPPER(x) debug ##x
#else
#define WRAPPER(x) x
#endif
....
extern int f(int);
extern int debug f(int);
int y;
// normal function call, unless USE DEBUGV is defined
WRAPPER(f(y));
In this example, we have created a debug and normal version of a function; when the macro
USE DEBUGV is defined, we wish to use the former. In this example, we have wrapped the
function so that the correct version will be called depending on the definition of the WRAPPER
macro. The preprocessor macro will expand to either debug f(y) or simply f(y) as required.

4.4

Types in C++

C++ supports simple types, aggregate types and class types. Aggregate types in C++ are those
types which group data into logically connected entities.

4.4.1

Simple Types

The simple types in C++ are similar to those in Java, with a few caveats. The char type
corresponds to Javas byte type. A major difference between Java and C++ is the architecture
independence of the former. Integer ranges in C++ are determined by the particular operating
system on which the application is compiled, while Java defines a standard size for these types.
The simple types are:
int Architecture dependent integer,
float Single precision IEEE floating point,
double Double precision IEEE floating point,
short Short integer (usually half the width of int),
long Long integer (usually twice the width of int),
char One byte quantity usually used to store characters.
C++ also permits long to serve as a type modifier for integers and floating point numbers.
Thus, we can have a long double type which is twice the width of a double (if the compiler
supports it). Note that long int and long are synonyms.
Because there are such platform dependencies, C/C++ includes a function, sizeof(), which
returns the size of a given type in bytes. Thus, on all platforms sizeof(char) will yield 1, while
sizeof(int) may vary.
C++ also allows both signed and unsigned versions of integral types such as char and int:

30

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS


signed long;
unsigned char;

Signed types can represent both positive and negative values, while unsigned types can only
represent non-negative values. In the latter case, however, the range of positive values is roughly
twice that of the signed version. The default for ints is signed while chars are usually
unsigned.
C++ has the following variable qualifiers2 :
extern This indicates that the variable has been defined outside the current scope, usually in
a different file.
static The variable is bound to the scope in which it is is defined such as a class object,
function or file. This keyword has a similar effect in Java.
const The variable cannot have its value modified from its initial value. A value must be
assigned when the variable is declared.
auto (deprecated) This ensures that the variable will be created on the stack, which means that
when it goes out of scope it will be destroyed. This is not often used, since local variables
are auto by default.
register This instructs the compiler to use a register on the CPU to store the variable, since
this may speed up processing. This is only a hint, however, and there is no guarantee
that a register will actually be used.
volatile The variable is protected from compiler optimisations which might corrupt the value
in certain cases (when, for example, threads are accessing shared memory).
Static variables come in two flavours: global and local. Global static variables are bound to the
source file in which they are defined and persist for the duration of the program. They cannot
be referenced from another file. Local static variables are defined within functions and are bound
to the function. Once they are initialised, their value persists across function calls. They cannot
be referenced by other functions, but they can be freely modified by the functions within which
they reside. Such variables can be used, for example, to keep track of the number of times a
given function is invoked. Static class members will be discussed in Chapter 6.
C++ offers a type creation mechanism, or more correctly, a means to create a type alias, which
is not available in Java. One can use the typedef keyword, as follows:
typedef int Bool;
typedef struct Structure St;
typedef int (*fptr)(int,int);
In the first example, the type Bool has been created. Bool is now a proper type, and can be
used wherever an int is legal. The second example shows how one can redefine the aggregate
type Structure so that we can refer to it in a more compact manner, as St. The final (rather
esoteric) example creates a synonym, called fptr, for a pointer to a function returning an int,
2

Some of these keywords can appear in different contexts. Here we are concerned only with their effect on
variables.

4.4. TYPES IN C++

31

and taking two int arguments which makes subsequent declarations of such a pointer far more
readable! Observe that in this final example, the peculiar syntax of function pointers requires
that the typedef statement be structured in a somewhat different way. In general, the rightmost
argument is the new synonym for the type that precedes it.
The draft standard for C++ has introduced a new type: bool. A variable of this type may only
assume the values true or false, as in Java. An integer cannot be assigned to a bool value
unless it is appropriately cast.

4.4.2

Arrays

An array is a collection of items which are contiguous in memory i.e. they are placed one after
the other, starting from the 1st element. Unlike Java, a C/C++ array is not an object and has
no methods associated with it. The programmer must keep track of the array size and make
sure it grows appropriately if more space is required. There is no bounds checking of any sort
the program must ensure that only valid array elements are manipulated.
An array is declared in the following manner:
type name[n0 ][n1 ]...[nm];
where there can be as many dimensions as desired. Access to elements is via the usual indexed
notation, with numbering starting at zero e.g.
int iarray[3] = {0.0,1.0,2.0};
cout << "1st element is: " << iarray[0] << endl;
If type is an object, the constructor will be invoked when the objects are created, and the
destructor will be invoked when they go out of scope. Note that the size of the array should
be specified at compile time, although many C++ compilers will allow run-time array sizing.
In general it is preferable to use dynamic memory allocation (see Section 4.5.4 for such arrays,
since this will work with all compilers.
Some valid array declarations are shown below:
int array[10];
// Constructor is invoked for each object
MyObject obs[10];
double mx[10][12];
//an array of 10 float pointers
float *ptrarray[10];
#define ARRAY DIM 20
float simplearray[ARRAY DIM];
Observe that array is not a C++ keyword. If one performs array initialisation when declaring
the array, it is not necessary to include the left-most dimension: the compiler can deduce this
from the initialisation list. Consider the following:

32

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS


int iargs[][2][2] = { { {1,2},{3,4} },
{ {4,7},{8,9} }, { {6,3},{2,2}} }; // missing dim.

= 3

Such incomplete type specifications can also be used in argument lists:


int myFunction(char namearray[][20],
char aname[]) {...}
There is very close relationship between arrays and pointers, which will be discussed in Section 4.5.3.

4.4.3

Structures, Unions and Enumerations

Both C and C++ support structures. A structure is a collection of data (fields) which logically
belongs together, often called a record in other languages (such as Pascal). Database applications
make extensive use of structures since they are essentially collections of records. A structure
definition looks similar to that of a class, and C++ treats the struct keyword as a special kind
of class declaration: one in which all the members are public. Structures in C are not permitted
to have member functions, and do not have any kind of inheritance mechanism. C++ permits
you to use either the class or struct keyword to define a class; by convention class is used.
It is common practise to use a C-style struct for simple data structures in which inheritance
or security are of no consequence. An example of a database structure might be:
struct DataEntry {
char name[40];
char telnum[40];
int PersId;
};
The keyword struct is followed by the name of the structure, which then becomes a type. In
other words one can then code statements such as:
int ParseData(DataEntry AnewEntry) {
if (AnewEntry.PersId == 22) cout << "Gday Bruce!";
}
Observe that the syntax for accessing a structure field is the same as that for accessing a class
variable in Java. You can define structures for use in class definitions, in which case access
protection can be provided by the encapsulating class. As with classes, you can define nested
structures.
If you need just one instance of structure you can even declare a singleton structure:
struct {
char name[40];
char telnum[40];
int PersId;
} MyStruct;

4.4. TYPES IN C++

33

This declares and defines a single struct called MyStruct. No other instances of this structure
can be created, since it has no associated type name.
A union is a special kind of structure which can be used for esoteric purposes such as low-level
data translation. A union consists of a region of memory which is shared by the members of a
structure. For example, consider the following:
union Translate {
char chars[4];
int value;
};
In this case, we can refer to the data represented by this entity in two ways: either through a
character array of length 4 or through a 4-byte integer. Thus, if we were receiving a stream of
byte data, and we knew that the data represented an integer, we could use the value field to
extract the integer value corresponding to each 4-byte sequence. It is up to the programmer to
keep track of which field is currently in use. Unions are not commonly used, and are referred to
here for completeness only. They are a legacy from the days in which C was used extensively
for low-level system programming. Although a union can contain member functions (in C++),
they may not be used in an inheritance hierarchy.
One final point to note concerns assignment of one structure to another. C++ defines a basic
field-by-field copy for this aggregate type. However, if a particular field is a pointer to some
other structure, that structure will not be duplicated: the copy is shallow. This issue will be
dealt with in more detail when we consider class copy constructors.
Another useful construct is the enumeration. An enumeration defines a set of named integer
constants which can be used wherever scope considerations allow. The constants are integers,
and can be used to encode useful information. For example, we might define the following:
enum DaysOfWeek {SUN,MON,TUE,WED,THUR,FRI,SAT} days;
which defines a new enumeration called DaysOfWeek and declares a variable, days, of this type.
One can then write code like the following:
void getDay(DaysOfWeek dd) {
if (dd == SUN) cout << "Its sunday";
else if (dd == MON) cout << "Its Monday";
... }
which is significantly more efficient than doing string comparisons, but maintains readability.
In this case, if dd was cast to an int, the integer values would range from 0 to 6. By default
the first field of the enumeration starts at 0. One can reset the value of any enumeration field.
Subsequent fields will increment based on the specified integer value:
enum OpCodes {FLAG=3, OPFAIL, NULLP, OPPASS=7,
NOP} myopcode;
In this example, the corresponding integer values will be 3,4,5,7 and 8. Because C++ is a
strongly typed language, and an enumeration defines a type, it is not permissable to assign an
int to an enumeration variable. The set of constants defined by the enumeration are bound to
the appropriate scope (usually global or class).

34

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

4.4.4

Classe Types

C++ class variables are defined using the class keyword. They look very similar to Java class
definitions, although there are many subtle differences. These will be highlighted in Chapter 6.
A class consists of data and methods/functions. If there are no functions, a class is essentially on
old C-style struct. If the class is associated with methods, these need to be defined. In Java the
method code is inserted into the class definition. In C++, one need not do this; in fact, there are
performance implications in such code inlining, and implementors are encouraged to separate
the method code from the declarations, by placing this code in a separate implementation file.
Once the class has been properly defined, we have access to a new custom type and we can
declare instances of this type. A simple example is given below. The header file, MyDetails,
containing the class declaration, is
// Header file for MyDetails
#ifndef MYDETAILS
#define MYDETAILS
class MyDetails {
private:
char name[30];
int IDnumber;
public:
void PrintMyDetails(void);
void UpdateMyDetails(char nm[], int id);
};
#endif
Unlike Java, C++ access specifiers (public, private etc) apply to groups of members. Everything following private: for example, is considered to be private. When the next access
keyword, public:, is encountered the access of the members that follow changes to public, and
so on.
The implementation file, MyDetails.cpp, is then
#include <iostream>
#include "MyDetails"
void MyDetails::PrintMyDetails(void) {
std::cout << name << endl << IDnumber << endl;
}
void MyDetails::UpdateMyDetails(char nm[], int id) {
int i = 0;
while (nm[i] != \0) { name[i] = nm[i]; i++; }
IDnumber = id;
}
Observe that the methods have to be associated with the class declaration, since they have been
moved outside of the class. This is done using the scope operator :: and the class name.
Finally, the class is used by a programmer, who needs to include the class declaration (header
file) before he can access this new type:

4.4. TYPES IN C++

35

#include <iostream>
#include "MyDetails"
using namespace std;
void main(void) {
char name[20] = "Joe Bloggs";
int IDnum = 1234567;
MyDetails details;
details.UpdateMyDetails(name, IDnum);
cout << "Personal Information follows:\n";
details.PrintMyDetails();
return 0;
}
As you can see, accessing a class member requires the . operator, as in Java. The main()
function, which provides an entry point into your program, is never part of a class. It is a free
function and should always been in some sort of driver source file.
The details of class structure will be fully explored in Chapter 6.

4.4.5

Variable Initialisers

C++ allows variable initialisers for all types. Class initialisation is a more complex topic which
will be leave for later chapters. As in Java, a variable can be initialised when it is created, or
a value can be assigned later. It is good practise to initialise variables when they are created,
since it is very easy to loose track of a variable and try to access it before a valid value has be
assigned. This is particularly so with pointers, where accessing an undefined pointer can lead
to bizarre errors and even an application crash.
The compiler performs certain default initialisations for you. For example, global and static
variables are initialised to zero.
Initialisation is accomplished by assigning a value at declaration:
float myfloat = 33.0;
To initialise more complex entities such as structures or arrays, one uses initialiser lists. In the
case of arrays, for example, we can write the following:
double mydbl[] = {1.2, 2.2, 3.3, 5.5};
char mystring[] = "A string constant";
char String[] = {a, b, c, \0};
The first example initialises a double array of size 4 with the indicated values. Observe that
the array dimension can be deduced by the compiler based on the the number of items in the
initialiser list. The second example shows that a string constant can be used to initialise a char
array. The compiler creates sufficient space to contain all the characters and the terminating

36

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

null byte. In the final example, a string has been initialised in a more laborious fashion: note
the inclusion of the final null character which terminates the string.
Structures are initialised similarly, although now each item in the list is assigned to a particular
field within the structure. The assignment proceeds in the order in which fields are encountered,
and can include sub-structure initialisers. Consider the following example:
struct S1 { float fval; int ival; char name[40]; };
struct Record { char IdCode; S1 details; };
Record r1 = { A, {1.2,15,"BillyBob"} };
If a particular field requires more than one item, we simply surround the field initialiser values
with curly braces. To initialise multi-dimensional arrays we use a similar strategy:
double array[3][3][2] = {
{ {1.0,2.0}, {3.0,4.0},{5.0,6.0}},
{ {7.0,8.0}, {9.0,10.0}, {1.0,12.0}},
{ {3.0,4.0}, {5.0,1.0}, {2.0,2.0}} };
In this example we have an array with 3 dimensions an array of arrays of arrays! The
outermost initialiser list is associated with the first dimension, the list nested in this one is
associated with next dimension, and so on.

4.4.6

Type Conversions

C++ enforces strong type checking. In other words, unlike C, where almost any kind of variable
can be assigned to another, C++ insists that a type cast be used for non-trivial conversions. In
C++, as in Java, type conversions maybe automatic or explicit. Automatic type conversions
occur when, for example, the compiler attempts to match types between a parameter list and a
function call. The following list illustrates a number of automatic type conversions:
Simple Types In arithmetic expressions, C++ will convert the type of each argument to match
that of the widest type. Thus, if you are adding a float and an int, the integer will
be promoted to a float. Similarly, working with a float and a double will ensure that
the float is promoted to a double. During assignment, the type on the right hand side
of the statement will be converted to the target type (if possible). When a class object is
involved, special considerations come into play, since C++ allows operator overloading.
Function Arguments and Return type C++ will attempt to convert the parameters passed
into a function to the appropriate types, and will also attempt to convert the return type,
if required.
Control Statements For statements such as if, while and switch, C++ will convert values
as necessary to ensure that an integral type is present in the test. Since the introduction
of the type bool, control structures which require Boolean truth values are cast to this
type.
Pointer Conversions Array and function names can be cast to pointers of the appropriate
kind. Automatic conversion from a pointer of any type to void* may also occur, but
explicit casts are required for the other direction. The integer 0 will be cast to the NULL
pointer, 0x0, if required.

4.5. POINTERS AND REFERENCES

37

Class Conversions Class objects may be converted automatically, provided appropriately defined overloaded operators exist. This will be discussed in the next chapter.
If an explicit type conversion is not possible, the programmer can force a conversion using the
cast operator. C++ supports to variants of this operator:
(type)old type; // eg:
type(old type); // eg:

(int)5 or (int)(x+2)
int(5) or int(x+2)

The 1st form is identical to the cast operator used in Java and C, but the second is only available
in C++. The argument of the cast operator may be an lvalue or an rvalue, and the converted
value is returned by the operator. Explicit type conversion may cause unforeseen consequences:
for example, converting a pointer to a short, and then converting back to a pointer type will
result in truncation of the address information. Subsequent use of the pointer is likely to corrupt
the program. The cast operator can be overloaded in a class: we shall see examples of this later.
The C++ draft standard has introduced 4 new type conversion operators: static cast, dynamic cast,
const cast and reinterpret cast. These operators will not be used here, and are mentioned
for completeness only.

4.5

Pointers and References

Pointers are fundamental to C/C++ and are conceptually difficult to visualise for many new to
programming. A pointer is simply the address of an item in memory, while a pointer variable is
a variable which contains such an address. Memory locations hold data, and the usefulness of
pointers arises from the fact that they point the way to the location of the data in memory.
We cannot manipulate data if we do not know where it resides in memory. In many cases, the
compiler simplifies this task for us, by providing convenient labels (variable names) by which we
can refer to these memory locations, but there are restrictions imposed by the language which
mean that this is insufficient for our purposes.
The most common reason for requiring a pointer is that we may be expected to deal with a
memory reference than is not coded explicitly into our program. Say, for example, that during
code execution, the operating system allows our program access to a segment of shared memory.
How will it communicate that location of the data? The code is static it has been compiled,
and the executable cannot change. However, if the program has been written correctly, the OS
can return a pointer to the block of shared memory, and this value can then be used to gain
access to that memory. In this example, one would make a system function call requesting access
to the memory, and would then receive the pointer value (the starting address) which could be
placed in a pointer variable for later use.
A pointer variable is declared as follows:
type * varname;
The * signifies that varname is a pointer to type i.e. it will hold an address of a type. Note
that one may mix pointer and non-pointer variables in a declaration:
int *iptr, j, *k; // iptr is an int*, j is an int, k is an int*

38

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

A pointer provides an entry point into memory: we can navigate through memory with respect
to the pointer. Consider the following: we have a block of memory which is referred to by a
pointer ptr. How do we access the actual data to which the pointer refers? The dereference
operator * provides that function. So, if we know that a data item resides at address 0x100000
(i.e. ptr = 0x100000) we can use *ptr to access the actual data item itself. Heres an example:
float *ptr;
fptr = GetFloatFromSharedMemory();
cout << "The value of 1st float is " << *ptr << endl;
After the function call, the pointer variable ptr contains the address of the data in memory.
In order to print that data rather than the address we must dereference the pointer.
Although it is usually far nicer to let the compiler deal with issues of addressing, we often
need to determine the memory address of a particular variable. C++ provides the address of
operator, &, to achieve this. Thus we might code something like the following:
float *ptr;
float afval = 32, afval2 = 22;
ptr = &afval; // ptr = memory address of afval.
*ptr = afval2; // the value of afval changed
// to 22 through a ptr access
cerr << "The value of afval is " << afval <<
" and it resides at memory location " <<
ptr << endl;
Addresses can only refer to one item in memory. However, if we know where a given item is
located, and how far away any other item is, we can use the specified pointer to walk about
memory and reference any item we desire. This is where the danger arises, since it is easy to
move to the wrong location and if the data is then modified, an application crash might result.
How then do we move about memory? Assuming that a function call has returned a block
of 1024 float values, how can we access each one in turn? We simply shift the pointer by
adding or subtracting values to it! Remember, memory is assumed to be a contiguous sequence
of locations, starting at 0x0 and increasing. Thus, if we have a block of memory, which is by
definition contiguous, we can increment of decrement the memory address to move onto the next
location. Consider the following piece of code:
float *fptr;
fptr = GetBlockOfFloats(1024);
cout << "Float at fptr is " << *fptr <<
"Float at position fptr + 20 is " <<
*(fptr+20) << endl;
All we have done here, is print out the value of the float at the address pointed to by fptr,
which is the 1st float of 1024. We then print out the float which is 20 steps higher in memory
i.e. the 21st float in the block. Observe that the address is updated in units of the appropriate
size: if we add one to the pointer, we move one float forward, not one byte or one word. If the
pointer type was int, adding one would adjust the pointer in units of size int.

4.5. POINTERS AND REFERENCES

39

Since pointers are of such fundamental importance in C++, and one need not initialise the
pointer when it is declared, a means is required to ensure that it has an undefined value. C++
defines the constant NULL which is the address 0x0. If you create a pointer variable you
should initialise it to NULL to reduce the chance of a pointer error later. The Java keyword null
undefines an object, and may only be assigned to object references. Most functions/operators
will return NULL if they cannot honour a memory request, so it is good practice to insert code
to check that a valid (non-NULL) address has been returned.
float *ptr = NULL;
if ((ptr = new float [30000000]) == NULL)
{ cerr << "Allocation error!"; exit(1); }
This code snippet shows how one should proceed; note that C++ allows any valid expression in
parentheses, including assignments. In this case, the value that is assigned to ptr is substituted
for the parenthesised expression.

4.5.1

Multiple Indirection

In certain cases it is desirable to store the address of variable in memory, which itself contains
an address. This is known as multiple indirection, and the indirection may continue for as
many levels as required. Each new level requires that one add an additional * after the type
declaration, thus for two levels of indirection, we would write:
float **ptr;
We can see that (*ptr) has type float *, therefore to access an actual float element (rather
than an address), we must use *(*ptr) or simply **ptr. Note that an address will always be
the same size for a given architecture, so the requirement for additional *s does not mean that
a different type of address will be used. Rather, it informs the compiler that when it sees the
code sequence *ptr for example, this will not be an actual float but the address of a float.
Although multiple indirection may seem esoteric it has two very practical uses. Firstly, it may
be used to provide a call by reference facility for function parameters, and secondly, it provides
the ability to create complex structures in memory, as show in Section 4.5.4 below.

4.5.2

The Pointer void*

C++ allows the declaration of a generic pointer, denoted by


void * ptr;
This pointer has no specific type, and before it can be used it must be cast to the appropriate
type:
float *fptr = (float*)ptr;
Some functions particularly those based on earlier C code return void * pointers, or have
such pointers in their argument list. In the latter case, a valid pointer type must be substituted
for each argument. An example of such a function prototype is

40

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS


void* test(void* something);

in which the argument accepts a pointer to some unspecified type, and can return a pointer to
anything. Such a construct does not exist in Java, since it violates strong typing. However, it
can be very useful, since you can write functions which process items of arbitrary type without
requiring complex constructs such as templates.

4.5.3

Relationship between Arrays and Pointers

C++ considers the array name to be a pointer to the first element of the array. Thus,
char mystring[40];
cout << "The character at location 20 is "
<< mystring[20] << " which is the same as "
<< *(mystring + 20) << endl;
and we may use the array name wherever a pointer (of the correct type) may be used.
Note that arrays are indexed from zero, as one would expect. Any pointer in C++ can be
dereferenced using either an array index or through the dereference operator. This equivalence
arises from the way strings or contiguous memory structures are handled in assembler. It is far
more efficient to pass around a pointer to the beginning of a memory block, than to pass around
the actual block itself. The code snippet also illustrates an important point: the indirection
operator has higher precedence than +, so you must enclose the expression in parentheses before
dereferencing.
Although it is not obvious, even multi-dimensional arrays are stored as a sequence of consecutive
items in memory. The mapping can become a little confusing, but for 2D arrays we have a array
of arrays, in which each row of the array follows the preceding row, from top to bottom. The
following code snippet illustrates this:
char arrays[20][10], *ptr;
int i = 10, j = 9;
ptr = &arrays[0][0]; // ptr to start of array of char; Note types!!
cout << "char at (i,j) is " << arrays[i][j] <<
" which is also " <<
*(ptr + 10*i + j) << endl;
One can also refer to sub-arrays in the above example, this means each row, of which there
are 20. Thus we can write the following:
cout << "10th subarray is " << arrays[9] << endl;
in which arrays[9] is interpreted as a pointer to the 1st element of the 10th row. Because
a pointer to a char is how C++ represents a string, the above statement will attempt to
print all the characters in the indicated row. If the string has been properly constructed (null
terminated), all will be well, otherwise the program is likely to crash. One could also code this
pointer explicitly: &arrays[9][0]. This will point to the same character, and because of the
way the data is organised in memory, successive characters will be part of the appropriate row.

4.5. POINTERS AND REFERENCES

4.5.4

41

Dynamic Allocation

For many programs arrays are inadequate since they do not allow one to increase the allocated
memory space as the program runs: they are static constructs in that respect. The solution
is to use dynamic memory allocation. Memory acquired in this way is created on the heap,
and will persist until it is freed up by the programmer or the program terminates. This should
be contrasted to auto variables (local variables), created on the stack, which are destroyed
automatically when they go out of scope.
To acquire a block of memory sufficient for N items, we use the new operator:
type *ptr = new type [N];
where type is a valid type. This allocates space for the required amount of items of the indicated
type and returns a pointer to the start of the allocated memory. If the type is an object, then the
appropriate constructors will be called to initialise each object. Because C++ does not use a
garbage collection strategy to manage memory allocation, it is your responsibility to destroy the
memory once you have finished using it. Although the operating system will clean up when your
program exits, you only have a finite amount of memory to work with so it is usually sensible to
free up memory when you no longer need it. In Java you can set the object reference to null.
In C++ you must use the delete operator:
delete [] ptr;
For object types, delete invokes the destructor to ensure that proper clean up takes place.
It is essential that you call delete with a pointer to the start of the allocated block. If you shift
the pointer elsewhere, you will corrupt the memory management system and your program will
almost certainly crash. If you require a single item, the above statements become:
type *ptr = new type;
delete ptr;
In C the functions malloc() and free() are used to manage dynamic memory allocation. Although
you can mix both systems, this is not advisable since attempting to free memory with the wrong
function will corrupt the memory management system.
It is often useful to dynamically allocate a 2D array, which we can access using the usual array
syntax. The following code shows how to do this:
float **array2D;
int rows = 10, cols = 20, k, l;
array2D = new float* [rows];
for (int i=0; i < rows; i++)
array2D[i] = new float [cols];
...
if (array2D[k][l] < 20.0) ...

42

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

This works because we have declared array2D to have type pointer to pointer of float. The
first new creates a 1D array in which each element is a pointer to a float. We then allocate
the actual memory pointed to by each element using a second new operation. Remember that
array2D contains an address. The fact that the value at that address contains another address
is irrelevant: we can use an array syntax to manipulate the pointer array2D. Thus, array2D[k]
returns an address, and we can then dereference this address as:
*(array2D[k] + l) or
(array2D[k])[l] or
array2D[k][l].
To recap: any pointer variable can be dereferenced using either the * operator, or array indexing.
The type that is returned will depend on the type of the pointer.

4.5.5

Function Pointers

C++ contains some rather exotic constructs, one of these is the function pointer. Just like a
variable, a function resides in memory, and one can find the address of the entry point into
this function. Java provides no such facility. The name of the function serves as a pointer
to the entry point of that function, just as an array name provides access to the associated
memory. Of course, a function is a very different entity, and the syntax for function pointers
is somewhat obscure. In the first instance, a function takes arguments, and may return an
argument. If strong type-checking is to be enforced, the compiler needs to be informed what
the types of these entities are. Assuming we wish to define a pointer to a function which takes
no arguments, and returns a float, how might we do this? Here are a two examples:
float (*mfptr)(void);
void * (*vfptr)(void *, long, char);
The first shows a function pointer, mfptr, which points to a function which takes no arguments
and returns a float. Note that in the above examples, the () are required around the pointer
name; if they are not present, the compiler will assume this is a function declaration!
What about more complex constructs: such as arrays of function pointers? The following
declares and array of function pointers corresponding to the second example above:
void* (*farray[10])(void *, long, char) ;
Function pointers are dereferenced by a standard function invocation:
myfptr()
farray[5](fptr,100,f)
One use of function pointers is to allow users to pass their own specialised functions into a
generic routine, such as QuickSort. Other applications might involve databases of functions or
compiler design.

4.5. POINTERS AND REFERENCES

4.5.6

43

References

C++ introduces the idea of a reference to a variable. Note that this is not the same as a Java
reference! A Java reference is simply a disguised pointer. To see why this is useful, we must first
understand how parameters are passed to functions. When a function is called with arguments,
the contents of each argument are copied into each formal parameter (the variables listed in the
argument list of the function). This is known as call by value and is the usual way function
arguments are processed. For example,
void MyFunc (int tt) {
tt = 22;
}
has a formal int parameter called tt. Note that tt is assigned a value of 22 within the function.
However, consider the following function call:
int myval = 33;
MyFunc(myval);
cout << " Value is now:

" << myval << endl;

The value which is printed out will be 33, not 22, as one might have thought. This occurs
because the variable myval is passed by value, leaving the external variable unaffected by the
operation of the function. The variable within the function changed, but this has no bearing on
events outside of the function! So how do we use functions to modify variables? There are two
choices: we can allow the function to return a value, which we can then assign to a variable,
or we can pass the parameter by reference. If we pass an argument by reference, the function
associates the formal parameter with the argument we are passing into the function. They are,
in fact, one and the same and changes made to the parameter will actually be applied to the
variable argument.
We can therefore rewrite the function above to accept a reference to an int:
void MyFunc (int& tt) {
tt = 22;
}
Note that a reference to a type is written as type&. When we execute the function, the variable
which is passed into the function will be changed to have value 22, as we wished. Reference
arguments cannot usually accept constant values. This makes sense since a reference is really
a disguised pointer, and you cannot take the address of a constant value such as the number
5. However, if we use a constant reference, declared as const type&, we can pass in constant
values. In this case, the compiler is notified that we will not try to modify the (constant) value
we pass in and will thus allow the code to compile (it creates space for the constant, and connects
the reference to this).
If the type has fields, such as a class or aggregate type, we simply use the . notation to
access/modify fields within the function.
Prior to the introduction of references, C++ used pointer arguments to emulate call by reference.
It is important to understand how this works, since this practise is still in widespread use. The
following code shows how the above example can be recoded to use this approach:

44

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS


void MyFunc (int *tt) {
*tt = 22;
}
int myval = 33;
MyFunc(&myval);
cout << " Value is now:

" << myval << endl;

The output will be 22, the same as the answer generated when a reference was used. You can see
that the function has been redefined to accept a pointer to the appropriate argument. Within
the function body, that pointer is dereferenced and the new value is assigned. Although a copy
of the variable myval is passed into the function, that copy contains the address of the variable
we wish to modify, rather than the variable itself. All we need access to in order to modify
a variable is the address. The dereference statement within the function modifies the variable
by changing its contents directly. The pointer variable which is passed into the function is not
modified in any way. Indeed, if we removed the * within the function, the assignment would
apply to the pointer variable, and since only a copy is modified, this change would not affect the
actual argument passed into the function. If we wish to modify a pointer variable, we must pass
a pointer to a pointer as an argument, and dereference it appropriately within the function:
void MyFunc(float **ptr) {
// modify a float pointer
*ptr = NULL; }
float *TopLevelPointer = 0x1000000;
MyFunc (&TopLevelPointer);
// TopLevelPointer will now be NULL
The fundamental point to remember is this: unless a reference argument is present all variables
are passed by value. In general, it is more efficient to pass arguments by reference, since a
reference is essentially a pointer and an address has a small fixed size. If you pass a class object
or structure by value, the entire object is duplicated and passed into the function, which is both
time consuming and wasteful of space.
References may also be defined for variables other than formal arguments of a function. However,
in this case,
the reference must be initialised to refer to an existing variable, and
the reference cannot refer to any other variable: it is a fixed alias.
Here is an example:.
int N = 2, M;
int& myint = N;
Once this code sequence has been executed, myint and N are the same item. The statement
myint = M will not change the reference (which is illegal), it will simply assign the value of M to
both N and myint.

4.6. CONTAINERS

4.6

45

Containers

If we wish to store a number of items of the same type, and access each item in a consistent
manner, we can use a C++ container. Containers come in many different flavours, but they all
share some basic features. At the very least, you can iterate through a container, to select each
item in turn. We will discuss containers in great detail in Chapter 8; for the moment we will
introduce two containers that we will encounter at many points in these notes: vector and list.
These containers are logically very different, but can be accessed consistently because of the
way C++ defines them. A vector is a random-access container: we can set/read any item it
contains using array indexing. To access an item in a list, we must either move backwards, or
forwards from the start or end of the list. The vector can however, also be traversed from
front to back or back to front: we simply increment or decrement the index value.
To declare a container to hold items of type object, we use the following syntax:
#include <list>
#include <vector>
using namespace std;
vector<object > MyVector;
list<object > MyList;
Every container has an associated header file, which has the same name as the container itself.
Containers are defined within the name space std; if we do not employ a using directive, we
would have to qualify these container types:
std::vector<object > MyVector;
Containers can contain objects or simple types. The container itself is an object, and knows
how many items it contains. The size() method returns the number of elements in a container.
To move through the elements of a container in turn, we employ iterators. An iterator is simply
a special type attached to each container which provides a consistent way to visit each element.
Conceptually, an iterator is a pointer to an element in the container. Thus, iterators support:
increment you may use the ++ operator on an iterator, to move to the next element in the
container;
dereference you can return the value pointed to using the * operator;
comparison you can compare iterator values using != and ==.
The following code will illustrate simple iteration through the two containers we have introduced:
#include <list>
#include <vector>
using namespace std;
vector<int> MyVector;
list<int> MyList;
int main(void) {

46

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS


int a[6] = {0,1,2,3,4,5};
for (int k = 0; k < 6; k++) {
MyVector.push back(a[k]);
MyList.push back(a[k]);
}
vector<int>::iterator i = MyVector.begin();
list<int>::iterator j = MyList.begin();
while (i != MyVector.end())
{ cout << "Value in vector: " << *i << endl; i++; }
while (j != MyList.end())
{ cout << "Value in list: " << *j << endl; j++; }
for (int p = 0; p < MyVector.size(); p++) {
cout << "Vector[" << p << "] is: " << MyVector[p] << endl;
}
return 0;
}

Every container supports the push back() method, which inserts a new element at the back
(end) of the container. Insertion and deletion in containers will be covered later. Observe that
the same methods are used to do this, regardless of the type of the container. Thus, provided
we obey some simple rules, we can write one function and use any container we desire!
Having populated the container, we fetch an iterator (pointer) to the first element, using the
begin() method and then step this iterator using ++. To access an element we dereference
the iterator, as we would with a normal pointer. Finally, to test whether we have reached the
end of the container (i.e. visited all the items in this case), we test the iterator against the
special sentinel iterator returned by the end() method. Once we iterate past the last item in
a container, the iterator will be set to this sentinel, and the loop test will fail. This program
illustrates a very general method to iterate in turn through every element of a container. For
a vector, however, we can use array indexing to access each element, as show in the last loop.
If we write the loop in this way, then we can only use this code for another container which
supports the [] operation. It is thus less general.

4.7

C++ Specific Operators and Expression Syntax

There is a great deal of overlap between the operators and precedence rules of C++ and Java.
In fact C++ only really differs in the following respects:
Java has no unary * operator this operator is used to dereference pointers. Operator
precedence can be important when dealing with this operator. For example:
char *q = "A string", *p = q;
int cnt = 0;
while (*p++ != \0) cnt++;
counts the number of characters in a string. Note that *p++ includes a post-increment
(unary) operator as well as a (unary) dereference operator. The ++ has higher precedence

4.7. C++ SPECIFIC OPERATORS AND EXPRESSION SYNTAX

47

and is evaluated first. However, because the ++ only operates after the expression has been
evaluated (it is a post-increment operator), the net effect is to evaluate the comparison
with the dereferenced pointer. The pointer then shifts onto the next character in the array,
since the ++ binds to the pointer p.
The unary operator & returns the address of its operand. Java has no equivalent. This
operator has the same precedence as *,
C++ has the binary arrow operator, -> which is shorthand for (*.). Thus if we have a
pointer to a struct S containing a field a, we can access the field in two ways:
(*sptr).a or
sptr->a
C++ does not have a have a zero-fill right shift operator (>>> in Java),
C++ has the scope resolution operator, ::, to resolve ambiguous class references,
Java does not provide full support for the , operator. In C++ the following expression is
valid at any point in the program:
a = y,y = z+2, z = 2;
The order of evaluation (associativity) is from right to left.
Expressions are largely unchanged between the two languages. A value may be an lvalue or an
rvalue. Variables which are not const are examples of lvalues, since they can occur on the left
side of an assignment statement. Constant variables/values and string literals are examples of
rvalues: they can only occur on the right-hand side of an assignment. Normal variables are both
lvalues and rvalues.
You can assign a value within an expression which is part of a larger expression:
t = ((a = log(x)) + 5.1);
In this case the variable a is assigned the value log(x) and this value is passed back to the rest
of the expression.
One major difference between C++ and Java is operator overloading. In C++ you can overload
operators to change their behaviour for each class. Thus, given class Matrix, we can overload
the + operator so that the following is valid:
Matrix A, B, C;
C = A + B;
Java does not allow operator overloading. This means that explicit method calls are needed to
perform operations on objects and also leads to code which may be less intuitive.
The C++ standard introduces 3 new logical operators: and, or and not. These operators are
equivalent to the usual C/C++ operators &&, || and !, but they may only operate on bool
values. C++ usually assumes that 0 is logically false, and any non-zero integer value represents
true.

48

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

4.8

C++ strings

C made exclusive use of arrays of 1-byte chars to represent a string of characters. Java uses
2-byte unicode characters as its basic building block and defines a String class to manipulate
these. C++ added its own string class, but the usage of character arrays is still widespread:
when efficiency is of prime concern, direct char manipulations may be preferable. More recently,
the type w char was introduced which allows for 2-byte unicode character strings, but this is
not widely used. The following provides a short review of the functionality available with both
these constructs.

4.8.1

Character Arrays

A string in this context is a null-terminated array of type char. The programmer is responsible
for ensuring that the array has adequate space to hold the entire string or any subsequent
additions. The programmer must keep track of the string size. Here are a few common functions
(you must include cstring to use these):
int strlen(char *str): This function will return the length of a null-terminated char array.
Note that the null character is not counted.
char *strcat(const char *s1, const char *s2): The contents of the array s2 are concatenated to the contents of array tt s1, with a pointer being return to the start of the array.
The new string will be null terminated.
char *strcpy(const char *s1, const char *s2): The contents of s2 are copied into s1. A
pointer is returned to s1.
int strcmp(const char *s1, const char *s2): The strings are compared and an integer is
return to indicate their ordering. A zero means they are identical, a negative number
indicates that s1 is lexicographically smaller than s2 and a positive number indicates that
s1 is lexicographically greater than s2. Versions of this function exist which ignore case
and can compare substrings.
Note that the compiler will recognise string literals such as "abcd" as being of type char* and
treat them accordingly. Such literals should really be immutable (to use Java terminology)
and should thus be of type const char*. In particular, when declaring a literal initialiser one
should use const:
const char *s = "abcd"; // should NOT be modified!;
// s is a POINTER to string table entry
Note that the declaration
char s[] = "abcd"; // abcd COPIED into allocated storage
creates space that we may modify (if we wish); the character sequence "abcd" is only used by
the compiler when building s[] and thus has no associated storage. While it is possible to use
string literal assignments outside of initialisation statements, this practise is discouraged since
modifying these literals can cause a program crash.

4.8. C++ STRINGS

49

Method
length()
at(index)
insert(index,strng)
remove(index, length)
substring(index, length)
replace(index, length, strng)
find(text,start)

Description
return string length
string indexing with bounds checking
insert new string at index
erase range of characters from string
return a substring
replace substring with new string
return index of substring

Table 4.1: A subset of the string class methods.

4.8.2

The string class

The C++ string class shares many similarities with its Java counterpart, but adds in functionality from the StringBuffer class. A number of operators have been overloaded to support
string operations:
{+, +=}: string concatenation, concatenation with assignment;
{<, >, <=, >=, ==, !=}: lexicographical string comparison;
[]: string character indexing (no boundary checking).
This syntax eliminates the need for method calls to perform simple tasks such as appending
characters. Here are some examples:
#include <string>
string a("abcd");
string b = a + "efghijk"; // concatenate
a += "lmn";
cout << "Length of string " << a << " is " << a.length() << endl;
cout << "2nd character is " << a[1] << endl;
if (a == "ace")
cout << "Youre an ace!" << endl;
else if (a < "ace") // a is alphabetically less
cout << "You aint so hot";
Some other string functions are listed in Table 4.1. If you need access to the internal char*
buffer, you can use the c str() method:
string a("abcd");
const char *sptr = a.c str(); // note return type!
There are several other functions which are documented in the appropriate class definitions.

50

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

4.9

Functions

Functions in Java and C++ are almost identical. The syntax is essentially the same,
return type FunctionName (argument list )
{ function body }
There are a number of important differences, however:
1. Functions in C++ need not be associated with a class one can have general purpose
utility functions which can be called from any function (even those in a class).
2. Functions may be referenced through function pointers; this is not possible in Java.
3. A class member function definition may be placed within the class definition, or placed
outside. Java does not permit such a separation.
4. Functions in C++ may accept variable length arguments (so-called varargs).
5. Functions in C++ may have default arguments:
int Myfunc(char tt, int theint = 10);
The function Myfunc can then be invoked without the need to specify the last (default)
parameter:
(void)Myfunc(T);
Note that all it is not permissable to have non-default parameters following a default
parameter.
Functions may be declared and defined separately. In the case of class member functions,
the scope resolution operator is used to disambiguate class member function names, since a
class defines a scope and name overlap may occur with global functions. A function body also
constitutes a scope with local variables visible only within the function.

4.9.1

Arguments and Return Types

A C++ function may accept any valid type as an argument. If no arguments are passed to the
function, the argument list must be declared void. Similarly, if the function does not return
a value, the return type should be declared void. A function prototype should be declared for
every function which is not bound to a class (the prototyping occurs naturally in the case of
class member functions). The prototype provides the compiler with the call signature of the
function (argument types) which is used to enforce strong type checking and allows function
overloading (discussed below). A function may also be defined to have default parameters. In
this instance, the function may be called with fewer parameters than it has arguments.
If a function returns a value which is not going to be assigned or used in the subsequent code,
one may cast the function return value to void:
(void)MyFunctionReturnsAnInt(myargs);

4.9. FUNCTIONS

51

This is not a common practice and is not obligatory: the compiler will simply discard the return
value.
If one requires more than one return value from a function there are two suggested methods:
References/Pointers You can simply add a new reference or pointer argument to the function
argument list for each new return value. When the function is called, you pass in the
variable (in the case of a reference) or a pointer to a variable (in the case of a pointer).
An example of the pointer method is shown here:
int multReturn(int *arg1, int *arg2, int *arg3) {
*arg1 = ...;
*arg2 = ...;
*arg3 = ...;
return ...; }
int Answ1, Answ2, Answ3;
(void)multReturn(&Answ1, &Answ2, &Answ3);
Structures You can define a structure containing all the relevant vales and return this:
struct RetVal { int ans1; int ans2; int ans3; };
RetVal multiRetTypes(arglist) {
RetVal rv;
rv.ans1 = ...;
rv.ans2 = ...;
rv.ans3 = ...;
return rv; }
RetVal ret = multiRetTypes(...);
Note that the actual structure was passed back since it only has 3 fields. For a large
structure, a pointer should be returned, with the structure having being allocated using
new within the function. One could also pass a preallocated structure by reference.
A final point to note concerns the application of the keywords const, static and extern to
functions.
A function may be declared const, provided it is a class member function. The significance
of this declaration will be discussed in Chapter 6. A function prototype can be preceded by
const, in which case the argument which is returned cannot serve as an lvalue. If an argument
is declared to be const, the code within the function cannot change the argument in any way.
This is useful when passing in pointers, which usually allow the function code to modify the
item they are pointing to.
A static function is one which is bound to a class rather than a class instance, as in Java.
In the case of a non-class function, static binds the function to the file it is defined in, and
prohibits functions in other source files from accessing it. An argument to a function cannot be
declared static.
A function prototype may be declared as extern if the function definition is not visible from
the current scope. This usually means the function has been defined in a separate source file.
If a function is static, an extern reference will be illegal. This keyword cannot be used with
function arguments.

52

CHAPTER 4. BASIC C++ SYNTAX AND CONSTRUCTS

4.9.2

Overloading

In C++, a function may be overloaded: multiple definitions of functions with the same name
may exist, without violating scoping rules. The function argument lists must be different: the
return type is ignored for overloading considerations. Function overloading is used extensively
with class inheritance, as we shall see, but it is also useful outside of this context. The following
example shows 3 overloaded functions, which all return the same type, but accept different
arguments:
int GetIntFrom(char *str);
int GetIntFrom(float flt);
int GetIntFrom(float mantissa, float exponent);
No special keyword is required to indicate that overloaded functions are present. C++ performs
name mangling on each function name, which generates a unique function name which encodes
the function arguments. This allows the compiler to determine which function version is actually
being invoked within a given segment of the program.

4.9.3

Function Inlining

If a small segment of code is heavily utilised by a program, the overhead of a function call can
impact negatively on performance. One can use pre-processor macros to avoid a function calls,
but there are limitations associated with this approach. C++ introduces the inline compiler
directive, which instructs the compiler to generate equivalent code which does not require a
function call:
inline int AShortFunction(int arg);
Of course, the executable size will increase with each inline function, but in certain cases the
trade-off may be highly beneficial. Some compilers do not, however, honour an inline directive,
and treat it as a hint instead. It is often possible to force code inlining by passing a flag to the
compiler.

Chapter 5

Simple I/O in C++


Most programs require input from some source usually the console or a file. We have already
seen how we can write a message to the console using cout and cerr, but we have not yet seen
how we can read information from the console into our program (besides using command line
parameters). C++ defines a number of standard input and output streams. A stream is simply
an I/O channel which may be connected to a large number of different I/O devices. The details
are hidden from the end-user: one simply reads and/or writes to the appropriate stream and
the low level system I/O libraries ensure that everything proceeds transparently.

5.1

The Standard I/O Streams

There are 3 output streams, cout, cerr and clog and one input stream, cin. Error messages
should be sent to cerr or clog since cout permits output redirection, which might mean that
error messages would not appear on the console. C++ streams are class objects and one may
apply various methods to them to change their behaviour. Operator overloading is used to define
the behaviours of the << and >> operators which may be applied to stream objects. We shall
return to this topic later; for now some examples will help to clarify things.
A typical statement to read data into a variable would be the following:
int i;
cout << "Enter an integer value:
cin >> i;

" << endl;

In this code snippet a character string is printed to cout (usually the console) and when the
user enters an integer and presses return, the value will be inserted into the variable i. By
default, cin reads data from the console. However, both cin and cout can be redirected in
which case input and/or output will be bound to different sources.
Multiple variables can be assigned or printed with one statement:
int i, j;
float f;
cout << "The current value of f is " << f <<
" and (i,j) has value (" << i <<
53

54

CHAPTER 5. SIMPLE I/O IN C++


"," << j << endl;
cout << "Enter a new int and float value:
cin >> i >> f;
cout << "\nThe value of i is " << i <<
" ; f has val: " << f << endl;

";

In this example we have used both endl and \n to format the output string. Both of these
approaches may be used to insert a newline (carriage return) into the output stream. C introduced a number of escape characters which could be inserted into character strings to modify
the output behaviour. Each escape character begins with a \ to distinguish it from a normal
ASCII printable character. To insert a \ into a string one must use the control character \\.
Some other escape characters are:
\a when printed produces an alert (beep)
\t move to next tab stop
\b moves the printing position one step left
\ print a double quote
\NNN prints the character with octal code NNN
\xNN prints the character with hex code NN
The instruction endl is an example of an I/O manipulator which modifies the behaviour of
the associated stream. In this case, the standard output stream is adjusted so that subsequent
character output occurs on a new line1 . Other manipulators performs tasks such determining
the number of decimal places to which float values should be printed, white space filtering and
so on. A manipulator is an instruction to the stream object to modify its behaviour; escape
sequences are simple embedded codes which have very limited functionality by comparison. One
can also write manipulators to further generalise stream behaviour.
The complications associated with I/O arise almost exclusively from console input. C++ assumes that whitespace (blanks, tabs, newlines) separate data items on an input line, which can
cause complications when we need to read in a character array:
char mystring[100];
cout << "Enter a string and press return: ";
cin >> mystring;
cout << "The string is :" << mystring << endl ;
If one entered the string hello world!, only the first word would be placed in the string! To
overcome such a limitation we must use one of the other methods associated with the cin stream:
getline(). We may then write:
cin.getline(mystring,100);
to read at most 99 characters from the input stream into the character string mystring. Note
that we require space for the additional null terminator of the string, which is why only 99
characters can possibly by read.
1

The output buffer is also flushed.

5.2. I/O FORMATTING

5.1.1

55

Reading from stdin

Often we wish to read in arbitrary number of input items from the console. We cannot (efficiently) allocate space beforehand, and we need a way of terminating the input once we have
entered all the data. The following code shows how one can do this:
vector<string> items;
string s;
while (!cin.eof())
{
cin >> s >> ws;
items.push back(s);
}
We are now faced with a problem: how do we signal EOF (end-of-file) for stdin? If we are
typing in data on a console, we have an EOF character (under UNIX, this is ^
D (ctrl-d), entered
on a separate line) which we can type to immediately terminate input. If we have redirected a
file into our program to generate input, then the redirection mechanism automatically sends the
EOF charcter, and our program will still think it is reading from the console, and will close the
input stream correctly once all the data has been received.
The ws (a special I/O manipulator that removes whitespace) is required for correct behaviour
in this case. The whitespace following the last input item (a newline) suggests that more valid
input is available, since the input buffer is not yet empty. Thus the loop will be executed again
after the last valid input (eof() will still pick up data in the input buffer), but the read attempt
will fail (white space is not valid data here) and the string s will thus remain unmodified, leading
to a duplicated item. The ws ensures that there is nothing remaining in the input buffer after
the last word is read. It will also cause the eofbit to be set after the last item is read since
it encouters the end of the input buffer while discarding white space. The loop then correctly
terminates since the call to eof() determines that all data has been processed in the input
buffer.
Note that the eof() method only returns true if a previous I/O operation has encountered the
eof-of-file.

5.2

I/O Formatting

The standard I/O library in C++ include extensive formatting capability. This section provides
a brief overview and shows how one can use some of these facilities.
Formatting can done in two ways:
setf() This is a method available to cout. By specifying a series of flags, once can set the
behaviour/formatting of the stream. For example,
cout.setf(ios::hex, ios::basefield);
will set the output format to hexadecimal. The flags are constants from an enumeration
defined inside ios, which is why they are qualified. The enumeration name is used as
the second argument, and the actual constant as the first. Some examples are shown in
Table 5.1.

56

CHAPTER 5. SIMPLE I/O IN C++


Group Name
ios::basefield
ios::adjustfield
ios::floatfield

Purpose
set base of output value
set justification for printing
set format for float printing

Constants
ios::hex, ios::dec, ios::oct
ios::left, ios::right
ios::scientific, ios::fixed, ios::internal

Table 5.1: Formatting flags


Name
dec, oct, hex
ws
ends

Purpose
print number in this base
consume whitespace
end input stream

Table 5.2: Simple Manipulators

The setf() function also has a single argument version, which is used in conjucntion with
unsetf() to turn stream behaviours on and off.
manipulators For ease of use, many of the flags can be set directly in the output statement
by using specially defined manipulators. For example,
cout << setprecision(6) << f << endl;
will print out the value of f using 6 digits of precision. Another useful manipulator is setw.
This allows you to specify an integer which defines the width of the output field for the
next output item in the stream. This is useful for lining up columns of text data. Note
that data will never be truncated, so the width is only respected if the data size can be
accommodated, otherwise the full data item will be printed.
Manipulators do not need to have an argument. The endl manuipulator simply inserts
a newline and then flushes the output buffer. Some other manipulators of this kind are
listed in Table 5.2.

5.2.1

Formatting to/from memory

Sometimes we wish to treat memory as a source/destination for input or output. For example,
for reasons of efficiency, we may wish to read input from a disk block cached in memory, rather
than performing a new set of expensive disk reads. We can do this using an input strstreams:
we can attach a stream to the data block and then use the standard input functions to extract
data from this block. The code below shows how this can be done:
#include <iostream>
#include <strstream>
char* input = "data1 2.34 data2 6";
string str1, str2;
float f1;
int i1;
istrstream str(input); // attach the stream
// read data
str >> str1 >> f1 >> str2 >> i1;

5.3. FILE-BASED I/O

57

We can also format data into a piece of memory:


#include <iostream>
#include <strstream>
char output[100];
int ival1;
ostrstream ostream(buffer,100); // attach the stream (100 bytes max)
// write data
ostream << "The value is:
cout << "Streamed data:

" << ival1 << endl << ends;

" << output << endl;

In this case, we must use the special manipulator ends to signal the end of the output stream.
Note that C++ also has string based versions of these streams, called stringstreams, which
manipulate strings instead of character buffers. To use them you need to include the header
file sstream. A method str() can be used to extract the formatted string after writing to a
stringstream object.

5.3

File-based I/O

We have examined streams which are associated with the standard input and output devices. In
many cases, however, we need to read or write information from a file. A stream can be bound
to a file on disk, just as it can be bound to the keyboard. In this case, however, we have to
instruct the operating system to open a stream to the file of interest.
There are two basic file stream types: input file streams and output file streams. The former
only allows us to read data from the file, whilst the latter only allows us to write data to the
file. One can also open an input/output file stream, which allows us to perform both kinds of
access on the file.
Thus we have:
ifstream Allows read operations only on the associated file
ofstream Allows write operations only on the associated file
fstream Allows read/write operations on the associated file.
C++ has an I/O class hierarchy which encompasses all the I/O behaviours we have discussed
so far. For now, all we need to know is that each of the file types is actually a class type and
that we can instantiate an object of the appropriate type:
ifstream MyFile;
MyFile.open("file.dat");
if (!MyFile) {
cerr << "File open failed!" << endl;
}

58

CHAPTER 5. SIMPLE I/O IN C++

This declares a new file object called MyFile which supports input. The physical file association
occurs when we invoke the open method on the ifstream object; we pass the file name as an
argument. Note that we can also create and open the file with one statement:
ifstream MyFile("file.dat");
if (!MyFile) {
cerr << "File open failed!" << endl;
}
The operator ! has been overloaded so that it returns true if some error condition has occurred
on the stream (such as the file being non-existent). This functionality is provided by the standard
libraries; we shall see how to overload our own operators later on.
Once we have finished using the file, we can disconnect it from the stream:
myfile.close();
Although files are (usually) closed automatically when file objects go out of scope, it is good
practice to close them yourself. If you program terminates unexpectedly the file may not be
closed and data will be lost. If we wish to write data to a file, we simply replace ifstream by
ofstream.
Reading and writing of data can be accomplished in several ways. We have the overloaded
operators << and >> for output and input respectively. These operators work in the same
manner on all stream objects, since they are inherited from a base I/O class (called ios). For
example, having opened our input file, we may read data as follows:
int i;
while (!myfile.eof()) {
myfile >> i >> ws;
cout << "The next data item is " <<
i << endl;
}
The member function eof() returns true if the stream object has no more data available (endof-file condition). The internal eofbit state bit is set to true when an input operation fails due
to lack of data. Note how we use ws once again to ensure that extra white space is discarded. As
before, this allows us to force an end-of-file event before our next attempt to read data (which
would fail, but cause an extra loop iteration).
One may also simply test the file object at each iteration, since this will evaluate to false as
soon as a file read operation fails. In this case, however, the code must be restructured to avoid
printing the output message if no valid data was available.
We need not use >> and << for I/O; as we did with the standard I/O streams, we can use more
powerful member functions. This is illustrated by the following code:
#define ARRY SZ 40
char array[ARRY SZ];
ifstream myfile("abc.dat", ios::binary); // open file in binary mode

5.3. FILE-BASED I/O


while (!myfile.eof()) {
memset(array,0,ARRY SZ);
myfile.read(array, ARRY SZ-1);
cout << "Next byte data chunk:
<< array << endl;
}

59

"

The read() method attempts to read N bytes from the input and places them in the character
array indicated by the 1st argument. Note that this data may actually be a stream of integers,
but it is read in byte by byte, thus one can perform a raw data read using this function. Observe
that a null byte will not be inserted by read(). Since we wish to print out the data as a string
(although it may not be!) we must insert a null character ourselves this is why we will read
in a maximum of one less than the array size. At every iteration we fill the array will zero bytes
so that we do not have to keep track of the end of the string explicitly. Of course, if the data
happens to consist of a long sequence of zero bytes we will not see any printed data.
This kind of input operation is appropriate for binary files: a collection of arbitrary data with
unspecified formatting. It is up to some higher logic to impose order on the acquired data, by
recognising, for example, that it is actually a sequence of records. If read() cannot fetch the
required number of bytes it sets the EOF flag as well as a flag to indicate that a file operation
has failed. One can test these flags to determine the nature of the file error.
Note that the file was opened in binary mode. Usually a file will be opened in text mode and
when reading or writing characters certain translations will occur. In binary mode the source
file is simply treated as a collection of bytes. There are a number of other mode specifiers, such
as ios::ate which opens the file and moves to the end and ios::trunc which will empty the
contents of a file if it exists or create a zero length file. These mode flags can be combined using
a logical OR:
ofstream myfile("abc", ios::binary | ios::ate);
These mode values are class constants which are defined in the I/O base class ios.
If one is dealing with ASCII files, then reading chunks of data in this fashion complicates matters.
In this case it is better to acquire whole lines of character data at a time. We have already seen
how we can read lines of data from cin; it turns out that the same method can be used (it is
inherited through the I/O hierarchy) to read a newline-terminated string of characters from a
file:
char array[100];
while (myfile.getline(array,100))
cout << "The next line is " << array << endl;

When no more lines can be read the loop terminates, with the appropriate error bits set to
indicate the nature of the failure.
To read a single character from an input source we can use the get() function. The function
returns an the ASCII code for that character or -1 if it has reached the eof-of-file:
cout << "Enter a character and press return:
int ch = cin.get();
cout << "You typed " << (char)ch << endl;

";

60

CHAPTER 5. SIMPLE I/O IN C++

Note that in all cases console input operations are only completed when the return key has
been pressed. If you wish your program to respond to a single key press only, you should use a
terminal management package such as curses.
There is a great deal more one can say about I/O under C++, but this brief introduction should
allow you to perform simple I/O tasks such as basic console and file I/O. It is now time to
examine class structure in C++, so that we can understand how the various constructs we have
seen in this section actually work.

Chapter 6

Classes in C++
Object orientated programming centres on the use of class objects to enable data encapsulation
and code reuse. The basic issues of OO design and programming will be familiar to those who
have worked with Java. Both Java and C++ use the class keyword to define a class type.
However, there are a great many differences in the way that objects are created and utilised.
Some of these differences are trivial mere issues of syntax others are more significant and
mean that concepts from Java will have to be modified to fit into the C++ view of things. We
shall start with the simple issues and gradually works our way up the scale of complexity.

6.1

Defining a Class

A basic class definition consists of member variables and functions. C++ allows one to separate
the member function definition from the actual function declaration, something which is not
supported under Java. A class is defined as follows:
class ClassName {
access specifier:
member variables
member functions
...
access specifier:
member variables
member functions
...
};
The keyword class is reserved, as it is in Java. Member functions and variables are defined as
they are outside a class. Of course, they are now part of the class scope and as such cannot be
referred to directly. You must use the class member operator . to access a member variable
or function, assuming you are permitted access.
The access specifier must be one of the following:
public The member variable or function is accessible from any point in the program for which
the class object is in scope.
61

62

CHAPTER 6. CLASSES IN C++

private The member variable or function is only accessible from within the object, and by
so-called friend functions.
protected This specifier has the same effect as private but allows the associated variables and
functions to be inherited; this is not the case with private members.
Java also has these three access specifiers, but they are applied in a somewhat different fashion.
Within a Java class, each method must have an explicit protection specifier; C++ assumes that
the indicated specifier applies to all subsequent members. By default, C++ assumes that all
members in a class are private. Within Java, package access which does not exist in C++
is assumed if no specifier is given.
Individual member functions may be defined inline, as in Java:
class ObName {
...
return type functionName ( arguments )
{ function body }
...
};
or the definition and declaration may be separated:
class ObName {
...
return type functionName ( arguments );
...
};
...
return type ObName::functionName ( arguments )
{ function body }
...
We have already seen how this separation should be handled (Chapter 3): the class declaration
(which includes member function prototypes) should be placed in a header file, while the actual
function definitions should be placed in a separate source file. If the function definition is
included within the class definition, the associated code will be inlined and the compiler will
try to optimise the code to avoid the overhead of a function call (which would be expensive for
functions containing only one or two statements). However, even for functions which are defined
outside of the class, we can suggest to the compiler that it inline the function Section 4.9.3.
As a rule of thumb, only functions containing a couple of statements should be included within
the class definition; for anything more extensive, the function definition should be placed in the
appropriate source file.
To conclude this section, we present a bare-bones implementation of a Matrix class. This class
will serve as a running example throughout the remainder of this text and will be expanded and
improved as we meet new concepts.
#ifndef MATRIX H
#define MATRIX H

6.1. DEFINING A CLASS

63

class Matrix
{
private:
int rows, cols;
double **space;
bool ValidIndex(int i, int j);
public:
Matrix(void) {rows = cols = 0; space = NULL; }
Matrix(int r, int c);
Matrix();
int GetRowSize(void){ return rows;}
int GetColSize(void){ return cols;}
double g(int i, int j);
void s(int i, int j, double val);
};
#endif
In this class there are three member variables, which are all in the private section of the
class. This variables cannot be accessed from outside of the class, or from any classes which
are derived from this one. All of the member functions have been placed in the public section.
These function provide the interface to the object i.e. we can use them to access the object
member variables and functionality.
#include <iostream>
#include "Matrix.h"
// Class method definitions
bool Matrix::ValidIndex(int i, int j) {
if ( i < 0 || j < 0 || i > rows -1 || j < cols -1)
{ cerr << "Array out of bounds!"; return false; }
else return true;
}
Matrix::Matrix(int r, int c) {
int i;
space = new double *[r];
for (i = 0; i < r; i++) {
space[i] = new double [c];
}
rows = r; cols = c;
}
Matrix::Matrix() {
if (space != NULL) {
for (int i = 0; i < rows; i++)
delete [] space[i];
delete [] space;
}
}
double Matrix::g(int i, int j) {
if (i < 0 || j < 0 || i > rows -1 || j > cols -1 )

64

CHAPTER 6. CLASSES IN C++


{ cerr << "Array bounds error!"; return 0.0; }
return space[i][j];
}
void Matrix::s(int i, int j, double val) {
if (i < 0 || j < 0 || i > rows -1 || j > cols -1 )
{ cerr << "Array bounds error!"; }
space[i][j] = val;
}

The C++ constructs used here should now make sense. Note the use of the scope resolution
operator :: to bind the function definitions to the class. Remember that a class defines a scope,
thus we need a means of unambiguously referencing members. To access members of a class we
use the . operator for references or direct class objects, or the -> operator for pointers to class
objects:

Matrix anInstance, *ptrTo;


if ((ptrTo = new Matrix) == NULL)
{
cerr << "Object creation failed!
exit(1);}

Exiting...";

cout << anInstance.g(0,0) << ptrTo->g(0,0) << endl;


delete ptrTo;

These operators will only work if an instance of an object is available for binding. For static
members, we require a different means of accessing the member since it is not bound to an
instance of the class, but to the class itself. Once more the operator :: comes to the rescue. The
following example illustrates this:

class StaticExample {
private:
static int val;
static const fixed = 3;
public:
static int Getvalue(void) { return val; }
};
// once off declaration; must be in global
int StaticExample::val = 0;
...
cout << StaticExample::Getvalue() << endl;

However, if the variable is declared to be of type static const we can place the initialiser
within the class. This works because the constant can be resolved at compile time, unlike the
usual static case in which the variable will only receive a value at run-time.

6.2. CLASS CONSTRUCTORS AND DESTRUCTORS

6.2

65

Class Constructors and Destructors

When a class object is instantiated, both C++ and Java invoke a special function known as a
constructor. The purpose of the constructor is to provide a means to perform any necessary
initialisation on the object, such as the allocation of dynamic memory or simply assigning default
values to member variables. If you do not provide a constructor for the class, one is automatically
generated by the compiler. This default constructor performs a number of house keeping tasks
but does not assign values to variables and so on. Constructors can be overloaded, thus we may
have several constructor functions with the same name but different signatures. This enables us
to instantiate a class object in a number of different ways, as we shall see.
A constructor must satisfy the following constraints:
1. It may not have a return value no type must be indicated, not even void. The compiler
generates the appropriate return information behind the scenes.
2. The constructor must have the same name as the class.
3. A constructor need not be public, but this limits class usability.
4. Any number of constructors can be defined, provided they have unique argument lists.
The type of the class can also serve as an argument this allows us to define a special
kind of constructor, a copy constructor.
If we study the Matrix class we see that two constructors are defined:
Matrix(void) {rows = cols = 0; space = NULL; }
Matrix(int r, int c);

The first of these will be called when we instantiate an object without passing any arguments:
Matrix MyMatrix;
Matrix *PtrToMx = new Matrix;
If we wish ourMatrix object to be initialised to a certain non-default size, we can issue the
following declarations:
Matrix MyMatrix(2,2); // create a 2 x 2 matrix
Matrix *PtrToMx = new Matrix(2,2);
Now, the second constructor will be invoked when the object is created, and the appropriate
internal initialisations will occur. We could define additional constructors to, for example,
initialise a Matrix object from a 2D array. Java and C++ both allow constructor overloading
and utilise the same sort of signature matching scheme to determine which one to utilise.
A constructor can also be used to initialise data members which cannot be initialised in the
normal manner. In the case of classes, member variables declared as const must be initialised
when the object is created, as must references. C++ allows us to use an initialiser list on a
constructor. For example, we could rewrite our second constructor as follows:

66

CHAPTER 6. CLASSES IN C++


Matrix(int r, int col) :
{ other statements }

rows(r), cols(c)

This means that the member variables rows and cols will be initialised to the indicated values.
If we have a const or a reference, they must be initialised using such a list. The parentheses can
contain any valid C++ expression. It is also permissable to have an empty function body, if a
non-empty initialisation list is present. Note that we do not have to initialise all data members
using an initialisation list; we are free to mix the two modes of initialisation.
If we declared an array of objects, the default constructor will be invoked for each one. It is
possible to allow each array element to invoke a different constructor, but there are constraints.
We can use an array initialisation list, as follows:

Matrix aa[] = {Matrix(2,2),Matrix(3,3),Matrix(4,4)};

However, this will only work with auto arrays if one requires dynamically allocated arrays,
it might be better to write a class which returns such a construct.
C++ has another specialised class function, a destructor which is in some sense the inverse of
a constructor. Just as a constructor has the job of initialising a class, a destructor ensures that
the class is properly eliminated when the object goes out of scope, or is destroyed by a delete
instruction. A destructor satisfies the following rules:

1. A destructor must not have a return type or an argument list; overloading is thus not
possible for this function.
2. The destructor must be in the public section of the class definition, and must have the
same name as the class, preceded by a tilde, .

The compiler generate a default destructor to perform necessary housekeeping if you do not
define one. Note, however, that if you declare a constructor that allocates memory dynamically
you must define a destructor, of your object will not be properly destroyed when it is discarded.
The destructor for the Matrix class provides a typical example of such a function.
Java does not have destructors, since it manages all memory allocation and cleanup using a
garbage collection scheme. When the reference count to an object becomes zero, the system
flags that object for later collection, thus freeing the programmer from the need to bother about
proper cleanup. However, Java does allow one to write a finalize method, which will be
invoked when an object is actually destroyed by the system. There are some problems with
this approach however, since the garbage collector is not required to harvest an object at
any point, and may not do so at all as the program runs, even if the reference count is zero.
This means that the memory required by the program may grow very large before the garbage
collector starts reclaiming memory resources, which can lead to performance issues. Although
there is no delete operator in Java, one can reduce the reference count for an object to zero by
assigning null to the object. The system will then (perhaps) reclaim the memory associated
with that object later.

6.2. CLASS CONSTRUCTORS AND DESTRUCTORS

6.2.1

67

The Copy Constructor

One often wishes to initialise a new object to be the same as an existing one. In this case, one
can use a copy constructor to achieve this. C++ generates a default copy constructor which
does a field by field copy from the source object to the (new) target object. Naturally, this
copy is shallow in the sense that only field values are copied: any memory which a pointer
field may point to will not be duplicated. If our class contains member variables that point to
dynamically allocated block of memory, we must write our own copy constructor to ensure that
a deep copy occurs. A copy constructor is invoked by the compiler when code of the following
sort is encountered:
MyObject oldObj;
MyObject X(oldObj);
MyObject X = oldObj;
// does NOT call copy constr.; = is an operator
X = oldObj;
Note that a simple assignment outside of a type declaration will not call the copy constructor.
The assignment operator when used in any other context is treated as a true binary operator
(which may be overloaded) and C++ will generate a default copy operation which does a field
by field copy from the source to the destination. If a deep copy is required, we should overload
the = operator we shall see how to do this shortly.
A copy constructor should be defined as follows:
class name(const class name& optName) opt init list;
The use of const in the argument is not technically required, but it guarantees that the constructor will not modify its argument (as indeed it should not). The use of a reference is obligatory:
a reference can deal with either a reference or an actual instance of an object, thus such a prototype is more general. You should always pass a reference to a function where possible, since
this allows greater versatility.
For our Matrix class, we could write the following copy constructor:
Matrix(const Matrix& oldMx) : rows(oldMx.rows),
cols(oldMx.cols) {
space = new double * [rows];
for (int i = 0; i < rows; i++) {
space[i] = new double [cols];
for (int j = 0; j < cols; j++)
space[i][j] = oldMx.space[i][j];
}
}
There are several things to note here. Firstly, we may use an initialiser list to set some of the
new class objects member variables, but this is not mandatory. Secondly, because the argument
is a reference, we use the . operator to access the member variables of the source object.
Finally, we require an inner loop to populate the new matrix with the source matrix values.
The function definition for the overloaded assignment operator would look very similar, but the
syntax required to overload an operator is very different. Observe that the constructor does not
modify the source object at all, thus the use of const is appropriate.

68

CHAPTER 6. CLASSES IN C++

6.2.2

Return Value Optimisation

A common requirement is for a member function to return a copy of a class variable. This
return by value causes the class copy constructor to be invoked, to build a new class object
with the contents of the temporary variable passed back from the function. Thus, a constructor
is called twice. The C++ compiler supports a special optimisation that can build a returned
object in-place i.e. in the space set aside for the return value. For example,
B A::f(int x) {
B b(x*x);
return b;
}
B bb = a.f(5);
can be recoded as
B A::f(int x) {
return b(x*x);
}
B bb = a.f(5);
where the returned object is directly built in the space allocated for bb.

6.3

The Pointer this

A class member function may be applied through the . or -> operator:


Object.method(args);
ObjectPtr->method(args);
In either case C++ includes an unspecified parameter in the argument list of the member
function, which is a pointer to the object on which the method was invoked. This pointer may
be accessed from within a member function as this. Since it is a pointer, dereferencing it will
return the entire invoking object. One of the common uses of this within a member function
is to return the invoking object:
class C { C& GetC(void); };
C& C::GetC(void)
{ return *this; }
Note that in this example, a reference to the object is returned, which is more efficient than
returning the entire object. Java has a similar construct, but it is a reference to the invoking
instance, rather than a pointer.

6.4. CLASS TYPE CONVERSIONS

6.4

69

Class Type Conversions

We have already seen that we may cast one type to another, in which case the source type
assumes the characteristics of the target type. C++ knows how to cast simple types and pointers,
but for more complex entities such as class types there is no obvious method for performing the
conversion. A C++ programmer may write type conversion constructors, which are constructors
which take a single argument, different from the class to which they are bound, and convert
his argument to an object of the class. Subsequently, explicit casts (using either (type)object
or type(object), can be formed as desired. Since C++ often performs implicit casts between
types, the compiler will attempt to use a class defined type conversion constructor whenever a
cast is appropriate. If it cannot identify such a constructor, and the conversion is not trivial, a
compile-time error will result.
A type conversion constructor should thus be defined as follows:
class name(type optName) opt init list;
where the source type may be a reference, and the initialisation list is optional, as usual. Note
that although these constructors can also be invoked in a declaration, they are not variants of a
copy constructor. The latter only accepts a single argument of the same type, so no conversion
between types occurs. In the case of our Matrix class, it may be useful to convert from some
other matrix representation to our own. In this case, we would might write something like the
following:
class ForeignMatrixType { ...

};

...
Matrix::Matrix(ForeignMatrixType& fmt) { ...
...

};

ForeignMatrixType f,g;
Matrix A = f, B, C;
B = g;
C = B + f;
We have omitted details of the other class, since this is simply an illustration of the points
discussed above. Observe how we can mix the two class types in the last line: C++ will
implicitly call our type conversion constructor to convert f appropriately. Alternatively, we
could force an explicit type conversion:
C = B + (Matrix)f;
This code example shows operator overloading in action, a subject we will address shortly.

6.5

Constant Objects and Methods

C++ allows one to define constant objects: these are objects which cannot be modified by
invoking member functions or accessing member variables. For example, the following declares
a constant 4 4 Matrix object.

70

CHAPTER 6. CLASSES IN C++


const Matrix mymatrix(4,4);

Any attempt to modify the contents of the object will generate and error. Furthermore, it is also
illegal to pass a constant object to a function which does not explicitly declare the appropriate
argument to be const.
Unfortunately, C++ assumes that all object method calls will attempt to change the object and
will therefore generate a compilation error when any method is invoked on the constant object.
Thus, in the case of our Matrix class even a simple query function like GetRowSize() will be
flagged as illegal. If we are to work with constant objects, we must explicitly declare which
methods are constant methods i.e. those that do not modify the object in any way. We do this
by placing the const keyword after the argument list, but before the method body. To allow
for the use of constant Matrix objects, we should redefine the following methods as const:
int GetRowSize(void) const { ...};
int GetColSize(void) const { ...};
double g(int i, int j) const { ...};
The method s(i,j) sets an element of the array, and cannot therefor be declared const. The
compiler checks to make sure we do not attempt to modify the object from a const function.
A const function can also return a const object. An object returned in this manner cannot
be an lvalue, since it is now immutable. It can, however, be assigned to another variable and
manipulated through that variable. A const method can also be applied to non-constant object
instances. C++ also supports the notion of a constant pointer to an object, as opposed to a
pointer to a constant object. In the former case, the pointer variable may mot be modified,
whilst in the latter case, the object itself may not be modified. Here are some examples:
const Object *ptr; // pointer to constant object
// *ptr = ... is illegal!
Object * const ptr; // constant pointer to object
// ptr = ... is illegal!
const Object *const ptr; // const pointer to a const
// object; ptr = ... and *ptr = ... both illegal
Java does not have a means of declaring constant methods. A final method is constant in
some sense, but that constancy has nothing to do with prohibiting a method from modifying
an object to which it is bound. However, when final is applied to a member variable, that
variable becomes immutable, just as a C++ const member variable is immutable.

6.6

Operator Overloading

We have examined method overloading in a limited context thus far namely that of class constructors. It is possible to overload any method within a class, except the destructor. Overloading simply provides multiple versions of a single function. Most of the intricacies involved in
member function overloading arise from inheritance, so we will defer a more detailed examination until we deal with that subject. We can, however, discuss the way in which operators are
overloaded. This is a facility which Java does not provide, although in our opinion it is a very
useful one which leads to elegant and clear code.

6.6. OPERATOR OVERLOADING

71

C++ permits most of the operator set to be overloaded, including the index operator, ([]), the
function call operator (), new and delete. The operators which cannot be overloaded are: .,
.*, ::, ?:, #. The precedence and associativity of the original operators remains unchanged
by the overloading process.
In order to differentiate operator overloading from method overloading, a special method definition is required:
ret type operator operator ssymbol ( args );
Both unary and binary versions of a particular operator can co-exist. The return type and
arguments will depend on the nature of the operator. For example, consider the code fragment
below, which overloads the assignment operator for the Matrix class:
Matrix& operator=(const Matrix& A) {
// avoid A = A type assignments
if (this == &A) return *this;
if (space != NULL) {
for (int i = 0; i < rows; i++)
delete [] space[i];
delete [] space;
}
rows = A.rows; cols = A.cols;
space = new double * [A.rows];
for (int i = 0; i < A.rows; i++) {
space[i] = new double [A.cols];
for (int j = 0; j < A.cols; j++)
space[i][j] = A.space[i][j];
}
return *this;
}
Perhaps the most interesting construct here is the pointer this. In C++ this points to the
object instance to which the method has been applied. Thus, to access the object, itself one
must use *this. Observe that although the assignment operator is a binary operator, we only
have one argument in the argument list. This is not really unexpected: the overloaded operator
is a method which gets applied to a specific instance of the class. Thus if we see A = B; this
is essentially equivalent to the method call invocation A.operator=(B);. We do not need to
indicate the left-most argument, C++ knows that it will be the object to which the operator is
applied. Next, examine the argument and return types. We pass a reference to the object and
return a reference too, since this is more efficient. In some cases we are compelled to return a
reference because the return value can serve as an lvalue; in this case we return one purely for
efficiency, to avoid an object copy.
Once we have defined an overloaded operator, C++ will automatically use that operator if it
cannot use the default operator:
Matrix operator + (Matrix& B) {
Matrix temp(*this); // copy constructor used!

72

CHAPTER 6. CLASSES IN C++


if (B.cols != cols || B.rows != rows) {
cerr << "Cant add these mxs!";
return *this; }
for (int i = 0; i < rows; i++)
for (int j = 0; j < cols; j++)
temp.space[i][j] += B.space[i][j];
return temp;
}

Because an add operation does not change either of its operands, we must create a temporary
Matrix object to contain our result. We then pass this back by value: we cannot use a reference,
although it is more efficient, because the variable temp will go out of scope when the function
returns. However, the return statement copies its argument, using the default (or overloaded)
copy constructor into the receiving variable. Note that if we need to define a copy constructor we
will almost certainly need to overload the assignment operator, to ensure that the appropriate
deep copying occurs in both instances.
Having overloaded these operators, we can now overload the += operator very simply:
Matrix& operator +=(Matrix& B) {
*this = *this + B; //use of overloaded = and +
return *this; // return modified object
}
In this case, we modify the left-most argument (the object itself) which is why we return *this.
Note that one must attempt to cater for all possible usages of the operator, thus coding a void
return type would not have been satisfactory here. If we encountered an expression such as !(A
+= B) (assuming that appropriate overloading had been performed) then a void return type
would result in an error.
Unary operators (such as !) are overloaded in a similar fashion. Now however, we have no
argument list, since the operator acts on the object instance alone. If we wish to write a Matrix
inverse operator we could proceed as follows:
Matrix operator ! () {
Matrix temp(*this);
// ...

code to invert matrix temp

return temp;
}
In this case the inverse of the matrix is computed on a local copy which is then returned by
value. This behaviour is, however, one we have chosen and not strictly necessary.
Another interesting operator is the function call operator, (). We can use this to achieve a
Matrix indexing ([] will not work, since only one argument can be taken):
double& operator() (int i, int j) {
// ...error checking code
return space[i][j]; }

6.6. OPERATOR OVERLOADING

73

We could use this functionality as follows:


Matrix A(2,2);
A(0,0) = 0.0; A(1,1) = 1.0;
A(0,1) = A(1,0) = 1.0;
Note that we returned a reference because the returned value can serve as an lvalue.

6.6.1

Type Conversion using Operator Overloading

We have already seen how a type conversion constructor can be defined to allow automatic type
conversions to a given class. However, this framework will not work if we wish to convert to a
type other than that of the associated class. For example, we can conceive of a way of converting
from our Matrix class to a double, perhaps by constructing a matrix norm, but this conversion
cannot be effected by a type conversion constructor, since the destination type is not of type
Matrix.
C++ allows us to overload the type casting operator. For example, if we wished to allow
conversion from a Matrix to double, we could write the following:
operator double () const {
double mxNorm = 0.0;
// compute matrix norm...
return mxNorm; }
The target type can be any valid type, and we have declared the function to be const since it
will not modify its operand. Note that no argument or return type can be specified: the compiler
will generate the appropriate types behind the scenes, and will automatically use this operator
to effect any necessary casts. Although this example is somewhat contrived, this existence of this
type conversion facility is important and ensures that arbitrary type conversions are possible.
In summary, we present some guidelines to follow when overloading operators:
Generalise Try to ensure that your overloaded operator will function in all possible expressions.
Use References Always pass arguments by reference and try to use references for your return
type too. You only need to return objects by value if the operator does not modify the
object to which it is bound.
Be Sensible Some operators have a well established meaning and you may cause great confusion if you overload them willy-nilly. The operators new and delete are good examples of
this.
Unary Operators These should always modify a copy of the operand except for the ++ and
-- operators. The post and prefix operators require special techniques to differentiate
them. The postfix variant is defined with a dummy int argument, which is not used but
allows the compiler to separate the definitions. They should properly have the following
prototypes:
const classname & operator ++ (); //prefix ++x
classname operator ++ (int); // postfix x++

74

CHAPTER 6. CLASSES IN C++

Basic operator overloading provides a great deal of flexibility, but there are some limitations.
In particular, the left-most operator is always assumed to be an object of the appropriate type.
This need not be so. Consider the * operator, and imagine that we allowed multiplication
of a Matrix by a constant: c * A;. How could we overload * to achieve this? The left-most
argument is a double and the rightmost-argument is an object: the usual overloading strategy
will not work. Of course, if we have defined a cast operator from double to Matrix, then we
could apply this to promote c accordingly. Nonetheless, this objection is an important one and
we require a means to circumvent this restriction. We do this using friend functions.

6.7

Friends of a Class

Encapsulation of class members and functions is one of the fundamental tenets of object orientated design and programming. Java applies this model rigorously, but C++, partly because of
its piece-meal evolution, allows certain liberties when working with classes. Perhaps the most
blatant example of this are so-called friend functions. For reasons of efficiency and elegance,
C++ permits a class to define a list of functions which can access all the members of a class as
if it were a class method. Although this seems to violate encapsulation, the right to access class
internals can only be granted by the class, and must be specified before the class is compiled.
A friend function is declared as follows:
class ClassName {
...
friend ret type FunctionName (args);
...
};
ret type FunctionName (args) { ...}
Note that the keyword friend is only applied to the function declaration within the class. One
can declare a whole class to be a friend of another:
class ClassName {
...
friend class ExternalClass;
...
};
In this case, all the member functions of the class ExternalClass will have access to ClassNames
internal variables and functions.
We can use friend functions to allow us symmetrical binary operators for the problematic case
we mentioned before. Thus, we can write:
class Matrix {
...
// the form Obj * double
Matrix operator*(double c);
friend Matrix operator*(double c,

6.7. FRIENDS OF A CLASS

75

const Matrix& A);


// double * Obj
...
};
Matrix operator*(double c, const Matrix& A) {
Matrix temp(A);
// multiply logic
return temp;
}
In this code snippet, we have both a member and a friend version of the * operator. Overloading
allows us to have multiple signature versions of any valid function however, so this is not an
issue. Because the friend is not a member of the class, it has no implicit this pointer. We must
therefore pass an object reference as one of the function arguments: the compiler will supply
the appropriate Matrix operand when it invokes the function.
Another interesting example of a friend operator function is the output operator, <<. The
associativity of this operator is left to right, and for correct usage it must return a reference to
an ostream object:
ostream& operator<<( type );
Because of the associativity constraints, if we wish to write our own output operator for Matrix
objects we cannot use a member function: the left-operand will be an ostream object. Thus,
we must use a friend function:
friend ostream& operator<<(ostream& theStream,
const Matrix& M );
...
ostream& operator<<(ostream& theStream,
const Matrix& M )
{
theStream << "Matrix size is " <<
M.rows << "X" << M.cols << endl << endl;
for (int i =0; i < M.rows; i++) {
for (int j = 0; j < M.cols; j++)
theStream << M(i,j) << " ";
theStream << endl;
}
return theStream;
}
The utility of operator overloading is clear from this example. We have used the overloaded ()
operator to access the matrix elements, although we could directly access the member variables
to do this.

76

CHAPTER 6. CLASSES IN C++

Chapter 7

Inheritance
The concept of inheritance is fundamental to object orientated programming, and as such is
supported by both Java and C++. A class may be sub-classed or derived from another, in the
sense that it inherits the characteristics of that class and complements them with additional
information. We thus speak of derived classes, and use the word inheritance to refer to the actual
process by which this is accomplished. A derived class will contain the member functions and
variables of its predecessor, in addition to new member variables and functions. The primary
purpose of inheritance is to extend the utility of a class, and to thus promote the reuse of existing
code. In other words, if we find that our class definition is too general, we can narrow our focus by
refining the class definition. One of the most useful features of inheritance is dynamic function
binding, which allows us to invoke the same function on different derived classes in the sure
knowledge that the appropriate version will be called when the program executes. C++ also
permits multiple inheritance, which allows a sub-class to be derived from several parent classes.
Java does not support multiple inheritance, but allows one to use interfaces to achieve similar
functionality. C++ does not have the super keyword: you must keep track of a given classs
parent.

7.1

Simple Inheritance

The notation used by C++ and Java is very different for class inheritance. Java uses the extends
keyword, whilst C++ does not have a special keyword. Assume we have class B which inherits
from (extends) class A. We would declare such a class as follows:
class A { ...};
class B : public A { ...

};

The second line declares class B to be a sub-class of class A, as indicated by the type name after
the :. The keyword public determines how members are inherited, and will be discussed
shortly. The new sub-class will contain the data and function members of its parent class, and
will add additional members in its own class definition. If a variable has the same name in
derived class, it will hide the variable from the parent class. Otherwise the variable can be
accessed as if it were defined in the sub-class. Similarly, if a function from the base class has
the same name as one defined in the sub-class, the sub-class function will hide or override the
base class function. If the data members and functions are uniquely named, they will all be
accessible from the newly derived class. If a variable or function is hidden by the derived class,
77

78

CHAPTER 7. INHERITANCE

we can use the scope operator :: to access it. Thus, if A has a member int f(void) and B has
a member float f(int), we can access As version of f() by using A::f(). A function name
will be hidden irrespective of the argument list: in this case only the name matters, rather than
the function signature.
Note that private members of the base class are not inherited: they will not be accessible from
the derived class. Java applies an identical access policy, but does not allow the same level of
access control as we shall see.
A derived class may itself serve as a base class for further refinement. This leads to the notion
of indirect bases. For example, if we defined a class C derived from class B, we would say that
A is an indirect base class of C, while B is a direct base class of C.
There are certain functions which cannot be inherited:
The = operator Defined by default for each class, and so hides the base class definition.
Friends If friends could be inherited, then any class could use inheritance to access the parent
class internals.
Constructors and Destructors These are individual to each class. C++ automatically invokes the base class constructors and destructors when a derived class is created or destroyed.
The following example shows a few of these ideas in action. Let us assume we have a base class
A, and a derived class B, defined as follows:
class A {
private:
int p;
int f1(int);
public:
int q;
float f2(float);
};
class B : public A {
private:
char f1(char);
public:
int f1(float);
int f2(float);
int f3(float);
};
Class A contains a private member p this variable will never be visible in any class directly or
indirectly derived from A. Similarly, the private member function f1() will not be accessible
from with B, and cannot be made accessible to any sub-class of B. Note that within B, the
private version of f1() can be called, since it has a different signature and in this case function
overloading, rather than overriding (through inheritance) is in operation. If the function f1()
in class A had been in the public section, then this function would have been hidden by the
overloaded f1() function in B.

7.2. INHERITANCE ACCESS CONTROL

79

There is a third access specifier: protected. In C++, as in Java, a protected member can only
be accessed from within the inheritance hierarchy, and by friend functions. Unlike a private
member, which remains private and cannot be accessed by a derived class, a protected member
can be accessed by any derived class, but cannot be accessed from outside of the inheritance
hierarchy. Thus protected provides a level of access control between private (no access at all)
and public (access for everyone).
class A {
private:
int p;
protected:
int q;
float f2(float);
};
class B : public A {
public:
int f1(float);
};
int B::f1(float tt)
{
if (q == 1) return 1;
else return f2(tt); }
...
B testObject;
testObject.q = 1; // ILLEGAL!!!
testObject.f1(2.0); // FINE - public method
If we wish people to extend a class we have written, it may be advantageous to declare members
to be protected rather than public.

7.2

Inheritance Access Control

C++ introduced another level of access control which is lacking in Java, namely that of inheritance access control. Java relies on the class access control mechanism to provide appropriate
control of the object internals throughout the inheritance hierarchy. In many cases this is sufficient. However, it is sometimes desirable to ensure that an inherited member assumes different
access rights in a derived class. Java retains the access specifier from the parent class. In
other words, a public member remains public in the derived class, a protected member remains protected and private members are not inherited at all. This corresponds to public
inheritance in C++. C++ makes provision for 3 inheritance schemes:
public public members remain public, protected members remain protected, private members
are not inherited.
private public and protected members become private, private members are not inherited.
protected public members become protected, protected members remain protected, private
members are not inherited.

80

CHAPTER 7. INHERITANCE

If you use private inheritance you will lose access to all base class members further down the
inheritance hierarchy, so this is rarely used. You might do this to isolate a base class from any
tampering but for this to work the sub-class must have its own protected or public interface.
Unlike Java every class can be sub-classed (Java has the keyword final which can be used to
block sub-classing).
To ensure that the access specifiers propagate untouched, you can use public inheritance. Of
course, private members will no longer be accessible, but this is always the case. If you use this
approach, you can continue to derive new classes from any previously derived class. However, at
each step, the private data members of the base class will no longer be accessible in the derived
class. This is not always desirable. If, on the other hand, you declare the base class members
to be protected, you can enjoy many of the security benefits of private access whilst allowing
future extensions of your class.
If you use protected inheritance, you will force public members from the base class to become
protected in the derived class. This adds an additional layer of security to your class and will
stop public access to the new class members. In general, you want to ensure that users of you
class use the provided public member functions to access and modify the object, since they may
otherwise break the code. Making the member variables private is often too restrictive. It
is usually a good idea to use protected or public inheritance with all the appropriate base
members suitably protected. The following example illustrates a few of these points:
class Base {
protected:
int var1;
public:
int var2;
};
class PrivEx :
private:
int priv;
public:
int pub1;
};

private Base {

class ProtEx : protected Base {


private:
int priv;
public:
int pub1;
void f1(void)
{ var2 = 1; // OK - var2 is protected!!
};

};

PrivEx privEx;
ProtEx protEx;
...
privEx.var2 = 1; // ERROR - private in new class
protEx.var2 = 1; // ERROR - protected in new class
protEx.publ = 1; OK - in public section
The existence of inheritance access control allows complete flexibility in determining precisely
how your derived class access will be structured. In fact one can go further, and use what is

7.2. INHERITANCE ACCESS CONTROL

81

known as an access declaration to selectively modify the access of certain members in the derived
class:
class Base {
protected:
int vprot;
public:
int prot;
float flot;
};
class Deriv : public Base {
protected:
Base::prot; // member is now protected
};
Deriv myObject;
myObject.float = 1.0; // valid...
myObject.prot = 2; // ERROR!!!
The only restriction placed on such access declarations is that the new declaration cannot make
the member more accessible than it was in the base class. Thus in the above example, a
declaration like Base::vprot under the public section would have been invalid.
Let us put this all together by referring back to our Matrix class. Assume that we wish to derive
a ComplexMatrix class from our (real valued) Matrix class. One of the goals of object orientated
programs is code reuse, so we wish to use as much of our existing code as possible to achieve
this. Let us make a small modification to our base class to ensure that it can be extended: we
will change the private specifier to protected. We do this so that the internal variables (size
and storage) will be accessible through the entire inheritance hierarchy, but not to outside code.
We assume that those sub-classing our Matrix (or its descendents!) will know what they are
doing! We now define our new class:
struct Complex {double x, y; } ;
class ComplexMatrix : public Matrix {
protected:
double **ImaginaryPart;
public:
ComplexMatrix(void)
{ ImaginaryPart = NULL; }
ComplexMatrix(int r, int c) : Matrix(r,c);
ComplexMatrix();
Complex g(int i, int j);
void s(int i, int j, Complex val);
};
Complex ComplexMatrix g(int i, int j) {
Complex temp;
if (ValidIndex(i,j)) {
temp.x = space[i][j];
temp.y = ImaginaryPart[i][j];
}

82

CHAPTER 7. INHERITANCE
else
return temp;
}
void s(int i, int j, Complex val) {
if (ValidIndex(i,j)) {
space[i][j] = val.x;
ImaginaryPart[i][j] = val.y;
}
}
ComplexMatrix::ComplexMatrix(int r, int c) {
int i;
ImaginaryPart = new double* [r];
for (i = 0; i < r; i++) {
ImaginaryPart[i] = new double [c];
}
}
ComplexMatrix::ComplexMatrix() {
if (ImaginaryPart != NULL) {
for (int i = 0; i < rows; i++)
delete [] ImaginaryPart[i];
delete [] ImaginaryPart;
}
}

This example only implements the extension for the basic Matrix class, but operator overloading
etc can also be coded. The operators from the base class will still be present and we can use
Matrix manipulations to compute certain ComplexMatrix operations (such as summation). We
will not show this however, since it adds nothing to this example.
The new ComplexMatrix class illustrates several features of inheritance:
Member reuse The new class has access to all the non-private base class member variables
and functions. Observe that we use rows and cols to represent the dimensions of both a
Matrix and a ComplexMatrix object. We also reuse the Matrix space, treating it as the
real part of the complex matrix. The protected auxiliary function ValidIndex() is also
inherited and may be freely used this saves us from writing a new version
Base Class Initialisation The constructor functions initialise the base class in the standard
way: by a call to the appropriate base class constructor, made through the initialisation
list. Observe that we do not repeat the initialisation code in the function definition,
and that the constructors do not refer to the base class variables. The destructor will
automatically invoke the destructor for the base class when it is called.
Polymorphism We have defined functions like ValidIndex() which can bind to either a
Matrix object or a ComplexMatrix object. This is known as polymorphism and is an
important feature of object orientated design. The C++ compiler will ensure that that
the correct code is generated when the member function is called, regardless of the object
it is attached to. The functions s() and g() are also examples of polymorphic functions,
but ones which are overloaded (multiple versions exist, defined through inheritance in this
case).

7.3. VIRTUAL FUNCTIONS

7.3

83

Virtual Functions

C++ allows us to define virtual functions. A virtual function declared in a base class must be
implemented in every (directly or indirectly) derived class. The base class contains a function
declaration preceded by the keyword virtual:
class Base {
public:
virtual void myfunc(int i);
...
};
class Sub1 : public Base {
public:
void myfunc(int i) { ...
...
};

};

class Sub2 : public Base {


public:
void myfunc(int i) { ...
...
};

};

class Sub3 : public Base {


public:
void myfunc(int i) { ...
...
};

};

We must use virtual functions if we wish to allow dynamic binding. Usually the object a member
function refers to is known at compile time, and the compiler is able to generate the appropriate
code as it runs this is known as static binding. However, in certain cases, the compiler
cannot determine which class the function will be bound to until the code actually executes,
and must therefore therefore generate code to allow dynamic binding. When a virtual function
is present, the dynamic binding mechanism will be invoked to determine the particular version
of the function which should be called. The need for dynamic binding usually arises from the use
of pointers to objects. A pointer to a derived class can be made to point to the base class object
and the type of the pointer no longer determines which version of a function should be called:
the actual pointer used must be examined to make sure that the correct function (based on the
object or sub-object being pointed to) is invoked. If a function is not declared virtual, then
static binding will take place and the type of the object pointer will determine which version of
the function is called. This is all a little confusing so heres an example:
Base *polyarray[4]; // polymorphic array!
polyarray[0]
polyarray[1]
polyarray[2]
polyarray[3]

=
=
=
=

new
new
new
new

(array of Base pointers)

Sub1;
Sub2;
Sub3;
Sub2;

for (int i = 0; i < 4; i++)


polyarray[i]->myfunc(); // dynamic binding

84

CHAPTER 7. INHERITANCE

This example creates an polymorphic array of objects i.e. objects of several types are present
in the array (actually pointers to those objects). Now, because myfunc() is a virtual function,
the version of this function appropriate to each object type will be invoked as we step through
the loop. If we did not declare this function to be virtual, then the type of the array (pointer
to Base) would mean that the version of myfunc() associated with Base would be invoked for
every element of the array. This occurs because the type of the pointer is statically deduced
from the declaration of the array (type Base) and the member function invoked will be that
associated with the Base class. This is usually not desirable, since the function for each class
will usually provide information from the derived class, and we will thus only be looking at part
of the object. In general, one should use virtual functions whenever a derived class needs to
redefine the behaviour of a given method and we wish to treat all derived classes in the same
manner. This simplifies code and may lead to more elegant solutions.
Java implicitly uses a dynamic binding scheme. C++, on the other hand, does not assume that
all base class functions are virtual: there is a performance penalty associated with implementing
the dynamic binding mechanism and the programmer should be free to invoke member functions
as they see fit. Here are some points to bear in mind when defining virtual functions.
Declaration the base class must declare the function to be virtual. Derived classes need not
use the virtual keyword.
Parameters the signature of the virtual function should be the same in each sub-class (Note:
this means the return type can be different, but we shall ignore this subtlety and assume
it must be identical).
Binding when the function is virtual, the type of the object pointer or reference used to invoke the member function dynamically determines which version is called. Otherwise the
declared type of the pointer or reference statically determines the version to be called at
compile time.
Virtual Operators Operators can be declared virtual and dynamically bound. This topic is
advanced, so we will not discuss it further.

7.3.1

Virtual Destructors

The ability to cast object pointers rather arbitrarily can cause problems with destructors. Specifically, if a delete operation is called on a pointer to a derived class which has undergone a type
conversion, the correct chain of destructors may not be invoked. This can lead to memory leaks
in applications which allocate memory within their constructors. This simple example illustrates
the problem:
class Base {
public:
int *block;
Base()
{ block = new int[30];
cout << "30 bytes lost";}
Base()
{ delete [] block;
cout << "30 bytes recovered"; }

7.3. VIRTUAL FUNCTIONS

85

...
};
class Deriv : public Base {
public:
int *more;
Deriv() { more = new int[20];
cout << "20 bytes lost";}
Deriv()
{ delete [] more;
cout << "20 bytes recovered"; }
...
};
Base *b1, *b2;
// b1 = new Base;
b2 = new Deriv;
// delete b1; OK - Base destructor called
delete b2; // Argh! - Base constructor called!!!
The output from this code will be:
30 bytes lost
20 bytes lost
30 bytes recovered
and we can see that we have indeed lost 20 bytes! The problem arises because we have used static
binding and the type of b2 determines the choice of destructor at compile time. If, however, we
make the base class destructor virtual we ensure that the correct destructor will be invoked
when b2 is destroyed, since dynamic binding uses the current type to determine which destructor
will be invoked. It is good practise to declare a destructor virtual if your class will serve as
a base class for new classes. Note that constructors cannot be virtual: we explicitly indicate
which constructor should be called, so there is no ambiguity.

7.3.2

Identifying Object Type

A final point to note is the existence of the typeid operator. This is part of the Runtime Type
Identification (RTTI) system and allows us to (dynamically) determine the type of an object:
if (typeid(*polyarray[0]) == typeid(Base)) {...}
cout << typeid(*polyarray[0]).name() << endl;
You must include typeinfo to use this facility. The operator returns an object of class type info.
There is also a member function name() which returns the name of the class type, as illustrated
in the example code.

86

7.4

CHAPTER 7. INHERITANCE

Abstract Classes

Both Java and C++ support abstract classes. An abstract class contains one or more pure virtual
functions. A pure virtual function is one which does not have an associated implementation:
it is merely a declaration of a function which all sub-classes must implement. Java uses the
abstract keyword to indicate such functions; C++ uses the following notation:
virtual ret type FunctionName(arg list ) = 0;
This notation suggests that the function has no body, which is indeed the case. An abstract
class cannot be instantiated: only a derived class in which all the pure virtual functions have
been implemented can be instantiated. If a sub-class fails to implement all the pure virtual
functions, it becomes abstract itself. Non-abstract member functions and variables may also be
part of an abstract class they are subject to the standard rules governing inheritance. As the
use of the term virtual suggests, pure virtual functions are bound dynamically.
An abstract class provides a means to define a standard behaviour which all sub-classes must
support. The interface mechanism in Java provides a more rigorous means of achieving this,
although it places strict constraints on composition of the class. The following example shows
how we might begin to define a more general Matrix class using a higher level of abstraction:
class Matrix {
protected:
int rows, cols;
Bool ValidIndex(int i, int j);
public:
Matrix(int r, int c) : rows(r), cols(c) {}
virtual void s(int i, int j, void *data) = 0;
virtual void *g(int i, int j) = 0;
...
};
class RealMatrix : public Matrix {
protected:
double **space;
public:
RealMatrix(int r=0, int c=0) : Matrix(r,c)
{...}
void s(int i, int j, void *data)
{ space[i][j] = *((double*)data); ...}
void *g(int i, int j)
{ return &space[i][j]; }
...
};
struct Complex {double x, y; }
class ComplexMatrix : public Matrix {
protected:
Complex **space;
public:
ComplexMatrix(int r=0, int c=0) : Matrix(r,c)

7.5. MULTIPLE INHERITANCE

87

{...}
void s(int i, int j, void *data)
{ space[i][j].x = ((Complex*)data)->x;...}
void *g(int i, int j)
{ return &space[i][j]; }
...
};
In this example, the abstract base class Matrix contains the fundamental attributes rows and
cols which describe its dimensions. The function ValidIndex() is inherited normally for each
concrete implementation of Matrix it only needs to use information from the (abstract) base
class. The functions s() and g() are used to set and get a specified element of a matrix.
This is a facility we want all our derived classes to implement, so we declare them as abstract
functions. We cannot provide a description of these functions in the base class, since they are
intimately involved with the (concrete) physical representation of the data. We thus define
them appropriately for each new sub-class. Within each sub-class we also declare and allocate
storage for the matrix variant: again, this is tied to an actual implementation, so it would be
inappropriate (and difficult) to place such code in the base class. Observe that the abstract class
can still have a constructor and destructor: although it cannot be directly instantiated, a subclass object will call the appropriate base class constructor/destructors when it is created. We
have also used void * pointers in the virtual functions to ensure that we can deal with generic
data. We shall soon see that Templates can be used more effectively and easily to obtain generic
functions and classes. The list of abstract functions should contain all those functions we want
every type of matrix to implement. Our example is simplistic, but it illustrates the general idea
and utility of an abstract class.

7.5

Multiple Inheritance

Under C++ one can define a sub-class which is derived from more than one base class. This
is known as multiple inheritance. The derived class will contain member functions and variables
from all the parent classes, and each parent class can have a different access specifier. Here is
an example declaration:
class A : public B, public C, ..., public Z {...}
The rules regarding function visibility (overloading/overriding) are the same in both single and
multiple inheritance. However, the ability to inherit methods from a number of base classes is
associated with its own problems. Object initialisations occur as one would expect: a constructor
must initialise each base class individually, or the default constructor will be used. The following
code shows a simple example of multiple inheritance.
class B
{
private:
int varb;
public:
B(int i = 0) { varb = i; }
};

88

CHAPTER 7. INHERITANCE
class C
{
private:
int varc;
public:
C(int j = 0) { varc = j; }
};
class A : protected B, protected C
{
public:
A(int d, int c) : B(d), C(c) {...}// init each sub-object
};

One can also define virtual functions and abstract base classes; in the latter case, the newly
derived class will be abstract if it does not implement all pure virtual functions defined in its
base classes. Java does not provide for multiple inheritance. It uses interfaces to obtain some of
the same functionality. However, within Java, we can only implement (inherit from) multiple
base classes if they are all abstract. In certain cases this may not be desirable one essentially
has to anticipate all future uses of each class and take precautions by abstracting them to a very
high level. Furthermore, an interface may only contain data members which are constant and
all member functions are public.
One problem that may occur when multiple inheritance (in the sense of C++) is available, is
ambiguity. What happens if a function or data member is present in more than one base class?
The compiler will not know which member is being referred to when a derived class object
invoked the member function or accesses the member variable. One can use the scope resolution
operator to solve some of these problems. Consider the following example:
class A { public:

int var; };

class B { public:

int var; };

class C : public A, public B {


public:
void printvar(void)
{
// ERROR! - ambiguous
cout << "Var is " << var << endl;
// OK - refers to As version
cout << "Var is " << A::var << endl;
};
Both class A and class B contain the member var. In this case, we can use the scope resolution
operator to specify which sub-object is referred to. The example is fairly obvious: a more
insidious variant arises when one or more of the base classes were derived from a common
parent class:
class A {
public:
int var;

7.5. MULTIPLE INHERITANCE

89

D
void printvar(void)
{ cout << "Var is " << var << endl; }
};
class B : public A { public:

int b; };

class C : public A { public:

int c; };

class D : public B, public C {


public:
void print(void)
{
cout << "Var is " << A::var << endl; // ERROR! - which A !?
}
};
In this case, we will have two A sub-objects within the final derived class D, and it it thus
ambiguous to refer to A::member .We can choose to refer to B or Cs copy of the sub-object A,
but in this is not desirable since we then have two sets of variables and functions where we really
only (logically) wish to have one.
The solution to these problems is to use virtual inheritance. If a base class for our multiple
inheritance hierarchy was generated from a common ancestor, virtual inheritance ensures that
only one copy of the sub-object for that ancestor is ever present in any subsequently derived
class:
class A {...};
class B : public virtual A {...};
class C : public virtual A {...};
class D : public B, public C {...};
When we use the virtual keyword as indicated in the example above, all references to inherited
members of A (now called the virtual base class) in D will refer to one logical copy of the A
sub-object. One may think of the virtual inheritance mechanism as creating a shared version of
the common sub-object in the derived class. Remember that, even with this facility, a derived
class member will hide a base class member if the names/signatures are identical. The issue of
constructor invocation for virtual base classes is complicated the mechanism will ensure that
the base class constructor is called only once, and has various arcane rules to determine which

90

CHAPTER 7. INHERITANCE

derived class actually invokes the base class constructor. If you design your classes correctly
it is usually possible to avoid these issues (you can always determine whether public/protected
members of a parent class with particular names are present).
In conclusion, C++ (like its predecessor C) gives all the available options to the programmer,
and lets them decide how they wish to use that freedom. Languages such as Java sacrifice some
of this flexibility to try and eliminate issues such as the one above. Ultimately, as with most
things, there are trade-offs involved and you must decide whether the added complexity justifies
the increased versatility.

Chapter 8

C++ Templates
Class inheritance is a very powerful concept, and the ability to extend an existing class can
lead to code reuse and elegant solutions to many problems. However, there are certain common
situations in which inheritance by itself leads to unwieldy solutions. Consider a List class
one which allows us to logically connect objects of a specific type into one coherent linear
structure. Now, we can create specific class types which can hold linear lists of that type
but what happens if we wish to deal with many such lists? Or if we wish a single list to
contain several different types of class object? The conventional approach is to use inheritance
to refine an abstract base class as we require, adding attributes appropriate for the type of
object we wish to store. Unfortunately, since the variety of objects we could store in a list is
limitless, we will need to sub-class our List base class for each new class type, a time consuming
approach. Fortunately, C++ provides a better alternative: class templates. A class template
or generic class can represent any class type with the same general properties. Thus, we can
build a generic class (template) which can store entities of any type, and manipulate them
appropriately. The idea of generic templates can be extended to function templates: rather
than writing a new implementation of a function (such as sort algorithm) for every new class
type, we can build a generic function which can handle arbitrary types. Templates reduce code
complexity considerably and the existence of the Standard Template Library, which implements
a number of useful generic classes, allows rapid development of applications.

8.1

Class Templates

Class templates enable us to write a once-off blueprint for a whole collection of classes, by
introducing a new class definition notation. Let us look at a simple example. Our Matrix class
is defined to work on data of type double. If we wished to have a matrix which dealt with, say,
int data, we could sub-class our abstract Matrix base class to accomplish this or we could
use the following class template:
template<typename T>
class Matrix {
private:
T **space;
int rows, cols;
bool ValidIndex(int i, int j);
public:
91

92

CHAPTER 8. C++ TEMPLATES


Matrix(int r = 0, int c = 0);
Matrix();
void s(int i, int j, T& data);
T& g(int i, int j) {
if (ValidIndex(i,j))
return space[i][j];
else {
cerr << "Matrix bounds error!" << endl;
return space[0][0];
}
}
...
};

This is only a partial implementation, but the basic structure of the template definition should
be clear. A new keyword, template, has been introduced, which informs the compiler that
this is a template class, and not one to be taken at face value. A template has a number
of associated template or generic parameters, in this case there is only one, called T. This
parameterises the class definition. The parameter assumes a type rather than some numerical
value, and is preceded by the class specifier. Another new keyword, typename, is introduced to
tell the compiler that T is actually the name of a type. One can use the keyword class instead,
but his can be confusing.
We shall see later that a template can also take (integral) numerical arguments (value parameters), in which case the corresponding parameter does not have the typename keyword before it.
The basic idea, which is fairly obvious from this example, is to substitute the generic parameter
T everywhere we wish to use an arbitrary type. For example, the elements within the matrix
will be of type T, rather than int or double.
To use the template, we must ensure the creation of a class instance. We do this by declaring
variables in the following manner:
Matrix<double> realmat;
Matrix<MyObject> objmat;
We have created two instances of the generic matrix class, one which contains double elements,
and one which contains objects of some class MyObject. Of course, performing matrix arithmetic
on class objects may not be sensible, but the idea of parameterising a class is well illustrated
by this example. One may imagine that the compiler replaces T by the types specified within
the <> and generates new classes (i.e. source code) for each variant. This is known as template
instantiation.
Member functions of a template class must also be defined to cope with generic parameters.
Thus, for example, the implementation of s() would look something like the following:
template<typename T>
void Matrix<T>::s(int i, int j, T& data)
{
if (ValidIndex(i,j))
space[i][j] = data;
else
cerr << "Matrix Bounds error!" << endl;
}

8.1. CLASS TEMPLATES

93

Of course, for this to work the class type T must have the assignment operator appropriately
overloaded or a simple shallow copy must suffice. Observe that the same substitution mechanism
is at work: the template keyword is required to ensure that the compiler invokes this mechanism.
Remember that each instance of the generic matrix class will have its appropriately substituted
functions; this is why we require the fully qualified class name unless we inline the definition.
The generic parameters may be either typename parameters, in which case the paramter can
represent any type, or value parameters, which can only assume integral (not floating point)
numerical values. The value must be computable at compile time. Furthermore, both typename
and value parameters can have default values. The following example illustrates both of these
concepts:
template<typename T=Matrix, int size=10>
class Array {
private:
T *array;
int sz;
public:
Array(int ents = size) : sz(ents)
{ array = new T [sz]; }
Array()
{ delete [] array; }
};
Having defined our class in this manner, the following declarations would be legal:
// creates an array of 10 MyObject objects
Array<MyObject> myo1;
// creates an array of 10 Matrix objects
Array<> myo2;
Note that if you assign a default value to a template parameter, all subsequent parameters must
also have default values to eliminate possible ambiguity.
A generic class supports all the functionality of a normal class. For example, we may have
static data members and functions, we may declare friend functions or classes and so on. A
generic class can also refer to another generic class. To enable all this work, however, we must
be sure to declare and reference the template classes correctly. It is also possible to define a
explicitly specialized class, in which case we override the template parameters by explicitly
defining a class with a fixed set of parameters. The compiler will then use this class when it
encounters a class definition which matches the specialized class:
class Array<Matrix> {
// specialised functions and data
...
};
The keyword template is not used to define the specialised class, and there is no parameterisation: we are fixing the parameters by setting to those listed between the <>. If we have such a
class definition the template implementation for this particular case will not be used.

94

CHAPTER 8. C++ TEMPLATES

8.2

Placement of Template Code

The placement of template code is a somewhat contentious issue. The template class definition
should clearly go in a header file. However, the member function definitions are conventionally
also placed within the same header file! This is very different from the way we usually do things,
but the rationale is as follows. A template function definition does not actually create storage at
its point of declaration. It is simply a blueprint for an eventual template instantiation. In a sense
the compiler treats a template as a macro definition which will be substituted and expanded
during compilation. Some compilers only support this mode of definition (or member inlining)
for templates, so it is best to follow this convention.

8.3

Iterators and the Standard Template Library

Class templates are extremely useful, but there are a certain number of templates which are
commonly used, particularly those which support general lists, trees, collections of objects etc.
Rather than re-inventing the wheel every time, the C++ standard has defined a number of
standard template classes, which together constitute the standard template library (STL). One
still has to use the appropriate header files, but this is a minor inconvenience when compared
to the benefits.
A central property of many of the data structures encountered in Computer Science is that of
data traversal. For example, in a linked list, we usually have to move forwards and backwards
in the list. In an array, we often wish to do the same thing indeed, the linear nature of these
constructs encourages this mode of thinking. The templates in the STL which logically require
such a navigation facility make use of a special helper class known as an iterator. An iterator
allows us to visit each of the data elements in the container class in a consistent manner, which
permits greater code re-use and simplification. In fact, the C++ standard defines a number of
different iterator varieties. For example, a bi-directional iterator allows one to move forwards
and backwards through the object collection. The particular template determines which
types of iterators are available for that class. Iterators and templates allow the development
of completely general algorithms, in which all references to data have been abstracted away. In
other words, these algorithms can work on any kind of data without the requirement for special
versions defined through inheritance (or otherwise).

8.3.1

How to use Iterators

An iterator is often conceptualised as a nested class within the container class. For example,
if we have some template class called Collection, we may have a class definition which looks
something like the following:
template<typename T>
class Collection {
...
class iterator {
...
};
...
};

8.3. ITERATORS AND THE STANDARD TEMPLATE LIBRARY

95

The nested definition means that we must refer to the iterator type associated with this class
as follows:
Collection<type >::iterator li;
An iterator will overload a number of operators, depending on its type. The choice is made
consistently, according to the draft standard. For example, in the case of the list template
class, the iterator allows both forward and backward movement, and it will overload the
++ and -- operators to achieve this. The unary * (dereference) operator will be overloaded to
return a reference to the actual data element at the iterators current location. For classes within
the STL, class member functions which allow you to manipulate the iterator bound to a class are
available. For example, the list generic class has begin() and end() functions which return a
reference to an iterator which points to the start or end of the list of items. Thus one could
write something like:
list<float> list;
list<float>::iterator li = list.begin();
while (li != list.end())
{ cout << *li << ":" << endl; li++; }
Note that several iterators can be bound to one class instance. The example given above shows
one way of traversing the list, by using a second iterator to represent the end of the underlying
list. This iterator serves as a means of locating the list end and is not dereferenced at all.
In general, there are additional special iterators which remove the need to do such a manual
traversal.

8.3.2

Types of Iterators

Every container class must provide support for at least a standard (forward) iterator, and a
const iterator, for those cases in which the container class is declared to be const:
ContainerClass::iterator
ContainerClass::const iterator
The basic operations supported by all containers and their associated iterators are as follows:
begin() Returns an iterator which is set to the first element of the container;
end() Returns an iterator which is set to the one-past-end element of the container this
is a sentinel which is used to stop the iterator from overruning the end of the list;
increment The operator ++, which advances the iterator to the next container element;
comparison The operators != and == are available to test one iterator against another;
element access The dereference operators * and -> are available for accessing the data to
which the iterator points.
If a container is reversible, it provides the additional methods rbegin() and rend() which allow
the container to be navigated from back to front:

96

CHAPTER 8. C++ TEMPLATES


ContainerClass::reverse iterator
ContainerClass::const reverse iterator

For example, the containers list, vector and deque are reversible. To traverse the container
in the opposite direction one simply initialises the reverse iterator using rbegin() and then
walks the container using the ++ operator. Note that a list actually provides a bi-directional
iterator, which also supports the iterator decrement -- operator.
There are many other kinds of iterators, but only a handful of these are likely to be encountered
in practise:
Insertion Iterators These are used to fill a container with data from a particular source. Some
generic algorithms require this facility. Depending on the container, one can insert at the
front, back or some location in between. The following example copies an array of floats
into the specified container using front insertion:
float B[] = {1,2,3,4,5};
list<float> dest;
copy(B, B+5, front inserter(dest));
// insert elements from B into container dest
I/O stream iterators These iterators come in two flavours which can be used to iteratively
extract data from an input stream, and to iteratively write out data to an output stream.
They are usually passed into generic algorithms to provide a data source or sink:
ifstream input("datafile.dat");
istream iterator<string> source(input), EndSentinel;
ostream iterator<string> output(cout, "\n");
vector<string> lines; // this stores the data from input
copy(source, EndSentinel, back inserter(lines));
copy(lines.begin(), lines.end(), output);
*output++ = "Last line!!!"; // can write directly to stream too!
The generic function copy() takes 3 iterators as its arguments: the first two point to the
beginning and end of the source, whilst the last points towards the destination for the
data.
Note that by its nature, an istream iterator has an end, whereas and ostream iterator
does not. A sentinel iterator is used to test for the end of the input stream. In the example,
the input stream is copied element by element into a vector container. The container is
then copied onto the output. The insertion iterators used above mean that additional
work is done when inserting data into the destination. For example, the output iterator
will insert a newline between each output element (as requested in the constructor).

8.3.3

Common Containers

A number of common templatized container classes are specified by the C++ standard. The
following list provides a brief description of the more commonly used containers:

8.3. ITERATORS AND THE STANDARD TEMPLATE LIBRARY

97

vector A vector is a resizable array and is similar to the Java Vector class. There are methods
to, amongst other things, test the size of the vector and append new elements.
queue/deque A deque is a queue which allows insertions and deletions on both ends.
list The standard doubly-linked linear list implementation.
map/multimap These are associative containers, which provide item lookups/insertions
based on key values (with associated attributes). For example, one can do things like
the following:
map <string, int> mymap;
mymap["key1"] = 5;
cout << mymap["key1"] << endl; // prints 5
set Maintains a list of ordered elements which can be rapidly checked for membership. Only
unique items may exist in a set.
priority queue Provides rapid and efficient access to the largest element in a data set.
stack The usual stack data structure.

8.3.4

Examples of Template Containers

Here are two simple examples. The first simply creates a vector of pointers to objects and then
places each new element at the end of the vector. The use of a vector of pointers is based on
efficiency: if we did not use pointers each new class object would have to be duplicated inside
the vector since the elements are passed by value. This is wasteful of space and time consuming
(requires constructor calls).
#include <vector>
#include <iostream>
class Record {
private:
string name;
int recNum;
public:
Record(char *val, int id = -1) : name(val), recNum(id) {};
string getName(void) { return name; }
int getNum(void) { return recNum; } };
int main(void) {
vector<Record*> data(10, NULL); // initialise with 10 elements
for (int i = 0; i < 8; i++) {
cin >> name;
data.push back(new Record(name, i));
}
vector<Record*>::iterator ptr;
for (ptr = data.begin(); ptr != data.end(); ptr++) {

98

CHAPTER 8. C++ TEMPLATES


cout << "Record " << (*ptr)->getNum() <<
" is " << (*ptr)->getName() << endl;
if (*ptr) delete *ptr; }
return 0;
}

The next example counts the number of occurrences of words in a file and prints out this
information. It uses the associative map container in which the key is a word and the value is
the number of times this word occurs. An iterator is used to traverse all the elements in the
map. When dereferenced, the iterator yields a pair structure. A pair has two fields, first and
second: the former stores the key and the latter the associated value. One interesting feature
is provided by the map indexing operator []: if the key is not present in the map, it is created
with the default value (which may involve a call to a constructor if the value is an object). You
can use the count() or find() member functions to see if a given key exists without cause a
new value to be inserted. Note that this simple program will not deal with case or punctuation.
#include
#include
#include
#include

<iostream>
<fstream>
<map>
<string>

using namespace std;


int main(int argc, char *argv[])
{
map<string,int> wordct; // associative array (string,int)
map<string,int>::iterator ptr;
string word;
int ct = 0;
if (argc != 2) {
cout << "usage:
return 1;
}

wordcount <filename>" << endl;

ifstream ifs(argv[1]);
// if no key match, (key,val) created w.
while (ifs >> word)
wordct[word] = wordct[word]+1;

default value (0)

ifs.close();
// print it all out
cout << "Word Count:

" << endl;

for (ptr = wordct.begin(); ptr != wordct.end(); ptr++)


{
cout << (*ptr).second << ": " << (*ptr).first<< endl;
ct += (*ptr).second;
}
cout << "Total: " << ct << endl;

8.4. FUNCTION TEMPLATES

99

return 0;
}
Another useful container is the set, and its more general form, the multiset. A set may only
contain one instance of a particular entity, whereas a multiset can hold multiple instances of
the same entity. The following simple example illustrates how they differ.
#include <iostream>
#include <set>
int main(void)
{
int numbers[]={0,1,2,3,4,4,6,8,0,3,2,1,11};
set<int> s;
multiset<int> ms;
for (int i = 0; i < sizeof(numbers)/sizeof(int); i++) {
s.insert(numbers[i]);
ms.insert(numbers[i]);
}
set<int>::iterator sptr = s.begin();
multiset<int>::iterator msptr = ms.begin();
while (sptr != s.end())
{
cout << "Entry " << *sptr << " occurs " << s.count(*sptr)
<< " times" << endl;
sptr++;
}
while (msptr != ms.end())
{
cout << "Entry " << *msptr << " occurs " << ms.count(*sptr)
<< " times" << endl;
msptr++;
}
return 0;
}
As one would expect, the multiset will report multiple occurrences of each integer.
These examples shows how easy it is to use the STL containers: you simply include the appropriate header file and off you go! Of course, you will need to find out which member functions
exist and the various sorts of iterators each container supports. This information is covered in
the documentation that accompanies any good C++ installation.

8.4

Function Templates

C++ allows us to define generic functions, that is, functions which can operate on arbitrary
data types. The template mechanism is also used to achieve this, but these generic functions

100

CHAPTER 8. C++ TEMPLATES

or algorithms are not associated with any particular class. The STL contains both generic
classes and generic functions; the one complements the other. A typical example of the kind
of algorithm which can usefully be abstracted is sorting. Under C, a certain degree of function
templating was possible, through the use of generic (void*) pointers and function pointers.
C++ refined those ideas and now provides a framework which enables a far more sophisticated
level of abstraction.
A function template is declared in a similar manner to a class template:
template<template args > ret type fname(args );
where the template keywords and arguments are used as before. A function template declaration
looks identical to an class member function declaration.
Note that it is usually unnecessary to explicitly include the template argument since the compiler
performs type induction based on the function call.
The next section examines some of the algorithms available under the STL.

8.5

Algorithms and the STL

In Addition to the rich set of templated containers, the STL contains a number of generic
functions, also known as algorithms. Because the are generic they are type-independent and can
thus be used on any well-defined type. In this context well-defined is taken to mean that the
arguments to the algorithm will support the required operations. In most cases, this amounts
to ensuring that the iterators involved support consistent traversals. For example, a sorting
algorithm requires random access to the underlying container, thus requiring that the container
to be sorted supports random access iterators. Remember that any less restrictive iterator can
be used in place of a more restrictive one. Thus, an algorithm that requires an iterator which
moves forward, will still work for any bi-directional iterator. Our class of containers is thus not
unnecessarily restricted.
The other class of requirements concern function objects. A function object is an instance of
a class that overloads the functon call operator. This allows the object to behave in a similar
manner to a conventional function. The kinds of adequacy required by algorithms usually involve
the number of arguments in the function call list. For example, a sorting algorithm requires two
arguments to compare, thus any function object passed into the algorithm to modify the basic
comparison process must support two arguments it will be a binary function object. This
will become clearer with a few examples.

8.5.1

Predicates and Function Objects

A predicate is a function which evaluates to true or false. Such functions are widely used
in algorithms since in many cases we must compare multiple arguments and make a binary
decision on this basis. In the case of the STL Algorithms, predicate functions can be replaced
by more general predicate function objects. Of course, you can still use a simple function if
this is appropriate for your needs. The following example shows how one might apply a binary
predicate to two arguments, using either a function f(), or a function object F():

8.5. ALGORITHMS AND THE STL

101

#include <iostream>
using namespace std;
// Functions and Function Objects
/* Simple Function */
bool f( int& x, int& y) { return x < y; }
/* function object - the power of Objects at your disposal */
class F {
private:
int x;
public:
F(int y = 0) : x(y) {}
bool operator()(int& y, int& z) { return y < z; }
};
/* Algorithm Eval() - apply a BinaryFunction to two arguments...

*/

template <typename T, typename BinaryFunction>


bool Eval( T& x, T& y, BinaryFunction B) { return B(x,y); }
int main(int argc, char* argv[])
{
int x = 3, y = 4;
cout << Eval(x,y,f) << endl; // apply a function
cout << Eval(y,x, F()) << endl; // apply function object
return 0;
}
As the example, indicates, the algorithm Eval() can be used to apply either a simple function
f, through its function pointer, or a more complex object F(). Note that to use a function
object, we need to create an instance of that type, hence the constructor call in the argument
list to Eval(). By contrast, the function f() already exists as an entry point into code loaded
into memory. Although not used here, one might choose to use a function object to accumulate
information about its arguments and pass these to some other class or to invoke some other
class function to parse the argument data and so on. While some of these operations may be
possible with a normal function, they become increasingly unwieldy as one requires increasing
sophistication.
To facilitate the smooth and easy operation of the STL algorithms, a number of predefined
function objects (either unary or binary) exist. For example, less, is used by algorithms when
ordering is required between container elements; it is a binary function object that uses the < operator of the contained elements to return true/false. An example of an algorithm which uses
this is sort(). Other predefined function objects include: greater, not equal to, less equal
etc, which all perform in the expected way.
Example: Summing Anything
As it happens the type of operation shown here is so common that the STL already contains
a number of algorithms to handle repeated applications of functions or function objects. The
simplest of these is for each(), an example of which is shown below. This algorithm requires a
forward iterator and a function object (or function) and its sole purpose is to apply the latter to

102

CHAPTER 8. C++ TEMPLATES

every element pointed to by the former. In the simplest scenario, we could pass in two delimiting
pointers and apply a function to each item within the range. This example is somewhat more
sophisticated: it defines a generic summing routine which accumulates information on each
argument. Study the example closely to see how it works!
#include
#include
#include
#include

<iostream>
<algorithm>
<vector>
<string>

using namespace std;


template <typename T>
class Sum {
private:
int N;
T sum;
public:
Sum() : N(0) {}
Sum(const Sum& S) : N(S.N),sum(S.sum) {}
void operator()(const T & v) {
if (N == 0) sum = v;
else sum = sum + v;
N++;
}
T getsum(void) { return sum; }
friend ostream & operator<< (ostream& os, const Sum<T>& obj)
{
os << obj.sum;
return os;
}
};
int main(int argc, char* argv[])
{
int v[] = {5,6,7};
Sum<int> S = for each(v,v+3,Sum<int>());
cout << "Sum is:

" << S << endl;

vector<string> w;
w.push back("The ");
w.push back("rain ");
w.push back("in ");
w.push back("Spain...");
Sum<string> Q = for each(w.begin(),w.end(),Sum<string>());
cout << "Concatenation is: " << Q << endl;
return 0;
}
In this example, for each() applies a particular instance of the class Sum to each element in the
container. Because of the generic nature of the algorithm, and the fact that the class string has

8.5. ALGORITHMS AND THE STL

103

overloaded + to mean concatenation, we are able to sum a collection strings together and get
something sensible. Note that the same object instance is being applied to every element; this
object is passed back by value after all the applications are complete. This places an additional
requirement on the function object it must support a copy constructor. Of course, since any
properly written class should so, this is not a problematic restriction.
Example: Sorting Generically
Consider this declaration of a generic (trivial) sort function [2]:
template<typename IT, typename Compare>
void sort(IT first, IT last, Compare cmp)
{
for (IT k = first; k != last; k++)
swap(*k, *min element(k, last, cmp));
}
The template argument IT represents an iterator of some sort, whilst the argument Compare is
a helper class which encapsulates the comparison operation to be used in sorting. This example
is quite subtle, so we will examine it in some detail. The functions swap() and min element()
are generic functions which swap two elements, and find the smallest element in a sequence,
respectively. They are defined as follows:
template<typename IT, typename Compare>
IT min element(IT first, IT last, Compare comp)
{
IT m = first;
for (IT i=++first; i != last; i++)
if (comp(*i, *m)) m = i;
return m;
}
template<typename T>
void swap(T& a, T& b)
{
T temp=a;
a=b;
b=temp;
}
The definition of swap needs no further explanation. A number of interesting things are happening in min element(). Firstly, the template parameter IT is a general iterator: it can be used for
both a fully fledged iterator (such as that associated with the STL list or vector templates)
or for a simple C++ array. This is possible because an iterator is considered to be a movable
pointer and the * operator is overloaded so that an iterator dereference will yield the actual
data being pointed to. Thus, in the example above, the cmp() function will be called with arguments that are the actual data types contained in the list/vector that the iterator is bound to.
Alternatively, if a normal array pointer is passed in, the dereference operation will work normally

104

CHAPTER 8. C++ TEMPLATES

too. This is abstraction in its purest form! A further subtlety is represented by the comparison
function, cmp(). In fact, this function is not a function at all, but a class which encapsulates
the concrete comparison operations required to actually compare data items. The function call
operator, (), is overloaded to achieve this. For example, if we wished to compare strings (sorted
alphabetically), we could define a property class of function objectCompareStrings:
class CompareStrings {
public:
bool operator()(char *str1, char *str2)
{ return strcmp(s1,s2) < 0; }
};
With these definitions we can write the following code:
list<char *> sl;
char *array[10] = {"aaaa", "abab", "abca",
"bbbb", "baba"};
// populate string list
// use iterator; pass object of type CompareStrings
sort(sl.begin(), sl.end(), CompareStrings());
// array pointer; pass object of type CompareStrings
sort(array, array+5, CompareStrings());
One further point to note here is that the iterator function end() returns an element which is
just after the end of the list. One cannot dereference this element it is simply an end-of-list
designator. To be consistent with this approach, the second array argument used in the array
version of sort passes a pointer which is one step outside of the valid array data. This will not
cause a problem however, since this pointer will not be dereferenced.

8.5.2

Other Useful Algorithms

There are many algorithms in the STL. A few have been listed below, along with the types of
iterators supported and the nature of any associated function objects.
fill void fill(ForwardIterator first, ForwardIterator last, const T& value) this fills the indicated
range with value.
generate void generate(ForwardIterator first, ForwardIterator last, Generator gen) this invokes the generator function gen once for each element in the range and assigns the generated value to the corresponding element. An example would be a random number
generator.
transform OutputIterator transform(InputIterator first, InputIterator last, OutputIterator result, UnaryFunction f ) applies the function object f to each element in the input range,
and copies the resulting value into the corresponding position in result. result is returned.
for each UnaryFunction for each(InputIterator first, OutputIterator last, UnaryFunction f )
applies the function object f to each element in the input range, discarding any return
value. The function object is returned after the operation is complete.

8.5. ALGORITHMS AND THE STL

105

count int count(InputIterator first, InputIterator last, const EqualityComparable& value)


count all the items with value in the range. A second version of this algorithm uses a
UnaryPredicate and counts only the items that make this true.
find InputIterator find( InputIterator first, InputIterator last, const EqualityComparable& value)
returns an iterator to the first occurrence of value, otherwise returns last. A second
version of this algorithm uses a UnaryPredicate and returns the first item that causes the
predicate to be true.
min/max/swap thee functions min and max return the smallest and largest of their two arguments, while swap simply exchanges its two arguments via reference.
Other useful algorithms include sort and a number of numerical and set-based utilities. Details
can be found in any good C++ reference.

106

CHAPTER 8. C++ TEMPLATES

Chapter 9

Exceptions
An exception is an event which interrupts the normal flow of execution within a program. C++
and Java provide a mechanism to handle exceptions. By using this mechanism and applying
good design principles we can ensure that the program responds in a controlled manner when it
encounters such events. Of course, we cannot cater for all exceptional events, but those which
arise from errors in program logic (such as division by zero) or failure of systems resources
(such as an abortive attempt to allocate memory) can be dealt with in a unified and consistent
manner. Programs written in this way are, in general, more robust and easier to maintain.

9.1

Exception Syntax

C++ has a number of standard exception classes, all of which are derived from a base class
called exception. To use exceptions, the header file stdexcept must be included. A number of
methods are associated with this class, and these can be used to determine the specific nature
of the exception. The standard exception hierarchy is illustrated in the Figure below. We can
also derive new exception classes from the existing classes in the hierarchy, but we shall not
investigate this topic here.
The system itself may generate exceptions which we can handle if we wish. An example of
such a system exception is bad alloc, which indicates that our last attempt to dynamically
allocate memory failed. Exceptions have a default behaviour which prints out an appropriate
text comment and terminates the program (unless we handle the exception). A non-system
exception i.e. one generated by our program, must be thrown (generated) by the offending code
section. For example, if we wish to notify the system that a divide by zero error is imminent,
we might code the following:
Exception

runtime_error
domain_error

logic_error
range_error

invalid_argument
length_error
out_of_range

107

overflow_error

108

CHAPTER 9. EXCEPTIONS
int a, b, c;
if (b == 0)
throw overflow error ("Division by zero!");
c = a/b;

When the value of b is zero, an exception of type overflow error will be raised by the
program and execution will terminate. The message passed to exception constructor can be
extracted for later use.
In order to trap the exception and ensure that program operation can continue, one must use
try and catch statements. A try block, introduced by the try keyword, contains a block of
code which which may raise an exception. The exception may be thrown by our code or by
the system. In either case, if we wish to allow the program to continue operating, we must
write an exception handler to catch the offending exception. The catch keyword is used to
introduce the exception handler(s) associated with a given try statement. The program code
specified within the handler is responsible for recovering from the error condition. Assuming a
valid exception handler exists, program execution will resume after the try/catch block. Here is
a simple example which illustrates these ideas:
int a, b, c;
// assign values to a,b,c...
try {
c = a/b; // can throw and overflow error
}
catch (overflow error)
{ cout << "Division by zero!"; }
// program execution continues here...
In this case, the system will throw an overflow error which will be caught by the indicated
exception handler. There can be any number of exception handlers, but only one try block.
C++ has a catch-all exception handler which allows the program to acknowledge that an
unknown error has occurred before continuing. This handler is written as catch(...), and
must be the last one in the list, since it will match (and thus absorb) the exception.
The general form of the try/catch statement is:
try {
statements
}
catch (exception type ) {
statements
}
...
catch (exception type ) {
statements
}
Once an exception has been thrown within a function, a matching procedure is invoked to see
which handler (if available) should process the event. The handlers are inspected in order, and
the first matching handler will be utilised. The rules for matching an exception of type E to
a handler of type T are as follows:

9.1. EXCEPTION SYNTAX

109

1. If T is of the same type as E,


2. If T is a (public) base class of E,
3. If E and T are pointers for which 1 or 2 hold for the underlying types,
4. If T is a reference and 1 or 2 holds for the type to which T refers.
If no match is found, the current function immediately exits and the exception is passed to the
calling function. This process continues until a valid handler is encountered and if none can be
found the program will terminate. The standard class exception has a virtual function named
what() which returns the error string we specified when the exception was thrown. This can
be used to provide additional information on the cause of the exception, or to specify where it
occurred.
If you need to manipulate the exception type, you can parameterise the argument of the catch
statement as follows:
catch (T e) traps exceptions of type T and gives your handler code access to the exception
itself. Classes derived from T will also be trapped, but will be cast to type T, losing
information in the process.
catch (T& e) as above, but now exceptions derived from T will not be cast automatically, and
we can find more detailed information on the responsible error condition.
The next example shows a number of these concepts in action. The function reciprocal
computes the inverse of a given number and as such can generate an error (when the number
is 0). When this happens we report this fact by throwing the exception to the calling function,
along with a more useful diagnostic message. As the example indicates, the statement which
generated the exception is within a try/catch block, and an appropriate handler exists to process
the exception. Control is passed the handler (which simply modifies the counter to exclude the
invalid reciprocal) and the function resumes operation after the last catch clause.

110

CHAPTER 9. EXCEPTIONS
#include <iostream>
#include <stdexcept>
#include <typeinfo>
float reciprocal(float den)
{
if (den == 0.0)
throw overflow error(" in reciprocal()");
else return 1/(float)(den);
}
float AvgReciprocals(int start, int end, float *Seq,
int size)
{
float sum = 0.0;
int NumSummed = 0;
for (i = start; i <= end; i++) {
if (i < 0 || i > size - 1)
throw out of range(" in SumReciprocals");
NumSummed++;
try { sum += reciprocal(Seq[i]); }
catch (overflow error) // handler code
{ NumSummed--; }
catch (...) // this will catch anything else
{ throw; // and simply rethrow it! }
}
if (NumSummed == 0) return 0.0;
else return sum/NumSummed;
}
// average reciprocals of set of inputs
int main(int argc, char **argv)
{
float arraydata[] = {0.0,1.0,
0.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0};
int firstIndex, lastIndex, size = 10;
if (argc != 3) exit(0);
firstIndex = atoi(argv[1]);
lastIndex = atoi(argv[2]);
try {
cout << "Sum is " <<
AvgReciprocals(firstIndex, lastIndex,
arraydata, size) << endl;
}
catch (exception& e) {
cout << typeid(e).name() << ":" <<
e.what() << endl;
}
return 0;
}

9.2. SPECIFYING EXCEPTIONS

111

Observe that no message will be printed: a message will only be printed if the program is forced
to terminate because an appropriate handler does not exist. Also note that control is not passed
back to the function that generated the exception. To ensure added robustness, we have included
the catch-all handler. However, since we will not know how to process an unexpected exception,
we simply pass the responsibility up the call chain to the invoking function: we rethrow the
exception. A throw statement with no arguments is used to accomplish this. As we stated above,
as soon as an exception occurs, the function in which this happens terminates and control is
passed to the invoking function. Thus, in the example we have shown, rethrowing the exception
forces us to return to main() which must either service the exception or terminate. This is
appropriate behaviour for a serious, unanticipated condition!
Within main() itself, we have anticipated trouble from our reciprocal function and taken adequate precautions by placing the statement within a try block. The handler we have provided
is generic: it will accept any exception derived from the standard exception hierarchy. Further,
because we have used a reference parameter, we can gain access to the original exception, identifying both its type and displaying the error string we provided. If we had not provided an
error string, a default message would be displayed. The diagnostic string can be very useful for
debugging purposes, so a meaningful message should be included.
Finally, we should recall that an exception will abort the current statement within the calling
function, but will not affect the state of the system prior to that call. Thus, in our last try
statement we should really have called the function AvgReciprocals() prior to printing the
result. If you recall our earlier discussion about operator overloading and precedence within the
I/O class, the cout statement is actually a chained series of function calls, which are evaluated
from left to right. If an exception occurs within AvgReciprocals() and propagates back to
main() the initial text string after cout will print even if the program aborts.

9.2

Specifying Exceptions

C++ provides an exception specification mechanism for functions which allows the programmer
to indicate which exception types the function can legally generate. With this information it
becomes easy to protect code based on these functions with the appropriate exception handlers.
Java has a similar specification mechanism, however within Java all exceptions must be encapsulated in a class object derived from the superclass Throwable. Java introduces a new keyword,
throws, to introduce the specification.
If a function is to have an exception specification, this must be present in both the prototype
declaration and the definition. The general form of the definition is as follows:
return type fname(parameters ) throw(E1, E2, ...);
where there may be an arbitrary number of Ei specified. If the throw argument list is empty,
the function cannot generate any exception. Finally, if the throw clause is missing, we have a
normal function definition and any type of exception can be raised by the function.
If the exception specification is incorrect i.e. the function generates an illegal exception, the
system will invoke a special function called unexpected() to deal with this. Note, however,
that in this case the program is faulty and the function should be appropriately rewritten. The
default behaviour of unexpected() is to terminate the application immediately. However, in
certain cases we may wish to recover from the error condition. This can be achieved by redefining
the action of unexpected(). C++ allows us to write an alternative function which is then used
in its place. This is accomplished as follows:

112

CHAPTER 9. EXCEPTIONS
void NewUnexpectedFn() { ...

int main(void) {
...
set unexpected(NewUnexpectedFn);
... }
The new function can generate an exception which we know how to handle and throw this
back to the invoking function. To be completely rigorous the function prototypes defined above
should be rewritten as follows:
float reciprocal(float den)
throw(overflow error);
float AvgReciprocals(int start, int end,
float *Seq, int size) throw(out of range);
If we specify an exception list for AvgReciprocals(), the rethrow we use in the catch-all handler
must be removed since this can throw any kind of exception. Of course, if the exception
specification for reciprocal() is completely correct this is not an issue. Observe that because
we have a handler that can trap exceptions of type overflow error, we do not have to include
this in the specification for AvgReciprocals().

Chapter 10

Using C and C++


The language C was the precursor to C++ and shares many of the same features. While C++
is backward compatible with C one has to take special precautions when trying to combine
both. This Chapter presents a few C-specific functions and shows how to use both C and C++
together, when this becomes necessary.

10.1

Linking C code to C++ code

A great deal of C legacy code exists in the world, and often the investment in time and money
is such that this code cannot simply be discarded. When the C code exists in libraries, the
necessary functions can be linked along with C++ functions, provided the libraries are binary
compatible (ELF vs COFF etc). However, because of the name mangling which occurs in C++,
C functions definitions within the object code will be structured differently and the compiler
needs to be aware that name mangling is turned off for these functions. This is achieved by
placing a linkage specifier around the C function prototypes:
extern "C" { int myOldCFunction(int *arg);
float myOldFloatFn(float fp); }
One can also apply the linkage specifier to an entire header file:
extern "C" { #include <OldPrototypes.h> }
The linkage specifier indicates that the function prototypes belong to C and this allows the
compiler to make the appropriate modifications in its function setup.

10.2

Writing C Code

On occasion you may be called to write C code. As a competent programmer you should be
conversant with both C and C++. For the most part, the basic structure of the programs
will be identical. The main areas in which there is divergence is I/O and dynamic memory
allocation. Of course, C does not support classes, or any inheritance mechanism. If you need
such functionality you should use C++. The following section provides a few pointers to bear
in mind when writing or trying to understand C legacy code.
113

114

CHAPTER 10. USING C AND C++

10.2.1

Function Prototypes

Function prototypes under C were largely a matter of choice. The original C language was
tightened up by ANSI, but some differences still remain between C++ prototypes and their C
counterparts:
1. There are no default arguments;
2. A prototype with no return type may be interpreted as returning an int; similarly an
empty argument list could be interpreted in the same way.
3. Under some old versions of C, all that was required was the function name, and even this
was optional as long as the function definition was parsed prior to the function invocation.
ANSI C is, fortunately, fairly specific about declaring function prototypes so later C code should
be more familiar.

10.2.2

I/O Under C

To use the I/O libraries under C, you need to include the header file stdio.h. This defines the
basic streams and the structures necessary to manipulate them. Output is achieved by calling
the function printf(), for stdout, or fprintf() for file based output. Input is accomplished
through the scanf function, for stdin, and fscanf() for file-based input. Files are opened and
closed much the same way as in C++, but rather than being manipulated through an fstream
object, they are manipulated through a FILE structure.
The I/O functions used format strings to specify the expected layout of input items and to
specify the format for data output. Types can be coerced easily, leading to unexpected results
if you try to read, for example, the wrong kind of item.
The following piece of C code illustrates some of these functions calls:
#include <stdio.h>
main()
{
char name[100];
FILE *fp = NULL;
printf("Please enter the name of a file: ");
scanf("%s", name);
if ((fp = fopen(name, "r")) == NULL)
{ fprintf(stderr, "Unable to open file - exiting."); exit(0); }
/* read in some text ...

*/

while (!feof(fp))
{ fscanf(fp, "%s", name); fprintf(stdout, "%s", name); }
fclose(fp);
return 0;
}

10.2. WRITING C CODE

115

The format string used here was %s which indicates a character string. There are similar codes
for integers, floating point numbers etc. Observe that no objects exist fp points to a struct
which contains all the information on a file. This struct is returned by the call to fopen(). The
file is opened as a read-only file, hence the argument r, and is opened in text mode by default.
If the string had been "rb", the file would have been opened in binary mode.

10.2.3

Memory Management Under C

Under C the compile time operator sizeof must be used to ensure that the correct number
of bytes are allocated during a memory request. C++ uses the type to determine this information itself. The function calls malloc() and free() are used to allocate and free memory,
respectively. The following code snippet shows how they work:
float *fptr = NULL;
int N = 1000;
if ((fptr = (float*)malloc(N*sizeof(float))) == NULL)
{ printf("Memory request failed..."); exit(0); }
/* do something
free(fp);
Note that malloc() returns a pointer to a collection of bytes, so a cast must be used to ensure
a type match. In this example space for N floats will be allocated and released.

10.2.4

Miscellaneous

Before leaving this section it is worth noting a few other quirks of C. Firstly, unlike C++,
local variables should always be declared at the top of a function, just after the start of the curly
braces. You cannot declare variables at arbitrary points within the function. The initialisation
of variables in C is also a little less reliable than C++: in general, always make sure to assign
an initial value when you create a variable. While the system should provide some reasonable
initial value, some old versions of C do not bother. Finally, be wary of implicit type conversions
in C! An example of this is adding a char to an int. Most type checked languages require a
cast - C will quite happily go along and apply such conversions without informing you, under
the assumption that you understand the full implications of what you have types...which as we
know, is not always true!

116

CHAPTER 10. USING C AND C++

Bibliography
[1] Bruce Eckel. Thinking in C++. Prentice Hall, second edition, 2000.
[2] Jan Skansholm. C++ From the Beginning. Addison-Wesley, 1997.
[3] Bjarne Stroustrup. The C++ Programming Language. Addison-Wesley, third edition, 1998.
[4] Joe Wigglesworth and Paula Lumby. Java Programming: Making the Move From C++.
Course Technology (ITP), 1998.

117

You might also like