You are on page 1of 343

Experiments in

Undergraduate
Mathematics
A Mathematica-Based Approach
-
hxperiments in
Undergraduate
Mathematics
A Matbematica-Based Approach

Phillip Kent
Phil Ramsden
John Wood

Department of Ma thema tics


Imperial College

Imperial College Press


I
Published by
Imperial College Press
516 Sherfield Building
Imperial College
London SW7 2AZ

Distributed by
World Scientific Publishing Co. Pte. Ltd.
P 0 Box 128, Farrer Road, Singapore 912805
USA rffice: Suite IB, 1060 Main Street, River Edge, NJ 07661
UK @ce: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

All trademarks are acknowledged as the property of their respective owners.

EXPERIMENTS IN UNDERGRADUATE MATHEMATICS


A Mathematica-Based Approach
Copyright 0 1996 Imperial College of Science, Technology and Medicine
First Published 1996
Reprinted 1997
All rights reserved. This book, or parts thereoJ may not be reproduced in any form or by any means,
electronic or mechanical, includingphotocopying, recording or any information storage and retrieval
system now known or to be invented, without written permission from the copyright owner.

ISBN 1-86094-027-7
1-86094-028-5 (pbk)

Printed in UK by J. W. Arrowsmith Ltd.


Acknowledgements
The development of the contents of this book was funded (1993-95) by the
Teaching and Learning Technology Programme (TLTP), a special funding
initiative of the United Kingdom higher education funding councils. Information
on TLTP can be found on the World-Wide-Web at the U l U
http://www.icbl.hw.ac.uk/tltp/.

We are especially grateful to Richard Noss of the University of London Institute


of Education, who made key contributions, both formal and informal, to the
process by which the contents of this book evolved. Professor Noss’ formal
evaluative role was made logistically possible, and its quality and value
enhanced, by the dedication and skill of his fieldworker, Rachel Shapton.

Jack Abramsky, and the staff and students of Kingston College, made willing
experimental subjects and, when invited, trenchant (though always polite) critics;
without their timely contributions our educational ideas could easily have
developed in directions unrelated to, and unconstrained by, reality.

David Klug, Richard Templer and Ian Gould, lecturers in the Chemistry
Department at Imperial College, made the bold decision to introduce
Muthemuticu into their first-year course on mathematics. They provided us with
our most important testing ground for evaluating prototype versions of the
mathematical experiments in this book. We thank them, and the many
undergraduate students, in the Chemistry department and elsewhere, who worked
on the experiments and provided us with invaluable information and insights.

Finally, we wish to thank Dan Moore for all the detailed discussions about
Chapter 11.
Preface
This book takes its place amongst a growing number about using a computer
mathematics system* (CMS) as an environment for learning mathematics. These
systems are already changing both the way mathematics is done and the way
mathematics is used. We can be sure, then, that they will have a role in the way
mathematics is learnt and taught. But they have been around for a relatively
short time, and the pace of curricular change is slow; in any case, there is still a
lively debate among educators about what the scale and nature of their role will
be, or ought to be.

We believe that a CMS allows us to present mathematics to students as an


experimental subject. The term “experiment” that we use in this book is not used
casually or rhetorically. In a laboratory experiment, in clear contrast to lectures
or tutorials, the student, though perhaps guided by a lab assignment script, can
exert a measure of control over the process. Experiments, too, are often
collaborative, in the sense that students can pool their ideas and energies; such
interactions are less common in traditional mathematics learning. Finally, and
most importantly, an approach based in part upon experiment encourages the
student to formulate mathematical conjectures. This conjecturing process is
central to mathematics and to mathematical problem-solving, but has
traditionally played only a minor role in the mathematical activity of students.

Our approach is related to the doctrine of constructivism: the idea that conceptual
learning involves the learner building, bit-by-bit, his or her own understanding of
mathematical knowledge. A consequence of this idea is that, in mathematics at
least, mere instruction is unlikely to be effective: at any rate, not on its own.
Contructivism emphasises the importance of cognitively challenging
mathematical activiv.

* We prefer this term to the more traditional “Computer Algebra System”, which fails to
convey the power and scope of modern mathematical software.
...
Vlll Preface

The topics covered in this book roughly correspond to those traditionally studied
in the A-level school syllabus in England and Wales, though recent and
continuing curriculum changes mean that several topics are now studied for the
first time at university. The first year undergraduate with mathematics ‘A’ level
will find this book most useful for revision: for strengthening his or her
understanding of topics first met at school. In North American terms, this book
is suitable for freshman calculus.

We welcome your comments and thoughts on the contents of this book. Send
them along to:

The METRIC Project


Mathematics Department, Imperial College
London SW7 2BZ
England.

Electronic mail: metric-grojeic.ac .uk


World-Wide-Web: httg: //metric.ma. ic. ac .uk/Message.html

Phillip Kent, Phil Ramsden and John Wood - May 1996.


Table of contents

Chapter 0: Getting Started with Mathematica 1

Chapter 1: Foundations
Functions & Graphs 19
Functions;functions in Mathematica; composing functions;
graphing functions; graphing experimental data.

Trigonometry 35
About trigonometry; degrees and radians; trig functions from
0 to nl2; special angles; trig functions from 0 to 2nn; amplitude,
frequency and phase; reciprocalfunctions (sec, cosec and cot);
trig identities; small angles.

Series 1 (Sequences) 53
Sequences; converging and diverging; arithmetic and geometric
progressions (APs and GPs); series; summing APs; summing GPs;
summing to infinity; recurrence sequences; Fibonacci numbers;
non-linearity and chaos.

Chapter 2: Calculus
Differentiation 1 71
Gradients of straight lines; chords and tangents; limits and
derivatives; differentiation of x n ;trigonometric functions;
exponentialfunctions; derivatives of inverses.
X Contents

Differentiation 2 93
Product rule; quotient rule; composite functions (chain rule);
maxima and minima; stationary points; implicitly defined functions.

Integration 1 109
Areas under curves; the Riemann approximation and the
trapezium rule; indefinite integrals; integration and
differentiation; integration of x n ;integrals of common functions;
simple rules and patterns.

Integration 2 125
Definite integrals; the trapezium rule; parabolic segments;
Simpson’s rule.

Integration 3 145
Manipulating the integrand; substitutions; change of variable
methods; integration by parts; areas in the plane; volumes of
revolution.

Series 2 (Series) 165


Binomial expansions; polynomial approximations; Maclaurin
series.

Chapter 3: Vectors and Matrices


Vectors 1 185
Introduction to vectors; negative and equal vectors; scalar
multiplication; vector addition; components; the i-j basis;
modulus.
Contents xi

Vectors 2 21 1
The scalar (or dot) product; angles between vectors; three-
dimensional vectors; the i-j-k basis; the vector (or cross) product.

Vectors 3 235
Vector and Cartesian equations of a line; parallel, intersecting
and skew lines; vector and Cartesian equations of a plane; the
angle between two planes.

Matrices 255
Introduction to matrices; transformations of the plane;
matrix multiplication; addition and subtraction; combining
transformations; matrix equations as transformations;
equations as lines; equations as algebra; inverses; determinants.

Chapter 4: Complex Numbers


Complex Numbers 1 28 1
Square roots of negative numbers; complex roots of quadratics;
arithmetic of complex numbers; complex conjugate.

Complex Numbers 2 295


The Argand diagram; geometrical interpretations of complex
arithmetic; modulus and argument; mod-arg arithmetic.

Complex Numbers 3 309


De Moivre 's theorem; nth roots; exponential form; loci.
To the learner
This is a book about doing mathematics. Mathematics is not a subject you can
learn just by reading about it - though that can sometimes make you think
you’ve learned it. Learning mathematics means getting your hands dirty: getting
to grips with the subject directly. What’s different about this book is that you’re
invited to do mathematics armed with Mathematica. When you see what
Mathematica can do for you, you’ll understand what an important difference that
is.

Most of the “doing” in this book happens in the mathematical experiments.


Though these sometimes take up less space on the page than what we’ve called
the Reading, they are not less important. On the contrary: they’re what this book
is really all about.

The structure of this book


The book is divided into five chapters. The first (Getting Started) is a quick
introduction to the Mathematica software. Then comes Foundations, basic topics
on which the following ones rest. The three main chapters, Calculus, Vectors &
Matrices and Complex Numbers, can then be studied independently of one
another.

Each topic is broken down into a number of modules (one, two or three), each
containing material for approximately one hour of study at the computer
(Differentiation,for example, has two modules). The title page of each chapter
lists the contents of all its modules.

Each module contains about a half dozen experiments, indicated by a shaded


background, which consist of a set of activities to be carried out at the computer.
Almost every experiment has a section of Preparatory Reading, to be read
before you start, and some have Post-experimentReading, to be read when you
have finished.
xiv To the learner

Practice questions
Placed at appropriate points in each module you’ll find practice questions.
These are an electronic version of those you find in ordinary mathematics
textbooks.

There is a difference, though. In a textbook, the questions are fixed. But the
practice questions, and their answers, are generated by the computer. Each time
you set yourself a question, it will be slightly different from the last time. The
system makes no judgements and it does not keep scores.

Mathematics and Mathematica


This book is for learning or revising mathematics, not for becoming a
Mathematica expert. Of course, you will end up learning quite a bit about
Mathemutica, and that skill will always be useful to you. And doing
Mathematica counts as doing mathematics, because Mathematica is that kind of
program. Actually, you may well become unsure where Mathematica ends and
mathematics begins; most Mathematica users get like that eventually.

By the way: if you find yourself getting interested in more advanced uses of the
program, we have devised some Additional Activities. These can be found on our
World-Wide-Web site (httg://metric.ma. ic .ac .uk/).

Finally, we’ll quote a passage from the “advice to the learner” section of another
book* which we like very much and which we think goes well here:

“If I am asked to solve a problem, or do something on the computer, and I am


not told how to do it, what should 1do?’ The answer is very simple. Make s o m
guesses (better yet, conjectures) and try them out on the computer. Ask

* Learning Abstract Algebra with ISETL, Ed Dubinsky and Uri Leron, Springer-Verlag,
1994. Page xii.
To the learner xv

yourself if it worked - or what part of your guess worked and what part did not.
Try to explain why. Then refine your guess and try again. And again. Keep
repeating this cycle until you understand what is going on. The most important thing
for you to remember is not to think of these explorations in terms of success and
failure. Whenever the computer result is different from what you expected, think of
this as an opportunity for you to improve your understanding. Remember: instead of
just being stuck, not knowing what to do next, you now have an opportunity to
experiment, to make conjectures and try them out, and to gradually refine your
conjectures until you are satisfied with your understanding of the topic at hand.
To the lecturer
We envisage three ways in which this book might be used:

as an integral part of a course based around computer laboratory sessions,


with or without lectures.
as a supplement to either a conventional lecture course, or a lab course using
another set of Mathernatica-based materials.
as revision material for mathematics required for a course that you are
teaching.

The book does not assume any previous experience with Mathernatica or similar
software on the part of learners. Neither does it set out to teach Mathernatica
explicitly. However, the authors’ experience in piloting the contents of this book
suggests that learners can, and do, become very skilled Mathernatica users quite
quickly.

The METRIC software


We have provided special Mathernatica functions for tasks where working from
scratch would be too cumbersome, or too intricate for a novice user. These
functions work with the standard Mathernatica syntax; we have not attempted to
implement a simplified, novice version. The standard syntax, though not
transparent, is inherently well designed and naturally mathematical in structure,
and supports students’ mathematical thinking.

We have also provided computer-generated, randomised practice questions.


Students can set themselves such questions at any time, but are invited to do so at
specific points in the text.

Our software contains a logging mechanism that allows students’ sessions with
Mathernatica, or selected parts of those sessions, to be automatically recorded.
Full directions for activating this mechanism are to be found in the text file
LOGGING. TXT in the METRIC package.
xviii To the lecturer

Computer labs and orientation sessions


In our experience, even students completely new to Mathemutica can become
quite proficient users provided they are given two or three hours of “orientation”
in a computer lab with teacherhnstructor support. The module Getting Started is
designed particularly for such sessions, with Functions & Graphs a natural
second module to follow.

Adapting the materials


You will, of course, be able to adapt the curricular materials in this book for your
own purposes, for example by selecting particular experiments, or parts of
experiments. We hope, too, that you will feel free to modify the special
functions for your own uses, and perhaps improve them.

You are also invited to make free use of the Practice Question code, and to author
your own questions. This is easier than it sounds; the design of the practice
questions has been deliberately kept simple for this very purpose. Instructions on
how to write practice questions are to be found in the text file PRACTICE.TXT
in the METRIC package.

We would be pleased to hear about any modifications that you do carry out.
Please contact us at the addresses given in the Preface.

Additional resources
Please check our World-Wide-Web site (http://metric.ma. ic .ac.uk/)
for additional materials that we will be making available from time to time.
The software for this book
To perform the experiments in this book you will need Mathernatica version 2.0
or later with the notebook front end, together with the METRIC software
package.

The METRIC package consists of


Mathernatica custom functions that are used throughout the experiments;
practice questions, and the Mathernatica code for running them;
code for logging students’ Mathernatica sessions;
text files containing detailed technical instructions, including directions for
installation.

Disk access
The hardcover edition of this book includes a floppy disk (DOS format) with the
complete set of programs.

You can also get the floppy disk by mail by writing to:

The METRIC Project (EUM Disk)


Mathematics Department
Imperial College
London SW7 2BZ
England.

The cost is El5 sterling. Please send a cheque payable to “Imperial College” and
drawn on a British bank.

Internet access
The METRIC package is freely available on the World-Wide Web, at the URL
httg://metric.ma.ic.ac.uk/
xx The software for this book

You may also use anonymous FTP or Gopher to metric .ma. ic. ac .uk (IP
address 155.198.192.26). The programs are available at the MathSource
archive in the USA:
http://www.wolfram.com/mathsource/

Platforms and formats


The floppy disk contains three archive files for DOS, Macintosh and Unix:
mathetic. exe, mathetic. sea and mathetic. tar,respectively. There
is also a directory, source,containing the files in normal form. Installation
instructions are given in the file INSTALL.TXT.

Updates and on-line information


For the latest updates to the software and learning modules, please visit our
WWW site at:
http://metric.ma.ic.ac.uk/
. .
or use anonymous FTP or Gopher to metric .ma. ic ac uk (IP address
155.198.192.26).

Mathernatica version 3.0


As we go to press, Mathematica version 3.0 has not yet been released. It seems
likely that the directory structure will change somewhat, and therefore that
installation instructions will have to be modified slightly. We shall post
information on this at the above WWW site.
Getting started with
Mathematica
2 Getting Started with Mathernatica

About this module


Aims
0
To help you start using Mathernatica;

0 To provide an idea about what Mathernatica is and what it can do.

Prerequisites
You need to be able to use a keyboard and mouse-windows computer user-
interface (e.g. Microsofi Windows or the MacOS). If you have not used this sort
of computer system before there are tutorial programs available, or you might like
to start working with someone who is more familiar.

There is no special mathematics knowledge required for this work but you should:

try to understand the Mathernatica statements, and

try out your own ideas rather than just “do what you are told”

What is this thing called Mathernatica?


Mathernatica is a computer program created by Wolfram Research Inc. and is a
type of system usually known as a “computer algebra system” or a “computer
mathematics system”. It helps you do lots of different mathematical tasks. Such
programs are now essential tools for scientists, engineers and mathematicians and
you need to become familiar with them.

Mathernatica has two parts: the Kernel and the Front End.

The Kernel is the main part of the system, which accepts Mathernatica com-
mands, processes them and sends back results. This is called evaluating the
command.

The Front End is the part of the system that handles such things as screen
display, printing and the creation of Mathernatica documents.
Getting Started with Mathematica 3

Mathematica documents are called Notebooks. A Notebook is a bit like a word-


processor document; you can type and edit commands, send them to the Kernel
for evaluation, display the results and save your work.

You send commands to be evaluated by holding down the “shift” key and pressing
the “return” key (which may be marked “enter” or ‘‘A ” on some keyboards).

In [1]:=
~~

2 + 2
out [1]=
4

Kernel

Notebook

When you start Mathematica you actually start just the Front End; the Kernel only
starts when it is needed. It’s fine to have more than one Notebook open at a time,
but don’t start more than one copy of Mathematica.

During this module you will work within a Notebook to try out some of
Mathematica’s capabilities.

Notebook management
A Mathematica Notebook is, as we’ve said, a bit like a word-processor document;
you can scroll up and down, and edit things in the usual “mousey” way. But
Mathematica Notebooks have certain special features you don’t find in other types
of document.
4 Getting Started with Mathematica

A Notebook can contain different types of things: Mathernatica commands, their


outputs, error messages, pictures, and just plain text. Mathernatica needs, and so
do you, to know what is what and to keep different types of things separate. The
Notebook is divided into cells and these are identified by “square brackets” on the
right of the window.

Mathernatica shows the relation between an input to the kernel and its output into
the Notebook by grouping the cells. This is shown below, where the graphical
and textual output are shown grouped with the input command by an extra bracket
further to the right; in this case the whole group has been selected. You can use
groups to organise your own Notebooks.

r‘2aralkerm!.I /nlS.=
Plot[Sin[xl. (I. -Pi. Pi)]

tiara? kerm!.l Uutl5l=


-Graphics-

Operations on whole (groups of) cells can be accomplished by selecting their


brackets using the mouse, and then doing the operation. You can also cut, copy,
and paste entire cells, and selections of text between cells: this can save you
having to retype the whole of a long Mathematica command. But you can also re-
evaluate a previous command, in any input cell, by clicking inside it or selecting
its cell bracket, and pressing “shift-return”to evaluate.
Getting Started with Mathernatica 5

Using outputs
Each output from Mathernatica has a number. If you want to use the contents of
output number 35, say, you can include it in a Mathernatica command using %35.
The symbol % by itself means “the most recent output”.

Mathernatica syntax
Mathernatica is a computer language with its own grammar and spelling rules,
which in computerese are collectively called its syntax. The Kernel is necessarily
strict about these rules and you will need to be careful with them. You’ll pick up
a lot of it as you proceed through this work but here are some key points:

Mathernatica commands start with a CAPITAL letter, e.g. Sin [XI.If the
command is really several words in English joined together then each one starts
with a capital but WithNoSgacesInBetween Letter case matters: the word
Fun is different from the word fun which is different from the word fuN.

Mathernatica uses a lot of brackets and all the different sorts of them, ( [ { } ] ).
It matters which type of bracket you use in a command:

Most Mathematica functions need inputs (“arguments” in computerese) and


these must be put inside square brackets, for example: Sin [XI.Two or more
inputs are separated by commas, for example:
P l o t [Sin[xl, {x, 0 , 2 Pi), F

Curly brackets, called “braces”, I...) are used to make a list, usually to allow
several objects to be treated as one.

Parentheses, or “ordinary” brackets, (like these), are used to group terms


together just as we do in algebra: it’s a good idea to use them in complicated
expressions to make sure the meaning is clear.

Arithmetical operations
The common arithmetical operations are carried out using the symbols:
+, -, *, / and A (forpowers).
6 Getting Started with Mathernatica

Multiplication can also be signified by a space (or spaces), e.g. 3 x means the
same as 3 *x. For the rest of the time, spaces have no significance except for
clarifying commands. For example, it’s usual to put a space after a comma (like
above), but not usual to write, say, S i n x I , though you can if you like. It is
sometimes better to indicate all the multiplications in a complicated command
with *‘s so that you don’t get the two meanings of the space character confused.

Variable names

Mathernatica variable names can be long, but they must not begin with a number,
because Mathernatica interprets, say, a d i m e n s i o n a s 2 * d i m e n s i o n
Names can end with a number, though: xl is a useful way of writing in
Mathematica a subscripted variable like x i . Note also that combinations of letters
without spaces are interpreted as new variables: ax does not mean a*x.

Getting help
You are certain to get stuck, and we all know how often computers seem to go
wrong. Here are some ways to recover.

Use the Help system, which is especially useful for finding out about
Mathernatica functions.

Ask people, other students, the course demonstrators, anybody.

If a command doesn’t work properly check the error messages (usually in red
or blue text).

If your input is simply returned unchanged, with no error messages to help,


check that you have spelt the command corrxtly, that the number of inputs is
correct, that you haven’t left out any commas, and that the types of the inputs
are appropriate (for example, you may have typed a number when a list is
required).

If Mathernatica seems to have stopped, Abort the calculation or (more


drastically) Quit the Kernel, using the Action menu.
Getting Started with Mathernatica 7

If everything seems to have gone wrong Quit from Mathernatica (via the File
menu) and start again. Learn to Save your work so that you can recover from
these situations.

Printing, saving and opening Notebooks


Mathernatica Notebooks can be printed and saved in standard ways, using the
appropriate commands on the File menu. If you wish to save a Notebook to a new
location, such as a floppy disk, use Save As. Use Open to load a Notebook from
disk.

...
It is particularly useful to use Print Selection (of cells) to get just the cells you
want.
8 Getting Started with Mathernatica

Experiment 1: Arithmetic
Getting Started with Mathernatica 9
10 Getting Started with Mathematica

Experiment 2: Numbers
Getting Started with Mathernatica 11

23/29 + 12/5
23/29 + 11/55
You can use N to get them

Another example of

Sin [Pi/3 1
cos CPi/51
Tan [Pi/&1

can use N here too to

Sin[ NtPi/31

Experiment 3: Algebra
1) Mathematica works with
(x + 3)"2 + x + 3

told what to do

note about how


12 Getting Started with Mathernatica
Getting Started with Mathernatica 13

Experiment 4: Equations
14 Getting Started with Mathernatica
Getting Started with Mathernatica 15

Experiment 5: Plotting graphs


16 Getting Started with Mathernatica

Experiment 6: Some built-in functions


Getting Started with Mathernatica 17
Functions and Graphs

-+
-1
-20

-2 Pi -Pi Pi 2 Pi
20 Foundations
Functions & Graphs 21

Introduction
Definitions
Functions are a very important idea in mathematics. A function is a rule that
takes one object and converts it to another. The “object” going in may be a
number, a variable or a coordinate pair (or triple in three dimensions). The
important thing is that, for any given input, there is only one output.

The set of all possible “objects”, or arguments, for which a function’s behaviour
is defined is called the function’s domain. The corresponding set of values which
the function yields is called its range.

Graphs are pictures of functions made by plotting the input against the output
using a coordinate system: most familiar is the Cartesian (x,y) system.

Notation
There are several different notations for functions. For example, the function that
takes a number, squares it, takes its sine and multiplies the square and the sine
together can be written:

y = x 2 sinx, or
y ( x ) = x 2 sinx, or
f ( x ) = x 2 smx,

or
f:x-+x 2 smx.

22 Foundations

Mathernatica issues
Functions
Mathernatica uses square brackets to enclose function inputs, where we might
usually use round ones. For example:
Sin[Pi]
Exgl2l
Mathernatica has many functions built in, and you can easily define your own
functions in Mathernatica as we’ll see shortly.

Graphs
Computer-generated graphs in Mathernatica are built by making a table of x and y
coordinate pairs across a range of x. Mathernatica chooses they ranges on graphs
so that it shows “the most interesting part” of the graph corresponding to the x-
range you chose. You can force it to use a certain y-range if you use an option,
like this:
Plot[xA2, Ex, -7, 71, PlotRange -> { - 5 , 2511

Experiment 1: Functions
Preparatory reading
Mathernatica has all the common functions built in, and some fairly uncommon
ones too. You may have seen some of them in the Getting Started module. In
this experiment you will be creating your own functions in Mathernatica.
Functions & Graphs 23
xauppo 103 0 = 3 = v put?
‘ssauuaha 103 0 = q 8uglas dq ‘paieu!m!Ia aq 01 aheq suuai%u~puajjo ayi uop3unj
ppo JO uaha ue oiu! :,ywpenb It!iaua%ayi aqvm o~ .(uuai x ayi) auo ppo auo put?
‘(iueisuo:, ayi pue miai zx ayi) smiai uaaa OM^ sey uoyi3unj :,yipenb p a u a %a y ~
Functions & Graphs 25

Experiment 2: Composing functions


Preparatory reading
Composing means applying one function after another. In traditional notation,
there is a possible confusion about the order of application when applying
functions one after the other:

g(x> means "apply function g to x".


fg(x) = f ( g ( x ) ) means "apply functionfto g ( x ) " .

The functionfg (sometimes written f . g or f 0 g) is called the composite off


and g. Notice thatfg means "apply the function g$first, then applyfto the output".
It is easy to get confused, because the function to be applied last is written first.
saq ivy1 anIen ayi asooy3 01 sf uopualzuo3 ay1 pue suoyin~os30 iaqurnu alyyuy
j ayL
ue seq 5.0 = x uys uoynba ay1 ‘aldurexa iod ’asioM ualza sy u o ~ p u nauys
.Pau!PP
LlanbIun IOU sr ‘Uayi ‘6 30 ,,a.mnbs asJaAu!,, ayL ‘E- pue E LIaureu ‘6 ale saienbs
asoyM siaqurnu ow a n aiayi ‘aldurexa 103 ‘in8 .indlno auo amy LIUO isnur
uop3unj e 1eyl a3uals!su! mo 30 asne3aq ‘3yxuaIqoid aq ue3 suoy1~unjasialzul
Functions & Graphs 27

between -d2and d 2 . With this restriction, the inverse of the sine function is
written:

arcsinx, or sin-lx .

The other inverse trigonometric functions are written similarly. These functions
are discussed more fully in the Trigonometry module. Note that the sin-’ x
notation is extremely ambiguous: it does not mean (sinx)-l = l/sin x, it simply
means “the angle whose sine is x”.

Inverses
For simpler functions there is a way to find the inverse, based on trying to solve
(or rearrange) the function definition to get x in terms of y. For example:

y=4x+2
y - 2 ~ 4 ~

We can now write down the inverse function, switching the roles of x and y:

y=- x - 2
4

It’s a convention that x be the independent (input) variable and y the dependent
(output) variable.

There are some trivial self-inverting functions like “add zero” or “multiply by 1”.
The reciprocal function, y = l/x, is a simple, non-trivial one. There are others.

Experiment 3: Graphing functions


Preparatory reading
A graph is just a picture of a function. Mathematically the graph of a functionfis
the (infinite) set of points (x,y) such that y =Ax).
28 Foundations

When we draw graphs we usually choose some values of x,calculatefix) and plot
each of the points (x,y) until there are enough to be able to draw a curve through
the points.

Muthernatica has a similar approach, and like us it knows to fill in more points at
places where the function is changing rapidly.

In x-y graphs there is an implicit distinction between the two axes. The x-axis is
usually used for the independent variable (independent in the sense that we can
usually choose any number to put into the function). The y-axis represents the
dependent variable (that is, dependent on x).
Functions & Graphs 29
30 Foundations

Post-experiment reading
The Plot command, though very useful and important, has a major flaw: it
always assumes that the functions it is plotting are continuous. So when an input
.
function isn’t continuous, such as l l ( x - 2) (discontinuous at x = 2), or tan x
(discontinuous at x = M 2 , f3d2,. . .), or sin( l/x) (discontinuous at x = 0) then
spurious lines connecting large positive and large negative function values will
appear. The failure in the case of sin( llx) is even more drastic because the period
of the function goes to zero at x = 0. The moral of this story is: always think
twice about computer output!

The graphs of a function and its inverse function are simply reflections of each
other in the line y = x. Self-inverse functions are symmetrical about this line.

Experiment 4: Graphing data


Preparatory reading
We can use the coordinate system of a graph to represent sets of pairs of numbers,
with or without knowing a function that links them. A typical example would be
the results of an experiment where, say, the temperature T of a system is measured
at different times t, leading to a set of coordinate pairs (l, T>.

We’ve ordered them in this way because we tend to put the independent variable
on the horizontal axis, and with this experimental data it seems natural to treat
time as the independent variable.

We don’t know if there’s a functional relationship between the two


measurements. Graphing the data is part of our attempt to find out. The simplest
case to look for is a linear relationship: we try and fit a straight line through the
points.

All straight lines have an equation like this:


y=mx+c

where m is the gradient of the line (= tan O), and c is the intercept on the y-axis.
Functions & Graphs 31

JC
X
0

for later use:


dataGragh=Li

commands define

a=$; c= 45;
fitLine = PlotCm * x
Show [ { dataGragh, f i t
32 Foundations

Post-experiment reading
Fit works by choosing the curve that minimises the (vertical) distances of the
data points from that curve. It performs a least squares fit: it minimises the sum
of the squares of the distances from the data points to the curve (a straight line in
this case):
Functions & Graphs 33

Experiment 5: Functions from graphs


Preparatory reading
This final Experiment is intended to help you see the relationships between
common functions and their graphs.
34 Foundations

GiveQuestion [

LsstAnswear t
to check your answer. Y
are randomly generated,

For questions on trigonome


GiveQuest ion [ "trig gr

stlhcswer [ "trig g

veQueastion [ "grag
is section uses this
ing hack to the Inst
Trigonometry
36 Foundations
Trigonometry 37

Introduction
This module covers:

the definitions of the trigonometric functions sine, cosine and tangent;

some of their mathematical properties as functions;

the inverse trig functions: secant, cosecant and cotangent.

You will need to be familiar with the following:

right-angled triangles and Pythagoras’ Theorem;

some coordinate geometry: plotting points and curves on a graph.

Mathernatica notation
z(= 3.1415926 ...) is written as P i . Muthemuticualways keeps values as
accurately as possible, so will show the number 7c as P i , not as a numerical
approximation. To force it to show you a numerical value you need to use the N
function.

For infinity, use the capitalised word Infinity in place of the usual symbol w .
Mathernatica requires square brackets around the arguments of functions, so sin x
becomes Sin [XI .
About trigonometry
Trigonometry is the part of mathematics dealing with the properties of triangles.
(From the Greek, trigonon = triangle). You should already know something about
the geometric properties of triangles, and right-angled triangles in particular.
Pythagoras’ Theorem relates the lengths of the sides in a right-angled triangle, u
and b and the hypotenuse h:

h2=a2+b2.
38 Foundations

You should also have learnt about the trigonometric (or “trig”) functions: sine,
cosine and tangent. Although these are defined as ratios of sides in right-angled
triangles, they are used very widely in advanced mathematics, and appear in some
surprising places.

Trig functions are important; you should have a good grasp of them and have
them in your “toolbox” of mathematical methods.

Experiment 1: Degrees and radians


Preparatory reading
In your previous study of trigonometry you may have been using only the degree
as the unit of angle, where a circle contains 360“.

In this module, and as a general rule for university-level mathematics, we will be


using the radian, where a circle contains 2nradians. (1 radian is the angle
subtended by a circular arc of radius 1 unit and arc-length 1 unit.)

1” is d180 radians, so multiplying any number of degrees by the factor d180


gives the radian measure of the angle.
Trigonometry 39

Post-experiment reading
1 radian = 57.2958 degrees;

1" = 0.01745329251994329577radians.

The special angles (30,45,60 and 90 degrees) are found in these right-angled
triangles:

l c \

1
40 Foundations

Experiment 2: Trig functions from 0 to n/2


Preparatory reading

I Adjacent

For the angle t:


opposite
sint =
hypotenuse ’
adjacent
cost =
hypotenuse ’
opposite
tant = -.
adjacent

It is usual to label angles with the Greek letter 8 (“theta”). We can’t use Greek
letters easily on the computer, so we’ll be using t instead.

In a right-angled triangle, t can have values between 0”and 90”, and therefore
these definitions are only valid for t between 0”and 90”,or 0 5 t 5 n I 2 radians.
Trigonometry 41
42 Foundations

Post-experiment reading
You should become familiar with this result:

Experiment 3: Trig functions from 0 to 2nx


Preparatory reading
Knowing the definitions of the trig functions in right-angled triangles does not tell
us how to calculate their values, except for the special angles above. The
calculation of values is a separate topic, and in this era of computers and pocket
calculators something we can leave out.

For other ranges of values of t, we could define these functions by considering the
exterior angles of an obtuse-angled triangle (some books do this). Instead, we will
consider the trig functions as defined in the unit circle:
Trigonometry 43

Here we've drawn a unit circle (a circle of radius l), and an arbitrary point P ( t )
which subtends an (acute) angle t a t the origin. The equation of the circle is

P(t) depends on t and has coordinates (x, y). The right-angled triangle containing t
has a hypotenuse of length 1, since this is the radius of the circle.

It follows from the definition of sine and cosine that:


x=cost, y=sint
y =-
tant = - sint
x cost

Sine, cosine, tangent and their related functions are often called circular
functions because of their definition in the unit circle.

We divide the unit circle into four quadrants, the quarters which are slices of
angle d 2 (that's 90",but this is the last time we mention degrees!).

We refer to them as:

1st quadrant: from 0 to d 2 ;

2nd quadrant: from d 2 to n;

3rd quadrant: from n to 3 d 2 ;

4th quadrant: from 3 d 2 to 2 n radians.

1) Use Muthenzatica to gr
together type:

Plot CSinttI I it

It's laid out like this, u


44 Foundations
Trigonometry 45

Post-experiment reading
You should be familiar with the shapes of the graphs of the three main
trigonometric functions between 0 and 2n:

sin t cos t

0.

-0.

tan t

Many properties of the trig functions can be found from these graphs and the unit
circle diagram. For example, they show the range of the sine and cosine
functions:

-1 I sint I +I or (sint(51
-1 5 cost 5 +1 or lcostl I 1

A circle “contains” 2n radians, which means that the point P reaches the same
place after each turn of 2n. Its associated angle will look the same, but the
circular definition is 2 8 (or 4 z or 6z etc.) larger. Similarly for turns in the
negative direction, P will return every 2 z
P(t) = P ( t + 2n) = P(t + 4n) ...
= P ( t - 2n) = P(t - 4n) ...
:. P ( t ) = P(t + 2nx),

for any integer n, which may be positive, negative or zero.


46 Foundations

We defined the trig functions in terms of the point P, so they have the same
property:
sin t = sin(t + 2nn)
cost = cos(t + 2nn)
tan t = tan(t + 2nn).

With tangent we can go further. The graph of tan x doesn’t only repeat every 2n
radians: it actually repeats every Irradians, as is clear from the graph. Thus,
tan t = tan(t + nn).

We have found the value of the trig functions for any angle, using just the values
in the interval 0 to d 2 . The trig functions are periodic (that is, they repeat
themselves) and the period (the length of one repeating cycle) is 27c for sin and
cos. and n for tan.
Trigonometry 47

Experiment 4: Amplitude, frequency and


phase
Preparatory reading
The periodicity of sine and cosine functions makes them useful for describing
periodic behaviours, such as waves and vibrations. Such applications usually
require that the trig functions be transformed, and there are really four kinds of
transformations we can do. Let’s start with a sine function:

we can multiply it by a number, a : a sin t e.g. 5 sin t

multiply the argument by a number, w : sin wt e.g. sin 21

or add a number to the argument: sin( t + p ) e.g. sin ( t + 7d3)

or just add a number to the result: c + sin t e.g. 2 + sin t


In the most general case we need to look at the function: c + a sin(wt + p ) .
48 Foundations

Post-experiment reading
The parameter c moves the curve “in the y-direction” (vertical translation).

The parameter a determines the range of the curve and is called the amplitude.

The parameter w gives the number of cycles that occur as t changes by 2 z this
quantity is called the angular frequency (often written using the Greek letter o,
“omega”).

The parameter p moves the curve through a horizontal translation and is known as
the phase. Note that apositive phase has the effect of moving the curve in the
negative x-direction (it shifts to the left).

The period, T, of our transformed sine function is T = 2 dw. The frequency of the
oscillation is the number of cycles during 1 unit of the variable t, that is, 1/T or
w I 2 ~For sound or light waves, and other time-dependent oscillations, frequency
is normally measured by the number of cycles per second; the SI unit for this is
the Hertz (Hz).
Trigonometry 49

Experiment 5: Reciprocal functions


Preparatory reading
Recall the triangle labelled for our original definition of the trigonometric
functions:

Opposite

I Adjacent

There are three other ratios we can define, often known as the reciprocal
trigonometric functions. (“Reciprocal” means “one over”).

Secant: hypotenuse -
-
1
= sect,
adjacent cost
hypotenuse - 1
Cosecant: - = cosec t ,
opposite sin t
adjacent -
~- -
1
Cotangent: = cot t .
opposite tan t

i
50 Foundations

Remember that the P1


negative values across

2 ) Copy the shapes of the


functions also have a p

Post-experiment reading
These functions all have ranges between --w and +-w , excluding the interval
(-1, 1).

Experiment 6: Trigonometric identities


Preparatory reading
Because the trigonometric functions arise so frequently you will quite often
encounter expressions, including quite complex expressions, involving
trigonometric functions. We often need to manipulate these expressions, to
simplify them or get them into a more useful form. To do so we can call on some
important identities.

An identity is a statement to the effect that that two expressions are exactly the
same, irrespective of the values of the variables involved. For example:

(x+2)* = x 2 + 4 x + 4

This is always true, for any value of x. It is not an equation for a particular value
of x. To emphasise this difference we use the special identity sign E.

Plot cos2t and sin


Show [

Plotstyle->RO
Plot t ( C O S [t1)”2,
Trigonometry 51

Post-experiment reading
You should know the following identities by heart:

cosLt+sinL t 5 1;
cos(2t) 5 cos 2 t - sin 2 t,
=220s 2 t - 1 ,

= 1- 2 sin2 t ;
sin(2r) = 2sinrcost.
Remember that identities are true for all values oft. The functions on each side of
the ‘k”sign are the same.

Experiment 7: Small angles


Preparatory reading
We sometimes try to replace functions with simpler ones, most typically by
polynomial functions, that is functions of powers of a variable.
52 Foundations

There are polynomial approximations to the trigonometric functions which give


particularly good approximations near 0.

Post-experiment reading
If t << 1. then

sin t is approximately equal to t.

tan t is approximately equal to t.

t2
cos t is approximately equal to 1- -.
2

To understand why these work, find out about the representation of sine and
cosine by power series. This is covered in the Series 2 module.
Series 1
2

,..J
..' ....

......... ,I
............ ..........
1,

..... 'h,,
...
I . . ......
, 'i'..,

0.5 0.75

-1

-2
54 Foundations
Series 1 (Sequences} 55

Experiment 1: Making sequences


Preparatory reading
A sequence is an ordered list of numbers. Sequences are usually generated by a
mathematical rule (a function) which allows us to work out the terms. The
sequence then is the output when that rule is applied to an ordered set of integers.
For example:

f(n)=2n+ 2, 4, 6, 8, 10, ...


3 ~ 3, 9, 27, 81, 243, ...
f ( i ~ > =3

where “. ..” means “and so on for ever”. In each case, we could find the 100th
term, or the 2000th, or the nth, for any n, because we know the generating
function which defines the general, nth term. Thus the two examples above could
be written:

2, 4, 6, 8, 10, ..., 2n, ...


3, 9, 27, 81, 243, ..., 3n, ...
56 Foundations
Series 1 (Sequences) 57

Post-experiment reading
There are basically three things that happen to sequences as n gets larger:

They can get closer and closer to some value and are said to converge to that
value,

OR

They get bigger (positive or negative) with no maximum value as n increases


and are said to diverge to (plus or minus) infinity,

OR

They keep “visiting” a set of different values, or a range of values without


converging.
Such sequences are also said to diverge.

In the mathematical language of limits, we seek the behaviour of the sequence as


n tends to infinity. In traditional notation we would write, for example:

1
lim - = O ,
n+- n

which is read: “the limit of one over n , as n tends to infinity, is zero”. The
sequence generated by lln is, therefore, a convergent series, which converges on
zero.

There are methods for determining the limiting behaviour of sequences: you will
study them later.
58 Foundations

Experiment 2: Progressing
Preparatory reading
Arithmetic progressions
An arithmetic sequence or arithmetic progression (AP) is a sequence in which
each term differs from the one before by a constant. This constant is called the
common difference. For example the sequence

10, 12, 14, 16, ...

is an AP with first term 10 and a common difference of 2.

In general, call the first term a and the common difference d. The general
arithmetic sequence is:

a, a + d , a+2d, a+3d, ..., a + ( n - l ) d , ....

The nth term, the general term we need, is there. Can you see why its multiplier
is (n-l)?

Geometric progressions
A geometric sequence or geometric progression (GP) is one in which each term
is a constant multiplied by the previous one. This constant is called the common
ratio. For example

2,4,8,16,...

is a GP with first term 2 and a common ratio of 2.

In general, if the first term of the sequence is a, and the common ratio is r, then
the sequence is:

a, ar, ar 2, ar3, ..., urn-', ....


The nth term is urn-'
Series I (Sequences) 59
60 Foundations

Experiment 3: Summing APs


Preparatory reading
Series notation
A series is the sum of a sequence. The normal notation for summation is (a
capital Greek “sigma”). So, instead of writing the sum of an AP like this:

2 + 5 + 8 + 11 + 14,
we may use the general term to write:

n=l

Similarly, the infinite sum of this GP:

8 + 4 + 2 + 1 + 0.5 + 0.25 + ...


may be written:
m

c8.(;)”-’
or c8.(:)”
or-’
even c8.(+)”-’.
n=l n

The convention for sums is that the count is assumed to be up to infinity if the
upper limit is omitted, and to start at 1 if the lower limit is omitted. You can even
leave out the counter variable ( n ) if this is obvious from the expression. In all
cases the counter is assumed to be going up in 1’s.
Series 1 (Sequences) 61

Post-experiment reading
Summing an arithmetic sequence
“Add up all the numbers between 1 and 1000.’’

This problem was set for the great mathematician Gauss when he was in primary
school, probably to keep him quiet for a few days. His solution, which you re-
derived in the experiment, can be easily generalised.

Let’s call the sum of the first n terms of an arithmetic sequence S,:
62 Foundations

S, = a +(a+ d ) + ( a + 2 d ) + ... + ( a + ( n - 2 ) d ) + ( a + ( n - 1)d)

and, writing it backwards:

S, = ( a + ( n- l ) d ) + ( a + ( n - 2 ) d ) + ... + ( a + 2 d ) + ( a + d ) + a

Add them:

2S, = n(a + a + ( n - 1)d)


n
* S, = -(2a + ( n - 1)d).
2

A useful alternative way of writing this is:

S, = (number of terms) x (average of first and last terms).


Series 1 (Sequences) 63

Experiment 4: Summing GPs

Post-experiment reading
The general geometric series summed to n terms is:

S, = a +ur+ur' + ... +urn-' +urn-'

(multiply by r ) r.S, = ur + ur2 + ... + urn-' + urn-' + ur".


64 Foundations

Now subtract and most of the terms disappear:

S,-rSn=a-arn

a(rn - 1)
:. sn= a(1- r n >
~ or S, =-.
1-r r-1

The first formulation is more convenient when r < 1, the second when r > 1. The
case when r = 1 is not covered. Can you see from the derivation why not? What
is the right formula for Sn in this case?
Series 1 (Sequences) 65

Experiment 5: Summing to infinity


Preparatory reading
Summing to infinity
We can allow the pattern of terms to continue forever but the sort of
manipulations we use to sum AP’s and GP’s cannot be used for series with an
infinite number of terms.

In general, if we have any infinite sequence (AP or GP or something else) we can


form the partial sums up to each term. If we call the sum to n terms Sn then we
can build a new sequence:

This new sequence will either converge or diverge. If it converges then its limit is
the sum to infinity of the original series.

You can probably see that for arithmetic sequences, and those geometric
sequences with Irl2 1 , the terms get larger (maybe large and negative) as n
increases and hence that the sum of terms will increase as we take more and more
terms of the series. These series diverge.

For a geometric sequence with Irl < 1, the absolute value of each term will be
smaller than that of the previous one, but what can we say about the sum?

If 111 < 1 , then:

a(1-rn)
Sn=-----,
1-r
a
S, = lim S, = - (for Irl< 1).
njm 1-r
66 Foundations

A note of caution
Although it’s clear that a series in which the terms are getting bigger can’t
converge to a limit, it is not enough for the terms in a series to be getting smaller
for its sum to infinity to exist.
Series 1 (Sequences) 67

Experiment 6: Recurrence sequences


Preparatory reading
Recurrence sequences are defined by deriving each term from one or more of the
previous ones. An example is the Fibonacci sequence given by:

Fibonacci derived this series to describe the population growth of rabbits, taking
u, to be the number of pairs of rabbits in the nth generation.

We need to assign values to the first two terms, u l and u2. Fibonacci started with
one pair of rabbits, and assumed that it took them two seasons to breed, so we
have u1 = u2 = 1.

The second example in this experiment is based on the nonlinear recurrence


relation:
68 Foundations

2
x, = c - x,-l

where c is a number. This is nonlinear because it has a squared term in it -this


apparently simple fact implies some absolutely remarkable mathematical
properties.
Series 1 (Sequences) 69
70 Foundations

Post-experiment reading
2
The sequence generated by x, = c - xn-, exhibits remarkable behaviour. For
some values of c the sequence converges (whatever the starting value, X I )but for
other values it will visit two, four, or more different values, getting closer each
visit in a “convergent” sort of way. For yet other values of c even stranger
behaviour can be observed, and this sequence demonstrates the essential
characteristic of chaotic systems, namely that very small changes in the values of
c can have disproportionately large impact on the limiting behaviour of the
sequence and the value(s) it visits.

The diagram on the title page of this module shows the results of plotting xlooo to
between 0.5 and 2.
~ 1 0 3 for
2 c

If x and c are allowed to be complex (see Complex Numbers in Chapter 4), even
richer patterns of behaviour occur. The well-known Mandelbrot Set stems from
analysis of these patterns.
Differentiation 1
72 Calculus
Differentiation 1 73

Experiment 1: Gradients of straight lines


Preparatory reading
The gradient of a straight line is a measure of its steepness. It’s equal to the
number of units the line rises for each unit step to the right. The following graph,
then, has gradient 2: it goes up 2 units for every 1 unit along.

If the graph falls instead of rises, then the gradient is negative: here’s a line with
gradient -3.
74 Calculus

To calculate the gradient of a straight line, fix two points on it, (XO,yo) and (XI,
yl). The line’s gradient is then

Y1 -Yo
t

x1 -xo

change in y
or, more simply,
change in x

You’ll have spotted, probably, that there’s a correspondence between a function’s


equation and its graph. The function with equation y = 2x - 4 has a graph with
gradient 2, and the function with equation y = 6 - 3x has a graph with gradient -3.

In general, the function with gradient y = mx + c has a graph with gradient m.


Differentiation 1 75
76 Calculus

Experiment 2: Chords and tangents


Preparatory reading
It's often important for us, as users of mathematics, to know about gradients of
graphs. If the horizontal axis represents time, for instance, then the gradient of
the graph represents how rapidly the quantity is changing in time (velocity =
gradient of distance-time graph, acceleration = gradient of velocity-time graph,
inflation = gradient of price-time graph, etc.)

The problem is, few graphs we study are straight lines. We need, then, some idea
of gradient for curves.

Suppose we're interested in the gradient of a curve at a given point. Now: we


already know about gradients of straight lines. So we can draw a straight line
through our point, exactly as steep as the curve.. .

2-

-2 -

...and then ask: what's the gradient of this line?


This line, which just touches the curve, is called a tangent.

The bad news is that, in general, it's hard to calculate gradients of tangents. Far
harder, for example, than it is to calculate the gradient of a line that crosses the
curve at two given points.
Differentiation 1 77

Lines like this are called chords or secants. Because they pass through two
points whose coordinates we know, we can use the formula
change in y
change in x

to find their gradients.

BIG IDEA: The closer together the two points, the more the chord looks like a
tangent.. .
78 Calculus

Thejrstpurt ofrhe e
produce this partied
Plotchord fx"2, Ex,
PlotRange->E-S, 251
then "shift-return '' i

1) Generate a dia

Generate a fresh diagram.


the second point a litt
of the chord. Repeat,
Note down the gradient ea

3 ) Write a short report s


at the point x = 2. Exp
results of your experiment

4) Use the same technique to


x= l,x=3andx=

5 ) What do your findings

Note: this section uses


try going back to the Instru
Differentiation I 79

Experiment 3: Limits and derivatives


Preparatory reading
We can express the ideas you met in Experiment 2 like this:

the gradient of the tangent at the point P is the limit, as Q approaches P , of the
gradient of the chord PQ.

A limit is a number that we can get as close to as we like, even though we may
not be able to quite reach it.

Let’s think about the equation y = x2. If P has coordinates (2,4), and Q has
coordinates (2 + h, [2 + hI2),then the gradient of the chord PQ is given by

change in y -
-
(2 + h)2 - 4
change in x h

As h gets progressively smaller (we say it tends to zero) Q approaches P and the
gradient of the chord PQ gets closer to that of the tangent at P. The gradient of
this tangent, then, is the limit as h tends to zero of the above quantity. We write
this as

(2+h)2-4
lim
h+O h
Notice, though, that

(2+h)’-4=4+4h+h2-4
=4h+h2,

and therefore that

(2+h)2- 4 4h + h2
lim = lim ___
h+O h h+O h
= lim 4 + h
h+0
= 4.

This shows us, then, that the gradient of the tangent at x = 2 is 4. Does this fit in
with what you found?
80 Calculus

More generally, the gradient of the graph y =Ax) at the point (x,Ax))is given by

This is called the derivative off, and is writtenf '(x)or sometimes justf '.
Differentiation 1 81

Experiment 4: Differentiation
Preparatory reading
In the preparatory reading for Experiment 3, we saw a mathematical argument
proving that the gradient of the graph y = x2 at the point ( 2 , 4 )was 4. We can use
a very similar argument to prove that gradient of the graph y = x2 at the point ( x ,
x2) is 2x, whatever the value of x may be:

( x + h ) 2- x 2 2xh + h2
lim = lim
h+O h h+O h
= lim 2 x + h
h+O
= 2x.

The derivative of x2, then, is 2x. We write

iff(x) = x2 thenf '(x) = 2x

or

dY = 2x.
if y = x2 then -
dx
82 Calculus

Finding a function's derivative is called differentiating the function.

We could repeat these kinds of calculation every time we need to differentiate


(what we call differentiationfrom Prst principles), but that would be time-
consuming. Alternatively, we could rely on Mathematica to do the job for us
every time; that, though, would be a kind of cheating.

It would be ideal to have sets of rules that allow us to differentiate relatively


quickly without relying on technology and without getting bogged down in
calculations involving limits. In this experiment, you'll use Mathemarica to
explore some rules of that type.

We've already met one Mathematica approach to diflerentiution: the f [x


idea you saw in Experiment 3. Another uses something called the I)
operator. Try the following:
D[XA2, X I

1) Using either the D operator or the f [ X I technique, find the derivatives of


x 3 , J? and x. What do you notice? Explore further.

2) Find the derivatives of dxt l/x, and 1/x2. Do these fit in with your findings
from part l ?

3) Write a short report summarising your fi


of the formfix) = xn.
4) Try the following:
Differentiation 1 83

Post-experiment reading
The observations

iff@) = x 2 then f '(x) = 2x,

iff@) = x3 thenf'(x) = 3x2,

iff@) = x4 thenf '(x) = 4x3,

etc. can be summed up in the single rule

if&) = x n thenf '(x) = &-I.

This rule isn't hard to prove from first principles. It applies to any function of the
form

including cases where n is fractional or negative. Thus, for example, suppose


Ax)= dx. Then
f( x ) = .'I2

and thus

f' ( x ) = z1 x -112

You'll notice, too, that the derivative of, say, 5x3 is 5 x 3x2 (or 15x2), and that the
derivative of, say, 5x3 + 2x2 is 5 x 3x2 + 2 x 2x (or 15x2+ 4x). This works quite
generally: the derivative of aAx) + b g(x) is

af '(x) + b g'(x).
84 Calculus

Experiment 5: Trigonometric functions


Differentiation 1 85

Post-experiment reading
It isn't hard to show, from first principles, that the derivative of sin x is, in fact,
cos x. You need to know two facts:

that, for small values of h, sin h is very close to h and cos h is very close to 1 -
h212;

that sin (A + B ) = sin A cos B + cosA sin B.


The argument then goes like this. Supposeflx) = sin x. Then

sin(x + h) - sin x
f ' ( x ) = lim
h+0 h
sin x cos h + cos xsin h -sin x
= lim
h+0 h

i h27
sinx l - T +hcosx-sinx
"/
= lim
h+O h
h
= lim cosx--sinx
h+O 2
= cos x.

A similar method can be used to show that the derivative of cos x is -sin x. Note,
though, that all this only works if you measure all angles in radians.
86 Calculus

Experiment 6: Exponential functions and e


Preparatory reading
Exponential functions are functions such as y = 2x, y = 4, y = Y,etc. Note that
these functions are not the same as things like y = x2 or y = x3: we can’t apply the
same rules when we differentiate them. The general shape of the graph of y = ax
is more or less the same for all positive values of a : they all look roughly like this:

The graph begins very flat, so for x negative the gradient is close to zero. Then, as
x gets larger, the graph gets very quickly steeper: the gradient rises rapidly. If we
were to plot a graph of gradient against x, then, we’d get something that begins
close to zero, then rises very rapidly: something very like the graph of the
function itself!

This suggests that the derivative of an exponential function might be another


exponential function, or something very like one.
Differentiation 1 87

Post-experiment reading
The results of this experiment strongly suggest that there is a close relationship
between exponential functions and their derivatives. This is indeed the case, as
the following argument makes clear. If&) = an, then

ah- 1
= L a x ,where L = lim-.
h+O h
88 Calculus

This means that the derivative of ax is just some multiple of ax itself. If we


choose the value of a carefully, we can make L equal to 1. In this case, the
derivative of ax will simply be ax. The value of a for which L = 1 is known as e ,
and is about 2.7182818284590452354.

The functionffx) = 8is so important that it is known as the exponential


function, sometimes also written exp(x). Remember the derivative of ex is ex.
Check using Mathernatica if you like!

Experiment 7: Derivatives of inverses


Preparatory reading
Reminder: the inverse of a function is that function in reverse: if f-l is the
inverse off then

f ( a ) = b w f - y b ) = a.

The inverse of the functionffx) = ax is known as the logarithmic function


f-1( x ) = log, x .

The inverse of the exponential functionffx) = ex is known as the natural


logarithm, and is written f - ' ( x ) = lnx ,or sometimes f - ' ( x ) = log x , though the
latter notation may be more familiar to you as standing for loglo x . In
Mathernatica, the natural logarithm is written Log [XI(and the base-10 logarithm
as Log tl0,xI).
Differentiation 1 89
snlnqv3 06
Differentiation 1 91

Summary

Function Derivative

Xn nxn-l

sin x cos x

cos x - sin x

ex ex

In x 1
-
X

arcsin x 1

arctan x 1
1+x2
Differentiation 2
94 Calculus
Differentiation 2 95

Experiment 1: Products and quotients


Preparatory reading
If you've performed the experiments in the Differentiation 1 module, you'll
already know the importance of being able to find gradients of curved graphs, and
you'll already know how to do this -how to differentiate- in the case of a
fairly wide range of simple functions.

We now seek techniques that allow us to differentiate things like

x5 sin x + eCoSn
Y= f

2x + x3
more complicated functions built from the simple ones we can already handle.

In this experiment, you are asked to seek techniques for differentiating products
and quotients. Thus, if we know how to differentiateflx) and g(x), then we can
also differentiate bothfix) g(x) andflx)lg(x).
96 Calculus

Post-experiment reading
We can sum up the rules for differentiating products and quotients in the form of
two formulae. The first, the product rule, is this:
d du dv
-(24.) = v -+ u -,
dx d x d x
or

[uv]'=u'v+uv'.
Direrentiution 2 97

The second, the quotient rule, can be stated like this:

This corresponds to the way Muthernuticu expresses the formula, but it is not the
usual or preferred way of writing it. The quotient rule is more often expressed
(and more easily remembered) as
du dv
v--24-

or

[;I1=- u' v
V
- uvf
2 .

Experiment 2: Composite functions


Preparatory reading
A composite function is one made from two or more simpler functions strung
together. For example:

y = e cos x

cosine + exponential function


or

y = cos(e x )
98 Calculus

exponential function + cosine


In this experiment, we seek a rule which enables us to differentiate a composite
function when we know how to differentiate the two simple functions which make
it up.

Post-experiment reading
The rule you have just discovered is called the chain rule. It can be written like
this:

if the derivative of u(x) is u‘(x)and the derivative of v(x) is v’(x)then the


derivative of u[v(x)] is v’(x)u‘ [v(x)];

or, perhaps more simply, like this:


Differentiation 2 99

dy dy du
-=--
dx d u a k ‘
For example: if y = sin (5 + 1lx) then we let u = 5 + 1lx. Then y = cos u, which
gives
dY du
-=cosu and -=11.
du dx
It follows that

dY = 11cos u = 1lcos(5 + 1lx).


-
dx

QuestCons
We’ve included a feature which allows you
practice questions and their answers. There
differentiation rules, To generate a question o
QiveQuastion I“product rule”
not forgetting to “shift-return”. To

You can do this as o


and repetitions shou

return”.]
100 Calculus

Experiment 3: Max and min


Preparatory reading
A local maximum on a graph is “the top of a hill”: a point higher than all points
close to it.

Note that it isn’t necessarily the highest point attained: merely the highest in its
immediate vicinity.

Similarly a local minimum is “the bottom of a valley”: a point lower than all
points close to it.
Difftrentiation 2 101

A general term meaning “either maximum or minimum” is turning point.

Many real-life problems have to do with maximising and minimising quantities:


an insight into the nature of maxima and minima is a useful thing to have. In this
experiment, we use calculus to seek one.
102 Calculus
Differentiation 2 103

Post-experiment reading
The stationary points on the graph of y =Ax) are those points where the graph is
locally horizontal: wheref '(x) = 0.

All turning points are stationary points, but not all stationary points are turning
points: it's possible for the gradient to be zero at a point which is neither a
maximum nor a minimum. This third type of stationary point is called a point of
stationary (or horizontal) inflexion.

At a stationary point P where the second derivative is positive, the gradient is zero
and rising. This means the gradient must be negative to the left of P, and positive
to the right of P: we have a minimum.
104 Calculus

10;

f '(x) zero and rising - -

gradient off '(x) positive -20-

-30-

Similarly, a stationary point where the second derivative is negative is a


maximum.

A stationary point where the second derivative is zero can be either a maximum,
or a minimum, or a point of horizontal inflexion.
Differentiation 2 105

GiveQuestion[

Experiment 4: Implicitly defined functions


Preparatory reading
So far, we've only met graphs of functions of the form y =fix). However, it's
entirely possible to specify a graph by means of an equation of which y is not the
subject: an example is the circle equation x2 + y 2 = 4.
106 Calculus

Equations such as this are said to define graphs implicitly. The circle example
also defines a function y =Ax) implicitly if we restrict its range to be either all
positive or all negative (since every input can have only one output).

Each point on the graph of an implicitly defined function has a gradient, so it must
be possible to make sense of the idea of the derivative of y with respect to x. In
this experiment, you seek a way of doing that.
Differentiation 2 107

Post-experiment reading
Implicit equations give us, typically, expressions for dyldx which involve both x
and y (ordinary explicit equations give us dyldx as a function of x only). To find
such expressions, we simply differentiate the implicit equation term by term,
remembering that

(a result which is quite easy to prove from the chain rule).


108 Calculus

For example:

y 2 = x2(3 - x) + 2y-dY = 6x - 3x2


dx
dy 6x-3x2
j-=
dx 2Y
Integration 1
110 Calculus
Integration 1 111

Experiment 1: Areas under curves


Preparatory reading
It’s important for users of mathematics to have a way of calculating the area under
a curve (or, more strictly, the area enclosed by the curve and the x-axis.) On a
velocity-time graph, for instance, this area represents distance or displacement.
There are two problems, however. Firstly, there’s the issue of how we calculate
areas under curves. Secondly, it’s not even clear what, precisely, we mean by the
area of an irregular shape. Definitions like “the amount of space inside” don’t
really get us anywhere.

Both problems can be solved at a stroke. It was first done, somewhat informally,
by Newton and Leibniz in the seventeenth century, and the reasoning was then
made more watertight by Riemann two hundred years or so later. Here’s a
simplified version of the argument.

We do know what we mean by the area of a rectangle:just its length times its
width. We know how to calculate it too. But every irregular shape, including the
area under the curve y =Ax), can be approximated as a collection of rectangles
(the Riemann approximation).
112 Calculus

If we know the width of each rectangle, and if we also know all the heights
(which we can get from the equation of the curve), then we can calculate the area
of them all, and thus the approximate area under the curve.

BIG IDEA: The thinner all the rectangles, the better the approximation.
Integration 1 113

Experiment 2: “Area so far’’


Preparatory reading
The Riemann approximation is quite a good theoretical basis for our work, but it
isn’t a very good way of calculating actual areas. To get anything like a good
approximation, we usually need a very small strip width.

You’re going to do some more numerical experiments of the type you’ve just met;
it would be nice to have an approximate method which does a little better. We can
obtain one by treating each strip, not as a rectangle, but as a long, thin trapezium.
114 Calculus

a b

The formula for the area of such a trapezium is

h(a + b)
2 ’
where h, a and b are as shown. We can get quite a good approximation for any
given area by dividing it into trapezoidal strips, and repeatedly using this formula.
Integration I 115

This approach is called the trapezium rule. You’ll meet it again in more detail in
a later series of experiments.

Post-experiment reading
In general, we can say that the area under the graph of y = 3x2 up to the point x is
x3 + c, where the value of c depends on where the measurement of area begins.
The function (or, strictly, the family of functions) x3 + c can be thought of as an
“area so far” function for y = 3x2. More formally, it is known as the indefinite
integral of 3x2. Finding it is known as integrating 3x2.

The notation we use for indefinite integrals recalls the Riemann approximation.
You’ll recall that this involves the idea of slicing the area into thin strips which
we treat as rectangles. If the width of each of these strips is 6x, then the height
will be the y-value on the left edge of the strip, which we’ll call y.
116 Calculus

The area of each strip is thus y6x, so the total area might reasonably be written as

c
where is the usual symbol for summation. We’re interested in letting 6x get
gradually smaller (and therefore letting the number of strips get larger) and noting
the behaviour of the approximation. In other words, we think of the true area as
being equal to

The standard notation for indefinite integrals is simply a shorthand form of the
above, namely

Note that the has turned into an elongated “s” (known as the integral sign) and
the 6 has turned into a d.

On the basis of the first two experiments, then, we can conjecture that

j3x2dx=x3 +c,

where the value of c depends on the point from which the area is measured.
Integration 1 117

Experiment 3: Integration and differentiation

Post-experiment reading
The fact that integration happens to be, as it were, “differentiation backwards”, is
so important that it’s known by the following rather grandiose name: the
Fundamental Theorem of the Calculus. It’s not all that hard to prove, though
we won’t.

It gives us the basis for finding indefinite integrals (always assuming we don’t
have Mathernatica to hand) To integrateflx), we try to find a function whose
derivative isflx). Suppose we find one, and suppose we call it F(x): then any
function of the form F(x) + c (where c is any constant) will also have derivative
Ax),and the indefinite integral offlx) is F(x) + c.
For example, suppose we’re trying to find the indefinite integral of 8x7.We know
from our work on differentiation that the derivative of x8 is 8x7.It follows that the
indefinite integral of 8x7 is x8 + c.
118 Calculus

Experiment 4: Integration of xn
Integration 1 119

Post-experiment reading
The last activity suggests the following:
Xn+l
jx"dx = -+ c.
n+l
It will be noted that the formula appears to be valid whatever the sign of n, and
whether or not n is a whole number. For example,

3
- XT
- _+ c
-
2

and

+x = x-2dXj
-1
X
=-+c
-1

1
--- +c.
-
X

The only exception is the integral of rl, which does not fit the pattern. Since the
derivative of In x is llx, it follows (from the Fundamental Theorem of the
Calculus) that the integral of llx is In x + c.
120 Calculus

Experiment 5: Other simple functions


Integration 1 121

Post-experiment reading
By reversing the relevant differentiation results, we can establish the following:

J'sinx dx = --cosx + c;
Jcosx dx = sinx + c;

J'exdx = e x + c;
I
dx = sin-' x + c;
J'
1
Q
-Jxf = tan-' x + c.

You will notice that a number of familiar functions are missing from the list we
have built up: what, for example, is the integral of tan x,or of In x? It turns out
that we need to move rather beyond the "what is the function whose derivative is
this?' technique in order to tackle even these functions, let alone more
complicated ones. This we begin to do in the final experiment of this module; we
take the process further in the Integration 2 module.

Experiment 6: Simple rules and patterns


122 Calculus

Post-experiment reading
Since we can differentiate “term by term” in a long expression, it follows that we
can integrate “term by term” too. For example:

j(5x3 + p) dx = SjX3dx + 7j:dx

5x4
=-++lnx+c.
4
Since the derivative of, say, sin (2x - 1) is 2 cos (2x - 1) (as you will know from
your work on differentiation of composite functions), it follows that the integral
of 2 cos (2x - 1) is sin (2x - 1) + c, and therefore that the integral of cos (2x - 1) is
1
- sin(2x - 1) + c. In general, if
2

then, if a and b are constants,

If(a+ 1
b ) d x = - F(ax + b )+ c
a
Simply divide by the coeficient of x.

Note that we are still a long way from a general rule for integrating composite
functions. In fact, there is no such general rule; no technique exists, for example,
for integrating symbolically even so simple a composite function as
Integration 1 123

There are two more sets of q


from the fourth set, type
GiveQuestiont ‘lint,
not forgetting to “shift-enter”.
LeLstAnswer I 9nt ,
I’

For a question from t


OiveQuestion t I‘
You can do this as oft
Integration 2
126 Calculus

This essential first step


Integration 2 127

Experiment 1: Definite integrals


Preparatory reading
If you have looked at the Integration I module you may recall the close
connection between the idea of integration and the calculation of areas under
curves. The first module’s treatment of symbolic, indefinite integration will now
be of use to us in devising a method - an exact one, not an approximation - for
finding such areas. The key ideas are these:

1. The indefinite integral is really an “area so far” function, which tells us the
area under the curve “up to” a certain value of x. (It includes an arbitrary constant
whose value depends on what point we start measuring from.)

2. We might typically be interested in finding the area under a certain curve


(what’s known as the definite integral of the function) between, say, x = 3 and
x=8.

2 4 6 8

3. We can get, from the indefinite integral, an expression for the area “up to”
x = 8...
128 Calculus

...and for the area “up to” x = 3:

A way of using these two expressions may already have occurred to you.
Integration 2 129
130 Calculus

Post-experiment reading
A definite integral calculation is set out formally as follows:

It’s important to note the following things.

(i) The definite integral notation:


jabf(x) dx means “the definite integral offix)
between x = a and x = b”. The numbers a and b are called the limits of
integration.

(ii) The square bracket notation: [ F ( x ) ] f :is shorthand for “F(b)- F(a)”. We just
evaluate what’s inside the square brackets at each of the limits of integration, then
subtract one from the other.

(iii) We don’t include the constant of integration (if we did, it would cancel
anyway).

(iv) Where the curve dips below the x-axis (in other words, in those regions for
which the function is negative), the integral regards enclosed areas above the x-
axis as positive and below as negative. (So for sections of the curve below the x-
axis, it is incorrect to speak of the area ‘under’ the curve.)
Integration 2 131

This explains the result you probably got when you integrated sin x between 0 and
2 n.

It's not at first obvious why Mathematica had so much trouble integrating l l x
between x = -2 and x = 2. It looks OK to reason as follows:

= ln121- lnl-21
= ln2 - ln2
= 0.

Here's the problem, though: the domain of integration - the set of numbers
between -2 and 2 - clearly includes the number zero, where the functions l l x and
In x are not defined. The number zero is called a singularity of the integral.

Singularities aren't always disastrous, but they can be, as here. The integral
j:2$ is said to be undefined.
132 Calculus

Experiment 2: The trapezium rule


Preparatory reading
Using the indefinite, symbolic, integral to deduce actual areas (definite integrals)
is the ideal method when it works, because the areas we get are exact. Symbolic
integration is notoriously hard, though; functions don’t have to get all that
complicated before we run out of techniques for integrating them symbolically.

Instead, we often fall back on the approximate methods, such as the Riemann
approximation and the trapezium rule, that you have briefly met if you have
worked in the Integration I module. The Riemann approximation is, as we’ve
said elsewhere, of theoretical rather than practical interest; in practice, we
generally want the trapezium rule or something better.

In this experiment you study the trapezium rule in more detail. As you’ll recall,
we split the area we want to find into a number of strips, each of which we think
of as being a trapezium (they’re not: that’s where the approximation comes in).

Suppose the width of each trapezium is h, and the y-values (ordinates) at each
trapezium boundary are yo, y1, y2, etc, as shown in the diagram below.
Integration 2 133

h
Then the area of the first trapezium is, by the standard formula, - ( y o
2
+y l ) .

So, if there are n trapezia (and therefore n + 1 ordinates, from yo up to yn) then the
total area of all the trapezia is
h
?(YO + Y1) + ;(Yl+ Y,) + ; ( Y 2 + Y J + **. + 5(Y,-1 + Y,)
h
= ?(YO + 2 y , + 2 y , + ... + 2Y,-2 + 2Yn-, + Y,) .
This serves as an approximation for the true area.

It’s a good idea, when doing trapezium rule calculations, to add up the numbers y1
to yn-l before doubling: that way, you only have to double once. Here’s a worked
example.

1 .
To estimate, using eleven ordinates (and therefore ten strips), the integral 2 -dx.
1 x
134 Calculus

h = 0.1

X V - IlX

1.o 1.oooooo
1.1 0.909091
1.2 0.833333
1.3 0.769231
1.4 0.714286
1.5 0.666667
1.6 0.625000
1.7 0.588235
1.8 0.555556
1.9 0.526316
2.0 0.500000
1.500000 6.18771

0.1
Area = -(1.5 + 2 x 6.18771)
2
= 0.693771.

u will haw alrea


have worked on the

TrapszieuleC
TarbulateQ->Truel
1) Use the above Muthe
Integration 2 135

Experiment 3: Parabolic segments


Preparatory reading
We can, of course, make the trapezium rule as accurate as we like by making h
sufficiently small. However, it’s possible, without changing the value of h, to
make the same data work harder for us. The argument (and it’s a clever one) goes
like this.

The trapezium rule depends on our taking segments of the graph, and pretending
that each is simpler than it is. In fact, we pretend that each is as simple as it
possibly could be: a straight line segment. We’d lose some simplicity, but
136 Calculus

possibly gain some accuracy, if we pretended instead that each segment was the
second simplest thing it could possibly be.

The next simplest function after a linear one is a quadratic, and in the same way
’ the next simplest curve after a straight line segment is a segment of a quadratic
curve - what’s called a parabolic segment.

That’s the strategy, then: approximate the curve, not by a sequence of straight line
segments, but by a sequence of parabolic ones.
Integration 2 137

Post-experimentreading
In this experiment you encountered the main drawback of trying to approximate a
curve by a set of parabolic segments rather than a set of straight lines: namely,
that each segment needs three points to specify it rather than two.

This leaves us with two options if we wish to use the “parabolic segments” idea to
find approximate integrals. Either we sample the function in more places than we
would for the trapezium rule, or we sample in the same number of places but
settle for fewer segments. In practice, even if we take the second option we
usually get a better approximation than the trapezium rule would give us.

This experiment seems to indicate that the area under the parabola that passes
h
through (4,yo), (0, y1) and (h, y2) between x = -h and x = h is -(yo
3
+ 4yl + y 2 ) .
This is illustrated on the next page:
138 Calculus

Now, the fact that, in the case we examined, the y-axis happened to be exactly in
h
the middle is irrelevant. j ( y ~+ 4yl + y 2 ) is also the area under the parabola that
passes through the points (I - h, yo), ( I , yl) and (I + h, y 2 ) :
Integration 2 139

Experiment 4: Simpson’s rule


Preparatory reading
Presented with a function for which we wish to find an approximate direct
integral, we divide the range of integration into an even number of strips, each of
width h. This gives us half that number of parabolic segments, each of width 2h,
as shown:

The area of the first pair of strips put together is approximately equal to that under
h
the first parabolic segment, namely -(yo
3
+ 4 y l + y2). The area of the second pair
h
is, similarly, approximately - (y2
3
+ 4y3 + y4). That of the third pair is
h
approximately -
3
(y4 + 4y5 + y6), and so on.
140 Calculus

This is known as Simpson’s rule. As with the trapezium rule, it’s a good idea to
add those ordinates which need to be multiplied by 2 or 4 first, and then perform
the multiplication.

Here is a worked example. To estimate the integral


61,-&, using eleven
ordinates (and therefore ten strips, and a strip width of 0.1):
Integration 2 141

h =0.1

X y = llx
1.o 1.oooooo
1.1 0.909091
1.2 0.833333
1.3 0.769231
1.4 0.714286
1.5 0.666667
1.6 0.625000
1.7 0.588235
1.8 0.555556
1.9 0.526316
2.0 0.500000
1..5000C)O 2.72817 3.45954
Area = E ( 1 . 5 + 2 x 2.72817 + 4 x 3.45954)
3
=0.69315.
142 Calculus
Integration 2 143

Post-experiment reading
The two approximate integration methods you’ve met can be applied to a far
wider range of functions than can any symbolic technique, including the more
sophisticated ones in the next module. There’s a drawback, though, and it’s an
obvious one: because these methods are approximate, our results may not be
entirely trustworthy. There will be, in each case, an error: a difference between
our estimate and the true value of the integral.

Often - though not always - this error will be quite small. In almost all cases of
practical importance, Simpson’s rule outperforms the trapezium rule for the same
number of ordinates, provided the strip width is small enough.

Mathernatica’s NIntegrate function uses a highly sophisticated range of


numerical integration techniques. It usually outperforms both the trapezium rule
and Simpson’s rule, except where the strip width is extremely small. (A very
small strip width can give rise to other kinds of accuracy problems, though, as
well as making the execution time for TrapeziumRule and SiwsonsRule
excessive.) NIntegrate is usually accurate to about fifteen decimal places.
This makes it quite as good as Integrate for most practical applications of
definite integration (with the advantage of working for a far larger class of
functions and often using far less memory). Furthermore, it is possible to set
options within NIntegrate which make it as accurate as you choose.

GiveQuestion I ’
and
Integration 3
146 Calculus

Wold down the “shift” key


Muthematica’s res

This essential fi

The following Mathem

Commands that com


Integrate, D, E
HolBForm, Relea

Special commands for this modul

To find out more ab


Integration 3 147

Experiment 1: Manipulating the integrand


Preparatory reading
Integration is quite a lot harder to do than differentiation. There are no general
rules for dealing with products, quotients or composite functions, so fairly simple
functions (like cos (x2), for instance) can prove very difficult to integrate
symbolically.

The thing we’re trying to integrate is called the integrand. One approach to
integrals we don’t immediately know how to handle is to see if we can rewrite the
integrand in an algebraically equivalent, but more “integration-friendly”, form.

Try some exampl


more) polynomia
and compare with the
Mathematica does to
148 Calculus

Post-experiment reading
(i) If the integrand is in the form of the product of two or more bracketed
expressions, it’s usually best to expand the brackets before integrating. For
example

-5x4+x3+7x2-35x+7 1dr
6 2
-x 5 x4 7x3
x +-+--- 35x + 7 x + c .
6 4 3 2
(ii) Integrands involving products of trigonometric functions are best recast in
their equivalent forms involving multiple angles. For example
Integration 3 149

jsin’x dx =
s
3 (1- cos2.x) dx
= 3[x - 3sin 2x) + c

t
= (2x - sin2x) + c.
(iii) Integrands of the form

where p and q are bothpolynomials, can be best tackled by

factorising q(x), then


expressing the function in partial fractions.

At its simplest, the method of partial fractions depends on the fact that

ax+b
(cx+d)(ex+ f )

can always be expressed in the form


A
+- B
c x + d ex+ f ’

the problem then being merely to find the values of the constants A and B.

Here is a simple example. To perform the integral

1 2 l d x
x -3~+2
we first reflect that
1 -
- 1
x2-3x+2 - ( ~ - 1 ) ( ~ - 2 )
-- A
-
- +-3
B
x-1 x-2
where A and B are constants yet to be determined and where ‘b”is the identity
sign, meaning “is equal, for all values of x, to”.
150 Calculus

Now, if
1 -- A
- B
+-7
(x-l)(x-2) - x-1 x-2

then 1= A ( x - 2) + B(x - 1). Setting x = 2 gives 1 = B, and setting x = 1 gives 1


-A. Hence
1 - 1 1
-
x2-3x+2 x-2 x-1’

and thus

x-2 x-1
= 1n)x- 2)- 1nJx- 1) c. +
It can also be shown that

ax2 + b x + c

can always be expressed in the form


A
+- B x + C
d x + e fx2+g’

which allows us to use this method on a wider range of functions.*

* Note that this is not intended to be a comprehensive treatment of partial fractions. In


particular, we have ignored the important case in which there are repeated factors in the
denominator.
Integration 3 151

Experiment 2: Change of variable, type 1


Preparatory reading
Many integrals one meets can’t be dealt with in the manner of the last activity: no
alternative, friendlier form of the integrand exists. An example is

there are many others.


152 Calculus

du
However, consider the function u = x2. It's clear that - = 2 x , and therefore that
dx
x =1 The integral can be written, then, as
2 dx'

and it's not hard to show that this is in turn equivalent to

we can think of this as "the dx's cancelling", though that's not strictly what's
going on.

Performing the integration gives us $ eu + c , which is equivalent to 4en' + c.


Here's how we'd set out the calculation:

I I
2
xex d x = i eUdu (U = x 2 3 du = 2 x d x * x d x = r2d u )
= 1 ~ u +ec
2
-
- 2l e X +c.

The method, known as integration by change of variable, depends on spotting a


function, which we call u, which appears buried somewhere in the integrand and
whose derivative, duldx, is one of the parts of a product.
Integration 3 153
154 Calculus

Post-experiment reading
In the case of definite integration, we can change the x-limits on the integral into
u-limits: this saves our having to convert back to x at the end. For example

e'du ( u = x 2 a d u = 2 x d x a x d x =2~ d)u

u=4
=iju=,
eUdu (x=1 u = 1; x = 2 3 u = 4)

= 3 ( e 4 -e).

Experiment 3: Change of variable, type 2


Preparatory reading
In all the change of variable examples you've met SO far, you've expressed the
new variable, u , in terms of the old variable, x. Sometimes, it's possible to work
the other way round: the old variable is expressed in terms of the new one. Here's
an example, in which we integrate d(36 - x2) using the change of variable x
= 6 sin t.

.6cost dt (x = 6sint 3 dx = 6cost dt)

(x=O*t =o; x=3*t =$ 1

= 9 p t + sin 2t]tJ6

9215
=3n+-.
2
Integration 3 155
XP x p x p x p x p xp
.- n - ( A n > -= -n c= -n + -n = ( A n ) -
VP P AP AP nP P
Integration 3 157

which can be written more simply as

The technique of integration by parts, based on this formula, uses the following
strategy: to integrate a product, try to differentiate one half of it, and integrate the
other, and end up with something simpler. In the case of the integral

j x c o s x dx,

for instance, we can differentiate x, which gives 1, and integrate cos x, which
gives sin x. Our new product will then be sinx x 1, or just sin x. Here’s how it
works

u=x dv=cosxdx
I xcosxdx=xsinx-

= xsin x
I sinxdx

+ cosx + c.
du=dx v=sinx
158 Calculus
Integration 3 159

Experiment 5: Areas in the plane


Preparatory reading
Examine the following diagram.

It’s fairly clear that if we subtract the integral of the lower curve from that of the
upper, between two appropriate limits, then what’s left will be the shaded area.
160 Calculus

What’s perhaps less immediately clear is that the same is true in the case of areas
that lie below the x-axis, or even those that “straddle” it, like that shown below.

To state it precisely: if the curves y =f(x) and y = g(x) do not cross in the domain
a < x < b, and if y =f(x) is the upper of the two in this region, then the area
enclosed between the two curves between x = a and x = b is

What happens if the two curves do cross? You are invited to consider this case in
the following experiment.

Plot, on the same pair of a


typing
PlOt[{X”2, XI, {x
Find the finite area b
Integrate [x -
) Use a similar technique
and y = sin x for PO
*in0idaMs aq 01 adeys pqos I? Bu;rsne3‘s!xl?-xayl
inoqe i!aie10.1 aM au~8eur~
pue ‘ ( x ) j = ic dq paugap ‘aueld ayi uy v a n ue au!8eu11
162 Calculus

In the limit as we make the disc thinner - that is, as Sx tends to zero - we have
exact equality, and can write

volume = j bny’h,
a

where a and b are the lower and upper x-limits of the shape.
Integration 3 163

Our curve in the first diagram of this section was actually y = x 2 - x 4 + 1


between x = 0 and x = 1, which means that the volume of the solid shown is

-
--1 8 7 ~
1575 ’

We’ve written a c
automates this cu

2) Check your answer to t


VOlWil8OfRevOlUt

VolwmeOfRevolut
IllustrateQ -
The integration invol
for -2 I xI 1 is extre
be done numerically.

VolumeOfRevoluti
IllustrateQ ->
IntFunction -
164 Calculus
Series 2
166 Calculus
Series 2 (Series) 167

Experiment 1: Binomial expansions


Preparatory reading

Expressions such as (1+ x ) n are called binomial, meaning “with two terms”. It
might help you to recall how such expressions are expanded:

(1 + x)2 = (1 + x)(l + x )
+
= 1(1+ x ) x ( l + x )

=l+x+x+x 2
= 1 + 2 x + x2.

(1+x)3 = (I+ x)(l+ x)2


=1[1+2x+x2 + x 1+2x+x2)

=1+3x+3x2+x3.

In the final expression, with like terms collected, the number multiplying each
power of x is called its coefficient. So in the expansion of (1+x)2 the coefficient
of x is 2 and the coefficient of x2 is 1.

Mathematica expands
quad = ExpanB[(x+Z)
The inverse of this
Factor t quati]
168 Calculus

Post-experiment reading
The coefficients form a pattern known as Pascal’s Triangle (after Blaise Pascal, a
French mathematician, philosopher and theologian). The first five lines of
Pascal’s Triangle are shown below.

1 1

1 2 1

1 3 3 1

1 4 6 4 1

1 5 10 10 5 1
Series 2 (Series) 169

(So, for example, (1 + x)4 = 1+ 4x + 6x2 + 4x3 + x4.)

Pascal’s Triangle is formed using the following rule: each element is the sum of
the two directly above it.

It may not be immediately obvious why the Pascal’s Triangle pattern should arise
in the context of binomial expansions. The second “10” in line 5, for example, is
the x3 coefficient in the expansion of (1 + x>5. It is formed by adding 6 and 4: the
x2and x3 coefficients, respectively, in the expansion of (1 + x ) ~ To
. see why this
should be so, reflect that

(1 + x)5 = (1+ x)(l + x4)

= ( l + x ) ( 1 + 4 x + 6 x 2+4x3 + x 4 )

=1+4x+6x2+4x3+x4
+ x + 4 x 2 +6x3 +4x4 + x 5
-
- ... +(4+6)x3+ ....

Experiment 2: Factorials and the binomial


coefficients
Preparatory reading

For binomial expansions (1 + x ) when


~ IZ is very large, constructing Pascal’s
Triangle is not a suitable method for generating the binomial coefficients. Think
about (1 + x)loo: the Triangle simply becomes too big to print or display on a
computer screen.

There is a better way to do it, which involves the use of the idea of a factorial.
170 Calculus

Post-experiment reading
The factorial of n, written n ! , is the product

n x ( n - 1) x ( n - 2)x.. .x2 x 1.

The exception is O!, which by convention is held to be 1.


Series 2 (Series) 171

Now: consider the expansion

( I + x ) ~ =1+6x+15x2 +20x3 +15x4 +6x5 + x 6 ,

and think of the coefficients in terms of factorials:

0 61
coefficient of x = 1= -
0!6!’
1 6!
coefficient of x = 6 = -,
1!5!
2 6!
coefficient of x = 15 = -
2!4!’
3 6!
coefficient of x = 20 = -,
3!3!
4 6!
coefficient of x = 15 = -
4!2!’
5 61
coefficient of x = 6 = -,
5!1!
6 6!
coefficient of x = 1= -
6!0!‘
More generally, the coefficient of x‘ in the expansion of (1 + xy” is

n!

This formula allows us to calculate binomial coefficients without needing to


contruct Pascal’s Triangle. It is the best option for comparatively large values of
n.

The accepted shorthand notation for or “C,; we often say “n


choose r”.
172 Calculus

Experiment 3: (a + bx)n
Preparatory reading
The ideas you have met generalise to binomial expansions such (1 + 2 . ~or) ~
(5 + 7 ~ )In~ this
. experiment we look at binomials of the form (a + bxyt, where n
is a positive integer.
Series 2 (Series) 173

Post-experiment reading
In general

( a + b x y = a n + . .un-*. bx+ ...+ ( r !)(n!n-r) ! .u"-'.(bx)'


+ ... + n.a. (bxy-' + ( b x y
It is conventional to order these expansions in increasing powers of x,and there is
a good reason for doing so which we'll consider shortly.

The completely general case for two variables x and y , is:

(uy + b x y = ( U y y + n.(uy)n-l.bx + ... + ( r ! )n!( n - r ) !.(u y y . (bx)'

+ ...+n.ay.(bx)n-l +(bxy

This can also be written as:

pructiw questions
binomial expansions

GiveQuestion[
To see the answer type:
LastAnswer ["bi

Giveguestion[
174 Calculus

and repetitions should be


commands: simply click

Note: this section us


try going bock to the

Experiment 4: Other kinds of n


Preparatory reading
Everything we’ve done so far has been valid only when n is a positive integer.
We now want to consider expanding binomials to negative and non-integer
powers such as:

Is it possible that the same pattern of coefficients that we found for positive
integer n,

n!
r! (n-r)!’

works for negative and non-integer values?

Expand C (1 + x
Series 2 (Series) 175

Series[(l + xIA4,
Series[ (1 + x)"4,

etc. Compare the outp


Expand[(l + XI^

3) Evaluate the foll

Series t (1 +
Series[(l +

4) Try the general expansion


Series[(l+~)~n,

Post-experiment reading
This experiment suggests that for all values of n (including negative and
fractional ones) we can write

n.(n-1) 2 n . ( n - l ) . ( n - 2 ) 3
( l + x y =l+m+- x + x + ....
2! 3!
This is indeed the case, although the proof is outside the scope of this module.
This proposition is known as the binomial theorem.
176 Calculus

For positive, integer n, the coefficients

n.(n - I) n.(n - l).(n - 2 )


1, n, ---, , etc
2! 3!
are exactly the same as the factorial expressions

n.(n-l).(n-2). ... 2.1


which you met in Experiment 2. The last such coefficient is 9
n!
which is equal to 1; after that, they’re all zero. The expansion is therefore$nite in
length, and holds true for all values of x.

For negative or fractional n, though, the coefficients never become zero, and the
expansion takes the form of an infinite series.

Experiment 5: Convergence
Preparatory reading
Consider the statement

(an example of the pattern you discovered in Experiment 4).

At first sight, it iss hard to see how an infinite series can possibly be the same as a
finite binomial expression. What, we may ask, does it even mean to say that two
such things are “equal”? The answer is that, for certain values of x, the value of
the infinite expression on the right-hand side will approach that of the finite
expression on the left as we take more and more terms. We say that, for these
values of x, the series is convergent to the finite expression.

Note that convergence need not happen for all values of x. What happens when
convergence does not occur is one of the questions you are asked to address in
this experiment.
Series 2 (Series) 177

Evaluate the following i


Series[(l + X I " ( -
Now evaluate

Muthematicu.

2) Evaluate the foll

Normal [Series[ ( 1

This sets up a user-


piece of Mutheinati

Try the following input


myseries [x, 21
myseries [x, 3 1

xlist = TabletmyS
Describe the output you se

4) Substitute the value x = 0.


list1 = xlist / . x
Describe the behaviour o f t

Generate a plot of these


Listplot [list
Describe what you s
‘... + x iE
E
iz + m + 1 =& + I )
( ~ - u ) . ( ~ - u ) W + z X ~
ieyi ~ u a w a ~ ae ~ ,
y s~ xoyssaldxa airuy amos 01 ,,Ienba,, s! i!lay1
6as 01 asuas ou s a ~ mJ! pur! ‘mns a i p y ou sey s a p s aq1 ‘x30 s a n p .~aqio
Series 2 (Series) 179

Experiment 6: Polynomial approximations


Preparatory reading
A polynomial in x is a finite series of terms of the form axn, where IZ is a positive
whole number; for example:

9+2x+15x2-x5, - 4 + 2 x 3 +xll, 2 - x 2 +5x 3 , etc.

The degree of a polynomial is the highest power of x that it contains. Quadratics


are polynomials of degree 2; cubics are polynomials of degree 3; quartics are
polynomials of degree 4; quintics are of degree 5.

When we say, for example:

(l+x)-' = 1 - x + x 2 - x 3 + x 4 . . ,
we mean that, for a certain range of x, the non-polynomial function:

(1 + x)-l

and the quartic polynomial:

1-x+x 2 - x 3 +x4
have values close to each other.
180 Calculus
Series 2 (Series) 181

Post-experiment reading
The idea that functions which aren’t themselves polynomials can nonetheless be
approximated by them turns out to be one of the most useful in mathematics,
because polynomials are so easy to work with.

Although we introduced the idea of polynomial approximations and infinite series


using the binomial expansion, it turns out that many functions can be
approximated by suitable polynomials. Typically, as with the binomial
expansions, the higher the polynomial’s degree, the better the approximation
and/or the wider the range of values of x for which it is useful.

As with binomial expansions, we can think of the successive polynomial


approximations as leading towards an infinite series expansion in x. We call this
expansion the function’s Maclaurin Series. This is what Mathematica calculates
with its Series command

The Maclaurin series for In( 1 + x) and for arctan x are convergent only for
-1 <x I 1. This is similar to, but not quite the same as, the range of convergence
for the binomial expansion (which is -1 < x < 1, as you may recall). However, the
Maclaurin series for ex, sin x and cos x actually converge for all values of x,
although convergence is slow for large values.
182 Calculus

The five series you met in this experiment are

x3 x5 x7 (-l)nx2n+l
sinx=x--+---+ ... + + ...
3! 5! 7! (2n + I ) !

x2
cosx=1--+---+
x4 x6
... + (-1)" x2n
+ ..
2! 4! 6! (h)!

log(l+ x ) = x - - x4 +
x 3 --
x2 + - ... + (-l)n+l x n 4- ,.. ( - l < x < l )
2 3 4 n

Deriving Maclaurin series


This section shows, without rigorous proof, how we find the coefficients in a
Maclaurin series.

Consider any function, f ( x ) , and assume that it has a power series expansion with
integer coefficients ao, al, a2 etc.

f ( x ) = a o + a ~ X + a 2 x 2 + a 3 X3 +a4x 4 +....
By substituting x = 0 we can find the first coefficient ao:

f (0) = a0 *

Now differentiatefwith respect to x.

f ' ( x ) = a1 +-2a2x + 3a3x 2 + 4 a 4 x 3 + ..


then:
Series 2 (Series) 183

If we now differentiate once more, we obtain

f”(x) = 2a, + 3.2a3x + 4.3a4x2+ ... .


And so on. In general:

where f‘“’ means the nth derivative o f j

In general, then,

f ( x ) = f(0) + xf’(0) + kf”(0)


2
Xn
+ ... + - f q o )
n!
+ ....

This is known as Maclaurin’s Theorem.


Vectors 1
5

Y
.I

3 3 1 1 3 3 4
186 Vectors & Matrices
Vectors I 187

Introduction to vectors
Vector and scalar quantities
Some physical quantities, such as mass and time, can be completely represented
by simple numbers, for example: “30 kilograms”; “7.5 seconds”.

For other types of quantities, such as force and acceleration, it’s important to
specify two things:
1. how large the quantity is, and
ii. in what direction it is acting.

We might, for example, talk about: “A force of 50 Newtons acting on a bearing of


120 degrees”.

Quantities which have direction as well as magnitude are called vector


quantities. In contrast, those which have magnitude only (like mass and time) are
called scalar quantities.

Other physically important vector quantities include:


*Displacement(the answer to the question: “how far away, and in what
direction?’)
*Velocity (the answer to the question: “how fast, and in what direction?’)
*Acceleration(the answer to the question: “how is the velocity changing?’)

Note that the magnitude of a displacement is distance, and the magnitude of a


velocity is speed.
188 Vectors & Matrices

Directed line segments


The most natural way to represent a vector quantity diagrammatically is as a line
segment with an arrow on it:

H
One unit

The length of the line segment represents the vector’s magnitude, in whatever
units we‘re using: Newtons, metres per second, etc. The direction of the arrow
represents the vector’s direction. The vector shown above is 5 units long and it
acts at an angle of 40 degrees anticlockwise from the direction “East”.

QUESTION: In maths, it’s usual to measure angles anticlockwise from a


horizontal, right-pointing baseline. How did this convention arise, do you think?
Give this problem a little thought before checking the answer at the bottom of the
page’.

ANSWER: Turning anticlockwise from a right-pointing horizontal line means turning


from the x-axis to the y-axis. Since x comes before y in the alphabet, this seems logical: at
least to a mathematician! It also means that sine and cosine functions start off positive,
which is tidy. But basically this is just a convention. Angles going clockwise from the
right-pointing horizontal line are counted as negative: for example, a direction of -90
degrees means due South. Map-makers and navigators use a rather different convention:
they measure angles (“bearings”) clockwise from North.
Vectors 1 189

Experiment 1: Drawing vectors with


Mathematica
Preparatory reading
This experiment introduces some Mathematica commands for drawing vectors of
given magnitude and direction.

The vector in the diagram above has a magnitude of 5 units and a direction of 40
degrees. We are going to call this the vector { 5,40}, using braces (curly brackets,
that is). This is not standard mathematical notation, but we need a neat way of
representing vectors if we want Mathematica to process them. Braces, left ( and
right ), are an important part of Mathernatica’s language; they are used whenever
we want to collect things together into a single bigger “thing”. Mathematically,
that “thing” is a type of set, but in Mathematica it’s called a list.

There are a number of notations for vector quantities. In this module we will refer
to any general vector using a lower case letter in bold type: v is our favourite
name (“v for vector”). When writing by hand, you indicate a vector by
underlining, either with a straight line, or some people use a wavy line like the
tilde symbol “-”. Another notation is to put an arrow over the top of a letter to
indicate that it’s a vector, 3 for example; we will reserve this notation for
displacement vectors (see Experiment 2).
190 Vectors & Matrices

ctor[{5# 1101, ShowEas


That last statement is an example of speci

penta -point
Vectors I 191

Experiment 2: Negative vectors and equal


vectors
Preparatory reading
Displacement vectors
Consider two points, A and B . The displacement vector from A to B is called,
not surprisingly, the vector AB. In textbooks, vector AB will be written in bold
type like this, or with a direction arrow over the top as in the diagram below.
When you write it down, you should also use an arrow. The computer won’t
easily let us use either convention on the screen but it should be clear from the
context when we mean a vector.
192 Vectors & Matrices

Remember that the vector AB is defined only by its magnitude and direction: if
we move it away from A and B, it’s still the same vector, and we can still call it
AB:

n
I .

Negative vectors
Given a displacement vector AB connecting the points A and B, how does it
relate to the vector BA? The magnitudes must be equal, but the directions are
opposite. In fact, we can say that BA is the negative vector of AB, that is,

BA = -AB
Vectors 1 193

Equal vectors
Two vectors are equal if they have the same magnitude and the same direction.
They do not have to match in any other way and, in particular, they don’t have to
be acting in the same place. In the following diagram, the displacement vector
from Farnby to Lowford is equal to the displacement vector from Highton to
Bixleigh, even though they’re in different places:

Lowford

Highton

If two vectors have the same magnitude, but not the same direction, they are not
equal. The two vectors in the diagram below are unequal, although they have
equal magnitude, because they have different directions.

instruct the Vector


194 Vectors & Matrices

vector of { 2.3,47

This section uses t

questions on equality of vectors. To gener


Vectors 1 195

Experiment 3: Multiplication by a scalar


Preparatory reading
If we have some vector v of magnitude m and direction d degrees, that is v = { m,
d } in the notation we’ve been using, what do we mean by multiplying v by a
scalar, say

3 “times” v = 3 “times” (m, d } ?

If the scalar is greater than zero then what we do is to leave the direction
unchanged and multiply the magnitude by 3, so

3 v = (3m,d).

Although we’ve written “3 v” just like ordinary multiplication, it’s important to


remember that even though the notation is the same the rule is quite different (for
example, try typing 3 * {m, d} in Mathernatica -what happens?)

Furthermore, what does it mean to multiply a vector by a scalar less than zero? Or
equal to zero?
196 Vectors & Matrices

be considered as

}1 is equivalent to:

Post-experiment reading
Here are all the possible cases for scalar multiplication, s “times” v: if the scalar s
is greater than zero, the outcome is a vector of the same direction with magnitude
s times the original. If s is less than zero, the outcome is a vector of magnitude s
times the original with a reversed direction (the original direction plus, or minus,
180 degrees). And if s = 0, the outcome is the zero vector.

The zero vector, written as 0 (a bold nought, or underlined if you’re writing it by


hand) is the vector with zero magnitude and indeterminate direction (that is, it just
hasn’t got a definite direction).
Vectors 1 197

Experiment 4: Vector addition


Preparatory reading
Consider the following problem:

B is 5 metres due East ofA. C is 12 metres due South of B. How far away from A
is C. and in what direction?

We could recast this in vector terms as follows:

Vector AB has magnitude 5 metres and direction 0 degrees. Vector BC has


magnitude 12 metres and direction -90 degrees. find the magnitude and direction
of vector AC.

Or, more simply:

What single displacement vector does the job of “( 5 , O ) followed by { 12, -90}”?

You’ll notice that, according to this definition:

AC = AB + BC .

The statement of vector addition called the Triangle Rule can be summed up like
this:

(i) link the vectors head to tail;

(ii) join the tail of the first to the head of the second: this is the answer, and
then...

(iii) ... use trigonometry to calculate its magnitude and direction.


198 Vectors & Matrices

various inputs. The answer is

applied:
Vectors 1 199

Post-experiment reading
The procedure for the Parallelogram Rule introduced in the experiment is:

(i) link the two vectors “tail to tail”;

(ii) complete the parallelogram formed by the vectors (the dashed lines in
Mathematica’s diagram);

(iii) the sum of the two vectors is the diagonal of the parallelogram.

The Triangle and Parallelogram Rules are exactly equivalent.

There’s a big problem with doing vector addition in magnitude-direction form,


which is to do with finding lengths of sides and angles in triangles. The first
example we chose was easy to solve because the triangle was right-angled; in
general this will not be the case and the trigonometrical calculations required will
be rather harder. There is in fact an alternative way to represent vectors, called
component form,which makes vector addition very easy to do; more on this in the
next experiment.

The fact that two vectors can be added together means that we can add any
number of vectors together, by a process of repeatedly adding up all the connected
pairs of vectors until we get to the single, resultant vector. For example, consider
the following diagram:

A AB\ B
/

CD

Here, we have three position vectors that we wish to sum. Suppose we decide
first to add BC and CD, then:
200 Vectors & Matrices

A B + B C t C D = A B t BD = AD
add add

Write down similar sets of equations for four, five, six and more position vectors.
Convince yourself that the order in which vectors are paired doesn’t change the
answer; this is the associative property of vector addition which we’ll come back
to in Experiment 6.

Vector subtraction is the same process as addition, since for any subtraction
v - w we may write:

v -w=v + (-w)
and we know that the vector -w is one having the same magnitude as w but
reversed direction (rotation by +180, or -180, degrees; see following figure).

Experiment 5: Components
Preparatory reading
Adding two vectors produces the single vector that does the job of the original
two. We’re now going to reverse the process and ask: what two vectors are
equivalent to a single vector?

Specifically: given a vector v, what horizontal vector can be added to what


vertical vector to make v?
Vectors 1 201

For example, in the diagram below, the single vector on the left is equivalent to
the two perpendicular vectors on the right (according to the Parallelogram Rule):

These two vectors are called the horizontal and vertical components of v.
Together they’re the perpendicular components of v, and we can describe every
vector uniquely in terms of its components. We will tend from now on to express
vectors in component form instead of magnitude-direction form.

The process of finding the components of a vector is sometimes called “resolving


the vector into its components”.

/
VY / I
/
I
/

,
202 Vectors & Matrices
Vectors 1 203

Post-experiment reading
Vectors in component form
For a vector v = {r, O } , the perpendicular components are:

v,=rcosO, vy=rsinO.

There is a standard notation for vectors in component form: in fact, there’s more
than one. The first one we’ll mention looks like this:

It is called column notation. In Mathematica we will generally use a “curly


bracket” notation for vectors in component form, { 3 . 5 , 6 . 0 6 ) for example, as
we also did for magnitude-direction form. This is potentially confusing: it is
important to keep in mind that curly brackets are a general way within
Mathematica to group things together into lists, and the meaning of a list depends
entirely on the context in which it is used.

Position vectors
In circumstances where we’ve defined a pair of axes and an origin, 0, the vector
OA (A’s displacement from the origin) is called A’s position vector. It tells us
where the point A is.

0.5-
,.0

-2.5.
204 Vectors & Matrices

For a position vector, the coordinates of A, (x, y ) say, are equivalent to the
components of the vector OA,

r= d
m
(;I . And the polar coordinates of A,

and 8 = arctan(y/x), are equivalent to the magnitude and direction


of OA.

Vector addition in component form

When the vectors (;:) and (::) are added, the resultant is the vector

For example,

Scalar multiplication in component form

With vectors in component form we can perform scalar multiplication simply by


multiplying both components individually by the scalar, no matter whether the
scalar is positive, negative or zero. There’s an interesting point to do with the
zero case: we said earlier that the zero vector, 0, has zero magnitude and
“indeterminate” direction-how is it that we can say definitely that 0 is a vector
whose components are all zero?
Vectors 1 205

Experiment 6: Addition by components


Preparatory reading
There’s an alternative notation for vectors in component form which can be very
useful.

The idea is this: let i be a horizontal vector of length one unit, and let j be a
vertical vector of length one unit. i and j are both unit vectors: their magnitude is
one unit. Then, for example, the vector

is the vector sum-the resultant-f three lots of i and two lots of j, as shown in
the diagram:

And we can write it as:

3i + 2j

The unit vectors i and j are called the Cartesian basis in two dimensions.
206 Vectors & Matrices
Vectors I 207

Post-experiment reading
The rule for addition of vectors in component form is simply to add together the
components themselves. Thus:

or equivalently:

(ai + bj) + (ci + dj) = (a + c)i + (b + d)j .

Vector addition is both commutative,

v+w=w+v

for any vectors v and w, and associative,

(a + b) + c = a + (b + c)

for any vectors a, b and c.

Experiment 7: Converting back, and the


modulus of a vector
Preparatory reading
The magnitude of a vector is such an important thing that there are several
different notations for it. For a vector v, the magnitude can be written as v, like a
normal variable (or without underlining, if you’re writing by hand), or as IvI, using
the “modulus” sign. In fact, the name modulus is often used instead of
magnitude. The modulus (absolute value) of a number and the modulus of a
vector are related quantities-can you see why? The modulus notation is useful
because it can be applied over complicated expressions, for example:
208 Vectors & Matrices

means the modulus of the result of adding together all the vectors inside. Note
that, although the modulus signs look like brackets, they definitely don’t behave
like brackets, for example:

31a + bl f 3a + 3b f 31al+ 3(bJ

in general.

The magnitude, or modulus, of the vector x i + y j, for any x and y , is dw.

deli-3, $11

Plot [ArcTan [XI,{x, -


Vectors I 209

s fro
Vectors 2
212 Vectors & Matrices
Vectors 2 213

Experiment 1: The scalar product


Preparatory reading
In Vectors I (Experiment 3 ) we looked at what happens when we multiply a
vector by a scalar. Now we want to ask: what does it mean to “multiply” a vector
by a vector?

What this means is by no means obvious. We can motivate things with an


important physical law. Newton’s Principle of Work can be stated as:

“The work done by a force F in moving through a displacement r at an angle of 8


degrees to its line of action is lFllrlcos 8 .”

This will be our definition for the scalar product. Given two vectors a and b
with an angle 8 between them, their scalar product is:

The product is denoted by a dot, we say “a dot b”, and so it is often also called the
dot prodult. It’s called a product because it behaves in some ways like the
product operation in arithmetic. And it’s called the “scalar product” rather than
just the product of vectors because there’s another kind of product, the “vector
product”, which will appear with three-dimensional vectors in a later experiment.
214 Vectors & Matrices
Vectors 2 215

Post-experiment reading
If a.b = 0 and neither a nor b is zero then the angle between a and b must be 90
degrees.

The properties of the scalar product:

(i) the scalar product is a number,

(ii) a.a=lal2 ,

(iii) a.b = b a (commutativity),

(iv) a.(b + c) = a.b + b.c (distributivity).

What about associativity? That is meaningless for the scalar product, since a.b is a
scalar and so (a.b).ccannot be performed.
216 Vectors & Matrices

Experiment 2: The scalar product by


components, and angles between vectors
Preparatory reading
It is often important to find the angles between two vectors. In two dimensions,
it’s not hard to convert the vectors into magnitude4irection form and then
subtract the angles, but in three dimensions, which we move to in the next
experiment, there is no single angle for direction and the problem is harder. In
both cases we can do the whole thing easily in component form by making use of
the scalar product.
Vectors 2 217

Post-experiment reading
Given any two vectors whose components are a = ali + a2 j, b = bli+ b2 j, then
218 Vectors & Matrices

In words, then, the rule for doing scalar products in component form is: multiply
together the corresponding components, and add up the answers.

The formula for the angle between two vectors is a simple rearrangement of the
scalar product formula:

a . b = abcos 8
a.b
COSe=-
ab

It is often very important to know whether two (non-zero) vectors are


perpendicular or not (whether the angle between them is 90 degrees). To do this,
we only need to check if a.b = &we need not go to the trouble of finding out the
vector magnitudes as well.
Vectors 2 219

Experiment 3: Vectors in three dimensions-i,


j and k
Preparatory reading
You should be familiar with the fact that any point in three-dimensional space can
be represented by three coordinates, (x, y , z). In a similar way, vectors in three
dimensional space can be represented by three perpendicular components in the
directions of the x, y and z axes. We can write down a 3-D vector in column form,
for example:

The Cartesian basis (see Vectors I, Experiment 6) can be extended from two into
three dimensions by introducng a third basis vector, k, which is one unit long in
the z direction, in addition to i and j. This diagram shows the vector 2i + 3 j + 4k:
220 Vectors & Matrices

We won’t here be using any magnitudedirection form for 3-D vectors, although
it is possible to do so: the complication is that two angles are required to specify
the direction of a vector, which makes the trigonometry pretty hard.

Adding vectors in three dimensions works in just the same way as in two: simply
add the components. For example:

(2i + 3j + 4k) + (3i - 5j + k) = 5i - 2j + 5 k .

The same geometrical constructs for vector addition, the Triangle Rule and
Parallelogram Rule, can be applied in three dimensions as in two. Since we will
always be working in component form for 3-D vectors we won’t (fortunately)
need to use the Rules for calculating vector additions, but it’s still important to be
aware geometrically of what’s happening in the addition process.
Vectors 2 22 1
222 Vectors & Matrices

kl

Post-experiment reading
This diagram shows the geometrical construction required to determine v, the
magnitude of the vector 2i + 3j + 4k:
Vectors 2 223

In the lower right-angled triangle, with sides of length 2, 3 and w, Pythagoras’


theorem tells us:

w2 = 22 + 32

and then in the upper right-angled triangle, with sides of length w,4 and v, again
by Pythagoras’ theorem:

Hence the magnitude is v = m.


The magnitude of any vector having components xi + yj -4
+ zk is x +y +z
(You could verify this simply by replacing the numbers 2, 3 and 4 in the above
diagram with x, y and z.)

Experiment 4: The scalar product and angles


between vectors in three dimensions
Preparatory reading
In this experiment we look at the scalar product in three dimensions. The
definition for the scalar product of two three-dimensional vectors a and b is, as it
was in two dimensions:

where 8 is the angle between the vectors. One way to think about this is that a
and b lie in a (unique) plane in 3-D space, and within that plane we are once again
in a 2-D space. The formula for the angle between two three-dimensional vectors
is, as before, a simple rearrangement of the scalar product formula:

e = cos- 1 (--)a . b
where a and b denote the magnitudes of a and b.
224 Vectors & Matrices

This formula gives us a quick test for perpendicularity of vectors: if the dot
product of two (non-zero) vectors is zero, those vectors are perpendicular,
otherwise they cannot be.
Vectors 2 225

rt [31-2k*Sur

Post-experiment reading
The general rule for calculating the scalar product of 3-D vectors in component
form is the same as it was for 2-D vectors: multiply together the corresponding
components, and add up the answers.

The following diagram shows how the angles between the vectors a, -a, b and - b
are related:
226 Vectors & Matrices

One can read off that, for example,

(angle between a and -b) = 180 - (angle between a and b).

Experiment 5: The vector product


Preparatory reading
In this experiment we return again to the question of the possible meaning of
multiplying a vector with a vector. This time we’ll define a product whose
outcome is a vector rather than a scalar, hence it is called the “vector product”. As
with the scalar product we’ll motivate the definition with a situation in mechanics
where the product appears naturally.

You may recall from your work in mechanics this definition for the moment of a
force in two dimensions:
Vectors 2 227

Given a force F which acts through a point P , whose position vector is r , the
moment of F about the origin 0 is defined to be

where 8 is the angle between r and F

In two dimensions we need to state whether the moment is a clockwise or


anticlockwise one. In the above diagram a clockwise moment is shown, one
which would cause a clockwise rotation if the system were pivoted at 0. In three
dimensions the situation is more complicated because we need to state not only
the sense (clockwise or anticlockwise) of the moment, but also the direction of the
axis about which it acts: that is, the axis about which the system would rotate if it
were able to do so freely.

Hence a moment in three dimensions possesses both a magnitude, lrllFlsin 8, and


a direction: the direction of the axis of rotation. In short, it must be a vector
quantity.

In the above diagram, if a three-dimensional moment is represented then we can


say that it shows a moment T of magnitude lrllFlsin 8 which is directed at right
angles to the paper.

Our argument so far is deficient in one vital respect: consider what the problem is,
and how it might be solved, before reading any further.

The problem is: there are two directions at right angles to the paper: into it, and
out of it. We have to specify which. How can we do that?
228 Vectors & Matrices

That’s a bit of a trick question, because there is no natural solution to the problem.
We simply have to make a choice - an arbitrary one - and stick to it from here
On.

The universally-accepted rule is this: choose the direction of T so that the triple of
vectors (r, F, T) forms a right-handed system.

This is a difficult idea; here’s what it means:

(i) Imagine yourself with a screwdriver in your right hand.

(ii) Approach the vectors r and F in such a way that when you turn the
screwdriver in the tightening (clockwise) direction it turnsfrom r to F.

(iii) Then the direction in which the screwdriver is pointing is the


direction of T.

So, then, according to this rule what is the direction of the moment T in the
diagram above?

Answer: into the paper, as the following diagram demonstrates:

from r to F
Vectors 2 229

The vector moment is one example of a general product of vectors. In general,


given two vectors a and b, at an angle 8 from one another, we define their vector
product as:

a x b = (lallblsin 8)fi

where fi denotes the unit vector which is perpendicular to both a and b in such a
way that the triple of vectors (a, b, n) forms a right-handed system.

Vector product is often also called the cross product, because of the “cross”
symbol used in its notation. Another notation you may sometimes see, which
means exactly the same thing, is a A b.
230 Vectors & Matrices

Post-experiment reading
The result that i x j = -j x i shouldn’t be too surprising, since in the Preparatory
Reading we spoke about the vector moment T = r x F having a direction defined
by turning an imaginary screwdriverfrom r to F. If instead we wanted to turn
from F to r we must be describing the vector -T = F x r.

There’s another way to look at the results for i x j etc. We know that the
Cartesian basis (i, j, k) is a right-handed system. So when we seek the vector i x j
we’re looking for a vector of magnitude 1 (a unit vector) in the direction which
Vectors 2 23 1

forms a right-handed system with i and j-that’s got to be k . If we consider now


the permutations (“shufflings”) of (i, j, k), then just two of them are right-handed:
(j, k, i) and (k,i, j). These imply a couple more cross product results. The other
permutations, (j, i, k), (k,j , i) and (i, k, j), are not right-handed which tells us
immediately that j x i can’t be the same as i x j.

General properties of the vector product:

(i) the outcome of the vector product is always a vector;

(ii) for any vector a, a x a = 0;

(iii) the vector product is not commutative, because for any a and b,

a x b = -b x a

We say that the vector product is anti-commutative,because the product one way
around is the negative of the product the other way around.

(iv) the vector product is distributive: a x (b + c) = a x b + b x c for all vectors a,


b and c.

(v) the vector product is not associative: the relation (a x b) x c = a x (b x c) is not


generally true.

The fact that the vector product is distributive is really the thing which justifies
our calling it a “product” in the first place.

Experiment 6: The vector product in


component form
Preparatory reading
In this experiment we seek a general formula for the vector product of vectors
expressed in component form. The distributivity property found in the last
experiment is key for this, because it means we can expand out the brackets in the
general expression:
232 Vectors & Matrices

(a,i+ayj+azk)x(b,i+byj+bzk)

where we’re writing any vector a as a,i + ayj + a,k with a, denoting the x
component of a, and so forth.

Post-experiment reading
The general formula for the vector product in component form is:

a x b = (aybz- a,by)i + (a,bx - a,b, )j + (axby- aybx)k .

It turns out that the formula can also be written as a determinant of a matrix:
Vectors 2 233

i j k
a x b = a x ay a,.
bx by bz
Vectors 3
236 Vectors & Matrices
Vectors 3 237

Experiment 1: Vector equation of a line


Preparatory reading
This module is concerned with applying some of the vector theory covered in the
earlier modules, Vectors I and Vectors 2, to the analysis of straight lines in two
and three dimensions, and planes in three dimensions.

Consider the problem of extending the familiar and useful x-y coordinate
geometry to three dimensions. For instance, the equation of a straight line in two
dimensions is always of the form

y = mx + c, where m and c are constants.

Finding the equation of a straight line in three dimensions turns out to be


surprisingly difficult unless we adopt a vector approach. In this experiment we’ll
look first at the 2-dimensional equation in vector form, and then move up to 3
dimensions.
238 Vectors & Matrices

thematica we can cr
en display them all on

torIngut-Xo
s->True, Dis
Vectors 3 239
240 Vectors & Matrices

Post-experiment reading
To specify a straight line, in two or three dimensions, we need just two pieces of
information:
(i) the position of a point on the line, and
(ii) the direction of the line.

Each of these pieces of information can be specified by a vector: we can specify a


point by its position vector from the origin-let’s call this vector a-and the
direction of the line through that point can be specified by a second vector, d,
chosen to be parallel to the line.

Then, any point on the straight line will have the position vector

r = a + t d,

where the parameter t is some real number. This expression is the general vector
equation of a line. The vector d is called the direction vector of the line. Note
that it’s only the direction of d that matters, not its magnitude (except if it were
zero).

It’s not difficult to convert between a vector equation and the equivalent Cartesian
equation in the 2-dimensional case. Consider the line

r = i + 3 j + t ( 2 i - j), or r = ( 1 + 2 t ) i + ( 3 - t ) j .

If a general point on the line has coordinates ( x , y), then its position vector is
r = x i + y j. Eliminate r, and equate the coefficients of i and j:

x=1+2t, y = 3 - t - y = - 1( 5 - x ) .
2

The conversion in the 3-dimensional case is more complicated, and is the subject
of the next experiment.
Vectors 3 24 1

Experiment 2: Cartesian equations of a line in


3-D
Preparatory reading
We wish to convert the vector equation for a straight line in three dimensions to a
Cartesian equation, which is expressed in terms of x, y and z, the coordinates of
the points on the line. Let’s start by looking at the line whose vector equation is

r = i - 2j + k + t (2i + 5j + 3k), or

r = (1 + 2t) i + (-2 + 5t) j + ( I + 3 t ) k.


If a general point on the line has coordinates (x, y, z), then its position vector is:

r = x i + y j + z, k.
242 Vectors & Matrices

It follows that

xi + y j + z k = (1 + 2t)i + ( - 2 + 5t)j + ( 1 + 3t) k,


from which we deduce that

X= 1+ 2 t , y=-2+5t, Z= 1+ 3 t .

Making t the subject of each of these equations gives:

-l+z
t = -- l + x , t = -2+Y , t=-.
2 5 3

Finally, equating the three right-hand sides gives:

-l+x
-- 2 + y - -l+z
2 5 3

This set of equations is true whenever the point (x, y, z ) lies on the line and is false
otherwise. It therefore describes the line completely. We call these equations the
Cartesian equations for this line (or sometimes, loosely, the Cartesian equation).
Notice the strong resemblance between the Cartesian equations we ended up with
and the vector equation they came from.
Vectors 3 243
244 Vectors & Matrices

ation

Post-experiment reading
In part 3 of this experiment you met vector equations where one or two of the
direction coefficients (the coefficients of the direction vector) were set to zero.

Consider the line r = 2i + 3j - k + 1 (3i - j ) , whose z direction coefficient is zero.


This rearranges to the three equations:

x-2
t=- 3 , t = 3 - y , z=-L

z does not depend on t, so the line has a constant value of z and we write its
Cartesian equations as:

x-2
-=3-y, z=-1.
3

Geometrically, this means that the line runs perpendicular to the z-axis. Consider
next the line r = 5i - 6 j + 2k + t j , which has two zero direction coefficients.
Rearranging as before:

x = 5 , t = y + 6 , z=2.

So this line has both x and z constant, and y may take any value. The latter fact is
understood implicitly because the Cartesian equations are written simply as:

x=5, z=2.

Geometrically, this line runs parallel to the y-axis (compare the 2-dimensional
case of a line like x = 5).
Vectors 3 245

Experiment 3: Parallel, intersecting and skew


lines
Preparatory reading
In two dimensions, two straight lines either intersect or are parallel. In three
dimensions, things are more complicated because the lines may be neither
intersecting nor parallel. Such lines are called skew lines.

The question is, how can we tell if two lines are parallel, intersecting or skew by
looking at their equations? The parallel case is quite straightforward. For
example, these two lines are parallel:

ri = 3i + j + s(2i - 3j+ k), r2 = 5i + 3 j + 2k + t(4i - 6 j + 2k).


The reason is that the second direction vector is a scalar multiple of the first:

4i - 6 j + 2k = 2(2i - 3j + k) .

If the two direction vectors are parallel, then so are the lines.
246 Vectors & Matrices

The harder question is this: given that two lines aren’t parallel, how can we tell
whether they’re intersecting or skew? We’ll approach this by finding out if they
intersect. For example, how do we decide whether the lines:

ri = -2i- j+ 7k+ s(5i - 3j+2k), r2 = 3i- 4 j + 1 l k + t(2i + j - k)

(which clearly aren’t parallel) intersect or not?

Broadly, the approach should be to ask “if they do intersect, what are the values
of s and t at the point of intersection?’ If there is a point of intersection, then at
that point we have rl = 1-2, from which we deduce (by equating the i, j and k
components) that:

-2 + 5 =~ 3+2t, - 1 -3s = 4 + t , 7 + 2 =~1 1 - t .

Think about trying to solve these equations simultaneously.


Vectors 3 247

Post-experiment reading
You probably noticed that the PlotLines3Dcommand was calculating the
distance of “closest approach” for skew lines. The mathematics behind this is a
little beyond the scope of this module, so we’ll just give an outline that you can
fill in if you wish. The distance between any point on one line and any point on
another line is the magnitude of the vector joining the points. This distance is a
function of the parameters of each line, let’s call it D(s, t). At the closest
approach of the lines we have both dDlds = 0 and dDldt = 0. Solving these
248 Vectors & Matrices

equations simultaneously for s and t tells us the end-points, and hence the
magnitude, of the line of closest approach.

Part 6 of the experiment asked you to find the angle between a pair of straight
lines. This ought to be simple: it's just the angle between the direction vectors
which we can find using the scalar product in the normal way. Or can we? The
angle between the lines given in the experiment is:

cos
-1
[ (5i-3j+2k).(-2i+ j + k )
15i - 3j + 2kll-2i + j + kl
)
= COS-'( &) = 136.76"

The problem with it is that for every pair of lines there are two angles:

And usually we want the acute one: the one less than 90 degrees.

There are two ways round this. Either subtract the answer from 180 degrees if it's
obtuse, or (and this is quite clever, and has exactly the same effect) stop the
problem arising in the first place by knocking any minus signs off the dot product.
Thus the acute angle between the lines is:

cos
-1
( 1(5i-3j+2k).(-2i+ j+k)l
15i - 3 j + 2kll-2i + j + kl
=43.24"
Vectors 3 249

Experiment 4: Planes
Preparatory reading
To specify a plane in 3-D, we need to know two things:
(i) a point that the plane goes through, and
(ii) a direction to which the plane is perpendicular.

As with the straight line, both of these things can be specified by vectors. Study
the diagram on the next page. We have the following problem: to set up a vector
equation for the plane passing through the point A, which has position vector a =
OA, perpendicular to the vector n.

This is really the same as saying: “make a statement that is true for all points P in
the plane, and false for all those outside it”. In this experiment you’ll explore
how such a statement can be made.
250 Vectors & Matrices
Vectors 3 25 1

Post-experiment reading
Given any point P in the plane (with position vector r = OP), one thing we can
say is that the vector AP is perpendicular to n. In other words,

AP.n=O.

But OA + AP = OP and therefore

AP = OP - OA = r - a ,

Thus (r-a).n=O or

This is what we mean by the (general) vector equation of a plane.

Here’s how it works in practice. Suppose we want to find an equation for the
plane passing through the point (2, -3, 1) perpendicular to the vector 4i + 5j + 4k.
We set:

n = 4i + 5 j + 4k, a =2i - 3j + k
252 Vectors & Matrices

and thus, for all points in the plane, the position vector r satisfies

r .(4i + 5 j + 4k) = (2i - 3j + k) . (4i + 5 j + 4k),

i.e. r.(4i+ 5 j + 4k) = -3.

The vector 4i + 5 j + 4k is called the plane’s normal vector.

In the experiment we used an alternative form of vector equation for a plane:

r = A + AB +pC

where A is a point in the plane, B and C are two vectors parallel to the plane, and
h and p are parameters.

Cartesian equation of a plane


If the general vector r has a component representation xi + y j + zk, can you find a
Cartesian equation for the plane r . (4i + 5 j + 4k) = -3 ? Try this before reading
on.

Substituting for r in the vector equation, we get:

r.(4i+ 5 j +4k) = (xi+ y j + zk).(4i+ 5 j + 4k) = 4x + 5y + 42,

Every plane has a Cartesian equation of the form ax + by + cz = constant. There


are no complications like there were in the case of lines.

Angle between two planes


Any two planes which aren’t parallel intersect. The angle between them is
defined as being:

the angle between the two lines, one in each plane, which are perpendicular to
the line of intersection.
Vectors 3 253

The diagram below illustrates the two lines referred to.

The angle between two planes is the same as the angle between their normal
vectors, as the following “edge-on” view shows.

The planes are shown, “edge-on“,as thick lines. The normal vectors are shown as
arrows. The fact that the two angles are equal is clear.

To find the angle between two planes, then, all we need to do is find the angle
between their normal vectors. Remember, though, that this angle may be obtuse,
in which case we would normally want to subtract it from 180 degrees (as with
lines, it’s usually the acute angle we’re after).
254 Vectors & Matrices

There is no experiment for looking at angles between planes, but we have


provided a special command called Plotplanes that may be used to visualize
planes and their intersections. For example, this command plots a single plane:

~lot~lanes[{4~+5y+4z==-3},
{x,-10,101, {y,-lO,lO},
(2, -10,101 1

You may need to adjust the x, y and z plotting ranges for best viewing of some
planes. This command has many options (check with ?PlotPlanes); one you
may want to use is EQuationType->NormalEquationso that equations
may be entered in vector form. Another useful option is Viewpoint,which
allows the viewing position to be moved (see Vectors 2, Experiment 3). You can
plot a number of planes simultaneously by making a list of them as the first input
to the command. For example:

PlotPlanes[(4~+5y+4~==-3, z==O), {x,-10,101,


{Y,-10,101, (2, -10,101 1

estion I "plan
Matrices
256 Vectors & Matrices

Launch Mathemat

TransformObj
PlotPlanes
Matrices 257

Introduction
We can start by thinking of a matrix (plural matrices ) as simply a rectangular
array (i.e. a table) of numbers, like this excerpt from a personal finances
spreadsheet:

In fact, this is something of an oversimplification: a matrix is really a rather


special kind of array, with special significance and properties. In this module we
will study the use of matrices to represent transformations, geometrical objects
and equations.

Notation
Matrices
The usual notation for a matrix is as a rectangular array of numbers enclosed by
big brackets, like this:

In Muthemtica each row of a matrix is represented "y a list and the whole matrix
is shown as a list of those lists, so, for example:

(z i;) is represented as { {a, b} , {c, d} } and


258 Vectors & Matrices

r: : :1
4 5 6 as ((1, 2, 31, ( 4 ,

Vectors
5, 61, (7, 8, 9)).

Vectors are considered as matrices having only one row or column:

the row vector (3 4) is represented as { { 3 , 4 } } and

the column vector


0 as { Ex}, {Y}I .

In text we normally refer to vectors using lower-case bold letters, e.g. v, r , b .


Similarly we usually use UPPER-CASE BOLD for matrices, e.g. A, M , I .

(In hand-written work the modern convention is simply to use lower-case letters
for vectors and upper case for matrices, and their being vectors or matrices is
implicit from their context. An older convention is to underline vectors and
double-underline matrices .)

Elements of a matrix
Each element of a matrix (sometimes called an entry) is referred to by its row and
column number (in that order). For example, we might say that the “row 2,
column 3” element of

is 7. We often use row-column suffices to identify the matrix elements:


Matrices 259

In Mathematica we use the Part command to select elements. If the matrix


above has been defined in Mathematica by this statement:

A = {{all, a12, a131, Ca21, a22, a231, (a31, a32, a3311

then we can access the first row as:

Part [A, 11

for which there is also the shorthand form:

A[ 111 I

Similarly for the other two rows. Individual elements can be accessed by, for
example:

A"lr 311

which is short for:

ParttA, 1, 31

Experiment 1:Transformations of the plane


Preparatory reading
Matrices are the way of representing certain types of transformations of vectors.
We will work in the 2-dimensional Cartesian plane. The point with coordinates
(x,y) will be represented by the column vector

We can get matrices to "act on" vectors in the following way. Given a 2 x 2
square matrix A and a two-dimensional column vector v, we define the matrix
product Av according to the following rule:
260 Vectors & Matrices

This defines a new, transformed point. We describe this operation as multiplying


A by v (see Experiment 2 for a fuller description).

We’realso going to describe points

PlotObje c t I flag1
Matrices 261

te the effect of the


262 Vectors & Matrices

Post-experiment reading
The transpose of a matrix is made by interchanging rows and columns. Thus, the
transpose of the matrix

is

We can represent a set of column vectors by a single rectangular matrix: the four
points

can be represented by the matrix

Certain kinds of transformations can be represented by 2 x 2 square matrices of


numbers, which multiply the 2 x 1 column vector of any point. Matrices are
associated with linear transformations: those in which straight lines are
transformed into straight lines (although they may be stretched, rotated, reflected,
or any combination of those three).

A good way of identifying the transformation associated with any matrix is to


look at how it affects the unit vectors. It helps to bear in mind that
Matrices 263

Two important general points come out of this experiment. First, notice that we
can use matrices to represent transformations and to represent sets of points we
wish to transform. This is one example among many of the remarkable versatility
of the matrix idea. Secondly, the idea of multiplying a matrix by a vector that we
took into this experiment has led us naturally into a method for multiplying a
matrix by another matrix. This is explored further in Experiment 2.

Experiment 2: Matrix multiplication


Preparatory reading
A matrix is said to have dimension m x n if it has m rows and n columns:

(l i) hasdimension 2x3,

’ (e) has dimension 2 x 1, etc.

It’s important to remember which way round it goes!

A square matrix has the same number of rows as columns ( m = n ) . A column


vector has only one column ( n = 1). A row vector has only one row (rn = 1).

Not all matrix multiplications are possible. In this experiment you will find how
matrix multiplication works, and when it doesn’t!
264 Vectors & Matrices

Post-experimentreading
Two matrices can only be multiplied together ifthe number of columns of thefirst
is equal to the number of rows on the second. To put it another way, if a matrix A
has dimension m x n and a matrix B has dimension p x q then the product matrix
AB only exists if n = p .

If the product exists, then the matrix AB has dimensions m x q

Matrix multiplication is not in general commutative: AB is not necessarily equal


to BA. As you can see from the dimensionality conditions, if AB exists then BA
may not exist at all, or be a matrix of completely different dimension. Because
order is important, there are two useful terms for referring to order, in the product
AB we may say that A pre-multipliesB, or, that B post-multipliesA.

The elements of the product are the sums of the products of the corresponding row
in the first matrix and the corresponding column of the second. This is more
Matrices 265

difficult to say in words than to understand in practice. Using the subscript


notation it looks like this:

Addition and subtraction of matrices


We have not said anything yet about addition and subtraction of matrices.
There’s not a great deal to say. Matrices can only be added and subtracted when
they have identical dimensions, and the process is carried out by adding or
subtracting the corresponding elements. For example:

1 -3 -2 6 -1 3
(0 5)+( 1 -2)=( 1 3)’
( a b c)-(1 0 -l)=(u-1 b c+l),

Scalar multiplication
To multiply a matrix by a scalar you simply multiply every element by the scalar,
for example:

-10 z -k 0 kz
3 lz]=[ak 3k k2].
k[uk2 5 0 k3 5k 0
266 Vectors & Matrices

matrix multiplications.

You can do this as often as you want: the

Experiment 3: Combining transformations


Preparatory reading
This experiment looks at the question of applying a second transformation to the
output of the first: for example, "Magnify by 2, then rotate by 90 degrees".

If the two transformations are represented by matrices M and N, respectively, we


are considering the combined transformation:

MW P )
Matrices 267

Post-experiment reading
Matrix representations of transformations satisfy this important result:

M.(N.~)=(M.N).~

Matrix multiplication is said to satisfy the law ofassociativity. Therefore the


product of two transformation matrices represents their combined transformation,
done in right-to-lefrorder - so M . N . p means “apply N to p, then apply M to
the result”.

In general matrix multiplication is not commutative, so it makes a difference in


which order we apply transformations to the plane. Consider for example the
different effects of
268 Vectors & Matrices

Rotate by 90 degrees and then stretch by a factor of 2 in the x-direction'

-4 -2
P 2 4 f
" " " "

-1

and

"Stretch by 2 in the x-direction and then rotate by 90 degrees"


Matrices 269

There are some transformations that are commutative. For example the
combination “stretching by 2 in the x-direction” and “stretching by 3 in the x-
direction” will give an “x-stretch” of 6 whichever way round the two are
multiplied.

The matrix:

represents the transformation “do nothing”. It is called the 2 x 2 identity matrix,


12. For any m x 2 matrix A,

.
A I2 = A,
and for any 2 x n matrix B,
1 2 . B = B.

The n x n identity matrix is usually written I, (it has to be square).

Experiment 4: Matrix equations as


transformations
Preparatory reading
One of the main uses of matrices is to express potentially very large sets of
equations of the form,
Ap=b

in which, typically, A is a given matrix, p is an vector of unknowns and b is a


given vector. Here we will restrict ourselves to the simplest category of equations
in which A is a square matrix. From what you know about matrix multiplication
you can see that, if A is n x n then p will be n x 1 (because it’s a vector) and b
must also be n x 1.
270 Vectors & Matrices

With n = 2, we might have, for example:

We can interpret this in two ways:

to find the pair of unknowns (x, y ) (which is a point in the plane, of course)
which this matrix transforms to the point ( 2 , 17), or,
to solve the pair of equations we get by multiplying out the left-hand side and
equating corresponding elements:
2x + 6 y =2
and 5x + 3 y = 17.

These two interpretations are equivalent. The second form, which we know as
simultaneous equations, may be more familiar and is often treated as searching for
the intersection of two lines.

When we have a large number of such equations (several hundred is common in


some types of applications) it becomes extremely useful to represent them in a
matrix form and to use matrix techniques to solve them.

Note that if A is n x n then the matrix equation


Ap=b

represents n equations, involving n unknowns.

In the experiment you will consider equations such as:

and you can probably solve this one to get p = (-4) (or x = 4, y = -1 if you

prefer). You will look at one representation of matrix equations like Ap = b and
use it to find solutions, where possible.
Mutrices 27 1

2) Similarly solve:

Post-experimentreading
Unless we can describe the transformation very simply this method of solving
equations is essentially “trial and error”. It does show, however, one of the
problems that can occur with equations like:

(i) (’’)(‘)=(;)or
3 6 Y
(ii) (’ ’)(’)=(’).
3 6 Y

In equation (i) the matrix takes any point, ( X , Y) to the point ( X + 2 Y , 3(X + 2 Y ) ) , ;
that is all the points of the plane will be mapped onto the line y = 3x. If the target
point is not on that line then there can’t be any points that will be mapped onto it,
so equation (i) can have no solutions.
272 Vectors & Matrices

Equation (ii), however, will have infinitely-many, because there are infinitely-
many points (a whole line of them) for which X + 2Y = 3 .

Experiment 5: Matrix equations as lines


Preparatory reading
In this experiment you will consider the same problem of matrix equations like:

(: :)(;)=(I;)

but this time interpret them as representing intersections of lines. We’ll take the
same examples as in the previous experiment to see what happens.
Matrices 273

Post-experiment reading
Each row of the equations corresponds to a straight line relationship between x
and y: and their intersection is the point where both relationships are true.

If the lines do not intersect at all we can’t find a solution.

If the two lines end up being the same single line then we have infinitely many
solutions.

Experiment 6: Matrix equations as algebra


Preparatory reading
One algebraic approach to the solution of matrix equations is based on finding an
inverse just like we would in a simple equation like:

3 . x = 21
214 Vectors & Matrices

1
The multiplicative inverse of 3 is 3-1 (i.e. -) and we find the solution by
3
multiplying both sides of the equation by 3-I:
3 . x = 21
3 - ' . 3 . x = 3-l.21
1.x = 7
For a matrix equation it will look the same:

The problem becomes, therefore, that of finding the multiplicative inverse M-' of
any matrix, M: the matrix such that M-'M = 1.

s then do so now,
Matrices 275

inant ?

Post-experiment reading
The determinant of a 2 x 2 matrix is defined by:

” ) ;1
det( ac d = :I= ad - bc.

The determinant can help us to find the inverse for a 2 x 2 matrix-you can check
that:
276 Vectors & Matrices

d -b
a b
c d , _ .
\ad-bc ad-bc)

When the determinant of the matrix is 0, this formula breaks down, because of the
division by 0. This corresponds to those cases you have studied earlier in which
there was no unique solution to the matrix equation. Such matrices do not have
inverses, and are called singular.

Experiment 7: Into 3-D


Preparatory reading
We’ve been working in 2-dimensional space with 2 x 2 matrices, because it’s
easier to draw and it’s easier to follow the matrix calculations, but everything
extends to 3-D space where 3 x 3 matrices represent the transformations of 3 x 1
matrices (position vectors).
Matrices 277

Our general matrix equation in 3-D is like this:


Mx=b

Note that the vector x and the vector b are different from the scalar elements x and
b.

The matrix product gives:

c][x] [ax+by+a]
d e f y = dx+ey+fz .
[ ga bh I z gx+hy+lz

So the matrix equation represents three simultaneous equations in three


unknowns. These equations represent planes in 3-D space:
ax + by + cz = p
dx+ey+ f z = q
gx + hy + lz = r

To solve these equations, that is to find the inverse o :he matrix, is somewhat
harder than in the 2-D case. In the experiment you can let Mathematicu do that
for you and you’ll see why the usual approach to 3 x 3 matrices, without the
computer, is to try and rearrange them to get a simpler form.
278 Vectors & Matrices

anes and the solutions, fore

SOlve[Rl.EB.p == Rl.b, {x, y, z


Explain.
Matrices 279

Post-experiment reading
A matrix equation can be made more accessible by manipulating the matrix to a
simpler form and as long as the same operation is carried out on both sides the
solutions are not altered.

Typical operations are:

(i) multiplying a whole row by a constant:

(ii) replacing one row by the sum of itself and another row:

These can always be used to reduce the matrix to an amenable form.

If the determinant of the matrix is 0 the equations are not independent and either
we get no solutions (they are inconsistent) or an infinite number of solutions
(there is redundancy).
280 Vectors & Matrices
Complex Numbers 1
\ lot /
282 Complex Numbers

this module.

P l o t ; , Sqrt,

0 Solve
Complex Numbers 1 283

Experiment 1: Square roots of negative


numbers

S q r t command does.

and 2.6. Check using the S g r t command.


284 Complex Numbers

tput

1”4

Post-experiment reading
When you square a positive number, you get a positive answer, and when you
square a negative number, you also get a positive answer. So, it seems as if there
aren’t any numbers whose squares are negative. This is the same as saying that
negative numbers don’t have square roots.

But is this really true? It certainly fits in with the graph of y = d x : no part of this
curve exists for x < 0. But it doesn’t fit in with the output from Mathernatica’s
Sqrt command. Mathernatica seems to handle square roots of negative numbers
with complete ease, even if its output is a little unusual.

Like Mathematica, we’re going to proceed on the assumption that we can take
square roots of negative numbers. As this experiment clearly shows, our answers
won’t be like any numbers we’ve met before - they’re neither positive or
negative, for instance, and they don’t show up on ordinary graphs! However, that
doesn’t stop them being sensible, and useful, things to talk about.

The simplest of these so-called imaginary numbers is d(-1), nearly always


called i for short (though electrical engineers preferj, to avoid confusion with
electric current). As you’ve seen, Mathematica uses a capital I: this is a
peculiarity which you won’t find in books. Other imaginary numbers are formed
when we take square roots of other negative numbers. For example,

1/-26=42.6x(-1)
=2/26X.\/zi
= 1.612451‘.

For technical reasons, Mathematica represents this as


0 . + 1.61245 I
Complex Numbers I 285

Working with imaginary numbers is a lot more straightforward than one might
think. The key is to treat them just like ordinary real numbers, but to remember
the important equation

i2=-1.

Experiment 2: Quadratic roots and complex


numbers
Preparatory reading
Consider the quadratic equation

x2 + 1 = 0.
If we refuse to accept the existence.of imaginary numbers - if we restrict
ourselves to the so-called real numbers - then this equation doesn’t have a
solution. If we allow imaginary numbers into the picture, then it has two, namely
+i and -i.

You’ve probably come across other quadratic equations which don’t have “real”
solutions. An example is the equation

x2 +2x + 5 = 0:

you can tell this has no real solutions because the graph of y = x2 + 2x + 5 doesn’t
cross the x-axis:

12

10

2
286 Complex Numbers

But perhaps it has imaginary solutions? Or perhaps it has solutions of some


other, third kind?

(Note: the terms “real” and “imaginary” are among the most unfortunate in
mathematics. There’s no sense in which one kind of number can be said to be
more “real” than another. But it’s unlikely that the terminology will change, so
we’ll stick with it.)

Solve command.
Complex Numbers I 287

Re[l f 211

Re14 - 311

our own. Write a

Post-experimentreading
Now is a good time to recall the quadratic formula, which says that if
ax2 + bx + c = 0, then

- b * d G
X=
2a
If b2 - 4ac turns out negative, then the formula involves our taking the square root
of a negative quantity, and imaginary numbers come into the picture. For
example, in the case of the equation x2 + 2x + 5 = 0, the solutions are given by

- 2 f 1 / 2 ~- 4 ~ 1 x 5
X=
2
-2flr-16

-
--
-2 f4i
2
=-1f2i.
The number b2 - 4ac is called the discriminant of the quadratic.
288 Complex Numbers

Numbers like -1 + 2i, which are made up of a real number and an imaginary
number added together, are called complex numbers.

Every complex number can be broken down, then, into a real number and an
imaginary one. It turns out that every complex number can be broken down like
this in just one way: associated with each complex number is a unique real part
and a unique imaginary part.

This isn’t hard to prove. Suppose that

a + bi = c + di,
where a , b, c and d are all real.

Then

a - c = di - bi,
and thus
( a - c ) 2 = ( d i - b i )2

=i2(d-b)2
=-(d-b)2.
Complex Numbers 1 289

Rearranging this equation gives


( u - c ) ~+ ( d - b ) 2 = O .

But (a - b ) and (d - c) are both real, so their squares are either positive or zero.
These squares can’t be positive, or the right-hand-side would be positive. So
they’re both zero, which means that a = c and b = d the only way two complex
numbers can be equal is if their real and imaginary parts are.

The shorthand for “the real part of’ is Re, and the shorthand for “the imaginary
part of’ is Im (Mathematica, too, uses Re and Im). Thus Re (3 + 44 = 3, and Im
(3 + 4i) = 4. Not 4i, you’ll notice!
290 Complex Numbers

Experiment 3: Adding, subtracting,


multiplying
Preparatory reading
Addition and subtraction of complex numbers is very straightforward, and
multiplication nearly as much so. The rule is to treat i just like an ordinary
symbolic quantity, but to remember that whenever i2 appears, it can be
immediately replaced by -1.. .
Complex Numbers 1 29 1

ion

of your own. Write a brief des


complex numbers works.

Post-experiment reading
To add or subtract complex numbers, we can simply add or subtract the real and
imaginary parts separately. For example

( l o + i5i)-(3- 12i) = (10- 3)+{15 -(-12)}i


=7 + 27i.
Multiplication is a little more involved, but not much. Here, we can’t operate on
the real and imaginary parts separately. Instead, we simply expand the bracketed
expression and then convert i2 into - 1. For example:

(-4 + 5i)(5 - 8i) = -20 + 32i + 25i - 40i2


=-20+57i-40~(-1)
= 20 + 57i.

Experiment 4: Conjugating and dividing


Preparatory reading
Division of complex numbers is rather less straightforward. It’s far from clear,
for instance, how we would set about performing a calculation such as
3-i
2+3i
292 Complex Numbers

Division of a complex number by a real number is far easier, of course. For


example,

5+2i 5 2.
-= -+ -2.
3 3 3
It looks like it would be profitable to look for a way of converting complex
denominators into real denominators. That is what this experiment is about.
Complex Numbers 1 293

Post-experiment reading
If z1 = x + iy, then we can ensure that z1z2 is real by setting z2 = x - iy. This is
because
( x + i y ) ( x - i y ) = x 2 - i x y + i x y - i 2y 2

= x 2 -(-1)y 2

= x 2 + y2 , which is real.
The complex number x - iy is known as the complex conjugate of x + iy. The
complex conjugate of z is written 2 or z*, and Conjugate [ z ] in
Mathematica. Thus, = 4 + 3i, and ( 2 + 5i)* = 2 - 5i.

As we’ve seen, if we multiply any complex number by its conjugate, the answer
is always real (and, as it happens, positive). This fact gives us a good method for
dividing complex numbers, namely

(i) multiply top and bottom of the fraction by the complex conjugate of the
denominator;
(ii) expand the brackets (the denominator will now become real);
(iii) per$orm the division, which will now be relatively easy.
294 Complex Numbers

For example:

2 + i - (2+i)(4+3i)
--
4 - 3i (4 - 3i)(4 + 3i)
--5 + 1Oi
-
25
1 2
=-+-i.
5 5

ble by
Complex Numbers 2
im
P

1
296 Complex Numbers

this m

ti
Complex Numbers 2 297

Experiment 1: The Argand diagram


Preparatory reading
If you have worked on the first Complex Numbers module, you will recall that
each complex number z has a unique real part, Re ( z ) and a unique imaginary
part, Im ( 2 ) . You will also recall that complex numbers are added by adding the
real and the imaginary parts separately: thus the sum of the complex numbers
3 - 4i and 2 + 5i is (3 + 2 ) + (-4 + 5 ) i , or just 5 + i.

This method of addition is, as you may have spotted, exactly equivalent to the
way in which vectors are added. We can, indeed, think of complex numbers in
this way. The complex number 7 + 2 i can be thought of as corresponding to the
vector

Equivalently, we can think of the complex number 7 + 2i as corresponding to the


point (7,2). In this way, we identify each complex number with a point in a two-
dimensional plane (rather as real numbers can be identified with points lying
along a line). This plane is called (unsurprisingly) the complex plane, or (less
obviously) the Argand diagram, after the Swiss mathematician Felix Argand.

-1 -

-2-

_ -
298 Complex Numbers

Since complex numbers can be thought of as vectors, it is to be expected that


addition and subtraction of complex numbers will work in the same way as
addition and subtraction of vectors.

lows y

by typing the followi mands

Type the command


Complex Numbers 2 299

Post-experiment reading
Adding complex numbers, then, is like vector addition; just as with vectors we
can add two complex numbers diagrammatically like this:

(i) complete the parallelogram defined by the line segments


(ii) draw the diagonal which starts at the origin.

This diagonal represents the sum. As for subtraction: the vector representing the
difference between z1 and z 2 is simply the vector which joins the point 22 to the
point z 1. These principles are illustrated in the figures below.

Addition:
300 Complex Numbers

Subtraction:

Im

-2-
Complex Numbers 2 301

Experiment 2: Multiplication and division by i


Preparatory reading
Like addition and subtraction, multiplication and division of complex numbers
have their geometrical interpretations in the Argand diagram. However, these are
a little subtler and require more careful analysis, so in this experiment we shall
concentrate on a special example: multiplication and division by i .
302 Complex Numbers

Post-experiment reading
As you saw, multiplying a complex number by i corresponds to rotating the vector
that represents it anticlockwise through 90",while leaving the length of this vector
unchanged. (this is illustrated on the diagram below). Unsurprisingly, dividing by
i corresponds to a clockwise rotation through 90".

Each of these observations is a special case of a far more general rule, which
describes what happens geometrically when we multiply or divide any two
complex numbers.

I im

Experiment 3: Modulus and argument


Preparatory reading
The general rule about the geometrical effect of multiplication and division, like
the special cases of it you met in the Post-experiment reading for Experiment 2, is
best expressed in terms of angles and lengths. We therefore begin this experiment
by inviting you to explore an "angle-length'' description of complex numbers.
Complex Numbers 2 303

As you’ve seen, we can think of a complex number as a directed line segment (a


vector) in the Argand diagram. We’ve been describing this vector by specifying
its “x-component” (the real part) and “y-component” (the imaginary part). An
alternative approach, though, would be to specify the length of the vector and its
direction: that would also correspond to a unique complex number.

The length of the vector representing a complex number z is called the modulus
of z. In standard mathematical notation, we write lzl (“mod z”), but Mathematica
uses Abs [ z] (short for “absolute value”). The angle between the real axis and
the vector representing z, measured anticlockwise, is called the argument of z .
The standard mathematical notation is arg z, but Muthematica uses A r g [ z 1 .

The diagram shows the modulus and argument of 1 + 2i.

I Im

modulus and the argument, respe

Type
304 Complex Numbers

[tl + I Sintt
Complex Numbers 2 305

Post-experiment reading

1.5-

1-

-1.5t
-2

Both conventions are widely in use. Mathematica uses the latter: the “negative
angles” option. This is also the convention we shall adopt. One of its
consequences is that the argument of a complex number z always satisfies the
relation

-a<arg z s n.
There’s a fairly simple relationship between
the modulus and argument of a complex number
and
its real and imaginary part.
306 Complex Numbers

If z = x + iy, then IzI = d


m and tan(argz) = -.Y
X

Also: if z has modulus r and argument 8, then r cos 8 = x and r sin 8 = y . This
means, of course, that we can write

z = r(cos 8+ i sin 6).


This is called the polar, or modulus-argument form of the number.

Experiment 4: Multiplication and division


Preparatory reading
It is when we express complex numbers in polar form that the geometrical
interpretation of multiplication and division makes most sense. In this
experiment, you are asked to explore the effect of multiplication and division on
the modulus and argument of complex numbers.
Complex Numbers 2 307

4) Type the following c


compprod=
rl (Costtll
r2 (Costt21

Describe and explain your finding


308 Complex Numbers

Post-experiment reading
The following rules apply in general:

(“When we multiply two complex numbers, we multiply the moduli and add the
arguments.”)

(“When we divide one complex number by another, we divide the moduli and
subtract the arguments.”).

These rules provide us with an explanation (which the reader is invited to supply)
for why multiplication by i corresponds to rotation anticlockwise through 90”,
with no change in the modulus.
Complex Numbers 3
310 Complex Numbers

hat come with Ma


Complex Numbers 3 311

Experiment 1: Raising to powers


Preparatory reading
Recall that when we multiply two complex numbers, we multiply their moduli
and add their arguments. This may suggest to you what might happen to the
modulus and argument of a complex number if we were to square it, say (or
perhaps to cube it, or to raise it to the power 4). In this experiment, you are
invited to explore this question.

Type the following, able

true.

4) Finally, type
Clear t z1
so that this variable
312 Complex Numbers

Post-experiment reading
Consider a complex number (call it z ) whose modulus is 1. Let its argument be 8.

so:
z=cos8+isine.

Let us now consider the number:

26 = z x z x z x z x z x z.

The modulus of z6 must be:


l x 1 x 1 x 1 x 1 x 1,

which is equal to 1.

The argument of 26 must be:

e + e + e + e+ e + e ,
which is 6 0 .

Thus:

( c 0 s 8 + i s i n 8 ) ~ = c o s 6 8+ i s i n 6 8

It’s fairly clear that this would work with i?, or z 8 , or z21 etc. In general, then:

( c ~ s O + i s i n Q ) ~ = c o s +n ies i n n 6

This is de Moivre’s theorem. It can be shown to be true even if n is negative,


and true in a certain sense if n is fractional.

As an example of the use of de Moivre’s theorem, let z = 3 + 4i.

Then

IzI = 4 3 2 + 42) = 5,
and
Complex Numbers 3 313

arg z = arctan (4/3) = 0.927295 rad.


so
z = 5 ( cos 0.927295 + i sin 0.927295 ).

Thus

z6 = 56 (cos( 6 x 0.927295) + i sin(6 x 0.927295))


= 15625(cos5.56377+isin5.56377)
= 15625(0.752192- 0.6589441’)
= 11753- 102961‘.

This method is fairly easily adaptable to negative whole number powers too.

Experiment 2: nth roots


Preparatory reading
Once we know that 1-2 + 2i I = 242, and that arg (-2 + 2i) = 3 d 4 radians, de
Moivre’s theorem gives us quite a good way of making sense of

(-2 + 2i) 1’3.

The reasoning runs like this:

{
(-2 + 2i)l/3 = 2&(cos- 37T + i sin -
4
+ i sin-
4 37T)1”3
= , l ~ ( c o s 7T
,
“1
4

=&
=l+i.
(A 4
-+i-

This is one candidate, then, for the title of “cube root of -2 + 22’. There are
hidden subtleties to this question, however, which you are invited to explore in
this experiment.
314 Complex Numbers

and note down your answer.

ArgandPlot [ c

Try some of your ow


Complex Numbers 3 315

be u

Post-experiment reading
The number 1 + i, then, is not the only value of (-2 + 2i)1/3.It is the most natural,
though, and we call it the principal root. It is the value Mathematics returns.

There are, as you saw, two other roots. These arise because the cosine and sine
functions are periodic: they repeat themselves every 2nradians. It follows that

+
= cos 38 i sin 38.

If a(cos n + i sin is the principal root of the equation

23 = -2 + 2i,

then, we can be sure that & is another root. We

can be equally certain that , is yet another. This


accounts for the other two solutions you observed in the experiment.

These three are the only roots there are: all the other values of m simply give
copies of these, again because of the periodic nature of sin and cos.

The three roots, if displayed on an Argand diagram, display a typical “spokes of a


wheel” pattern, with the angle between adjacent “spokes” being 2 d 3 radians -
exactly one-third of a full circle.
316 Complex Numbers

Im

In general, if r(cos 8 + i sin 8 ) is the principal root of the equation zn = a, then the
other roots are given by those complex numbers of the form
r( cos[ 8 + F] + F])
+ i sin[ 8 whose arguments fall within the range

- K < arg z I ?r.


Complex Numbers 3 317

Experiment 3: Exponential form


Preparatory reading
The study of complex numbers throws up a surprising link between (on one hand)
the trigonometrical functions and (on the other) the exponential function. This, in
turn, gives us a way of making sense of the idea of raising numbers to imaginary
powers, and provides us with a third alternative form for expressing complex
numbers.
318 Complex Numbers

Post-experiment reading
The three results

cis 81 cis 82 = cis (81 +$)


cis 81 / cis & = cis (81 - &)
(cis 8yl = cis n8

closely resemble the three fundamental properties of indices, namely

This suggests that "cis" is rather like an exponential function. A thorough


explanation of this idea is beyond the scope of this treatment. However, it can be
shown that provided 8 is measured in radians,

,it?=
J
cos 8 + sin 8.

This surprising link between the exponential function and the trigonometrical
functions is very important. One of its consequences is that if the complex
number z has modulus rand argument 8 then

z = rei?
Complex Numbers 3 319

This is known as the exponential form of the complex number. It is identical to


the modulus-argument form, and perhaps rather easier to work with.

Experiment 4: Loci
Preparatory reading
In the section on the complex plane, we said that we could think of a complex
number either
(i) as a vector joining the origin to a certain point in the plane, or
equivalently
(ii) as that point itself.

In this section, we concentrate on the second of those two ideas: complex


numbers are points in the plane. That means that if we make a statement which is
true of some complex numbers but not of others, we can draw, on an Argand
diagram, a picture of those points where the statement is true. Such a picture is
called a locus (plural loci).

For example, the following diagram shows the locus of all complex numbers z for
which

123+2*+iI <2:
320 Complex Numbers

The shaded area contains all complex numbers for which the statement is true:
where there is no shading, the statement is false. The broken perimeter line is a
conventional way of indicating that the statement is false at the boundary of the
region.

This is a rich and fascinating subject. For example, many of the most beautiful of
the chaos pictures you may have seen are, in fact, loci in the complex plane.
We’ll begin, though, by looking at some straightforward cases.
Complex Numbers 3 32 1

Post-experiment reading
Perhaps the subtlest loci you have studied (with the possible exception of some
you may have devised yourself) belong to the type

arg( z) = a.

As an example let us consider the locus

arg( *)
z+l
!!
=
4

This can, of course, be written

JL
arg(z - 1)- arg(z + 1) = -,
4

and it is in this modified form that we are best able to make sense of it.

In the following diagram, the complex number z is represented by the labelled


point. The angle arg ( z - 1) is represented by 6, and the angle arg ( z + 1 ) by 4. It
is not hard to show, by an argument involving the sum of the angles of a triangle
(the details of which are left to you), that arg ( z - 1) - arg ( z + I ) is the angle
subtended at z, as shown.
322 Complex Numbers

The locus, then, is the set of points for which this angle is d 4 radians. There is a
well-known geometrical theorem which states that this set of points corresponds
to the curve shown: the major arc of a circle passing through the points (-1, 0)
and (1,O). Another famous geometrical theorem states that the centre of this
circle is placed symmetrically in such a way that the angle subtended there is
exactly mice the angle at the circumference: in this case, a right angle. In our
example, then, this places the centre at the point (0, l), as shown.

You might also like