Professional Documents
Culture Documents
Mathematics
Textbook
CHOO YAN MIN
This version: 2nd January 2019.
Latest version here.
Revision in progress.1
1
This textbook was first completed in Jul 2016. For 1.5 years afterwards, only minor changes were made.
But starting Feb 2018, I’ll be completely rewriting this textbook. In particular, I’ll be working to (a)
slow down the pace of this textbook; and (b) explain things more clearly and simply. Less importantly,
I’ll also be removing the old (9740) material that is no longer on the current (9758) syllabus and making
trivial formatting changes (to make the book beautifuller).
As with everything I do, please let me know if you spot any errors or have any feedback. Thank you.
i, Contents www.EconsPhDTutor.com
, Errors? Feedback? Email me! ,
Notices:
You do not have to comply with the license for elements of the material in the public domain
or where your use is permitted by an applicable exception or limitation. No warranties are
given. The license may not give you all of the permissions necessary for your intended use.
For example, other rights such as publicity, privacy, or moral rights may limit how you use
the material.
Le savant n’étudie pas la nature parce que cela est utile; il l’étudie parce qu’il
y prend plaisir et il y prend plaisir parce qu’elle est belle. Si la nature n’était
pas belle, elle ne vaudrait pas la peine d’être connue, la vie ne vaudrait pas
la peine d’être vécue.
The scientist does not study nature because it is useful to do so. He studies
it because he takes pleasure in it, and he takes pleasure in it because it is
beautiful. If nature were not beautiful it would not be worth knowing, and life
would not be worth living.
[W]hoever does not love and admire mathematics for its own internal splend-
ours, knows nothing whatever about it.
1 Just To Be Clear 3
2 PSLE Review: Division 4
2.1 Long Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Dividing By Zero . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Logic 8
3.1 True, False, and Indeterminate Statements . . . . . . . . . . . . . . . . . . 9
3.2 The Conjunction AND and the Disjunction OR . . . . . . . . . . . . . . . 10
3.3 The Negation NOT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.4 Equivalence ⇐⇒ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.5 De Morgan’s Laws: Negating the Conjunction and Disjunction . . . . . . 13
3.6 The Implication P Ô⇒ Q . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.7 The Converse Q Ô⇒ P . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.8 Affirming the Consequent (or The Fallacy of the Converse) . . . . . . . . 19
3.9 The Negation NOT- (P Ô⇒ Q) . . . . . . . . . . . . . . . . . . . . . . . 20
3.10 The Contrapositive NOT-Q Ô⇒ NOT-P . . . . . . . . . . . . . . . . . . 23
3.11 (P Ô⇒ Q AND Q Ô⇒ P ) ⇐⇒ (P ⇐⇒ Q) . . . . . . . . . . . . . . . 25
3.12 Other Ways to Express P Ô⇒ Q (Optional) . . . . . . . . . . . . . . . . 26
3.13 The Four Categorical Propositions and Their Negations . . . . . . . . . . 27
3.14 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Sets 32
4.1 The Elements of a Set Can Be Pretty Much Anything . . . . . . . . . . . 33
4.2 In ∈ and Not In ∉ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
v, Contents www.EconsPhDTutor.com
4.3 The Order of the Elements Doesn’t Matter . . . . . . . . . . . . . . . . . 36
4.4 n(S) Is the Number of Elements in the Set S . . . . . . . . . . . . . . . . 37
4.5 The Ellipsis “. . . ” Means Continue in the Obvious Fashion . . . . . . . . . 37
4.6 Repeated Elements Don’t Count . . . . . . . . . . . . . . . . . . . . . . . 38
4.7 R Is the Set of Real Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 39
4.8 Z Is the Set of Integers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
4.9 Q Is the Set of Rational Numbers . . . . . . . . . . . . . . . . . . . . . . . 41
4.10 A Taxonomy of Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
4.11 More Notation: + , − , and 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.12 The Empty Set ∅ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4.13 Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.14 Subset Of ⊆ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
4.15 Proper Subset Of ⊂ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.16 Union ∪ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.17 Intersection ∩ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.18 Set Minus ∖ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.19 The Universal Set E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.20 The Set Complement A′ . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.21 De Morgan’s Laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.22 Set-Builder Notation (or Set Comprehension) . . . . . . . . . . . . . . . . 56
4.23 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5 O-Level Review 61
5.1 Some Mathematical Vocabulary . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 The Absolute Value or Modulus Function . . . . . . . . . . . . . . . . . . 62
5.3 The Factorial n! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.4 Exponents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.5 Rationalising the Denominator with a Surd . . . . . . . . . . . . . . . . . 69
5.6 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.7 Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6 Graphs 77
6.1 Ordered Pairs . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 77
6.2 The Cartesian Plane . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 79
6.3 A Graph is Any Set of Points . . . . . . . . . . . . .
. . . . . . . . . . . . 80
6.4 The Graph of An Equation . . . . . . . . . . . . . .
. . . . . . . . . . . . 82
6.5 Graphing with the TI84 . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 84
6.6 The Graph of An Equation with Constraints . . . .
. . . . . . . . . . . . 86
6.7 Intercepts and Roots . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 93
6.8 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 96
6.9 Horizontal, Vertical, and Oblique Lines . . . . . . . .
. . . . . . . . . . . . 98
6.10 Finding the Equation of a Line . . . . . . . . . . . .
. . . . . . . . . . . . 99
6.11 Perpendicular Lines . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . 101
6.12 The Difference between Lines, Line Segments, and Rays . . . . . . . . . . 102
6.13 Asymptotes and Limit Notation . . . . . . . . . . . . . . . . . . . . . . . . 103
6.14 Maximum and Minimum Points . . . . . . . . . . . . . . . . . . . . . . . . 107
27 Sequences 378
27.1 Sequences Are Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
27.2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382
27.3 Arithmetic Combinations of Sequences . . . . . . . . . . . . . . . . . . . . 383
28 Series 384
28.1 Convergent and Divergent Series . . . . . . . . . . . . . . . . . . . . . . . 385
41 Collinearity 474
42 The Vector Product 476
42.1 The Angle between Two Vectors Using the Vector Product . . . . . . . . 479
42.2 The Length of the Rejection Vector . . . . . . . . . . . . . . . . . . . . . . 480
59 Coplanarity 605
59.1 Coplanarity of Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 608
67 Limits 647
67.1 Limits, Informally Defined . . . . . . . . . . . . . . . . . . . . . . . . . . . 647
67.2 Examples Where The Limit Does Not Exist . . . . . . . . . . . . . . . . . 652
67.3 Rules for Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 659
73 Concavity 734
73.1 Inflexion Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 738
73.2 Stationary and Non-Stationary Points of Inflexion . . . . . . . . . . . . . 741
73.3 The First Derivative Test for Inflexion Points (FDTI) . . . . . . . . . . . 742
73.4 The Second Derivative Test for Inflexion Points (SDTI) . . . . . . . . . . 743
77 Integration 800
77.1 An Important Warning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804
77.2 A Sketch of How We Can Find the Area under a Curve . . . . . . . . . . 806
77.3 Some Basic Rules of Integration . . . . . . . . . . . . . . . . . . . . . . . . 808
77.4 The First Fundamental Theorem of Calculus (FTC1) . . . . . . . . . . . . 811
78 Antidifferentiation 816
78.1 The Antiderivative Is Not Unique ... . . . . . . . . . . . . . . . . . . . . . 818
78.2 ... But It Is Unique Up to a COI . . . . . . . . . . . . . . . . . . . . . . . 818
78.3 How, Precisely, Should We Use the Antidifferentiation Symbol ∫ ? . . . . 820
78.4 Rules of Antidifferentiation . . . . . . . . . . . . . . . . . . . . . . . . . . 821
∫ f exp f = exp f + C . . . . . . . . . . . . . . . . . . . . .
′
81.2 . . . . . . . . 853
f′
81.3 ∫ f = ln ∣f ∣ + C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 855
1
∫ (f ) ⋅ f = n + 1 (f ) + C . . . . . . . . . . . . . . . . .
′
81.4
n n+1
. . . . . . . . 858
81.5 Building a Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860
81.6 More Challenging Applications of the Substitution Rule . . . . . . . . . . 861
81.7 An Alternative Formula for IBP . . . . . . . . . . . . . . . . . . . . . . . 869
81.8 The Substitution Rule with the TOT and IBP . . . . . . . . . . . . . . . 870
81.9 Finding the Antiderivative of an Inverse Function (optional) . . . . . . . . 871
82 Term-by-Term Integration 874
114 Past-Year Questions for Part VI. Prob. and Stats. 1187
115 All Past-Year Questions, Listed and Categorised 1224
115.1 2017 (9758) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1224
115.2 2016 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1226
115.3 2015 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1227
115.4 2014 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1228
115.5 2013 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1229
115.6 2012 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1230
115.7 2011 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1231
115.8 2010 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1232
115.9 2009 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1233
115.10 2008 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1234
115.11 2007 (9740) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1235
115.12 2008 (9233) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1236
115.13 2007 (9233) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1237
115.14 2006 (9233) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1238
Index 1767
Abbreviations Used in This Textbook 1771
Singlish Used in This Textbook 1775
YouTube Ad 1778
Tuition Ad 1779
• FREE! This book is free. But if you paid any money for it, I certainly hope your money
is going to me! This book is free because:
1. It is a shameless advertising vehicle for my awesome tutoring services.
2. The marginal cost of reproducing this book is zero and I am a benevolent maximiser of
social welfare. If you don’t understand what that last sentence means, you should read
my economics textbooks.4 (Quick translation: I’m a very nice guy. ,)
• LYX rocks!5 A big thank you to all the developers and those who have helped to fund
its development.
2
Three Letter Abbreviations. In the US for example, abbreviations are viewed by some as dumbing down.
In contrast, in Singapore, the ability to use as many abbreviations as possible and even create one’s own
abbreviations is nearly a mark of intelligence. There shall therefore be very many abbreviations in this
textbook.
3
In 2017, the current 9758 syllabus was examined for the first time and the previous 9740 syllabus for the
last time.
4
I’m working on these. You can find half-completed versions on my website.
5
This book was written using LYX. LATEX is the typesetting program used by most economists and
scientists. But LATEX can be annoying to use. LYX is a user-friendly, GUI, for-dummies version of LATEX.
With LYX, you can actually clearly see on screen the equations you’re typing, as you’re typing them.
It is quite pointless to try working out one’s maths in LATEX because you can’t clearly tell what you’re
writing. In contrast, it is perfectly feasible and indeed easy to work out one’s maths in LYX. For example,
if you have countless lines of tedious algebra to do, you can do it in LYX and copy-paste/document every
step of the way. Otherwise you’d probably be doing it on pen and paper which is messy and which you’ll
probably misplace.
LYX has boosted my productivity by countless hours over the years and you should use LYX too!
xxv, Contents www.EconsPhDTutor.com
• Maths, Math, and Matzz.
This book uses British English. And so the word mathematics shall be abbreviated as maths
(and not math). By the way, Singaporeans used to pronounce maths as matzz (similar to
how they pronounce clothes as klotes). But at some point between 2005 and 2015, perhaps
after watching too many American TV and movies, Singaporeans decided they’d switch to
the American math. Perhaps in 50 years, we’ll all be Amos Yees trying to speak annoying
pseudo American English. But in the meantime, I’ll continue to say matzz. This is my way
of promoting and preserving Singlish (and also sticking it to the ghost of LKY).
Carefully work through all the examples and exercises. Merely moving your eyeballs is
not the same as working. Working means having pencil and paper by your side and going
through each example/exercise word-by-word, line-by-line.
For example, I might say something like “x2 − y 2 = 0. Thus, (x − y)(x + y) = 0.” If it’s not
obvious to you why the first sentence implies the second, stop right there and work on it
until you understand why. And again, if you can’t figure it out yourself, don’t be shy about
asking someone for help. Don’t just let your eyeballs fly over these sentences and pretend
that your brain “gets” it. Such self-deceit will only cost you in the long run.
The Chinese believe in “eating bitterness” and work for work’s sake. That is not my view.
The exercises in this textbook are not to make you suffer or somehow strengthen your moral
fibre. Instead, as with learning to ride a bike or swim, practice makes perfect. The best
way to learn and master any material is by practising, doing, and occasionally failing. You
may struggle initially and get a few bruises, but eventually, you’ll get good.
I strongly advise that you do every one of the exercises in this textbook. They will help
you learn. They will also serve as a check that you’ve actually “got it”. And if you haven’t,
then well, as mentioned, keep working until you get it or seek help. The seemingly-easy
way out of pretending you’ve got it and flipping to the next page is actually the hard way
out, because it will only cost you more grief in the long run.
6
Spoiler: Voldemort dies, Harry Potter lives happily ever after.
xxvii, Contents www.EconsPhDTutor.com
• Confused? Good!
誨女知之乎!知之為知之,不知為不知,是知也。
Shall I teach you what wisdom means? To know what you know and know
what you do not know — this then is wisdom.
... he discovered that, when he imagined his education was completed, it had
in fact not commenced; and that, although he had been at a public school and
a university, he in fact knew nothing. To be conscious that you are ignorant
is a great step to knowledge.
There are always some students who, when probed, say they are not at all confused and
have no questions to ask. In my experience, these are usually precisely the worst students.
These students have such poor understanding of the material that they do not know
what they do not know.
And so, when you find yourself confused, don’t panic or despair. Your confusion is actually
good news! To be confused is to know that you do not know and thus to have taken your
first step towards wisdom. You now know what you need to work on and what questions
to ask your friends and teachers.
Of course, merely knowing what you do not know is not enough. You need to actually act
on this as well. Be proactive: your education and your life are in your own hands!
Feynman was once asked by a Caltech faculty member to explain why spin
1/2 particles obey Fermi-Dirac statistics. He gauged his audience perfectly
and said, “I’ll prepare a freshman lecture on it.” But a few days later he
returned and said, “You know, I couldn’t do it. I couldn’t reduce it to
the freshman level. That means we really don’t understand it.”
There is a view in some philosophical circles that anything that can be un-
derstood by people who have not studied philosophy is not profound enough
to be worth saying. To the contrary, I suspect that whatever cannot be said
clearly is probably not being thought clearly either.
I agree: If you can’t explain something simply, you don’t understand it well
enough.7
This saying is useful for the instructor. But it also yields the learner the following useful
corollary (in mathematics, a corollary is a statement that follows readily from another):
This learning technique may be dubbed learning by teaching and is a perfectly good
one. As the Latin proverb goes, Docendo discimus — by teaching, we learn.
One way to implement this technique8 is for you and your friends to get together explain
concepts to each other aloud. For this technique to work, you and your friends must
challenge each other and be demanding of each other. Do not be content until you’ve
actually given each other the full and correct explanation of the concept at hand.
7
This quote or some similar variant is often misattributed to Einstein. But as Einstein himself once said,
“73% of Einstein quotes are misattributed.”
8
Another way is to hand over the classroom to students. But this is probably too adventurous a move in
Singapore, where the closest thing is probably the officially-sanctioned and of course graded project work
presentation.
xxix, Contents www.EconsPhDTutor.com
• Lessons from the science of learning.
Unfortunately, the science of learning is still very much in its infancy. But do check out
this short review of the literature: “How We Learn: What Works, What Doesn’t” (2013).
According to this review, the two most effective study/learning techniques are:
1. Self-testing.
2. Distributed practice. (Sometimes also called spaced practice.)
The next three are:
3. Elaborative interrogation.
4. Self-explanation.
5. Interleaved practice.
Two commonly-used but ineffective techniques are (a) highlighting; and (b) rereading.
Again, be warned that the science of learning is very much in its infancy, so you should
take with a large dose of salt any advice (including mine about learning by teaching).
Nonetheless, there’s probably no harm trying out different techniques and seeing what
works for you.
You’ve probably forgotten some (or most?) of it, but unfortunately, you are still assumed to
know all of O-Level Maths (2017 syllabus) and “some” (OK, more like a lot) of Additional
Maths (pp. 14–15 of your A-Level syllabus lists what you need to know from ‘A’ Maths).
(To take H2 Maths, most JCs require that you at least passed ‘A’ Maths.)9
Littered around this textbook are occasional “O-Level Reviews” (e.g. Ch. 5 and 19). These
reviews will usually be very quick and hopefully you’ll have no difficulty with them. But if
you do, go back and review your O-Level Maths and ‘A’ Maths!
• Web-based calculators.
Google is probably the quickest for simple calculations. Type in anything into your
browser’s Google search bar and the answer will instantly show up:
Wolfram Alpha is somewhat more advanced (but also slower). Enter sin x for example and
you’ll get graphs, the derivative, the indefinite integral, the Maclaurin series, and a bunch
of other stuff you neither know nor care about. In this textbook, you’ll sometimes see
(usually at the end of an example or an exercise answer) a clickable Wolfram Alpha logo
that will bring you to the relevant computation on Wolfram Alpha.
Symbolab is a much less powerful alternative to Wolfram Alpha. However, it’s perfectly
good for simple algebra and somewhat quicker, so you may sometimes prefer using it.
Derivative Calculator • With Steps! https://www.derivative-calculator.net/
The Derivative Calculator and the Integral Calculator are probably unbeatable for the specific
Also check the Integral Calculator!
Calculadora de Derivadas en español
Ableitungsrechner auf Deutsch
purposes of differentiation and integration. Both give step-by-step solutions for anything
Calculate derivatives online — with steps and graphing!
you want to differentiate or integrate. As with Wolfram Alpha, you’ll sometimes see click-
© DAVID SCHERFGEN 2018 — ALL RIGHTS RESERVED.
IMPRINT AND PRIVACY POLICY
able logos and that’ll bring you to the relevant computations. (Note that unfortunately,
after clicking, you’ll also have to either click “Go!” or hit Enter .)10
I also made this Collection of Spreadsheets (click “Make a copy”). These are for doing tedi-
ous and repetitive calculations you’ll often encounter in H2 Maths (with vectors, complex
numbers, etc.).11
9
Some kiasu JCs, like HCI, even require that you got at least a B3 for both Maths & Additional Maths.
10
The site author informed me that not having a direct link was deliberate and defensive.
11
As with anything I do, I welcome any feedback on these spreadsheets. (Perhaps in the future I will make
a more attractive version.)
xxxi, Contents www.EconsPhDTutor.com
• Other online resources.
There are way too many websites that try to cover primary, secondary, and lower-level
undergraduate maths. Unfortunately, some of them are awful and get things wrong.
Three websites I like (though are probably a bit advanced for JC students) are:
12
The flagship SE site is where you can ask any programming question and (often) see it
answered amazingly quickly. For computing in general, there are also many other SE sites.
SE sites are like Yahoo! Answers or Quora, but less stupid. The worst SE site is probably Politics beta ,
but even there, the average question or answer is probably better than that on Yahoo! Answers or Quora.
13
There are also many other wonderful SE sites that you should explore. Unfortunately, despite my
magnificent contributions, Economics beta is not exactly thriving. It seems that economists, unlike
programmers or mathematicians, have learnt all too well that contributing to the public good is folly.
14
Well, depending on which jurisdiction you live in. Of course, in Singapore, unless told otherwise, you
should assume that everything is illegal.
15
Note though that these sites are constantly playing whac-a-mole with the fascist authorities and so the
URLs often change. If the links here aren’t correct, please first let me know so I can correct them. Then
simply google to find the current working URLs. For Sci-Hub, the page usually lists the
latest up-and-running URLs.
xxxii, Contents www.EconsPhDTutor.com
Use of Graphing Calculators
You are required to know how to use a graphing calculator.16
This textbook will give only a few examples involving graphing calculators.
There is no better way of learning to use it than to play around with it yourself. By the
time you sit down for your A-Level exams, you should have had plenty of practice with it.
You can also use any of the seven calculators in the following list (last updated by SEAB
on Dec 1st, 2017 — PDF).17
The following graphing calculator models are approved for use ONLY in subjects examined at H1, H2 and
H3 Levels of the A-Level curriculum.
Note: All graphing calculators must be reset prior to any examination.
This textbook will stick with the TI-84 PLUS Silver Edition,18 which I’ll simply call the
TI84. (My understanding is that most students use a TI calculator and that the five
approved TI calculators are pretty similar.)
I’ll always start each example with the calculator freshly reset.19
16
Pretty bizarre that in this age of the smartphone, they want you to learn how to use these clunky and
now-useless devices from the ’80s and ’90s. It is the equivalent of learning to program a VCR. (The TI-81
was designed in the 1980s and first sold in 1990. The TI-84 PLUS was first sold in 2004 and represents
only a modest improvement over the original TI-81.)
IMHO it’d be much better to teach you to some simple programming or Excel (or whatever spreadsheet
program). “B-b-but ... how would such learning be tested in an exam format?” Ay, there’s the rub. In
the Singapore education system, anything that cannot be “examified” is not worth learning.
Of course, there are some folks over in Texas who don’t mind. Nor do those lucky few MOE teachers
and administrators who get to go on all-expenses-paid “business” trips to Texas to “learn” more about
the calculators.
17
They forgot a U in INSTRMENTS.
18
Operating System, Version 2.55MP — available at the TI website.
19
I’ve never actually bought or owned a graphing calculator. All my screenshots here of the TI84 are
actually from the emulator Wabbitemu. (Yup, you can download Atari or DOS emulators to play
decades-old games; and you can likewise download an emulator for this decades-old piece of junk.)
xxxiii, Contents www.EconsPhDTutor.com
Exam Tips for Towkays
Use your graphing calculator as much as possible. You are always allowed to use your
graphing calculator.
Some exam questions will explicitly instruct you not to use your calculator, but this just
means that your written answer should not include any hint that you used your calculator.
(Nonetheless, you can still cheat and use your calculator for guidance and to check your
answer.)
Instructions to not use your calculator include:
• “Do not use a calculator in answering this question.”
• “Without using your calculator ...”
• “Use a non-calculator method ...”
• “Find the exact value of ...”
√
• “Express your answer in terms of 3 or π.”
20
This scene takes place at about 57:30–58:00 of the movie (YouTube clip).
They removed the brand name and also the model name, so I can’t quite tell what model it is. One
website claims it’s a TI-86, while another claims it’s an exact match to the TI-83 PLUS.
Zooming in, it looks like Spider-Man is doing some sort of combinatorics on his notepad. I can’t tell
what exactly’s on the calculator screen. Many of the buttons on the calculator seem to be permanently
depressed, which is weird.
Slowing down, we see that he first punches in 5 8 ( , then a moment later 5 7 (or maybe 2 6 ) —
so it’s probably all just rubbish that he’s punching in. The latter keystrokes do get him out of the place
though. (That’s all for my brilliant movie analysis of the week.)
xxxiv, Contents www.EconsPhDTutor.com
Preface/Rant
When you’re very structured almost like a religion ... Uniforms, uniforms,
uniforms ... everybody is the same. Look at structured societies like Singapore
where bad behaviour isn’t tolerated. You are extremely punished. Where
are the creative people? Where are the great artists? Where are the great
musicians? Where are the great singers? Where are the great writers? Where
are the athletes? All the creative elements seem to disappear.
The most dangerous man, to any government, is the man who is able to think
things out for himself.
Let us contrast this with the goal of the Singapore education system, which is to:
At first glance, these two goals do not seem to be in conflict. After all, we’d expect a
student who genuinely understands her H2 Maths to also contribute to GDP growth.
The conflict only arises with the keyword docile. An education system that imparts genuine
understanding tends also to encourage independent thinking and discourage docility.
On the one hand, to maximise GDP growth, Gahmen wants “creative” innovators. On the
other, it doesn’t want too much of a challenge to the status quo (especially politically). Its
goal is thus to turn docile test-taking drones into docile creative innovators.
21
Unfortunately, the audio at this BBC webpage seems to be broken.
22
The following resources are provided with the efficient Type 1 student in mind: (a) The H2 Mathem-
atics CheatSheet, all the formulae you’ll ever need on two sides of an A4 sheet of paper. (b) The H1
Mathematics Textbook, which is written more simply, and which covers a subset of the H2 syllabus.
(c) The H2 Maths Exercise Book (coming “soon”), which teaches you how to mindlessly apply formulae
and give the “correct” answer to every exam question. (d) My totally awesome tuition classes!
xxxv, Contents www.EconsPhDTutor.com
Unfortunately, creative innovation is completely incompatible with docility. A populace
trained to avoid the slightest transgression is not one that is capable of producing anything
new. The result is lip service to buzzwords like “creativity” and half-hearted education
reform. Once a decade or so, some technocrat comes up with an inane four-letter campaign
(FLC) like TSLN 1997 and TLLM 2005 that brings us precisely nowhere.23
To this ambivalence and pussy-footing, add (i) the deep-rooted Confucian love of exams
and rote-learning; and (ii) the elitist British educational system we inherited.24 Altogether,
despite superficial appearances to the contrary, we’ve had very little change over the years.
Administrators, teachers, and students alike remain completely fixated with exams.
There is a place for testing. The problem is that as currently constructed, our exams
and education system do not test for genuine understanding. Instead, they merely test if
you’ve mastered the art of mimicry and if you’re able to follow instructions, recipes, and
algorithms. In other words, they test if you are an obedient monkey capable of performing
tricks you’ve practised over and over and over again.
I have a study in mind: Gather all the students who got As for their A-Level maths exams,
5, 10, 15, 20 years after the exam. Get them to do that exact same A-Level exam they
took years ago. Ask them if they remember anything from their JC maths education or
if their JC maths education had any value whatsoever. I suspect that most will get close
to 0, remember absolutely nothing, and consider their JC maths education to have been
completely worthless. If these suspicions are correct, then JC maths education has no value,
except and only as a selection device.
(... Which, of course, is of paramount importance in elitist, social-Darwinist Singapore.
Grades help differentiate the President’s Scholar from the “mere” PSC scholar and the
lowly McDonald’s employee from the dalit cleaner. Grades help differentiate those should
reproduce from those who shouldn’t.)25
23
In 1997, Thinking Schools, Learning Nation (or, as was joked, Sinking Schools, Burning Nation). In
2006, Teach Less, Learn More. These campaigns may now be found, alongside such gems as Goal 2010,
in the ash heap of history. The current FLC is probably ESGS (Every School a Good School).
24
From a April 12, 2013 Straits Times interview with Tharman:
ST: DPM, why do you think we are the way we are?
A: Well, we inherited the British system, which is quite academically biased and in Britain, of course,
quite an elitist system. We also inherited a Chinese education culture, which is also quite academically
oriented, a strong emphasis on values and character education but quite academically oriented and quite
test-oriented. And I think the combination of a British and East Asian educational ethos has created a
particular form of meritocracy which achieved a lot in 40 years. But as we go forward and we think about
the type of inclusive society we want, it’s not just about wages, which we are working on, it’s about how
you view yourself and others at the workplace, wherever you live, how we view fellow Singaporeans, do
you view them as equals, do you do things together. That has to start from young and it has to continue
through life.
25
One will recall the infamous Graduate Mothers’ Priority Scheme and Small Family Incentive Scheme.
Those two Orwellian/Nazi schemes have since been scrapped.
However, the Social Development Unit (SDU), now rebranded as the Social Development Network (SDN),
lives on. The SDU was established at around the same time (1984) as the aforementioned schemes “to
encourage social interaction and marriage among graduate singles”. I do not know if this still remains
the rebranded SDN’s explicit, written goal, but it would surprise me if this did not still remain at least
its implicit and unwritten goal.
xxxvi, Contents www.EconsPhDTutor.com
In A Mathematician’s Lament (2002, 2009), Paul Lockhart describes (pre-tertiary) maths
education in the US as being “stupid and boring”, “formulaic”, and “mindless” “pseudo-
maths”.26 The same may be said of maths education in Singapore. But at least the typical
US student has the consolation that only a very small portion of her life will have been
squandered on such “mindless” “pseudo-maths”.
The same cannot be said for the typical Singaporean student. By the time she turns 18, she
will have — just for the single subject of maths alone — clocked many thousands of hours
attending school and tuition classes; doing homework, practice exam questions, assessment
books, and Ten Year Series (TYS); taking common tests, promos, prelims, mid-year exams,
and end-of-year exams; ad infinitum, ad nauseam.
The 8 Confucian East Asian countries27 perform splendidly on international tests like the
triennial Programme for International Student Assessment (PISA). In the 2015 PISA, these
8 countries all ranked in the top 11 (out of 70 countries/regions, see p. xliii). Singapore
did especially well and topped all three categories (science, reading, mathematics).
“What,” the Western educator enquires, “is the magic here?”28 But here there is no magic.
To me, the explanation for why East Asian students do so well on these tests is obvious:
While the American teen is “wasting” her time on typical, “useless” teenager-ly pursuits,
the Singaporean teen is seated obediently in front of his desk, doing yet another soul-
crushing TYS question. And once in a while, a 10-year-old commits suicide due to poor
exam results.29 Kids the world over commit suicide for a variety of reasons, but only in
East Asia do they regularly do so due to poor exam results.
In South Korea, legislators have even passed (ill-enforced) laws barring hagwons (private
cram schools) from operating past 10 p.m. To the American teen, it is utterly mind-
blowing that (a) anyone would be in school past 10 p.m.; and (b) this practice grew to be
so common that legislators saw fit to pass laws against it. But in East Asia, this isn’t at
all strange.
To me, the fact that East Asian students bust their butts is obviously the single most
important explanation for why they do so well on international tests. Yet strangely, in the
countless papers and books that I’ve come across seeking to explain why some countries do
better than others, this explanation is rarely ever considered.
26
By the way, Lockhart explains why this is so and what maths really is far more eloquently and clearly than
I ever could. I strongly recommend that every student and instructor of maths read A Mathematician’s
Lament. There are two versions — a 2002 25-page PDF that circulated online and a 2009 book version.
27
Japan, Korea, Viet Nam, and the five Chinese-majority countries (China, Hong Kong, Macao, Singapore,
and Taiwan).
28
In the US, “Singapore Math” has acquired something of a mythical status. As with weight loss, Americans
are constantly on the lookout for some magic, painless solution to their mediocre education systems.
29
For example, in 2001, 10-year-old Lysher Loh jumped to her death from her fifth floor apartment (source).
She “had been disappointed with her mid-year examination results and had found the workload heavy.”
She had also “told her maid Lorna Flores two weeks before her death she did not want to be reincarnated
as a human being because she never wanted to have to do homework again.” In 2016, another 11-year-
old “killed himself over his exam results by jumping from his bedroom window in the 17th-storey flat”
(source). For more of such stories, see this page.
xxxvii, Contents www.EconsPhDTutor.com
Figure 2: Average test score by group and treatment: U.S. vs. Shanghai
25
U.S Shanghai
20
15
Score
(out of 25) 21.3 22.0 22.7
21.0 20.6
10
19.3
17.0 17.5 17.6
14.2
13.3
11.4
5
9.1
8.3
6.1 6.4
0
Control Treatment
N otes: Average score for students who received no incentives (Control) and for
students who received incentives (Treatment) by school and track.
A related explanation is that East Asians are trained from young to take every test seriously.
In contrast, American kids could care less about some inconsequential PISA test.30 (Indeed,
one might argue that if made to do an hours-long inconsequential PISA test, the truly
intelligent kid would simply click through it as quickly as possible.)
This explanation has recently received some academic attention. In “Measuring Success in
Education: The Role of Effort on the Test Itself” (2017), experimenters gave a PISA-like
test to students in several schools in the US and Shanghai, China. At each location, the
treatment group was given a financial incentive (i.e. a bribe) to do well, while the control
group was given nothing.
In the US, the treatment group students performed significantly better than the control
group. In contrast, in Shanghai, the treatment group students performed no better.
This suggests that robotic Shanghai students are conditioned to always try their hardest,
whether or not there’s any financial incentive. In contrast, US students may not be trying
particularly hard when the stakes are low or zero (as is the case with PISA), but will try
a little harder when there’s some financial incentive.
Note also that “students learn about the incentives just before taking the test, so any
impact on performance can only operate through increased effort on the test itself rather
than through, for example, better preparation or more studying” (p. 4). The remaining US-
Shanghai performance gap could thus very well be eliminated if the former were incentivised
(through carrot and stick) to prepare and work half as hard as the latter.
A similar but more recent study is “Taking PISA Seriously: How Accurate Are Low Stakes
30
East Asians care as much about exams as Americans do professional sports, while Americans care as
much about exams as East Asians do professional sports.
xxxviii, Contents www.EconsPhDTutor.com
Exams?” (2018). One of its conclusions is that “a country can rise up to 15 places in
rankings if its students took the exam seriously.”
Altogether then, there is very little that Western educators can learn from East Asia. The
only lesson is this: If you want your students to do well on tests like the PISA, then:
• Train them from young to take every test seriously; and
• Force them to work their butts off (thereby destroying their childhood and adolescence).
Singapore produces world champion test-takers and gold medallists at the various Inter-
national Olympiads. But as currently constituted, the Singapore education system will
never produce a Fields Medallist or a Nobel Laureate. And as Steve Wozniak sug-
gests, Singapore will never produce a world-beating innovator like an Apple or a
Google. The reason is that unlike taking tests (be it your J1 Promos or your IMO), such
endeavours require more than mere monkey-see-monkey-do mimicry.31
The goal of this textbook is to impart genuine understanding. I suspect that the sincere
pursuit of this goal will do more to promote GDP growth than any of Gahmen’s current
educational policies.
But quite aside from any such instrumental value, I believe that a genuine understanding
of maths (and indeed any other material) is intrinsically valuable. (GDP growth is
lovely, but despite what Gahmen would have you believe, it is not all that makes life
worthwhile.) And heck, learning can even be that three-letter word banished from the
Singapore education system (and apparently also playgrounds and void decks): F-U-N.
These were actual signs posted in Singapore. They became viral and were removed in
June 2013 (New Paper story) and in March 2016 (Straits Times story), respectively.
31
Here are two excuses I’ve come across for why Singapore has produced no Nobel Laureates: (a) Singa-
pore’s population is too small; and (b) Singapore was until fairly recently very poor.
But consider Denmark (population 5.7M), Finland (5.5M), and Norway (5.2M), whose populations are
similar to or even smaller than Singapore’s (5.5M) and who were producing Nobel Laureates when they
were far poorer than Singapore is today. We could also point to tiny Saint Lucia (180,000) which has
produced two Nobel Laureates.
I am thus accepting bets for this proposition: “By 2050, no born-and-bred Singaporean will have won a
Fields medal or a Nobel Prize (Peace excluded).” (We’ll need to work out what exactly “born-and-bred”
means, but that can be ironed out.)
xxxix, Contents www.EconsPhDTutor.com
Now, what do I mean by “imparting genuine understanding”?
Personal anecdote: As a JC student, I remember being deeply mystified by why the scalar
product had such a simple algebraic definition and yet could at the same time also tell us
about the cosine of the angle between the two vectors. I never figured it out. But this
didn’t matter, because this was simply “yet another formula” that we learnt for the sole
purpose of answering exam questions.32
I remember being confused about the difference between the sample mean, the mean of the
sample mean, the variance of the sample mean, and the sample variance. But this confusion
didn’t matter, because once again, all we needed to do to get an A was to mindlessly apply
formulae and algorithms. Monkey see, monkey do.
This textbook is thus partly in response to my unhappy and unsatisfactory experience as a
cog in the Singapore educational system. In other words, this is the textbook I wish I had
had when I was a JC student.
Almost all results are proven. I try to supply the intuition for each result in the simplest
possible terms. Many proofs are relegated to the appendices, but where a proof is especially
simple and beautiful, I encourage the student to savour it by leaving it in the main text.
In the rare instances where proofs are entirely omitted from this book — usually because
they are too advanced — I make sure to clearly state so, lest the student wonder whether
the result is supposed to be obvious (as I often did when I was a JC student).
This textbook follows the Singapore A-Level syllabus.33 And so, a good deal of mindless
formulae is unavoidable. Even so, I try in this textbook to give the student a tiny glimpse
of what maths really is — “the art of explanation”.34 I try to plant a thoughtcrime in the
student’s mind: Maths is not merely another pain to be endured, but can at times be a joy.
And so for example, this textbook explains:
• A bit of intuition behind differentiation, integration, and the Fundamental Theorems of
Calculus. (To get an A, no understanding of these is necessary. Instead, one need merely
know how to “do” differentiation and integration problems.)
• Why the Central Limit Theorem is so amazing. (To get an A, one need merely treat
the CLT as yet another mysterious mathematical trick that helps solve exam questions.
It isn’t necessary to appreciate why it is so amazing, where it might possibly come, or
what relevance it has to everyday life.)
• A bit of intuition behind the Maclaurin series. (To get an A, it suffices to know how to
mindlessly apply this strange formula that falls out from the sky.)
• Why it is terribly wrong to believe that “a high correlation coefficient means a good
model”. (Yet this is exactly your A-Level examiners seem to believe. See Ch. 108.9.)
32
I remember complaining about this to a classmate and he responded, “But that’s how we’ve always been
taught maths what. It’s just a bunch of formulae.” He was of course right.
Today, the intellectually-curious student can easily find the answer on the internet. But at that time
(2001–02), the internet was not quite as developed and so one could not easily find answers online.
33
Another of my quixotic desires is to change that too. Or as Lockhart says, “[T]hrow the stupid curriculum
and textbooks out the window!”
34
Lockhart, A Mathematician’s Lament (2002, 2009).
xl, Contents www.EconsPhDTutor.com
Two Reasons Why Even Type 1 Pragmatists Should Read This Textbook
(1) The A-Level exams now include more curveball or out-of-syllabus questions.
Previously, the A-Level exam questions were always perfectly predictable. If you had no
problem doing past year exam questions, you’d have no problem getting an A.
But starting in 2017 (coinciding with the new and supposedly reduced syllabus), curveball
questions now carry a weight of perhaps 10–20%. For example, in 2017, out of absolutely
nowhere, students were suddenly asked to use something called D’Alembert’s ratio test and
to explain whether a series converges (see Exercise 483).
I can find no official, publicly-available statement announcing (much less explaining) this
change. I have heard only that JC maths teachers were informed by MOE of this change
ahead of time. My guess is that this is the MOE’s highly-creative method of creating
creative students.
In my humble opinion, this change is complete cow manure. It serves only to add further
pressure to the already-miserable Singapore student.
But I will confess that selfishly, I welcome this change because it increases the value of this
textbook. The student who carefully studies this textbook will be rewarded with a true
and deep understanding of all the H2 Maths material and hence be fully prepared to bat
away any curveball.
Take for example Exercise 512 (9758 N2017/I/6). This unfamiliar problem will likely have
come as a shock for many a Singaporean monkey drilled a thousand times over to “do”
computational problems involving 3D geometry. In contrast, any student who bothered to
read this textbook’s Part III even once will have enjoyed solving this problem.
(2) If you’re intending to do more maths in the future (e.g. physics, economics,
engineering), then this textbook will actually save you time in the long run.
Merely doing well in A-Level H2 Maths may give you the false illusion that you’ve actually
learnt or understood the material. Down the road, this may cost you more time.
Another personal anecdote: When I began my undergraduate studies (in a small US col-
lege), I was still the typical kiasu Singaporean monkey trained to believe that life was a
competitive, Social Darwinist race. And so I skipped a whole bunch of first- and second-
year maths classes (Calculus I, Calculus II, Statistics, and Linear Algebra), thinking I had
already covered all the material back in JC.
On paper, I may indeed have covered all this material. But in practice, all I’d learnt in JC
was monkey see, monkey do. I’d learnt enough to do well on the exams, but not enough to
actually understand or use any of the material.
It was only many years later, with the benefit of hindsight, that I began to see how much of
a mistake I had made. Skipping those classes saved me time and put me “ahead of the race”
in the short run. But in the long run, this actually cost me dearly. I would actually have
saved more time by not skipping those seemingly-elementary first- and second-year classes!
(And of course, I would’ve saved even more time by skipping the Singapore education
system, but alas, that wasn’t an option.)
xli, Contents www.EconsPhDTutor.com
This textbook thus offers the sort of A-Level maths education I wished I had received.35
You’ll be spending two years on H2 Maths anyway. And so, instead of wasting these two
years learning mindless algorithms you’ll forget a month after the A-Level exams, why not
spend this time actually learning and understanding the material that you can go on to
actually use?
And of course, in my completely humble and unbiased opinion, the best way to learn and
understand H2 Maths is by studying this textbook.
I conclude this Preface/Rant by expressing my hope that even if you the instructor or
student do not use this textbook as your primary instructional or learning material, you
will still find it perfectly useful as an authoritative and reliable reference.
P.S. This textbook is far from perfect. To steal a certain neighbourhood school’s motto,
the best is yet to be. I hope to keep improving this textbook, but I can only do so with your
help. So if you have any feedback or spot any errors, please feel free to email
me. (As you can tell, I am pretty merciless about criticising others. So please don’t be shy
about pointing out the many foolish mistakes that are surely still lurking in this textbook.)
35
Please note that there is no knock or diss here on my JC Maths teachers. My JC Maths teachers and
indeed most of my Singapore teachers were generally pretty good and did the best they could within
the stultifying confines of the Singapore educational system. My critique here applies to the Singapore
educational system. As the hip-hop cliché goes, I don’t hate the player(s); I hate the game.
xlii, Contents www.EconsPhDTutor.com
PISA 2015 Mean Scores
Science Reading Maths Science Reading Maths
Singapore 556 535 564 Lithuania 475 472 478
Japan 538 516 532 Croatia 475 487 464
Estonia 534 519 520 CABA (Argen.) 475 475 456
Taiwan 532 497 542 Iceland 473 482 488
Finland 531 526 511 Israel 467 479 470
Macao 529 509 544 Malta 465 447 479
Canada 528 527 516 Slovak Rep. 461 453 475
Viet Nam 525 487 495 Greece 455 467 454
Hong Kong 523 527 548 Chile 447 459 423
BSJG (China) 518 494 531 Bulgaria 446 432 441
Korea 516 517 524 UAE 437 434 427
NZ 513 509 495 Uruguay 435 437 418
Slovenia 513 505 510 Romania 435 434 444
Australia 510 503 494 Cyprus 433 443 437
UK 509 498 492 Moldova 428 416 420
Germany 509 509 506 Albania 427 405 413
Netherlands 509 503 512 Turkey 425 428 420
Switzerland 506 492 521 Trin. & Tobago 425 427 417
Ireland 503 521 504 Thailand 421 409 415
Belgium 502 499 507 Costa Rica 420 427 400
Denmark 502 500 511 Qatar 418 402 402
Poland 501 506 504 Colombia 416 425 390
Portugal 501 498 492 Mexico 416 423 408
Norway 498 513 502 Montenegro 411 427 418
US 496 497 470 Georgia 411 401 404
Austria 495 485 497 Jordan 409 408 380
France 495 499 493 Indonesia 403 397 386
Sweden 493 500 494 Brazil 401 407 377
Czechia 493 487 492 Peru 397 398 387
Spain 493 496 486 Lebanon 386 347 396
Latvia 490 488 482 Tunisia 386 361 367
Russia 487 495 494 FYROM 384 352 371
Luxembourg 483 481 486 Kosovo 378 347 362
Italy 481 485 490 Algeria 376 350 360
Hungary 477 470 477 Dominican Rep. 332 358 328
Notes: B-S-J-G = Beijing-Shanghai-Jiangsu-Guangdong; CABA = Ciudad Autónoma de Buenos Aires;
FYROM = Former Yugoslav Republic of Macedonia. Source: “PISA 2015 Results in Focus” (PDF), p. 5.
1, Contents www.EconsPhDTutor.com
The glory of [maths] is its complete irrelevance to our lives. That’s why it’s
so fun!
2, Contents www.EconsPhDTutor.com
1. Just To Be Clear
In this textbook, we’ll stick to these standard conventions:
• Greater than means “strictly greater than” (>). So I won’t bother saying “strictly”,
unless it’s something I want to emphasise.
• Less than means “strictly less than” (<).
• If I want to say greater than or equal to (≥) or smaller than or equal to (≤), I’ll
say exactly that.
• Positive means “greater than zero” (> 0).
• Negative means “less than zero” (< 0).
• Non-negative means “greater than or equal to zero” (≥ 0).
• Non-positive means “less than or equal to zero” (≤ 0).
• Zero is neither positive nor negative. Instead, it is both non-negative and non-
positive.36
Names of some punctuation marks:
Remark 1. Some writers refer to (), [], and {} as round, square, and curly brackets
— we’ll avoid these terms. Instead, as stated above, we’ll strictly refer to (), [], and {}
as parentheses, brackets, and braces.37
36
Note though that in France, positif and négatif mean ≥ 0 and ≤ 0, so that 0 is both positif and négatif.
On this, see e.g. Wiktionary.
37
Note that there is actually another pair of brackets ⟨⟩ called angle brackets. If we’re using angle
brackets, then we’ll want to be careful to distinguish them from [] by referring to the latter as square
brackets. Happily, we won’t be using angle brackets at all in this textbook. And so, we’ll simply call
[] brackets.
3, Contents www.EconsPhDTutor.com
2. PSLE Review: Division
r = 9 − 4q = 9 − 4 × 2 = 1.
r = 17 − 3q = 17 − 3 × 5 = 2.
x = dq + r.
We call x the dividend, d the divisor, q the quotient, and r the remainder.
Note that thus defined, the quotient and remainder are unique.39
10s 1s
1 2
7 8 7 Explanation
7 0 10 × 7 = 70
1 7 87 − 70 = 17
1 4 2 × 7 = 14
3 17 − 14 = 3
3 3
Thus: 87 ÷ 7 = 12 + = 12 .
7 7
We call 87 the dividend, 7 the divisor, 12 the quotient, and 3 the remainder.
100s 10s 1s
5 3
17 9 1 2 Explanation
8 5 0 50 × 17 = 850
6 2 912 − 850 = 62
5 1 3 × 17 = 51
1 1 62 − 51 = 11
11 11
Thus: 912 ÷ 17 = 53 + = 53 .
17 17
We call 912 the dividend, 17 the divisor, 53 the quotient, and 11 the remainder.
Exercise 1. Do the long division for 8 057 ÷ 39. Identify the dividend, divisor, quotient,
and remainder. (Answer on p. 1387.)
5, Contents www.EconsPhDTutor.com
2.2. Dividing By Zero
Dividing by zero is a common mistake. Students have little trouble avoiding this mistake
if the divisor is obviously a big fat zero. Instead, students usually make this mistake when
the divisor is an unknown constant or variable that might be zero.
Example 6. Solve x (x − 1) = (2x − 2) (x − 1). (That is, find the values of x for which
the equation is true. We call these values of x the solutions to the equation.)
Here’s a wrong solution: “Divide both sides by x − 1 to get x = 2x − 2. So x = 2.” 7
The correct solution considers two possible cases:
Case 1. If x − 1 = 0, then the equation is true. So, x = 1 is a possible solution.
Case 2. If x − 1 ≠ 0, then we can divide both sides by x − 1 to get x = 2x − 2.
So, x = 2 is another possible solution.
Conclusion. The two possible
Art. 24.
solutions are x = 1 and x = 2.
s 1 M P L E E Q_U A T 1 o N. S. Io 3
The Proof.
Moral of the story. Dividing by zero may cause us to lose perfectly valid solutions. So,
always make sureThethe divisor
original equation, #;=Iº;
is non-zero. If you’re not
x=sure whether
6; therefore it equals zero, then
2x=12;
break up your analysis into two cases, as was4.5 done in4.5the above example: Case 1. The
-
Example 13.
#=#. divide both numerators by x, and you will have
–tº– - 35 ; therefore 42 = 35 x-70 ; therefore 42 x– 126 =
3. – 2 x - 3: -
1 50 −17.1 ∞ −∞
, , , , , .
0 0 0 0 0
40
The special case is 0 ÷ 0, which is indeterminate. This means that 0 ÷ 0 is sometimes undefined, but
can sometimes be defined under certain circumstances.
7, Contents www.EconsPhDTutor.com
3. Logic
Big surprise — you’ve secretly been using logic your whole life.
Logic isn’t explicitly on your H2 Maths syllabus.41 But spending an hour or two on logic
pays huge dividends — you’ll learn to reason better, both in maths and in everyday life.
This chapter is thus a brief and gentle introduction to logic. Here we merely present some
of the most basic but also some of the most useful results from logic. (If you truly can’t be
bothered, please at least check out the one-page summary of this chapter on p. 31.)
First, try this appetiser.42
Example 7. The Wason Four-Card Puzzle. In a special deck of cards, each card
has a letter on one side and a number on the other. You are shown these four cards:
A Z 1 8
Betsy the Bimbotic Blonde now comes along and makes the following claim:
“If a card has a vowel on one side, then it has an even number on the other side.”
You suspect that Betsy is wrong. To prove that she’s wrong, which of the above four
cards should you turn over? (The goal is to turn over as few cards as possible.)43
The above puzzle baffles most who are untrained in logical thinking. Right now, that
probably includes you. But by p. 22 of this textbook, you’ll have had some training in
logic and thus be able to solve this puzzle easily.
41
If it were up to me, the H2 Maths syllabus would devote at least a little time to logic. Instead, that time
is spent on learning to compute the volume of the revolution of a curve around the y-axis. Which is a
doggie trick that (1) students will forget two weeks after the final A-Level exam; and (2) is completely
useless unless you’re planning to be an engineer or physicist, in which case it is still completely useless
since down the road, you’ll be learning it again (and probably more properly).
42
The wording here is a slightly-modified version of Wason (1966, pp. 145–146).
43
Answer: A and 1. We’ll explain this on p. 22.
8, Contents www.EconsPhDTutor.com
3.1. True, False, and Indeterminate Statements
Then statements A and C are true, while statements B and D are false.44
As shorthand, we’ll use the green checkmark 3 to denote that a statement is true; and a
red crossmark 7 to denote that it’s false.
M : “x > 0.”
N : “x > 1.”
O: “x is a positive number.”
Note that the truth values of M , N , and O depend on the value of x. That is, whether
each statement is true or false depends on the value of x.
So, without being given further information, each of these three statements can neither
be said to be true nor said to be false. Instead, we say that each is indeterminate.
But if we’re told that:
• x = 5, then all three statements are true.
• x = 0.5, then statements M and O are true, while N is false.
• x = −1, then all three statements are false.
44
In this textbook, we will not define the terms statement, true, and false. We’ll take for granted that
“everybody knows” what these terms mean (even if they don’t).
9, Contents www.EconsPhDTutor.com
3.2. The Conjunction AND and the Disjunction OR
A: “Germany is in Europe.” 3
B: “Germany is in Asia.” 7
C: “1 + 1 = 2.” 3
D: “1 + 1 = 3.” 7
Using the logical connective AND (called the conjunction), we can form the following
statements (which we also call conjunctions):
Exercise 4. Continue with the above example. Explain if each of the following statements
is true or false. (Answer on p. 1388.)
A: “Germany is in Europe.” 3
B: “Germany is in Asia.” 7
C: “1 + 1 = 2.” 3
D: “1 + 1 = 3.” 7
Then the negations of statements A, B, C, and D are simply:
• A and C are true; thus, their negations NOT-A and NOT-C must be false.
• B and D are false; thus, their negations NOT-B and NOT-D must be true.
The negation of the negation simply brings us back to the original statement:
Exercise 5. Let E: “It’s raining”, F : “The grass is wet”, G: “I’m sleeping”, and H:
“My eyes are shut”.
Write down NOT-E, NOT-F , NOT-G, and NOT-H. (Answer on p. 1388.)
Remark 2. AND, OR, and NOT are our three most basic logical connectives. Using
these three basic connectives, we can build ever more complex statements.
Definition 6. We say that two statements P and Q are equivalent and write:
P ⇐⇒ Q,
M : “x > 0.”
N : “x > 1.”
O: “x is a positive number.”
Observe that if M is true, then O is also true. And if M is false, then O is also false. It
is impossible that one is true while the other is false. And so, we say that M and O are
equivalent and write M ⇐⇒ O.
In contrast, it is possible that M is true while N is false — this is the case when x = 0.5.
And so we say that M and N are not equivalent and write M ⇐⇒ / N.
α: “x = 3.”
β: “x + 2 = 5.”
γ: “x2 = 9.”
Observe that if α is true, then β is true. And if α is false, then β is also false. It is
impossible that one is true while the other is false. And so, we say that α and β are
equivalent and write α ⇐⇒ β.
Exercise 7. Continue with the above example. Explain if the negation of each of the
following statements is true. (Answer on p. 1388.)
Which is true.
Example 17. Let G: “I’m sleeping” and H: “My eyes are shut”.
Then the implication G Ô⇒ H is the statement:
Which is true.
This all seems simple enough. However, you may find the formal definition of P Ô⇒ Q a
little strange and unintuitive:
NOT-P OR Q.
What confuses students most about the above definition is this: From a false hypothesis,
any conclusion may be drawn! That is:
Example 21. Let E: “It’s raining” and F : “The grass is wet”. Then
Note that F Ô⇒ E is false. One way to prove that a statement is false is by supplying
a counterexample. A counterexample that shows that F Ô⇒ E is false is any scenario
where the grass is wet even though it isn’t raining. We can easily think of three such
counterexamples:
1. The rain just stopped.
2. Someone is watering the grass.
3. A dog is peeing on the grass.
Observe that in the above example, E Ô⇒ F is true but its converse F Ô⇒ E is false.
This proves that an implication and its converse are not always equivalent. Let’s
jot this down formally:
(P Ô⇒ Q) ⇐⇒ (Q Ô⇒ P ).
Fact 4. (Q Ô⇒ P ) ⇐⇒ (NOT-Q OR P ).
Exercise 11. Write down the converse of each statement. Then explain whether this
converse is true. (Answer on p. 1389.)
(a) “If Tin Pei Ling (TPL) is a genius, then the Nazis won World War II (WW2).”
(b) “If TPL is a genius, then the Allies won WW2.”
(c) “If π is rational, then I am the king of the world.”
(d) “If π is rational, then Lee Hsien Loong is Lee Kuan Yew’s son.”
Exercise 13. Fill in the blanks with (i) must be true; (ii) must be false; or (iii) could be
true or false. (Answer on p. 1390.)
1. “P Ô⇒ Q.”
2. “Q.”
3. “Therefore, P .”
Example 22. Examples of affirming the consequent or the fallacy of the converse:
When spelt out so explicitly, affirming the consequent or the fallacy of the converse
seems rather silly. But unfortunately, people make this error all the time. Hopefully you’ll
now be able to avoid it.
45
By the way, such a chain of reasoning is called a syllogism. A syllogism has two or more statements
called premises, followed by a conclusion.
19, Contents www.EconsPhDTutor.com
3.9. The Negation NOT- (P Ô⇒ Q)
Which of the following correctly negates I Ô⇒ J? In other words, which of the following
statements is NOT- (I Ô⇒ J)?
(a) “If x is German, then x is not European.”
(b) “If x is not German, then x is European.”
(c) “Some x is German and not European.”
(d) “Some x is European and not German.”
This is tricky and you should take as long as you need to think about it, before reading
the answer/explanation on the next page. The point of this exercise is to demonstrate to
yourself that it isn’t obvious what the negation of an implication is. (Or if it’s obvious,
it’ll demonstrate that you’re pretty smart.)
(As I’ve repeatedly stressed, do not do the intellectually-lazy thing of skipping ahead.
Give it at least three minutes of honest effort before going to the next page.)
Or don’t.
Whatevs.
Exercise 15. Prove the above Fact. (Hint in footnote.)46 (Answer on p. 1390.)
46
Look at the definition of P Ô⇒ Q (Definition 7). What is its negation?
21, Contents www.EconsPhDTutor.com
We can now easily solve the Wason Four-Card Puzzle.
Example 7. The Wason Four-Card Puzzle. In a special deck of cards, each card
has a letter on one side and a number on the other. You are shown these four cards:
A Z 1 8
Betsy the Bimbotic Blonde now comes along and makes the following claim:
“If a card has a vowel on one side, then it has an even number on the other side.”
You suspect that Betsy is wrong. To prove that she’s wrong, which of the above four
cards should you turn over? (The goal is to turn over as few cards as possible.)
The answer is that we should turn over A and 1. Here are two explanations:
Solution I. By Fact 5, the negation of P Ô⇒ Q is P AND NOT-Q. Thus, the negation
of Betsy’s claim is:
“A card has a vowel on one side AND an odd number on the other side.”
In case you weren’t convinced, here’s Solution II, which doesn’t directly use Fact 5. We
can also call this the brute-force case-by-case method:
• An odd number behind A would prove Betsy wrong. So, we should turn over A.
• An odd number behind Z would not prove Betsy wrong. Nor would an even number.
So, we needn’t turn over Z.
• A vowel behind 1 would prove Betsy wrong. So, we should turn over 1.
• A vowel behind Z would not prove Betsy wrong. Nor would a consonant. So, we
needn’t turn over 8.
Observe that both I Ô⇒ J and its contrapositive NOT-J Ô⇒ NOT-I are true.
Observe that both J Ô⇒ I and its contrapositive NOT-I Ô⇒ NOT-J are false.
Observe that both E Ô⇒ F and its contrapositive NOT-F Ô⇒ NOT-E are true.
Observe that both E Ô⇒ F and its contrapositive NOT-F Ô⇒ NOT-E are false.
47
Here we make the implicit assumption that the logical connective OR is commutative.
23, Contents www.EconsPhDTutor.com
Fact 6 is especially useful on those occasions when it’s hard to prove an implication but
easy to prove its contrapositive:48
Example 26. It’s not obvious how we can prove the following implication:
If x4 − x3 + x2 ≠ 1, then x ≠ 1.
If x = 1, then x4 − x3 + x2 = 1.
Example 27. It’s not obvious how we can prove the following implication:
But don’t worry. In H2 Maths, you won’t be required to write any proofs; the above is just
FYI and to illustrate why the contrapositive is useful.
Exercise 17. The statement “If x is German, then x is European” is true. Which of the
following statements is its contrapositive? Which are true? (Answer on p. 1391.)
(a) “If x is European, then x is German.”
(b) “If x is not German, then x is not European.
(c) “If x is not German, then x is European.”
(d) “If x is not European, then x is not German.”
(e) “If x is not European, then x is German.”
48
These examples are from .
24, Contents www.EconsPhDTutor.com
3.11. (P Ô⇒ Q AND Q Ô⇒ P ) ⇐⇒ (P ⇐⇒ Q)
To show that two statements are equivalent, we can use Definition 12. So for example:
• M ⇐⇒ O, because it is impossible that one is true while the other is false.
• I ⇐⇒
/ J (counterexample: if x = Emmanuel Macron, then I is false while J is true).
• M ⇐⇒/ N (counterexample: if x = 0.5, then M is true while N is false).
And so, to show that P ⇐⇒ Q, we can show that P Ô⇒ Q and Q Ô⇒ P are both true.
And to show that P ⇐⇒
/ Q, we can show that either P Ô⇒ Q or Q Ô⇒ P is false.
Exercise 18. Continue with the last example: Is N ⇐⇒ O true? (Answer on p. 1391.)
Exercise 19. Let X: “John is a Singapore citizen”, Y : “John has a National Registration
Identity Card (NRIC)”, and Z: “John has a pink NRIC”. Are any two of these three
statements equivalent? (Answer on p. 1391.)
Example 30. Let E: “It is raining” and F : “The grass is wet”. Then all of the following
statements are exactly equivalent:
Exercise 20. Let G: “I’m sleeping” and H: “My eyes are shut”. Construct the exact
same table as we just did, but for G Ô⇒ H. (Answer on p. 1391)
49
See this Washington Post story: “There really are 50 Eskimo words for ‘snow’”.
50
It’s far from obvious, but implies is logically equivalent to only if. And thus, the symbol Ô⇒ can be
read aloud not only as “implies”, but also as “only if”.
26, Contents www.EconsPhDTutor.com
3.13. The Four Categorical Propositions and Their Negations
Note first that in mathematics and logic, some means at least one.
The four categorical propositions are:51
The six subjects used are: “Korean”, “German”, “animal”, “mammal”, “Korean”, and
“German”.
The six predicates used are: “Asian”, “European”, “a dog”, “a bat”, “eats dogs” (or “a
dog-eater”), and “eats bats” (or “a bat-eater”).
(Example continues on the next page ...)
51
These are also called the A, E, I, and O propositions.
27, Contents www.EconsPhDTutor.com
(... Example continued from the previous page.)
Exercise 21. For the given pair of subject S and predicate P , write down the corres-
ponding UA, UN, PA, and PN. (Answer on p. 1391.)
S P
(a) Donzer Kiki
(b) Donzer Cancer
(c) Bachelor Married
(d) Bachelor Smoke
Exercise 22. Is each of the following statements always true? If so, explain why. If not,
supply a counterexample. (Answer on p. 1392.)
(a) The UA and UN are negations of each other.
(b) The PA and UN are negations of each other.
The above examples show that in general, the negation of the UA is the PN, while the
negation of the UN is the PA:
Statement Negation
UA: “All S are P .” PA: “Some S is NOT-P .”
UN: “No S is P .” PN: “Some S is P .”
Statement
(a) “All donzers are kiki.”
(b) “No donzer is kiki.”
(c) “Some donzer is kiki.”
(d) “Some donzer is not kiki.”
(e) “All bachelors are married.”
(f) “No bachelor is married.”
(g) “Some bachelor is married.”
(h) “Some bachelor is not married.”
(i) “All donzers cause cancer.”
(j) “No donzer causes cancer.”
(k) “Some donzer causes cancer.”
(l) “Some donzer does not cause cancer.”
(m) “All bachelors smoke.”
(n) “No bachelor smokes.”
(o) “Some bachelor smokes.”
(p) “Some bachelor does not smoke.”
Exercise 24. While trying to excuse the less-than-perfect play of a basketball player, a
commentator remarks, “Everybody is not LeBron James.” Rewrite this statement into
the form of a categorical proposition. Identify the type of categorical proposition, the
subject, and the predicate.
Write down its negation. Then state if the commentator’s statement is true or false. If
false, what should he have said instead? (Answer on p. 1392.)
Statement Negation
UA: “All S are P .” PN: “Some S is NOT − P .”
UN: “No S is P .” PA: “Some S is P .”
1 100
3
5 200
The set A contains three elements — namely, the numbers 1, 3, and 5. Informally, it is
a “box” containing the numbers 1, 3, and 5.
The set B contains three elements — namely, the numbers 100 and 200. Informally, it is
a “box” containing the numbers 100 and 200.
Note that when we talk about a set, we refer to both the box and the things inside it.
Mathematical punctuation:
• Braces {} — are used to denote the “container”.
• A comma means “and” and is used to separate the elements within a set.
Exercise 25. Write down C, the set of the first 7 positive integers.(Answer on p. 1393.)
Exercise 26. Write down D, the set of even prime numbers. (Answer on p. 1393.)
Example 36. Let V be the set of the four largest cities in the US. Then V =
{New York City, Los Angeles, Chicago, Houston}.
Example 37. Let L be the set of suits in the game of bridge. Then L = {♠, ♡, ♢, ♣}.
Example 38. Let E = {3, π2 , The Clementi Mall, Love, the colour green}.
3 π2
❤ ▮
The set E
The set E contains exactly five elements: two numbers — 3 and π2 ; a shopping centre
— The Clementi Mall; an abstract concept called love (denoted in the figure above by a
red heart); and even the colour green (denoted by a green rectangle).
Example 39. Let S be the set of Singapore citizens. Then S contains about 3.4M
elements,53 including Lee Hsien Loong, Ho Ching, and Chee Soon Juan.
Example 40. Let U be the set of United Nations (UN) member states. Then U contains
exactly 193 elements,54 including Afghanistan, Singapore, and Zimbabwe.
Exercise 27. Write down X, the set of Singapore Prime Ministers (both past and
present). (Answer on p. 1393.)
52
Actually, there are some restrictions on what can go into a set, but these technicalities are beyond the
scope of the A-Levels.
53
According to SingStat, the number of Singapore citizens in 2017 was about 3,439,200.
54
According to this UN webpage, the most recent and 193rd state to join the UN was South Sudan in
2011.
33, Contents www.EconsPhDTutor.com
A set can even contain other sets:
Example 41. Let F be the set that contains the sets A = {1, 3, 5} and B = {100, 200}.
1 100 1 100
5 5
3 200 3 200
The sets F and G look very similar. So, is G the same set as F ?
Nope. The set F contains exactly two elements, namely the sets A and B.
In contrast, G contains exactly five elements, namely the numbers 1, 3, 5, 100, and 200.
And so, F and G are not the same.
You can think of F as a box that itself contains two boxes — namely, A and B, each of
which contain some numbers. In contrast, G is a box that contains no boxes; instead, it
simply contains five numbers.
Exercise 29. Let I be the set whose elements are A, B, and G. (Answer on p. 1393.)
Example 42. Let J = {1, 2, 3, 4, 5, 6, 7}. Then 1 ∈ J, 2 ∈ J, 3 ∈ J, etc. You can read these
statements aloud as “1 is in J”, “2 is in J”, “3 is in J”, etc.
We can also write 1, 2, 3 ∈ J (read aloud as “1, 2, and 3 are in J”).
Also, 8 ∉ J, 9 ∉ J, 10 ∉ J, etc. (read aloud as “8 is not in J”, “9 is not in J”, “10 is not
in J”, etc.). We can also write 8, 9, 10 ∉ J (read aloud as “8, 9, and 10 are not in J”).
Definition 10. Two sets A and B are equal if every element that is in A is also in B and
every element that is in B is also in A.
One implication of the above definition55 is that the order in which we write out the
elements of a set does not matter:
2 6
4 2
=
6 4
🐄 🐔
🐄
=
🐔
The set C The set D
Exercise 31. Is each of the following pairs of sets equal? (Answer on p. 1393.)
(a) {1, 2, 3} and {3, 2, 1}. (b) {{1} , 2, 3} and {{3} , 2, 1}.
55
Actually, in set theory, this is not a definition, but an axiom (known as the Axiom of Extensionality).
But here for simplicity, I’ll just call it a definition.
36, Contents www.EconsPhDTutor.com
4.4. n(S) Is the Number of Elements in the Set S
Let S be a set. Then the number of elements in S is denoted by:
n(S).
Exercise 32. Let X be the set of Singapore Prime Ministers (past and present). Then
what is n (X)? (Answer on p. 1393.)
Remark 5. Note that most writers denote the number of elements in the set S by ∣S∣.56
But for some reason, your A-Level syllabus (p. 16) instead uses the notation n (S), so
that’s what we’ll have to use too.
Example 48. L is the set of all odd positive integers smaller than 100. So in set notation,
we can write L = {1, 3, 5, 7, 9, 11, . . . , 99}.
Example 49. M is the set of all negative integers greater than −100. So in set notation,
we can write M = {−99, −98, −97, . . . , −2, −1}.
What is “obvious” to you may not be obvious to your reader. So only use the ellipsis when
you’re confident it will be obvious to your reader! And as I did with the sets above, never
be shy to write a few more of the set’s elements (doing so costs you nothing except maybe
a few more seconds and some ink).
Exercise 33. In the above examples, what are n (L) and n (M )? (Answer on p. 1393.)
Exercise 34. Let N be the set of even integers greater than 100 but smaller than 1, 000.
Write down N in set notation. (Answer on p. 1393.)
56
Or cardA. See ISO 80000-2:2009, Item No. 2-5.5.
37, Contents www.EconsPhDTutor.com
4.6. Repeated Elements Don’t Count
Another implication of Definition 10 is that repeated elements don’t count (they’re
simply ignored):
2 2 2
4
=
6 4 6
Example 51. {Cow, Chicken} = {Cow, Cow, Chicken} = {Chicken, Cow, Chicken}.
🐄 🐄 🐔
= 🐔 = 🐔
🐔 🐄 🐄
Moreover,
Exercise 36. C is the set of even prime numbers. Find n(C). (Answer on p. 1393.)
Now, by the way, what exactly is a real number? This sounds like a “dumb” question, but
is actually a profound one that was satisfactorily resolved only from the late 19th century.
Indeed, this question is a little beyond the scope of the A Levels.
And so for the A Levels, we’ll simply pretend — as we did in secondary school — that
“everyone knows” what real numbers are (even though, as the quotes below suggest, they
actually don’t). We shall not attempt to define or construct the real numbers.
Note that with an infinite set, we cannot explicitly list out all its elements. And so, when
writing out an infinite set, we’ll sometimes find it helpful to use the ellipsis.
When writing out Z above, we used two ellipses. The first ellipsis says we continue “left-
wards” in the “obvious” fashion, with −4, −5, −6, etc. The second says we continue “right-
wards” in the “obvious” fashion, with 4, 5, 6, etc.
Exercise 37. H is the set of all prime numbers. With the aid of an ellipsis, write down
H in set notation. (Answer on p. 1393.)
It’s not on your syllabus, but a natural number is simply any positive integer:58
N = {1, 2, 3, . . . }.
57
According to Heinrich Weber (in his 1893 obituary for Kronecker), Kronecker made this remark at an
1886 lecture to the Berliner Naturforscher-Versammlung.
58
There’s actually a little bit of a debate as to whether N should include 0.
40, Contents www.EconsPhDTutor.com
4.9. Q Is the Set of Rational Numbers
Definition 14. A rational number (or simply rational) is any real number that can be
expressed as the ratio of two integers.
Any other real number is called an irrational number (or simply irrational).
Q is for quotient.59
Example 54. 16 ∈ Q because we can express 16 as the ratio of two integers (e.g. 16/1).
−1.87 ∈ Q because we can express −1.87 as the ratio of two integers (e.g. −187/100).
√ √
Example 55. 2, π ∉ Q. In √ words: “ 2 and π are not elements of the set of rational
numbers.” Or more simply: “ 2 and π are irrational.”
√
(Note though that this is far from obvious. It takes a little work to prove that 2 is
irrational and even more work to prove that π is irrational.)
As you probably already know from secondary school, any number whose decimal repres-
entation (eventually) recurs is rational. All other numbers are irrational.60
Example 56. 1/3 = 0.33333 ⋅ ⋅ ⋅ = 0.3 is rational and sure enough, it has the recurring digit
3. We will use the overbar to denote recurring digit(s).
1/7= 0.142857142857142857 ⋅ ⋅ ⋅ = 0.142857 is rational and sure enough, it has the recurring
digits 142857.
Similarly, 16 = 16.000 ⋅ ⋅ ⋅ = 16.0 and 1.87 = 1.87000 ⋅ ⋅ ⋅ = 1.87 are rational and have the
recurring digit 0. (Of course, when the recurring digit is 0, we usually don’t bother
writing it.)
√
Example 57. 2 ≈ 1.4142135623 . . . and π ≈ 3.1415926535 . . . are irrational. And sure
enough, their digits never recur.
(But again, this is far from obvious and takes some work to prove.)
59
Also quoziente in Italian, Quotient in German, and quotient in French.
60
We prove this in Fact 188 (p. 1254) of the Appendices.
41, Contents www.EconsPhDTutor.com
4.10. A Taxonomy of Numbers
Below is a taxonomy of the types of numbers you’ll encounter in this textbook. We’ll√study
complex and imaginary numbers only later on in Part IV (quick preview: i = −1 is
the imaginary number — indeed, i is the imaginary unit).
Real numbers are either rational or irrational. In turn, rational numbers are either integers
or non-integers.
Complex
Reals R Rationals Q Integers Z
numbers C
Imaginary
Irrationals Non-integers
numbers
Remark 6. This textbook uses blackboard bold font and writes R, Z, and Q. Note
though that some other writers instead use bold font and write R, Z, and Q. Your
A-Level syllabus and exams use only the former, so that’s what we’ll do too.
61
Actually, the truth is somewhat more complicated. For example, some writers call ∞ and −∞ extended
real numbers. But in this textbook, I’ll keep it simple and insist that infinity is not a number.
42, Contents www.EconsPhDTutor.com
4.11. More Notation: + , − , and 0
To create a new set that contains only the positive elements of the old set, append a
superscript plus sign (+ ) to the name of a set:
1. Z+ = {1, 2, 3, . . . } is the set of all positive integers.
2. Q+ is the set of all positive rational numbers.
3. R+ is the set of all positive real numbers.
To create a new set that contains only the negative elements of the old set, append a
superscript minus sign (− ) to the name of a set:
1. Z− = {−1, −2, −3, . . . } is the set of all negative integers.
2. Q− is the set of all negative rational numbers.
3. R− is the set of all negative real numbers.
(As we’ll learn later, there is no such thing as a positive or negative complex number.
Hence, there are no sets denoted C+ or C− .)
To add the number 0 to a set, append a subscript zero (0 ) to its name:
1. Z+0 = {0, 1, 2, 3, . . . } is the set of all non-negative integers. Z−0 = {0, −1, −2, −3, . . . } is the
set of all non-positive integers.
2. Q+0 is the set of all non-negative rational numbers. Q−0 is the set of all non-positive
rational numbers
3. R+0 is the set of all non-negative real numbers. R−0 is the set of all non-positive real
numbers.
Exercise 38. Which of the sets introduced above are finite? (Answer on p. 1393.)
Remark 7. The three pieces of notation introduced on this page (+ , − , and 0 ) aren’t terribly
important or widely used. I give them a quick mention only because they’re listed on p.
16 of your A-Level syllabus.
Definition 16. The empty set is the set {}. It is often also denoted
∅.
The empty set
Informally, the empty set {} = ∅ is the “container” with nothing inside. {} or ∅.
Hence the name.
Example 58. In 2016, the set of all Singapore Ministers who are younger than 30 is {}
or ∅. This means there is no Singapore Minister who is younger than 30.
Example 59. The set of all even prime numbers greater than 2 is {} or ∅. This means
there is no even prime number that is greater than 2.
Example 60. The set of numbers that are greater than 4 and smaller than 4 is {} or ∅.
This means there is no number that is simultaneously greater than 4 and smaller than 4.
Example 61. The set {∅} is not the same as the set ∅.
{∅} is a set containing a single element, namely the empty set.
Informally, {∅} is a box containing an empty box — it is not empty.
In contrast, ∅ is the empty set.
Informally, it is simply an empty box.
We can also rewrite the two sets as:
{∅, 3, {∅}}
Exercise 39. Tricky: Let S = {{{}} , ∅, {∅} , {}}. What is n (S)? (Answer on p. 1393.)
Example
√ 63. Let A = (0, 3). Then A is the set of real numbers that are > 0 and < 3. So,
2 ≈ 1.41 ∈ A, but 0, 3 ∉ A.
−3 −2 −1 0 1 2 3 4
Example
√ 64. Let B = [0, 3]. Then B is the set of real numbers that are ≥ 0 and ≤ 3. So,
0, 2, 3 ∈ B.
−3 −2 −1 0 1 2 3 4
Example
√ 65. Let C = (0, 3]. Then C is the set of real numbers that are > 0 and ≤ 3. So,
2, 3 ∈ C, but 0 ∉ C.
−3 −2 −1 0 1 2 3 4
√ 66. Let D = [0, 3). Then D is the set of real numbers that are ≥ 0 and < 3.
Example
So, 0, 2 ∈ D, but 3 ∉ D.
−3 −2 −1 0 1 2 3 4
Exercise 40. Let X = [1, 1] , Y = (1, 1) , Z = (1, 1.01). Find n (X), n (Y ), n (Z). Express
the set X in another way and the set Y in another two ways. (Answer on p. 1393.)
Exercise 41. Express R, R+ , R+0 , R− , and R−0 in interval notation. (Answer on p. 1394.)
62
One good argument in favour of reverse bracket notation is that it avoids confusing the open interval
(a, b) with the ordered pair (a, b) (we’ll learn more about ordered pairs in Ch. 6). However, by
and large, the reverse bracket notation remains uncommon, except in continental Europe and especially
France (where it was introduced by the Bourbaki group).
46, Contents www.EconsPhDTutor.com
4.14. Subset Of ⊆
A ⊆ B.
Example 67. Let M = {1, 2}, N = {1, 2, 3}, and O = {1, 2, 4, 5}, and P = {3, 2, 1}. Then:
• M is a subset of N , O, and P .
We write M ⊆ N , M ⊆ O, and M ⊆ P .
• N is a subset of P , but not of M or O.
We write N ⊆ P , but N ⊈ M and N ⊈ O.
• O is not a subset of M , N , or P .
We write O ⊈ M , O ⊈ N , and O ⊈ P .
• P is a subset of N , but not of M or O.
We write P ⊆ N , but P ⊈ M and P ⊈ O.
Note that N is a subset of P and P is a subset of N . Indeed, the sets N and P are equal.
We have the following fact:
Fact 8. Two sets are equal ⇐⇒ They are subsets of each other.
Proof. In the following chain of reasoning, the first ⇐⇒ simply uses our definition of when
two sets are equal (Definition 10). The second ⇐⇒ simply uses the above definition.
Exercise 43. True or false: “The set of current Singapore Prime Minister(s) is a subset
of the set of current Singapore Minister(s).” (Answer on p. 1394.)
Exercise 44. Let A and B be sets. Explain whether each of the following statements is
true. (If false, give a counterexample.) (Answer on p. 1394.)
(a) A ⊆ B Ô⇒ A = B. (d) A = B Ô⇒ A ⊆ B.
(b) B ⊆ A Ô⇒ A = B. (e) A = B ⇐⇒ A ⊆ B.
(c) A = B Ô⇒ A ⊆ B. (f) A = B ⇐⇒ B ⊆ A.
A ⊂ B.
Example 68. Let M = {1, 2}, N = {1, 2, 3}, and O = {1, 2, 4, 5}, and P = {3, 2, 1}. Then:
Note that N = P and so by the above Definition, N is not a proper subset of P and P
is not a proper subset of N .
Exercise 45. Let S be the set of all squares and R be the set of all rectangles. Is S ⊂ R?
(Answer on p. 1394.)
Exercise 48. True or false statement: “If A is a subset of B, then A is either a proper
subset of or is equal to B.” (Answer on p. 1394.)
Remark 9. The A-Level syllabus (p. 16) uses the symbol ⊆ to mean “subset of” and ⊂ to
mean “proper subset of”. So this is what we’ll use in this textbook.
However, confusingly enough, some writers use the symbol ⊂ to mean “subset of” and ⊊
to mean “proper subset of”. We will not follow such practice in this textbook. This is
just FYI, in case you get confused when reading other mathematical texts!
Definition 20. The union of A and B is the set of elements that are in A OR B and is
denoted A ∪ B.
Example 69. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}.
Then T ∪ U = {1, 2, 3, 4}, T ∪ V = {1, 2, 3}, and U ∪ V = {1, 2, 3, 4}. And T ∪ U ∪ V =
{1, 2, 3, 4}.
Exercise 50. Let S be the set of squares and R be the set of rectangles. What is S ∪ R?
(Answer on p. 1394.)
Exercise 51. What is the union of the set of rationals and the set of irrationals? (Answer
on p. 1394.)
Definition 21. The intersection of A and B, denoted A ∩ B, is the set of elements that
are in A AND B. Two sets intersect if their intersection contains at least one element.
A ∩ B ≠ ∅.
Definition 22. We say that two sets are mutually exclusive or disjoint if they do not
intersect, i.e. their intersection is empty:
A ∩ B = ∅.
Example 70. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}.
Then T ∩ U = ∅, T ∩ V = {1, 2}, U ∩ V = {3}, and T ∩ U ∩ V = ∅.
Exercise 53. Let S be the set of squares and R be the set of rectangles. What is S ∩ R?
(Answer on p. 1394.)
Exercise 54. What is the intersection of the set of rationals and the set of irrationals?
(Answer on p. 1394.)
Definition 23. A set minus B is the set of elements that are in A AND not in B and is
denoted A ∖ B.
Example 71. Let T = {1, 2}, U = {3, 4}, and V = {1, 2, 3}.
Then T ∖ U = T , T ∖ V = ∅, and U ∖ V = {4}.
Exercise 55. Continue with the above example. Write down V ∖ T and V ∖ U . (Answer
on p. 1394.)
Some examples to illustrate why the set minus notation is sometimes very convenient and
allows us to avoid writing ugly monstrosities.
Example 72. Without the set minus sign, we’d write (−∞, 1) ∪ (1, ∞) to denote the set
of all real numbers except 1.
With it, we can write the same set more simply as R ∖ {1}.
Example 73. Without the set minus sign, we’d write ⋅ ⋅ ⋅ ∪ (−3, −2) ∪ (−2, −1) ∪ (−1, 0) ∪
(0, 1) ∪ (1, 2) ∪ (2, 3) . . . to denote the set of all real numbers that aren’t integers.
With it, we can write the same set more simply as R ∖ Z.
Example 74. In the context of a roll of a die, the universal set might be the set of all
possible outcomes:
E = {1, 2, 3, 4, 5, 6} .
Example 75. In the context of a spin of a European-style roulette wheel, the universal
set might be is the set of all possible outcomes:
E = {0, 1, 2, 3, . . . , 36} .
Note that in American-style roulette, there is a 38th possible outcome — double zero 00.
And so in the American context, the universal set might instead be:
E = {00, 0, 1, 2, 3, . . . , 36} .
Example 76. In the context of a game of chess, the universal set might be the set of all
possible outcomes for White:
Remark 10. I give the universal set notation E a quick mention here only because it
appears on your A-Level syllabus (p. 16). We will rarely (if ever) make use of this piece
of notation in this textbook.
Example 77. Let A = {2, 3}. If the relevant context is the roll of a die, then:
Example 78. Let B = {2, 4, 6, . . . }. If the relevant context is the positive integers, then:
Example 79. Let C = R+ . If the relevant context is all real numbers, then:
E =R and C ′ = R ∖ C = R−0 .
Remark 11. Just so you know (JSYK), some writers write Ac or A instead of A′ .
53, Contents www.EconsPhDTutor.com
4.21. De Morgan’s Laws
It turns out there’s a deep connection between logic and set theory. In particular:
• The intersection ∩ corresponds to the logical connective AND (the conjunction).
• The union ∪ corresponds to the logical connective OR (the disjunction).
Earlier in logic, we had De Morgan’s Laws:
• Fact 1: The negation of the conjunction P AND Q is NOT-P OR NOT-Q.
• Fact 2: The negation of the conjunction P OR Q is NOT-P AND NOT-Q.
We now have the following De Morgan’s Laws for set theory:
Fact 9. (P ∩ Q) ′ = P ′ ∪ Q′ .
′ ′
(P ∩ Q) = P ′ ∪ Q′ is in yellow. (P ∪ Q) = P ′ ∩ Q′ is in yellow.
Fact 10. (P ∪ Q) ′ = P ′ ∩ Q′ .
🐎 🐘 🐖
🐁
• P ′ ∩ Q′ is the set of animals that are short AND lean.
• P ∪ Q is the set of animals that are tall OR fat.
• (P ∪ Q) ′ = P ′ ∩ Q′ is the set of animals that are short AND lean (yellow region below).
🐎 🐘 🐖
🐁
55, Contents www.EconsPhDTutor.com
4.22. Set-Builder Notation (or Set Comprehension)
Previously, we simply wrote out a set using the method of set enumeration. That is to
say, we simply enumerated (i.e. listed out) all their elements:
Example 81. The set of Singapore PMs (both past and present) is:
Where the set has too many elements to list out, we can use the ellipsis “. . . ”. But this too
counts as the method of set enumeration:
We now introduce a second method of writing out a set, called set-builder notation or
set comprehension:
Example 83. The set of Singapore PMs (both past and present) is:
In set-builder notation, the mathematical punctuation mark colon “∶” means such that.
Following the colon is the property or criterion that x must satisfy in order to be an
element of the set. Hence, S is
Example 84. T = {x ∶ x ∈ Z, x > 100}. (Recall that the comma “,” means AND.)
T is “the set that contains all elements x such that x is an integer AND x > 100”. (This
time, following the colon are two properties or criteria that x must satisfy in order to be
an element of the set.)
We could also have written T as “the set that contains all integers x such that x > 100”:
T = {x ∈ Z ∶ x > 100}.
Example 85. Using set enumeration, the set of ASEAN members is:
{x ∶ x is a member of ASEAN} .
This is “the set that contains all elements x such that x is a member of ASEAN”.
Example 86. Let Familee be the set of individuals who have ever been members of
Singapore’s Royal or First Family. Now consider:
A is “the set that contains all elements x such that x has ever been the Singapore PM
AND x has never been a member of the Familee”. And so:
A = {GCT} .
Example 87. Let B = {x ∶ x is a member of ASEAN, x has fewer than 5, 000 islands}.
That is, B is “the set that contains all elements x such that x is a member of ASEAN
AND x has fewer than 5, 000 islands”. Then:
Exercise 56. Rewrite the sets S and T using set enumeration. (Answer on p. 1394.)
Hence: B = {1}.
This says that B is “the set that contains all positive real numbers x such that x2 − 1 = 0”.
R+ = (0, ∞) = {x ∈ R ∶ x > 0} .
This is “the set that contains all reals x such that x is greater than 0”.
Similarly, the set of non-negative reals may be written as:
R+0 = [0, ∞) = {x ∈ R ∶ x ≥ 0} .
This is “the set that contains all reals x such that x is greater than or equal to 0”.
Similarly, we have:
{2, 4, 6, 8, . . . } = {x ∶ x = 2k, k ∈ Z+ } .
This is “the set that contains all elements x such that x equals 2k AND k is a positive
integer”. Notice that here we introduce a second placeholder or dummy variable k to
help us describe the set.
Again, both the letters or symbols x and k could’ve been replaced by any other letters
or symbols and we’d still have the same set. For example:
{2, 4, 6, 8, . . . } = {p ∶ p = 2q, q ∈ Z+ }
= {⋆ ∶ ⋆ = 2⧫, ⧫ ∈ Z+ }
= {, ∶ , = 2/, / ∈ Z+ } .
Of course, it is customary and thus preferable to stick to letters like x, k, p, and q rather
than weird symbols like shapes and faces.
Another way to write down the set of positive even numbers:
{2, 4, 6, 8, . . . } = {x ∶ x/2 ∈ Z+ } .
This is “the set that contains all elements x such that x divided by 2 is a positive integer”.
Actually, there is a simpler way to write the above set without using two placeholder
variables. We can simply write:
{2, 4, 6, 8, . . . } = {2k ∶ k ∈ Z+ } .
In words, this is “the set of all elements 2k such that k is a positive integer”.
Remark 13. Following the A-Level syllabus (p. 16), in set-builder notation, we use the
colon “∶” to mean such that. Note though that some writers use the pipe “∣” instead.
Whenever we talk about a set, we refer to both the container and the objects inside.
• ∈ means in and ∉ means not in.
• The ellipsis “. . . ” means continue in the obvious fashion.
• The order of the elements doesn’t matter and repeated elements don’t count:
{1, 2, 3} = {3, 3, 2, 1, 3, 2, 1, 1, 1, 1, 2, 3} .
63
Not on your syllabus: N is the set of natural numbers.
60, Contents www.EconsPhDTutor.com
5. O-Level Review
The expression on the LHS contains the terms 1 and 4. The expression on the RHS
contains the terms 2 and 3.
Informally, the difference between an equation and an expression is this:
The coefficient on x is 5.
The constant term or more simply constant is 6. (The constant term is simply any
real number64 that does not involve any variables.)
64
When we study complex numbers in Part IV, constants will also include complex numbers.
61, Contents www.EconsPhDTutor.com
5.2. The Absolute Value or Modulus Function
The absolute value (or modulus) function, denoted ∣⋅∣, is defined for all x ∈ R by:65
⎧
⎪
⎪
⎪x for x ≥ 0,
∣x∣ = ⎨
⎪
⎪
⎩−x
⎪ for x < 0.
7 7 −7 −7
Example 97. = = 1 and = = −1.
∣7∣ 7 ∣−7∣ 7
⎧
x ⎪⎪
⎪1, if x > 0,
Fact 11. If x ≠ 0, then: =⎨
∣x∣ ⎪
⎪
⎩−1,
⎪ if x < 0.
Definition 25. Let n ∈ Z+0 . Then n-factorial, denoted n!, is defined by:
⎧
⎪
⎪
⎪1 for n = 0,
n! = ⎨
⎪
⎪
⎩1 × 2 × ⋅ ⋅ ⋅ × n
⎪ for n > 0.
⎧
⎪
⎪
⎪1 for n = 0,
Or equivalently:66 n! = ⎨
⎪
⎪
⎩(n − 1)! × n
⎪ for n > 0.
And so: 0! = 1,
1! = 0! × 1 = 1,
2! = 1! × 2 = 1 × 2,
3! = 2! × 3 = 1 × 2 × 3,
4! = 3! × 4 = 1 × 2 × 3 × 4,
5! = 4! × 5 = 1 × 2 × 3 × 4 × 5,
6! = 5! × 6 = 1 × 2 × 3 × ⋅ ⋅ ⋅ × 6,
⋮ ⋮ ⋮
You may be wondering, “Why is 0! defined as 1?” It turns out this is the definition that
causes us the least overall inconvenience — we’ll appreciate this a little better when we
study combinatorics in Part VI.
65
A slightly more formal definition of this function is given as Definition 78, after we’ve learnt a little more
about functions.
66
This latter equivalent definition is an example of a recursive definition.
62, Contents www.EconsPhDTutor.com
5.4. Exponents
Definition 26. Let b be a non-zero real number and x be an integer. Then b to the power
of x, denoted bx , is defined as the following number:
⎧
⎪
⎪
⎪
⎪ 1 for x = 0,
⎪
⎪
⎪
⎪
⎪
⎪ b ⋅ b ⋅ ⋅⋅⋅ ⋅ b for x > 0,
⎪
⎪´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
b = ⎨ x times
⎪
x
⎪
⎪
⎪
1 1 1
⎪
⎪ ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ for x < 0.
⎪
⎪
⎪
⎪ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
b b b
⎪
⎪
⎩ ∣x∣ times
Given the expression bx , we call b the base and x the exponent.
In the special case where the base is zero, i.e. b = 0, we define:
⎧
⎪
⎪
⎪
⎪ 0 for x > 0,
⎪
⎪
0x = ⎨ 1 for x = 0,
⎪
⎪
⎪
⎪
⎪
⎪
⎩Undefined for x < 0.
We also have:
1 1 1 1 1 1
2−1 = = = 0.5, 2−2 = = = 0.25, 2−3 = = = 0.125,
21 2 22 4 23 8
1 1 1 1
2−4 = = = 0.0625, 2−20 = = = 0.000 000 953 674 316 406 25.
24 16 220 1 048 576
We now examine the special case where the base is zero, i.e. b = 0:
So: 02 = 0 ⋅ 0 = 0, 07 = 0 ⋅ 0 ⋅ 0 ⋅ 0 ⋅ 0 ⋅ 0 ⋅ 0 = 0, 055 = 0 ⋅ 0 ⋅ ⋅ ⋅ ⋅ ⋅ 0 ⋅ 0 = 0.
1 1
• If x ∈ Z− , then 0x = = is undefined.
0∣x∣ 0
So, 0−1 and 0−50 are undefined.
67
One convenience this definition affords is this: For all b ∈ R, we simply have b0 = 1.
But note that some writers (including ) argue that we should simply leave 00 undefined. But in my
judgment, 00 = 1 is probably the definition that will cause us the least inconvenience.
64, Contents www.EconsPhDTutor.com
We next define b1/x , in the case where b is non-negative and x is a non-zero integer:
Definition 27. Let b > 0 and x be a non-zero integer. Then b to the power of 1/x, also
called the xth root of b, is defined to be the number a > 0 that satisfies:
ax = b.
√
a = b x = b.
1
We write:
x
√ 1
Remark 14. b and b are just two different ways to write exactly the same thing.
x
x
4 4 2
√
√ 1 3
1
1 √ 1
8 3 = 8 = 2, ( ) = 0.125 3 = = 0.125 = 0.5 = ,
1 3 1 3 3
8 8 2
√
√ 1 4
1
1 √ 1
16 4 = 16 = 2, ( ) = 0.0625 4 = = 0.0625 = 0.5 = .
1 4 1 3 4
16 8 2
√
1 048 576 20 = 1 048 576 = 2.
1
We also have:
20
1
√
1 1
( ) = 0.000 000 953 674 316 406 25 20 =
20 1
And: 20
Remark 15. It is true that b2 = 25 has two solutions, namely b = 5 and b = −5. However,
it is wrong to write any of the following:
√ √
25 2 = ±5. 7 25 2 = −5. 7 25 = ±5. 7 25 = −5. 7
1 1
√
By the above Definition, 251/2 or 25 refers to the positive square root:
√
251/2 = 25 = 5. 3
To talk about the negative square root −5, you must (simply) stick a minus sign in front:
√
−251/2 = − 25 = −5. 3
√
In general, given any real number x, we have x ≥ 0.
√
Exercise 60. True or false: “If x ∈ R, then x2 = x.” (Answer on p. 1396.)
a2 = −1.
√
(Later on, when we study complex numbers in Part IV, we will define i = −11/2 = −1.)
√
Note also the requirement in Definition 27 that x ≠ 0. If x = 0, then
0 1
b or b 0 is simply
undefined.
1
For certain special cases of b x , we have some special notation or terminology:
√
• If x = 1, we will not write a = b. Instead, we will simply write a = b.
1
√ √
• If x = 2, we will not write a = b. Instead, we will simply write a = b. We will also call
2
Definition 28. Let b > 0 and x ∈ Q. Suppose x = m/n for some integers m and n. Then
we define:
bx = (b n ) .
1 m
√
bx = b n = (b n ) = ( b) .
1 m m
Altogether, we have:
m n
√ 1383
213.83 = 2 100 = ( 2) = (1.000 6 . . . ) ≈ 14 563,
1 383 100 1383
√ 7
22.3 = 2 3 = ( 2) = (1.259 9 . . . ) ≈ 5.039 7.
7 3 7
Remark 16. Note well the requirement that the bases a and b are positive.
If for example b = −1 < 0, then:
Our proof below is only of the special and simple case where the exponents x and y are
positive integers. (The proof where they are instead negative integers is similar.)
You can nonetheless take for granted that the above laws hold for all real x, y.68
(a) bx by = b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b × b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b = b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b = bx+y .
1 1
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
x times y times x+y times
1 1
(b) Since x > 0, we have −x < 0 and ∣−x∣ = x. Now: b−x = =
1
.
b∣−x∣ bx
(c) If x ≥ y so that x − y ≥ 0, then:
y times
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
x times
b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b 1 b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b bx
= b ⋅ b ⋅ ⋅⋅⋅ ⋅ b = b ⋅ b ⋅ ⋅⋅⋅ ⋅ b × = = .
1
bx−y
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b by
x−y times x−y times ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
y times y times
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
x times x times
1 1 1 b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b bx
= = = × = = .
1
bx−y
b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b by
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
∣x−y∣ times y−x times y−x times x times y times
y times
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ · ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
(d) (bx ) = (b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b)y = (b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b) × ⋅ ⋅ ⋅ × (b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b) = b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b = bxy .
y 1 1 1
68
Even though we haven’t even defined bx in the case where x ∉ Q! See Ch. 121.17 for a further discussion.
67, Contents www.EconsPhDTutor.com
Exercise 61. Simplify each expression. (Answer on p. 1396.)
Recall that (a + b) (a − b) = a2 − b2 .
Remark 17. We only ever use the ∓ notation if we’re also already using the ± notation
and have a need to indicate√that the signs go the opposite way. So for example, we’d√never
write, “x2 = 17 Ô⇒ x = ∓ 17,” because we can simply write, “x2 = 17 Ô⇒ x = ± 17”.
Exercise 64. Let a, b, c ∈ R with a ≠ 0 and b2 − 4ac > 0. Prove the following.69
√
−b ± b2 − 4ac 2c
= √ . (Answer on p. 1397.)
2a −b ∓ b2 − 4ac
69
The LHS is the quadratic formula in the more familiar form. The RHS is also the quadratic formula,
69, Contents www.EconsPhDTutor.com
f: x y the function f maps the element x to the element y
g o f, gf
5.6. Logarithms
the composite function of f and g which is defined by
(g o f)(x) or gf(x) = g(f(x))
Informally, logarithms are simply the inverse of exponents. A bit more formally:
lim f(x)
Definition 30. Let b, n ∈ Rx→+ a and x ∈ R. If bx = n, then we write x = logb n and call x the
the limit of f(x) as x tends to a
ln x natural logarithm of x
lg x logarithm of x to base 10
We will never write log. We will only ever write logb and even then fairly rarely.
6. Circular Functions and Relations
18
1
= logb a−1 = − logb a.
(e)
(f) logb
a
(g) The first step here uses Proposition 1(a):
(d) ⋆
blogb a+logb c = blogb a blogb c = ac Ô⇒ logb (ac) = logb a + logb c.
1 (g) 1 (f) 1
= logb (a ⋅ ) = logb a + logb = logb a − logb .
a
(h) logb
c c c c
(d)
(i) blogb a = a.
logc a
Divide by logc b ≠ 0: logb a = .
logc b
Term Coefficients
0th-degree −3 −3
1st-degree 7x 7
1st-degree term
2nd-degree term ↖ ↑ ↗ 0th-degree term
¬ ¬«
3 x2 + 4 x −5
® ® ¯
2nd-degree coefficient ↙ ↓ ↘ 0th-degree coefficient
1st-degree coefficient
Term Coefficient
0th-degree −5 −5
1st-degree 4x 4
2nd-degree 3x2 3
Term Coefficient
0th-degree 9 9
1st-degree 2x 2
2nd-degree 0x2 0
3rd-degree −5x2 −5
You get the idea. We also have 4th-, 5th-, 6th-, . . . degree (or quartic, quintic, sextic,
. . . ) polynomials and equations.
In this textbook, we’ll almost always consider only polynomials in one variable. So
unless otherwise stated, when we say polynomial, we’ll always mean a polynomial in one
variable.70
70
But in case you were wondering, an example of a polynomial in two variables is Ax + Bxy + Cy.
In general, the degree of each term in a polynomial is the sum of the exponents on the variables. And
the polynomial’s degree is simply the highest such degree. So here, the term Ax has degree 1, Bxy has
degree 2, and Cy has degree 1. So this polynomial is of degree 2.
Another example of a polynomial in two variables is Ax2 + Bxy + Cy 2 + Dx + Ey + F . Despite looking
more complicated than the previous example, this polynomial also has degree 2 because the greatest
sum of exponents on any term is again 2.
And by the way, as Ch. 117.8 (Appendices) discusses, the conic section is, in general, described by this
2nd-degree polynomial equation in two variables:
Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.
74, Contents www.EconsPhDTutor.com
Part I.
Functions and Graphs
This is true for all of science. Successes were largely due to forgetting com-
pletely about what one ultimately wanted, or whether one wanted anything
ultimately; in refusing to investigate things which profit, and in relying solely
on guidance by criteria of intellectual elegance; it was by following this rule
that one actually got ahead in the long run, much better than any strictly
utilitarian course would have permitted.
We now introduce a new mathematical object called an ordered pair (a, b). Like the
sets {Cow, Chicken} and {−5, 4}, you can think of an ordered pair as a container with two
objects. But unlike sets, with ordered pairs, the order matters (hence the name).71
Definition 32. Given the ordered pair (a, b), we call a its first or x-coordinate and b its
second or y-coordinate.
Two ordered pairs are equal if and only if both their x- and y-coordinates are equal.
The ordered pair (Cow, Chicken) has x-coordinate Cow and y-coordinate Chicken.
The ordered pair (Chicken, Cow) has x-coordinate Chicken and y-coordinate Cow.
Since these two ordered pairs have different x- and y-coordinates, they are not equal:
To distinguish an ordered pair from a set with two elements, we use parentheses (instead
of braces). Be very clear that the ordered pair (a, b) is a completely different mathematical
object from the set {a, b}:
71
For the formal definition of an ordered pair, see p. 1257 in the Appendices.
77, Contents www.EconsPhDTutor.com
Example 117. (-5,4) and (4,-5) are ordered pairs. (−5, 4) has x-coordinate −5 and
y-coordinate 4, while (4, −5) has x-coordinate 4 and y-coordinate −5. Since these two
ordered pairs have different x- and y-coordinates, they are not equal:
This is in contrast to what we saw above with sets, where {−5, 4} = {4, −5}.
Again, be very clear that the ordered pair (a, b) is a completely different mathematical
object from the set {a, b}:
The ordered pair (Cow, Cow) has x-coordinate Cow and y-coordinate Cow.
The ordered pair (Chicken, Chicken) has x-coordinate Chicken and y-coordinate Chicken.
Remark 19. Confusingly, (−5, 4) can denote two entirely different things:
• In Ch. 4.13, we learnt that (−5, 4) denotes the set of real numbers between −5 and 4.
• Here we learn that (−5, 4) denotes the ordered pair with the x-coordinate −5 and the
y-coordinate 4.
This is an unfortunate and confusing situation. But don’t worry.
In the Oxford English Dictionary, no fewer than 645 different meanings are given for the
word run.73 But one rarely has trouble telling from the context which of these is meant
when someone uses the word run.
Likewise, you’ll rarely have trouble telling from the context whether by (−5, 4), the writer
means a set of real numbers or an ordered pair.
72
The word distinct is just a synonym for not equal.
73
According to the NYT, the OED editor Peter Gilliver spent nine months working on the word run.
Previously, the word set was said to have the most different meanings, at 430 (see e.g. Guinness Book
of World Records).
78, Contents www.EconsPhDTutor.com
6.2. The Cartesian Plane
We will usually be concerned with ordered pairs of real numbers (rather than ordered
pairs of cows and chickens):
Definition 33. A point (in two-dimensional space) is any ordered pair of real numbers.74
By the way, what a point is depends on the context. In one-dimensional space, a point is
simply any real number. Later on in Part III, we’ll also be looking at three-dimensional
space — in that case, a point will be any ordered triple of real numbers. But for now, we’ll
be concerned only with two-dimensional space. And so for now, whenever we say point, it
should be understood that we’re talking about an ordered pair of real numbers.
The cartesian plane75 is the set of all points (i.e. the set of all ordered pairs of real
numbers). Formally:
{(x, y) ∶ x, y ∈ R} .
The origin O is the point at which the x- and y-axes intersect.
Example 120. Depicted below is the cartesian plane, centred on the origin O = (0, 0).
Note that the cartesian plane stretches out to ±∞ in both the x- and y-directions.
Also depicted are three points A = (−5, 4), B = (1, 1), C = (2, −3).
y
A = (−5, 4)
4
2 B = (1, 1)
O = (0, 0)
-6 -4 -2 2 4 x
-2
-4 C = (2, −3)
-6
74
Note that a point is itself a zero-dimensional object.
75
The cartesian plane (and more generally cartesian geometry) is named after René Descartes
(1596–1650), who’s also the dude who came up with “Cogito ergo sum” (“I think, therefore I am”).
76
There is some disagreement over whether to capitalise cartesian — see e.g. . Indeed, in
your syllabus, it is capitalised on p. 19 but not on pp. 7–8! (My guess is that while pp. 1–15 were written
by the local Singapore authorities, pp. 16–20 were simply copy-pasted from some standard Cambridge
notation template.) My personal preference is to capitalise cartesian, but it seems that the A-Level
exams do not do so. I shall therefore follow the sacred A-Level exams by not capitalising cartesian.
79, Contents www.EconsPhDTutor.com
6.3. A Graph is Any Set of Points
You’re probably used to thinking of a graph (or a curve) as a “drawing”. But formally:
Example 121. Consider G = {(−5, 4) , (1, 1) , (2, −3)}. G is a set that contains three
points. And so by definition, G is also a graph.
We’ve defined graph as a noun (it is a set of points). But at the slight risk of confusion,
we’ll also use graph as the verb meaning to draw a graph. So, we can either say, “The
graph G is drawn below,” or, “G is graphed below”.
y
(−5, 4)
The graph 3
G = {(−5, 4) , (1, 1) , (2, −3)}
contains three points. (1, 1)
1
x
-5 -4 -3 -2 -1 -1 1 2
-3
(2, −3)
-5
Example 122. Consider H = {(−5, 4) , (2, −3)}. H is a set that contains two points. And
so by definition, H is also a graph. H is graphed below.
y
(−5, 4)
The graph 3
H = {(−5, 4) , (2, −3)}
contains two points. 1
x
-5 -4 -3 -2 -1 -1 1 2
-3
(2, −3)
-5
The graph 3
I = {(1, 1)} (1, 1)
contains one point. 1
x
-5 -4 -3 -2 -1 -1 1 2
-3
-5
If a set contains at least one element that isn’t a point, then it isn’t a graph:
Example 126. Each of the sets R, Q, and Z contains at least one element that isn’t a
point. (Indeed, each contains no points at all and infinitely many elements that are not
points.) Thus, R, Q, and Z are not graphs.
You may be used to thinking of a graph as a “drawing”. But you should now think of a
graph as being simply a set of points. A “drawing” of a graph is not the graph itself, but
merely a visual aid.78
Remark 20. Your A-Level exams seem to use the terms graph and curve interchangeably
(i.e. as entirely equivalent synonyms), so that’s what this textbook will do too.
77
Actually, a number can also be regarded as a point in one-dimensional space. However, as stated earlier,
for now, whenever we say point, we mean a point in two-dimensional space. And so, following this usage,
1 here is not a point.
78
Albeit a tremendously helpful one. Indeed, analytic or cartesian geometry was one of the major
milestones in the history of mathematics. The idea of combining algebra and geometry is today “obvious”
even to the secondary school student, but it wasn’t always obvious to mathematicians.
81, Contents www.EconsPhDTutor.com
6.4. The Graph of An Equation
We’ll be looking mostly at graphs of equations (and shortly, also of functions):
Definition 37. The graph of an equation is the set of points (x, y) for which the equation
is true.
G = {(x, y) ∶ y = x + 2} .
We can say, “Below we’ve graphed the equation y = x + 2” or more simply, “Below is
graphed y = x + 2,” or, “Below is graphed G”.
G is the set of points (x, y) for which the equation y = x + 2 is true. And so, G contains:
• (5, 7), because 7 = 5 + 2.
• (1, 3), because 3 = 1 + 2.
8 y
(5, 7)
6
-3 -2 -1 1 2 3 4 5 6 x
-2
H = {(x, y) ∶ y = x2 } .
H is the set of points (x, y) for which the equation y = x2 is true. And so, H contains:
-5 -4 -3 -2 -1 1 2 x
I = {(x, y) ∶ x2 + y 2 = 1} .
y
I is the set of points (x, y) for √ √
which the equation x2 + y 2 = 1 is (
2 2
, )
true. And so, I contains: 2 2
• (1, 0), because 12 + 02 = 1.
√ √
2 2
• ( , ), because:
2 2
√ 2 √ 2 (1, 0)
2 2
( ) +( ) = 1. x
2 2
The graph of the equation x + y = 1 2 2
√
P.S. An alternative to Step 5 is to enter “−Y1 ” instead of “− 1 − X 2 ”. To do so, replace
Step 5 with the following instructions: First press (-) to enter the minus sign, as was
done in Step 5. Next press VARS to bring up the VARS menu. Then press ⟩ to go to
the Y-VARS menu. Now press ENTER to select “1: Function...”. Press ENTER again
to select “1: Y1 ”. Altogether, we will have entered “Y2 = −Y1 ”. Now go to Step 6.
Example 132. Let G1 be the graph of the equation y = x + 2 with the constraint x ≥ 3.
Equivalently, let G1 be the graph of y = x + 2, x ≥ 3. Then:
G1 = {(x, y) ∶ y = x + 2, x ≥ 3} .
G1 is the set of points (x, y) for which (y = x + 2 AND x ≥ 3) is true. And so, G1 contains:
• (5, 7), because 7 = 5 + 2 AND 5 ≥ 3.
• (3, 5), because 5 = 3 + 2 AND 3 ≥ 3.
G1 does not contain (1, 3), because 1 ≥/ 3.
y
11
G1 = {(x, y) ∶ y = x + 2, x ≥ 3}
9
7
(5, 7)
5
(3, 5)
3
(1, 3)
1
-1 -1 1 2 3 4 5 6 7 8 x
Above we labelled our graph as G1 = {(x, y) ∶ y = x + 2, x ≥ 3}. But going forward, we’ll
be a little lazy/sloppy and simply label it as y = x + 2, x ≥ 3 (as done below), with the
understanding that this is the graph that satisfies the labelled equation and constraint.
Nonetheless, you should always bear in mind that a graph is a set of points.
y
11
y = x + 2, x ≥ 3
9
7
(5, 7)
5
(3, 5)
3
(1, 3)
1
-1 -1 1 2 3 4 5 6 7 8 x
G2 = {(x, y) ∶ y = x + 2, x > 3} .
G2 is exactly the same as G1 but with one difference — the constraint (inequality) is now
strict, so that this time, G2 does not contain (3, 5).
y
11
y = x + 2, x > 3
9
7
(5, 7)
5
(3, 5)
3
(1, 3)
1
-1 -1 1 2 3 4 5 6 7 8 x
H1 = {(x, y) ∶ y = x2 , x ≤ 2} .
H1 is the set of points (x, y) for which (y = x2 AND x ≤ 2) is true. And so, H1 contains:
• (1, 1), because 12 = 1 AND 1 ≤ 2.
• (2, 4), because 22 = 4 AND 2 ≤ 2.
H1 does not contain (3, 9), because 3 ≤/ 2.
8 (3, 9)
y = x ,x ≤ 2
2
4
(2, 4)
2
(1, 1)
-4 -3 -2 -1 1 2 3 x
H2 = {(x, y) ∶ y = x2 , x < 2} .
H2 is exactly the same as H1 but with one difference — the constraint (inequality) is now
strict, so that this time, H2 does not contain (2, 4).
8 (3, 9)
y = x ,x < 2
2
4
(2, 4)
2
(1, 1)
-4 -3 -2 -1 1 2 3 x
y 1
√ x2 + y 2 = 1, x ≥ −
1 3 2
(− , )
2 2
√ √
2 2
( , )
2 2
(1, 0) x
√
1 3
(− , − )
2 2
y 1
√ x2 + y 2 = 1, x ≥ −
1 3 2
(− , )
2 2
√ √
2 2
( , )
2 2
(1, 0) x
√
1 3
(− , − )
2 2
y
J is the graph of: J
⎧
⎪
⎪
⎪−x, for x ≤ 0
y=⎨
⎪
⎪
⎩x , for x > 0.
⎪ 2
But this is cumbersome and difficult to read. And so, we’ll usually simply specify J as
was done above.
y K
K is the graph of:
⎧
⎪
⎪
⎪x, for x ≠ 2
y=⎨
⎪
⎪
⎩0, for x = 2.
⎪
Example 140. The graph of the equation y = x + 2 has horizontal or x-intercept (−2, 0)
and vertical or y-intercept (0, 2).
We can also more simply say, “The equation y = x+2 has horizontal or x-intercept (−2, 0)
and vertical or y-intercept (0, 2).”
Since the equation y = x + 2 has x-intercept with x-coordinate −2, we say that −2 is a
root of the equation y = x + 2.
y
y =x+2
6
2
(0, 2)
(−2, 0)
-3 -2 -1 1 2 3 4 5 6 x
-2
Remark 21. Just so you know, some writers (including your TI84) also call a root a zero.
So in the above example, they’d say that the zero of the equation y = x+2 is −2. However,
we will avoid using the term zero in this textbook because it does not appear on your
A-Level exams or syllabus.
8
y = x2
6
2
(0, 0)
-3 -2 -2 -1 1 1 2 x
Example 142. The equation x2 + y 2 = 1 has two x-intercepts (−1, 0) and (1, 0), two
y-intercepts (0, −1) and (0, 1), and two roots −1 and 1.
y
{(x, y) ∶ y = x + 2}
(0, 1)
(−1, 0) (1, 0) x
(0, −1)
y
3
y = x2 − 1
2
(−1, 0) (1, 0)
-2 -1 1 x
(0, −1)
-1
Exercise 68. For each of the following equations, write down any x-intercept(s), y-
intercept(s), and roots. (Answer on p. 1400.)
(a) y = 2.
(b) y = x2 − 4.
(c) y = x2 + 2x + 1.
(d) y = x2 + 2x + 2.
ax + by + c = 0,
where a, b, c ∈ R and it is not the case that both a and b are zero. The line’s gradient is the
number −a/b (provided b ≠ 0; if b = 0, then we say that the line’s gradient is undefined).
You may find this definition a little puzzling — didn’t we always simply write lines as:
y = dx + e?
ax + by + c = 0 ⇐⇒ by = −ax − c ⇐⇒ y= − x − .
a c
° ¯
b b
d e
But writing a line as y = dx + e also has one big disadvantage — it can’t describe the case
where b = 0, i.e. vertical lines:
-4 -3 -2 -1 -1 1 2 3 x
-3
-5
Here’s what we’ll do in this textbook: If we know for sure that a line isn’t vertical, then
79
Our definition here covers only lines in two-dimensional space. In Part IV (Vectors), we’ll learn of a
more general definition (namely Definition 109) of a line that covers also higher-dimensional spaces and
which will replace Definition 39.
96, Contents www.EconsPhDTutor.com
we’ll write it in the form y = dx + e, because of the aforementioned advantages. Otherwise,
we’ll write it as ax + by + c = 0.
Definition 40. We say that a graph is horizontal if any two points in that graph have
the same y-coordinate; and vertical if any two points have the same x-coordinate.
Definition 41. We say that a line is oblique (or slanted) if it is neither horizontal nor
vertical.
Example 145. The line y = 1 is horizontal, the line x = 2 is vertical, and the line y = x+1
is oblique.
y
x=2
y =x+1
(1, 1)
(0, 1)
y = −1
(1, 0)
Fact 14. The line containing the distinct points (a1 , b1 ) and (a2 , b2 ) is:
(a2 − a1 ) (y − b1 ) = (b2 − b1 ) (x − a1 ) .
Example 147. The line containing the points (1, 2) and (−1, 3) is:
1 5
(−1 − 1) (y − 2) = (3 − 2) (x − 1) or −2y = x − 5 or y =− x+ .
2 2
(4, 5)
1 5
y =− x+
2 2 5
y = x−5
2
(−1, 3)
(1, 2)
(2, 0)
5
(4 − 2) (y − 0) = (5 − 0) (x − 2) or 2y = 5x − 10 or y = x − 5.
2
Exercise 69. In each of the following, write down the equation of the line that contains
the two given points. (Answer on p. 1401.)
(a) (4, 5) and (7, 9).
(b) (1, 2) and (−1, −3).
y − q = m (x − p) .
Proof. The line contains the distinct points (p, q) and (p + 1, q + m).
And so by Fact 14 then, it may also be described by:
(p + 1 − p) (y − q) = (q + m − q) (x − p) or y − q = m (x − p).
Example 148. The line that contains the point (−1, 2) and has gradient 3 is:
y − 2 = 3 [x − (−1)] or y = 3x + 5.
y
y = 3x + 5
−2
y = −2x + 19
3
1 (7, 5)
x
(−1, 2)
The line that contains the point (7, 5) and has gradient −2 is:
y − 5 = −2 (x − 7) or y = −2x + 19.
Exercise 70. In each of the following, write down the equation of the line that contains
the given point and has the given gradient. (Answer on p. 1401.)
(a) (4, 5) and 3.
(b) (1, 2) and −2.
In Part IV (Vectors), we will give a more general definition (namely Definition 116) of when
two lines are said to be perpendicular and which will replace the above Definition.
Remark 22. Confusingly, some other writers use ray to mean a (finite) line segment. But
this textbook will strictly reserve the word ray to mean a “half-infinite line”.
Remark 23. According to ISO 80000-2: 2009,81 AB may be used to denote the line
segment from point A to point B. This notation is convenient and allows us to distinguish
between the line AB and the line segment AB.
However, your A-Level syllabus and exams (see e.g. N2007/I/6) do not seem to use this
notation.82 And so, this textbook shall not do so either.
For us, the only way to tell whether AB is a line or a line segment is to see if we say “the
line AB” or “the line segment AB”. We must thus always be absolutely clear if we’re
talking about a line or a line segment.
80
For the formal definitions of a line, a line segment, and a ray, see Definition 222 in the Appendices.
81
See Item No. 2-8.4. The 40-something-page PDF costs a mind-blowing 158 Swiss Francs or about S$214
at the ISO store. As always, you may or may not be able to find free versions of this document elsewhere
on the interwebz.
82
Indeed, they don’t seem to be very careful about distinguishing between lines and line segments.
102, Contents www.EconsPhDTutor.com
6.13. Asymptotes and Limit Notation
Informally, an asymptote is a line that a graph “gets ever closer to”.
1
Example 152. Consider y = + 2. We have:
x−1
1
lim− y = lim− ( + 2) = −∞.
x−1
,1
x→1 x→1
For the A-Levels, you need only know, roughly and informally, what the above line ,1
says.83 And as you can probably guess, it says:
1
Similarly, we have: lim+ y = lim+ ( + 2) = ∞
x−1
,2
x→1 x→1
y As x → 1+ , y → ∞.
1
y= +2
Horizontal asymptote x−1
y=2
As x → −∞, y → 2− . As x → ∞, y → 2+ .
x
Vertical asymptote
x=1
As x → 1− , y → −∞.
83
Ch. 121.2 (Appendices) formally and precisely defines what ,1 means and what asymptotes are.
103, Contents www.EconsPhDTutor.com
(... Example continued from the previous page.)
Next, we also have:
1
lim y = lim ( + 2) = 2− ,
1
x→−∞ x→−∞ x − 1
1
lim y = lim ( + 2) = 2+ .
2
x→∞ x→∞ x−1
1
Both = and = say that y = 2 is a horizontal asymptote for the graph of y = + 2.
1 2
x−1
y = ex + 1
2
As x → −∞, y → 1+ . Horizontal asymptote
y=1
x→( π2 ) x→( π2 )
x→( π2 ) x→( π2 )
y
Vertical asymptote x =
π
π −
As x → ( ) , y → ∞. 2
2
y = tan x
x
x=−
π
2
π +
As x → ( ) , y → −∞.
2
Exercise 72. With the aid of formal limit notation, explain why the line x = −π/2 is also
a vertical asymptote for the graph of y = tan x. (Answer on p. 1401.)
1
Example 155. Consider y = + x. y
x−1 y=
1
+x
We note in passing that x = 1 is a vertical x−1
asymptote (can you explain why?).
Observe that this graph has the oblique
asymptote y = x, because: As x → ∞, y → x+ .
1
lim y = lim ( + x) = x− ,
1
x→−∞ x→−∞ x − 1
1
lim y = lim ( + x) = x+ .
2
x→∞ x→∞ x−1
Oblique asymptote
y=x
Vertical asymptote
x=1
As x → −∞, y → x− .
1
Thus, y = x is an oblique asymptote for the graph of y = + x.
x−1
Example 156. Consider the graph of y = −x2 . The point D = (0, 0) is a global max-
imum, because it is at least as high as any other point. Indeed, it is also a strict global
maximum, because it is strictly higher than any other point.
D = (0, 0) x
y = −x2
84
Here in this subchapter, we’ll merely give the informal definitions of the eight types of points introduced.
For their formal definitions, see Definition 224 (Appendices).
107, Contents www.EconsPhDTutor.com
Example 157. Consider the graph of y = sin x. Let
3π 5π
A = (− B = ( , 1), C=(
π
, 1), and , 1)
2 2 2
3π y 5π
A = (− B = ( , 1) C=(
π
, 1) , 1)
2 2 2
y = sin x
Indeed, y = sin x has infinitely many global maxima — for k ∈ Z, the following point is a
global maximum:
1
((2k + ) π, 1)
2
Note though that y = sin x has no strict global maximum, because no point is strictly
higher than any other point.
Obviously, if a point is strictly higher than any other point, then it must also be at least
as high as any other point. And so, in general:
However, the converse is not true. That is, a global maximum need not be strict. In the
above example, each of A, B, and C is a global maximum, but none is a strict global
maximum. Indeed, the graph of y = sin x has no strict global maximum.
We next introduce the concept of a local maximum:
F = (1, 11)
Informally, a strict local maximum (point) of a graph is a point that’s (strictly) higher
than any “nearby” point (that’s also in the graph).
Example 159. In the last example, the points E and F are also strict local maxima,
because each is (strictly) higher than any “nearby” point.
Again and obviously, if a point is strictly higher than any “nearby” point, then it must also
be at least as high as any “nearby” point. And so in general:
However, the converse is not true. That is, a local maximum need not be strict:
Example 160. Consider the graph of y = 3. The point G = (1, 3) is a local maximum,
because it’s at least as high as any “nearby” point. However, it is not a strict local
maximum, because it is not strictly higher than any “nearby” point.
Actually, every point in y = 3 is a local maximum (though not a strict local maximum)!
y
y=3
G = (1, 3)
x
Indeed, every point in y = 3 is a global maximum (though not a strict global maximum)!
Example 161. Consider the graph of y = x2 . The point K = (0, 0) is a global minimum,
because it is at least as low as any other point. Indeed, it is also a strict global
minimum, because it is (strictly) lower than any other point.
y
y = x2
K = (0, 0) x
5π 3π
H = (− , −1), I = (− , −1), J =( , −1)
π
and
2 2 2
y
y = sin x
5π 3π
H = (− , −1) I = (− , −1) J =( , −1)
π
2 2 2
Indeed, y = sin x has infinitely many global minima — for k ∈ Z, the following point is a
global maximum:
1
((2k − ) π, 1)
2
Note though that y = sin x has no strict global minimum, because no point is strictly
lower than any other point.
Obviously, if a point is strictly lower than any other point, then it must also be at least as
low as any other point. And so in general:
However, the converse is not true. That is, a global maximum need not be strict. In
the above example, each of H, I, and J is a global minimum, but none is a strict global
minimum. Indeed, the graph of y = sin x has no strict global minimum.
Example 163. In the graph of y = 6x5 − 15x4 − 10x3 + 30x2 , neither L = (0, 0) nor
M = (2, −8) is a global minimum, because neither is at least as low as any other point.
However, each is a local minimum, because each is at least as low as any “nearby” point.
y
y = 6x5 − 15x4 − 10x3 + 30x2
L = (0, 0) x
M = (2, −8)
Informally, a strict local minimum (point) of a graph is a point that’s (strictly) lower
than any “nearby” point (that’s also in the graph).
Example 164. In the last example, the points L and M are also strict local minima,
because each is (strictly) lower than any “nearby” point.
Again and obviously, if a point is strictly lower than any “nearby” point, then it must also
be at least as low as any “nearby” point. And so in general:
However, the converse is not true. That is, a local minimum need not be strict:
Example 165. Consider the graph of y = 3. The point G = (1, 3) is a local minimum,
because it’s at least as low as any “nearby” point. However, it is not a strict local
minimum, because it is not strictly lower than any “nearby” point.
Actually, every point in y = 3 is a local minimum (though not a strict local minimum)!
y
y=3
G = (1, 3)
x
Indeed, every point in y = 3 is a global minimum (though not a strict global minimum)!
Remember turning points? We’ll formally define what these are in Definition 63. For
now, we’ll rely on your intuitive understanding of what turning points are. We’ll also
mention that every turning point is a strict extremum (i.e. either a strict maximum or
minimum), but not every strict extremum is a turning point.
y y
y = x, x ≥ −1
x x
I = (−1, −1)
y=x
In contrast, the graph of y = x with the constraint x ≥ −1 (right) has one extremum,
namely I = (−1, −1), which is a global minimum, strict global minimum, local minimum,
and strict local minimum. Observe though that I is not a turning point.
y H = (6, 125)
G = (5, 125)
C = (−2, 76)
E = (2, 44)
D = (0, 0) x
F = (4, −32)
A = (−4, −81)
B = (−3, −81)
The following table says that A = (−8, −81) is a local maximum, a global minimum, and
a local minimum, but is not any of the other five types of extrema and is not a turning
point. You should verify that the table is correct.
A B C D E F G H
Global maximum 3 3
Strict global maximum
Local maximum 3 3 3 3 3
Strict local maximum 3 3
Global minimum 3 3
Strict global minimum
Local minimum 3 3 3 3 3
Strict local minimum 3 3
Turning point 3 3 3 3
Besides the eight points (A–H), does this graph have any other extrema?85
85
Yes. In fact, there are infinitely many other extrema. For every p < −3, the point (p, −81) is, like A, a
global minimum, local maximum, and local minimum. And for every q > 5, the point (q, 125) is, like H,
a global maximum, local maximum, and local minimum.
114, Contents www.EconsPhDTutor.com
Example 168. Let G = {A, B, C} be the graph consisting of three isolated86 points:
A = (−5, 4), B = (1, 1), and C = (2, −3).
Interestingly, each of A, B, and C is a strict local maximum of G. This is because each
of A, B, and C has no “nearby” point. It is thus trivially or vacuously true that each of
A, B, and C is strictly higher than any “nearby” point.
By the same token, it is likewise trivially or vacuously true that each of A, B, and C is
a strict local minimum of G.
y
A = (−5, 4)
4
2
B = (1, 1)
-6 -4 -2 2 4 x
-2
C = (2, −3)
-4
Exercise 74. Identify each graph’s extrema and turning points. (Answer on p. 1402.)
(a) y = x2 + 1. (b) y = x2 + 1, −1 ≤ x ≤ 1.
(c) y = cos x. (d) y = cos x, −1 ≤ x ≤ 1.
Exercise 75. Explain whether each statement is true or false. (Answer on p. 1403.)
Example 169. The reflection of the point P = (−2, 0) in the point Q = (0, 1) is the point:
R = (2, 2) .
2 R = (2, 2)
1 Q = (0, 1)
P = (−2, 0)
-4 -3 -2 -1 1 2 3 x
-1
As the above example shows, intuitively, we know what the reflection of one point P in
another Q is. Let us now try to formalise our intuition.
Let the reflection of P in Q be the point R. Intuitively, we want R to satisfy two properties:
As we’ll explain on the next page, the following definition of a reflection point “works”,
in the sense that it satisfies the above two properties. Or in other words, it successfully
formalises our intuition about what a reflection should be.
Definition 44. Let P = (a, b) and Q = (c, d) be distinct points. Then the reflection of P
in Q is the point:
R = (2c − a, 2d − b) .
On the next page, we give an informal proof-by-picture that the above definition “works”.87
87
For a more formal proof, see Fact 194 in the Appendices.
116, Contents www.EconsPhDTutor.com
Informal proof-by-picture. Suppose P = (a, b) and Q = (c, d) are points and we want to find
the reflection of P in Q.
To get from point P to point Q, go c − a units right and d − b units down.
Now, if from point Q, we again go c − a units right and d − b units down, we arrive at point
R = (2c − a, 2d − b). And so, “obviously”:
c−a
P = (a, b)
d−b
c−a
Q = (c, d)
d−b
x
R = (2c − a, 2d − b)
Exercise 76. What is the reflection of (8, 5) in (−2, 4)? (Answer on p. 1403.)
Remark 24. By the way, the above is an example of how maths often proceeds. We start
with our intuition of what a reflection of a point in a point ought to be. We then try
to write down a definition (in this case Definition 44) that formalise our intuition. We
then verify that our definition does indeed satisfy our intuition.
But as we go along, we may very well discover that our definition suffers from flaws or
contradictions. Or perhaps it is simply not the most convenient definition possible. If so,
we may decide to go back and write down a new definition.
Example 170. Consider the point P = (−3, 2) and the line y = 2x + 3. You are told that
Q = (−1, 1) is the point in the line that’s closest to P .
Then the reflection of P in the line is simply the reflection of P in Q, which is:
y
y = 2x + 3
P = (−3, 2)
2
Q = (−1, 1) 1
R = (1, 0)
-4 -3 -2 -1 1 2 3 x
-1
As before, let us formalise our intuition about what the reflection of a point in a graph is,
by writing down a formal definition:
Definition 45. Let P be a point and G be a graph. Let Q be the point in G that’s
closest to P . Then the reflection of P in G is the reflection of P in the point Q.
Note that the above definition is the general one, where G can be any sort of graph. But
to keep things simple, we will look only at cases where G is a line.
Here are two simple but useful results:
Example 171. The reflection of (−2, 1) in y = x is (1, −2). The reflection of (0, 1) in
y = x is (1, 0).
y
y=x
(0, 1)
(−2, 1) 1
(1, 0)
x
-4 -3 -2 -1 1 2 3
-1
-2 (1, −2)
Corollary 2. The reflection of the point (p, q) in the line y = −x is the point (−q, −p).
Example 172. The reflection of (−2, 1) in y = −x is (−1, 2). The reflection of (0, 1) in
y = x is (−1, 0).
y
(−1, 2)
2
(−2, 1) 1 (0, 1)
(−1, 0)
-5 -4 -3 -2 -1 1 2 3 4 x
-1
y = −x
Exercise 77. Find the reflections of (3, 2) in y = x and y = −x. (Answer on p. 1403.)
Definition 46. Let G and L be graphs. The reflection of G in L is the set of points
obtained by reflecting every point in G in L.
Again, the above definition is the general one, where L can be any sort of graph. But again,
to keep things simple, we will look only at cases where L is a line:
y
y=x 2
5
y = 4 − x2
3
y=2
-3 -2 -1 -1 1 2 x
Definition 47. Suppose the graph G is identical88 to its reflection in the line l. Then we
say that G is symmetric in l, or that that l is a line (or axis) of symmetry for G.
Example 174. The graph of y = x2 is symmetric in the line x = 0 (also the y-axis).
Or equivalently: x = 0 is a line or axis of symmetry for y = x2 .
88
The word identical is a synonym for the word equal.
120, Contents www.EconsPhDTutor.com
1
Example 175. The graph of y = is symmetric in the lines y = x and y = −x.
x
y = −x y=x
x
1
y=
x
Example 176. The graph of x2 + y 2 = 1 is symmetric in every line through the origin.
For example, it is symmetric in the lines y = 2x and y = −x.
y
y = −x y = 2x
x2 + y 2 = 1
x
The above two examples were easy enough to solve. In the next chapter, we’ll review the
solution to the quadratic equation ax2 + bx + c = 0. For now, here’s a quick example:
y
y=8
(−3, 8) (3, 8)
y = x2 − 1
y = x3 − 3x (2, 2)
x
(0, 0)
y = 3x − x2
(−3, −18)
Thus, the equation x3 −3x = 3x−x2 (x ∈ R) has three solutions: −3, 0, and 2. Equivalently,
its solution set is {−3, 0, 2}.
By the way, don’t worry. We don’t know how to solve cubic equations and we won’t be
learning to do so.
However, we are required to know how to use a graphing calculator to find the solutions
to this and indeed just about any equation. We’ll learn how to do so later.
y
y = sin x
(−π, 0) (2π, 0)
The above two inequalities were easy enough to solve. In Ch. 24.2, we’ll learn the general
solution to the quadratic inequality ax2 + bx + c > 0. For now, here’s a quick example:
y
y=8
(−3, 8) (3, 8)
y = x2 − 1
(2, 2)
x
y = x3 − 3x
y = 3x − x2
(−3, −18)
Definition 48. Given an equation (or inequality) involving a single variable, any number
that satisfies the equation (or inequality) is called its solution. And the set of all such
solutions is called its solution set.
For now, we’ll be dealing only with real numbers. And so for now, we’ll be looking only
at real solutions and real solution sets. But just so you know, here’s an example of an
equation with solutions that are not real:
Example 186. The equation x2 + 1 = 0 (x ∈ R) has no solution and its solution set is ∅
(the empty set). But this is only because we’ve specified that x ∈ R.
In contrast, the equation x2 + 1 = 0 (x ∈ C) has two solutions: −i and i. And its solution
set is {−i, i}. We’ll learn more about this in Part IV.
y = ax2 + bx + c (a ≠ 0).
0 = ax2 + bx + c
b2 b2 b2
0=x + x+ 2 − 2 + .
b
2 c
Add and subtract 2 :
4a a 4a 4a a
b 2 b2
0 = (x + ) − 2 + .
c
Complete the square:
2a 4a a
b 2 b2 c b2 − 4ac
Rearrange: (x + ) = 2 − = .
2a 4a a 4a2
√ √
b2 − 4ac ± b2 − 4ac
x+ =± =
b
Take the square root: .
2a 4a2 2a
Rearrange to get the quadratic formula (i.e. the two roots of the quadratic equation):
√
−b ± b2 − 4ac
x= .
2a
The quadratic formula is not on the List of Formulae (MF26) and so sadly, you’ll have to
memorise it. Unfortunately, I’ve never come across a good mnemonic that works (for me).
Look for one that works for you. (Lemme know if you think you’ve found a good one!)
89
Technically, this is a quadratic equation in two variables. And technically, when we simply speak of the
quadratic equation, we’re referring ax2 + bx + c = 0. But here we shall be a little sloppy and also call
y = ax2 + bx + c “the” quadratic equation.
127, Contents www.EconsPhDTutor.com
One mnemonic is to sing the quadratic formula to the tune of Pop Goes the Weasel.90
Arranged by a musical genius so that each syllable matches each note:
Moderato
x e-quals to ne-ga-tive b,
On the next page, Fact 20 summarises the key features of the quadratic equation and also
reviews some of the concepts we’ve gone through in previous chapters. Six examples follow.
90
I checked out about ten versions on YouTube and unfortunately I found all of them to be very annoying
and cannot recommend them. Maybe I’ll make one — don’t worry, I won’t be the one singing.
128, Contents www.EconsPhDTutor.com
Fact 20. Given the quadratic equation y = ax2 + bx + c,
1. The y-intercept is (0, c).
2. The sign of the discriminant determines the number of x-intercepts:
(a) If b2 − 4ac > 0, then there are two x-intercepts (i.e. two real roots):
√
−b ± b2 − 4ac
x= .
2a
We can factorise the quadratic polynomial:
√ √
−b + 2 − 4ac −b − b2 − 4ac
ax2 + bx + c = a (x − ) (x − ).
b
2a 2a
(b) If b2 − 4ac = 0, then there is one x-intercept (i.e. one real root), where the graph
just touches the x-axis:
x=−
b
.
2a
We can factorise the quadratic polynomial:
b 2
ax + bx + c = a (x + ) .
2
2a
(c) If b2 − 4ac < 0, then there are no x-intercepts (i.e. no real roots). There is also
no way to factorise the quadratic polynomial ax2 + bx + c (unless we use complex
numbers).
3. There is one line of symmetry, which is vertical:
x=−
b
.
2a
b2
(− , − + c).
b
4. There is one turning point:
2a 4a
We can distinguish between six cases of the quadratic equation, depending on whether ...
3 y y = x2 + 3x + 1
x=−
2
D = (0, 1)
√ √
−3 − 5 −3 + 5 x
A=( , 0) C=( , 0)
2 2
3 5
B = (− , − )
2 4
3 3
3. There is one (vertical) line of symmetry: x = − =− =− .
b
2a 2×1 2
4. The one turning point is:
b2 3 32 3 5
B = (− , c − ) = (− , 1 − ) = (− , − ) .
b
2a 4a 2 4×1 2 4
5. Since the coefficient 1 on x2 is positive, the graph is ∪-shaped, with the turning point
being the strict global minimum.
y
x = −1 y = x2 + 2x + 1
B = (0, 1)
A = (−1, 0) x
A = (−1, 0).
x2 + 2x + 1 = [x − (−1)] = (x + 1) .
2 2
2
3. There is one (vertical) line of symmetry: x = − =− = −1.
b
2a 2×1
4. In general, if a quadratic equation has only one real root, then the turning point is
also the x-intercept:
b2 22
A = (− , c − ) = (−1, 1 − ) = (−1, 0) .
b
2a 4a 4×1
5. Since the coefficient 1 on x2 is positive, the graph is ∪-shaped, with the turning point
being the strict global minimum.
y
1
x=− y = x2 + x + 1
2
B = (0, 1)
x
1 3
A = (− , )
2 4
y
3
x=
2
3 5
C=( , ) √
2 4 3+ 5
D=( , 0)
2
√
3− 5 x
A = (0, 1) B=( , 0)
2
y = −x2 + 3x − 1
3 3
3. There is one (vertical) line of symmetry: x = − =− = .
b
2a 2 × (−1) 2
4. The one turning point is:
b2 3 32 3 5
C = (− , c − ) = ( , −1 − ) = ( , ).
b
2a 4a 2 4 × (−1) 2 4
5. Since the coefficient −1 on x2 is negative, the graph is ∩-shaped, with the turning point
being the strict global maximum.
y
x=1
B = (1, 0)
A = (0, −1) x
y = −x2 + 2x − 1
B = (1, 0).
−x2 + 2x − 1 = − (x − 1) .
2
2
3. There is one (vertical) line of symmetry: x = − =− = 1.
b
2a 2 × (−1)
4. In general, if a quadratic equation has only one real root, then the turning point is
also the x-intercept:
b2 22
B = (− , c − ) = (1, 1 − ) = (1, 0) .
b
2a 4a 4 × (−1)
5. Since the coefficient −1 on x2 is negative, the graph is ∩-shaped, with the turning point
being the strict global maximum.
y
1
x=
2
A = (0, −1) x
1 3
B = ( ,− )
2 4
b2 1 12 1 3
B = (− , c − ) = ( , −1 − ) = ( ,− ).
b
2a 4a 2 4 × (−1) 2 4
5. Since the coefficient −1 on x2 is negative, the graph is ∩-shaped, with the turning point
being the strict global maximum.
x2 + x + 1 y
−x2 + 3x − 1
x2 + 2x + 1
x
x2 + 3x + 1 −x2 + 2x − 1
−x2 + x − 1
Exercise 79. Sketch each graph, identifying any intercepts, lines of symmetry, turning
points, and extrema.
(a) y = 2x2 + x + 1.
(b) y = −2x2 + x + 1.
(c) y = x2 + 4x + 4. (Answers on p. 1404.)
Strictly speaking,91 this description of functions is incorrect and suffers from (at least) two
big problems:
1. It fails to make any mention of the domain and the codomain.
But whenever we specify a function, we must also specify the domain and codomain.92
2. It incorrectly suggests that a function must always be some sort of a “formula”.
But it needn’t be. A function simply maps or assigns every element in the domain to
(exactly) one element in the codomain. There need be nothing logical or formulaic about
how this mapping or assignment is done. We will illustrate this important point with many
examples below.
Here is the correct way to describe functions:93
91
Pedagogical note: H2 Maths is equivalent to first- and even second-year university courses in many
countries. (Well, at least on paper. See my Preface/Rant.) My view is that while at earlier levels,
the above incorrect description of a function may have been suitable, at this level, it is no longer so.
Definition 50 is not at all difficult (especially when compared with a lot of the junk that’s already in H2
Maths). At the cost of very little additional pain and time, the student gains a far better understanding
of what functions are and how they work, and thus saves herself more grief in the long run.
92
Things become so much simpler if we make it clear from the outset and indeed insist that the very
definition of a function includes the specification of a domain and codomain. Instead we have the
present rigmarole where we ask students to explain which values “to exclude” from the domain, as if we
were doing some ad hoc repair to make the function “work”.
93
Definition 50 will serve us very well. Note though that it is still not quite correct! The formal and
correct definition of a function is that it is a set. See Definition 226 (Appendices).
137, Contents www.EconsPhDTutor.com
Example 193. Let f be the function with:
🐄
enough. If we wanted to, we
could write it out formally,
like so: Produces eggs
In the above example, the mapping rule “makes sense” — we simply map each animal to
its role. In the next example, the mapping rule “makes no sense”. Nonetheless, we have a
perfectly well-defined function all the same:
🐄
“makes no sense” — it maps
Cow to Produces eggs and
Chicken to Guards the home. Produces eggs
Nonetheless, this is a well-
defined function, because it Guards the home
maps every element in the
domain to (exactly) one ele- Produces milk
ment in the codomain.
🐔
The domain The codomain
Again, a very similar example, but this time the mapping rule “makes no sense”:
94
The reason there are more ISO three-letter country codes than UN members is that not every “country”
is a UN member. For example, Greenland is assigned the ISO code GRL, but is a territory of Denmark
and is not a UN member state. The Holy See (or Vatican City) is a fully sovereign state and is
assigned the ISO code VAT, but is not a UN member state. Taiwan is, for all intents and purposes, a
fully sovereign state and is assigned the ISO code TWN, but is not a UN member state (thanks to a big
bully and evil empire next door).
140, Contents www.EconsPhDTutor.com
Now, a function consists of the above three pieces. Hence — and here’s a somewhat subtle
point — two functions are identical (i.e. equal) if and only if they have the same domain,
codomain, and mapping rule.
The function h
🐄 Produces eggs
Produces milk
🐔
The domain The codomain
The function h1
🐄 Produces eggs
Produces milk
🐔
The domain The codomain
The functions h and h1 look very similar. Indeed, they have the same domain and
mapping rule.
However, their codomains are different. And so h and h1 are distinct:
h ≠ h1 .
One might think, “Aiyah, the codomain not very important wat. Both h and h1 map Cow
to Produces Milk and Chicken to Produces eggs. So just call them the same function
lah!” But this thinking is wrong.
Again, to reiterate, stress, and emphasise, a function consists of three pieces: the domain,
the codomain, and the mapping rule. Two functions are identical if and only if they have
the same domain, codomain, and mapping rule.
The function l
1
1
2
3
2
4
The function l1
1
1
2
2
4
Again, the functions l and l1 look very similar — they have the same domain and mapping
rule. However, their codomains are different. And so, l and l1 are distinct:
l ≠ l1 .
The “function” f
1
100
200
3
Unfortunately, f is not a function, because f fails to map every element in the domain
to (exactly) one element in the codomain — in particular, f fails to map 5 to any element
in the codomain.
The “function” g
1
100
200
3
Unfortunately, g is not a function, because g fails to map every element in the domain
to (exactly) one element in the codomain — in particular, g maps 1 to two elements,
namely 100 and 200.
The function f
5
3
6
7
4
8
There is usually more than one way to write down a mapping rule. The mapping
rule in the above example was written informally as “Double it”. But if we wanted to, it
could’ve been written more formally as:
We could also have written the mapping rule explicitly, stating what each element in the
domain is to be mapped to:
The mathematical punctuation mark ↦ means maps to. And so, here’s yet another way
to write the mapping rule:
“f ∶ 3 ↦ 6 and f ∶ 4 ↦ 8.”
So, altogether, in this example alone, we’ve given four different but entirely equivalent ways
of writing out the mapping rule. You can choose to write the mapping rule however you
like. What’s important is that you make clear how the mapping rule maps each element in
the domain to (exactly) one element in the codomain. If you haven’t made it sufficiently
clear, then you have failed to communicate to others what your function is and your function
is not well-defined.
145, Contents www.EconsPhDTutor.com
Here are eight ways to say aloud f (4) = 8 or f ∶ 4 ↦ 8:
Since −1, 1.5, π, 0 ∉ Domaing, g(−1), g(1.5), g(π), and g (0) are undefined.
Likewise, since Cow, Chicken ∉ Domain(g), g (Cow) and g (Chicken) are undefined.
The mapping rule was given informally above. We could’ve written it more formally as:
With the aid of an ellipsis, we could also have written it down explicitly:95
95
But in general, this may not be possible.
146, Contents www.EconsPhDTutor.com
So far, we’ve written down functions with the aid of tables and/or figures. But going
forward, we’ll want to write down functions more concisely and without the aid of
tables or figures.
Here are nine entirely equivalent and formal ways to write down the function g from the
last example:
1. Let g be the function that maps every element x in the domain Z to the element x2 in
the codomain R.
2. Let g be the function that has domain Z, codomain R, and maps every x ∈ Z to x2 ∈ R.
3. Let g ∶ Z → R be the function defined by g (x) = x2 .
4. Let g ∶ Z → R be the function defined by g ∶ x ↦ x2 .
5. Let g ∶ Z → R be defined by g (x) = x2 .
6. Let g ∶ Z → R be defined by g ∶ x ↦ x2 .
7. Define g ∶ Z → R by g (x) = x2 .
8. Define g ∶ Z → R by g ∶ x ↦ x2 .
9. Define g ∶ Z → R by x ↦ x2 .
In Statements 3–9, the domain comes after the colon, the codomain after the → arrow,
and the mapping rule at the end of the statement.
The general version of Statement 9 is:
Domain (f ) Codomain (f )
Define f ∶ A → B by x ↦ f (x).
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
Mapping rule
In words, the function f is defined to have domain A, codomain B, and mapping rule
x ↦ f (x). (Note that once again, x here is merely a dummy or placeholder variable, that
could’ve been replaced by any other symbol like y, z, ,, or ☀.)
Three examples on the next page:
Observe that −1, 0, Cow, Chicken ∉ R+ = Domainh. Thus, the following are all undefined:
Observe that 1.5, π, Cow, Chicken ∉ Z = Domaini. Thus, the following are all undefined:
Observe that −1, −3.2, Cow, Chicken ∉ Z = Domainj. Thus, the following are all undefined:
96
The functions g and h have different domains and so g ≠ h. Likewise, the functions i and j have different
domains and so i ≠ j.
148, Contents www.EconsPhDTutor.com
In the above examples, we actually cheated a little. Or rather, we took it for granted that
the specified mapping rule applied to all elements in the domain. If we wanted to be extra
careful (or pedantic), then we should “really” have written:
This is because we can have piecewise functions like the following, where there are
different mapping rules for different elements in the domain:
In words, the function k doubles integers that are less than or equal to 5, but adds one
to those that are greater than 5.
A common mistake is to believe that f (x) denotes a function. But this is wrong.
For the next two examples, let S be the set of human beings.
Example 216. Let h ∶ S → R be the height function. That is, h gives each human being’s
height (rounded to the nearest centimetre).
Then we have, for example, h (Joseph Schooling) = 184.
Example 217. Let w ∶ S → R be the weight function. That is, w gives each human
being’s weight (rounded to the nearest kilogram).
Then we have, for example, w (Joseph Schooling) = 74.
This may seem like an excessively pedantic distinction. But maths is precise and pedantic.
In maths, we are gentlemen (and ladies) who say what we mean and mean what we say.
There is no room for ambiguity or alternative interpretations.
In H2 Maths, we’ll usually encounter only nice functions. So, we’ll often encounter functions
like f , g, h, and i, but not functions like j, k, l, or m.
Remark 25. The term nice function is not standard and is used in this textbook for
brevity’s sake (so we don’t have to keep saying “a real-valued function of a real variable”).
97
No to both, because j and k have different domains and so too do l and m.
151, Contents www.EconsPhDTutor.com
10.5. Graphs of Functions
Let f be a nice function. Then the graph of f is simply the graph of the equation y = f (x)
with the constraint x ∈ Domain(f ). A little more formally:
Definition 51. Given a nice function f , its graph is the following set of points:98
By the way, strictly speaking, we should say that the point (3, 6) is on the graph of f .
That is, we should explicitly state both the x- and y-coordinates of any point we are
talking about.
However, since the x-coordinate is sufficient for identifying a point on the graph of a
function, we will sometimes be lazy and say things like:
98
Actually, going by the formal definition of a function (Definition 226 in the Appendices), there is no
difference between a function and its graph. A function is its graph. See the discussion in the Appendices.
152, Contents www.EconsPhDTutor.com
Example 222. Define f1 ∶ R+ → R by f1 (x) = 2x.
Note that the graph of f1 does not contain the point (0, 0), because 0 ∉ R+ .
-6 -4 -2 2 4 x
-4
-8
-12
{. . . , (−3, −3) , (−2, −2) , (−1, −1) , (0, 0) , (1, 1) , (2, 2) , (3, 3) , . . . }
Note that again, the graph of g1 does not contain the point (0, 0), because 0 ∉ R+ .
y
h
y
i
−3 −2 −1 1 2 3
j
1
−1
Exercise 82. Fill in the blanks with “at least one element”; “every element”; or “exactly
one element”. (Answer on p. 1405.)
A function maps _____ in its domain to _____ in its codomain.
Exercise 83. What do we call a function whose ...
99
By the way, this function is sometimes called the Heaviside function.
157, Contents www.EconsPhDTutor.com
Exercise 85. (i) Verify that the functions given below are well-defined. (ii) Which (if
any) of them are equivalent? (Answer on p. 1405.)
Exercise 86. Define f ∶ R+ → R by “round it off to the nearest integer (half-integers are
rounded up)”. (Answer on p. 1405.)
(a) What are f (3), f (π), f (3.5), f (3.88), and f (0)?
Exercise 87. Let A = {Lion, Eagle} and B = {Fat, Tall}. Can we construct a well-defined
function using A as the domain and B as the codomain? (Answer on p. 1405.)
Exercise 88. Explain whether each of the following alleged functions is in fact well-
defined. (Answer on p. 1405.)
(a) Define a ∶ {Cow, Chicken, Dog} → {Produces eggs, Guards the home, Produces milk}
by “match the animal to its role”.
(b) Define b ∶ {Cow, Chicken, Dog} → {Produces eggs, Produces milk} by “match the
animal to its role”.
(c) Let c have the set of UN member states as its domain, the set of cities as its codomain,
and the mapping rule “match the state to its most splendid city”.
(d) Let d have the set of UN member states as its domain, the set of cities as its codomain,
and the mapping rule “match the state to a city with over 10M people”.
Exercise 90. Continuing with the above exercise, how can we change the domains of n
and o so that n and o become well-defined?100 (Answer on p. 1407.)
100
The sharp student may have noticed that one trivial answer here is to simply change the domain to
the empty set ∅. Then it is trivially or vacuously true that every element in the domain is mapped
to exactly one element in the codomain. If you’ve noticed and are bothered by this, please change the
question to “What are the largest subsets of R to which the domains of n and o can be changed, so that
they become well defined?”
159, Contents www.EconsPhDTutor.com
10.6. The Range of a Function
Informally, the range is the set of elements in the codomain that are “hit”.101 Formally:
In general, the range is not the same thing as the codomain.102 This is because in
general, not every element in the codomain need be hit by the function.
Instead, in general, the range is a subset of the codomain. (Indeed, it is typically a proper
subset of the codomain; in other words, it is typically “smaller” than the codomain.)
Because this is such a common point of confusion, let me repeat:
101
If D is the domain of the function f , then we can also call the range of f the image of D under f and
denote it f (D).
102
Unfortunately and very confusingly, for a minority of writers, the term range is synonymous with
codomain. In the A-Level syllabus and exams and in this textbook, we will follow majority practice by
insisting that range is not the same thing as codomain.
160, Contents www.EconsPhDTutor.com
Sometimes, every element in the codomain is “hit” — in such cases, the range is equal to
the codomain. Examples:
1
100
200
3
1
100
200
3
Exercise 92. Let f be a function. Then which of the following must be true?
(a) Range(f ) ⊆ Domain(f ).
(b) Range(f ) ⊆ Codomain(f ).
(c) Range(f ) ⊂ Domain(f ).
(d) Range(f ) ⊂ Codomain(f ).
(e) Range(f ) = Domain(f ).
(f) Range(f ) = Codomain(f ). (Answer on p. 1407.)
f is continuous everywhere
because you can draw its entire
graph without lifting your pencil.
103
For the formal definition of continuity, see Ch. 121.6 in the Appendices.
163, Contents www.EconsPhDTutor.com
Example 238. The exponential function exp is continuous everywhere.
∣⋅∣ is continuous
everywhere because you
can draw its entire graph
without lifting your pencil.
As stated, most functions we’ll encounter are continuous everywhere. However, we’ll oc-
casionally encounter functions that aren’t continuous everywhere. For example, functions
with vertical asymptotes:
−
π π x
2 2
1 1
Indeed, tan is continuous on each interval ((k − ) π, (k + ) π), for k ∈ Z.
2 2
And so, we can actually say that the tan function is continuous everywhere except at
1
each x = (k + ) π, for k ∈ Z.
2
1
Example 241. Define h ∶ R ∖ {0} → R by h (x) = . y
x
Then h is not continuous everywhere,
because to draw its entire graph, you
must lift your pencil.
Note though that h is continuous on R− h
because you can draw this portion of the
graph without lifting your pencil.
Similarly, h is also continuous on R+ be- x
cause you can draw this portion of the
graph without lifting your pencil.
And so, we can actually say that h is
continuous everywhere except at x = 0.
Note though that i is continuous on (−∞, 1) because you can draw this portion of the
graph without lifting your pencil.
Similarly, i is continuous on (1, ∞) because you can draw this portion of the graph without
lifting your pencil.
And so, we can actually say that h is continuous everywhere except at x = 1.
To repeat, most functions we’ll encounter in A-Level maths will be continuous everywhere.
Indeed, all functions we’ll encounter will be continuous everywhere except possibly at a
set of isolated points. Informally, a point in a set is said to be isolated if it isn’t close
to any other point in the set.104 For example, tan is continuous everywhere except on
{(k + 1/2) π ∶ k ∈ Z}.
104
For the formal definition, see Definition 250 (Appendices).
166, Contents www.EconsPhDTutor.com
This isn’t something you need to know, but just to illustrate, here’s a somewhat exotic
function whose domain is R but which is nowhere-continuous.
105
Or the characteristic function of the rationals. Named after the German mathematician Peter
Gustav Lejeune Dirichlet (1805–59).
106
We will explain in a little more detail why this is so in Part V (Calculus), Ch. 68.
167, Contents www.EconsPhDTutor.com
12. When a Function Is Increasing or Decreasing
Informally, we know what it means for a function to be increasing, decreasing, strictly
increasing, and/or strictly decreasing:
Figure to be
inserted here.
Note: At x = 0, f is both decreasing and increasing, but neither strictly decreasing nor
strictly increasing. This follows from the formal definitions (below).
Formal definitions:
Definition 53. Let f be a nice function. Given a set of points S ⊆ Domainf , we say
that f is:
We will find it convenient to have a word that describes a function that’s either increasing
or decreasing:
Definition 54. If a function is increasing or decreasing (on a set), then we say that it is
monotonic (on that set).
If a function is strictly increasing or strictly decreasing (on a set), then we say that it is
strictly monotonic (on that set).
If a function is (strictly) monotonic on its domain, then we simply say that it is a (strictly)
monotonic function.
Exercise 93. We will review trigonometric functions in Ch. 19. For now, here is the
graph of the sine function sin ∶ R → R:
Figure to be
inserted here.
Write down the sets on which sin is (a) increasing; (b) decreasing; (c) strictly increasing;
and/or (d) strictly decreasing. What are the points at which sin is (e) increasing but
not strictly increasing; and (f) decreasing but not strictly decreasing? (Answer on p.
169.)
A93. For every integer k, sin is:
(a) Increasing on [− + 2kπ, + 2kπ];
π π
2 2
3π
(b) Decreasing on [ + 2kπ, + 2kπ];
π
2 2
(c) Strictly increasing on (− + 2kπ, + 2kπ);
π π
2 2
3π
(d) Strictly decreasing on ( + 2kπ, + 2kπ);
π
2 2
(e) Increasing but not strictly increasing at each point x = 2kπ + ;
π
2
(f) Decreasing but not strictly decreasing at each point x = 2kπ + .
π
2
Figure to be
inserted here.
(f + g) ∶ R→R by (f + g) (x) = 7x + 5 + x3 .
(f − g) ∶ R→R by (f − g) (x) = 7x + 5 − x3 .
(f ⋅ g) ∶ R→R by (f ⋅ g) (x) = (7x + 5) x3 .
(kf ) ∶ R→R by (kf ) (x) = 2 (7x + 5) .
7x + 5
( )∶ R ∖ {0} → R by ( ) (x) =
f f
.
g g x3
Evaluating each of these five functions at 1, we have:
A word about the domain. For kf , the domain is simply Domainf . For each of f + g,
f − g, and f ⋅ g, the domain is simply Domainf ∩ Domaing (i.e. the set of numbers that
are in in both the domains of f and g).
But for the quotient function f /g, it’s a little trickier figuring out what the domain is.
Observe that g (0) = 0. And so, in order for f /g to be well-defined, we must restrict the
domain by removing the element 0. Otherwise, (f /g) (0) would be undefined and f /g
would fail to be a well-defined function. Thus, the domain of f /g is:
In words, the domain of f /g is the set of numbers x that are in both the domains of f
and g, but remove those for which g (x) equals zero.
In words, the domain of h/i is the set of numbers x that are in both the domains of h
and i, but remove those for which i (x) equals zero.107
√
107
Here the sharp student may wonder, “But isn’t x + 1 perfectly well-defined for all x ∈ [−1, ∞)? So
couldn’t the domain of h/i instead be [−1, ∞)?”’ Great point. The thing is, we really want to think
of (h/i) (x) as being equal to h (x) divided by i (x) (without
√ doing any simplification beforehand and
without thinking of h/i as being simply the “formula” x + 1). So, if i (x) is undefined, then (h/i) (x)
should also be undefined.
171, Contents www.EconsPhDTutor.com
Formal definitions of the five functions:
Definition 55. Let f and g be nice functions and k ∈ R. Then the sum, difference,
product, constant multiple, and quotient functions — denoted f + g, f − g, f ⋅ g, kf , and
f /g — have codomain R and domain and mapping rule as given below:
f (x)
Domain(f ) ∩ Domain(g) ∖ {x ∶ g (x) = 0} ( ) (x) =
f f
g g g (x)
Remark 26. As we’ll learn shortly, f g refers to a function that’s entirely different from
f ⋅ g. So take great care to write f ⋅ g if that’s what you mean.108
f ∶ R → R by f (x) = 7x + 5, g ∶ R → R by g (x) = x3 ,
√
h ∶ [−1, ∞) → R by h (x) = x + 1; and i ∶ [−1, ∞) → R by i (x) = x + 1.
108
Unfortunately and very confusingly, a minority of writers do use f g to mean f ⋅ g. In the A-Level
syllabus and exams and in this textbook, we will follow majority practice by insisting that f g ≠ f ⋅ g.
172, Contents www.EconsPhDTutor.com
14. Inverse Functions
Example 247. Define f ∶ {Cow, Chicken} → {Produces eggs, Produces milk} by “match
the animal to its role”.
The function f
🐄 Produces eggs
Produces milk
🐔
The domain The codomain
The function f −1
🐄 Produces eggs
Produces milk
🐔
The codomain The codomain
Given the function f , its inverse function (or simply inverse) is denoted f −1 and satisfies:
Definition 56 will formally define the inverse function. But first, more examples:
173, Contents www.EconsPhDTutor.com
Example 248. Define g ∶ {Cow, Dog, Chicken} → {Produces eggs, Guards the home,
Produces milk} by “match the animal to its role”.
The function g
🐄 Produces eggs
🐔 Produces milk
The function g −1
🐄 Produces eggs
🐔 Produces milk
In each of the above examples, the original function’s range was identical to its codomain.
And thus, the inverse function’s domain was simply the original function’s codomain.
However, in general, this need not be the case:
The function h
🐄 Produces eggs
Produces milk
🐔
The domain The codomain
1. Use the fact that h−1 (h (x)) = x to “invert” the mapping rule.
So again, here we simply “invert” the mapping rule from “match the animal to its role”
to “match the role to the corresponding animal”. (This corresponds to inverting the ↦
arrows in the above figure.)
Thus, the inverse function of h is h−1 ∶ {Produces eggs, Produces milk} → {Cow, Chicken}
defined by “match the role to the corresponding animal”.
🐄 Produces eggs
Produces milk
🐔
The codomain The domain
j −1 (j (x)) = x ⇐⇒ j −1 (2x) = x
⇐⇒ j −1 (y) = x (Let y = j (x) = 2x.)
⇐⇒ j −1 (y) = (Do the algebra: x = .)
y y
2 2
109
By the way, note that once again, y here is merely a dummy or placeholder variable that we use for i−1 .
We could’ve replaced y with any other symbol like w, z, ,, or ☀. (We would however avoid using x
because this is the dummy or placeholder variable that we already used for i.)
176, Contents www.EconsPhDTutor.com
1
Example 252. Define k ∶ R ∖ {0} → R by k (x) = .
x
Three-step procedure to get the inverse function k −1 :
1. Set Domain (k −1 ) = Range(k) = R ∖ {0}.
2. Set Codomain (k −1 ) = Domain(k) = R ∖ {0}.
3. Use the fact that k −1 (k (x)) = x to “invert” the mapping rule:
k −1 (k (x)) = x
1
⇐⇒ k −1 ( ) = x
x
1
⇐⇒ k −1 (y) = x (Let y = k (x) = .)
x
1 1
⇐⇒ k −1 (y) = (Do the algebra: x = .)
y y
1
Thus, we define k −1 ∶ R ∖ {0} → R ∖ {0} by k −1 (y) = .
y
Let’s verify that this inverse function “works”, i.e. k −1 (k (x)) = x, for some values of x:
1 1
k −1 (k(1)) = k −1 ( ) = k −1 (1) = = 1. 3
1 1
1 10 1
k −1 (k(0.3)) = k −1 ( ) = k −1 ( ) = = 0.3. 3
0.3 3 10/3
1 5 1
k −1 (k(0.8)) = k −1 ( ) = k −1 ( ) = = 0.8. 3
0.8 4 5/4
110
They have identical domains and mapping rules. However, they have different codomains and are
therefore not equal.
177, Contents www.EconsPhDTutor.com
Example 253. Define l ∶ R+0 → R by l (x) = x2 .
Three-step procedure to get the inverse function l−1 :
1. Set Domain (l−1 ) = Range(l) = R+0 .
2. Set Codomain (l−1 ) = Domain(l) = R+0 .
3. Use the fact that l−1 (l (x)) = x to “invert” the mapping rule:
l−1 (l (x)) = x
⇐⇒ l−1 (x2 ) = x
⇐⇒ l−1 (y) = x (Let y = l (x) = x2 .)
√ √
⇐⇒ l−1 (y) = y (Do the algebra: x = ± y.)
√
Note that in the last step here, we discard − y, because the codomain of l−1 is the set
of non-negative real numbers.
√
Thus, we define l−1 ∶ R+0 → R+0 by l−1 (y) = y.
Let’s verify that this inverse function “works”, i.e. l−1 (l (x)) = x holds, for some values
of x:
√
l−1 (l(1)) = l−1 (12 ) = l−1 (1) = 1 = 1. 3
√
l−1 (l(3)) = l−1 (32 ) = l−1 (9) = 9 = 3. 3
√
l−1 (l(8)) = l−1 (82 ) = l−1 (64) = 64 = 8. 3
Definition 56. Let f be a function. Then its inverse function (or simply inverse),
denoted f −1 , has the following domain, codomain, and mapping rule:
Domain: Range(f ).
Codomain: Domain(f ).
Mapping rule: If y = f (x), then f −1 (y) = x.
As we’ll see on the next page, the inverse function f −1 isn’t always well-defined. In such
cases, we say that “the inverse function does not exist” or more simply, “the function has
no inverse”.
Example 254. Define f ∶ {Cow, Chicken} → {Yum, Yuck} by f (Cow) = Yum and
f (Chicken) = Yum.
The function f
🐄 Yum
Yuck
🐔
The domain The codomain
Say we try to get the inverse function f −1 through the usual three-step procedure:
1. Set Domain (f −1 ) = Range(f ) = {Yum}. 3
2. Set Codomain (f −1 ) = Domain(j) = {Cow, Chicken}. 3
So far so good. But in the third step, we run into trouble:
3. If we try to “invert” the mapping rule by inverting the ↦ arrows, we get the following
“function”:
The “function” f −1
🐄 Yum
Yuck
🐔
The codomain The domain
But as you should know very well by now, this “function” f −1 isn’t well-defined, because
it maps the element Yum in its domain to more than one element in its codomain.
And so in this case, we say that “the inverse function f −1 does not exist” or more simply,
“the function f has no inverse”.
The function g
0 0
1
1
2
2 3
Say we try to get the inverse function g −1 through the usual three-step procedure:
1. Set Domain (g −1 ) = Range(g) = {0, 1}. 3
2. Set Codomain (g −1 ) = Domain(g) = {0, 1, 2}. 3
So far so good. But in the third step, we run into trouble:
3. If we try to “invert” the mapping rule by inverting the ↦ arrows, we get the following
“function”:
The “function” g −1
0 0
1
1
2
2 3
But again, this “function” g −1 is not well-defined, because it maps the element 1 in its
domain to more than one element in its codomain.
And so in this case, we say that “the inverse function g −1 does not exist” or more simply,
“the function g has no inverse”.
In words, a function is one-to-one or invertible if any two distinct elements in its domain
correspond to two distinct elements in the codomain.
Here are two equivalent ways to rewrite the above definition — a function f is one-to-one
or invertible if:
• For every y ∈ Range(f ), there is exactly one x ∈ Domain(f ) such that f (x) = y.
• f (x1 ) = f (x2 ) implies x1 = x2 . (This is the contrapositive of the above definition.)
Graphically and informally, we can also use the horizontal line test (HLT):
The name one-to-one is apt — every element in the codomain is “hit” by exactly one
element in the domain. In contrast, a function that isn’t one-to-one is many-to-one,
where at least one element in the codomain is “hit” by more than one element in the
domain. (By the way, is it possible that a function is one-to-many?111 )
The name invertible is also apt, because as our examples above illustrate, a function has
a well-defined inverse if and only if it is invertible. Let’s formally jot this down as a result:
111
Nope. A one-to-many function would be one that maps an element in the domain to more than one
element in the codomain — but this violates our cardinal requirement that a function maps each element
in the domain to (exactly) one element in the codomain.
181, Contents www.EconsPhDTutor.com
Remark 27. There is actually a third name for one-to-one or invertible functions — they’re
also called injective functions (or simply injections). But we won’t use this third name
in this textbook.
To show that a function f is not invertible, simply find a counterexample. That is, simply
find some x1 ≠ x2 such that f (x1 ) = f (x2 ):
y
HLT: The horizontal
line y = 8 intersects
h
the graph of h twice. 8
∣ ∣
−4 2 x
∣ ∣
x
3π
−
π
2 2
−2 2 x
∣ ∣
Exercise 96. Determine if each function is invertible. And if it is, write down its inverse.
(a) a ∶ R → R defined by a (x) = x2 − 1.
(b) b ∶ R+0 → R defined by b (x) = x2 − 1.
(c) c ∶ R → [−1, ∞) defined by c (x) = x2 − 1. (Answer on p. 1409.)
Fact 22. Let f be an invertible function and f −1 be its inverse. Then f and f −1 are
reflections of each other in the line y = x.
f −1 (f (x)) = x
⇐⇒ f −1 (x + 1) = x
⇐⇒ f −1 (y) = x (Let y = f (x) = x + 1.)
⇐⇒ f −1 (y) = y − 1 (Do the algebra: x = y − 1.)
y=x
f −1
g −1 (g (x)) = x
⇐⇒ g −1 (2x) = x
⇐⇒ g −1 (y) = x (Let y = g (x) = 2x.)
⇐⇒ g −1 (y) = (Do the algebra: x = .)
y y
2 2
y
g
y=x
g −1
h−1 (h (x)) = x
1
⇐⇒ h−1 ( ) = x
x
1
⇐⇒ h−1 (y) = x (Let y = h (x) = .)
x
1 1
⇐⇒ h−1 (y) = (Do the algebra: x = .)
y y
1
Thus, the inverse function is h−1 ∶ R ∖ {0} → R ∖ {0} defined by h−1 (x) = . Below are
x
graphed h and h−1 . Observe that they are reflections of each other in the line y = x.
Indeed, they look exactly the same. (Is h = h−1 ?112 )
y
y=x
−1
h, h
112
Nope. They have the same domains and mapping rules. But they have different codomains and are
thus different.
186, Contents www.EconsPhDTutor.com
Example 264. The function i ∶ R+0 → R defined by i (x) = x2 is invertible. Thus, its
inverse exists and we can find it as usual:
i−1 (i (x)) = x
⇐⇒ i−1 (x2 ) = x
⇐⇒ i−1 (y) = x (Let y = i (x) = x2 .)
√ √
⇐⇒ i−1 (y) = y (Do the algebra: x = ± y.)
√
Note that in the last step here, we discard − y, because the codomain of i−1 is the set
of non-negative real numbers.
√
Thus, the inverse function is i−1 ∶ R+0 → R+0 defined by i−1 (x) = x. Below are graphed i
and i−1 . Observe that they are reflections of each other in the line y = x.
y
i
y=x
i−1
j −1 y=x
In the above example, it’s actually possible to write down the inverse function’s mapping
rule114 — we just don’t know how to, because we haven’t learnt to solve cubic equations.
In the next example, it is impossible to do so. Nonetheless, we can again use Fact 22 to
sketch the graph of the inverse function.
113
Let x2 > x1 . Then x32 + x2 > x31 + x1 . We’ve just proven that for any x1 ≠ x2 , we must have j(x1 ) ≠ j(x2 )$.
Thus, j is invertible. ¿ ¿
√ √
Á 1 2 1 Á
Á
À1 Á
À 1 1 2 1
In case you were wondering, it’s j (x) = x+ x + + x− x + .
3 3
114 −1
2 4 27 2 4 27
188, Contents www.EconsPhDTutor.com
Example 266. The function k ∶ R → R defined by k (x) = x5 + x is invertible. (Can you
verify this?)115 Thus, the the inverse function k −1 ∶ R → R exists.
But unfortunately, it is impossible 116 to write down the mapping rule of the inverse
function k −1 ∶ R → R.
But again, even though we cannot write down the mapping rule of k −1 , if we already have
the graph of k, we can use Fact 22 to sketch the graph of k −1 .
k −1
y=x
Exercise 97. Write down the inverse of each function. Then graph both the function
and its inverse. (Answers on pp. 1410–1411.)
(a) f ∶ (0, 1] → R defined by f (x) = x + 1.
(b) g ∶ (0, 1] → R defined by g (x) = 2x.
1
(c) h ∶ (0, 1] → R defined by h (x) = .
x
(d) i ∶ (0, 1] → R defined by i (x) = x2 .
115
Let x2 > x1 . Then x52 + x2 > x51 + x1 . We’ve just proven that for any x1 ≠ x2 , we must have k (x1 ) ≠ k (x2 ).
Thus, k is invertible.
116
Abel’s impossibility theorem says there is no algebraic solution for polynomials of degree 5 and
above. That is, unlike the quadratic formula which gives us algebraic expressions for the quadratic
equation’s two roots, it is impossible to write down a similar formula for polynomials of degree 5 and
above. One implication of this is that it is impossible to write down k −1 here as an algebraic expression.
189, Contents www.EconsPhDTutor.com
14.3. The Intersection of f and f −1
By Fact 22, a function f and its inverse f −1 are reflections of each other in the line y = x.
Observe then that if f intersects the line y = x at some point, then f −1 must also intersect
y = x at the very same point. Thus, any point at which f intersects y = x is also a point at
which f and f −1 intersect.
lin or
45 ○ = x
e
y
lin or
45 ○ = x
e
The graph of f intersects the line y = x at g −1 y
the points (0, 0) and (1, 1). And so, g and g −1
should also intersect at these points.
g
(1, 1)
Fact 23. Suppose the function f has inverse f −1 . Then any point at which f intersects
the line y = x is also a point at which f intersects f −1 .
The above statement sounds perfectly plausible. But unfortunately, it is false. (Those
writing your A-Level exams have assumed it to be true at least twice in the recent past.)117
1 y
Example 269. Define f ∶ R ∖ {0} → R ∖ {0} by: f (x) = .
x f = f −1
Then f ’s inverse is f −1 ∶ R ∖ {0} → R ∖ {0}, also defined by:
1
f −1 (x) = .
(3, 1/3)
x
Observe that interestingly, here f is its (1, 1)
own inverse.118 That is, f = f −1 . And
so, f and f −1 share infinitely many in- x
tersection points. (−1, −1)
However, only (−1, −1) and (1, 1) are
on the line y = x. Every other intersec- lin or
tion point is not. For example, f and
45 ○ = x
e
f −1 at the point (3, 1/3), but this point
y
is not on y = x.
lin or
that the function in the last counter- g
45 ○ = x
e
example was unusual because it was g −1
y
(a) not continuous everywhere; and
(b) its own inverse.
Consider then the function g ∶ R → R (−1, 1)
defined by g (x) = −x3 . It is (a) con-
tinuous everywhere; and (b) isn’t its (0, 0)
own inverse.
Its inverse g −1 ∶ R → R is defined by: x
√
g −1 (x) = − 3 x.
117
Exercises 467(iii) (N2011/II/3) and 475(iii) (N2008-II-4). (Indeed, this subchapter was inspired by
those two questions.) One set of published TYS answers baldly claims, “As y = f (x) is a reflection of
y = f −1 (x) about the line y = x, the point of intersection of the two curves must meet on y = x.”
118
If you like big words, a function that’s its own inverse is called an involution.
191, Contents www.EconsPhDTutor.com
Two Results That Come Kinda Close (Optional)
Our last two examples show that the following statement is false:
The invertible functions examined in our last four examples were all continuous on an
interval. And so sure enough, in each case, the function intersected its inverse on the line
y = x at least once.
The second result says that by adding the peculiar assumption that f and f −1 intersect at
an even number of points, we can obtain the stronger result that all of the intersection
points are on the line y = x.
Fact 268 is illustrated by Example 268, but not by the other three of the last four examples
examined, because the hypotheses of Fact 268 do not apply in those three examples (can
you explain why?).119
119
In each of the three examples, the function intersects its inverse once, infinitely many times, and thrice
(respectively). And so in each case, we violate the hypothesis in Fact 268 that they intersect an even
number of times.
192, Contents www.EconsPhDTutor.com
14.4. Domain Restriction to Create an Invertible Function
We saw that some functions were not invertible. And so, for such functions, the inverse
function simply does not exist.
Nonetheless, it turns out we can always120 restrict the domain of a non-invertible function
to create a brand new function that is invertible.
y y
h
g
y=x
g −1
x
x
h−1
y=x
120
We can always simply restrict the domain to be the empty set! The function thus formed would have
an empty domain and an empty range. It would thus be vacuously true that this function is invertible
(because no element in its range is hit more than once).
193, Contents www.EconsPhDTutor.com
As the above example illustrates, there is usually more than one way to restrict the domain
of a non-invertible function to create an invertible function.
y y
j
y=x
−1
i, i
x x
y=x
j −1
1
Exercise 98. Define f ∶ R ∖ {1} → R by f (x) = 2. (Answer on p. 1412.)
(x − 1)
(a) Prove that f is not invertible.
Let g be the function created by restricting the domain of f to (1, ∞).
(b) Prove that g is invertible, then write down the inverse function g −1 .
Let h be the function created by restricting the domain of f to (−∞, 1).
(c) Prove that h is invertible, then write down the inverse function h−1 .
Proof. If f is strictly monotonic, then for any distinct x1 , x2 ∈ Domainf , we have f (x1 ) ≠
f (x2 ). And so, by Definition 57, f is invertible.
However, if we make the additional assumptions that the function is continuous and has
an as its domain, then the converse of Fact 26 is true:
Fact 26 tells us that any strictly monotonic function has an inverse. The following result
tells us a little more:
(f g) (x) = f (g (x)) = f (x + 1) = 2 (x + 1) = 2x + 2.
So for example: (f g) (0) = 2 × 0 + 2 = 2.
Note also that for the composite function f g, we first apply the function g, then apply
the function f . So for example, to compute, say f g(7), we first compute g(7) = 7 + 1 = 8,
then compute f (g(7)) = f (8) = 2 ⋅ 8 = 16.
Conversely, for the composite function gf , we first apply the function f , then apply the
function g. So for example, to compute, say gf (7), we first compute f (7) = 2 ⋅ 7 = 14,
then compute g (f (7)) = g (14) = 14 + 1 = 15.
(A common mistake is to instinctively go from left to right. So with f g, one might
mistakenly apply f before g. And with gf , one might mistakenly apply g before f .)
Again, note that h ⋅ i = i ⋅ h, but hi ≠ ih. The composite function ih ∶ R → R is defined by:
x2 1
(ih) (x) = i (h (x)) = i (x2 − 1) = − .
2 2
02 1 1
So for example: (ih) (0) = − = − ≠ (hi) (0) = −1.
2 2 2
Definition 58. Let f and g be functions with Range(g) ⊆ Domain(f ). Then the com-
posite function f g is defined to have:
Domain: Domain(g);
Codomain: Codomain(f ); and
Mapping rule: (f g) (x) = f (g (x)).
Remark 28. The composite function f g may also be written as f ○ g (read aloud as “f
circle g”). We use this alternative piece of notation whenever we want to be extra careful
about distinguishing f ○ g from f ⋅ g.
The condition Range(g) ⊆ Domain(f ) is important. It ensures that for any x ∈ Domain(g),
we have g (x) ∈ Domain(f ) and hence that f (g (x)) is well-defined.
If this condition fails, then we simply say that the composite function f g does not exist
or is undefined:
would be undefined. Hence, the composite function f g simply does not exist.
The rest of this example is explicitly excluded from your syllabus.121 Nonetheless, spending
a minute or two reading it will earn you a better understanding of composite functions.
Recall that given a non-invertible function, we can restrict its domain to create a brand
new function that is invertible.
Here we can similarly restrict the domain of g to create a brand new function h, so that
the composite function f h exists.
For example, we can restrict the domain of g to (−1, ∞) and get the brand new function
h ∶ (−1, ∞) → R defined by h (x) = x + 1. Now Range(h) = R+ is a subset of Domain(f ) =
R+ . We thus have the composite function f h ∶ (−1, ∞) → R defined by:
(f h) (x) = f (h (x)) = f (x + 1) = ln (x + 1) .
√
Example 283. Define i ∶ R+0 → R by i (x) = x and j ∶ R → R by j (x) = x − 3.
Observe Range(j) = R is not a subset of Domain(i) = R+0 . And so for example,
would be undefined. Hence, the composite function ij simply does not exist.
Again, the rest of this example is explicitly excluded from your syllabus.
Again, we can restrict the domain of j to create a brand new function k, so that the
composite function jk exists.
For example, we can restrict the domain of j to [3, ∞) and get the brand new function
k ∶ [3, ∞) → R defined by k (x) = x−3. Now Range(k) = R+0 is a subset of Domain(f ) = R+0 .
We thus have the composite function jk ∶ [3, ∞) → R defined by:
√
(jk) (x) = j (k (x)) = j (x − 3) = x − 3.
121
See p. 5 of your syllabus.
198, Contents www.EconsPhDTutor.com
We can use a single function to build a composite function.
The composite function f f is usually simply written as f 2 . So, the line above can also
be written as:
f 3 (x) = 8x.
So for example: f 3 (1) = 8 and f 3 (3) = 24.
Remark 29. The Singapore-Cambridge A-Level exams and syllabus use f 2 to mean the
composite function f f , f 3 to mean f f 2 , f 4 to mean f f 3 , etc.
Later on in Part V (Calculus), we’ll use a similar-looking but totally different piece of
notation. We’ll use f ′ , f ′′ , f ′′′ , f (4) , etc. to denote “the first derivative of”, the “second
derivative of”, the “third derivative of”, the “fourth derivative of”, etc. And in general,
f (n) means “the nth derivative of”.
Take care not to confuse the composite function f n with the nth derivative f (n) .
1 x 1
+x 3 x
g 3 (x) = g ( + ) = 1 − 2 4 = − .
2 4 2 4 8
5 5
So for example: g 3 (1) = and g 3 (3) = .
8 8
(a) Let n = 4.
(i) Explain whether the composite function g n exists. If it does exist:
(ii) Write down the function g n ; and
(iii) Evaluate g n (1) and g n (3).
(b) Repeat part (a), but now let n = 5.
(c) Repeat part (a), but now let n = 6.
Part (d) is a little harder, but is also the sort of curveball that the A-Level examiners
like to throw in to make you squirm:
(d) Let n be a positive integer. Write down the function g n . (You need not prove that
g n exists nor that the function g n you’ve written down is the correct one.) Hence,
prove that for any x ∈ R, we have:
2
lim g n (x) = . (Answer on p. 1413.)
n→∞ 3
Big hint for part (d): The Jacobstahl numbers are 1, 1, 3, 5, 11, 21, 43, . . . , where each
new number equals the sum of the number before it and twice the number before that.
So for example, 11 = 5 + 2 × 3 and 21 = 11 + 2 × 5. You are told that the nth Jacobsthal
number is given by
2n − (−1)
n
.
3
16.1. y = f (x) + a
The graph of y = f (x) + a is simply the graph of f translated (or shifted) upwards by a
units. (Note that if a < 0, then we have a negative upward shift, i.e. a downward shift.)
Example 286. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
The graph of y = f (x) + 2 = x3 + 1 is simply that of f translated upwards by 2 units.
y = f (x) + 2
(0, 1)
+2
(0, −1) x
f
−3
(0, −4)
y = f (x) − 3
122
This assertion is more formally stated and proven as Fact 196 in the Appendices.
202, Contents www.EconsPhDTutor.com
Example 287. The black graph below is the function g ∶ R ∖ {0} → R defined by g (x) =
1/x.
The graph of y = g (x) + 1 = 1/x + 1 is simply that of g translated upwards by 1 unit.
Since g has horizontal asymptote y = 0 (the x-axis), y = g (x)+1 has horizontal asymptote
y = 1.
y y =x+1
y =x+1
y=x
y = −x − 1
y = g (x) + 1
y=1
+1
g
x = 0 is a
vertical asymptote
for both g and y = g (x) + 1
Note that with an upward or downward shift, any vertical asymptotes remain unchanged,
because a vertical line translated upwards or downwards is simply the same vertical line.
And so here, both g and y = g (x) + 1 have the vertical asymptote x = 0 (the y-axis).
The two lines of symmetry for g are y = x and y = −x. Thus, the two lines of symmetry
for y = g (x) + 1 are simply the same, but translated upwards by 1 unit — y = x + 1 and
y = −x + 1.
Example 288. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
Also graphed are these two equations:
y = f (x + 2) = (x + 2) − 1 = x3 + 6x2 + 12x + 7,
3
y = f (x − 1) = (x − 1) − 1 = x3 − 3x2 + 3x − 2.
3
The first equation is simply f translated leftwards by 2 units. The second is simply f
translated rightwards by 1 unit.
+1
−2
y = f (x + 2) f y = f (x − 1)
y y =x+1
x = −1 g
y=x
y = 0 is a horizontal
asymptote for both
g and y = g (x + 1)
x
y = −x
y = g (x + 1) y = −x − 1
Note that with a leftward or rightward shift, any horizontal asymptotes remain unchanged,
because a horizontal line translated leftwards or rightwards is simply the same horizontal
line. And so here, both g and y = g (x + 1) have the horizontal asymptote y = 0 (the
x-axis).
The two lines of symmetry for g are y = x and y = −x. Thus, the two lines of symmetry
for y = g (x + 1) are simply the same, but translated leftwards by 1 unit — y = x + 1 and
y = − (x + 1) = −x − 1.
Example 290. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
The graph of y = 2f (x) = 2x3 − 2 is simply that of f stretched vertically (outwards from
the x-axis) by a factor of 2.
1
(0, − )
2
←Ð
Compress 2× (1, 0) x
1
y = f (x) Stretch 2× (0, −1)
2
(0, −2)
f
y = 2f (x)
The graph of y = 0.5f (x) = 0.5x3 − 0.5 is simply that of f compressed vertically (inwards
towards the x-axis) by a factor of 2.
When a graph is stretched vertically, any y-intercepts, lines of symmetry, turning
points, and asymptotes stretch along with it.124
In this example, f has y-intercept (0, −1). And so, y = 2f (x) and y = 0.5f (x) simply
have y-intercepts (0, −2) and (0, −0.5).
Under a vertical stretch, any x-intercepts remain unchanged. Here, all three graphs have
the same x-intercept (1, 0).
124
This assertion is more formally stated and proven as Fact 196 in the Appendices.
206, Contents www.EconsPhDTutor.com
To get y = −af (x) (where a > 0), first reflect f in the x-axis to get y = −f (x), then stretch
vertically by a factor of a.
Example 291. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
Suppose we want to graph y = −2f (x).
To do so, first reflect f in the x-axis to get y = −f (x) = −x3 + 1.
y
y = −2f (x)
y = −f (x) (0, 2)
(0, −1)
x
(1, 0)
(0, −1)
Example 292. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
The graph of y = f (2x) = (2x) − 1 = 8x3 − 1 is simply that of f compressed horizontally
3
y = f (2x) f
y=f( )
x
2
1
( , 0) (1, 0)
2
(0, −1) (2, 0) x
Compress 2×
Stretch 2×
125
This assertion is more formally stated and proven as Fact 196 in the Appendices.
208, Contents www.EconsPhDTutor.com
To get y = f (−ax) (where a > 0), first reflect f in the y-axis to get y = f (−x), then compress
horizontally by a factor of a.
Example 293. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
Say we want to graph y = f (−2x).
To do so, first reflect f in the y-axis to get y = f (−x) = (−x) − 1 = −x3 − 1.
3
Compress 2×
1
(− , 0)
2
(1, 0) x
(−1, 0)
(0, −1)
y = f (−x)
f
y = f (−2x)
Example 294. Some unknown function f is graphed below. You are told only that
points A, B, and C are (approximately) (−1.4, 0), (0.8, −1.1), and (1.4, 0).
Armed only with this knowledge, we will try to graph four equations: y = 2f (x + 1),
y = 2f (x + 1), y = f (2x) + 1, and y = f (2x + 1).
A ≈ (−1.4, 0) C ≈ (1.4, 0)
B ≈ (0.8, −1.1)
y
y = 2f (x) + 1
y = 2f (x)
y = f (x + 1)
y = 2f (x + 1)
y = f (2x) + 1
y = f (2x)
x
y
f
y = f (x + 1)
y = f (2x + 1)
Summary:
Fact 27. Let a, b > 0 and c, d ∈ R. Let f be a nice function. Then to get the graph of
y = af (bx + c) + d, follow these steps:
1. Translate leftwards by c units, to get y = f (x + c).
2. Compress horizontally (inwards towards y-axis) by a factor of b, to get y = f (bx + c).
3. Stretch vertically (outwards from x-axis) by a factor of a, to get y = af (bx + c).
4. Translate upwards by d units, to get y = af (bx + c) + d.
Exercise 102. Use f from the last example to graph these equations. (Hint: You can
make use of what was already shown in the above example.)
(a) y = −2f (x) − 1. (b) y = 2f (−x) + 1. (Answers on p. 1416.)
(c) y = −2f (x + 1). (d) y = 2f (−x + 1). (Answers on p. 1417.)
(e) y = −f (2x) + 1. (f) y = f (−2x) + 1. (Answers on p. 1418.)
(g) y = −f (2x + 1). (h) y = f (−2x + 1). (Answers on p. 1419.)
Example 295. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
y = ∣f (x)∣ is the red dotted graph.
y
Where f < 0, the two Where f ≥ 0, the two
graphs are reflections of graphs coincide.
each other in the x-axis.
y = ∣f (x)∣
Example 296. The black graph below is the function g ∶ R ∖ {0} → R defined by g (x) =
1/x.
y = ∣g (x)∣ is the red dotted graph.
y
x
g
Where g < 0, they’re
reflections of each
other in the x-axis.
Example 297. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
y = f (∣x∣) is the red dotted graph.
Example 298. The black graph below is the function g ∶ R ∖ {0} → R defined by g (x) =
1/x.
y = g (∣x∣) is the red dotted graph.
y
1. The right portions coincide.
2. To get the left portion,
simply reflect the right
portion in the y-axis.
y = g (∣x∣)
g x
Example 299. The black graph below is the function f ∶ R → R defined by f (x) = x3 −1.
To graph 1/f , we make use of the above three rules:
1. Small becomes big.
In particular: Where f → 0− , we have 1/f → −∞; and where f → 0+ , we have 1/f → ∞.
So, x = −1 is a vertical asymptote of 1/f .
2. Big becomes small.
In particular: Where f → −∞, we have 1/f → 0− ; and where f → ∞, we have 1/f → 0+ .
So, y = 0 is a horizontal asymptote of 1/f .
3. Intersect at f (x) = 1.
√
So, the graphs of f and 1/f intersect at (0, −1) and ( 2, 1).
3
1
f
√
Horizontal asymptote ( 2, 1)
3
y=0
x
(0, −1)
Vertical asymptote
f x=1
(0, 2)
1 (0, 0.5)
g
Horizontal asymptote x
y=0
A ≈ (−1.4, 0) C ≈ (1.4, 0)
B ≈ (0.8, −1.1)
Exercise 104. Describe a sequence of transformations that would transform the graph
of
1 1
y= onto y = 3 − . (Answer on p. 1421.)
x 5x − 2
1
ln t is the area under y = ,
between x = 1 and x = t.
x
1
y=
ln t is defined as x
this shaded area
1 t x
The above definition is considered (slightly) informal because the mapping rule is described
using geometry. After we’ve learnt about the definite integral in Part V (Calculus), we will
give a formal definition of the natural logarithm function (see Definition 184).
Remark 30. In Singapore and some other Britishy bits of the world, ln is usually read
aloud as lawn. In the US, it’s usually read out loud as “el en”.
The notation ln was probably first published in Steinhauser (1875, p. 277), where it stood
for the Latin Logarithmus naturalis.
1
y=
ln 4 = 1.386 . . .
x
1 4 x
1
y=
x
ln 5 = 1.609 . . . .
1 5 x
1
y=
ln 1 = 0 because x
there’s no area!
1 x
1
y=
x
ln 0.9 = −0.105 . . .
0.9 1 x
1
ln 0.5 = −0.693 . . . y=
x
0.5 1 x
y
ln
1 x
As x → 0+ , ln x → −∞.
Thus, ln has the vertical
asymptote x = 0 (also the y-axis).
Definition 59. The exponential function, denoted exp, is the inverse of the natural
logarithm function.
Since Range (ln) = R, the above Definition says that exp has:
Domain: R,
Codomain: R+ ,
Mapping rule: ln y = x ⇐⇒ exp x = y.
Example 306. Since exp is the inverse of ln, we know from the earlier examples that:
ln
1
exp
1 x
Definition 60. Euler’s number, denoted e, is defined as the number that satisfies:
ln e = 1 or equivalently, e = exp 1.
1
y=
x
We define e so that
this area equals 1.
1 e x
ex = exp x.
The above Fact justifies why we can and will often write ex in place of exp x.
Remark 32. Note that while interesting, Euler’s number e is by itself not particularly
important. What’s really important are the natural logarithm and exponential func-
tions.127
For this reason, we will in this textbook often write exp x rather than ex . This is to
remind you that this expression is not simply a number e raised to the power of x, but is
the value of the exponential function at x.
126
It was Leonhard Euler (1707–1783) himself who first used the letter e to denote this number. Presumably
he did not do this to honour himself. Calling e Euler’s number is simply an honour conferred by
posterity. (Confusingly, there is also another number called Euler’s constant γ ≈ 0.577 215 664 . . . But
fortunately we will not encounter γ in A-Level maths.)
127
Indeed, one mathematician Walter Rudin (1966 [1987]) goes so far as to call the exponential function
“the most important function in mathematics”.
223, Contents www.EconsPhDTutor.com
We also have these two lovely results:
1 1 1 1
Theorem 1. e = + + + + ...
0! 1! 2! 3!
Proof. Happily, we’ll learn to prove this in Part V (Calculus) — see p. 769.
1 n
Theorem 2. e = lim (1 + ) .
n→∞ n
Proof. Happily, we’ll learn to prove this in Part V (Calculus) — see p. 770.
Although we can’t prove either of the above theorems right now, we can nonetheless nu-
merically “verify” that they are at least plausible:
1 1
To “verify” Theorem 1, define: f ∶ Z+0 → R by f (n) = + ⋅ ⋅ ⋅ + .
0! n!
Then write:
1 1 1 1
f (0) = = = 1. f (3) = f (2) + = 2.5 + = 2.6.
0! 1 3! 6
1 1 1 1
f (1) = f (0) + = 1 + = 2. f (4) = f (3) + = 2.6 + = 2.716.
1! 1 4! 24
1 1 1 1
f (2) = f (1) + = 2 + = 2.5. f (5) = f (4) + = 2.716 + = 2.718 . . .
2! 2 5! 120
We see that f rapidly converges towards e = 2.718 281 828 459 . . . By f (6), we have e correct
to three decimal places. Lovely.
1 n
Similarly, to “verify” Theorem 2, define: g ∶ R → R by g(n) = (1 + ) .
n
Then write:
1 1 1 10
g(1) = (1 + ) = 2. g(10) = (1 + ) = 2.593 742 . . .
1 10
1 2 1 100
g(2) = (1 + ) = 2.25. g(100) = (1 + ) = 2.704 813 . . .
2 100
1 3 1 1 000
g(3) = (1 + ) = 2.6. g(1 000) = (1 + ) = 2.716 923 . . .
3 1 000
6
1 4 1 10
g(4) = (1 + ) = 2.708 3. g (10 ) = (1 + 6 )
6
= 2.718 280 . . .
4 10
9
1 5 1 10
g(5) = (1 + ) = 2.716. g (10 ) = (1 + 9 )
9
= 2.718 281 . . .
5 10
We see that g also converges towards e = 2.718 281 828 459 . . . , though much less rapidly
than f . Even g(100) gets e correct only to two decimal places. And even g (106 ) gets e
correct to only five decimal places.
224, Contents www.EconsPhDTutor.com
18. O-Level Review: The Derivative
y
y = 2x + 1
∆y = 2
∆x = 1
y
y = −x − 1
∆y = −1
∆x = 1
Remark 33. Slope is a perfectly good synonym for gradient. However, your A-Level
syllabus and exams do not use the word slope. And so we’ll stick to using only the word
gradient.
y
y = x2 + 1
(3, 10)
Gradient −4
(−2, 5)
Gradient 6
(0, 1)
Gradient 0
The blue line is tangent to the graph at the point (0, 1). It can be shown that this line’s
gradient is 0. Therefore, the graph’s gradient at this point is 0.
The green line is tangent to the graph at the point (3, 10). It can be shown that this
line’s gradient is 6. Therefore, the graph’s gradient at this point is 6.
Gradient 12
(3, 28)
y = x3 + 1
Gradient 0 (0, 1)
Gradient 27
(−2, −7)
The blue line is tangent to the graph at the point (0, 1). It can be shown that this line’s
gradient is 0. Therefore, the graph’s gradient at this point is 0.
The green line is tangent to the graph at the point (3, 28). It can be shown that this
line’s gradient is 27. Therefore, the graph’s gradient at this point is 27.
Example 311. Define f ∶ R → R by f (x) = 5x. The graph of f is simply the graph of the
equation y = 5x. We don’t yet know how, but it’s possible to show that at every point,
this graph’s gradient is equal to 5:
dy
= f ′ (x) = 5 for all x.
dx
So, for example, at x = −2, 0, 3, and indeed any other point, the graph’s gradient is 5.
That is, at each of the points x = −2, 0, or 3, the graph’s gradient is −4, 0, and 6.
R
dy RRRR
Formally, RRR = g ′ (a) is read aloud as “the derivative of y with respect to x, evaluated
dx RR
Rx=a
at a” or “the derivative of g at a”.
A little less formally, we can simply read it as “the gradient at a”.
That is, at each of the points x = −2, 0, or 3, the graph’s gradient is 12, 0, or 27.
Example 314. A particle P travels along a line. Its eastward displacement x (metres)
from the point O at time t (seconds) is given by:
x (metres)
O 1 2 3 4
x (m)
1
t (s)
1 2 3 4
v (m s−1)
11
-1 1 2 3 4 t (s)
-3
At each instant of time, v tells us what P ’s velocity is — in other words, the rate at
which x is changing per “infinitesimally small” unit of time t. If v > 0, then P is travelling
eastwards. And if v < 0, then P is travelling westwards.
From the above graph, we can tell that:
• During t ∈ [0, 1), P travels eastwards.
• At t = 1, P stops.
• During t ∈ (1, 3), P travels westwards.
• At t = 3, P stops.
• During t > 3, P travels eastwards.
(Example continues on the next page ...)
a (t) = 6t − 12.
a (m s−2)
12
10
8
6
4
2
-2 1 2 3 4 t (s)
-4
-6
-8
-10
-12
At each instant of time, a tells us what P ’s acceleration is — in other words, the rate
at which v is changing per “infinitesimally small” unit of time t. If a > 0, then P ’s
eastwards velocity is increasing (or equivalently, its westwards velocity is decreasing).
And if a < 0, then P ’s eastwards velocity is decreasing (or equivalently, its westwards
velocity is increasing)
From the above graph, we can tell that:
• During t ∈ [0, 2), P ’s eastwards velocity is increasing (or equivalently, its westwards
velocity is decreasing).
• During t > 2, P ’s eastwards velocity is decreasing (or equivalently, its westwards velo-
city is increasing)
x
f is differentiable everywhere
because its graph is
“smooth” and has no “kinks”.
Example 316. The sine function sin is both continuous everywhere and differentiable
everywhere.
sin is differentiable
everywhere because
its graph is “smooth”
and has no “kinks”.
128
We will formally define the concept of differentiability in Part V (Calculus).
233, Contents www.EconsPhDTutor.com
Example 317. The exponential function exp is both continuous everywhere and differ-
entiable everywhere.
The above examples may leave you wondering, “So aren’t continuity and differentiability
just the same thing?” Well, it turns out that every differentiable function must also
be continuous.129 (This is exactly what we meant when we said that differentiability is a
stronger condition than continuity.)
However, the converse is false — not every continuous function is differentiable; that is,
continuity does not imply differentiability. Differentiable functions are thus a subset of
continuous functions. Beautiful Venn diagram drawn by an artistic genius:
All
functions
Differentiable
functions
Continuous
Functions
Let’s now look at some examples of functions that are continuous but not differentiable.
The classic example is the absolute value function:
129
This assertion is formally stated and proven as Theorem 19 in the Appendices.
234, Contents www.EconsPhDTutor.com
Example 318. The absolute value function ∣⋅∣ is continuous everywhere, because you can
draw its entire graph without lifting your pencil.
However, it is not differentiable everywhere, because it is not “smooth” everywhere. In
particular, it has a “kink” at x = 0.
We can say though that the absolute value function is differentiable everywhere except
at x = 0. Or equivalently, it is differentiable on R/ {0}.
The function g is continuous everywhere, because you can draw its entire graph without
lifting your pencil.
However, it is not differentiable everywhere because, like ∣⋅∣, it has a “kink” at x = 0.
Nonetheless, again, we can say that g is differentiable everywhere except at x = 0. Or
equivalently, g is differentiable on R/ {0}.
g is not differentiable
everywhere because it
has a kink at x = 0.
g is
differentiable
on R ∖ {0}.
Example 320. The tangent function tan is not continuous everywhere. Thus, it is not
differentiable everywhere either.
y
tan is not
differentiable
everywhere.
tan is differentiable
on (− , ).
π π
2 2
−
π π x
2 2
It is however both (i) continuous and (ii) differentiable on the interval (− , ). This is
π π
2 2
because this interval (i) can be drawn without lifting your pencil; and (ii) is “smooth”
and has no “kinks”.
Indeed, tan is both continuous and differentiable on every interval:
1 1
((k − ) π, (k + ) π), for k ∈ Z.
2 2
Happily, most functions we’ll encounter in A-Level maths will be both continuous and
differentiable. There are however exceptions, as we’ve seen here and in Ch. 11.
Exercise 105. Graphed below are three functions f , g, and h. State if each is continuous
everywhere and/or differentiable everywhere. If not, state the set of points on which each
function is continuous or differentiable. (Answer on p. 1422.)
y y h y
g
x
f x
x
You’re probably wondering where the Chain Rule is. Don’t worry, it is the topic of the
next subchapter.
130
For a formal statement of these Rules of Differentiation, see Proposition 20 (Appendices).
237, Contents www.EconsPhDTutor.com
d C
Example 321. 5 = 0. (Constant Rule)
dx
d
500 = 0.
C
Example 322. (Constant Rule)
dx
d
(−200) = 0.
C
Example 323. (Constant Rule)
dx
d d
(5x3 ) = 5 ( x3 ) = 5 (3x2 ) = 15x2 .
F
Example 324. (Constant Factor Rule)
dx dx
d d
(500x0.3 ) = 500 ( x0.3 ) = 500 (0.3x−0.7 ) = 150x−0.7 .
F
Example 325. (CFR)
dx dx
d d
(−200x−1 ) = −200 ( x−1 ) = −200 (−x−2 ) = 200x−2 .
F
Example 326. (CFR)
dx dx
d 3P 2
Example 327. x = 3x . (Power Rule)
dx
d 0.3 P
Example 328. x = 0.3x−0.7 . (Power Rule)
dx
d −1 P −2
Example 329. x = −x . (Power Rule)
dx
d ± d 3 d
(x3 + 500x0.3 ) = x + (500x0.3 ) = 3x2 + 150x−0.7 .
dx dx dx
d ± d 3 d
(x3 − 500x0.3 ) = x − (500x0.3 ) = 3x2 − 150x−0.7 .
dx dx dx
d ×
(x3 sin x) = 3x2 sin x + x3 cos x.
dx
(ln x)
2 2
1 ln x (ex ex + ex ex ) − (ex ex ) x1
=
P
(ln x)
2 2
e2x 2 ln x − x1 e2x 1
= = (1 − ).
2 (ln x) 2 ln x 2x ln x
R
dy dy RRRR
R
dx RRRR
Exercise 106. Find and for each of the following. (Answer on p. 1422.)
dx
Rx=0
(a) y = x2 .
(b) y = 3x5 − 4x2 + 7x − 2.
(c) y = (x2 + 3x + 4) (3x5 − 4x2 + 7x − 2).
dy
Exercise 107. For each of the following, find without using the chain rule.
dx
(a) y = ex ln x.
(b) y = x2 ex ln x.
sin x
(c) y = .
x
sin x
(d) y = tan x, given that tan x = and sin2 x + cos2 x = 1.
cos x
1
(e) y = , where z is a variable that can be expressed in terms of x. (Leave your answer
z
dz
in terms of z and .)
dx
Use (e) to solve (f), (g) and (h):
1
(f) y = cosecx, where cosecx = .
sin x
1
(g) y = sec x, where sec x = .
cos x
1
(h) y = cot x, where cot x = . (Answers on p. 1422.)
tan x
One mnemonic is to think of the derivatives on the RHS as fractions — in which case,
the dy’s get cancelled out and we’re left with dz/dx. (In Part V, we’ll explain why this is
merely a mnemonic and why it is wrong to think of derivatives as fractions.)
The chain rule has the following informal interpretation:
Example 334. Say that if I add to a cup of water 1 g of Milo, its water volume increases
by 2 cm3 . And if the cup’s water volume increases by 1 cm3 , its water level rises by 0.3 cm.
Then by common sense, if I add 1 g of Milo, its water level should rise by 2 × 0.3 = 0.6 cm.
Let’s now rewrite the above common-sense observations more formally:
Let x be the mass (g) of Milo in a cup of water, y be the total volume (cm3 ) of water in
the cup, and z be the water level (cm) in the cup.
• When x increases by 1 g, y increases by 2 cm3 .
dy
Formally: = 2 cm3 g−1 .
dx
• When y increases by 1 cm−3 , z increases by 0.3 cm.
dz
Formally: = 0.3 cm cm−3 = 0.3 cm−2 .
dy
• And so, by the chain rule, when x increases by 1 g, z increases by 2 × 0.3 = 0.6 cm.
dz dz dy
Formally: = = 0.3 cm−2 × 2 cm3 g−1 = 0.6 cm g−1 .
dx dy dx
131
For a formal statement and proof of the Chain Rule, see Theorem XXX (Appendices).
240, Contents www.EconsPhDTutor.com
Examples of how to “use” the chain rule:
A slightly more complicated example, where we use the Chain Rule more than once:
Exercise 109. Let F, m, v, t, and p denote force, mass, velocity, time, and mo-
mentum. Momentum is defined as the product of mass and velocity.
(a) Newton’s Second Law of Motion states that the rate of change of momentum (of
an object) is equal to the force applied (to that object). Write down this law in
mathematical notation.
(b) Acceleration a is defined as the rate of change of momentum. Explain why Newton’s
Second Law simplifies to F = ma if mass is constant. (Answer on p. 124.12.)
Exercise 110. In Part V (Calculus), Fact 157, we will formally state and prove that:
d 1 1
ln x = .
dx x
d
But assuming = is true, we can quite easily prove that exp x = exp x:
1
dx
d
(a) Use the Chain Rule to write down an expression for ln (exp x).
dx
(b) What do you observe about the expression ln (exp x)? Use this observation to write
d
down another expression for ln (exp x).
dx
d
(c) Then conclude that exp x = exp x. (Answer on p. 1424.)
dx
Remark 34. By the way, you needn’t mug the following derivatives because they are on
your List MF26. (We’ll review the inverse trigonometric functions sin−1 , cos−1 , and tan−1
in Ch. 19.7.)
1
sin −1 x
1− x 2
1
cos −1 x −
1− x 2
1
tan −1 x
1 + x2
y y
x
f ′ < 0 on R− f ′ > 0 on R+
g g′ = 0
at x = 0
f
f′ = 0 g ′ > 0 on R− g ′ < 0 on R+
at x = 0
h′ > 0
everywhere
Informally and intuitively, a turning point (of a function) is where the graph (of that
function) “turns”. Formally:
f ′ (0) = 0.
y
f
x
K
Moreover, K is also a turning point because it is both a stationary point and a strict
extremum (in particular, it is a strict local minimum.)
Remark 35. The term turning point is rarely or never used by mathematicians. How-
ever, it appears on your O- and A-Level syllabuses and exams. We shall therefore have to
use it. Definition 63 is merely my attempt to formally define what I believe your A-Level
examiners mean by this term.
g ′ (0) = 0.
y
x
D
g
Moreover, D is also a turning point because it is both a stationary point and a strict
extremum (in particular, it is a strict local maximum.)
By definition, every turning point is a stationary point. However, the converse is false —
that is, a stationary point need not be a turning point.
All
points
Stationary
points
Turning
points
x
(0, 0) is a stationary point
but not a turning point
f
y H = (6, 125)
G = (5, 125)
C = (−2, 76)
E = (2, 44)
D = (0, 0) x
F = (4, −32)
A = (−4, −81)
B = (−3, −81)
132
j is continuous everywhere, but not differentiable everywhere. It is differentiable everywhere except at
B and G.
247, Contents www.EconsPhDTutor.com
Example 344. The sine function has infinitely many stationary and turning points. For
every integer k, the point Ik = ((0.5 + k) π, 1) is both a stationary and a turning point.
3π 5π
I−2 = (− I0 = ( , 1) I2 = (
π
, 1) y , 1)
2 2 2
sin
5π 3π
I−3 = (− , −1) I−1 = (− , −1) I1 = ( , −1)
π
2 2 2
Note that at every even integer k, the point Ik = ((0.5 + k) π, 1) is a strict local maximum.
While at every odd integer l, the point Il = ((0.5 + 2l) π, 1) is a strict local minimum.
Now, here’s a subtle but important point. Every turning point is either a strict local
maximum or minimum. But the converse is false — that is, a strict local maximum or
minimum need not be a turning point.
y
i
I = (−1, −1)
However, the derivative of i at I is not equal to zero. Indeed, the derivative of i at I does
not even exist. Hence, I is not a stationary point and cannot be a turning point either.
y
A = (−5, 4)
4
2
B = (1, 1)
-6 -4 -2 2 4 x
-2
C = (2, −3)
-4
You may recall from secondary school that there were also something called inflexion
points. Don’t worry, we haven’t covered these yet and will do so in Part V (Calculus).
y
E
A
B C D F
As with line segments, ∣AB∣ and ∣CD∣ will denote the lengths of the arcs AB and CD.
Remark 36. Just so you know, some writers use the word arc to refer to any “smooth”
curve. This shall not be the practice of this textbook. In this textbook, we will strictly
reserve the word arc to mean a subset of the circumference of a circle.
133
For the formal definition of an arc, see Definition 228 in the Appendices.
252, Contents www.EconsPhDTutor.com
19.1. Angles
Let a and b be rays. Informally, the angle between a and b is the anti-clockwise rotation a
must undergo to coincide with b.
Now, going the other way, observe that we also get the ray OA if we rotate the ray OB
clockwise (c.w.) by 360○ − α; or by 720○ − α; or by 1080○ − α; etc. But:
And so, by the same reasoning as before, let us simply regard the angles α, α−360○ , α−720○ ,
α − 1080○ , etc. as being equal.
That is, for any integer k, the angles α and α + k ⋅ 360○ are equal. This is what we mean,
when we say that angles are periodic, with period 360○ .
253, Contents www.EconsPhDTutor.com
19.2. The Radian
In primary and secondary school, we used the degree (○ ) to measure angles. But from
here on out, we’ll use the radian instead.134
Definition 64. Let AB be an arc of a circle of radius r. Then the magnitude, in radians
(rad), of the angle α subtended by the arc is defined to be the following number:
∣AB∣
.
r
Refer to the circle on the left. It has centre O and radius r. And so, by the above Definition:
∣AB∣ ∣CD∣
α= and β= .
r r
A A
2πr
B
r r
D
α
β
The full angle
is 2π rad.
Now refer to the circle on the right. By the above Definition, the full angle 360○ equals
2π radians, because this is the ratio of the circle’s entire circumference to its radius r:
2πr
Full angle= = 360○ = rad = 2π rad.
1
r
134
Indeed, the radian is the SI unit for angles.
254, Contents www.EconsPhDTutor.com
By Definition 64, the angle subtended by an A
arc whose length equals r has magnitude 1 rad. r
To figure out what 1 rad is in degrees, simply135
divide = (previous page) by 2π: B
1
r
2π rad 360○ 1 rad
1 rad = = ≈ 57.3○ .
2π 2π
O
By the way, observe that by definition, the
radian is the ratio of two lengths. Thus, it
is actually a “unitless” unit or a “pure num-
ber”. And so, going forward, we will not even
bother writing the unit “rad” (as we’ve been
doing so far).
Refer to the circle on the left. The straight angle 180○ is that subtended by the semicircle
and equals π, because this is the ratio of a semicircle’s length πr to the radius r.
A A
π
πr r
2
The right
The straight angle is π/2.
angle is π.
O O
B
Now refer to the circle on the right. The right angle 90○ is one-quarter of a full angle
and thus equal to π/2. By convention, we depict the right angle as a square and any other
angle as a sector  of a circle.
135
This computation requires that we know the value (or at least the approximate value) of π.
255, Contents www.EconsPhDTutor.com
Altogether, we have seven names for angles, depending on their magnitude:
Here are a few convenient terms that we may use in this textbook:
• Complementary if α + β =
π
;
2
• Supplementary if α + β = π; and
• Explementary or conjugate if α + β = 2π.
Remark 37. By convention, angles are usually denoted by lower-case Greek letters α
(alpha), β (beta), γ (gamma), and θ (theta). Your List of Formulae (MF26, p. 3) also
uses upper-case Latin letters like A, B, P , and Q, and so we’ll use those too.
Proof. Let D be the point on AB that is the base of the perpendicular from the point C.
A
D
B C
Observe that the three triangles ABC, ACD, and CBD are similar. Thus:
∣AC∣ ∣AD∣ ∣BC∣ ∣BD∣
= and = .
∣AB∣ ∣AC∣ ∣AB∣ ∣BC∣
h
o
θ
a
Here now are the right-triangle definitions of sine and cosine you’ll recall from secondary
school:
We then use the sine and cosine functions to define another four trigonometric functions:
1 sin θ
The tangent functiontan ∶ R ∖ {(k + ) π ∶ k ∈ Z} → R by tan θ = .
2 cos θ
1
The cosecant function cosec ∶ R ∖ {kπ ∶ k ∈ Z} → R by cosecθ = .
sin θ
1 1
The secant function sec ∶ R ∖ {(k + ) π ∶ k ∈ Z} → R by sec θ = .
2 cos θ
cos θ
The cotangent function cot ∶ R ∖ {kπ ∶ k ∈ Z} → R by cot θ = .
sin θ
Remark 38. Your A-Level syllabus and exams denote the cosecant function by cosec and
so we’ll do that too. Be aware though that many writers instead denote it by csc.
(For now, don’t worry about the complicated-looking domains of the trigonometric functions
— we’ll discuss them in a moment.)
The above Definition together with the right-triangle definitions of sine and cosine give us:
大 跤 嫂
tuā kha só
o2 a2 o2 + a2 P h2
(a) sin2 x + cos2 x = + = = 2 = 1.
h2 h2 h2 h
o2 a2 + o2 P h2
(b) 1 + tan2 x = 1 + 2
= 2
= 2 = sec2 x.
a a a
a2 o2 + a2 P h2
(c) 1 + cot x = 1 + 2 =
2
2
= 2 = cosec2 x.
o o o
136
Credit to Tom J at TheStudentRoom.co.uk.
137
To mug is to study hard and, especially, to engage in rote learning or memorisation. I consider mug to
be Singlish and so italicise it.
It seems though that the phrase mug up may have originated in Britain. To my knowledge, this phrase
isn’t in current usage in Britain. I have however come across South Asians using mug up in this sense.
(In efficient Singlish though, the preposition up is simply dropped.)
138
Our proof here is actually not quite general, because it implicitly refers to the right triangle and thus
implicitly assumes that x is acute.
259, Contents www.EconsPhDTutor.com
It’s not difficult to remember these particular values of sin, cos, and tan:
Fact 30. π π π π
x 0
6 4 3 2
√ √ √ √ √
0 1 2 3 4
sin x
2 2 2 2 2
√ √ √ √ √
4 3 2 1 0
cos x
2 2 2 2 2
1 √
tan x 0 √ 1 3 N.A.
3
Proof. The results for x = 0 and x = π/2 are “obvious” and require no proof.
Here we merely give an informal proof-by-picture that:
√ √
π 1 3 2
(a) sin = , (b) sin = (c) sin =
π π
, and .
6 2 3 2 4 2
∣BD∣
sin ∠BAD = ,
∣AB∣
π 0.5 1
or sin = = .
6 1 2 π
3
B 0.5 D C
√
Thus, ∣DE∣ = 2. And so:
√ π
∣DF ∣ 2
sin ∠DEF = or sin = E
π 4
.
∣DE∣ 4 2
1 F
√ sin θ
To complete the proof, we use cos θ = 1 − sin2 θ and tan θ = and write:
cos θ
√ √ √ √
1 2 3 π sin (π/6) 1/2 1 3
cos = 1 − sin2 = 1−( ) = tan = =√ =√ =
π π
. .
6 6 2 2 6 cos (π/6) 3/2 3 3
¿
√ Á √ 2 √ √
Á (π/3) 3/2 √
=Á
À1 − ( 3 1 sin
cos = 1 − sin2 ) = tan = =√ = 3.
π π π
.
3 3 2 2 3 cos (π/3) 1/2
¿
√ Á √ 2 √ √
Á (π/4)
=Á
À1 − ( 2 2 sin 2/2
cos = 1 − sin2 ) = tan = =√ = 1.
π π π
.
4 4 2 2 4 cos (π/4) 2/2
Fun Fact
141
Victor Katz, A History of Mathematics: An Introduction, (2009, p. 253).
261, Contents www.EconsPhDTutor.com
19.5. Sine and Cosine: The Unit-Circle Definitions
Unfortunately, the right-triangle definitions of sine and y
cosine suffer from a slight problem — they “work” only if
θ is an angle in a right triangle, i.e. only if θ ∈ [0, π/2].
In order for sin and cos to also “work” more generally,
II I
that is, for any θ ∈ R, we must turn to the unit-circle
definitions. x
We first divide the cartesian plane into four Quadrants
called I, II, III, and IV — Quadrant I is where x and y III IV
are positive, then go anti-clockwise.
The positive x-axis is a ray that starts at the origin. For convenience, let’s call it x+ .
Let α be any angle. Rotate x+ anticlockwise by α to produce the ray a.
Let A = (Ax , Ay ) be the point at which a intersects the unit circle centred on the origin.
y The ray a
A = (Ax , Ay )
1
Ay
α The ray x+
O Ax Ā x
That is, sine and cosine are simply given by the y- and x-coordinates of the point A.
o ∣AĀ∣ Ay a ∣OĀ∣ Ax
sin β = = = = Ay and cos β = = = = Ax ,
h ∣OA∣ 1 h ∣OA∣ 1
which indeed coincides with the unit-circle definitions from the previous page.
In the other Quadrants, the right-triangle definitions will continue to give the “correct”
magnitudes of sine and cosine. However, they may now give the “wrong” signs. (“Correct”
and “wrong” here are as determined by the unit-circle definitions.)
Consider for example the angle β ∈ (π, 3π/2) in Quadrant III. As before, rotate x+ anti-
clockwise by β to produce the ray b. Let B = (Bx , By ) be the point at which b intersects
the unit circle centred on the origin.
β
B̄ Bx The ray x+
O x
By
1
The ray b
B = (Bx , By )
In contrast, looking at the right triangle OB̄B, the right-triangle definitions give:
Remark 39. Note though that the unit-circle definitions are considered informal, because
they rely on drawings. One way to formally define the sine and cosine functions is to
use their power series — something we’ll learn about in Part V. Another way is to use
Euler’s identity — something we’ll learn about in Part IV.
For A-Level maths, our above unit-circle definitions will be more than good enough. But
if you’re interested, this textbook’s “official” formal definitions of sine and cosine are
given in Definitions 177 and 178 in the Appendices.
Summary of the sign of each of sine, cosine, and tangent, in each of the four Quadrants:
y tan
cos
sin 1
3π 3π
− −
π π
2 2 2 2 x
−2π −π π 2π
−1
Exercise 113. Graph the functions cosec, sec, and cot. (Answer on the next page.)
π 3π
−
−π 2 1 π 2
−2π 3π −1 π 2π
−
2 2
More generally, if we shift any of the six graphs to the left or right by an integer multiple
of 2π, we get the exact same graphs. Algebraically:
sin θ = sin (θ + 2π) = sin (θ − 2π) , cos θ = cos (θ + 2π) = cos (θ − 2π) ,
sec θ = sec (θ + 2π) = sec (θ − 2π) , cosecθ = cosec (θ + 2π) = cosec (θ − 2π) .
• tan and cot don’t just have period 2π — they also have period π.
That is, after every π, they “repeat”. If we shift their graphs to the left or right by an
integer multiple of π, we get the exact same graphs. Algebraically:
Domain Range
sin R [−1, 1]
cos R [−1, 1]
tan R ∖ {(k + 0.5) π ∶ k ∈ Z} R
cosec R ∖ {kπ ∶ k ∈ Z} R ∖ (−1, 1)
sec R ∖ {(k + 0.5) π ∶ k ∈ Z} R ∖ (−1, 1)
cot R ∖ {kπ ∶ k ∈ Z} R
Recall142 that the domain of f /g must exclude any values of x for which g (x) = 0. This is
exactly what we must do here to get the domains of tan, cosec, sec, and cot:
For any integer k, sin (kπ) = 0. And so, for cosec = 1/ sin and cot = cos / sin, we must
exclude {kπ ∶ k ∈ Z} from the domain.
And for any integer k, cos ((k + 0.5) π) = 0. And so, for sec = 1/ cos and tan = sin / cos, we
must exclude {(k + 0.5) π ∶ k ∈ Z} from the domain.
• sin translated leftwards by π/2 is cos:
sin (θ + ) = cos θ.
π
2
Equivalently, cos translated rightwards by π/2 is sin:
cos (θ − ) = sin θ.
π
2
• sin and cos translated left- or rightwards by π are their own reflections in the x-axis:
142
Ch. 13.
267, Contents www.EconsPhDTutor.com
Fact 31 lists a whole bunch of useful trigonometric identities.
A, B, A ± B ∉ {(k + 0.5) π ∶ k ∈ Z} .
Thankfully, you needn’t mug the above because they’re on List of Formulae, MF26 (p. 3).
Whenever you see a question with trigonometric functions, put MF26 (p. 3) next to you!
143
These additional conditions ensure that cos A, cos B, and cos (A ± B) are non-zero — and hence that
tan A, tan B, tan (A ± B) are well-defined. Otherwise, (c) is false, because an undefined mathematical
object cannot be said to be equal to anything — indeed, not even to another undefined object!
268, Contents www.EconsPhDTutor.com
Here we’ll give only an informal proof-by-picture144 of the sine and cosine Addition Formu-
lae.145 This proof-by-picture covers only the special case where A, B, and A + B are acute.
The subsequent exercises then ask you to prove the remaining formulae.
cos A sin B
sin B
1
sin (A + B)
T
cos B
sin A cos B
B
A
P cos A cos B U
1. ∣P T ∣ = cos B. Thus:
We now have:
144
Credit to Blue .
145
My mnemonic from earlier also kinda works here: sine is normal, while cosine is weird.
146
Construction details. Let A, B, A + B ∈ (0, π/2). Let P U be the horizontal ray that starts at P . Rotate
P U anticlockwise by A to get the ray P T . Now rotate P T anticlockwise by B to get the ray P R. Pick
R so that ∣P R∣ = 1. Pick T so that it is the perpendicular drop of R onto the ray P T . Now construct
the rectangle P QSU that completely contains the right triangle P RT , with R on the line segment QS
and T on the line segment SU .
147
I.e. ∠RT S + ∠P T U = π/2 — see Definition 67.
269, Contents www.EconsPhDTutor.com
Exercise 114. Use the figure148 to prove Q S
the Subtraction Formulae for Sine
and Cosine in the special case where A R
is acute and B < A. (Answer on p. 1426.)
Remark 40. The Double-Angle Formulae are on List MF26, so strictly speaking, you
needn’t memorise them. Nonetheless, it’s a good idea to have them committed to memory
so you can solve problems that much more quickly.
Remark 41. The Triple-Angle Formulae are not on List MF26. So if they ever come up
on exams, you’ll want to be able to either derive them from scratch or recall them.
For cos 3A = 4 cos3 A − 3 cos A, there’s the Hokkien mnemonic “$1.30 = $4.30 − $3”:
箍 三 等於 四箍 三 減 三 箍 。
khoo sam sì khoo sam sam khoo .
$1.30 = $4.30 − $3 .
148
The construction of this figure is very similar to before, except now we start with the vertical ray P Q,
rotate it clockwise by A − B to get the ray P R, then rotate P R clockwise by B to get P T .
270, Contents www.EconsPhDTutor.com
Exercise 118. Prove the following Half-Angle Formulae. (Answer on p. 1428.)
⎧ √
⎪
⎪
⎪ 1 − cos A
⎪
⎪
⎪ for
A
in Quadrant I or II,
A ⎪ ⎪
,
2 2
sin = ⎨ √
2 ⎪ ⎪
⎪
⎪
⎪ 1 − cos A
⎪ −
A
⎪
⎪
, for in Quadrant III or IV,
⎩ 2 2
⎧√
⎪
⎪
⎪ 1 + cos A
⎪
⎪
⎪ for
A
in Quadrant I or IV,
A ⎪⎪
,
2 2
cos = ⎨ √
2 ⎪⎪
⎪
⎪
⎪ 1 + cos A
⎪−
A
⎪
⎪
, for in Quadrant II or III.
⎩ 2 2
A A
Hint: cos A = cos ( + ).
2 2
Exercise 119. Prove the following Sum to Product or Product to Sum Formulae.
P +Q −Q P +Q P −Q
sin P + sin Q = 2 sin cos P + cos Q = 2 cos
P
cos , cos ,
2 2 2 2
P +Q −Q P +Q P −Q
sin P − sin Q = 2 cos cos P − cos Q = −2 sin
P
sin , sin .
2 2 2 2
(Answers on p. 1429.)
P +Q P −Q P +Q P −Q
Hint: P = + and Q = − .
2 2 2 2
Fun Fact
The above S2P or P2S Formulae are also known as the Prosthaphaeresis Formulae.
Sounds cheem, but that’s just the combination of the Greek words for addition and
subtraction — prosthesis and aphaeresis. So yea, something you can totally use to impress
your friends and family.
The P2S Formulae will be particularly useful when we do integration, because they allow
us to rewrite an otherwise-difficult-to-integrate product into an easy-to-integrate sum.
Exercise 120. Rewrite each expression using the P2S Formulae: (Answer on p. 1429.)
But this is not the case! Very confusingly, sin2 denotes the function sin ⋅ sin. That is:
π 2
= (sin ) = (1) = 1, sin (sin ) = sin 1 ≈ 0.845.
2π 2 π
sin but
2 2 2
And in general, for any positive integer n, sinn does not denote sin ○ sin ○ ⋅ ⋅ ⋅ ○ sin. That is:
y tan
y=1
sin cos
And so, if we want to construct the inverses of sin, cos, and tan, we’ll have to first restrict
their domains:
Exercise 121. Use the HLT to show that the following domain restrictions will create
invertible functions. Then write down their inverses. (Answer on the next page.)
[− , ].
π π
sin R
2 2
cos R [0, π].
1
tan R ∖ {(k + ) π ∶ k ∈ Z} (− , ).
π π
2 2 2
Below are the graphs of sinR , cosR , and tanR . Clearly, by the HLT, each is invertible.
tanR
sin
−
π π
2 2
x
π
cos
sinR cosR
tan
Definition 69. The inverse trigonometric functions arcsine, arccosine, and arctangent
are denoted sin−1 , cos−1 , and tan−1 and are defined as follows:
149
Note that the domain restrictions given here are somewhat arbitrary. For example, with sin, we could
equally well have chosen to restrict the domain to [π/2, 3π/2] instead. Nonetheless, by convention, these
domain restrictions are standard and so they are what we’ll use.
274, Contents www.EconsPhDTutor.com
The three inverse trigonometric functions are graphed below. Observations:
• sin−1 and cos−1 both have domain [−1, 1].
sin−1 has endpoints (−1, −π/2) and (1, π/2), while cos−1 has endpoints (−1, π) and (1, 0).
• In contrast, tan−1 has domain R and no endpoints.
y
π
cos−1
π sin−1
2
tan−1
x
−1 1
−
π
2
Note that each function has a range that’s equal to its codomain. Each inverse trigonometric
function’s range (or equivalently codomain) is also called its set of principal values. So,
we have the following table (which also appears on List MF26, p. 3):
Remark 42. The previous subchapter noted that sin2 = sin ⋅ sin. This is confusing and
contradicts with our earlier use of f 2 to denote the composite function f ○ f .
Here, to add to our confusion, sin−1 doesn’t mean 1/ sin, as would be logical given that
sin2 = sin ⋅ sin. Instead, sin−1 x denotes the inverse sine or arcsine function!
This tremendously confusing notation is one reason why many writers prefer to denote
the three inverse trigonometric functions by arcsin, arccos, and arctan.
However and unfortunately, your A-Level exams and syllabus insist on using the notation
sin−1 , cos−1 , and tan−1 — and so that’s what we’ll have to do too.
1
Lemma 1. (a) If x ∈ R, then cos (tan−1 x) = √ and sin (tan−1 x) = √
x
.
1 + x2 1 + x2
√
(b) If x ∈ [−1, 1], then sin (cos−1 x) = 1 − x2 = cos (sin−1 x).
Let’s first give an informal proof-by-picture:
O
sin (tan−1 x) = =√
x
H 1 + x2
√
1 + x2 A 1
(a) x cos (tan−1 x) = =√
H 1 + x2
tan−1 x
√
O 1 − x2
sin (cos−1 x) = =
cos−1 x H 1
1 √
(b) x A 1 − x2
cos (sin−1 x) = =
H 1
sin−1 x
√
1 − x2
Formal proof:
√
150
Here we’ve actually omitted a step. Recall that if a = b√2 , then a = ∣b∣. And so here, taking square roots
of the equation 1 + x2 = sec2 y, we should instead get 1 + x2 = ∣sec y∣ = ∣sec (tan−1 x)∣. We next observe
that tan−1 x ∈ (− , ) and hence sec (tan−1 x) > 0; thus, we can get rid of the absolute value sign.
π π
2 2
276, Contents www.EconsPhDTutor.com
1 x2
Next, sin y = 1 − cos y = 1 − = = √
2 2 151 x
. Taking square roots, we have sin
1 + x2 1 + x2
y
1 + x2
or sin (tan−1 x) = √
x
. 3
1 + x2
(b) Let y = cos−1√ x. Then x = cos y and 1√− x = 1 − cos y = sin y. Taking square roots,
2 2 2 152
We have the following obvious result that is immediate from the definitions of the inverse
trigonometric functions:
The following Corollary is nearly immediate from the above Lemma and Fact:
√
1 − x2
Corollary 4. If x ∈ (−1, 1), then tan (sin−1 x) = √ and tan (cos−1 x) =
x
.
1 − x2 x
Proof. By tan = sin / cos and the above Lemma and Fact, we have:
√
sin (sin−1 x) sin (cos−1 x) 1 − x2
tan (sin−1 x) = =√ tan (cos−1 x) = =
x
and .
cos (sin−1 x) 1 − x2 cos (cos−1 x) x
∣x∣ ∣x∣
∣sin y∣ = √ or ∣sin (tan−1 x)∣ = √ . To get rid of these absolute value signs, we observe that:
1
1 + x2 1 + x2
• x ≥ 0 ⇐⇒ tan−1 x ∈ [0, ) and hence sin (tan−1 x) ≥ 0.
π
2
• x < 0 ⇐⇒ tan−1 x ∈ (− , 0) and hence sin (tan−1 x) < 0.
π
2
The above two observations show that x always has the same sign as sin (tan−1 x). Thus, we can get
rid of the absolute value signs.
152
Again, here’s we’ve omitted√
√ a step. Taking square roots of 1−x2 = sin2 y, we should instead have ∣sin y∣ =
1 − x2 or ∣sin (cos−1 x)∣ = 1 − x2 . We next observe that cos−1 x ∈ [0, π] and hence sin (cos−1 x) ≥ 0;
thus, we can get rid of the absolute value sign.
153
Again, here’s we’ve omitted a step. Taking square roots of 1−x2 = cos2 z, we should instead have ∣cos z∣ =
√ √
1 − x2 or ∣cos (sin−1 x)∣ = 1 − x2 . We next observe that sin−1 x ∈ [− , ] and hence cos (sin−1 x) ≥ 0;
π π
2 2
thus, we can get rid of the absolute value sign.
277, Contents www.EconsPhDTutor.com
The next Fact is a result you’re supposed to have mastered in secondary school. Sadly, it
is not on List MF26, which means you’ll have to mug it.
√
R= a2 + b2 α = tan−1 .
b
and
a
1 1
cos α = cos (tan−1 ) = √ =√ =√
b a
and: .
2 + b2
1 + (a)
b 2 a2 +b2
a a
a2
√
Below we first use the Addition and Subtraction Formulae, then what was just written in
red and blue above. In each case, the surd a2 + b2 nicely cancels out:
+
√
(a) R sin (θ + α) = a2 + b2 (cos α sin θ + sin α cos θ) = a sin θ + b cos θ,
+
√
(b) R cos (θ + α) = a2 + b2 (cos α cos θ − sin α sin θ) = a cos θ − b sin θ,
−
√
(c) R sin (θ − α) = a2 + b2 (cos α sin θ − sin α cos θ) = a sin θ − b cos θ,
−
√
(d) R cos (θ − α) = a2 + b2 (cos α cos θ + sin α sin θ) = a cos θ + b sin θ.
On the next page are another two (inverse) trigonometric identities that will be used in
Part III (Vectors):
Proof. For a formal proof, see Exercise 122. Here follows an informal proof-by-picture:
Two right triangles with hypotenuse and base of lengths 1 and x are drawn below.
1. The red angle equals cos−1 x.
2. By symmetry, the green angle is equal
to the red angle. 1 1
3. The blue angle is the supplement of 3 cos (−x)
−1
Proof. For a formal proof, see Exercise 123. Here is an informal proof-by-picture:
Two right triangles with hypotenuse and base of lengths
1 and x are drawn below.
1. The red angle equals cos−1 x. 1
2. The blue angle is supplementary to the red angle and 2 sin−1 x
is also equal to sin−1 x (because “Opp” is of length x).
Hence, sin−1 x = π/2 − cos−1 x.
1 cos−1 x x
Rearranging, cos x + sin x = π/2.
−1 −1
Exercise 122. This Exercise guides you through a proof of Fact 34. Recall (p. 19.5)
that cosine reflected in the x-axis is cosine translated rightwards by π. That is:
− cos θ = cos (θ − π) .
Exercise 123. This Exercise guides you through a proof of Fact 35. Recall (p. 19.5)
that cosine translated rightwards by π/2 is sine. That is:
sin θ = cos (θ − ) .
π
2
Proposition 5. Let △ABC have angles A, B, and C, and sides a, b, and c. Then:
1
(a) The area of △ABC is: ab sin C.
2
Proof. Let D be the point on AC that is the base of the perpendicular from B. Then sin C =
∣BD∣ /a and cos C = ∣CD∣ /a. Thus, ∣BD∣ = a sin C, ∣CD∣ = a cos C, and ∣AD∣ = b − a cos C.
B
c a
a sin C
A C
A b − a cos C D a cos C C
(a) △ABC has base b and height a sin C. Hence, its area is 0.5ab sin C.
(b) By symmetry, the triangle has area 0.5ab sin C = 0.5bc sin A = 0.5ac sin B.
Divide by 0.5abc to get: sin A/a = sin B/b = sin C/c.
(c) Consider the triangle ABD. It has hypotenuse of length c and legs of lengths a sin C
and b − a cos C. Now use the Pythagorean Theorem and the identity sin2 C + cos2 C = 1:
Corollary 5. The length of any one side of a triangle is always less than the sum of the
lengths of the other two sides.
Proof. Consider a triangle with sides of lengths a, b, c > 0, where C > 0 is the angle opposite
the side of length c. By the Law of Cosines:
c2 = a2 + b2 − 2ab cos C
= a2 + b2 − 2ab + 2ab − 2ab cos C
= (a − b) + 2ab (1 − cos C)
2
> (a − b) ,
2
where the last inequality follows because a, b > 0 and cos C < 1.
The inequality c2 > (a − b) is equivalent to c > a − b or a < b + c. This proves that the length
2
The following result, known as the Triangle Inequality, is secretly the same as the above
result (hence the name):
The constant and identity functions are, of course, special instances of polynomial
functions.
Definition 71. An identity function is any nice function f defined by f (x) = x.154
Definition 72. A constant function is any nice function f defined by f (x) = c, where
c ∈ R.
A special case of a constant function is a zero function:
Definition 74. A power function is any nice function f defined by f (x) = xk , where k
is any real number.
We shall not formally define what an algebraic function is. Instead, we shall merely note
in passing that the set of algebraic functions includes all polynomial and power functions
(but also more besides).
Definition 75. The functions sin, cos, tan, cosec, sec, and tan are called trigonometric
(or circular) functions.155
Definition 76. The functions sin−1 , cos−1 , and tan−1 are called inverse trigonometric (or
circular) functions.
All of the above functions are elementary functions. Also, any arithmetic combination
(Ch. 13) or composition (Ch. 15) of two elementary functions is an elementary function.
Formally:
154
The identity mapping is the mapping x ↦ x. And so, an identity function may also be defined as any
nice function with the identity mapping.
155
On p. 18 of your H2 Maths syllabus, these six functions are simply called the circular functions.
However, the term trigonometric functions is probably more common.
282, Contents www.EconsPhDTutor.com
Definition 77. An elementary function is:
Nearly every function you’ll ever encounter in H2 Maths is elementary. Through arith-
metic combinations and compositions, we can build ever functions that “look” ever more
complicated but are nonetheless elementary:
⎧
⎪
⎪
⎪x for x ≥ 0,
∣x∣ = ⎨
⎪
⎪
⎩−x
⎪ for x < 0.
The absolute value function doesn’t seem to fall under our above Definition of elementary
functions. But observe we can rewrite the mapping rule of ∣⋅∣ more simply as:
√
∣x∣ = x2 .
√
(If you are puzzled by the above equation, recall that for any y ∈ R, we have y ≥ 0 — see
Remark 15.)
√
Now define the functions f, g ∶ R → R by f (x) = x2 and g (x) = x. Both f and g are
elementary. Moreover, ∣⋅∣ is the composition of f and g:
∣⋅∣ = f ○ g,
Remark 43. Note that there is no single standard definition of the term elementary
function. The above definition is merely this book’s.156
This term is nonetheless introduced because it is a convenient one for referring to nearly
all functions that maths students at this level will encounter.
156
And ProofWiki’s.
284, Contents www.EconsPhDTutor.com
21. Polynomial Division
Ch. 2 reviewed division. We’ll now look at polynomial division.
q(x) ©
p(x) r(x)
© © © 2x + 1 ©
p(x) q(x) d(x) r(x)
1
2x + 1 = 2 ⋅ x + 1 or = 2 + .
® ®
x x
d(x) d(x)
The four polynomials labelled above have the same four names as before:
In the above example, it was kinda obvious that the quotient had to be q (x) = 2. In the
next example, it’s a little less obvious and we’ll have to use long division:
Example 353. Consider (x2 + 3)÷(x − 1). The dividend is p (x) = x2 +3 and the divisor
is d (x) = x − 1. This time, it’s not so obvious what the quotient q (x) should be. But it
turns out that just like with (simple) division, here long division can help us.
In (simple) long division, going from right to left, the columns were 1s, 10s, 100s, etc.
Here in polynomial long division, going from right to left, they’re the constant x0 term,
the linear x1 term, the squared x2 term, etc.
Terms: x2 x1 x0
x +1 Explanation
x − 1 x2 +0x +3
x2 −x x ⋅ (x − 1) = x2 − x
x +3 (x2 + 3) − (x2 − x) =x+3
x −1 1 ⋅ (x − 1) =x−1
4 (x + 3) − (x − 1) =4
q(x) ©
p(x) r(x)
³¹¹ ¹ ¹ ·¹ ¹ ¹ ¹ µ ³¹¹ ¹ ¹ ·¹ ¹ ¹ ¹ µ ©
p(x)
x2 + 3 ¬
d(x) q(x) r(x)
4
x2 + 3 = (x − 1) ⋅ (x + 1) + 4 or = x+1+ .
x−1 x−1
± ±
d(x) d(x)
Terms: x2 x1 x0
3 11
x + Explanation
2 4
2x − 3 3x2 +x −4
9 3 9
3x2 − x x ⋅ (2x − 3) = 3x2 − x
2 2 2
11 9 11
x −4 (3x2 + x − 4) − (3x2 − x) = x − 4
2 2 2
11 33 11 11 33
x − ⋅ (2x − 3) = x −
2 4 4 2 4
17 11 11 33 17
( x − 4) − ( x − ) =
4 2 2 4 4
3 11 17
q (x) = x + and r (x) = .
2 4 4
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ¬
p(x) r(x)
3x2 + x − 4 3 11 17/4
Or: = x+ + .
2x − 3 2 4 2x − 3
² ´¹¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¶ ²
d(x) q(x) d(x)
Terms: x3 x2 x1 x0
2x +2 Explanation
2x − x − 1 4x +2x
2 3 2
+0x +1
4x3 −2x2 −2x 2x ⋅ (2x2 − x − 1) = 4x3 − 2x2 − 2x
4x2 2x +1 (4x3 + 2x2 + 1) − (4x3 − 2x2 − 2x) = 4x2 + 2x + 1
4x2 −2x −2 2 ⋅ (2x2 − x − 1) = 4x2 − 2x − 2
4x +3 (4x2 + 2x + 1) − (4x2 − 2x − 2) = 4x + 3
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
p(x) r(x)
4x3 + 2x2 + 1 4x + 3
Or: = 2x + 2 + 2 .
2x − x − 1 2 ² 2x − x − 1
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ q(x) ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
d(x) d(x)
Exercise 124. For each expression, do the long division and identify the dividend, divisor,
quotient, and remainder. (Answer on p. 1431.)
16x + 3 4x2 − 3x + 1 x2 + x + 3
(a) . (b) . (c) .
5x − 2 x+5 −x2 − 2x + 1
The above examples and exercises suggest the following theorem and definition:
Theorem 4. (Euclidean Division Theorem for Polynomials.) Let p (x) and d (x)
be P - and D-degree polynomials in x with D < P . Then there exists a unique polynomial
q (x) of degree P − D such that r (x) = p (x) − d (x)q (x) has degree less than D.
Definition 79. Given polynomials p (x) and d (x) and the expression p (x) ÷ d (x), we
call p (x) the dividend, d (x) the divisor, the unique polynomial q (x) given in the above
theorem the quotient, and r (x) = p (x) − d (x)q(x) the remainder.
Definition 80. Let p (x) and d (x) be polynomials. If there exists a polynomial q (x)
such that p (x) = d (x) q (x), then we say that d (x) is a factor for p (x) or divides p (x).
p (x) = (x + 1) (x + 3) .
We now state and prove the Remainder Theorem and its corollary157 the Factor The-
orem. The Factor Theorem is especially useful for factorising polynomials.
Proof. (a) By Definition 79, when we divide p (x) by x−a, the quotient q (x) and remainder
r (x) are given by:
Note that r (x) is a polynomial whose degree is less than 1 and which is thus a constant.
And so, let us simply write r in place of r (x).
Now simply plug in x = a to get:
p (a) = (a − a) q (a) + r = r.
157
In mathematics, a corollary is a statement that follows readily from another.
288, Contents www.EconsPhDTutor.com
Some examples and exercises to illustrate the Remainder Theorem (RT):
Terms
Squared Linear Constant
x −2
x−3 x2 −5x +1
x2 −3x
−2x +1
−2x +6
−5.
Example 358. Consider p (x) = 17x5 − 5x4 + x2 + 1 (a quintic polynomial). By the RT,
p (x) divided by x − 1 leaves a remainder of p(1) = 17 ⋅ 15 − 5 ⋅ 14 + 12 + 1 = 14.
We could’ve figured this out through long division (I didn’t bother but you can try this
as an exercise), but clearly the RT is a lot quicker.
Historically, the RT rarely featured on the A-Level exams ... Which means, of course, that
it made a sudden appearance in 2017 just to screw students over.158 So yea, it’s another
thing you’ll want to remember. An exercise to help with that:
For H2 Maths, the Remainder Theorem will have little use (except as a means for screwing
students over). Instead, its corollary — the Factor Theorem (FT) — will be more useful
for factorising polynomials.
The FT tells us that p (a) = 0 if and only if x − a is a factor for p (x). And so, by guess-and-
checking numbers a for which p (a) = 0, we can factorise p (x). Let’s call this the Factor
Theorem guess-and-check method (FTGACM). Examples:
158
See Exercise 446(a) (9758 N2017/I/5).
289, Contents www.EconsPhDTutor.com
Example 359. Factorise p (x) = x2 − 3x + 2 (a quadratic polynomial).
Let’s try the FTGACM by plugging in the number 1:
p(1) = 12 − 3 ⋅ 1 + 2 = 0. 3
Wah! So “lucky”! Success on the very first try! By the FT, x − 1 is a factor for p (x).
Since p (x) is quadratic, its other factor must be a linear or 1st-degree polynomial, i.e. of
the form ax + b. To find this other factor, we could continue trying the FTGACM. But
we won’t do that.
Instead, we’ll divide x2 − 3x + 2 by x − 1. In fact, here let’s learn another method for
dividing polynomials. Write:
By comparing the coefficients on the squared and constant terms, we see that a = 1 and
b = −2. Thus, the other factor must be ax + b = x − 2. We have:
p (x) = x2 − 3x + 2 = (x − 1) (x − 2) .
P.S. It was actually unnecessary to write = above. Because just from = alone, you can
2 1
easily tell from the coefficients on the squared and constant terms that a = 1 and b = −2.
p(1) = 3 ⋅ 12 + 5 ⋅ 1 + 2 = 10 ≠ 0. 7
Aiyah, sian. Doesn’t work — by the FT, x − 1 is not a factor for p (x).
By the way, here’s a towkay time-saving tip: We don’t need to compute the exact value
of p(1) to see that since every term is positive, p(1) is clearly positive and non-zero.
Let’s keep trying the FTGACM. This time, we try −1:
From the coefficients on the squared and constant terms, we see that a = 3 and b = 2.
Thus, the other factor must be ax + b = 3x + 2. We have:
√ √
3 − 1 3 + 1
x2 − 3x + 2 = (x − ) (x − ) = (x − 1) (x − 2) .
2 2
Example 363. 4x2 − 12x + 9 has discriminant b2 − 4ac = (−12) − 4(4)(9) = 0. Thus:
2
−12 2 3 2
4x − 12x + 9 = 4 (x + ) = 4 (x − ) = (2x − 3) .
2 2
2×4 2
Example 364. 3x2 −2x+1 has discriminant b2 −4ac = (−2) −4(3)(1) < 0. Thus, 3x2 −2x+1
2
cannot be factorised.
If your examiners are nice, a, b, c, and d should all be integers. We already know that
ac = 3 and bd = 2. So, let’s try something like a = 3, c = 1, b = 1, and d = 2:
(3x + 1) (x + 2) = 3x2 + 7x + 2. 7
If your examiners are nice, a, b, c, and d should all be integers. We already know that
ac = 6 and bd = −35. So, let’s try something like a = 3, c = 2, b = 7, and d = −5:
Aiyah, sian. Still doesn’t work. Neh’mine. Try again by switching the signs:
With quadratic (i.e. degree-2) polynomials, the quadratic formula or SSGACM will usually
be quicker than the FTGACM.
But for polynomials of degree 3 or higher, we do not know of any formula159 and the
SSGACM may be hopeless. And so it is really only with higher-degree polynomials that
the FTGACM comes in handy:
159
There are actually formulae for factorising cubic and quartic polynomials, but we haven’t learnt these.
292, Contents www.EconsPhDTutor.com
Example 367. Factorise p (x) = 15x3 − 17x2 − 22x + 24.
If we want to try the SSGACM, we’d write:
We observe that ace = 15 and bdf = 24. We could try out different numbers, but there
are just way too many possibilities and we’d probably take way too long.
Better then to try the FTGACM. To do so, we plug in the number 1:
p(1) = 15 ⋅ 13 − 17 ⋅ 12 − 22 ⋅ 1 + 24 = 0. 3
Wah! So “lucky”! Success on the very first try! By the FT, x − 1 is a factor for p (x).
Since p (x) is cubic, p (x) divided by x − 1 gives us a quadratic polynomial ax2 + bx + c,
which we can find by writing:
The coefficients on the cubed and constant terms are a = 15 and −c = 24 (or c = −24).
The coefficients on the squared term are −a + b = −17; and so, b = −2. Thus:
Let’s now factorise 15x2 − 2x − 24. Here, let’s just use the quadratic
√ formula. We have
b − 4ac = (−2) − 4 (15) (−24) = 4 + 1440 = 1444 > 0. Moreover, 1444 = 38. Thus:
2 2
2 − 38 2 + 38 6 4
15x2 − 2x − 24 = 15 (x − ) (x − ) = 15 (x + ) (x − ) .
30 30 5 3
Altogether, we have:
6 4
15x3 − 17x2 − 22x + 24 = 15 (x − 1) (x + ) (x − ) = (x − 1) (5x + 6) (3x − 4) .
5 3
But even with higher-degree polynomials, it may sometimes be quicker to use the SSGACM
(than the FTGACM), especially if the coefficients aren’t too big or are equal to zero:
Since ace = 1 and bdf = −6, why not try a = c = e = 1 and b = 2, d = −3, and f = 1:
(x + 2) (x − 3) (x + 1) = (x2 − x − 6) (x + 1) = x3 − x2 − 6x + x2 − x − 6 = x3 − 7x − 6. 3
As we’ve seen, factorising polynomials takes wisdom, intuition, and some luck. You’ll have
to learn to judge which tool will get you the answer most quickly.
Example 369. Let f be a continuous function. Suppose we have most of f ’s graph, but
are missing the interval (1, 3). Say we know that f (1) = −2 and f (3) = 2. What then can
we say about the missing portion of the graph?
y f
2 (3, 2)
1 3 x
(1, −2)
−2
Since f is continuous, it must be that we can draw its entire graph without lifting our
pencil. In particular, we can connect the dots (1, −2) and (3, 2) without lifting our
pencil. But obviously, the only way to do so is to have our pencil “go through” every
value between −2 and 2. Hence, f must take on every value between −2 and 2.
And yup, that’s all the IVT says — if f is continuous on the interval [a, b], then f must
“hit” every value between f (a) and f (b) in the interval (a, b). A bit more formally:
p(1) = 2 ⋅ 12 + 9 ⋅ 1 − 5 = 6 ≠ 0. 7
Aiyah, sian. Doesn’t work — by the FT, x − 1 is not a factor for p (x).
We could continue trying the FTGACM. But here let’s first enlist the help of the IVT.
Observation: p (0) = −5. What good is that observation?
Well, since p (0) < 0 < p (1), the IVT says there must be some 0 < c < 1 such that p (c) = 0.
Cool!
So let’s continue trying the FTGACM by plugging in the number 0.5:
The coefficients on the squared and constant terms are 2a = 2 and −b = −5. And so, a = 1
and b = 5. Altogether, we have:
For the A-Levels, you will routinely have to factorise quadratic polynomials. You may
sometimes also have to factorise cubic polynomials.
It’s unusual that they ask you to factorise polynomials of a higher order. And if they do,
the friendly folks at the MOE will usually be nice enough to give you a little help.160
In the next example, we factorise a quartic polynomial using what we’ve learnt. It’s long
and tedious, but conceptually not any harder than what we’ve already done:
160
See e.g. Exercise 552(b) — 9740 N2010/II/1.
295, Contents www.EconsPhDTutor.com
Example 371. Factorise p (x) = 6x4 + 13x3 − 29x2 − 52x + 20 (a quartic polynomial).
We’ll start by trying the FTGACM. We plug in the number 1:
p(1) = 6 ⋅ 14 + 13 ⋅ 13 − 29 ⋅ 12 − 52 ⋅ 1 + 20 < 0. 7
(Again, we can tell p(1) < 0 even without computing its exact value.)
Aiyah, sian. Doesn’t work — by the FT, x − 1 is not a factor for p (x).
But now, observe that p (0) = 20 > 0 > p(1). And so, the IVT says there must be some
value 0 < q < 1 such that p (q) = 0. So, let’s stick with the FTGACM, but now try 1/2:
1 1 4 1 3 1 2 1 6 13 29 52
p ( ) = 6 ⋅ ( ) + 13 ⋅ ( ) − 29 ⋅ ( ) − 52 ⋅ ( ) + 20 = + − − + 20 < 0. 7
2 2 2 2 2 16 8 4 2
(Again, we can tell p (1/2) is negative, even without computing its exact value.)
Aiyah, sian. Still doesn’t work — by the FT, x − 1/2 is not a factor for p (x).
But again, since p (0) = 20 > 0 > p (1/2), the IVT says that there must be some value
0 < r < 1/2 such that p (r) = 0. So, let’s stick with the FTGACM, but now try 1/3:
1 1 4 1 3 1 2 1
p ( ) = 6 ⋅ ( ) + 13 ⋅ ( ) − 29 ⋅ ( ) − 52 ⋅ ( ) + 20
3 3 3 3 3
6 13 29 52 2 13 29 8 15 5
= + − − + 20 = + − + = − = 0. 3
81 27 9 3 27 27 9 3 27 9
Yay, works! By the FT, x − 1/3 is a factor for 6x4 + 13x3 − 29x2 − 52x + 20.
If x − 1/3 is a factor, so too is 3 (x − 1/3) = 3x − 1. So write:
The coefficients on the 4th-degree and constant terms are 3a = 6 and −d = 20. And so,
a = 2 and d = −20. Next, to find b and c, examine the coefficients on the cubed and linear
terms, which are 3b − a = 13 and 3d − c = −52. And so, b = 5 and c = −8. Thus:
We must now factorise 2x3 + 5x2 − 8x − 20. Once again, let’s try the FTGACM.
By the way, here’s an additional trick to help you factorise polynomials. When trying
the FTGACM, you should try numbers that are factors of the constant term. So in this
case, the constant term is −20 = −2 × 2 × 5. So why not we try plugging in 2:
2 ⋅ 23 + 5 ⋅ 22 − 8 ⋅ 2 − 20 = 16 + 20 − 16 − 20 = 0. 3
The coefficients on the cubed and constant terms are e = 2 and −2g = −20. And so, g = 10.
To find f , look at the coefficients on the squared terms, which are f − 2e = 5. And so,
f = 9. Thus, ex2 + f x + g = 2x2 + 9x + 10.
To factorise 2x2 + 9x + 10, we use the SSGACM. Let’s try:
Wah! So lucky! Success on the very first try! And now, at long last, we’re done:
All of our examples have so far involved a n-degree polynomial that can be fully factorised
into n linear (or degree-1) factors. But this need not always be the case:
x4 − 1 = (x2 + 1) (x + 1) (x − 1) .
161
Actually, with complex numbers, further factorisation is possible. We can write x2 + 1 = (x + i) (x − i)
and thus x4 −1 = (x + i) (x − i) (x + 1) (x − 1). Indeed, as we’ll see later, the Fundamental Theorem of
Algebra (Theorem 11) guarantees that with the aid of complex numbers, any nth-degree polynomial
can be factorised into n linear factors.
162
Again, with complex numbers, we can actually write x2 + 1 = (x + i) (x − i) and x2 + 4 = (x + 2i) (x − 2i).
Thus, x4 + 5x2 + 2 = (x + i) (x − i) (x + 2i) (x − 2i).
163
Again, with complex numbers, it is actually possible to factorise x4 + x + 1 into four linear factors.
297, Contents www.EconsPhDTutor.com
Exercise 126. Factorise the following polynomials. (Answer on p. 1432.)
Exercise 127. Let p (x) = ax4 + bx3 − 31x2 + 3x + 3, where a and b are constants. You are
told that (i) p (x) divided by x − 1 leaves a remainder of 5; and (ii) p (0.5) = 0.
x2 y 2
x2 + y 2 = 1 + =1
a2 b2
1 x2 y 2
y= x2 − y 2 = 1 − =1
x a2 b2
y 2 x2 bx + c ax2 + bx + c
− =1 y= y=
b2 a 2 dx + e dx + e
Fun Fact
The Greek word hyperbola is closely related to the English word hyperbole, which means
an exaggeration or overstatement.
By the way, we’ve actually already studied one example of a conic section — this was the
graph of the quadratic equation y = ax2 + bx + c, which is a type of conic section called the
parabola.
164
For why these are called conic sections, see Ch. 117.8 (Appendices).
299, Contents www.EconsPhDTutor.com
22.1. The Ellipse x2 + y 2 = 1 (The Unit Circle)
Let’s do an O-Level review of why the equation x2 + y 2 = 1 describes the unit circle (i.e.
radius 1) centred on the origin.
Consider any point A = (Ax , Ay ) on the unit circle. We can use it to form a right triangle,
with base Ax , height Ay , and hypotenuse 1. By the Pythagorean Theorem, A2x +A2y = 12 = 1.
This proves that every point on the unit circle satisfies the equation x2 + y 2 = 1.
1 y
x2 + y 2 = 1
(0, 1) is a strict A = (Ax , Ay )
global maximum.
1
Ay
−1 1
O Ax x
−1
2. Then stretch (x/a) + y 2 = 1 vertically, outwards from the x-axis, by a factor of b, to get
2
(x/a) + (y/b) = 1.
2 2
Thus, (x/a) + (y/b) = 1 is simply the unit circle stretched horizontally and vertically by
2 2
factors of a and b. We call this “elongated” or “imperfect” circle an ellipse. Note that this
ellipse remains centred on the origin (0, 0).
y (0, b) is a strict
global maximum.
x 2 y 2
( ) +( ) =1
a b
Line of symmetry
x=0
−a Line of symmetry a x
y=0
1. Intercepts. The y-intercepts are (0, −b) and (0, b). The x-intercepts are (−a, 0) and
(a, 0).
2. Turning points. By observation, there are two turning points — (0, b) is a strict global
maximum and (0, −b) is a strict global minimum.
3. Asymptotes. By observation, there are no asymptotes.
4. Symmetry. By observation, if a ≠ b, then there are only two lines of symmetry, namely
y = 0 (the x-axis) and x = 0 (the y-axis). (Note that if a = b, then this ellipse is in fact a
circle and there are again infinitely many lines of symmetry.)
165
Read Ch. 16 if you haven’t already.
301, Contents www.EconsPhDTutor.com
Exercise 128. Graph the equation below (a, b, c, d ∈ R and a, b ≠ 0). Label any turning
points, asymptotes, lines of symmetry, and intercepts. (Hint in footnote.)166
(x + c) (y + d)
2 2
+ = 1. (Answer on p. 1434.)
a2 b2
The rest of this chapter will look at six examples of hyperbolae. Our first and also the
simplest example of a hyperbola is y = 1/x:
166
To find the y-intercepts, plug in x = 0. To find the x-intercepts, plug in y = 0.
302, Contents www.EconsPhDTutor.com
1
22.3. The Hyperbola y =
x
All hyperbolae we’ll study will share some common features:
1. There’ll be two branches — y = 1/x has a bottom-left branch and a top-right branch.
2. There may or may not be x- and y-intercepts — y = 1/x has neither.
3. There may or may not be turning points — y = 1/x has none.
4. There’ll be two asymptotes — y = 1/x has horizontal asymptote y = 0, because as
x → −∞, y → 0− and as x → ∞, y → 0+ . Also, y = 1/x has the vertical asymptote x = 0,
because as x → 0− , y → −∞ and as x → 0+ , y → ∞.
A rectangular hyperbola is any hyperbola whose two asymptotes are perpendicular
— thus, y = 1/x is an example of a rectangular hyperbola.
5. The hyperbola’s centre is the point at which the two asymptotes intersect167 — y = 1/x
has centre (0, 0).
6. There’ll be two lines of symmetry — each (a) passes through the centre; and (b)
bisects an angle formed by the two asymptotes.
y = 1/x has two lines of symmetry: y = x and y = −x. Observe that indeed, each (a)
passes through the centre; and (b) bisects an angle formed by the two asymptotes.
1
Line of symmetry y= Line of symmetry
y = −x x y=x
Horizontal asymptote
y=0
x
Centre
(0, 0)
Vertical asymptote
x=0
167
For simplicity, this shall be this textbook’s definition of a hyperbola’s centre.
Note though that in the usual and proper study of conic sections, the centre is instead defined as the
midpoint of the line segment connecting the two foci. That the two asymptotes intersect at the centre
is then a result rather than a definition. However, in H2 Maths, there is no mention of foci and so I
thought it better to simply define the centre as the intersection point of the two asymptotes.
303, Contents www.EconsPhDTutor.com
22.4. The Hyperbola x2 − y 2 = 1
Consider the equation x2 − y 2 = 1. Notice that no x ∈ (−1, 1) satisfy this equation. Hence,
this graph contains no points for which x ∈ (−1, 1).
y
Oblique asymptote Oblique asymptote
y = −x y=x
x2 − y 2 = 1
Line of symmetry
y=0
−1 1 x
Centre
(0, 0)
Line of symmetry
x=0
1. There are two branches — one on the left and another on the right.
2. Intercepts. The x-intercepts are (−1, 0) and (1, 0). There are no y-intercepts.
3. There are no turning points.
4. As x → −∞, y → ±x. And as x → ∞, y → ±x. So, x2 − y 2 = 1 has two oblique
asymptotes y = ±x.
Since the two asymptotes y = x and y = −x are perpendicular, this is again a rectangular
hyperbola. (In fact, we call this an “east-west” rectangular hyperbola.)
5. The hyperbola’s centre is (0, 0).
6. The two lines of symmetry are y = 0 (the x-axis) and x = 0 (the y-axis). Observe
that each (a) passes through the centre; and (b) bisects an angle formed by the two
asymptotes.
equation. Hence, this graph again contains no points for which x ∈ (−a, a).
Two transformations will get us from x2 − y 2 = 1 to (x/a) − (y/b) = 1:
2 2
(x/a) − (y/b) = 1.
2 2
y
Oblique asymptote Oblique asymptote
y=− x y= x
b b
a a
Line of symmetry
y=0
−a a x
Centre
(0, 0)
Line of symmetry
x=0
1. There are two branches — one on the left and another on the right.
2. Intercepts. The x-intercepts are (−a, 0) and (a, 0). There are no y-intercepts.
3. There are no turning points.
4. As x → −∞, y → ±bx/a. And as x → ∞, y → ±bx/a. So, (x/a) − (y/b) = 1 has two
2 2
y 2 x2
− =1
b2 a2
Oblique asymptote
y=− x
b b
a
Line of symmetry
y=0
Centre x
(0, 0)
−b
Oblique asymptote
y= x
b
a
Line of symmetry
x=0
But to warm up, let’s first study the simpler case where a = 0. That is, let’s first study:
bx + c
y=
dx + e
.
y
2x + 1
y=
x+1
x = −1
y = −x + 1 y =x+3
(−1, 2)
y=2
(0, 1)
(−1/2, 0) x
1. There are two branches — one on the top-left and another on the bottom-right.
2. Intercepts. Plug in x = 0 to get y = (2 ⋅ 0 + 1) / (0 + 1) = 1 — the y-intercept is (0, 1).
Plug in y = 0 to get 2x + 1 = 0 or x = −1/2 — the x-intercept is (−1/2, 0).
3. There are no turning points.
4. As x → −1− , y → ∞. And as x → −1+ , y → −∞. So, y = (2x + 1) / (x + 1) has vertical
asymptote x = −1. (Not coincidentally, this is the x-value for which x + 1 = 0.)
As x → −∞, y → 2+ . And as x → ∞, y → 2− . So, y = (2x + 1) / (x + 1) has horizontal
asymptote y = 2. (Not coincidentally, this is the quotient in the above long division.)
Since the two asymptotes y = 2 and x = −1 are perpendicular, this is again a rectan-
gular hyperbola.
5. The hyperbola’s centre (the point at which the two asymptotes intersect) is (−1, 2).
(The centre’s coordinates are simply given by the vertical and horizontal asymptotes.)
6. The two lines of symmetry are y = x + 3 and y = −x + 1.
b/d
dx + e bx +c
bx +be/d
c − be/d
bx + c b c − bed b cd − be 1
Thus: = + = + .
dx + e d dx + e d d2 x + e/d
y
7x + 3
y=
2x + 4
x = −2
y = −x + 3/2 y = x + 11/2
(−3/7, 0) x
1. There are two branches — one on the top-left and another on the bottom-right.
2. Intercepts. Plug in x = 0 to get y = 3/4. So, the y-intercept is (0, 3/4).
Plug in y = 0 to get 7x + 3 = 0 or x = −3/7. So, the x-intercept is (−3/7, 0).
3. There are no turning points.
4. Asymptotes. The value of x that makes the denominator 0 is −2 — hence, the
vertical asymptote is x = −2.
The quotient in the long division is 7/2 — hence, the horizontal asymptote is y = 7/2.
Since the two asymptotes x = −2 and y = 7/2 are perpendicular, this is again a rect-
angular hyperbola.
5. The hyperbola’s centre (the point at which the two asymptotes intersect) is (−2, 7/2).
(These coordinates are given by the vertical and horizontal asymptotes.)
6. The two lines of symmetry may be written as y = x + α and y = −x + β and pass
through the centre (−2, 7/2). Plugging in the numbers, we find that α = 11/2 and
β = 3/2. Thus, the two lines of symmetry are y = x + 11/2 and y = −x + 3/2.
y
x = −2/3
y = −x − 7/3
y = −5/3
(−2/3, −5/3)
y =x−1 −5x + 1
y=
3x + 2
1. There are two branches — one on the top-left and another on the bottom-right.
2. Intercepts. Plug in x = 0 to get y = 1/2. So, the y-intercept is (0, 1/2).
Plug in y = 0 to get −5x + 1 = 0 or x = 1/5. So, the x-intercept is (1/5, 0).
3. There are no turning points.
4. Asymptotes. The value of x that makes the denominator 0 is −2/3 — hence, the
vertical asymptote is x = −2/3.
The quotient in the long division is −5/3 — hence, the horizontal asymptote is y = −5/3.
Since the two asymptotes x = −2/3 and y = −5/3 are perpendicular, this is again a
rectangular hyperbola.
5. The hyperbola’s centre (the point at which the two asymptotes intersect) is
(−2/3, −5/3). (These coordinates are given by the vertical and horizontal asymptotes.)
6. The two lines of symmetry may be written as y = x + α and y = −x + β and pass
through the centre (−2/3, −5/3). Plugging in the numbers, we find that α = −1 and
β = −7/3. Thus, the two lines of symmetry are y = x − 1 and y = −x − 7/3.
(a) Intercepts. If e ≠ 0, then there is one y-intercept (0, c/e). (If e = 0, then there are
no y-intercepts.) And if b ≠ 0, then there is one x-intercept (−c/b, 0). (If b = 0, then
there are no x-intercepts.)
(b) There are no turning points.
(c) There is the horizontal asymptote y = b/d and the vertical asymptote x = −e/d.
(The asymptotes are perpendicular and so, this is a rectangular hyperbola.)
(d) The hyperbola’s centre is (−e/d, b/d).
(e) The two lines of symmetry are y = ±x + (b + e) /d.
Proof. We proved (a), (c), and (d) above. For (b) and (e), see p. 1277 (Appendices).
For the hyperbola y = (bx + c) / (dx + e), you should know how to find (a) the x- and
y-intercepts; and (c) the horizontal and vertical asymptotes.
You’re not required to know what (d) the hyperbola’s centre is, but since this is simply
the intersection of the two asymptotes (which you already know how to find), you might
as well know how it, since it’ll help you sketch better graphs.
You’re also not required to know how to find (e) the equations of the two lines of sym-
metry. But as we’ve shown in the above examples, it is not very difficult to figure out
their equations. It is certainly not very difficult for you to at least sketch them.
After you are done with Part V (Calculus), there is a small possibility that you are
required to prove that (b) this hyperbola has no turning points.(And so you may or may
not be interested in reading the proof of (b) in the Appendices.)
Exercise 129. Graph and describe the features of the following equations.
3x + 2
(a) y = . (Answer on p. 1436.)
x+2
x−2
(b) y = . (Answer on p. 1437.)
−2x + 1
−3x + 1
(c) y = . (Answer on p. 1438.)
2x + 3
x y
x x2 +1 x2 + 1
Do the long y=
division: x2 x
√
1 y = (1 + 2) x
x2 + 1 1
Ô⇒ y= =x+ .
x x (1, 2)
x
y=x (0, 0)
√
(−1, −2) y = (1 − 2) x
x=0
1. There are two branches — one on the bottom-left and another on the top-right.
2. Intercepts. If we plug in x = 0, then y is undefined. Thus, there are no y-intercepts.
And if we plug in y = 0, then x2 + 1 = 0, an equation for which there are no (real)
solutions. Thus, there are no x-intercepts.
3. There are two turning points: (−1, −2) is a strict local maximum and (1, 2) is a strict
local minimum.
4. Asymptotes. The value of x that makes the denominator 0 is 0 — hence, the ver-
tical asymptote is x = 0. The quotient in the long division is x — hence, the oblique
asymptote is y = x. (By the way, here for the first time, the asymptotes here are not
perpendicular and so, this is a non-rectangular hyperbola.)
5. The hyperbola’s centre is (0, 0). Recall that the centre is simply the point at which
the two asymptotes intersect. In this example, the intersection of the asymptotes x = 0
and y = x is (0, 0),
√
6. The two lines of symmetry are y = (1 ± 2) x.
Note that if b2 − 4ac < 0, then there are no x-intercepts (this was the case in the last
example). And if b2 − 4ac = 0, then there is exactly one x-intercept, namely (−b/ (2a) , 0)
• Asymptotes. The value of x that makes the denominator dx + e zero is x = −e/d and
gives us the vertical asymptote x = −e/d. To find the other oblique asymptote, do
the long division:
The quotient gives us the oblique asymptote y = ax/d + (bd − ae) /d2 . (Since the asymp-
totes are not perpendicular, this hyperbola is not rectangular.)
• The centre is the point at which the two asymptotes intersect. Its x-coordinate is given
by the vertical asymptote x = −e/d. For its y-coordinate, plug x = −e/d into the equation
of the oblique asymptote:
bd − ae −ae + bd − ae bd − 2ae
y= (− ) + = =
a e
.
d d d2 d2 d2
bd − 2ae
Thus, the centre is: (−e/d, ).
d2
• You need not know how to find the equations of the lines of symmetry.
You should however know how to roughly sketch them. So, just remember that they (a)
pass through the centre; and (b) bisect the angles formed by the two asymptotes.
x2 + 3x + 1 1
y= =x+2− .
x+1 x+1
1. There are two branches — one on the left and another on the right.
2. Intercepts. Plug in x = 0 to get y = 1/1 = 1. Thus, the y-intercept is (0, 1). Plug in
√
y = 0 to get x2 + 3x + 1 = 0 — thus, the two x-intercepts are (0.5 (−3 ± 5) , 0).
3. There are no turning points.
4. Asymptotes. The value of x that makes the denominator 0 is −1 — hence, the
vertical asymptote is x = −1. The quotient in the long division is x + 2 — hence, the
oblique asymptote is y = x + 2.
5. The centre’s x-coordinate is given by the vertical asymptote x = −1. For its y-
coordinate, plug x = −1 into the oblique asymptote to get y = −1 + 2 = 1. Hence, the
centre is (−1, 1).
√ √
6. The two lines of symmetry are y = (1 ± 2) x + 2 ± 2.
y
x2 + 3x + 1
y= x = −1 y =x+2
x+1
√ √
y = (1 − 2) x + 2 − 2
(−1, 1) (0, 1)
√
−3 − 5
( , 0)
2 x
√
−3 + 5
( , 0)
2
√ √
y = (1 + 2) x + 2 + 2
2x2 + 2x + 1 5 5
= −2x − 4 + = −2x − 4 + .
−x + 1 −x + 1 −x + 1
1. There are two branches — one on the top-left and another on the bottom-right.
2. Intercepts. Plug in x = 0 to get y = 1/1 = 1. Thus, the y-intercept is (0, 1).
Plug in y = 0 to get 2x2 + 2x + 1 = 0, an equation for which there are no (real) solutions.
Thus, there are no x-intercepts.
√ √
3. The two turning points are (1 ± 0.5 10, −6 ± 2 10).
4. Asymptotes. The value of x that makes the denominator 0 is 1 — hence, the vertical
asymptote is x = 1. The “quotient” in the long division is −2x − 4 — hence, the oblique
asymptote is y = −2x − 4.
5. The centre’s x-coordinate is given by the vertical asymptote x = 1. For its y-
coordinate, plug x = 1 into the oblique asymptote to get y = −2 (1) − 4 = −6. Hence,
the centre is (1, −6).
√ √
6. The two lines of symmetry are y = (−2 ± 5) x − 4 ± 5.
2x2 + 2x + 1
y=
−x + 1
√ √
y = (−2 + 5) x − 4 − 5 x=1
(0, 1) x
√ √
(1 − 10/2, −6 − 2 10) (1, −6)
√ √
y = (−2 − 5) x − 4 + 5 y = −2x − 4
√ √
(1 + 10/2, −6 + 2 10)
ax2 + bx + c
y= ,
dx + e
you now know how to find its intercepts, asymptotes, and centre.168
And after we’ve done Calculus (Part V), you’ll also be able to find the turning points.
To repeat, you do not need to know how to find the equations of the two lines of sym-
metry. However, you should at least be able to roughly sketch them.
Exercise 130. Graph the equations below. Label any intercepts, asymptotes, and centre.
Roughly indicate or sketch any turning points and lines of symmetry.
x2 + 2x + 1
(a) y = . (Answer on p. 1439.)
x−4
−x2 + x − 1
(b) y = . (Answer on p. 1440.)
x+1
2x2 − 2x − 1
(c) y = . (Answer on p. 1441.)
x+4
168
Fact 198 in the Appendices summarises the features of this hyperbola.
317, Contents www.EconsPhDTutor.com
23. Simple Parametric Equations
We can sometimes describe a graph (i.e. a set of points) using an equation. We can
sometimes also describe a graph using parametric equations:
Example 381. We can describe the unit circle centred on the origin with the
equation x2 + y 2 = 1. This graph (set of points) is the set S = {(x, y) ∶ x2 + y 2 = 1}.
Recall169 that sin2 t + cos2 t = 1. And so, observe that by letting x = cos t and y = sin t, we
have x2 + y 2 = 1. We thus have a second method for writing down the set S:
Arrows indicate y At t = 1,
instantaneous
(x, y) ≈ (0.54, 0.84) ,
direction of
(vx , vy ) ≈ (−0.84, 0.54) ,
l
travel.
(ax , ay ) ≈ (−0.54, −0.84) .
At t = 0,
(x, y) = (1, 0) ,
S = {(x, y) ∶ x + y = 1}
2 2
(vx , vy ) = (1, 0) ,
= {(x, y) ∶ x = cos t, y = sin t, t ≥ 0} (ax , ay ) = (−1, 0) .
l At t =
5π
,
x
4
√ √
2 2
(x, y) = (− ,− ),
2 2
√ √
\ (vx , vy ) = (
2
2
√ √
,−
2
2
),
2 2
(ax , ay ) = ( , ).
2 2
dx
vx = = − sin t.
dt
And its velocity in the y-direction is the rate of change of displacement in the y-direction
with respect to time t, i.e. the (first) derivative of y w.r.t. t:
dy
vy = = cos t.
dt
Altogether, (vx , vy ) = (− sin t, cos t). And so:
• At time t = 0, the particle P has velocity (vx , vy ) = (− sin 0, cos 0) = (0, 1) — it is
moving upwards at 1 m s−1 (and not moving rightwards at all).
• At time t = 1, P has velocity (vx , vy ) = (− sin 1, cos 1) ≈ (−0.84, 0.54) — it is moving
leftwards at 0.84 m s−1 and upwards at 0.54 m s−1 .
• At time t = 5π/4, P has velocity
√ √
5π 5π 2 2
(vx , vy ) = (− sin , cos ) = ( ,− ) ≈ (0.71, −0.71).
4 4 2 2
√ √
At time t = 5π/4, P is moving rightwards at 2/2 m s−1 and downwards at 2/2 m s−1 .
(Example continues on the next page ...)
dvx d2 x
ax = = 2 = − cos t.
dt dt
And its acceleration in the y-direction is the rate of change of velocity in the y-direction
with respect to time t or, in other words, the second derivative of y w.r.t. t:
dvy d2 y
vy = = 2 = − sin t.
dt dt
Altogether, (ax , ay ) = (− cos t, − sin t). And so:
• At time t = 0, the particle P has acceleration (ax , ay ) = (− cos 0, − sin 0) = (−1, 0) — it
is accelerating leftwards at 1 m s−2 (and not upwards at all).
• At time t = 1, P has acceleration (ax , ay ) = (− cos 1, − sin 1) ≈ (−0.54, −0.84) — it is
accelerating leftwards at 0.54 m s−2 and downwards at 0.84 m s−2 . Note that at t = 1,
we have vy = 0.54 > 0 but ay = −0.84 < 0 — this means P is still moving upwards, but
this upwards movement is slowing down.
• At time t = 5π/4, P has acceleration
√ √
(ax , ay ) = (− cos (5π/4) , − sin (5π/4)) = ( 2/2, 2/2) ≈ (0.71, 0.71).
√ √
At time t = 5π/4, P is moving rightwards
√ at 2/2 m s −1
√
and also upwards at 2/2 m s−1 .
Note that at t = 5π/4, we have vy = − 2/2 < 0 but ay = 2/2 > 0 — this means P is still
moving downwards, but this downwards movement is slowing down.170
Exercise 131. Particle Q travels on the same plane as P (from the above example). Q’s
position is described by {(x, y) ∶ x = sin t, y = cos t, t ≥ 0}. (Answer on p. 1442.)
170
We will revisit this example when we study vectors in Part III. There, we will show that P ’s direction of
movement is always tangent to the circle and its direction of acceleration is always towards the centre.
Moreover, the magnitudes of the particle’s (overall) velocity and acceleration are always constant.
320, Contents www.EconsPhDTutor.com
Example 382. Let a, b > 0. The equation x2 /a2 + y 2 /b2 = 1 describes an ellipse centred
on the origin, with x-intercepts (±a, 0) and y-intercepts (0, ±b). This graph is the set:
U = {(x, y) ∶ x2 /a2 + y 2 /b2 = 1}.
Observe that by letting x = a cos t and y = b sin t, we have:
x2 y 2 a2 cos2 t b2 sin2 t
2
+ 2 = 2
+ 2
= cos2 t + sin2 t = 1.
a b a b
We can thus use parametric equations to rewrite the set U :
We apply the same interpretation as before: t is time (seconds) and the parametric
equations describe the position (metres from the origin O) of some particle R. Observe
that like P , R is travelling anticlockwise.
y
x2 y 2
b U = {(x, y) ∶ 2 + 2 = 1}
a b
= {(x, y) ∶ x = a cos t, y = b sin t, t ≥ 0}
−a a x
−b
(c) Above we specified that a, b > 0. How does R’s starting position and direction of
travel (clockwise or anticlockwise) change if:
We can again impose a similar interpretation: t is time (measured in seconds) and the
parametric equations describe the motion (in metres from the origin O) of a particle A.
W = {(x, y) ∶ x2 − y 2 = 1} y
= {(x, y) ∶ x = sec t, y = tan t, t ≥ 0}
During t ∈ [0, 0.5π),
l
Arrows indicate A is moving northeast.
instantaneous
direction of travel.
At t = 0,
l
(x, y) = (1, 0) ,
(vx , vy ) = (0, 1) , x
During t ∈ (0.5π, 1.5π), (ax , ay ) = (1, 0) .
A moves upwards along
the left branch.
\
Exercise 133. Continuing with the above example, write down A’s acceleration in the
x- and y-directions at the instant t. (Answer on the next page.)
171
Fact 29(b).
322, Contents www.EconsPhDTutor.com
(... Example continued from the previous page.)
dvx d ×
ax = = (sec t tan t) = sec t tan t tan t + sec t sec2 t
dt dt
= sec t (tan2 t + sec2 t) = sec t (2 sec2 t − 1) ,
dvy d
ay = = (sec2 t) = 2 sec t ⋅ sec t tan t = 2 sec2 t tan t.
Ch.
dt dt
d2 x d2 y dvx dvy
Thus: (ax , ay ) = ( , 2)=( , ) = (sec t (2 sec2 t − 1) , 2 sec2 t tan t).
dt dt
2 dt dt
Note that at t = 0.5π, 1.5π, 2.5π, , . . . , both sec t and tan t are undefined. And so, we’ll
say that at these instants in time, the particle A’s position, velocity, and acceleration are
simply undefined.
Observe that interestingly, vy = sec2 t > 0 for all t (for which sec t is well-defined). Hence,
the particle A is always moving upwards (except during the aforementioned instants in
time when its velocity is undefined).
So, A starts at the midpoint of the right branch of the hyperbola, is moving upwards at
1 m s−1 , and is accelerating rightwards at 1 m s−2 .
During t ∈ [0, π/2), the particle A is moving northeast. As t → π/2, it “flies off” towards
the “top-right infinity” (∞, ∞) and:
x, y, vx , vy , ax , ay → ∞.
An instant after t = π/2, A magically reappears “near” “bottom-left infinity” (−∞, −∞).
During t ∈ (0.5π, 1.5π), the particle travels upwards along the left branch of the hyperbola.
And again, as t → 1.5π, it “flies off” towards “top-left infinity” (−∞, ∞) and we have:
x, vx , ax → −∞ and y, vy , ay → ∞.
π 3π 3π 5π
(i) t ∈ [0, ). (ii) t ∈ ( , ). (iii) t ∈ ( , ).
π
2 2 2 2 2
(e) Marked below are the positions of the particle B at six different instants in time
t = 0, 1, 2, 3, 4, and 5. However, we do not know which position corresponds to which
instant in time. Without using a calculator, match each of the six positions to
the corresponding instant in time. (Hint: 0.5π ≈ 1.57, π ≈ 3.14, 1.5π ≈ 4.71, and
2π ≈ 6.28.)
Ba y
Bc
x
Arrows indicate
instantaneous
direction of travel.
Be
Bf
Bd
• First rewrite y = t − 1 as t = y + 1.
2 3
y
After t = 2,
At t = 2, P moves rightwards.
(x, y) = (−4, 1) √
(vx, vy ) = (0, 1) At t = 2 + 5,
√
(ax, ay ) = (2, 0) (x, y) = (1, 1 + 5)
(vx , vy ) = (0, 1)
P always moves (ax , ay ) = (2, 0)
upwards at 1 m s−1 .
x
Starting
point
During t ∈ [0, 2), At t = 0,
P moves leftwards.
(x, y) = (0, −1)
(vx, vy ) = (−4, 1) P does not travel along
this grey portion.
(ax, ay ) = (2, 0)
dx dy dax d2 x day d2 y
vx = = 2t − 4, vy = = 1, ax = = 2 = 2, ay = = 2 = 0.
dt dt dt dt dt dt
x+4 2 y−1 2
U = {(x, y) ∶ ( ) +( ) = 1} .
2 3
172
Fact 29(a).
326, Contents www.EconsPhDTutor.com
(... Example continued from the previous page.) y
Centre
(−4, 1)
Starting
point
At t = 0, x
(x, y) = (−2, 1)
(vx , vy ) = (0, 3)
(ax , ay ) = (−2, 0)
Here are Q’s velocity and acceleration, decomposed into the x- and y-directions:
dx dy dax d2 x day d2 y
vx = = −2 sin t, vy = = 3 cos t, ax = = 2 = −2 cos t, ay = = 2 = −3 sin t.
dt dt dt dt dt dt
At t = 0, Q’s starting position and velocity are:
(x, y) = (2 cos 0 − 4, 3 sin 0 + 1) = (−2, 1) and (vx , vy ) = (−2 sin 0, 3 cos 0) = (0, 3).
So, it starts at the rightmost point of the ellipse and is moving upwards at 3 m s−1 . Thus,
its direction of travel is anticlockwise around the ellipse. Every 2π s, Q completes one
full revolution around the ellipse.
Exercise 136. The sets A, B, and C below describe the positions (metres) of particles
A, B, and C at time t (seconds), relative to the origin. For each set:
(i) Rewrite the set so that the parameter t is eliminated.
(ii) Sketch the graph.
(iii) Describe the particle’s position and velocity as time progresses.
ax = bx. 3
2×3>1×3 or 6 > 3.
2×0=1×0 or 0 = 0.
The above seems obvious. But a common mistake is to multiply an inequality by some
unknown constant x and expect it to be preserved:
Example 387. Let x ∈ R. Beng reasons, “We know that 8 > 5. Therefore 8x > 5x.”
Beng’s reasoning is wrong.
(a) If x > 0, then yea, he happens to be correct and 8x > 5x. 3
(b) If x = 0, then 8x = 5x = 0 and he’s wrong. 7
(c) If x < 0, then 8x < 5x and he’s again wrong. 7
In general:
Proof. Omitted.173
173
These results may seem “obvious”, but to properly prove them, we need to first define the notion of
328, Contents www.EconsPhDTutor.com
The above result also holds if (i) the inequalities are reversed; or (ii) the strict inequalities
(i.e. > and <) are replaced with weak ones (i.e. ≥ and ≤).
In general, we may have to break our analysis into two (or three) cases, depending on
whether x is positive, negative, or zero. Example:
> .
a 1 b
x x
(a) If x > 0, then yea, he happens to be correct — we can indeed multiply > by x to
1
order.
If x = 0, then a/x and b/x would be undefined and > could not possibly hold.
174 1
Altogether then, −x2 + 3x − 1 > 0 is true between those two roots. That is, the given
inequality has solution set:
√ √
3− 5 3+ 5
( , ).
2 2
√ √
3− 5 3+ 5
x∈( , )
2 2
solves −x2 + 3x − 1 > 0.
y = −x2 + 3x − 1
Altogether then, x2 + 3x + 1 > 0 “outside” those two roots. That is, the given inequality
has solution set:
√ √
−3 − 5 −3 + 5
R∖[ , ].
2 2
y = x2 + 3x + 1
√
−3 + 5
2
√
−3 − 5 x
2
Altogether then, there are no values of x for which x2 + 2x + 1 > 0 is true. Equivalently,
the inequality has solution set ∅.
y
x
y = −x2 + 2x − 1
x∈∅
(there are no values of x
for which −x2 + 2x − 1 > 0)
Note that 1 does not belong to the solution set of this inequality. This is because at x = 1,
we have −x2 + 2x − 1 = −12 + 2 ⋅ 1 − 1 = 0. And so, at x = 1, it is not true that −x2 + 2x − 1 > 0.
Altogether then, x2 + 2x + 1 > 0 for all values of x except −1. Equivalently, the inequality
has solution set R ∖ {−1}.
y
y = x2 + 2x + 1
x ∈ R ∖ {−1} solves
x2 + 2x + 1 > 0
1 x
Note that 1 does not belong to the solution set of this inequality. This is because at x = 1,
we have x2 + 2x + 1 = 12 + 2 ⋅ 1 + 1 = 0. And so, at x = 1, it is not true that x2 + 2x + 1 > 0.
y
x
y = −x2 + x − 1
x∈∅
(there are no values of x
for which −x2 + x − 1 > 0).
y = x2 + x + 1
x ∈ R solves
x2 + x + 1 > 0
x
Fact 39. Let a, b, c be constants and x ∈ R. Let r1 and r2 be the two roots given by the
quadratic formula. That is, let:
√ √
−b − b2 − 4ac −b + b2 − 4ac
r1 = and r2 = .
a a
Then here are the solution sets for ax2 + bx + c > 0, in the six different cases:
Case 1. a > 0, y
∪-shaped quadratic
b2 − 4ac < 0
b2 − 4ac > 0
b2 − 4ac = 0
b2 − 4ac = 0 x
b2 − 4ac > 0
b2 − 4ac < 0
Case 2. a < 0,
∩-shaped quadratic
175
Note that r2 < r1 because a < 0.
336, Contents www.EconsPhDTutor.com
ax + b
24.3. >0
cx + d
N
Let be a fraction where N is the numerator and D is the denominator. Then:
D
>0 ⇐⇒ N and D have the same signs (i.e. N, D > 0 OR N, D < 0).
N
D
Again, the mistake here is to multiply both sides by the unknown quantity x and expect
the inequality to be preserved. But it may not be, because x may be negative.
Here’s the correct solution. First, rewrite the given inequality into SF:
1 1 1 x−1 N 2
<1 ⇐⇒ 1− >0 ⇐⇒ = > 0.
x x x D
Observe that D = 0 ⇐⇒ x = 0 and N = 0 ⇐⇒ x = 1.
Let us call the points at which either the denominator or numerator equals zero the zero
points. So here, the zero points are 0 and 1.
We now consider the sign of N /D, in each of the three possible cases:
(a) If x < 0, then N < 0 and D < 0, so that N /D > 0.
(b) If 0 < x < 1, then N < 0 and D > 0, so that N /D < 0.
(c) If x > 0, then N > 0 and D > 0, so that N /D > 0.
The above observations may be compactly summarised in the following diagram.
+ − +
0 1
For convenience (and lack of a better name), let’s call the above the sign diagram.
y 1
y=
x
y=1
1
x
1
y=
338, Contents x www.EconsPhDTutor.com
5x + 4
Example 400. Solve > 0.
−2x + 1
We note that the numerator and denominator equal zero at −4/5 and 1/2. And so, we
have the sign diagram below, because:
4
(a) If x < − , then N < 0 and D > 0, so that N /D < 0.
5
4 1
(b) If − < x < , then N < 0 and D > 0, so that N /D > 0.
5 2
1
(c) If x > , then N > 0 and D > 0, so that N /D < 0.
2
− + −
−4/5 1/2
4 1
Hence, − < x < .
5 2
4 1
Equivalently, the solution set is: (− , ).
5 2
x
4 1
5x + 4 −
y= 5 2
−2x + 1
+ − +
−3 −2/3
2
Equivalently, the solution set is: (−∞, −3) ∪ (− , ∞).
3
x+3
y=
3x + 2
−3 2 x
−
3
+ − +
−2 1/4
1
Equivalently, the solution set is: (−∞, −2) ∪ ( , ∞).
4
4x − 1
y=
x+2
x
−2 1
4
x−1 −1 1 2x + 1 −3x − 18
(a) > 0. (b) > 0. (c) > 0. (d) > 0. (e) > 0.
−4 −4 −4 3x + 2 9x − 14
−18x + 5
> 0 ⇐⇒ (−18x + 5, −5x + 1 > 0 OR − 18x + 5, −5x + 1 < 0).
1 2
−5x + 1
Observe that:
5 1 1
• −18x + 5, −5x + 1 > 0 ⇐⇒ (x < AND x < ) ⇐⇒ x < .
1
18 5 5
5 1 5
• −18x + 5, −5x + 1 < 0 ⇐⇒ (x > AND x > ) ⇐⇒ x > .
2
18 5 18
3x − 2 −18x + 5 1 5
Altogether then, < 3 ⇐⇒ > 0 ⇐⇒ x ∈ R ∖ [ , ].
−5x + 1 −5x + 1 5 18
y=3
1 5 x
3x − 2
y= 5 18
−5x + 1
2x + 3 −4x + 2
(a) < 9. (b) > 13.
−x + 7 x+1
Beng reasons, “Divide ≥ by x to get ex ≥ e. Which is true for all x ≥ 1. Hence, the solution
1
Again, Beng’s mistake is to divide ≥ by x and assume the inequality will be preserved.
1
By the way, when given any inequality, I recommend first rewriting it in the following
Standardised Form (SF), where STUFF is on the LHS of the inequality and 0 is on
the RHS, like this:
(The symbol ⋛ stands for any inequality, i.e. any of >, ≥, <, or ≤.)
Strictly speaking, it isn’t necessary to rewrite inequalities into SF. But if you make it a
habit to always do so, you’ll be less likely to make a careless mistake.
So, here let’s rewrite ≥ into SF, then factorise:
1
In the last step, we divided by the positive constant e. This is equivalent to multiplying
by the positive constant 1/e and hence by Fact 38 preserves the inequality.
Let a = x (ex−1 − 1). We observe that:
a=0 ⇐⇒ x = 0 OR x = 1.
Example 405. ∣x∣ < 5 ⇐⇒ −5 < x < 5. The solution set is (−5, 5).
Example 407. ∣x∣ > 7 ⇐⇒ (x > 7 OR x < −7). The solution set is (−∞, −7) ∪ (7, ∞).
Example 408. ∣x∣ ≥ 1 ⇐⇒ (x ≥ 1 OR x ≤ −1). The solution set is (−∞, −1] ∪ [1, ∞).
Equivalently, we may say that the solution set for < is (−4, 6).
1
∣x + 4∣ ≤ 3 ⇐⇒ −3 ≤ x + 4 ≤ 3 ⇐⇒ −7 ≤ x ≤ −1.
2
Solution:
Equivalently, we may say that the solution set for ≤ is [−1, −7].
2
Equivalently, we may say that the solution set for > is (∞, −5] ∪ [9, ∞).
3
∣x + 1∣ ≥ 1 ⇐⇒ x + 1 ≥ 1 OR x + 1 ≤ −1 ⇐⇒ x ≥ 0 OR x ≤ 0 ⇐⇒
4
Solution:
x ∈ R.
∣x − a∣ < b
( )
a − b a x a + b
∣x − a∣ ≤ b
[ ]
a − b a x a + b
(c) ∣x − a∣ > b means that the distance between x and a is more than b.
∣x − a∣ > b
) (
a − b a a + b x
∣x − a∣ ≥ b
] [
a − b a a + b x
We will first solve ≤ with the aid of graphs. To do so, first sketch the graphs of y = ∣x − 4∣
1
and y = 2x. To sketch the former, simply recall from Ch. 16.6 that the graph of y = ∣f (x)∣
is the same as that of f , but with any portion below the x-axis reflected in the x-axis.
y
y = 2x
y = ∣x − 4∣
4
3
x
“Clearly”, ≤ holds ⇐⇒ x is to the right of the intersection point P . Our goal then is to
1
find P .
From the graph, we see that at P , we have x − 4 < 0 and thus ∣x − 4∣ = − (x − 4) = 4 − x.
And so, P is given by:
4
4 − x = 2x or x= .
3
4 4
Hence, x ≥ . Or equivalently, the solution set for ≤ is [ , ∞).
1
3 3
We now solve ≤ again but this time a little more rigorously and without the aid of graphs.
1
4
∣x − 4∣ ≤ 2x ⇐⇒ 4 − x ≤ 2x ⇐⇒ 4 ≤ 3x ⇐⇒ ≤ x.
1
3
If (b) x ≥ 4, then:
∣x − 4∣ ≤ 2x ⇐⇒ x − 4 ≤ 2x ⇐⇒ −4 ≤ x,
1
y
y = ∣2x − 4∣
y=x
P
4
3 4
x
At P , we have 2x − 4 < 0 and thus ∣2x − 4∣ = − (2x − 4) = 4 − 2x. And so, P is given by:
4
4 − 2x = x or x= .
3
At Q, we have 2x − 4 > 0 and thus ∣2x − 4∣ = 2x − 4. And so, Q is given by:
2x − 4 = x or x = 4.
4
Conclusion: x < or x > 4 solves the inequality. In other words, the solution set is:
3
4 4
(−∞, ) ∪ (4, ∞) = R ∖ [ , 4].
3 3
y = 2x + 2
y = ∣3x − 4∣
P
2
5 6 x
At P , we have 3x − 4 < 0 and thus ∣3x − 4∣ = − (3x − 4) = 4 − 3x. And so, P given by:
2
4 − 3x = 2x + 2 or x= .
5
At Q, we have 3x − 4 > 0 and thus ∣3x − 4∣ = 3x − 4. And so, Q is given by:
3x − 4 = 2x + 2 or x = 6.
2
Conclusion: x ≤ or x ≥ 6 solves the inequality. In other words, the solution set is:
5
2 2
(−∞, ] ∪ [6, ∞) = R ∖ ( , 6).
5 5
y
y = ∣2x − 1∣
y = ∣x + 1∣
Q
0 2 x
At P , we have x + 1 > 0 and 2x − 1 < 0. Thus, ∣x + 1∣ = x + 1 and ∣2x − 1∣ = 1 − 2x. And so,
P is given by:
x + 1 = 1 − 2x or x = 0.
x + 1 = 2x − 1 or x = 2.
Conclusion: x ∈ [0, 2] solves the inequality. In other words, the solution set is [0, 2].
(a) ∣x − 4∣ ≤ 71.
(b) ∣5 − x∣ > 13.
(c) ∣−3x + 2∣ − 4 ≥ x − 1.
(d) ∣x + 6∣ > 2 ∣2x − 1∣.
6 ∣6∣ 6 ∣6∣ 6
Example 417. ∣ ∣≠ because ∣ ∣ = ∣3∣ = 3 but = = −3.
−2 −2 −2 −2 −2
−6 −6 −6 −6 −6
Example 418. ∣ ∣≠ because ∣ ∣ = ∣3∣ = 3 but = = −3.
2 ∣2∣ 2 ∣2∣ 2
−6 −6 −6 −6
Example 419. ∣ ∣≠ because ∣ ∣ = ∣3∣ = 3 but = −3.
2 2 2 2
a ∣a∣
Fact 42. If a, b ∈ R with b ≠ 0, then ∣ ∣ = .
b ∣b∣
If the above proof doesn’t convince you, hopefully the following examples do:
6 ∣6∣ ∣6∣ 6 6
Example 420. ∣ ∣ = because = = 3 and ∣ ∣ = ∣3∣ = 3.
2 ∣2∣ ∣2∣ 2 2
6 ∣−6∣ ∣−6∣ 6 −6
Example 421. ∣ ∣ = because = = 3 and ∣ ∣ = ∣−3∣ = 3.
2 ∣2∣ ∣2∣ 2 2
6 ∣6∣ ∣6∣ 6 6
Example 422. ∣ ∣ = because = = 3 and ∣ ∣ = ∣−3∣ = 3.
2 ∣−2∣ ∣−2∣ 2 −2
6 ∣−6∣ ∣−6∣ 6 −6
Example 423. ∣ ∣ = because = = 3 and ∣ ∣ = ∣3∣ = 3.
2 ∣−2∣ ∣−2∣ 2 −2
Proof. There are four possible cases, depending on the signs of a and b. We will show that
the conclusion holds in each case:
1. If a, b ≥ 0, then ∣ab∣ = ab = ∣a∣ ∣b∣.
351, Contents www.EconsPhDTutor.com
2. If a, b < 0, then ∣ab∣ = ab = ∣a∣ ∣b∣.
3. If a ≥ 0 and b < 0, then ∣ab∣ = −ab = ∣a∣ ∣b∣.
4. If a < 0 and b ≥ 0, then ∣ab∣ = −ab = ∣a∣ ∣b∣.
ax2 + bx + c 1
Fact 44. Consider the inequality: > 0.
dx2 + ex + f
(a) If ax2 + bx + c is always positive, then > is equivalent to dx2 + ex + f > 0.
1
Proof. (a) Divide both sides by ax2 + bx + c to get 1/ (dx2 + ex + f ) > 0 or dx2 + ex + f > 0.
(b) Multiply both sides by dx2 + ex + f to get ax2 + bx + c > 0.
By the way, recall (Ch. 9) that a quadratic expression is always positive if and only if
its coefficient on x2 is positive (so that the graph is ∪-shaped) AND the discriminant is
positive (so that the graph is everywhere above the x-axis).
x2 + x + 1
Example 424. Consider > 0.
3x2 − 2x − 5
The numerator is always positive, because the coefficient on x2 is positive and the dis-
criminant 12 − 4 (1) (1) = −3 is negative. And so by Fact 44(a), the given inequality is
simply equivalent to 3x2 − 2x − 5 > 0. This is a ∪-shaped quadratic with discriminant
(−2) − 4 (3) (−5) = 64. Hence, 3x2 − 2x − 5 > 0 “outside” the two roots, which are:
2
√
2 ± 64 −6 10 5
x= = , = −1, .
2⋅3 6 6 3
Hence, the given inequality’s solution set is R ∖ [−1, 5/3] or (−∞, −1) ∪ (5/3, ∞).
−x2 + 7x + 1
Example 425. Consider > 0.
2x2 − x + 1
The denominator is always positive, because the coefficient on x2 is positive and the
discriminant (−1) − 4 (2) (1) = −7 is negative. And so by Fact 44(b), the given inequality
2
x2 + 3x + 2
Example 426. Solve > 0.
2x2 − 7x + 6
First consider the numerator N = x2 + 3x + 2. Observe that x2 + 3x + 2 = (x + 1) (x + 2).
And so y = x2 + 3x + 2, which is a ∪-shaped quadratic, intersects the x-axis at x = −2, −1.
Thus, N > 0 if x ∈ R ∖ [−2, −1], while N < 0 if x ∈ (−1, −2).
Next consider the denominator D = 2x2 −7x+6. Observe that 2x2 −7x+6 = (2x − 3) (x − 2).
And so y = 2x2 − 7x + 6, which is a ∪-shaped quadratic, intersects the x-axis at x = 1.5, 2.
Thus, D > 0 if x ∈ R ∖ [1.5, 2], while D < 0 if x ∈ (1.5, 2).
The given inequality is true if N, D > 0 OR N, D < 0. We have:
R ∖ ([−2, −1] ∪ [1.5, 2]) or (−∞, −2) ∪ (−1, 1.5) ∪ (2, ∞).
−x2 + 5x − 4
Example 427. Solve > 0.
3x2 − 2x − 5
First consider the numerator N = −x2 +5x−4. Observe that −x2 +5x−4 = − (x − 1) (x − 4).
And so y = −x2 + 5x − 4, which is a ∩-shaped quadratic, intersects the x-axis at x = 1, 4.
Thus, N > 0 if x ∈ (1, 4), while N < 0 if x ∈ R ∖ [1, 4].
Next consider the denominator D = 3x2 −2x−5. Observe that 3x2 −2x−5 = (3x − 5) (x + 1).
And so y = 3x2 − 2x − 5, which is a ∪-shaped quadratic, intersects the x-axis at x = −1, 5/3.
Thus, D > 0 if x ∈ R ∖ [−1, 5/3], while D < 0 if x ∈ (−1, 5/3).
The given inequality is true if N, D > 0 OR N, D < 0. We have:
N, D > 0 ⇐⇒ x ∈ (1, 4) AND x ∈ R ∖ [−1, 5/3] ⇐⇒ x ∈ (1, 4) ∖ [−1, 5/3] = (5/3, 4).
N, D < 0 ⇐⇒ x ∈ R ∖ [1, 4] AND x ∈ (−1, 5/3) ⇐⇒ x ∈ (−1, 5/3) ∖ [1, 4] = (−1, 1).
Exercise 141. Solve each inequality. (Answers on pp. 1453, 1454, and 1455.)
x2 + 2x + 1 x2 − 1 x2 − 3x − 18
(a) > 0. (b) > 0. (c) > 0.
x2 − 3x + 2 x2 − 4 −x2 + 9x − 14
Recall176 that zero is another word for root. So what TI84’s zero function will do here is
find the roots of the given equation (i.e. the values of x for which y = 0). Those of you
accustomed to newfangled inventions like the world wide web and the wireless telephone
will probably be expecting that the TI84 simply and immediately tells you what all the
roots are. But alas, the TI84 is an ancient device, which means there’s plenty more work
you must do to find the three roots.
To find a root, you must first specify a “Left Bound” and a “Right Bound” for x. The
TI84 will then check to see if there are any values of x for which y = 0 between those
two bounds.
9. Using the ⟨ and ⟩ arrow keys, move the blinking cursor until it is where you want
your first “Left Bound” to be. For me, I have placed it a little to the left of where I
believe the leftmost horizontal intercept to be.
10. Press ENTER and you will have just entered your first “Left Bound”.
TI84 now prompts you with the question: “Right Bound?”.
11. So now just repeat. Using the ⟨ and ⟩ arrow keys, move the blinking cursor until it
is where you want your first “Right Bound” to be. For me, I have placed it a little to
the right of where I believe the leftmost horizontal is.
12. Again press ENTER and you will have just entered your first “Right Bound”.
TI84 now asks you: “Guess?” This is just asking if you want to proceed and get TI84 to
work out where the horizontal intercept is. So go ahead and:
13. Press ENTER . TI84 now informs you that there is a “Zero” at “x = −1”, “y = 0” and
places the blinking cursor at that point. So, x = −1 is the first root we’ve found.
To find the other two roots, “simply” repeat steps 7 through 13 — two more times. You
should find that the other two roots are x = 0 and x = 1. Altogether, the three roots are
x = −1, 0, 1. Based on these and what the graph looks like, we conclude:
176
Remark 21.
356, Contents www.EconsPhDTutor.com
Example 429. Solve x > e + ln x.
As usual, first rewrite the inequality into SF: x − e − ln x > 0.
1. Graph y = x − e − ln x on your TI84 (precise instructions omitted).
We see that there’s clearly an x-intercept at around x ∈ (4, 5). (Note that by default,
each of the little tick marks shown on your TI84 marks 1 unit.)
2. Zoom in (precise instructions omitted).
Now we see that there’s probably also an x-intercept near the origin. But unfortu-
nately, now we can no longer see the other x-intercept. To fix this:
3. Press WINDOW to bring up the WINDOW menu.
We will adjust Xmin and Xmax:
4. Press 0 . We have adjusted Xmin to 0. Next:
5. Press ENTER 5 . We have adjusted Xmax to 5.
6. Now press GRAPH . We can now see the portion of the graph between x = 0 and
x = 5.
7. To find the two roots, “simply” go through the steps described in the previous example
— twice (precise instructions omitted). You should find that the two roots are x ≈
0.708, 4.139.
Based on these roots and what the graph looks like, we conclude that the inequality’s
solution set is x ∈ (0, 0.708 . . . ) ∪ (4.139 . . . , ∞) = R+ ∖ (0.708 . . . , 4.139 . . . ).
√ 1
(a) x3 − x2 + x − 1 > ex . (b) x > cos x. (c) > x3 + sin x.
1−x2
Exercise 143. When Apu was 40 years old, Beng was twice as old as Caleb. Today, Apu
is twice as old as Beng and Caleb is 28 years old. What are the ages of Apu and Beng
today? (Assume that a person’s age is always an integer and fixed between January 1st
and December 31st each year.) (Answer on p. 1457.)
Exercise 144. Planes A and B leave the same point at 12 p.m. Plane A travels northeast
at a constant speed of 100 km h−1 . Plane B travels south at a constant speed of 200 km h−1 .
Plane A travels
northeast at 100 km h−1
Plane B travels
south at 200 km h−1
✈
At 3 p.m., both planes make an instant turn and start flying directly towards each other
at the same speed. At what time will the two planes collide? (Answer on p. 1458.)
Example 430. Consider this system of equations (which is simply a set of two equations):
You already learnt to solve this system of equations in secondary school. Simply plug =
2
−x + 3 = x + 1 ⇐⇒ 2 = 2x ⇐⇒ x = 1.
3
Thus, this system of equations has one solution: (1, 2) and its solution set is {(1, 2)}.
y = −x + 3 y =x+1
(1, 2)
x
To make it extra clear that the ordered pair’s first co-ordinate is x and second is y, we
can also write, “This system of equations has one solution: (x, y) = (1, 2).”
177
In Ch. 8, we also studied inequalities involving one variable. Fortunately, for H2 Maths, we will not be
studying systems of inequalities (i.e. sets of inequalities involving more than one variable). We will
instead only be studying systems of equations.
359, Contents www.EconsPhDTutor.com
Example 431. Consider this system of equations:
x2 3
y= − and y = x (x, y ∈ R).
1 2
2 2
Again, from secondary school, you already know how to solve this system of equations.
Simply plug = into = to get =, then do some simple algebra:
2 1 3
x2 3
x= − ⇐⇒ 0 = x2 − 2x − 3 = (x − 3) (x + 1).
3
2 2
Thus, this system of equations has two solutions: (3, 3) and (−1, −1) and its solution
set is {(3, 3) , (−1, −1)}.
x2 3
y= −
2 2
(3, 3)
y=x
(−1, −1)
To be extra clear, we can also write, “This system of equations has two solutions:
(x, y) = (3, 3) , (−1, −1).”
y = ln x lies below y = x everywhere. Hence, no ordered pair (x, y) satisfies the above
1 2
system of equations. In other words, there is no solution and the solution set is ∅.
y
y=x
y = ln x
y = x2 lies above y = −1 everywhere. Hence, no ordered pair (x, y) satisfies the above
1 2
system of equations. In other words, there is no solution and the solution set is ∅.178
y = x2
y = −1
178
But as we’ll learn in Part IV, if this system of equations is rewritten so that x, y ∈ C, then it actually
has two (complex) solutions — (−i, −1) and (i, −1) — and the solution set {(−i, −1) , (i, −1)}.
361, Contents www.EconsPhDTutor.com
A system of equations can have infinitely many solutions.
Observe that these two equations are really the same. And so, the above system of equa-
tions is satisfied by every point (x, y) for which y = x. Its solution set is {(x, y) ∶ y = x}.
y
Also
y=x
2y = 2x
The above system of equations is satisfied by every point (x, y) for which x = 2kπ and
y = 0, where k is any integer. Its solution set is {(x, y) ∶ x = 2kπ, y = 0, k ∈ Z}.
(0, 1)
y = cos x
Example 436. The following system of equations is a set of three equations, involving
three variables:
Again, you already learnt to solve this system of equations in secondary school. Simply
plug = and = into = to get x = 3x − 2 + 7 − y = 3x + 5 − y ⇐⇒ y = 2x + 5.
1 2 3 4
Now plug x = 7 into = to get y = 19. Then plug y = 19 into = to get z = −12.
1 2
We conclude that this system of equations has one solution: (7, 19, −12). Its solution
set is {(7, 19, −12)}.
To be extra clear that the ordered triple’s first co-ordinate is x, second is y, and third is
z, we can also write, “This system of equations has one solution: (x, y, z) = (7, 19, −12).”
By the way, (7, 19, −12) is our first example of an ordered triple. This is exactly
analogous to an ordered pair, except that now there are three coordinates. As you can
imagine, we also have ordered quadruples, ordered quintuples, and more generally
ordered n-tuples.179
When we study vectors in Part III, we’ll learn that it’s actually possible to graph the
above system of equations. (Spoiler: Our graph will be in 3-dimensional space.)
Example 437. Here is a system of two equations that involves three variables:
= immediately tells us that any solution must have z = 0. Plugging = into =, we then also
2 2 1
In the context of a system of equations, here are the formal definitions of a solution and
the solution set:
Definition 81. Given a system of equations involving n variables, we call any ordered
n-tuple that satisfies the system of equations a solution. And we call the set of all such
ordered n-tuples its solution set.
179
See Definition 218 in the Appendices for the formal definition of an n-tuple.
363, Contents www.EconsPhDTutor.com
Here’s a very typical example from the A-Level exams:
1 = a ⋅ 02 + b ⋅ 0 + c = c,
1
3 = a ⋅ 22 + b ⋅ 2 + c = 4a + 2b + c,
2
5 = a ⋅ 42 + b ⋅ 4 + c = 16a + 4b + c.
3
Hence, a = 0, b = 1, and c = 1. The above system of equations has the solution (a, b, c) =
(0, 1, 1) and the solution set {(0, 1, 1)}.
Remark 44. Note that historically, the A-Level exams have never mentioned the concept
of a solution set.
180
Indeed, I foolishly missed this second x-intercept in earlier versions of this textbook!
366, Contents www.EconsPhDTutor.com
25.1. O-Level Review: Partial Fractions
1
Example 440. Consider the expression .
x2 + 3x + 2
We can rewrite or decompose this expression into partial fractions. First, observe
that x2 + 3x + 2 = (x + 1) (x + 2). Next, write:
1 1
= = +
A B
x2 + 3x + 2 (x + 1) (x + 2) x+1 x+2
A (x + 2) + B (x + 1) (A + B) x + 2A + B
= =
(x + 1) (x + 2) (x + 1) (x + 2)
.
Comparing coefficients on the linear and constant terms, we have A+B = 0 and 2A+B = 1.
Hence, A = 1 and B = −1. Altogether then, we can decompose our original expression
into the following partial fractions:
1 1 1
= − .
x2 + 3x + 2 x + 1 x + 2
5x − 3
Example 441. Consider . Again, x2 + 3x + 2 = (x + 1) (x + 2) and so write:
x + 3x + 2
2
5x − 3 A (x + 2) + B (x + 1) (A + B) x + 2A + B
= + = =
A B
x2 + 3x + 2 x + 1 x + 2 (x + 1) (x + 2) (x + 1) (x + 2)
.
5x − 3 5 3
= − .
x2 + 3x + 2 x + 1 x + 2
9x − 5
Example 442. Consider .
−x2 + 5x − 6
Comparing coefficients on the linear and constant terms, we have A−B = 9 and −2A+3B =
−5. Hence, A = 22 and B = 13. Altogether then, we have:
9x − 5 22 13 22 13
= + = + .
−x2 + 5x − 6 − (x − 3) x − 2 3 − x x − 2
x2 + x + 1
Example 443. Consider 2. Write:
(x + 1) (x − 1)
x2 + x + 1
= + +
A B C
(x + 1) (x − 1)
2 x + 1 x − 1 (x − 1)2
A (x − 1) + B (x + 1) (x − 1) + C (x + 1)
2
=
(x + 1) (x − 1)
2
A (x2 − 2x + 1) + B (x2 − 1) + C (x + 1)
=
(x + 1) (x − 1)
2
(A + B) x2 + (−2A + C) x + A − B + C
= .
(x + 1) (x − 1)
2
x2 + x + 1 1 3 3
= + + .
(x + 1) (x − 1)
2 4 (x + 1) 4 (x − 1) 2 (x − 1)2
2 = 4x − 1 + x + 2 + 2 =
A B C
(4x − 1) (x + 2) (x + 2) (4x − 1) (x + 2)
2
And now we can also get A = 5/27 and C = −5/3. Altogether then, we have:
3x2 − x + 1 5 19 5
= + − .
(4x − 1) (x + 2)
2 27 (4x − 1) 27 (x + 2) 3 (x + 2)2
2x2 + x + 1
Example 445. Consider . Write:
(5x − 1) (x2 + 1)
(A + 5B) x2 + (5C − B) x + A − C
=
(5x − 1) (x2 + 1)
.
From =, we have A = 2 − 5B. From =, we have C = (B + 1) /5. Plug = and = into = to get
1 4 2 5 4 5 3
2 − 5B − (B + 1) /5 = 1 or B = 2/13.
And now we can also get A = 16/13 and C = 3/13. Altogether then, we have:
2x2 + x + 1 16 2x + 3
= + .
(5x − 1) (x + 1) 13 (5x − 1) 13 (x2 + 1)
2
From =, we have A = 4 − 2B. From =, we have C = − (7B + 3) /2. Plug = and = into = to
1 4 2 5 4 5 3
Towkay Tip
As you can tell, partial fractions aren’t difficult — just a bunch of tedious algebra. So
the important thing is to go slowly and be really careful. Check and double-check that
you’ve got everything exactly correct at each step of the way. This will save you time
and marks, as compared to trying to do the algebra quickly and making a mistake.
Exercise 148. Decompose each of the following into partial fractions. (Hint: You may
need to factorise the denominators — read Ch. 21.1 if you haven’t already.)
8 17x − 5 2x2 − x + 7
(a) 2 . (b) . (c) 3 .
x +x−6 3x2 − 8x − 3 x − x2 − x + 1
−3x2 + 5
(d) . (Answers on pp. 1461–1462.)
x3 − 2x2 + 4x − 8
26.1. Squaring
√
Example 447. We are asked to solve x = 2 − x (x ≤ 2).
a
√ √
Observe that x = 1 does indeed solve =: 1 = 2 − 1 = 1 = 1.
b 1 a
√ √
3
But x = −2 does not: −2 = 2 − (−2) = 4 = 2.
c a
7
So, the above steps are wrong because they produce the extraneous solution x = −2.
c
Ô⇒ x2 = 2 − x
1
⇐⇒ (x − 1)(x + 2) = 0
2
⇐⇒ x = 1 or x = −2.
3 b c
We now see clearly how Step 1 differs from Steps 2 and 3. Step 1 is a “ Ô⇒ ” state-
ment, while Steps 2 and 3 are “ ⇐⇒ ” statements. Or in plainer English, the squaring
operation in Step 1 is an irreversible operation.
It is generally true that: a=b Ô⇒ a2 = b2 .
However, the converse is false. That is: a2 = b2 Ô⇒
/ a = b.
For example, (−1) = 12 , but −1 ≠ 1. So, squaring is an example of an irreversible
2
If our chain of reasoning contains only ⇐⇒ ’s, then all is well. However, if it contains
even one Ô⇒ (i.e. an irreversible step), then extraneous solutions may arise and we
must be careful to check for them.
Note the emphasis on the word may. Extraneous solutions may arise but might not.
The operation of squaring is merely one example of when extraneous solutions may be
introduced. Two other examples of operations that may do likewise are those of multiply-
ing by zero and removing logarithms:
x2 − 3x 1 a
Example 448. We are asked181 to solve: + 2 + = 0. We try these steps:
x2 − 1 x−1
1. Multiply by x2 − 1: x2 − 3x + 2 (x2 − 1) + (x + 1) = 0.
2. Rearrange and factorise: 3x2 − 2x − 1 = (x − 1)(3x + 1) = 0.
3. Conclude: x = 1 or x = −1/3.
b c
So, x = 1 is an extraneous solution and the above steps are wrong. Where lies the error?
b
Again, to clearly detect the error, let us write out our chain of reasoning more explicitly
with the aid of the logical relations Ô⇒ and ⇐⇒ :
x2 − 3x 1 a
+ 2 + =0
x2 − 1 x−1
Ô⇒ x2 − 3x + 2 (x2 − 1) + (x + 1) = 0
1
⇐⇒ 3x2 − 2x − 1 = (x − 1)(3x + 1) = 0
2
⇐⇒ x=1 x = −1/3.
3 b c
or
y = z Ô⇒ 0 ⋅ y = 0 ⋅ z, but 0 ⋅ y = 0 ⋅ z Ô⇒
/ y=z
For example: 1 = 1 Ô⇒ 0 ⋅ 1 = 0 ⋅ 1, but 0 ⋅ 2 = 0 ⋅ 3 Ô⇒
/ 2 = 3.
So, our above chain of reasoning yields the following (true) implication:
x2 − 3x 1 a
+2+ =0 Ô⇒ x = 1 or x = −1/3.
b c
x −1
2 x−1
We must then check, case-by-case, whether each of our final solutions actually solves the
original equation. In this example, we find that x = −1/3 solves =, while x = 1 does not
c 1 b
In contrast, x = 1 does not solve = because x − 1 = 0, so that some terms in = are undefined.
b a a
Example 449. We are asked to solve log x+log (x + 1) = log (2x + 2). We try these steps:
a
1. Use a Logarithm Law: log x + log (x + 1) = log (x2 + x) = log (2x + 2).
a
2. Remove logs: x2 + x = 2x − 2.
3. Rearrange and factorise: x2 − x + 2 = (x + 1)(x − 2) = 0.
4. Conclude: x = −1 or x = 2.
b c
So, x = 1 is an extraneous solution and the above steps are wrong. Where lies the error?
b
Again, to clearly detect the error, let us write out our chain of reasoning more explicitly:
Ô⇒ x2 + x = 2x − 2
2
⇐⇒ x2 − x + 2 = (x + 1)(x − 2) = 0
3
⇐⇒ x = −1 or x = 2.
4 b c
log a = log b Ô⇒ a = b, a = b Ô⇒
/ log a = log b.
5
but
Here in this brief chapter, we have merely examined three examples of operations by which
extraneous solutions may be introduced, namely squaring, multiplying by zero, and
removing logarithms. These are not exhaustive and you will likely encounter more of
such operations as your maths education progresses.
The important thing is to remember how and why extraneous solutions arise. In particular,
you should remember and understand the informal theorem given on p. 372.
to solve = with the steps below. Identify any errors in these steps and give the correct
1
√ √
Exercise 150. (Tricky.)185 Suppose we are given that x ∈ R and x2 x + x2 + x + 1 = 0.
1
√
⇐⇒ (x2 + 1) ( x + 1) = 0
1
√
⇐⇒ x2 + 1 = 0 or x + 1 = 0
2
√
⇐⇒ N.A. or x = −1
3
⇐⇒ x = 1.
4 2
Verify that x = 1 does not solve =. Then identify any errors in the above chain of reasoning
2 1
x2 + x + 1 = 0.
1
⇐⇒ Rearrange: x2 = −x − 1.
1 2
1
So, we can divide = by x to get: x = −1 − .
2 3
x
1
⇐⇒ Now plug = into = to get: x2 + (−1 − ) + 1 = 0.
3 3 1 4
x
⇐⇒ x = 1.
4 5
Verify that x = 1 solves = but not =. Then identify any errors in the above chain of
5 4 1
184
Stolen from Sullivan (Precalculus, 10e, 2017, p. 519), hat tip to .
185
Adapted from .
186
Stolen from .
375, Contents www.EconsPhDTutor.com
Part II.
Sequences and Series
Letting f (n) denote the nth term, the Fibonacci sequence may be defined by:
⎧
⎪
⎪
⎪1 for n = 1, 2,
f (n) = ⎨
⎪
⎪
⎩f (n − 2) + f (n − 1)
⎪ for n ≥ 3.
Remark 45. For clarity, it is wise and indeed customary to enclose a sequence in paren-
theses.188 And so, even though your A-Level exams do not, I will insist on doing so.
187
In H2 Maths, we’ll look only at sequences of real numbers. But in general, the objects in a sequence
need not be real numbers and could be any objects whatsoever.
188
Note though that some writers prefer using braces {} or angle brackets ⟨⟩.
378, Contents www.EconsPhDTutor.com
The above sequences were infinite. But of course, sequences can also be finite:
Example 453. The finite sequence of the first six Fibonacci numbers is:
(1, 1, 2, 3, 5, 8) .
Example 454. The finite sequence of the first seven square numbers is:
Example 455. The finite sequence of the first four triangular numbers is:
(1, 3, 6, 10) .
Remark 46. We’ll generally be more interested in infinite sequences than finite sequences.
And so, when we simply say sequence, it may be assumed that we’re talking about an
infinite sequence. And when we want to talk about a finite sequence, we’ll clearly and
explicitly include the word finite.
Definition 82. A finite sequence of length k is a function with domain {1, 2, . . . , k}.
Remark 47. For H2 Maths, the objects in a sequence will always be real numbers. But
in general, they can be anything whatsoever.189
Example 456. Formally, the Fibonacci sequence is the function f ∶ Z+ → R defined by:
⎧
⎪
⎪
⎪1 for n = 1, 2,
f (n) = ⎨
⎪
⎪
⎩f (n − 2) + f (n − 1)
⎪ for n ≥ 3.
Example 459. Formally, the finite sequence of the first six Fibonacci numbers is the
function f6 ∶ {1, 2, 3, 4, 5, 6} → R defined by:
⎧
⎪
⎪
⎪1 for n = 1, 2,
f6 (n) = ⎨
⎪
⎪
⎩f6 (n − 2) + f6 (n − 1)
⎪ for n = 3, 4, 5, 6.
Example 460. Formally, the finite sequence of the first seven square numbers is the
function s7 ∶ {1, 2, 3, 4, 5, 6, 7} → R defined by s7 (n) = n2 .
Example 461. Formally, the finite sequence of the first four triangular numbers is the
function t4 ∶ {1, 2, 3, 4} → R defined by t4 (n) = 1 + 2 + ⋅ ⋅ ⋅ + n.
Remark 48. As repeatedly emphasised, the letter x we often use with functions is merely
a dummy or placeholder variable that can be replaced by any another.
Indeed, in the context of sequences, we’ll often prefer using n rather than x as our dummy
variable.
More examples:
189
In other words, for H2 Maths, the codomain of a sequence (which is a function) will always be R. But
in general, it can be any set whatsoever.
380, Contents www.EconsPhDTutor.com
Example 462. The function e ∶ Z+ → R with mapping rule e(n) = 2n is the sequence of
(positive) even numbers (2, 4, 6, 8, 10, 12, . . . ).
Example 463. The function g ∶ Z+ → R with mapping rule g(n) = 2n2 − 3n + 3 is the
following sequence (2, 5, 12, 23, 38, 57, 80, 107, 138, 173, . . . ).
Recall that a function need not “follow any formula” or “make any sense”. The same is
true of sequences (since sequences are simply functions):
Although h does not seem to “make any sense”, it is a (perfectly) well-defined function.
Indeed, the function h is also the following finite sequence of length 4:
Although this sequence does not seem to “make any sense”, it is a (perfectly) well-defined
finite sequence of length 4, simply because it is a function with domain {1, 2, 3, 4}.
Example 465. The function j ∶ {1, 2, 3} → {↑, ↓, →, ←, Punch, Kick} is defined by:
Although j does not seem to “make any sense”, it is a (perfectly) well-defined function.
Indeed, the function j is also the following finite sequence of length 3:
(↓, →, Punch).
Although this sequence does not seem to “make any sense”, it is a (perfectly) well-defined
finite sequence of length 3, simply because it is a function with domain {1, 2, 3}.
Example 466. Let s1 = 1, s2 = 4, s3 = 9, s4 = 16, and s5 = 25. Then we can denote the
finite sequence of the first five square numbers by:
Again, s and n are merely dummy or placeholder variables. And here, n in particular may
also be called an index variable, because it indexes or indicates which term in the
sequence we’re referring to.
We could replace s or n with any other symbol, like , or ⋆. And so, we could rewrite the
above example as:
Of course, it’s a bit strange to use symbols like , or ⋆. The point here is simply to illustrate
that once again, these are mere symbols that can be replaced by any other. We’ll usually
stick to using boring symbols like letters from the Latin alphabet.
Next, let (a1 , a2 , . . . ) be an (infinite) sequence. For convenience, this sequence can also be
denoted as any of the following:
∞
(an )n=1 or (an )n=1,2,... or (an )n∈Z+ or (an )n∈{1,2,... } or (an ).
Example 468. Let t1 = 1, t2 = 3, t3 = 6, t4 = 10, t5 = 15, etc. Then we can denote the
(infinite) sequence of triangular numbers by:
∞
(t1 , t2 , . . . ) = (tn )n=1 = (tn )n=1,2,... = (tn )n∈Z+ = (tn )n∈{1,2,... } = (tn ) .
Example 470. Let (an ) = (1, 1, 2, 3, 5, 8, 13, 21, 34, . . . ) be the Fibonacci sequence. Let
(bn ) = (2, 4, 6, 8, 10, 12, 14, 16, 18, . . . ) be the sequence of even numbers. Let k = 10. Then:
(kbn ) = (20, 40, 60, 80, 100, 120, 140, 160, 180, . . . ).
Exercise 153. Let (cn ) be the sequence of negative odd numbers, (dn ) the sequence of
cube numbers, and k = 2. Write out the first five terms of (a) (cn ); and (b) (dn ). Then
write out the first five terms of each of the following. (Answer on p. 1464.)
Definition 84. Given a finite sequence (an )n∈{1,2,...,k} , its series is the expression
a1 + a2 + a3 + ⋅ ⋅ ⋅ + ak .
And the sum of this series is the number that equals the above expression.
Example 471. Consider the finite sequence of the first five square numbers:
(1, 4, 9, 16, 25). Its series is the expression 1 + 4 + 9 + 16 + 25, while the sum of this
series is the number 55.
Example 472. Consider the finite sequence of the first six even numbers:
(2, 4, 6, 8, 10, 12). Its series is the expression 2 + 4 + 6 + 8 + 10 + 12, while the sum of
this series is the number 42.
It may seem strange and unnecessary to distinguish between a series and its sum. Aren’t
they exactly the same thing?
It turns out that expressions like a1 + a2 + a3 + ⋅ ⋅ ⋅ + ak play an important role in maths. And
so, we want to reserve a special name for the expression itself, in order to distinguish it
from the number that is the sum of the series.
Example 473. Given the sequence (1, 3, 5, 7), we might be specifically interested in the
expression 1 + 3 + 5 + 7, rather than just the number 16.
It is thus convenient to have separate names for them — we call the expression 1+3+5+7
the series and the number 16 the sum (of the series).
Definition 85. Given an (infinite) sequence (an ), its series is the expression
a1 + a2 + a3 + . . .
As we saw on the previous page, every finite series has a well-defined sum — simply add
up all the numbers!
With an infinite series, things get a little trickier. It may sometimes be that an infinite
series diverges and its limit does not exist.
Example 474. Let (1, 1, 1, 1, 1, . . . ) be the (infinite) sequence that consists solely of 1s.
Its series is the expression 1 + 1 + 1 + 1 + 1 + . . .
“Clearly”, this expression is not equal to any number. And so formally, we say that this
series diverges and that its limit does not exist.
Also, observe that this expression “grows ever larger”. As shorthand, we write:190
1 + 1 + 1 + 1 + 1 + ⋅ ⋅ ⋅ = ∞.
Remark 49. As was the case with sequences, we’ll generally be more interested in infinite
series than finite series. And so, when we simply say series, it may safely be assumed that
we’re talking about an infinite series. And when we want to talk about a finite series,
we’ll clearly and explicitly include the word finite.
Example 475. Let (2, 4, 6, 8, 10, . . . ) be the sequence of (positive) even numbers.
Its series is the expression 2 + 4 + 6 + 8 + 10 + . . .
“Clearly”, this expression is not equal to any number. And so formally, we say that this
series diverges and that its limit does not exist.
Also, observe that this expression “grows ever larger”. As shorthand, we write:
2 + 4 + 6 + 8 + 10 + ⋅ ⋅ ⋅ = ∞.
190
Pedantic point: this “equation” is not really an equation. Instead, it is merely shorthand for the
following statement:
“Grows ever larger” is, in turn, a vague and informal phrase that we clarify only in Ch. 118.1 of the
Appendices.
385, Contents www.EconsPhDTutor.com
We just looked at two examples of series that diverge. We now look at examples of series
that converge to some limit:
0 + 0 + 0 + 0 + 0 + ⋅ ⋅ ⋅ = 0.
Note that what was called the sum (in the previous context of finite series) is now called
the limit (in the current context of infinite series).
1 1 1 1 1
Example 477. Consider the sequence ( , , , , , . . . ).
2 4 8 16 32
1 1 1 1 1
Its series is the expression + + + + + ...
2 4 8 16 32
As we’ll soon learn, it turns out that:
1 1 1 1 1
+ + + + + ⋅ ⋅ ⋅ = 1.
2 4 8 16 32
That is, this series converges to 1. (And we call 1 the limit of this series.)
Here we should remark that whenever we are dealing with infinite series, we must be very
careful. Here the = sign in the above equation is not the usual one. Instead, the above
1 1 1 1 1
equation is merely shorthand for “the expression + + + + + . . . converges
2 4 8 16 32
to the number 1”. In H2 Maths, we shall simply count on your intuitive and imprecise
understanding of what the phrase converges to means, but you should know that it does
actually have a clear and precise meaning (on this, see Ch. 118.1 in the Appendices).
1 1 1
Example 478. Consider the sequence of reciprocals of squares ( , , , . . . ).
12 22 32
1 1 1 1 1
It series is the expression + + + ⋅ ⋅ ⋅ = 1 + + + ...
12 22 32 4 9
It’s not at all obvious, but it turns out that:
1 1 1 1 1 π2
+ + + ⋅ ⋅ ⋅ = 1 + + + ⋅ ⋅ ⋅ = .
12 22 32 4 9 6
That is, this series converges to π/6. (And we call π/6 the limit of this series.)
The problem of finding the sum of this series is among the more famous problems in the
history of mathematics and is known as the Basel Problem. We will revisit this probably
in Ch. XXX.
1 − 1 + 1 − 1 + 1 − 1 + ...
Does this series converge or diverge? Or equivalently, does its limit exist?
Remarkably, we can “prove” that this series is equal to 0, 1, and 1/2.
• To “prove” that it equals 0, pair off the terms like so:
1 − 1 + 1 − 1 + 1 − 1 + ⋅ ⋅ ⋅ = (1 − 1) + (1 − 1) + (1 − 1) + ⋅ ⋅ ⋅ = 0 + 0 + 0 + ⋅ ⋅ ⋅ = 0.
´¹¹ ¹ ¹ ¸ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸ ¹ ¹ ¹ ¶
0 0 0
• To “prove” that it equals 1, pair off the terms after the first, like so:
1 − S = 1 − (1 − 1 + 1 − 1 + 1 − 1 + . . . ) = 1 − 1 + 1 − 1 + 1 − 1 + ⋅ ⋅ ⋅ = S.
191
But if you’re interested, see the brief discussion in Ch. 118.1 (Appendices).
192
The Italian priest-mathematician Luigi Guido Grandi (1671–1742) thought he had proved that the sum
of this series equalled both 0 and 1/2, and thus that “God could create the word out of nothing” (source).
193
We prove this in Example 1224 in the Appendices.
387, Contents www.EconsPhDTutor.com
Example 480. Consider the following series:
1 2 3 4 5 6 7 8 9 10
− + − + − + − + − + ...
2 3 5 7 11 13 17 19 23 29
The terms are fractions, with the numerators being the (positive) integers and the denom-
inators being the prime numbers. This series looks “simple” enough. But remarkably,
mathematicians still do not know whether it converges or diverges!194
194
This problem is listed as equation (8) on this Wolfram Mathworld page and as Problem E7 in Richard
Guy’s Unsolved Problems in Number Theory (3e, p. 316), where it is attributed to Paul Erdős.
388, Contents www.EconsPhDTutor.com
29. Summation Notation ∑
5
Example 481. ∑ n2 = 12 + 22 + 32 + 42 + 52 = 1 + 4 + 9 + 16 + 25.
n=1
• The variable n is the dummy, placeholder, or index variable. (We can replace it with
any other symbol without changing the meaning of the above sentence.)
• The integer below ∑ is the starting point. So here, 1 tells us to start counting (the
index variable n) from n = 1.
• The integer above ∑ is the stopping point. So here, 5 tells us to stop at n = 5.
• The expression to the right of ∑ describes the nth term to be added up. So here, the
nth term to be added up is n2 .
5
Altogether then, ∑ n2 tells us to add up the terms 12 , 22 , 32 , 42 , and 52 .
n=1
3
Example 482. ∑ (2n + 3) = (2 ⋅ 1 + 3) + (2 ⋅ 2 + 3) + (2 ⋅ 3 + 3) = 5 + 7 + 9 = 21.
n=1
4
Example 483. ∑ n = 1 + 2 + 3 + 4 = 10.
n=1
6
Example 484. ∑ 2n = 2 ⋅ 1 + 2 ⋅ 2 + 2 ⋅ 3 + 2 ⋅ 4 + 2 ⋅ 5 + 2 ⋅ 6 = 2 + 4 + 6 + 8 + 10 + 12 = 42.
n=1
7
Example 485. ∑ 2n = 21 + 22 + 23 + 24 + 25 + 26 + 27 = 2 + 4 + 8 + 16 + 32 + 64 + 128 = 254.
n=1
5
Example 486. ∑ 1 = 1 + 1 + 1 + 1 + 1 = 5. Here each term to be added up is simply the
n=1
5
constant 1. And so, ∑ 1 is simply the sum of five 1s.
n=1
3
Example 487. ∑ (10 − 2n) = (10 − 2 ⋅ 1) + (10 − 2 ⋅ 2) + (10 − 2 ⋅ 3) = 8 + 6 + 4.
n=1
195
The symbol σ is the lower-case Greek letter sigma.
389, Contents www.EconsPhDTutor.com
4
1 1 1 1 1
Example 488. ∑ = + + + .
n=1 n 1 2 3 4
4
1 1 1 1 1 1 1 1 1
Example 489. ∑ = + + + = + + + .
(n + 1) (1 + 1) (2 + 1) (3 + 1) (4 + 1)
2 2 2 2 2 4 9 16 25
n=1
It’s nice to have 1 as the starting point and that’s what we’ll usually do. But there’s no
reason why the starting point must always be 1. Examples:
n −2 −1 0 1 2 3
3
1 1 1 1 3 3
∑ = + + + + + =− − +0+ + + = .
n=−2 4 4 4 4 4 4 4 2 4 4 2 4 4
Exercise 155. Redo the above exercise, but now with starting point 0. (Answer on p.
1465.)
Exercise 156. Find the sum of each series. (Observe that here the dummy or index
variables are not the usual n. Instead, they are i, ⋆, and x.) (Answer on p. 1465.)
4 17 33
(a) ∑ (2 − i) . (b) ∑ (4 ⋆ +5). (c) ∑ (x − 3).
i
Example 494. In the following series, the “n = 1” below the ∑ symbol indicates that it
has starting point 1.
∞
1 1 1 1 1 1 1
∑ = + + + ⋅ ⋅ ⋅ = + + + ...
n=1 2n 2 1 2 2 2 3 2 4 8
The ∞ above the ∑ symbol indicates that there is no stopping point. This is thus an
infinite series.
As already mentioned and as we’ll soon learn, this series converges to 1. We may write:
∞
1
∑ = 1.
n=1 2n
If the context makes it crystal clear what the starting and stopping points are, we will
sometimes be lazy/sloppy and omit them. And so here, we may also write:
∞
1 1
∑ = ∑ = 1.
n=1 2n 2 n
∞
Example 495. ∑ n = ∑ n = 1 + 2 + 3 + . . .
n=1
∞
“Clearly”, this series diverges. We write: ∑ n = ∑ n = ∞.
n=1
∞
1 1 1 1 1 1 1
Example 496. ∑ = ∑ = + + + ⋅ ⋅ ⋅ = 1 + + + ...
n=1 n
2 n2 12 22 32 4 9
∞
1 1 π
As mentioned in Example 478, this series converges to π/6: ∑ =∑ 2 = .
n=1 n
2 n 6
∞
Example 497. ∑ nx2 = ∑ nx2 = 1x2 + 2x2 + 3x2 + 4x2 + . . .
n=1
This infinite series has starting point 1 and each term to be added up is nx2 .
∞
Thus, ∑ nx2 is the sum of infinitely many terms, namely x2 , 2x2 , 3x2 , 4x2 ...
n=1
By the way, this series diverges for all x ≠ 0. That is, for all x ≠ 0, we have:
∞
∑ nx2 = ∑ nx2 = ∞.
n=1
∞
1 1 1 1 1
∑ = ∑ = + + + ...
n=1 n n 1 2 3
We have:
1 2 3
1 1 1 1 1 1 1 1 1
∑ = = 1, ∑ = + = 1.5, ∑ = + + = 1.83,
n=1 n 1 n=1 n 1 2 n=1 n 1 2 3
Does the harmonic series converge or diverge? From the above, it’s not obvious.
It turns out that it diverges. Here’s a heuristic (i.e. not 100% rigorous) proof:
1 1 1
First, consider the series 1 + + + + . . . — “clearly”, this series diverges.
2 2 2
As we show below, this series is “smaller than” the harmonic series.196
Thus, the harmonic series must “clearly” also diverge:
1 1 1 1
1+ + + + +...
2 2 2 2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
= + + + + + + + + + + + + + + + +...
1 2 4 4 8 8 8 8 16 16 16 16 16 16 16 16
² ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
1/2 1/2 1/2
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
< + + + + + + + + + + + + + + + +...
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
∞
1
= ∑ .
n=1 n
196
Again, here we must be careful to define what we mean by one infinite series being “smaller than”
another.
392, Contents www.EconsPhDTutor.com
As before, it’s nice to have 1 as the starting point, but this need not always be so:
∞
1 1 1 1 1 1 1 1
Example 499. ∑ = 0 + 1 + 2 + 3 + ⋅⋅⋅ = 1 + + + + ...
n=0 2 2 2 2 2 2 4 8
n
∞
By the way, since ∑ 1/2n = 1, it “clearly” follows that:
n=1
∞ ∞
1 1
∑ n = 1 + ∑ n = 1 + 1 = 2.
n=0 2 n=1 2
∞
1 1 1 1 1 1 1 1 1
Example 500. ∑ = + + + + ⋅⋅⋅ = + + + + ...
n=−2 n + 5 −2 + 5 −1 + 5 0 + 5 1 + 5 3 4 5 6
∞
By the way, since the harmonic series diverges — i.e. ∑ 1/n = ∞ — we have:
n=1
∞
1 1 1 1 1 1 1 ∞ 1
∑ = + + + + ⋅ ⋅ ⋅ = − − + ∑ = ∞.
n=−2 n + 5 3 4 5 6 1 2 n=1 n
That is, since the harmonic series, this series also diverges.
Exercise 158. Redo the above exercise, but now with starting point 0. (Answer on p.
1465.)
Definition 86. An arithmetic sequence (or progression) is a sequence where the difference
between any two consecutive terms is constant. This constant difference, denoted d, is
called the common difference.
Example 501. Below are six arithmetic sequences. Those on the left are finite while
those on the right are infinite.
(bn )n=1 = (4, 7, 10, 13, 16, 19, 22) (bn ) = (4, 7, 10, 13, 16, 19, 22, . . . )
7
In each sequence, the difference between any two consecutive terms is a constant.
(an )n=1 (an ), d = 2.
5
In and the common difference is
Definition 87. Given an arithmetic sequence, its series is called an arithmetic series.
Example 502. In the above example, we gave six arithmetic sequences. Here are the six
corresponding arithmetic series:
1 + 3 + 5 + 7 + 9, 1 + 3 + 5 + 7 + 9 + ...,
4 + 7 + 10 + 13 + 16 + 19 + 22, 4 + 7 + 10 + 13 + 16 + 19 + 22 + . . . ,
0 + π + 2π, 0 + π + 2π + . . .
Fact 45. Let (an )n=1 be a finite arithmetic sequence with common diff. d = a2 − a1 . Then:
k
ak − a1
(b) The number of terms is: k= + 1.
d
k k
(c) ∑ an = ∑ [a1 + (n − 1) d].
n=1 n=1
Proof. (a) The common difference between any two consecutive terms is d. Since an is
(n − 1) terms “after” a1 , we must have an = a1 + (n − 1) d.
ak − a1
(b) By (a), ak = a1 + (k − 1) d. Rearranging, we have: k = + 1.
d
(c) This is an immediate consequence of (a).
Example 503. You’ve probably heard the apocryphal story197 about an eight-year-old
Gauss adding up the numbers from 1 to 100 in an instant. The trick is to pair the first
number with the last, the second with the second last, etc., then multiply. Like this:
50 pairs
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
1 + 2 + 3 + 4 + ⋅ ⋅ ⋅ + 100 = (1 + 100) + (2 + 99) + (3 + 98) + ⋅ ⋅ ⋅ + (50 + 51)
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
101 101 101 101
= 101 × 50 = 5050.
In general, there is a simple formula for the sum of a finite arithmetic series:
Number of Terms
(First Term + Last Term) × .
2
A bit more formally:
k
∑ an = (a1 + ak ) .
k
n=1 2
k
∑ an = (a1 + ak ) . 3
k
Thus:
n=1 2
For the remainder of this proof (the case where k is odd), which is slightly trickier/messier,
see p. 1281 (Appendices).
197
The American Scientist writer Brian Hayes has investigated the provenance of this story. According to
him, the earliest known printing of this story was in “a 1906 pamphlet authored by Franz Mathé”.
395, Contents www.EconsPhDTutor.com
Example 504. Consider the arithmetic sequence (an )n=1 = (7, 17, 27, 37, . . . , 837).
k
The first and last terms are a1 = 7 and ak = 837. The common difference is 10.
So, by Fact 45(b), the total number of terms is k = (837 − 7) /10 + 1 = 83 + 1 = 84.
And now by Fact 46:
k
84
∑ an = (a1 + ak ) = (7 + 837) = 35 448.
k
n=1 2 2
Example 505. Consider the arithmetic sequence (bn )n=1 = (1, 5, 9, 13, 17, . . . , 393).
k
The first and last terms are b1 = 1 and bk = 393. The common difference is 4.
So, by Fact 45(b), the total number of terms is k = (393 − 1) /4 + 1 = 98 + 1 = 99.
And now by Fact 46:
k
99
∑ bn = (b1 + bk ) = (1 + 393) = 19 503.
k
n=1 2 2
Exercise 159. Rewrite each series in summation notation, then compute its sum.
(a) 2 + 7 + 12 + 17 + 22 + 27 + 32 + ⋅ ⋅ ⋅ + 997.
(b) 3 + 20 + 37 + 54 + 71 + ⋅ ⋅ ⋅ + 1 703.
(c) 81 + 89 + 97 + 105 + 113 + ⋅ ⋅ ⋅ + 8 081. (Answer on p. 1466.)
Fact 47. Other than the zero series, every (infinite) arithmetic series diverges.
Definition 88. A geometric sequence (or progression) is a sequence where the ratio
between any two consecutive terms is constant. This constant ratio, denoted r, is called
the common ratio.
Example 506. Below are six geometric sequences. Those on the left are finite while
those on the right are infinite.
1 1 1 1 1 1 1 1 1 1 1 1
(bn )n=1 = (1, , , , , , ), (bn ) = (1, , , , , , , . . . ),
7
2 4 8 16 32 64 2 4 8 16 32 64
(cn )n=1 = (7, 7π, 7π2 ), (cn ) = (7, 7π, 7π2 , . . . ).
3
In each sequence, the ratio between any two consecutive terms is a constant.
(an )n=1 (an ), r = 2.
5
In and the common ratio is
Definition 89. Given a geometric sequence, its series is called a geometric series.
Example 507. In the above example, we gave six geometric sequences. Here are the six
corresponding geometric series:
1 + 2 + 4 + 8 + 16, 1 + 2 + 4 + 8 + 16 + . . . ,
1 1 1 1 1 1 1 1 1 1 1 1
1+ + + + + + 1+ + + + + + + ...
2 4 8 16 32 64 2 4 8 16 32 64
7 + 7π + 7π2 7 + 7π + 7π2 + . . .
Fact 48. Let (an )n=1 be a finite geometric sequence with common ratio r = a2 /a1 . Then:
k
n=1 n=1
Proof. (a) The common ratio between any two consecutive terms is r. Since an is (n − 1)
terms “after” a1 , we must have an = a1 rn−1 .
(b) By (a), ak = a1 rk−1 , or ak /a1 = rk−1 , or logr (ak /a1 ) = k − 1, or k = logr (ak /a1 ) + 1.
(c) This is an immediate consequence of (a).
1 − rk
S=a
1−r
.
lim Sk =
a1
And thus: .
k→∞ 1−r
Remark 50. By the way, the mass cancellation trick used in the above proof is called the
method of differences (which is the topic of Ch. 33).
1 − 221 221 − 1
1 + 2 + 22 + 23 + ⋅ ⋅ ⋅ + 220 = 1 = = 2 097 152 − 1 = 2 097 151.
1−2 1
1 1 1 1
Example 509. Consider the geometric series 1 + + 2 + 3 + ⋅ ⋅ ⋅ + 20 .
2 2 2 2
The first term is 1, the common ratio is 1/2, and there are 21 terms. Hence:
Corollary 6. Let (an )n=1 be a finite geometric sequence with r = a2 /a1 . Then:
k
k
a1 − rak
∑ an =
1−r
.
n=1
1 − 2 ⋅ 1 024 1 − 2 048
1 + 2 + 4 + 8 + 16 + ⋅ ⋅ ⋅ + 1 024 = = = 2 047.
1−2 −1
4 − 3 ⋅ 8 748 4 − 26 244
4 + 12 + 36 + 108 + ⋅ ⋅ ⋅ + 8 748 = = = 13 120.
1−3 −2
Exercise 160. Rewrite each series in summation notation, then compute its sum.
(a) 7 + 14 + 28 + 56 + 112 + 224 + 448 + 896.
(b) 20 + 10 + 5 + 5/2 + 5/4 + 5/8.
(c) 1 + 1/3 + 1/9 + 1/27 + 1/81 + 1/243. (Answers on p. 1466.)
a1 + a1 r + a1 r2 + a1 r3 + ⋅ ⋅ ⋅ =
a1
1−r
.
lim Sk =
a1
And thus: .
k→∞ 1−r
If ∣r∣ ≥ 1, then the limit does not exist:
Exercise 161. Rewrite each series in summation notation, then compute its sum.
(a) 6 + 9/2 + 27/8 + . . . .
(b) 20 + 10 + 5 + . . .
(c) 1 + 1/3 + 1/9 + . . . (Answers on p. 1466.)
198
Our proof here hand-waves a little because it implicitly assumes certain “obvious” results about limits
that aren’t mentioned in the main text (but are proved only in Ch. XXX of the Appendices).
400, Contents www.EconsPhDTutor.com
32. Rules of Summation Notation
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
k times
k
Proof. (a) ∑ c = c + c + ⋅ ⋅ ⋅ + c = ck.
n=1
k k
(b) ∑ (can ) = ca1 + ca2 + ⋅ ⋅ ⋅ + cak = c (a1 + a2 + ⋅ ⋅ ⋅ + ak ) = c ∑ an .
n=1 n=1
k
(c) ∑ (an + bn ) = (a1 + b1 ) + (a2 + b2 ) + ⋅ ⋅ ⋅ + (ak + bk )
n=1
k k
= (a1 + a2 + ⋅ ⋅ ⋅ + ak ) + (b1 + b2 + ⋅ ⋅ ⋅ + bk ) = ∑ an + ∑ bn .
n=1 n=1
k
(d) ∑ (an − bn ) = (a1 − b1 ) + (a2 − b2 ) + ⋅ ⋅ ⋅ + (ak − bk )
n=1
k k
= (a1 + a2 + ⋅ ⋅ ⋅ + ak ) − (b1 + b2 + ⋅ ⋅ ⋅ + bk ) = ∑ an − ∑ bn .
n=1 n=1
k+l
(e) ∑ an = a1 + a2 + ⋅ ⋅ ⋅ + ak + ak+1 + ak+2 + ⋅ ⋅ ⋅ + ak+l
n=1
k k+l
= (a1 + a2 + ⋅ ⋅ ⋅ + ak ) + (ak+1 + ak+2 + ⋅ ⋅ ⋅ + ak+l ) = ∑ an + ∑ an .
n=1 n=k+1
k+l l
(f) ∑ an = ak+1 + ak+2 + ak+3 + ⋅ ⋅ ⋅ + +ak+l = ∑ ak+n .
n=k+1 n=1
199
More of such identities available here: .
401, Contents www.EconsPhDTutor.com
Exercise 162. Evaluate each of the following (x ∈ R). (Answer on p. 1467.)
k
Exercise 163. Let x ≠ 1 and Sk = ∑ nxn−1 . Prove the following.
n=1
1 1 1 1
Example 512. Consider + + + ⋅⋅⋅ + .
1×2 2×3 3×4 1 000 × 1 001
It seems like a tall task to evaluate the sum of this series. But using partial fractions,
it’s actually ezy-wheezy. First, rewrite this series in summation notation:
1 000
1 1 1 1 1
+ + + ⋅⋅⋅ + = ∑
1×2 2×3 3×4 1 000 × 1 001 n=1 n (n + 1)
.
Next, take the nth term and do the partial fractions decomposition:
1 A (n + 1) + Bn (A + B) n + A
= + = =
A B
n (n + 1) n n + 1 n (n + 1) n (n + 1)
.
1 1 1
Hence: = − .
n (n + 1) n n + 1
1 000
1 1 1 1 1
And: ∑ = + + + ⋅⋅⋅ +
n=1 n (n + 1) 1 × 2 2 × 3 3 × 4 1 000 × 1 001
1 1 1 1 1 1 1 1 1 1
= − + − + − + ⋅⋅⋅ − + −
1 2 2 3 3 4 4 1 000 1 000 1 001
1 1 000
=1− = .
1 001 1 001
Observe that in the second line, every term with denominator 2 through 1 000 is happily
cancelled out. Your syllabus calls this process of mass cancellation the method of
differences. (Some other writers instead call this telescoping.)200
More generally, we have:
k
1 1 1 1 1 1
∑ = + + + ⋅⋅⋅ + =1− .
n=1 n (n + 1) 1 × 2 2 × 3 3 × 4 k (k + 1) k + 1
We can thus easily show that the sum of the corresponding infinite series converges:
1 1 1 k
1 1
+ + + ⋅ ⋅ ⋅ = lim ∑ = lim (1 − ) = 1.
1×2 2×3 3×4 k→∞ n=1 n (n + 1) k→∞ k+1
200
ProofWiki says that this “arises from the obvious physical analogy with the folding up of a telescope”.
403, Contents www.EconsPhDTutor.com
1 1 1 1
Example 513. Consider + + + ⋅⋅⋅ + .
1×2×3 2×3×4 3×4×5 1 000 × 1 001 × 1 002
1 000
1
First rewrite this series in summation notation: ∑ .
n=1 n (n + 1) (n + 2)
Next, take the nth term and do the partial fractions decomposition:
1 A (n + 1) (n + 2) + Bn (n + 2) + Cn (n + 1)
= + + =
A B C
n (n + 1) (n + 2) n n + 1 n + 2 n (n + 1) (n + 2)
(A + B + C) n2 + (3A + 2B + C) n + 2A
= .
n (n + 1) (n + 2)
1 0.5 1 0.5
Altogether then: = − + .
n (n + 1) (n + 2) n n+1 n+2
1 1 1 1
And so: + + + ⋅⋅⋅ +
1×2×3 2×3×4 3×4×5 1 000 × 1 001 × 1 002
0.5 1 0.5 0.5 1 0.5 0.5 1 0.5 0.5 1 0.5
= − + + − + + − + + ⋅⋅⋅ + − + .
1 2 3 2 3 4 3 4 5 1 000 1 001 1 002
Observe that the terms with denominator 3 cancel out nicely. And the same will happen
to all terms with denominator 3 through 1 000.
We will then be left only with terms that have denominators 1, 2, 1 001, and 1 002:
1 000
1 0.5 1 0.5 0.5 1 0.5
∑ = − + + − +
n=1 n (n + 1) (n + 2) 1 2 2 1 001 1 001 1 002
1 0.5 0.5 1 1
= − + = − = 0.249 . . .
4 1 001 1 002 4 2 ⋅ 1001 ⋅ 1002
More generally, we have:
k
1 0.5 1 0.5 0.5 0.5 1 1 1
∑ = − + − + = − + .
n=1 n (n + 1) (n + 2) 1 2 2 k + 1 k + 2 4 2 (k + 1) 2 (k + 2)
And hence, the sum of the corresponding infinite series converges to 1/4:
1 1 k
1 1 1 1 1
+ + ⋅ ⋅ ⋅ = lim ∑ = lim ( − + )= .
1×2×3 2×3×4 k→∞ n=1 n (n + 1) (n + 2) k→∞ 4 2 (k + 1) 2 (k + 2) 4
1 1 1 1
Example 514. Consider √ √ + √ √ + √ √ + ⋅⋅⋅ + √ √ .
1+ 2 2+ 3 3+ 4 9 999 + 10 000
Again, first rewrite this series in summation notation:
9 999
1 1 1 1 1
√ √ + √ √ + √ √ + ⋅⋅⋅ + √ √ = ∑ √ √ .
1+ 2 2+ 3 3+ 4 9 999 + 10 000 n=1 n + n + 1
The same-coloured terms nicely cancel out, so that we’re left with:
9 999
1 √ √
∑ √ √ = 10 000 − 1 = 100 − 1 = 99.
n=1 n+ n+1
1 1 1 1
Example 515. Consider + + + ⋅⋅⋅ + .
1 1+2 1+2+3 1 + 2 + 3 + ⋅ ⋅ ⋅ + 1 000
Again, first rewrite the series in summation notation:
1 000
1 1 1 1 1
+ + + ⋅⋅⋅ + = ∑ .
1 1+2 1+2+3 1 + 2 + 3 + ⋅ ⋅ ⋅ + 1 000 n=1 1 + ⋅ ⋅ ⋅ + n
Next, use the formula for the sum of an arithmetic series to rewrite the nth term:
n (n + 1) 1 2
1 + ⋅⋅⋅ + n = Ô⇒ = .
2 1 + ⋅ ⋅ ⋅ + n n (n + 1)
1 1 1 k
1 2
+ + + ⋅ ⋅ ⋅ = lim ∑ = lim (2 − ) = 2.
1 1+2 1+2+3 k→∞ n=1 1 + ⋅ ⋅ ⋅ + n k→∞ k+1
k
Example 516. Consider ∑ n2 = 12 + 22 + 32 + ⋅ ⋅ ⋅ + k 2 .
i=1
k
k (k + 1) (2k + 1)
We will prove that: ∑ n2 = .
i=1 6
(n + 1) − n3 = 3n2 + 3n + 1.
3
First, observe that:
∑ [(n + 1) − n ] = ∑ (3n + 3n + 1) = 3 ∑ n + 3 ∑ n + ∑ 1
k k k k k
3 3 2 2
Hence:
i=1 i=1 i=1 i=1 i=1
k
k (k + 1)
= 3 ∑ n2 + 3 + k.
1
i=1 2
∑ [(n + 1) − n3 ] = 23 − 13 + 33 − 23 + 43 − 33 + ⋅ ⋅ ⋅ + (k + 1) − k 3
k
3 3
i=1
= (k + 1) − 13 = k 3 + 3k 2 + 3k.
2 3
k
k (k + 1)
Putting = and = together, we get: 3 ∑ n2 + 3 + k = k 3 + 3k 2 + 3k.
1 2
i=1 2
k
k 3 + 3k 2 + 3k − 3k (k + 1) /2 − k
Rearranging: ∑n = 2
i=1 3
2k 3 + 3k 2 + k k (k + 1) (2k + 1)
= = .
6 6
Exercise 164. Rewrite each series in summation notation and find its sum. Next, write
down its sum in the case where the series has k terms instead. Finally, determine if the
corresponding infinite series converges. If it does, find its limit. (Answer on p. 1468.)
1 1 1 1 1 1
(a) + + + + + ⋅⋅⋅ + . (Hint in footnote.)201
3 8 15 24 35 999 999
1 2 3 999
(b) lg + lg + lg + ⋅ ⋅ ⋅ + lg . (lg is the base-10 log.)
2 3 4 1 000
1 1 1
(c) √ √ + √ √ + ⋅⋅⋅ + √ √ . (Hint in footnote.)202
2 1+1 2 3 2+2 3 100 99 + 99 100
(d) 13 + 23 + 33 + ⋅ ⋅ ⋅ + 1003 . (Hint in footnote.)203
201
Hint: Think about the square numbers.
202
Do the surd rationalisation. Then persevere with the algebra and things will work out nicely.
Consider (n + 1) − n4 and mimic the last example (be warned that the algebra will be more painful).
203 4
y
D = (2, 4)
F = (0.5, 2)
B = (2, 1)
x
C = (−1, 0) E = (4, 0)
A = (−1, −3)
Let C = (−1, 0), D = (2, 4), E = (4, 0), and F = (0.5, 2) be points. Then:
ÐÐ→
The vector CD = (3, 4) “carries” us 3 units east and 4 units north from the tail C to the
head D.
Ð→
The vector CE = (5, 0) “carries” us 5 units east and 0 units north from the tail C to the
head E.
Ð→
The vector CF = (1.5, 2) “carries” us 1.5 units east and 2 units north from the tail C to
the head F .
Definition 90. Given the points A = (a1 , a2 ) and B = (b1 , b2 ), the vector from A to B,
Ð→
denoted AB, is the ordered pair of real numbers:
Ð→
AB = (b1 − a1 , b2 − a2 ) .
(Later on when we look at three-dimensional (3D) space, vectors will instead be ordered
triples of real numbers.)
A vector is often contrasted with a scalar, which is simply any real number:
magnitude.
Example 518. You may recall from physics that velocity is a vector quantity, while
speed is a scalar quantity. In particular, speed is the magnitude of velocity.
We’ll have more to say about this in Ch. 39.
Definition 92. Given the vector (p, q), its magnitude (or length), denoted ∣(p, q)∣, is the
number:
√
∣(p, q)∣ = p2 + q 2 .
Ð→
Example 519. The magnitude or length of the vector AB = (3, 4) is:
√
Ð→
∣AB∣ = ∣(3, 4)∣ = 42 + (−3) = 5.
2
y
D = (2, 4)
F = (0.5, 2)
B = (2, 1)
C = (−1, 0) x
E = (4, 0)
4
√
Ð→
∣AB∣ = 42 + (−3) = 5
2
A = (−1, −3) 3
ÐÐ→ Ð→ Ð→
Similarly, the magnitudes of CD = (3, 4), CE = (5, 0), and CF = (1.5, 2) are:
√
ÐÐ→
∣CD∣= ∣(3, 4)∣ = 42 + (−3) = 5,
2
Ð→ √
∣CE∣= ∣(5, 0)∣ = 52 + 02 = 5,
Ð→ √
∣CF ∣= ∣(1.5, 2)∣= 1.52 + 22 = 2.5.
Remark 51. In more general contexts, norm is another synonym for magnitude (or
length). But we shan’t use this term in this textbook.
(p, q) = (r, s) ⇐⇒ p = r, q = s.
A = (−1, −3) 3
ÐÐ→ Ð→
Example 521. The vectors CD = (3, 4) and CF = (1.5, 2) point in the same direction.
ÐÐ→ Ð→ ÐÐ→ Ð→
However, CD ≠ CF because they have different lengths — ∣CD∣ = 5, while ∣CF ∣ = 2.5.
ÐÐ→ Ð→
More formally, CD ≠ CF because (3, 4) ≠ (1.5, 2).
Ð→ ÐÐ→ Ð→
Example 522. Each of AB = (3, 4), CD = (3, 4), and CE = (5, 0) has length 5. However,
Ð→ Ð→ ÐÐ→
CE “obviously” points in a different direction from AB and CD.
Ð→ Ð→ Ð→ ÐÐ→
Thus: CE ≠ AB and CE ≠ CD.
204
Below, Definition 103 will formally define what it means for two vectors to “point in the same direction”.
413, Contents www.EconsPhDTutor.com
34.3. A Vector and a Point Are Different Things
Here’s another important point. Although a vector and a point can each be described by
an ordered pair of real numbers, they are entirely different mathematical objects.
To repeat, a vector is a two-dimensional object endowed with the properties of length
and direction. In contrast, a point is a zero-dimensional object, with neither length
nor direction. Example:
Example 523. Let A = (−1, −3), B = (2, 1), and G = (3, 4) be points.
Ð→
Then AB = (3, 4) is the vector that carries us 3 units east and 4 units north from A to
B. It is a two-dimensional object endowed with the properties of length and direction.
In contrast, the point G = (3, 4), although also described by an ordered pair of real
numbers, is a zero-dimensional object, with neither length nor direction.
y
G = (3, 4)
B = (2, 1)
x
A = (−1, −3)
Ð→
AB = (3, 4) is a vector, while G = (3, 4) is a point. They are completely different math-
ematical objects. Do not confuse them.
So far in this textbook, we’ve used the notation (p, q) to mean three entirely different
things. The notation (p, q) can mean:
(a) The set of real numbers between p and q (excluding p and q);
(b) The point with x-coordinate p and y-coordinate q; or
(c) The vector that “carries” us p units east and q units north.
As argued in Remark 19, in theory, this could be very confusing. But in practice, it isn’t.
⎛p⎞
(p, q) =
⎝q ⎠
.
Ð→ ÐÐ→ Ð→ Ð→
Example 524. The vectors AB = (3, 4), CD = (3, 4), CE = (5, 0), CF = (1.5, 2) may
also be written:
Ð→ ⎛ 3 ⎞ ÐÐ→ ⎛ 3 ⎞ Ð→ ⎛ 5 ⎞ Ð→ ⎛ 1.5 ⎞
AB = , CD = , CE = , CF = .
⎝4⎠ ⎝4⎠ ⎝0⎠ ⎝ 2 ⎠
As we’ll see shortly, we’ll be doing a lot of addition and multiplication with vectors. And
so, this “vertical” notation for vectors is very useful, because it literally helps us see better.
But in print, I’ll often prefer using the (a, b) notation, simply because it takes up less space.
Second, we can denote a vector by a single, lower-case, bold-font letter:
(p, q) = u.
Ð→ ÐÐ→ Ð→ Ð→
Example 525. The vectors AB = (3, 4), CD = (3, 4), CE = (5, 0), CF = (1.5, 2) may
also be written:
Ð→ ÐÐ→ Ð→ Ð→
AB = a, CD = b, CE = c, CF = d.
We’ll often use the bold-font letter notation in print. However, it’s hard to hand-write in
bold font, so you can write Ð
→
u and Ð →v in place of u and v.
Our first exercises:
Exercise 165. Let A = (−1, −3), B = (2, 1), and G = (3, 4) be points.
Ð→ Ð→ Ð→ Ð→ Ð→
Consider the five vectors AG, BA, BG, GA, GB. Write down each in three different
ways. What is each vector’s tail, head, and length? How many units does each vector
carry us in the x- and y-directions? (Answer on p. 1471.)
Exercise 166. Provide a counterexample to show that the following is not always true:
This is just so you know. We will not be using this bracket notation for vectors in this
textbook, nor do your H2 Maths syllabus and exams.
Ð→ ⎛ a1 ⎞
OA = a = (a1 , a2 ) = .
⎝ a2 ⎠
Example 526. The points A = (−1, −3), B = (2, 1), and C = (−1, 0) have position vectors:
Ð→ ⎛ −1 ⎞ Ð→ ⎛2⎞ Ð→ ⎛ −1 ⎞
OA = a = (−1, −3) = , OB = b = (2, 1) = , and OC = c = (−1, 0) = .
⎝ −3 ⎠ ⎝1⎠ ⎝ 0 ⎠
Ð→
Again, take care not to confuse a point with its position vector. Although A and OA
may both be denoted by (−1, −3), they are different mathematical objects — the former
is a point while the latter is a vector.
Definition 94. The zero vector, denoted 0, is the origin’s position vector.
Ð→ ⎛0⎞
And so, the zero vector is: 0 = OO = (0, 0) = .
⎝0⎠
• Once again, do not confuse the point O = (0, 0) with its position vector 0.
• Given any point P , the vector that carries us from P to P is the vector carries us precisely
Ð→
nowhere. Hence, P P = 0.
The following result says that every vector has non-negative length; and moreover, the only
vector with length 0 is the zero vector:
205
For a proof of this result in the general n-dimensional case, see p. 1287 in the Appendices.
417, Contents www.EconsPhDTutor.com
34.7. Displacement Vectors
Definition 95. Suppose a moving object starts at point A and ends at point B. Then
Ð→
we call AB its displacement vector.
So, if a moving object starts at A = (a1 , a2 ) and ends at B = (b1 , b2 ), then regardless of the
path taken by the object, we say that its displacement vector is:
Ð→
AB = (b1 − a1 , b2 − a2 ).
A = (−1, 0) x
▲
▲
▲
▲
▲
▲
Example 529. Let A = (1, 2) and B = (5, 0) be points. Then the sum A+B is undefined.
It makes no sense to talk about the sum of two points.
Example 530. Consider Athens and Berlin, two points or locations. The sum Athens +
Berlin is undefined. It makes no sense to talk about the sum of two points or locations.
And so, given the points A = (a1 , a2 ) and B = (b1 , b2 ), their difference is:
Ð→
B − A = AB = (b1 − a1 , b2 − a2 ) .
Ð→
Example 531. Let A = (1, 2) and B = (5, 0). Then B − A is the vector AB:
Ð→
B − A = AB = (5, 0) − (1, 2) = (4, −2) .
Ð→
Similarly, the difference A − B is defined to be the vector BA:
Ð→
A − B = BA = (1, 2) − (5, 0) = (−4, 2) .
Example 532. The vector “Berlin − Athens” is the journey from Athens to Berlin:
That is, the journey from Athens to Berlin carries us 500 km west and 900 km north.206
Similarly, the vector “Athens − Berlin” is the reverse journey from Berlin to Athens:
Definition 97. Given a point A = (a1 , a2 ) and a vector v = (v1 , v2 ), their sum A + v is
the following point:
A + v = (a1 + v1 , a2 + v2 ) .
Equivalently, if the vector v’s tail is at the point A, then its head is at the point A + v.
Example 533. Let A = (1, 2) and v = (4, 4). Then their sum is the point (5, 6):
Starting from Athens, travelling 500 km west and 900 km north brings us to Berlin.
Definition 98. Given a point B = (b1 , b2 ) and a vector v = (v1 , v2 ), their difference B − v
is the following point:
B − v = (b1 − v1 , b2 − v2 ) .
Equivalently, if the vector v’s head is at the point A, then its tail is at the point A − v.
Example 535. Let A = (1, 2) and v = (4, 4). Then their difference is the following point:
If we end up in Berlin after travelling 500 km west and 900 km north, then we must have
started in Athens.
Definition 99. Let u = (u1 , u2 ) and v = (v1 , v2 ) be vectors. Then their sum, denoted
u + v, is the vector u + v = (u1 + v1 , u2 + v2 ).
Place v’s tail at u’s head. Then u + v is the vector from u’s tail to v’s head:
u+v
v
Example 537. The sum of u = (−1, 3) and v = (4, 4) is the vector (3, 7):
⎛ −1 ⎞ ⎛ 4 ⎞ ⎛ 3 ⎞
u+v= + =
⎝ 3 ⎠ ⎝4⎠ ⎝8⎠
.
−u
Informally, the vector −u is the vector u flipped in the opposite direction. Formally:
Definition 100. The additive inverse of the vector u = (u1 , u2 ) is the vector:
−u = (−u1 , −u2 ) .
⎛ −1 ⎞ ⎛ 1 ⎞ ⎛ 4 ⎞ ⎛ −4 ⎞
−u = − = and −v = − = .
⎝ 3 ⎠ ⎝ −3 ⎠ ⎝ 4 ⎠ ⎝ −4 ⎠
Definition 101. Given two vectors u and v, the difference u − v is the sum of u and the
additive inverse of v. That is:
u − v = u + (−v).
u − v = (u1 − v1 , u2 − v2 ) .
u − v = u + (−v) = (u1 − v1 , u2 − v2 ) .
v Flip v
ÐÐÐÐ→ u−v −v
u
Or equivalently, place the heads of u and v at the same point. Then u − v is the vector
from the tail of u to the tail of v:
u
u−v v
Example 539. Let u = (−1, 3) and v = (4, 4). Then the difference u − v is defined to be
the vector (−5, −1):
⎛ −1 ⎞ ⎛ 4 ⎞ ⎛ −5 ⎞
u−v= − =
⎝ 3 ⎠ ⎝ 4 ⎠ ⎝ −1 ⎠
.
⎛ 4 ⎞ ⎛ −1 ⎞ ⎛ 5 ⎞
v−u= − =
⎝4⎠ ⎝ 3 ⎠ ⎝1⎠
.
More generally:
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
Fact 56. If A, B, and C are points, then AB − AC = CB and AB + BC = AC.
Proof. Let A = (a1 , a2 ), B = (b1 , b2 ), and C = (c1 , c2 ) be points. Then by Definition 96:
Ð→ Ð→ Ð→
AB = (b1 − a1 , b2 − a2 ), AC = (c1 − a1 , c2 − a2 ), and CB = (b1 − c1 , b2 − c2 ).
And now by Fact 54,
Ð→ Ð→ Ð→
AB − AC = (b1 − a1 , b2 − a2 ) − (c1 − a1 , c2 − a2 ) = (b1 − c1 , b2 − c2 ) = CB. 3
Ð→ Ð→ Ð→ Ð→ Ð→
Observing that −CB = BC and rearranging, we also have AB + BC = AC. 3
Ð→ Ð→ Ð→ ⎛ 3 ⎞ ⎛ −1 ⎞ ⎛ 4 ⎞
AB = OB − OA = − = .
⎝ −1 ⎠ ⎝ 2 ⎠ ⎝ −3 ⎠
ÐÐ→ ÐÐ→ Ð→ ⎛ 3 ⎞ ⎛ −1 ⎞ ⎛ 4 ⎞
CD = OD − OC = − = .
⎝ −2 ⎠ ⎝ 1 ⎠ ⎝ −3 ⎠
Exercise 168. Express each of the following vectors more simply: (Answer on p. 1471)
Ð→ Ð→ ÐÐ→ Ð→ ÐÐ→ Ð→
AC + CB, DC + CA, BD + DA,
Ð→ ÐÐ→ ÐÐ→ ÐÐ→ ÐÐ→ ÐÐ→
AD − CD, −DC − BD, BD + DB.
Definition 102. Given the vector v = (v1 , v2 ) and the scalar c ∈ R, the vector cv is:
cv = (cv1 , cv2 ) .
The vector cv is simply the vector that points in the same direction as v, but has c times
the length.
v
cv
Exercise 170. Let A = (1, −3), B = (2, 0), and C = (5, −1). (Answer on p. 1472.)
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
(a) Write down AB, AC, BC, 2AB, 3AC, and 4BC.
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
(b) Verify that ∣2AB∣ = 2 ∣AB∣, ∣3AC∣ = 3 ∣AC∣, and ∣4BC∣ = 4 ∣BC∣.
Definition 103. Two non-zero vectors u and v are said to point in:
(a) The same direction if u = kv for some k > 0;
(b) Exact opposite directions if u = kv for some k < 0; and
(c) Different directions if u ≠ kv for any k.
Example 542. Let a = (2, 0), b = (1, 0), c = (−3, 0), and d = (1, 1).
The vectors a and b point in the same direction because a = 2b.
The vectors a and c point in the exact opposite directions because c = −1.5a.
The vectors a and d point in different directions because a ≠ kd for any k.
Exercise 171. Continuing with the above example, explain if b points in the same, exact
opposite, or different direction from each of c and d. (Answer on p. 1472.)
Remark 53. Note the special case of the zero vector 0 = (0, 0) — it does not point in the
same, exact opposite, or different direction as any other vector.
Definition 104. Two non-zero vectors u and v are parallel if u = kv for some k and
non-parallel otherwise.
Example 543. The vectors a = (2, 0), b = (1, 0), c = (−3, 0) are parallel. And so as
shorthand, we may write a ∥ b, a ∥ c, and b ∥ c.
The vector d = (1, 1) is not parallel to a, b, or c. Equivalently, d = (1, 1) points in a
different direction from a, b, and c. And so as shorthand, we may write d ∥/ a, d ∥/ b,
and d ∥/ c.
Remark 54. Again, note the special case of the zero vector 0 = (0, 0) — it is neither
parallel nor non-parallel to any other vector.
207
Just so you know, some writers call two vectors that point in exact opposite directions anti-parallel.
In contrast, we call them parallel. We will not use the term anti-parallel in this textbook.
426, Contents www.EconsPhDTutor.com
34.13. Unit Vectors
Example 545. The vectors (1, 1) and (−1, −1) are not unit vectors:
√ √
∣(1, 1)∣ = 12 + 12 = 2 ≠ 1, 3
√ √
∣(−1, −1)∣ = (−1) + (−1) = 2 ≠ 1.
2 2
3
Given a vector v, its unit vector, denoted v̂, is the vector that points in the same direction,
but has length 1. Formally:
Definition 106. Given a non-zero vector v, its unit vector (or the unit vector in its
direction) is:
1
v̂ = v.
∣v∣
It is easy to verify that thus defined, any vector’s unit vector has length 1:
Fact 58. Given any non-zero vector, its unit vector has length 1.
1 1 1
Proof. By Fact 57, ∣v̂∣ = ∣ v∣ = ∣ ∣ ∣v∣ = ∣v∣ = 1.
∣v∣ ∣v∣ ∣v∣
Fact 59. Let v be a vector with unit vector v̂. If c ∈ R, then the vector cv̂ has length ∣c∣.
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
AB, AC, BC, 2AB, 3AC, 4BC.
Remark 55. Note that some writers also call û the normalised vector of u, but we shall
not do so.
j = (0, 1)
i = (1, 0)
The standard basis vectors i = (1, 0) and j = (0, 1) are simply the unit vectors that point
in the directions of the positive x- and y-axes. Formally:
Definition 107. The standard basis vectors (in 2D space) are i = (1, 0) and j = (0, 1).
It turns out that any vector can be written as the linear combination (i.e. weighted
sum) of i’s and j’s:
Example 546. Let A = (−2, −1), B = (1, 2), and C = (5, −2) be points. Their position
vectors can be written as linear combinations (i.e. weighted sums) of i’s and j’s:
y
Ð→ ⎛ 1 ⎞ ⎛1⎞ ⎛0⎞
OB = = i + 2j = +2
⎝2⎠ ⎝0⎠ ⎝1⎠
Ð→ ⎛ −2 ⎞ ⎛1⎞ ⎛0⎞
OA = = −2i−j = −2 −
⎝ −1 ⎠ ⎝0⎠ ⎝1⎠
Ð→ ⎛ 5 ⎞ ⎛1⎞ ⎛0⎞
OC = = 5i−2j = 5 −2
⎝ −2 ⎠ ⎝0⎠ ⎝1⎠
Fact 61. Let a, b, and c be vectors. If a ∥/ b, then there are α, β ∈ R such that:
c = αa + βb.
Proof. The next page gives a heuristic proof. For a formal proof, see p. 1288 (Appendices).
Example 547. Consider the vectors a = (1, 2) and b = (3, 4). Since a ∥/ b, by Fact 61,
any vector can be expressed as the linear combination of a and b.
Consider for example the vector u = (2, 2). We will find α, β ∈ R such that u = αa + βb.
To do so, first write:
Write out the above vector equation as the following two cartesian equations:
2 = 1α + 3β 2 = 2α + 4β.
1 2
and
Now solve this system of (two) equations: = minus 2× = yields −2β = −2 or β = 1, so that
2 1
α = −1. Thus:
Exercise 174. Let a = (1, 2) and b = (3, 4). Express each of the vectors v = (3, 2) and
w = (−1, 0) as the linear combination of a and b. (Answer on p. 1472.)
Exercise 175. Explain why any vector can be written as a linear combination of the
vectors a = (1, 3) and b = (7, 5). Then express each of the vectors i = (1, 0), j = (0, 1),
and d = (1, 1) as the linear combination of a and b. (Answer on p. 1473.)
Remark 56. For a somewhat recent application of Fact 61 in the A-Level exams, see
N2013/I/6(i) (Exercise 521).
Example 548. Consider the vectors a = (1, 1) and b = (2, 2). Since a ∥ b, Fact 61 does
not apply.
For example, we cannot express v = (1, 2) written as the linear combination of a and b.
(Indeed, this can only be done for vectors that are themselves parallel to a and b.)
Example 549. The vectors c = (3, 1) and d = (−3, −1) point in exact opposite directions.
Since c ∥ d, Fact 61 does not apply.
For example, we cannot express v = (1, 2) written as the linear combination of c and d.
(Indeed, this can only be done for vectors that are themselves parallel to c and d.)
v = αa + βb b
a
βb
αa
As the above figure suggests, “obviously”, we can always find real numbers α and β so that
the head of αa and the tail of βb coincide. In other words, there are real numbers α and
β such that v = αa + βb.
Theorem 7. Let A and B be points with position vectors a and b. Let P be the point
that divides the line segment AB in the ratio λ ∶ µ. Then P ’s position vector is:
µa + λb
p=
λ+µ
.
A
The point P has position vector:
µa + λb
p=
λ
λ+µ
.
Ð→ 1 Ð→
Proof. By Fact 55, AP = p − a and AB = b − a.
Ð→ Ð→
Now observe that AP points in the same direction as AB, but has λ/ (λ + µ) times the
length. Thus:
Ð→ 2 λ Ð→
AP = AB = (b − a) .
λ
λ+µ λ+µ
Ð→ (λ + µ) a + λ (b − a) µa + λb
p = a + AP = a + (b − a) = =
λ
λ+µ λ+µ λ+µ
.
By the way, you do not need to mug the Ratio Theorem because the following is printed
on p. 4 of List MF26:
µa + λb
The point dividing AB in the ratio λ : µ has position vector
λ+µ
Vector product:
Example 551. Let C = (8, 3) and D = (2, −6) be points. Let Q be the point that divides
the line segment CD in the ratio 3 ∶ 7. By the Ratio Theorem:
7c + 3d 7 ⎛ 8 ⎞ 3 ⎛ 2 ⎞ 1 ⎛ 62 ⎞
q= = + = and hence Q = (6.2, 0.3).
3+7 10 ⎝ 3 ⎠ 10 ⎝ −6 ⎠ 10 ⎝ 3 ⎠
y
C = (8, 3)
Q = (6.2, 0.3)
D = (2, −6)
Exercise 176. Let A = (1, 2), B = (3, 4), C = (1, 4), D = (2, 3), E = (−1, 2), F = (3, −4)
be points. Find the points P , Q, and R which divide the line segments AB, CD, and
EF in the ratios 5 ∶ 6, 5 ∶ 1, and 2 ∶ 3, respectively. (Answer on p. 1473.)
l = {(x, y) ∶ ax + by + c = 0} ,
ax + by + c = 0.
In this chapter, we’ll learn a second method for describing lines, namely vector equations.
We’ll start by introducing the concept of a line’s direction vector:
208
Ch. 6.8.
209
Some writers also call this a scalar equation, but we shan’t do so.
434, Contents www.EconsPhDTutor.com
As the above example suggests, direction vectors are not unique. If v is a direction vector
of a line, then so too is any vector that’s parallel to v.
But no other vector is a direction vector of the line. That is, if u ∥/ v, then u is not a
direction vector of the line. (And so, although the direction vector v isn’t unique, we can
say that it is unique up to non-zero scalar multiplication.)
Altogether then, if a line has direction vector v, then its direction vectors are exactly those
that are parallel to v. Formally:
Fact 62. Let u and v be vectors. Suppose v is a line’s direction vector. Then:
Ð→ ⎛ 1.5 ⎞ ÐÐ→ ⎛ 2 ⎞
AB = and CD = .
⎝ 3 ⎠ ⎝4⎠
ÐÐ→ Ð→
We can easily verify that CD ∥ AB:
ÐÐ→ ⎛ 2 ⎞ 4 ⎛ 1.5 ⎞ 4 Ð→
CD = = = AB.
⎝ 4 ⎠ 3⎝ 3 ⎠ 3
Ð→ ÐÐ→
The following are parallel to AB and CD and are thus also direction vectors of l:
⎛ 2 ⎞ ⎛ −10 ⎞ ⎛ 2 ⎞ ⎛ 2π ⎞ ⎛ 2 ⎞ ⎛ 34 ⎞
−5 = , = , and 17 = .
⎝ 4 ⎠ ⎝ −20 ⎠ ⎝ 4 ⎠ ⎝ 4π ⎠ ⎝ 4 ⎠ ⎝ 68 ⎠
π
Proof. Let D = (p, q) be any point on the line. Since D is on the line, it satisfies the line’s
cartesian equation — that is:
ap + bq + c = 0.
Now consider the point E = (p + b, q − a). We now show that E also satisfies the line’s
cartesian equation and is thus is also on the line:
a (p + b) + b (q − a) + c = ap + ab + bq − ab + c = ap + bq + c = 0. 3
Since D and E are both points on the line, by Definition 108, the line has direction vector:
ÐÐ→
DE = E − D = (p + b, q − a) − (p, q) = (b, −a) .
y
−2x + 1 = 0 (1, 2.5)
(−2, −5)
5x − 2y + 3 = 0
or y = 2.5x + 1.5 (0, 2)
(3, 0)
3y − 1 = 0
Next, the line described by 3y − 1 = 0 or y = 1/3 has direction vector (3, 0).
And the line described by −2x + 1 = 0 or x = 0.5 has direction vector (0, 2).
Proof. Suppose the line is described by ax + by + c = 0. Then by Fact 63, the line has
direction vector (b, −a).
(a) If the line is horizontal, then by Fact 13, a = 0. And so, the line has direction vector
(b, 0). Since (1, 0) ∥ (b, 0), by Fact 62, the line also has direction vector (1, 0).
(b) Similarly, if the line is vertical, then by Fact 13, b = 0. And so, the line has direction
vector (0, −a). Since (0, 1) ∥ (0, −a), by Fact 62, the line also has direction vector (0, 1).
(c) The line’s gradient is −b/a = m. But (b, −a) ∥ (−b/a, 1). And so by Fact 62, the line
also has direction vector (1, m).
Example 555. The horizontal line y = −1 has direction vector (1, 0).
The vertical line x = 2 has direction vector (0, 1).
The oblique line y = x + 1 has gradient 1 and thus direction vector (1, 1).
y
x=2
y =x+1
(1, 1)
(0, 1)
y = −1
(1, 0)
Exercise 177. For each line, write down a direction vector. (Answer on p. 1474.)
Example 556. Consider the line l = {(x, y) ∶ 3x − y + 2 = 0}. It contains exactly those
points (x, y) that satisfy the cartesian equation 3x − y + 2 = 0 or y = 3x + 2. More simply,
we may say that this cartesian equation describes l.
It turns out that we can also describe l using a vector equation.
To do so, first observe that l contains the point P = (0, 2). Also, it has gradient 3 and
thus direction vector v = (1, 3). Since l is a straight line, it must also contain the points:
Indeed, l contains exactly those points R that can be expressed as P + λv = (0, 2) + λ(1, 3)
for some real number λ. That is:
Equivalently, l contains exactly those points R whose position vector r may be expressed
as p + λv = (0, 2) + λ(1, 3) for some real number λ. That is:
Again, we may more simply say that the vector equation = describes l.
2
(By the way, = and = are subtly different — more on this in Ch. 35.3.)
1 2
(−1, −1) = (0, 2)−1(1, 3), (0, 2) = (0, 2) + 0(1, 3), and (1, 5) = (0, 2) + 1(1, 3).
y
l may be described by:
3x − y + 2 = 0, or (1, 5)
r = (0, 2) + λ(1, 3) (λ ∈ R).
2 (λ = 1)
(0, 2)
(λ = 0) (1, 3)
(−1, −1)
(λ = −1) x
⎛0⎞ ⎛ 100 ⎞
l = { (x, y) ∶ r = +λ (λ ∈ R) }
⎝2⎠ ⎝ 300 ⎠
⎛ −1 ⎞ ⎛ −100 ⎞
= { (x, y) ∶ r = +λ (λ ∈ R) }
⎝ −1 ⎠ ⎝ −300 ⎠
⎛1⎞ ⎛ 1.5 ⎞
= { (x, y) ∶ r = +λ (λ ∈ R) }.
⎝5⎠ ⎝ 4.5 ⎠
Definition 109. A line is any set of points that can be written as:
Ð→
{R ∶ OR = p + λv (λ ∈ R)} ,
The above Definition says that a line contains exactly those points R whose position vector
Ð→
OR = r may be expressed as:
Ð→
OR = r = p + λv = (p1 , p2 ) + λ(v1 , v2 )
2
for some real number λ.
Equivalently, a line contains exactly those points R that may be expressed as:
R = (p1 , p2 ) + λ(v1 , v2 )
1
for some real number λ.
To repeat, here are what the vectors p and v and the number λ mean:
• p = (p1 , p2 ) is the position vector of some point on the line;
• v = (v1 , v2 ) is a direction vector of the line; and
• The parameter λ takes on every value in R; each distinct value produces a distinct
point on the line.
Note that Definition 109 is perfectly consistent with our earlier definition of a line (Definition
39). The difference is that Definition 39 “works” only in 2D space. In contrast, Definition
109 is more general — it “works” in 2D space and, as we’ll see, also in 3D space.210
210
Indeed, it also “works” in any n-dimensional space.
439, Contents www.EconsPhDTutor.com
And so, to write down a line’s vector equation, we need simply find any point on the line
and any direction vector of the line. More examples to illustrate how this works:
• = says that l contains exactly those points R that may be written as (0, 1) + λ(1, −1),
1
y
The line l
(1, −1)
(−1, 2)
(λ = −1)
(0, 1)
(λ = 0)
x
y
(1, 0)
The line l
x = −1.
R = (0, 1) + λ(0, 1) or
1
Exercise 179. In Definition 109 (of a line), we impose the restriction that a line’s dir-
ection vector v must be non-zero. By considering what the line becomes if v is the zero
vector, explain why we impose this restriction. (Answer on p. 1474.)
Remark 57. Here we repeat our earlier warning. A line {(x, y) ∶ ax + by + c = 0} is a set of
points. But for the sake of convenience, we often simply say that the line may be described
by the cartesian equation ax + by + c = 0. And if we’re especially lazy or sloppy, we
might even say that the line is the equation ax + by + c = 0 (even though strictly speaking,
this is wrong because a line is not an equation — it is a set).
Here likewise, a line {R = (x, y) ∶ r = p + λv, λ ∈ R} is a set of points. But for the sake
of convenience, we will often simply say that the line may be described by the vector
equation r = p + λv (λ ∈ R). And if we’re especially lazy or sloppy, we might even say
that the line is the equation ax + by + c = 0 (even though again, strictly speaking, this is
wrong because a line is not an equation — it is a set).
© 1 © ª
Point Point Vector
R = P + λv (λ ∈ R).
ª
Vector
© 2 ©
VectorVector
Or: r = p + λv (λ ∈ R).
Here are three pedantic points that can serve as a useful test of your understanding:
Pedantic Point #1. = is consistent with what we learnt earlier (in Ch. 34.9):
1
So, both vector equations = and = are perfectly correct ways to describe the exact same
1 2
line line.
The difference is that = does so “more directly” than does =. Because, to repeat:
1 2
• = says that l contains exactly those points R that may be written as P + λv, for some
1
real number λ.
Ð→
• = says that l contains exactly those points R whose position vector r = OR may be
2
Pedantic Point #2. What would be wrong and unacceptable is the following:
© 3 Vector
Point
© ©
Vector
R = p +λv (λ ∈ R), 7
As we learnt earlier (Ch. 34.9), Vector + Vector = Vector. But the LHS of = is a Point
3
© 4 ©
Point
©
Vector Vector
r = P +λv (λ ∈ R). 7
As we also learnt earlier, Point + Vector = Point. But the LHS of = is a Vector while its
4
Pedantic Point #3. A line is a set of points and not a set of vectors. So, take care to
note that the line l contains the points R = (x, y) and P = (p1 , p2 ) — it does not contain
the vectors r = (x, y) and p = (p1 , p2 ).
r = p + λv (λ ∈ R).
⎛ x ⎞ ⎛ p1 ⎞ ⎛v ⎞
Or equivalently: = + λ 1 (λ ∈ R).
⎝ y ⎠ ⎝ p2 ⎠ ⎝ v2 ⎠
Then given any point (x, y) on this line, there must be some real number λ such that:
We say that the line may be described by the above pair of cartesian equations.
Hm ... but aren’t we supposed to be able to describe a line with just one cartesian equation?
Well, if we’d like, we can do some easy algebra to eliminate the parameter λ:
x=1+λ⋅1 y = 2 + λ ⋅ 1.
1 2
and x
y−x=1 or y = x + 1.
x=0+λ⋅4 y = 0 + λ ⋅ 5.
1 2
and x
5 1
= minus × = yields:
2
4
5 5
y− x=0 or y = x.
4 4
x=3+λ⋅0=3 y = 1 + λ ⋅ 2.
1 2
and
of doing any algebra, we’ll simply discard =. The above pair of cartesian equations then
2
x = 3.
1
x = −1 + λ ⋅ (−1) y = 2 + λ ⋅ 0 = 2.
1 2
and
of doing any algebra, we’ll simply discard =. The above pair of cartesian equations then
1
y = 2.
2
Exercise 180. Each of the following vector equations describes a line. Rewrite each into
cartesian equation form. (Answer on p. 1474.)
x − p1 y − p2
= y= x + p2 − p1 .
v2 v2
or, rearranging:
v1 v2 v1 v1
(b) If v1 = 0, then l is vertical and can be described by x = p1 .
(c) If v2 = 0, then l is horizontal and can be described by y = p2 .
x = p1 + λv1 y = p2 + λv2 .
1 2
Proof. First, write: and
Then v1 × = minus v2 × = yields:
2 1
v1 y − v2 x = v1 p2 + λv1 v2 − v2 p1 − λv1 v2 = v1 p2 − v2 p1 .
v2 (x − p1 ) = v1 (y − p2 ).
3
Or:
x − p1 y − p2
(a) If v1 , v2 ≠ 0, then = divided by v1 v2 yields =
3
.
v1 v2
(b) If v1 = 0, then = becomes x = p1 .
3
Armed with Fact 64, we now revisit the last four examples.
x−1 y−2
= or, rearranging: y = x + 1.
1 1
x−0 y−0 5
= or, rearranging: y = x.
4 5 4
Definition 110. Given vectors u = (u1 , u2 ) and v = (v1 , v2 ), their scalar product, denoted
u ⋅ v, is the number:
u ⋅ v = u1 v1 + u2 v2 .
⎛ 5 ⎞ ⎛2⎞ ⎛ −4 ⎞ ⎛8⎞
Example 568. Let u = , v= , w= , and x = . Then:
⎝ −3 ⎠ ⎝1⎠ ⎝ 0 ⎠ ⎝7⎠
⎛ 5 ⎞ ⎛2⎞
u⋅v = ⋅ = 5 ⋅ 2 + (−3) ⋅ 1 = 10 − 3 = 7,
⎝ −3 ⎠ ⎝ 1 ⎠
⎛ 5 ⎞ ⎛ −4 ⎞
u⋅w = ⋅ = 5 ⋅ (−4) + (−3) ⋅ 0 = −20 + 0 = −20,
⎝ −3 ⎠ ⎝ 0 ⎠
⎛ 5 ⎞ ⎛8⎞
u⋅x = ⋅ = 5 ⋅ 8 + (−3) ⋅ 7 = 40 − 21 = 19,
⎝ −3 ⎠ ⎝ 7 ⎠
⎛ 2 ⎞ ⎛ −4 ⎞
v⋅w = ⋅ = 2 ⋅ (−4) + 1 ⋅ 0 = −8 + 0 = −8.
⎝1⎠ ⎝ 0 ⎠
The scalar product is itself simply a scalar (i.e. a real number). Hence the name.
Remark 58. The scalar product is also called the dot product or the inner product.
But it appears that your A-Level exams and syllabus do not use these terms. And so
neither shall we. We will stick strictly to the term scalar product.
Right now, the scalar product may seem like a totally random and useless thing, but as
we’ll soon learn, it is plenty useful. Let us first learn about a few of its properties.
It turns out that the scalar product is likewise commutative and distributive:
The fact that the scalar product is both commutative and distributive is a simple con-
sequence of the fact that multiplication is itself commutative and distributive.211
Example 571. Continue to let u = (5, −3), v = (2, 1), w = (−4, 0), and x = (8, 7).
The scalar product is commutative:
v ⋅ u = u ⋅ v = 7, w ⋅ u = u ⋅ w = −20, x ⋅ u = u ⋅ x = 19.
⎛ 5 + 2 ⎞ ⎛ −4 ⎞
(u + v) ⋅ w = ⋅ = −28 + 0 = −28 = −20 − 8 = u ⋅ w + v ⋅ w.
⎝ −3 + 1 ⎠ ⎝ 0 ⎠
(ca) ⋅ b = c (a ⋅ b).
Exercise 182. Let v = (2, 1), w = (−4, 0), and x = (8, 7). Above we already computed
v ⋅ w = −8. Now also compute the following: (Answer on p. 1475.)
211
The latter is, in turn, a fact we will simply take for granted in this textbook.
212
The proof covers only the two-dimensional case. For a more general proof, see p. 1289 (Appendices).
213
This proof covers only the two-dimensional case. For a more general proof, see p. 1289 (Appendices).
448, Contents www.EconsPhDTutor.com
36.1. A Vector’s Scalar Product with Itself
It turns out that a vector’s length is the square root of the scalar product with itself:
√
Fact 67. Suppose v be a vector. Then ∣v∣ = v ⋅ v and ∣v∣ = v ⋅ v.
2
√
Proof. By Definition 92, ∣v∣ = v12 + v22 . By Definition 110, v ⋅ v = v1 v1 + v2 v2 = v12 + v22 .
√
Hence, ∣v∣ = v ⋅ v and ∣v∣ = v ⋅ v.
2
Exercise 183. Let u = (5, −3), v = (2, 1), w = (−4, 0), and x = (8, 7).
The lengths of each vector are:
√ √
∣u∣ = 52 + (−3) =
2
34
√ √
∣v∣ = 22 + 12 = 5
√
∣w∣ = (−4) + 02 =
2
4
√ √
∣x∣ = 82 + 72 = 113
And the square roots of the scalar product of each vector with itself are:
√ √ √ √
u⋅u = (5, −3) ⋅ (5, −3) = 25 + 9 = 34
√ √ √ √
v⋅v = (2, 1) ⋅ (2, 1) = 4+1 = 5
√ √ √
w⋅w = (−4, 0) ⋅ (−4, 0) = 16 + 0 = 4
√ √ √ √
x⋅x = (8, 7) ⋅ (8, 7) = 64 + 49 = 113
Exercise 184. You are given the vectors a = (−2, 3), b = (7, 1), and c = (5, −4). Verify
that the length of each vector is equal to the square root of each vector’s scalar product
with itself. (Answer on p. 1475.)
w β
2π − β
We now give our formal Definition of the angle between two vectors. Be warned that
it comes seemingly outta nowhere. But don’t worry, Exercise 185 (next page) will help you
understand where this Definition comes from.
Definition 111. The angle between two non-zero vectors u and v is the number:
u⋅v
cos−1
∣u∣ ∣v∣
.
Recall214 that the range of cos−1 is [0, π]. And so, by the above Definition, the angle
between two vectors is indeed always between 0 and π.
214
P. 275.
450, Contents www.EconsPhDTutor.com
Exercise 185. Let u and v be vectors and θ be the angle between them.
θ
u
u⋅v
This Exercise will help you understand why we define θ = cos−1 .
∣u∣ ∣v∣
(a) Write down the vector that corresponds to the third side of the above triangle.
(b) Write down the lengths of the triangle’s three sides in terms of u and v.
(c) The Law of Cosines (Proposition 5) states that if a triangle has sides of lengths a, b,
and c and has angle C opposite the side of length c, then:
c2 = a2 + b2 − 2ab cos C.
Fact 68. If u and v are two non-zero vectors and θ is the angle between them, then:
Example 572. Let θ be the angle between the vectors i = (1, 0) and u = (1, 1). Since i
points east, while u points north-east, we know from primary school trigonometry that
θ = π/4.
Let’s verify that this is consistent with Definition 111:
−3 − 8 −11
= cos−1 ( √ √ ) = cos−1 ( √ ) ≈ 2.404.
13 17 221 w = (−1, −4)
By the way, here’s a possible concern. We’ve defined the angle between two vectors as:
u⋅v
cos−1
∣u∣ ∣v∣
.
But recall215 that the domain of arccosine is [−1, 1]. So, how can we be sure that the above
expression is always well-defined? In other words, how can we be sure that:
u⋅v
−1 ≤ ≤ 1?
∣u∣ ∣v∣
u⋅v
∈ (0, 1) ⇐⇒ ∈ (0, ).
π
(b) And thus:
∣u∣ ∣v∣
θ
2
u⋅v
= ⇐⇒ = u⋅v > ⇐
π
(c) 0 . (i) 0
∣u∣ ∣v∣
θ
2
u⋅v
∈ (−1, 0) ⇐⇒ ∈ ( , π). u⋅v = ⇐
π
(d) (ii) 0
∣u∣ ∣v∣
θ
2
u⋅v
(e) = −1 ⇐⇒ = (iii) u⋅v < 0 ⇐
∣u∣ ∣v∣
θ π.
y
π
cos−1
π
2
x
−1 1
Definition 112. Two non-zero vectors u and v are perpendicular (or normal or ortho-
gonal) if u ⋅ v = 0 and non-perpendicular if u ⋅ v ≠ 0.
Remark 59. Again, note the special case of the zero vector 0 = (0, 0) — it is neither
perpendicular nor non-perpendicular to any other vector.
Proof. “Obviously”, (c) and (d) simply follow from (a) and (b). For the proof of (a) and
(b), see p. 1290 in the Appendices.
More examples:
Example 575. Let θ be the angle between the vectors u = (1, −3) and v = (−2, 4). Then:
So, u and v are neither perpendicular nor parallel; instead, they point in different direc-
tions. Moreover, the angle between them is obtuse.
v = (−2, 4)
θ ≈ 2.999
u = (1, −3)
u⋅v −2 − 8 −10
θ = cos−1 = cos−1 √ √ = cos−1 √ √ = cos−1 −1 = π.
∣u∣ ∣v∣
12 + (−2) (−2) + 42 5 20
2 2
θ=π
u = (1, −2)
Example 577. Let θ be the angle between the vectors u = (3, −1) and v = (1, 3). Then:
u⋅v 3−3
θ = cos−1 = cos−1 √ = −1
=
π
2√ 2
cos 0
∣u∣ ∣v∣
.
2
3 + (−1) 1 + 3
2 2
θ=
π
2
u = (3, −1)
Example 578. Let θ be the angle between the vectors u = (1, −2) and v = (2, −4). Then:
u⋅v 2+8 10
θ = cos−1 = cos−1 √ √ = cos−1 √ √ = cos−1 1 = 0.
∣u∣ ∣v∣
12 + (−2) 22 + (−4) 5 20
2 2
v = (2, −4)
Example 579. Even without doing any precise calculations or drawing any graphs, we
can quickly see that:
• The angle between the vectors (817, −2) and (39, −55) is acute, because “clearly”:
√ √
79300 ⋅ 47 + (−470) ⋅ 793 = 0.
• If k < 0, then the angle between (67, k) and (−485, 32) is obtuse, because “clearly”:
Exercise 186. In each of the following, find the angle between u and v. Is it zero, acute,
right, obtuse, or straight? Are u and v perpendicular or parallel? Do they point in the
same, exact opposite, or different directions? (Answers on p. 1476.)
(a) u = (2, 0) and v = (0, 17). (b) u = (5, 0) and v = (−3, 0).
√
(c) u = (1, 0) and v = (1, 3). (d) u = (2, −3) and v = (1, 2).
u+v
v
Exercise 187. Use Facts 65 and 67 to show that ∣u + v∣ = ∣u∣ + 2u ⋅ v + ∣v∣ . Then use
2 2 2
Remember the Triangle Inequality (Fact 36)? Here it is again, but this time in the
language of vectors:
Fact 72. (Triangle Inequality.) If u and v are vectors, then ∣u + v∣ ≤ ∣u∣ + ∣v∣.
∣u + v∣ = ∣u∣ + 2u ⋅ v + ∣v∣ .
2 2 2
To prove Fact 72, first apply Cauchy’s Inequality (Fact 69) to the above equation; then
complete the square and take square roots. (Answer on p. 1477.)
Definition 113. The x- and y-direction cosines of the vector v = (v1 , v2 ) are the numbers:
v1 v2
and .
∣v∣ ∣v∣
v̂ = ( , ).
v1 v2
∣v∣ ∣v∣
And so, equivalently, v’s x- and y-direction cosines are the x- and y-coordinates of its unit
vector.
We now explain why the x- and y-direction cosines are so named. Place the tail of v =
(v1 , v2 ) at the origin. Let α be the angle between v and the positive x-axis. Similarly, let
β be the angle between v and the positive y-axis.217
y v
b
v̂
β
α
a x
Let v̂ = (a, b) be v’s unit vector. It has length 1 and forms the hypotenuse of two right
triangles. From the lower-right triangle, we have a = cos α.
Similarly, from the upper-left triangle, we have b = cos β.
This explains why v’s unit vector’s x- and y-coordinates are also its x- and y-direction
cosines.
217
More formally, α is the angle between v and i = (1, 0), while β is the angle between v and j = (0, 1).
458, Contents www.EconsPhDTutor.com
We can state and prove what was just said a bit more formally:
Fact 73. Let v = (v1 , v2 ) be a non-zero vector. Let α and β be the angles between v and
each of i and j. Then:
cos α = cos β =
v1 v2
and .
∣v∣ ∣v∣
Example 580. Consider the vector v = (3, 2). Its x- and y-direction cosines are:
3 3 2 2
=√ =√ =√ =√ .
v1 v2
and
∣v∣ 32 + 22 13 ∣v∣ 32 + 22 13
3 2
Which means, of course, that its unit vector is v̂ = ( √ , √ ).
13 13
y
v = (3, 2)
√ v̂
2/ 13
β
√
α
3/ 13 x
Let α and β be the angles it makes with the positive x- and y-axes. Then:
3 2
α = cos−1 √ ≈ 0.588 and β = cos−1 √ ≈ 0.983.
13 13
−2 −2 −1 −1
=√ =√ =√ =√ .
v1 v2
and
∣v∣ ∣v∣
(−2) + (−1) 5 (−2) + (−1) 5
2 2 2 2
−2 −1
Which means, of course, that its unit vector is v̂ = ( √ , √ ).
5 5
y
√
β
−2/ 13 x
√
−1/ 5
α
v̂
v = (−2, −1)
Let α and β be the angles it makes with the positive x- and y-axes. Then:
−2 −1
α = cos−1 √ ≈ 2.678 and β = cos−1 √ ≈ 2.034.
5 5
Exercise 189. Find each vector’s x- and y-direction cosines. Then write down its unit
vector. (Answer on p. 1477.)
We now work towards a formal definition of the angle between two lines.
The above Example suggests the following “Definition” for the angle between two lines.
Given two lines l1 and l2 , pick for each any direction vectors u and v. Then define:
This “Definition” works well in the above Example, but only because the angle between u
and v happens to be acute.
Unfortunately and as the next example illustrates, this “Definition” doesn’t work so well if
the angle between the two chosen direction vectors is instead obtuse:
In this case, the angle between the two lines, α, is actually the supplement of the angle
between the chosen two direction vectors, β. That is:
α = π − β.
The following Definition of the non-obtuse angle between two vectors will prove
convenient:
Definition 114. Let α denote the non-obtuse angle between the vectors u and v. Let β
be the angle between u and v. Then:
⎧
⎪
⎪
⎪β if β is not obtuse,
α=⎨
⎪
⎪
⎩π − β
⎪ if β is obtuse.
Example 584. The angle between the vectors c and d is β, which is obtuse. And so,
the non-obtuse angle between them is α = π − β.
c e
β γ
α
d f
The angle between the vectors e and f is γ, which is acute. And so, the non-obtuse angle
between them is also γ.
We are now ready to write down our formal Definition of the angle between two lines:
Example 585. Given the lines l1 and l2 , we pick the direction vectors u and v.
The angle between u and v is α, which is acute. And so, the non-obtuse angle between
u and v is also α. Thus, by Definition 115, the angle between the two lines is α.
l2
l1 l3
v
u γ w
α
β
l4
x
We just wrote down the Definition of the angle between two lines. We now work towards
Corollary 8, which will give us our “formula” for the angle between two lines.
u⋅v
Recall that by Definition 111, the angle between u and v is: cos−1 .
∣u∣ ∣v∣
It turns out that we can get the non-obtuse angle between u and v simply by slapping ∣⋅∣
(the absolute value function) onto the numerator:
Fact 74. The non-obtuse angle between two non-zero vectors u and v is:
∣u ⋅ v∣
cos−1
∣u∣ ∣v∣
.
Proof. Suppose 0 ≤ θ ≤ π/2. Then u ⋅ v ≥ 0. And so, by Definition 114, the non-obtuse angle
between u and v is:
u⋅v ∣u ⋅ v∣
cos−1 = cos−1
∣u∣ ∣v∣ ∣u∣ ∣v∣
. 3
Suppose instead θ > π/2. Then u ⋅ v < 0. And so, by Definition 114 and the trigonometric
identity π − cos−1 x = cos−1 (−x) (Fact 34), the non-obtuse angle between u and v is:
u⋅v −u ⋅ v ∣u ⋅ v∣
π − cos−1 = cos−1 = cos−1
∣u∣ ∣v∣ ∣u∣ ∣v∣ ∣u∣ ∣v∣
. 3
Corollary 8. The angle between two lines with direction vectors u and v is:
∣u ⋅ v∣
cos−1
∣u∣ ∣v∣
.
v1
∣5∣ 1
= = cos−1 √ √ = cos−1 √ = .
π
x
5 10 2 4
³¹¹ ¹ ¹ · ¹ ¹ ¹ ¹µ
v1
v2
y l2
l1
v1 = (−2, 3)
v2 = (3, 1)
1.305 x
Corollary 9. Suppose θ is the angle between two lines. (a) If θ = 0, then the two lines
are parallel. And (b) if θ = π/2, then they are perpendicular.
³¹¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹µ
v2
v1
y y
v3 = (−1, 2)
v1 = (3, 3)
v4 = (6, 3)
l2
π/2
l4
x
l1
l3
x
³¹¹ ¹ ¹ · ¹ ¹ ¹ ¹µ
v3
v4
Definition 117. A line and a vector are (a) parallel if the line has a direction vector
that’s parallel to the given vector; and (b) perpendicular if the line has a direction vector
that’s perpendicular to the given vector.
Here are two “obvious” Facts you may recall from primary school:
Fact 76. If two lines (in 2D space) are distinct and non-parallel, then they must share
exactly one intersection point.
Example 589. The lines l1 and l2 are identical. And indeed, they l1 = l2
are also parallel.
l3
The lines l1 and l3 are distinct and parallel. And indeed they do
not intersect.
The lines l1 and l4 are distinct and non-parallel. And indeed they l4
share exactly one intersection point.
Exercise 190. Find the angle between each given pair of lines. State if they are parallel
or perpendicular. (Answer on p. 1478.)
(a) r = (−1, 2) +λ (−1, 1) and r = (0, 0) +λ (2, −3) (λ ∈ R).
(b) r = (−1, 2) +λ (1, 5) and r = (0, 0) +λ (8, 1) “
(c) r = (−1, 2) +λ (2, 6) and r = (0, 0) +λ (3, 2) “
Remark 60. Fact 76 applies only to 2D space. As we’ll learn later, in 3D space, two lines
can be distinct, non-parallel, and yet do not intersect. (We call such lines skew lines.)
In contrast, Fact 75 applies more generally to higher dimensions, including in 3D space.
The particle’s position vector s is a function of time t. For brevity of notation, we will
often be lazy/sloppy and omit “(t)”. That is, we will often instead simply write:
At time t, the particle’s x- and y-coordinates are sx = cos t and sy = sin t, respectively. In
other words, at time t, the particle is cos t m east and sin t m m north of the origin.
As time t progresses from 0 to 2π seconds, we trace out, anti-clockwise, the unit circle:
Arrows indicate y At t = 1,
instantaneous s = (sx , sy ) ≈ (0.54 m, 0.84 m),
direction of v = (vx , vy ) ≈ (−0.84 m s−1 , 0.54 m s−1 ),
l
travel.
v = 1 m s−1 .
At t = 0,
s = (sx , sy ) ≈ (1 m, 0 m),
Direction of
a and F v = (vx , vy ) ≈ (1 m s−1 , 0 m s−1 ),
l v = 1 m s−1 .
At t =
5π
,
x
4
√ √
2 2
s = (sx , sy ) ≈ (− m, − m),
2 2
√ √
\ v = (vx , vy ) ≈ (
2
2
ms ,−−1
2
2
m s−1 ),
v = 1 m s−1 .
So, at time t, the particle is travelling eastwards at vx = − sin t m s−1 (or equivalently,
westwards at sin t m s−1 ) and northwards at vy = cos t m s−1 .
The magnitude of the particle’s velocity vector is denoted v and is called its speed:
√
v = ∣v∣ = vx2 + vy2 .
Aha! So, interestingly, the particle travels at the constant speed of 1 m s−1 . That is, at
every instant in time t, it is moving 1 m s−1 in its direction of travel.
We now prove that the particle always moves in a direction tangent to the circle.
In other words, its direction of travel is always perpendicular to its position vector.
To do so, we need simply prove that v ⋅ s = 0 for all t:
218
Fact 29(a).
468, Contents www.EconsPhDTutor.com
(... Example continued from the previous page.)
Similarly, the particle’s acceleration vector is defined as the first derivative of the
velocity vector (or, equivalently, the second derivative of the position vector):
So, at time t, the particle is accelerating eastwards at ax = − cos t m s−2 and northwards
at ay = − sin t m s−2 . Or equivalently, it is accelerating westwards at cos t m s−2 and south-
wards at sin t m s−2 . (Note that m s−2 is abbreviation for metre per second per second.)
The magnitude of the particle’s acceleration vector is denoted a:
√ √ √
a = ∣a∣ = ax + ay = (− cos t) + (− sin t) = 1 = 1.
2 2 2 2
Aha! So, interestingly, the particle accelerates at the constant rate of 1 m s−2 . That is, at
every instant in time t, it is accelerating 1 m s−2 in its direction of acceleration.
(Note that for velocity, we gave its magnitude the special name of speed. But in contrast,
the magnitude of acceleration has no special name. We simply call it the magnitude of
acceleration.)
Above we proved that the particle’s direction of movement is always tangent to the
circle. Here we can similarly prove that its direction of acceleration is always towards
the centre of the circle. (Or equivalently, the acceleration vector points in the exactly
opposite direction as the position vector.)
To prove this, we need simply observe that the acceleration vector a = (− cos t, − sin t)
and the position vector s = (cos t, sin t) point in exact opposite directions.219
Suppose the particle’s mass is m = 1 kg. Recall from physics Newton’s Second Law:220
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
Vector Scalar Vector
Note that mass is a scalar quantity. Hence, force, being a product of a scalar and a
vector, is itself a vector quantity.
The force vector points in the same direction as the acceleration vector (i.e. towards the
centre of the circle). Moreover, it has constant magnitude:
where N is for newton (the SI unit for force and which is equal to kg m s−2 ).
Physicists call such a force (which results in circular movement) a centripetal force.
219
Note though that it would be wrong to write a = −s. This is because acceleration is measured in m s−2 ,
while position is measured in m.
220
See e.g. Exercise 109.
469, Contents www.EconsPhDTutor.com
40. The Projection and Rejection Vectors
Let a and b be vectors.
The projection of a on b, denoted projb a, is a
the vector that is:221
a − projb a
• Parallel to b; and
• Perpendicular to a − projb a (the vector de-
picted in blue). θ
First, since projb a ∥ b, by Definition 104, we must have projb a = k b̂ for some k ≠ 0.
Next, the length of projb a is ∣k∣. But what is k?
We observe that in the above figure, a right triangle is formed. This right triangle’s hypo-
tenuse corresponds to the vector a, while its base corresponds to the vector projb a. Let θ
be the angle between these two vectors.
Then by our right-triangle definition of cosine, we have:
“Adjacent′′ ∣projb a∣ 1 ∣k∣
cos θ = = = .
“Hypotenuse′′ ∣a∣ ∣a∣
a ⋅ b 2 a ⋅ b̂
But by Definition 111, we also have: cos θ = = .
∣a∣ ∣b∣ ∣a∣
∣k∣ a ⋅ b̂
= or ∣k∣ = a ⋅ b̂.
∣a∣ ∣a∣
The above discussion motivates the following definition of the projection vector:
Definition 118. Let a and b be vectors. Then the projection of a on b, denoted projb a,
is the following vector:
a⋅b
projb a = (a ⋅ b̂) b̂ (or equivalently, projb a = b).
∣b∣
2
For convenience, let’s call the “blue vector” the rejection vector and denote it rejb a:
Definition 119. Let a and b be non-zero vectors. Then the rejection of a on b, denoted
rejb a, is the following vector:
rejb a = a − projb a.
221
These two properties are to hold so long as projb a and a − projb a are non-zero.
470, Contents www.EconsPhDTutor.com
The following result is “obvious” from our above two Definitions:
Here’s what the above result says geometrically. Let θ be the angle between a and b. Then:
(a) If θ is acute, then projb a points in the same direction as b.
(b) If θ is obtuse, then projb a points in the exact opposite direction as b.
(c) If θ is right (i.e. if a ⊥ b), then projb a = 0 and rejb a = a.
(a) (b)
a
a
rejb a = a − projb a θ
θ
projb a b projb a b
a (c)
rejb a = a
θ = π/2
projb a = 0 b
Above we already argued that the length of projb a must be ∣a ⋅ b̂∣. We now formally prove
that this is so:
∣projb a∣ = ∣a ⋅ b̂∣ .
(3, 2) ⋅ (1, 1) 3 + 2 5
∣projv u∣ = ∣u ⋅ v̂∣ = ∣ ∣= √ = √ .
∣(1, 1)∣ 2 2
u = (3, 2)
v = (1, 1)
Now consider w = (−2, −2) — it points in the exact
opposite direction from v and has twice the length.
It turns out that the projection of u on w, projw u,
is identical to projv u.
projw u
u = (3, 2)
As the above example suggests, if v and w are parallel, then the projections of any vector
u on v and w are identical. Formally:
projv u = projw u.
a = (−6, 1)
b = (2, 0)
projb a
Now consider c = (3, 0) — it points in the same direction as b, but is half again as long.
By Fact 79, the projection of a on c is the same as that of a on b. That is:
projc a = projb a.
a = (−6, 1)
c = (3, 0)
projc a
projv u = projw u.
Hence, instead of computing ∣projv u∣, we can simply compute ∣projw u∣:
Exercise 191. Find the lengths of the projections of: (Answer on p. 1478.)
(a) (1, 0) on (33, 33); and (b) (33, 33) on (1, 0).
Definition 120. Two or more points are collinear if some line contains all of them.
“Obviously”, any two points must be collinear. Indeed, given any two points, there is a
unique line that contains both of them:
Fact 80. Suppose A and B are distinct points. Then the unique line that contains both
A and B is described by:
Ð→ Ð→
r = OA + λAB (λ ∈ R).
Proof. First, plug in λ = 0 and λ = 1 to verify that the given line contains A and B.
Next, this line is unique because any line that contains both A and B must have direction
Ð→
vector AB and must thus be described by:
Ð→ Ð→
r = OA + λAB (λ ∈ R) .
In contrast, three distinct points can be collinear but will not generally be:
A B
C
D E
F
Example 596. Let A = (1, 2), B = (4, 5), and C = (7, 8) be points.
To check if they are collinear:
1. First write down the unique line that contains both A and B:
Ð→ Ð→
r = OA + λAB = (1, 2) + λ(3, 3) (λ ∈ R).
As you can verify, λ̂ = 2 solves the above vector equation (or system of two equations).
Hence, our line also contains C. Thus, A, B, and C are collinear.
Example 597. Let D = (1, 0), E = (0, 1), and F = (0, 0) be points. To check if they are
collinear:
From =, we have λ̂ = 1. But this contradicts =. This contradiction means that there is no
1 2
Exercise 192. In each of the following, three points A, B, and C are given. Determine
if they are collinear.
Definition 121. Let u = (u1 , u2 ) and v = (v1 , v2 ) be vectors. Their vector product,
denoted u × v, is the number:
u × v = u1 v2 − u2 v1 .
Remark 61. The vector product is also called the cross product. But your A-Level
exams and syllabus do not seem to use this term and so neither shall we. We will stick
strictly to the term vector product.
Example 598. Let u = (5, −3), v = (2, 1), w = (−4, 0), and x = (8, 7). Then:
⎛ 5 ⎞ ⎛2⎞
u×v = × = 5 ⋅ 1 − (−3) ⋅ 2 = 5 + 6 = 11,
⎝ −3 ⎠ ⎝1⎠
⎛ 5 ⎞ ⎛ −4 ⎞
u×w = × = 5 ⋅ 0 − (−3) ⋅ (−4) = 0 − 12 = −12,
⎝ −3 ⎠ ⎝ 0 ⎠
⎛ 5 ⎞ ⎛8⎞
u×x = × = 5 ⋅ 7 − (−3) ⋅ 8 = 35 + 24 = 59,
⎝ −3 ⎠ ⎝7⎠
⎛2⎞ ⎛ −4 ⎞
v×w = × = 2 ⋅ 0 − 1 ⋅ (−4) = 0 + 4 = 4.
⎝1⎠ ⎝ 0 ⎠
⎛ 5 ⎞ ⎛ 2−4 ⎞
u × (v + w) = × = 5 ⋅ 1 − (−3) ⋅ (−2)
⎝ −3 ⎠ ⎝ 1+0 ⎠
= −1 = 11 + (−12) = u × v + u × w.
⎛ 5+2 ⎞ ⎛ −4 ⎞
(u + v) × w = × = 7 ⋅ 0 − (−2) ⋅ (−4)
⎝ −3 + 1 ⎠ ⎝ 0 ⎠
= −8 = −12 + 4 = u × w + v × w.
a × b = −b × a.
Example 600. Let u = (5, −3), v = (2, 1), and w = (−4, 0). We already showed that:
u × v = 11 and w × u = −12.
v × v = (2, 1) × (2, 1) 2 ⋅ 1 − 1 ⋅ 2 = 2 − 2 = 0.
In summary:
a × b = a × (ca) = c (a × a) = c ⋅ 0 = 0.
Example 602. Let a = (1, 2) and b = (−2, −4). Since a ∥ b, by Corollary 10, a × b = 0.
The converse of Corollary 10 is also true, but is harder to prove:
Example 603. Let a = (3, −1) and b = (2, k), where k is some unknown constant. We
are now told that a × b = 0. What is k?
Well, by Fact 82, a × b = 0 implies that a ∥ b. That is, b is a multiple of a.
Hence, 2/3 = k/ (−1). And so, k = −2/3.
a×b=0 ⇐⇒ a ∥ b.
(ca) × b = c (a × b).
Exercise 193. Let a = (1, −2), b = (3, 0), and c = (4, 1). Compute a × b, a × c, b × c,
b × a, c × a, c × b, and a × (b + c). (Answer on p. 1480.)
Fact 84. Let θ be the angle between the vectors a and b. Then:
Exercise 194. Let θ be the angle between the vectors a = (a1 , a2 ) and b = (b1 , b2 ).
(a) Express ∣a∣, ∣b∣, ∣a × b∣, and cos θ in terms of a1 , a2 , b1 , and b2 . (You do not need to
expand the squared terms.)
(b) Since θ ∈ [0, π], what can you say about the sign of sin θ? (That is, is sin θ positive,
negative, non-positive, or non-negative?)
(c) Now use a trigonometric identity to express sin θ in terms of cos θ. (Hint: You should
find that there are two possibilities. Use what you found in (b) to explain why you
can discard one of these possibilities.)
(d) Plug the expression you wrote down for cos θ in (a) into what you found in (c).
(e) Prove223 that (a21 + a22 ) (b21 + b22 )−(a1 b1 + a2 b2 ) = (a1 b2 − a2 b1 ) . (Hint: Simply expand
2 2
Corollary 12. If θ ∈ [0, π] is the angle between the vectors u and v, then:
∣u × v∣
θ = sin−1
∣u∣ ∣v∣
.
222
Fact 68.
223
By the way, this is simply an instance of Lagrange’s Identity.
479, Contents www.EconsPhDTutor.com
42.2. The Length of the Rejection Vector
The vector product will mostly be useful only when we look at 3D space. Nonetheless, even
in 2D space, it has the following use:
∣rejb a∣ = ∣a × b̂∣ .
rejb a = a − projb a
θ
projb a b
As we’ll see next, Fact 85 will help us compute the distance between a point and the line.
224
For a proof that makes no mention of the sine function, see p. 1296 in the Appendices.
480, Contents www.EconsPhDTutor.com
43. The Foot of the Perpendicular From a Point to a Line
Definition 122. Let A be a point that isn’t on the line l. The foot A
Ð→
of the perpendicular from A to l is the point B on l such that AB ⊥ l. B l
Example 604. Let A = (1, 2) be a point and l be the line described by:
Ð→
r = OP + λv = (0, 1) + λ(9, 1) (λ ∈ R).
Ð→
Compute P A = (1, 2) − (0, 1) = (1, 1) and:
Ð→ (1, 1) ⋅ (9, 1) ⎛ 9 ⎞ 9 + 1 5
projv P A = proj(9,1) (1, 1) = = (9, 1) = (9, 1).
92 + 12 ⎝ 1 ⎠ 82 41
Exercise 195. Find the feet of the perpendiculars from the points A = (−1, 0) and
Ð→
B = (3, 2) to the line described by r = OP + λv = (2, −3) + λ (5, 1) (λ ∈ R). (Answer on p.
1481.)
225
Hence justifying the use of the definite article in Definition 122.
481, Contents www.EconsPhDTutor.com
Exercise 196. This Exercise guides you through a proof of Fact 86. Let l be the line
Ð→ Ð→
described by r = OP + λv (λ ∈ R), A be a point that isn’t on l, and B = P + projv P A.
To prove that B is a foot of the perpendicular from A to l, we must prove that (a) B is
Ð→
on l; and (b) AB ⊥ l :
(a) To prove that B is on l, follow these two steps:
Ð→
(i) Explain why projv P A can be written as a scalar multiple of v. That is, explain
Ð→
why projv P A = λv for some λ ∈ R.
(ii) Hence explain why B satisfies l’s vector equation and is thus on l.
Ð→
(b) Next, to prove that AB ⊥ l, follow these steps:
Ð→ Ð→
(i) Show that AB = −rejv P A. (Hint: Definition 119.)
Ð→
(ii) Explain why rejv P A ⊥ v.
Ð→
(iii) Hence explain why AB ⊥ l.
(c) To prove that B is the unique foot of the perpendicular from A to l, let C ≠ B be
Ð→
a point on l — we will show that AC ⊥/ l and hence that C cannot also be a foot of
the perpendicular from A to l:
Ð→ Ð→
(i) First explain why AB ⋅ BC = 0.
Ð→ Ð→ Ð→
(ii) Now prove that AC ⊥/ BC and hence that AC ⊥/ l. (Answer on p. 1481.)
Definition 123. Let A be a point and l be a line. Suppose B is the point on l that’s
Ð→
closest to A. Then the distance between A and l is ∣AB∣.
Remark 62. In the trivial case where A is on l, the point on l that’s closest to A is A
Ð→
itself. And so, the distance between A and l is ∣AA∣ = ∣0∣ = 0 (as we’d expect).
Fact 87. If B is the foot of the perpendicular from a point A to a line l, then B is also
the point on l that’s closest to A.
A l
Proof. Let C ≠ B be any other point on l. Observe that ABC
forms a right triangle with hypotenuse AC.
By the Pythagorean Theorem, the leg AB must be shorter than B
the hypotenuse AC. That is, ∣AB∣ < ∣AC∣. We’ve just shown
that B is closer to A than any other point on q. Thus, B is the
C
point on l that’s closest to A.
From the above Definition and Fact, the following Corollary is immediate:
Corollary 13. Suppose l is a line, A is a point, and B is the foot of the perpendicular
Ð→
from A to l. Then the distance between A and l is ∣AB∣.
Example 605. As in Example 604, let A = (1, 2) be a point and l be the line described by
Ð→
r = OP +λv = (0, 1)+λ (9, 1) (λ ∈ R). Previously, using Method 1 (Formula Method),
1
we already found that the foot of the perpendicular from A to l is B = (45, 46).
41
Ð→ 1 1 4
Now: AB = B − A = (45, 46) − (1, 2) (4, −36) = (1, −9)
41 41 41
Ð→ 4√
And so, by Corollary 13, the distance between A and l is ∣AB∣ = 82.
41
We will next introduce two more methods for finding B and hence the distance between
A and l. In each method, we begin by letting B = (0, 1) + λ̃ (9, 1) be the foot of the
perpendicular from A to l, where λ̃ is some unknown to be found.
(Example continues on the next page ...)
Ð→ Ð→ Ð→
Since B is the foot of the perpendicular, we have AB ⊥ l or AB ⊥ v or AB ⋅ (9, 1) = 0:
Ð→
0 = AB ⋅ (9, 1) = (9λ̃ − 1, λ̃ − 1) ⋅ (9, 1) = 9 (9λ̃ − 1) + (λ̃ − 1) = 82λ̃ − 10.
5 1
Rearranging, λ̃ = 10/82 = 5/41 and so B = (0, 1) + λ̃ (9, 1) = (0, 1) +
(9, 1) = (45, 46).
41 41
Lovely — this is the same as what we found in Method 1. And now, if we’d like, we can
Ð→
calculate ∣AB∣ (the distance between A and l) as we did before.
Ð→
Method 3 (Calculus Method). Let R be a generic point on l. Then AR =
(9λ − 1, λ − 1) and the distance between A and R is:
√ √
Ð→
∣AR∣ = (9λ − 1) + (λ − 1) = 82λ2 − 20λ + 2.
2 2
Ð→
Since B is the point on l that’s closest to A, the value of λ that minimises ∣AR∣ must
Ð→ √
be λ̃. Our goal then is to determine when ∣AR∣ = 82λ2 − 20λ + 2 is minimised — or
equivalently, when 82λ2 − 20λ + 2 is minimised.
To do so, we’ll use calculus. We’ll learn how more about how this calculus thing works in
Part V, but for now we’ll rely on what you may (or may not) remember from secondary
school. First differentiate 82λ2 − 20λ + 2 with respect to λ:
d
(82λ2 − 20λ + 2) = 164λ − 20.
dλ
And so, by the First Order Condition (FOC), we have:
20 5
(164λ − 20) ∣λ=λ̃ = 0 or 164λ̃ − 20 = 0 or λ̃ = = .
164 41
Lovely — this is the same as what we found in Method 2. And now, if we’d like, we can
Ð→
find B and ∣AB∣ as we did before.
By the way, instead of explicitly using calculus, an alternative is to use what we learnt
earlier in Part I. Recall226 that quadratic expressions are minimised at “−b/2a”. Hence:
b ′′ −20 5
λ̃ = “ − =− = .
2a 2 ⋅ 82 41
This alternative is probably quicker (provided of course you can recall the “−b/2a” thing).
226
Fact 20 (whose proof, by the way, actually uses calculus).
484, Contents www.EconsPhDTutor.com
The following is a “formula” for the distance between a point and a line.
Ð→
Corollary 14. Suppose A is a point, l is the line described by r = OP + λv (λ ∈ R), and
d is the distance between A and l. Then:
Ð→
d = ∣P A × v̂∣.
Example 606. We continue with the point A = (1, 2) and the line l described by r =
Ð→
OP + λv = (0, 1) + λ (9, 1) (λ ∈ R).
By Corollary 14, the distance between A and l is:
√
Ð→ (9, 1) 1⋅1−1⋅9 8 2
∣P A × v̂∣ = ∣(1, 1) × √ ∣=∣ √ ∣= √ =4 .
92 + 12 82 82 41
(a) Prove Corollary 14 in the trivial case where the point A is on the line l.
In the rest of this exercise (or proof), we will suppose that A is not on l. Let B be the
foot of the perpendicular from A to l.
Ð→
(b) What is the relationship between d and AB?
Ð→ Ð→
(c) Express AB in terms of rejv P A. (Hint: Refer to the figure on p. 481.)
(d) Now use a result from the previous chapter to complete the proof of Corollary 14.
And now a brand new example where we illustrate all three methods:
Ð→
Example 607. Let A = (−1, 0) be a point, l be the line described by r = OP + λv =
(3, 2) + λ(5, 1) (λ ∈ R), and B be the foot of the perpendicular from A to l.
Ð→
Method 1 (Formula Method). First compute P A = (−1, 0) − (3, 2) = (−4, −2) and:
Ð→ −11 1
So: B = P + projv P A = (3, 2) + (5, 1) = (−16, 15).
13 13
√
Ð→ Ð→ (5, 1) −4 − (−10) 6 3 26
And: AB = ∣P A × v̂∣ = ∣(−4, −2) × √ ∣=∣ √ ∣= √ = .
52 + 12 26 26 13
Ð→ Ð→
Since AB ⊥ l, we have AB ⊥ v or:
Ð→
0 = AB ⋅ (5, 1) = (5λ̃ + 4, λ̃ + 2) ⋅ (5, 1) = 5 (5λ̃ + 4) + (λ̃ + 2) = 26λ̃ + 22.
d
Differentiate: (26λ2 + 44λ + 20) = 52λ + 44.
dλ
44 11
FOC: (52λ + 44) ∣λ=λ̃ = 0 or λ̃ = − = .
52 13
44 44 11
Alternatively, we could simply have used “−b/2a”: λ̃ = − =− =− .
2 ⋅ 26 52 13
Ð→
And now, we can find B and ∣AB∣ as we did in Method 2.
Exercise 198. In each of the following, a point A and line l are given. Let B be the foot
of the perpendicular from A to l. Find B and also the distance between A and l.
The point A The line l Answer on p.
(a) (7, 3) r = (8, 3) + λ (9, 3) 1482.
(b) (8, 0) Contains the points (4, 4) and (6, 11) 1483.
(c) (8, 5) r = (8, 4) + λ (5, 6) 1484.
227
I’ve examined dozens of 3D graphing software and all things considered (user-friendliness, accessibility,
features, etc.), this is the best 3D graphing web app I’ve found so far. Please let me know if you know
of any other better software/app. (I was gonna use GeoGebra, but it had too many critical flaws.)
228
Note that you’ll be routed through TinyURL.com first. The reason is that the CalcPlot3D links are
often thousands of characters long and were confusing my computer.
487, Contents www.EconsPhDTutor.com
44. Three-Dimensional (3D) Space
In 2D space, we had ordered pairs. In 3D space, we’ll instead have ordered triples:229
Definition 124. Given an ordered triple (a, b, c), we call a its first or x-coordinate, b its
second or y-coordinate, and c its third or z-coordinate.
Example 608. The ordered triple (Cow, Chicken, Dog) has x-coordinate Cow, y-
coordinate Chicken, and z-coordinate Dog. As with ordered pairs, the order of the
coordinates matters. So for example:
Example 609. The ordered triple (2, 5, −π) has x-coordinate 2, y-coordinate 5, and
z-coordinate −π. Again, order matters, so that for example:
In 2D space, a point was simply any ordered pair of real numbers. Now in 3D space:
Example 610. The ordered triple (Cow, Chicken, Dog) is not a point because at least
one of its coordinates is not a real number. (Indeed, all three aren’t.)
Example 611. The ordered triple (2, 5, −π) is a point because all three of its coordinates
are real numbers.
229
For the formal definition of an ordered triple (and n-tuple), see Definition 218 (Appendices).
488, Contents www.EconsPhDTutor.com
In 2D space (the cartesian plane), we could depict points y
(ordered pairs of real numbers) by drawing on a piece
of paper. The x-axis went right and the y-axis up. A = (2, 1)
y A = (a1 , a2 , a3 )
x
a2
We say that this coordinate system follows the right-hand rule. To see why, have the
palm of your right hand face you. Fold your ring and pinky fingers. Have your thumb point
right, your index finger up, and your middle finger towards your face. Then these three
fingers correspond to the x-, y-, and z-axes. (Try it!)
(If instead the z-axis “goes into the paper away from your face”, then our coordinate system
would instead follow the left-hand rule. Can you explain why?)
In 2D space, the origin was the point O = (0, 0) (Definition 35) and was where the x- and
y-axes intersected. And the generic point A = (a1 , a2 ) was a1 units to the right and a2 units
above the origin.
Analogously, in 3D space:
In 3D space, the origin is where the x-, y-, and z-axes intersect. And relative to the origin,
the generic point A = (a1 , a2 , a3 ) is a1 units right, a2 units up, and a3 units “out (towards
your face)”.
Definition 127. The graph of an equation (or system of equations) is the set of points
(x, y, z) for which the equation (or system of equations) is true.
Shortly, we’ll learn about the equations (and systems of equations) used to describe planes
(and lines). For now, here are two quick examples:
q = {(x, y, z) ∶ x ∈ R, y ∈ R, z ∈ R, x + y + z = 1}.
In words, q is the set of ordered triples (x, y, z) such that x, y, and z are real numbers
satisfy x + y + z = 1.
As with ordered pairs, we will generally be looking only at ordered triples of real numbers,
i.e. points. And so, we shall be a little lazy/sloppy and not bother mentioning that x, y,
and z are real numbers. That is, we’ll usually more simply write:
q = {(x, y, z) ∶ x + y + z = 1} .
x=y and y = z.
It turns out that this system of (two) equations describes a line l in 3D space. (We’ll
learn more about this in Ch. 48.) The line l contains exactly those points that can be
written as (λ, λ, λ), for some real number λ. ,
So, for example, it contains the points (1, 1, 1), O = (0, 0, 0), and (−1, −1, −1).
The line l
(1, 1, 1)
l = {(x, y, z) ∶ x = y = z}.
In words, l is the set of ordered triples (x, y, z) such that x, y, and z are real numbers
that satisfy x = y = z.
As per , above, we can also write:
l = {(λ, λ, λ) ∶ λ ∈ R} = {λ (1, 1, 1) ∶ λ ∈ R} .
In words, l is the set of points that can be written as (λ, λ, λ) or λ (1, 1, 1) for some real
number λ.
Definition 128. Given the points A = (a1 , a2 , a3 ) and B = (b1 , b2 , b3 ), the vector from A
Ð→
to B is AB = (b1 − a1 , b2 − a2 , b3 − a3 ).
Example 614. The vector from the point A = (1, 5, 0) to the point B = (−2, 6, 3) is:
⎛ −3 ⎞
Ð→
AB = (−3, 1, 3) = ⎜ ⎟
⎜ 1 ⎟ = u.
⎝ 3 ⎠
Observe that there are, again, at least four ways to denote a single vector.
B = (−2, 6, 3)
Ð→ A = (1, 5, 0)
AB = (−3, 1, 3)
(The vector
from A to B)
x
z
We may again contrast vectors with scalars: vectors are two-dimensional objects, while
scalars are one-dimensional.
Ð→
Definition 93. Given a point A, its position vector is the vector OA .
Ð→
And so, the point A = (a1 , a2 , a3 ) has position vector OA = a = (a1 , a2 , a3 ).
Ð→
Example 615. The point A = (1, 5, 0) has position vector OA = a = (1, 5, 0).
Once again, do not confuse a point (a zero-dimensional object) with a vector (a two-
dimensional object).
Definition 94. The zero vector, denoted 0, is the origin’s position vector.
Ð→
And so, the zero vector (in 3D space) is 0 = OO = (0, 0, 0).
Definition 95. Suppose a moving object starts at point A and ends at point B. Then
Ð→
we call AB its displacement vector.
And so, if a moving object starts at A = (a1 , a2 , a3 ) and ends at B = (b1 , b2 , b3 ), then its
Ð→
displacement vector is AB = (b1 − a1 , b2 − a2 , b3 − a3 ).
Definition 129. Given the vector u = (u1 , u2 , u3 ), its magnitude or length, denoted ∣u∣,
is the number:
√
∣u∣ = u21 + u22 + u23 .
√ √
Example 616. If u = (1, 2, 3), then the length of u is ∣u∣ = 12 + 22 + 32 = 14.
Now, observe that the line segment OA is the hypotenuse of the right triangle OBA.
Moreover, ∣BA∣ = a1 . And so, again by the Pythagorean Theorem:
√ √
√ 2 √
∣OA∣ = ∣BA∣ + ∣OB∣ = a1 + ( a2 + a3 ) = a21 + a22 + a23 .
2 2 2 2 2
This completes our explanation of why the above Definition makes sense.
As before, the length of every vector must be non-negative. Moreover, a vector has zero
length if and only if it is the zero vector:
Exercise 200. Let A = (2, 5, 8) and B = (0, 1, 1) be points. What is the length of the
vector from A to B? (Answer on p. 1485.)
Ð→
Definition 96. Given two points A and B, the difference B − A is the vector AB.
And so, given the points A = (a1 , a2 , a3 ) and B = (b1 , b2 , b3 ), their difference is:
Ð→
B − A = AB = (b1 − a1 , b2 − a2 , b3 − a3 ) .
Example 617. Given the points A = (1, 5, 0) and B = (−2, 6, 3), their difference is the
Ð→
vector B − A = AB = (−2 − 1, 6 − 5, 3 − 0) = (−3, 1, 3).
B = (−2, 6, 3)
Ð→ A = (1, 5, 0)
B − A = AB = (−3, 1, 3)
(The difference between
the points A and B)
x
z
A + v = (a1 + v1 , a2 + v2 , a3 + v3 ).
Example 618. Given the point A = (1, 5, 0) and the vector v = (−3, 1, 3), their sum is
the point A + v = (1 − 3, 5 + 1, 0 + 3) = (−2, 6, 3).
y
(The sum of a point A
and a vector v)
A + v = (−2, 6, 3)
v = (−3, 1, 3) A = (1, 5, 0)
x
z
B − v = (b1 − v1 , b2 − v2 , b3 − v3 ) .
Example 619. Given the point B = (−2, 6, 3) and the vector v = (−3, 1, 3), their difference
is the point B − v = (−2 − (−3) , 6 − 1, 3 − 3) = (1, 5, 0).
B = (−2, 6, 3)
v = (−3, 1, 3)
B − v = (1, 5, 0)
(The difference between a
point B and a vector v)
x
z
Exercise 201. Let A = (1, 2, 3), B = (−1, 0, 7), and C = (5, −2, 3) be points. What are
(a) A + B; (b) A − B; (c) A + (B + C); and (d) A + (B − C)? (Answer on p. 1485.)
Definition 132. Let u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ) be vectors. Then their sum,
denoted u + v, is the vector u + v = (u1 + v1 , u2 + v2 , u3 + v3 ).
Example 620. Given the vectors u = (1, 2, 3) and v = (−1, 0, 1), their sum is the vector
u + v = (1 − 1, 2 + 0, 3 + 1) = (0, 2, 4).
y
v = (−1, 0, 1)
u = (1, 2, 3)
x
z
Example 621. The additive inverse of u = (1, 2, 3) is the vector −u = (−1, −2, −3).
u = (1, 2, 3)
u − v = u + (−v).
u − v = (u1 − v1 , u2 − v2 , u3 − v3 ) .
u − v = u + (−v) = (u1 − v1 , u2 − v2 , u3 − v3 ) .
Example 622. Given the vectors u = (1, 2, 3) and v = (−1, 0, 1), their difference is the
vector u − v = (1 − (−1) , 2 − 0, 3 − 1) = (2, 2, 2).
y
v = (−1, 0, 1)
u − v = (2, 2, 2)
u = (1, 2, 3)
x
z
Exercise 202. Let u = (1, 2, 3), v = (−1, 0, 7), and w = (5, −2, 3) be vectors. What are
(a) u + v; (b) u − v; (c) u + (v + w); and (d) u + (v − w)? (Answer on p. 1485.)
y
B = (−2, 6, 3)
Ð→ Ð→ Ð→
OB − OA = AB = (−3, 1, 3)
A = (1, 5, 0)
Ð→
Ð→ OA = (1, 5, 0)
OB = (−2, 6, 3)
x
z
Example 624. Let A = (1, 5, 0), B = (−2, 6, 3), and C = (4, −2, 1) be points.
Ð→ Ð→
Then AB = B − A = (−2, 6, 3) − (1, 5, 0) = (−3, 1, 3), AC = C − A = (4, −2, 1) − (1, 5, 0) =
Ð→
(3, −7, 1), and BC = C − B = (4, −2, 1) − (−2, 6, 3) = (6, −8, −2).
Ð→ Ð→ Ð→ Ð→
And indeed, AB − AC = (−3, 1, 3) − (3, −7, 1) = (−6, 8, 2) = −BC = CB. 3
Ð→ Ð→ Ð→
Also, AB + BC = (−3, 1, 3) + (6, −8, −2) = (3, −7, 1) = AC. 3
Ð→ y
AB = (−3, 1, 3)
B = (−2, 6, 3)
A = (1, 5, 0)
Ð→
BC = (6, −8, −2) Ð→
AC = (3, −7, 1)
C = (4, −2, 1)
Ð→
Exercise 203. Let A = (5, −1, 0), B = (3, 6, −5), and C = (2, 2, 3) be points. Find AB,
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
AC, and BC; and show that AB − AC = CB and AB + BC = AC. (Answer on p. 1485.)
Example 625. Let v = (−1, 0, 1) be a vector. Then 2v = (−2, 0, 2) and −3v = (3, 0, −3).
2v = (−2, 0, 2)
z x
v = (−1, 0, 1)
√ √
∣v∣ = (−1) + 02 + 12 = 2
2
Now:
Definition 104. Two non-zero vectors u and v are parallel if u = kv for some k and
non-parallel otherwise.
x = (5, 1, 0)
w = (−2, 0, −2)
u = (1, 0, 1) x
v = (3, 0, 3)
z
u ∥ v, w and u ∥/ x.
Remark 63. Again, note the special case of the zero vector 0 = (0, 0, 0). It does not point
in the same, exact opposite, or different direction as any other vector. Also, it is neither
parallel nor non-parallel to any other vector.
Exercise 204. Continue to let u = (1, 0, 1), v = (3, 0, 3), w = (−2, 0, −2), and x = (5, 1, 0).
State if each of the following pairs of vectors point in the same, exact opposite, or different
directions; and also if they are parallel. (Answer on p. 1485.)
Definition 106. Given a non-zero vector v, its unit vector (or the unit vector in its
direction) is:
1
v̂ = v.
∣v∣
Fact 58. Given any non-zero vector, its unit vector has length 1.
Fact 59. Let v be a vector with unit vector v̂. If c ∈ R, then the vector cv̂ has length ∣c∣.
Exercise 205. Find the length and unit vector of each vector. (Answer on p. 1486.)
Analogously, in 3D space, the (three) standard basis vectors are the three unit vectors
that point in the directions of the positive x-, y-, and z-axes:
j = (0, 1, 0)
x
k = (0, 0, 1)
i = (1, 0, 0)
Not surprisingly, every vector can be written as the linear combination of i, j, and k:231
Exercise 206. Write each of the vectors v = (9, 0, −1) and w = (−7, 3, 5) as a linear
combination of the standard basis vectors. (Answer on p. 1486.)
230
Ch. 34.14.
231
You may also recall (Fact 61) that in 2D space, every vector can be written as the linear combination
of two non-parallel vectors. It turns out that there is an analogous result in 3D space.
For this result, we first define what it means for three (or more) vectors to be linearly independent:
Definition 136. Three (or more) non-zero vectors are linearly independent if the first vector cannot
be written as a linear combination of the other two vectors.
We then have the following Fact (proof omitted). This Fact is definitely out of the H2 Maths syllabus
and isn’t something you need worry about.
Fact 89. Every vector can be written as the linear combination of three linearly independent vectors.
Theorem 7. Let A and B be points with position vectors a and b. Let P be the point
that divides the line segment AB in the ratio λ ∶ µ. Then P ’s position vector is:
µa + λb
p=
λ+µ
.
A
The point P has position vector:
µa + λb
p=
λ
λ+µ
.
Exercise 207. Let A = (1, 2, 3) on B = (4, 5, 6) be points. Find the point that divides
the line segment AB in the ratio 2 ∶ 3. (Answer on p. 1486.)
u ⋅ v = u1 v1 + u2 v2 .
Definition 137. Let u = (u1 , u2 , u3 ) and v = (v1 , v2 , v3 ) be vectors. Their scalar product
u ⋅ v is the number:
u ⋅ v = u1 v1 + u2 v2 + u3 v3 .
Example 628. Let u = (5, −3, 1), v = (2, 1, −2), and w = (0, −4, 3). Then:
⎛ 5 ⎞ ⎛ 2 ⎞
u⋅v =⎜ ⎟ ⎜
⎜ −3 ⎟ ⋅ ⎜ 1
⎟ = 10 − 3 − 2 = 5.
⎟
⎝ 1 ⎠ ⎝ −2 ⎠
⎛ 5 ⎞ ⎛ 0 ⎞
u⋅w =⎜ ⎟ ⎜
⎜ −3 ⎟ ⋅ ⎜ −4
⎟ = 0 + 12 + 3 = 15.
⎟
⎝ 1 ⎠ ⎝ 3 ⎠
⎛ 2 ⎞ ⎛ 0 ⎞
v⋅w =⎜ ⎟ ⎜
⎜ 1 ⎟ ⋅ ⎜ −4
⎟ = 0 − 4 − 6 = −10.
⎟
⎝ −2 ⎠ ⎝ 3 ⎠
Recall232 that in 2D space, the scalar product was both commutative and distributive
over addition. The same remains true of the scalar product in 3D space:
232
Fact 65.
233
Our proof here covers only the 3D case. For a more general proof, see p. 1289 (Appendices).
507, Contents www.EconsPhDTutor.com
Example 629. Continue to let u = (5, −3, 1), v = (2, 1, −2), and w = (0, −4, 3).
To illustrate commutativity, we can easily verify that:
v⋅u=u⋅v=5 and w ⋅ u = u ⋅ w = 5.
⎛ 5 ⎞ ⎛ 2+0 ⎞
u ⋅ (v + w) = ⎜ ⎟ ⎜
⎜ −3 ⎟ ⋅ ⎜ 1 − 4
⎟ = 10 + 9 + 1 = 20.
⎟
⎝ 1 ⎠ ⎝ −2 + 3 ⎠
⎛ 5+2 ⎞ ⎛ 0 ⎞
(u + v) ⋅ w = ⎜
⎜ −3 + 1
⎟ ⋅ ⎜ −4 ⎟ = 0 + 8 − 3 = 5.
⎟ ⎜ ⎟
⎝ 1−2 ⎠ ⎝ 3 ⎠
And again, a vector’s length is the square root of its scalar product with itself:
√
Fact 67. Suppose v be a vector. Then ∣v∣ = v ⋅ v and ∣v∣ = v ⋅ v.
2
√
Proof. By Definition 129, ∣v∣ = v12 + v22 + v32 . By Definition 110, v ⋅ v = v1 v1 + v2 v2 + v3 v3 =
√
v12 + v22 + v32 . Hence, ∣v∣ = v ⋅ v and ∣v∣ = v ⋅ v.
2
Exercise 211. Compute (1, 2, 3)⋅(4, 5, 6) and (−2, 4, −6)⋅(1, −2, 3). (Answer on p. 1487.)
Example 630. As shown in the figure below, a, b, c, and d are vectors. The angle
between a and b is α, while that between c and d is β.
y
a
d
c β x
Definition 111. The angle between two non-zero vectors u and v is the number:
u⋅v
cos−1
∣u∣ ∣v∣
.
Fact 68. If u and v are two non-zero vectors and θ is the angle between them, then:
y
(1, 3, −2)
(0, 2, 1)
1.072
x
z
Example 632. The angle between (8, 5, 0) and (−2, −3, 5) is:
(8, 5, 0)
z
2.133 x
(−2, −3, 5)
The following results and Definition are reproduced verbatim from before:
u⋅v
∈ (0, 1) ⇐⇒ ∈ (0, ).
π
(b) And thus:
∣u∣ ∣v∣
θ
2
u⋅v
= ⇐⇒ = u⋅v > ⇐
π
(c) 0 . (i) 0
∣u∣ ∣v∣
θ
2
u⋅v
∈ (−1, 0) ⇐⇒ ∈ ( , π). u⋅v = ⇐
π
(d) (ii) 0
∣u∣ ∣v∣
θ
2
u⋅v
(e) = −1 ⇐⇒ = (iii) u⋅v < 0 ⇐
∣u∣ ∣v∣
θ π.
Definition 112. Two non-zero vectors u and v are perpendicular (or normal or ortho-
gonal) if u ⋅ v = 0 and non-perpendicular if u ⋅ v ≠ 0.
Fact 72. (Triangle Inequality.) If u and v are vectors, then ∣u + v∣ ≤ ∣u∣ + ∣v∣.
Exercise 212. Find the angle between each pair of vectors. Also, state whether each
pair of vectors is parallel or perpendicular. (Answer on p. 1487.)
(a) a = (1, 2, 3) and b = (4, 5, 6) (b) u = (−2, 4, −6) and v = (1, −2, 3)
Definition 138. The x-, y-, and z-direction cosines of the vector v = (v1 , v2 , v3 ) are:
v1 v2 v3
, , and .
∣v∣ ∣v∣ ∣v∣
Again, the direction cosines are so named because each direction cosine is equal to the
cosine of the angle the given vector makes with each (positive) axis:
Fact 91. Let v = (v1 , v2 , v3 ) be a non-zero vector. Let α, β, and γ be the angles between
v and each of i, j, and k. Then:
√ √
Example 633. Let v = (2, 3, 2). Compute: ∣v∣ = 22 + 32 + 22 = 17.
√ √ √
So v’s x-, y-, and z-direction cosines are 2/ 17, 3/ 17, and 2/ 17.
And the angles it makes with the positive x-, y-, and z-axes are:
2 3 2
cos−1 √ ≈ 1.064, cos−1 √ ≈ 0.756, and cos−1 √ ≈ 1.064.
17 17 17
Exercise 213. For each of the following vectors, write down its unit vector and x-, y-,
and z-direction cosines. Then compute also the angles each makes with the positive x-,
y-, and z-axes. (Answer on p. 1487.)
Definition 118. Let a and b be vectors. Then the projection of a on b, denoted projb a,
is the following vector:
a⋅b
projb a = (a ⋅ b̂) b̂ (or equivalently, projb a = b).
∣b∣
2
Definition 119. Let a and b be non-zero vectors. Then the rejection of a on b, denoted
rejb a, is the following vector:
rejb a = a − projb a.
rejb a = a − projb a
projb a ∥ b
θ and
projb a b rejb a ⊥ projb a, b.
The 2D figure above is simply reproduced from before. Here’s a figure depicting the pro-
jection and rejection vectors in 3D:
b
y
projb a
rejb a = a − projb a
a
θ
234
Again, these two properties must hold provided projb a and rejb a are both non-zero.
513, Contents www.EconsPhDTutor.com
√
Example 634. Let a = (5, −2, 3) and b = (0, 1, 2). Then b̂ = (0, 1, 2) / 5 and:
⎛0⎞ ⎛0⎞ ⎛0 ⎞ ⎛ 0 ⎞
(5, −2, 3) ⋅ (0, 1, 2) ⎜ ⎟ 0 − 2 + 6 ⎜ ⎟ 4 ⎜ ⎟ = ⎜ 0.8 ⎟,
projb a = (a ⋅ b̂) b̂ = ⎜ 1 ⎟= ⎜ 1 ⎟= 5⎜ 1 ⎟ ⎜ ⎟
5 5
⎝2⎠ ⎝2⎠ ⎝2 ⎠ ⎝ 1.6 ⎠
rejb a = a − projb a = (5, −2, 3) − (0, 0.8, 1.6) = (5, −2.8, 1.4)
projb a
b = (0, 1, 2) θ
z
x
a = (5, −2, 3)
rejb a = a − projb a
We can easily verify that projb a = kb for some k and hence that projb a ∥ b:
We can also verify that rejb a ⋅ b = 0 and hence that rejb a ⊥ projb a, b:
∣projb a∣ = ∣a ⋅ b̂∣ .
Example 635. Continue to let a = (5, −2, 3) and b = (0, 1, 2). We already found:
√
b̂ = (0, 1, 2) / 5 and projb a = (0, 0.8, 1.6).
√ √
Now: ∣projb a∣ = ∣0.8 (0, 1, 2)∣ = 0.8 02 + 12 + 22 = 0.8 5.
√ √ √ √
Also: ∣a ⋅ b̂∣ = (5, −2, 3) ⋅ (0, 1, 2) / 5 = 4/ 5 = 4 5/5 = 0.8 5.
As before, the sign of a ⋅ b tells us whether projb a points in the same or exact opposite
direction as b:
Exercise 214. Continuing with the above example, find projb a, rejb a, projc a, and rejc a.
Then verify that rejb a ⊥ b and rejc a ⊥ c. (Answer on p. 1488.)
projv u = projw u.
Example 637. Let u = (2, 5, −1), v = (1, −2, 1), and w = (−2, 4, −2). Since v ∥ w, by the
above Fact, it should be that projv u = projw u, as we now verify:
⎛ 1 ⎞ ⎛ 1 ⎞ ⎛ 1 ⎞
(2, 5, −1) ⋅ (1, −2, 1) ⎜ ⎟ 2 − 10 + 1 ⎜ ⎟ 3⎜ ⎟.
projv u = (u ⋅ v̂) v̂ = ⎜ −2 ⎟ = ⎜ −2 ⎟ = − 2 ⎜ −2 ⎟
12 + (−2) + 12 ⎝
2 6
1 ⎠ ⎝ 1 ⎠ ⎝ 1 ⎠
⎛ −2 ⎞ ⎛ −2 ⎞ ⎛ 1 ⎞
(2, 5, −1) ⋅ (−2, 4, −2) ⎜ ⎟ −4 + 20 + 2 ⎜ 4 ⎟ = − ⎜ −2
3 ⎟.
projw u = (u ⋅ ŵ) ŵ = 2 ⎜ 4 ⎟ = ⎜ ⎟
(−2) + 4 + (−2) ⎝
2 24 2⎜ ⎟
⎠ ⎝ ⎠ ⎝ 1 ⎠
2
−2 −2
y rejv u = rejw u
w = (−2, 4, −2)
u = (2, 5, −1)
projv u = projw u
v = (1, −2, 1)
Exercise 215. Given a = (1, 2, 3) and b = (4, 5, 6), find projb a, rejb a, ∣projb a∣, and
∣rejb a∣. Verify that projb a ∥ b and rejb a ⊥ b. Does projb a point in the same or exact
opposite direction as b? (Answer on p. 1488.)
Definition 109. A line is any set of points that can be written as:
Ð→
{R ∶ OR = p + λv (λ ∈ R)} ,
As before, the above Definition says that a line contains exactly those points R whose
Ð→
position vector OR = r may be expressed as:
⎛ p1 ⎞ ⎛ v1 ⎞
Ð→
OR = r = p + λv = ⎜ ⎟ ⎜
⎜ p 2 ⎟ + λ⎜ v 2
⎟
⎟ for some real number λ.
⎝ p3 ⎠ ⎝ v3 ⎠
Equivalently, a line contains exactly those points R that may be expressed as:
⎛ p1 ⎞ ⎛ v1 ⎞
R=⎜ ⎟ ⎜
⎜ p 2 ⎟ + λ ⎜ v2
⎟
⎟ for some real number λ.
⎝ p3 ⎠ ⎝ v3 ⎠
As before, here are what the vectors p and v and the number λ mean:
• p = (p1 , p2 , p3 ) is the position vector of some point on the line;
• v = (v1 , v2 , v3 ) is a direction vector of the line; and
• The parameter λ takes on every value in R; each distinct value produces a distinct
point on the line.
Definition 108. Given any two distinct points A and B on a line, we call the vector
Ð→
AB a direction vector of the line.
Fact 62. Let u and v be vectors. Suppose v is a line’s direction vector. Then:
⎛1⎞ ⎛0 ⎞
Ð→
r = OP + λv = ⎜ ⎟ ⎜
⎜ 2 ⎟ + λ⎜ 1
⎟
⎟ (λ ∈ R).
⎝3⎠ ⎝1 ⎠
The line l contains the point P = (1, 2, 3) and has direction vector v = (0, 1, 1).
As the parameter λ varies, we get different points of l. So for example, when λ takes on
the values 0, 1, and −1, we get the following three position vectors (and thus points):
(1, 3, 4)
λ=1
(0, 1, 1)
(1, 2, 3)
λ=0 (1, 1, 2)
λ = −1 x
l
z
Note that the direction vector v = (0, 1, 1) has x-coordinate 0. Informally, one implication
of this is that the line doesn’t “move” in the direction of the x-axis.
A little more formally, the line is perpendicular to the x-axis. Indeed, we can easily verify
that v is perpendicular to the first standard basis vector i = (1, 0, 0):
⎛0⎞ ⎛0⎞ ⎛1 ⎞
v⋅i=⎜ ⎟ ⎜ ⎟ ⎜
⎜ 1 ⎟⋅i=⎜ 1 ⎟⋅⎜ 0
⎟ = 0 ⋅ 1 + 1 ⋅ 0 + 1 ⋅ 0 = 0.
⎟
⎝1⎠ ⎝1⎠ ⎝0 ⎠
⎛0⎞ ⎛1 ⎞
Ð→
r = OP + λv = ⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 0
⎟
⎟ (λ ∈ R).
⎝0⎠ ⎝0 ⎠
The line l contains the point P = (0, 0, 0) and has direction vector v = (1, 0, 0).
As the parameter λ varies, we get different points of l. So for example, when λ takes on
the values 0, 1, and −1, we get the following three position vectors (and thus points):
(1, 0, 0)
(−1, 0, 0) (1, 0, 0)
λ = −1 λ=1
(0, 0, 0)
l x
λ=0
z
Note that the direction vector v = (1, 0, 0) has y- and z-coordinates 0. Again, this means
that the line is perpendicular to the y- and z-axes — we can easily verify that v is
perpendicular to both j = (0, 1, 0) and k = (0, 0, 1).
Indeed, this line actually coincides with the x-axis — it passes through the origin (0, 0, 0)
and its direction vector is parallel to i.
Then for each point (x, y, z) on l, there is some real number λ such that:
Example 640. Let l be the line described by the following vector equation:
⎛1⎞ ⎛4 ⎞
r=⎜ ⎟ ⎜ ⎟
⋆
⎜ 2 ⎟ + λ⎜ 5 ⎟ (λ ∈ R).
⎝3⎠ ⎝6 ⎠
⋆
That is, let: l = {R ∶ r = (1, 2, 3) + λ(4, 5, 6)}.
In words, l is the set of points R whose position vector can be written as (1, 2, 3)+λ(4, 5, 6),
for some real number λ.
⋆
Write out = as the following three cartesian equations:
⎛ −2 ⎞ ⎛ 1 ⎞
r=⎜ ⎟ ⎜
⎜ 5 ⎟ + λ⎜ 5
⎟
⎟ (λ ∈ R).
⎝ 0 ⎠ ⎝ −2 ⎠
x = −2 + λ, y = 5 + 5λ, z = 0 − 2λ.
1 2 3
x+2 y−5
λ= λ= λ=
1 2 3 z
, , .
1 5 −2
Thus, l0 may also be described by the following two cartesian equations:
x+2 y−5 z
= = .
1 5 −2
Example 642. Let l1 be the line described by the following vector equation:
⎛0⎞ ⎛2 ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 3
⎟
⎟ (λ ∈ R).
⎝0⎠ ⎝5 ⎠
λ= , λ= , λ= .
1 x 2 y 3 z
2 3 5
Thus, l1 may also be described by the following two cartesian equations:
= = .
x y z
2 3 5
y−2 z−3
λ= λ=
2 3
and .
5 6
Altogether then, l2 may also be described by the following two cartesian equations:
y−2 z−3
x=1 and = .
5 6
x−1 z−3
λ= λ=
1 3
and .
4 6
Thus, l3 may also be described by the following two cartesian equations:
x−1 z−3
y=2 and = .
4 6
x−1 y−2
λ= λ=
2 3
and .
4 5
Thus, l4 may also be described by the following two cartesian equations:
x−1 y−2
z=3 and = .
4 5
x = 1 + 0λ = 1, y = 2 + 0λ = 2, z = 3 + 6λ.
1 2 3
Observe that the direction vector (0, 0, 6) has x- and y-coordinates 0. And so, l5 must be
perpendicular to both the x- and y-axes.
Indeed, the x-and y-coordinates of every point of l5 are fixed as x = 1 and y = 2.
1 2
On the other hand, z is free to vary along with λ. Unlike in any of our previous examples,
there is no restriction on what z can be. And so we call z the free variable.
And so here in this example, there is actually no algebra to be done. We simply discard
= and say that l5 may be described by the following two cartesian equations:
3
x=1 y = 2.
1 2
and
x = 1 + 0λ = 1, y = 2 + 5λ, z = 3 + 0λ = 3.
1 2 3
Observe that the direction vector (0, 5, 0) has x- and z-coordinates 0. And so, l6 must be
perpendicular to both the x- and z-axes.
Indeed, the x-and z-coordinates of every point of l6 are fixed as x = 1 and z = 3.
1 3
In this example, the free variable is y. We simply discard = and say that l6 may be
2
x=1 z = 3.
1 3
and
x = 1 + 4λ, y = 2 + 0λ = 2, z = 3 + 0λ = 3.
1 2 3
Observe that the direction vector (4, 0, 0) has y- and z-coordinates 0. And so, l7 must be
perpendicular to both the y- and z-axes.
Indeed, the y-and z-coordinates of every point of l7 are fixed as y = 2 and z = 3.
2 3
In this example, the free variable is x. We simply discard = and say that l7 may be
1
y=2 z = 3.
2 3
and
x − p1 y − p2 z − p3
= = .
v1 v2 v3
(2) If v1 = 0 and v2 , v3 ≠ 0, then l is perpendicular to the x-axis and can be described by:
y − p2 z − p 3
x = p1 and = .
v2 v3
(3) If v2 = 0 and v1 , v3 ≠ 0, then l is perpendicular to the y-axis and can be described by:
x − p1 z − p 3
y = p2 and = .
v1 v3
(4) If v3 = 0 and v1 , v2 ≠ 0, then l is perpendicular to the z-axis and can be described by:
x − p1 y − p2
z = p3 and = .
v1 v2
(5) If v1 , v2 = 0, then l is perpendicular to the x- and y-axes and can be described by:
x = p1 and y = p2 .
(6) If v1 , v3 = 0, then l is perpendicular to the x- and z-axes and can be described by:
x = p1 and z = p3 .
(7) If v2 , v3 = 0, then l is perpendicular to the y- and z-axes and can be described by:
y = p2 and y = p2 .
Exercise 216. Each vector equation below describes a line. Rewrite each into cartesian
form. Also, state if each line is perpendicular to any axes. (Answer on p. 1489.)
Example 649. Let l be the line described by the following cartesian equations:
3x − 9 2y − 8 z − 1
= = .
6 2 3
First, rewrite the above cartesian equations so that the coefficients on x, y, and z are all
1. This is easily done by dividing the numerator and denominator of each fraction by the
variable’s coefficient:
x−3 y−4 z−1
= = .
2 1 3
Reading off, the line l contains the point (3, 4, 1) and has direction vector (2, 1, 3). So, it
can also be described by the following vector equation:
−x + 7 0.5y + 1
Example 650. Let l1 be the line described by = = z − 2.
−5 0.3
x−7 y + 2 z − 2
Rewrite the above as: = = .
5 0.6 1
Reading off, l1 contains the point (7, −2, 2) and has direction vector (5, 0.6, 1). So, it can
also be described by:
5x y − 12 3z − 15
Example 651. Let l2 be the line described by = = .
2 6 9
x − 0 y − 12 z − 5
Rewrite the above as: = = .
0.4 6 3
Reading off, l2 contains the point (0, 12, 5) and has direction vector (0.4, 6, 3). So, it can
also be described by:
5y − 12 2 3z − 15
x = 17 =
1
and .
100 9
y − 2.4 z − 5
Rewrite = as: =
2
.
20 3
Reading off, l3 contains the point (17, 2.4, 5) and has direction vector (0, 20, 3). So, it
can also be described by:
−x 2 z + 10
y = −2 =
1
and .
3 −5
x z − (−10)
Rewrite = as: =
2
.
−3 −5
Reading off, l4 contains the point (0, −2, −10) and has direction vector (−3, 0, −5). So, it
can also be described by:
7x − 6 2 2y + 10
4z = 3 =
1
and .
35 18
x − 6/7 y − (−5)
Rewrite = as: =
2
.
5 9
Reading off, l5 contains the point (6/7, −5, 3/4) and has direction vector (5, 9, 0). So, it can
also be described by:
Example 657. Let l8 be the line described by y = −11 and −4z = 52.
Every point on l8 has y- and z-coordinates −11 and −13. Hence, the direction vector
must have 0 as its y- and z-coordinates. (Equivalently, this line must be perpendicular
to both the y- and z-axes.)
The free variable is x. Altogether then, l8 contains exactly those points (x, −11, −13), for
√
all real numbers x. For example, it contains the points (0, −11, −13) and ( 2, −11, −13).
Hence, for any non-zero k, (k, 0, 0) is a direction vector of l8 .
For simplicity, we pick (1, 0, 0) as our direction vector and describe l8 by:
Exercise 217. Each pair of cartesian equations below describes a line. Rewrite each into
vector form. State if each is perpendicular to any axes. (Answer on p. 1489.)
7x − 2 0.3y − 5 8z x − 3 5z − 2
(a) = = . (d) 3y = 11 and = .
5 7 7 2 7
2x = 3y = 5z. = 13 2z = 1.
x
(b) (e) and
5
3y − 1
(c) 17x − 4 = = 3z. (f) 13x + 5 = 0 and y = 5z − 2.
2
Definition 116. Two lines are (a) parallel if they have parallel direction vectors; and
(b) perpendicular if they have perpendicular direction vectors.
Since (1, 2, 3) ∥ (−2, −4, −6), by the above Definition, the two lines are parallel.
⎛1⎞ ⎛1 ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 2
⎟ (λ ∈ R)
⎟
y ⎝1⎠ ⎝3 ⎠
x
z
⎛5⎞ ⎛ −2 ⎞
⎜ ⎟
r = ⎜ 0 ⎟ + µ⎜
⎜ −4
⎟ (µ ∈ R)
⎟
⎝9⎠ ⎝ −6 ⎠
r = (5, −1, 4) + λ (8, 2, −1) and r = (3, 1, 6) + µ (1, −2, 4) (λ, µ ∈ R).
We have (8, 2, −1) ⋅ (1, −2, 4) = 8 − 4 − 4 = 0. So, (8, 2, −1) ⊥ (1, −2, 4) and by the above
Definition, the two lines are perpendicular.
Since (1, 0, 0) ∥/ (1, 1, 0) and (1, 0, 0) ⊥/ (1, 1, 0), the two lines are neither parallel nor
perpendicular.
Since (1, 2, 3) ∥ (−2, −4, −6), by the above Definition, the two lines are parallel.
Observe that the point (3, 6, 9) is on both lines (plug in λ = 0 and µ = −1.5).
Since the two lines are parallel and do intersect, by Fact 75(b), they cannot be distinct.
Equivalently, they must be identical.
We will next learn how to determine whether two lines in 3D space intersect and if they
do, how to find their intersection point.
⎛0⎞ ⎛ 1 ⎞ ⎛1⎞ ⎛ 2 ⎞
r = ⎜ 0 ⎟ + λ⎜
⎜ ⎟
⎜ −1
⎟
⎟ and r = ⎜ 1 ⎟ + µ⎜
⎜ ⎟
⎜ 0
⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝ −2 ⎠ ⎝1⎠ ⎝ −1 ⎠
These two lines are not parallel and hence distinct. And so, by Fact 75(c), they share at
most one intersection point.
Suppose they intersect. Then there must be real numbers λ̂ and µ̂ such that:
λ̂ = 1 + 2µ̂,
1
⎛0⎞ ⎛ 1 ⎞ ⎛1⎞ ⎛ 2 ⎞
⎜ 0 ⎟ + λ̂ ⎜ −1 ⎟ = ⎜ 1 ⎟ + µ̂ ⎜ 0 ⎟, −λ̂ = 1,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝0⎠ ⎝ −2 ⎠ ⎝1⎠ ⎝ −1 ⎠
−2λ̂ = 1 − µ̂.
3
From =, λ̂ = −1. Plug this into = to get µ̂ = −1. You can verify that these values of λ̂ and
2 1
To find their intersection point, plug λ̂ = −1 or µ̂ = −1 into either line’s vector equation:
⎛0⎞ ⎛ 1 ⎞ ⎛1⎞ ⎛ 2 ⎞ ⎛ −1 ⎞
⎜ 0 ⎟ + λ̂ ⎜ −1 ⎟ = ⎜ 1 ⎟ + µ̂ ⎜ 0 ⎟ = ⎜ 1 ⎟.
⎜ ⎟ ®⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝ 0 ⎠ −1 ⎝ −2 ®
⎠ ⎝ 1 ⎠ −1 ⎝ −1 ⎠ ⎝ 2 ⎠
y ⎛1⎞ ⎛ 2 ⎞
⎜
r = ⎜ 1 ⎟ + µ⎜
⎟
⎜ 0
⎟ (µ ∈ R)
⎟
(−1, 1, 2) ⎝1⎠ ⎝ −1 ⎠
x
z
⎛0⎞ ⎛ 1 ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ ⎜ −1
⎟ (λ ∈ R)
⎟
⎝0⎠ ⎝ −2 ⎠
⎛1⎞ ⎛1 ⎞ ⎛0⎞ ⎛2 ⎞
r=⎜ ⎟ ⎜
⎜ 2 ⎟ + λ⎜ 1
⎟
⎟ and r=⎜ ⎟ ⎜
⎜ 1 ⎟ + µ⎜ 1
⎟
⎟ (λ, µ ∈ R).
⎝3⎠ ⎝1 ⎠ ⎝2⎠ ⎝4 ⎠
These two lines are not parallel and hence distinct. And so, by Fact 75(c), they share at
most one intersection point.
Suppose they intersect. Then there must be real numbers λ̂ and µ̂ such that:
1 + λ̂ = 2µ̂,
1
⎛1⎞ ⎛1 ⎞ ⎛0⎞ ⎛2 ⎞
⎜ 2 ⎟ + λ̂ ⎜ 1 ⎟ = ⎜ 1 ⎟ + µ̂ ⎜ 1 ⎟, 2 + λ̂ = 1 + µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝3⎠ ⎝1 ⎠ ⎝2⎠ ⎝4 ⎠
3 + λ̂ = 2 + 4µ̂.
3
= minus = yields −1 = µ̂ − 1 or µ̂ = 0. Plug this into = to get λ̂ = −1. You can verify that
1 2 1
these values of λ̂ and µ̂ also satisfy =. Hence, the two lines intersect.
3
To find their intersection point, plug λ̂ = −1 or µ̂ = 0 into either line’s vector equation:
y
⎛0⎞ ⎛2 ⎞
(0, 1, 2) r=⎜ ⎟ ⎜ ⎟ (µ ∈ R)
⎜ 1 ⎟ + µ⎜ 1 ⎟
⎝2⎠ ⎝4 ⎠
⎛1⎞ ⎛1 ⎞
r=⎜ ⎟ ⎜
⎜ 2 ⎟ + λ⎜ 1
⎟ (λ ∈ R)
⎟ x
⎝3⎠ ⎝1 ⎠ z
Definition 139. Two lines are said to be skew if they are not parallel and do not intersect.
⎛0⎞ ⎛1 ⎞ ⎛1⎞ ⎛4 ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 2
⎟
⎟ and r=⎜ ⎟ ⎜
⎜ 1 ⎟ + µ⎜ 5
⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝3 ⎠ ⎝2⎠ ⎝6 ⎠
These two lines are not parallel and hence distinct. And so by Fact 75(c), they share at
most one intersection point.
To check if they intersect, suppose there are real numbers λ̂ and µ̂ such that:
λ̂ = 1 + 4µ̂,
1
⎛0⎞ ⎛1 ⎞ ⎛1⎞ ⎛4 ⎞
⎜ 0 ⎟ + λ̂ ⎜ 2 ⎟ = ⎜ 1 ⎟ + µ̂ ⎜ 5
⋆ ⎟, 2λ̂ = 1 + 5µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝0⎠ ⎝3 ⎠ ⎝2⎠ ⎝6 ⎠
3λ̂ = 2 + 6µ̂.
3
Now, 2× = minus = yields 0 = 1 + 3µ̂ or µ̂ = −1/3. Plug this back into = to get λ̂ = −1/3.
1 2 1
⋆
This contradiction means that there are no real numbers λ̂ and µ̂ such that = holds. In
other words, the two lines do not intersect. And since they are not parallel either, by
the above Definition, they are skew.
y
⎛1⎞ ⎛4 ⎞
r = ⎜ 1 ⎟ + µ⎜
⎜ ⎟
⎜5
⎟ (µ ∈ R)
⎟
⎝2⎠ ⎝6 ⎠
x
⎛0⎞ ⎛1 ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 2
⎟ (λ ∈ R)
⎟
z
⎝0⎠ ⎝3 ⎠
235
Fact 76.
532, Contents www.EconsPhDTutor.com
Example 666. Suppose two lines are described by:
⎛1⎞ ⎛ 1 ⎞ ⎛1⎞ ⎛2 ⎞
r=⎜ ⎟ ⎜
⎜ 3 ⎟ + λ ⎜ −1
⎟
⎟ and r=⎜ ⎟ ⎜
⎜ 0 ⎟ + µ⎜ 1
⎟
⎟ (λ, µ ∈ R).
⎝3⎠ ⎝ −2 ⎠ ⎝1⎠ ⎝3 ⎠
These two lines are not parallel and hence distinct. And so by Fact 75(c), they share at
most one intersection point.
To check if they intersect, suppose there are real numbers λ̂ and µ̂ such that:
1 + λ̂ = 1 + 2µ̂,
1
⎛1⎞ ⎛ 1 ⎞ ⎛1⎞ ⎛2 ⎞
⎜ 3 ⎟ + λ̂ ⎜ −1 ⎟ = ⎜ 0 ⎟ + µ̂ ⎜ 1
⋆ ⎟, 3 − λ̂ = µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝3⎠ ⎝ −2 ⎠ ⎝1⎠ ⎝3 ⎠
3 − 2λ̂ = 1 + 3µ̂.
3
Now, = minus 2× = minus yields 3λ̂ − 5 = 1 or λ̂ = 2. Plug this back into = to get µ̂ = 1.
1 2 1
And so again, the two lines do not intersect. And since they are not parallel either,
they are skew.
⎛1⎞ ⎛ 1 ⎞
⎛1⎞ ⎛2 ⎞
r=⎜ ⎟ ⎜
⎜ 3 ⎟ + λ ⎜ −1
⎟ (λ ∈ R)
⎟
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + µ⎜ 1
⎟ (µ ∈ R)
⎟ ⎝3⎠ ⎝ −2 ⎠
⎝1⎠ ⎝3 ⎠
y
x
β =π−α
α
z
Our formal definition of the angle between two lines is reproduced from before:
Definition 115. Given two lines, pick for each any direction vector. We call the non-
obtuse angle between these two vectors the angle between the two lines.
Corollary 8. The angle between two lines with direction vectors u and v is:
∣u ⋅ v∣
cos−1
∣u∣ ∣v∣
.
Corollary 9. Suppose θ is the angle between two lines. (a) If θ = 0, then the two lines
are parallel. And (b) if θ = π/2, then they are perpendicular.
Observe that these two lines intersect at (1, 0, 1). (And so, they are not skew.)
Again, the angle between these two lines is given by Corollary 8:
This angle is neither zero nor right. And so by Corollary 9, the two lines are neither
parallel nor perpendicular.
y
⎛0⎞ ⎛1 ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 0
⎟ (λ ∈ R)
⎟
⎝0⎠ ⎝1 ⎠
x
⎛6⎞ ⎛5 ⎞ 0.442
r=⎜ ⎟ ⎜ ⎟ (µ ∈ R) (1, 0, 1)
⎜ 1 ⎟ + µ⎜ 1 ⎟ z
⎝3⎠ ⎝2 ⎠
⎛1⎞ ⎛1 ⎞ ⎛0⎞ ⎛ 0 ⎞
r=⎜ ⎟ ⎜
⎜ 2 ⎟ + λ⎜ 2
⎟
⎟ and r=⎜ ⎟ ⎜
⎜ 0 ⎟ + µ̂ ⎜ 3
⎟
⎟ (λ, µ ∈ R).
⎝2⎠ ⎝1 ⎠ ⎝0⎠ ⎝ −2 ⎠
If they intersect, then there are real numbers λ̂ and µ̂ such that:
1 + λ̂ = 0,
1
⎛1⎞ ⎛1 ⎞ ⎛0⎞ ⎛ 0 ⎞
⎜ 2 ⎟ + λ̂ ⎜ 2 ⎟ = ⎜ 0 ⎟ + µ̂ ⎜ 3
⋆ ⎟, 2 + 2λ̂ = 3µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝2⎠ ⎝1 ⎠ ⎝0⎠ ⎝ −2 ⎠
2 + λ̂ = −2µ̂.
3
From =, λ̂ = −1. Plug this into = to get µ̂ = 0. But now, these values of λ̂ and µ̂ contradict
1 2
Even though the two lines do not intersect, we will still find it useful to talk about the
angle between them. This we can compute as usual:
This angle is neither zero nor right. And so by Corollary 9, the two lines are neither
parallel nor perpendicular.
Since the two lines do not intersect and are not parallel, they are skew.
y ⎛0⎞ ⎛ 0 ⎞
r = ⎜ 0 ⎟ + µ⎜
⎜ ⎟
⎜ 3
⎟ (µ ∈ R)
⎟
⎛1⎞ ⎛1 ⎞ ⎝0⎠ ⎝ −2 ⎠
r=⎜ ⎟ ⎜
⎜ 2 ⎟ + λ⎜ 2
⎟ (λ ∈ R)
⎟ 1.101
⎝2⎠ ⎝1 ⎠
The two lines do not intersect. Nonetheless, we can always translate one of the two lines
so that they intersect. In the above figure, we’ve translated the black line so that it
intersects the red line at the origin.
If they intersect, then there are real numbers λ̂ and µ̂ such that:
9λ̂ = 4 + 3µ̂,
1
⎛0⎞ ⎛9 ⎞ ⎛4⎞ ⎛3 ⎞
⎜ 1 ⎟ + λ̂ ⎜ 1 ⎟ =⋆ ⎜ 5 ⎟ + µ̂ ⎜ 2 ⎟, 1 + λ̂ = 5 + 2µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝2⎠ ⎝3 ⎠ ⎝6⎠ ⎝1 ⎠
2 + 3λ̂ = 6 + µ̂.
3
= minus 3× = yields −6 = −14, which is a contradiction. Hence, the two lines do not
1 3
This angle is neither zero nor right. And so by Corollary 9, the two lines are neither
parallel nor perpendicular. Since they do not intersect either, they are skew.
y
⎛4⎞ ⎛3 ⎞
r=⎜ ⎟ ⎜
⎜ 5 ⎟ + µ⎜ 2
⎟ (µ ∈ R)
⎟ 0.459
⎝6⎠ ⎝1 ⎠ (0, 1, 2)
z x
⎛0⎞ ⎛9 ⎞
r=⎜ ⎟ ⎜
⎜ 1 ⎟ + λ⎜ 1
⎟ (λ ∈ R)
⎟
⎝2⎠ ⎝3 ⎠
The two lines do not intersect. Nonetheless, we can always translate one of the two
lines so that they intersect. In the above figure, we’ve translated the red line so that it
intersects the black line at the point (0, 1, 2).
Exercise 218. Each of (a)–(d) gives a pair of lines in vector form. Find any inter-
section points and the angle between the two lines. State if the two lines are parallel,
perpendicular, identical, or skew. (Answer on p. 1490.)
(a) r= (0, 1, 1) +λ (1, −1, 1) and r= (1, 3, 3) +µ (0, 0, 2).
(b) r= (−1, 2, 3) +λ (0, 1, 0) and r= (0, 0, 0) +µ (8, −3, 5).
(c) r= (7, 3, 4) +λ (8, 3, 4) and r= (9, 3, 7) +µ (3, −4, −3).
(d) r= (0, 0, 1) +λ (1, 2, 1) and r= (1, 0, 0) +µ (−3, −6, −3).
Definition 120. Two or more points are collinear if some line contains all of them.
Fact 80 is reproduced from before and says that any two points are always collinear:
Fact 80. Suppose A and B are distinct points. Then the unique line that contains both
A and B is described by:
Ð→ Ð→
r = OA + λAB (λ ∈ R).
Ð→ Ð→ Ð→
AB r = OA + λAB (λ ∈ R)
B
A
And as before, three distinct points can be collinear but will not generally be:
Ð→ Ð→
A, B, and C are collinear. AB r = a + λAB (λ ∈ R)
A B
C
D E
F
We’ll use the exact same procedure to check whether three points are collinear:
1. First use Fact 80 to write down the unique line that contains two of the three points.
2. Then check whether this line also contains the third point.
Two examples:
⎛1⎞ ⎛3 ⎞
Ð→ Ð→ ⎜ ⎟
r = OA + λAB = ⎜ 2 ⎟ + λ⎜
⎜3
⎟
⎟ (λ ∈ R).
⎝3⎠ ⎝3 ⎠
7 = 1 + 3λ̂,
1
⎛7⎞ ⎛1 ⎞ ⎛3⎞
C=⎜ ⎟ ⎜
⎜ 8 ⎟=⎜ 2
⎟ + λ̂⎜ 3 ⎟,
⎟ ⎜ ⎟ or 8 = 2 + 3λ̂,
2
⎝9⎠ ⎝3 ⎠ ⎝3⎠
9 = 3 + 3λ̂.
3
As you can verify, λ̂ = 2 solves the above vector equation (or system of three equations).
Thus, our line also contains C.
We conclude that A, B, and C are collinear.
y
C = (7, 8, 9)
B = (4, 5, 6)
A = (1, 2, 3)
⎛1⎞ ⎛3 ⎞ x
r = ⎜ 2 ⎟ + λ⎜
⎜ ⎟
⎜3
⎟ (λ ∈ R)
⎟
⎝3⎠ ⎝3 ⎠
⎛1⎞ ⎛ −1 ⎞
ÐÐ→ ÐÐ→ ⎜ ⎟
r = OD + λDE = ⎜ 0 ⎟ + λ⎜
⎜ 1
⎟
⎟ (λ ∈ R).
⎝0⎠ ⎝ 0 ⎠
0 = 1 − 1λ̂,
1
⎛0⎞ ⎛1⎞ ⎛ −1 ⎞
F =⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 0 ⎟ = ⎜ 0 ⎟ + λ̂⎜ 1 ⎟, or 0 = 0 + 1λ̂,
2
⎝1⎠ ⎝0⎠ ⎝ 0 ⎠
0 = 0 + 0λ̂.
3
From =, we have λ̂ = 1. But this contradicts =. This contradiction means that there is no
1 2
solution to the above vector equation (or system of three equations). Thus, the line we
wrote down above does not contain F .
We conclude that D, E, and F are not collinear.
y
⎛1⎞ ⎛ −1 ⎞
r = ⎜ 0 ⎟ + λ⎜
⎜ ⎟
⎜ 1
⎟ (λ ∈ R)
⎟
⎝0⎠ ⎝ 0 ⎠ E = (0, 1, 0)
D = (1, 0, 0)
x
F = (0, 0, 1)
(a1 , a2 , a3 ) ⋅ (c1 , c2 , c3 ) = a1 c1 + a2 c2 + a3 c3 = 0,
1
(b1 , b2 , b3 ) ⋅ (c1 , c2 , c3 ) = b1 c1 + b2 c2 + b3 c3 = 0.
2
Our goal is to find c that solves = and =. Observe that b3 × = minus a3 × = yields:
1 2 1 2
0 = a1 b3 c1 + a2 b3 c2
+a3
b3c3 − a3 b1 c1 − a3 b2 c2
−
3
a3b3c3
= c2 (a2 b3 − a3 b2 ) − c1 (a3 b1 − a1 b3 ) .
³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ · ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹µ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
c1 c2
0 = a1 (
a2b3 − a3 b2 ) + a2 (a3 b1 −a1 b3 ) + a3 c3 = −a1
a3 b2 + a2
a3 b1 +
a3 c3 or c3 = a1 b2 − a2 b1 .
Hence, a vector that solves = and = (i.e. is perpendicular to both a and b) is:
1 2
⎛ c1 ⎞ ⎛ a2 b3 − a3 b2 ⎞
c=⎜ ⎟ ⎜
⎜ c2 ⎟ = ⎜ a3 b1 − a1 b3
⎟.
⎟
⎝ c3 ⎠ ⎝ a1 b2 − a2 b1 ⎠
We will simply use the above as our Definition of the vector product:236
Definition 140. Let a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ) be vectors. Then their vector
product, denoted a × b, is the following vector:
⎛ a2 b3 − a3 b2 ⎞
a×b=⎜
⎜ a3 b1 − a1 b3
⎟.
⎟
⎝ a1 b2 − a2 b1 ⎠
By the way, no need to mug Definition 140, because it’s already on List MF26 (p. 4).
236
Pedagogical note: In earlier versions of this textbook (i.e. before the revisions of 2018), I started with
the geometric definition of the vector product. I have now decided to go the other way round — that
is, I now take the more standard and formalistic approach of defining the vector product analytically.
541, Contents www.EconsPhDTutor.com
Example 676. The vector product of a = (1, 2, 3) and b = (4, 5, 6) is:
⎛ 1 ⎞ ⎛ 4 ⎞ ⎛ 2⋅6−3⋅5 ⎞ ⎛ −3 ⎞
a×b=⎜ ⎟ ⎜ ⎟ ⎜
⎜ 2 ⎟×⎜ 5 ⎟=⎜ 3⋅4−1⋅6
⎟ = ⎜ 6 ⎟.
⎟ ⎜ ⎟
⎝ 3 ⎠ ⎝ 6 ⎠ ⎝ 1⋅5−2⋅4 ⎠ ⎝ −3 ⎠
From our above discussion, we already know that a × b ⊥ a, b. Indeed, this was the
geometric property that motivated our definition of the vector product. Nonetheless, as
an exercise, let’s go ahead and verify that (a × b) ⋅ a = 0 and (a × b) ⋅ b = 0:
y
b = (4, 5, 6)
a × b = (−3, 6, −3)
a = (1, 2, 3)
Example 677. The vector product of u = (1, 0, −1) and v = (3, −1, 0) is:
⎛ 1 ⎞ ⎛ 3 ⎞ ⎛ 0 ⋅ 0 − (−1) ⋅ (−1) ⎞ ⎛ −1 ⎞
u×v=⎜ ⎟ ⎜
⎜ 0 ⎟ × ⎜ −1
⎟=⎜
⎟ ⎜ −1 ⋅ 3 − 1 ⋅ 0 ⎟ = ⎜ −3 ⎟.
⎟ ⎜ ⎟
⎝ −1 ⎠ ⎝ 0 ⎠ ⎝ 1 ⋅ (−1) − 0 ⋅ 3 ⎠ ⎝ −1 ⎠
Exercise 220. Let u = (0, 1, 2), v = (3, 4, 5), w = (−1, −2, −3), and x = (1, 0, 5).
(a) Find u × v and verify that u × v ⊥ u, v.
(b) Find w × x and verify that w × x ⊥ w, x. (Answer on p. 1492.)
(c) Let a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ). Prove that (a × b) ⋅ a = 0 and (a × b) ⋅ b = 0.
All of our results about the vector product in 2D space continue to hold in 3D space and
are now reproduced. First, it remains true that the vector product is distributive and
anti-commutative. Moreover, the vector product of a vector with itself is zero:
(ca) × b = c (a × b).
Also, from Fact 81(c), we again have the following result. The proof is exactly the same as
before and is simply reproduced:
a×b=0 ⇐⇒ a ∥ b.
Example 678. Let s = (1, 2, 3) and t = (2, 4, 6) be vectors. Since s ∥ t, by Corollary 11,
we must have s × t = 0. We can easily verify that this is so:
Example 679. The vector product of c = (−1, 3, −5) and d = (2, −4, 6) is:
⎛ −1 ⎞ ⎛ 2 ⎞ ⎛ −2 ⎞
c×d=⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 3 ⎟ × ⎜ −4 ⎟ = ⎜ −4 ⎟.
⎝ −5 ⎠ ⎝ 6 ⎠ ⎝ −2 ⎠
c∥a×b ⇐⇒ c ⊥ a, b.
In contrast, −a × b, the other vector that’s perpendicular to a and b, satisfies the left-hand
rule. (Can you explain why?)
⎛ a2 b3 − a3 b2 ⎞ ⎛ a3 b2 − a2 b3 ⎞
a×b=⎜
⎜ a3 b1 − a1 b3
⎟
⎟ and −a × b = ⎜
⎜ a1 b3 − a3 b1
⎟.
⎟
⎝ a1 b2 − a2 b1 ⎠ ⎝ a2 b1 − a1 b2 ⎠
Why is it that one of these two arbitrary-looking vectors satisfies the right-hand rule,
while the other satisfies the left-hand rule? That this is so is not at all obvious and is
beyond the scope of this textbook.237
Fun Fact
Why do we use the right-hand rule rather than the left-hand rule? One possible
explanation might be that the right-handed majority is, as usual, being tyrannical.
But more likely, this is simply an arbitrary convention, not unlike like how most of the
world drives on the right, while a minority drives on the left.238
Indeed, according to one writer:
Until 1965, the Soviet Union used the left-hand rule, logically reasoning that
the left-hand rule is more convenient because a right-handed person can sim-
ultaneously write while performing cross products.
237
The short answer is that (i) we earlier adopted the convention that our coordinate system obeys the
right-hand rule; and (ii) (a, b, a × b) is positively oriented with respect to (i, j, k) (what exactly
positively oriented means is the bit that’s beyond the scope of this textbook). Had we instead adopted
the convention that our coordinate system obeys the left-hand rule, then as currently defined, our vector
product a × b would also obey the left-hand rule.
238
This left-driving minority includes Japan, the UK, and former British colonies like Singapore and
Malaysia.
545, Contents www.EconsPhDTutor.com
49.2. The Length of the Vector Product
As before, the vector product a × b has length ∣a∣ ∣b∣ sin θ. Formally:
Fact 84. Let θ be the angle between the vectors a and b. Then:
Exercise 222. Let θ be the angle between the vectors a = (a1 , a2 , a3 ) and b = (b1 , b2 , b3 ).
(a) Express ∣a∣, ∣b∣, ∣a × b∣, and cos θ in terms of a1 , a2 , a3 , b1 , b2 , and b3 . (You need not
expand the squared terms.)
(b) Since θ ∈ [0, π], what can you say about the sign of sin θ? (That is, is sin θ positive,
negative, non-positive, or non-negative?)
(c) Now use a trigonometric identity to express sin θ in terms of cos θ. (Hint: You should
find that there are two possibilities. Use what you found in (b) why you can discard
one of these possibilities.)
(d) Plug the expression you wrote down for cos θ in (a) into what you found in (c).
(e) Prove the following algebraic identity.239 (Hint: Fully expand each of LHS and RHS.
Then conclude that LHS = RHS.)
(f) Use (a) and (d) to express ∣a∣ ∣b∣ sin θ in terms of a1 , a2 , a3 , b1 , b2 , and b3 . Then use
(e) to prove that:
239
By the way, this is again simply an instance of Lagrange’s Identity.
546, Contents www.EconsPhDTutor.com
49.3. The Length of the Rejection Vector
In 2D space, the vector product was a scalar (real number). In contrast, in 3D space, it is
a vector (hence the name).
Nonetheless and perhaps surprisingly, Fact 85 — which says the rejection vector’s length
is given by the vector product — remains true and is now reproduced:
∣rejb a∣ = ∣a × b̂∣ .
Example 680. The points A = (1, 5, −2), B = (2, 3, 1), and C = (2, 7, −1) form a right
Ð→ Ð→ Ð→
triangle. Compute: AB = (1, −2, 3), AC = (1, 2, 1), and BC = (0, 4, −2).
The lengths of the line segments AB and AC are simply:
√
Ð→ √ Ð→ √ √
∣AB∣ = 12 + (−2) + 32 = 14 ∣AC∣ = 12 + 22 + 12 = 6.
2
and
Ð→ Ð→
Ð→ Ð→ Ð ̂ → ∣BC × AB∣ ∣(8, −2, −4)∣
∣AC∣ = ∣BC × AB∣ = Ð→ =
∣AB∣ ∣(1, −2, 3)∣
¿ √
Á (−8)2 + 22 + 42 84 √ z
=ÁÀ = = 6. 3 x
12 + (−2) + 32
2 14
Definition 122. Let A be a point that isn’t on the line l. The foot A
Ð→
of the perpendicular from A to l is the point B on l such that AB ⊥ l. B l
Again, we can make use of the projection vector to find the foot of the perpendicular:
Ð→
Fact 86. Suppose l is the line described by r = OP + λv (λ ∈ R) and A is a point that
isn’t on l. Then the unique foot of the perpendicular from A to l is the following point:
Ð→
P + projv P A.
Again, the distance between a point and a line is the minimum distance between them:
Definition 123. Let A be a point and l be a line. Suppose B is the point on l that’s
Ð→
closest to A. Then the distance between A and l is ∣AB∣.
Fact 87. If B is the foot of the perpendicular from a point A to a line l, then B is also
the point on l that’s closest to A.
Corollary 13. Suppose l is a line, A is a point, and B is the foot of the perpendicular
Ð→
from A to l. Then the distance between A and l is ∣AB∣.
Again, we can use what we learnt about the rejection vector to find the distance between
a point and a line:
Ð→
Corollary 14. Suppose A is a point, l is the line described by r = OP + λv (λ ∈ R), and
d is the distance between A and l. Then:
Ð→
d = ∣P A × v̂∣.
We’ll use the the exact same three methods as before to find the foot of the perpen-
dicular and the distance between a point and a line (in 3D space). Here are two
examples:
Ð→
r = OP + λv = (0, 1, 2) + λ(9, 1, 3) (λ ∈ R).
Ð→
Method 1 (Formula Method). First, P A = (1, 2, 3) − (0, 1, 2) = (1, 1, 1). So:
Ð→ Ð→ (1, 1, 1) ⋅ (9, 1, 3)
P B = projv P A = proj(9,1,3) (1, 1, 1) = (9, 1, 3)
92 + 12 + 32
9+1+3 13 1
= (9, 1, 3) = (9, 1, 3) = (9, 1, 3).
91 91 7
And so by Fact 86, the foot of the perpendicular from A to l is:
Ð→ 1 1
B = P + projv P A = (0, 1, 2) + (9, 1, 3) = (9, 8, 17) .
7 7
y A = (1, 2, 3)
√
2
2
7
B
P = (0, 1, 2)
̂
v̂ = (9, 1, 3)
Ð→ Ð→ (9, 1, 3) (1 ⋅ 3 − 1 ⋅ 1, 1 ⋅ 9 − 1 ⋅ 3, 1 ⋅ 1 − 1 ⋅ 9)
∣BA∣ = ∣P A × v̂∣ = ∣(1, 1, 1) × √ ∣=∣ √ ∣
92 + 12 + 32 91
√ √ √
(2, 6, −8) (1, 3, −4) 1 + 3 + (−4)
2
2 2 26 2
=∣ √ ∣ = 2∣ √ ∣=2 =2 =2 .
91 91 91 91 7
Ð→ Ð→
Since AB ⊥ l, we have AB ⊥ v or:
⎛ 9 ⎞ ⎛ 9λ̃ − 1 ⎞ ⎛9⎞
Ð→ ⎜ ⎟ ⎜ ⎟ ⋅ ⎜ 1 ⎟ = 9 (9λ̃ − 1) + (λ̃ − 1) + 3 (3λ̃ − 1) = 91λ̃ − 13.
0 = AB ⋅ ⎜ 1 ⎟ = ⎜ λ̃ − 1 ⎟ ⎜ ⎟
⎝ 3 ⎠ ⎝ 3λ̃ − 1 ⎠ ⎝3⎠
Ð→ 1 1 2
AB = B − A = (9, 8, 17) − (1, 2, 3) = (2, −6, −4) = (1, −3, −2).
7 7 7
Thus, the distance between A and l is:
√ √
Ð→ 2 2 2 √ 2
∣AB∣ = ∣(1, −3, −2)∣ = 12 + (−3) + (−2) = 14 = 2
2 2
.
7 7 7 7
Ð→
Method 3 (or the Calculus Method). Let R be a generic point on l, so that AR =
(9λ − 1, λ − 1, 3λ − 1) and the distance between A and R is:
√ √
Ð→
∣AR∣ = (9λ − 1) + (λ − 1) + (3λ − 1) = 91λ2 − 26λ + 3.
2 2 2
d
(91λ2 − 26λ + 3) = 182λ − 26.
dλ
Then by the First Order Condition (FOC), we have:
26 1
(182λ − 26) ∣λ=λ̃ = 0 or λ̃ = = .
182 7
Happily, this is the same as what we found in Method 2. And now, as before, we can find
Ð→
B and ∣AB∣. Alternatively, we could simply have found λ̃ by using “−b/2a”:
−26 1
λ̃ = “ − b/2a” = − = .
2 ⋅ 91 7
Ð→
r = OP + λv = (3, 2, 1) + λ(5, 1, 2) (λ ∈ R).
Ð→
Method 1 (Formula Method). First, P A = (−1, 0, 1) − (3, 2, 1) = (−4, −2, 0). So:
B P = (3, 2, 1)
√
58
15
̂
v̂ = (5, 1, 2)
A = (−1, 0, 1)
x
z
Ð→
Method 2 (Perpendicular Method). Let B = (3, 2, 1) + λ̃ (5, 1, 2). Write down AB:
Ð→
AB = B − A = (3, 2, 1) + λ̃ (5, 1, 2) − (−1, 0, 1) = (5λ̃ + 4, λ̃ + 2, 2λ̃).
Ð→ Ð→
Since AB ⊥ l, we have AB ⊥ v or:
Ð→ 1 1
AB = B − A = (−10, 19, −7) − (−1, 0, 1) = (5, 19, −22).
15 15
Thus, the distance between A and l is:
√ √ √
Ð→ 1 1 870 58
∣AB∣ = ∣(5, 19, −22)∣ = 52 + 192 + (−22) = =
2
.
15 15 15 15
Ð→
Method 3 (or the Calculus Method). Let R be a generic point on l, so that AR =
(5λ + 4, λ + 2, 2λ) and the distance between A and R is:
√ √
Ð→
∣AR∣ = (5λ + 4) + (λ + 2) + (2λ) = 30λ2 + 44λ + 20.
2 2 2
d
(30λ2 + 44λ + 20) = 60λ + 44.
dλ
Then by the First Order Condition (FOC), we have:
44 11
(60λ + 44) ∣λ=λ̃ = 0 or λ̃ = − =− .
60 15
Happily, this is the same as what we found in Method 2. And now, as before, we can find
Ð→
B and ∣AB∣. Alternatively, we could simply have found λ̃ by using “−b/2a”:
44 11
λ̃ = “ − b/2a” = − =− .
2 ⋅ 30 15
Exercise 223. For each of the following, use all three methods you just learnt to find
the foot of the perpendicular from A to l; and the distance between A and l.
The point A The line l Answer on p.
(a) (7, 3, 4) r = (8, 3, 4) + λ (9, 3, 7) 1495.
(b) (8, 0, 2) Contains the points (4, 4, 3) and (6, 11, 5) 1496.
(c) (8, 5, 9) r = (8, 4, 5) + λ (5, 6, 0) 1497.
y
The plane q described by x = 1
C = (1, 3, 1)
B = (1, 1, 1)
x
A = (1, 0, 0)
q = {(x, y, z) ∶ x = 1} .
In words, q is the set containing exactly those points (x, y, z) whose x-coordinate is 1.
You should take a moment to convince yourself that the plane q, which is the set of points
whose x-coordinates are 1, does indeed form a “flat 2D surface”.
x = 1, x = 3, and x = 5.
Later on, we will learn what it means for two planes to be parallel and how to calculate
the distance between two planes. But for now, we merely assert that “obviously”:
• The three planes are parallel.
• The distance between the first and second planes is 2.
• The distance between the second and third planes is also 2.
Example 685. Consider the plane q described by the cartesian equation y = 2x.
It is the set of points (x, y, z) that satisfies the equation y = 2x. Formally:
q = {(x, y, z) ∶ y = 2x}.
y y = 2x, z = 0
D = (−1, 2, 0)
y = 2x, z = 3
C = (1, 0, 3)
B = (−1, −2, 0)
240
Here’s a proof of this assertion. Consider the line y = 2x, z = k. Let P be any point on the line. Observe
that P obviously satisfies the plane’s equation y = 2x. Thus, P ∈ q. We have just shown that any
arbitrary point P on the line is also on q. Therefore, q contains the line.
555, Contents www.EconsPhDTutor.com
Example 686. Consider the plane q described by the cartesian equation x + y = z.
It is the set of points (x, y, z) that satisfies the equation x + y = z. Formally:
q = {(x, y, z) ∶ x + y = z}.
D = (−1, 2, 0)
A = (1, 2, 3)
B = (−1, 1, 0) O x
As the above examples suggest, it turns out that in general, any plane q is simply the
graph of the following cartesian equation:
ax + by + cz = d,
q = {(x, y, z) ∶ ax + by + cz = d} .
In the coming chapters, we will explain why a plane may be described by the above cartesian
equation. We will also learn what the vector (a, b, c) and the number d mean geometrically.
x = 4.
The plane q: x = 4
y
z x
Now suppose we also impose the constraint y = 5. That is, we take the plane q, but keep
only those points on q that satisfy the equation y = 5 (and “throw away” all other points).
This gives us the line l. We say that the line l is described by two equations:
x=4 and y = 5.
x = 4, y = 5, and z = 6.
x = 4.
y The line l: x = 4
P = (4, 5) ∶
x = 4, y = 5
Now suppose we also impose the constraint y = 5. That is, we take the line l, but keep
only those points on l that satisfy the equation y = 5 (and “throw away” all other points).
This gives us the point P = (4, 5). We say that the point P is described by two equations:
x=4 and y = 5.
Example 689. Let q be the plane that contains the points A = (1, 0, 0), B = (0, 1, 0),
and C = (0, 0, 1). Informally, a plane is a “flat surface”.
And since it is a “flat surface”, there must y The plane q
be some vector that is perpendicular to
it. We will call any such vector a normal
vector of the plane.241
Ð→ n = (1, 1, 1)
To find a normal vector of q, all we need do AB = (−1, 1, 0)
is pick any two vectors on q and compute B = (0, 1, 0)
their vector product.
Ð→ x
Let’s pick, say, AB = (−1, 1, 0) and
Ð→ A = (0, 0, 1)
AC = (−1, 0, 1). Their vector product
C = (1, 0, 0) Ð→
(which we’ll also denote n) is: AC = (−1, 0, 1)
⎛ −1 ⎞ ⎛ −1 ⎞ ⎛1⎞
Ð→ Ð→ ⎜ z
AB × AC = ⎜ 1 ⎟ ⎜
⎟×⎜ 0
⎟ = ⎜ 1 ⎟ = n.
⎟ ⎜ ⎟
⎝ 0 ⎠ ⎝ 1 ⎠ ⎝1⎠
As we learnt in Ch. 49, the vector product n = (1, 1, 1) must be perpendicular to both
Ð→ Ð→
AB and AC. It turns out that n is also perpendicular to every vector on q (we’ll formally
state and prove this as Fact 101 below). And so, we call n a normal vector of q.
Ð→
Now, let R denote a generic point on q. Then the vector AR is on q. Which means:
Ð→ Ð→
AR ⊥ n = (1, 1, 1), or equivalently, AR ⋅ (1, 1, 1) = 0.
Ð→
In words, q is the set containing exactly those points R that satisfy OR ⋅ (1, 1, 1) = 1.
(Example continues on the next page ...)
241
We’ll formally define what a normal vector of a plane is in Definition 143.
559, Contents www.EconsPhDTutor.com
(... Example continued from the previous page.)
If we let r denote the position vector of the generic point R, then here is another vector
equation that also describes q:
r ⋅ (1, 1, 1) = 1.
q = {R ∶ r ⋅ (1, 1, 1) = 1}.
In words, q is the set containing exactly those points R that satisfy r ⋅ (1, 1, 1) = 1.
Definition 141. A plane is any set of points that can be written as:
Ð→
{R ∶ OR ⋅ n = d} or {R ∶ r ⋅ n = d},
In words, a plane is the set containing exactly those points R that satisfy:
Ð→
OR ⋅ n = d or r ⋅ n = d.
A little less formally, we will simply say that a plane is described by either of the above
vector equations.
By the way, in the above example, we spoke of vectors being on a plane. It’s probably a
good idea to formally and precisely define what this means:
Ð→
Definition 142. A vector v is on a plane q if there are points S, T ∈ q such that v = ST .
And if v is on q, then for the sake of convenience, we will sometimes be sloppy and say
that q contains v.242
242
I say that this is sloppy because strictly speaking,it is wrong to say that a plane q contains a vector v.
A plane contains points and not vectors. Nonetheless, for the sake of convenience, we will often simply
(and incorrectly) say that a plane contains a vector.
560, Contents www.EconsPhDTutor.com
Ð→
Example 690. Consider the plane q = {R ∶ OR ⋅ (−3, 0, 2) = −5}.
It contains the point A = (1, 0, −1) because A satisfies the plane’s vector equation:
Ð→
OA ⋅ (−3, 0, 2) = (1, 0, −1) ⋅ (−3, 0, 2) = −3 + 0 − 2 = −5. 3
y
The plane q
A = (1, 0, −1)
C = (9, 1, 1)
(−3, 0, 2)
B = (3, 1, 2) x
Ð→
Since q contains the points A and B, the vector AB is on q. (If we were being sloppy,
Ð→
we’d instead say that q contains the vector AB.)
Ð→ Ð→
Now, are the vectors AC and BC on q? As you may have guessed, the answer is, “No,
they are not.” But to prove this, we’ll have to wait until Fact 100 below.243
Ð→
Exercise 225. Consider the plane q = {R ∶ OR ⋅ (−5, 7, 3) = −1}. Does q contain the
points A = (5, −3, 1), B = (1, −2, 6), and C = (−2, 2, −3)? (Answer on p. 1499.)
Ð→ Ð→
243
Here is an incorrect “proof”: “C is not on q. Therefore, the vectors AC and BC not on q.” This proof
Ð→
is incorrect because in order to prove that say AC is not on q, we need to prove that given any two
Ð→ Ð→
points P and Q on q, AC ≠ P Q. The mere observation that C is not on q does not suffice.
561, Contents www.EconsPhDTutor.com
Our first result about planes is simple and intuitively “obvious”:
Fact 95. If a plane contains two distinct points, then it also contains the line through
those two points.
The plane q
Ð→
Example 691. In the last example, we verified that the plane q = {R ∶ OR ⋅ (−3, 0, 2) = 5}
contains the points A = (1, 0, −1) and B = (3, 1, 2). By the above Fact then, q also contains
the line AB. That is, any point on the line AB is also on the plane q.
Example 692. The plane q is described by r ⋅ (4, 1, 5) = 0 and the line l is described by
Ð→
r = OA + λv = (−1, −1, 1) + λ (5, 0, −4) (λ ∈ R).
It turns out that the plane q contains the y
line l. Here are two ways to show this: The plane q
Method 1. Observe that l contains the
points A = (−1, −1, 1) and B = A + v =
(−1, −1, 1) + (5, 0, −4) = (4, −1, −3).
But the plane q also contains A and B (as
you should be able to verify). And so, by the
above Fact, q contains the line AB, which is The line l
also the line l.
z
Method 2. Let R = (−1, −1, 1) + λ (5, 0, −4)
be a generic point on the line l. We show x
that R satisfies q’s vector equation:
Ð→
OR ⋅ (4, 1, 5) = [(−1, −1, 1) + λ (5, 0, −4)] ⋅ (4, 1, 5)
= 4 (−1 + 5λ) + (−1) + 5 (1 − 4λ)
= −4 − 20λ − 1 + 5 + 20λ = 0.
We’ve just shown that q contains any point R on l. That is, q contains l.
Exercise 226. Suppose the plane q is described by r ⋅ (4, −3, 2) = −10, while the line l is
described by r = (7, 3, 1) + λ (3, 6, −2). Determine if q contains l. (Answer on p. 1499.)
More simply, instead of saying, “n is a normal vector of the plane q,” we’ll also say, “n is
normal to q”. And as shorthand, we’ll write n ⊥ q.
Not surprisingly, the vector n used in Definition 141 of the plane is a normal vector:
Ð→
Fact 96. If q = {R ∶ OR ⋅ n = d} is a plane, then n ⊥ q.
m = (2, 2, 2) n
u = (−1.5, −1.5, −1.5)
√ √ √
v = ( 5, 5, 5) x
√
The vectors m = 2n, u = −1.5n, and
v = 5n are parallel to n = (1, 1, 1). u
And so, they are “obviously” also nor-
mal vectors of q. q
z
As the above example suggests, if n is a normal vector of the plane q, then “obviously”, so
too is any vector m that’s parallel to n. Formally:
m ∥ n Ô⇒ m ⊥ q.
m⊥q Ô⇒ m ∥ n.
Fact 97 says that a normal vector n of a plane q is not unique — any vector parallel to
n is also a normal vector of q. Theorem 9 then says the converse: only vectors that are
parallel to n are normal vectors of q. Putting these two results together, we have:
m⊥q ⇐⇒ m ∥ n.
In other words, if n is a plane’s normal vector, then that plane’s normal vectors are exactly
those which are parallel to n.
Conversely ( Ô⇒ of Corollary 15), every normal vector of q must be parallel to (1, 1, 1).
And so for example, the vectors a = (1, 2, 3), b = (0, −1, 2), c = (1, 1, 0.9), and 0 are not
parallel to (1, 1, 1) and are thus not normal to q.
The vector d = (1, −1, 0) is on the plane q and, as depicted, d ⊥ n, u, v, but d ⊥/ a, b.
Exercise 227. The plane q is described by r ⋅ (1, −1, 1) = −2. Determine if a = (2, −2, 2),
√ √ √
b = (2, 2, −2), and c = (− 2, 2, − 2) are normal vectors of q. (Answer on p. 1499.)
©
kn
r ⋅ m = kd.
Example 696. The plane q described by r ⋅ (1, 2, 3) = 4 can also be described by:
√ √ √ √
r ⋅ (2, 4, 6) = 8, r ⋅ (−1, −2, −3) = −4, or r ⋅ ( 5, 2 5, 3 5) = 4 5.
⎛ −21 ⎞ Ð→
Ð→ Ð→ ⎜ AC = (−6, −5, −1)
n = AB × AC = ⎜ 13 ⎟⎟.
⎝ 61 ⎠ C = (3, −5, 4)
Ð→
Compute d = OA ⋅ n = (9, 0, 5) ⋅ (−21, 13, 61) = −189 + 0 + 305 = 116.
Thus, q may be described by r ⋅ (−21, 13, 61) = 116.
Another normal vector of q is 2(−21, 13, 61) = (−42, 26, 122). And so, q may also be
described by r ⋅ (−42, 26, 122) = 2 ⋅ 116 = 232.
Exercise 228. The plane q contains the points A = (1, −1, 2), B = (−2, 3, 0), and C =
(0, −1, 1). (Answer on p. 1499.)
(a) Find a normal vector of q.
(b) Hence write down a vector equation that describes the plane q.
(c) Write down another normal vector of q.
(d) Hence write down another vector equation that describes the plane q.
Fact 99. Let q be a plane with normal vector n. Suppose v is a vector. Then:
v⊥n Ô⇒ v is on q.
Putting Definition 143 and Fact 99 together, a plane’s vectors are exactly those that are
perpendicular to its normal vector:
Corollary 16. Let v be a vector and q be a plane with normal vector n. Then:
v⊥n ⇐⇒ v is on q.
Example 698. Let q be the plane described by r ⋅ (5, 1, 6) = −3. As you should be able
to verify, it contains the points A = (−1, 2, 0), B = (−2, 1, 1), and C = (3, 0, −3).
Ð→ Ð→
Since A, B, and C are on q, so too are the vectors AB = (−1, −1, 1) and AC = (4, −2, −3).
Ð→ Ð→
Hence, both AB and AC should be perpendicular to the normal vector n = (5, 1, 6). Let’s
verify that this is so:
Ð→
AB ⋅ n = (−1, −1, 1) ⋅ (5, 1, 6) = −5 − 1 + 6 = 0. 3
Ð→
AC ⋅ n = (4, −2, −3) ⋅ (5, 1, 6) = 20 − 2 − 18 = 0. 3
So yup, u is on q.245
Let’s now consider the vector v = (−7, 2, 3). Is it on q? Again, simply check if v ⊥ n.
So nope, v is not on q.
Exercise 229. Let q be the plane described by r ⋅ (8, −2, 1) = 5. Are the vectors a =
(3, 7, −5), b = (1, 6, 4), and c = (3, 10, 1) are on q? (Answer on p. 1499.)
245
For the doubtful reader, let D = A + u = (−1, 2, 0) + (−3, 3, 2) = (−4, 5, 2). We can verify that D ∈ q. And
Ð→
now, since A, D ∈ q, by Definition 142, the vector AD = u is on q.
566, Contents www.EconsPhDTutor.com
Suppose the plane q contains the point P . Then q contains exactly those points R for which
Ð→
the vector P R is on q. Formally:
Ð→
Example 699. The plane q = {R ∶ OR ⋅ (0, −2, 3) = 1}
contains the point A = (0, 1, 1). The point B is such
Ð→
that AB = (1, 3, 2).
Here are two methods for showing that B ∈ q. B
y
Ð→
Method 1. First find B = A + AB = (0, 1, 1) +
(1, 3, 2) = (1, 4, 3). Then show that B satisfies the Ð→
AB
plane’s vector equation:
Ð→
OB ⋅ n = (1, 4, 3) ⋅ (0, −2, 3) = 0 − 8 + 9 = 1.
A
3
Ð→ z n
Method 2. Simply check if AB ⊥ n:
x
Ð→
AB ⋅ n = (1, 3, 2) ⋅ (0, −2, 3) = 0 − 6 + 6 = 0. 3
Ð→
Yup, AB ⊥ n. And so by Fact 100, B ∈ q.
Exercise 230. The plane q is described by r ⋅ (7, −1, 3) = 19 and A = (1, 4, −1) is a point.
Ð→
The point B is such that AB = (7, 3, −2). Is the point B on q? (Answer on p. 1499.)
Corollary 18. Suppose a and b are non-parallel vectors on the plane q. Then
c⊥a×b ⇐⇒ c is on q.
Example 700. The plane q contains the points A = (0, 0, 1), B = (4, 2, 0), and C =
(−5, 0, 4).
Aisha claims that v = (6, −7, 10) is normal to q. Let’s check if she’s correct:
First, write down two non-parallel vectors on q. Two obvious candidates are:
Ð→ Ð→
AB = (4, 2, −1) and AC = (−5, 0, 3).
Ð→ Ð→
Then check if v ⊥ AB, AC:
Ð→
v ⋅ AB = (6, −7, 10) ⋅ (4, 2, −1) = 24 − 14 − 10 = 0, 3
Ð→
v ⋅ AC = (6, −7, 10) ⋅ (−5, 0, 3) = −30 + 0 + 30 = 0. 3
Ð→ Ð→ Ð→ Ð→
Since AB ∥/ AC and v ⊥ AB, AC, by Corollary 17, v ⊥ q and Aisha is correct.
Exercise 231. The vectors a = (1, −1, 1) and b = (−2, 2, −2) are on the plane q. Is
n = (0, 1, 2) a normal vector of q? What about m = (1, 3, 2)? (Answer on p. 1499.)
r ⋅ n = r ⋅ (1, 2, 3) = 4 .
v
The plane q contains those points R = (x, y, z) whose position vector satisfies =.
v
x + 2y + 3z = 4.
c
r ⋅ n = (x, y, z) ⋅ (a, b, c) = d.
v
ax + by + cz = d.
c
Formally:
Example 702. The plane described by r ⋅ (5, 0, −1) = 3 may also be described by:
5x − z = 3.
Example 703. The plane described by r ⋅ (−1, 7, 2) = 0 may also be described by:
−x + 7y + 2z = 0.
r ⋅ (5, 6, 7) = 8.
r ⋅ (0, 1, 0) = 5.
569, Contents www.EconsPhDTutor.com
Fact 103. The plane described by r ⋅ (a, b, c) = d contains the origin if and only if d = 0.
Proof. The origin is the point (x, y, z) = (0, 0, 0) and satisfies the equation r ⋅ (a, b, c) = d or
ax + by + cz = d if and only if d = 0. Thus, the plane described by r ⋅ (a, b, c) = d contains the
origin if and only if d = 0.
ax + by + cz = 0 or r ⋅ (a, b, c) = 0.
Even if we don’t know what a, b, and c are, we know that q contains the origin.
ax + by + cz = 8 or r ⋅ (a, b, c) = 8.
Even if we don’t know what a, b, and c are, we know that q does not contain the origin.
Exercise 232. Each of the following is a plane given in vector form. Rewrite each in
cartesian form and state if each contains the origin. (Answer on p. 1500.)
(a) r ⋅ (1, 2, 3) = 17. (b) r ⋅ (−1, 0, −2) = 0. (c) r ⋅ (0, −2, 5) = −3.
Exercise 233. Each of the following is a plane given in cartesian form. Rewrite each in
vector form and state if each contains the origin. (Answer on p. 1500.)
Example 708. Consider the plane described in vector or cartesian form by:
r ⋅ (1, 2, 3) = 4 or x + 2y + 3z = 4.
Given a plane’s cartesian equation, we can easily use trial-and-error to find points on that
plane: Simply try out values of x, y, and z that satisfy the cartesian equation. (Tip: As
always, zero is our friend.)
So for example, the following points are on the given plane (as you should verify yourself):
In contrast, the following points are not on the given plane, because they do not satisfy
x + 2y + 3z = 4 (as you should verify yourself):
Example 709. Consider the plane described in vector or cartesian form by:
r ⋅ (3, 1, 1) = −4 or 3x + y + z = −4.
Example 710. Consider the plane described in vector or cartesian form by:
r ⋅ (−5, 1, 0) = 1 or −5x + y = 1.
Exercise 234. Below are given three planes in vector form. First rewrite each plane in
cartesian form. Then find three points that are on each plane and another three points
that are not. (Answer on p. 1500.)
r ⋅ (1, 2, 3) = 4 or x + 2y + 3z = 4.
Its normal vector is: n = (a, b, c) = (1, 2, 3).
Recall (Corollary 16) that a vector is on q if and only if it is perpendicular to n. So,
to find vectors on q, we need simply find vectors that are perpendicular to n — that is,
vectors whose scalar product with n is zero.
We will now construct one such vector u = (u1 , u2 , u3 ). That is, we’ll pick values of u1 ,
u2 , and u3 so that u ⋅ n = 0.
As always, zero is our friend. Let’s start by picking u3 = 0, so that:
u = (u1 , u2 , 0).
It is equally easy to show that a vector is not on q. For example, e = (3, 2, 1) and
f = (1, −1, 1) are not on q because e ⋅ n ≠ 0 and f ⋅ n ≠ 0 (as you can verify).
(b, −a, 0), (c, 0, −a), (0, c, −b), (−b, a, 0), (−c, 0, a), and (0, −c, b).
(b, −a, 0) = (1, −3, 0), (c, 0, −a) = (1, 0, −3), (0, c, −b) = (0, 1, −1),
(−b, a, 0) = (−1, 3, 0), (−c, 0, a) = (−1, 0, 3), (0, −c, b) = (0, −1, 1).
We can also easily find other vectors on q. For example, d = (−1, 1, 2) is on q because:
d ⋅ n = (−1, 1, 2) ⋅ (3, 1, 1) = −3 + 1 + 2 = 0.
(b, −a, 0) = (1, 5, 0), (c, 0, −a) = (0, 0, 5), (0, c, −b) = (0, 0, −1),
(−b, a, 0) = (−1, −5, 0), (−c, 0, a) = (0, 0, −5), (0, −c, b) = (0, 0, 1).
Actually, here we can make another useful and important observation. Notice that n’s
z-coordinate is 0. And so, if a vector u = (u1 , u2 , u3 ) is perpendicular to n (and is hence
on q), then so too is the vector (u1 , u2 , λ) for any value of λ.
So for example, since (1, 5, 0) is on q, so too are the following vectors:
√
(1, 5, 0), (1, 5, 1), (1, 5, − 2), (1, 5, 999), (1, 5, π), etc.
Also, for any value of λ , the vector (0, 0, λ) must be perpendicular to n. In particular,
the standard basis vector k = (0, 0, 1) is perpendicular to n and is thus also on q.
Exercise 235. Find three non-parallel vectors on each plane: (a) r ⋅ (1, −2, 3) = 0; (b)
r ⋅ (5, 3, 1) = −2; (c) r ⋅ (1, 0, 4) = 5; (d) r ⋅ (0, 7, 0) = 32. (Answer on p. 1500.)
c = λa + µb.
It turns out that the same is true of vectors on a plane in 3D space. That is, in 3D space,
a vector is on a plane if and only if it can be written as a LC of two non-parallel vectors on
that plane. Or equivalently, the vectors on a plane q are exactly those that can be written
as a LC of any two non-parallel vectors on q. Formally:
Remark 65. Take care to note that Theorem 10 is an if and only if ( ⇐⇒ ) statement
which says two things. First, ⇐Ô says:
Proof. First note that since a and b are non-parallel vectors on q, by Fact 101, a × b ⊥ q.
We first prove ⇐Ô . Suppose there exist λ, µ ∈ R such that c = λa + µb. We show that
a × b ⊥ c, so that by Fact 99, c is also a vector on the plane:
As we’ll see in Ch. 54.1, Theorem 10 will allow us to describe planes in parametric form.
But first, let’s better acquaint ourselves with Theorem 10 with some examples:
⎛1⎞ ⎛ 1 ⎞ ⎛ 1 ⎞ 1 =λ+µ
1
a = λu + µv ⎜ 2 ⎟ = λ⎜ −1 ⎟ + µ⎜ 0 ⎟ 2 = −λ
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or or
⎝3⎠ ⎝ 0 ⎠ ⎝ −1 ⎠ 3 = −µ.
3
= plus = yields 3 = µ, which contradicts =. So, there are no numbers λ and µ such that
1 2 3
Example 716. The plane q described by x − y = 5 has normal vector n = (1, −1, 0).
The vectors u = (0, 0, 1) and v = (1, 1, 0) are perpendicular to n and are thus on q.
Moreover, u ∥/ v. And so by Theorem 10, the vectors on q are exactly those that can be
written as a LC of u and v.
For example, the vector w = (1, 1, 1) is perpendicular to n and is on q. And so by Theorem
10, we should be able to write w as a LC of u and v, as indeed we can:
⎛0⎞ ⎛0 ⎞ ⎛1⎞ 0 =µ
1
a = λu + µv ⎜ 1 ⎟ = λ⎜ 0 ⎟ + µ⎜ 1 ⎟ 1 =µ
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or or
⎝0⎠ ⎝1 ⎠ ⎝0⎠ 0 = λ.
3
Clearly, = immediately contradicts =. So, there are no numbers λ and µ such that
1 2
⎛0⎞ ⎛1 ⎞ ⎛0⎞ 0 =λ
1
a = λu + µv ⎜ 1 ⎟ = λ⎜ 0 ⎟ + µ⎜ 1 ⎟ 1 =µ
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or or
⎝1⎠ ⎝0 ⎠ ⎝0⎠ 1 = 0.
3
Clearly, = is an immediate self-contradiction. So, there are no numbers λ and µ such that
3
Next, suppose a and b are non-parallel vectors on q. Then Theorem 10 says that a vector
v is on the plane q if and only if it is a LC of a and b. In formal notation:
Ð→
R∈q ⇐⇒ There exist λ, µ ∈ R such that P R = λa + µb.
3
Ð→
In words: A point R is on the plane q if and only if the vector P R is a LC of a and b.
Ð→
Or equivalently: The plane q contains exactly those points R for which the vector P R is a
LC of a and b. That is:
Ð→
q = {R ∶ P R = λa + µb (λ, µ ∈ R)} .
4
We previously learnt how to describe planes in vector and cartesian forms. = now gives
4
us a third way to describe planes and is called the parametric form of a plane.
Let’s write down or summarise the above discussion as a formal result:
Fact 105. Let q be a plane, P be a point, and a and b be non-parallel vectors. Suppose
P , a, and b are on q. Then:
Ð→
R∈q ⇐⇒ There exist λ, µ ∈ R such that P R = λa + µb.
Ð→
q = {R ∶ P R = λa + µb (λ, µ ∈ R)}.
4
Or equivalently:
Ð→ Ð→ Ð→
Observe that P R = OR − OP . And so:
Ð→ Ð→ Ð→
P R = λa + µb ⇐⇒ OR = OP + λa + µb.
Vec. Vec.
« «
Ð→ Ð→
= +λa + µb (λ, µ ∈ R).
6
OR OP
Pt Pt
© ©
= +λa + µb (λ, µ ∈ R).
7
Or equivalently: R P
= gives us a nice, informal geometric interpretation. A point R is on the plane q if:
7
⎛1⎞ ⎛ −1 ⎞ ⎛ −1 ⎞ ⎛ 1 − λ − µ ⎞
Ð→ 6 ⎜ ⎟
OR = ⎜ 0 ⎟ + λ ⎜
⎜ 1
⎟ + µ⎜ 0 ⎟ = ⎜
⎟ ⎜ ⎟ ⎜ λ ⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ µ ⎠
As the parameters λ and µ vary, we get different points on the plane q. For example,
(λ, µ) = (5, −1) produces the point:
Starting from the point P , we can reach the point (5, −1, 3) by taking 1 “step” in the
direction opposite to b, then 5 “steps” in the direction of a.
y
The plane q
P = (0, 0, 3) x
−b = − (0, 1, 0)
(5, −1, 3)
5a = 5 (1, 0, 0)
Starting from the point P , we can reach the point (0, 1, 3) by taking 1 “step” in the
direction of b.
As the parameters λ and µ vary, we get different points on the plane q. For example,
(λ, µ) = (1, 2) produces the point:
Starting from the point P , we can reach the point (6, 1, −2) by taking 1 “step” in the
direction of a, then 2 “steps” in the direction of b.
The plane q
a = (3, 1, 0) 2b = 2 (5, 0, −1)
(6, 1, −2)
P = (1, 0, 0)
x
z
Starting from the point A, we can reach the point (5, −1, −3) by taking 1 “step” in the
direction opposite to a and 3 “steps” in the direction of b.
Exercise 237. Below are three planes given in vector form. Rewrite each into both
cartesian and parametric forms. (Answer on p. 1502.)
⎛7⎞ ⎛8 ⎞ ⎛ 9 ⎞ ⎛ 7 + 8λ + 9µ ⎞
r = ⎜ 3 ⎟ + λ⎜
⎜ ⎟
⎜3
⎟ + µ ⎜ 3 ⎟ = ⎜ 3 + 3λ + 3µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝4⎠ ⎝4 ⎠ ⎝ 7 ⎠ ⎝ 4 + 4λ + 7µ ⎠
This plane contains the vectors (8, 3, 4) and (9, 3, 7). And so, a normal vector of this
plane is the vector product of these two vectors:
The plane contains the point (7, 3, 4). Since (7, 3, 4) ⋅ (9, −20, −3) = 63 − 60 − 12 = −9, the
plane may be described by:
⎛ 17 + 3λ − 2µ ⎞
r=⎜
⎜ 2µ − 2 ⎟
⎟ (λ, µ ∈ R).
⎝ 5λ ⎠
⎛ 17 ⎞ ⎛3 ⎞ ⎛ −2 ⎞
r=⎜ ⎟ ⎜
⎜ −2 ⎟ + λ ⎜ 0
⎟ + µ⎜ 2 ⎟
⎟ ⎜ ⎟ (λ, µ ∈ R).
⎝ 0 ⎠ ⎝5 ⎠ ⎝ 0 ⎠
This plane contains the vectors (3, 0, 5) and (−2, 2, 0). And so, a normal vector is:
Observe that (−10, −10, 6) ∥ (5, 5, −3). Hence, by Fact 97, (5, 5, −3) is also a normal
vector of the plane. (It’s nice to “simplify” the normal vector as much as possible — this
will usually make our subsequent calculations slightly easier.)
The plane contains the point (17, −2, 0). Since (17, −2, 0) ⋅ (5, 5, −3) = 85 − 10 + 0 = −75,
the plane may be described by:
⎛ λ−µ−2 ⎞
r=⎜
⎜ 14 + 5λ + 3µ
⎟
⎟ (λ, µ ∈ R).
⎝ 5 + µ + 7λ ⎠
⎛ −2 ⎞ ⎛1 ⎞ ⎛ −1 ⎞
r = ⎜ 14 ⎟ + λ ⎜
⎜ ⎟
⎜5
⎟ + µ⎜ 3 ⎟
⎟ ⎜ ⎟ (λ, µ ∈ R).
⎝ 5 ⎠ ⎝7 ⎠ ⎝ 1 ⎠
This plane contains the vectors (1, 5, 7) and (−1, 3, 1). And so, a normal vector is:
Observe that (−16, −8, 8) ∥ (2, 1, −1). So, (2, 1, −1) is also a normal vector.
The plane contains the point (−2, 14, 5). Since (−2, 14, 5) ⋅ (2, 1, −1) = −4 + 14 − 5 = 5, the
plane may be described by:
r ⋅ (2, 1, −1) = 5 or 2x + y − z = 5.
Exercise 238. Below are three planes given in parametric form. Rewrite each into both
vector and cartesian form. (Answer on p. 1502.)
(a) r = (1, 2, 3) + λ (4, 5, 6) + µ (7, 8, 9) (λ, µ ∈ R).
(b) r = (λ − µ, 4λ + 5, 0) (λ, µ ∈ R).
(c) r = (1 + µ, 1 + λ, λ + µ) (λ, µ ∈ R).
Example 724. A plane contains the point (1, 2, 3) and has normal vector (1, 1, 0).
Compute (1, 2, 3) ⋅ (1, 1, 0) = 1 + 2 + 0 = 3. Thus, this plane may be described in vector or
cartesian form by:
r ⋅ (1, 1, 0) = 3 or x + y = 3.
This plane also contains the non-parallel vectors (1, −1, 0) and (0, 0, 1). Thus, it may be
described in parametric form by:
⎛1⎞ ⎛ 1 ⎞ ⎛ 0 ⎞ ⎛ 1+λ ⎞
r=⎜ ⎟ ⎜
⎜ 2 ⎟ + λ ⎜ −1
⎟ + µ⎜ 0 ⎟ = ⎜ 2 − λ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝3⎠ ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 3+µ ⎠
Example 725. A plane contains the point (0, 0, 1) and has normal vector (2, −1, 1).
Compute (0, 0, 1) ⋅ (2, −1, 1) = 0 + 0 + 1 = 1. Thus, this plane may be described in vector
or cartesian form by:
r ⋅ (2, −1, 1) = 1 or 2x − y + z = 1.
This plane also contains the non-parallel vectors (1, 2, 0) and (0, 1, 1). Thus, it may be
described in parametric form by:
⎛0⎞ ⎛1 ⎞ ⎛0⎞ ⎛ λ ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 2
⎟ + µ ⎜ 1 ⎟ = ⎜ 2λ + µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝1⎠ ⎝0 ⎠ ⎝ 1 ⎠ ⎝ 1+µ ⎠
Second, two examples where we’re given one point and two vectors (that aren’t parallel):
246
This is an assertion that we formally prove only in Ch. 119.11 (Appendices).
584, Contents www.EconsPhDTutor.com
Example 726. A plane contains the point (1, 2, 3) and the non-parallel vectors (5, 4, 3)
and (1, −1, 2). It may thus be described by the following parametric equation:
⎛1⎞ ⎛5 ⎞ ⎛ 1 ⎞ ⎛ 1 + 5λ + µ ⎞
r = ⎜ 2 ⎟ + λ⎜
⎜ ⎟
⎜4
⎟ + µ ⎜ −1 ⎟ = ⎜ 2 + 4λ − µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝3⎠ ⎝3 ⎠ ⎝ 2 ⎠ ⎝ 3 + 3λ + 2µ ⎠
This plane has normal vector (5, 4, 3) × (1, −1, 2) = (11, −7, −9).
Compute (1, 2, 3) ⋅ (11, −7, −9) = 11 − 14 − 27 = −30. Thus, this plane may be described in
vector or cartesian form by:
r ⋅ (11, −7, −9) = −30 or 11x − 7y − 9z = −30.
Example 727. A plane contains the point (5, 0, 1) and the non-parallel vectors (1, 1, 8)
and (1, 0, 1). It may thus be described by the following parametric equation:
⎛5⎞ ⎛1 ⎞ ⎛ 1 ⎞ ⎛ 5+λ+µ ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 1
⎟ + µ⎜ 0 ⎟ = ⎜
⎟ ⎜ ⎟ ⎜ λ ⎟
⎟ (λ, µ ∈ R).
⎝1⎠ ⎝8 ⎠ ⎝ 1 ⎠ ⎝ 1 + 8λ + µ ⎠
Third, two examples where we’re given two points and a vector (that isn’t parallel to
the vector between the two points):
Example 728. A plane contains the points (0, 0, 3) and (1, 4, 5), and the vector (3, 2, 1).
The vector between the two points is (1, 4, 5) − (0, 0, 3) = (1, 4, 2) and isn’t parallel to the
vector (3, 2, 1). Thus, this plane may be described by the following parametric equation:
⎛0⎞ ⎛3 ⎞ ⎛ 1 ⎞ ⎛ 3λ + µ ⎞
⎜
r = ⎜ 0 ⎟ + λ⎜
⎟
⎜2
⎟ + µ ⎜ 4 ⎟ = ⎜ 2λ + 4µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝3⎠ ⎝1 ⎠ ⎝ 2 ⎠ ⎝ 3 + λ + 2µ ⎠
This plane has normal vector (3, 2, 1) × (1, 4, 2) = (0, −5, 10).
It thus also has normal vector (0, −1, 2).
Compute (0, 0, 3) ⋅ (0, −1, 2) = 0 + 0 + 6 = 6. Thus, this plane may be described in vector
or cartesian form by:
r ⋅ (0, −1, 2) = 6 or −y + 2z = 6.
585, Contents www.EconsPhDTutor.com
Example 729. A plane contains the points (8, −2, 0) and (3, 6, 9), and the vector (0, 1, 1).
The vector between the two points is (3, 6, 9) − (8, −2, 0) = (−5, 8, 9) and isn’t parallel to
the vector (0, 1, 1). Thus, this plane may be described in parametric form as:
⎛ 8 ⎞ ⎛0 ⎞ ⎛ −5 ⎞ ⎛ 8 − 5µ ⎞
r = ⎜ −2 ⎟ + λ ⎜
⎜ ⎟
⎜1
⎟ + µ ⎜ 8 ⎟ = ⎜ −2 + λ + 8µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝ 0 ⎠ ⎝1 ⎠ ⎝ 9 ⎠ ⎝ λ + 9µ ⎠
This plane has normal vector (0, 1, 1) × (−5, 8, 9) = (1, −5, 5).
Compute (8, −2, 0)⋅(1, −5, 5) = 8+10+0 = 18. Thus, this plane may be described in vector
and cartesian forms by r ⋅ (1, −5, 5) = 18 and x − 5y + 5z = 18.
Fourth and lastly, two examples where we’re given three points (that aren’t collinear):
Example 730. A plane contains the points (1, 2, 3), (4, 5, 8), and (2, 3, 5).
The vector between the first two points is (4, 5, 8) − (1, 2, 3) = (3, 3, 5), while that between
the first and last is (2, 3, 5) − (1, 2, 3) = (1, 1, 2). Since (3, 3, 5) ∥/ (1, 1, 2), this plane may
be described described in parametric form as:
⎛1⎞ ⎛3 ⎞ ⎛ 1 ⎞ ⎛ 1 + 3λ + µ ⎞
r=⎜ ⎟ ⎜
⎜ 2 ⎟ + λ⎜ 3
⎟ + µ⎜ 1 ⎟ = ⎜ 2 + 3λ + µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝3⎠ ⎝5 ⎠ ⎝ 2 ⎠ ⎝ 3 + 5λ + 2µ ⎠
This plane has normal vector (3, 3, 5) × (1, 1, 2) = (1, −1, 0).
Compute (1, 2, 3) ⋅ (1, −1, 0) = 1 − 2 + 0 = −1. Thus, this plane may be described in vector
and cartesian forms by r ⋅ (1, −1, 0) = −1 and x − y = −1.
Example 731. A plane contains the points (1, 0, 0), (0, 1, 0), and (0, 0, 1).
The vector between the first two points is (0, 1, 0)−(1, 0, 0) = (−1, 1, 0), while that between
the first and last is (0, 0, 1) − (1, 0, 0) = (−1, 0, 1). Since (−1, 1, 0) ∥/ (−1, 0, 1), this plane
may be described described in parametric form as:
⎛1⎞ ⎛ −1 ⎞ ⎛ −1 ⎞ ⎛ 1 − λ − µ ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 1
⎟ + µ⎜ 0 ⎟ = ⎜
⎟ ⎜ ⎟ ⎜ λ ⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ µ ⎠
Exercise 239. In each of the following, three points are given. Describe the plane that
contains all three points in vector, cartesian, and parametric form.
(a) (7, 3, 4), (8, 3, 4), and (9, 3, 7). (c) (8, 5, 9), (8, 4, 5), an
(b) (8, 0, 2), (4, 4, 3), and (2, 7, 2). (Answe
A = 0.5π − θ n
A= − θ.
π
2
Recall (Fact 74) that θ, the non-obtuse angle between v and n, is given by:
∣v ⋅ n∣
θ = cos−1
∣v∣ ∣n∣
.
∣v ⋅ n∣
A= − θ = − cos−1
π π
Thus: .
2 2 ∣v∣ ∣n∣
sin−1 x + cos−1 x =
π
.
2
x x
∣v ⋅ n∣ ∣v ⋅ n∣
A = − cos−1 = sin−1
π
And so, we have: .
2 ∣v∣ ∣n∣ ∣v∣ ∣n∣
Definition 144. Suppose a line has direction vector v and a plane has normal vector n.
Then the angle between the line and the plane is the following number:
∣v ⋅ n∣
sin−1
∣v∣ ∣n∣
.
Observe that to compute the angle between a line and a plane, all we need are a direction
vector of the line and a normal vector of the plane. Three examples:
n = (1, 1, 1)
−1 ∣(9, 1, 3) ⋅ (1, 1, 1)∣ z The line
sin
∣(9, 1, 3)∣ ∣(1, 1, 1)∣ x
∣9 + 1 + 3∣
= sin−1 √ √
92 + 12 + 32 12 + 12 + 12 0.906
∣13∣
= sin−1 √ √ ≈ 0.906.
91 3
x
∣−1∣ 1 π z
= sin−1 √ √ = sin−1 ≈ . π
2 2 2 6 n = (−1, −1, 0) 6
Example 734. The angle between a line with direction vector (1, 0, 1) and a plane with
normal vector (0, 1, 0) is:
Exercise 240. Find the angle between the given line and plane. (Answer on p. 1504.)
(a) Line: r = (−1, 2, 3) + λ (−1, 1, 0) (λ ∈ R).
Plane: r ⋅ (3, 4, 5) = 0.
(c) Line: Contains the points (−1, 2, 3) and (0, 11, 11).
Plane: Contains the points (1.5, 0, 0) and (0, 0, 1.5) and the vector (4, −1, 0).
Definition 145. Let θ be the angle between a line and a plane. The line and plane are
said to be (a) parallel if θ = 0; and (b) perpendicular if θ = π/2.
As usual, if a line l and a plane q are parallel, then as shorthand we’ll write l ∥ q. And if
they’re perpendicular, we’ll write l ⊥ q. n l2
“Obviously”, a line (e.g. l1 ) is parallel to a plane l1
if and only if the line’s direction vector is per-
pendicular to the plane’s normal vector.
Similarly, a line (e.g. l2 ) is perpendicular to a l
plane if and only if the line’s direction vector is The plane
parallel to the plane’s normal vector.
Let’s state and prove these obviosities formally:
Fact 106. Suppose the line l has direction vector v and the plane q has normal vector n.
Then (a) l ∥ q ⇐⇒ v ⊥ n; and (b) l ⊥ q ⇐⇒ v ∥ n.
Proof. Let θ be the angle between l and q. By Definitions 145, 144, and 112, and Fact 71:
∣v ⋅ n∣
(a) l ∥ q ⇐⇒ θ = 0 ⇐⇒ sin−1 = 0 ⇐⇒ v⋅n = 0 ⇐⇒ v ⊥ n.
∣v∣ ∣n∣
∣v ⋅ n∣ ∣v ⋅ n∣
(b) l ⊥ q ⇐⇒ θ = ⇐⇒ sin−1 = ⇐⇒ = 1 ⇐⇒ v ∥ n.
π π
2 ∣v∣ ∣n∣ 2 ∣v∣ ∣n∣
It will be nice if we can speak of a vector and a plane being parallel or perpendicular:
Definition 146. Let q be a plane with normal vector n and v be a vector. Then q and
v are said to be (a) parallel if v ⊥ n; and (b) perpendicular if v ∥ n.
Fact 107. Given a line and a plane, there are three possibilities. The line and plane are:
(a) Parallel and do not intersect at all.
(b) Parallel and the line lies entirely on the plane.
(c) Non-parallel and intersect at exactly one point.
Corollary 20. If a line and plane are parallel, then they intersect if and only if the line
lies completely on the plane.
Example 736. The line l and the plane q are described by:
To find this intersection point, simply plug in a generic point of l into the equation for q:
10
[(3, 5, 5) + λ̂ (9, 1, 3)] ⋅ (1, 1, 1) = 3 ⇐⇒ 13 + 13λ̂ = 3 ⇐⇒ λ̂ = − .
13
10 1
Hence, l and q intersect at: (3, 5, 5) + λ̂ (9, 1, 3) = (3, 5, 5) − (9, 1, 3) = (−51, 55, 35).
13 13
Example 737. The line l and the plane q are described by:
Here’s a trick to find out which of these two possibilities holds. Observe that the line l
contains the point (3, 5, 5).247 So, simply check if this point is also on the plane q:
Since this point does not satisfy q’s vector equation, it is not on q. So, it cannot be that
l lies entirely on q. That leaves only one possibility: l and q do not intersect at all.
247
To see this, simply plug λ = 0 into the line’s vector equation.
590, Contents www.EconsPhDTutor.com
Example 738. The line l and the plane q are described by:
Again, l has direction vector v = (9, 1, 3), while q has normal vector n = (1, 0, −3). And
so again, l ∥ q.
And again, two possibilities: either l lies entirely on q or they don’t intersect at all.
To find out which it is, we use the same trick as before. Observe that the point (3, 5, 3)
is on l. Let’s check if this point is also on the plane q:
Yup, this time it is. Since l and q are parallel and share an intersection point, it must be
that l lies entirely on q.
The line l
The plane q
Exercise 241. In each of the following, a line l and a plane q are given. For each,
determine which of the three possibilities given in Fact 107 holds. And if the line and
plane intersect, find their intersection point. (Answer on p. 1504.)
(a) l: r = (4, 5, 6) + λ (2, 3, 5) (λ ∈ R).
q: r ⋅ (−10, 0, 4) = −26.
Definition 147. The angle between two planes is the non-obtuse angle between their
normal vectors.
∣u ⋅ v∣
By Fact 74, the non-obtuse angle between u and v is cos−1 .
∣u∣ ∣v∣
And so, we have the following “formula” for the angle between two planes:
Fact 108. The angle between two planes with normal vectors u and v is:
∣u ⋅ v∣
cos−1
∣u∣ ∣v∣
.
Example 740. Two planes are described by r ⋅ (2, 1, 3) = 26 and r ⋅ (−3, 0, 5) = −25. The
angle between them is the (non-obtuse) angle between their normal vectors:
Example 741. Two planes are described by r ⋅ (1, 1, 1) = 12 and r ⋅ (−1, −1, 0) = −1. The
angle between them is the (non-obtuse) angle between their normal vectors:
√
∣(1, 1, 1) ⋅ (−1, −1, 0)∣ ∣−2∣ 2
θ = cos−1 = cos−1 √ √ = cos−1 ≈ 0.615.
∣(1, 1, 1)∣ ∣(−1, −1, 0)∣ 3 2 3
Exercise 242. Find the angle between the given planes. (Answers on p. 1505.)
(a) r ⋅ (−1, −2, −3) = 1 and r ⋅ (3, 4, 5) = 2.
(b) One plane contains the vectors (1, −1, 0) and (3, 5, −1). The other contains the vectors
(0, 1, 0) and (10, 2, 3).
(c) One plane contains the points (1, 1, 0), (3, 0, 0), and (0, 0, 1). The other contains the
points (1, −1, 0), (1, 0, −1), and (0, 3, 1).
Definition 148. Let θ be the angle between two planes. We say that the two planes are:
If the planes q and r are parallel, then as shorthand, we’ll write q ∥ r. And if they’re
perpendicular, we’ll write q ⊥ r.
“Obviously”, two planes are parallel (or perpendicular) if and only if their normal vectors
are parallel (or perpendicular):
Fact 109. Let q and r be planes with normal vectors u and v. Then:
q3
v3
Fact 110. If two planes are parallel, then they are either identical or do not intersect.
Example 743. The planes q1 and q2 are described by r⋅(3, −3, −1) = 1 and r⋅(−6, 6, 2) = 5.
They are parallel because their normal vectors
are parallel: (3, −3, −1) ∥ (−6, 6, 2). q1
Now pick any point on q1 — for example,
(0, 0, −1). Check if this point is on q2 :
It isn’t. Since the two planes are not identical, by Fact 110, they do not intersect at all.
Fact 111. If two planes are not parallel, then they must intersect.
Example 745. The planes q1 and q2 aren’t parallel. So by Fact 111, they must intersect.
q2
q1
Int line
ers
ect
ion
In fact, and as should be intuitively “obvious”, q1 and q2 must intersect along a line.
As the above example suggests, two non-parallel planes must intersect along a line.
Now, what do we know about this intersection line? Well, “obviously”, its direction vector
is parallel to both planes and is thus also perpendicular to both planes’ normal vectors.
Let n and m be two planes’ normal vectors. By Fact 94, the only vectors perpendicular to
both n and m are those parallel to n × m.
Hence, their intersection line must have direction vector n × m. Altogether then:
Fact 112. Suppose two non-parallel planes have normal vectors n and m. Then their
intersection is a line with direction vector n × m.
Clearly, n1 ∥/ n2 . And so by Fact 112, q1 and q2 must intersect along a line with direction
vector n1 × n2 = (−1, 2, −3) × (5, −6, 7) = (−4, −8, −4) or (1, 2, 1).
Recall that to fully describe a line, we need a direction vector and a point. We already
have a direction vector. Let us now find some point P that is on the intersection line.
To do so, first write out the two planes’ cartesian equations:
−x + 2y − 3z = 4 and 5x − 6y + 7z = 0.
The solutions to the above system of (two) equations gives us the two planes’ intersection
points. Note that with three variables and two equations, this system of equations has
infinitely many solutions and hence infinitely many intersection points. And of
course, the set of all these intersection points is the intersection line.
Here’s a simple trick to find any one such intersection point. As always, zero is our
friend. So, let’s look for an intersection point whose x-coordinate is zero. In other words,
let’s simply plug x = 0 into the above equations to get:
2y − 3z = 4 −6y + 7z = 0.
1 2
and
And now, we can easily solve this system of (two) equations (with two variables): = plus
2
Altogether then, an intersection point shared by q1 and q2 is P = (0, −6, −7). And thus,
their intersection line may be described by:
n2 = (5, −6, 7)
P = (0, −6, −7)
q1
Corollary 21. Given two planes, there are exactly three possibilities. They are:
(a) Identical and thus also parallel;
(b) Parallel and do not intersect at all; or
(c) Non-parallel and intersect along a line.
And hence, two distinct planes intersect if and only if they are not parallel.
Proof. If two distinct planes are parallel, then by Fact 110, they do not intersect at all.
And if they aren’t parallel, then by Fact 112, they intersect along a line.
Clearly, n1 ∥/ n2 . And so by Fact 112, they must intersect along a line with direction
vector (−3, 7, 1) × (1, 2, 1) = (5, 4, −13).
To find an intersection point, write out the two planes’ cartesian equations:
−3x + 7y + z = 2 and x + 2y + z = 0.
n1 = (−3, 7, 1)
Again, there’ll be infinitely many in- q2
tersection points. To find one, use
the same trick as before. Plug x = 0
into the above equations to get: P = (0, 0.4, −0.8)
n2 = (1, 2, 1)
7y + z = 2 2y + z = 0.
1 2
and
Clearly, n1 ∥/ n2 . And so by Fact 112, they must intersect along a line with direction
vector (0, 4, 5) × (3, 4, 5) = (0, 15, −12) or (0, 5, −4).
To find an intersection point, write out the two planes’ cartesian equations:
4y + 5z = 0 and 3x + 4y + 5z = 1.
Again, there’ll be infinitely many intersec-
n1 = (0, 4, 5)
q2
tion points. It turns out that this time, our
“plug in x = 0 trick” won’t work so nicely.
Let’s try it anyway and see what happens:
P = (1/3, 0, 0)
4y + 5z = 0 4y + 5z = 1, n2 = (3, 4, 5)
1 2
and
Exercise 243. In each of the following, a pair of planes q1 and q2 is given. For each
pair, determine which of the three possibilities in Corollary 21 holds. If the two planes
intersect, describe the set of intersection points. (Answer on p. 1505.)
Earlier, we learned that the foot of the perpendicular from a point to a line is unique. It is
analogously true that the foot of the perpendicular from a point to a plane is unique:249
Fact 113. There is at most one foot of the perpendicular from a point to a plane.
Exercise 244. This Exercise250 guides you through a proof of Fact 113. Let B be a foot
of the perpendicular from a point A to a plane q. Let C ≠ B be a point on q. We’ll show
Ð→
that AC ⊥/ q and hence that C cannot also be a foot of the perpendicular from A to q.
Ð→ Ð→
(a) Explain why AB ⋅ BC = 0.
Ð→ Ð→ Ð→
(b) Now prove that AC ⊥/ BC and hence that AC ⊥/ q. (Answer on p. 1507.)
As before, we define the distance between a point and a plane to be the minimum distance
between them:
Definition 150. Let A be a point and q be a plane. Suppose B is the point on q that’s
Ð→
closest to A. Then the distance between A and q is ∣AB∣.
Remark 66. In the trivial case where A is on q, the point on q that’s closest to A is A
Ð→
itself. And so, by Definition 150, the distance between A and q is ∣AA∣ = ∣0∣ = 0. Which
is just what we’d expect.
248
Chs. 43 and 50.
249
Hence justifying the use of the definite article the in Definition 149.
250
It is also very similar to Exercise 196(c).
598, Contents www.EconsPhDTutor.com
Fact 114 is the analogue of Fact 87 and is again intuitively “obvious”:
q A
B is also the
closest point.
Fact 114. If B is the foot of the perpendicular from a point A to a plane q, then B is
also the point on q that’s closest to A.
Corollary 22. If B is the foot of the perpendicular from a point A to a plane q, then the
Ð→
distance between A and q is ∣AB∣.
251
Note that this proof is actually exactly identical to that of Fact 87.
252
This is exactly analogous to Method 2 (Perpendicular Method) in Chs. 43 and 50.
599, Contents www.EconsPhDTutor.com
Example 750. Let A = (−1, 0, 1) be a point, q be A = (−1, 0, 1) n = (0, 2, 5)
the plane described by r ⋅ n = r ⋅ (0, 2, 5) = 1, and B
be the foot of the perpendicular from A to q. q
Perpendicular Method. By Definition 149,
Ð→
AB ∥ n. So, there exists k ≠ 0 such that: Ð→ 4
B AB = − n
29
B = A + kn = (−1, 0, 1) + k(0, 2, 5). Not to scale.
Since B ∈ q, we have:
Ð→
OB ⋅ (0, 2, 5) = 1 or [(−1, 0, 1) + k(0, 2, 5)] ⋅ (0, 2, 5) = 1 or 5 + 29k = 1.
4 4 8 9
B = A + kn = A − n = (−1, 0, 1) − (0, 2, 5) = (−1, − , ).
29 29 29 29
Ð→ 4 4 4√ 4
∣AB∣ = ∣kn∣ = ∣k∣ ∣n∣ = ∣− ∣ ∣(0, 2, 5)∣ = ∣(0, 2, 5)∣ = 29 = √ .
29 29 29 29
Exercise 245. For each of the following, let B be the foot of the perpendicular from A
to q. Use the Perpendicular Method to find B and the distance between A and q.
The point A The plane q
(a) (7, 3, 4) r⋅ (9, 3, 7) = 109.
(b) (8, 0, 2) r⋅ (2, 7, 2) = 42.
(c) (8, 5, 9) r⋅ (5, 6, 0) = 64. (Answer on p. 1507.)
Next up, we’ll learn a second method, called the Formula Method, for finding the foot
of the perpendicular from a point to a plane and the distance between a point and a plane:
Fact 115. Suppose q is a plane described by r ⋅ n = d, A is a point that isn’t on the plane
q, and B is the foot of the perpendicular from A to q.
Ð→
d − OA ⋅ n
Let: k= .
∣n∣
2
Ð→
Then: (a) B = A + kn; and (b) ∣AB∣ = ∣k∣ ∣n∣.
Observe that if n = (a, b, c) and A = (x, y, z), then in the above result, we also have:
d − (ax + by + cz)
k=
a2 + b2 + c2
.
We now redo the last two examples, but now using Fact 115:
Example 751. Let A = (1, 2, 3) be a point, q be the plane described by r⋅n = r⋅(1, 1, 1) = 3,
and B be the foot of the perpendicular from A to q.
√ √
Formula Method. First compute ∣n∣ = 12 + 12 + 12 = 3. Then compute:
Ð→
d − OA ⋅ n 3 − (1, 2, 3) ⋅ (1, 1, 1) 3 − (1 + 2 + 3) −3
k= = = = = −1.
∣n∣
2 3 3 3
Happily, these are the same as what we found with Method 2 earlier.
Remark 67. I recommend sticking with and using the Perpendicular Method rather
than the Formula Method. Two reasons for this: the Perpendicular Method (a) is
easier to remember; and (b) helps you understand what’s going on.
In contrast, with the Formula Method, one is liable to simply and mindlessly plug in
formulae without understanding what is going on. Disaster then strikes if one is unable
to recall these formulae.
4 8 4
And now by Fact 115, we have B = A + kn = (−1, 0, 1) − (0, 2, 5) = (−1, − , ).
29 29 29
Ð→ 4 √ 4
And: ∣AB∣ = ∣k∣ ∣n∣ = ⋅ 29 = √ .
29 29
Happily, these are the same as what we found with Method 2 earlier.
Exercise 247. Redo Ex. 245 using the Formula Method. (Answer on p. 1508.)
The next result is not one that students could reasonably have been expected to know.
Which means, of course, that it made a sudden appearance in 2017 (Exercise 512).
Corollary 23. Suppose a plane is described by r ⋅ n = d. Then the distance between the
plane and the origin is:
∣d∣
.
∣n∣
Hence, if n is a unit vector, then the distance between the plane and the origin is ∣d∣.
Exercise 248. Prove Corollary 23. (Hint: Use Fact 115). (Answer on p. 1509.)
∣d∣ 3 √
= √ = 3.
∣n∣ 3
∣d∣ 1
=√ .
∣n∣ 29
4 4 1
B = A− n = (−1, 2, 5) − (0, 5, 1) = (−13, 6, 61) .
13 13 13
And the distance between A and q is:
√
Ð→ 4 √ 4 2
∣AB∣ = ∣k∣ ∣n∣ = ⋅ 26 = √ .
13 13
By Corollary 23, the distance between the origin and the given plane is:
∣d∣ ∣7∣ 7
=√ =√ .
∣n∣ 26 26
√ √
Formula Method. First compute ∣n∣ = 02 + 52 + 12 = 26. Then compute:
Ð→
d − OA ⋅ n 7 − (−1, 2, 5) ⋅ (0, 5, 1) 7 − 15 4
k= = = =− .
∣n∣
2 26 26 13
16 16 16
So, k = 32/14 = 16/7 and B = A + n = (0, 0, 0) + (1, 2, 3) = (1, 2, 3).
7 7 7
The distance between A and q is:
√
Ð→ 16 √ 16 2
∣AB∣ = ∣k∣ ∣n∣ = ⋅ 14 = √ .
7 7
A = (0, 0, 0)
n = (1, 2, 3)
q
Ð→ 16
AB = n
7
B
√ √
Formula Method. First compute ∣n∣ = 12 + 22 + 32 = 14. Then compute:
Ð→
d − OA ⋅ n 32 − (0, 0, 0) ⋅ (1, 2, 3) 32 − 0 16
k= = = = .
∣n∣
2 14 14 7
16 16
And now by Fact 115, B = A + kn = (0, 0, 0) + (1, 2, 3) = (1, 2, 3).
7 7
Ð→ √
And as before, we can compute ∣AB∣ = ∣k∣ ∣n∣ = 16 2/7.
Of course, since A = O, this number is also the distance between the origin and q.
Exercise 249. Let S = (−1, 0, 7) and T = (3, 2, 1) be points and q be the plane described
by r ⋅ (5, −3, 1) = 0. Use both methods you’ve learnt in this chapter to find (a) the feet of
the perpendiculars from S and T to q. Then find (b) the distances from q to the points
S and T ; and also (c) the distance between q and the origin. (Answer on p. 1509.)
Remark 68. It turns out that there is also a third method, called the Calculus Method,
for finding the foot of the perpendicular and the distance between a point and a plane.
This is very similar to and not much more difficult what we did earlier in Ch. 43 and 50
with a point and a line.
However, because this involves multivariate calculus and is not on your syllabus, I have
decided to relegate this discussion to Ch. 119.15(Appendices).
Definition 151. Two or more points are coplanar if some plane contains all of them.
“Obviously”, given any line, there is a plane that contains this line.
Recall253 that any two points are collinear. And so “obviously”, they must also be coplanar.
Recall254 that three non-collinear points uniquely determine a plane. Thus, any three points
must also be coplanar.
In contrast, four points need not be coplanar. Given four distinct points, we’ll use these
steps to check if they’re coplanar:
1. Write down the plane that contains three of the points.
2. Then check whether this plane also contains the fourth point.
Example 757. Let A = (1, 0, 0), B = (0, 1, 0), C = (0, 0, 1), and D = (1, 1, −1) be points.
To check if they are coplanar, we’ll write down the plane q that contains A, B, and C.
We’ll then check if D ∈ q.
Ð→ Ð→
The non-parallel vectors AB = (−1, 1, 0) and AC = (−1, 0, 1) are on q.
Ð→ Ð→
Method 1 (Vector Form). q has normal vector AB × AC = (1, 1, 1).
Ð→
Compute OA ⋅ (1, 1, 1) = 1 + 0 + 0 = 1. Hence, q may be described by r ⋅ (1, 1, 1) = 1.
We now verify that D ∈ q and hence that the four points are coplanar:
ÐÐ→
OD ⋅ (1, 1, 1) = (1, 1, −1) ⋅ (1, 1, 1) = 1 + 1 − 1 = 1. 3
⎛1⎞ ⎛ −1 ⎞ ⎛ −1 ⎞ ⎛ 1 − λ − µ ⎞
Ð→ Ð→ Ð→ ⎜ ⎟
r = OA + λAB + µAC = ⎜ 0 ⎟ + λ ⎜ ⎟ ⎜ ⎟ ⎜
⎜ 1 ⎟ + µ⎜ 0 ⎟ = ⎜ λ ⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ µ ⎠
⎛ 1 ⎞ ⎛ 1−λ−µ ⎞ 1 = 1 − λ − µ,
1
⎜ 1 ⎟=⎜ ⎟ 1 = λ,
⎜ ⎟ ⎜ ⎟
2
λ or
⎝ −1 ⎠ ⎝ µ ⎠ −1 = µ.
3
From = and =, λ = 1 and µ = −1. These values of λ and µ satisfy =. Hence, D ∈ q and the
2 3 1
253
Fact 80.
254
Ch. 55.
605, Contents www.EconsPhDTutor.com
Example 758. Let A = (2, 3, 5), B = (8, −1, 0), C = (0, 1, 0), and D = (−3, −2, −1) be
points. To check if they are coplanar, we’ll write down the plane q that contains A, B,
and C. We’ll then check if D ∈ q.
Ð→ Ð→
The non-parallel vectors AB = (6, −4, −5) and BC = (−8, 2, 0) are on q.
Ð→ Ð→
Method 1. q has normal vector AB × BC = (10, 40, −20) or (1, 4, −2).
Ð→
Compute OC ⋅ (1, 4, −2) = 0 + 4 + 0 = 4. Hence, q may be described by r ⋅ (1, 4, −2) = 4.
We now show that D ∉ q and hence that the four points are not coplanar:
ÐÐ→
OD ⋅ (1, 4, −2) = (−3, −2, −1) ⋅ (1, 4, −2) = −3 − 8 + 2 = −9 ≠ 4. 7
⎛0⎞ ⎛ 6 ⎞ ⎛ −8 ⎞ ⎛ 6λ − 8µ ⎞
Ð→ Ð→ Ð→ ⎜ ⎟
r = OC + λAB + µBC = ⎜ 1 ⎟ + λ ⎜
⎜ −4
⎟ + µ ⎜ 2 ⎟ = ⎜ 1 − 4λ + 2µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝ −5 ⎠ ⎝ 0 ⎠ ⎝ −5λ ⎠
⎛ −3 ⎞ ⎛ 6λ − 8µ ⎞ −3 = 6λ − 8µ,
1
⎜ −2 ⎟ = ⎜ 1 − 4λ + 2µ ⎟ −2 = 1 − 4λ + 2µ,
⎜ ⎟ ⎜ ⎟
2
or
⎝ −1 ⎠ ⎝ −5λ ⎠ −1 = −5λ.
3
From =, λ = 0.2. And so from =, µ = 21/40. These values of λ and µ contradict =. Hence,
3 1 2
Fact 116. If three of four points are collinear, then the four points are coplanar.
Proof. Given four points A, B, C, and D, suppose A, B, and C are collinear. Let q be the
plane that contains A, B, and D. By Fact 95, q contains the line AB and hence also the
point C. Thus, q contains all four points.
Example 759. Let A = (1, 2, 3), B = (4, 5, 6), C = (10, 11, 12), and D = (9, 1, 7) be points.
The line AB is described by r = (1, 2, 3) + λ (3, 3, 3) (λ ∈ R). By picking λ = 3, we see that
AB also contains the point C. Hence, A, B, and C are collinear.
And so by Fact 116, A, B, C, and D are coplanar. (Indeed, given any point E, it is
similarly true that the four points A, B, C, and E must be coplanar.)
Note though that the converse of Fact 116 is false (see Exercise 250).
Exercise 251. In each of the following, determine if the four points given are coplanar.
If they are, write down the plane that contains all four points. (Answer on p. 1510.)
(a) A = (0, 1, 5), B = (−3, −1, 1), C = (2, 7, 5), and D = (6, 6, 1).
(b) A = (−1, 3, −5), B = (0, 0, 0), C = (6, 1, −2), and D = (4, 7, −12).
(c) A = (0, 1, 2), B = (1, 2, 3), C = (2, 3, 4), and D = (19, 0, −5).
Definition 152. Two or more lines are coplanar if some plane contains all of them.
Example 760. The lines l1 and l2 are coplanar because the plane q1 contains both of
them. Similarly, l2 and l3 are coplanar because the plane q2 contains both of them.
l3
q2
l2
q1
l1
In contrast, l1 and l3 are not coplanar because no plane contains both of them.
“Obviously”:
Fact 117. If two lines are identical, then they are also parallel and coplanar.
Ð→
Proof. Let the two (identical) lines be described by r = OP + λu (λ ∈ R).
They have parallel direction vectors. And so by Definition 116, they are parallel.
Let v be any vector that points in a different direction from u. Then the plane r =
Ð→
OP + λu + µv (λ, µ ∈ R) contains both lines (to verify this, simply let µ = 0).
The following Fact says that in the case of two distinct lines, there are three possibilities:
Ð→ Ð→
Fact 118. Suppose l1 and l2 are distinct lines described by r = OP + λu and r = OQ + λv
(λ ∈ R). Then the three possibilities are that l1 and l2 are:
(a) Parallel and do not intersect; moreover, the unique plane that contains l1 and l2 is
Ð→ Ð→
described by r = OP + λu + µP Q (λ, µ ∈ R).
(b) Non-parallel and share exactly one intersection point; moreover, the unique plane
Ð→
that contains l1 and l2 is described by r = OP + λu + µv (λ, µ ∈ R).
(c) Skew (i.e. neither parallel nor intersect) and are not coplanar.
Corollary 24. Two lines are coplanar if and only if they are not skew.
Proof. ( Ô⇒ ) Suppose two lines are coplanar. Either they are parallel or not. If they are
parallel, then by Definition 139, they are not skew. And if they are not parallel, then by
Fact 118(b), they intersect and again by Definition 139 are not skew.
( ⇐Ô ) Fact 117 already proved that two identical lines are coplanar. So suppose two
distinct lines are not skew. Then by Definition 139, they either intersect or are parallel. If
they intersect, then by Fact 118(b), they are coplanar. And if they are parallel, then by
Fact 118(a), they are again coplanar.
Remark 69. In this textbook, we define two lines to be skew if they are not parallel and
do not intersect (Definition 139). Corollary 24 then follows as a result.
However, some writers take the opposite route — they first define two lines to be skew
if they are not coplanar. That is, they use Corollary 24 as their definition of skew lines.
They then prove that our Definition 139 follows as a result.
Since (3, 6, 9) ∥ (1, 2, 3), the two lines are parallel. And so by Fact
118(a), they are coplanar and do not intersect.
Compute (4, 5, 6) − (8, 1, 1) = (−4, 4, 5). By Fact 118(a), the (unique)
plane that contains both lines is:
Since (0, 1, 0) ∥/ (1, 0, 0)the two lines are not parallel. And so, there
are the two possibilities given by Fact 118(b) and (c). To check if they
l
intersect, write:
⎜ 0 ⎟ + λ̂ ⎜ 1 ⎟ = ⎜ 17 ⎟ + µ̂ ⎜ 0 ⎟ λ̂ = 17,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝0⎠ ⎝0⎠ ⎝ 0 ⎠ ⎝0⎠ 0 = 0.
3
Solving, we have λ̂ = 17 and µ̂ = −4. Thus, the two lines intersect at the following point:
And so by Fact 118, the two lines are also coplanar and the (unique) plane that contains
them can be described by r = (0, 0, 0) + λ (0, 1, 0) + µ (1, 0, 0) (λ, µ ∈ R).
Since (9, 1, 3) ∥/ (3, 2, 1), the two lines are not parallel. And so again,
there are the two possibilities given by Fact 118(b) and (c). To check if
l
they intersect, write:
⎜ 1 ⎟ + λ̂ ⎜ 1 ⎟ = ⎜ 5 ⎟ + µ̂ ⎜ 2 ⎟ 1 + λ̂ = 5 + 2µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝2⎠ ⎝3 ⎠ ⎝6⎠ ⎝1 ⎠ 2 + 3λ̂ = 6 + µ̂.
3
= minus 3× = yields −6 = −14, a contradiction. So, the two lines do not intersect. And
1 3
Exercise 252. Determine if each pair of lines is parallel, coplanar, or intersect. Also,
find any intersection points and any plane that contains both lines.(Answer on p. 1511.)
l1 l2
(a) r= (8, 1, 5) +λ (3, 2, 1) r= (1, 2, 3) +λ (5, 6, 7) (λ ∈
(b) r= (0, 0, 6) +λ (3, 9, 0) r= (1, 1, 1) +λ (1, 3, 0) “
(c) r= (6, 5, 5) +λ (1, 0, 1) r= (9, 3, 6) +λ (0, 1, 1) “
(d) r= (−1, 3, 8) +λ (−5, 0, 1) r= (9, 3, 6) +λ (10, 0, 2) “
Definition 153. The imaginary unit, denoted i, is the number that satisfies i2 = −1.
√
Or equivalently: i = −1. And so, the solution to the equation x2 = −1 is x = ±i.
In this textbook, we will blithely and naïvely assume that the “usual” rules of arithmetic
also apply to the complex numbers.
Any non-zero, real multiple of the imaginary unit is called a purely imaginary number:
The sum of any real number (including zero) and a purely imaginary number is called an
imaginary number:
Any imaginary number that isn’t purely imaginary is an “impure” imaginary number:
Example 768. The following numbers are complex, imaginary, and “impure” imaginary:
√
3 − 2i, −5 + 13i, 2+ 5i, −13 − i.
Example 769. The following numbers are complex, imaginary, and purely imaginary:
√
2i, − 5i, (47 − π) i.
Example 770. The number i is complex, imaginary, purely imaginary, and the imaginary
unit.
However, i is neither real nor “impure” imaginary.
Example 771. The following numbers are both complex and real:
Remark 70. Be aware that some other writers define purely imaginary numbers to include
the number 0. This is not the practice of this textbook.255 In this textbook, the number
0 is not purely imaginary because 0 = 0 + 0i, but by Definition 154, a purely imaginary
number a + ib must have b ≠ 0.
R denotes the set of real numbers. Similarly, C denotes the set of complex numbers:
Every real number is a complex number, but not every complex number is a real number.
Or equivalently, the set of real numbers is a proper subset of the set of complex numbers:
Fact 119. R ⊂ C.
255
This alternative practice used by other writers is sometimes convenient as one learns more about complex
numbers. However, because this textbook doesn’t go far, it is simpler and more natural to simply
continue to continue calling 0 a real number, as we’ve done all our lives, and not now also call it a
purely imaginary number.
614, Contents www.EconsPhDTutor.com
We now reproduce our taxonomy of numbers from p. 42, but also flesh it out a little with
what we’ve just learnt, adding three boxes at the bottom):
Complex C
Real R
(a + ib, Rationals Q Integers Z
(a + ib, b = 0)
a, b ∈ R)
Imaginary
Irrationals Non-integers
(a + ib, b ≠ 0)
To fully appreciate why this is so is beyond the scope of the A-Levels.256 But for now, here
is a very simple example just to illustrate the point.
Exercise 253. Fill in the rest of the following table. (Answer on p. 1512.)
√ √
Is this number ... 9 − 2i 3i 0 4 4 + 2i i 3
Complex? 3
Real? 7
Imaginary? 3
Purely imaginary? 7
“Impure” imaginary? 3
The imaginary unit? 7
256
But if you’re interested, see (especially ajotaxte’s answer).
615, Contents www.EconsPhDTutor.com
60.1. The Real and Imaginary Parts of Complex Numbers
Definition 159. Given a complex number z = a + ib, its real part is Rez = a and its
imaginary part is Imz = b.
Example 773. Let z = 3 + 2i. Then Rez = Re (3 + 2i) = 3 and Imz = Im (3 + 2i) = 2.
Example 775. Let ω = 19i. Then Reω = Re (19i) = 0 and Imω = Im (19i) = 19.
Remark 71. Your A-Level examiners like using the symbols z, w, and ω (lower-case Greek
letter omega) to denoted complex numbers and so that’s what we’ll try to do too.
“Obviously”, two complex numbers z and w are equal if and only if:
Exercise 254. The complex numbers a, b, c, and d are given below. Exactly two are
identical — find them. (Answer on p. 1512.)
√ √
1 2 3 1 3
a= √ − b = √ − √ i, c = sin − sin i, d= − cos (− ) i.
π π π
i,
2 2 2 2 3 3 2 4
Remark 72. Your A-Level examiners seem to follow the convention of writing a + ib or
x + iy rather than a + bi or x + yi. That is, the imaginary unit i is written before a variable
like b or y. And so that’s also what we’ll do too. (The logic seems to be that constants
come before variables and the imaginary unit i is a constant.)
Note though that we will still write 1 + 2i or −3 − 4i rather than 1 + i2 or −3 − i4. That is,
any constants like 1 or 4 will still be written before the imaginary unit i.
Definition 160. Given a complex number z = a + ib, its real part is Rez = a and its
imaginary part is Imz = b.
Exercise 255. Rewrite each number in ordered pair notation. (Answer on p. 1512.)
√
(a) z = 33 (1 + ei) (b) w = (237 + π) − ( 2 − 3) i (c) Reω = p, Imω = q.
z + w = −2 + 4i and z − w = −2 − 2i.
z + w = 9 + 4i and z − w = 5 − 6i.
In general:
257
In this textbook, we’ve been neither clear nor explicit about what these rules are. We have simply
assumed that everyone, including you the student, “knows” what they are.
618, Contents www.EconsPhDTutor.com
61.2. Multiplication
Here are the powers of i:
i = i, i2 = i × i = −1, i3 = i × i2 = −i, i4 = i × i3 = 1,
i5 = i × i4 = i, i6 = i × i = −1, i7 = i × i2 = −i, i8 = i × i3 = 1,
etc.
Observe that i4 = 1. And so, the cycle repeats after every fourth power.
The “usual” rules of multiplication hold:
Google does the basic arithmetic of complex numbers as well as Wolfram Alpha, but
much more quickly. So here in Part IV, whenever you see the logo, click on it and
you’ll be brought to the relevant computation done by Google.
zw = (2 − i) (−1 + i) = −2 + 2i + i − i2 = −1 + 3i.
In general:
Definition 161. Given the complex number z = a+ib, its (complex) conjugate is z ∗ = a−ib.
Also, z = a + ib and z ∗ = a − ib are called a (complex) conjugate pair.
√ √
Example 788. The conjugate of w = −5 − (17 + 2) i is w∗ = −5 + (17 + 2) i; and w and
w∗ are called a conjugate pair.
(a + b) (a − b) = a2 − b2 .
1
Recall also that when a denominator contained a surd, we could often use = to rationalise
1
We can often use = to realise (“make real”) a denominator that contains a complex number:
2
258
Ch. 5.5.
621, Contents www.EconsPhDTutor.com
Example 796. Let z = 1 + i. Consider the reciprocal of z:
1 1
=
z 1+i
.
In general, it is easier to deal with “simpler” denominators. We might thus like to rid
the above denominator of any complex numbers.
Here’s how we can do so by using the conjugate z ∗ . We simply multiply by z ∗ /z ∗ = 1:
z∗
¬
1 1 z∗ 1 1−i 2 1−i 1−i 1 1
= ∗= = 2 2= = − i.
z zz 1+i 1−i 1 +1 2 2 2
1 z∗ 1
(a) zz ∗ = a2 + ib2 = (a2 , b2 ); and (b) = 2 = 2 2 (a, −b).
z ∣z∣ a +b
1 z∗ z∗ −3 − 5i 3 5 1
= 2 2 = ∗= =− − i= (−3, −5).
z 3 +5 34 34 34 34 34
1 w∗ w∗ 1 + i 1 1 1
= = = = + i = (1, 1).
w 12 + 12 2 2 2 2 2
1 1 1−i 1 1 1
= 2 2 ω∗ = = − i = (1, −1).
ω 1 +1 2 2 2 2
Exercise 260. Write down each number’s conjugate and reciprocal in the form a + ib.
(a) z = −5 + 2i. (b) z = 3 − i. (c) z = 1 + 2i. (Answer on p. 1513.)
Remark 73. Just so you know, most writers denote the conjugate of z by z. However,
your A-Level examiners use z ∗ and so that’s what we’ll do too.
z z w∗ zw∗
Proof. = =
w w w∗ c2 + d2
.
z 3+i zw∗ (3 + i) (1 + i) 2 + 4i
= = 2 2= = = 1 + 2i.
w 1−i 1 +1 2 2
Exercise 262. For each, find z/w in the form a + ib. (Answer on p. 1513.)
√
(a) z = 1 + 3i, w = −i. (b) z = 2 − 3i, w = 1 + i. (c) z = 2 − πi, w
(d) z = 11 + 2i, w = i. (e) z = −3, w = 2 + i. (d) z = 7 − 2i, w
Fact 126. Every quadratic equation has two complex roots given by =.
1
Example 807. The quadratic equation x2 −3x+2 = 0 has positive discriminant : b2 −4ac =
(−3) 2 − 4 (1) (2) = 1 > 0. Thus, both of its complex roots are real:
√ √
−b ± b2 − 4ac 3 ± 1
x= = = 1, 2.
1
2a 2
Proof. Omitted. See e.g. Schilling, Lankham, & Nachtergaele (2016, Ch. 3).
Example 809. The 2nd-degree polynomial (or quadratic) equation x2 − 1 = 0 has two
roots, namely 1 and −1.
Example 810. The 2nd-degree polynomial (or quadratic) equation x2 + 1 = 0 has two
roots, namely i and −i.
Example 811. By the FTA, the 3rd-degree polynomial equation (or cubic equation)
x3 − 8 = 0 has three roots. Let’s find them using what we learnt in Ch. 21.1.
Observe that 23 − 8 = 0. So one root is 2 and x − 2 is a factor of x3 − 8.
To find the other two factors, write:
x3 − 8 = (x − 2) (x2 + 2x + 4) .
We can then further factorise x2 + 2x + 4 using the usual quadratic formula:
√ √
−b ± b − 4ac −2 ± 22 − 4 (1) (4)
2 √
x= = = −1 ± 3i.
2a 2⋅1
Altogether then, the three roots of the 3rd-degree polynomial equation x3 − 8 = 0 are:
√ √
2, −1 + 3i, −1 − 3i.
And here is the cubic polynomial x3 − 8 factorised into its three linear factors:
√ √
x3 − 8 = (x − 2) (x + 1 − 3i) (x + 1 + 3i) .
√
Exercise 264. Verify that −1 ± 3i solve x3 − 8 = 0. (Answer on p. 1514.)
x2 − 2x + 1 = (x − 1) = 0
2
x3 − 6x2 + 12x − 8 = (x − 2) = 0
2
The FTA can be useful even if we have no idea how to solve an equation.
Example 814. We may have no idea how to solve the 17th-degree polynomial equation
x17 + 3x4 − 2x + 1 = 0.
Nonetheless, the FTA gives us a useful piece of information, namely that this equation
must have 17 roots or solutions (though some may possibly be repeated).
Exercise 266. You’re given that 1 solves both of the equations given below. Find the
other roots of each equation. (Answer on p. 1514.)
(a) x3 + x2 − 2 = 0. (b) x4 − x2 − 2x + 2 = 0.
The above examples suggest that if z = p + iq solves the quadratic equation ax2 + bx + c = 0,
then so too does its conjugate z ∗ = c − id. It turns out that this is generally true of any
polynomial equation, provided the coefficients are real:
Example 818. If given that both i and 0.5i solve 4x4 + 5x2 + 1 = 0, then by Theorem 12,
we know that their conjugates −i and −0.5i also solve the same equation.
Example 819. If −1 + 2i is a root of x3 − 3x2 − 5x − 25 = 0, then what are the other two?
Well, by Theorem 12, we know that −1 − 2i must be another root.
So, both x−(−1 + 2i) = x+1−2i and x−(−1 − 2i) = x+1+2i are factors of x3 −3x2 −5x−25.
(x + 1 − 2i) (x + 1 + 2i) = (x + 1) − (2i) = x2 + 2x + 5.
2 2
Compute:
Altogether then, the three roots of the cubic equation x3 − 3x2 − 5x − 25 = 0 are:
−1 + 2i, −1 − 2i, 5.
Compute: (x − i) (x + i) = x2 − i2 = x2 + 1.
x2 + x − 6 = (x + 3) (x − 2) .
i, −i, −3, 2.
By the way, the condition that all coefficients c0 , c1 , . . . , cn in the polynomial equation are
real is important. If this condition is violated, then the Theorem’s conclusion may not hold:
Observe that not all of the coefficients in = are real. And so, Theorem 12’s conclusion
1
Exercise 267. You’re given that 2 − 3i solves both of the equations below. Find the
other roots of each equation. (Answer on p. 1515.)
(a) x4 − 6x3 + 18x2 − 14x − 39 = 0. (b) −2x4 + 21x3 − 93x2 + 229x − 195 = 0.
x2 + px + q = 0 (p, q ∈ R).
z = (a, b) .
And so, complex numbers can also be depicted geometrically as points on the plane. This
time, the real axis is the horizontal or x-axis, while the imaginary axis is the vertical or
y-axis. We call this plane the complex plane or Argand diagram.
Exercise 269. Depict the complex numbers 2, −1, 2i, 1 + 2i, and −1 − 3i on a single
Argand diagram. (Answer on p. 1515.)
Remark 74. Both of the following mathematical objects can be depicted on a plane.
• The set of complex numbers or the complex plane C = {a + ib ∶ a, b ∈ R}; and
• The set of ordered pairs of real numbers or the cartesian plane
{(x, y) ∶ x, y ∈ R}.
However, you should be aware that the complex plane and cartesian are different math-
ematical objects. Don’t worry, you need merely be aware that they are different; you
needn’t know what exactly the differences are.
z = a + ib or z = (a, b).
In this chapter, we’ll learn to write down a complex number in polar form. (And in the
next chapter, we’ll learn how to do so in exponential form.)
To write down a complex number z = a + ib = (a, b) in cartesian form, we need two pieces of
information:
To write down a complex number z in polar form, we likewise need two pieces of information:
Informally, the modulus of a complex number is the magnitude or length of its position
vector. Formally:
Definition 162. Given a complex number z = a + ib, its modulus, denoted ∣z∣ is the
following number:
√
∣z∣ = a2 + b2 .
z = 3 + 2i = (3, 2)
z = 3 + 2i√= (3, 2)
w = −2i = (0, −2),
∣z∣ = 13
ω = −2 − 2i = (−2, −2).
w = −2i
√ √ ∣w∣ = 2
Then: ∣z∣ = 32 + 22 = 13,
√ x
∣w∣ = 02 + (−2) = 2,
2
√ √
∣ω∣ = (−2) + (−2) = 2 2.
2 2
ω = −2 √
− 2i
∣ω∣ = 2 2
Exercise 270. Compute the moduli of the following numbers: (Answer on p. 1516.)
260
Also known as standard or rectangular form.
630, Contents www.EconsPhDTutor.com
64.1. The Argument: An Informal Introduction
Informally, a complex number’s argument is the angle that number’s position vector makes
with the positive x-axis:
Example 825. Let z = 3 + 2i = (3, 2), w = −2i = (0, −2), and ω = −2 − 2i = (−2, −2).
2 3π
Then: arg z = tan−1 ≈ 0.588, arg w = π, and arg ω = − .
3 4
Notice that the angles that give us arg z and y
arg w are measured anti-clockwise from the
positive x-axis. In contrast, the angle that z = 3 + 2i
gives us arg ω is measured clockwise from √
∣z∣ = 13
the positive x-axis.
We will adopt the following informal rule:
• If the complex number a is on or above the arg w = π
x-axis (i.e. Ima ≥ 0), then the angle that arg z ≈ 0.588
gives us arg a is measured anti-clockwise w = −2 3π x
from the positive x-axis. ∣w∣ = 2 arg ω = −
4
• But if a is strictly below the x-axis (i.e.
Ima < 0), then the angle that gives us arg z
is measured clockwise from the positive ω = −2 √
− 2i
x-axis. ∣ω∣ = 2 2
And thus:
• If a is on or above the x-axis, then the angle measured anti-clockwise from the
positive x-axis must be in the interval [0, π].
• And if a is below the x-axis, then the angle measured clockwise from the positive
x-axis must be in the interval (−π, 0).
Altogether then, for any non-zero complex number a, we always have:
Or equivalently, the range or set of principal values of the argument function is:
Remark 75. The argument of the complex number 0, i.e. arg 0, is undefined. For all other
z, we have arg z ∈ (−π, π].
Exercise 271. Find the arguments of 2, −1, 2i, 1 + 2i, and −1 − 3i.(Answer on p. 1516.)
Recall (Definition 111) that θ, the angle between z and i, is given by:
z⋅i (a, b) ⋅ (1, 0) a+0
θ = cos−1 = cos−1 = cos−1 √ = cos−1 √ = cos−1 .
a a
∣z∣ ∣i∣ ∣(a, b)∣ ∣(1, 0)∣ a2 + b2 ⋅ 1 a2 + b2 ∣z∣
Definition 163. Given a non-zero complex number z = a + ib, its argument, denoted
arg z, is the following number:
⎧
⎪
⎪ cos−1 √ if b ≥ 0,
a
⎪
⎪
⎪ 2 + b2
,
⎪
arg z = ⎨
a
⎪
⎪
⎪
⎪ − −1
√ if b < 0.
a
⎪
⎪ cos
⎩
,
a2 + b2
⎧
⎪ Rez
⎪
⎪
⎪ cos−1 if Imz ≥ 0,
⎪ ∣z∣
,
⎪
⎪
Or equivalently: arg z = ⎨
⎪
⎪
⎪ Rez
⎪
⎪
⎪ − cos−1 if Imz < 0.
⎪ ∣z∣
,
⎩
ω = −2 √
− 2i
Take note that there is a negative sign before
∣ω∣ = 2 2
arccosine for ω (because ω is below the x-axis).
In contrast, there isn’t for either z or w (be-
cause they are on or above the x-axis).
The following result is immediate from what we’ve learnt about the Argand diagram and
Definition 163:
Exercise 273. Find each number’s argument, but this time using Definition 163. (Check
that your answers are the same as before.) (Answer on p. 1516.)
Remark 76. In this subchapter, we’ve learnt two methods for computing the argument of a
complex number. The first method, covered in the previous subchapter, may be called the
“look at the graph and use arctangent”. The second method, covered in this subchapter.
simply uses Definition 163. I personally do not find (b) difficult to remember and so
going forward in this textbook, that’s what I’ll be using. But you should use whichever
you think is easier for you.
We have: cos θ =
A a O b
= and sin θ = = . z = a + ib = (a, b)
H r H r
Rearranging: a = r cos θ and b = r sin θ. r
Thus, we may also write z in polar form as: b
Fact 128. Let z be a non-zero complex number with ∣z∣ = r and arg z = θ. Then:
z = r (cos θ + i sin θ) .
Proof. We already proved this result above, but only in the case where both a and b are
positive. For a complete proof, see p. 1319 in the Appendices.
Exercise 274. Rewrite each complex number in polar form. (Hint: We already computed
their moduli and arguments in earlier exercises.) (Answer on p. 1516.)
Richard Feynman called the above “the most remarkable formula in mathematics”.261
Now, plug θ = π into Euler’s Formula to get:
Euler’s Identity is one of the most extraordinary and beautiful equations in all of math-
ematics. It links together five of the most fundamental mathematical constants:
e, i, π, 1, and 0.
Fun Fact
Leonhard Euler (1707–83) was a stud. There are so many mathematical results and
objects named after him that there is even a Wikipedia entry listing the things named
after him!
This can sometimes result in confusion. For example, what we call Euler’s Formula is
called Euler’s Identity by others and vice versa.
As another example, Euler’s number e = 2.718 . . . is different from Euler’s constant
γ = 0.577 . . .
Even if we count only Euler’s output after he turned blind at around age 60, his output
was of such quality and quantity that has been matched by few other mathematicians in
history.
261
The Feynman Lectures on Physics (1964, p. 22–10).
635, Contents www.EconsPhDTutor.com
65.1. Complex Numbers in Exponential Form
Let z be a non-zero complex number with r = ∣z∣ and θ = arg z. Then by Fact 128:
By Euler’s Formula, cos θ + i sin θ = eiθ . Now plug = into = and we will have written z down
2 2 1
in exponential form:
z = reiθ .
Fact 129. Let z be a non-zero complex number. Suppose r = ∣z∣ and θ = arg z. Then:
z = reiθ .
Exercise 275. Rewrite each complex number in polar form. (Hint: We already computed
their moduli and arguments in earlier exercises.) (Answer on p. 1516.)
Remark 77. Your A-Level examiners do not seem to use the term exponential form. What
we call exponential form is simply called polar form by them. (And what we call polar
form is also called polar form by them.)
(a) ∣zw∣ = ∣z∣ ∣w∣; and (b) arg (zw) = arg z + arg w + 2kπ,
⎧
⎪
⎪
⎪
⎪ −1, if arg z + arg w > π,
⎪
⎪
where in (b): k = ⎨0, if arg z + arg w ∈ (−π, π] ,
⎪
⎪
⎪
⎪
⎪
⎪
⎩1, if arg z + arg w ≤ −π.
Proof. For (a), see Exercise 277. For (b), see p. 1321 (Appendices).
The additional term 2kπ in Fact 130(b) is to ensure that arg (zw) ∈ (−π, π], as is required
by the definition of the argument. A few examples will make this clear:
To write zw down in cartesian form, we can use ∣zw∣ and arg (zw) to compute:
√ √
Re (zw) ≈ 290 cos 0.869 = 10.994; and Im (zw) ≈ 290 sin 0.869 = 13.005.
Of course, since the real and imaginary parts of both z and w are all integers, so too
must be the real and imaginary parts of zw. And so we have in fact Re (zw) = 11 and
Im (zw) = 13. Thus, zw = 11 + 13i.
Alternatively, we can do the usual multiplication, which yields us the exact value of zw:
To write zw down in cartesian form, we can use ∣zw∣ and arg (zw) to compute:
√ √ √ √
Re (zw) ≈ 65 82 cos −2.733 ≈ −67; and Im (zw) ≈ 65 82 sin −2.733 ≈ −29.
Alternatively, we can do the usual multiplication, which yields us the exact value of zw:
To write zw down in cartesian form, we can use ∣zw∣ and arg (zw) to compute:
√ √
Re (zw) ≈ 5 2 cos 2.356 ≈ −5; and Im (zw) ≈ 5 2 sin 2.356 ≈ 5.
Thus: zw = −5 + 5i.
Alternatively, we can do the usual multiplication, which yields us the exact value of zw:
Corollary 26. Suppose z is a complex number and a > 0. Then arg (az) = arg z.
Proof. By Fact 130, arg (az) = arg a + arg z + 2kπ = arg z + 2kπ = arg z, where we choose
k = 0 because arg z ∈ (−π, π].
Example 836. Let z=7-9i, so that 5z = 35 − 45i. Then arg z = arg (5z) ≈ −0.910.
Proof. By Fact 130(b), arg (−z) = arg (−1 ⋅ z) = arg (−1) + arg z + 2kπ = π + arg z + 2kπ.
(a) If arg z > 0, then k = −1 and thus: arg (−z) = π + arg z − 2π = arg z − 2π.
1
Example 837. Let z=7-9i, so that −z = −7 + 9i. Since arg z ≈ −0.910 ≤ 0, by Corollary
27, we have arg (−z) = arg z + π ≈ −0.910 + π ≈ 2.232.
√
Example 838. Let a = 1, b = 1 + i, c = i, and d = 1 − 3i. Then:
arg a = arg 1 = 0 ≤ 0,
√
−d = −1 + 3i y
arg b = arg (1 + i) = > 0,
π
4
arg c = = > 0, c=i b=1+i
π
arg i
2
√ −π
arg d = arg (1 − 3i) = ≤ 0.
3
−a = −1 x
And so by Corollary 27: a=1
Example 839. Let z=7-9i, so that −5z = −35+45i. Since arg z ≈ −0.910 ≤ 0, by Corollary
28, we have arg (−5z) = arg z + π ≈ −0.910 + π ≈ 2.232.
√
Example 840. Let a = 1, b = 1 + i, c = i, and d = 1 − 3i. Then:
Exercise 276. For each, find ∣zw∣, arg (zw), ∣−2zw∣, and arg (−2zw). Then express both
zw and −2zw in polar, exponential, and cartesian forms. (Answer on p. 1517.)
(a) z = 1, w = −3. (b) z = 2i, w = 1 + 2i. (c) z = −1 − 3i, w
(d) z = −2 + 5i, w = i. (e) z = −1 − i, w = −1 − 2i. (f) z = −5 − 3i, w
Exercise 277. This Exercise guides you through a proof of Fact 130(a). Let r = ∣z∣,
θ = arg z, s = ∣w∣, and φ = arg w. (Answer on p. 1518.)
1 1
(a) ∣ ∣= .
w ∣w∣
1
(b) arg = − arg w.
w
1 1 1 1 1
By Fact 131, (a) ∣ ∣ = √ and (b) arg = − arg z ≈ 0.381. Also, (a) ∣ ∣ = √ and
z 29 z w 10
1
(b) arg = − arg w ≈ −1.249. Thus:
w
1 1 1
≈ √ (cos 0.381 + i sin 0.381) ≈ √ e0.381i ,
z 29 29
1 1 1
≈ √ (cos −1.249 + i sin −1.249) ≈ √ e−1.249i .
w 10 10
As stated, Fact 131(b) does not hold in the special case where w < 0. In this special case,
we instead simply have:
1
arg = arg w = π.
w
Example 842. Let ω = −5, so that 1/ω = −1/5. Then arg ω = π and arg (1/ω) = π. So
Fact 131(b) does not hold — that is, arg (1/ω) ≠ − arg w. Instead, we have:
1
arg = arg w = π.
w
Exercise 278. Find the moduli and arguments of each number and its reciprocal. Then
write down the latter in exponential, polar, and cartesian form. (Answer on p. 1518.)
(a) z = 1. (b) w = 2i. (c) z = −17. (d) w = −8i.
(e) z = −2 + 5i. (f) w = −1 − i. (g) z = 1 − 3i. (h) w = 3 + 4i.
∣z∣
(a) ∣ ∣ = = arg z − arg w + 2kπ,
z z
; and (b) arg
w ∣w∣ w
⎧
⎪
⎪
⎪
⎪ −1, if arg z − arg w > π,
⎪
⎪
where in (b): k = ⎨0, if arg z − arg w ∈ (−π, π] ,
⎪
⎪
⎪
⎪
⎪
⎪
⎩1, if arg z − arg w ≤ −π.
Proof. For (a), see Exercise 280. For (b), see p. 1323 (Appendices).
= −0.1 − 1.7i.
z
Thus:
w
Alternatively, we can simply do the usual division, which yields us the exact value of zw:
z 5 − 2i 5 − 2i 1 − 3i 5 − 15i − 2i − 6 −1 − 17i
= = ⋅ = = = −0.1 − 1.7i.
w 1 + 3i 1 + 3i 1 − 3i 12 + 32 10
≈ 0.719 + 0.525i.
z
Thus:
w
Alternatively, we can simply do the usual division, which yields us the exact value of z/w:
z −4 + 7i −4 + 7i 1 − 9i −4 + 36i + 7i + 63 59 + 43i 59 43
= = = = = + i.
w 1 + 9i 1 + 9i 1 − 9i 12 + 92 82 82 82
1 1
≈ √ cos −1.429 ≈ 0.100; ≈ √ sin −1.429 ≈ −0.700.
z z
Re and Im
w 2 w 2
= 0.1 − 0.7i.
z
Thus:
w
Alternatively, we can simply do the usual division, which yields us the exact value of z/w:
z −2 − i −2 − i 1 + 3i −2 − 6i − i + 3 1 − 7i
= = = = = 0.1 − 0.7i
w 1 − 3i 1 − 3i 1 + 3i 12 + 32 10
Exercise 279. For each, find ∣z/w∣ and arg (z/w). Then express z/w in polar, exponen-
tial, and cartesian forms. (Answer on p. 1519.)
(a) z = 1, w = −3. (b) z = 2i, w = 1 + 2i. (c) z = −1 − 3i, w
(d) z = −2 + 5i, w = i. (e) z = −1 − i, w = −1 − 2i. (f) z = −5 − 3i, w
Exercise 280. Use Facts 130 and 131 to prove Fact 132(a). (Answer on p. 1519.)
My mathematical tutors had never shown me any reason to suppose the Cal-
culus anything but a tissue of fallacies.
Zudem ist es ein Irrtum zu glauben, daß die Strenge in der Beweisführung
die Feindin der Einfachheit wäre. An zahlreichen Beispielen finden wir im
Gegenteil bestätigt, daß die strenge Methode auch zugleich die einfachere
und leichter faßliche ist.
Besides it is an error to believe that rigor in the proof is the enemy of sim-
plicity. On the contrary we find it confirmed by numerous examples that the
rigorous method is at the same time the simpler and the more easily compre-
hended.
Remark 78. To the unconvinced Type 1 Pragmatist thinking of skipping this chapter:
Think again. In recent years, your A-Level examiners have seen fit to screw students over
with curveball, totally-out-of-the-syllabus questions264 involving limits. See especially
Exercise 483(c) (N2017/I/9). So yea, this chapter’s probably worth a quick read.
The limit of f at a is L.
What does this mean? Here are two informal definitions or interpretations:265
262
Ignoring the Central Limit Theorem, the word limit appears on your syllabus only once (on p. 9),
almost in passing, and solely in relation to the definite integral.
263
To keep things simple, we discuss only functional limits (and not sequential limits).
264
We already discussed this “phenomenon” in my Preface/Rant — see p. xli.
265
We will formally define the above statement only in the Appendices (see Definition 252).
647, Contents www.EconsPhDTutor.com
Example 846. Consider the function f ∶ R → R defined by f (x) = x2 − 1.
Figure to be
inserted here.
Or: For all values of x that are “close” but not equal to 2,
f (x) is “close” (or possibly even equal) to 3.
The statement the limit of f at a is L can be written more formally (and concisely) as:
lim f (x) = L.
x→a
The condition “not equal to” is subtle and requires emphasis. When considering the limit
of a function g at a, we do not care about g (a), the value of the function at a. We only
care about the values of x that are “close” to a. Example:
648, Contents www.EconsPhDTutor.com
Example 848. Define the function g ∶ R → R by:
⎧
⎪
⎪
⎪x2 − 1 for x ≠ 2,
g (x) = ⎨
⎪
⎪
⎪ for x = 2.
⎩0
The function g is very similar to the function f , except that now there is a “hole” in
the curve, with g (2) = 0. As we’ll learn later, this is an example of a removable
discontinuity.
Figure to be
inserted here.
lim g (x) = 3.
x→2
And so, the following four (equivalent) statements are again true:
1. The limit of g at 2 is 3. 3. As x approaches 2, g (x) approaches 3.
2. lim g (x) = 3. 4. As x → 2, g (x) → 3.
x→2
Actually, the condition “not equal to” goes even further. When we consider the limit of
a function h at a point a, we don’t even care if h (a) is undefined! Example:
Figure to be
inserted here.
lim h (x) = 3.
x→2
And so, the following four (equivalent) statements are again true:
1. The limit of h at 2 is 3. 3. As x approaches 2, h (x) approaches 3.
2. lim h (x) = 3. 4. As x → 2, h (x) → 3.
x→2
Figure to be
inserted here.
A281.
A282.
To better understand limits, we next look at examples where the limit does not exist:
j
1
−1
lim j (x).
x→0
This limit cannot be −1, because for values of x that are “close” to but more than 0, we
have j (x) = 1. Hence:
lim j (x) ≠ −1, or equivalently: For some values of x that are “close” but
x→0 not equal to 0, j (x) is not “close” to −1.
Similarly, this limit cannot be 1, because for values of x that are “close” to but less than
0, we have j (x) = −1. Hence:
lim j (x) ≠ 1, or equivalently: For some values of x that are “close” but
x→0 not equal to 0, j (x) is not “close” to 1.
More generally, the limit cannot be any real number L. For any real number L, we have:
lim j (x) ≠ L, or equivalently: For some values of x that are “close” but
x→0 not equal to 0, j (x) is not “close” to L.
Since there is no real number that equals lim j (x), we simply say that lim j (x) does not
x→0 x→0
exist.
As we’ll learn later, x = 0 is an example of a jump discontinuity.
652, Contents www.EconsPhDTutor.com
Your H2 Maths syllabus does not mention of the concepts of a left-hand limit and a
right-hand limit. But they are simple and can aid your understanding. In the above
example:
• The left-hand limit of j at 0 is −1; and
• The right-hand limit of j at 0 is 1.
In formal notation, we’d write:
Given our above informal definitions of the limit, it is not difficult to write down the
following informal definitions of left- and right-hand limits.
We say that the left-hand limit of f at a is L if:
Not surprisingly, the limit of f at a is L ⇐⇒ the left- and right-hand limits of f at a are
also L:
Proof. It turns out that this result is an immediate consequence of our definitions of the
limit (Definition 252), the left-hand limit (Definition 253), and the right-hand limit
(Definition 254). However, these formal definitions are given only in the Appendices.
However, they are not equal. And so, by Fact 133, the limit of j at 0 does not exist.
Figure to be
inserted here.
Then the limit of k at 0 does not exist. This is because there is no real number L such
that:
lim k (x) = ∞.
x→0
At this point you may feel confused. We have two seemingly-contradictory statements.
• “The limit of k at 0 does not exist.” (Or: “lim k (x) does not exist.”)
x→0
• “The limit of k at 0 is infinity.” (Or: “lim k (x) = ∞.”)
x→0
But strangely enough, both of the above statements are true. How can this be?
The key here is to recall what we emphasised on p. 42 — ∞ is not a number. Instead,
it is merely a symbol that is sometimes convenient for helping us say specific things.
We have written:
statement:
Importantly, = does not say that lim k (x) is equal or identical to some object called ∞.
1
x→0
Indeed, in writing =, we do not even commit to the existence of an object called ∞.
1
Figure to be
inserted here.
Again, the limit of m at 0 does not exist. This is because there is no real number L
such that:
It is likewise true that the left- and right-hand limits of m at 0 do not exist. This is
because there is no real number L such that:
For all values of x that are “close” For all values of x that are “close”
but less than 0, m (x) is “close” or but greater than 0, m (x) is “close”
(or possibly even equal) to L; (or possibly even equal) to L.
Figure to be
inserted here.
Observe that ln is undefined “near” −1. And so, the limit of ln at −1, or lim ln x, does
x→−1
not exist.
One key motivation for having the concept of limits is that it can help us understand
how a function behaves “near” a point.
And so, if a function is undefined “near” a point, then there is nothing to understand. In
which case, we shall simply say that the limit does not exist at that point.
So, here for example, the function ln is undefined “near” −1. Thus, we shall simply say
that lim ln x does not exist.
x→−1
Indeed, if a is any negative number, then the limit lim ln x likewise does not exist. The
x→a
reason is that ln is undefined “near” any negative number a.
By the way, what is the limit of ln at 0? Following our previous examples, we observe
that the right-hand limit of ln at 0 is −∞ and we may write:
lim+ ln x = −∞
1
x→0
lim ln x = −∞.
x→0
Informally, the reason is that for all values of x that are in the domain of ln, as x
“approaches” 0, it is indeed the case that ln x “grows” without bound from below.
⎧
⎪ 1
⎪
⎪sin for x ≠ 0,
f (x) = ⎨ x
⎪
⎪
⎪
⎩0 for x = 0.
This is a very strange function indeed. Like sin, f takes on values between −1 and 1.
But as x gets “closer” to 0, f (x) fluctuates ever more rapidly between −1 and 1. Indeed,
when we’re very “close” to 0, it’s impossible to accurately depict the graph of f .
y
⎧
⎪ 1
⎪
⎪sin for x ≠ 0,
f (x) = ⎨ x
⎪
⎪
⎪
⎩0 for x = 0.
Observe that for all values of x that are “close to” but not equal to 0, there is no number
L that f (x) is “close to”. Instead, when x is “close to” 0, f (x) takes on every value in
[−1, 1] “infinitely often”! And so, “near” 0, there is no number L that f (x) can be said
to be “close to”. In other words:
Theorem 14. (Rules for Limits) Suppose k, L, M ∈ R, lim f (x) = L, and lim g (x) =
x→a x→a
M . Then:
lim [kf (x)] = kL
F
(a) (Constant Factor Rule)
x→a
±
(b) lim [f (x) ± g (x)] =L+M (Sum and Difference Rules)
x→a
×
(c) lim [f (x) g (x)] = LM (Product Rule)
x→a
1 R 1
(d) lim = (provided M ≠ 0) (Reciprocal Rule)
x→a g (x) M
f (x) ÷ L
(e) lim = (provided M ≠ 0) (Quotient Rule)
x→a g (x) M
=k
C
(f) lim k (Constant Rule)
x→a
= ak
P
(g) lim xk (Power Rule)
x→a
Remark 79. As stated, limits are not on your H2 Maths syllabus. And so, a fortiori,
neither are the above Rules for Limits.
Nonetheless, since the above Rules are so simple, “obvious”, and easy to remember, it is
probably not much of a cognitive burden on you to include them here. We will, moreover,
find the above Rules useful when learning how to compute derivatives shortly.
267
These Rules are most definitely not on your H2 Maths syllabus. But since they are so simple and
“obvious”, there’s probably no pedagogical harm in listing them without proof.
659, Contents www.EconsPhDTutor.com
68. Continuity, Revisited
In Ch. 11, we already briefly discussed the concept of continuity. Recall that informally, a
continuous function is one whose entire graph can be drawn without lifting your pencil.
We now revisit the concept of continuity, now that we have an intuitive grasp of the idea of
limits. In particular, we can now use limits to write down a formal definition of continuity:
Definition 164. Let f be a nice function268 with domain D. Let a ∈ D. We say that f
is continuous at a if either of the following Conditions hold:
1. lim f (x) = f (a); or
x→a
2. a is an isolated point of D.
The above Definition gives two Conditions under which f is said to be continuous at a.
Condition 1 is the important one that we’ll focus on. In words, it says that the limit of f
at a equals the value of f at a.
Condition 2 is less important. You can think of it as an annoying technicality that we’ll
briefly discuss only in Ch. 68.6 (optional).
We’ll illustrate continuity using examples from the previous chapter:
Figure to be
inserted here.
Figure to be
inserted here.
The function g is almost identical to the function f from the previous example.
It is again the case that g is continuous at 1, 0, and −1, because:
Definition 166. Let f be a nice function and a be a non-isolated point in its domain.
We say that f is discontinuous at a (or has a discontinuity at a) if:
270
These are formally defined in Definition 262(Appendices).
662, Contents www.EconsPhDTutor.com
In our last example, the function g had a removable discontinuity at 2. Informally, a
removable discontinuity is simply where we have a “little hole” at a point — patch that
“little hole” and the function becomes continuous at that point.
Two more examples of functions with a removable discontinuity:
j
1
−1
Any other discontinuity — that is, any discontinuity that isn’t a removable or a jump
discontinuity — is called an essential (or infinite) discontinuity. (So this is really
⎧
⎪ 1
⎪
⎪sin for x ≠ 0,
f (x) = ⎨ x
⎪
⎪0
⎪
⎩ for x = 0.
Thus, f is discontinuous at 0.
It turns out that this discontinuity at 0is neither a removable discontinuity nor a jump
discontinuity. And so, we call it an essential (or infinite) discontinuity.
y
⎧
⎪ 1
⎪
⎪sin for x ≠ 0,
f (x) = ⎨ x
⎪
⎪
⎪
⎩0 for x = 0.
It turns out that again, f is continuous everywhere except at a single point (namely 0).
That is, f is continuous on (−∞, 0) ∪ (0, ∞), but discontinuous at 0. And so, f is not a
continuous function.
Using the formal definitions of the three types of discontinuities (given in the Appendices),
we can prove that none of these discontinuities is a removable or a jump discontinuity.
Hence, each is an essential (or infinite) discontinuity. Thus, we may say that d is essen-
tially (or infinitely) discontinuous everywhere. Or equivalently, d has an essential
(or infinite) discontinuity at every point a ∈ R.
Figure to be
inserted here.
Indeed, perhaps surprisingly, h is continuous at every point in its domain (we prove this
formally on p. XXX in the Appendices). Thus, h is a continuous function!
Observe that this example proves that our informal definition of continuity (“we can draw
the entire graph without lifting our pencil”) is actually not quite correct!271 Although h
is continuous, we are unable to draw its entire graph without lifting our pencil.
Example 876. As noted in the previous chapter, lim ln x, does not exist or is un-
x→−1
defined.
Figure to be
inserted here.
However, as also noted, the natural logarithm function ln has domain R+ . So, −1 is not
in its domain. Equivalently, ln is not defined at −1.
Thus, ln is neither continuous nor discontinuous at −1.
Indeed, perhaps surprisingly, ln is continuous at every point in its domain (we prove this
formally on p. XXX in the Appendices). Thus, ln is a continuous function!
271
To be precise, the condition given our informal definition is sufficient but not necessary for continuity.
666, Contents www.EconsPhDTutor.com
Example 877. XXX
⋆
lim sin x2 = sin (lim x2 ) = sin 0 = 0.
x→0 x→0
⋆
It turns out that the above is correct. However, the step taken at = requires justification.
Why is it that we can simply “move” the limit in?
One of the (many) nice things about continuity is the following result. When taking take
the limit of a composite function, we can “move” the limit in if the “outer” function is
⋆
continuous. This justifies = in the above example — since sin is continuous, we can “move”
lim in.
x→0
Of course, in H2 Maths, most functions we’ll encounter are continuous, so that the above
result usually applies.
Intuition would also suggest that continuity is preserved272 under the four basic arithmetic
operations and scalar multiplication:
Theorem 16. Suppose the nice functions f and g are continuous at a. Then (a) f ±g and
(b) f ⋅ g are also continuous at a. (c) If moreover g (a) ≠ 0, then f /g is also continuous
at a. (d) If c ∈ R, then cf is also continuous at a.
272
Or closed.
669, Contents www.EconsPhDTutor.com
Using the above four results, we can quite easily prove that all polynomial functions
are continuous:
Proof. Let g and h be functions with the same domain as f . Let g and h have mapping
rules g (x) = x and h (x) = c0 .
By Fact 136, g is continuous.
By Theorem 15, g 2 = g ○ g is also continuous. That is, the nice function that has the same
domain as f and the mapping rule x ↦ x2 is also continuous.
Similarly, for any n = 1, 2, 3, . . . , the repeated application of Theorem 15 shows that g n =
g ○ g ○ ⋅ ⋅ ⋅ ○ g is also continuous.
Now, observe that the mapping rule of f may also be rewritten as:
Fact 138. The functions sin and cos cosine functions are continuous.
Corollary 29. The functions sin−1 , cos−1 , and exp are continuous.
Corollary 30. The functions tan, cosec, sec, cot, and tan−1 are continuous.
In summary:
Copy-pasting Definition 77 of an elementary function, the above Theorem says that each
of the following functions is continuous:
Exercise 287. Prove Corollary 30. (You should carefully specify any definitions and
results used at each step of the way.) (Answer on p. 1521.)
In an ideal universe, Condition 2 would be unnecessary because it would already have been
implied by Condition 1 of Definition 164 — lim f (x) = f (a). Unfortunately, if a is an
x→a
isolated point of D, then by our formal definition of limits (see Ch. 121.2), the limit of
f at a does not exist, so that lim f (x) = f (a) is necessarily false. Hence the annoying,
x→a
additional need for Condition 2.
273
For the formal definition, see Definition 250 (Appendices).
274
Here are two. First, in general, we want to be able to say that the restriction of a continuous function
to a subset of its domain is also continuous. So, say f ∶ X → Y is continuous. Let S ⊆ X be a set of
isolated points in X. Now consider the function g ∶ S → Y defined by x ↦ f (x). We want to say that g
is also continuous.
Second, under more general definitions of continuity (e.g. the topological one where the pre-image of
open sets are also open), a function is continuous at its isolated points. So here in our more specialised
setting, we want to also include in our definition of continuity any isolated points.
672, Contents www.EconsPhDTutor.com
69. The Derivative
Differentiation, or the problem of finding the derivative, is the problem of finding the
gradient of a curve.
Graphed below is some function f ∶ R → R.
Consider the point A = (a, f (a)). Let l be the tangent line to the graph of f at A.
Now, how might we find the gradient of the line l? Unsure of how to proceed, we try a
series of approximations.
1. We first pick some point B = (b, f (b)) on f .
Consider the line AB. We have:
Rise f (b) − f (a)
AB’s gradient = =
b−a
.
Run
Clearly, our actual tangent line l is steeper than the line AB. Nonetheless, AB’s gradient
serves as our first crude estimate of l’s gradient.
Now, can we improve on this crude estimate? Sure. Simply:
2. Pick some point C = (c, f (c)) that’s also on f but which is closer to A than B.
Consider the line AC. It is a little steeper than AB but still not as steep as l.
We have:
Rise f (c) − f (a)
AC’s gradient = =
c−a
.
Run
The gradient of AC now serves as our second and slightly-improved estimate of l’s gradient.
AC
l
f (b)
C B
f (c)
f (a)
A
a c b x
We can keep repeating the above procedure to get ever-improved estimates of l’s gradient:
673, Contents www.EconsPhDTutor.com
• Pick a point D = (d, f (d)) that’s on f but which is closer to A than C. The gradient of
the line AD serves as our third and slightly-improved estimate of l’s gradient.
• Pick a point E = (e, f (e)) that’s on f but which is closer to A than D. The gradient of
the line AE serves as our fourth and slightly-improved estimate of l’s gradient.
• Etc.
The above suggests that the gradient of the line l at the point A can be written as:
f (x) − f (a)
lim
x−a
.
x→a
We are thus motivated to write down the following formal definition of the derivative.275
If this limit exists (i.e. is equal to a real number), then we say that f is differentiable at
df df
a, call this limit the derivative of f at a, and denote it by f ′ (a), (a), ∣ , or f˙ (a).
dx dx x=a
If this limit doesn’t exist, then we say that f is not differentiable at a.
Remark 80. The lines AB, AC, AD, and AE in our above discussion are sometimes called
secant lines. We may thus consider the tangent line l to be the limit of these secant
lines.
This is just so you know — the term secant lines does not appear on your H2 Maths
syllabus or exams.
Use the Rules for Limits (Theorem 14), we can compute derivatives:
275
Technical/pedagogical note: Strictly speaking, it is unnecessary to assume that the function’s domain
is an interval. What matters is that the point a is a limit point of D (see Definition 251). However, by
imposing this (strong) assumption, every point in D is a limit point, thus allowing us to avoid having
to speak of limit points in the main text. Besides, most (all?) functions encountered in H2 Maths are
defined on intervals (or unions thereof). So all things considered, this assumption is mostly harmless.
674, Contents www.EconsPhDTutor.com
Example 889. Define f ∶ R → R by f (x) = ∣x∣.
Using the above Rules, we can prove that the derivative of f at 2 is 1, i.e. that f ′ (2) = 1:
x−2
= lim (For all x “near” 2, x ≥ 0 and hence ∣x∣ = x)
x→2 x − 2
=1
F
(Constant Factor Rule).
Figure to be
inserted here.
We can similarly show that f is likewise differentiable at any a > 0. In particular, we can
prove that the derivative of f at any a > 0 is also 1 — in other words, we can prove that
f ′ (a) = 1 for any a > 0:
=1
F
(Constant Factor Rule).
We will continue with this example below. But first, Exercise 288.
Figure to be
inserted here.
Using the Rules for Limits, we can prove that the derivative of g at 2 is 4, i.e. g ′ (2) = 4:
g (x) − g (2) x 2 − 22
lim = lim (Simply plug in)
x→2 x−2 x→2 x − 2
(x − 2) (x + 2)
= lim
x→2 x−2
= lim (x + 2) (Note that x ≠ 2)
x→2
±
= lim x + lim 2 (Sum and Difference Rules)
x→2 x→2
= 2+2=4
P,C
(Power and Constant Rules).
Or more informally:
”Near” a (or, when we “zoom in” to a), f “looks” like a straight line.
Example 892. The graph of sin doesn’t “look” like a straight line anywhere.
Figure to be
inserted here.
However, if we pick any point, say x = 0, and zoom in, then the graph does “look”
increasingly like a straight line. And indeed, sin is differentiable at 0.
Example 893. Consider the absolute value function. No matter how far we zoom in at
the point 0, it never looks like a straight line.
Figure to be
inserted here.
Remark 81. Note that instead of approximately linear, some writers say locally lin-
ear.276
f (x) − f (a)
“f ′ (a) ≈ ”.
x−a
Then rearranging, we have:
Figure to be
inserted here.
However, the converse is false — that is, a function may be continuous without also being
differentiable.
B
A
h
f g
So, given any point a ∈ S, f ′ (a) gives us the gradient of the tangent line to f at that point.
Let us stress, emphasise, and repeat: the derivative f ′ is itself also a function. In partic-
ular, it is the function whose:
• Domain is the set of points at which the derivative of f exists; and
f (x) − f (a)
• Mapping rule is a ↦ lim .
x→a x−a
Exercise 290. XXX (Answer on p. 681.)
A290.
Suppose that a ∈ Domainf . Then the following limit (if it exists) is simply a number and
is called the derivative of f at a:
f (x) − f (a)
lim
x−a
.
x→a
There are, again, at least three different ways to denote the derivative of f at a:
f (x) − f (a)
lim = f ′ (a) (Lagrange’s notation)
x→a x−a
R
df RRRR df
= RRR = (a) (Leibniz’s notation)
dx RR dx
Rx=a
⋅
= f (a). (Newton’s notation)
Of course, Newton’s notation is very similar to Lagrange’s — instead of the prime symbol
′
to the right of the name of the function f , Newton uses a dot over f .
Remark 82. The notation of Lagrange and Leibniz are widely used. Newton’s is not.
Indeed, Newton’s notation does not appear in any of your recent years’ A-Level exams
and we shall not use it in this textbook.
Nonetheless, Newton’s notation is sometimes used in physics (especially when the in-
dependent variable is time). Moreover, it appears on p. 18 of your syllabus. So, it’s
probably worth knowing about.
d
One convenience of Leibniz’s notation is that it allows us to interpret as the differen-
dx
d
tiation operator or function. That is, is itself a function that maps a function (e.g.
dx
f ) to another (e.g. f ′ ).277
277
Operator and function are synonyms. However, if a function maps functions to other functions, then
d
we tend to call this function an operator. Here for example, maps a function f to another function
dx
f ′ and so we call it an operator.
682, Contents www.EconsPhDTutor.com
d 2
Example 899. The statement “ x = 2x” is simply shorthand for:
dx
Or: d
The function or operator maps the
dx
function with mapping rule x ↦ x2 to
the function with mapping rule x ↦ 2x.
d
Example 900. The statement “ f = g” is simply shorthand for:
dx
Or: d
The function or operator maps the
dx
function named f to the function named g.
d
Example 901. The statement “ f ⋅ g = f ′ ⋅ g + f ⋅ g ′ ” (this is the Product Rule) is simply
dx
shorthand for:
Or: d
The function or operator maps the
dx
function with mapping rule x ↦ (f ⋅ g) (x) to
the function with mapping rule x ↦ f ′ (x) ⋅ g (x) + f (x) ⋅ g ′ (x).
Isaac Newton (1643–1727) was one of the greatest physicists ever and also one of the
greatest mathematicians ever. It is not surprising then that one writer ranked him the
second-most influential person in history (and the only among the top six who was
a non-religious figure).278
Gottfried Wilhelm von Leibniz279 (1646–1716) was likewise a first-rate genius and a poly-
math. Indeed, he is sometimes called be “the last man to know everything”, the
rationale being that:
Since his time the growth of knowledge has resulted in, and indeed neces-
sitated, specialization. The horizon for the individual is now restricted,
for few can hope to attain proficiency in more than one subject.
Newton and Leibniz are often dubbed the “inventors” of the calculus. Indeed, their
dispute over who “invented” calculus is perhaps history’s most famous academic dispute.
(Even history’s greatest geniuses are not above some petty bickering.)280 But as has been
well said by the historian of mathematics Carl B. Boyer (1949):
278
Michael Hart, in The 100: A Ranking of the Most Influential Persons in History (1978, 1992). In case
you’re wondering, Muhammad was ranked first and Jesus third. Full rankings plus summary here and
book here.
279
Sometimes spelt Leibnitz.
280
Jason Socrates Bardi gives a popular account of this dispute in The Calculus Wars: Newton, Leibniz,
and the Greatest Mathematical Clash of All Time (2007).
684, Contents www.EconsPhDTutor.com
69.5. Proving Several Rules of Differentiation
In Ch. 18.5, we gave several (informal) Rules of Differentiation.281 We now reproduce
these verbatim:
And so, in this and the next two subchapters, we shall formally and properly restate and
prove several of the above Rules of Differentiation. These proofs are not on the syllabus,
so the Type 1 pragmatist can choose to skip them.
We start with the simplest, the Constant Rule, which says that the derivative of a
constant function is a zero function:
f ′ (x) = 0.
C
(Constant Rule)
where the last step uses the Constant Rule for Limits (see Theorem 14).
We’ve just shown that for any a ∈ D, the derivative of f at a is 0. Thus, the derivative of
f is the function f ′ ∶ D → R defined by f ′ (x) = 0.
By the way, the above result’s converse is also true. That is, a function whose derivative
is a zero function is itself is a constant function:
We next state and prove the Constant Factor Rule, which says that the derivative
of a constant multiple of a function is the scalar multiple of that function’s
derivative:
′
(cf ) (x) = cf ′ (x).
F
(Constant Factor Rule)
where = and = use the Product and Constant Rules for Limits (see Theorem 14), while =
1 2 3
The Sum and Difference Rules say that the derivative of the sum (or difference)
of two functions is the sum (or difference) of their derivatives:
′ ±
(f ± g) (x) = f ′ (x) ± g ′ (x).
(Sum and Difference Rules)
f ′ (x) = cxc−1 .
P
(Power Rule)
A complete proof of the Power Rule is beyond the scope of H2 Maths and this textbook.
Nonetheless and very excitingly, we will now learn to prove the Power Rule in the special
case where the exponent c is a non-negative integer.
Of course, we’ve already proven the Power Rule in the simplest case where the exponent is
0 — this is simply the Constant Rule. And so, let us start with the next simplest case —
the case where the exponent is 1 — and work our way up:
g (x) − g (a) x2 − a2
lim = lim (Simply plug in)
x→a x−a x→a x − a
(x − a) (x + a)
= lim
x→a x−a
= lim (x + a) (Note that x ≠ a)
x→a
±
= lim x + lim a (Sum and Difference Rules)
x→a x→a
= a + a = 2a
P,C
(Power and Constant Rules).
h (x) − h (a) x3 − a3
lim = lim (Simply plug in)
x→a x−a x→a x − a
(x − a) (x2 + ax + a2 )
= lim
1
x→a x−a
= lim (x2 + ax + a2 ) (Note that x ≠ a)
x→a
±, F
= lim x2 + a lim x + lim a2 (Sum, Difference, and Constant Factor Rules)
x→a x→a x→a
= a2 + a ⋅ a + a2 = 3a2
P, C
(Power and Constant Rules).
What we’ve just shown is that for any a ∈ D, the derivative of h at a is 3a2 .
Thus, the derivative of h is the function h′ ∶ D → R defined by h′ (x) = 3x2 .
Using the following information, we can go on finding the derivatives of higher integer
powers:
Exercise 291. Find the derivative of the function i ∶ R → R defined by i (x) = x4 .(Answer
on p. 1523.)
Exercise 292. Let c be a positive integer and define f ∶ R → R by f (x) = xc . Find the
derivative of f (you will thus have proven the Power Rule in the special case where the
exponent is a positive integer). (Answer on p. 1523.)
Remark 83. As stated above, a complete proof of the Power Rule is beyond the scope of
H2 Maths and this textbook. In Exercise 292, we merely proved the Power Rule in the
special case where c is a positive integer.
For a somewhat more general proof of the Power Rule, see p. 1337 (Appendices).
Exercise 293. What’s wrong with the following “proof” that 1 = 0? (Answer on p.
1524.)
Theorem 20. (The Product and Quotient Rules) Let D be an interval. Suppose
f, g ∶ D → R are differentiable functions. Then:
′
(a) The derivative of the function f ⋅ g is the function (f ⋅ g) ∶ D → R defined by:
×
(f ⋅ g) ′ (x) = f (x) g ′ (x) + f ′ (x) g (x). (Product Rule)
×
(f ⋅ g) ′ = f ⋅ g ′ + f ′ ⋅ g.
Or equivalently and more succinctly: (Product Rule)
′
(b) The derivative of the function f /g is the function (f /g) ∶ D ∖ {x ∶ g (x) = 0} → R
defined by:
f ′ ÷ gf ′ − f g ′
Or equivalently and more succinctly:( ) = . (Quotient Rule)
(g)
g 2
′
Remark 84. Note the exclusion from the domain of (f /g) any points at which g (x) = 0.
If you cannot remember or do not understand why this is necessary, go back and review
Ch. 13.
We reproduce from the following common mnemonic for the Quotient Rule:
Fun Fact
He quickly realised though that this was wrong and arrived at the correct Product Rule.282
282
For more about this story, see Google Books and .
690, Contents www.EconsPhDTutor.com
Proof. (a) (Product Rule) For any a ∈ D, we have:
′ (f ⋅ g) (x) − (f ⋅ g) (a)
(f ⋅ g) (a) = lim
x→a x−a
f (x) g (x) − f (a) g (a)
= lim
x→a x−a
f (x) g (x) −f (x) g (a) + f (x) g (a) − f (a) g (a)
= lim (Plus Zero Trick)
x→a x−a
g (x) − g (a) f (x) − f (a)
= lim [f (x) ] + lim [ g (a)]
x→a x−a x→a x−a
g (x) − g (a) f (x) − f (a)
= lim f (x) lim + g (a) lim
x→a x→a x−a x→a x−a
= f (a) g ′ (a) + f ′ (a) g (a).
We’ve just shown that for any a ∈ Domain (f ⋅ g), the derivative of f ⋅ g at a is f (a) g ′ (a) +
′
f ′ (a) g (a). Thus, the derivative of f ⋅ g is the function (f ⋅ g) ∶ D → R defined by
′
(f ⋅ g) (x) = f (x) g ′ (x) + f ′ (x) g (x).
(b) (Quotient Rule) For any a ∈ D ∖ {x ∶ g (x) = 0}, we have:
(b) Since f and g are differentiable at a, by _____ (result), they are also _____ at
a. And so, by the definition of _____, we have:
(c) Since g is _____ at a and g (a) ≠ 0, by _____ (result), the reciprocal function
1/g is also _____ at a. And so, again by the definition of _____, we have:
1
lim =?
x→a g (x)
(e) Now return to expression ⋆ and plug in the equations = through = to get:
1 6
=?
(f) Complete the proof by writing down the usual last two sentences.
Theorem 21. (Chain Rule) Let a ∈ R and f and g be nice functions. Suppose g ′ (a)
and f ′ (g (a)) exist. Then:
Unfortunately, the above “proof” contains two fatal flaws. First, in =, there is the possibility
1
that g (x) = g (a) for some values of x that are “near” a — if so, then = commits the cardinal
1
f (g (x)) − f (g (a)) ⋆ ′
sin of (possibly) dividing by zero. Second, the step lim = f (g (a))
x→a g (x) − g (a)
requires additional justification.
Nonetheless, these two flaws may be regarded as mere blemishes or technicalities that can
be easily addressed — if you’re interested in the gory details, see p. 1338 (Appendices).
Though flawed, the above “proof” should give you an idea of why the Chain Rule “works”.
Figure to be
inserted here.
Nonetheless, the above claim is “almost” true. Recall that an elementary function is:
It turns out that the only bad apple is the power function x ↦ xc and even then only
in the special case where c < 1. In such cases, the derivative at certain points may have
denominator 0 and thus be undefined. And so, we have the following informal theorem
telling us that with this small exception, all elementary functions are differentiable:
All elementary functions are differentiable, except possibly those involving the power func-
tion x ↦ xc where c < 1.
∆y y (x) − y (a)
lim = lim .
x→a ∆x x→a x−a
We then introduce the following (familiar) piece of notation:
dy ∆y
= lim .
dx x→a ∆x
dy
It must be stressed, emphasised, and repeated that is a single expression. Do not think
dx
of it as a fraction with numerator dy and denominator dx.
However, to better understand where the above notation comes from, let us note that
Leibniz held a view that was contrary to the modern and standard one. In particular, to
Leibniz:
• dx denoted an “infinitesimal change in x”;
• dy denoted the corresponding “infinitesimal change in y”; and
dy
• was really a fraction with numerator dy and denominator dx.
dx
Unfortunately, Leibniz’s notion of “infinitesimals” (or Newton’s of “fluxions”) was rather
vague, imprecise, and non-rigorous. An “infinitesimal” was smaller than any quantitatively
and yet not zero. The most famous (and most poetic) critique is probably George Berkeley’s
(1734):
what are these same evanescent Increments? They are neither finite Quant-
ities nor Quantities infinitely small, nor yet nothing. May we not call them
the Ghosts of departed Quantities?
So, in the 19th century, mathematicians embarked on a project to put calculus on a firmer
footing. In particular, they sought to rid mathematics of those ill-defined “infinitesimals”.
Eventually, they came up with the formal notion of limits. Thereafter, Leibniz’s “infinites-
imals” or Newton’s “fluxions” were banished from maths.283
In Ch. 67, you learnt a little about the idea of limits. Under our modern notion of limits:
283
But would later be resurrected by Abraham Robinson in the 1960s with his non-standard analysis.
695, Contents www.EconsPhDTutor.com
dy
It is wrong to think of as a fraction with numerator dy and denominator dx.
dx
So simply put, Leibniz was wrong to think of the derivative as a fraction.284 And
you should be very careful not to think of the derivative as a fraction, even though it looks
very much like one.
dy
Instead, denotes a function — in particular, it is the derivative of the function y.285
dx
d
Indeed, the operator is itself a function! It maps a function, for example y, to another
dx
dy d
function denoted and which we call the derivative of y. Thus, is itself a function
dx dx
whose domain and codomain are both sets of functions!
Now, if Leibniz was wrong to think of the derivative as a fraction, then why are we still
using his notation? The main reason is that it is highly intuitive.
Leibniz’s notation reminds us that calculus is the study of continuous changes. For
example, it also allows us to quickly grasp the intuition behind such results as the Chain
Rule, which we stated informally as:
dz dz dy
= × .
dx dy dx
It is tempting to naïvely interpret the expressions in the above equation as fractions, naïvely
apply simple algebra, naïvely cancel out the dy’s, so that the equation looks correct by
primary-school algebra:
dz dz dy
“ = × .”
dx dy dx
But the correct informal interpretation (easily seen when written in Leibniz’s notation) is
this:
Another example is the Inverse Function Theorem, which may informally be stated as:
dy 1
= .
dx dx
dy
dy dx
Again, the naïve interpretation would be that and are fractions, so that the above
dx dy
equation again looks correct by primary-school algebra.
284
The traditional historiographical view is that Leibniz and Newton were less than completely rigorous
and sometimes committed logical fallacies. Note however that not everyone agrees. For example, Katz
and Sherry (2012) argue that, “Leibniz’s system for differential calculus was free of logical fallacies.”
285
Actually, the variable x is superfluous! Indeed, Euler simply wrote Dy to denote the derivative of the
function y. Happily, this fourth (!) piece of notation for the derivative does not appear in your A-Level
syllabus or exams and so we shall say no more about it.
696, Contents www.EconsPhDTutor.com
But again, the correct informal interpretation (easily seen when written in Leibniz’s nota-
tion) is this:
−1
The change in y due to ⎛ The change in x due to ⎞
=
a small unit change in x ⎝ a small unit change in y ⎠
Stack Exchange has numerous discussions on why we use Leibniz’s notation even though it
is arguably “wrong”.286 I recommend reading the top answer here:
Fun Fact
Here is what Alan Turing wrote about the Leibniz notation in his recently-discovered
wartime notebooks:
dy
The Leibniz notation I find extremely difficult to understand in spite of
dx
it having been the one I understood best once! It certainly implies that some
y = x2 + 3x
286
“Wrong” is in scare quotes here because, of course, notation can’t be “wrong” any more than any
convention (such as driving on the left side of the road) can be “wrong”.
287
Discussion of what Turing might have meant by this remark: .
697, Contents www.EconsPhDTutor.com
70. Some Techniques of Differentiation
Example 906. Let x be the mass of Milo powder in a cup of water and y be the volume
of water in the cup.
Suppose that adding another 1 g of Milo to the cup of water increases the volume of water
by 2 cm3 . Then we may write:
dy
= 2 cm3 g−1 .
dx
By the IFT (or common sense), we also have:
dx 1 −1 3
= g cm .
dy 2
That is, if we had instead wanted to increase the volume of water by 1 cm3 , we should
instead have added 1 g of Milo.
dy dx 1
Method #2 (quicker method using the IFT). = cos x Ô⇒ = .
dx dy cos x
288
For a formal statement of the IFT, see Theorem 34 (Appendices).
698, Contents www.EconsPhDTutor.com
dy dx
Exercise 295. Suppose x2 y + sin x = 0. Find . Hence write down . (You may leave
dx dy
your answers expressed in terms of x and y.) (Answer on p. 1528.)
dy
Example 908. Consider the equation x2 + y 2 = 1. What is y ′ ,
1
, or ẏ?
dx
Method 1. First express y in terms of x:
√
y = ± 1 − x2 .
2
√ strictly/pedantically speaking, here we actually have√ two functions. One is y1 ∶ [−1, 1] → R defined by
(Note that
y1 (x) = 1 − x2 . The other is y2 ∶ [−1, 1] → R defined by y2 (x) = 1 − x2 .)
d
operator to =:
2
Now apply the
dx
dy d √ −2x −x ∓x
= (± 1 − x2 ) = ± √ = ±√ =√ .
dx dx 2 1 − x2 1 − x2 1 − x2
√
(Strictly/pedantically speaking, we have two √ derivatives. One is y1′ ∶ (−1, 1) → R defined by y1′ (x) = −x/ 1 − x2 . The
other is y2′ ∶ (−1, 1) → R defined by y2′ (x) = x/ 1 − x2 . Note that the functions y1 and y2 are not differentiable because
each fails to be differentiable at the points −1 and 1.)
d
operator to =:
1
Method 2 (Implicit Differentiation). Directly apply the
dx
d d dy dy 3 x
(x2 + y 2 ) = 1 Ô⇒ 2x + 2y =0 ⇐⇒ = − (for y ≠ 0).
dx dx dx dx y
Alternatively, we can plug = into = to get the same answer as in Method 1e:
2 3
dy ∓x
=− √ =√
x
.
dx ± 1 − x2 1 − x2
In the above example, Method 2 (implicit differentiation) was not obviously superior
to Method 1. However, it is sometimes difficult (or impossible) to express y in terms of
x. In such cases, implicit differentiation is the clear winner:
√ √ 1 dy y (− sin x) − cos x dx
dy
d d
(x2 y + )= Ô⇒ 2x y + x2 √ + = 0.
y
1
dx cos x dx 2 y dx cos2 x
Now plug in x = 0:
√ y (− sin 0) − cos 0 dx ∣ dy
2 1 dy dy
2⋅0 y+0 √ ∣ + x=0
=0 ⇐⇒ ∣ = 0.
2 y dx x=0 cos 0
2 dx x=0
In this textbook, we will simply take for granted that implicit differentiation “works”,
without explaining why.289
289
But if you’re interested, here’s a quick and incomplete explanation: Suppose we have an equation
involving x and y, but have no idea how to explicitly express y in terms of x. Then the Implicit
Function Theorem says that even if we have no idea how, it is actually possible to express y as
a function of x and moreover, we can speak of the derivative of y (with respect to the variable x).
Unfortunately, the Implicit Function Theorem requires a little knowledge of multivariate calculus and
partial derivatives and so we shall omit any discussion of it from this textbook altogether. But if you’re
interested, see Wikipedia or Krantz & Parks (1993, The Implicit Function Theorem: History, Theory,
and Applications).
701, Contents www.EconsPhDTutor.com
Happily, the next Corollary appears on List MF26, so no need to mug:
d d 1
Fact 144. (a) sec x = sec x tan x. (b) sin−1 x = √ .
dx dx 1 − x2
d −1 d 1
(c) cos−1 x = √ . (d) tan−1 x = .
dx 1 − x2 dx 1 + x2
Proof. You are asked to prove (a), (c), and (d) in Exercise 296. Here we prove only (b).
d dy
(b) Let y = sin−1 x ∈ [− , ]. Then x = sin y. Apply to = to get 1 = cos y .
π π 1 1 2
2 2 dx dx
√
From the identity sin2 y + cos2 y = 1, we have cos y = ± 1 − x2 . But since y ∈ [− , ],
3 π π
2 2
√ that cos y ≥ 0. And so in =, we may simply discard the negative value to get
3
we know
cos y = 1 − x2 .
4
√ dy dy 1
1= 1 − x2 or =√ .
dx dx 1 − x2
Exercise 296. Prove Fact 144(a), (c), and (d). (Answer on p. 1527.)
dy b dy dx
= ÷ .
dx dt dt
“Proof”. Use (a) the Chain Rule and (b) the IFT:
dy a dy dt b dy dx
= = ÷ .
dx dt dx dt dt
dy
Example 910. Let x = t5 + t and y = t6 − t. Find ∣ .
dx t=0
dy dy dx 6t5 − 1
= ÷ =
dx dt dt 5t4 − 1
.
dy
So ∣ = 1. It would be much more difficult (perhaps even impossible) if instead we
dx t=0
dy
first tried to express y in terms of x, then compute .
dx
dy
Exercise 297. Let x = cos t + t2 and y = et − t3 . Find . (Answer on p. 1528.)
dx
Figure to be
inserted here.
The second derivative of f is the function f ′′ ∶ R → R defined by f ′′ (x) = 6x. The second
derivative of f at −1 is the number f ′′ (−1) = 6 (−1) = −6.
Observe that f ′′ is itself:
If this limit exists (i.e. is equal to a real number), then we say that f is twice differentiable
d2 f
at a, call this limit the second derivative of f at a, and denote it by f (a), ′′
(a),
dx2
d2 f
∣ , or f¨ (a).
dx2 x=a
If this limit doesn’t exist, then we say that f is not twice differentiable at a.
If f is twice differentiable at every point in a set S, then we say that f is twice differentiable
on S.
Let T be the set of points on which f is twice differentiable. Then the second derivative
2
′′ d f
of f is the real-valued function denoted f , , or f¨, with domain T and mapping rule:
dx2
f ′ (x) − f ′ (a)
x ↦ lim
x−a
.
x→a
If f is twice differentiable on its domain (or equivalently, if T = D), then we simply call
f a twice-differentiable function.
For comparison, we now reproduce Definitions 167 and 168 (concerning the first derivative).
As you can tell, mutatis mutandis, they are, nearly identical to the above Definition (which
concerns the second derivative):
If this limit exists (i.e. is equal to a real number), then we say that f is differentiable at
df df
a, call this limit the derivative of f at a, and denote it by f ′ (a), (a), ∣ , or f˙ (a).
dx dx x=a
If this limit doesn’t exist, then we say that f is not differentiable at a.
d
Under Leibniz’s notation, the differentiation operator is denoted , while the repeated
dx
d2
application of this operator is denoted 2 . Thus, the second derivative of f is denoted:291
dx
d2 f df 2
and not .
dx2 dx2
df 2
291
Given our notation for the second derivative, the expression is confusing and should be avoided.
dx2
However, if used, it would denote the derivative of the composite function f 2 = f ○ f with respect to the
variable x2 .
706, Contents www.EconsPhDTutor.com
Example 912. Define g ∶ R → R by g (x) = x3 + 2x.
dg
The (first) derivative of g may be denoted g ′ , , or ġ. It has R as both its domain and
dx
codomain, and is defined by:
dg
g ′ (x) = (x) = ġ (x) = 3x2 + 2.
dx
Plugging in −1, we find that the (first) derivative of g at −1 is the number:
dg dg
g ′ (−1) = 5 or (−1) = 5 or ∣ =5 or ġ (−1) = 5.
dx dx x=−1
We can also read any of the above four statements aloud as “the (first) derivative of g
evaluated at −1”.
Figure to be
inserted here.
d2 g ′′
The second derivative of g may be denoted g , , or g̈. It has R as both its domain
dx2
and codomain, and is defined by:
d2 g
g (x) = 2 (x) = g̈ (x) = 6x.
′′
dx
Plugging in −1, we find that the second derivative of g at −1 is the number:
d2 g d2 g
g (−1) = −6
′′
or (−1) = −6 or ∣ or g̈ (−1) = −6.
dx2 dx2 x=−1
We can also read any of the above four statements aloud as “the second derivative of g
evaluated at −1”.
A function that is twice differentiable must also be differentiable. However, the converse
need not be true. That is, a function that is differentiable need not also be twice differen-
tiable.
dh dh
h′ (4) = 3 or (4) = 3 or ∣ =3 or ḣ (4) = 3.
dx dx x=4
Figure to be
inserted here.
d2 h ′′
The second derivative of h may be denoted h , , or ḧ. It has domain R+ , codomain
dx 2
R, and is defined by:
d2 h 3
h (x) = 2 (x) = ḧ (x) = x−1/2 .
′′
dx 4
For example, the second derivative of h at 4 is:
3 d2 h 3 d2 h 3
h (4) =
′′
or (4) = or ∣ or ḧ (4) = .
8 dx 2 8 dx2 x=4 8
(a) x
(b) x
(c) x
Exercise 299. Explain whether each of the following statements is true or false. (As
usual, one way to show that a statement is false is to provide a counterexample.)(Answer
on p. 1526.)
As you can tell, with higher derivatives, Lagrange’s prime notation and Newton’s dot
notation will start getting cumbersome. And so, in general, for the nth derivative (with
n ≥ 4), instead of writing n primes or dots, we’ll write:
The function (f ) ∶ R → R is the function f raised to the power of four and is defined by:
4
(Note that since a trillion is 1012 , the number f 4 (2) is approximately 2.417 trillion tril-
lion.)
Remark 85. The notation (f ) to mean a function raised to the nth power is not com-
n
monly used. Indeed, it does not appear on your H2 Maths syllabus or exams. We will
try to avoid using it, but as we’ll see later, it sometimes comes in handy.
If this limit exists (i.e. is equal to a real number), then we say that f is n-times differen-
dn f
tiable at a, call this limit the nth derivative of f at a, and denote it by f (a),
(n)
(a),
dx n
dn f n
∣ , or f˙ (a).
dxn x=a
If this limit doesn’t exist, then we say that f is not n-times differentiable at a.
If f is n-times differentiable at every point in a set S, then we say that f is n-times
differentiable on S.
Let T be the set of points on which f is n-times differentiable. Then the nth derivative of
(n) d f
n n
f is the real-valued function denoted f , , or f˙, with domain T and mapping rule:
dxn
f (n−1) (x) − f (n−1) (a)
x ↦ lim
x−a
.
x→a
Exercise 301. In Lagrange’s notation, why do we denote the nth derivative of f with
parentheses? That is, why do we denote the nth derivative of f by f (n) rather than more
simply f n ? (Answer on p. 1526.)
Definition 171. Suppose that for every positive integer n, the function f is n-times-
differentiable at a. Then we say that f is smooth (or infinitely differentiable) at a.
We say that a function is smooth on a set S if it is smooth at every point in S.
A smooth function is one that’s smooth on its domain.
Most functions you’ll encounter in the A-Levels are smooth. This includes, for example, all
polynomial functions:
f ′ (x) = 5x4 , f ′′ (x) = 20x3 , f ′′′ (x) = 60x2 , f (4) (x) = 120x,
f (5) (x) = 120.
The function i is smooth, with its sixth and higher-order derivatives all having domain
R, codomain R, and mapping rule x ↦ 0.
As the above examples suggest, for any non-negative integer n, any nth-degree polynomial
function is smooth. Moreover, for any k ≥ n + 1, its kth derivative has the mapping x ↦ 0.
The exponential function is also smooth:
Example 919. Let j be the exponential function. Then for all x ∈ R, we have:
The function j is smooth, with its every derivative being the exponential function.
All elementary functions are smooth, except possibly those involving the power function
x ↦ xc .
Exercise 304. Find all the derivatives of each function. Which are smooth?(Answer on
p. 715.)
(a) x
(b) x
A304.
Definition 53. Let f be a nice function. Given a set of points S ⊆ Domainf , we say
that f is:
The problem of finding a derivative is the problem of finding a curve’s gradient. And so,
not surprisingly, the derivative is intimately related to whether a function is increasing
or decreasing; we have what is sometimes known as the Increasing/Decreasing Test
(IDT):
Figure to be
inserted here.
Note also that at x = 0, we have f ′ (x) = f ′ (0) = 2 ⋅ 0 = 0, so that f is both increasing and
decreasing at 0. (However, f is neither strictly increasing nor strictly decreasing at 0.)
Take care to note that in (b) and (d) of the IDT, we cannot replace the Ô⇒ ’s with
⇐⇒ ’s.
Figure to be
inserted here.
(a) x
(b) x
We also reproduce from Ch. 18.8 the following definitions of stationary and turning
points:
f ′ (a) = 0.
Because the above equation appears so often, it is sometimes given the special name of
the First Order Condition (FOC). (It could equally well be called the Stationary
Point Condition, but for some reason that name hasn’t caught on.)
292
For the formal definitions, see Definition 224 (Appendices).
719, Contents www.EconsPhDTutor.com
Example 926. Consider again the function h ∶ R → R defined by x ↦ 6x5 − 15x4 − 10x3 +
30x2 . (Graph reproduced below for convenience.)
x = ±1 are maximum points. However, they are not global maximum points. Indeed,
h has no global maximum point because lim h (x) = ∞ (“as x increases without bound,
h (x) also increases without bound”). In other words, there is no x such that h (x) ≥ h (a)
x→∞
for all a ∈ R.
Similarly, x = 0, 2 are minimum points. However, they are not global minimum points.
Indeed, h has no global minimum point because lim h (x) = −∞ (“as x decreases without
bound, h (x) also decreases without bound”). In other words, there is no x such that
x→−∞
x = ±1 y
maximum points
-2 -1 0 1 2 3
x = 0, 2
minimum points
We next restrict the domain of h in two ways to create two new functions i and j:
y x = 2.5 y
x = ±1 x = -1
max max and max and
global max global max x = 1, 1.2
max
x x
-2 -1 0 1 2 3 -2 -1 0 1 2 3
x = -1.5
min and x = -1.2, 0 min x = 2 min and
global min x = 0, 2 min global min
Also graphed above (right) is the function j ∶ [−1.2, 2.2] → R defined by x ↦ 6x5 − 15x4 −
10x3 + 30x2 .
Again, there are three maximum points in total, namely ±1, 2.2. However, only −1 is a
global maximum point of j because only j(−1) ≥ j (x) for all x ∈ [−1.2, 2.2]. Of course, it
is also a strict global maximum point because j(−1) > i (x) for all x ∈ [−1.2, 2.2].
And again, there are three minimum points in total, namely −1.2, 0, 2. However, only 2
is a global minimum point of j because only j(2) ≤ j (x) for all x ∈ [−1.2, 2.2]. Of course,
it is also a strict global minimum point because j(2) < j (x) for all x ∈ [−1.2, 2.2].
Exercise 307. (Answer on p. 1530.) For each of the following functions, write down,
if any of these exist, the (i) maximum points, (ii) minimum points, (iii) strict maximum
points, (iv) strict minimum points, (v) global maximum points, (vi) global minimum
points, (vii) strict global maximum points, (viii) strict global minimum points; and also
all the corresponding values of the function at these points.
(a) f ∶ R → R defined by x ↦ 100.
(b) g ∶ R → R defined by x ↦ x2 .
(c) h ∶ [1, 2] → R defined by x ↦ x2 .
Type
Type AA BB CC D
D E
Max
Max ✓3 ✓
3
Min
Min ✓3 ✓3
Strict
StrictMax
Max ✓3 ✓
3
Strict
StrictMin
Min ✓3 ✓3
Global
GlobalMax
Max ✓
3
Global
GlobalMin
Min ✓3
Strict
StrictGlobal
GlobalMax
Max ✓
3
Strict
StrictGlobal
GlobalMin
Min ✓3
Stationary
Stationary ✓3 ✓3 ✓
3
Turning
Turning ✓3 ✓3
Exercise 308. Is each of the following statements true or false? To show that a statement
is false, simply give a counterexample from the above example. If it is true, explain why.
(Answer on p. 1531.)
(a) Every65.
Exercise maximum
Is each point
of theorfollowing
minimum point is atrue
statements stationary point.
or false? To show that a statement
is(b)
false, simply
Every give a point
maximum counterexample
or minimum from theisabove
point example.
a turning point.If it is true, explain why.
(Answer
(c) Every p. 953.) point is a maximum point or minimum point.
onstationary
(a)
(d)Every
Everymaximum pointisora minimum
turning point maximum point
point isoraminimum
stationarypoint.
point.
(b) Every maximum point or minimum point is a turning point.
(e) Every turning point is a stationary point.
(c) Every stationary point is a maximum point or minimum point.
(f) Every
(d) Every turning
stationary point
point is aismaximum
a turningpoint
point.or minimum point.
(e) Every turning point is a stationary point.
(f) Every stationary point is a turning point.
Example 929. The set S = [0, 1] has two boundary points, namely 0 and 1.
Every other point in S is an interior point. So for example, the points 0.2, 0.5, and 0.785
are interior points of the set S.
Given a set S, its boundary BS is the set of its boundary points. And its interior IS is
the set of its interior points.
Example 930. Continuing with the set S = [0, 1], its boundary is BS = {0, 1} and its
interior is IS = (0, 1).
Figure to be
inserted here.
Of course, the union of any set’s boundary and interior are equal to the set. So here, we
have:
S = SB ∪ SI .
Figure to be
inserted here.
293
For the formal definition, see Definition 263 (Appendices).
723, Contents www.EconsPhDTutor.com
Example 932. Consider the set U = {(x, y) ∶ x2 + y 2 ≤ 1}.
Its boundary is the set BU = {(x, y) ∶ x2 + y 2 = 1}.
Figure to be
inserted here.
1 1 1
Example 933. Consider the set V = [0, ) ∪ ( , 1] = [0, 1] ∖ { }.
2 2 2
Its boundary points are 0 and 1. Thus, its boundary is the set BV = {0, 1}.
Figure to be
inserted here.
1 1
Every other point is an interior point. Thus, V ’s interior is the set IV = (0, ) ∪ ( , 1) =
2 2
1
(0, 1) ∖ { }.
2
Remark 87. Interior and boundary points are not on your H2 Maths syllabus. How-
ever, as I hope the above examples have shown, they are very simple concepts. They
are thus well worth knowing because they will give you a better and more correct under-
standing of the material that follows, in particular the Interior Extremum Theorem
and the various First and Second Derivative Tests.
The Flawed Procedure will often work. However (and as we’ll illustrate below with several
examples), it sometimes fails.
To better understand how, why, and when the Flawed Procedure works, we now formally
introduce the result that justifies it. This result is called the Interior Extremum The-
orem (IET) and says that every interior extremum at which the derivative exists
is a stationary point.
In the sentence before the formal statement of the IET, there contain in italics two subtle
but important technicalities that are overlooked by the FSSPFE and which we will discuss
shortly.
Here’s a quick and simple example to illustrate the IET:
f ′ (1) = −2 (1 − 1) = 0. 3
In secondary school, you will have gone through the intuition for why the IET works. We
now briskly go through it again:
In order for 1 to be a maximum point of f , it must be that “just” to its left, f is increasing;
and “just” to its right, f is decreasing. In other words, “just” to the left of 1, f ′ (x) ≥ 0.
And “just” to the right of 1, f ′ (x) ≤ 0. Moreover, at the maximum point, f must be
both increasing and decreasing. Thus, f ′ (1) = 0 — the gradient of f at the maximum
point must be 0.
Figure to be
inserted here.
Remark 88. The IET is also called Fermat’s Theorem. But a bit like Euler, Fermat was
a stud whose name is attached to many results and theorems (including most famously
Fermat’s Last Theorem). So, to avoid confusion, we’ll call it the IET instead of Fermat’s
Theorem.
Exercise 309. Refer to the above Example. Explain the intuition for why g ′ (−1) = 0.
(Answer on p. 1531.)
Exercise 310. True or false: “Let f ∶ D → R be a differentiable function. If c is a
maximum or minimum point AND in the interior of D, then x is a turning point.”
(Answer on p. 1531.)
A B C D E
Global maximum 3
Strict global maximum 3
Local maximum 3 3
Strict local maximum 3 3
Global minimum 3
Strict global minimum 3
Local minimum 3 3
Strict local minimum 3 3
Turning point 3 3
Note the point D is not a local maximum because there are points “nearby” (in particular,
to the right) that are higher than D.
Similarly, it is not a local minimum because there are points “nearby” (in particular, to
the left) that are lower than D.
Altogether then, f has four extrema, namely A, B, C, and E.
Let’s see if the FSSPFE correctly identifies these four extrema:
According to the FSSPFE, here’s what we’d do.
First, find the derivative’s mapping rule:
Next, find any stationary points, in what we call the First Order Con-
dition:
3
f ′ (x) = 0 ⇐⇒ x = − , −1, 0.
5
Now conclude that f ’s extrema (i.e. maximum or minimum points) are
3
at − , −1, and 0.
5
3 3
So, the FSSPFE correctly identifies the points B = (−1, f (−1)) and C = (− , f (− )) as
5 5
extrema. However, it makes two mistakes.
Figure to be
inserted here.
Below are the three ways by which the CPFE rectifies the FSSPFE. The first two concern
the “two subtle but important technicalities” we mentioned earlier:
1. A boundary point may be an extremum but not a stationary point. Hence, it may be
overlooked by the FSSPFE.
2. Similarly, a point at which the derivative does not exist may be an extremum but will, by
definition, not be a stationary point. Hence, it may also be overlooked by the FSSPFE.
3. The FSSPFE suggests or incorrectly assumes that every stationary point is an extremum.
But this is false. Loosely, the IET says that every extremum is a stationary point, but
not the converse. And so, Step 4 of the CPFE demands that you investigate whether
each stationary point you found is actually an extremum.
Remark 89. The above statement of the FDTE is informal only in that the phrases
“immediate left” and “immediate right” haven’t been precisely defined. The FDTE is
formally stated (and proven) as Proposition 20 in the Appendices.
Remark 90. One is tempted to assume that the converses of each of (a)–(d) in the First
Derivative Test are true. Unfortunately, they are not!
For example, one is tempted to assume that if c is a local maximum of f , then f ′ is non-
negative on c’s “immediate left” and non-positive on c’s “immediate left”. This however
is false. For a counterexample, see Example 1233 (Appendices).
Proof. For the proofs of (a) and (b), see p. 1347 (Appendices).
Below we will give a partial proof of (c).
By providing two examples, we now prove that, as asserted by (c) of the SDTE, if f ′′ (a) = 0,
then a could be a maximum or a minimum point:
f ′ (0) = 4x3 ∣ = 4 ⋅ 03 = 0,
x=0
g (0) = −4x ∣
′ 3
= −4 ⋅ 03 = 0,
x=0
Figure to be
inserted here.
We can also easily verify that the second derivative of each of f and g at 0 is zero:
f ′′ (0) = 12x2 ∣ = 12 ⋅ 02 = 0,
x=0
g (0) = −12x ∣
′′ 2
= −12 ⋅ 02 = 0,
x=0
However, as is evident from the graph, 0 is a minimum point of f and a maximum point
of g.
We are not done proving (c) of the SDTE because it remains to be proven that if f ′′ (a) = 0,
then a could be an inflexion point or “something else altogether”. In Ch. 73.4(after
we’ve introduced the concept of inflexion points), we will furnish two such examples.
Note that similar to the FDTE, the converses of (a) and (b) in the SDTE are false. That
is, given a strict maximum (or minimum) a of a twice-differentiable function f , it need not
be that f ′′ (a) < 0 (or f ′′ (a) > 0). Instead, it could be that f ′′ (a) = 0. Example:
Figure to be
inserted here.
Then 0 is a strict minimum of f . However, it is false that f ′′ (0) > 0. Instead, we have:
Figure to be
inserted here.
We say that f is concave on R−0 = (−∞, 0], but convex on R+0 = [0, ∞).
Here are our informal definitions of concavity and convexity:294
• f is concave on an interval if its slope is decreasing on that interval.
• f is convex on an interval if its slope is increasing on that interval.
Also:
• f is strictly concave on an interval if its slope is strictly decreasing on that interval.
• f is strictly convex on an interval if its slope is strictly increasing on that interval.
Here is another characterisation of concavity and convexity:
• f is concave on R−0 because if we pick any two points in that interval, say A and B,
then no point of the line segment AB is above the graph of f .
• f is convex on R+0 because if we pick any two points in that interval, say C and D,
then no point of the line segment CD is below the graph of f .
294
For the formal Definitions, see Ch. 121.11 (Appendices).
734, Contents www.EconsPhDTutor.com
Example 947. Graphed below are function g ∶ R → R defined by g (x) = −x2 and the
exponential function.
Figure to be
inserted here.
Observe that exp is convex on its entire domain R, while g is concave on its entire domain
R.
Two mnemonics (for distinguishing between concave and convex):
In secondary school, we learnt that informally, a linear function is one whose graph is a
straight line.
It turns out that more formally, we can characterise the property of linearity as follows:295
Figure to be
inserted here.
The following result is immediate from our informal definitions of concavity and convexity:
295
For a formal definition of linearity, see Definition 265 (Appendices).
735, Contents www.EconsPhDTutor.com
Proposition 8. (First Derivative Test for Concavity [FDTC]) Let D be an interval
and f ∶ D → R be a differentiable function. Then:
(a) f ′ is decreasing (on D) ⇐⇒ f is concave (on D).
(b) “ strictly decreasing ⇐⇒ “ strictly concave.
(c) “ increasing ⇐⇒ “ convex.
(d) “ strictly increasing ⇐⇒ “ strictly convex.
The following result is immediate from the FDTC with the IDT:
Once again, note the one-way Ô⇒ ’s in the SDTC (these are simply inherited from the
IDT). For example, the converse of (b) is false. Given twice-differentiable and strictly
concave function f , it need not be that f ′′ (x) < 0 for all x.
Let a < b and f ∶ (a, b) → R be a continuous function. We call c ∈ (a, b) an inflexion point
of f if either of the following statements is true:
(a) f is strictly concave on c’s “immediate left” and strictly convex on c’s “immediate
right”.
(b) f is strictly convex on c’s “immediate left” and strictly concave on c’s “immediate
right”.
Remark 92. The above Definition is informal only in that the phrases “immediate left”
and “immediate left” have not been precisely defined. For the formal definition of an
inflexion point, see Definition 266 (Appendices).
Remark 93. It’s usually spelt inflection rather than inflexion.296 But the latter spelling
is what appears on your A-Level syllabus and so that’s what we’ll do too.
296
According to Google Ngram, inflection is the more common spelling (even when we restrict attention
to British English). It seems that inflexion is, like connexion, an archaic spelling.
738, Contents www.EconsPhDTutor.com
Example 954. Consider the function f ∶ R → R defined by f (x) = x3 .
Observe that f is strictly concave on R− = (−∞, 0) and strictly convex on R+ = (0, ∞).
Hence, 0 is an inflexion point of f .
Figure to be
inserted here.
One simple test for inflexion points is the Tangent Line Test (TLT). If c is an inflexion
point of f , then it must pass the TLT:297
The tangent line is strictly above (or below) f on the “immediate left” of c
and strictly below (or above) f on the “immediate right” of c.
We can easily verify that the inflexion point 0 passes the TLT:
• On the “immediate left” of 0, the tangent line at 0 is strictly above f ; and
• On the “immediate right” of 0, it is strictly above f .
Note though that the converse is false! Any inflexion point must pass the TLT, but not
every point that passes the TLT must be an inflexion point! See Remark 152 (Appen-
dices).
296
For the formal definition, see Definition 266 (Appendices).
297
For a formal statement of the TLT, see Fact 224 (Appendices).
739, Contents www.EconsPhDTutor.com
Example 955. Define g ∶ R → R by:
⎧
⎪
⎪
⎪x2 for x ≤ 0,
g (x) = ⎨
⎪
⎪
⎪ for x > 0.
⎩0
Observe that g is continuous, strictly convex on R− , and concave on R+ . (Of course, g is
in fact linear on R+ , so that it is both concave and convex on R+ .)
Figure to be
inserted here.
Under our above Definition of an inflexion point, we do not consider 0 an inflexion point
of the function g
However and confusingly, some other writers do!
Figure to be
inserted here.
Under our above Definition of an inflexion point, we do not consider 0 an inflexion point
of the function h
However and confusingly, some other writers do!
Definition 172. An inflexion point that is also a stationary point is called a stationary
point of inflexion; otherwise, it is called a non-stationary point of inflexion.
Your H2 Maths syllabus explicitly excludes non-stationary points of inflexion. That is,
happily enough, all points of inflexion you’ll encounter will also be stationary, i.e. where
the first derivative equals zero.
Nonetheless, one is tempted to assume that “every inflexion point must also be a stationary
point”. This is false:
Figure to be
inserted here.
Remark 94. Again, the above statement of the FDTI is informal only in that we haven’t
precisely defined the term “near”. For a formal statement (and proof) of the FDTI, see
Fact 223 in the Appendices.
Remark 95. Unfortunately, the converse of the FDTI is false. That is, it may be that f ′
is strictly positive (or negative) at all points “near” c, but c is not an inflexion point.
For such a counterexample, see Example 1234 in the Appendices.
f ′ (0) = 3x2 ∣ = 3 ⋅ 02 = 0.
x=0
Figure to be
inserted here.
We can also easily verify that the second derivative of f at 0 are zero:
We have f ′ (0) = 0 and f ′′ (0) = 0. On the other hand, as verified in previous subchapters,
0 is an inflexion point of f .
⎧
⎪ 1 1 1
⎪
⎪ 20x3
sin − 8x2
cos − x sin , for x ≠ 0,
g ′′ (x) = ⎨
⎪
x x x
⎪
⎪ for x = 0.
⎩0,
Figure to be
inserted here.
It is a little disappointing that we have (c) of the SDTE. But happily, we do have the
following partial converse, which says that if c is an inflexion point of a twice-differentiable
function f , then f ′′ (c) = 0:
Fact 146. (Second Derivative Test for Inflexion Points [SDTI]) Let a < b. Suppose
f ∶ (a, b) → R is twice differentiable and has inflexion point c ∈ (a, b). Then:
(a) c is a strict extremum of the first derivative f ′ ; and
(b) f ′′ (c) = 0.
All Points
a Inflexion
Points b
c
Stationary
d e Points i j
f h
g
Maximum Minimum
Points Points
k
Remark 96. The above Venn diagram is for reference only. It would be foolish to try
to mug it. Instead, it is much easier to simply remember what (strict) maximum and
minimum points, stationary points, inflexion points, and turning points are.
298
It turns out that in general, there is the awkward possibility of an inflexion point being an extremum (see
Example 1235). Fortunately, we can eliminate this awkward possibility by imposing the requirement
that our function is twice differentiable (see 225).
746, Contents www.EconsPhDTutor.com
Exercise 315. Which of types a through l are turning points? (Answer on p. 747.)
Exercise 316. Below is the graph is of a twice-differentiable function f , with X points
marked. What type (a through l, see above) is each point? (Answer on p. 747.)
Figure to be
inserted here.
A315. By Definition 63, a turning point is a stationary point and a strict extremum.
Therefore, points of types f and h are turning points.
A316.
Figure to be
inserted here.
Fact 15. The line that contains the point (p, q) and has gradient m is:
y − q = m (x − p) .
Example 968. We unload sand onto a flat surface at a steady rate of 0.01 m3 s-1 . Assume
the unloaded sand always forms a perfect cone whose height and base diameter are always
equal.
Let’s find the rate at which the base area of the cone is increasing, at the instant t = 20 s.
First, recall that a cone with base radius r and height h has volume
1
V = πr2 h.
3
Since the base diameter equals the height (or h = 2r), we can rewrite this as
2
V = πr3 .
3
Now differentiate the above equation with respect to t, to get
dv
= 2πr2 .
dr
dt dt
Let A = πr2 be the base area. The rate at which the base area is increasing is
dr dv
= 2πr = ÷ r.
dA
dt dt dt
The volume of the sand is always increasing at a rate 0.01 m3 s-1 . That is:
dv
= 0.01 m3 s−1 .
dt
d2 A 9 h4 A2 − (π − h3 )
12 6 2
= .
dh2 4 A3
d2 A
(f) Consider the numerator of . Replace A2 with the expression for A that you found
dh 2
in (c). Now fully expand this numerator. Observe that it is a quadratic and prove that
it is always positive.
(g) Hence conclude that the stationary point we found is indeed the global minimum.
Example 969. Define f ∶ [0, 2] → R by x ↦ x − sin (0.5πx). We can easily find the
minimum point of f analytically:
dF 2 2 2
= 1 − cos ( x) = 0 ⇐⇒ cos ( x) = ⇐⇒ x= cos−1 ≈ 0.560664181.
π π π
dx 2 2 2 π π π
But as an exercise, let’s find it using our TI84.
Note that in the question given, the domain is actually [0, 2], but we didn’t bother
telling the calculator this. So the calculator just went ahead and graphed the equation
y = x − sin(0.5πx) for all possible real values of x and y.
No big deal, all we need to do is to zoom in to the region where 0 ≤ x ≤ 2.
5. Press the ZOOM button to bring up a menu of ZOOM options.
6. Press 2 to select the Zoom In option. Using the ⟨ and ⟩ arrow keys, move the cursor
to where X = 1.0638298, Y = 0. Now press ENTER and the TI will zoom in a little,
centred on the point X = 1.0638298, Y = 0.
(Example continues on the next page ...)
4. Press the blue 2ND button and then CALC (which corresponds to the TRACE
button). This brings up the CALCULATE menu.
5. Press 3 to select the “minimum” option. This brings you back to the graph, with a
cursor flashing. Also, the TI84 prompts you with the question: “Left Bound?”
TI84’s MINIMUM function works by you first choosing a “Left Bound” and a “Right
Bound” for x. TI84 will then look for the minimum point within your chosen bounds.
6. Using the ⟨ and ⟩ arrow keys, move the blinking cursor until it is where you want
your first “Left Bound” to be. For me, I have placed it a little to the left of where I
believe the minimum point to be.
7. Press ENTER and you will have just entered your first “Left Bound”.
TI84 now prompts you with the question: “Right Bound?”.
8. So now just repeat. Using the ⟨ and ⟩ arrow keys, move the blinking cursor until it
is where you want your first “Right Bound” to be. For me, I have placed it a little to
the right of where I believe the minimum point to be.
9. Again press ENTER and you will have just entered your first “Right Bound”.
TI84 now asks you: “Guess?” This is just asking if you want to proceed and get TI84 to
work out where the minimum point is. So go ahead and:
10. Press ENTER . TI84 now informs you that there is a “Zero” at “X = .56066485”,
“Y = −.2105137” and places the cursor at precisely that point. This is our desired
minimum point.
(Notice there’s a slight error, because the TI84 uses slightly-imprecise numerical methods.
Analytically, we found that the minimum point was x ≈ 0.560664181, while the TI84
claims it is “X = .56066485”.)
Notice that strangely enough, the graph seems to be empty for the region where x < 0. But
clearly there are values for which x < 0 — for example, t = −1.1 Ô⇒ (x, y) ≈ (−2.71, 2.87).
So why isn’t the TI84 graphing this?
(Example continues on the next page ...)
After Step 9. After Step 10. After Step 11. After Step 12.
dy
Actually, the last few steps were really not necessary, if all we wanted was to find ∣ ,
dx t=1
as we do now:
7. Press the blue 2ND button and then CALC (which corresponds to the TRACE
button). This brings up the CALCULATE menu, which once again looks a little
different under the current parametric setting.
8. Press 2 to select the “dy/dx” option. This brings you back to the graph.
Nothing seems to be happening. But now, simply ...
9. Press 1 and now the bottom left of the screen changes to display “T = 1”.
dy
10. Hit ENTER . What you’ve just done is to ask the calculator to calculate at the
dx
point where t = 1. The calculator tells you that “dy/dx = .83333528”.
dy 5
Again, there’s a slight error — the exact correct answer is = = 0.8333..., so again
dx 6
the TI84 is a tiny bit off.
Quick examples:
c0 + c1 x + c2 x2 + ⋅ ⋅ ⋅ + cn xn .
You can easily imagine what an “infinite-degree polynomial” is. Except we don’t call it
that. Instead, we call it a power series:
We also call:
• Each cn xn the nth-degree term or the nth term;
• Each cn the coefficient on xn (or the nth-degree coefficient, or the nth coefficient); and
• c0 the constant term or, more simply, the constant.
299
See Definition 31.
757, Contents www.EconsPhDTutor.com
Example 972. The expression 1 + 2x + 3x2 + 4x3 + 5x4 + 6x5 + . . . is a power series.
For any non-negative integer n, this power series’s nth coefficient is 1 + n. And so, using
summation notation, we may also write this power series as:
∞
1 + 2x + 3x2 + 4x3 + 5x4 + 6x5 + ⋅ ⋅ ⋅ = ∑ (1 + n) xn
n=0
Observe that a power series is, by definition, an infinite series (see Ch. 28.1).
As emphasised in Part II (Sequences and Series), we must be very careful when dealing
with infinite series. The = sign in the above equation is not the usual one; instead, it
means converges to, which has a very clear, precise, and technical meaning that you
are not required to know for H2 Maths.300
Example 973. In the power series below, the nth coefficient is (−1) .
n
∞
∑ (−1) xn = 1 − x + x2 − x3 + x4 − x5 + . . .
n
n=0
Remark 97. Definition 173 mostly parallels Definition 31. The only exception is that we
do not call the following a power equation:
∞
∑ ci xi = c0 + c1 x + c2 x2 + ⋅ ⋅ ⋅ = 0.
i=0
The reason is that the term power equation is rarely used in mathematics and when
it is, it’s usually for rather different purposes — example. So, in this textbook, we will
never use the term power equation.
300
But see Ch. 118.1 (Appendices) if you’re interested.
758, Contents www.EconsPhDTutor.com
76.2. Analytic Functions
A function is analytic if it can be represented by a power series. A bit more formally:
∞
f (x) = ∑ cn xn = c0 + c1 x + c2 x2 + . . . for every x ∈ D.
n=0
Then we call f an analytic function and say that f can be represented by the power series
∞
∑ cn xn .
n=0
And so, we say that f is analytic and can be represented by the following power
series:
∞
∑ xn = 1 + x + x2 + x3 + . . .
n=0
Since f can be represented by the above power series, it would have been exactly equi-
valent if we had defined f ∶ (−1, 1) → R not by =, but instead by:
1
f (x) = 1 + x + x2 + x3 + . . .
And so, we say that g is analytic and can be represented by the following power
series:
∞
∑ (−2) xn = 1 − 2x + 4x2 − 8x3 + . . .
n
n=0
Since g can be represented by the above power series, it would have been exactly equivalent
if we had defined g ∶ (−1/2, 1/2) → R not by =, but instead by:
1
Most functions we’ll encounter in H2 Maths are analytic (at least when we restrict their
domain suitably).
Analytic functions are “well-behaved” in that they possess certain properties that make
them particularly easy to deal with. For example, analytic functions are smooth (i.e.
infinitely differentiable).301
A rare example of a function that’s commonly encountered in H2 Maths but which is not
analytic is the absolute value function ∣⋅∣. But even so, ∣⋅∣ fails to be analytic only on open
intervals containing 0 and is analytic if we restrict its domain to any other open interval.
301
Every analytic function is smooth. But the converse is not true — there are smooth functions that are
not analytic (for one, see Example 986).
760, Contents www.EconsPhDTutor.com
76.3. Introducing the Maclaurin Series
In this subchapter, we’ll simply learn to mechanically compute something called the Mac-
laurin coefficients and Maclaurin series (expansion) without understanding what
they do.
In the next subchapter, we’ll then learn that certain functions can be represented by their
Maclaurin series and that this is tremendously convenient.
1 2 3⋅2
f ′ (x) = 2, f ′′ (x) = 3, f ′′′ (x) = 4,
(1 − x) (1 − x) (1 − x)
4⋅3⋅2
f (4) (x) = f (n) (x) =
n!
5, ... n+1 .
(1 − x) (1 − x)
1 1 2
f (0) = = 1 = 0!, f ′ (0) = = 1 = 1!, f ′′ (0) = = 2 = 2!,
1−0 (1 − 0)
2
(1 − 0)
3
3⋅2
f (3) (0) = = 3!, f (n) (0) = = n!.
n!
...
(1 − 0) (1 − 0)
4 n+1
By the way, note that f (0) = f . That is, we define the zeroth derivative of a function to
be the function itself.
Step 3. For each non-negative integer n = 0, 1, 2, . . . , define the nth Maclaurin coeffi-
cient for f to be the following number:
f (n) (0)
mn = .
n!
And so, the 0th, 1st, 2nd, 3rd, and nth Maclaurin coefficients for f are:
Now, note that all we’ve done is to write down an infinite series M (x) called the Maclaurin
series (and which happens also to be a power series). We have not actually shown that
this series M (x) converges to or is “equal” to any number.
762, Contents www.EconsPhDTutor.com
Remark 98. This textbook302 shall treat the terms Maclaurin series and Maclaurin
series expansion as synonyms. That is, the word expansion is optional — and indeed,
we will usually drop it.
Step 3. So, for each n = 0, 1, 2, . . . , the nth Maclaurin coefficient for sin is:
⎧
⎪
⎪
⎪ 0/n! = 0 for n = 0, 4, 8, . . . ,
⎪
⎪
⎪
sin(n) (0) ⎪
⎪
⎪1/n! for n = 1, 5, 9, . . . ,
mn = =⎨
⎪
⎪
⎪ 0/n! = 0 for n = 2, 6, 10, . . . ,
⎪
n!
⎪
⎪
⎪
⎪
⎪
⎩−1/n! for n = 3, 7, 11, . . .
Formal definitions:
302
And from what I observe, also on your A-Level exams.
763, Contents www.EconsPhDTutor.com
Definition 175. If the function f is n-times differentiable at 0, then the nth Maclaurin
coefficient for f is denoted mn and is defined to be the following number:
f (n) (0)
mn = for n = 0, 1, 2, . . .
n!
n=0 n=0 n! 2! 3! n!
Remark 99. No need to mug the above definition because = appears on List MF26.
1
Step 3. So, for each n = 0, 1, 2, . . . , the nth Maclaurin coefficient for exp is:
exp(n) (0) 1
mn = = .
n! n!
Step 4. Thus, the Maclaurin series for exp is:
∞
exp(n) (0) n 1 1
∞
1 2 1 3 x2 x3
M (x) = ∑ mn x = ∑
n
x = + x + x + x + ⋅⋅⋅ = 1 + x + + + ...
n=0 n=0 n! 0! 1! 2! 3! 2! 3!
Exercise 320. Find the Maclaurin series for each function. (Answer on p. 1536.)
(a) f ∶ (−1, 1) → R defined by f (x) = (1 + x) , where k is any real number.
k
(b) cos
(c) g ∶ (−1, 1] → R defined by g (x) = ln (1 + x).
Remark 100. Happily, your H2 Maths syllabus (p. 9) explicitly excludes “derivation of
the general term of the series”. I take this to mean that they promise never to ask you
to derive the general nth term of a Maclaurin series. (But of course, who knows if they’ll
actually keep this promise.)
Remark 101. Your H2 Maths syllabus and exams make no mention of the Taylor series.
But just so you know, the Maclaurin series is simply a special case of the Taylor series.
Specifically, the Maclaurin series for f is the Taylor series for f about 0.
Note that in the main text of Part II, we did not formally define what it means for an
infinite series to converge. Instead, as stated on p. 387, in H2 Maths, we will simply
count on your rough and intuitive understanding of what convergence means. Two quick
examples to illustrate convergence (and its antonym divergence):
1 1 1 1 1
Example 979. The series + + + + + . . . converges to 1. Or equivalently:
2 4 8 16 32
1 1 1 1 1
+ + + + + ⋅ ⋅ ⋅ = 1.
2 4 8 16 32
Definition 176. Let f be a function that is smooth at 0 and M be its Maclaurin series.
Suppose that for every x ∈ Domainf , we have:
M (x) = f (x).
It turns out that very happily, most functions encountered in H2 Maths can be represented
by their Maclaurin series (at least when the domain is suitably restricted):
303
By everywhere, we mean at every point in the domain of f .
765, Contents www.EconsPhDTutor.com
Example 981. Define f ∶ (−1, 1) → R by:
1
f (x) =
1
1−x
.
In the previous subchapter, we found that the Maclaurin series for f is:
M (x) = 1 + x + x2 + . . .
It is possible (but beyond the scope of H2 Maths) to prove that M (x) converges to f (x)
for all x ∈ Domainf = (−1, 1). Or equivalently, that:
And so, by Definition 176, we say that f can be represented by its Maclaurin series.
Thus, it would’ve been exactly equivalent if we had defined the function f not by =, but
1
instead by its Maclaurin series expansion. That is, it would’ve been exactly equivalent if
the first sentence of this example had been replaced with the following sentence:
And so, by Definition 176, we say that sin can be represented by its Maclaurin series.
In Ch. 19, we defined sin by means of the right-triangle and unit-circle definitions. These
definitions are, however, considered somewhat informal. Since sin may be represented by
its Maclaurin series, why not we simply use that as our formal definition? Here then is this
textbook’s official formal definition of sin:
Remark 102. Note that in this textbook, we have not justified why the above definition is
x3 x5 x7
valid. That is, we have not justified why for all x ∈ R, the expression x − + − + . . .
3! 5! 7!
converges to some real number. We will simply take for granted that the above definition
“works”.
We will also take for granted that the above definition is in agreement with our informal
right-triangle and unit-circle definitions of sine.
These same remarks apply to the definition of the cosine function below.
We can similarly work our way towards this textbook’s formal definition of cos:
And so, by Definition 176, we say that cos can be represented by its Maclaurin series.
In Ch. 19, we defined cos by means of the right-triangle and unit-circle definitions. This,
however, is considered somewhat informal. Since cos may be represented by its Maclaurin
series, why not we simply use that as our formal definition? Here then is this textbook’s
official formal definition of cos:
Remark 103. Definitions 177 and 178 just given are called the power series definitions
of sine and cosine and shall be this textbook’s official formal definitions of these two
functions.
Note though that there are other ways to formally define sine and cosine. One is to use
the following exponential definitions:
And so, by Definition 176, we say that exp can be represented by its Maclaurin series.
In Ch. 17, Definition 59 formally defined the exponential function exp to be the inverse
of the natural logarithm function ln.304 But since exp can be represented by its Maclaurin
series, we also have the following alternative definition of exp:
Remark 104. Definition 179 is JSYK. Definition 59 remains this textbook’s official formal
definition of the exponential function.
In Ch. 17, we gave two results about Euler’s number e. We can now prove both of them.
The first is especially easy:
1 1 1 1
Theorem 1. e = + + + + ...
0! 1! 2! 3!
1 1 12 13 1 1
e = exp 1 = + + + + ⋅⋅⋅ = 1 + 1 + + + ...
0! 0! 2! 3! 2! 3!
The second result about e is a little harder to prove:
x 1
304
And the natural logarithm function ln ∶ R+ → R was, in turn, defined by ln x = ∫ dt. We’ll have
1 t
more to say about this in Ch. XXX.
769, Contents www.EconsPhDTutor.com
1 n
Theorem 2. e = lim (1 + ) .
n→∞ n
Observe that f and g have different domains but are otherwise identical.
The function f is smooth at 0 and, as found earlier, its Maclaurin series Mf is given by:
∞
Mf (x) = ∑ mn xn = 1 + x + x2 + . . .
n=0
Observe that g is also smooth at 0. And so, we can go through the exact same four steps
to find the Maclaurin series Mg for g. Not surprisingly, we will find that Mg is exactly
the same as Mf . That is, Mg is given by:
∞
Mg (x) = ∑ mn xn = 1 + x + x2 + . . .
n=0
However, it is no longer true that Mg (x) converges to g (x) everywhere. That is, it is no
longer true that Mg (x) = g (x) for every x ∈ Domaing = (−3, 1).
Take for example −2 ∈ Domaing = (−3, 1). We have:
∞
1 1
M (−2) = ∑ mn (−2) = 1 + (−2) + (−2) + . . . , g (−2) = = .
2
while
n
n=0 1 − (−2) 3
Clearly, M (−2) ≠ g (−2). Indeed, M (−2) does not even converge. Hence, g cannot be
represented by its Maclaurin series.
In the above counterexample, g could not be represented by its Maclaurin series because
there were values of x ∈ Domaing for which Mg (x) did not converge. One might thus
wonder if the following “result” is true:
“Suppose a function has a Maclaurin series that converges everywhere.
Then this function may be represented by its Maclaurin series.”
−1 2
h′ (x) = (exp ) and
x2 x3
−1 4 −1 2 −1 4 − 6x2
h′′ (x) = (exp ) − 3 (exp ) = (exp ) .
x2 x6 x2 x4 x2 x6
h(0) (0) = 0, h′ (0) = 0, h′′ (0) = 0, ..., h(n) (0) = 0 for every n ∈ Z+0 .
Our earlier examples suggest that most functions we’ll encounter in H2 Maths can be
represented by their Maclaurin series.
In contrast, the two examples we’ve just looked at show that not every function can be
represented by its Maclaurin series.
At this point, a natural question to ask is this:
Theorem 23. Every analytic function whose domain includes 0 can be represented by its
Maclaurin series.
n=0
f (0) = m0 ,
f ′ (0)= m1 ,
f ′′ (0)= 2m2 ,
f (3) (0) = 3!m3 ,
f (4) (0) = 4!m4 ,
⋮
f (n) (0) = n!mn .
x2 x3
Having found that: M (x) = 1 + x + + + ..., we can then assert that:
2! 3!
x2 x3
exp x = M (x) = 1 + x + + + ... for all x ∈ R.
2! 3!
Having found that: M (x) = 1 − 2x + 3x2 − 4x3 +, we can then assert that:
1
= M (x) = 1 − 2x + 3x2 − 4x3 + for all x ∈ (−1, 1).
(1 + x)
2
In H2 Maths and also in this textbook, we will not even attempt to answer the above
question. Instead, we will simply and blithely assume that most functions we encounter
are analytic (at least when the domain is suitably restricted). Which means that for most
functions, we can simply and blithely apply Theorem 23 and thus assume that they can
indeed be represented by their Maclaurin series.
This is wonderful, but you should be aware that this also means there are important holes
in your understanding of how and when the Maclaurin series works. These holes will be
patched as you progress beyond H2 Maths.
Your H2 Maths syllabus (p. 9) and exams306 call the five specific Maclaurin series listed
above
Partialthe standard
fractions series. In previous subchapters, we already learnt how to derive all
decomposition
five of these standard series.
Non-repeated linear factors:
Your H2 Maths syllabus (p. 9) includes:
px + q A B
= +
• range of values of x for which (aaxstandard
+ b)(cx + d ) series
(ax + bconverges.
) (cx + d )
In List Repeated
MF26 (see factors: these ranges of values are given on the right (in parentheses).
linearabove),
x3 x5
sin x = x − + + ...
3! 5!
x2 x4
cos x = 1 − + + ...
2! 4!
In contrast, for the first and fifth standard
2
series, we have restrictions.
The first series for (1 + x) has the restriction “∣x∣ < 1”. This means that the corresponding
n
Maclaurin series M (x) converges only for x ∈ (−1, 1). That is, for every x ∈ (−1, 1), we
have:
n (n − 1) 2 n (n − 1) (n − 2) 3
(1 + x) = 1 + nx + x + x + ...
n 1
2! 3!
In contrast, = is false for any x ∉ (−1, 1). For example, suppose x = 2 and n = 1.5. Then the
1
LHS of = is:
1
306
See Exercise 564 (N2017/I/1).
775, Contents www.EconsPhDTutor.com
(1 + x) = (1 + 2) = 31.5 ≈ 5.196.
n 1.5
So, to repeat, for any x ∉ (−1, 1), the corresponding Maclaurin series will not converge and
= is false.
1
In H2 Maths and this textbook, we shall not explain why the Maclaurin series for (1 + x)
n
converges only on (−1, 1). Instead, this is simply something you must “know” by rote.
Remark 105. The term binomial series — which was on the old 9740 syllabus but is no
longer on the current 9758 syllabus — is simply the Maclaurin series for (1 + x) .
n
Similarly, the fifth series for ln (1 + x) has the restriction “−1 < x ≤ 1”. This means that
the corresponding Maclaurin series M (x) converges only for x ∈ (−1, 1]. That is, for every
x ∈ (−1, 1], we have:
x2 x3 x4 x5 x6
ln (1 + x) = x − + − + − + ...
3
2 3 4 5 6
In contrast, = is false for any x ∉ (−1, 1]. For example, suppose x = 2. Then the LHS of =
3 3
is:
ln (1 + x) = ln (1 + 2) = ln 3 ≈ 1.099.
x2 x3 x4 x5 x6 22 23 24 25 26
x− + − + − + ⋅⋅⋅ = 2 − + − + − + ...,
2 3 4 5 6 2 3 4 5 6
So, to repeat, for any x ∉ (−1, 1], the corresponding Maclaurin series will not converge and
= is false.
3
Again, in H2 Maths and this textbook, we shall not explain why the Maclaurin series for
ln (1 + x) converges only on (−1, 1]. Instead, this is simply something you must “know” by
rote.
Definition 180. Let f be a function that is n-times differentiable at 0. Then the nth
Maclaurin polynomial for f is:
n
f ′′ (0) 2 f (3) (0) 3 f (n) (0) n
Mn (x) = ∑ mi x = f (0) + f (0) x +
i ′
x + x + ⋅⋅⋅ + x .
i=0 2! 3! n!
Not surprisingly, if a function can be represented by its Maclaurin series, then it can also be
approximated by its Maclaurin polynomials. This is one useful application of the Maclaurin
series.
x2 x3 x4
g (x) = ln (1 + x) = x − + − + ...
2 3 4
By the above definition, the 0th, 1st, 2nd, 3rd, 4th, and 5th Maclaurin polynomials
for g are the following polynomials:
M0 (x) = 0,
M1 (x) = x,
x2
M2 (x) = x − ,
2
x2 x3
M3 (x) = x − + ,
2 3
x2 x3 x4
M4 (x) = x − + − ,
2 3 4
x2 x3 x4 x5
M5 (x) = x − + − + .
2 3 4 5
Figure to be
inserted here.
Observe that these first six Maclaurin polynomials for g serve as ever-improved approx-
imations of the function g.
777, Contents www.EconsPhDTutor.com
Example 990. Consider the sine function sin ∶ R → R.
We already showed that sin can be represented by its Maclaurin series. That is, for every
x ∈ Domain sin = R, we have:
x3 x5
sin x = M (x) = x − + − ...
3! 5!
The 0th, 1st, 2nd, 3rd, 4th, and 5th Maclaurin polynomials for f are:
M0 (x) = 0,
M1 (x) = x,
M2 (x) = x,
x3
M3 (x) = x − ,
3!
x3
M4 (x) = x − ,
3!
x3 x5
M5 (x) = x − + .
3! 5!
Figure to be
inserted here.
x3 x5
Observe that if x is small (close to zero), then is small and is even smaller. Indeed,
3! 5!
each Maclaurin coefficient grows smaller.
And so, if x is small, even low-order Maclaurin polynomials will serve as “good” approx-
imations of sine.
Indeed, if x is very small (i.e. very close to zero), then we may simply assert that:
sin x ≈ x.
1
In Exercises 321(c) and 329, we’ll also learn of the small-angle approximation for
cosine and tangent.
By the way, one might reasonably think that a higher-degree Maclaurin polynomial is
always a better approximation than a lower-degree Maclaurin polynomial. Unfortunately,
this is not generally true, especially if x is far from zero.
As an example, let x = 10, so that sin x = sin 10 ≈ −0.544.
Evaluating
778, Contents the 5th Maclaurin polynomial at 10, we have: www.EconsPhDTutor.com
Exercise 321. For each function, sketch its graph; then find and sketch (on the same
graph) its 0th, 1st, 2nd, and 3rd Maclaurin polynomials. (Answer on p. 1538.)
(a) exp
(b) f ∶ (−1, 1) → R defined by f (x) = (1 + x) , where n is any real number.
n
(c) cos
The small-angle approximation for cosine is given by the 2nd Maclaurin polynomial.
Write it down.
Remark 106. The term Maclaurin polynomial is not used in your H2 Maths syllabus or
exams. However, it is sufficiently convenient that I have decided nonetheless to introduce
it in this textbook.
We aren’t sure if such a step is legal, but suppose we try differentiating the above Mac-
laurin series term by term. Then we’d get the following expression:
∞
1 + 2x + 3x + ⋅ ⋅ ⋅ = ∑ nxn−1 .
2
n=0
The above example and exercise suggest that if f can be represented by a power series
M , then f is differentiable and, moreover, the derivative f ′ can be obtained by simply
differentiating M term by term. It turns out that happily enough, this is true! Formally:
∞
f (x) = c0 + c1 x + c2 x + c3 x ⋅ ⋅ ⋅ = ∑ cn xn
2 3
for every x ∈ D.
n=0
We now illustrate the above Theorem by proving that the derivatives of sin and cos are,
x3 x5
Proof. For every x ∈ R, we have sin x = x − + − ...
3! 5!
By the above Theorem then, sin is differentiable and its derivative sin′ ∶ R → R is defined
by:
3x2 5x4 x2 x4
sin x = 1 −
′
+ − ⋅⋅⋅ = 1 − + − ...
3! 5! 2! 4!
Observing that this last expression is simply the power series expansion of cos, we conclude
that the derivative of sin is cos.
x2 x4 x6
Proof. For every x ∈ R, we have cos x = 1 − + − + ...
2! 4! 6!
By the above Theorem then, cos is differentiable and its derivative cos′ ∶ R → R is defined
by:
2x 4x3 6x5 x3 x5
cos′ x = 0 − + − + ⋅ ⋅ ⋅ = −x + − + ...
2! 4! 6! 3! 5!
Observing that this last expression is simply the power series expansion of − sin, we conclude
that the derivative of cos is − sin.
We’ve just shown that sin and cos are differentiable. By Theorem 19 then, they are also
continuous. We have thus proven the following result that was stated long ago in Ch. 68.5
and now reproduced:
Fact 138. The functions sin and cos cosine functions are continuous.
Recall that a power series is simply an “infinite polynomial”. And so, given two power
series, we can also (naïvely) multiply them together as if they were finite polynomials:
Let us now simply and naïvely multiply these two power series together as if they were
finite polynomials. To do so, write:
Happily and as already noted, for H2 Maths, you’ll only ever be asked to find the “first
few terms”.
So here, let us find only c0 , c1 , c2 , and c3 . “Clearly”, we have:
c0 = 1 × 1 = 1,
c1 = 2 × 1 = 2,
c2 = 1 × (−2) + 3 × 1 = 1,
c3 = 2 × (−2) + 4 × 1 = 0.
Thus:
We call this last expression obtained the Cauchy product307 of the two power series A
and B.
x3 x5 x2 x4
Exercise 324. Let A = x − + − . . . , B = 1 − + − . . . , and C = 1 + x + x2 + x3 + . . .
3! 5! 2! 4!
Write down the Cauchy products AB, AC, and BC, up to and including the x3 term.
(Answer on p. 783.)
A324.
307
For the formal definition of the Cauchy product, see Definition 267 (Appendices).
783, Contents www.EconsPhDTutor.com
The following result says that we can simply multiply two analytic functions together in
the “obvious” fashion:
Suppose f and g are functions that can be represented by the power series A and B. Then
f ⋅ g can also be represented by the Cauchy product C = AB.
x3 x5 1 1 5
sin x = x − + − ⋅ ⋅ ⋅ = 0 + 1x + 0x2 − x3 + 0x4 − x − ... for all x ∈ R,
3! 5! 6 120
x2 x4 1 1
cos x = 1 − + − ⋅ ⋅ ⋅ = 1 + 0x− x2 + 0x3 + x4 − 0x5 − . . . for all x ∈ R.
2! 4! 2 24
In Exercise 324, we already found that the Cauchy product of the above two power series
is:
x3 x5 x2 x4 2
(x − + − . . . ) (1 − + − . . . ) = x − x3 + . . .
3! 5! 2! 4! 3
2
And so, by Theorem 37, x − x3 + . . . is the Maclaurin series representation of f . That is:
3
2
f (x) = sin x cos x = x − x3 + . . . for all x ∈ R.
3
Another method for finding the representation of f is by doing what we did in earlier
subchapters. You are asked to do so in Exercise 325.
308
This result is formally stated as Theorem 37 (Appendices).
784, Contents www.EconsPhDTutor.com
76.12. The Composition of Two Analytic Functions
The following informal result says that the composition of two analytic functions is analytic.
Moreover, the composition can be obtained in the “obvious” fashion — that is, by simply
“plugging” one power series into the other.
Let f and g be functions for which the composite function f ○ g is well-defined. Suppose
f and g can be represented by the power series:
∞ ∞
∑ a n xn and ∑ bn xn .
n=0 n=0
309
This result is formally stated as Theorem 38 (Appendices).
785, Contents www.EconsPhDTutor.com
1 1 1
Example 996. Define f ∶ (−1, 1) → R by f (y) = and g ∶ (− , ) → R by g (x) = 2x.
1+y 2 2
1 1
Observe that Rangeg = (−1, 1) ⊆ Domainf so that the composite function f g ∶ (− , ) →
2 2
R is well-defined, with:
1
(f g) (x) = f (g (x)) = .
1 + 2x
Assuming f and g are both analytic, by Theorem 38, so too is f g.
The power (and also Maclaurin) series representation of f is:
1
f (y) = = 1 − y + y2 − y3 + . . . for y ∈ (−1, 1).
1
1+y
Similar to the previous subchapter, we will use two methods to find the power (and also
Maclaurin) series representation of f g.
Method 1. By Theorem 38, simply plug y = g (x) into =:
1
1 1
(f g) (x) = f (g (x)) = = = 1 − 2x + (2x) − (2x) + ⋅ ⋅ ⋅ = 1 − 2x + 4x2 − 8x3 + . . .
2 3
1 + g (x) 1 + 2x
1 1
for x ∈ Domaing = (− , ).
2 2
′ −2
(f g) (x) = 2,
(1 + 2x)
′′ 8
(f g) (x) = 3,
(1 + 2x)
′′′ −48
(f g) (x) = 4,
(1 + 2x)
′ ′′ ′′′
Evaluate each of f g, (f g) , (f g) , and (f g) at 0:
1
(f g) (0) = = 1,
1+2⋅0
′ −2
(f g) (0) = 2 = −2,
(1 + 2 ⋅ 0)
′′ 8
(f g) (0) = = 8,
(1 + 2 ⋅ 0)
3
′′′ −48
(f g) (0) = = −48,
(1 + 2 ⋅ 0)
4
Observe that if we define h ∶ R → R by h (x) = x2 , then i = exp ○h. That is, i may be
written as the composition of the exponential function and h.
Assuming exp and h are both analytic, by Theorem 38, so too is their composition
i = exp ○h.
The power (and also Maclaurin) series expansion of exp is:
y2 y3
exp y = 1 + y + + + . . . for y ∈ R.
1
2! 3!
As before, we can use two methods to find the power (and also Maclaurin) series expansion
of i.
Method 1 (Theorem 38). Simply plug y = h (x) into =:
1
(x2 ) (x2 )
2 3
x4 x6
i (x) = (exp ○h) (x) = exp (h (x)) = e = 1 + x + x2
+ 2
+ ⋅ ⋅ ⋅ = 1 + x2 + + + ...
2! 3! 2! 3!
for x ∈ R.
i (0) = e0 = 1,
2
i′ (0) = 2 ⋅ 0 ⋅ i (0) = 0,
i′′ (0) = 2i (0) + 2 ⋅ 0 ⋅ i′ (0) = 2 + 0 = 2,
i′′′ (0) = 4i′ (0) + 2 ⋅ 0 ⋅ i′′ (0) = 4 ⋅ 0 + 0 = 0,
i(4) (0) = 6i′′ (0) + 2 ⋅ 0 ⋅ i(3) (0) = 6 ⋅ 2 + 0 = 12.
1 0 2 0 12 x4 1 1
i (x) = ex = + x + x2 + x3 + x4 + ⋅ ⋅ ⋅ = 1 + x2 + + ... for x ∈ (− , ).
2
0! 1! 2! 3! 4! 2! 2 2
We can even “plug” one Maclaurin series into another. To illustrate, here is a conceptually-
simple (if tedious) example:
Assuming f and g are both analytic, by Theorem 38, so too is their composition h.
The power (and also Maclaurin) series expansion of f is:
1
f (y) = = 1 − y + y2 − y3 + . . . for all y ∈ (−1, 1).
1
1+y
1 1 x3 x5
g (x) = sin x = (x − + − ...) for all x ∈ R.
2
2 2 3! 5!
1
h (x) = =
1 + 12 sin x
2
1 x3 x5 1 x3 x5 1 x3 x5
1 − [ (x − + − . . . )] + [ (x − + − . . . )] − [ (x − + − . . . )] 3 + . . .
2 3! 5! 2 3! 5! 2 3! 5!
for x ∈ R.
The RHS expression looks like a complete nightmare. But if we’re merely asked to write
out the Maclaurin series of h up to and including the x3 term, then things aren’t too
bad. Examine the RHS expression one term at a time and simply discard anything that’s
above degree 3. If we do so, we get:
1 x x3 x2 x3 x x2 x3
h (x) = = 1 − ( − ) + ( ) − ( ) + ⋅⋅⋅ = 1 − + − + ... for x ∈ R.
1 + 12 sin x 2 12 4 8 2 4 24
−1 1 1
h′ (x) = ( cos x) = − [h (x)] cos x,
2
(1 + 12 sin x) 2
2 2
1 1
h′′ (x) = − {2h (x) h′ (x) cos x − [h (x)] sin x} = h (x) [ h (x) sin x − h′ (x) cos x],
2
2 2
1 1 ′ 1
h (x) = h (x) [ h (x) sin x − h (x) cos x] + h (x) [ h (x) sin x + h (x) cos x − h′′ (x) cos x + h′
′′′ ′ ′
2 2 2
1
788, h (0) =
Contents = 1, www.EconsPhDTutor.com
Exercise 328. The function f is defined by f (x) = sin [ln (1 + x)]. Using both methods
you’ve learnt (i.e. Theorems 38 and 23), write down the Maclaurin series representation
of f , up to and including the x3 term. (As always, don’t forget to state the range of
values for which the Maclaurin series converges.) (Answer on p. 1540.)
Example 999. In a typical A-Level exam question, you might be asked to find the
Maclaurin series expansion of the secant function sec up to and including the x4 term.
To do so, first write down the first four derivatives of sec:
Observe that sec 0 = 1 and tan 0 = 0. So, we can easily evaluate each of sec, sec′ , sec′′ ,
sec′′′ , and sec(4) (x) at 0:
sec 0 = 1,
sec′ 0 = 0,
sec′′ 0 = 0 + 1 = 1,
sec′′′ 0 = 0 + 0 + 0 = 0,
sec(4) 0 = 0 + 0 + 2 + 0 + 0 + 3 = 5,
Thus, the Maclaurin series expansion of sec, up to and including the x4 term, is:
1 0 1 2 0 3 5 4 1 5
sec x = + + x + x + x + ⋅ ⋅ ⋅ = 1 + x2 + x4 + . . .
1
0! 1! 2! 3! 4! 2 24
Remark 107. The method of repeated differentiation is nice, but does have one im-
portant drawback — it does not tell us about the range of values on which the computed
Maclaurin series converges.
For example, it turns out that the Maclaurin series of sec converges only on (− , ).
π π
2 2
That is, = holds only for x ∈ (− , ) and not for any other x. This is an important
1 π π
2 2
piece of information that we are unable to find using only the method of repeated
differentiation.
Exercise 329. Find the Maclaurin series expansion of the tangent function tan, up to
and including the x5 term. The small-angle approximation for tan is given by the
2nd Maclaurin polynomial — write it down. (Answer on p. 790.)
A329.
x [f (x)] + e = ef (x) .
2 0
Using repeated implicit differentiation, we can find the first few terms of the Maclaurin
series of f .
Differentiating once with respect to x, we have:
2f (x) f ′ (x) + 2f (x) f ′ (x) + 2x {[f ′ (x)] + f (x) f ′′ (x)} = ef (x) [f ′ (x)] + ef (x) f ′′ (x).
2 2 2
Plug x = 0 into =:
0
4 1
1 + 0 = 1 = e1 f ′ (0) f ′ (0) = .
1
or
e
1 1 4 2 1 1 2 1 ′′ 1 3
2 ⋅ 1 ⋅ + 2 ⋅ 1 ⋅ + 0 = = e ⋅ ( ) + e f (0) = + ef ′′ (0) or f ′′ (0) = .
e e e e e e2
Thus, the Maclaurin series expansion of f , up to and including the x2 term, is:
f (0) f ′ (0) f ′′ (0) 2 3
f (x) = + x+ x + ⋅ ⋅ ⋅ = 1 + x + 2 x2 + . . .
0! 1! 2! 2e
Remark 108. Again, the method of repeated implicit differentiation is nice but fails
to us about the range of values on which the computed Maclaurin series converges.
In the above example, this method allowed us to find the Maclaurin series expansion of
f , but did not tell us the range of values on which this series converges.
Comparing the new 9758 syllabus (first examined 2017) with the old 9740 syllabus (last
examined 2017), we have mostly subtractions and rarely any additions. One of the rare
additions is this subchapter’s topic. My suspicion is therefore that it will soon show up.
(Note that it didn’t appear on the 2017 9758 A-Level exams.)
Although very tedious, there is conceptually nothing difficult about repeated implicit
differentiation — it’s just a whole bunch of differentiation and algebra. So, just make
sure you go slowly and carefully. Ensure that everything is correct at each step of the
way.
Using repeated implicit differentiation, we can find the first few terms of the Maclaurin
series of g.
Differentiating once with respect to x, we have:
Observe that the expression [g (0)] + g (0) + 1 is a quadratic polynomial in g (0) and has
2
Below we will go through Euler’s remarkable “solution” of the Basel Problem. But first, a
very brief history:
The Basel Problem was first posed in 1650 by Pietro Mengoli (1626–86),312 but only became
more widely known in 1689 when Jacob Bernoulli (1655–1705) published one of his Treatises
on Infinite Series. The Basel Problem is named after the city of publication (and also
Bernoulli’s residence).313 Bernoulli wrote:
1 1 1 1 1
... when the numbers are pure squares, as in the series + + + +
1 4 9 16 25
&c., it is more difficult than one would have expected, which is noteworthy.
If someone should succeed in finding what till now withstood our efforts and
communicate it to us, we would be much obliged to them.314315
311
We already briefly discussed this in Example 478.
312
Mengoli wrote:
Ab huius fractionum dispositionis contemplatione faliciter expeditus, ad aliam pro-
grediebar dispositionem, in qua singula unitates numeris quadratis denominantur.
Hac speculatio fructus quidem laboris rependit, nondum tamen effecta est solvendo,
sed ingenij ditioris postulat adminiculum, ut pracisam dispositionis, quam mihi-
metipsi proposui, summam valeat reportare.
An English translation of the above paragraph is credited to Emanuele Delucchi and found in Loya
(2018):
Having concluded with satisfaction my consideration of those arrangements of frac-
tions, I shall move on to those other arrangements that have the unit as numerator,
and square numbers as denominators. The work devoted to this consideration has
bore some fruit — the question itself still awaiting solution — but it [the work] re-
quires the support of a richer mind, in order to lead to the evaluation of the precise
sum of the arrangement [of fractions] that I have set myself as a task.
313
Basel was also the where the illustrious Bernoulli family resided. The Bernoullis left their mark every-
where. Just to give a few examples, in physics, we have Bernoulli’s Principle, named after Daniel
Bernoulli (1700–82). In economics, Daniel is usually credited with coming up with the concept of ex-
pected utility. L’Hôpital’s Rule should really be Johann Bernoulli’s (1667–1748) Rule. As we’ll learn
later, in probability, we have Bernoulli random variables, also named after Jacob.
315
This English translation was taken from Lagarias (2013, p. 13), who in turn credits Jordan Bell. The
original Latin passage can be found in Bernoulli’s posthumously published Ars Conjectandi (1713, p.
254):
1 1 1 1 1
quando sunt puri Quadrati, ut in serie + + + + &c. difficilior est,
1 4 9 16 25
quam quis expectaverit, summae pervestigatio, quam tamen finitam esse, ex altera,
qua manifesto minor est, colligimus: Si quis inveniat nobisque communicet, quod
794, Contents www.EconsPhDTutor.com
The Basel Problem was first solved by Leonhard Euler (1707–83) in 1734, albeit somewhat
heuristically.316 By heuristically, we mean that Euler’s solution would not meet modern
standards of rigour. Nonetheless, we will now go through his solution, because it is delight-
fully simple and illustrates one use of the Maclaurin series.
Now consider for example the quadratic polynomial 1 − 4x + 3x2 . It has constant term 1
and roots r1 = 1/3 and r2 = 1. And so, we can write:
1 − 4x + 3x2 = (1 − 3x) (1 − x) = (1 − ) (1 − ) = (1 − ) (1 − ) .
x x x x
1/3 1 r1 r2
It turns out that in general, if p (x) is a nth-degree polynomial with constant term 1 and
roots r1 , r2 , . . . , rn , then:
p (x) = (1 − ) (1 − ) . . . (1 − ).
2 x x x
r1 r2 rn
sin x
see that may be written as an “infinite polynomial” (or more correctly, a power series)
x
with constant term 1. And so here, Euler made an audacious leap of logic. He supposed
sin x
could, like any finite polynomial with constant term 1, be written like =. If so,
2
that
x
sin x
then since has roots ±π, ±2π, ±3π, . . . , we’d have:
x
sin x 3
= (1 − ) (1 − ) (1 − ) (1 − ) (1 − ) (1 − )...
x x x x x x
x π −π 2π −2π 3π −3π
industriam nostram elusit hactenus, magnas de nobis gratias feret.
316
Euler formally presented this result in 1735 (in St. Petersburg) and in 1740 published it as “De Summis
Serierum Reciprocarum” (PDF). The latter has been translated into English by Jordan Bell (2005,
PDF).
317
By the Factor Theorem:
Thus, p (x) = c (1 − ) (1 − ) . . . (1 − ) for some constant c. Since p (x) has constant term 1, it
x x x
r1 r2 rn
must be that c = 1 and hence =.
2
bit of algebra:
sin x 3
= (1 − ) (1 − )(1 − ) (1 − ) (1 − ) (1 − )...
x x x x x x
x π −π 2π −2π 3π −3π
x2 x2 x2
= (1 − )(1 − ) (1 − )...
π2 4π2 9π2
1 1 1
= 1 + (− − − − . . . ) x2 + . . .
4
π 4π
2 2 9π 2
x2 1 2 x2 1 2
(To get the last step, observe that − ⋅ 1 ⋅ 1 ⋅ 1 ⋅ ⋅ ⋅ ⋅ = − x , − ⋅ 1 ⋅ 1 ⋅ 1 ⋅ ⋅ ⋅ ⋅ = − x , etc.)
π2 π2 4π2 4π2
And now, compare = and = — in particular, compare the coefficients on x2 :
1 4
1 1 1 1
− = − 2 − 2 − 2 − ...
6 π 4π 9π
1 1 1 1 π2
Rearranging: 1 + + + ⋅⋅⋅ = 1 + 2 + 2 + ⋅⋅⋅ = .
4 9 2 3 6
Remark 109. The only defect in the above “solution” is that, as discussed above, = has
3
318
The Weierstrass Factorization Theorem. According to Turner (2013), this was first proven in 1876.
796, Contents www.EconsPhDTutor.com
76.16. The Riemann Hypothesis (fun, optional)
For a complex number s whose real part is greater than 1 (i.e. Res > 1), the Riemann zeta
function, denoted ζ, is defined by:
∞
1
ζ (s) = ∑ s
.
n=1 n
1 1 1
And so for example: ζ (2) = + + + ...
12 22 32
Observe then that the Basel Problem is simply the problem of finding ζ (2). By comparing
the coefficients on x2 in = and = in the previous subchapter, we found that:
1 4
π2
ζ (2) = .
6
It turns out that by similarly comparing the coefficients on x4 in = and =, we can — with
1 4
1
If s is not a negative even integer and ζ (s) = 0, then Res = .
2
In September 2018, the renowned 89-year-old British mathematician Michael Atiyah claimed
to have solved the Riemann Hypothesis. There was much initial scepticism.320 Time will
tell if he’s actually correct.
As of late 2018, only one of the seven Millennium Prize Problems — the Poincaré Con-
jecture — has been officially solved. It was solved in 2003 by the Russian mathematician
Grigori Perelman (b. 1966). Perelman was officially awarded the US$1M prize in 2010, but
rejected it, stating:
I’m not interested in money or fame. I don’t want to be on display like
an animal in a zoo. I’m not a hero of mathematics. I’m not even that
successful; that is why I don’t want to have everybody looking at me.321
320
See e.g. this Science Magazine story.
321
See e.g. this 2010 BBC story.
798, Contents www.EconsPhDTutor.com
So far in Part V, we’ve been looking at differential calculus.
In the remainder of Part V, we’ll look instead at integral cal-
culus.
√
Example 1002. Graphed below is the function f ∶ [0, 9] → R defined by f (x) = x + 1.
Figure to be
inserted here.
The definite integral of f from 0 to 1 is the number equal to the red area and may be
denoted:
1 1 1
∫0 f (x) dx or ∫0 f dx or ∫0 f .
The definite integral of f from 2 to 4 is the number equal to the blue area and may be
denoted:
4 4 4
∫2 f (x) dx or ∫2 f dx or ∫2 f .
How can we compute the red or blue areas? Right now, we have no idea. But in the next
subchapter, we’ll revisit this question.
Figure to be
inserted here.
The definite integral of g from 0 to 1 is the number equal to the red area and may be
denoted:
1 1 1
∫0 g (x) dx or ∫0 g dx or ∫0 g.
The definite integral of g from 2 to 4 is the number equal to the blue area and may be
denoted:
4 4 4
∫2 g (x) dx or ∫2 g dx or ∫2 g.
Thanks to primary-school geometry, in this example, we know how to compute the red
and blue areas:
1 1 Base × Height 1 × 1 1
1
∫0 g (x) dx = ∫0 g dx = ∫0 g = 2
=
2
= .
2
4 4 4 Base × Height 2 × 2
∫2 g (x) dx = ∫2 g dx = ∫2 g = = = 2.
2 2
You are probably most familiar with the following piece of notation:
∫a f (x) dx.
b
We call:
• The symbol ∫ the integral sign (it is simply an elongated S);
• The numbers a and b the lower and upper limits of integration;
• The function f to be integrated the integrand; and
• The symbol dx the differential of the variable x — it tells us that the independent
variable is denoted x.
Notice though that, as usual, x is merely a dummy variable that can be replaced with
any other symbol. When describing the definite integral of f from a to b, what matters are
the function f and the lower and upper limits a and b. The symbol we use to denote the
independent variable doesn’t really matter — it is customarily x but could be any other
symbol like y, t, u, or even ,.
801, Contents www.EconsPhDTutor.com
And so, the “(x)” and even “dx” are somewhat superfluous. We could equally well denote
the definite integral ∫ f (x) dx as:
b
a
b b
∫a f dx or ∫a f .
Nonetheless, as we’ll see later when we’re dealing with more than one variable, the “(x)”
and “dx” can help us avoid confusion.
As we saw earlier (Ch. 69.4), in differential calculus, there are (at least) three commonly-
used types of notation:322
∫a f (x) dx
b b b
or ∫a f dx or ∫a f .
Fun Fact
We will not give a formal definition of the definite integral in the main text of this
textbook.323 Nonetheless, just to provide a little clarity and precision, here’s an informal
definition anyway:
Let a, b ∈ R with a < b and f ∶ [a, b] → R be a continuous function. Then the definite
integral of f from a to b is denoted:
f (x) dx
b b b
∫a or ∫a f dx or ∫a f ;
and is the area bounded by f , the x-axis, and the vertical lines x = a and x = b.
As already mentioned, we call the symbol ∫ the integral sign; the numbers a and b the
lower and upper limits of integration; the function f to be integrated the integrand; and
the symbol dx the differential of the variable x.
322
In n. 285, we also mentioned a fourth type of notation due to Euler.
323
But if you’re interested, see Ch. 121.13 (Appendices).
802, Contents www.EconsPhDTutor.com
The above definition is considered informal because we haven’t formally defined what the
“area” bounded by a curve and three straight lines is, or how we can compute it. In the
next subchapter, we will make a sketch of how this might be done.
By the way, in the above definition, we define the definite integral of f from a to b only in
the case where a < b. We will find it convenient to also define the definite integral in those
cases where (a) the two limits are equal; and (b) the upper limit is smaller than the lower
limit.
∫c f = 0.
c
a
(b) The definite integral of f from b to a is denoted ∫ f and is defined to be the additive
b
b
inverse of ∫ f :
a
f = − ∫ f.
a b
∫b a
And so, there is, a priori,326 no relationship whatsoever between differentiation and integ-
ration. There is, a priori, no reason to believe that the gradient of a curve has anything to
do with the area under that same curve.
That there is a relationship is established only with the two Fundamental Theorems of
Calculus (FTCs), which tell us that, very surprisingly:
This, it must be stressed, is a very surprising result. There is no reason to have expected
that the gradient of a curve is somehow related to the area under the curve — much less
that these two operations are inverses of each other.
We will now work our way towards the first Fundamental Theorem of Calculus (FTC1).
Don’t worry, we’ll omit most technical details. The goal here is merely to provide you
with some intuition and hence a better understanding of why the FTCs work and why
differentiation and integration turn out to be inverses.
324
Indeed, on your 4047 A Maths syllabus (and again on your H2 Maths syllabus), the very first mention
of integration states, “integration as the reverse of differentiation”.
325
According to Abbott (2015, pp. 215–6):
Historically, the concept of integration was defined as the inverse process of differ-
entiation. ... A very interesting shift in emphasis occurred around 1850 in the work
of Cauchy, and soon after in the work of Bernhard Riemann. The idea was to com-
pletely divorce integration from the derivative and instead use the notion of “area
under the curve” as a starting point for building a rigorous definition of the integral.
The latter, modern approach is the one this textbook shall follow, not least for pedagogical reasons.
See also this MathEducators.SE discussion.
326
A priori is just a fancy Latin phrase for beforehand.
804, Contents www.EconsPhDTutor.com
805, Contents www.EconsPhDTutor.com
77.2. A Sketch of How We Can Find the Area under a Curve
√
Example 1004. Define327 f ∶ [0, 9] → R by f (x) = x + 1.
Figure to be
inserted here.
Consider the definite integral of f from 0 to 4. This quantity corresponds to the green
area and may be denoted:
4 4 4
∫0 f (x) dx or ∫0 f dx or ∫0 f .
L1 = 4 × f (0) = 4.
√
Next, consider the rectangle with base 4 and height f (4) = 4 + 1 = 9. If we denote its
area by U1 , then we have:
U1 = 4 × f (4) = 36.
Evidently, the green area is somewhere between these two quantities. That is:
4 4
L1 ≤ ∫ f ≤ U1 or 4≤∫ f ≤ 36.
0 0
In other words, L1 = 4 and U1 = 36 serve as lower and upper bounds for what the green
area can be.
Can we do better than this? Sure. One obvious possibility is to use more rectangles.
√
Construct two rectangles, each with base 2, but one with height f (0) = 1 and the other
with height f (2) = 2 + 1. Let us call the total area of these two rectangles the lower
sum and denote it by L2 . Then we have:
√
L2 = 2 × [f (0) + f (2)] = 4 + 2 2 ≈ 6.828.
√
Next, construct two rectangles, each with base 2, but one with height f (2) = 2 + 1 and
the other with height f (4) = 9. Let us call the total area of these two rectangles the
upper sum and denote it by U2 , then we have:
806, Contents
√ www.EconsPhDTutor.com
Exercise 331. Continuing with the above example, let each of L4 and U4 be the total
4
area of four rectangles, where L4 and U4 serve as lower and upper bounds of ∫ f . Find
0
L4 and U4 . Are they improvements over L2 and U2 ?
Repeat all of the above, but now for L8 and U8 . (Answer on p. 807.)
A331. xxx
Figure to be
inserted here.
Sketched in the above example and exercise is the main idea underlying integration:
More precisely, integration (or the procedure of computing the area under a curve) uses
these four steps:
1. We first divide the area under the curve into n thin rectangles, with each rectangle lying
entirely below the curve. We call the sum of these rectangles’ areas the lower sum Ln .
2. We again divide the area under the curve into n thin rectangles, but this time each
rectangle lies entirely above the curve. We call the sum of these rectangles’ areas the
upper sum Un .
3. Observe that for every n, we have:
4
Ln ≤ Area = ∫ f ≤ Un .
0
4. By letting n → ∞, we have:
Of course, to properly, rigorously, and precisely define integration (or the procedure of
computing the area under a curve), there are some technical details that need to be filled
in. For example, how exactly are the lower and upper sums Ln and Un defined?
But for H2 Maths, we needn’t worry about these technical details328 and the above ex-
planation will more than suffice. For a recent A-Level exam question that requests an
explanation of integration, see Exercise 575 (N2015-I-3).
necessary.
328
But see Ch. 121.13 in the Appendices if you’re interested.
807, Contents www.EconsPhDTutor.com
77.3. Some Basic Rules of Integration
Theorem 25. Let a, b, c, d, e ∈ R with a < c < b. Suppose f, g ∶ [a, b] → R are continuous
functions. Then:
a a a
a a c
a a
a a
Proof. For the formal proofs of (a)–(?), see p. 1356 in the Appendices. Here we give some
informal proofs:
(a) Consider the area under the graph obtained by taking the sum (or difference) of f and
g. This area must be equal to the sum (or difference) of the areas under the graphs of f
and g.
Figure to be
inserted here.
c b
(b) The area under the graph of f from a to c is ∫ f . The area from c to b is ∫ f . And
a c
c
so “obviously”, the area from a to b, or ∫ f , is the sum of those first two quantities.
a
Figure to be
inserted here.
(c) Stretch f outwards from x-axis by a factor d. The area under the graph thus obtained
must be d times the area under the graph of f .
(d) “Clearly”, ∫ c is simply the area of a rectangle with base b − a and height c. So
b
∫a c = (b − a) c.
b
Figure to be
inserted here.
(e) If f is everywhere on or above g, then the area under f must be no less than that under
g.
Figure to be
inserted here.
(f) The numbers c and d serve as lower and upper bounds for f on the relevant interval
(a, b). And so “obviously”, ∫ f , the area under the graph of f from a to b, is bounded
b
from below and above by the rectangles with base b − a and heights c and d.
a
Figure to be
inserted here.
It is actually not difficult to formally prove (f) and you are asked to do so in Exercise
XXX.
Exercise 333. Let f ∶ [a, b] → R be a continuous function. Suppose f ≥ 0 on [a, b]. Prove
that ∫ f ≥ 0. (Hint: Define F ∶ [a, b] → R by F (x) = 0.)
b
(Answer on p. 810.)
a
a a
(b − a) e.
a a a a
A333. Following the hint, we define F ∶ [a, b] → R by F (x) = 0. By the Constant Rule,
∫a F = 0. Since f ≥ F = 0 on [a, b], by Comparison Rule I, ∫a f ≥ ∫a F = 0.
b b b
However, we have not actually explained how we can solve the following problem:
Instead of tackling the above problem directly, we will now, somewhat strangely, take an
indirect approach. We will instead try to answer the following question:
Now, at this point, we have no idea how to find a definite integral. And so, the above
question seems akin to asking someone who has no idea where Singapore is to locate the
Istana.
Nonetheless and somewhat surprisingly, it turns out that the seemingly-indirect question
- is easier to answer than △ and will enable us to find definite integrals.
Indeed, the answer to - is precisely the First Fundamental Theorem of Calculus (FTC1)!
It is this:
We will now try to work towards understanding why ,, which is an informal statement of
the FTC1, might be true.
We begin by noting that - and , are a little imprecise when they speak of “the derivative
of a definite integral”. We defined a definite integral to be a number. But we know that
only functions can have derivatives and so it makes no sense to speak of the derivative of
a number. So let us now define a function based on definite integrals:
Definition 182. Given the continuous function f ∶ [a, c] → R and any b ∈ [a, c], we define
a new function g ∶ [b, c] → R, called the definite integral of f from b, by:
In words, the function g takes each x ∈ [b, c] and maps it to the number that is equal to
the area under f , bounded by the x-axis and the vertical lines at b and c.
−2
Figure to be
inserted here.
Similarly, the definite integral of f from 0 is the function h ∶ [0, 5] → R defined by:
Figure to be
inserted here.
Remark 110. As usual, take care to note that t in the above Definition is simply a dummy
variable that can be replaced by any other symbol.
Also, the equation in the above Definition could also have been written more simply as:
g (x) = ∫ g (x) = ∫
x x
f dt, or f.
a a
Figure to be
inserted here.
Let g be the definite integral of f from 0. That is, define g ∶ [0, 9] → R by:
g (x) = ∫
x
f.
0
We now ask:
To answer this question, let us return to one of the possible intuitive interpretations of
the derivative:
Observe that the thin [COLOR XXX?] area is g (x) − g (4) and is bounded by the above
two areas. That is:
Figure to be
inserted here.
g (x) = ∫
x
f.
a
Then g ′ = f .
Remark 111. The FTC1 establishes that differentiation and integration are inverse
operations. Again, we must stress, emphasise, and repeat that this is a genuinely
surprising result and should not be taken for granted. In particular, we should not
assume that integration is by definition the inverse of differentiation. Instead, we should
be acutely aware that this is a surprising finding established only by the FTC1.
Definition 182.
329
This step is formally justified by the Order Limit Theorem (Appendices).
814, Contents www.EconsPhDTutor.com
Example 1007. A car is moving. Below we graph its velocity v (m s−1 ) as a function
of time t (s).
Figure to be
inserted here.
Recall that the distance d (m) travelled by the car is the area under the graph.
5
• For example, after 5 s, the distance travelled by the car is ∫ v dt.
0
8
• And after 8 s, the distance travelled by the car is ∫ v dt.
0
x
• In general, after x s, the distance travelled by the car is ∫ v dt.
0
That is: d′ = v.
The derivative of the area under a function’s graph is the function itself.
So far, we haven’t actually computed the area under any curve. We shall do so in Ch. 79,
where the Second Fundamental Theorem of Calculus (FTC2) is introduced.
F = ∫ f.
F =∫ f ⇐⇒ F′ = f.
We understood that = was simply shorthand for the following, more long-winded state-
1
ment:
∫ 2x dx = x ,
22
with the understanding that = simply shorthand for the following, more long-winded
2
statement:
Remark 112. In this textbook, we will treat the terms antiderivative, primitive, and
indefinite integral as synonyms.330 We will also treat the terms antidifferentiation
and indefinite integration as synonyms.
330
Some writers choose to maintain a very slight and subtle distinction between these three terms. See for
816, Contents www.EconsPhDTutor.com
Example 1009. XXX
Remark 113. Again, let us stress, emphasise, and repeat that a priori, there is no re-
b
lationship whatsoever between the definite integral ∫ f and the antiderivative or
a
indefinite integral ∫ f .
b
The symbol ∫ f denotes the area under the graph of f , between a and b. In contrast,
a
the symbol ∫ f denotes any function whose derivative happens to be the function f .
It is only through the two FTCs that we establish that, surprisingly enough:
• There is a relationship between the integration (or definite integration) and antidiffer-
entiation (or indefinite integration),
• And moreover, the two turn out to be the “same thing”.
One reason why students are often confused into thinking that there is some obvious,
definitional relationship between the definite and indefinite integrals is that they have
almost identical names and notation. And so, to reduce this source of confusion, I will
often prefer to use the terms antiderivative and antidifferentiation instead of indef-
inite integral and indefinite integration. This helps to constantly remind students
that antidifferentiation (or indefinite integration) is, by definition, the inverse of
differentiation and, a priori, has nothing to do with integration.
example Hagen von Eitzen’s answer at . My view is that even if such a distinction were useful, it is
so subtle as to be more confusing than clarifying (especially at this introductory level). And so, this
textbook shall simply treat these three terms as synonyms.
817, Contents www.EconsPhDTutor.com
78.1. The Antiderivative Is Not Unique ...
F′ = f, G′ = f , H′ = f .
F = ∫ f, G = ∫ f, H = ∫ f.
The above example shows that the antiderivative is not unique. In general:
Proof. For all x ∈ D, we have F ′ (x) = f (x) and hence also G′ (x) = F ′ (x)+C ′ = F ′ (x)+0 =
f (x). We have just shown that the derivative of G is f and thus that G is also an
antiderivative of f .
Fact 151. Let D be an interval. Suppose the function f ∶ D → R has the antiderivative F .
If G ∶ D → R is also an antiderivative of f , then G may be defined by G (x) = F (x) + C,
where C is some real number.
F′ = f, or equivalently F = ∫ f.
On p. 682, we carefully and pedantically explained how, precisely, we should use Leibniz’s
d
differentiation notation, in particular the symbol .
dx
We now do likewise for Leibniz’s antidifferentiation notation, in particular the symbol ∫ .
Example 1016. (Constant Rule) ∫ 5 dx = 5x+C, where as usual, C denotes the COI.
1
Example 1017. (Power Rule) ∫ x17 dx = x18 + C.
18
1 11
Example 1018. (Power Rule) ∫ x−4 dx = − x−3 + C = − 3 + C.
3 3x
1
Example 1019. (Reciprocal Rule) ∫ dx = ln ∣x∣ + C.
x
1
Example 1024. (Difference Rule) ∫ − cos x dx = ln ∣x∣ − sin x + C.
x
51
Example 1025. (Constant Factor Rule) ∫ 5x−4 dx = − 3 + C.
3x
1
Example 1026. (LPC Rule) ∫ cos (2x + 3) dx = sin (2x + 3) + C.
2
1
Example 1027. (LPC Rule) ∫ exp (2x + 3) dx = exp (2x + 3) + C.
2
1
(c) ∫ x dx = ln ∣x∣ + C (x ≠ 0), (Reciprocal
(d) ∫ e dx = e + C, (Exponentia
x x
(h) ∫ kf = k ∫ f , (Constant F
1
(i) ∫ f (ax + b) dx = (∫ f ) (ax + b),
a
where, in each case, C denotes the constant of integration.
Remark 114. For lack of a better name, I shall call the last Rule the Linear Polynomial
Composition (LPC) Rule.
When written out formally, it looks complicated. But as illustrated by the last three
examples, it’s jolly simple and you will already have seen plenty of it in secondary school.
Remark 115. Just to be perfectly clear, let us stress, emphasise, and repeat what we
already said in the last subchapter.
Take for example the Constant Rule, which states:
∫ k dx = kx + C.
The above equation is simply shorthand for the following precise but long-winded state-
ment:
Suppose a function has mapping rule x ↦ k (where k ∈ R).
Then this function’s antiderivatives are exactly those functions
whose mapping rule is x ↦ kx + C.
Proof. Here we will prove (or rather verify) only the (c) Reciprocal Rule. (You are asked
to verify the remaining Rules in Exercise 336.)
d
In general, to verify that ∫ f (x) dx = F (x) + C, it suffices to verify that (F (x) + C) =
dx
823, Contents www.EconsPhDTutor.com
f (x).
1 d
(c) So, to verify that ∫ dx = ln ∣x∣+C (x ≠ 0), it suffices to verify that (ln ∣x∣ + C) = x−1
x dx
(for x ≠ 0).
⎧
⎪
⎪
⎪ln x + C for x > 0,
We have: ln ∣x∣ + C = ⎨
⎪
⎪
⎩ln (−x) + C
⎪ for x < 0.
⎧
⎪ 1
⎪
⎪
⎪ for x > 0,
d ⎪
⎪
⎪x
Thus: (ln ∣x∣ + C) = ⎨
dx ⎪
⎪
⎪
⎪
⎪ −1 1
⎪
⎪ = for x < 0.
⎩ −x x
d 1
That’s all there is to verifying that (ln ∣x∣ + C) = x−1 and hence also that ∫ dx =
dx x
ln ∣x∣ + C (for x ≠ 0)!
Remark 116. In the Reciprocal Rule, there is, annoyingly enough, an absolute value
sign. Take care to always include it. And no, it is not OK to simply drop it. For why
this isn’t OK, see the Remark following the answer to Exercise 337(c) or this discussion:
.
⋆
Remark 117. By Theorem 28(a), we have ∫ 1 dx = x + C.
⋆
We will often write ∫ 1 dx more simply as ∫ dx. Thus, = may also be rewritten as:
∫ dx = x + C.
Exercise 338. Compare the Constant Factor Rule in Theorem 27 (Rules of Antidiffer-
entiation) and the Constant Factor Rule in Theorem 25 (Rules of Integration). Aren’t
these exactly the same thing? If not, explain why. (Answer on p. 1543.)
(a) ∫ ax + b dx.
1
(e) ∫ dx.
ax + b
(f) ∫ a sin (bx + c) + d dx.
∫a f = g (b) − g (a)
b
where g is any antiderivative of f .
That is, when asked to find the definite integral of f from a to b (i.e. the area under the
graph of f between a and b), we need merely follow these steps:
1. Find any antiderivative g of f .
2. Plug the lower and upper limits of the definite integral into g.
3. The difference g (b) − g (a) is our desired area.
Below we will formally state and prove the FTC2. But first, some examples to illustrate
how it works:
3
Example 1029. Define f ∶ R → R by f (x) = x . Suppose we are told to find ∫ f —
2
0
that is, the area under the graph of f , between 0 and 3.
Figure to be
inserted here.
3
Thus: ∫0 f = g (3) − g (0) = 9 − 0 = 9
∫a f = g (b) − g (a) .
b
a a a a
(a) xxx
(b) xxx
A340(a) xxx
(b) xxx
(We put “same thing” in scare quotes because this is a rather imprecise assertion that is
made precise only by the formal statements of the FTCs.)
80.1. Factorisation
1
Example 1032. Find ∫ 2 dx (for x ≠ −1).
x + 2x + 1
Looks tricky. But observe that x2 + 2x + 1 = (x + 1) . And so, using also the Power and
2
1 1 1
∫ x2 + 2x + 1 dx = ∫ 2 dx = − x + 1 + C.
(x + 1)
1
Example 1033. Find ∫ 3 dx (for x ≠ −1).
x + 3x2 + 3x + 1
We observe that x2 + 3x2 + 3x + 1 = (x + 1) . And so:
3
1 1 1 1
∫ x3 + 3x2 + 3x + 1 dx = ∫ dx = − + C..
(x + 1) 2 (x + 1)2
3
In the last subchapter, we learnt to find the following antiderivative in those cases where
b2 − 4ac = 0 and ax2 + bx + c is thus a perfect square:
1
∫ ax2 + bx + c dx.
We now learn to find the above antiderivative, but in those cases where b2 − 4ac > 0 so that
ax2 + bx + c is still factorisable but no longer a perfect square. So, this is really just
more factorisation, but this time we’ll also make use of partial fractions.
1
Example 1034. Find ∫ 2 dx (for x ≠ ±1).
x −1
Here partial fractions (see Ch. 25.1) will come in handy. Observing that x2 − 1 =
(x + 1) (x − 1), we write:
1 A (x − 1) + B (x + 1) (A + B) x − A + B
= + = =
A B
x2 − 1 x + 1 x − 1 (x + 1) (x − 1) x2 − 1
.
1 −1/2 1/2
∫ x2 − 1 dx = ∫ x + 1 + x − 1 dx
−1/2 1/2
=∫ dx + ∫ dx (Sum Rule)
x+1 x−1
1 1 1 1
=− ∫ dx + ∫ dx (Constant Rule)
2 x+1 2 x−1
⋆ 1 1
= − ln ∣x + 1∣ + ln ∣x − 1∣ + C (Reciprocal and LPC Rules)
2 2
1
= (ln ∣x − 1∣ − ln ∣x + 1∣) + C
2
1 ∣x − 1∣
= ln +C (Law of Logarithm)
2 ∣x + 1∣
1 x−1
= ln ∣ ∣ + C. (Fact 42)
2 x+1
⋆
Note: We could’ve just left our answer at =; the last three steps are nice but aren’t
necessary.
1 −1/5 1/5
∫ x2 + x − 6 dx = ∫ x + 3 + x − 2 dx
−1/5 1/5
=∫ dx + ∫ dx (Sum Rule)
x+3 x−2
1 1 1 1
=− ∫ dx + ∫ dx (Constant Rule)
5 x+3 5 x−2
⋆ 1 1
= − ln ∣x + 3∣ + ln ∣x − 2∣ + C (Reciprocal and LPC Rules)
5 5
1
= (ln ∣x − 2∣ − ln ∣x + 3∣) + C
5
1 ∣x − 2∣
= ln +C (Law of Logarithm)
5 ∣x + 3∣
1 x−2
= ln ∣ ∣ + C. (Fact 42)
5 x+3
⋆
Again, = would’ve sufficed as our answer.
dx (for x ≠ −1).
x
Example 1036. Find ∫ 2
x + 2x + 1
First, observe that x2 + 2x + 1 = (x + 1) . Then write:
2
∫ x2 + 2x + 1 dx = ∫
x x
2 dx
(x + 1)
x+1−1
=∫ dx (Plus Zero Trick)
(x + 1)
2
x+1 1
=∫ − dx
(x + 1) (x + 1)
2 2
1 1
=∫ − dx
x + 1 (x + 1)2
1 1
=∫ dx − ∫ 2 dx. (Difference Rule)
x+1 (x + 1)
1
= ln ∣x + 1∣ + + C. (Reciprocal, Power, and LPC Rules)
x+1
∫ x3 − 3x2 + 3x − 1 dx = ∫
x x
3 dx
(x − 1)
x−1+1
=∫ dx (Plus Zero Trick)
(x − 1)
3
x−1 1
=∫ + dx
(x − 1) (x − 1)
3 3
1 1
=∫ + dx
(x − 1) (x − 1)
2 3
1 1
=∫ dx + ∫ dx. (Sum Rule)
(x − 1) (x − 1)
2 3
1 1 1
=− − +C (Power, and LPC Rules)
x − 1 2 (x − 1)2
2x − 1
=− + C.
2 (x − 1)
2
1
∫ √ 2 dx = sin−1 + C, for ∣x∣ < ∣a∣,
x
(b)
a − x2 ∣a∣
1 1 x−a
(c) ∫ x2 − a2 dx = 2a ln ∣ x + a ∣ + C, for x ≠ a,
1 1 a+x
(d) ∫ a2 − x2 dx = 2a ln ∣ a − x ∣ + C, for x ≠ a,
∫ tan x dx = ln ∣sec x∣ + C,
π
(e) for x not an odd multiple of ,
2
(f) ∫ cot x dx = ln ∣sin x∣ + C, for x not an multiple of π,
Remark 118. Our versions of (b)–(h) are slightly more general than those given in List
MF26.
∫ ( cos x ) 2x dx = sin x + C.
2 2
↓ ↓ ↓ ↓ ↓
′ ′
f g g f g
Remark 119. As we already saw when doing Exercise 342, (d) is really just (c) with a
negative sign stuck in front.
Proof. Here we will prove (or rather verify) only (a) and (b). We already verified (c) and
(d) in Exercise 342. You are asked to verify (e)–(h) in Exercise 344.
d 1
(a) By Fact 144, tan−1 x = 2 . And so we have:
dx x +1
d 1 1 1 1 1
[ tan−1 + C] = ⋅ =
x
a ( x )2 + 1 a x2 + a2
.
dx a a
a
d 1
sin−1 x = √ (for ∣x∣ < 1). Hence, for ∣ ∣ < 1 or ∣x∣ < ∣a∣, we have:
x
(b) By Fact 144,
dx 1 − x2 ∣a∣
Exercise 344. Verify Proposition 9(e)–(h). (Hint: In each, you will have to examine
two cases, similar to (b) above.) (Answers on p. 1547)
In Chs. 80.1 and 80.2, we already learnt to find the above antiderivative, in those cases
where b2 − 4ac ≥ 0.
But if b2 −4ac < 0, then those earlier techniques will not work. Instead, we will have to learn
a new technique. This is to complete the square so that we can make use of Proposition
9(a):
1 ⋆ 1
∫ x2 + a2 dx = a tan a + C.
−1 x
1
Example 1038. Find ∫ 2 dx.
x +x+1
1 2 3
Observe that: x2 + x + 1 = (x + ) + .
2 4
1 1
Hence: ∫ x2 + x + 1 dx = ∫ dx.
(x + 21 ) + 34
2
√ ⋆
Now, let x + 1/4 and 3/4 take the places of “x” and “a” in =. Then:
1 1
∫ x2 + x + 1 dx = ∫ dx
(x + 21 ) + 34
2
⋆ 1 x + 21
=√ tan −1
√ +C
3/4 3/4
2 2x + 1
= √ tan−1 √ + C.
3 3
For how to complete the square, see Ch. ??. In general, we have:
b 2 b2
ax + bx + c = a (x + ) + c − .
2
2a 4a
But rather than to try memorise the above formula, it’s probably easier to try to understand
and thus easily “see” how you can complete the square in each case.
Let’s first rewrite the integrand so that the leading coefficient is 1 and stick any constants
in front:
1 1 1
∫ 2x2 + 3x + 5 dx = 2 ∫ x2 + 1.5x + 2.5 dx.
3 2 31
Complete the square: x + 1.5x + 2.5 = (x + ) + .
2
4 16
√ ⋆
Let x + 3/4 and 31/16 take the places of “x” and “a” in =. Then:
1 1 1
∫ 2x2 + 3x + 5 dx = 2 ∫ dx
(x + 4 ) + 16
3 2 31
⋆ 1 1 x + 3/4
= √ tan−1 √ +C
2 31/16 31/16
2 4x + 3
= √ tan−1 √ + C.
31 31
Remark 120. In Ch. 81, we will learn another technique for finding the antiderivative
covered in this subchapter.
In this subchapter and also Chs. 80.1 and 80.2, we’ve learnt to find the antiderivative
of the reciprocal of any quadratic polynomial. We summarise these results in Fact
152, which looks intimidating, but is really just what we’ve been doing, except with actual
numbers in place of a, b, and c.
√
Fact 152. Suppose a, b, c ∈ R with a ≠ 0 and d = ∣b2 − 4ac∣. Then:
⎧
⎪ x + b−d
⎪
⎪
⎪
1
ln ∣ ∣+C for b2 − 4ac > 0,
⎪
2a
⎪
⎪
⎪ x + 2a
⎪
d b+d
⎪
⎪
⎪
⎪
⎪
⎪
1 ⎪
⎪
∫ ax2 + bx + c dx = ⎨− 1 + C for b2 − 4ac = 0,
⎪
⎪
⎪ x + 2a
⎪
b
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
2
tan−1
2ax
+C for b2 − 4ac < 0.
⎪
⎩d d
A345.
1
Example 1040. Find ∫ √ dx.
−x2 + x + 1
2
√ 2
5 1 5 1 2
Complete the square:−x + x + 1 = − (x − ) = (
2
) − (x − ) .
4 2 2 2
1 1
∫ √ dx = ∫ √ √ dx
−x2 + x + 1 2
( 25 ) − (x − 21 )
2
x − 12
= sin −1
√ +C
∣ 5/2∣
2x − 1
= sin−1 √ + C.
5
331
Note that here we secretly use the LPC Rule.
838, Contents www.EconsPhDTutor.com
1
Example 1041. Find ∫ √ dx.
−2x2 + 3x + 5
3 5 49 3 2 7 2 3 2
−2xthe+ square:
Complete 3x + 5 = 2 (−x + x + ) = 2 [ − (x − ) ] = 2 [( ) − (x − ) ].
2 2
2 2 16 4 4 4
By letting 7/4 and x − 3/4 take the places of d and y in Proposition 9(b), we have:
1 1
∫ √ dx = ∫ √ √ dx
−2x2 + 3x + 5 2 ( 4 ) − (x − 4 )
7 2 3 2
1 x − 3/4
= √ sin−1 +C
2 ∣7/4∣
1 4x − 3
= √ sin−1 + C.
2 7
1 x − 1/6
= √ sin−1 √ +C
3 73/6
1 6x − 1
= √ sin−1 √ + C.
3 73
1 x + 1/14
= √ sin−1 √ +C
7 57/14
1 14x + 1
= √ sin−1 √ + C.
7 57
1 1 −2ax − b
∫ √ 2 dx = √ sin−1 √ + C.
ax + bx + c ∣a∣ b2 − 4ac
You can verify that the above general formula “works” for the above examples and exercises.
332
But see Fact 227 (Appendices) if you’re interested.
840, Contents www.EconsPhDTutor.com
80.7. Using Trigonometric Identities
The following Rules of Antidifferentiation are explicitly listed on your H2 Maths syllabus,
but sadly do not appear on List MF26. Which means you’ll have to know how to derive
them.
1 sin 2x
∫ cos x dx = 2 x + 4 + C,
2
(b)
∫ tan x dx = tan x − x + C,
2
(c)
If x, m, n ∈ R, m + n ≠ 0, and m − n ≠ 0, then:
1 cos (m − n) x cos (m + n) x
(d) ∫ sin mx cos nx dx = − [ + ] + C,
2 m−n m+n
1 sin (m − n) x sin (m + n) x
(e) ∫ sin sin dx = [ − ] + C,
m−n m+n
mx nx
2
1 sin (m − n) x sin (m + n) x
(f) ∫ cos mx cos nx dx = [ + ] + C.
2 m−n m+n
(In each, C is, as usual, the constant of integration.)
Proof. (a) Use the identity cos 2x = 1 − 2 sin2 x (see Exam Tip below):
1
1 − cos 2x 1 sin 2x
∫ sin x dx = ∫ dx = x − + C.
2 1
2 2 4
You are asked to prove (b)–(f) in Exercise 347.
Whenever you see a question with trigonometric functions, put MF26 (p. 3) next to you!
∫ u ⋅ v = u ⋅ v − ∫ u ⋅ v.
′ ′
Rearranging:
This last equation is our Integration by Parts (IBP) formula. For future reference, let’s jot
it down as a formal result:
∫ u ⋅ v = u ⋅ v − ∫ u ⋅ v.
′ ′
∫ x e = x e − ∫ 1 e = xe − e + C = e (x − 1) + C.
x x x x x x
©¬ ©³¹¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ ©³¹¹ ¹ ¹ ¹ ¹ ¹ ¹· ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
u v′ u
v u′ v
It turns out that to choose v ′ , we want to use mnemonic and rule of thumb dETAIL. That
is, choose the derivative v ′ in this order:
The above rule of thumb (usually) works because it is easiest to find an antiderivative of
an Exponential function and hardest to find one of a Logarithmic function.334
More examples:
333
Following common practice, I use the functions u and v (rather than say f and g) when discussing IBP.
334
Kasube (1983) first gave it as LIATE. By reversing the letters and adding the letter d in front, we get
the actual English word dETAIL, which is probably easier to remember.
842, Contents www.EconsPhDTutor.com
Example 1044. Find ∫ sin x cos x.
1
An easy way to find this antiderivative is to recall that sin x cos x = sin 2x and hence:
2
1 1 1
∫ sin x cos x = 2 ∫ sin 2x = − 4 cos 2x + C.
But here as an exercise, let’s try using IBP to find this antiderivative.
This time the dETAIL rule of thumb doesn’t help us because we have two trigonometric
functions. So let’s just choose v ′ = cos x, so that v = sin x and:
¬¬ ¬¬ ¬¬
u v′ u v u′ v
Rearranging, we have:
2 1
2 ∫ sin x cos x = sin2 x + Ĉ or ∫ sin x cos x = 2 sin x + C̄.
2
Can you explain why = and = are consistent with each other?335
1 2
©© 1 ©©
u v′ ′
ª ©
u u v v
∫ x e = x e − ∫ 2x e .
2 x 2 x x
At this point, we would apply IBP a second time. But we already did this in an earlier
example and found that ∫ xex = ex (x − 1). So let’s just plug = into =:
2 2 1
∫ x e = x e − 2e (x − 1) + C = e (x − 2x + 2) + C.
2 x 2 x x x 2
Sometimes we can use IBP together with the Times One Trick:
Example 1046. To find ∫ ln x dx, the Times One Trick and IBP work wonders:
1
∫ ln x dx = ∫ ln x ⋅ 1 dx = (ln x) x − ∫ x x dx = x ln x − ∫ 1 dx = x ln x − x + C.
1 1 1
335
Recall that cos 2x = cos2 x − sin2 x = 1 − 2 sin2 x. Hence, − cos 2x = sin2 x − . The additional term
4 2 4
1
“− ” is not a problem once we also recall that indefinite integrals can differ by up to a constant.
4
843, Contents www.EconsPhDTutor.com
Remark 123. Unfortunately, I have not found any good mnemonic for the IBP formula
(Theorem 30). But perhaps this is all for the best, since this may occasionally force you
to derive it from the Product Rule (and hence understand where it comes from):
′
(u ⋅ v) = u′ ⋅ v + u ⋅ v ′
u ⋅ v = ∫ u′ ⋅ v + ∫ u ⋅ v ′
∫ u ⋅ v = u ⋅ v − ∫ u ⋅ v.
′ 1 ′
In Ch. 81.7, we will show that the IBP formula = can also be rewritten as:
1
∫ u dv = u ⋅ v − ∫ v du.
A mnemonic that does work well with this alternative formula is “ultraviolet voodoo”.
1
Exercise 349. Starting with ∫ dx, Aisha arrives at the conclusion that 0 = 1:
x
1 1 1 −1 1
“∫ dx = ∫ ⋅ 1 dx = ⋅ x − ∫ 2 x dx = 1 + ∫ dx.
x x x x x
1
“Now subtract ∫ dx from both sides to get 0 = 1.”
x
2 1
“Now subtract ∫ dx from both sides to get 0 = 1.”
1 x
∫ x e dx = x e − ∫ 3x e dx.
3 x 1
3 x 2 x
get:
∫ x e dx = x e − 3e (x − 2x + 2) + C = e (x − 3x + 6x − 6) + C.
3 x 3 x x 2 x 3 2
Now apply IBP a second time, this time choosing cos x as our new “v ′ ”:
Recall that indefinite integrals are unique, but only up to a constant of integration. And
so, more generally, given any continuous function u, we may write:
Aisha’s error is in the second sentence — with indefinite integrals, the ∫ f (x) dx on the
LHS may differ from the ∫ f (x) dx on the RHS by a constant and so we cannot simply
cancel them out.
2 1
A350. This time, the second sentence is correct. Any definite integral, such as ∫ dx,
1 x
2 1
is simply a number. It is therefore perfectly legitimate to cancel out ∫ dx from both
1 x
sides of an equation.
This time, the error lies in the last step of the first sentence. In particular, the following
equation is false:
2
1 2 1
“[1 + ∫ dx] = 1 + ∫ dx.”
x 1 1 x
cos x d
First, observe that: cot x = and sin x = cos x.
sin x dx
u = sin x.
1
du 2
From =, we also have: = cos x.
1
dx
cos x cos x du 1 du
∫ cot x dx = ∫ sin x dx = ∫ dx = ∫
1 2
dx
u dx u dx
So far, so normal. But we’ll now do something strange — namely, take the last expression
and simply “cancel out the dx’s”:
1 du s 1
∫ u dx = ∫ du.
dx
u
We call this step where we “cancel out the dx’s” an application of the Substitution Rule
and will denote it by =
s
And now:
1
∫ u du = ln ∣u∣ + C = ln ∣sin x∣ + C,
1
where we must always remember to plug back the initial substitution u = sin x to get rid
1
of u.
At =, we apply the Substitution Rule and simply “cancel out the dx’s”. And as always,
s
the final step is to plug back the initial substitution u = x2 to get rid of u.
1
At =, we apply the Substitution Rule and simply “cancel out the dx’s”. And as always,
s
the final step is to plug back the initial substitution u = x2 + x to get rid of u.
1
d
Another method is to observe that (x3 + 2x2 ) = 3x2 + 4x, which suggests the substitu-
dx
du 2 2
tion u = x3 + 2x2 . With this substitution, we have = 3x + 4x and:
1
dx
du
∫ (x + 2x ) (3x + 4x) dx = ∫ u (3x + 4x) dx = ∫ u dx dx
3 2 2 1 2 2
1 1 1
= ∫ u du = u2 + C = (x3 + 2x2 ) + C.
s 2
2 2
At =, we apply the Substitution Rule and simply “cancel out the dx’s”. And as always,
s
the final step is to plug back the initial substitution u = x3 + 2x2 to get rid of u.
1
Informally, the Substitution Rule says that we can simply “cancel out the dx’s”:
du du
dx “=”
dx “=” du. ,
dx dx
du
Here we must replay our earlier warning that is not a fraction and we should not think
dx
of the dx’s as numbers. So, , is best thought of as being merely a convenient and informal
mnemonic for the Substitution Rule. Formally, we are not doing anything like cancelling
out the dx’s.
As it turns out, the Substitution Rule is simply the inverse of the Chain Rule. To
see why, let us first state and prove the following result, which is simply the Chain Rule
inverted. We will then explain why this result gives us the Substitution Rule:
∫ [(f ○ g) ⋅ g ] = f ○ g + C.
′ ′
(Substitution Rule)
⋆
Next, by definition f = ∫ f ′ . And so, we can rewrite RHS of = as:
⋆
Putting = and = together, we can rewrite = as:
1 2
dg ○
∫ f (g (x)) dx dx = ∫ f (t) dt∣
′ ′
.
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ t=g(x)
∫ f (g(x)) dg
′
○
This last equation = is recognisable as the Substitution Rule. By (informally) “cancelling
○
the dx’s”, the LHS of = (informally) becomes336 ∫ f ′ (g (x)) dg. This last expression seems
to say:
336
This is informal because we have not at any point in this textbook defined what an expression like
∫ f (g (x)) dg might mean.
′
The above equation says that the functions (f ○ g)⋅g ′ and f have the same antiderivatives,
which is clearly false!
This observation was already made by David Gale in his 1994 article “Teaching Integra-
tion by Substitution”. I can do no better than reproduce his remarks:
Of course the equation is false. The expression ∫ f (x) dx∫ stands for
antiderivative, as in a table of integrals, and the variable, be it x, t, u
or anything else is a dummy. Clearly the antiderivatives on the left and
right above are not equal. What the books mean, no doubt, is that if you
substitute g (x) for u after taking the antiderivative on the right you get
the antiderivative on the left. I expect some readers will say I am being
pedantic or that there is no need to be so rigorous at the freshman level,
but I think this kind of lapse is symptomatic of a rather strange set of
standards and perhaps it sheds light on why none of the books proves the
inverse substitution theorem. It is because none of them formulates it.
Without changing it too much, here are two corrected versions of =:
1
Or alternatively, if F = ∫ f , then:
337
Some textbooks that make this error are: Stewart (Single Variable Calculus, 2011, p. 331); Thomas
and Finney (Calculus and Analytic Geometry, 1998, p. 294) — see also Hass, Heil, and Weir (Thomas’
Calculus, 2018, p. 291). Also, ProofWiki (retrieved 2018-10-06-1058).
Some textbooks that do not make this error are: Apostol (Calculus: Volume I, 1967, p. 212); Larson
and Edwards (Calculus of a Single Variable, 2016, p. 296).
850, Contents www.EconsPhDTutor.com
81.1. ∫ [(f ○ g) ⋅ g ] = f ○ g + C
′ ′
We now repeat our earlier derivation of the Substitution Rule. By the Chain Rule:
′
(f ○ g) = (f ′ ○ g) ⋅ g ′ .
And so equivalently:
⋆
∫ [(f ○ g) ⋅ g ] = f ○ g + C.
′ ′
⋆
As we saw earlier, by manipulating = a little, we can make it look more like the Substitution
Rule.
Now, observe that by recognising that an integrand is of the form (f ′ ○ g) ⋅ g ′ , we can
immediately write down its antiderivative and so skip the step of making any substitution.
(x2 ) ′ = 2x .
± ¯ g′
g
∫ ( cos x ) 2x dx = sin x + C.
2 2
So: ↓ ↓ ↓ ↓ ↓
f ′ g g′ f g
What we’ve done here is secretly equivalent to what we did earlier, but just much quicker.
Example 1052. In Example 1048, we found ∫ sin(x2 + x)(2x + 1) dx by using the sub-
stitution u = x2 + x.
But we can actually skip this substitution altogether. To do so, observe that:
(x2 + x)′ = 2x + 1.
´¹¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¶ ²
g′
g
∫ ( sin (x + x) ) 2x + 1 dx = − cos (x + x) + C.
2 2
So: ↓ ↓ ↓ ↓ ↓
′ ′
f g g f g
Again, what we’ve done here is secretly equivalent to what we did earlier, but just much
quicker.
where as usual, the last step is to plug back the original substitution u = sin x to get rid
1
of u.
√
e x
Example 1054. Find ∫ √ dx.
x
d√ 1 1 √ du 1
Observing that x = − √ , we try the substitution u = x, so that = − √ or
dx 2 x dx 2 x
1 2 du
√ = −2 and:
x dx
√
e x eu √
u du
∫ √ = ∫ √ = −2 ∫ = −2 ∫ + = −2e + = −2e + C,
1 2 s 1
dx dx e dx eu
du C u
C x
x x dx
√
where as usual, the last step is to plug back the original substitution u =
1
x to get rid
of u.
In general:
∫ f exp f = exp f + C.
′
d
Proof. By the Chain Rule: (exp f + C) = (exp f ) f ′ .
dx
And so, by recognising that an integrand is of the form f ′ exp f , we can simply skip the
substitution altogether and thus solve the problem more quickly. We now revisit the above
two examples:
Example 1057. Earlier in Example 1047, we found ∫ cot x dx (for x not an integer multiple of
cos x sin′ x
And hence: ∫ cot x dx = ∫ sin x dx = ∫ sin x dx = ln ∣sin x∣ + C.
Again, what we’ve done here is secretly equivalent to what we did earlier, but just much
quicker.
In general:
f′
∫ f = ln ∣f ∣ + C (for f ≠ 0).
d f′
Proof. By the Chain Rule: (ln ∣f ∣ + C) = .
dx f
And so, by recognising that an integrand is of the form f ′ exp f , we can simply skip the
substitution altogether and thus solve the problem more quickly.
du 2
This substitution gives us = 2x + cos x and:
dx
2x + cos x 2x + cos x 1 du 1
∫ x2 + sin x dx = ∫ dx = ∫ dx = ∫ du + C
1 2 s
u u dx u
= ln ∣u∣ + C = ln ∣x2 + sin x∣ + C.
1
where as usual, the last step is to plug back the original substitution u = x2 + sin x + 1 to
1
get rid of u.
Let us now redo the problem without explicitly making the substitution. Observe that
the derivative of the integrand’s denominator is its numerator:
′
(x2 + sin x) = 2x + cos x.
′
2x + cos x (x2 + sin x)
∫ x2 + sin x = ∫ x2 + sin x dx = ln ∣x + sin x∣ + C.
2
And hence: dx
In this example, the second solution is secretly equivalent to the first, but just much
quicker.
d
A354(a) Since (x3 − cos x) = 3x2 + sin x, we have:
dx
3x2 + sin x
∫ x3 − cos x dx = ln ∣x − cos x∣ + C.
3
d
(b) Since (sin2 x + 1) = 2 sin x cos x = sin 2x, we have:
dx
sin 2x
∫ dx = ln ∣sin2 x + 1∣ + C = ln (sin2 x + 1) + C.
sin x + 1
2
d
(d) Since (x3 + x2 + x + 1) = 3x2 + 2x + 1, we have:
dx
3x2 + 2x + 1
∫ x3 + x2 + x + 1 dx = ln ∣x + x + x + 1∣ + C.
3 2
d
(f ) = 2f ⋅ f ′ . Hence:
2
By the Chain Rule,
dx
1
∫ f ⋅ f = 2 (f ) + C.
′ 2
′
If we recognise that (x3 + 2x2 ) = 3x2 + 4x and hence that the integrand is of the form
f ⋅ f ′ , then we can skip the substitution altogether and simply write:
2 ′ 1 3
∫ (x + 2x ) (3x + 4x) dx = ∫ (x + 2x ) (x + 2x ) dx = 2 (x + 2x ) + C.
3 2 2 3 2 3 2 2
d
(f ) = 3 (f ) ⋅ f ′ . Hence:
3 2
Similarly, by the Chain Rule,
dx
1
∫ (f ) ⋅ f = 3 (f ) + C.
2 ′ 3
Example 1061. Find ∫ (x3 + 2x2 ) (3x2 + 4x) dx. Again, one method is to expand
2
the integrand then antidifferentiate term by term. Another is to use the substitution
u = x3 + 2x2 .
′
The best method of all is to simply recognise that (x3 + 2x2 ) = 3x2 + 4x and hence that
the integrand is of the form (f ) ⋅ f ′ . We can then skip the substitution altogether and
2
simply write:
2 ′ 1 3
∫ (x + 2x ) (3x + 4x) dx = ∫ (x + 2x ) (x + 2x ) dx = 3 (x + 2x ) + C.
3 2 2 2 3 2 2 3 2 3
1
∫ (f ) f = n + 1 (f ) + C.
n ′ n+1
One method is to fully expand the integrand to get a 152nd-degree polynomial, then
integrate this polynomial term-by-term. This is doable, but absurdly tedious.
Another is to use the substitution u = x3 + 5x2 − 3x + 2, as we do now. Given =, we also
1 1
have:
d
(x3 + 5x2 − 3x + 2) = 3x2 + 10x − 3.
50
dx
This suggests that we should use the substitution .
We have:
du 2 2
= 3x + 10x − 3.
dx
And now:
du
= ∫ u50
2
dx
dx
= ∫ u50 du
s
1 51
= u +C
51
1 1
= (x3 + 5x2 − 3x + 2) + C,
51
51
where as usual, the last step is to plug back the original substitution u = x3 + 5x2 − 3x + 2
1
to get rid of u.
x
Example 1063. Find ∫ 2 dx.
x +x+1
d
This time, we have: (x2 + x + 1) = 2x + 1.
dx
And so, it’s not obvious how we can apply the ln trick.
But what we can do is to rewrite the integral:
1 2x 1 2x + 1 − 1 1 2x + 1 1
∫ x2 + x + 1 dx = 2 ∫ x2 + x + 1 dx = 2 ∫ x2 + x + 1 dx = 2 (∫ x2 + x + 1 dx − ∫ x2 + x + 1 dx
x
(Note that in this last step, we can use parentheses instead of the absolute value sign
because x2 + x + 1 > 0 for all x ∈ R.)
For the second term, we can use what we learnt in Ch. 80.5:
1 1 1 2 −1 2x + 1
∫ x2 + x + 1 dx = ∫ dx = ∫ √ dx = √ tan √ + C̄.
(x + 12 ) + 34
2 2
(x + 2 ) + ( 2 )
1 2 3 3 3
Altogether then:
1 2x + 1 1 1 1 −1 2x + 1
∫ x2 + x + 1 dx = 2 (∫ x2 + x + 1 dx − ∫ x2 + x + 1 dx) = 2 ln (x + x + 1)− √ tan √ +C.
x 2
3 3
That is, you may not be given the substitution to make when faced with integrands of the
form f ′ (x) [f (x)] or f ′ (x) ef (x) . But this is no problem, since we’ve already thoroughly
n
√ √
1 − x = 1 − sin2 u
2 2
√
= cos2 u (∵ sin2 u + cos2 u = 1)
√
= ∣cos u∣ (∵ y 2 = ∣y∣)
dx du ⋆
By the Inverse Function Theorem, we have = 1. We will use this in conjunction
du dx
with the Times One Trick:
√ √
∫ 1 − x dx = ∫ 1 − sin2 u dx
2 4
⋆ dx du
= ∫ cos u dx (Times One Trick)
du dx
du
= ∫ cos2 u dx
3
dx
= ∫ cos2 u du
s
cos 2u + 1
=∫ du (Double Angle Formula)
2
sin 2u u
= + +C
4 2
sin u cos u u
= + +C
2 2
√
sin u 1 − sin2 u u
= + +C
2 2
x cos u sin−1 x
= + +C
1
2 2
√
1 − x2 sin−1 x
= + + C.
4 x
2 2
du 2
As usual, we have: = sec2 x.
dx
1 sec2 x
∫ 1 + 3 cos2 x dx = ∫ sec2 x + 3 dx (Multiply N and D by sec2 x)
1 du
=∫
2
dx
sec2 x + 3 dx
1
=∫
s
du
sec2 x + 3
1
=∫ du (∵ tan2 x + 1 = sec2 x)
tan x + 4
2
1
=∫ 2
1
du
u +4
1 1
= ∫ du
(u) + 1
4 2
2
1
= (2 tan−1 ) + C
u
4 2
1
= tan−1 + C
u
2 2
1 1 tan x
= tan−1 + C.
2 2
1
f (x) = √ .
x2 x2 − 1
(a) Show that ∫ f dx = sin (sec−1 x) + C1 by using the substitution x = sec u, where
1
u ∈ (0, ). (Hint 1: The chosen values of u ensure that tan u > 0.)
π
2
√
x2 − 1 1
(b) Now show that ∫ f dx = + C2 by using the substitution u = 1 − 2 ∈ (0, 1).
2
x x
(c) Use (a) and (b) to prove that we have:
√
x2 − 1
sin (sec−1 x) = (at least for x > 1).
3
x
(Hint 2: How are two antiderivatives of the same function related? Hint 3: Plug in x = 2.)
1
(a) Show that ∫ g dx = + (tan −1
+ =
1
cos by using the substitution
cos (tan−1 x)
x) C 1 x
tan u, where u ∈ (− , ). (Hint: These values of u ensure that sec u > 0.)
π π
2 2
1 √
(b) Now show that ∫ g dx = √ + x2 + 1 + C2 by using the substitution u = x2 + 1.
2
x2 + 1
1 1 1
(c) Prove that if + a = + b (a, b ≠ 0), then either a = b or a = .
a b b
(d) Use (a), (b), and (c) to prove that we have:
1
cos (tan−1 x) = √
3
(for all x).
x2 + 1
And thus: ∫ exp (sin x) cos x dx = ∫ exp (sin x)± cos x dx = exp (sin x) + C.
° ′
´¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¶ g′
°´¹¹ ¹ ¹¸¹ ¹ ¹ ¶
f g f g
This procedure is secretly equivalent to what we did in Example ??, but just much
quicker.
And hence:
f ′ (x) 1
∫ 2f (x) f (x) dx = [f (x)] + C, ∫ − dx = + C,
′ 2
[f (x)]
2 f (x)
∫ e f (x) dx = e + C.
f (x) ′ f (x)
2 1 1 + sin x
∫ sec x dx = 2 ln 1 − sin x + C.
cos x
(b) Show that sec x = .
1 − sin2 x
1
= +
A B
(c) Show that , where A and B are constants to be found.
1 − sin2 x 1 + sin x 1 − sin x
(d) By plugging in (a) and (b), then considering the derivatives of the denominators,
prove =.
2
(e) By considering 2 ln ∣sec x + tan x∣ or otherwise, show that = and = are equivalent.
1 2
(f) Prove that tan (θ + ) = sec 2θ+tan 2θ. (This may be hard — see hints in footnote.)339
π
4
(g) Hence conclude that:
∫ sec x dx = ln ∣tan ( 2 + 4 )∣ + C.
3 x π
(Answer on p. 867.)
According to Rickey and Tuchinsky (1980), historically, the problem of solving the integral
∫ sec x dx was “one of the outstanding open problems of the mid-seventeenth century”
and was closely related to the construction of the Mercator map projection. In 1645,
Isaac Barrow (Newton’s teacher) gave the first “intelligible” solution.
sec x + tan x sec2 x + sec x tan x
A358(a) ∫ sec x dx = ∫ sec x dx = ∫ dx.
sec x + tan x sec x + tan x
d
Observe that (sec x + tan x) = sec2 x + sec x tan x. Thus:
dx
339
Use the following steps:
tan A + tan B 6 sin θ cos θ + sin θ
tan (A + B) = , tan = 1, tan θ = , sin2 θ + cos2 θ = 1,
π 5
, multiply by
4 7
1 − tan A tan B 4 cos θ cos θ + sin θ
2 sin θ cos θ = sin 2θ, cos2 θ − sin2 θ = cos 2θ
8 9
1 cos x cos x
(b) sec x = = = .
cos x cos2 x 1 − sin2 x
1 1 A + B + (B − A) sin x
= = + =
A B
(c) .
1 − sin x (1 + sin x) (1 − sin x) 1 + sin x 1 − sin x (1 + sin x) (1 − sin x)
2
⋆ 1 1 1 + sin x
= [ln (1 + sin x) − ln (1 − sin x)] + C = ln + C.
2 2 1 − sin x
Note that since x is not an odd multiple of π/2, we have −1 < sin x < 1 so that 1 + sin x > 0
⋆
and 1 − sin x > 0. Hence, at =, we can use parentheses instead of the absolute value sign.
(e)
Exercise 359. Recall that the formula for Integration by Parts is:
∫ u ⋅ v dx = u ⋅ v − ∫ u ⋅ v dx.
′ ′
Using the Substitution Rule, show that the above formula may also be written as:
A359. We have:
dv du
∫ u ⋅ v dx = ∫ u ⋅ dx dx = ∫ u dv ∫ u ⋅ v dx = ∫ dx ⋅ v dx = ∫ v du.
′ ′
and
∫ u ⋅ v dx = u ⋅ v − ∫ u ⋅ v dx ⇐⇒ ∫ u dv = u ⋅ v − ∫ v du.
′ ′
Hence:
The above alternative formula for IBP is really just the exact same thing but under a
different guise. We revisit our first two examples from Ch. 80.8:
© ©© © ¬
u dvu v v du
∫ x ex
dx = x ex
− ∫ ex
1 dx = xex − ex + C.
Remark 126. As mentioned in Remark 123, if you’re comfortable with or even prefer the
above alternative formula for IBP, then a mnemonic that goes well with it is “ultraviolet
voodoo”.
Example 1070. To find ∫ tan−1 x, we can use the (1) Times One Trick; (2) IBP; and
the (3) Substitution Rule:
1 3 1
∫ tan x = ∫ 1 ⋅ tan x = x tan x − ∫ x ⋅ 1 + x2 = x tan x − 2 ln (1 + x ) + C.
−1 1 −1 2 −1 −1 2
Note that in the last step, we can use parentheses instead of the absolute value sign
because 1 + x2 > 0 for all x ∈ R.
(a) ∫ sin−1 x
(b) ∫ cos−1 x
A360. As in the above example, we’ll use the (1) Times One Trick; (2) IBP; and the (3)
Substitution Rule:
1 √
∫ sin x = ∫ 1 ⋅ sin x dx = x sin x − ∫ x ⋅ √ = x sin−1 x + 1 − x2 + C.
−1 1 −1 2 −1 3
(a)
1 − x2
−1 √
(b) ∫ cos−1 x = ∫ 1 ⋅ cos−1 x dx = x cos−1 x − ∫ x ⋅ √ = x cos−1 x − 1 − x2 + C.
1 2 3
1 − x2
Suppose f is an invertible function and we’ve already figured out what ∫ f −1 is. Then we
have the following lovely formula340 for ∫ f :
y=f (x)
Proof. We’ll use the (1) Times One Trick; (2) IBP; and the (3) Substitution Rule again.
In addition, we’ll use (4) x = f −1 (f (x)):
∫ f (x) = ∫ 1⋅f (x) = xf (x)−∫ xf (x) = xf (x)−∫ [f (f (x)) ⋅ f (x)] = xf (x)−∫ f (y) ∣
1 2 ′ −14 ′ −1 3
Reassuringly, this is the same as what we found earlier in Example 1046. (Of course, in
Example 1046, we were secretly using a few of the tricks that we also just used in the
proof of Proposition 12.)
340
Somewhat surprisingly, according to Wikipedia, this formula was first published only in 1905.
871, Contents www.EconsPhDTutor.com
Example 1072. We know that ∫ tan x = ln ∣sec x∣ + C.
And so by Proposition 12, we have:
Note that we can remove the absolute value sign because tan−1 x ∈ (− , ) and thus
π π
2 2
sec (tan x) > 0.
−1
1
∫ tan x = x tan x − 2 ln (1 + x ) + Ĉ.
−1 2 −1 2
1 √
sec (tan−1 x) = = 1 + x2 .
cos (tan x)
−1
1 √
ln (1 + x2 ) = ln 1 + x2 = ln [sec (tan−1 x)] .
2
As mentioned in Example 266, the function k is invertible. That is, the function k −1 ∶
R → R exists. However, it is impossible to write down k −1 (x) as an algebraic expression.
Nonetheless, with the aid of Proposition 12, it is actually possible to write down the
antiderivative of k −1 in terms of k −1 :
1 6 1 2 1 6 1
∫ k (x) = xk (x)−∫ k (y) ∣y=k−1 (x) +C = xk (x)−[ 6 y + 2 y ]
−1 −1 −1
+C = − [k −1 (x)] − [k
y=k −1 (x) 6 2
Exercise 361. Use Proposition 12 to find (a) ∫ sin−1 ; and (b) ∫ cos−1 . Explain, using
Lemma 1, why your answers here are consistent with those for Exercise 360. (Answer on
p. 872.)
Observe that the integrand a1 + a2 + a3 is the sum of three terms. And so by the Sum Rule,
our integral may be written as the sum of three integrals:
∫ a1 + a2 + a3 dx = ∫ a1 dx + ∫ a2 dx + ∫ a3 dx.
More generally:
∫ a1 + a2 + ⋅ ⋅ ⋅ + an dx = ∫ a1 dx + ∫ a2 dx + ⋅ ⋅ ⋅ + ∫ an dx.
∫ a1 + a2 + a3 + . . . dx = ∫ a1 dx + ∫ a2 dx + ∫ a3 . . . dx + . . .?
It turns out that, “No, we cannot always do so.” See Exercise 363 for a counterexample.
∞
It turns out that interchanging ∑ and ∫ 341
is valid only under certain technical conditions.
i=1
Nonetheless, in H2 Maths, we shall simply and blithely assume that these conditions are
∞
usually (or even always) met — so that we can interchange ∑ and ∫ pretty much whenever
i=1
we like (even when we don’t actually know if this is valid).342
For two more examples, see Exercises 583(iii) (N2014/I/8) 599(ii)(a) (N2011/I/4).
Example 1076. Find the area bounded by the curve y = x2 and the lines y = 1 and y = 2.
It’s always helpful to make a quick sketch:
Figure to be
inserted here.
Example 1077. Find the area bounded by the curve y = x2 and the line y = x + 1.
1 2
Figure to be
inserted here.
Our sketch suggests that we find the intersection points of the line and the curve. To do
so, combine = and =:
1 2
x2 − x − 1 = 0.
⎡ (1 + √5)2 √ √ 3 √ 2
(1 + 5) ⎤⎥ ⎡⎢ (1 − 5)
√ √ 3
(1 − 5) ⎤⎥
⎢ + −
= ⎢⎢ ⎥−⎢ ⎥
1 5 1 5
+ − + −
⎢ 23 2 3 ⋅ 23 ⎥⎥ ⎢⎢ 23 2 3 ⋅ 23 ⎥⎥
⎣ ⎦ ⎣ ⎦
√ √ √ √ √ √
6 + 2 5 1 + 5 16 + 8 5 6 − 2 5 1 − 5 16 − 8 5
=[ + − ]−[ + − ]
8 2 24 8 2 24
√ √ √ √ √ √ √ √ √
3+ 5 1+ 5 2+ 5 3− 5 1− 5 2− 5 7+5 5 7−5 5 5 5
=[ + − ]−[ + − ]= − = .
4 2 3 4 2 3 12 12 6
Exercise 365. Find the exact area bounded by the curve y = sin x and the line y = 0.5,
for x ∈ (0, π/2).(Answer on p. 1556.)
Figure to be
inserted here.
Our sketch suggests that we find the intersection points of the line and the curve. To do
so, combine = and =:
1 2
2x2 − 2x − 2 = 0.
√
√
0.5(1+ 5)
x3 2 5 5
= 2 [x − + ] =
x
2 0.5(1−√5)
.
3 3
Exercise 366. Find exact area bounded by the curves y = 2 − x2 and y = x2 + 1. (Answer
on p. 1556.)
Figure to be
inserted here.
As stated earlier, the definite integral gives us the signed area. So if the curve is under
the x-axis (as is the case here), then the computed area will be negative:
−8 −32
2
2 x3 8
∫−2 x − 4 dx = [ − 4x] = ( − 8) − ( + 8) =
2
.
3 −2 3 3 3
But of course, an area is simply a magnitude, so we’ll take the absolute value and conclude
32
that the requested area is .
3
Exercise 367. Find the exact area bounded by x4 − 16 and the x-axis. (Answer on p.
1557.)
Example 1081. Consider the line y = 1. Rotate it about the x-axis to form an (infinite)
3D cylinder. Now consider the finite portion of the cylinder between x = 1 and x = 2. By
a primary school formula, its volume is Base Area × Height = π12 × (2 − 1) = π.
Figure to be
inserted here.
We can also compute this same volume using integration. The intuition is that we’re
adding up infinitely many infinitely thin circle-shaped slices, laid on their sides, from
x = 1 to x = 2 (left to right). The face of each of these circles has area πy 2 . In this
particular example, y is constant (simply 1). Thus, the total volume is
2 2
∫1 πy 2 dx = ∫ π dx = [πx]1 = π.
2
1
Figure to be
inserted here.
We can also compute this same volume using integration. Again, the intuition is that
we’re adding up infinitely many infinitely thin circle shaped slices, from x = 0 to x = 2.
Again, the face of each of these circles has area πy 2 . In this particular example, y = 3x.
Thus, the total volume is
2
2 2 x3
∫0 πy dx = ∫0 π(3x) dx = 9π [ ] = 24π.
2 2
3 0
Now consider instead the finite portion of the cone between x = 3 and x = 5. This looks
like a pedestal tilted sideways (not illustrated). We can easily compute its volume using
integration:
5
5 5 x3
∫3 πy dx = ∫3 π(3x) dx = 9π [ ] = 294π.
2 2
3 3
Computing its volume using geometric formulae is possible, if slightly more tedious. The
1 1
finite portion of the cone between x = 0 and x = 3 is V1 = πr2 h = π92 × 3 = 81π. The
3 3
1 2 1
finite portion of the cone between x = 0 and x = 5 is V2 = πr h = π152 × 5 = 375π.
3 3
Hence, the desired volume is V = V2 − V1 = 375π − 81π = 294π.
Example 1083. Consider the curve y = x2 . Find its volume of rotation about the y-axis,
from y = 0 and y = 5.
Figure to be
inserted here.
In this case, there are no familiar geometric formulae we can apply. So we really just have
to compute this same volume using integration. Again, the intuition is that we’re adding
up infinitely many infinitely thin circle-shaped slices, but this time these circle-shaped
slices are stacked from bottom to top, from y = 0 to y = 5. The face of each of these
circles has area πx2 , where in this particular example, x2 = y. Thus, the total volume is
5
5 5 y2
∫0 πx dy = ∫0 πy dy = π [ ] = 12.5π.
2
2 0
Exercise 369. Compute the volume of rotation of y = sin x about the x-axis from x = 0
to x = π. (Answer on p. 1557.)
Figure to be
inserted here.
1
ln x = ∫
x
dt.
1 t
Remark 127. In the above equation, students are often confused by the presence of the
two variables x and t. Both are dummy variables that could be replaced by any other
symbol. So for example, the following three equations are exactly equivalent:
1 ⋆ 1 ,1
ln x = ∫ ln ⋆ = ∫ ln , = ∫
x
dt, d○, d∎.
1 t 1 ○ 1 ∎
However, we must be careful not to mix up x and t. The dummy variable x is used to
tell us about the mapping rule of ln, while the dummy variable t is used to tell us about
the definite integral.
With the above definition and the FTC1, the following result is immediate:
1
Fact 157. The derivative of ln ∶ R+ → R is the function f ∶ R+ → R defined by f (x) = .
x
With the above definition, it is not difficult to prove some basic properties of the natural
logarithm function:
(a) ln 1 = 0.
(b) ln (xy) = ln x + ln y.
1
(c) ln = − ln x.
x
(d) ln = ln x − ln y.
x
y
Proof. (a) Simply plug x = 1 into Definition 184, then apply Definition 181(a):
1 1
ln 1 = ∫ dx = 0.
1 x
(b) We will play a little trick. Differentiate both sides with respect to x to get:
d 1 dy 1 1 1 dy d 2 1 1 dy
ln (xy) = (y + x ) = + and (ln x + ln y) = + .
dx xy dx x y dx dx x y dx
Or equivalently:
1 1 dy 1 1 1 dy 2
∫ ( x + y dx ) = ln (xy) and ∫ ( x + y dx ) = ln x + ln y.
ln (xy) = ln x + ln y + C.
3
Exercise 370. Prove Fact 158(c) (use the same trick as in the proof of Fact 158(b)).
(Answer on p. 1558.)
x = logb n ⇐⇒ bx = n.
Now that we have formally defined the natural logarithm, let us now give a more formal
definition of logarithms that replaces the above definition:
Definition 185. Let b, n > 0 with b ≠ 1. Then the base b logarithm of n is denoted logb n
and is defined to be the following number:
ln n
logb n = .
ln b
In Ch. 5.6, we gave informal proofs of the Laws of Logarithms. With the above Definitions
and results, we can now prove these rigorously and indeed more easily:
⋆
Proof. Below, = indicates the use of Definition 185
⋆ ln 1
(a) logb 1 = = 0.
ln b
⋆ ln b
(b) logb b = = 1.
ln b
x ⋆ ln b x ln b
(c) logb b = = = x. (The middle step uses Fact 159.)
x
ln b ln b
⋆ ln a
(d) logb a = . Rearranging: (logb a) ln b = ln a.
ln b
By Fact 159: ln blogb a = ln a.
889, Contents www.EconsPhDTutor.com
Now apply exp: exp (ln blogb a ) = exp (ln a). Since exp is, by definition, the inverse of ln, =
1 1
becomes blogb a = a.
(e)–(j) See Exercise 371.
Definition 59. The exponential function, denoted exp, is the inverse of the natural
logarithm function.
(a) exp 0 = 1.
(b) exp 1 = e.
(c) exp (x + y) = (exp x) (exp y).
1
(d) exp (−x) = .
exp x
exp x
(e) exp (x − y) = .
exp y
(d) [exp (−x)] (exp x) == exp (−x + x) = exp 0 = 1. Rearranging (note that exp x > 0 for all
b a
x ∈ R), we have:
1
exp (−x) = .
exp x
We can now also easily prove that the derivative of the exponential function is itself :
′ 1 1
Proof. By the Chain Rule, [ln (exp x)] = exp′ x.
exp x
′ 2
But observing that ln (exp x) = x, we also have [ln (exp x)] = 1.
Putting = and = together and rearranging yields the result.
1 2
The exponential function (and constant multiples thereof) are the only functions
that are their own derivative. Formally:
Example 1085. Use your TI84 to find the approximate area bounded by the curve
y = esin x and the horizontal axis, between x = 1 and x = 2.
dy 1
= f (x) ⇐⇒ y = ∫ f (x) dx.
2
dx
d
Going from right to left (= to =), we apply the differentiation operator
2 1
.
dx
Going from left to right (= to =), we apply the antidifferentiation operator ∫ dx.
1 2
dy 1 2
Example 1086. Solve the differential equation =x .
dx
Apply the antidifferentiation operator ∫ dx:
x3
y = ∫ x2 dx = + C.
2
3
We call = the general solution to the given differential equation =. It is general because
2 1
the constant of integration C is free to vary, so that there are many possible solutions for
y.
Now, suppose we are told also that:
This additional piece of information = is often called an initial condition. Here’s why.
3
It might be that y is the number of bats in a cave and x is time. Then the initial condition
tells us that at time x = 0 (i.e. “initially”), the number of bats in the cave is y = 1. And
over time, the number of bats in the cave changes according to the differential equation
=.
1
03
1 = + C = C.
3
Hence, C = 1. And so:
x3
y= + 1.
4
3
dy 1 2
We call = the particular solution to the differential equation = x with initial
4
dx
condition (x, y) = (0, 1).
3
For future reference, here is the formal result that justifies the procedure used in the above
examples:
dy
Fact 162. The general solution to the differential equation = f (x) is:
dx
y = ∫ f (x) dx.
dy
Exercise 372. Find the general solution of = ex sin x. Find also the particular solu-
dx
tion, if given also the initial condition x = 0 Ô⇒ y = 1. (Answer on p.
1559.)
dx 1
Rearrange: = .
dy y 2
1 2 −1 1
Apply ∫ dy: x=∫ = +C y=
3
dy or .
y2 y C −x
We are now also given the initial condition (x, y) = (0, 1). Plugging = into = (or =):
4 4 3 2
1
1= ⇐⇒ C = 1.
C −0
dy 1 2
= y (y ≠ 0) with initial condition (x, y) = (0, 1) is:
4
Thus, the particular solution to
dx
1
y=
5
.
1−x
x = − ln(cosecy + cot y) + C
That is, for each given value of x, there are infinitely many possible values of y (one for
each integer m).
But now suppose we have the initial condition x = 3 Ô⇒ y = . In this case, we have
π
2
3 = − ln ∣cosec + cot ∣ + C = − ln ∣1∣ + C = C,
π π
2 2
so that C = 3. We may write y = 2 (cot−1 e3−x + 2mπ). Moreover, plugging in the same
values for x and y, we see that
We now justify why the procedure used in the above examples works:
So, suppose we’re given the following differential equation:
dy
= f (y) with f (y) ≠ 0.
dx
By the Inverse Function Theorem:
dx 1 1
= dy = .
dy dx f (y)
dy
Fact 163. The general solution to the differential equation = f (y) with f (y) ≠ 0 is:
dx
1
x=∫ dy.
f (y)
dy
Exercise 373. Find the general solution of = y 2 + 1. Find also the particular solution,
dx
given also the initial condition x = 0 Ô⇒ y = 1. (Answer on p. 1559.)
d2 y dy
∫ dx2 dx = ∫ f (x) dx or =
dx ∫
f (x) dx.
dy
∫ dx dx = ∫ (∫ f (x) dx) dx y = ∫ (∫ f (x) dx) dx.
2
or
In summary:
d2 y
Fact 164. The general solution to the differential equation = f (x) is:
dx2
343
Strictly speaking, the parentheses around the inner integral are not necessary.
899, Contents www.EconsPhDTutor.com
d2 y 1 2
Example 1090. Solve =x .
dx2
Apply the antidifferentiation operator ∫ dx once:
d2 y dy x3
∫ dx2 = = = + C1 .
dx ∫
2
dx x dx
3
dy x3 4
= = + = + C1 x + C2 .
2 x
∫ dx dx y ∫ 3 C 1 dx
12
We are now also given the initial conditions (x, y) = (0, 1) and (x, y) = (1, 2). Plug = and
3 4 3
= into = to get:
4 2
1
1 = 0 + 0 + C2 and 2= + C1 + C2 .
12
Solving the above system of equations, we have C2 = 1 and C1 = 11/12.
d2 y 1 2
Thus, the particular solution to the differential equation = x with initial condi-
dx2
tions (x, y) = (0, 1) and (x, y) = (1, 2) is:
3 4
x4 11
y= + x + 1.
12 12
d2 y
Example 1091. Solve = sin x.
dx2
dy
= sin x dx = − cos x + C1 . Next, y = ∫ − cos x + C1 dx = − sin x + C1 x + C2 . This is
dx ∫
the general solution to the given differential equation.
If given the additional pieces of information that x = 0 Ô⇒ y = 1 and x = π Ô⇒ y = 2,
then we we have
1 = − sin 0 + 0C1 + C2 Ô⇒ C2 = 1,
1
2 = − sin π + πC1 + 1 Ô⇒ C1 = .
π
1
Hence y = − sin x + x + 1 is the particular solution.
π
d2 y
Exercise 374. Find the general solution of = ex sin x. Find also the particular
dx 2
solution, given also that x = 0 Ô⇒ y = 1.(Answer on p. 1560.)
Example 1092. A plate of bacteria grows at a rate that is inversely proportional to the
number of bacteria. Express the number of bacteria as a function of time.
Let x be the number of bacteria. Let t be time. We are given that x grows in inverse
dx k
proportion to t. In other words, = , for some constant k ∈ R. Rearranging, we have
dt x
= . Thus,
dt x
dx k
x2
t=∫ dx = + C.
x
k k
√
we have x = ± k(t − C), where of course the negative root may be
Further rearranging, √
rejected. Hence, x = k(t − C).
Suppose we are also given that t = 0 Ô⇒ x = 1 and t = 1 Ô⇒ x = 2. Then we have
√ √
1 = k(−C) and 2 = k(1 − C).
a b
From =, we have C = −1/k. Plug this into = and we have 4 = k(1 √+ 1/k) = k + 1 or k = 3.
a b
Let vs be the velocity at which the ball hits the surface of the Earth.
1 1
(d) (i) Show that the LHS of the above equation is equal to GM (− + ).
R R+x
vs2
(ii) Show that the RHS of the above equation is equal to − . (Hint 1: Use Integration
2
dr
by substitution. Hint 2: What is ?)
√
dt
1 1
(iii) Hence show that vs = − 2GM ( − ). Again, explain why vs is negative.
R R+x
Suppose instead that the small ball is initially at rest on the surface of the earth. It is
then propelled upwards at a velocity V .
Probability and Statistics takes up 60 (out of 200) points and hence 30% of your A-Level
exam.
— Willy Wonka (in Charlie and the Great Glass Elevator, 1972).
Example 1093. For lunch today, I can either go to the food court or the hawker centre.
At the food court, I have 2 choices: ramen or briyani. At the hawker centre, I have 3
choices: bak chor mee, nasi lemak, or kway teow.
Altogether then, I have 2 + 3 = 5 choices of what to eat for lunch today.
(Just so you know, the AP is sometimes also called the Second Principle of Counting
or the Rule of Sum or the Disjunctive Rule.)
Of course, the AP generalities to cases where there are more than just 2 “areas”. It may
seem a little silly, but just to illustrate, let’s use the AP to tackle the CAT problem:
344
See section 122.1 in the Appendices (optional) for a more precise statement of the AP.
907, Contents www.EconsPhDTutor.com
Example 1094. Problem: How many permutations are there of the letters in the word
CAT?
We can divide the possibilities into three cases:
Case #1. First letter is an A. Then the next two letters are either CT or TC — 2
possibilities.
Case #2. First letter is a C. Then the next two letters are either AT or TA — 2
possibilities.
Case #3. First letter is a T. Then the next two letters are either AC or CA — 2
possibilities.
Altogether then, by the AP, there are 2 + 2 + 2 = 6 possibilities. That is, there are 6
possible permutations of the letters in CAT. These are illustrated in the tree diagram
below.
Exercise 377. How many permutations are there of the letters in the word DEED?
Illustrate your answer with a tree diagram similar to that given in the CAT example
above. (Answer on p. 1563.)
Example 1095. For lunch today, I can either have prata or horfun. For dinner tonight,
I can have McDonald’s, KFC, or Pizza Hut.
Enumeration shows that I have a total of 6 possible choices for my two meals today:
Alternatively, we can use the Multiplication Principle (MP). I have 2 choices for lunch
and 3 choices for dinner. Hence, for my two meals today, I have in total 2 × 3 = 6 possible
choices.
345
See section 122.1 in the Appendices (optional) for a more precise statement of the MP.
910, Contents www.EconsPhDTutor.com
Example 1096. For breakfast tomorrow, I can have shark’s fin or bird’s nest (2 choices).
For lunch, I can have black pepper crab or curry fishhead (2 choices). For dinner, I can
have an apple, a banana, or a carrot (3 choices). By the MP, for tomorrow’s meals, I have
a total of 2 × 2 × 3 = 12 possible choices. We can enumerate these (I’ll use abbreviations):
(SF, BPC, A), (SF, BPC, B), (SF, BPC, C), (SF, CF, A),
(SF, CF, B), (SF, CF, C), (BN, BPC, A), (BN, BPC, B),
(BN, BPC, C), (BN, CF, A), (BN, CF, B), (BN, CF, C).
Example 1097. Problem: How many four-letter words can be formed using the letters
in the 26-letter alphabet?
Let’s rephrase this problem so that it is clearly in the framework of the MP. We have 4
blank spaces to be filled:
_ _ _ _.
1 2 3 4
These 4 blanks spaces correspond to 4 decisions to be made. Decision #1: What letter
to put in the first blank space? Decision #2: What letter to put in the second blank
space? Decision #3: What letter to put in the third blank space? Decision #4: What
letter to put in the fourth blank space?
How many choices have we for each decision?
For Decision #1, we can put A, B, C, ..., or Z. So we have 26 choices for Decision #1.
For Decision #2, we can again put A, B, C, ..., or Z. So we again have 26 choices for
Decision #2.
We likewise have 26 choices for Decision #3 and also 26 choices for Decision #4.
Altogether then, by the MP, there are 26 × 26 × 26 × 26 = 264 = 456, 976 ways to make our
four decisions.
Solution: There are 264 = 456, 976 possible four-letter words that can be formed using the
26-letter alphabet.
_ _.
1 2
These 2 blank spaces correspond to 2 decisions to be made. Decision #1: What number
to put in the first blank space? Decision #2: What letter to put in the second blank
space?
Again we ask: How many choices have we for each decision?
For Decision #1, we can put 1, 2, 3, ..., or 18. So we have 18 choices for Decision #1.
For Decision #2, we can put A, B, C, D, E, or F. So we have 6 choices for Decision #2.
Altogether then, by the MP, there are 18 × 6 = 108 ways to make our two decisions. In
other words, there are 108 possible outcomes from rolling these two dice.
(If necessary, it is tedious but not difficult to enumerate them: 1A, 1B, 1C, 1D, 1E, 1F,
2A, 2B, ..., 17E, 17F, 18A, 18B, 18C, 18D, 18E, and 18F.)
Exercise 378. A club as a shortlist of 3 men for president, 5 animals for vice-president,
and 10 women for club mascot. How many possible ways are there to choose the president,
the vice-president, and the mascot? (Answer on p. 1564.)
Exercise 379. (Answer on p. 1564.) The highly-stimulating game of 4D consists of
selecting a four-digit number, between 0000 and 9999 (so there are 10, 000 possible num-
bers).
Your mother tells you to go to the nearest gambling den (also known as a Singapore Pools
outlet) to buy any three numbers, subject to these two conditions:
• The four digits in each number are distinct.
• Each four-digit number is distinct.
How many possible ways are there to fulfil your mother’s request?
Example 1099. For lunch today, I can either go to the food court or the hawker centre.
At the food court, I have 4 choices of cuisine: Chinese, Indian, Malay, and Western. At
the hawker centre, I have 3 choices of cuisine: Chinese, Malay, and Thai.
There are 2 choices of cuisine that are common to both the food court and the hawker
centre (Chinese and Malay).
And so by the Inclusion-Exclusion Principle (IEP), I have in total 4 + 3 − 2 = 5 choices of
cuisine. The Venn diagram below illustrates.
Why do we subtract 2? If we simply added the 4 choices available at the food court
to the 3 available at the hawker centre, then we’d double-count the Chinese and Malay
cuisines, which are available at both the food court and the hawker centre. And so we
must subtract the 2 cuisines that are at both locations.
Example 1100. Problem: How many integers between 1 and 20 are divisible by 2 or 5?
There are 10 integers divisible by 2, namely 2, 4, 6, 8, 10, 12, 14, 16, 18, and 20.
There are 4 integers divisible by 5, namely 5, 10, 15, and 20.
There are 2 integers divisible by BOTH 2 and 5, namely 10 and 20.
Hence, by the IEP, there are 10 + 4 − 2 = 12 integers that are divisible by either 2 or 5.
(These are namely 2, 4, 5, 6, 8, 10, 12, 14, 15, 16, 18, and 20.)
346
See section 122.1 in the Appendices (optional) for a more precise statement of the IEP.
913, Contents www.EconsPhDTutor.com
The Inclusion-Exclusion Principle (IEP). I have to choose a destination, out of two
possible areas. At area #1, there are p possible destinations to choose from. At area #2,
there are q possible destinations to choose from. Areas #1 and #2 overlap — they have
r destinations in common.
The IEP simply states that I have, in total, p + q − r different choices.
Exercise 380. (Answer on p. 1565.) The food court has 4 types of cuisine: Chinese, In-
donesian, Korean, and Western. The hawker centre has 3: Chinese, Malay, and Western.
A restaurant has 3: Chinese, Japanese, or Malay.
In total, how many different types of cuisine are there? Illustrate your answer with a
Venn diagram.
Example 1101. The food court has 4 types of cuisine: Chinese, Malay, Indian, and
Other.
I’m at the food court but don’t feel like eating Malay or Chinese. So by the Complements
Principle (CP), I have 4 − 2 = 2 possible choices of cuisine (Indian and Other).
Exercise 381. There are 10 Southeast Asian countries, of which 3 (Brunei, Indonesia,
and the Philippines) are not on the mainland. How many mainland Southeast Asian
countries are there that a European tourist can visit? (Answer on p. 1565.)
347
See section 122.1 in the Appendices (optional) for a more precise statement of the CP.
915, Contents www.EconsPhDTutor.com
89. How to Count: Permutations
In this chapter, we’ll use the MP to generate several more methods of counting.
But first, some notation you should find familiar from secondary school:
Example 1103. Problem: How many permutations (or arrangements) are there of the
three letters in the word CAT?
Let’s rephrase this problem in the framework of the MP. Consider three blank spaces:
_ _ _.
1 2 3
These 3 blank spaces correspond to 3 decisions to be made. Decision #1: What letter to
put in the first blank space? Decision #2: What letter to put in the second blank space?
Decision #3: What letter to put in the third blank space?
Again we ask: How many choices have we for each decision?
For Decision #1, we can put C, A, or T. So we have 3 choices for Decision #1.
Having already used up a letter in Decision #1, we are left with two letters. So we have
2 choices for Decision #2.
Having already used up a letter in Decision #1 and another in Decision #2, we are left
with just one letter. So we have only 1 choice for Decision #3.
Altogether then, by the MP, there are 3 × 2 × 1 = 3! = 6 possible ways of making our
decisions. This is also the number of ways there are to arrange the three letters in the
word CAT.
_ _ _ _ _ _ _ _ _ _ _ _ _.
1 2 3 4 5 6 7 8 9 10 11 12 13
These 13 blanks spaces correspond to 13 decisions to be made. Decision #1: What letter
to put in the first blank space? Decision #2: What letter to put in the second blank
space? ... Decision #13: What letter to put in the 13th blank space?
Again we ask: How many choices have we for each decision?
First an important note: In the word UNPREDICTABLY, no letter is repeated. (Indeed,
UNPREDICTABLY is the longest “common” English word without any repeated letters.)
For Decision #1, we can put U, N, P, R, E, D, I, C, T, A, B, L, or Y. So we have 13
choices for Decision #1.
For Decision #2, having already used up a letter in Decision #1, we are left with 12
letters. So we have 12 choices for Decision #2.
For Decision #3, having already used up a letter in Decision #1 and another letter in
Decision #2, we are left with 11 letters. So we have 11 choices for Decision #3.
⋮
For Decision #13, having already used up a letter in Decision #1, another in Decision
#2, another in Decision #3, ..., and another in Decision #12, we are left with one letter.
So we have 1 choice for Decision #13.
Altogether then, by the MP, there are 13 × 12 × ⋅ ⋅ ⋅ × 2 × 1 = 13! = 6, 227, 020, 800 possible
ways of making our decisions. This is also the number of ways there are to arrange the
13 letters in the word UNPREDICTABLY.
_ _ _ . . . _.
1 2 3 n
For space #1, we have n possible choices. For space #2, we have n − 1 possible choices
(because one object was already placed in space #1). ... And finally for space #n, we have
only 1 object left and thus only 1 choice. By the MP then, there are n × (n − 1) × ⋅ ⋅ ⋅ × 1 = n!
possible ways of filling in these n spaces with the n distinct objects.
Example 1105. The word COWDUNG has seven distinct letters. Hence, there are
7! = 5040 permutations of the letters in the word COWDUNG.
348
This is informal because, amongst other omissions, we haven’t yet given a precise definition of the term
permutation.
918, Contents www.EconsPhDTutor.com
89.1. Permutations with Repeated Elements
In the previous section, we saw that there are 3! permutations of the three letters in the
word CAT and 13! permutations of the 13 letters in the word UNPREDICTABLY. We
made an important note: In each of these words, there was no repeated letter.
We now consider permutations of a set where some elements are repeated.
Example 1106. How many permutations are there of the three letters in the word SEE?
A naïve application of the MP would suggest that the answer is 3! = 6. This is wrong.
Enumeration shows that there are only 3 possible permutations:
To see why a naïve application of the MP fails, set up the problem in the framework of
the MP. Consider 3 blank spaces:
_ _ _.
1 2 3
These 3 blanks spaces correspond to 3 decisions to be made. Decision #1: What letter
to put in the first blank space? Decision #2: What letter to put in the second blank
space? Decision #3: What letter to put in the third blank space?
Again we ask: How many choices have we for each decision?
For Decision #1, we can put E or S. So we have 2 choices for Decision #1.
But now the number of choices available for Decision #2 depends on what we chose
for Decision #1! (If we chose E in Decision #1, then we again have 2 choices for
Decision #2. But if instead we chose S in Decision #2, then we now have only 1 choice
for Decision #2.) This violates the implicit but important assumption in the MP that
the number of choices available in one decision is independent on the choice made in the
other decision. Hence, the MP does not directly apply.
The reason SEE has only 3 possible permutations (instead of 3! = 6) is that it contains a
repeated element, namely E. But why would this make any difference?
To understand why, let’s rename the second E as Ê, so that the word SEE is now trans-
formed into a new word SEÊ. From the three letters of this new word, we’d again have
3! = 6 possible permutations:
Hence, when we do not distinguish between the two E’s, there are only half as many
possible permutations.
Example 1107. How many permutations are there of the four letters in the word SASS?
The answer is 4!/3! = 4. Let’s see why.
If we distinguish between the three S’s, perhaps by calling them S, Ŝ, and S̄, then we’d
have 4! = 24 possible permutations of the letters in the word SAŜS̄.
But amongst the three S’s themselves, we have 3! = 6 possible permutations: SŜS̄, SS̄Ŝ,
ŜSS̄, S̄SŜ, ŜS̄S, and S̄ŜS. So distinguishing between the three S’s increases by 6-fold the
number of possible permutations. Working backwards, the word SASS thus has one-sixth
as many permutations as SAŜS̄. That is, SASS has 4!/3! = 4 possible permutations.
The figure below illustrates how the 4 possible permutations of SASS correspond to the
24 possible permutations of SAŜS̄.
x ⋅ 2! ⋅ 2! = 4!
y ⋅ 2! ⋅ 5! = 8!
Fact 166. Consider n objects, only k of which are distinct. Let r1 , r2 , . . . , and rk be the
numbers of times the 1st, 2nd, . . . , and kth distinct objects appear. (So r1 +r2 +⋅ ⋅ ⋅+rk = n.)
Then the number of possible ways to permute these n objects is
n!
.
r1 !r2 ! . . . rk !
More examples:
Example 1110. How many permutations are there of the six letters in the word BA-
NANA?
We have three distinct letters — B, A, and N. The letter B appears 1 time. The letter
A appears 3 times. The letter N appears 2 times. Hence, by the above Fact, the number
of possible permutations of these 6 letters is
6!
= 60.
1!3!2!
Of course, 1! is simply equal to 1. So for the denominator, we shall usually not bother to
write out any 1!. So we will normally instead write that the number of permutations of
BANANA is:
6!
= 60.
3!2!
Example 1111. How many permutations are there of the 11 letters in the word MISSIS-
SIPPI?
We have four distinct letters — M, I, S, and P. The letter M appears 1 time. The letter
I appears 4 times. The letter S appears 4 times. The letter P appears 2 times. Hence,
by the above Fact, the number of possible permutations of these 11 letters is
11!
= 34, 650.
4!4!2!
Exercise 383. There are 3 identical white tiles and 4 identical black tiles. How many
ways are there of arranging these 7 tiles in a row? (Answer on p. 1566.)
Example 1112. There are 3! = 6 (linear) permutations of CAT. That is, there are 3! = 6
possible ways to fill them into these 3 linearly-arranged spaces:
___
1 2 3
In contrast, there are only 2! = 2 circular permutations of CAT. That is, there are only
2! = 2 possible ways to fill them into these 3 circularly-arranged spaces:
The three seemingly-different arrangements above are considered to be the same circular
permutation. This is because any arrangement is simply a rotation of another. Take the
left red arrangement, rotate it clockwise by one-third of a circle to get the middle green
arrangement. Repeat the rotation to get the right blue arrangement.
The second and only other circular arrangement of CAT is shown below. Again, these
three seemingly-different arrangements are considered to be the same circular permuta-
tion. This is because any arrangement is simply a rotation of another. Take the left
black arrangement, rotate it clockwise by one-third of a circle to get the middle pink
arrangement. Repeat the rotation to get the right orange arrangement.
Note importantly, that the arrangement (or three arrangements) below cannot be rotated
to get the arrangement (or three arrangements) above. Hence, the arrangement below is
indeed distinct from the arrangement above.
It turns out that in general, if we have n distinct objects, there are (n − 1)! ways to
arrange them in a circle. So here there are only (3 − 1)! = 2! = 2 ways to arrange CAT in
a circle.
In general:
Proof. Given n distinct objects, any 1 circular permutation can be rotated n times to obtain
925, Contents www.EconsPhDTutor.com
n distinct (linear) permutations. Hence, there are n times as many (linear) permutations
as there are circular permutations.
But we already know that there are n! (linear) permutations of n distinct objects. Hence,
there are n!/n = (n − 1)! circular permutations of n distinct objects.
Exercise 384. How many ways are there to seat 10 people in a circle? (Answer on p.
1566.)
Note that if there are repeated objects, then the problem is considerably more difficult. See
Ch. 122.2 in the Appendices for a brief discussion.
Example 1113. Using the 26-letter alphabet, how many 3-letter words can we form that
have no repeated letters? This, of course, is simply the problem of filling in these 3 empty
spaces using 26 distinct elements. For space #1, we have 26 possible choices. For space
#2, we have 25. And for space #2, we have 24.
___
1 2 3
By the MP then, the number of ways to fill the three spaces is 26 × 25 × 24. This is also
the number of three-letter words with no repeated letters.
Problems like the above example crop up often enough to motivate a new piece of notation:
Definition 187. Let n, k be positive integers with n ≥ k. Then P (n, k), read aloud as n
permute k, is defined by
P (n, k) =
n!
(n − k)!
.
P (n, k) answers the following question: “Given n distinct objects and k spaces (where
k ≤ n), how many ways are there to fill the k spaces?”
Just so you know, P (n, k) is also variously denoted nP k, Pkn , n Pk , etc., but we’ll stick solely
with the P (n, k) in this textbook.
Example 927 (continued from above). The number of 3-letter words without re-
peated letters is simply P (26, 3) = 26!/23! = 26 × 25 × 24.
Example 1114. Problem: Using the 22-letter Phoenician alphabet, how many 4-letter
words can we form that have no repeated letters?
This, of course, is simply the problem of filling in these 4 empty spaces using 22 distinct
elements. So the answer is P (22, 4) = 22!/18! = 22 × 20 × 19 × 18 words.
Exercise 385. Out of a committee of 11 members, how many ways are there to choose
a president and a vice-president? (Answer on p. 1566.)
Example 1115. At a dance party, there are 7 heterosexual married couples (and thus
14 people in total). Problem #1. How many ways are there of arranging them in a
line, with the restriction that every person is next to his or her partner?
Think of there as being 7 units (each unit being a couple). There are 7! ways to arrange
these 7 units in a line. Within each unit, there are 2 possible arrangements. Hence, in
total, there are 7! × 27 possible arrangements.
Problem #2. Repeat the above problem, but now for a circle, rather than a line.
There are 6! ways to arrange the 7 units in a circle. Within each unit, there are 2 possible
arrangements. Hence, in total, there are 6! × 27 possible arrangements.
Problem #3. How many ways are there of arranging them in a circle, with the restric-
tion that every man is to the right of his wife?
There are 6! ways to arrange the 7 units in a circle. Within each unit, there is only 1
possible arrangement. Hence, in total, there are 6! possible arrangements.
Example 1116. (I assume you’re familiar with the standard 52-card deck.)
_ _ _.
1 2 3
For space #2, having picked a card of suit X for space #1, we must pick a card from some
other suit Y. And so there are only 39 possible choices (we have three suits available —
that’s 3 × 13 = 39).
For space #3, having picked a card of suit Y for space #2, we must pick a card from
some other suit Z. Note that suit Z can be the same as suit X. And so there are 38
possible choices (we have three suits available, less the card used for space #1 — that’s
3 × 13 − 1 = 38).
Altogether then, there are 52 × 39 × 38 possible arrangements.
Problem #2. Repeat the above problem, but now for a circle, rather than a line.
One subtle thing is that, in addition to space #1 being of a different suit from space #2
and space #2 being of a different suit from space #3, we must also have that space #3
is of a different suit from space #1. Thus, there are 52 × 39 × 26 possible ways to fill in
these three spaces, if they were in a line.
Since they are instead in a circle, there are 52 × 39 × 26 ÷ 3 possible ways to arrange three
cards in a circle, with the condition that no two cards of the same suit are next to each
other.
Exercise 386. (Answer on p. 1566.) There are 4 brothers and 3 sisters. In how many
ways can they be arranged ...
(a) in a line, without any 2 brothers being next to each other?
(b) in a line, without any 2 sisters being next to each other?
(c) in a circle, without any 2 brothers being next to each other?
(d) in a circle, without any 2 sisters being next to each other?
__
1 2
Example 1118. How many ways are there of choosing 5 cards out of a standard 52-card
deck?
_____
1 2 3 4 5
First, how many ways are there to fill 5 spaces using 52 distinct objects (where order
matters)? Answer: P (52, 5) = 52 × 51 × 50 × 49 × 48 = 311, 875, 200.
And so if we don’t care about order, we must adjust this number by dividing by 5! to get
P (52, 5)/5! = 2, 598, 960. So the answer is that to choose 5 cards out of a 52-card deck,
there are 2, 598, 960 ways.
The above examples suggest that, in general, to choose k out of n given distinct objects,
there are P (n, k)/k! possible ways. This motivates the following definition:
It turns out that C(n, k) appears so often in maths that it has many alternative notations
⎛n⎞
— one of the most common is .
⎝k ⎠
“n choose k” also has several names, such as the combination, the combinatorial
number, and even the binomial coefficient. Shortly, we’ll see why the name binomial
coefficient makes sense.
Exercise 387 gives an alternate expression for C(n, k) which you’ll often find very useful.
Exercise 387. (Answer on p. 1568.) Show that:
n × (n − 1) × (n − 2) × ⋅ ⋅ ⋅ × (n − k + 1)
C(n, k) = .
k!
Exercise 388. Compute C(4, 2), C(6, 4), and C(7, 3). (Answer on p. 1568.)
Exercise 389. We wish to form a basketball team, consisting of 1 centre, 2 forwards,
and 2 guards. We have available 3 centres, 7 forwards, and 5 guards. How many ways
are there of forming a team? (Answer on p. 1568.)
Proof. Choosing k out of n objects is the same as choosing which n − k out of n objects to
ignore.
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 25 35 21 7 1
⋮
It turns out that beautifully enough, each term is equal to the sum of the two terms above
it. The next exercise asks you to verify several instances of this:
Exercise 390. Verify the following: (a) C(1, 0)+C(1, 1) = C(2, 1); (b) C(4, 2)+C(4, 3) =
C(5, 3); (c) C(17, 2) + C(17, 3) = C(18, 3). (Answer on p. 1568.)
Poincaré’s quote is especially true in combinatorics. In this section, we’ll learn why C (n, k)
can be called the combination and also the binomial coefficient.
Verify for yourself that the following equations are true:
(1 + x) = 1,
(1 + x) = 1 + x,
(1 + x) = 1 + 2x + x2 ,
(1 + x) = 1 + 3x + 3x2 + x3 ,
(1 + x) = 1 + 4x + 6x2 + 4x3 + x4 ,
(1 + x) = 1 + 5x + 10x2 + 10x3 + 5x4 + x5 ,
(1 + x) = 1 + 6x + 15x2 + 20x3 + 15x4 + 6x5 + x6 ,
(1 + x) = 1 + 7x + 21x2 + 35x3 + 35x4 + 21x5 + 7x6 + x7 .
⋮
Each of the expressions on the RHS is called a binomial series. Each can also be called
the binomial expansion of (1 + x).
Notice anything interesting? No? Try this exercise:
It turns out that somewhat surprisingly, the coefficients of the binomial expansions of
⎛n⎞ ⎛n⎞ ⎛n⎞
(1 + x) are simply , , ... . As an additional exercise, you should verify for
⎝ 0 ⎠ ⎝ 1 ⎠ ⎝n⎠
yourself that this is also true for n = 0 through n = 6.
There are several ways to explain why the combinatorial numbers also happen to be the
binomial coefficients. Here we’ll give only the combinatorial explanation:
(1 + x) = (1 + x) (1 + x) = 1 ⋅ 1 + 1 ⋅ x + x ⋅ 1 + x ⋅ x.
For 1 ⋅ x, we “chose” 1
from the first (1 + x) and x
from the second (1 + x). ⎫
⎪ From the two (1 + x)’s in the
⎪
⎬ product, there are C(2, 1) = 2
⎪
⎪
For x ⋅ 1, we “chose” x ⎭ ways to choose 1 of the x’s.
from the first (1 + x) and 1
from the second (1 + x).
Altogether then, the coefficient on x0 is C(2, 0) (“choose 0 of the x’s”), that on x1 is C(2, 1)
(“choose 1 of the x’s”), and that on x2 is C(2, 1) (“choose 2 of the x’s”). That is:
Exercise 392. (Answer on p. 1569.) Mimicking what was just done above, explain why
There’s a nice combinatorial interpretation of the above fact (Poincaré’s quote at work
again).
Consider the set S = {A, B}. S has 22 = 4 subsets: ∅ = {}, {A}, {B}, and S = {A, B}.
Now consider the set T = {A, B, C}. T has 23 = 8 subsets: ∅ = {}, {A}, {B}, {C}, {A, B},
{A, C}, {B, C}, and T = {A, B, C}.
In general, if a set has n elements, how many subsets does it have? We can couch this in
the framework of the Multiplication Principle — this is really a sequence of n decisions of
whether or not to include each element in the subset. There are 2 choices for each decision.
Thus, there are 2n choices altogether. In other words, using a set of n elements, we can
form 2n subsets.
But of course, this must in turn be equal to the sum of the following:
• C (n, 0) ways to form subsets with 0 elements;
• C (n, 1) ways to form subsets with 1 element;
• C (n, 2) ways to form subsets with 2 elements;
...
• C (n, n) ways to form subsets with n elements.
Thus,
Exercise 394. Using what you’ve learnt, write down (3 + x)4 . (Answer on p. 1570.)
Exercise 395. (Answer on p. 1570.) (a) The Tan family has 4 sons and the Wong
family has 3 daughters. Using the sons and daughters from these two families, how many
ways are there of forming 2 heterosexual couples?
(b) The Lee family has 6 sons and the Ho family has 9 daughters. Using the sons and
daughters from these two families, how many ways are there of forming 5 heterosexual
couples?
Example 1120. We want to know how much material to purchase, in order to build a
fence around a field. We might go through these steps:
1. Formulate a mathematical model: Our field is the shape of a rectangle, with length
100 m and breadth 50 m.
2. Analyse: The rectangle has perimeter 100 + 50 + 100 + 50 = 300 m.
3. Apply the results of our analysis: We need to buy enough material to build a
300-metre long fence.
Secretly, we’ve always been using mathematical modelling; we just haven’t always been
terribly explicit about it. The foregoing discussion was placed here, because with probability
and statistical models, we want to be especially clear about that we are doing mathematical
modelling.
349
An experiment is often instead called a probability triple or probability space or (probability)
measure space.
350
Previously, in the only ordered triples we encountered, the three terms were always simply real numbers.
Here however, the first two terms are sets and the third is a function. Nonetheless, this is all the same
an ordered triple, albeit a more complicated one.
939, Contents www.EconsPhDTutor.com
Example 1121. We model a coin-flip with the experiment E = (S, Σ, P). What are the
sample space S, the event space Σ, and the probability function P?
1. S = {H, T }.
The mathematical modeller has no freedom over the domain Σ and codomain R of the
probability function. However, she does have freedom to choose the mapping rule she
deems most appropriate. Hence, the act of choosing the mapping rule belongs to Step
#1 (Formulation) in the process of mathematical modelling.
So here, if told that heads and tails are “equally likely” (or that the coin is “unbiased”
or “fair”), the mathematical modeller would naturally choose to assign probability 0.5 to
each of the events {H} and {T }.
John, who chooses S = {H, T, X} as his sample space, might instead assign probability
1/6000 to the event {X} and probability 5999/12000 to each of the events {H} and {T }.
Remark 128. It is correct and proper to write P ({H}) = P ({T }) = 0.5. It is incorrect
and improper to write P (H) = P (T ) = 0.5. This is because the function P is of events
(sets of outcomes) and NOT of outcomes themselves.
Nonetheless, we will often allow ourselves to be sloppy and write the “incorrect and
improper” P (H) = P (T ) = 0.5. This is because the notation P ({H}) = P ({T }) = 0.5 can
get rather messy. But you should always remember, even as you write P (H) = P (T ) = 0.5,
that this is technically incorrect.
2. Event space:
Σ = {∅, {1} , {2} , . . . , {6} , {1, 2} , {1, 3} , . . . , {5, 6} , {1, 2, 3} , {1, 2, 4} , . . . , {4, 5, 6} , . . . . . . , S}
There are 6 possible outcomes and thus 26 = 64 possible events. The event space, given
above, is simply the set of all possible events.
If the real-world outcome of the die roll is 3, then the interpretation (in terms of our
model) is that the following 32 events occur: {3}, {1, 3}, {2, 3}, . . . , {1, 2, 3}, {1, 3, 4},
. . . , S = {1, 2, 3, 4, 5, 6}. (These are simply the events that contain the outcome 3.)
Similarly, if the real-world outcome of the die roll is 5, then the interpretation is that 32
events occur. You should be able to list all 32 of these events on your own.
3. Probability function P ∶ Σ → R.
If the die is “unbiased” or “fair”, then it makes sense to assign
1
P({1}) = P({2}) = P({3}) = P({4}) = P({5}) = P({6}) = .
6
4
What about the other 58 events? It makes sense to assign, for example, P ({1, 3, 5, 6}) = .
6
In general, the mapping rule of the probability function can be fully specified as: For any
event A ∈ Σ,
∣A∣ ∣A∣
P(A) = =
∣S∣
.
6
In words, given any event A, its probability P(A) is simply the number of elements it
contains, divided by 6.
• S, the sample space, is simply any set (interpreted as the set of possible outcomes in
a real-world scenario involving chance).
• Σ, the event space, is the set of possible events.
• P, the probability function, has domain Σ, codomain R, and must satisfy the three
Kolmogorov axioms (to be discussed below in Definition 190).
For the probability function P, the mathematical modeller is free to choose the mapping
rule she deems most appropriate. The only restriction is that P satisfies three axioms,
called the Kolmogorov Axioms, to be discussed in the next section.
Exercise 396. (Answers on pp. 1571, 1572, and 1573.) Consider each of the following
real-world scenarios.
(a) You pick, at random, a card from a standard 52-card deck.
(b) You flip two fair coins.
(c) You roll two fair dice.
Model each of the above real-world scenarios as an experiment, by following steps (i) -
(iii):
(iv) In each scenario, explain briefly how John, another scientist, might justify choosing
a different sample space, event space, and probability function.
Example 1123. Euclid’s parallel axiom says that “Two non-parallel lines in the plane
eventually intersect”. Historically, this axiom was accepted as a “self-evident truth”,
without need for justification or proof.
However, in the 19th century, mathematicians discovered “non-Euclidean geometries”, in
which the parallel axiom did not hold. These turned out to have significant implications
for maths, philosophy, and physics.
The above example illustrates that an axiom is not an eternal and immutable truth. Instead,
it is merely a statement that some mathematicians tentatively accept as being true. Having
listed a bunch of axioms, mathematicians then study their implications.
In probability theory, we impose three axioms on the probability function. These can be
thought of as restrictions on what the probability function looks like. Informally:
1. Probabilities can’t be negative.
2. The probability of an outcome occurring is 1.
3. The probability that one of two disjoint events occurs is the sum of the their individual
probabilities.
Formally:
Definition 190. We say that a function P satisfies the three Kolmogorov axioms if:
1. Non-Negativity Axiom. For any event E ⊆ S, we have P(E) ≥ 0.
2. Normalisation Axiom. P(S) = 1.
3. Additivity Axiom.352 Given any two disjoint events E1 , E2 ⊆ S, we have
P (E1 ∪ E2 ) = P (E1 ) + P (E2 ).
In case you’ve forgotten, two sets are disjoint if they have no elements in common.
You may recognise that the Complements and the Inclusion-Exclusion properties are ana-
logous to the CP and IEP from counting.
Venn diagrams are helpful for illustrating probabilities. Those below help to illustrate the
four of the above five properties.
Example 1124. Flip three fair coins. Model this as an experiment E = (S, Σ, P), where
Exercise 398. Roll two dice. Given that the sum of the two dice rolls is 8, what is the
probability that we rolled at least one even number? (Answer on p. 1575.)
Definition 192. The conditional probability fallacy (CPF) is the mistaken belief that
P (A∣B) = P (B∣A)
is always true.
Fact 172. (a) If P(A) < P(B), then P (A∣B) < P (B∣A).
(b) If P(A) > P(B), then P (A∣B) > P (B∣A).
(c) If P(A) = P(B), then P (A∣B) = P (B∣A).
P (A ∩ B) P (B ∩ A)
Proof. By definition, P (A∣B) = and P (B∣A) = .
P(B) P(A)
P (A)
Thus, P (A∣B) = P (B∣A). And so,
P(B)
P(A) < P(B) Ô⇒ P (A∣B) < P (B∣A) ,
P(A) > P(B) Ô⇒ P (A∣B) > P (B∣A) ,
P(A) = P(B) Ô⇒ P (A∣B) = P (B∣A) .
The CPF is also known as the confusion of the inverse or the inverse fallacy. In
different contexts, it is also known variously as the base-rate fallacy, false-positive
fallacy, or prosecutor’s fallacy.
Example 1127. Sally buys a 4D ticket every week. One day, she wins the first prize.
To her astonishment, she wins the first prize again the following week.
Her jealous cousin Ah Kow makes a police report, based on the following reasoning:
“Without cheating, the probability that Sally wins the first prize two weeks in a row is
1 in 100 million. Given that she did win first prize two weeks in a row, the probability
that she didn’t cheat must likewise be 1 in 100 million. In other words, there is almost
no chance that Sally didn’t cheat.”
Let’s rephrase Ah Kow’s reasoning more formally. Let A and B be the events “Sally
wins the first prize two weeks in a row” and “Sally didn’t cheat”, respectively. We
know that P (A∣B) = 0.00000001. By the CPF, we have P (A∣B) = P (B∣A). Hence,
P (B∣A) = 0.00000001. Equivalently, there is probability 0.99999999 that Sally cheated.
Formally, this reasoning is flawed because P(B) is probably much larger than P (A).
Thus, P (B∣A) is probably much larger than P (A∣B).
Informally, the reasoning is flawed because:
• Cheating in 4D is extremely rare (and difficult), so it is extremely unlikely that Sally
cheated in the first place.
• Besides cheating, there are many other alternative explanations for why there exists
an individual who won first prize two weeks in a row.
One important alternative explanation is that so many individuals buy 4D tickets regu-
larly that there will invariably be someone as lucky as Sally. Suppose that only 100, 000
Singaporeans (less than 2% of Singapore’s population) buy one 4D number every week.
Then we’d expect that about once every 20 years, one of these 100, 000 Singaporeans
will have the fortune of winning the first prize on consecutive weeks. Rare, but hardly
impossible.
The test result returns positive (i.e. it says that the randomly-chosen person has small-
pox). What is the probability that this person actually has smallpox?
In words, it is easy to confuse “the probability of a positive test result conditional on
having smallpox” with “the probability of having smallpox conditional on a positive
test result”. Formally, this is the CPF. One starts with P (+∣S) = 0.99 and confusedly
concludes that P (S∣+) = 0.99 — this person almost certainly has smallpox.
In fact, as we now show, despite testing positive, the person is very unlikely to have
1 ∗
smallpox. The correct answer is P (S∣+) ≈ ! In the steps below, each = simply
10, 000
uses the definition of conditional probability (Definition 191):
∗ P (S) P (+∣S)
=
P (S) P (+∣S) + P (S C ) P (+∣S C )
1
1000000 0.99 1
= = 0.00009899029 ≈
1000000 0.99 + 1000000 0.01
1 999999 .
10, 000
This example illustrates how far off the CPF can lead one astray.
It turns out that not only laypersons and court prosecutors commit the CPF. As we’ll see
later, even academic researchers also often commit the CPF, when it comes to interpreting
the results of a null hypothesis significance test (Chapter 107).
Example 1130. Consider all the families in the world that have two children, of whom
at least one is a boy. Randomly pick one of these families. What is the probability that
both children in this family are boys?
Think about it (set aside this book) before reading the answer below.
We already know that one child is a boy. So intuition might suggest that “obviously”,
Intuition would be wrong. Intuition goes astray by failing to recognise that there are three equally likely
ways that a family with two children can have at least one boy: BB, BG, or GB. The answer is in fact
1/3:
P(BB) 1
1
= = 4
= .
P(BB) + P(BG) + P(GB) 1
4 + +
1
4
1
4
3
In 2010, the following variant of the above Martin Gardner problem was presented.
1 1 7
BT B Boy born on Tuesday Boy (born on any day) P (BT B) = ⋅ =
14 2 196
1 1 7
BT G Boy born on Tuesday Girl P (BT G) = ⋅ =
14 2 196
6 1 6
BN BT Boy not born on Tuesday Boy born on Tuesday P (BN BT ) = ⋅ =
14 14 196
1 1 7
GBT Girl Boy born on Tuesday P (GBT ) = ⋅ =
2 14 196
Altogether then, amongst two-child families with at least one boy born on a Tuesday, the
proportion that have two boys is
P (BT B) + P (BN BT )
=
P (BT B) + P (BT G) + P (BN BT ) + P (GBT )
7
+ 196
6
13
= 196
=
+ + 196 + 196
7 7 6 7 .
196 196
27
P(A ∩ B) = P(A)P(B).
Fact 173. Suppose P(B) ≠ 0. Then A, B are independent events ⇐⇒ P(A∣B) = P(A).
desired.
Example 1133. Flip a fair coin and roll a fair die. This can be modelled by an experi-
ment, where
More broadly, we can even say that the coin flip and die roll are independent. Informally,
this means that the outcome of the coin flip has no influence on the outcome of the die
roll, and vice versa.
The idea of independence is a little tricky to illustrate on a Venn diagram. I’ll try anyway.
We observe that P(A) = 0.2 = P(A∣B). And so by Fact 173, we conclude that the events
A and B are independent.
Example 1135. The event “coin-flip #1 is heads” and the event “coin-flip #2 is heads”
are probably independent.
Example 1136. The event “die-roll #1 is 3” and the event “die-roll #2 is 6” are probably
independent.
Here are two examples where the assumption of independence is not plausible:
Example 1137. The event “Google’s share price rises today” is probably not independent
of the event “Apple’s share price rises today”.
Example 1138. The event “it rains in Singapore today” is probably not independent of
the event “it rains in Kuala Lumpur today”.
Example 1139. The “expert” witness claimed that in an affluent, non-smoking family
such as Sally Clark’s, the probability of an infant suddenly dying with no explanation
was 1/8543. Hence, he concluded, the probability of two sudden infant deaths in the
same family was (1/8543) or approximately 1 in 73 million.
2
P(A ∩ B) = P(A)P(B),
P(B ∩ C) = P(B)P(C),
P(A ∩ C) = P(A)P(C).
A, B, C are independent if in addition to the above three conditions being true, it is also
true that
P(A ∩ B ∩ C) = P(A)P(B)P(C).
It is tempting to believe that pairwise independence implies independence. That is, if the
first three conditions listed above are true, then so is the fourth. Alas, this is false, as the
next exercise demonstrates:
Example 1140. Say you pick Box #2. The host then opens an empty Box #1. You’re
now given a choice: Stay (with Box #2) or switch (to Box #3). Which do you choose?
Take as long as you need to think about this problem, before turning to the
next page for the answer.
Yes; you should switch. The first door has a 1/3 chance
of winning, but the second door has a 2/3 chance.
Not switching wins you the minister’s salary only in Case A (1/3 probability).
Switching wins you the minister’s salary in Cases B and C (2/3 probability).
353
Marilyn vos Savant was, briefly, on the Guinness Book of Records as the person with the world’s highest
IQ, until Guinness retired this category because IQ tests were considered to be too unreliable.
962, Contents www.EconsPhDTutor.com
Even with the above explanations, some of you may remain unconvinced. Don’t worry, you
are not alone. After Marilyn’s initial response, 10,000 readers sent in letters telling her she
was wrong. Some were from Professors of Mathematics and PhDs. A few examples:354
Unfortunately for the above letter writers, Marilyn was correct and they were wrong.
The best way to convince the sceptical is through simulations — try this Google spreadsheet.
Or if you don’t trust computers, do an actual experiment:
Class Activity
Form pairs. One person is the gameshow host and the other is the contestant. The host
decides where the prize is (Box #1, #2, or #3). The contestant then picks a box. The
host then tells the contestant which one of the other two boxes is empty. The contestant
then decides whether to stay or switch.
Repeat as many times as you have time for. Record the proportion of times that the
contestant should have switched. You should find that this proportion is about 2/3.
354
You can read more of these letters at her website.
963, Contents www.EconsPhDTutor.com
94.2. The Birthday Problem
Example 1141. (The birthday problem.) What is the smallest number n of people
in a room, such that it is more likely than not, that at least 2 people in the room share
the same birthday? 355
Fix person #1’s birthday. Then
• The probability that person #2’s birthday is different (from person #1) is 364/365.
• The probability that person #3’s birthday is different (from persons #1 and #2) is
363/365.
• The probability that person #4’s birthday is different (from persons #1, #2, and #3)
is 362/365.
• ... ...
• The probability that person #n’s birthday is different (from persons #1 through #n−1)
is (366 − n)/365.
Altogether, the probability that no 2 persons share the same birthday is
364 363 362 366 − n
× × × ⋅⋅⋅ × .
365 365 365 365
Hence, the probability that at least 2 persons share the same birthday is
364 363 362 366 − n
1− × × × ⋅⋅⋅ × .
365 365 365 365
The smallest integer n for which the above probability is at least 0.5 is 23. That is,
perhaps surprisingly, with just 23 people, it is more likely than not that at least 2 persons
share a birthday.
Example 1142. Model a fair coin-flip with the usual experiment E = (S, Σ, P), where
• S = {H, T }.
• Σ = {∅, {H} , {T } , S}.
• P ∶ Σ → R is defined by P (∅) = 0, P ({H}) = P ({H}) = 0.5, and P(S) = 1.
Let X ∶ S → R be the random variable that indicates whether the coin-flip is heads.
That is, the observed value of X is X(H) = 1 if the outcome is heads and X(T ) = 0 if
the outcome is tails.
Formally:
Remember: A random variable X is a function that can take on many possible real
number values. Each such value x = X(s) is called an observed value of X.
The notation “X ≥ k”, “X > k”, “X ≤ k”, “X < k”, “a ≤ X ≤ b”, etc. are similarly defined.
Example 1142 (continued from above). X(H) = 1 and X(T ) = 0. So we can write:
Now let’s try some other arbitrary number like 13.71. Notice there is no outcome s such
that X(s) = 13.71. Thus:
Y = 15.5 denotes the event {s ∈ S ∶ X(s) = 15.5} = {H, T } , and P(X = 15.5) = 1.
Example 1143. Flip two fair coins. Model this with the usual experiment, where S =
{HH, HT, T H, T T }.
Let X ∶ S → R indicate whether the two coin flips are the same and Y ∶ S → R count the
number of heads. That is,
Y (HH) = 2, Y (HT ) = 1, Y (T H) = 1, Y (T T ) = 0.
And
P(Y = 0) = 0.25, P(Y = 1) = 0.5, P(Y = 2) = 0.25, and P(X = k) = 0, for any k ≠ 0, 1, 2.
Another example:
S = {A«, K«, , . . . , 2«, Aª, Kª, . . . , 2ª, A©, K©, . . . , 2©, A¨, K¨, . . . , 2¨} .
X ∶ S → R is the High Card Point count (used in the game of bridge). I.e.,
Thus,
36 4 4
P(X = 0) = , P(X = 1) = , P(X = 2) = ,
52 52 52
4 4
P(X = 3) = , P(X = 4) = , P(X = k) = 0,
52 52
for any k ≠ 0, 1, 2, 3, 4.
Y ∶ S → R indicates whether the picked card is a spade (♠). I.e.,
Thus,
39 13
P(Y = 0) = , P(Y = 1) = , P(Y = k) = 0, for any k ≠ 0, 1.
52 52
⎛ ⎞ ⎛ ⎞
= 7 and X = 5.
⎝ ⎠ ⎝ ⎠
X
The table below says that P (X = 2) = 1/36, because there is only one way the event X = 2
can occur. And P (X = 3) = 2/36, because there are two ways the event X = 3 can occur.
You are asked to complete the table in the next exercise.
Exercise 404. (Continuation of the above example.) (Answer on p. 1577.) (a) Complete
the above table.
Consider the event E, described in words as “the sum of the two dice is at least 10”.
(b) Write down the event E in terms of X.
(c) Calculate P(E).
Example 1145 (continued from above). Continue with the same the roll-two-fair-
dice example, with X again being the random variable that is the sum of the two dice.
We had
⎛ ⎞ ⎛ ⎞
= 7 and X = 5.
⎝ ⎠ ⎝ ⎠
X
⎛ ⎞ ⎛ ⎞
= 10 and Y = 4.
⎝ ⎠ ⎝ ⎠
Y
Remember: random variables are simply functions. And thus, we can manipulate random
variables just like we manipulate any functions.
So for example, consider the function X + Y ∶ S → R. It is also a random variable. We
have
⎛ ⎞ ⎛ ⎞
(X + Y ) = 17 and (X + Y ) = 9.
⎝ ⎠ ⎝ ⎠
⎛ ⎞ ⎛ ⎞
(XY ) = 70 and (XY ) = 20.
⎝ ⎠ ⎝ ⎠
⎛ ⎞ ⎛ ⎞
(4X − 5Y ) = −22 and (4X − 5Y ) = 0.
⎝ ⎠ ⎝ ⎠
Exercise 406. (Answer on p. 1578.) Model a fair die-roll with the usual experiment
E = {S, Σ, P}. Define the function X ∶ S → R by the mapping rule X(1) = 1, X(2) = 2,
X(3) = 3, X(4) = 4, X(5) = 5, and X(6) = 6.
Is X a random variable on E? Why or why not?
If X is indeed a random variable on E, then write down also P(X = k), for all possible k.
Exercise 407. For each of the following real-world scenarios, write down, in precise
mathematical notation (i) the experiment E = {S, Σ, P}; (ii) what the random variable
X is; and (iii) P(X = k), for all possible k. (Answers on pp. 1578 and 1579.)
(a) Flip 4 (fair) coins. Let the random variable X be a count of the number of heads.
(b) Roll 3 (fair) dice. Let the random variable X be the sum of the three dice. (Tedious.)
Example 1146. Flip two fair coins. Model this with the usual experiment where S =
{HH, HT, T H, T T }.
Let X ∶ S → R indicate whether the two coin flips were the same and Y ∶ S → R count
the number of heads. That is,
Then X = 0, Y = 0 is the event that the two coin flips were not the same AND the number
of heads was 0. By observation, this event is the empty set. Thus, P (X = 0, Y = 0) =
P (∅) = 0.
X = 1, Y = 0 is the event that the two coin flips were the same AND the number of heads
was 0. By observation, this event is {T T }. Thus, P (X = 1, Y = 0) = P ({T T }) = 0.25.
Exercise: Verify for yourself that
P (X = 0, Y = 1) = 0.5, P (X = 1, Y = 1) = 0,
P (X = 0, Y = 2) = 0, P (X = 1, Y = 2) = 0.25.
Formally:
Let’s restate the above definition more explicitly. Suppose X can take on values x1 , x2 , . . . , xn
and Y can take on values y1 , y2 , . . . , ym . Then to say that X and Y are independent is to
say that all of the following n × m pairs of events are independent
X = x 1 , Y = y1 , X = x 1 , Y = y2 , ... X = x 1 , Y = ym ,
X = x 2 , Y = y1 , X = x 2 , Y = y2 , ... X = x 2 , Y = ym ,
⋮ ⋮ ... ⋮
X = x n , Y = y1 , X = x n , Y = y2 , ... X = x n , Y = ym .
P (A = a, B = b) P(A = a)P(B = b)
a = 0, b = 0 P ({T T }) = 0.25 P ({T H, T T }) P ({HT, T T }) = 0.5 × 0.5, 3
a = 1, b = 0 P ({HT }) = 0.25 P ({HH, HT }) P ({HT, T T }) = 0.5 × 0.5, 3
a = 0, b = 1 P ({T H}) = 0.25 P ({T H, T T }) P ({HH, T H}) = 0.5 × 0.5, 3
a = 1, b = 1 P ({HH}) = 0.25 P ({HH, HT }) P ({HH, T H}) = 0.5 × 0.5. 3
Exercise 408. Flip two fair coins. Let X ∶ S → R indicate whether the two coin flips
were the same and Y ∶ S → R count the number of heads. Are X and Y independent
random variables? (Answer on p. 1581.)
Earlier we warned against blithely assuming that any two events are independent. Here we
can repeat this warning: Unless explicitly told (or you have a good reason), do not assume
that two random variables are independent.
The assumption of independence is a strong one. There are many scenarios where it is
plausible. For example, the flips of two coins are probably independent. The rolls of two
dice are probably independent.
There are, however, also many scenarios where it is not plausible. Today’s changes in
the share prices of Google and Apple are probably not independent. Today’s rainfall in
Singapore and in Kuala Lumpur are probably not independent.
Nonetheless, the assumption of independence is frequently — and incorrectly — made even
when it is implausible. The reason is that the maths is easy if we assume independence —
we can simply multiply probabilities together. Unfortunately, incorrectly assuming inde-
pendence has sometimes had tragic consequences, as we saw in the Sally Clark case.
That is, a random variable is discrete if it takes on finitely many possible values.
We can now formally define the expected value of a discrete random variable:
E [X] = ∑ P(X = k) ⋅ k.
k∈Range(X)
We call E [X] the expected value (or mean) of X. We often write µX = E [X] or even
µ = E [X] (if it is clear from the context that we’re talking about the mean of X).
Example 1148. Let X be the outcome of a fair die roll. The range of X is Range(X) =
{1, 2, 3, 4, 5, 6}. So
E [X] = ∑ P (X = k) ⋅ k
k∈Range(X)
= P (X = 1) ⋅ 1 + P (X = 2) ⋅ 2 + P (X = 3) ⋅ 3 + P (X = 4) ⋅ 4 + P (X = 5) ⋅ 5 + P (X = 6) ⋅ 6.
1 1 1 1 1 1
= ⋅ 1 + ⋅ 2 + ⋅ 3 + ⋅ 4 + ⋅ 5 + ⋅ 6 = 3.5.
6 6 6 6 6 6
356
The correct definition is this: A random variable is discrete if its range is finite or countably-infinite.
I avoid giving this correct definition because this would require explaining what “countably-infinite”
means.
976, Contents www.EconsPhDTutor.com
Example 1149. Let Y be the sum of two fair die-rolls.
The range of Y is Range(Y ) = {2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}. In Exercise 404, we worked
out that P (Y = 2) = 1/36, P (Y = 3) = 2/36, etc. Thus:
E [Y ] = ∑ P (Y = k) ⋅ k
k∈Range(Y )
= P (Y = 2) ⋅ 2 + P (Y = 3) ⋅ 3 + P (Y = 4) ⋅ 4 + P (Y = 5) ⋅ 5 + ⋅ ⋅ ⋅ + P (Y = 12) ⋅ 12
1 2 3 4 5 6 5 4 3 2 1
= ⋅2+ ⋅3+ ⋅4+ ⋅5+ ⋅6+ ⋅7+ ⋅8+ ⋅9+ ⋅ 10 + ⋅ 11 + ⋅ 12
36 36 36 36 36 36 36 36 36 36 36
2 + 6 + 12 + 20 + 30 + 42 + 40 + 36 + 30 + 22 + 12 252
= = = 7.
36 36
Example 1150. XXFlip two fair coins and roll two fair dice. Let X be the number of
heads and Y be the number of sixes.
Problem: What is E[X + Y ]?
As it turns out, it is generally true that E[X + Y ] = E [X] + E [Y ] (as we’ll see in the next
section). So if we knew this, then the problem would be very easy:
1 4
E[X + Y ] = E [X] + E [Y ] = 1 + = .
3 3
But as an exercise, let’s pretend we don’t know that E[X + Y ] = E [X] + E [Y ]. We thus
have to work out E[X + Y ] the hard way:
First, note that Range(X + Y ) = {0, 1, 2, 3, 4}. P (X + Y = 0) is the probability of 0 heads
and 0 sixes. And P (X + Y = 1) is the probability of 1 head and 0 sixes OR 0 heads and
1 six. We can compute:
1 1 5 5 25
P (X + Y = 0) = ⋅ ⋅ ⋅ = ,
2 2 6 6 144
⎛ 2 ⎞ 1 1 5 5 1 1 ⎛ 2 ⎞ 5 1 50 10 60
P (X + Y = 1) = ⋅ ⋅ ⋅ + ⋅ = + = .
⎝ 1 ⎠ 2 2 6 6 2 2 ⎝ 1 ⎠ 6 6 144 144 72
You are asked to complete the rest of this problem in the exercise below.
Exercise 409. Complete the above example by following these steps: (a) Compute
P (X + Y = 2). (b) Compute P (X + Y = 3). (c) Compute P (X + Y = 4). (d) Now com-
pute E[X + Y ]. (Answer on p. 1581.)
Example 1151. Let 5 be a constant random variable on some experiment E = (S, Σ, P).
That is, 5 ∶ S → R is the function defined by s ↦ 5. (Note that the symbol 5 does double
duty by denoting both a function and a real number.) Then not surprisingly,
Function Number
↓ ↓
E [5] = 5 .
That is, on average, we expect the random variable 5 to take on the value 5.
Fact 174. If the constant random variable c maps every outcome to the number c, then
E[c] = c.
(Source: Singapore Pools, “Rules for the 4-D Game”, Version 1.11, 17/11/15. PDF.)
d
Example 1153. The differentiation operator is an example of a linear transformation.
dx
Because it satisfies both additivity and homogeneity of degree 1:
d d d d d
(f (x) + g (x)) = f (x) + g (x) and (kf (x)) = k f (x) .
dx dx dx dx dx
(x + y) = x2 + y 2 or (kx) = kx2 .
2 2
Proposition 15. The expectation operator E is linear. That is, if X and Y are random
variables and c is a constant, then
(a) Additivity: E[X + Y ] = E [X] + E [Y ],
(b) Homogeneity of degree 1: E[cX] = cE [X].
Example 1156. I stake $100 on each of two different 4D numbers for Saturday’s drawing
(“big” game). (So that’s $200 total.)
Let X and Y be my winnings (excluding my original stake) from the first and second
numbers (respectively). Now, X and Y are certainly not independent because for ex-
ample, if my first number wins first prize, then my second number cannot possibly also
win first prize.
Nonetheless, despite X and Y not being independent, the linearity of the expectation
operator tells us that
Example 1157. Consider a random variable X that is equally likely to take on one of 5
possible values: 0, 1, 2, 3, 4. Its mean is
1 1 1 1 1
µX = ∑ P (X = k) ⋅ k = ⋅ 0 + ⋅ 1 + ⋅ 2 + ⋅ 3 + ⋅ 4 = 2.
5 5 5 5 5
Now consider another random variable Y that is equally likely to take on one of 5 possible
values: −8, −3, 2, 7, 12. Coincidentally, its mean is the same:
1 1 1 1 1
µY = ∑ P (Y = k) ⋅ k = ⋅ (−8) + ⋅ (−3) + ⋅ 2 + ⋅ 7 + ⋅ 12 = 2.
5 5 5 5 5
The random variables X and Y share the same mean. However, there is an obvious
difference: Y is “more spread out”.
What, precisely, do we mean when we say that one random variable is “more spread out”
than another?
Our goal in this section is to invent a measure of “spread-outness”. We’ll call this the
variance and denote the variance of any random variable X by Var [X].
It’s not at all obvious how the variance should be defined. One possibility is to define the
variance as the weighted average of the deviations from the mean.
Example 982 (continued from above). (Our first proposed definition of variance.)
For X, the weighted average of the deviations from the mean is
V [X] = ∑ P (X = k) ⋅ (k − µ)
1 1 1 1 1
= ⋅ (0 − µ) + ⋅ (1 − µ) + ⋅ (2 − µ) + ⋅ (3 − µ) + ⋅ (4 − µ)
5 5 5 5 5
1 1 1 1 1
= ⋅ (0 − 2) + ⋅ (1 − 2) + ⋅ (2 − 2) + ⋅ (3 − 2) + ⋅ (4 − 2)
5 5 5 5 5
2 1 1 2
= − − + 0 + + = 0.
5 5 5 5
Hmm. This works out to be 0. Is that just a weird coincidence? Let’s try the same for
Y:
V [Y ] = ∑ P (Y = k) ⋅ (k − µ)
1 1 1 1 1
= ⋅ (−8 − µ) + ⋅ (−3 − µ) + ⋅ (2 − µ) + ⋅ (7 − µ) + ⋅ (12 − µ)
5 5 5 5 5
1 1 1 1 1
= ⋅ (−8 − 2) + ⋅ (−3 − 2) + ⋅ (2 − 2) + ⋅ (7 − 2) + ⋅ (12 − 2)
5 5 5 5 5
= −2 − 1 + 0 + 1 + 2 = 0.
=1
So our first proposed definition of the variance — the weighted average of the deviations
from the mean — is always equal to 0. Intuitively, the reason is that the negative deviations
(corresponding to those values below the mean) exactly cancel out the positive deviations
(corresponding to those values above the mean).
This proposed definition is thus quite useless. We cannot use it to say things like Y is
“more spread out” than X.
This suggests a second approach: define the variance to be the weighted average of the
absolute deviations from the mean.
Example 982 (continued from above). (Our second proposed definition of variance.)
For X, the weighted average of the absolute deviations from the mean is
V [X] = ∑ P (X = k) ⋅ ∣k − µ∣
1 1 1 1 1
= ⋅ ∣0 − µ∣ + ⋅ ∣1 − µ∣ + ⋅ ∣2 − µ∣ + ⋅ ∣3 − µ∣ + ⋅ ∣4 − µ∣
5 5 5 5 5
1 1 1 1 1
= ⋅ ∣0 − 2∣ + ⋅ ∣1 − 2∣ + ⋅ ∣2 − 2∣ + ⋅ ∣3 − 2∣ + ⋅ ∣4 − 2∣
5 5 5 5 5
2 1 1 2 6
= + +0+ + = .
5 5 5 5 5
And now let’s work out the same for Y :
V [Y ] = ∑ P (Y = k) ⋅ (k − µ)
1 1 1 1 1
= ⋅ ∣−8 − µ∣ + ⋅ ∣−3 − µ∣ + ⋅ ∣2 − µ∣ + ⋅ ∣7 − µ∣ + ⋅ ∣12 − µ∣
5 5 5 5 5
1 1 1 1 1
= ⋅ ∣−8 − 2∣ + ⋅ ∣−3 − 2∣ + ⋅ ∣2 − 2∣ + ⋅ ∣7 − 2∣ + ⋅ ∣12 − 2∣
5 5 5 5 5
= 2 + 1 + 0 + 1 + 2 = 6.
Wonderful! So we can now use this second proposed definition of the variance to say
things like “Y is more spread out than X”.
This second proposed definition seems perfectly satisfactory. Yet for some bizarre reason,
we won’t use it! Instead, we’ll define the variance to be the weighted average of the
squared deviations from the mean.
V [X] = ∑ P (X = k) ⋅ (k − µ)
2
1 1 1 1 1
= ⋅ (0 − µ) + ⋅ (1 − µ) + ⋅ (2 − µ) + ⋅ (3 − µ) + ⋅ (4 − µ)
2 2 2 2 2
5 5 5 5 5
1 1 1 1 1
= ⋅ (0 − 2) + ⋅ (1 − 2) + ⋅ (2 − 2) + ⋅ (3 − 2) + ⋅ (4 − 2)
2 2 2 2 2
5 5 5 5 5
4 1 1 4
= + + 0 + + = 2.
5 5 5 5
And now let’s work out the same for Y :
V [Y ] = ∑ P (Y = k) ⋅ (k − µ)
2
1 1 1 1 1
= ⋅ (−8 − µ) + ⋅ (−3 − µ) + ⋅ (2 − µ) + ⋅ (7 − µ) + ⋅ (12 − µ)
2 2 2 2 2
5 5 5 5 5
1 1 1 1 1
= ⋅ (−8 − 2) + ⋅ (−3 − 2) + ⋅ (2 − 2) + ⋅ (7 − 2) + ⋅ (12 − 2)
2 2 2 2 2
5 5 5 5 5
= 20 + 5 + 0 + 5 + 20 = 50.
Formally,
Definition 201. Let µ = E [X]. Then the variance operator is denoted Var and is the
function that maps each random variable X to a real number c, given by the mapping
rule
V [X] = E [(X − µ) ] .
2
We call Var [X] the variance of X. This is often also instead written as σX
2
or even more
2
simply as σ (if it is clear from the context that we’re talking about the variance of X).
So to calculate the variance, we do this: Consider all the possible values that X can take.
Take the difference between these values and the mean of X. Square them. Then take the
probability-weighted average of these squared numbers.
More examples:
1 35
= (2.52 + 1.52 + 0.52 + 0.52 + 1.52 + 2.52 ) = ≈ 2.92.
6 12
35
So the variance of the die roll is ≈ 2.92. This means that the expected squared
12
35
deviation of X from its mean µ = 3.5 is ≈ 2.92.
12
Example 1159. Roll two fair dice. Let the random variable Y be the sum of the two
dice. We already know from Example 1149 that µ = 7. So, using also our findings from
Exercise 404,
V [Y ] = E [(Y − µ) ] = E [(Y − 7) ]
2 2
1 2 2 2 3 2 4 2 5 2 6 2 5 2
= ⋅5 + ⋅4 + ⋅3 + ⋅2 + ⋅1 + ⋅0 + ⋅1
36 36 36 36 36 36 36
4 2 3 2 2 2 1 2
+ ⋅2 + ⋅3 + ⋅4 + ⋅5
36 36 36 36
2 (25 + 32 + 27 + 16 + 5) 210 70
= = = ≈ 5.83.
36 36 12
70
So the variance of the sum of two dice is ≈ 5.83. This means that on average, the
12
70
square of the deviation of Y from its mean µ = 7 is ≈ 5.83.
12
As the above examples suggest, calculating the variance can be tedious. Fortunately, there
is a shortcut:
Proof. Using the definition of variance, the linearity of the expectation operator (Proposi-
tion 15), and the fact that µ is a constant, we have
= E [X 2 ] + µ2 − 2µE [X] = E [X 2 ] + µ2 − 2µ ⋅ µ = E [X 2 ] − µ2 .
Example 1159 (continued from above). Let the random variable Y be the sum of
two rolled dice. We already know from Example 1149 that µ = 7. So, using also our
findings from Exercise 404,
E [Y 2 ] = P (Y = 2) ⋅ 22 + P (Y = 3) ⋅ 32 + ⋅ ⋅ ⋅ + P (Y = 12) ⋅ 122
1 2 2 2 3 2 1
= ⋅2 + ⋅3 + ⋅ 4 + ⋅⋅⋅ + ⋅ 122
36 36 36 36
Exercise 411. Let the random variable Z be the sum of three rolled dice. Find Var [Z].
(Answer on p. 1583.)
Fact 176. Let c be a constant random variable (i.e. it maps every outcome to the real
number c). Then
V[c] = 0.
Example 1160. There are 100 dumbbells in a gym, of which 30 have weight 5 kg and
the remaining 70 have weight 10 kg. Let X be the weight of a randomly-chosen dumbbell.
Then the mean of X is
To get a measure of “spread” that uses the original unit of measure, we simply take the
square root of the variance. This is called the standard deviation as a measure of spread.
Definition 202. Let X be a random variable and Var [X] be its variance. Then the
standard deviation of X is defined as
√
SD [X] = V [X].
2
The variance of a random variable X is often denoted σX or even more simply as σ 2 (if it
is clear from the context that we’re talking about the variance of X).
Correspondingly, the standard deviation of X is often denoted σX or σ.
Example 988 (continued from above). We calculated the variance of X to be
Var [X] = σ 2 = 5.25 kg2 .
√
Hence, the standard deviation of X is simply σ = 5.25 ≈ 2.29 kg.
Exercise 412. There are 100 rulers in a bookstore, of which 35 have length 20 cm and
the remaining 65 have length weight 30 cm. Let Y be the weight of a randomly-chosen
dumbbell. Find the mean, variance, and standard deviation of Y . (Be sure to include
the units of measurement.)(Answer on p. 1583.)
With the above, it becomes much easier than before to find the variance of the sum of 2
dice, 3 dice, or indeed n dice.
Exercise 413. The weight of a fish in a pond is a random variable with mean µ kg and
variance σ 2 kg2 . (Include the units of measurement in your answers.) (Answer on p.
1583.)
(a) If two fish are caught and the weights of these fish are independent of each other,
what are the mean and variance of the total weight of the two fish?
(b) If one fish is caught and an exact clone is made of it, what are the mean and variance
of the total weight of the fish and its clone?
(c) If two fish are caught and the weights of these fish are not independent of each other,
what are the mean and variance of the total weight of the two fish?
Why do we prefer using squared (rather than absolute) deviations as our definition of
variance? The conventional view is that the squared deviations definition is superior to
the absolute deviations definition (but see Gorard (2005) and Taleb (2014) for dissenting
views). Here are some reasons for believing the squared deviations definition to be superior:
• The maths works out more nicely. For example:
– The algebra is easier when dealing with squares than with absolute values.
– Differentiation is easier (serve that x2 is differentiable but ∣x∣ is not).
– Variances are additive: If X and Y are independent, then Var [X + Y ] = Var [X] +
Var [Y ]. In contrast, if we use the definition Var [X] = E [∣X − µ∣], then variances are
no longer additive.
• Tradition (inertia).
– A century or two ago, some Europeans preferred using squared to absolute deviations.
And so we’re stuck with using this.
See also these five SE discussions: , , , , .
357
This is easily proven: E [X − µ] = E [X] − E [µ] = µ − µ = 0.
991, Contents www.EconsPhDTutor.com
99. The Coin-Flips Problem (Fun, Optional)
Here’s another example of a probability problem that can be stated very simply, yet have
counter-intuitive results.
Example 1162. Keep flipping a fair coin until you get a sequence of HH (two heads in
a row). Let X be the number of flips taken.
Now, keep flipping a fair coin until you get a sequence of HT . Let Y be the number of
flips taken.
Which is larger µX = E [X] or µY = E [Y ]?
Intuition might suggest that “obviously”, µX = µY . Intuition would be wrong. It turns
out that, surprisingly enough, µX = 6 and µY = 4!
Example 1163. Now suppose we flip a fair coin 10, 001 times. This gives us a sequence
of 10, 000 pairs of consecutive coin-flips.
For example, if the 10, 001 coin-flips are HHTHT . . . , then the first four pairs of consec-
utive coin-flips are HH, HT, TH, and HT .
Let A be the proportion of the 10, 000 consecutive coin-flips that are HH. Let B be the
proportion of the 10, 000 consecutive coin-flips that are HT .
Which is larger µA = E [A] or µB = E [B]?
In the previous example, we saw that it took, on average, 6 flips before getting HH and
4 flips before getting HT . So “obviously”, we’d expect a smaller proportion to be HH’s.
That is, µA < µB .
Sadly, we would again be wrong! It turns out that µA = µB = 1/4! This Google spreadsheet
simulates 10, 001 coin-flips and calculates A and B.
If you’re interested, the results given in the above two examples are formally proven in Fact
229 in the Appendices.
Example 1164. Flip a coin. We can model this with a Bernoulli trial with probability
of success (heads) 0.5:
• Sample space S = {T, H},
• Event space Σ = {∅, {T }, {H}, S},
• Probability function P({T }) = 0.5 and P({H}) = 0.5.
Formally:
Note that we can denote the two elements of the sample space with any symbols. We could
use 0 — standing for failure — and 1 — standing for success. Or we could use T and H,
as was done in the example above.
Example 1165. On any given day, our refrigerator at home has probability 0.001 of
breaking down. We can model this with a Bernoulli trial with probability of success
0.001:
• Sample space S = {0, 1},
• Event space Σ = {∅, {0}, {1}, S},
• Probability function P({0}) = 0.999 and P({1}) = 0.001.
Fact 177. A Bernoulli random variable T with probability of success p has mean p and
variance p(1 − p).
Proof. E[T ] = P (T = 0) ⋅ 0 + P (T = 1) ⋅ 1 = (1 − p) ⋅ 0 + p ⋅ 1 = p.
For the variance, first compute
E [T 2 ] = P (T = 0) ⋅ 02 + P (T = 1) ⋅ 12 = (1 − p) ⋅ 0 + p ⋅ 12 = p.
⎛3⎞ 1 0 1 3 1 ⎛3⎞ 1 1 1 2 3
P(X = 0) = ( ) ( ) = , P(X = 1) = ( ) ( ) = ,
⎝0⎠ 2 2 8 ⎝1⎠ 2 2 8
⎛3⎞ 1 2 1 1 3 ⎛3⎞ 1 3 1 0 1
P(X = 2) = ( ) ( ) = , P(X = 3) = ( ) ( ) = .
⎝2⎠ 2 2 8 ⎝3⎠ 2 2 8
Formally:
X = T1 + T2 + ⋅ ⋅ ⋅ + Tn .
⎛2⎞ 0 2
P (Y = 0) = 0.9 0.1 = 0.01,
⎝0⎠
⎛2⎞ 1 1
P (Y = 1) = 0.9 0.1 = 0.18,
⎝1⎠
⎛2⎞ 2 0
P (Y = 2) = 0.9 0.1 = 0.81.
⎝2⎠
In words, the probability that both fail is 0.01, the probability that exactly one passes is
0.18, and the probability that both pass is 0.81.
⎛n⎞ k
P(X = k) = p (1 − p)n−k .
⎝k ⎠
In summary:
⎛n⎞ k
P(X = k) = p (1 − p)1−k .
⎝k ⎠
Example 1169. Let X be the number of heads when 10 fair coins are flipped.
Then X ∼ B(10, 0.5). And the probability that exactly 8 coins are heads is:
⎛ 10 ⎞ 8 2 45
P(X = 8) = 0.5 0.5 =
⎝ 8 ⎠
.
1024
⎛ 20 ⎞ 18 2 ⎛ 20 ⎞ 19 1 ⎛ 20 ⎞ 20 0
= 0.9 0.1 + 0.9 0.1 + 0.9 0.1 ≈ 0.677.
⎝ 18 ⎠ ⎝ 19 ⎠ ⎝ 20 ⎠
Example 1171. Problem: Three machines each have, independently, probability 0.3 of
failure. What is the expected number of failures? What is the variance of the number of
failures?
Solution: Let Z ∼ B(3, 0.3) be the number of failures. Then
Hence, E [Z] = P (Z = 1) ⋅ 1 + P (Z = 2) ⋅ 2 + P (Z = 3) ⋅ 3
⎛3⎞ 1 2 ⎛3⎞ 2 1 ⎛3⎞ 3 0
= 0.3 0.7 ⋅ 1 + 0.3 0.7 ⋅ 2 + 0.3 0.7 ⋅ 3
⎝1⎠ ⎝2⎠ ⎝3⎠
= 0.441 + 0.378 + 0.081 = 0.9.
Now,E [Z 2 ] = P (Z = 1) ⋅ 12 + P (Z = 2) ⋅ 22 + P (Z = 3) ⋅ 32
⎛3⎞ 1 2 2 ⎛3⎞ 2 1 2 ⎛3⎞ 3 0 2
= 0.3 0.7 ⋅ 1 + 0.3 0.7 ⋅ 2 + 0.3 0.7 ⋅ 3
⎝1⎠ ⎝2⎠ ⎝3⎠
= 0.441 + 0.756 + 0.243 = 1.44.
It turns out though that there is a much quicker formula for finding the mean and variance
of any binomial random variable.
(You can verify that this formula works for the last example: n = 3, p = 0.3, and thus
E [Z] = np = 0.9.)
358
But strangely enough, zero probability is not the same thing as impossible. For example, we’d
say that
• There is zero probability, but it is not impossible that X ∼ U [0, 1] takes on the value 0.37.
• There is zero probability and it is impossible that X ∼ U [0, 1] takes on the value 1.2.
(Actually, rather than use the word “impossible”, mathematicians prefer saying “almost never”, which
has a precise definition.)
1002, Contents www.EconsPhDTutor.com
P (0.3 ≤ X ≤ 0.7) = 0.7 − 0.3 = 0.4.
Similarly, the probability that X takes on values between 0.16 and 0.35 is simply 0.35−0.16 =
0.19. That is,
The above observations suggest that it may be useful to define a new concept, called the
cumulative distribution function.
FX (k) = P (X ≤ k) .
It turns out that every random variable can be uniquely defined by giving its
CDF. For example, the continuous uniform random variable is formally defined thus:
Definition 206. X is the continuous uniform random variable on [0, 1] if its CDF FX ∶
R → R is defined by
⎧
⎪
⎪
⎪
⎪ 0, if k < 0,
⎪
⎪
FX (k) = ⎨k, if k ∈ [0, 1],
⎪
⎪
⎪
⎪
⎪
⎪
⎩1, if k > 1.
Armed with the concept of the CDF, the formal definition of a continuous random variable
can be simply stated:
359
Or countably-infinite.
1004, Contents www.EconsPhDTutor.com
102.3. Important Digression: P (X ≤ k) = P (X < k)
For any continuous random variable X, we have
P (X ≤ k) = P (X < k) .
That is, whether an inequality is strict makes no difference. The reason is that by the third
Kolmogorov axiom (additivity),
Thus, for continuous random variables, it doesn’t matter whether inequalities are strict or
weak.
P (0.2 ≤ X ≤ 0.5) = P (0.2 < X ≤ 0.5) = P (0.2 ≤ X < 0.5) = P (0.2 < X < 0.5) .
Definition 208. Let X be a random variable whose CDF FX is differentiable. Then the
probability density function (PDF) of X is the function fX ∶ R → R defined by
fX (k) = FX (k).
d
dk
The PDF has an intuitive interpretation. The area under the PDF between points a and
b is equal to P (a ≤ X ≤ b). This, of course, is simply a consequence of the Fundamental
Theorems of Calculus:
FTC
fX (k)dk = ∫ FX (k)dk = FX (b) − FX (a) = P(X ≤ b) − P(X ≤ a) = P(a ≤ X ≤ b).
b b d
∫a a dk
The PDF of X ∼ U[0, 1] (graphed below) is simply the function fX ∶ R → R defined by
For any a ≤ b, the area under the PDF between a and b is precisely P (a ≤ X ≤ b). For
example, there is probability 0.25 (red area) that X takes on values between 0.5 and 0.75.
There is probability 0.1 (blue area) that X takes on values between 0.2 and 0.3.
Exercise 415. The continuous uniform random variable Y ∼ U[3, 5] is equally likely to
take on values between 3 and 5, inclusive. (a) Write down its CDF FY . (b) Write down
and graph its PDF fY . (c) Compute, and also illustrate on your graph, the quantities
P (3.1 ≤ Y ≤ 4.6) and P (4.8 ≤ Y ≤ 4.9). (Answer on p. 1585.)
360
Note that although every random variable has a CDF, not every random variable has a PDF. In
particular, if the random variable’s CDF is not differentiable, then by our definition here, the random
variable does not have a PDF.
1006, Contents www.EconsPhDTutor.com
103. The Normal Distribution
The standard normal (or Gaussian) random variable (SNRV) is very important. In
fact, it is so important that we usually reserve the letter Z for it, and the Greek letters φ
and Φ (lower- and upper-case phi) for its PDF and CDF.
The following three statements are entirely equivalent:
1. Z is a SNRV.
2. Z is a random variable with the standard normal distribution.
3. Z ∼ N (0, 1).
Here’s the formal definition:
Definition 209. Z is called a standard normal random variable (SNRV) if its PDF
φ ∶ R → R is defined by:
1
φ (a) = √ e−0.5a .
2
2π
For the A-Levels, you need not remember this complicated-looking PDF. Nor need you
understand where it comes from.
The normal PDF is often also referred to as the bell curve, due to its resemblance to a
bell (kinda).
As with the continuous uniform, for any a ≤ b, the area under the normal PDF between
a and b gives us precisely P (a ≤ X ≤ b). For example, there is probability 0.25 (red area)
that X takes on values between 0.5 and 0.75. There is probability 0.1 (blue area) that X
takes on values between 0.2 and 0.3.
−∞ −∞ 2π
Unfortunately, this last integral has no simpler expression (mathematicians would say that
it has no “closed-form expression”). Instead, as we’ll soon see, we have to use the so-called
Z-tables (or a graphing calculator) to look up values of Φ(k).
The next fact summarises the properties of the normal distribution. Some of these proper-
ties are illustrated in the figure that follows.
Fact 180. Let Z ∼ N(0, 1) and φ and Φ be its PDF and CDF.
1. Φ(∞) = 1. (As with any random variable, the area under the entire PDF is 1.)
2. φ (a) > 0, for all a ∈ R. (The PDF is positive everywhere. This has a surprising
implication: however large a is, there is always some non-zero probability that Z ≥ a.)
3. E [Z] = 0. (The mean of Z is 0.)
4. The PDF φ reaches a global maximum at the mean 0. (In fact, we can go ahead and
1
compute φ (0) = √ ≈ 0.399.)
2π
5. Var [Z] = 1. (The variance of Z is 1.)
6. P (Z ≤ a) = P (Z < a). (We’ve already discussed this earlier. It makes no difference
whether the inequality is strict. This is because P(Z = a) = 0.)
7. The PDF φ is symmetric about the mean. This has several implications:
(a) P (Z ≥ a) = P (Z ≤ −a) = Φ(−a).
(b) Since P (Z ≥ a) = 1 − P (Z ≤ a) = 1 − Φ (a), it follows that Φ(−a) = 1 − Φ (a) or,
equivalently, Φ (a) = 1 − Φ(−a).
(c) Φ (0) = 1 − Φ (0) = 0.5.
8. P (−1 ≤ Z ≤ 1) = Φ (1) − Φ (−1) ≈ 0.6827. (There is probability 0.6827 that Z takes on
values within 1 standard deviation of the mean.)
9. P (−2 ≤ Z ≤ 2) = Φ (2) − Φ (−2) ≈ 0.9545. (There is probability 0.9545 that Z takes on
values within 2 standard deviations of the mean.)
10. P (−3 ≤ Z ≤ 3) = Φ (3) − Φ (−3) ≈ 0.9973. (There is probability 0.9973 that Z takes on
values within 3 standard deviations of the mean.)
11. The PDF φ has two points of inflexion, namely at ±1. (The points of inflexion are one
standard deviation away from the mean.)
After
After Step
Step 1.
1. After
After Step
Step 2.
2. After Step 3.
After Step 3. After Step
After Step 4.
4.
-4 -3 -2 -1 0 1 2 3 4
-4 -3 -2 -1 0 1 2 3 4
-4 -3 -2 -1 0 1 2 3 4
Exercise 416. Using both the Z-tables and your graphing calculator, find the following:
(a) P (Z ≥ 1.8). (b) P (−0.351 < Z < 1.2). (Answer on p. 1586.)
1 2 3 4 5 6 7 8 9
z 0 1 2 3 4 5 6 7 8 9
ADD
0.0 0.5000 0.5040 0.5080 0.5120 0.5160 0.5199 0.5239 0.5279 0.5319 0.5359 4 8 12 16 20 24 28 32 36
0.1 0.5398 0.5438 0.5478 0.5517 0.5557 0.5596 0.5636 0.5675 0.5714 0.5753 4 8 12 16 20 24 28 32 36
0.2 0.5793 0.5832 0.5871 0.5910 0.5948 0.5987 0.6026 0.6064 0.6103 0.6141 4 8 12 15 19 23 27 31 35
0.3 0.6179 0.6217 0.6255 0.6293 0.6331 0.6368 0.6406 0.6443 0.6480 0.6517 4 7 11 15 19 22 26 30 34
0.4 0.6554 0.6591 0.6628 0.6664 0.6700 0.6736 0.6772 0.6808 0.6844 0.6879 4 7 11 14 18 22 25 29 32
0.5 0.6915 0.6950 0.6985 0.7019 0.7054 0.7088 0.7123 0.7157 0.7190 0.7224 3 7 10 14 17 20 24 27 31
0.6 0.7257 0.7291 0.7324 0.7357 0.7389 0.7422 0.7454 0.7486 0.7517 0.7549 3 7 10 13 16 19 23 26 29
0.7 0.7580 0.7611 0.7642 0.7673 0.7704 0.7734 0.7764 0.7794 0.7823 0.7852 3 6 9 12 15 18 21 24 27
0.8 0.7881 0.7910 0.7939 0.7967 0.7995 0.8023 0.8051 0.8078 0.8106 0.8133 3 5 8 11 14 16 19 22 25
0.9 0.8159 0.8186 0.8212 0.8238 0.8264 0.8289 0.8315 0.8340 0.8365 0.8389 3 5 8 10 13 15 18 20 23
1.0 0.8413 0.8438 0.8461 0.8485 0.8508 0.8531 0.8554 0.8577 0.8599 0.8621 2 5 7 9 12 14 16 19 21
1.1 0.8643 0.8665 0.8686 0.8708 0.8729 0.8749 0.8770 0.8790 0.8810 0.8830 2 4 6 8 10 12 14 16 18
1.2 0.8849 0.8869 0.8888 0.8907 0.8925 0.8944 0.8962 0.8980 0.8997 0.9015 2 4 6 7 9 11 13 15 17
1.3 0.9032 0.9049 0.9066 0.9082 0.9099 0.9115 0.9131 0.9147 0.9162 0.9177 2 3 5 6 8 10 11 13 14
1.4 0.9192 0.9207 0.9222 0.9236 0.9251 0.9265 0.9279 0.9292 0.9306 0.9319 1 3 4 6 7 8 10 11 13
1.5 0.9332 0.9345 0.9357 0.9370 0.9382 0.9394 0.9406 0.9418 0.9429 0.9441 1 2 4 5 6 7 8 10 11
1.6 0.9452 0.9463 0.9474 0.9484 0.9495 0.9505 0.9515 0.9525 0.9535 0.9545 1 2 3 4 5 6 7 8 9
1.7 0.9554 0.9564 0.9573 0.9582 0.9591 0.9599 0.9608 0.9616 0.9625 0.9633 1 2 3 4 4 5 6 7 8
1.8 0.9641 0.9649 0.9656 0.9664 0.9671 0.9678 0.9686 0.9693 0.9699 0.9706 1 1 2 3 4 4 5 6 6
1.9 0.9713 0.9719 0.9726 0.9732 0.9738 0.9744 0.9750 0.9756 0.9761 0.9767 1 1 2 2 3 4 4 5 5
2.0 0.9772 0.9778 0.9783 0.9788 0.9793 0.9798 0.9803 0.9808 0.9812 0.9817 0 1 1 2 2 3 3 4 4
2.1 0.9821 0.9826 0.9830 0.9834 0.9838 0.9842 0.9846 0.9850 0.9854 0.9857 0 1 1 2 2 2 3 3 4
2.2 0.9861 0.9864 0.9868 0.9871 0.9875 0.9878 0.9881 0.9884 0.9887 0.9890 0 1 1 1 2 2 2 3 3
2.3 0.9893 0.9896 0.9898 0.9901 0.9904 0.9906 0.9909 0.9911 0.9913 0.9916 0 1 1 1 1 2 2 2 2
2.4 0.9918 0.9920 0.9922 0.9925 0.9927 0.9929 0.9931 0.9932 0.9934 0.9936 0 0 1 1 1 1 1 2 2
2.5 0.9938 0.9940 0.9941 0.9943 0.9945 0.9946 0.9948 0.9949 0.9951 0.9952 0 0 0 1 1 1 1 1 1
2.6 0.9953 0.9955 0.9956 0.9957 0.9959 0.9960 0.9961 0.9962 0.9963 0.9964 0 0 0 0 1 1 1 1 1
2.7 0.9965 0.9966 0.9967 0.9968 0.9969 0.9970 0.9971 0.9972 0.9973 0.9974 0 0 0 0 0 1 1 1 1
2.8 0.9974 0.9975 0.9976 0.9977 0.9977 0.9978 0.9979 0.9979 0.9980 0.9981 0 0 0 0 0 0 0 1 1
2.9 0.9981 0.9982 0.9982 0.9983 0.9984 0.9984 0.9985 0.9985 0.9986 0.9986 0 0 0 0 0 0 0 0 0
It turns out that σZ + µ is a normal random variable with mean µ and variance σ 2 :
Definition 210. X is called a normal random variable with mean µ and variance σ 2 if
its PDF fX ∶ R → R is defined by:
1
fX (a) = √ e−0.5( σ ) .
a−µ 2
σ 2π
Once again, for the A-Levels, you need not remember this complicated-looking PDF. Nor
need you understand where it comes from.
The following three statements are entirely equivalent:
1. X is a normal random variable with mean µ and variance σ 2 .
2. X is a random variable with normal distribution of mean µ and variance σ 2 .
3. X ∼ N (µ, σ 2 ).
Exercise 417. Let X ∼ N(µ, σ 2 ). Verify that if µ = 0 and σ 2 = 1, then for all a ∈ R, we
have fX (a) = φ (a). What can you conclude? (Answer on p. 1587.)
Thus, we can easily transform any normal random variable into the SNRV:
X −µ
Corollary 33. If X ∼ N (µ, σ 2 ), then = Z ∼ N(0, 1). Equivalently, X = σZ + µ.
σ
(Just to be clear, two random variables are identical if their CDFs are identical.)
X −µ
Exercise 418. Using Fact 181, prove that if X ∼ N (µ, σ 2 ), then = Z ∼ N(0, 1).
σ
(Answer on p. 1587.)
The above corollary gives us an alternative method for computing probabilities associated
with normal random variables. In general, if X ∼ N (µ, σ 2 ), then
c−µ c−µ
P (X ≤ c) = P (σZ + µ ≤ c) = P (Z ≤ ) = Φ( ).
σ σ
Fact 182. Let X ∼ N (µ, σ 2 ) and let fX and FX be the PDF and CDF of X.
1. Φ(∞) = 1. (The area under the entire PDF is 1. This, of course, is true of any random
variable.)
2. φ (a) > 0, for all a ∈ R. (The PDF is positive everywhere. This has the surprising
implication that no matter how large a is, there is always some non-zero probability
that Z ≥ a.)
3. E [X] = µ. (The mean of Z is µ.)
4. The PDF fX reaches a global maximum at the mean µ. (In fact, we can go ahead and
1 0.399
compute fX (µ) = √ ≈ .)
σ 2π σ
5. Var [X] = σ 2 . (The variance of X is σ 2 .)
6. P (Z ≤ a) = P (Z < a). (We’ve already discussed this earlier. It makes no difference
whether the inequality is strict. This is because P(Z = a) = 0.)
7. The PDF φ is symmetric about the mean. This has several implications:
(a) P (X ≥ µ + a) = P (X ≤ µ − a) = FX (µ − a).
(b) Since P (X ≥ µ + a) = 1 − P (X ≤ µ + a) = 1 − FX (µ + a), it follows that FX (µ − a) =
1 − FX (µ + a) or, equivalently, FX (µ + a) = 1 − FX (µ − a).
(c) FX (µ) = 1 − FX (µ) = 0.5.
8. P (µ − σ ≤ X ≤ µ + σ) = Φ (1) − Φ (−1) ≈ 0.6827. (There is probability 0.6827 that X
takes on values within 1 standard deviation of the mean.)
9. P (µ − σ ≤ X ≤ µ + σ) = Φ (2) − Φ (−2) ≈ 0.9545. (There is probability 0.9545 that X
takes on values within 2 standard deviations of the mean.)
10. P (µ − σ ≤ X ≤ µ + σ) = Φ (3) − Φ (−3) ≈ 0.9973. (There is probability 0.9973 that X
takes on values within 3 standard deviations of the mean.)
11. The PDF φ has two points of inflexion, namely at ±σ. (The points of inflexion are
one standard deviation away from the mean.)
Exercise 419. Prove all of the properties listed in Fact 182. (Hint: Use Corollary 33 to
convert X into the SNRV. Then simply apply Fact 180.) (Answer on p. 1588.)
Previously, we didn’t bother telling the TI84 our mean µ and standard deviation σ.
And so by default, if we pressed ENTER at this point, the TI84 simply assumed that
we wanted the SNRV Z ∼ N(0, 1). Now we’ll tell the TI84 what µ and σ are:
5. First enter the mean µ = −1. Press , (-) 1 .
√ √
6. Now enter the standard deviation σ = 0.1 (and not the variance). Press , 0
. 1 ) . Finally, press ENTER . The TI84 says that P (G < 2) ≈ 1.
Finding P (H < 2), P (I < 2), P (−1 < G < 1), P (−1 < H < 1), and P (−1 < I < 1) is similar:
P (H < 2) and P (I < 2) P (−1 < G < 1) P (−1 < H < 1) P (−1 < I < 1)
Since I has mean µ = 2, we should have exactly P (I < 2) = 0.5. So here the TI84 has
actually made a small error in reporting instead that P (I < 2) ≈ 0.5000000005.
2 − µG 2 − (−1)
P (G < 2) = P (Z < = √ ≈ 9.4868) = Φ (9.4868) ≈ 1,
σG 0.1
2 − µH 2 − 1
P (H < 2) = P (Z < = √ ≈ 0.7071) = Φ (0.7071) ≈ 0.7601,
σH 2
2 − µI 2 − 2
P (I < 2) = P (Z < = √ = 0) = Φ (0) = 0.5,
σI 3
−1 − (−1) 1 − (−1)
P (−1 < G < 1) = P (0 = √ <Z< √ ≈ 6.3246)
0.1 0.1
= Φ (6.3246) − Φ (0) ≈ 1 − Φ (0) = 0.5.
−1 − 1 1−1
P (−1 < H < 1) = P (−1.4142 ≈ √ < Z < √ = 0)
2 2
= Φ (0) − Φ(−1.4142) ≈ 0.5 − [1 − Φ(1.4142)]
= Φ(1.4142) − 0.5 ≈ 0.9213 − 0.5 = 0.4213,
−1 − 2 1−2
P (−1 < I < 1) = P (−1.7321 ≈ √ < Z < √ ≈ −0.5774)
3 3
= Φ(−0.5774) − Φ(−1.7321) = 1 − Φ(0.5774) − [1 − Φ(1.7321)]
≈ 0.9584 − 0.7182 = 0.2402.
Exercise 420. Let X ∼ N(2.14, 5) and Y ∼ N(−0.33, 2). Using both the Z-tables
and your graphing calculator, find the following: (a) P (X ≥ 1) and P (Y ≥ 1). (b)
P (−2 ≤ X ≤ −1.5) and P (−2 ≤ Y ≤ −1.5). (Answer on p. 1589.)
Theorem 31. If X and Y are independent normal random variables, then X + Y is also
a normal random variable. Moreover, X − Y is also a normal random variable.
Proof. Omitted.
Examples:
(a) Let X1 ∼ N (200, 50) and X2 ∼ N (200, 50) be the weight of the first and second sumo
wrestler. Then X1 + X2 ∼ N (400, 100). Thus,
405 − 400
P (X1 + X2 > 405) = P (Z > √ ) = P (Z > 0.5) = 1 − Φ (0.5) ≈ 1 − 0.6915 = 0.3085.
100
(b) Our goal is to find p = P (X1 > 1.1X2 ) + P (X2 > 1.1X1 ). This is the probability that
the first sumo wrestler is more than 10% heavier than the second, plus the probability
that the second is more than 10% heavier than the first. Of course, by symmetry, these
two probabilities are equal. Thus, p = 2 × P (X1 > 1.1X2 ). Now,
But X1 − 1.1X2 ∼ N (200 − 1.1 ⋅ 200, 50 + 1.12 ⋅ 50) = N (−20, 110.5). Thus,
0 − (−20)
P (X1 > 1.1X2 ) = P (X1 − 1.1X2 > 0) = P (Z > √ )
110.5
(b) P (X > 9Y ) = P (X − 9Y > 0). But X −9Y ∼ N (1 − 9 × 0.1, 0.4 + 92 × 0.1) = N (0.1, 8.5).
Thus, P (X − 9Y > 0) ≈ 0.5137 (calculator).
Exercise 421. (Answer on p. 1590.) Water and electricity usage are billed, respectively,
at $2 per 1, 000 litres (l) and $0.30 per kilowatt-hour (kWh). Assume that each month,
the amount of water used by Ahmad (and his family) at their HDB flat is normally
distributed with mean 25, 000 l and variance 64, 000, 000 l2 . Similarly, the amount of
electricity they use is normally distributed with mean 200 kWh and variance 10, 000
kWh2 .
Assume that monthly water usage and electricity usage are independent.
(a) Find the probability that their total water and electricity utility bill in any given
month exceeds $100.
(b) Find the probability that their total water and electricity utility bill in any given year
exceeds $1, 000.
Suppose instead that electricity usage is billed at $x per kWh.
(c) Then what is the maximum value of x, in order for the probability that the total
utility bill in a given month exceeds $100 is 0.1 or less?
i=1
Proof. The proof is a little advanced and thus entirely omitted from this book.
What does it mean for one random variable to “converge in distribution” to another? This
is a little beyond the scope of the A-Levels, but informally, this means that as n → ∞,
n
the random variable ∑ Xi becomes “ever more” like the random variable with distribution
i=1
N (nµ, nσ ).
2
How large is “large enough”? The most common rule-of-thumb is that n ≥ 30 is “large
enough”, so that’s what we’ll use in this book, even though this is somewhat arbitrary.
Indeed, if the original distribution from which the random variables are drawn are not
“nice enough”, then n ≥ 30 may not be “large enough”. (Informally, a distribution is “nice
enough” if it is — among other things — fairly symmetric, fairly unimodal, and not too
skewed.)
You can safely assume that all distributions you’ll ever encounter in the A-Levels are “nice
enough”, so that the n ≥ 30 rule-of-thumb works. But whenever you use the CLT normal
approximation, you should be clear to state that you assume the distribution is “nice
enough”.
P(X ≥ 360) ≈ P(Y ≥ 360) and P(X > 360) ≈ P(Y > 360).
Note however that X is a discrete random variable, so that P(X ≥ 360) ≠ P(X > 360).
More specifically,
In contrast, Y is a continuous random variable, so that P(Y ≥ 360) = P(Y > 360). Hence,
if we simply use the approximations P(X ≥ 360) ≈ P(Y ≥ 360) and P(X > 360) ≈ P(Y >
360), then implicitly we’d be saying that P(X = 360) = 0, which is blatantly false.
To correct for this, we perform the so-called continuity correction. This says that we’ll
instead use the approximations
P(X ≥ 360) ≈ P(Y ≥ 359.5) and P(X > 360) ≈ P(Y ≥ 360.5).
Thus, P(X ≥ 360) ≈ P(Y ≥ 359.5) ≈ 0.2890 (calculator) and P(X > 360) ≈ P(Y ≥ 360.5) ≈
0.2693.
Note that if the random variable to be approximated is itself continuous, then there is no
need to perform the continuity correction. This is illustrated in Exercise 423 below.
Exercise 422. Let X be the random variable that is the sum of 30 rolls of a fair die.
Find P(100 ≤ X ≤ 110). (Answer on p. 1591.)
Exercise 423. The weight of each Coco-Pop is independently- and identically-distributed
with mean 0.1 g and variance 0.004 g2 . A box of Coco-Pops has exactly 5, 000 Coco-Pops.
It is labelled as having a net weight of 500 g. Find the probability that that the actual
net weight of the Coco-Pops in this box is less than or equal to 499 g. (Answer on p.
1591.)
Example 1184. Below is a histogram of the heights of the 4,060 NBA players who ever
played in an NBA game (through the end of the 2016 season). (Heights are reported in
feet and a whole number of inches, where 1 in = 2.54 cm and 1 ft = 12 in, so that 1 ft =
30.48 cm.) The histogram has 28 bins and (arguably) looks normal (bell-shaped).
The width of each bin is 1 inch. For example, the red bin says 410 players have had
reported heights of 6 ft 7 in (approx. 200 cm). The pink (leftmost) bin is barely visible
and says only 1 player has had a reported height of 5 ft 3 in (approx. 160 cm). The
blue (rightmost) bin is also barely visible and says that only 2 players have had reported
heights of 7 ft 7 in (approx. 231 cm). The average or mean height is approx. 6 ft 6 in
(approx. 198 cm).
361
Data: Excel spreadsheet. Source: Basketball-Reference.com (retrieved June 15th, 2016). Caveats: (1)
For some reason, out of the 4060 players in that database at the time of retrieval, there was exactly
one player (George Karl) whose height was not listed. lists George Karl’s height as 6 ft
2 in, so that is what I have used for his height. (2) By NBA, I actually mean the BAA (1946-1949),
the NBA (1949-present), and the ABA (1967-1976), combined. (3) As is well-known among basketball
fans, the listed heights of NBA players are not accurate and can sometimes be off by as much as 2 to 3
inches (5 to 7.5 cm). (See this recent Wall Street Journal article.)
1028, Contents www.EconsPhDTutor.com
Manute Bol (approx 231 cm) and Muggsy Bogues (approx 160
cm) were briefly on the same team. (YouTube highlights.)
What made the book especially controversial were its claims that intelligence was largely
heritable and that black Americans had lower intelligence than whites. The figure above
is taken from p. 279 of the book. It suggests that
• Black IQ is normally distributed, with a mean of around 80.
• White IQ is normally distributed, with a mean of around 105.
(Source: YouTube.)
Question:
We will try to answer this question, but only after we’ve illustrated how the Central Limit
Theorem works.
Example 581.
Example 1187.Flip Flipmany
manyfair faircoins.
coins.Model
Modeleacheachwithwiththethe Bernoullirandom
Bernoulli randomvariables
variablesT111,
TT221,, TT332,, .T. .3 ,, .each
. . , each
withwith probability
probability of success
of success (heads)
(heads) 0.5. 0.5.
Let X
Let Xn==TT1++TT2++⋅⋅⋅⋅⋅⋅++TTn ∼∼BB(n, (n,0.5).
0.5).
nn 11 22 nn
Below are the histograms of the distributions of X1 , X2 , . . . , and X6 . X1 has probability
0.5 of taking
Below are theon each of the
histograms of values 0 and 1. X2ofhas
the distributions X111probability 0.5 X
, X222, . . . , and of666taking
. X on probability
has
X111 has the value
probability
of 1;
0.5 of andtaking probability
on each of of 0.25 of taking
the values on 1.
0 and each
X222ofhas
theprobability
values 0 and 0.52.ofEtc.
taking on
taking on the
the value
value
of 1; and probability of 0.25 of taking on each of the values 0 and 2. Etc. Etc.
Page Contents
Page
1032, 656, Table
656, Table of
of Contents
Contents www.EconsPhDTutor.com
www.EconsPhDTutor.com
www.EconsPhDTutor.com
(...
(... Example
(... Example continued
Example continued from
continued from the
the previous
previous page ...)
page.)
Onthis
On
On thisand
this andandthe thenext
the nextpage
next pageare
arethethehistograms
histograms of of
thethe distributions
distributions of77X
of X ,X7 ,88,XX
8 ,9 ,XX9 ,10X10 ,20,
10, X 20
X
XX3020,,,XX4030,, ,X
X XX50 ,
,
40 ,
and
and
X 50
X
X , and
100 . Observe
X 100 . Observe
that as nthat as
grows, n grows,
the shape the
of shape
the of the
probability probability
distribution
30 40 50 100
distribution
of
of Xnn looks
X looks ever of Xmore
ever n looks
more ever more This
bell-shaped. bell-shaped. This
is exactly is exactly
what the CLT what
says.the CLT says.
Page 657,
Page 657, Table
Table of
of Contents
Contents www.EconsPhDTutor.com
1033, Contents www.EconsPhDTutor.com
(... Example
(... Example continued
continued from
from the
the previous
previous page
page.)
...)
3.
2. The
Addprobability
them up todistribution
get anotherofrandom
S will variable
look normal.
S.
3. The probability distribution of S will look normal.
76
76 I
I should say nearly any distribution. For the classical CLT to apply, the variance
variance must
must be finite.
be finite.
362
I should say nearly any distribution. For the classical CLT to apply, the variance must be finite.
Page 658, Table of Contents www.EconsPhDTutor.com
www.EconsPhDTutor.com
1034, Contents www.EconsPhDTutor.com
What
What makes
makes the
the CLT
CLT particularly
particularly amazing
amazing is
is that
that it
it works with ANY
works with ANY distribution.
distribution.
To
To illustrate,
What illustrate,
makes the next
next CLT up is
is an
an example
up particularly example where
whereisthe
amazing thethat original
original distribution
distribution
it works with ANY is highly
highly skewed
skewed and
is distribution. and
does not look at all bell-curved. Nonetheless, the
does not look at all bell-curved. Nonetheless, the CLT still works out nicely. CLT still works out nicely.
To illustrate, next up is an example where the original distribution is highly skewed and
does not look at all bell-curved. Nonetheless, the CLT still works out nicely.
Example
Example 582. 582. Flip
Flip many many biased
biased coins,
coins, each
each with with probability
probability 0.9
0.9 of
of heads.
heads. Model
Model each
each
Example
with the 1188.
Bernoulli Flip
random many biased
variables coins,
, , each
, with
, each probability
with 0.9
probabilityofofheads.
success Model
(heads)
with the Bernoulli random variables Y1 , Y2 , Y3 , . . . , each with probability of success (heads)
Y 1 Y 2 Y 3 . . .
each with the Bernoulli random variables Y1 , Y2 , Y3 , . . . , each with probability of success
0.9.
0.9.
(heads) 0.9.
S = Y11 + Y222 +
+ ⋅⋅⋅ ⋅⋅⋅ ⋅⋅⋅ +
+
Let SSnnn == YY1 + +Y
Let be the number of heads
headsinin the first coin-flips. (By the way,
Let +Y + be the
the number
number of of heads inthe
thefirst
firstnn coin-flips.(By
(Bythe
theway,
way,
Y Y
Ynnn be ncoin-flips.
∼
SSnnn ∼∼ B (n,
B(n,
B 0.9).)
(n,0.9).)
S
0.9).)
On this
On this and
and the next page
the next page are
are the
the histograms
the histograms of
histograms ofthe
of thedistributions
the distributionsofof
distributions 1 ,1 ,,SS2 , ,,. ......,.. and
,, and .
On this and ofSS
S 1 S2 2 andS10
S 10 ..
S10
has probability
SS111 has
S has probability 0.1
probability of taking
0.1 of taking onon value
value 000and
value and0.9
and 0.9ofof
0.9 oftaking
takingon
taking onvalue
on value1.1.
value 1.S2S S22hashasprobability
has probability
probability
0.01 of
0.01 of taking
taking onon the value of 0,
of 0, 0.18
0.18 of
of taking
taking on
on the
the value
value 1,1,and
and0.81
0.81 ofof taking
taking ononthe
the
value 2.
value 2. SS33 has probability 0.001
0.001 ofof taking
taking onon the
the value
valueofof0,0,0.036
0.036ofoftaking
takingononthe thevalue
value
3
1, 0.486
1, 0.486 of
of taking
taking on the
the value
value 2,
2, 0.2916
0.2916 ofof taking
taking on
onthe
thevalue
value3,3,0.6561
0.6561ofoftaking takingononthe the
value 4.
value 4. Etc.
Etc.
Page
Page 659,
659, Table
Table of
of Contents
Contents www.EconsPhDTutor.com
www.EconsPhDTutor.com
ItIt certainly
certainly does not look
look like
like the
the distribution
distribution SSnnnisisbecoming
becomingincreasingly
increasinglybell-curved.
bell-curved.
Well, let’s
Well, let’s see.
see.
(Example continues on the next page ...)
(... Example continued on the next page ...)
Page
Page 660,
660, Table
Table of
of Contents
Contents www.EconsPhDTutor.com
www.EconsPhDTutor.com
1036, Contents www.EconsPhDTutor.com
(...
(... Example
Example continued
continued from
from the
the previous
previous page ...)
page.)
Below are
Below are the
the histograms
histograms of
of the
the distributions
distributions ofof SS2020,, SS3030, ,SS4040, ,SS5050, ,and
andSS . Remarkably
100 . Remarkably
100
enough, as
enough, as nn grows,
grows, the
the shape
shape of
of the
the probability
probability distribution
distributionofofSSn nlooks looksever
evermore
morebell-
bell-
shaped. As promised by the
shaped. As promised by the CLT.CLT.
Page 661,
Page 661, Table
Table of Contents www.EconsPhDTutor.com
www.EconsPhDTutor.com
1037, Contents www.EconsPhDTutor.com
104.3. Why Are So Many Things Normally Distributed?
We now return to the question posed earlier:
Examples to illustrate:
Example 1189. Assume that human height is entirely determined by 1000 independent
genes (assume all human beings have these 1000 genes).
Assume that each of these 1000 genes is associated with an independent random variable
2
X1 , X2 , . . . , X1000 , each identically distributed with mean µX and variance σX . Assume
also that human height is simply equal to the sum of these random variables. That is, a
human being’s height is simply given by H = X1 + X2 + ⋅ ⋅ ⋅ + X1000 .
Then the CLT says that since n = 1000 is “large enough”, H will be approximately
2
normally distributed, with mean 1000µX and variance 1000σX . Amongst the world’s 7.4
billion people, there will be some very short people and some very tall people, but most
people will be near the mean height 1000µX .
363
Source: Basketball-Reference.com. Caveats: (1) The data were retrieved on June 15th, 2016, so the
points scored are between 1946 and that date. (2) By NBA, I actually mean the BAA (1946-1949), the
NBA (1949-present), and the ABA (1967-1976), combined. Data: Excel spreadsheet.
1039, Contents www.EconsPhDTutor.com
Example 1191. Below is a histogram of the total points scored by each of 4,060 NBA
players. Clearly, the total points scored by each player is not normally distributed.
The histogram has 20 bins of equal width. The leftmost bin says that 2, 615 players
scored 0 to 1919 points. The rightmost bin says that only 2 players scored 36, 468 to
38, 387 points.
The grand total number of points ever scored in the NBA is 11, 565, 923. Of which,
8, 424, 242 (or 72.8%) were scored by the top 20% (812). So it appears that the 80-20
Rule is a reasonably good description of the distribution of total points scored by players!
In contrast, the normal distribution is obviously not a good description.
It’s fairly obvious to anyone who bothers graphing the data that “points scored in the
NBA” is not normally-distributed. There are however instances where this is less obvious.
One is thus more likely to mistakenly assuming a normal distribution. A famous and tragic
example of this is given by the financial markets.
Let qi be the % change in closing value on day i, as compared to day i − 1. For example,
on June 14th, 2016, the DJIA closed at 17, 674.82. On June 15th, 2016, it closed at
17, 640.17, 34.65 points lower than the previous day’s close. Thus,
−34.65
q20160615 = ≈ −0.20%.
17, 674.82
It seems reasonable to say that q is normally-distributed (at least if we ignore the leftmost
and rightmost bins).
(Example continues on the next page ...)
Probability Statistics
Given a known model, what can we Given observed data, what can
say about the data we’ll observe? we say about the model?
Example 1194. Ann and Bob are two infinitely intelligent persons. Ann believes that
the probability of rain tomorrow is 0.2 and Bob believes that it is 0.6.
• Objectivist view: There is some single, “correct” probability p of rain tomorrow.
Perhaps no one (except some Supreme Being up above) will ever know what exactly p
is. But in any case, we can say that exactly one of the following must be true:
1. Ann is correct (and Bob is wrong);
2. Bob is correct (and Ann is wrong); or
3. Both Ann and Bob are wrong.
• Subjectivist view: A probability is not some objective, rational thing that exists
outside the mind of any human being. There is no “correct” probability. Instead, a
probability is merely
Thus, Ann and Bob can legitimately disagree about the probability of rain tomorrow,
without either being wrong. After all, the numbers 0.2 and 0.6 are merely their personal,
subjective degrees of belief in the likelihood of rain tomorrow.
Bruno de Finetti(1906–1985) was perhaps the most famous and extreme subjectivist ever.
In the preface to his Theory of Probability (1970)364 , he wrote:
My thesis, paradoxically, and a little provocatively, but nonetheless genu-
inely, is simply this:
364
Originally published in 1970 in Italian as Teoria delle probabilità. The link is to a recent 2017 English
edition.
1046, Contents www.EconsPhDTutor.com
Example 1195. Judge Ann says the murder suspect is probably innocent. Judge Bob
says the suspect is probably guilty.
Objectivist interpretation: Ann and Bob cannot both be correct. The suspect is
either innocent (with probability 1) or guilty (with probability 1).
In fact, we can go even further and say that both Ann and Bob are talking nonsense. It is
nonsensical to say things like the suspect is “probably” innocent (or “probably” guilty),
because the suspect either is innocent or not.
Subjectivist interpretation: Ann and Bob are perfectly well-entitled to their beliefs.
Moreover, it is perfectly meaningful to say things like the suspect is “probably” innocent
(or “probably” guilty). Ann and Bob do not know for sure whether the suspect is innocent
or guilty. They are thereby perfectly well-entitled to speak probabilistically about the
innocence or guilt of the suspect.
Example 1196. We flip a coin 100 times and get 100 heads.
Given these observed data (100 heads out of 100 flips), what can we say (what statistical
inference can we make) about whether or not the coin is fair?
Subjectivist answer: The coin is probably not fair. (This is perhaps the answer that
most laypersons would give.)
Objectivist answer: The coin either is fair (with probability 1) or isn’t fair (with
probability 1). Subjectivist statements like the coin is “probably” not fair are nonsensical.
Most untrained laypersons are innately subjectivist. Yet in this book (and also for the
A-Levels), you’ll be trained to think like strict objectivists.
Note though that it is not the case that one school of thought is correct and the other
wrong. Both the objectivist and subjectivist schools of thought have merit. The growing
consensus amongst statisticians is to take the best of both worlds.
Nonetheless, in this textbook, we learn only the objectivist interpretation. Not because it
is necessarily superior, but rather because
1. The maths is easier.
2. Tradition: For most of the 20th century, the objectivist interpretation was favoured.
106.1. Population
Definition 211. A population is any ordered set (i.e. vector) of objects we’re interested
in.
A population can be finite or infinite. But to keep things simple, we’ll look at examples
where it is finite.
Example 1197. The two candidates for the 2016 Bukit Batok SMC By-Election are Dr.
Chee Soon Juan and PAP Guy. It is the night of the election and voting has just closed.
Our objects-of-interest are the 23, 570 valid ballots cast. (A ballot is simply a piece of
paper on which a vote is recorded. The words ballot and vote are often used interchange-
ably.)
Arrange the ballots in any arbitrary order. Let v1 = 1 if the first ballot is in favour of Dr.
Chee and v1 = 0 otherwise. Similarly and more generally, for any i = 2, 3, . . . , 23570, let
vi = 1 if the ith ballot is in favour of Dr. Chee and v1 = 0 otherwise.
Our population here is simply the ordered set P = (v1 , v2 , . . . , v23570 ). So in this example,
the population is simply an ordered set of 1s and 0s.
Example 1048 (continued from above). Suppose that of the 23, 570 votes, 9, 142
were for Dr. Chee and the remaining against. So the vector (v1 , v2 , . . . , v23570 ) contains
9, 142 1s and 14, 428 0s.
Then the population mean is
v1 + v2 + ⋅ ⋅ ⋅ + vn 9142 × 1 + 14428 × 0 9142
µ= = = ≈ 0.3879.
n 23570 23570
In this particular example, the population values are binary (either 0 or 1). And so
we have a nice alternative interpretation: the population mean is also the population
proportion. In this case, it is the proportion of the population who voted for Dr. Chee.
So here the proportion of votes for Dr. Chee is about 0.3879.
The population variance is
365
In the case of an infinite population, the definitions of µ and σ 2 must be adjusted slightly, but the
intuition is the same.
1049, Contents www.EconsPhDTutor.com
106.3. Parameter
Informally, a parameter is some number we’re interested in and which may be calculated
based on the population.
Example 1048 (continued from above). A parameter we might be interested in
is the population mean µ — this is also the proportion of votes in favour of Dr. Chee.
(Another parameter we might be interested in is the population variance σ 2 , but let’s
ignore that for now.)
Voting has just closed. In a few hours’ time (after the vote-counting is done), we will
know what exactly µ is. But right now, we still don’t know what µ is.
Suppose we are impatient and want to know right away what µ might be. In other
words, suppose we want to get an estimate of the true value of µ. What are some
possible methods of getting a quick estimate of µ?
One possibility is to observe a random sample of 100 votes and count the proportion of
these 100 votes that are in favour of Dr. Chee. So for example, say we do this and observe
that 39 out of the 100 votes are for Dr. Chee. That is, we find that the observed sample
mean (which in this context can also be called the observed sample proportion) is
0.39. Then we might conclude:
Based on this observed random sample of 100 votes, we estimate that µ is 0.39.
The layperson might be content with this. But the statistician digs a little deeper and
asks questions such as:
• How do we know if this estimate is “good”?
• What are the criteria to determine whether an estimate is “good”?
We’ll now try to address, if only to a limited extent, these questions. But to do so, we
must first precisely define terms like sample and estimate.
366
Formally, we’d define the population distribution as a function. Indeed, some writers define the popu-
lation itself as the distribution function.
1051, Contents www.EconsPhDTutor.com
106.5. A Random Sample
Informally, to observe a random sample of size n, we follow this procedure: Imagine the
23, 570 ballots are in a single big bag.
1. Randomly pull out one ballot. Record the vote (either we write x1 = 1, if the vote was
for Dr. Chee, or we write x1 = 0, if it wasn’t).
2. Put this ballot back in (this second step is why we call it sampling with replacement).
3. Repeat the above n times in total, so as to record down the values of x1 , x2 , . . . , xn .
We call (x1 , x2 , . . . , xn ) an observed random sample of size n. Note that this is an
ordered set (or vector) of numbers. Formally:
Definition 213. Let P be a population. Then the random vector (i.e. ordered set of
random variables) (X1 , X2 , . . . , Xn ) is a random sample of size n from the population P
if
An example to illustrate:
In this textbook, we’ll be very careful to distinguish between a random sample (which is
a vector of random variables) and an observed random sample (which is a vector of real
numbers).
This may be contrary to the practice of your teachers or indeed even the A-Level exams.
Definition 214. Let (X1 , X2 , . . . , Xn ) be a random sample of size n. Then the corres-
ponding sample mean X̄ and the sample variance S 2 are the random variables defined
by:
X 1 + X 2 + ⋅ ⋅ ⋅ + Xn
X̄ = ,
n
S =
2
=
n−1 n−1
.
(The List of Formulae (MF26) will contain the observed sample variance.)
Note that strangely enough, the denominator of S 2 is n − 1, rather than n as one might
expect. As we’ll see later, there is a good reason for this.
By the way, there are two other formulae for calculating the sample variance:
Fact 183. Let S = (X1 , X2 , . . . , Xn ) be a random sample of size n. Let X̄ be the sample
mean and S 2 be the sample variance. Let a ∈ R be a constant. Then
[∑n
i=1 Xi ] [∑ (X −a)]
∑i=1 Xi2 − ∑i=1 (Xi − a) − i=1 n i
2 2
n n 2 n
(a) S 2 = n
and (b) S =
2
n−1 n−1
.
(1 − 13 ) + (0 − 31 ) + (0 − 31 )
2 2 2
(x1 − x̄) + (x2 − x̄) + (x3 − x̄)
2 2 2
1
s =
2
= = .
n−1 3−1 3
Let (X1 , X2 , X3 , X4 , X5 ) be a random sample of size 5. The corresponding sample mean
X̄ and sample variance S 2 are these random variables:
(x1 − x̄) + (x2 − x̄) + (x3 − x̄) + (x4 − x̄) + (x5 − x̄)
2 2 2 2 2
s =
2
n−1
(0 − 51 ) + (1 − 15 ) + (0 − 51 ) + (0 − 15 ) + (1 − 51 )
2 2 2 2 2
= = 0.35.
5−1
Example 1199. Suppose we wish to find the average height µ (in cm) of an adult male.
As a practical matter, it would be quite difficult to locate and record the height of every
adult male in the world. So instead, what we might do is to randomly pick 4 adult males
and record their heights. This gives us a random sample (H1 , H2 , H3 , H4 ) of heights. The
corresponding sample mean is the random variable H̄ = (H1 + H2 + H3 + H4 ) /4. H̄ shall
serve as our estimator for µ.
Suppose our observed random sample is (h1 , h2 , h3 , h4 ) = (178, 165, 182, 175).
Then the corresponding observed sample mean is
h1 + h2 + h3 + h4 178 + 165 + 182 + 175
h̄ = = = 175.
n 4
Thus, h̄ = 175 serves as an estimate (or “guess”) of the true average male height µ.
Again, are the estimator H̄ and estimate h̄ = 175 “good” or “reliable”? How much
should we trust them? These are questions that we’ll address in the next section.
i=1
2
σ .
(a) Suppose our observed random sample is such that
8 8
∑ xi = 1, 320 and ∑ x2i = 218, 360.
i=1 i=1
Then the observed sample mean x̄ and the observed sample variance s2 are
∑i=1 xi 1320
n
x̄ = = = 165,
n 8
(∑n xi )
∑i=1 x2i − i=1n 218360 − 1320
2 2
n
s =
2
= 8
= 80.
n−1 7
And our estimates for µ and σ 2 are, respectively, 165 cm and 80 cm2 .
(b) Suppose instead our observed random sample is such that
8 8
∑(xi − 160) = 72 and ∑ (xi − 160) = 1, 560.
2
i=1 i=1
Then the observed sample mean x̄ and the observed sample variance s2 are
s =
2
= ≈ 130.3.
n−1 7
And our estimates for µ and σ 2 are, respectively, 169 cm and 130.3 cm2 .
Exercise 425. (Answer on p. 1592.) Let X be the random variable that is the weight
(in kg) of an American. Suppose we are interested in estimating the true population mean
µ and variance σ 2 of X. We get an observed random sample of size 10: (x1 , x2 , . . . , x10 ).
10 10
(a) Suppose you are told that ∑ xi = 1, 885 and ∑ x2i = 378, 265. Find the observed
i=1 i=1
2
sample mean x̄ and observed sample variance s .
10 10
(b) Suppose you are instead told that ∑(xi − 50) = 1, 885 and ∑ (xi − 50) = 378, 265.
2
i=1 i=1
Find the observed sample mean x̄ and observed sample variance s2 .
Definition 215. Let X be a random variable and θ ∈ R be a parameter (i.e. just some
real number). We say that X is an unbiased estimator for θ if
E [X] = θ.
The next proposition says that the sample mean X̄ is an unbiased estimator for the
population mean µ; and the sample variance S 2 is an unbiased estimator for the
population variance σ 2 .
Proposition 17. Let (X1 , X2 , . . . , Xn ) be a random sample of size n drawn from a dis-
tribution with population mean µ and population variance σ 2 . Let X̄ be the sample mean
and S 2 be the sample variance. Then
(a) E [X̄] = µ. And
(b) E [S 2 ] = σ 2 .
Proof. You are asked to prove (a) in Exercise 427. For the proof of (b), see p. 1379 in the
Appendices (optional).
Proposition 17(b) is the reason why, strangely enough, we define the sample variance with
n − 1 in the denominator:
S2 =
n−1
.
As defined, S 2 is an unbiased estimator for the population variance σ 2 . This, then, is the
reason why we define it like this.
Some writers call S 2 the unbiased sample variance, but we shall not bother doing so. We’ll
simply call S 2 the sample variance.
Sample i x1 x2 x3 x̄i
1 1 0 1 2/3
2 0 0 0 0
3 0 1 0 2/3
4 1 0 0 1/3
5 0 1 1 2/3
6 1 0 0 1/3
7 0 0 0 0
8 0 0 0 0
9 0 0 1 1/3
10 1 1 0 2/3
Note that every estimate x̄i is wrong. Indeed, since the sample mean X̄i can only take
on values 0, 1/3, 2/3, or 1, the estimates can never possibly be equal to the true µ = 0.39.
Nonetheless, what the above proposition says informally is that on average, the estimate
gets it correct. Formally, E [X̄] = µ = 0.39.
For a demonstration that you can play around with, try this Google spreadsheet.
Exercise 428. Suppose we flip a coin 10 times. The first 7 flips are heads and the next
3 are tails. Let 1 denote heads and 0 denote tails. (Answer on p. 1593.)
(a) Write down, in formal notation, our observed random sample, the observed sample
mean, and observed sample variance.
(b) Are these observed sample mean and variance unbiased estimates for the true popu-
lation mean and variance?
(c) Can we conclude that this a biased coin (i.e. the true population mean is not 0.5)?
σ2
Fact 184. Var [X̄] = .
n
1
Exercise 429. Prove Fact 184. (Hint: Note that X̄ = (X1 + X2 + ⋅ ⋅ ⋅ + Xn ) and X1 , X2 ,
n
. . . , Xn are independent.) (Answer on p. 1593.)
Exercise 430. For each of the following terms, give a formal definition and an intuitive
explanation. (State whether each term is a random variable or a real number.) For
simplicity, you may assume that the finite population is given by P = (x1 , x2 , . . . , xk ).
(Answer on p. 1594.)
(a) The population mean.
(b) The population variance.
(c) The sample mean.
(d) The sample variance.
(e) The mean of the sample mean.
(f) The variance of the sample mean.
(g) The mean of the sample variance.
(h) The observed sample mean.
(i) The observed sample variance.
Proof. Corollary 34 tells us that the sum of normal random variables is itself a normal
random variable. So X1 + X2 + ⋅ ⋅ ⋅ + Xn is a normal random variable.
Fact 181 tells us that a linear transformation of a normal random variable is itself a normal
random variable. So X̄n = (X1 + X2 + ⋅ ⋅ ⋅ + Xn ) /n is a normal random variable.
In the previous sections, we already showed that X̄n has mean µ and variance σ 2 /n.
σ2
Altogether then, X̄n ∼ N (µ, ).
n
Now, suppose instead X1 , X2 , . . . , Xn are not normally-distributed. Surprisingly, a similar
result still holds, thanks to the CLT. Informally, draw X1 , X2 , . . . , Xn from any distribution.
Then thanks to the CLT, it will still be the case that — provided n is “large enough” —
X̄n is (approximately) normally-distributed. Formally:
Proof. The CLT says that if n is “large enough”, then X1 +X2 +⋅ ⋅ ⋅+Xn is well-approximated
by the normal distribution N (nµ, nσ 2 ).
And so it follows from Fact 181 (a linear transformation of a normal random variable is
itself a normal random variable) that X̄ = (X1 + X2 + ⋅ ⋅ ⋅ + Xn ) /n is well-approximated by
σ2
the normal distribution N (µ, ).
n
In the next chapter, we’ll make greater use of the two results given in this section.
Example 1201. Suppose we’re interested in the average height of a Singaporean. The
only way to know this for sure is to survey every single Singaporean. This, however, is
not practical.
Instead, we have only the resources to survey 100 individuals. We decide to go to a
basketball court and measure the heights of 100 people there. We thereby gather an
observed sample of size 100: (x1 , x2 , . . . , x100 ). We find that the average individual’s
height is x̄ = ∑ xi /100 = 179 cm.
Is x̄ = 179 cm an unbiased estimate of the average Singaporean’s height? Intuitively, we
know that the answer is obviously no.
The reason is that our observed sample of size 100 was non-random. We picked a basket-
ball court, where the individuals are overwhelmingly (i) male; and (ii) taller than average.
Our estimate x̄ = 179 cm is thus probably biased upwards.
Example 1202. Suppose we’re interested in what the average Singaporean family spends
on food each month. The only way to know this for sure is to survey every single family
in Singapore. This, however, is not practical.
Instead, we have only the resources to survey 100 families. We decide to go to Sixth
Avenue and randomly ask 100 families living there what they reckon they spend on food
each month. We thereby gather an observed sample of size 100: (x1 , x2 , . . . , x100 ). We
find that the average family spends x̄ = ∑ xi /100 = $2, 700 on food each month.
Is x̄ = $2, 700 an unbiased estimate of the average monthly spending on food by a Singa-
porean family? Intuitively, we know that the answer is obviously no.
The reason is that our observed sample of size 100 was non-random. We picked an
unusually affluent neighbourhood. Our estimate x̄ = $2, 700 is thus probably biased
upwards.
p = P (T ≤ t = 1∣H0 ) .
Now, remember that T is a random variable. In fact, it’s a binomial random variable.
Assuming H0 to be true, we have T ∼ B (n, θ) = B (5, 0.6). Thus,
⎛5⎞ 0 5 ⎛5⎞ 1 4
p = P (T ≤ 1∣H0 ) = P (T = 0∣H0 ) + P (T = 1∣H0 ) = 0.6 0.4 + 0.6 0.4 = 0.08704.
⎝0⎠ ⎝1⎠
This says that if H0 were true, then the probability of observing a test statistic as extreme
as the one we actually observed is only 0.08704. We might interpret this relatively small
p-value as casting doubt on or providing evidence against H0 .
1. Null hypothesis H0 (e.g. “this equipment has probability 0.6 of breaking down”).
2. Alternative hypothesis HA (e.g. “this equipment has probability less than 0.6 of
breaking down”). The test is either one-tailed or two-tailed, depending on HA .
3. A random sample of size n: (X1 , X2 , . . . , Xn ).
4. A test statistic T (which simply maps each observed random sample to a real num-
ber.)
5. The p-value of the observed sample. This is the probability that — assuming H0 were
true — T takes on values that are at least “as extreme as” the actual observed test
statistic t.
6. The significance level α. This is a pre-selected threshold, usually chosen to be some
small value. The conventional significance levels are α = 0.1, α = 0.05, or α = 0.01.
We then conclude qualitatively that:
• A small p-value casts doubt on or provides evidence against H0 .
• A large p-value fails to cast doubt on or provide evidence against H0 .
In particular, if p < α, then we say that we reject H0 at the significance level α. And
if p ≥ α, then we say that we fail to reject H0 at the significance level α.
Note importantly that to reject H0 (at some significance level α) does NOT mean that H0
is false and HA is true. Similarly, failure to reject H0 does NOT mean that H0 is true and
HA is false. More on this below.
Another example of NHST, now slightly more formally and carefully presented.
H0 ∶ µ = 0.3,
HA ∶ µ > 0.3.
T = X1 + X2 + ⋅ ⋅ ⋅ + X100 .
Suppose that in our observed random sample (x1 , x2 , . . . , x100 ), we find that 39 are in
favour of Dr. Chee. Our observed test statistic is thus t = 39.
We now ask: What is the probability that — assuming H0 were true — T takes on
values that are at least “as extreme as” the actual observed test statistic t? That is, what
is the p-value of the observed sample?
Now, assuming H0 were true, T is a binomial random variable with parameters 100 and
0.3. That is, T ∼ B (n, p) = B (100, 0.3). So:
Exercise 431. We flip a coin 20 times and get 17 heads. Test, at the 5% significance
level, whether the coin is biased towards heads. (Answer on p. 1595.)
H0 ∶ µ = 0.3,
HA ∶ µ > 0.3.
This was a one-tailed test because the alternative hypothesis HA was that µ was to the
right of 0.3.
If instead we changed the alternative hypothesis to:
H0 ∶ µ = 0.3,
HA ∶ µ ≠ 0.3.
Then this would be called a two-tailed test, because the alternative hypothesis HA is that
µ is either to the left or to the right of 0.3.
We now repeat the examples done in the previous section, but with HA tweaked so that we
instead have two-tailed tests. The difference is that the p-value is calculated differently.
367
By the way, the more common convention is to say “one-tailed” and “two-tailed” tests, rather than
“one-tail” and “two-tail” tests, as is the norm in Singapore (similar to those “Close for break” signs you
sometimes see). But after some consultation with my grammatical experts, I have been told that both
are equally correct.
1070, Contents www.EconsPhDTutor.com
Example 1203 (equipment breakdown).
Everything is as before, except that we now change the alternative hypothesis:
H0 ∶ θ = 0.6,
HA ∶ θ ≠ 0.6.
Say we observe the same random sample as before: (x1 , x2 , x3 , x4 , x5 ) = (0, 0, 0, 1, 0).
Again our test statistic is the sample number of failures T = X1 + X2 + X3 + X4 + X5 . And
so again our observed test statistic is t = x1 + x2 + x3 + x4 + x5 = 0 + 0 + 0 + 1 + 0 = 1.
The difference now is how the p-value (of the observed sample) is calculated. In words,
the p-value gives the likelihood that our test statistic is “at least as extreme as” that
actually observed — assuming H0 were true.
Previously, under a one-tailed test, we interpreted “our test statistic is at least as extreme
as that actually observed” to mean the event T ≤ t = 1.
Now that we’re doing a two-tailed test, we’ll instead interpret the same phrase to mean
both the event T ≤ t = 1 and the event that T is as far away on the other side of
E [T ∣H0 ] = 3. The second event is, specifically, T ≥ 5. Altogether then, the p-value is
given by
p = P (T ≤ 1, T ≥ 5∣H0 )
Since p = 0.1648 ≥ α = 0.1, we say that we fail to reject H0 at the α = 0.1 significance
level.
Observe that previously, under the one-tailed test, we could reject H0 at the α = 0.1
significance level, because there p = 0.08704. Now, in contrast, under the two-tailed test,
we fail to reject H0 at the same significance level.
In general, all else equal, the p-value for an observed random sample is greater under a
two-tailed test than under a one-tailed test. Thus, under a two-tailed test, we are less
likely to reject H0 .
H0 ∶µ = 0.3,
HA ∶µ ≠ 0.3.
Say we observe the same random sample as before: (x1 , x2 , . . . , x100 ), in which 39 votes
were in favour of Dr. Chee. So again our observed test statistic is t = x1 +x2 +⋅ ⋅ ⋅+x100 = 39.
The difference now is how the p-value (of the observed sample) is calculated. In words,
the p-value gives the likelihood that our test statistic is “at least as extreme as” that
actually observed — assuming H0 were true.
Previously, under a one-tailed test, we interpreted “our test statistic is at least as extreme
as that actually observed” to mean the event T ≥ t = 39.
Now that we’re doing a two-tailed test, we’ll instead interpret the same phrase to mean
both the event T ≥ t = 39 and the event that T is as far away on the other side of
E [T ∣H0 ] = 30. The second event is, specifically, T ≤ 21. Altogether then, the p-value is
given by
Exercise 432. We flip a coin 20 times and get 17 heads. Test, at the 5% significance
level, whether the coin is biased.(Answer on p. 1595.)
p = P (D∣H0 ) ,
where D stands for the observed data and H0 stands for the null hypothesis. The p-value
answers the following question: — assuming H0 were true, what’s the probability that we’d
get data “at least as extreme” as those actually observed (D)?
Say we get a p-value of 0.03. We should then say simply that
• The small p-value casts doubt on or provides evidence against H0 .
• If the pre-selected significance level was α = 0.05, then we may say that we reject H0
at the 5% significance level.
However, instead of merely saying the above, some researchers may instead conclude that:
Do you see the error here? The researcher has gone from the finding that p = P (D∣H0 ) = 0.03
to the conclusion that P (H0 ∣D) = 0.03. This is precisely the Conditional Probability Fallacy
(CPF), which we discussed at length in subsection 92.1.
The error is the same as leaping from “A lottery ticket buyer who doesn’t cheat has a small
probability q of winning” to “Jane bought a lottery ticket and won. Therefore, there is only
probability q that she didn’t cheat.”
The p-value is NOT the probability that H0 is true.368 Instead, it is the probability that
— assuming H0 were true — we would have gotten data “at least as extreme” as those
actually observed. This is an important difference. But it is also a subtle one, which is why
even researchers get confused.
368
Indeed, under the objectivist view, such a statement is nonsensical anyway, because H0 is either true
or not true; it makes no sense to talk probabilistically about whether H0 is true.
1073, Contents www.EconsPhDTutor.com
107.3. Common Misinterpretations of the Margin of Error
(Optional)
The sampling error or margin of error is often misinterpreted by laypersons (and
journalists).
Example 1204. On the night of the 2016 Bukit Batok SMC By-Election, the Elections
Department announced369 that based on a sample count of 900 ballots,
• Dr. Chee had won 39% of the votes.
• These sample counts have a confidence level of 95%, with a ±4% margin of error.
What does the above gobbledygook mean? Let µ be the true proportion of votes won by
Dr. Chee. Let X̄ be the sample proportion and x̄ be the observed sample proportion.
It’s clear enough what the 39% means — they randomly counted 900 ballots and found
(after accounting for any spoilt votes) that x̄ = 39% were in favour of Dr. Chee.
What’s less clear is what the 95% confidence level and ±4% margin of error mean.
Here are three possible interpretations of what is meant. Only one is correct.
1. “With probability 0.95, µ ∈ (x̄ − 0.04, x̄ + 0.04) = (0.35, 0.43).”
2. “With probability 0.95, X̄ ∈ (x̄ − 0.04, x̄ + 0.04) = (0.35, 0.43).”
Equivalently, suppose we repeatedly observe many random samples of size 900. Then we
should find that in 0.95 of these observed random samples, the observed sample mean is
between 0.35 and 0.43.
3. “With probability 0.95, X̄ ∈ (µ − 0.04, µ + 0.04).”
We have no idea what µ is. All we can say is that with probability 0.95, the sample mean
X̄ of votes for Dr. Chee is between µ − 0.04 and µ + 0.04.
Equivalently, suppose we repeatedly observe many random samples of size 900. Then we
should find that in 0.95 of these observed random samples, the observed sample mean is
between µ − 0.04 and µ + 0.04.
Take a moment to understand what each of the above interpretations say. Then decide
which you think is the correct interpretation, before turning to the next page.
(Example continues on the next page ...)
See section 122.8 in the Appendices for a discussion of where the Elections Department’s
±4% margin of error comes from.
Example 1205. On the night of the 2016 Bukit Batok SMC By-Election, a website
called Mothership.sg wrote:
“Based on the sample count of 100 votes,372 it was revealed at 9.26pm that the SDP
Sec-Gen received 39 percent of votes. In other words, Chee would score 35 per cent in
the worst case scenario and 43 per cent in the best case scenario.”
This is the most absurd misinterpretation of the margin of error I have ever seen.373
Let’s see what the correct worst- and best-case scenarios are.
Suppose that in the observed random sample of 900 votes, exactly 39% or 0.39×900 = 351
were votes for Dr. Chee and the remaining 549 were for PAP Guy. Then:
• Worst-case scenario: The observed random sample of 900 votes happened to contain
exactly all of the votes in favour of Dr. Chee. That is, Dr. Chee won only 351 votes
and PAP Guy won the remaining 23, 570−351 = 23, 219 votes. So the correct worst-case
scenario is that Dr. Chee won ≈ 1.5% of the votes.
• Best-case scenario: The observed random sample of 900 votes happened to contain
exactly all of the votes in favour of PAP Guy. That is, PAP Guy won only 549 votes
and Dr. Chee won the remaining 23570 − 549 = 23, 021 votes. So the correct best-case
scenario is that Dr. Chee won ≈ 97.7% of the votes.
These worst- and best-case scenarios are admittedly unlikely. Nonetheless, they are
possible scenarios all the same. The journalist’s purported worst- and best-case scenarios
are completely wrong.
372
By the way, even this basic fact was wrong. The sample count was not 100 votes. Instead, it was 900
votes, consisting of 100 votes from each of 9 polling stations.
Moreover, the Mothership.sg journalist failed to report the confidence level of 95%, either because he
didn’t know what it meant or because he didn’t think it important. But it is important. It is pointless
to inform the reader about the margin of error without also specifying the confidence level.
373
You can find several misinterpretations of the margin of error collected in this academic paper. None is
as absurdly bad as the error committed here.
1076, Contents www.EconsPhDTutor.com
107.4. Critical Region and Critical Value
Informally, the critical region is the set of values of the observed test statistic t for which
we would reject the null hypothesis. The critical region is thus sometimes also called the
rejection region.
And the critical value(s) is (are) the exact value(s) of the observed test statistic t at
which we are just able to reject the null hypothesis.
Example 1048. (Dr. Chee election.) Say that as before, we have a one-tailed test
where the two competing hypotheses are:
H0 ∶ µ = 0.3,
HA ∶ µ > 0.3.
H0 ∶ µ = 0.3,
HA ∶ µ ≠ 0.3.
The significance level is again α = 0.05. Again, the observed random sample of 100 votes
contains 39 in favour of Dr. Chee, so that our observed test statistic is t = 39.
We calculated that the corresponding p-value is 0.06281 and so we failed to reject H0 at
the α = 0.05 significance level.
We calculate that if t = 40, then the corresponding p-value is ≈ 0.03745 (you should verify
this for yourself). Thus, the critical values are 20 and 40, because these are the values of
t at which we are just able to reject H0 .
The critical region is the set {0, 1, . . . , 20, 40, 41, . . . , 100}. These are the values at which
we’d be able to reject H0 at the α = 0.05 significance level.
Exercise 433. (Answer on p. 1596.) We flip a coin 20 times. What are the critical
region and critical value(s) in
(a) A test, at the 5% significance level, of whether the coin is biased towards heads.
(b) A test, at the 5% significance level, of whether the coin is biased.
Example 1206. The weight (in mg) of a grain of sand is X ∼ N (µ, 9). Our unknown
parameter of interest is the true population mean µ (i.e. the true average weight of a
grain of sand). Our “guess” is that µ = 5. We thus write down two competing hypotheses:
H0 ∶ µ = 5,
HA ∶ µ ≠ 5.
⎛ 7.5 − 5 ⎞ ⎛ 2.5 − 5 ⎞
=P Z≥ √ +P Z ≤ √ ≈ 0.04779 + 0.04779 = 0.09558.
⎝ 9/4 ⎠ ⎝ 9/4 ⎠
Thus, we reject H0 at the α = 0.1 significance level. However, we would fail to reject H0
at the α = 0.05 significance level.
X̄ − µ
Any Normal Known Z-test: √ ∼ N(0, 1).
σ/ n
X̄ − µ
Large Any Known Z-test: √ ∼ N(0, 1).
σ/ n
X̄ − µ
Large Any Unknown Z-test: √ ∼ N(0, 1).
s/ n
Exercise 434. The Singapore daily high temperature (in °C) can be modelled by
X ∼ N (µ, 8). Our unknown parameter of interest is the true population mean µ (i.e.
the true average daily high temperature). Your friend guesses that µ = 34. You gather
the following data on daily high temperatures, of 10 randomly-chosen days in 2015:
(35, 35, 31, 32, 33, 34, 31, 34, 35, 34). Test your friend’s hypothesis, at the α = 0.05 signific-
ance level. (Be sure to write down your null and alternative hypotheses.) (Answer on p.
1597.)
Example 1207. The weight (in mg) of a grain of sand is X ∼ (µ, 9). (This says simply
that X is distributed with mean µ and variance 9.) Our unknown parameter of interest
is the true population mean µ (i.e. the true average weight of a grain of sand). Again,
we “guess” that µ = 5. Again, we write down:
H0 ∶ µ = 5,
HA ∶ µ ≠ 5.
= P (Z ≥ 2) + P (Z ≤ −2) ≈ 0.0455.
Exercise 435. The Singapore daily high temperature (in °C) can be modelled by X ∼
(µ, 8). Our unknown parameter of interest is the true population mean µ (i.e. the true
average daily high temperature). Your friend guesses that µ = 34. You gather the data
on daily high temperatures, of 100 randomly-chosen days in 2015 and find the observed
sample average temperature to be 33.4 °C. Test your friend’s hypothesis, at the α = 0.05
significance level. (Be sure to write down your null and alternative hypotheses. Also,
clearly state where you use the CLT.) (Answer on p. 1597.)
Example 1208. The weight (in mg) of a grain of sand is X ∼ (µ, σ 2 ). (This says simply
that X is distributed with mean µ and variance σ 2 .) Our unknown parameter of interest
is the true population mean µ (i.e. the true average weight of a grain of sand). Again,
we “guess” that µ = 5. Again, we write down
H0 ∶ µ = 5,
HA ∶ µ ≠ 5.
Exercise 436. The Singapore daily high temperature (in °C) can be modelled by X ∼
(µ, σ 2 ). Our unknown parameter of interest is the true population mean µ (i.e. the true
average daily high temperature). Your friend guesses that µ = 34. You gather the data
on daily high temperatures, of 100 randomly-chosen days in 2015. Your observed sample
mean temperature is 33.4 °C and your observed sample variance is 11.2 °C2 . Test your
friend’s hypothesis, at the α = 0.05 significance level. (Be sure to write down your null
and alternative hypotheses. Also, clearly state where you use the CLT.) (Answer on p.
1598.)
Example 1209. We flip a coin 100 times. We get 100 heads. What can we say about
the coin?
This is an open-ended question, to which there can be many different answers. Here’s
the answer we’re taught to give for H2 Maths:
Let µ be the probability that a coin-flip is heads. We formulate a pair of competing
hypotheses:
H0 ∶ µ = 0.5,
HA ∶ µ ≠ 0.5.
Our test statistic T is the number of heads (out of 100 coin-flips). Our observed test
statistic t is 100. The corresponding p-value (note that this is a two-tailed test) is
Exercise 437. (Answer on p. 1598.) We observe the weights (in kg) of a random sample
of 50 Singaporeans: (x1 , x2 , . . . , x50 ). We observe that ∑ xi /50 = 68 and ∑ x2i /50 = 5000.
A friend claims that the average American is heavier than the average Singaporean. It is
known that the average American weighs 75 kg. Is your friend correct? If you make any
assumptions or approximations, make clear exactly where you do so. (Hint: Use Fact
183(a)).
Example 1210. We measure the heights and weights of 10 adult male Singaporeans.
Their heights (in cm) and weights (in kg) are given in this table:
i 1 2 3 4 5 6 7 8 9 10
hi (cm) 182 165 173 155 178 174 169 160 150 190
wi (kg) 81 70 71 53 72 75 69 60 44 80
We call (hi , wi ) observation i. So for example, observation 5 is (178, 72) and observation
9 is (150, 44).
We can plot a scatter diagram of these 10 persons’ weights (vertical axis) against their
heights (horizontal).
90 Weight (kg)
80
70
60
50 Height (cm)
40
145 155 165 175 185 195
The black dotted line is called a line of best fit. Shortly (section 108.4), we’ll learn
how to construct this line of best fit.
The more closely the data points in the above scatter diagram lie to a straight line, the
more strongly linearly-correlated are weight and height. So here with these particular
data, the linear correlation between weight and height seems strong. In the next section,
we’ll learn about the product moment correlation coefficient, which is a way to
precisely quantify the degree to which two sets of data are linearly-correlated.
Because the line of best fit is upward-sloping, we can also say that the linear correlation
is positive.
i 1 2 3 4 ... 361
ti (°C) 27.3 29.5 31.1 32 30.2
pi (mm) 0 0.2 0 0 12.4
80 Rainfall (mm)
70
60
50
40
30
20
10
0
25 30 Temperature (degrees Celsius) 35
Again, the black dotted line is a line of best fit. The data points do not seem close to
this line. Thus, it seems that the linear correlation between temperature and rainfall is
weak.
The line of best fit is downward-sloping and so we say that the linear correlation is
negative.
Exercise 438. (Answer on p. 1599.) The table below shows the prices charged (p) and
the number of haircuts (q) given by 5 different barbers, during June 2016.
Draw a scatter diagram with price on the horizontal axis. Plot also what you think looks
like a line of best fit.
i 1 2 3 4 5
pi ($) 8 9 4 10 8
qi 300 250 1000 400 400
Definition 216. Let (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) be two ordered sets of real num-
bers. The product moment correlation coefficient (PMCC) is the following real number:
8. r is merely a measure of linear correlation and nothing else. Two variables may be very
closely related but not linearly-correlated. For example, data generated by the quadratic
model yi = x2i may have a very low r.
i 1 2 3 4 5 6 7 8 9 10
hi (cm) 182 165 173 155 178 174 169 160 150 190
wi (kg) 81 70 71 53 72 75 69 60 44 80
90 Weight (kg)
80
70
60
50 Height (cm)
40
145 155 165 175 185 195
182 + 165 + 173 + 155 + 178 + 174 + 169 + 160 + 150 + 190
h̄ = = 169.6,
10
81 + 70 + 71 + 53 + 72 + 75 + 69 + 60 + 44 + 80
w̄ = = 67.5,
10
∑ (hi − h̄) (wi − w̄) = (182 − h̄) (81 − w̄) + ⋅ ⋅ ⋅ + (190 − h̄) (80 − w̄) = 1237
n
i=1
¿ √
Án
Á
À∑ (hi − h̄)2 = (182 − 169.6)2 + ⋅ ⋅ ⋅ + (190 − 169.6)2 ≈ 37.180640,
i=1
¿ √
Án
Á
À∑ (wi − w̄)2 = (81 − 67.5)2 + + ⋅ ⋅ ⋅ + (80 − 67.5)2 ≈ 35.418922,
i=1
As expected, r > 0 (the linear correlation is positive or, equivalently, the line of best fit
is upward-sloping). Moreover, r is close to 1 (the linear correlation is very strong).
i 1 2 3 4 ... 361
ti (°C) 27.3 29.5 31.1 32 30.2
pi (mm) 0 0.2 0 0 12.4
80 Rainfall (mm)
70
60
50
40
30
20
10
0
25 30 Temperature (degrees Celsius) 35
27.3 + 29.5 + 31.1 + 32 + ⋅ ⋅ ⋅ + 30.2 0 + 0.2 + 0 + 0 + ⋅ ⋅ ⋅ + 12.4
t̄ = ≈ 31.5, w̄ = ≈ 5.0.
361 361
≈ −0.1623.
As expected, r < 0 (the linear correlation is negative or, equivalently, the line of best fit
is downward-sloping). Moreover, r is fairly close to 0 (the linear correlation is weak).
i 1 2 3 4 5
pi ($) 8 9 4 10 8
qi 300 250 1000 400 400
Hanging suicides
$25 billion 8000 suicides
The PMCC is r ≈ 0.99789126. So the two sets of data are almost perfectly linearly-
correlated. But of course, this doesn’t mean that spending on science causes suicides
or that suicides cause spending on science. More likely, the correlation is simply spurious.
A comic from xkcd:
Example 425 (continued from above). We suspect that the heights and weights of
adult male Singaporeans are linearly-correlated. We thus write down this linear model:
w = a + bh.
Recall the quote: “All models are wrong, but some are useful.” The model w = a + bh is
unlikely to be exactly correct. But hopefully it will be useful.
We treat a and b as unknown parameters (do you expect b to be positive or negative?).
Our goal is to try to get estimates for a and b, from an observed random sample of height
and weight data.
We recycle the data from earlier. These, along with the scatter diagram, are reproduced
for convenience.
i 1 2 3 4 5 6 7 8 9 10
hi (cm) 182 165 173 155 178 174 169 160 150 190
wi (kg) 81 70 71 53 72 75 69 60 44 80
90 Weight (kg)
80
70
60
50 Height (cm)
40
145 155 165 175 185 195
The basic idea of linear regression is this: Find the line that “best fits” the given data.
Drawn in the figure above are three plausible candidates for the “line of best fit”. But
there can only be one line of best fit. Which is it?
At the end of the day, we’ll choose black dotted line as “the” line of best fit. But why?
This will be answered in the next section.
p = a + bt.
Again, our goal is to get estimates for the unknown parameters a and b (do you expect
b to be positive or negative?).
We gather the following data (recycled from before):
i 1 2 3 4 ... 361
ti (°C) 27.3 29.5 31.1 32 30.2
pi (mm) 0 0.2 0 0 12.4
80 Rainfall (mm)
70
60
50
40
30
20
10
0
25 30 Temperature (degrees Celsius) 35
Again, drawn in the figure above are several plausible candidates for the “line of best fit”.
It turns out that the black dotted line will be “the” line of best fit.
85
Weight (kg)
80
75
70 5
65
60
55
50
45
Height (cm)
40
145 155 165 175 185 195
i 1 2 3 4 5 6 7 8 9 10
hi (cm) 182 165 173 155 178 174 169 160 150 190
wi (kg) 81 70 71 53 72 75 69 60 44 80
ŵi (kg) 65 65 65 65 65 65 65 65 65 65
ûi = wi − ŵi (kg) 16 5 6 −12 7 10 4 −5 −21 15
The second last row of the above table gives, for each person with height hi , the cor-
responding predicted weight ŵi (as per our candidate line of best fit). The residual ûi
(last row) is then defined as the vertical distance between the data point and the weight
predicted by the candidate line of best fit.
10
The SSR is ∑ û2i = 162 + 52 + 62 + (−12)2 + 72 + 102 + 42 + (−5)2 + (−21)2 + 152 = 1317.
i=1
Can we do better than this? That is, can we find another candidate line of best fit whose
SSR is smaller than 1317?
Fact 187. Let (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) be two ordered sets of data. The OLS
regression line of y on x is y − ȳ = b̂ (x − x̄), where
∑ xi yi − nx̄ȳ
(ii) b̂ =
∑ x2i − nx̄2
.
Moreover, the regression line can also be written in the form y = â + b̂x, where b̂ is as
given above and â = ȳ − b̂x̄.
Proof. We want to find â and b̂ such that the line y = â + b̂x has the smallest SSR possible.
The residual ûi is defined as the vertical distance between (xi , yi ) and the line y = â + b̂x.
That is,
2
Thus, the SSR is ∑ û2i = ∑ [yi − (â + b̂xi )] .
We wish to minimise the SSR, by choosing appropriate values of â and b̂. This involves the
following pair of first order conditions:375
∑ û2i = 0, ∑ û2i = 0.
∂ ∂
∂â ∂b̂
The remainder of the proof simply involves taking derivatives and doing the algebra, and
is continued on p. 1384 in the Appendices.
Remark 129. Whenever we simply say regression line or line of best fit, it may safely
be assumed that we are talking about the OLS regression line.
375
There’s a bit of hand-waving here.
1098, Contents www.EconsPhDTutor.com
Example 1210 (height and weight example revisited). We already calculated
i=1 i=1
Thus, the regression line is w − 67.5 = 0.8948 (h − 169.6) or w = â + b̂h = −84.26 + 0.8948h.
90
Weight (kg)
85 4
80
8
75
70
65
60
55
50
45
Height (cm)
40
145 155 165 175 185 195
i 1 2 3 4 5 6 7 8 9 10
hi (cm) 182 165 173 155 178 174 169 160 150 190
wi (kg) 81 70 71 53 72 75 69 60 44 80
ŵi (kg) 78.6 63.4 70.5 54.4 75.0 71.4 67.0 58.9 50.0 85.8
ûi = wi − ŵi (kg) 2.4 6.6 0.5 −1.4 −3.0 3.6 2.0 1.1 −6.0 −5.8
10
The SSR for the actual line of best fit is ∑ û2i = 2.42 + ⋅ ⋅ ⋅ + (−5.8)2 ≈ 147.6. This is much
i=1
better than the SSR of 1317 that we found for the previous candidate line of best fit,
which was simply a horizontal line.
i 1 2 3 4 5
pi ($) 8 9 4 10 8
qi 300 250 1000 400 400
q̂i
ûi = qi − q̂i
Example 1212. We’ll find the PMCC and the regression line for these data:
i 1 2 3 4 5
xi 1 7 3 11 8
yi 14 5 6 4 4
4. Press ENTER once. And press ENTER a second time. The TI84 now says “DONE”,
telling you that the Diagnostic option has been turned on.
The above steps need only be performed once. Unless of course you’ve just reset your
calculator (as is required before each exam). In which case you have to go through the
above steps again.
After Step 9. After Step 10. After Step 11. After Step 12.
Exercise 441. Using your TI84, find the PMCC between q and p, and also find the
regression line of q on p (see data below). Verify that your answer for this exercise is the
same as those in the last two exercises. (Answer on p. 1601.)
i 1 2 3 4 5
pi ($) 8 9 4 10 8
qi 300 250 1000 400 400
Given any value of x, we call the corresponding ŷ = b̂ (x − x̄) + ȳ the fitted value or the
predicted value. One use of the regression line is that it can help us predict (or “guess”)
the value of y, even for x for which we have no data.
Example 1210 (height and weight example revisited). Say we want to guess
the weight of an adult male Singaporean who is 185 cm tall. Using our regression line,
we predict that his weight is ŵh=185 = 0.8948 × 185 − 84.26 ≈ 81.3 kg. This is called
interpolation, because we are predicting the weight of a person whose height is between
two of our observations.
Say instead we want to guess the weight of an adult male Singaporean who is 210 cm tall.
Using our regression line, we predict that his weight is ŵh=210 = 0.8948×210−84.26 ≈ 103.6
kg. This is called extrapolation, because we are predicting the weight of a person whose
height is beyond on our rightmost observation.
i 1 2 3 4 5 6 7 8 9 10
hi (cm) 182 165 173 155 178 174 169 160 150 190 185 210
wi (kg) 81 70 71 53 72 75 69 60 44 80 - -
ŵi (kg) 78.6 63.4 70.5 54.4 75.0 71.4 67.0 58.9 50.0 85.8 81.3 103.6
90
80
70
60
50
Height (cm)
40
145 155 165 175 185 195 205 215
This, though, is not a very satisfying explanation for why extrapolation is “less reliable”
than interpolation. It merely leads to another question: “Why should a prediction be more
reliable if done between two known observations, than if done to the right of the right-most
observation (or to the left of the left-most observation)?”
We won’t give an adequate answer to this latter question. Instead, we’ll simply give a
bunch of examples to illustrate the dangers of extrapolation:
Example 1213. A man on a diet weighs 115 kg in Week #1. Here’s a chart of his weight
loss.
The OLS line of best fit suggests that he has been losing about 0.5 kg a week.
He forgot to record his weight on Week #6. By interpolation, we “predict” that his
weight that week was 112.5 kg. This is probably a reliable guess.
By extrapolation, we predict that his weight on Week #201 will be 15 kg. This guess is
obviously absurd. It requires that he keeps losing 0.5 kg a week for nearly 4 years.
The OLS line of best fit suggests that he has been growing by about 1 cm a month.
He forgot to record his height in Month #6. By interpolation, we “predict” that his
height that month was 165 cm. This is probably a reliable guess.
By extrapolation, we predict that his height in Month #101 will be 260 cm. This guess is
obviously absurd. It requires that he keep growing by 1 cm a month for the 8-plus years.
Example 1215. Russell’s Chicken (Problems of Philosophy, 1912, Google Books link):
The man who has fed the chicken every day throughout its life at last wrings its neck
instead, showing that more refined views as to the uniformity of nature would have been
useful to the chicken. ... The mere fact that something has happened a certain number
of times causes animals and men to expect that it will happen again. Thus our instincts
certainly cause us to believe the sun will rise to-morrow, but we may be in no better a
position than the chicken which unexpectedly has its neck wrung.
F0 = 22 + 1 = 3,
0
F1 = 22 + 1 = 5,
1
F2 = 22 + 1 = 17,
2
F3 = 22 + 1 = 257,
3
F4 = 22 + 1 = 65537.
4
Remarkably, the first five Fermat numbers are all prime. This observation led Fermat to
conjecture (guess) in the 17th century that all Fermat numbers are prime. This was an
act of extrapolation.
Unfortunately, Fermat’s act of extrapolation was wrong. About a century later, Euler
showed that F5 = 22 + 1 = 4294967297 = 641 × 6700417 is composite (not prime).
5
Today, the Fermat numbers F5 , F6 , . . . , F32 are all known to be composite. Indeed,
it was shown in 1964 that F32 is composite. Over half a century later, it is not yet
known if F33 = 22 + 1 is prime or composite. F33 is an unimaginably huge number, with
33
After his third day at school, Ah Beng decides he’ll skip at least the next few Chinese
classes, because he thinks he knows how to write the Chinese characters for the numbers 4
and above. 4 simply consists of four horizontal strokes; 5 simply consists of five horizontal
strokes; etc. Unfortunately, Ah Beng’s act of extrapolation is wrong.
The characters for the numbers 4 through 10 look instead like this:
4 5 6 7 8 9 10
Example 1218. Moore’s Law. In 1965, Gordon Moore observed that the number of
components that could be crammed onto each integrated circuit doubled every year. He
NEWS FEATURE
predicted that this rate of progress would continue at least through 1975.
In 1975, he adjusted his prediction to a more modest rate of doubling every two years.
Thus far, this latter prediction has held up remarkably well. The following from Nature:
B
MOORE’S LORE 90
to m
sm
For the past five decades, the number of transistors per microprocessor T
chip — a rough measure of processing power — has doubled about every
two years, in step with Moore’s law (top). Chips also increased their ‘clock
on
speed’, or rate of executing instructions, until 2004, when speeds were tur
capped to limit heat. As computers increase in power and shrink in size, a sto
new class of machines has emerged roughly every ten years (bottom).
exe
1010 ele
rat
S
10 8 cur
tha
10 6 mo
sm
wit
10 4
pra
Transistors per chip bro
10 2 ficu
im
1
E
ele
Clock speeds (MHz) chi
10 –2 the
1960 1974 1988 2002 2016
wh
po
Unfortunately, as stated in the same Nature article, it “has become increasingly obvious com
1013
to everyone involved” that “Moore’s law ... is nearing its end”. wh
1012
am
e Cla
nf r
1011 Ma
i is l
1108, Contents www.EconsPhDTutor.com T
Example 1219. Augustine’s Law. In 1983, Norman Augustine observed that the cost
of a tactical aircraft grows four-fold every ten years. (Google Books.)
v
~
A Qí Þí µø
~
Aäí
)
O U
N U ,
U
~
AAAhÑí Ñí
Îí h
פí
>? ; 2?
=
, RU RU M2U
N PU RU L $ *U
Ó A7L R
" ~lyí h[hí
,2RU
åí
h.í
SU L , U
.Ëø ~JQí fÜA<í ~A=í
%
ø ãø.ø
Ù ç Ħ Óc~ Bí
,FU
Ñ hAyí
,U
ø 1 Wø
«WA Þh¥í
þ ) . ~.Ã
1U æ LUR,&,U
«_Líí
*Ň 4ø «
O`
:
çU LU LU
Aí
,T,U
þ «TAí Ñí ÏÞ í Ú.[í
5í )?;?) ?
6? )?9
?
âãþ
C. 9 5 $? ?:$9U
¥³¹yÑíè í
Rh Þí ¹¾h uh³uíyAí
í ÞÙ.í
Þíʳ
¾í
hí
"
"
" " " " " " "
a<0WÅQAÅFNF^F0JÅQS<W0^FQN0JÅ70S04FJF^aÅ
This is considerably quicker than the rate at which the annual US defense budget and
US Gross National Product (GNP) grows. Extrapolating, he concluded:
• In 2054, the entire annual US defense budget will be spent on a single aircraft.
• Early in the 22nd century, the entire US GNP will be spent on a single aircraft.
Exercise 442. Using the data below, “predict” how many haircuts were sold in June
2016 by (a) a barber who charged $7 per haircut; and (b) a barber who charged $200 per
haircut. Which prediction is an act of interpolation and which is an act of extrapolation?
Which prediction do you think is more reliable?(Answer on p. 1601.)
i 1 2 3 4 5
pi ($) 8 9 4 10 8
qi 300 250 1000 400 400
Example 1220. Quadratic. Consider the following data. There is a very strong, but
not perfect degree of linear correlation between x and y (r ≈ 0.950). The observations are
very close to, but are not exactly on the OLS line of best fit.
The degree of linear correlation between z and y is near perfect (r ≈ 0.995). The obser-
vations also lie closer to the line of best fit than before.
The degree of linear correlation between z and y is much stronger (r ≈ 0.899). The
observations also lie closer to the line of best fit.
The degree of linear correlation between z and y is much stronger (r ≈ 0.978). The
observations also lie closer to the line of best fit.
i 1 2 3 4 5
xi 1 2 3 4 5
yi 10.59 10.54 27.30 33.84 56.6
(a) Plot the above data in a scatter diagram and find the PMCC.
(b) Apply an appropriate transformation to x. Plot the transformed data in a scatter
diagram and find the PMCC.
— J.M. Hammersley377
It’s much more interesting to live not knowing than to have answers which
might be wrong.
The A-Level examiners378 want you to say, mindlessly and formulaically, that
Regurgitating the above sentence will earn you your full mark. But in fact, without the
“all else equal” clause, it is nonsense. And since it is almost never true that “all else is
equal”, it is almost always nonsense.
In every introductory course or text on statistics, one is told that the PMCC is merely
a relatively-unimportant consideration, in deciding between models. Yet somehow, the
A-Level examiners seem to consider the PMCC an all-important consideration.
Here’s a quick example to illustrate.
Example 1223. (From the 2015 exam — see Exercise 668 below.) In an experiment the
following information was gathered about air pressure P , measured in inches of mercury,
at different heights above seA-Level h, measured in feet.
h 2000 5000 10000 15000 20000 25000 30000 35000 40000 45000
P 27.8 24.9 20.6 16.9 13.8 11.1 8.89 7.04 5.52 4.28
√
The exam first asks us to find the PMCCs between (a) h and P ; (b) ln h and P ; and (c) h
and P . The answers are (a) ra ≈ −0.980731; (b) rb ≈ −0.974800; and (c) rc ≈ −0.998638.
The A-Level exam then says, “Using the most appropriate case ..., find the equation
√ models air pressure at different heights.” The “correct” answer is that (c)
which best
P = a + b h is the “most appropriate” model, simply because the PMCC there is the
largest.
(Example continues on the next page ...)
378
See 9740 N2015/II/10(iii), N2014/II/8(b)(ii), N2012/II/8(v), N2011/II/8(iii), N2010/II/10(iii), and
N2008/II/8(i). These are given in this textbook as Exercises 668, 674, 689, 695, 705, and 717.
1115, Contents www.EconsPhDTutor.com
(... Example continued from the previous page.)
But this is utter nonsense. One does not conclude that one model is “more appropri-
ate” than another simply because its PMCC is 0.018 larger. Small measurement errors
or plain bad luck could easily explain these tiny differences in PMCCs.
Moreover, even if one model has r = 0.9 and another has r = 0.4, it does not automatically
follow that the first model is “more appropriate” than the second. In deciding which
statistical model to use, there are very many considerations, of which the PMCC is a
relatively-unimportant one.
In my view, the correct answer should have been this:
Sadly, in the Singapore education system, what I consider to be the correct answer would
not have gotten you any marks. Instead, one is taught that there must always be one
single, simplistic, formulaic, definitive, “correct” answer. This is a convenient substitute
for thinking.
As it turns out, the “most correct” linear model — based on the actual barometric formula
(see subsection 122.10 in the Appendices) — is actually the following:
ln P = a + b ln (1 +
L
h) .
T
The constants L = −0.0065 kelvin per metre (Km-1 ) and T = 288.15 kelvin (K) are,
respectively, the standard temperature lapse rate (up to 11, 000 m above sea level) and
the standard temperature (at sea level).
The PMCC for the above model is rd ≈ 0.999998, which is “better” than the cases ex-
amined above. (See this Google spreadsheet for the data and calculations.)
But again, the PMCC is merely one relatively-unimportant √ consideration. Our
conclusion that this last model is superior to the model P = a + b h is based not on the
fact that rd is 0.001 larger than rc .
√ model because it was derived from physical theories. In
Instead, we are confident in this
contrast, the model P = a + b h (or indeed any of the other models suggested above)
√
is completely arbitrary and has no theoretical justification. Hence, even if the model
P = a + b h had a PMCC of 1, we’d still prefer this last model.
Answers for Part VI (Probability and Statistics) 2016 and 2017 questions
will be written “soon”. ,
For more practice, try the TYS questions for H1 Maths (in my H1 Maths
Textbook). They’re very similar!
This part lists all the questions from the 2006–2017 A-Level exams, sorted into the six
different parts and in reverse chronological order.
In the older exams, they had the habit of not distinctly numbering different parts within
the same question as parts (i), (ii), etc. So I have sometimes taken the liberty of adding or
modifying such numbers.
379
Happily, the present 9758 syllabus (first examined in 2017) is considerably lighter than the previous
9740 syllabus (last examined in 2017), which was in turn lighter than the previous 9233 syllabus (last
examined in 2008). Thus, many past-year questions printed here are no longer in the current 9758
syllabus and you can skip them. Answers have been provided anyway and you’re perfectly welcome to
try them.
1119, Contents www.EconsPhDTutor.com
The following appears on the cover page of each of your 9758 A-Level exam papers.
READ THESE INSTRUCTIONS FIRST
Write your Centre number, index number and name on the work you hand in.
Write in dark blue or black pen on both sides of the paper.
You may use an HB pencil for any diagrams or graphs.
Do not use staples, paper clips, glue or correction fluid.
At the end of the examination, fasten all your work securely together.
The number of marks is given in brackets [ ] at the end of each question or part question.
(iii) Describe a pair of transformations which transforms the graph of C on to the graph
1
of y = . [2]
x
3
x= , y = 2t.
t
(i) The line y = 2x cuts C at the points A and B. Find the exact length of AB. [3]
3
(ii) The tangent at the point P ( , 2p) on C meets the x-axis at D and the y-axis at E.
p
The point F is the midpoint of DE. Find a cartesian equation of the curve traced by
F as p varies. [5]
1121, Contents www.EconsPhDTutor.com
Exercise 448. (9758 N2017/II/3.) (Answer on p. 1605.)
(a) The curve y = f (x) cuts the axes at (a, 0) and (0, b). It is given that f −1 (x) exists.
State, if it is possible to do so, the coordinates of the points where the following curves
cut the axes.
(i) y = f (2x).
(ii) y = f (x − 1).
(iii) y = f (2x − 1).
(iv) y = f −1 (x). [4]
(b) The function g is defined by
1
g ∶x↦1− , where x ∈ R, x ≠ a.
1−x
(i) State the value of a and explain why this value has to be excluded from the
domain of g. [2]
(ii) Find g (x) and g (x), giving your answers in simplified form.
2 −1
[4]
(iii) Find the values of b such that g 2 (b) = g −1 (b). [2]
4x2 + 4x − 14
< (x + 3). [3]
x−4
f f (x) = x. Explain why this value of x satisfies the equation f (x) = f −1 (x). [5]
(b) The function g, with domain the set of non-negative integers, is given by
⎧
⎪
⎪
⎪
⎪ 1 for n = 0,
⎪
⎪
⎪ 1
g(n) = ⎨2 + g ( n) for n even,
⎪
⎪
⎪ 2
⎪
⎪
⎪
⎩1 + g (n − 1)
⎪ for n odd.
y= + bx + c,
a
x2
where a, b and c are constants. It is given that C passes through the points with coordinates
(1.6, −2.4) and (−0.7, 3.6), and that the gradient of C is 2 at the point where x = 1.
(i) Find the values of a, b and c, giving your answers correct to 3 decimal places. [4]
(ii) Find the x-coordinate of the point where C crosses the x-axis, giving your answer
correct to 3 decimal places. [2]
(iii) One asymptote of C is the line with equation x = 0. Write down the equation of the
other asymptote of C. [1]
x+1
(i) Sketch the curve with equation y = ∣ ∣, stating the equations of the asymptotes.
1−x
On the same diagram, sketch the line with equation y = x + 2. [3]
x+1
(iv) Solve the inequality ∣ ∣ < x + 2. [3]
1−x
1123, Contents www.EconsPhDTutor.com
Exercise 454. (9740 N2015/I/5.) (Answer on p. 1609.)
(i) State a sequence of transformations that will transform the curve with equation y = x2
on to the curve with equation y = 0.25 (x − 3) 2 . [2]
1
f ∶x→ , x ∈ R, x > 1.
1 − x2
(i) Show that f has an inverse. [2]
(ii) Find f −1 (x) and state the domain of f −1 . [3]
(b) The function g is defined by
2+x
g∶x→ , x ∈ R, x ≠ ±1.
1 − x2
√
Find algebraically the range of g, giving your answer in terms of 3 as simply as
possible. [5]
1
y= , x ∈ R, x ≠ 1, x ≠ 0.
1−x
(i) Show that f 2 (x) = f −1 (x). [4]
(ii) Find f 3 (x) in simplified form. [1]
y
D
A B C
O x
(i) Sketch the curve y 2 = f (x), stating, in terms of a, b, c and d, the coordinates of any
turning points and of the points where the curve crosses the x-axis. [4]
(ii) What can be said about the tangents to the curve y 2 = f (x) at the points where it
crosses the x-axis? [1]
x2 + x + 1
y= , x ∈ R, x ≠ 1.
x−1
Without using a calculator, find the set of values that y can take. [5]
380
The question writers may have overlooked the fact that if p = 0, then P = (0, 0) and the tangent at P
is vertical, so that D could be any point on the y-axis.
1125, Contents www.EconsPhDTutor.com
Exercise 460. (9740 N2013/I/3.) (Answer on p. 1613.)
(i) Sketch the curve with equation
x+1
y=
2x − 1
,
stating the equations of any asymptotes and the coordinates of the points where the
curve crosses the axes. [4]
(ii) Solve the inequality
x+1
< 1. [1]
2x − 1
2+x
f ∶x ↦ , x ∈ R, x ≠ 1,
1−x
g ∶ x ↦ 1 − 2x, x ∈ R.
(i) Explain why the composite function f g does not exist. [2]
(ii) Find an expression for gf (x) and hence, or otherwise, find (gf ) −1 (5). [4]
Group Under 16 years Between 16 and 65 years Over 65 years Total cost
A 9 6 4 $162.03
B 7 5 3 $128.36
C 10 4 5 $158.50
Write down and solve equations to find the cost of a ticket for each of the age categories.[4]
1126, Contents www.EconsPhDTutor.com
Exercise 463. (9740 N2012/I/7.) (Answer on p. 1614.)
A function f is said to be self-inverse if f (x) = f −1 (x) for all x in the domain of f . The
function g is defined by
x+k
g∶x↦ , x ∈ R, x ≠ 1.
x−1
where k is a constant, k ≠ −1.
(i) Show that g is self-inverse. [2]
(ii) Given that k > 0, sketch the curve y = g (x), stating the equations of any asymptotes
and the coordinates of any points where the curve crosses the x- and y-axes. [3]
(iii) State the equation of one line of symmetry of the curve in part (ii), and describe fully
1
a sequence of transformations which would transform the curve y = onto this curve.
x
[4]
x2 + x + 1
< 0. [4]
x2 + x − 2
1
f ∶ x ↦ ln (2x + 1) + 3, x ∈ R, x > − .
2
(i) Find f −1 (x) and write down the domain and range of f −1 . [4]
(ii) Sketch on the same diagram the graphs of y = f (x) and y = f −1 (x) giving the equa-
tions of any asymptotes and the exact coordinates of any points where the curves
cross the x- and y-axes. [4]
(iii) Explain why the x-coordinates of the points of intersection of the curves in part (ii)
satisfy the equation ln (2x + 1) = x − 3, and find the values of these x-coordinates,
correct to 4 significant figures. [3]
of the points where the graph crosses the x- and y-axes. [3]
Remark 131. In the above question’s first sentence, the second step y
is a little ambiguous. Say we transform the black circle on the right
“by a stretch with scale factor 0.5 parallel to the y-axis”. Then do
we get (a) the red ellipse; or (b) the blue ellipse? Perhaps this was
clear in the mind of whoever that wrote this question, but it isn’t to x
me and probably to others too. In my answer, I shall assume that
(a) the stretch is outwards from the y-axis.
1
f ∶x↦ , for x ∈ R, x ≠ −1, x ≠ 1.
x2 − 1
(i) Sketch the graph of y = f (x). [1]
(ii) If the domain of f is further restricted to x ≥ k, state with a reason the least value of
k for which the function f −1 exists. [2]
1
g∶x↦ , for x ∈ R, x ≠ 2, x ≠ 3, x ≠ 4.
x−3
(x − 3)
2
(iii) Show that f g (x) = . [2]
(4 − x) (x − 2)
(iv) Solve the inequality f g (x) > 0. [3]
(v) Find the range of f g. [3]
f ∶x↦ for x ∈ R, x ≠ ,
ax a
,
bx − a b
where a and b are non-zero constants.
(i) Find f −1 (x). Hence or otherwise find f 2 (x) and state the range of f 2 . [5]
1
(ii) The function g is defined by g ∶ x ↦ for all real non-zero x. State whether the
x
composite function f g exists, justifying your answer. [2]
(ii) Solve the equation f −1 (x) = x. [3]
(i) y = 2
x
, stating the equations of the asymptotes, [4]
x −1
(ii) y 2 = 2
x
, making clear the form of the curve at the origin. [3]
x −1
(iii) Show that the x-coordinates of the points of intersection of the curves y =
x
and
x2 − 1
y = ex satisfy the equation x2 = 1 + xe−x . [1]
√
(iv) Use the iterative formula xn+1 = 1 + xn e−xn , together with a suitable initial value x1 ,
to find the positive root of this equation correct to 2 decimal places. [2]
2x2 − x − 19
> 1. [4]
x2 + 3x + 2
1131, Contents www.EconsPhDTutor.com
Exercise 477. (9740 N2007/I/2.) (Answer on p. 1629.)
Functions f and g are defined by
1
f ∶x ↦ for x ∈ R, x ≠ 3,
x−3
g ∶ x ↦ x2 for x ∈ R.
(i) Only one of the composite functions f g and gf exists. Give a definition (including
the domain) of the composite that exists, and explain why the other composite does
not exist. [3]
(ii) Find f −1 (x) and state the domain of f −1 . [3]
Assuming that, for each variety of fruit, the price per kilogram paid by each of the friends
is the same, calculate the total amount that Lee Lian paid. [6]
1132, Contents www.EconsPhDTutor.com
Exercise 480. (9233 N2007/II/4.) (Answer on p. 1631.)
The function f is defined by
4x + 1
f ∶x↦ , x ∈ R, x ≠ 3.
x−3
(i) State the equations of the two asymptotes of the graph of y = f (x). [2]
(ii) Sketch the graph of y = f (x), showing its asymptotes and stating the coordinates of
the points of intersection with the axes. [3]
(iii) Find an expression for f −1 (x) and state the domain of f −1 . [3]
f ∶ x ↦ 5x + 3, x > 0,
3
g∶x↦ , x > 0.
x
(i) Find, in a similar form, f g, g 2 and g 35 . [3]
[Note: g 2 denotes gg.]
(ii) Express h in terms of one or both f and g, where
x−9
≤ 1. [5]
x2 − 9
(i) Find an expression for un in terms of A, B and n. Simplify your answer. [3]
(ii) It is also given that the tenth term is 48 and the seventeenth term is 90. Find A
and B. [2]
(b) Show that r2 (r + 1) − (r − 1) r2 = kr3 , where k is a constant to be determined. Use
2 2
n
this result to find a simplified expression for ∑ r3 . [4]
r=1
∞
(c) D’Alembert’s ratio test states that a series of the form ∑ ar converges when lim ∣ ∣<
an+1
r=0
n→∞ an
1, and diverges when lim ∣ ∣ > 1. When lim ∣ ∣ = 1, the test is inconclusive.
an+1 an+1
n→∞ an n→∞ an
∞ r
Using the test, explain why the series ∑
x
converges for all real values of x and state
r=0 r!
1
∑ r (r2 + 1) = n (n + 1) (n2 + n + 2).
n
[5]
r=1 4
1
n (n + 1) (3n2 + 31n + 74). [6]
1 × 3 × 6 + 2 × 4 × 7 + 3 × 5 × 8 + ⋅ ⋅ ⋅ + n (n + 2) (n + 5) =
12
2
+
A B
(b) (i) Show that 2 can be expressed as , where A and B are
4r + 8r + 3 2r + 1 2r + 3
constants to be determined. [1]
n
2
The sum ∑ is denoted by Sn .
r=1 4r + 8r + 3
2
7 − 4n
pn = . [5]
3
n
(ii) Find ∑ pr . [3]
r=1
1
Sn = 1 − .
(n + 1)!
(i) Give a reason why the series ∑ ur converges, and write down the value of the sum
to infinity. [2]
(ii) Find a formula for un in simplified form. [2]
(i) In Version 1 of the exercise (above), the distances between adjacent points are all 4 m.
(a) Find the distance run by an athlete who completes the first 10 stages of Version
1 of the exercise. [2]
(b) Write down an expression for the distance run by an athlete who completes n
stages of Version 1. Hence find the least number of stages that the athlete needs
to complete to run at least 5 km. [4]
(ii) In Version 2 of the exercise (below), the distances between the points are such that
OA1 = 4 m, A1 A2 = 4 m, A2 A3 = 8 m and An An+1 = 2An−1 An . Write down an expression
for the distance run by an athlete who completes n stages of Version 2. Hence find the
distance from O, and the direction of travel, of the athlete after he has run exactly
10 km using Version 2. [5]
Remark 132. The wording of (iii) is a little ambiguous. Is the desired answer (a) the
maximum number of pieces one can cut off before the total length cut off is greater
than 380 cm? Or is it (b) the minimum number of pieces one can cut off in order for
the total length cut off to be greater than 380 cm? (Of course, the latter is simply one
more than the former.)
In my answer, I shall assume (b).
1
∑ r (2r2 + 1) = n (n + 1) (n2 + n + 1).
n
[5]
r=1 2
(ii) It is given that f (r) = 2r3 +3r2 +r +24. Show that f (r)−f (r − 1) = ar2 , for a constant
n
a to be determined. Hence find a formula for ∑ r2 , fully factorizing your answer. [5]
r=1
n
(iii) Find ∑ f (r). (You should not simplify your answer.) [3]
r=1
3un − 1
u1 = 2 and un+1 = for n ≥ 1.
6
(i) Find the exact values of u2 and u3 . [2]
(ii) It is given that un → l as n → ∞. Showing your working, find the exact value of l. [2]
(iii) For this value of l, use the method of mathematical induction to prove that
14 1 n
un = ( ) + l. [4]
3 2
1137, Contents www.EconsPhDTutor.com
Exercise 494. (9740 N2012/II/4.) (Answer on p. 1641.)
On 1 January 2001 Mrs A put $100 into a bank account, and on the first day of each
subsequent month she put in $10 more than in the previous month. Thus on 1 February
she put $110 into the account and on 1 March she put $120 into the account, and so on.
The account pays no interest.
(i) On what date did the value of Mrs A’s account first become greater than $5000? [5]
On 1 January 2001 Mr B put $100 into a savings account, and on the first day of each
subsequent month he put another $100 into the account. The interest rate was 0.5% per
month, so that on the last day of each month the amount in the account on that day was
increased by 0.5%.
(ii) Use the formula for the sum of a geometric progression to find an expression for the
value of Mr B’s account on the last day of the nth month (where January 2001 was
the 1st month, February 2001 was the 2nd month, and so on). Hence find in which
month the value of Mr B’s account first became greater than $5000. [5]
(iii) Mr B wanted the value of his account to be $5000 on 2 December 2003. What interest
rate per month, applied from January 2001, would achieve this? [3]
1 1 1
sin (r + ) θ − sin (r − ) θ = 2 cos rθ sin θ. [2]
2 2 2
Remark 133. Now assume also that θ is not an even integer multiple of π.381
n
1 1
(ii) Hence find a formula for ∑ cos rθ in terms of sin (n + ) θ and sin θ. [3]
r=1 2 2
(iii) Prove by the method of mathematical induction that
n cos 12 θ − cos (n + 12 ) θ
∑ sin rθ =
r=1 2 sin 12 θ
381
Otherwise some of the formulae that follow have 0 as denominators and are thus undefined.
1138, Contents www.EconsPhDTutor.com
Exercise 496. (9740 N2011/I/9.) (Answer on p. 1643.)
(i) A company is drilling for oil. Using machine A, the depth drilled on the first day is
256 metres. On each subsequent day, the depth drilled is 7 metres less than on the
previous day. Drilling continues daily up to and including the day when a depth of
less than 10 metres is drilled. What depth is drilled on the 10th day, and what is the
total depth when drilling is completed? [6]
(ii) Using machine B, the depth drilled on the first day is also 256 metres. On each
8
subsequent day, the depth drilled in of the depth drilled on the previous day. How
9
many days does it take for the depth drilled to exceed 99% of the theoretical maximum
Sn = n (2n + c),
where c is a constant.
(i) Find un in terms of c and n. [3]
(ii) Find a recurrence relation of the form un+1 = f (un ). [2]
un = n (2n + 1),
for n ≥ 1. The sum of the first n terms is denoted by Sn . Use the method of mathematical
induction to show that
1
Sn = n (n + 1) (4n + 5)
6
for all positive integers n. [5]
O α β x
The diagram shows the graph of y = ex − 3x. The two roots of the equation ex − 3x = 0 are
denoted by α and β, where α < β.
(i) Find the values of α and β, each correct to 3 decimal places. [2]
A sequence of real numbers x1 , x2 , x3 , . . . satisfies the recurrence relation
1
xn+1 = exn , for n ≥ 1.
3
(ii) Prove algebraically that, if the sequence converges, then it converges to either α or β.
[2]
(iii) Use a calculator to determine the behaviour of the sequence for each of the cases
x1 = 0, x1 = 1, x1 = 2. [3]
(iv) By considering xn+1 − xn , prove that
xn+1 < xn if α < xn < β,
xn+1 > xn if xn < α or xn > β. [2]
(v) State briefly how the results in part (iv) relate to the behaviours determined in (iii).
... [2]
2n + 1
un+1 = un − , for all n ≥ 1.
n2 (n + 1) 2
1
(i) Use the method of mathematical induction to prove that un = . [4]
n2
N
2n + 1
(ii) Hence find ∑ . [2]
n=1 n (n + 1)
2 2
(iii) Give a reason why the series in part (ii) is convergent and state the sum to infinity.
[2]
N
2n − 1
(iv) Use your answer to part (ii) to find ∑ . [2]
n=2 n (n + 1)
2 2
n cos 12 x − cos (n + 21 ) x
∑ sin rx = , where 0 < x < 2π. [6]
r=1 2 sin 12 x
2n
Exercise 509. (9233 N2007/II/1.) Find ∑ 3r+2 . [3] (Answer on p. 1650.)
r=1
(i) Interpret geometrically the vector equation r = a + tb, where a and b are constant
vectors and t is a parameter. [2]
(ii) Interpret geometrically the vector equation r ⋅ n = d, where n is a constant unit vector
and d is a constant scalar, stating what d represents. [3]
(iii) Given that b ⋅ n ≠ 0, solve the equations r = a + tb and r ⋅ n = d to find r in terms of
a, b, n and d. Interpret the solution geometrically. [3]
Remark 134. This question should have clearly stated if this was meant to be in the
context of two- or three-dimensional space.382 I shall assume the latter.
382
This is because in the context of two-dimensional space, the vector equation r ⋅ n = d describes a line.
1144, Contents www.EconsPhDTutor.com
Exercise 515. (9740 N2016/I/11.) (Answer on p. 1654.)
⎛ 1 ⎞ ⎛1 ⎞ ⎛ a ⎞
The plane p has equation r = ⎜ ⎟ ⎜
⎜ −3 ⎟ + λ ⎜ 2
⎟ + µ ⎜ 4 ⎟, and the line l has equation
⎟ ⎜ ⎟
⎝ 2 ⎠ ⎝0 ⎠ ⎝ −2 ⎠
⎛ a−1 ⎞ ⎛ −2 ⎞
r=⎜
⎜ a
⎟ + t ⎜ 1 ⎟, where a is a constant and λ, µ and t are parameters.
⎟ ⎜ ⎟
⎝ a+1 ⎠ ⎝ 2 ⎠
(i) Given that a × b = 0, what can be deduced about the vectors a and b? [2]
(ii) Find a unit vector n such that n × (i + 2j − 2k) = 0. [2]
(iii) Find the cosine of the acute angle between i + 2j − 2k and the z-axis. [1]
a
c
b B
O M
Ð→ Ð→
The origin O and the points A, B and C lie in the same plane, where OA = a, OB = b and
Ð→
OC = c (see diagram).
(i) Explain why c can be expressed as c = λa + µb, for constants λ and µ. [1]
The point N is on AC such that AN ∶ N C = 3 ∶ 4.
(ii) Write down the position vector of N in terms of a and c. [1]
(iii) It is given that the area of triangle ON C is equal to the area of triangle OM C, where
M is the mid-point of OB. By finding the areas of these triangles in terms of a and
b, find λ in terms of µ in the case where λ and µ are both positive. [5]
(i) Find a vector equation of the line through the points A and B with position vectors
7i + 8j + 9k and −i − 8j + k respectively. [3]
(ii) The perpendicular to this line from the point C with position vector i + 8j + 3k meets
the line at the point N . Find the position vector of N and the ratio AN ∶ N B. [5]
(iii) Find a cartesian equation of the line which is a reflection of the line AC in the line
AB. [4]
Exercise 525. (9740 N2011/I/7.) A (Answer on p. 1659.)
P
M
O B
Q
Ð→ Ð→
Referred to the origin O, the points A and B are such that OA = a and OB = b. The point
P on OA is such that OP ∶ P A = 1 ∶ 2, and the point Q on OB is such that OQ ∶ QB = 3 ∶ 2.
The mid-point of P Q is M (see diagram).
ÐÐ→
(i) Find OM in terms of a and b and show that the area of triangle OM P can be written
as k ∣a × b∣, where k is a constant to be found. [6]
(ii) The vectors a and b are now given by a = 2pi − 6pj + 3pk and b = i + j − 2k, where p is
a positive constant. Given that a is a unit vector,
(a) find the exact value of p, [2]
(b) give a geometrical interpretation of ∣a ⋅ b∣, [1]
(c) evaluate a × b. [2]
x − 10 y + 1 z + 3
= = and x − 2y − 3z = 0.
−3 6 9
(i) Show that l is perpendicular to p. [2]
(ii) Find the coordinates of the point of intersection of l and p. [4]
(iii) Show that the point A with coordinates (−2, 23, 33) lies on l. Find the coordinates of
the point B which is the mirror image of A in p. [3]
(iv) Find the area of triangle OAB, where O is the origin, giving your answer to the nearest
whole number. [3]
2x − 5y + 3z = 3,
3x + 2y − 5z = −5,
5x + λy + 17z = µ,
respectively, where λ and µare constants. When λ = −20.9 and µ = 16.6, find the coordinates
of the point at which these planes meet. [2]
The planes p1 and p2 intersect in a line l.
(i) Find a vector equation of l. [4]
(ii) Given that all three planes meet in the line l, find λ and µ. [3]
(iii) Given instead that the three planes have no points in common, what can be said about
the values of λ and µ? [2]
(iv) Find the cartesian equation of the plane which contains l and the point (1, −1, 3). [4]
1149, Contents www.EconsPhDTutor.com
Exercise 533. (9233 N2008/I/11.) (Answer on p. 1662.)
The cartesian equations of two lines are
ω 4 + pω 3 + 39ω 2 + qω + 58 = 0,
The complex numbers z1 and z2 , where ∣z1 ∣ < ∣z2 ∣, correspond to the points of intersection
of these loci.
(i) Draw an Argand diagram to show both loci, and mark the points represented by
z1 and z2 . [2]
(ii) Find the two values of z which represent points on ∣z − 3 − i∣ = 1 such that ∣z − z1 ∣ =
∣z − z2 ∣. [4]
(b) (i) The complex number 2 − 2i is denoted by w. By writing w in polar form reiθ ,
where r > 0 and −π < θ ≤ π, find exactly all the cube roots of w in polar form. [3]
1
(ii) Find the smallest positive whole number value of n such that arg (w∗ wn ) = π.[3]
2
1152, Contents www.EconsPhDTutor.com
Exercise 542. (9740 N2015/I/9.) (Answer on p. 1665.)
(a) The complex number w is such that w = a+ib, where a and b are non-zero real numbers.
∗ w2
The complex conjugate of w is denoted by w . Given that ∗ is purely imaginary, find
w
the possible values of w in terms of a. [5]
z 7 − (1 + i) = 0,
giving the roots in the form reiα , where r > 0 and −π < α ≤ π. [5]
(ii) Show the roots on an Argand diagram. [2]
(iii) The roots represented by z1 and z2 are such that 0 < arg z1 < arg z2 <
π
. Explain why
2
the locus of all points z such that ∣z − z1 ∣ = ∣z − z2 ∣ passes through the origin. Draw
this locus on your Argand diagram and find its exact cartesian equation. [5]
(a) The complex number w has modulus r and argument θ, where 0 < θ < , and w∗
π
2
denotes the conjugate of w. State the modulus and argument of p, where p = ∗ . [2]
w
w
5
Given that p is real and positive, find the possible values of θ. [2]
(b) The complex number z satisfies the relations ∣z∣ ≤ 6 and ∣z∣ = ∣z − 8 − 6i∣.
(i) Illustrate both of these relations on a single Argand diagram. [3]
(ii) Find the greatest and least possible values of arg z, giving your answers in radians
correct to 3 decimal places. [4]
(a) Sketch, on an Argand √ diagram, the locus of points representing the complex number z
such that ∣z + 2 − 3i∣ = 13. [3]
(b) The complex number w is such that ww∗ +2w = 3+4i, where w∗ is the complex conjugate
of w. Find w in the form a + ib, where a and b are real. [4]
−π < θ ≤ π. [4]
(iii) Hence, or otherwise, express z 6 + 64 as the product of three quadratic factors with real
coefficients, giving each factor in non-trigonometrical form. [3]
(i) The equation az 4 + bz 3 + cz 2 + dz + e = 0 has a root z = ki, where k is real and non-zero.
Given that the coefficients a, b, c, d and e are real, show that ad2 + b2 e = bcd. [5]
(ii) Verify that this condition is satisfied for the equation z 4 + 3z 3 + 13z 2 + 27z + 36 = 0 and
hence find two roots of this equation which are of the form z = ki, where k is real. [3]
For an object falling vertically through the atmosphere, the rate of change of velocity is
less than that for an object falling in a vacuum. The new rate of change of v is modelled
as the difference between the value of c found in part (i)(b) and an amount proportional
to the velocity v, with a constant of proportionality k.
(ii) Given that in this case the initial velocity is zero, find v in terms of t and k. [5]
For an object falling through the atmosphere, the ‘terminal velocity’ is the value approached
by the velocity after a long time.
(iii) A falling object has initial velocity zero and terminal velocity 40 m s−1 . Find how long
it takes the object to reach 90% of its terminal velocity. [4]
1158, Contents www.EconsPhDTutor.com
Exercise 568. (9758 N2017/II/4.) (Answer on p. 1687.)
(a) A flat novelty plate for serving food on is made in the shape of the region enclosed by
the curve y = x2 − 6x + 5 and the line 2y = x − 1. Find the area of the plate. [4]
(b) A curved container has a flat circular top. The shape of the container is formed by
√
rotating the part of the curve x =
y
, where a is a constant greater than 1, between
a − y2
1
the points (0, 0) and ( , 1) through 2π radians about the y-axis.
a−1
(i) Find the volume of the container, giving your answer as a single fraction in terms
of a and π. [4]
(ii) Another curved container with a flat circular top is formed in the same way from
√
1
the curve x = (0, (
y
and the points 0) and , 1). It has a volume that is
b − y2 b−1
four times as great as the container in part (i). Find an expression for b in terms
of a. [3]
(i) Use your calculator to find the gradient of the curve y = 2cos x at the points where x = 0
1
and x = π. [2]
2
1
(ii) Find the equations of the tangents to this curve at the points where x = 0 and x = π
2
and find the coordinates of the point where these tangents meet. [3]
the exact volume of the solid obtained, simplifying your answer. [5]
(i) Sketch the graph of D. Give in exact form the coordinates of the points where D
meets the x-axis, and also give in exact form the coordinates of the maximum point
on the curve. [4]
(ii) Find, in terms of a, the area under D for 0 ≤ t ≤ a, where a is a positive constant less
than 2π. [3]
1
The normal to D at the point where t = π cuts the x-axis at E and the y-axis at F .
2
(iii) Find the exact area of triangle OEF , where O is the origin. [4]
and c and use these values to find the coefficient of x4 in the expansion of ax (1 + bx) ,
c
(iii) Show that the substitution y = sin u transforms the integral in (ii) to π ∫
b
u2 cos u du,
a
for limits a and b to be determined. Hence find the exact volume. [6]
1
x = sin3 θ, y = 3 sin2 θ cos θ, for 0 ≤ θ ≤ π.
2
dy
(i) Show that = 2 cot θ − tan θ. [3]
dx
√
(ii) Show that C has a turning point when tan θ = k, where k is an integer to be
determined. Find, in non-trigonometric form, the exact coordinates of the turning
point and explain why it is a maximum. [6]
(iii) Show that the area of the region bounded by C and the x-axis is given by
1
2π
∫0 9 sin4 θ cos2 θ dθ.
Use your calculator to find the area, giving your answer correct to 3 decimal places.
[3]
The line with equation y = ax, where a is a positive constant, meets C at the origin and at
the point P .
3
(iv) Show that tan θ = at P . Find the exact value of a such that the line passes through
a
the maximum point of C. [3]
O α x
−7
(β, −7)
(i) Find the value of α, giving your answer correct to 3 decimal places, and find the exact
value of β. [2]
(ii) Evaluate ∫ f (x) dx, giving your answer correct to 3 decimal places.
α
[2]
√
β
(iii) Find, in terms of 3, the area of the finite region bounded by the curve and the line,
for x ≥ 0. [3]
(iv) Show that f (x) = f (−x). What can be said about the six roots of the equation
f (x) = 0? [4]
Remark 135. For (ii), replace binomial expansion (no longer on the 9758 syllabus) with
Maclaurin expansion.
(iii) Hence, or otherwise, find the first four non-zero terms of the Maclaurin series for
sin−1 . Give the coefficients as exact fractions in their simplest form.
x
[4]
3
1164, Contents www.EconsPhDTutor.com
Exercise 584. (9740 N2014/I/10.) (Answer on p. 1700.)
The mass, x grams, of a certain substance present in a chemical reaction at time t minutes
satisfies the differential equation
dx
= k (1 + x − x2 ) ,
dt
1 1 dx 1
where 0 ≤ x ≤ and k is a constant. It is given that x = and = − when t = 0.
2 2 dt 4
1
(i) Show that k = − . [1]
5
(ii) By first expressing 1 + x − x2 in completed square form, find t in terms of x. [5]
(iii) Hence find
(a) the exact time taken for the mass of the substance present in the chemical reaction
to become half of its initial value, [1]
(b) the time taken for there to be none of the substance present in the chemical
reaction, giving your answer correct to 3 decimal places. [1]
(iv) Express the solution of the differential equation in the form x = f (t) and sketch the
part of the curve with this equation which is relevant in this context. [5]
Give your answer in the form a ln b + c tan−1 d, where a, b, c and d are rational numbers to
be determined. [9]
and that f (x + 3a) = f (x) for all real values of x, where a is a real constant.
(i) Sketch the graph of y = f (x) for −4a ≤ x ≤ 6a. [3]
√
3
(ii) Use the substitution x = a sin θ to find the exact value of ∫ 1 f (x) dx in terms of a
2 a
a 2
and π. [5]
dz
= 3 − 2z (A)
dx
dy
=z (B)
dx
3
(i) Given that z < , solve equation (A) to find z in terms of x. [4]
2
(ii) Hence find y in terms of x. [2]
(iii) Use the result in part (ii) to show that
d2 y dy
= a + b,
dx2 dx
x = 3t2 , y = 2t3 .
(i) Find the equation of the tangent to C at the point with parameter t. [3]
(ii) Points P and Q on C have parameters p and q respectively. The tangent at P meets
the tangent at Q at the point R. Show that the x-coordinate of R is p2 + pq + q 2 , and
find the y-coordinate of R in terms of p and q. Given that pq = −1, show that R lies
on the curve with equation x = y 2 + 1. [5]
y
C
A curve L has equation x = y 2 + 1. The diagram shows the parts of C and L for which y ≥ 0.
The curves C and L touch at the point M .
(iii) Show that 4t6 − 3t2 + 1 = 0 at M . Hence, or otherwise, find the exact coordinates of
M. [3]
(iv) Find the exact value of the area of the shaded region bounded by C and L for which
y ≥ 0. [6]
x x
a
x x
x x x
B Fig. 1 C Fig. 2 Fig. 3
1 √ √ 2
(i) Show that the volume V of the prism is given by V = x 3 (a − 2x 3) . [3]
4
(ii) Use differentiation to find, in terms of a, the maximum value of V , proving that it is
a maximum. [6]
Remark 136. For (ii), assume also that a is a fixed constant. Otherwise, V has no
maximum because we can simply let both a and x grow without bound.
1 3
π
4
θ
A C
1
(i) Show that AC = . [4]
cos θ − sin θ
(ii) Given that θ is a sufficiently small angle, show that
AC ≈ 1 + aθ + bθ2 ,
x − y = (x + y) 2 .
(i) It is given that the volume of the model is a fixed value k cm3 , and the external surface
area is a minimum. Use differentiation to find the values of r and h in terms of k.
Simplify your answers. [7]
(ii) It is given instead that the volume of the model is 200 cm3 and its external surface
area is 180 cm2 . Show that there are two possible values of r. Given also that r < h,
find the value of r and the value of h. [5]
x = θ − sin θ, y = 1 − cos θ,
where 0 ≤ θ ≤ 2π.
dy 1
(i) Show that = cot θ and find the gradient of C at the point where θ = π. What can
dx 2
be said about the tangents to C as θ → 0 and θ → 2π? [5]
(ii) Sketch C, showing clearly the features of the curve at the points where θ = 0, π and
2π. [3]
(iii) Without using a calculator, find the exact area of the region bounded by C and the
x-axis. [5]
1
(iv) A point P on C has parameter p, where 0 < p < π. Show that the normal to C at P
2
crosses the x-axis at the point with coordinates (p, 0). [3]
1170, Contents www.EconsPhDTutor.com
Exercise 597. (9740 N2012/II/1.) (Answer on p. 1710.)
d2 y
(a) Find the general solution of the differential equation 2 = 16 − 9x2 , giving your answer
dx
in the form y = f (x). [3]
du
(b) Given that u and t are related by = 16 − 9u2 , and that u = 1 when t = 0, find t in
dt
terms of u, simplifying your answer. [5]
(i) Use the first three non-zero terms of the Maclaurin series for cos x to find the Maclaurin
series for g (x), where g (x) = cos6 x, up to and including the term in x4 . [3]
(ii) (a) Use your answer to part (i) to give an approximation for ∫ g (x) dx in terms of
a
0
a, and evaluate this approximation in the case where a = .
π
[3]
4
1
dv
= 10 − 0.1v 2 .
dt
(iii) Find t in terms of v. Hence find the exact time the stone takes to reach a speed of 5
metres per second. [5]
(a) Find the speed of the stone after 1 second. [3]
(b) What happens to the speed of the stone for large values of t? [2]
(i) Show that the volume V cubic metres of the box is given by V = 2n2 x − 6nx2 + 4x3 .[3]
(ii) Without using a calculator, find in surd form the value of x that gives a stationary
value of V , and explain why there is only one answer. [6]
4x
(b) The region bounded by the curve y = , the axis and the lines x = 0 and x = 1 is
+1 x2
rotated through 2π radians about the x-axis. Use the substitution x = tan θ to show
π/4
that the volume of the solid obtained is given by 16π ∫ sin2 θ dθ, and evaluate this
0
integral exactly. [6]
α −1 O β 1 γ x
(i) Find the values of β and γ, giving your answers correct to 3 decimal places. [2]
(ii) Find the area of the region bounded by the curve and the x-axis between x = β and
x = γ. [2]
(iii) Use a non-calculator method to find the area of the region bounded by the curve and
the line. [4]
(iv) Find the set of values of k for which the equation x3 − 3x + 1 = k has three real distinct
roots. [2]
3x 3x
y
ky
x x
Box Lid
(i) Use differentiation to find, in terms of k, the value of x which gives a minimum total
external surface area of the box and the lid. [6]
Remark 137. My interpretation: “Total external surface area” refers to that when the
box and lid are kept separate, as depicted. Also, each of the box’s and lid’s external
surface areas consist of five rectangles. Further, I assume that k is constant.
y
(ii) Find also the ratio of the height to the width, , in this case, simplifying your answer.
x
[2]
y
(iii) Find the values between which must lie. [2]
x
(iv) Find the value of k for which the box has square ends. [2]
(ii) The tangent at P meets the line y = x at the point A and the line y = −x at the point
B. Show that the area of triangle OAB is independent of p, where O is the origin.[4]
(iii) Find a cartesian equation of C. Sketch C, giving the coordinates of any points where
C crosses the x- and y-axes and the equations of any asymptotes. [4]
(i) Given that f (x) = ecos x , find f (0), f ′ (0) and f ′′ (0). Hence write down the first two
non-zero terms in the Maclaurin series for f (x). Give the coefficients in terms of e.[5]
(ii) Given that the first two non-zero terms in the Maclaurin series for f (x) are equal
1
to the first two non-zero terms in the series expansion of , where a and b are
a + bx2
constants, find a and b in terms of e. [4]
0
region between the curve and the positive x-axis. [4]
2
(iv) Find the exact value of ∫ ∣f (x)∣ dx. [2]
−2
(v) Find the volume of revolution when the region bounded by the curve, the lines x = 0,
x = 1 and the x-axis is rotated completely about the x-axis. Give your answer correct
to 3 significant figures. [2]
x = t2 + 4t, y = t3 + t2 .
(i) Sketch the curve for −2 ≤ t ≤ 1. [1]
(ii) The other scientist suggests that n and t are related by the differential equation
dn
= 3 − 0.02n. Find n in terms of t, given again that n = 100 when t = 0. Explain in
dt
simple terms what will eventually happen to the population using this model. [7]
O 1 2
dy 3x
= 2 . [2].
dx x + 1
(ii) Find the particular solution of the differential equation for which y = 2 when x = 0.[1]
(iii) What can you say about the gradient of every solution curve as x → ±∞? [1]
(iv) Sketch, on a single diagram, the graph of the solution found in part (ii), together with
2 other members of the family of solution curves. [3]
1177, Contents www.EconsPhDTutor.com
Exercise 619. (9740 N2008/I/5.) (Answer on p. 1722.)
√1
1
(i) Find the exact value of ∫
3
dx. [3]
0 1 + 9x2
e
(ii) Find, in terms of n and e, ∫ xn ln x dx, where n ≠ −1. [4]
1
(b) Given that f (x) = tan (2x + ), find f (0), f ′ (0) and f ′′ (0). Hence find the first 3
π
4
terms in the Maclaurin series of f (x). [5]
0.5
O 0.5 1 x
−0.5
(i) Write down an integral that gives the area of R, and evaluate this integral numerically.
[3]
(ii) The part of R above the x-axis is rotated through 2π radians about the x-axis. By
using the substitution u = 1 − x, or otherwise, find the exact value of the volume
obtained. [3]
(iii) Find the exact x-coordinate of the maximum point of C. [3]
cos 2x
√ ≈ a + bx2 . [4]
1+x 2
1 1 3 −2
∫0 xe dx = 4 − 4 e .
−2x
[5]
dy
(ii) Hence find the solution of the differential equation xy = x2 + y 2 for which y = 6
dx
when x = 2. [5]
1
x = cos3 t, y = sin3 t, for 0 < t < π.
4
(i) Show that the equation of the normal to the curve at the point P (cos3 t, sin3 t) is
(i) Use mathematical induction to prove the statement for all positive integers n. [6]
(ii) By considering the expression obtained by integrating each term on the left hand side,
prove the statement without using mathematical induction. [6]
2x
ln (1 + x) −
x+2
is never negative. [5]
2x
(ii) Hence show that ln (1 + x) ≥ when x ≥ 0. [3]
x+2
2x
Remark 139. In (i), it is claimed that “ln (1 + x) − ” is a function. But it is not. It
x+2
is simply an expression.
As repeatedly stressed in Ch. 10, to specify a function, we must state its domain, codo-
main, and mapping rule. It turns out that the failure to do so here has important
consequences (see footnote in answer).
dI
4 = 2 − 3I.
dt
(i) Find I in terms of t, given that I = 2 when t = 0. [6]
(ii) State what happens to the current in this circuit for large values of t. [1]
1
x = cos2 t, y = sin3 t, for 0 ≤ t ≤ π.
2
(i) Sketch the curve. [2]
1
(ii) The tangent to the curve at the point (cos2 θ, sin3 θ), where 0 < θ < π, meets the x-
2
and y-axes at Q and R respectively. The origin is denoted by O. Show that the area
of △OQR is
1
sin θ (3 cos2 θ + 2 sin2 θ) .
2
[6]
12
1
(iii) Show that the area under the curve for 0 ≤ t ≤ 0.5π is 2 ∫
2π
cos t sin4 t dt, and use the
0
substitution sin t = u to find this area. [5]
4
powers of x, where ∣x∣ < . Give your answer as a fraction in its lowest terms. [3]
3
1182, Contents www.EconsPhDTutor.com
Exercise 639. (9233 N2007/I/3.) (Answer on p. 1731.)
1 1 1√
The region bounded by the curve y = √ , the x-axis and the lines x = and x = 3
1 + 4x2 2 2
is rotated through 4 right angles about the x-axis to form a solid of revolution of volume
V . Find the exact value of V , giving your answer in the form kπ2 . [5]
∫ √ dt
1 − t2
Remark 140. For (i), let us specify also that u ∈ (− , ) so that (a) t = sin u ∈ (−1, 1);
π π
2 2
(b) the integrand is well defined for all t; and (c) cos u ∈ (0, 1).
d3 y d2 y dy
(i) = 2 , [3]
dx3 dx2 dx
d4 y
(ii) the value of when x = 0 is 2. [4]
dx4
(iii) Write down the Maclaurin series for ln (sec x) up to and including the term in x4 . [2]
1 π2 π4
(iv) By substituting x = π, show that ln 2 ≈ + . [3]
4 16 1 536
dy x2 + y 2
= . [4]
dx 2xy
dy 2xy
=− 2 .
dx x + y2
(ii) By substituting y = vx, where v is a function of x, show that, for the second family of
curves,
dv 3v + v 3
=− . [4]
1 + v2
x
dx
(iii) Hence show that the second family of curves is given by
3x2 y + y 3 = C,
384
In my opinion, there are two reasons why this simplifying assumption should have been included. First,
dy
if x or y equals zero, then is undefined. Second, if x or y is negative, then there is little additional
dx
hassle we’ll have to deal with in (iii), in particular with regards to ln ∣⋅∣. My suspicion is that on this
particular exam, these issues were simply glossed over.
1184, Contents www.EconsPhDTutor.com
Exercise 645. (9233 N2006/I/7.) (Answer on p. 1733.)
A hollow cone of semi-vertical angle 45○ is
held with its axis vertical and vertex down-
wards (see diagram). At the beginning of an
experiment, it is filled with 390 cm3 of liquid.
The liquid runs out through a small hole at
the vertex at a constant rate of 2 cm3 s−1 .
Find the rate at which the depth of the li- 45°
quid is decreasing 3 minutes after the start
of the experiment. [6]
3x2 + xy + y 2 = 33
Remark 143. For (ii), assume also that θ ∈ [0, π/2) ∪ [π, 3π/2), so that tan θ ≥ 0.
O x
R
P
1
(i) Prove that the gradient of QR is − . [2]
qr
(ii) Given that the line through P perpendicular to QR meets the curve at V (cv, ), find
c
v
v in terms of p, q and r. [2]
(iii) Find the gradient of the normal at P . [3]
1
(iv) The normal at P meets the curve again at S (cs, ). Show that s = − 3 .
c
[2]
s p
○
(v) Given that angle QP R is 90 , prove that QR is parallel to the normal at P . [3]
dz 32
(i) Given that z = =
x
1 , show that . [3]
(x2 + 32) 2 dx (x2 + 32) 32
1
(ii) Find the exact value of the area of the region bounded by the curve y = ,
(x2 + 32)
3
2
In order to protect them from rusting, the spheres are given a coating which increases the
mass of each sphere by 10%.
(ii) Find the probability that the mass of a coated sphere is between 21.5 and 22.45 grams.
State the distribution you use and its parameters. [3]
(iii) The masses of the metal bars are normally distributed such that 60% of them have a
mass greater than 12.2 grams and 25% of them have a mass less than 12 grams. Find
the mean and standard deviation of the masses of metal bars. [4]
(iv) The probability that the total mass of a component, consisting of two randomly chosen
coated spheres and one randomly chosen bar, is more than k grams is 0.75. Find k,
stating the parameters of any distribution you use. [4]
1189, Contents www.EconsPhDTutor.com
Exercise 657. (9740 N2016/II/5.) (Answer on p. 1736.)
In a game of chance, a player has to spin a fair spinner. The spinner has 7 sections and an
arrow which has an equal chance of coming to rest over any of the 7 sections. The spinner
has 1 section labelled R, 2 sections labelled B and 4 sections labelled Y (see diagram).
Y Y
B
Y
Y
B
The player then has to throw one of three fair six-sided dice, coloured red, blue or yellow.
If the spinner comes to rest over R the red die is thrown, if the spinner comes to rest over
B the blue die is thrown and if the spinner comes to rest over Y the yellow die is thrown.
The yellow die has one face with ∗ on it, the blue die has two faces with ∗ on it and the
red die has three faces with ∗ on it. The player wins the game if the die thrown comes to
rest with a face showing ∗ uppermost.
(i) Find the probability that a player wins a game. [2]
(ii) Given that a player wins a game, find the probability that the spinner came to rest
over B. [1]
(iii) Find the probability that a player wins 3 consecutive games, each time throwing a die
of a different colour. [2]
(i) The directors wish to survey a sample of 100 of the employees. This sample is to be
a stratified sample, based on department and gender.
(a) How many males should be in the sample? [1]
(b) How many females from the Development department should be in the sample?
[1]
The Managing Director knows that, some years ago, the mean age of employees was 37
years. He believes that the mean age of employees now is less than 37 years.
(ii) State why the stratified sample from part (i) should not be used for a hypothesis test
of the Managing Director’s belief. [1]
The Company Secretary obtains a suitable sample of 80 employees in order to carry out a
hypothesis test of the Managing Director’s belief that the mean age of the employees now
is less than 37 years. You are given that the population variance of the ages is 140 years2 .
(iii) Write down appropriate hypotheses to test the Managing Director’s belief. You are
given that the result of the test, using a 5% significance level, is that the Managing
Director’s belief should be accepted. Determine the set of possible values of the mean
age of the sample of employees. [4]
(iv) You are given instead that the mean age of the sample of employees is 35.2 years, and
that the result of a test at the α% significance level is that the Managing Director’s
belief should not be accepted. Find the set of possible values of α. [3]
(i) Plot a scatter diagram on graph paper for these values, labelling the axes, using a
scale of 2 cm to represent 10% efficiency on the y-axis and an appropriate scale for the
x-axis. On your diagram, circle the point that Xian has copied wrongly. [2]
For parts (ii), (iii) and (iv) of this question you should exclude the point for which Xian
has copied the efficiency value wrongly.
(ii) Explain from your scatter diagram why the relationship between x and y should not
be modelled by an equation of the form y = ax + b. [1]
(iii) Suppose that the relationship between x and y is modelled by an equation of the form
y = + d, where c and d are constants. State with a reason whether each of c and d
c
x
is positive or negative. [2]
(iv) Find the product moment correlation coefficient and the constants c and d for the
model in part (iii). [3]
(v) Use the model y = + d, with the values of c and d found in part (iv), to estimate the
c
x
efficiency value (y) that Xian has copied wrongly. Give two reasons why you would
expect this estimate to be reliable. [3]
h 2000 5000 10 000 15 000 20 000 25 000 30 000 35 000 40 000 45 000
P 27.8 24.9 20.6 16.9 13.8 11.1 8.89 7.04 5.52 4.28
(i) Draw a scatter diagram for these values, labelling the axes. [1]
(ii) Find, correct to 4 decimal places, the product moment correlation coefficient between
(a) h and P ,
(b) ln h and P ,
√
(c) h and P . [3]
(iii) Using the most appropriate case from part (ii), find the equation which best models
air pressure at different heights. [3]
(iv) Given that 1 metre = 3.28 feet, re-write your equation from part (iii) so that it can
be used to estimate the air pressure when the height is given in metres. [2]
m 11 20 28 36 40 47 58 62 68 75
P 112 800 102 600 76 500 72 000 72 000 69 000 65 800 57 000 50 600 47 600
It is thought that the price after m months can be modelled by one of the formulae
P = am + b, P = c ln m + d,
Set 1 + + + + × × × ◯ ◯ ⋆
Set 2 + + + × ◯ ◯ ◯ ◯ ⋆ ⋆
Set 3 + + × × × × ◯ ◯ ◯ ⋆
For example, if a + symbol is chosen from set 1, a ◯ symbol is chosen from set 2 and a ⋆
symbol is chosen from set 3, the display would be +◯⋆.
(i) Find the probability that, on one turn,
(a) ⋆ ⋆ ⋆ is displayed, [1]
(b) at least one ⋆ symbol is displayed, [2]
(c) two ×symbols and one + symbol are displayed in any order. [3]
(ii) Given that exactly one of the symbols displayed is ⋆, find the probability that the
other two symbols are + and ◯. [4]
Exercise 677. (9740 N2014/II/11.) (Answer on p. 1742.)
An art dealers sells both original paintings and prints. (Prints are copies of paintings.) It
is to be assumed that his sales of originals per week can be modelled by the distribution
Po(2) and his sales of prints per week can be modelled by the independent distribution
Po(11).
(i) Find the probability that, in a randomly chosen week,
(a) the art dealer sells more than 8 prints, [2]
(b) the art dealer sells a total of fewer than 15 prints and originals combined. [2]
(ii) The probability that the art dealer sells fewer than 3 originals in a period of n weeks
is less than 0.01. Express this information as an inequality in n, and hence find the
smallest possible integer value of n. [5]
(iii) Using a suitable approximation, which should be stated, find the probability that the
art dealer sells more than 550 prints in a year (52 weeks). [3]
(iv) Give two reasons in context why the assumptions made at the start of this question
may not be valid. [2]
Exercise 678. (9740 N2013/II/5.) (Answer on p. 1743.)
A large multi-national company has 100 000 employees based in several different countries.
To celebrate the 90th anniversary of the founding of the company, the Chief Executive
wishes to invite a representative sample of 90 employees to a party, to be held at the
company’s Headquarters in Singapore.
(i) Explain how random sampling could be carried out to choose the 90 employees. Ex-
plain briefly why this may not provide the representative sample that the Chief Ex-
ecutive wants. [2]
(ii) Name a more appropriate sampling method, and explain how it can be carried out to
provide the representative sample that the Chief Executive wants. [2]
(i) State, in context, two assumptions needed for F to be well modelled by a binomial
distribution. [2]
Assume now that F has a binomial distribution.
(ii) Given that n = 20, find P (F = 1). [1]
(iii) Given instead that n = 60, use a suitable approximation to find the probability that
F is at least 5. State the parameter(s) of the distribution that you use. [3]
Exercise 681. (9740 N2013/II/8.) (Answer on p. 1743.) For events A and B it is given
that P(A) = 0.7, P (B∣A′ ) = 0.8 and P (A∣B ′ ) = 0.88. Find
(i) P (B ∩ A′ ), [1]
(ii) P (A′ ∩ B ′ ), [2]
(iii) P (A ∩ B). [3]
(i) Calculate unbiased estimates of the population mean and variance. [2]
The manufacturer claims that this model of car will travel 13.8 km per litre on average. It
is given that the distances travelled per litre for cars of this model are normally distributed.
(ii) Stating a necessary assumption, carry out a t-test of the magazine editor’s belief at
the 5% significance level. [5]
(ii) Draw the scatter diagram for these values, labelling the axes. [1]
(iii) Explain which of the three cases in part (i) is the most appropriate for modelling
these values, and calculate the product moment correlation coefficient for this case.[2]
(iv) It is required to estimate the distance travelled at a speed of 110 km h−1 . Use the case
that you identified in part (iii) to find the equation of a suitable regression line, and
use your equation to find the required estimate. [3]
Week x 1 2 3 4 5 6
Percentage mark y 38 63 67 75 71 82
L 91 92 93
r −0.929 944 −0.929 918
(iv) Calculate the value of r for L = 91, giving your answer correct to 6 decimal places.[1]
(v) Use the table and your answer to part (iv) to suggest with a reason which of 91, 92
or 93 is the most appropriate value for L. [1]
(vi) Using the value for L, calculate the values of a and b, and use them to predict the
week in which Amy will obtain her first mark of at least 90%. [4]
(vii) Give an interpretation, in context, of the value of L. [1]
(iv) For an unknown value of p it is given that P (A = 15) = 0.06864 correct to 5 decimal
places. Show that p satisfies an equation of the form p(1−p) = k, where k is a constant
to be determined. Hence find the value of p to a suitable degree of accuracy, given
that p < 0.5. [5]
(i) Sketch a scatter diagram that might be expected for the case when x and y are related
approximately by y = a+bx2 , where a is positive and b is negative. Your diagram should
include 5 points, approximately equally spaced with respect to x, and with all x- and
y-values positive. [1]
The table gives the values of seven observations of bivariate data, x and y.
(ii) Calculate the value of the product moment correlation coefficient, and explain why
its value does not necessarily mean that the best model for the relationship between
x and y is y = c + dx. [2]
(iii) Explain how to use the values obtained by calculating product moment correlation
coefficients to decide, for this data, whether y = a + bx2 or y = c + dx is the better
model. [1]
(iv) It is desired to use the data in the table to estimate the value of y for which x = 3.2.
Find the equation of the least-squares regression line of y on x2 . Use your equation
to calculate the desired estimate. [3]
385
This question is terribly vague. A trivial but perfectly correct answer would be P (A′ ∩ B ∩ C) ≥ 0, but
I suspect that any smart aleck who wrote this didn’t get the mark.
1208, Contents www.EconsPhDTutor.com
Exercise 705. (9740 N2010/II/10.) (Answer on p. 1753.)
A car is placed in a wind tunnel and the drag force F for different wind speeds v, in
appropriate units, is recorded. The results are shown in the table.
v 0 4 8 12 16 20 24 28 32 36
F 0 2.5 5.1 8.8 11.2 13.6 17.6 22.0 27.8 33.9
(i) Draw the scatter diagram for these values, labelling the axes clearly. [2]
It is thought that the drag force F can be modelled by one of the formulae
F = a + bv or F = c + dv 2
386
I have changed the wording of this sentence slightly.
1209, Contents www.EconsPhDTutor.com
Exercise 707. (9740 N2009/II/5.) (Answer on p. 1755.)
A cinema manager wishes to take a survey of opinions of cinema-goers. Describe how a
quota sample of size 100 might be obtained, and state one disadvantage of quota sampling.
[3]
∑ x = 86.4, ∑ x2 = 835.92.
(i) Calculate unbiased estimates of the mean and variance of X. [2]
The mean mass of sugar in a packet is claimed to be 10 grams. The company directors
want to know whether the sample indicates that this claim is incorrect.
(ii) Stating a necessary assumption, carry out a t-test at the 5% significance level. Explain
why the Central Limit Theorem does not apply in this context. [7]
(iii) Suppose now that the population variance of X is known, and that the assumption
made in part (ii) is still valid. What change would there be in carrying out the test?
[1]
∑ x = 1 026.0, ∑ x2 = 77 265.90.
Test, at the 5% significance level, whether the mean mass of calcium in a bottle has changed.
[6]
1212, Contents www.EconsPhDTutor.com
Exercise 716. (9740 N2008/II/7.) (Answer on p. 1758.)
A computer game simulates a tennis match between two players, A and B. The match
consists of at most three sets. Each set is won by either A or B, and the match is won by
the first player to win two sets.
The simulation uses the following rules.
• The probability that A wins the first set is 0.6.
• For each set after the first, the conditional probability that A wins that set, given that
A won the preceding set, is 0.7.
• For each set after the first, the conditional probability that B wins that set, given that
B won the preceding set, is 0.8.
Calculate the probability that
(i) A wins the second set, [2]
(ii) A wins the match, [3]
(iii) B won the first set, given that A wins the match. [3]
(i) Calculate the product moment correlation coefficient between x and t, and explain
whether your answer suggests that a linear model is appropriate. [3]
(ii) Draw a scatter diagram for the data. [1]
One of the values t appears to be incorrect.
(iii) Indicate the corresponding point on your diagram by labelling it P , and explain why
the scatter diagram for the remaining points may be consistent with a model of the
form t = a + b ln x. [2]
(iv) Omitting P , calculate least square estimates of a and b for the model t = a + b ln x.[2]
(v) Estimate the value of t at the value of x corresponding to P . [1]
(vi) Comment on the use of the model in part (iv) in predicting the value of t when x = 8.0.
[1]
(i) Give a real-life example of a situation in which quota sampling could be used. Explain
why quota sampling would be appropriate in this situation, and describe briefly any
disadvantage that quota sampling has. [4]
(ii) Explain briefly whether it would be possible to use stratified sampling in the situation
you have described in part (i). [1]
∑ x = 4626, ∑ x2 = 147691.
(i) Find unbiased estimates of the population mean and variance. [2]
(ii) Test, at the 5% significance level, whether the population mean time for a student to
complete the project exceeds 30 hours. [4]
(iii) State giving a valid reason, whether any assumptions about the population are needed
in order for the test to be valid. [1]
It is given that the value of the product moment correlation coefficient for this data is
−0.912, correct to 3 decimal places. The scatter diagram for the data is shown below.
100
90
80
70
60
50
40
30
20
10 t (minutes)
0
0 50 100 150 200 250 300 350
B
A
E
F
G
H
I
J
(i) How many different triangles can be drawn which have the point A as one of the
vertices? [1]
(ii) How many different triangles in total can be drawn? [4]
(i) A random sample of size 100 is taken from a population with mean 30 and standard
deviation 5. Find an approximate value for the probability that the sample mean lies
between 29.2 and 30.8. [6]
(ii) Giving a reason, state whether it is necessary to make any assumptions about the
distribution of the population. [1]
387
Assume also that each of the latter 5 balls is different from each of the first 3.
1222, Contents www.EconsPhDTutor.com
Exercise 743. (9233 N2006/II/28.) (Answer on p. 1766.)
Observations are made of the speeds of cars on a particular stretch of road during daylight
hours. It is found that, on average, 1 in 80 cars is travelling at a speed exceeding 125 km h−1 ,
and 1 in 10 is travelling at a speed less than 40 km h−1 .
(i) Assuming a normal distribution, find the mean and the standard deviation of this
distribution. [4]
(ii) A random sample of 10 cars is to be taken. Find the probability that at least 7 will
be travelling at a speed in excess of 40 km h−1 . [3]
(iii) A random sample of 100 cars is to be taken. Using a suitable approximation, find the
probability that at most 8 cars will be travelling at a speed less than 40 km h−1 . [3]
388
See Preface/Rant — p. xli.
1224, Contents www.EconsPhDTutor.com
Paper I: Pure Mathematics [100]
Q A Part Topics Points
1 1158 1685 Calc. Maclaurin 4=4
2 1121 1603 F&G graphs, absolute value 2+4=6
differentiation, stationary points,
3 1158 1685 Calc. 4+3=7
turning points, maximum, minimum
conic sections, differentiation,
4 1121 1604 F&G 3+3+2=8
asymptotes, transformations
factorisation, Remainder Theorem,
5 1121 1604 F&G 4 + 3 + 3 = 10
differentiation, quadratic
6 1144 1652 Vectors vector equations, lines, planes 2+3+3=8
7 1158 1686 Calc. integration, trigonometry 3+5=8
8 1152 1664 Complex quadratic, factorisation 3 + 4 + 3 = 10
3+2+
9 1134 1634 S&S summation, limits, Maclaurin = 13
4+4
vector equations, lines,
10 1144 1652 Vectors 4 + 4 + 5 = 13
scalar product, quadratic
1+3+
11 1158 1686 Calc. differential equations = 13
5+4
Paper II, Section A: Pure Mathematics [40]
parametric, differentiation,
1 1121 1605 F&G 3+5=8
equations, points
arithmetic progression,
2 1134 1634 S&S 2+4+3=9
geometric progression
inverse, composite functions 4+2+
3 1122 1605 F&G = 12
transformations, conic sections 4+2
graph, quadratic,
4 1159 1687 Calc. 4 + 4 + 3 = 11
integration, volume
Section B: Probability and Statistics [60]
5 1187 1736 P&S 3+2+2=7
6 1187 1736 P&S 2+3+4=9
1+2+
7 1187 1736 P&S = 10
5+2
3+2+
8 1188 1736 P&S = 10
3+2
2+1+
1+2
9 1189 1736 P&S = 12
1+3+
1+1
1+3+
10 1189 1736 P&S = 12
4+4
x2 + (k − 4) x − (k − 7) > 0
A D
2x
B C
The diagram shows a sign in the shape of a rectangle ABCD with two semicircles, one
attached to AB and one attached to CD. The length of AB is 2x cm and the total perimeter
of the sign is 10 cm.
(i) Show that the area of the sign is x (10 − πx) cm2 . [3]
The area of the sign is to be as large as possible.
2. Use a non-calculator method to find the maximum value of this area, giving your answer
in terms of π. Justify that this is the maximum value. [4]
Employee A B C D E F G H
x 14 16 11 24 36 28 22 40
y 4.9 5.5 5.2 6.5 9.7 7.5 6.2 9.8
Sue has been employed by the company for 2 years and she earns 190 dollars per week.
(iv) Use the equation of your regression line to calculate an estimate of the weekly earnings
for employees on the production line who have been with the company for 2 years.[1]
Sue concludes that she should be earning more.
(v) Give two reasons why her conclusion might not be justified. [2]
∑ y = 70.4, ∑ y 2 = 49.42.
(iii) Find unbiased estimates for the population mean and variance using this second
sample. [3]
(iv) Using this second sample, test at the 5% significance level, whether there is evidence
that the population mean volume of juice differs from 0.6 litres. [4]
Marketing Economics
20 16 11
x
12 12
16
Finance
Marketing, Economics and Finance are three subjects offered at a business college. The
numbers of students studying different combinations of these subjects are shown in the
above Venn diagram. Every student studies at least one of these subjects. The number
who study all three subjects is x. One of the students is chosen at random.
• M is the event that the student studies Marketing.
• E is the event that the student studies Economics.
• F is the event that the student studies Finance.
(i) Write down expressions for P (M ) and P (E) in terms of x. [2]
(ii) Given that events M and E are independent, find the value of x. [3]
(iii) Find P (M ∪ F ′ ). [1]
(iv) Explain, in the context of this question, what is meant by P (F ∣M ), and find its value.
[3]
Three students are chosen at random, without replacement.
(v) Find the probability that each studies exactly two of these three subjects. [3]
1242, Contents www.EconsPhDTutor.com
Exercise 755. (8865 N2017/12.) (Answer on p. 1244.)
There are bus and train services between the towns of Ayton and Beeton. The journey
times, in minutes, by bus and by train have independent normal distributions. The means
and standard deviations of these distributions are shown in the following table.
(i) Find the probability that a randomly chosen bus journey takes less than 48 minutes.
[1]
(ii) Find the probability that two randomly chosen bus journeys each take more than 48
minutes. [2]
(iii) The probability that the total time for two randomly chosen bus journeys is more
than 96 minutes is denoted by p. Without calculating its value, explain why p will be
greater than your answer to part (ii). [1]
Lan lives in Ayton and works in Beeton. Three days a week he travels from home to work
by bus and two days a week he travels from home to work by train.
(iv) Find the probability that for 3 randomly chosen bus journeys and 2 randomly chosen
train journeys, Lan’s total journey time is more than 210 minutes. [4]
Journeys are charged by the time taken. For bus journeys the charge is $0.12 per minute
and for train journeys the charge is $0.15 per minute.
Let B represent the cost of one journey from Ayton to Beeton by bus.
Let T represent the cost of one journey from Ayton to Beeton by train.
5. Find P (3B − 2T < 3) and explain, in the context of this question, what your answer
represents. [5]
(iv) Find ∫ (e−x − x2 ) dx, where k > 0. Give your answer in terms of k.
k
[3]
0
2x
D
The diagram shows a V-shape which is formed by removing the equilateral triangle DEF ,
in which DE = y cm, from an equilateral triangle ABC, in which AB = 2x cm. The√points
E and F are on BC such that BE = F C. The area of the V-shape ABEDF CA is 2 3 cm2 .
(i) Show that 4x2 − y 2 = 8. [3]
(ii) Given that the perimeter of ABEDF CA is 10 cm, find the values of x and y. [6]
(i) Give a sketch of the scatter diagram for the data, as shown on your calculator. [2]
(ii) Find the product moment correlation coefficient. [1]
(iii) Find the equation of the regression line of y on x in the form y = mx + c, giving the
values of m and c correct to 3 significant figures. [1]
(iv) Calculate an estimate of the time taken to run 1000 metres by an athletes who swims
1000 metres in 16.9 minutes. State two reasons why you would expect this to be a
reliable estimate. [3]
The time taken by a new member of the club to swim 1000 metres and run 1000 metres
are 18.4 minutes and 2.6 minutes respectively.
(v) Calculate the new product moment correlation coefficient when the times for the new
member are included. [1]
(vi) State, with a reason, which of your answers to parts (ii) and (v) is more likely to
represent the correlation between swimming and running times for all members of the
club. [1]
(i) Find the probability that the mass of an individual biscuit is less than 19 grams. [2]
(ii) Find the probability that the total mass of a box containing 12 biscuits is more than
248 grams. State the mean and variance of the distribution that you use. [4]
The cost of producing biscuits if 0.6 cents per gram and the cost of producing empty boxes
is 0.2 cents per gram.
(iii) Find the probability that the total cost of producing a box containing 12 biscuits is
between 142 cents and 149 cents. State the mean and variance of the distribution that
you use. [5]
The main text above has not always been complete, precise, or rigorous. In these appen-
dices, I go some way towards filling in these gaps. In particular, I give formal definitions,
statements of claims, and proofs of claims.
Where there is a trade-off between generality of a result and the simplicity of its proof, I
usually favour the latter.
116.1. Logic
Proof. To prove that two statements are equivalent, we must show that in every possible
case, it is impossible that one is true while the other is false.
We will examine the four possible cases, depending on whether P and Q are true or false:
Case 1. Both P and Q are true. Then:
• P AND Q is true; and so, NOT- (P AND Q) is false.
• Both NOT-P and NOT-Q are false; and so, (NOT-P OR NOT-Q) is false.
Case 2. P is true while Q is false. Then:
• P AND Q is false; and so, NOT- (P AND Q) is true.
• NOT-Q is true; and so, (NOT-P OR NOT-Q) is true.
Case 3. P is false while Q is true. Then:
• P AND Q is false; and so, NOT- (P AND Q) is true.
• NOT-P is true; and so, (NOT-P OR NOT-Q) is true.
Case 4. Both P and Q are false. Then:
• P AND Q is false; and so, NOT- (P AND Q) is true.
• NOT-P is true; and so, (NOT-P OR NOT-Q) is true.
We can use a truth table to present the above case-by-case analysis more tidily and clearly
(“1” = true and “0” = false.) In the truth table below, we present the same four cases in
the same order:
From the truth table, we can quickly tell that across all four possible cases, NOT-(P AND Q)
and (NOT-P OR NOT-Q) always have the same truth values. Thus, the two statements
are equivalent.
Proof. The truth table below shows that NOT-(P OR Q) and (NOT-P AND NOT-Q)
always have the same truth values and are thus equivalent.
P Q P Ô⇒ Q Q Ô⇒ P P Ô⇒ Q AND Q Ô⇒ P P ⇐⇒ Q
1 1 1 1 1 1
1 0 0 1 0 0
0 1 1 0 0 0
0 0 1 1 1 1
0 = {} = ∅.
1 = {0} = {{}} = {∅} .
2 = {0, 1} = {{} , {{}}} = {∅, {∅}} .
3 = {0, 1, 2} = {{} , {{}} , {{} , {{}}}} = {∅, {∅} , {{∅}}} .
⋮
As another example, perhaps surprisingly the function is also defined to be a set. We’ll
see this shortly in Ch. 117.3 below.
We first give a proof sketch to expose the main ideas of the proof.
To prove ⇐Ô, consider for example x = 8.344 571 93 = 8.344 571 935 719 357 193 . . . where
the digits 57 193 eventually recur. Now consider 99 999 000x. We have:
We’ve just shown that x = 834 448 849/99 999 000 — x is the ratio of two integers and is
thus rational.
To prove Ô⇒ , consider for example 9/7 = 1.285 714 = 1.285 714 285 714 . . . .
Long division (the remainder at each step is highlighted in blue):
Line 1 1. 2 8 5 7 1 4
2 7 9 Explanation
3 7 1×7=7
4 2 0 9−7=2
5 1 4 2 × 7 = 14
6 6 0 20 − 14 = 6
7 5 6 8 × 7 = 56
8 4 0 60 − 56 = 4
9 3 5 5 × 7 = 35
10 5 0 40 − 35 = 5
11 4 9 7 × 7 = 49
12 1 0 50 − 49 = 1
13 7 1×7=7
14 3 0 10 − 7 = 3
15 2 8 4 × 7 = 28
16 2 30 − 28 = 2
At line 16, the division isn’t complete. But observe that the remainder at line 16 is the
same as in line 4 — namely, 2. And so clearly, the process will simply repeat. Lines 16
through 28 of the long division will look exactly the same as lines 4 through 16. Lines 28
through 40 will again look the same. Etc. The digits 285 714 will thus recur.
The key insight here is that there are only finitely many possible remainder values (namely,
0, 1, . . . , 9) And thus, the remainder must eventually repeat (or hit zero). This completes
the proof sketch.
On the next page is a “proper” proof of the above Fact:
Observe that A = 10m+k x − 10m x = (10m+k − 10m ) x is an integer. And clearly, 10m+k − 10m
is also an integer. Thus, x can be expressed as the ratio of two integers:
x=
A
10m+k − 10m
.
To prove Ô⇒ , let a and b be integers. We’ll prove that the digits in a/b must eventually
recur.
Let q0 be the largest integer such that a = q0 b + r0 and r0 is a non-negative integer (so that
when we divide a by b, q0 is the quotient and r0 is the remainder.)
For each i = 1, 2, 3, . . . , let qi be the largest integer such that 10ri−1 = qi b + ri and ri is a
non-negative integer (so that at each step of the long division, qi is the quotient and ri is
the remainder.)
And so we can write a/b as follows, with the recurring digits qs+1 qs+2 . . . qt :
We can actually reuse our earlier proofs (of De Morgan’s Laws from logic). But here as an
exercise, let’s prove these two laws using the set theory notation we’ve just learnt.
Fact 9. (P ∪ Q) ′ = P ′ ∩ Q′ .
x ∈ (P ∩ Q) ′
⇐⇒ x∉P ∩Q
⇐⇒ x ∉ P OR x ∉ Q
⇐⇒ x ∈ P ′ OR x ∈ Q′
⇐⇒ x ∈ P ′ ∪ Q′ .
Fact 10. (P ∪ Q) ′ = P ′ ∩ Q′ .
x ∈ (P ∪ Q) ′
⇐⇒ x∉P ∪Q
⇐⇒ x ∉ P AND x ∉ Q
⇐⇒ x ∈ P ′ AND x ∈ Q′
⇐⇒ x ∈ P ′ ∩ Q′ .
116.3. Division
x = dq + r and 0 ≤ r ≤ d.
117.1. Graphs
As mentioned on p. 1253, every mathematical object can be defined in terms of sets. The
ordered pair is directly defined as a set.
There’s actually more than one way we can define an ordered pair.389 What’s important
is that our definition correctly captures the idea that (x, y) = (a, b) ⇐⇒ x = a and y = b.
This our above definition does:
Proof. ⇐Ô is trivial: If x = a AND y = b, then (x, y) = {{x} , {x, y}} = {{a} , {a, b}} = (a, b).
For Ô⇒ , we’ll prove the contrapositive: If x ≠ a OR y ≠ b, then (x, y) ≠ (a, b).
Suppose x ≠ a. Then {x} ≠ {a}.
Case 1. If {x} ≠ {a, b}, then {x} is an element of (x, y) but not of (a, b). So, (x, y) ≠ (a, b).
Case 2. If {x} = {a, b}, then a ≠ b (otherwise {x} = {a, b} = {a}, a contradiction). So
(x, y) = {{x} , {x, y}} = {{a, b} , {a, b, y}} does not contain {a} and thus (x, y) ≠ (a, b).
Now suppose x = a and y ≠ b. Then (x, y) = {{x} , {x, y}} = {{a} , {a, y}} does not contain
{a, b} and therefore (x, y) ≠ (a, b).
We then define the ordered triple (x, y, z) to be the ordered pair ((x, y) , z). Similarly,
we define the ordered quadruple (x1 , x2 , x3 , x4 ) to be the ordered pair ((x1 , x2 , x3 ) , x4 ).
Etc. In general, here is how the ordered n-tuple is defined:
Definition 218. Let n ≥ 3. The ordered n-tuple (x1 , x2 , . . . , xn ) is defined recursively by:
Remark 144. Many writers simply call it a tuple instead of an ordered n-tuple.
389
The above definition is by Kuratowski (1921, p. 171) and is today the one that’s usually used.
1257, Contents www.EconsPhDTutor.com
Fact 189. Two ordered n-tuples (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) are identical if and
only if xi = yi for all i = 1, 2, . . . , n.
In the rest of this subchapter, everything will be in the context of the cartesian plane
R2 = {(x, y) ∶ x, y ∈ R}.
Definition 220. The distance between the points (x1 , y1 ) and (x2 , y2 ) is:
√
(x2 − x1 ) + (y2 − y1 ) .
2 2
Definition 221. If the distance between two points A and B is shorter than that between
A and C, we say that C is closer to A than B.
Fact 190. There is exactly one point on the line ax + by + c = 0 that is closest to (p, q).
Proof. If b = 0, then every point in the line is of the form (−c/a, y). Clearly then, the unique
closest point is simply (−c/a, q).
So suppose b ≠ 0. Then y = (−a/b) x − c/b.
Let d denote the distance between an arbitrary point (x, y) on the line and the point (p, q).
√ √
2
d = (x − p) + (y − q) = (x − p) + (− x − − q)
2 2 2 a c
b b
√
a2 c2
= x2 + p2 −2px + 2 x2 + 2 + q 2 +2 2 x + 2 x + 2
ac aq cq
b b b b b
√
a2 c2
= (1 + 2 )x2 + 2 ( 2 + − p)x + p2 + 2 + q 2 + 2 .
ac aq cq
b b b b b
The quadratic expression inside the surd has a positive coefficient on x2 , which means it has
a strict global and thus unique minimum (see Ch. 9). Thus, d itself has a unique minimum
— there is a exactly one point on the line that is closest to (p, q).
390
By the way, though embedded in n-dimensional space, a point is itself a zero-dimensional object.
1258, Contents www.EconsPhDTutor.com
Corollary 35. The point on the line ax + by + c = 0 that is closest to the point (p, q) is:
ap + bq + c ap + bq + c
(p − a − ).
a2 + b2 a2 + b2
, q b
Proof. If b = 0, then as stated in the previous proof, the closest point is (−c/a, q), which the
reader can verify is indeed equal to the point claimed by the present corollary.
So, suppose b ≠ 0. We know from Ch. 9 that if m > 0, then mx2 + nx + o achieves a strict
global minimum at x = −n/2m. And so, continuing with our previous proof, the value of x
that minimises d is:
2 ( ac
b2 + b − p)
aq
b2 p − ac − abq a2 p + b2 p − ac − abq − a2 p ap + bq + c
x=− = = = −
a2 + b2 a2 + b2 a2 + b2
p a .
2(1 + ab2 )
2
ax + c
b2 p−ac−abq
+c a (b2 p
−ac
− abq) +a2c + b2 c
y =− x− =− =− =−
a c a a2 +b2
b b b b b (a2 + b2 )
a (b2 p − abq) + b2 c a2 q − abp − bc a2 q + b2 q − b2 q − abp − bc ap + bq + c
=− = = =q−b 2 2 .
b (a + b )
2 2 a +b
2 2 a +b
2 2 a +b
Definition 123. Let A be a point and l be a line. Suppose B is the point on l that’s
Ð→
closest to A. Then the distance between A and l is ∣AB∣.
Corollary 36. The distance between a point (p, q) and a line ax + by + c = 0 is:
∣ap + bq + c∣
√ .
a2 + b2
Proof. By Corollary 35, the point on the line that is closest to (p, q) is:
ap + bq + c ap + bq + c
(p − a − ),
a2 + b2 a2 + b2
, q b
By Definition 220, the distance between this point and (p, q) is:
¿
Á ap + bq + c 2 ap + bq + c 2
Á
À
d= [p − (p − a 2 2 )] + [q − (q − b 2 2 )]
a +b a +b
¿
Á ap + bq + c 2 ap + bq + c 2 ap + bq + c √ 2 2 ∣ap + bq + c∣
Á
À
= (a 2 2 ) + (b 2 2 ) = ∣ 2 2 ∣ a + b = √ .
a +b a +b a +b a2 + b2
In Ch. 119.14, an Appendix in the Appendices for Part III (Vectors), we’ll prove the above
two Corollaries again using different methods.
Proof. Suppose the lines ax + by + c = 0 and dx + ey + f = 0 contain the distinct points (p, q)
and (r, s). Then:
ap + bq + c = 0, ar + bs + c = 0, dp + eq + f = 0, dr + es + f = 0.
1 2 3 4
and
If p = r, then given that (p, q) ≠ (r, s), we have s ≠ q. Since p − r = 0 and s − q ≠ 0, = and =
5 6
imply that b = e = 0. That is, the coefficient on y for each of our two lines is zero. Hence
our two lines are simply x = p and x = r. But of course, p = r and so the two lines are
identical.
So suppose instead that p ≠ r. Then = and = may be rewritten as:
5 6
s−q s−q
a=b d=e
7 8
and .
p−r p−r
Since at least one of a or b must be non-zero, = implies that both a and b must be non-zero.
7
Similarly, since at least one of d or e must be non-zero, = implies that both e and e must
8
s−q s−q
p+q+ =0 p + q + = 0.
c 9 f 10
and
p−r b p−r e
We now show that a point (t, u) is in the line ax + by + c = 0 if and only if it is also in the
line dx + ey + f = 0. We will thus have shown that the two lines are identical:
s−q
at + bu + c = 0 ⇐⇒ t + bu + c = 0
7
p−r
b
s−q s−q
⇐⇒ b( t+u+ ) =0 ⇐⇒ t+u+ =0
c c
p−r b p−r b
s−q
⇐⇒ e( t+u+ ) =0 ⇐⇒ dt + eu + f = 0.
11 f
p−r e
Fact 192. Given two distinct points (p, q) and (r, s), the unique line that contains both
points is (q − s) x + (r − p) y + ps − qr = 0.391
Proof. We need merely plug in and verify that the given line contains (p, q) and (r, s):
(q − s) p + (r − p) q + ps − qr = 0. 3 (q − s) r + (r − p) s + ps − qr = 0. 3
Our previous fact then states that this is the unique line that contains both points.
s−q qr − ps
Note that if r ≠ p (the line isn’t vertical), then this line may be written as: y = x+ .
r−p r−p
391
qr − ps
And if r = p (the line is vertical), then it may be written as: x = .
q−s
1260, Contents www.EconsPhDTutor.com
Fact 14. The line containing the distinct points (a1 , b1 ) and (a2 , b2 ) is:
(a2 − a1 ) (y − b1 ) = (b2 − b1 ) (x − a1 ) .
Proof. By Fact 192, the unique line that contains both points is (y1 − y2 ) x + (x2 − x1 ) y +
x1 y2 − y1 x2 = 0. Rearranging, (x2 − x1 ) (y − y1 ) = (y2 − y1 ) (x − x1 ).
Definition 222. Let A = (p, q) and B = (r, s) be distinct points with p < r or q < s. Then
1. The line AB is the graph of the equation (q − s) x + (r − p) y + ps − qr = 0;
2. The line segment AB is the graph of the equation (q − s) x + (r − p) y + ps − qr = 0 with
the constraint x ∈ [p, r]; and
3. The ray AB is the graph of the equation (q − s) x + (r − p) y + ps − qr = 0 with the
constraint x ≥ p if p < r and the constraint y ≥ q if q < s.
Fact 193. Let A be a point that is not on the line l. Let B be the point on l that is closest
to A. Then l is perpendicular to the line m which contains both A and B.
Proof. Let A = (p, q), B = (r, s), and l be given by ax + by + c = 0. Then by Corollary 35,
ap + bq + c ap + bq + c
B = (r, s) = (p − a − ).
a2 + b2 a2 + b2
, q b
If a = 0, then l is horizontal and m is vertical, so that the two lines are perpendicular.
If instead b = 0, then l is vertical and m is horizontal, so that again , the two lines are
perpendicular.
So suppose a ≠ 0 ≠ b. Then the gradient of the line m is:
ap + bq + c ap + bq + c
−b ÷ (−a ) =
b
a2 + b2 a2 + b2
,
a
This is the negative reciprocal of the gradient −a/b of the line l, so that the two lines are
perpendicular.
Proof. Let x = (p, q) and y = (r, s), so that by Definition 44, z = (2r − p, 2s − q).
(a) The distance between z and y is:
√ √ √
[(2r − p) − r] + [(2s − q) − s] = (r − p) + (s − q) = (p − r) + (q − s) ,
2 2 2 2 2 2
(q − s) (2r − p) + (r − p) (2s − q) + ps − qr = 0. 3
Fact 195. The reflection of the point (p, q) in the line ay + bx + c = 0 is the point
ap + bq + c ap + bq + c
(p − 2a , q − 2b 2 2 ) .
a +b
2 2 a +b
Proof. By Corollary 35, the point on the line ay + bx + c = 0 that is closest to (p, q) is:
ap + bq + c ap + bq + c
(p − a − ).
a2 + b2 a2 + b2
, q b
Corollary 1. The reflection of the point (p, q) in the line y = x is the point (q, p).
Corollary 2. The reflection of the point (p, q) in the line y = −x is the point (−q, −p).
Proof. (a) Let (x1 , y1 ) and (x2 , y2 ) be points on the line with x1 ≠ x2 . Since both are points
on the line, we have:
ax1 + by1 + c = 0 and ax2 + by2 + c = 0.
First, suppose a = 0. Then by1 + c = 0 and by2 + c = 0, so that y1 = y2 = −c/b. We have just
shown that any two arbitrary points on the line have the same y-coordinate. And so by
Definition 40, the line is horizontal.
Next, suppose instead that a ≠ 0. If b = 0, then (x1 , y1 + 1) is also a point on the line, so
that the line is not horizontal (because it contains two points with different y-coordinates).
And if b ≠ 0, then:
ax1 + c ax2 + c
y1 = − and y2 = − .
b b
But since x1 ≠ x2 and a ≠ 0, it follows that y1 ≠ y2 , so that again the line is not horizontal
(again because it contains two points with different y-coordinates).
(b) The proof of (b) is very similar and thus omitted.
Definition 224. Let P = (a, b) be a point in the graph G. We say that P is:
1. A global maximum (point) of G if for all (x, y) ∈ G, we have b ≥ y.
2. The strict global maximum (point) of G if for all (x, y) ∈ G, we have b > y.
3. A local maximum (point) of G if there exists ε > 0 such that:
Note: There can be at most one strict global maximum and at most one strict global
minimum — hence the use of the definite article the. In contrast, there can be more than
one of each of the other types of extreme points — hence the use of the indefinite article a.
392
For the formal definition, see Definition 249 in the Appendices for Calculus.
1264, Contents www.EconsPhDTutor.com
117.2. The Quadratic Equation
(b) If b2 − 4ac = 0, then there is one x-intercept (i.e. one real root), where the graph
just touches the x-axis:
x=−
b
.
2a
We can factorise the quadratic polynomial:
b 2
ax + bx + c = a (x + ) .
2
2a
(c) If b2 − 4ac < 0, then there are no x-intercepts (i.e. no real roots). There is also
no way to factorise the quadratic polynomial ax2 + bx + c (unless we use complex
numbers).
3. There is one line of symmetry, which is vertical:
x=−
b
.
2a
b2
(− , − + c).
b
4. There is one turning point:
2a 4a
(x, y) ∈ G ⇐⇒ (− − x, y) ∈ G.
b
a
And to do so, we write:
2
b2 2b
a (− − x) + b (− − x) + c = a ( 2 + x2 + x) + b (− − x) + c
b b b
a a a a a
(4) and (5) Differentiate the quadratic equation y = ax2 + bx + c with respect to x to get:
y ′ (x) = 2ax + b.
Definition 225. Given two sets A and B, their cartesian product, denoted A × B, is:
A × B = {(x, y) ∶ x ∈ A, y ∈ B} .
Observe that thus defined, a function is simply a set of points. And so, a function and
what we called its graph in the main text above are really one and the same thing.393
Fact 196. Suppose the graph of f has x-intercept (p, 0), y-intercept (0, q), line of sym-
metry αx + by + c = 0, turning point (r, s), and asymptote α̂x + b̂y + ĉ = 0. Then:
(a) The graph of y = f (x)+a has y-intercept (0, q + a), line of symmetry αx+b (y − a)+c =
0, turning point (r, s + a), and asymptote α̂x + b̂ (y + a) + ĉ = 0.
(b) The graph of y = f (x+a) has x-intercept (p − a, 0), line of symmetry α (x − a)+by+c =
0, turning point (r − a, s), and asymptote α̂ (x − a) + b̂y + ĉ = 0.
Proof. We prove only (a) — the proof of (b) is very similar to and is thus omitted.
(a) Intercept. If q = f (0), then q + a = f (0) + a. In other words, if (0, q) satisfies y = f (x),
then (0, q + a) satisfies y = f (x) + a. Thus, (0, q + a) is a y-intercept for the graph of
y = f (x) + a.
Line of symmetry. Let (x1 , y1 ) be any point in the graph of y = f (x) + a. Its reflection
in the line αx + b (y − a) + c = 0 is:
αx1 + by1 + c − ba αx1 + by1 + c − ba
S = (Sx , Sy ) = (x1 − 2α − 2b ).
α 2 + b2 α 2 + b2
, y 1
Our goal is to show that S satisfies y = f (x) + a and is in the graph of y = f (x) + a. We
will thus have shown that αx + b (y − a) + c = 0 is a line of symmetry for y = f (x) + a.
Now, note that (x1 , y1 − a) is in the graph of y = f (x). So too is its reflection in the line
αx + by + c = 0, which is:
αx1 + b (y1 − a) + c αx1 + b (y1 − a) + c
T = (Tx , Ty ) = (x1 − 2α − − 2b ).
α2 + b2 α 2 + b2
, y 1 a
393
As mentioned in footnote 98.
1267, Contents www.EconsPhDTutor.com
Turning point. To be written.
Asymptote. To be written.
Proof. Suppose f is not invertible. Then by the definition of an invertible function (Defin-
ition 57), there are x1 , x2 ∈ Domainf such that x1 ≠ x2 and f (x1 ) = f (x2 ). Let y = f (x1 ) =
f (x2 ). Our definition of an inverse function (Definition 56) now fails to clearly specify
whether f −1 maps y to x1 and/or x2 . In other words, f −1 is not well-defined.
Now suppose f is invertible. Then f does indeed map every element in its domain to
exactly one element in its codomain, as per the mapping rule given in Definition 56:
f (x) = y Ô⇒ f −1 (y) = x.
Fact 22. Let f be an invertible function and f −1 be its inverse. Then f and f −1 are
reflections of each other in the line y = x.
Proof. (By contradiction.) Suppose f is invertible, but neither strictly increasing nor
strictly decreasing on D.
Since f is neither strictly increasing nor strictly decreasing, there exist394 x1 , x2 , x3 ∈ D
with x1 < x2 < x3 such that f (x2 ) ≤ min {f (x1 ) , f (x3 )} or f (x2 ) ≥ max {f (x1 ) , f (x3 )}.
Actually, since f is invertible, these last two weak inequalities may be replaced by strict
ones.
If f (x2 ) < min {f (x1 ) , f (x3 )} = a, then pick any y ∈ (f (x2 ) , a). By the Intermediate
Value Theorem (Theorem 6), there exist x4 ∈ (x1 , x2 ) and x5 ∈ (x2 , x3 ) such that f (x4 ) = y
and f (x5 ) = y, so that f is not invertible and we have our desired contradiction.
The case where f (x2 ) > max {f (x1 ) , f (x3 )} = b is similarly handled.
Proof. Suppose f is strictly increasing. Pick any distinct y1 , y2 ∈ Rangef = Domainf −1 , with
y1 < y2 . Since f is strictly increasing, there exist distinct x1 , x2 ∈ Domainf with x1 < x2
394
We should actually have mentioned that in the case where D is empty or contains a single point, then
the Proposition is “obviously” or vacuously true. So in our proof, we shall assume that D contains more
than one point.
1269, Contents www.EconsPhDTutor.com
such that f (x1 ) = y1 and f (x2 ) = y2 . Thus, f −1 (y1 ) = x1 < f −1 (y2 ) = x2 and f −1 is also
strictly increasing.
The case where f is strictly decreasing is similar and thus omitted.
Proof. Suppose for contradiction that (a, b) ∈ f, f −1 with a ≠ b. That is, suppose there
exists a shared intersection point (a, b) that is not on the line y = x. Then by Fact 22,
Fact 27. Let a, b > 0 and c, d ∈ R. Let f be a nice function. Then to get the graph of
y = af (bx + c) + d, follow these steps:
1. Translate leftwards by c units, to get y = f (x + c).
2. Compress horizontally (inwards towards y-axis) by a factor of b, to get y = f (bx + c).
3. Stretch vertically (outwards from x-axis) by a factor of a, to get y = af (bx + c).
4. Translate upwards by d units, to get y = af (bx + c) + d.
Proof. In each case, we need merely verify that Definition 227 is met:
(1) Observe that q = f (p) ⇐⇒ q = f (p − c + c). Hence, the point (p, q) is in the graph
of f ⇐⇒ (p − c, q) is in the graph of y = f (x + c). And so, by Definition 227(1)(d), the
graph of y = f (x + c) is the graph of g translated leftwards by c units.
(2) Observe that q = f (p + c) ⇐⇒ q = f (bp/b + c). Hence, the point (p, q) is in the
graph of y = f (x + c) ⇐⇒ (p/b, q) is in the graph of y = f (bx + c). And so, by Definition
227(3)(b), the graph of y = f (bx + c) is the graph of y = f (x + c) compressed by a factor of
b, inwards towards the y-axis.
(3) Observe that q = f (bp + c) ⇐⇒ aq = af (bp + c). Hence, the point (p, q) is in the
graph of y = f (bx + c) ⇐⇒ (p, aq) is in the graph of y = af (bx + c). And so, by Definition
227(2)(a), the graph of y = af (bx + c) is the graph of y = f (bx + c) stretched by a factor of
a, outwards from the x-axis.
(4) Observe that q = af (bp + c) + d ⇐⇒ q + d = af (bp + c) + d. Hence, the point (p, q) is
in the graph of y = af (bx + c) ⇐⇒ (p, q + d) is in the graph of y = af (bx + c) + d. And
so, by Definition 227(1)(a), the graph of y = af (bx + c) + d is the graph of y = f (bx + c)
translated upwards by d units.
Definition 228. Let A = (p, q) and B = (r, s) be distinct points on the circle (x − a) +
2
Here are this textbook’s official, formal definitions of sine and cosine, reproduced from
Ch. 76.4.
With “some” work, it is possible to prove that under Definitions 177 and 178, all results
stated in the main text continue to hold.
√
Fact 197. Let x ∈ R. Then sin (cos−1 x) = 1 − x2 .
Theorem 4. (Euclidean Division Theorem for Polynomials.) Let p (x) and d (x)
be P - and D-degree polynomials in x with D < P . Then there exists a unique polynomial
q (x) of degree P − D such that r (x) = p (x) − d (x)q (x) has degree less than D.
P D P −D
Proof. Let p (x) = ∑ pi xi and d (x) = ∑ di xi . Now construct q (x) = ∑ qi xi , with:
i=0 i=0 i=0
qP −D =
pP
,
dD
pP −1 − qP −D dD−1 pP −1 − dD dD−1
pP
qP −D−1 = = ,
dD dD
pP −2 − (qP −D−1 dD−1 + qP −D dD−2 )
qP −D−2 = ,
dD
⋮
pD − (q1 dD−1 + ⋅ ⋅ ⋅ + qD d0 )
q0 = .
dD
By construction, we have:
P j
d (x) q (x) = ∑ ( ∑ qj dj−k ) xj
j=0 k=0
D−1 j
= pP x + pP −1 x
P P −1
+ ⋅ ⋅ ⋅ + pD x + ∑ ( ∑ qj dj−k ) xj .
D
j=0 k=0
By construction, p (x) = d (x) q (x) + r (x), q (x) is of degree P − D, while r (x) is of degree
less than D. This completes the proof of existence.
P −D
To prove uniqueness, let s (x) = ∑ si xi be a P − D-degree polynomial such that p (x) −
s (x) d (x) is a polynomial of degree below D. If sP −D ≠ qP −D , then p (x) − q (x) d (x)
i=0
Axis
Generator
Hyperbola
Circle
Ellipse
Parabola
395
Not made by me and narrated by an female Indian robot, but awesome nonetheless.
1275, Contents www.EconsPhDTutor.com
The ellipse and the parabola are formed from only one half of the double cone. In contrast,
the hyperbola is formed from both halves — it thus has two branches.396
We shall not do so, but it is possible to prove that in general, a conic section is the graph
of the equation
Ax2 + Bxy + Cy 2 + Dx + Ey + F = 0.
Proof. Recall (Corollary 1) that the reflection of the point (p, q) in the line y = x is the
point (q, p). But:
p= ⇐⇒ q = .
k k
q p
p= ⇐⇒ −q = .
k k
q −p
396
There are also four types of degenerate conic sections. In each case, the plane cuts through (or
contains) the vertex. We have:
1. A point if the plane is less steep than the generator (this is the degenerate ellipse).
2. A single straight line if the plane is exactly as steep as the generator (this is the degenerate
parabola).
3. A pair of intersecting lines if the plane is steeper than the generator (this is the degenerate
hyperbola).
Now, suppose the generator, which is usually oblique, is now instead parallel to the axis. Then we get
a degenerate cone that is the cylinder. Now, any plane that is perpendicular to the cylinder’s base
produces:
4. A pair of vertical lines. (This is considered a degenerate parabola, because the plane is exactly as
steep as the generator).
1276, Contents www.EconsPhDTutor.com
Fact 37. Let b, c, d, e ∈ R with d ≠ 0 and cd − be ≠ 0. Consider the graph of
bx + c
y=
dx + e
.
(a) Intercepts. If e ≠ 0, then there is one y-intercept (0, c/e). (If e = 0, then there are
no y-intercepts.) And if b ≠ 0, then there is one x-intercept (−c/b, 0). (If b = 0, then
there are no x-intercepts.)
(b) There are no turning points.
(c) There is the horizontal asymptote y = b/d and the vertical asymptote x = −e/d.
(The asymptotes are perpendicular and so, this is a rectangular hyperbola.)
(d) The hyperbola’s centre is (−e/d, b/d).
(e) The two lines of symmetry are y = ±x + (b + e) /d.
Proof. We already proved (a), (c), and (d) in the main text. We now prove (b) and (e).
dy d b cd − be 1 d b d cd − be 1 cd − be −1
(b) = ( + ) = + ( ) = .
dx dx d d2 x + e/d dx d dx d2 x + e/d d2 (x + e/d)2
By assumption, cd − be = 0. Thus, dy/dx ≠ 0. And hence, by Definition ??, this graph has
no turning points.
(e) By Lemma 2, the following graph is symmetric in y = x and y = −x:
cd − be 1
y= .
d2 x
Now shift this graph leftwards by e/d units to get the graph of:
cd − be 1
y= ,
d2 x + e/d
which, by Fact 27, has lines of symmetry y = x + e/d and y = − (x + e/d).
Now shift this last graph upwards by b/d units to get the graph of:
b (cd − be) /d2 bx + c
y= + = ,
d x + e/d dx + e
which, by Fact 27, has the claimed lines of symmetry:
b+e b−e
y =x+ and y = −x + .
d d
It remains to be shown that these are the only two lines of symmetry. If there were a third
distinct line of symmetry, then there would be more than two asymptotes. But this is not
the case. Thus, there can be at most two distinct lines of symmetry.
If ad > 0, then the turning point on the left is a strict local maximum and the one on
the right is a strict local minimum. And if ad < 0, then the one on the left is a strict
local minimum and the one on the right is a strict local maximum.
(c) There are two asymptotes, one oblique and one vertical:
bd − ae
y = x+ = −
a e
and x .
d d2 d
(Note that since the asymptotes are not perpendicular, this is not a rectangular
hyperbola.)
e bd − 2ae
(d) The hyperbola’s centre is (− , ).
d d2
(e) The two lines of symmetry are:
√ √
a ± a2 + d2 bd − ae ± e a2 + d2
y= x+ .
d d2
So, if (ae2 + cd2 − bde) /a < 0, then there are no stationary points.
If (ae2 + cd2 − bde) /a = 0, then dy/dx = 0 at x = −e/d. But there is no point in y =
(ax2 + bx + c) / (dx + e) at which x = −e/d. And so here, there is no stationary point.
If (ae2 + cd2 − bde) /a > 0, then there are two stationary points, given by ,. Plugging these
values of x into y = (ax2 + bx + c) / (dx + e) and doing the algebra (omitted), we can find
the y-values and thus conclude the two stationary points are:
√ √
⎛ −e ± (ae2 + cd2 − bde) /a bd − 2ae ± 2 a (ae2 + cd2 − bde) ⎞
P, Q =
⎝ ⎠
, ,
d d2
397
This painful, brute-force method is not exactly the “proper” or “usual” way to find the lines of symmetry
(for which see my “forthcoming” H2 Further Mathematics Textbook), but does avoid having to use other
facts about conic sections that we haven’t discussed.
1279, Contents www.EconsPhDTutor.com
117.9. Inequalities
k
∑ an = (a1 + ak ) .
k
n=1 2
a1 + ak a1 + a1 + (k − 1) d k−1
Hence: = = a1 + d = a(k+1)/2 .
2 2 2
k
k−1 k − 1 a1 + ak
∑ an = (a1 + ak ) + a(k+1)/2 = (a1 + ak ) + = (a1 + ak ) .
k
Altogether:
n=1 2 2 2 2
Corollary 37. Let (an )n=1 be a finite arithmetic sequence with d = a2 − a1 . Then:
k
k
k (k − 1)
∑ an = ka1 + d.
n=1 2
Definition 229. Let (an ) be a (real and infinite) sequence. Let L ∈ R. Suppose that for
all ε > 0, there exists N such that for all n ≥ N , we have:
∣L − an ∣ < ε .
Then we say that the sequence (an ) is convergent and that its limit exists. Moreover, we
say the sequence converges to L and call L its limit.
That the sequence (an ) converges to L may be written as:
an → L or lim an = L.
n→∞
We have just shown that (sn ) diverges. Hence, Grandi’s series diverges.
Fact 199. (The Reverse Triangle Inequality.) Let x, y ∈ R. Then ∣x − y∣ ≥ ∣∣x∣ − ∣y∣∣.
Most proofs of the Reverse Triangle Inequality make use of the Triangle Inequality, but
Proof. The given inequality ∣x − y∣ ≥ ∣∣x∣ − ∣y∣∣ is equivalent to (x − y) ≥ (∣x∣ − ∣y∣) or:
1 2 2
x2 + y 2 − 2xy ≥ ∣x∣ + ∣y∣ − 2 ∣x∣ ∣y∣ = x2 + y 2 − 2 ∣x∣ ∣y∣ or −2xy ≥ −2 ∣x∣ ∣y∣ or xy ≤ ∣x∣ ∣y∣.
2 2 2
Fact 47. Other than the zero series, every (infinite) arithmetic series diverges.
n
Proof. Let (an ) be a non-zero arithmetic series with d = a2 − a1 and sn = ∑ ai . We have:
i=1
Number of terms dk 2
sk+1 = (First term + Last term)× = (a1 + ak+1 ) = [a1 + (a1 + kd)] = a1 k+
k k
.
2 2 2 2
Case 1. Suppose d = 0.
Since the given arithmetic series is non-zero, we must have a1 ≠ 0. Pick ε = ∣a1 ∣ /2 and let
L ∈ R. Suppose that for some k, we have ∣L − sk ∣ < ε. Then:
RRR RRRR ∣a1 ∣
R
∣L − sk+1 ∣ = ∣L − sk − a1 ∣ ≥ RRR∣L − sk ∣ − ∣a1 ∣ RRRR >
Rr
= ε,
RRR´¹¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¶ RRR 2
<ε=∣a1 ∣/2
RRR´¹¹ ¹ ¹ ¹¸ ¹ ¹ ¹ ¹ ¶ ²RRR 2
<ε=∣d∣/2 >{d}
Definition 233. The point (0, 0, . . . , 0) in Rn is called the origin and is denoted O.
Definition 235. Given the vector u = (u1 , u2 , . . . , un ) ∈ Rn , its length (or norm or mag-
nitude), denoted ∣u∣, is the number:
√
∣u∣ = u21 + u22 + ⋅ ⋅ ⋅ + u2n .
Definition 237. Given the points A = (a1 , a2 , . . . , an ) and B = (b1 , b2 , . . . , bn ), the differ-
Ð→
ence B − A is defined as the vector AB. That is:
Ð→
B − A = AB = (b1 − a1 , b2 − a2 , . . . , bn − an ) .
A + v = (a1 + v1 , a2 + v2 , . . . , an + vn ) .
B − v = (b1 − v1 , b2 − v2 , . . . , bn − vn ) .
399
Note that technically, the set Rn that contains the points A and B is different from the set Rn that
Ð→
contains the vector AB. The former is a Euclidean space, while the latter is a vector space. This
though is somewhat beyond the scope of H2 Maths and so let’s not worry any further about this.
1285, Contents www.EconsPhDTutor.com
Definition 240. Given the vectors u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ), their sum,
denoted u + v, is the vector:
u + v = (u1 + v1 , u2 + v2 , . . . , un + vn ) .
Definition 241. Given the vector u = (u1 , u2 , . . . , un ), its additive inverse, denoted −u,
is the vector:
Definition 242. Given the vectors u = (u1 , u2 , . . . , un ) and v = (v1 , v2 , . . . , vn ), the differ-
ence u − v is defined as the sum of u and v.
u − v = (u1 − v1 , u2 − v2 , . . . , un − vn ) .
Proof. By Definition 241, −v = (−v1 , −v2 , . . . , −vn ). And now by Definition 242,
u − v = u + (−v) = (u1 − v1 , u2 − v2 , . . . , un − vn ) .
Ð→ Ð→ Ð→
Fact 201. Suppose A, B, and C be points. Then AB − AC = CB.
Definition 243. Given the non-zero vector u = (u1 , u2 , . . . , un ) ∈ Rn , the unit vector in
its direction (or its unit vector) is denoted û and is defined by:
1
û = u.
∣u∣
So, given the vector u, its unit vector û is simply the vector that points in the same direction
but has length 1.
Fact 202. Let a = (a1 , a2 , . . . , an ) be a vector and c ∈ R. Then ∣ca∣ = ∣c∣ ∣a∣.
¿ ¿ ¿
i=1
Án Á Án 2
À∑ (c2 a2 ) = Á
=Á Àc2 ∑ a2 = ∣c∣ Á
À∑ a = ∣c∣ ∣a∣.
n
i i i
i=1 i=1 i=1
Proof. (a) Suppose â = b̂. Then a/ ∣a∣ = b/ ∣b∣ or a = (∣a∣ / ∣b∣) b. Since a = kb for some
k > 0, by Definition 103, they point in the same direction.
Now suppose instead that a and b point in the same direction. Then by Definition 103,
there exists k > 0 such that a = kb. Thus:
1 1 1 1
â = a= kb = kb = b = b̂.
∣a∣ ∣kb∣ ∣k∣ ∣b∣ ∣b∣
(b) is similar and thus omitted. (c) and (d) follow from (a) and (b).
Fact 61. Let a, b, and c be vectors. If a ∥/ b, then there are α, β ∈ R such that:
c = αa + βb.
b2 c1 − b1 c2 ⎛ a1 ⎞ a1 c2 − a2 c1 ⎛ b1 ⎞
Then: αa + βb = +
a1 b2 − a2 b1 ⎝ a2 ⎠ a1 b2 − a2 b1 ⎝ b2 ⎠
−a1
⎛ a1 b2 c1 +a1
b1c2 b1c2 − a2 b1 c1 ⎞
⎜ a1 b2 − a2 b1 ⎟ ⎛c ⎞
⎜ ⎟
=⎜ ⎟= 1
= c.
⎜ a b c − a b c + a b c −a ⎟ ⎝ c2 ⎠
⎜ 2 2 1 2 1 2 1 2 2 2 b 2 c1
⎟
⎝ a1 b2 − a2 b1 ⎠
Fact 62. Let u and v be vectors. Suppose v is a line’s direction vector. Then:
(ca) ⋅ b = c (a ⋅ b).
√ √
Proof. By Def. 235, ∣v∣ = ∑ vi2 . By Def. 244, v⋅v = ∑ vi vi = ∑ vi2 . Thus, ∣v∣ = v ⋅ v.
(u ⋅ v) u⋅v
2
Rearranging, we have ≤ 1. Then take square roots to get : −1 ≤ ≤ 1.
∣u∣ ∣v∣
2 2 ∣u∣ ∣v∣
Proof. We first prove ⇐Ô of (a). If u and v point in the same direction, then by Definition
103, there exists k > 0 such that u = kv and so:
u ⋅ v (kv) ⋅ v k (v ⋅ v) k ∣v∣ ∣v∣
= = = = 1.
∣u∣ ∣v∣ ∣kv∣ ∣v∣ ∣k∣ ∣v∣ ∣v∣ k ∣v∣ ∣v∣
3
The proof of ⇐Ô of (b) is very similar. If u and v point in the exact opposite directions,
then by Definition 103, there exists k < 0 such that u = kv and so:
u ⋅ v (kv) ⋅ v k (v ⋅ v) k ∣v∣ ∣v∣
= = = = −1.
∣u∣ ∣v∣ ∣kv∣ ∣v∣ ∣k∣ ∣v∣ ∣v∣ −k ∣v∣ ∣v∣
3
We have just shown that if u ∥/ v, then u ⋅ v ≠ ± ∣u∣ ∣v∣. And so, by the contrapositive, if
u ⋅ v = ± ∣u∣ ∣v∣, then u ∥ v.
Now, if u ⋅ v ≠ ∣u∣ ∣v∣, then by ⇐Ô of (a), u and v do not point in the same direction.
Thus, if u ⋅ v = − ∣u∣ ∣v∣, then u and v must point in the exact opposite directions. 3
Similarly, if u ⋅ v ≠ − ∣u∣ ∣v∣, then by ⇐Ô of (b), u and v do not point in the exact opposite
directions. Thus, if u ⋅ v = ∣u∣ ∣v∣, then u and v must point in the same direction. 3
Definition 245. The standard basis vector in the ith direction (or ith standard basis
vector), denoted ei , is the vector whose ith coordinate is 1 and other coordinates are 0.
Definition 246. The ith-direction cosine of the vector v = (v1 , v2 , . . . , vn ) is the number:
vi
.
∣v∣
Fact 203. Suppose θ is the angle between a vector v = (v1 , v2 , . . . , vn ) and ei . Then:
cos θ =
vi
∣v∣
.
v ⋅ ei vi ⋅ 1 + ∑j≠i vj ⋅ 0 vi
Proof. By Definition 111: cos θ = = = .
∣v∣ ∣ei ∣ ∣v∣ ⋅ 1 ∣v∣
The following Fact says that given two lines, we can choose any direction vector for each
and the calculated angle between the two chosen direction vectors will be fixed:
Fact 204. If one line has direction vectors u1 and v1 , while another has u2 and v2 , then:
∣u1 ⋅ u2 ∣ ∣v1 ⋅ v2 ∣
=
∣u1 ∣ ∣u2 ∣ ∣v1 ∣ ∣v2 ∣
.
Proof. There exist non-zero real numbers λ and µ such that u1 = λv1 and u2 = µv2 . Thus:
∣u1 ⋅ u2 ∣ ∣(λv1 ) ⋅ (µv2 )∣ ∣(λv1 ) ⋅ (µv2 )∣ ∣λ∣ ∣µ∣ ∣v1 ⋅ v2 ∣ ∣v1 ⋅ v2 ∣
= = = =
∣u1 ∣ ∣u2 ∣ ∣λv1 ∣ ∣µv2 ∣ ∣λv1 ∣ ∣µv2 ∣ ∣λ∣ ∣µ∣ ∣v1 ∣ ∣v2 ∣ ∣v1 ∣ ∣v2 ∣
.
Corollary 9. Suppose θ is the angle between two lines. (a) If θ = 0, then the two lines
are parallel. And (b) if θ = π/2, then they are perpendicular.
Proof. Suppose the two lines are l1 and l2 , with direction vectors u and v
(a) If θ = 0, then by Corollary 8:
∣u ⋅ v∣ ∣u ⋅ v∣
cos−1 =0 or = cos 0 = 1 or u ⋅ v = ± ∣u∣ ∣v∣.
∣u∣ ∣v∣ ∣u∣ ∣v∣
And so by Fact 71, u ∥ v.
(b) If θ = π/2, then by Corollary 8:
∣u ⋅ v∣ π ∣u ⋅ v∣
cos−1 = = cos = 0 u ⋅ v = 0.
π
or or
∣u∣ ∣v∣ 2 ∣u∣ ∣v∣ 2
And so by Definition 112, u ⊥ v.
Proof. (a) If two lines are identical, then they also share a direction vector, so that by
Definition 116, they are parallel.
(b) Suppose two lines are parallel. Then by Definition 116 and Fact 62, they share some
direction vector u.
Suppose also that they intersect at some point S. Then both lines can be described by
Ð→
r = OS + λu and are identical.
Thus, if two parallel lines are distinct, then they cannot intersect.
(c) We already showed that two distinct and parallel lines do not intersect. We now show
that two distinct and non-parallel lines share at most one intersection point.
Suppose for contradiction that two distinct lines share two distinct intersection points P
Ð→
and Q. Then P Q is a direction vector of both lines. Thus, both lines can be described by
Ð→ Ð→
r = OP + λP Q and must thus be identical.
Fact 76. If two lines (in 2D space) are distinct and non-parallel, then they must share
exactly one intersection point.
x − p1 y − p2 z − p3
= = .
v1 v2 v3
(2) If v1 = 0 and v2 , v3 ≠ 0, then l is perpendicular to the x-axis and can be described by:
y − p2 z − p 3
x = p1 and = .
v2 v3
(3) If v2 = 0 and v1 , v3 ≠ 0, then l is perpendicular to the y-axis and can be described by:
x − p1 z − p 3
y = p2 and = .
v1 v3
(4) If v3 = 0 and v1 , v2 ≠ 0, then l is perpendicular to the z-axis and can be described by:
x − p1 y − p2
z = p3 and = .
v1 v2
(5) If v1 , v2 = 0, then l is perpendicular to the x- and y-axes and can be described by:
x = p1 and y = p2 .
(6) If v1 , v3 = 0, then l is perpendicular to the x- and z-axes and can be described by:
x = p1 and z = p3 .
(7) If v2 , v3 = 0, then l is perpendicular to the y- and z-axes and can be described by:
y = p2 and y = p2 .
v2 (z − p3 ) = v3 (y − p2 ) v1 (z − p3 ) = v3 (x − p1 ).
5 6
and
(Proof continues on the next page ...)
x − p1 y − p2 z − p3 y − p2
= and = .
v1 v2 v3 v2
z − p3 y − p2
= .
v3 v2
Since (0, v2 , v3 ) ⋅ i = 0, l is perpendicular to the x-axis.
z − p 3 x − p1
= .
v3 v1
Since (v1 , 0, v3 ) ⋅ j = 0, l is perpendicular to the y-axis.
x − p1 y − p2
= .
v1 v2
Since (v1 , v2 , 0) ⋅ k = 0, l is perpendicular to the z-axis.
x = p1 and y = p2 .
Since (0, 0, v3 ) ⋅ i = 0 and (0, 0, v3 ) ⋅ j = 0, l is perpendicular to both the x- and y-axes.
x = p1 and z = p3 .
Since (0, v2 , 0) ⋅ i = 0 and (0, v2 , 0) ⋅ k = 0, l is perpendicular to both the x- and z-axes.
y = p2 and z = p3 .
Since (v1 , 0, 0) ⋅ j = 0 and (v1 , 0, 0) ⋅ k = 0, l is perpendicular to both the y- and z-axes.
Fact 205. If u and v are non-zero vectors, then (u ⋅ v̂) v̂ is the unique vector such that:
(b) By Definition 112, u − (u ⋅ v̂) v̂ ⊥ v ⇐⇒ [u − (u ⋅ v̂) v̂] ⋅ v = 0. But this last equation
is true, as we now verify:
v v
[u − (u ⋅ v̂) v̂] ⋅ v = u ⋅ v − (u ⋅ v̂) v̂ ⋅ v = u ⋅ v − (u ⋅ ) ( ⋅ v)
∣v∣ ∣v∣
1 1
= u ⋅ v − 2 (u ⋅ v) (v ⋅ v) = u ⋅ v − 2 (u ⋅ v) ∣v∣ = 0
2
3
∣v∣ ∣v∣
We now show that (u ⋅ v̂) v̂ is the unique vector that satisfies (a) and (b).
Suppose w is also a vector that satisfies (a) and (b). That is:
(a) w ∥ v; and (b) u − w ⊥ v.
Since w ∥ v, there exists λ ≠ 0 such that w = λv. And since u − w ⊥ v, we have:
u⋅v
(u − λv) ⋅ v = 0 or u ⋅ v − λv ⋅ v = 0 or u ⋅ v = λv ⋅ v or λ= .
v⋅v
u⋅v
Altogether then: w = λv = v = (u ⋅ v̂) v̂.
v⋅v
Fact 206. (Lagrange’s Identity.) (a) (a21 + a22 ) (b21 + b22 )−(a1 b1 + a2 b2 ) = (a1 b2 − a2 b1 ) .
2 2
∣rejb a∣ = ∣a × b̂∣ .
We will prove this claim twice, once in the 2D case and again in the 3D case. In each case,
we will use Lagrange’s Identity (LI).
2D case. Let a = (a1 , a2 ) and b = (b1 , b2 ). Then:
¿
√ Á
Á
À (a1 b1 + a2 b2 )
2
∣rejb a∣ = ∣a∣ − ∣projb a∣ = a1 + a2 −
2 2 2 2
b21 + b22
.
¿
Á (a2 + a2 ) (b2 + b2 ) − (a1 b1 + a2 b2 )2
= Á
À 1 2 1 2
b1 + b22
2
¿
Á (a1 b2 − a2 b1 )2 ∣a1 b2 − a2 b1 ∣
LI Á
= À = √ = ∣a × b̂∣ .
b21 + b22 b21 + b22
√
(a2 b3 − a3 b2 ) + (a3 b1 − a1 b3 ) + (a1 b2 − a2 b1 ) ∣a × b∣
2 2 2
= √ = = ∣a × b̂∣ .
LI
Proof. Let θ be the angle between a and b. If a×b = 0, then by Fact 84, ∣a × b∣ = ∣a∣ ∣b∣ sin θ =
0. Since ∣a∣ ≠ 0 and ∣b∣ ≠ 0., we have sin θ = 0 and thus a ∥ b.
c∥a×b ⇐⇒ c ⊥ a, b.
c1 a1 + c2 a2 + c3 a3 = 0 c1 b1 + c2 b2 + c3 b3 = 0.
1 2
and
c1 z = c3 x c1 y = c2 x.
4 5
and
d2 z = d3 y, d1 z = d3 x, d1 y = d2 x.
6 7 8
and
We will break down the remainder of the proof into four cases, depending on whether any
of x, y, and z are zero. We will show that wherever no contradiction arises, c can be written
as a non-zero scalar multiple of d, so that c ∥ d.
(Proof continues on the next page ...)
⎛ c1 ⎞ ⎛ 0 ⎞ ⎛ 0 ⎞ ⎛ d1 ⎞
c=⎜ ⎟ ⎜
⎜ c2 ⎟ = ⎜ 0
⎟ = c3 ⎜ 0 ⎟ = c3 ⎜ d ⎟ = c3 d.
⎟ d ⎜ ⎟ d ⎜ 2⎟ d
⎝ c3 ⎠ ⎝ c3 ⎠ 3
⎝ d3 ⎠ 3
⎝ d3 ⎠ 3
⎛ c1 ⎞ ⎛ 0 ⎞ ⎛ 0 ⎞ ⎛ d1 ⎞
c=⎜ ⎟ ⎜
⎜ c2 ⎟ = ⎜ c2
⎟ = ⎜ d ⎟ = ⎜ d ⎟ = c3 d.
c 3 c 3
⎟ d ⎜ 2⎟ d ⎜ 2⎟ d
⎝ c3 ⎠ ⎝ c3 ⎠ 3
⎝ d3 ⎠ 3
⎝ d3 ⎠ 3
= = .
c1 c2 c3
Thus:
d1 d2 d3
And now, c can be written as a non-zero scalar multiple of d:
⎛ c1 ⎞ ⎛ d1 ⎞
c=⎜ ⎟ ⎜ ⎟ c3
⎜ c2 ⎟ = d ⎜ d2 ⎟ = d d.
c3
⎝ c3 ⎠ 3
⎝ d3 ⎠ 3
Definition 141. A plane is any set of points that can be written as:
Ð→
{R ∶ OR ⋅ n = d} or {R ∶ r ⋅ n = d},
Fact 95. If a plane contains two distinct points, then it also contains the line through
those two points.
Proof. Suppose the points are A and B, so that the line AB may be described by r =
Ð→ Ð→
OA + λAB (λ ∈ R).
Suppose the plane can be described by r ⋅ n = d.
We will prove that any point on the line AB is also on the plane q. (We will thus have
shown that q contains the line AB.) To do so, we need merely verify that the generic point
Ð→ Ð→
r = OA + λAB on the line AB satisfies q’s vector equation:
Ð→ Ð→ Ð→ Ð→ Ð→
r = (OA + λAB) ⋅ n = [OA + λ (OB − OA)] ⋅ n
Ð→ Ð→ Ð→
= OA ⋅ n + λOB ⋅ n − λOA ⋅ n = d + λd − λd = d. 3
Ð→
Fact 96. If q = {R ∶ OR ⋅ n = d} is a plane, then n ⊥ q.
Ð→
Proof. Let v be any vector on q. Then there are points A, B ∈ q such that v = AB. Since
Ð→ Ð→
A, B ∈ q, we have OA ⋅ n = d and OB ⋅ n = d.
Ð→ Ð→ Ð→ Ð→ Ð→
And thus: n ⋅ v = n ⋅ AB = n ⋅ (OB − OA) = n ⋅ OB − n ⋅ OA = d − d = 0.
m ∥ n Ô⇒ m ⊥ q.
m ⋅ v = (kn) ⋅ v = k (n ⋅ v) = k ⋅ 0 = 0.
Thus, by Definition 143, m ⊥ q.
Ð→ Ð→ Ð→
Proof. Let n be a normal vector of q. Since OR = OP + P R, we have:
Ð→ Ð→ Ð→ Ð→ Ð→
OR ⋅ n = (OP + P R) ⋅ n = OP ⋅ n + P R ⋅ n. ,
How to go back and forth between a plane’s vector and cartesian forms:
k
Proof. By Definition 244, (x1 , x2 , . . . , xk ) ⋅ (n1 , n2 , . . . , nk ) = ∑ ni xi .
i=1
Remark 145. Note that Definition 141 actually serves as the general definition of the
k − 1-dimensional hyperplane in Rk . In general, the k − 1-dimensional hyperplane in Rk
k
has cartesian equation ∑ ni xi = d, where n = (n1 , n2 , . . . , nk ).
i=1
And so the hyperplane in R3 is the “flat” two-dimensional plane with cartesian equation
ax + by + cz = d. While the hyperplane in R2 is the one-dimensional line with cartesian
equation ax + by = c.
Definition 247. A two-dimensional plane in Rn is any set that can be written as:
Ð→ Ð→
{R ∶ OR = OA + λu + µv (λ, µ ∈ R)} ,
Thus, P ∈ s.
Ð→ Ð→ Ð→ Ð→
(b) Now suppose P ∈ s, i.e. OP ⋅ n = d. Rearranging, OP ⋅ n = OA ⋅ n or AP ⋅ n = 0. And
Ð→ Ð→
so by Corollary 18, AP is on q. By Theorem 10 then, AP can be written as the linear
Ð→
combination of u and v. That is, there exist real numbers α and β such that AP = αu + βv.
Ð→ Ð→
Rearranging, OP = OA + αu + βv. Thus, P ∈ q.
Fact 99. Let q be a plane with normal vector n. Suppose v is a vector. Then:
v⊥n Ô⇒ v is on q.
Ð→
Since P, Q ∈ q and v = P Q, v is a vector on the plane.
Fact 210. Let m be a vector and q be a plane. Suppose m ⊥ q. Then there exists e ∈ R
Ð→
such that OR ⋅ m = e for all R ∈ q.
Ð→ Ð→
Proof. Let A, R ∈ q and e = OA ⋅ m. On the one hand, AR ⋅ m = 0 (because m ⊥ q). On the
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
other, AR ⋅ m = (OR − OA) ⋅ m = OR ⋅ m − OA ⋅ m. Thus, OR ⋅ m = OA ⋅ m = e.
m⊥q Ô⇒ m ∥ n.
Ð→
Proof. Let q = {R ∶ OR ⋅ n = d}, n = (n1 , n2 , . . . , nk ), and m = (m1 , m2 , . . . , mk ). Suppose
Ð→
m ⊥ q. By Fact 210, there exists some e ∈ R such that for all R ∈ q, OR ⋅ m = e.
1
Lemma 3. ni = 0 ⇐⇒ mi = 0.
ÐÐ→
Since Sw ∈ q, we have OSw ⋅ m = e or:
1
ÐÐ→
OSw ⋅ m = wmi + mj + ∑ 0 ⋅ ml = wmi + mj + 0 = wmi + mj = e.
d d d 2
nj l∉{i,j} nj nj
Lemma 4. ni ≠ 0 Ô⇒ mi d/ni = e.
Proof of Lemma 4. Suppose ni ≠ 0. Let Q be the point whose ith coordinate is d/ni and
other coordinates are 0. Then Q ∈ q because:
Ð→
OQ ⋅ n = ni + ∑ 0 ⋅ nl = d + 0 = d.
d
ni l≠i
Ð→ Ð→
Since Q ∈ q, OQ ⋅ m = e OQ ⋅ m = mi + ∑ 0 ⋅ ml = mi + 0 = mi = e.
1 d d d
or
ni l≠i ni ni
We’ve completed our proofs of Lemmata 3 and 4. On the next page, we continue with our
proof of Theorem 9.
(Proof continues on the next page ...)
400
Note that this is really just a restatement of a fundamental result from linear algebra. The proof is
rather long, but uses only material and language we’ve already covered in this textbook.
1302, Contents www.EconsPhDTutor.com
(... Proof continued from the previous page.)
We will show that whether (a) d ≠ 0; or (b) d = 0, we have m ∥ n.
(a) Suppose d ≠ 0. By Lemma 3, if ni = 0, then mi = 0, so that mi = ni (e/d).
And by Lemma 4, if ni ≠ 0, then mi = ni (e/d).
Hence, for all i ∈ {1, 2, . . . , k}, mi = ni (e/d). Thus, we may write m = n.
4 e
d
Since m ≠ 0, it must be that e ≠ 0. So, = shows that m ∥ n.
4
Ð→
Since T ∈ q, we have OT ⋅ m = e = 0 or:
1 6
Ð→
OT ⋅ m = nj ms + (−ns ) mj + ∑ 0 ⋅ ml = nj ms − ns mj + 0 = 0.
8
l∉{i,j}
m=
10 mj
Hence, we may write: n.
nj
Ð→ Ð→
Proof. The given plane contains A, because OA ⋅ n = OA ⋅ n.
Let v be a vector on the plane. Then there exist points P and Q on the plane such that
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
v = P Q = OQ − OP . Now, v ⋅ n = (OQ − OP ) ⋅ n = OQ ⋅ n − OP ⋅ n = OA ⋅ n − OA ⋅ n = 0.
Thus, v ⊥ n. We’ve just shown that the given plane has normal vector n.
Ð→
We now prove uniqueness. Suppose the plane {R ∶ OR ⋅ m = d} contains A and has normal
Ð→ Ð→ Ð→
vector n. Then by Theorem 9, m = kn for some k ≠ 0. Thus, d = OA⋅m = OA⋅(kn) = k OA⋅n.
Ð→ Ð→ Ð→ Ð→ Ð→
And now: {R ∶ OR ⋅ m = d} = {R ∶ OR ⋅ kn = k OA ⋅ n} = {R ∶ OR ⋅ n = OA ⋅ n}.
How to go back and forth between a (hyper)plane’s cartesian and parametric forms:
Then S = T .
The results given in this subchapter were general. In contrast, the results in the next
subchapter will apply only to planes in R3 .
a1 n1 + a2 n2 + a3 n3 = 0 b1 n1 + b2 n2 + b3 n3 = 0.
1 2
and
a1 n1 + a2 n2 b1 n1 + b2 n2 c1 n1 + c2 n2
a3 = − b3 = − c3 = −
1 2 3
, , and .
n3 n3 n3
We will use Lemmata 5 and 6 to prove Theorem 10:
Lemma 5. (a) a1 and a2 are not both zero. (b) b1 and b2 are not both zero.
401
Note that again, this long proof is just a fundamental result from linear algebra (applied to the 3D
case), but written using only material and language we’ve introduced in this textbook.
1305, Contents www.EconsPhDTutor.com
(... Proof continued from the previous page.)
Lemma 6. a1 b2 − a2 b1 ≠ 0.
a1 n1 + a2 n2 b1 n1 + b2 n2
λa3 + µb3 = −λ −µ
1,2
n3 n3
(λa1 + µb1 ) n1 + (λa2 + µb2 ) n2 8,9 c1 n1 + c2 n2 10
=− = − = c3 .
n3 n3
It thus suffices to show that = and = hold. And we now do so, in each of two cases:
8 9
(i) Suppose a1 ≠ 0. Then = holds: λa1 + µb1 = c1 − µb1 + µb1 = c1 . And so too does =:
8 7 8 9
a1 b2 − a2 b1 6 a2 c1 a1 c2 − a2 c1 9
λa2 + µb2 = (c1 − µb1 ) + µb2 = +µ = + = c2 .
7 a2 a 2 c1
a1 a1 a1 a1 a1
(ii) Suppose a1 = 0. Then = holds: λa2 + µb2 = c2 − µb2 + µb2 = c2 . And so too does =:
9 7 9 8
a2 b1 − a1 b2 a1 c2 a2 c1 − a1 c2 8
λa1 + µb1 = (c2 − µb2 ) + µb1 = +µ = + = c1 .
a1 a1 c2
a2 a2 a2 a2 a2
Fact 214. Let A, B, and C be points that are not collinear. Then the plane that contains
all three points is:
Ð→ Ð→
{R ∶ R = A + λAB + µAC (λ, µ ∈ R)} .
Ð→ Ð→
Proof. Since A, B, and C are not collinear, AB ∥/ AC. Now apply Corollary 19.
Fact 107. Given a line and a plane, there are three possibilities. The line and plane are:
(a) Parallel and do not intersect at all.
(b) Parallel and the line lies entirely on the plane.
(c) Non-parallel and intersect at exactly one point.
r = p + λv r ⋅ n = d.
1 2
and
(p + λv) ⋅ n = d p ⋅ n + λv ⋅ n = d or λv ⋅ n = d − p ⋅ n.
3
or
Thus, the intersection points of l and q correspond to the values of λ for which = holds.
3
(a) If p ⋅ n ≠ d, then l and q do not intersect at any value of λ. So, l and q do not intersect.
(b) If p ⋅ n = d, then l and q intersect at all values of λ. So, l lies completely on q.
(c) Now suppose instead that l ∥/ q. Then by Fact 106, v ⋅ n ≠ 0.
d−p⋅n
And so, we can rearrange = to get: λ=
3
.
v⋅n
This shows that there is only one value of λ at which the line and plane intersect. And this
unique intersection point is given by:
d−p⋅n
p + λv = p + v.
v⋅n
Fact 109. Let q and r be planes with normal vectors u and v. Then:
Proof. Let θ be the angle between q and r. By Definitions 148 and Facts 108 and 71:
∣u ⋅ v∣ ∣u ⋅ v∣
(a) q ∥ r ⇐⇒ θ = cos−1 = 0 ⇐⇒ = cos 0 = 1 ⇐⇒ u ∥ v.
∣u∣ ∣v∣ ∣u∣ ∣v∣
∣u ⋅ v∣ π ∣u ⋅ v∣
(b) q ⊥ r ⇐⇒ θ = cos−1 = ⇐⇒ = cos = 0 ⇐⇒ u ⋅ v = 0 ⇐⇒ u ⊥ v.
π
∣u∣ ∣v∣ 2 ∣u∣ ∣v∣ 2
Fact 110. If two planes are parallel, then they are either identical or do not intersect.
Proof. Suppose two planes are parallel. Then they share some normal vector n.
Ð→ Ð→
Suppose they are described by OR⋅n = d1 and OR⋅n = d2 . If d1 = d2 , then they are identical.
Ð→
So suppose d1 ≠ d2 . If the point P is on the first plane, then OP ⋅ n = d1 ≠ d2 , so that P is
not on the second plane. Thus, the two planes do not intersect.
Ð→ Ð→
Proof. Let the two planes be described by OR⋅n = d and OR⋅m = e, where n = (n1 , n2 , . . . , nk ),
m = (m1 , m2 , . . . , mk ), and n ∥/ m. By Lemma 7, there exist i and j such that ni mj − nj mi ≠
0. And since ni mj − nj mi ≠ 0, at least one of ni or nj must be non-zero.
Suppose without loss of generality that ni ≠ 0. Let P = (p1 , p2 , . . . , pk ) be the point with:
eni − dmi d − pj nj
pj = , pi = , and pl = 0 for all l ∉ {i, j}.
ni mj − nj mi ni
Ð→
OP ⋅ n = ∑ pl nl = pi ni + pj nj + ∑ pl nl
l∉{i,j}
d − pj nj
= ni + pj nj + 0 = d − pj nj + pj nj = d, 3
ni
Ð→ d − pj nj
OP ⋅ m = ∑ pl ml = pi mi + pj mj + ∑ pl ml = mi + pj mj + 0
l∉{i,j} ni
dmi + pj (ni mj − nj mi ) dmi + eni − dmi
= = = e. 3
ni ni
Fact 112. Suppose two non-parallel planes have normal vectors n and m. Then their
intersection is a line with direction vector n × m.
Proof. Here in the Appendices we’ll actually go a little further by fully specifying the line
along which the two planes intersect.
Ð→ Ð→
Let the two planes q1 and q2 be described by OR ⋅ n = d and OR ⋅ m = e. Let P be the point
constructed in the proof of Fact 111.
Then, we claim, q1 and q2 intersect at the line described by:
Ð→
r = OP + λn × m (λ ∈ R).
To prove this claim, we first verify that q1 and q2 contain the above line. To do so, plug
the generic point of the above line into each plane’s vector equation:
Ð→ Ð→
(OP + λn × m) ⋅ n = OP ⋅ n + (λn × m) ⋅ n = d + 0 = d, 3
Ð→ Ð→
(OP + λn × m) ⋅ m = OP ⋅ m + (λn × m) ⋅ m = d + 0 = d. 3
Corollary 35. The point on the line ax + by + c = 0 that is closest to the point (p, q) is:
ap + bq + c ap + bq + c
(p − a − ).
a2 + b2 a2 + b2
, q b
Proof. Replace n, d, and A in Fact 115 with (a, b), −c, and (p, q) to get:
Ð→
d − OA ⋅ n −c − (p, q) ⋅ (a, b) −c − ap − bq ap + bq + c
k= = = = −
a2 + b2 a2 + b2 a2 + b2
.
∣n∣
2
So, by Facts 115 and 114, the point on the given line that’s closest to the given point is:
ap + bq + c ap + bq + c ap + bq + c
B = A + kn = (p, q) − (a, = (p − − ).
a2 + b2 a2 + b2 a2 + b2
b) a , q b
Corollary 36. The distance between a point (p, q) and a line ax + by + c = 0 is:
∣ap + bq + c∣
√ .
a2 + b2
Proof. Continue with the above proof and apply Fact 115:
Ð→ ap + bq + c √ ∣ap + bq + c∣
∣AB∣ = ∣k∣ ∣n∣ = ∣− 2 2 ∣ a2 + b2 = √
a +b
.
a2 + b2
Ð→
The values of λ and µ that minimise ∣AR∣ correspond to the point B. Our goal then is to
find these values. We’ll do so using calculus — this will be very similar to what we did
in Chs. 43 and 50, the difference being that we’ll take two derivatives w.r.t. λ and µ.
Moreover, these derivatives are a little different in that they are partial derivatives.
When taking a partial derivative with respect to a constant, we treat any other variable
as a constant. So:
4λ + 2 − 2µ∣λ=λ̃,µ=µ̃ = 0 4µ − 4 − 2λ∣λ=λ̃,µ=µ̃ = 0.
1 2
and
B = (0, 0, 3) + λ̃ (1, −1, 0) + µ̃ (0, 1, −1) = (0, 0, 3) + 0 (1, −1, 0) + 1 (0, 1, −1) = (0, 1, 2) .
√ √ √
Ð→
And: ∣AB∣ = 2λ̃2 + 2µ̃2 + 2λ̃ − 4µ̃ − 2µ̃λ̃ + 5 = 0 + 2 + 0 − 4 − 0 + 5 = 3.
√
= 29λ2 + 30µ2 + 9λ + 11µ + 58λµ + 9/4.
Happily, these results are the same as what we found in Ch. 58.
√
= 26λ2 + 27µ2 + 28λ + 30µ + 52λµ + 11.
Happily, these results are the same as what we found in Ch. 58.
√
= 5λ2 + 10µ2 − 32λ + 12λµ + 256.
Happily, these results are the same as what we found in Ch. 58.
Ð→ Ð→
Fact 118. Suppose l1 and l2 are distinct lines described by r = OP + λu and r = OQ + λv
(λ ∈ R). Then the three possibilities are that l1 and l2 are:
(a) Parallel and do not intersect; moreover, the unique plane that contains l1 and l2 is
Ð→ Ð→
described by r = OP + λu + µP Q (λ, µ ∈ R).
(b) Non-parallel and share exactly one intersection point; moreover, the unique plane
Ð→
that contains l1 and l2 is described by r = OP + λu + µv (λ, µ ∈ R).
(c) Skew (i.e. neither parallel nor intersect) and are not coplanar.
In the remainder of this proof, we’ll suppose instead that l1 ∥/ l2 . That is, u ∥/ v. Then by
Corollary 19, qb is the unique plane that contains P , u, and v. Thus, qb is the only possible
plane that contains both l1 and l2 .
By plugging in µ = 0 into qb ’s parametric equation, we see that qb contains l1 .
By Fact 75, l1 ∥/ l2 implies that l1 and l2 share at most one intersection point.
Ð→
(b) If l1 and l2 share an intersection point S, then P S is a direction vector of l1 , so that
Ð→
by Fact 62, P S ∥ v. Thus, qb can also be described by:
3 Ð→ Ð→
r = OP + λu + µP S (λ, µ ∈ R).
By plugging in µ = 1 into =, we see that this plane also contains l2 . And so indeed, qb is
3
Ð→ Ð→
Fact 215. Let l1 and l2 be the lines described by r = OP + λu and r = OQ + λv (λ ∈ R).
Ð→
Then: l1 and l2 are skew ⇐⇒ P Q ⋅ (u × v) ≠ 0.
Ð→
Proof. ( ⇐Ô ) If P Q ⋅ (u × v) ≠ 0, then u × v ≠ 0, so that by Corollary 11, u ∥/ v and l1 ∥/ l2 .
Suppose for contradiction that l1 and l2 intersect at some point S. Then there are numbers
α and β such that:
Ð→
S = P + αu = Q + βv or P Q = αu − βv.
Ð→
And so: P Q ⋅ (u × v) = αu ⋅ (u × v) − βv ⋅ (u × v) = 0 − 0 = 0.
Ð→
But this contradicts P Q ⋅ (u × v) ≠ 0. So, l1 and l2 do not intersect.
Since l1 and l2 are non-parallel and do not intersect, they are skew.
Ð→
( Ô⇒ ) Now suppose P Q ⋅ (u × v) = 0.
Ð→
If P Q = 0, then P = Q, so that l1 and l2 intersect and are not skew. And if u × v = 0, then
by Corollary 11, l1 and l2 are parallel and are again not skew.
Ð→
So, suppose P Q, u × v ≠ 0. Then by Corollary 11, u ∥/ v.
Ð→ Ð→
Also, P Q ⊥ (u × v). By Corollary 18 then, P Q lies on the same plane as u and v.
And so by Theorem 10, there exist α and β such that:
Ð→
P Q = αu + βv.
Clearly, q contains l1 (to see this, set µ = 0). It also contains the point Q, because:
Ð→ Ð→ Ð→ Ð→
OQ = OP + P Q = OP + αu + βv.
Fact 217. Suppose a, b ∈ R with b ≠ 0. Then the two square roots of a + bi are:
√ √ √
2 √ b √ 2 2
± ( a +b +a+i
2 2 a + b − a) .
2 ∣b∣
√
√√ √√ 2
2
[± ( a2 + b2 + a + i a2 + b2 − a)]
b
Proof.
2 ∣b∣
√ √
1 √ 2 2 √ √
= [ a + b + a − ( a + b − a) + 2i ( a + b + a) ( a2 + b2 − a)]
2 2
b 2 2
2 ∣b∣
1 b√ 2 2
= (2a + 2i a + b − a2 ) = a + i ∣b∣ = a + ib.
b
2 ∣b∣ ∣b∣
∗
Lemma 8. If p, q ∈ C, then (p + q) = p∗ + q ∗ and (pq) ∗ = p∗ q ∗ .
k=0 k=0
n
Taking conjugates of both sides of =, we have: [ ∑ ck (a + ib) k ] ∗ = 0∗ = 0.
1 2
k=0
n n
We now apply Lemma 8(a) and (b) to show that [ ∑ ck (a + ib) ] = ∑ ck (a − ib) . k ∗ 3 k
k=0 k=0
(b)
n n n
[ ∑ ck (a + ib) k ] ∗ = ∑ [ck (a + ib) ] ∗ = ∑ ck ∗ [(a + ib) ] ∗
(a) k k
(b)
= ∑ ck [(a + ib) k ] ∗ = ∑ ak [(a + ib) ∗ ] = ∑ ck (a − ib) .
n n n
k k
n n
Together, = and = show that ∑ ck (a − ib) = 0. Hence, a − ib also solves ∑ ck xk = 0.
2 3 k
k=0 k=0
Fact 128. Let z be a non-zero complex number with ∣z∣ = r and arg z = θ. Then:
z = r (cos θ + i sin θ) .
√
r = r2 cos2 θ + b2 ⇐⇒ r2 = r2 cos2 θ + b2 ⇐⇒ b2 = r2 sin2 θ ⇐⇒ b = ±r sin θ.
4
Observe that b ≥ 0 ⇐⇒ sin θ ≥ 0 and b < 0 ⇐⇒ sin θ < 0. That is, sin θ has the same sign
as b. Hence, we can discard the negative value in = to get b = r sin θ.
4 5
But by Theorem XXX, the only functions whose derivatives are zero are constant functions.
Thus, e−iθ (cos θ + i sin θ) = C for some constant C.
To find what C is, plug in θ = 0 to get: C = e−0 (cos 0 + i sin 0) = 1 ⋅ (1 + 0) = 1.
Thus, e−iθ (cos θ + i sin θ) = 1. Rearranging, eiθ = cos θ + i sin θ.
(a) If k is even, then cos (x − kπ) = cos xcos (kπ) + sin xsin (kπ) = cos x.
´¹¹ ¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¸ ¹ ¹ ¹ ¹ ¹ ¹ ¶
1 0
(b) If k is odd, then cos [(k + 1) π − x] = cos [(k + 1) π] cos x + sin [(k + 1) π] = cos x.
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
1 0
402
We’re actually cheating a little with this proof here, because we haven’t explained how the derivatives
of complex-valued functions work. We simply assume that they work “fairly similarly”.
1320, Contents www.EconsPhDTutor.com
Fact 130. Let z and w be non-zero complex numbers. Then:
(a) ∣zw∣ = ∣z∣ ∣w∣; and (b) arg (zw) = arg z + arg w + 2kπ,
⎧
⎪
⎪
⎪
⎪ −1, if arg z + arg w > π,
⎪
⎪
where in (b): k = ⎨0, if arg z + arg w ∈ (−π, π] ,
⎪
⎪
⎪
⎪
⎪
⎪
⎩1, if arg z + arg w ≤ −π.
We’ve just shown that arg (zw) = arg z + arg w + 2kπ, with:
⎧
⎪
⎪
⎪
⎪ −1, if arg z + arg w > π,
⎪
⎪
k = ⎨0, if arg z + arg w ∈ (−π, π] ,
⎪
⎪
⎪
⎪
⎪
⎪
⎩1, if arg z + arg w ≤ −π.
1321, Contents www.EconsPhDTutor.com
Fact 131. Suppose w is a non-zero complex number. Then:
1 1
(a) ∣ ∣= .
w ∣w∣
1
(b) arg = − arg w.
w
⎧
⎪
⎪
⎪
⎪ cos−1
√
a
if b ≥ 0,
⎪
,
⎪
⎪ a +b
2 2
(b) By Definition 163: arg w = ⎨
⎪
⎪
⎪
⎪
⎪
⎪ − cos−1 √
a
if b < 0.
⎪
,
⎩ a 2 + b2
⎧
⎪
⎪ −1 a/ (a + b ) −b
2 2
⎪ √ = −1
√ ≥ 0 or b ≤ 0,
a
⎪
⎪
⎪
cos
2 + b2
cos
2 + b2
, if
a2 + b2
1 ⎪⎪ 1/ a a
And: arg = ⎨
w ⎪⎪
⎪
⎪ −1 a/ (a + b ) −b
2 2
⎪ − √ = − −1
√ < 0 or b > 0.
a
⎪
⎪
⎪
cos cos , if
a2 + b2
⎩ 1/ a2 + b2 a2 + b2
1
Thus, if b < 0, then: = cos−1 √ = − arg w.
a
arg 3
w a2 + b2
1
And if b > 0, then: = − cos−1 √ = − arg w.
a
arg 3
w a2 + b2
1 1
If b = 0, a > 0, then arg w = arg a = 0 and arg = arg = 0 so that indeed:
w a
1
arg = − arg w. 3
w
1
And in the exceptional case where b = 0, a < 0, we have arg = π = arg w.
w
1322, Contents www.EconsPhDTutor.com
Fact 132. Let z and w be non-zero complex numbers. Then:
∣z∣
(a) ∣ ∣ = = arg z − arg w + 2kπ,
z z
; and (b) arg
w ∣w∣ w
⎧
⎪
⎪
⎪
⎪ −1, if arg z − arg w > π,
⎪
⎪
where in (b): k = ⎨0, if arg z − arg w ∈ (−π, π] ,
⎪
⎪
⎪
⎪
⎪
⎪
⎩1, if arg z − arg w ≤ −π.
Now suppose instead that w is a negative real number. Then arg w = −π.
And now by Corollary 28:
⎧
⎪ ∈(−π,π]
⎪ ©
0
⎪
⎪
⎪ ³¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ
⎪
⎪arg z − π = arg z − arg w + 2 k π, if arg z > 0,
arg = arg (−z) = ⎨
z
⎪
⎪
⎪ arg z + π = arg z − arg w + 2 k π, if arg z ≤ 0.
⎪
w
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ®
3
⎪
⎪
⎪
⎩ ≤−π 1
Fact 218. If a < b, then there exists c ∈ R such that a < c < b.
Second, given any real number, we can always find a bigger natural number:
Fact 219. (Archimedean Property) If x ∈ R, then there exists n ∈ N such that n > x.
Figure to be
inserted here.
Definition 248. Let x ∈ R and ε > 0. Then the ε-neighbourhood of x, denoted Nε (x), is:
Nε (x) = (x − ε, x + ε).
Remark 146. The Nε (x) notation is fairly standard, though other writers may instead
use the letters B or V instead of N.
Unfortunately, there is no standard notation for deleted neighbourhoods, which is why
here in these Appendices I’ve come up with Nε (x), which is not at all standard.403
We can also speak more generally of the ε-neighbourhood of a point in any n-dimensional
space. For example, in two-dimensional space, we have the following definition:
403
Velleman (2016, Calculus: A Rigorous First Course) uses the equally non-standard notation x → a≠ .
1325, Contents www.EconsPhDTutor.com
Definition 249. Let a, b ∈ R and ε > 0. Then the ε-neighbourhood of the point P = (a, b),
denoted Nε (P ), is:
√
Nε (P ) = {(x, y) ∶ (x − a) + (y − b2 ) < ε}.
2
Nε (P ) = Nε (P ) ∖ {P }.
Informally, an isolated point in a set S is one that isn’t “close” to any other point in S.
Formally:
Informally, a limit point x of a set S is one if we can always find another point in S that’s
“arbitrarily” close to x. Formally:
Definition 251. Let S be a set of real numbers. We call x a limit point of S if for every
ε > 0, the ε-neighbourhood of x intersects S at some point other than x — or equivalently:
Nε (x) ∩ S ≠ ∅.
Note importantly that a limit point x of S may or may not be in the set S.
Remark 147. Some writers treat the terms limit point, cluster point, and accumula-
tion point as synonyms. But unfortunately and very confusingly, other writers assign
different meanings to these three terms. Fortunately, in these appendices, we will only
mention limit points. We will never again mention cluster or accumulation points.
The above definition is called the ε-δ definition and is usually credited to Bernard Bolzano
(1781–1848) and Augustin-Louis Cauchy (1789–1857).
Take note of the subtle but important requirement that a be a limit point of D. If a is not
a limit point of D, then lim f (x) is simply undefined (or does not exist).
x→a
The following result says that if it exists, the limit must be unique:
Fact 220. Let f be a nice function. If lim f (x) = L1 and lim f (x) = L2 , then L1 = L2 .
x→a x→a
Proof. Suppose for contradiction that L1 ≠ L2 . Then pick ε = ∣L1 − L2 ∣ /2. Observe that
Nε (L1 ) ∩ Nε (L2 ) = ∅.
Let D = Domainf . By Definition 252, there exist δ1 , δ2 > 0 such that x ∈ D ∩ Nmin{δ1 ,δ2 } (a)
implies f (x) ∈ Nε (L1 ) AND f (x) ∈ Nε (L2 ). But since Nε (L1 ) ∩ Nε (L2 ) = ∅, we have a
contradiction.
Definition 253. Let f be a nice function with domain D. Let a be a limit point of D.
We say that the left-hand limit of f at a is L ∈ R and write lim− f (x) = L if:
x→a
Definition 254. Let f be a nice function with domain D. Let a be a limit point of D.
We say that the right-hand limit of f at a is L ∈ R and write lim+ f (x) = L if:
x→a
(a) We say that the left-hand limit of f at a is ∞ and write lim− f (x) = ∞ if:
x→a
(b) We say that the right-hand limit of f at a is ∞ and write lim+ f (x) = ∞ if:
x→a
(c) We say that the limit of f at a is ∞ and write lim f (x) = ∞ if:
x→a
(d) We say that the left-hand limit of f at a is −∞ and write lim− f (x) = −∞ if:
x→a
(e) We say that the right-hand limit of f at a is −∞ and write lim+ f (x) = −∞ if:
x→a
(f) We say that the limit of f at a is −∞ and write lim f (x) = −∞ if:
x→a
(a) We say that the limit of f as x approaches ∞ is L ∈ R and write lim f (x) = L if:
x→∞
(b) We say that the limit of f as x approaches −∞ is L ∈ R and write lim f (x) = L if:
x→−∞
(a) We say that the limit of f as x approaches ∞ is ax + b and write lim f (x) = ax + b if:
x→∞
(b) We say that the limit of f as x approaches −∞ is ax + b and write lim f (x) = ax + b
x→−∞
if:
Definition 261. Given a set S ⊆ R, the largest and smallest numbers in S (if they exist)
are denoted by max S and min S.
Example 1231. Let U = (0, 1). Then neither max U nor min U exists because the set U
has no largest or smallest number.
Theorem 14. (Rules for Limits) Suppose k, L, M ∈ R, lim f (x) = L, and lim g (x) =
x→a x→a
M . Then:
lim [kf (x)] = kL
F
(a) (Constant Factor Rule)
x→a
±
(b) lim [f (x) ± g (x)] =L+M (Sum and Difference Rules)
x→a
×
(c) lim [f (x) g (x)] = LM (Product Rule)
x→a
1 R 1
(d) lim = (provided M ≠ 0) (Reciprocal Rule)
x→a g (x) M
f (x) ÷ L
(e) lim = (provided M ≠ 0) (Quotient Rule)
x→a g (x) M
=k
C
(f) lim k (Constant Rule)
x→a
= ak
P
(g) lim xk (Power Rule)
x→a
Proof. Fix ε > 0. For each Rule, we shall show that there exists some δ such that if
x ∈ D ∩ Nδ (a), then the value of the given function at x is less than ε away from the
purported limit.
First, note that the statements lim f (x) = L and lim g (x) = M say the following:
x→a x→a
, For every εf > 0, there exists δf > 0 such that x ∈ D∩ Nδf (a) implies
☀ For every εg > 0, there exists δg > 0 such that x ∈ D∩ Nδg (a) implies
(b) Pick εf = ε/2 and εg = ε/2, and let δf and δg be as given by , and ☀. Pick δ =
min {δf , δg }.
Suppose x ∈ D ∩ Nδ (a). Then ∣f (x) − L∣ < εf , ∣g (x) − M ∣ < εg and thus:
∣f (x) g (x) − LM ∣ = ∣f (x) g (x) − f (x) M + f (x) M − LM ∣ ≤ ∣f (x) g (x) − f (x) M ∣ + ∣f (x) M − L
T
(d) Let N be the smallest integer such that N > ε/ ∣M ∣ and N > ε2 .
ε3
Pick εg = > 0. Observe that (we’ll use this below):
N (N − ε2 )
3
ε 2
εg N 2 N (N −ε2 ) N
= = =
εN 1
ε2 + εg εN ε2 + N (Nε −ε2 ) εN N − ε2 + ε2
3 ε.
1 1 M − g (x) ∣g (x) − M ∣ εg N 2 1
∣ − ∣=∣ ∣= < = 2 =ε
εg
g (x) M g (x) M ∣g (x)∣ ∣M ∣ (ε/N + εg ) ε/N ε + εg εN
(g) Let a > 0. Pick δ = min { , }. Suppose x ∈ D ∩ Nδ (a). Note that since δ ≤ a/2,
ε a
2 exp k 2
we have x > 0. And now:
∣xk − ak ∣ = ∣exp (k ln x) − exp (k ln a)∣ = ∣(exp k) [exp (ln x) − exp (ln a)]∣ = ∣(exp k) (x − a)∣ < ε.
1 2 3
Above, = uses Definition 271, while = and = use Fact 160(c) and Definition 59. This
1 2 3
completes the proof of the Power Rule in the case where a > 0.
It remains to be proven that the Power Rule holds in the case where a ≤ 0. Unfortunately,
such a proof must be omitted altogether from this textbook, for reasons that were already
discussed in Remark 156.
Definition 262. Suppose the function f has a discontinuity at a. Then we call this
discontinuity:
(a) A removable discontinuity if lim f (x) exists;
x→a
(b) A jump discontinuity if both lim− f (x) and lim+ f (x) exist but lim− f (x) ≠ lim+ f (x);
x→a x→a x→a x→a
and
(c) An essential (or infinite) discontinuity if it isn’t a removable or a jump discontinuity.
Proof. Let ε > 0 and a ∈ Domainf . Pick any δ > 0. Then for all x ∈ Nδ (a), we have:
Proof. Let ε > 0 and a ∈ Domainf . Pick δ = ε. Then for all x ∈ Nδ (a), we have:
Proof. Let ε > 0. Since f is continuous at b = g (a), there exists δ̂ > 0 such that for every
y ∈ Domainf ∩ Nδ̂ (g (a)), we have ∣f (y) − f (b)∣ < ε.
Since g is continuous at a, there exists δ > 0 such that for every x ∈ Domaing ∩ Nδ (g (a)),
we have ∣g (x) − g (a)∣ < δ̂.
Hence, for every x ∈ Domaing ∩ Nδ (g (a)), we have ∣(f g) (x) − (f g) (a)∣ < ε.
1
Fact 221. Any nice function with the mapping x ↦ is continuous.
x
1 1
Proof. Let D ⊆ R ∖ {0}, f ∶ D → R be defined by f (x) = , ε ∈ (0, ), and a ∈ D.
x a
And:
1 1 a−x 1 1
∣f (x) − f (a)∣ = ∣ − ∣ = ∣ ∣ = ∣ ∣ ∣ ∣ ∣x − a∣
x a ax a x
1 1 + aε 1 1 + aε 1 1 + aε a2 ε
< ∣x − a∣ < δ= = ε.
a a a a a a 1 + aε
The case where a < 0 is similarly handled.
Theorem 16. Suppose the nice functions f and g are continuous at a. Then (a) f ±g and
(b) f ⋅ g are also continuous at a. (c) If moreover g (a) ≠ 0, then f /g is also continuous
at a. (d) If c ∈ R, then cf is also continuous at a.
q= p=
ε ε
(b) Let: and .
2 (∣f (a)∣ + 1) 2 [q + ∣g (a)∣]
By the continuity of f and g at a, there exists δ2 > 0 such that for every x ∈ Nδ2 (a) ∩
Domainf ∩ Domaing, we have: f (x) ∈ Np (f (a)), g (x) ∈ Nq (g (a)), so that:
∣f (x) g (x) − f (a) g (a)∣ = ∣f (x) g (x) − f (a) g (x) + f (a) g (x) − f (a) g (a)∣
△
≤ ∣f (x) g (x) − f (a) g (x)∣ + ∣f (a) g (x) − f (a) g (a)∣
= + ∣f (a)∣ < + = ε.
ε ε ε ε
2 2 (∣f (a)∣ + 1) 2 2
Remark 148. The usual, short proof of (b) relies on other hard-fought results about
sequential limits. However, we have not discussed sequential limits at all in this textbook
and so we cannot use those results.
Proposition 18. If a monotonic function has an interval as its range, then it is contin-
uous.
404
If no such a exists, then D is empty and f is continuous.
1334, Contents www.EconsPhDTutor.com
Case 4. Suppose there do not exist b or c such that f (b) ∈ (f (a) , f (a) + ε) or f (c) ∈
(f (a) − ε, f (a)).
Then since Rangef is an interval, it must be that Rangef consists of the single point f (a).
So, f is a constant function and is continuous (Fact 135).
Since f is continuous at b, there exists δ̂ > 0 such that for every y ∈ Domainf ∩ Nδ̂ (b), we
have f (y) ∈ Nε (c).
Since lim g (x) = b, there exists δ > 0 such that for every x ∈ Nδ (a), we have g (x) ∈ Nδ̂ (b).
x→a
Hence, for every x ∈ Nδ (a), we have g (x) ∈ Nδ̂ (b) and thus also f (g (x)) ∈ Nε (c). We have
just proven that:
Remark 149. Fact 134 is also true if a is replaced by ±∞. (The proof is very similar.)
Using the Rules for Limits (Theorem 14), we show that lim f (x) − f (a) = 0:
x→a
± f (x) − f (a)
lim f (x) − f (a) = lim f (x) − lim f (a) = lim [f (x) − f (a)] = lim [ (x − a)]
C 1
x→a x→a x→a x→a x→a x−a
× f (x) − f (a)
= lim lim (x − a) = f ′ (a) ⋅ 0 = 0.
x→a x−a x→a
Hence, lim f (x) = f (a). By Definition 164 (of continuity) then, f is continuous at a.
x→a
Remark 150. Newton’s Linear Approximation gives us Newton’s Method (or the
Newton-Raphson Method) for finding the roots of a function. (Newton’s Method was
formerly on the old 9233 syllabus but was removed when the 9740 syllabus was intro-
duced in 2007.)
Proof.
f ′ (a) = L
f (x) − f (a)
⇐⇒ lim = L.
x→a x−a
f (x) − f (a
⇐⇒ For every ε > 0, there exists δ > 0 such that x ∈ Nδ (a) ∩ Domainf implies ∣
x−a
⇐⇒ ∣f (x) − [f (a) + L ∣x − a∣]∣ < ∣x − a∣ ε.
In the main text (p. 687), we proved the Power Rule only in the special case where the
exponent c is a non-negative integer. We now prove it also in the case where the base x is
positive and c is any real exponent.
f ′ (x) = cxc−1 .
P
(Power Rule)
⋆
Proof. Below, = indicates the use of Definition 271, which is the general definition of
exponentiation given in Ch. 121.17.
For x > 0, we have:
⋆ 1 ⋆ c1
i′ (x) = (xc ) ′ = [exp (c ln x)] ′ = [exp (c ln x)] (c ln x) ′ = c exp (c ln x) (ln x) ′ = c exp (c ln x) = cx
C F
x x
We’ve just proven that the Power Rule of Differentiation holds in the case where x > 0.
It remains to be proven that the Power Rule also holds in the case where x ≤ 0. Unfor-
tunately, this proof shall be omitted altogether from this textbook, for reasons that were
already discussed in Remark 156.
To see why the above step requires additional justification, let b = g (a). We were given
that f ′ (g (a)) = f ′ (b) exists. By Definition 168 (of the derivative), this merely means that:
f (y) − f (b) f (y) − f (g (a))
f ′ (g (a)) = f ′ (b) = lim = lim
y−b y − g (a)
.
y→b y→g(a)
⋆
We now need to justify that this last expression is in fact equal to the LHS of =, i.e. that:
f (y) − f (g (a)) f (g (x)) − f (g (a))
lim = lim
y − g (a) g (x) − g (a)
.
y→g(a) x→a
Theorem 21. (Chain Rule) Let a ∈ R and f and g be nice functions. Suppose g ′ (a)
and f ′ (g (a)) exist. Then:
Proof. Let D be the set of points for which the composite function f g is well-defined. Let
b = g (a).
Define h ∶ D → R by:
⎧
⎪ f (g (x)) − f (b)
⎪
⎪
⎪ for g (x) ≠ b,
1 ⎪
⎪ g (x) − b
h (x) = ⎨
⎪
⎪
⎪
⎪
⎪
⎩f (b)
⎪ for g (x) = b.
′
So:
⎧
⎪
⎪ f (g (x)) − f (b) for g (x) ≠ b,
g (x) − b ⎪
⎪
⎪
h (x) =⎨
x−a ⎪
⎪
⎪ g (x) − b
⎪
⎪ f ′ (g (a)) = 0 = f (g (x)) − f (b) for g (x) = b.
⎩ x−a
Hence:
g (x) − b 2
h (x) = f (g (x)) − f (b) .
x−a
f (y) − f (b)
∣ − f ′ (b)∣ < ε
4
y−b
By the continuity of g, there exists δ > 0 such that for every x ∈ D ∩ Nδ (a), we have
g (x) ∈ D ∩ Nλ (g (a)) and thus also:
⎧
⎪ f (g (x)) − f (b)
⎪
⎪
⎪ ∣ − ′
(b)∣ <
4
for g (x) ≠ b,
⎪
⎪ g (x) − b
f ε
∣h (x) − f (b)∣ = ⎨
′
⎪
⎪
⎪
⎪
⎪
⎪
⎩∣f (b) − f (b)∣ = 0 for g (x) = b.
′ ′
We have just proven =, because we have just shown that for any ε > 0, we can find δ > 0
3
It’s not at all obvious why the above result corresponds to our Parametric Differentiation
Rule. To see why, let y and x take the places of f and g. Then we have:
(f g) ′
f (g) =
′
.
² g′
dy ²
dt ÷ dt
dx dy dx
(As usual, t is a dummy variable that can be replaced by any other symbol.)
Here is a weak version of the Inverse Function Theorem (IFT). It is weak in the sense
that it employs strong assumptions, so that the result is nearly immediate from Corollary
38.
Figure to be
inserted here.
Pick any two points A and B on f . Let m the gradient of the line AB.
It is intuitively plausible that there exists some point C (on f ) between A and B such that
the gradient of the tangent line at C equals m. This result is known as the Mean Value
Theorem:
Theorem 35. (Mean Value Theorem) Let a < b. Suppose f is continuous on [a, b]
and differentiable on (a, b). Then there exists c ∈ (a, b) such that:
f (b) − f (a)
f ′ (c) =
b−a
.
Proof. ( Ô⇒ )(a) Let c, d ∈ (a, b) with d > c. By the Mean Value Theorem, there exists
f (d) − f (c)
e ∈ (c, d) such that f ′ (e) = . Since f ′ (e) ≥ 0 and d > c, we have f (d) ≥ f (c).
d−c
The proof of (b) is identical, except that we replace the two ≥s with >s.
The proofs of (c) and (d) are similar to (a) and (b) and thus omitted.
( ⇐Ô )(a) Suppose f is increasing405 on (a, b). Then for any distinct c, d ∈ (a, b), we have:
f (d) − f (c) 1
Hence: ≥ 0.
d−c
405
Note that this includes the possibility that f is strictly increasing.
1341, Contents www.EconsPhDTutor.com
Now, let e ∈ (a, b) and consider the derivative of f at e, which exists because f is differen-
tiable on (a, b):
f (x) − f (e)
f ′ (e) = lim
x−e
.
x→e
We now show that f ′ (e) ≥ 0. Suppose for contradiction that f ′ (e) = L < 0. Pick δ = ∣L∣ /2.
Then by definition of the limit, there exists some ε ∈ (0, min {e − a, b − e}) such that for
every x ∈ Nε (e), we have:
f (x) − f (e)
∈ Nδ (L) ⊆ R− .
x−e
f (x) − f (e)
< 0, contradicting ≥.
1
Which implies that:
x−e
The proof of (c) is similar and thus omitted.
By the way, with the MVT, we can easily prove that a function whose derivative is a
zero function is itself is a constant function:
Proof. Pick any a, b ∈ D with a < b. Then by the MVT, for some x ∈ [a, b], we have:
f (b) − f (a)
f ′ (x) =
b−a
.
But since f ′ (x) = 0 for all x ∈ D, we must have f (b) = f (a). Since a and b were arbitrarily
chosen points in D, f is constant on D.
Theorem 36. (Extreme Value Theorem) If f is continuous on [a, b], then there exist
m, M ∈ [a, b] such that for every x ∈ [a, b], we have:
f (m) ≤ f (x) ≤ f (M ) .
Proof. (a) We will prove the contrapositive. Suppose there exists b ∈ (a − ε, a) such that
f (a) < f (b). Let δ = [f (b) − f (a)] /2. By continuity, there exists ε̂ ∈ (0, ε) such that for
all x ∈ (a − ε̂, a), we have f (x) ∈ Nδ (f (a)). Observe that since f (b) ∉ Nδ (f (a)), we must
have b ≤ a − ε̂. Observe that f (b) > f (x) for all x ∈ (a − ε̂, a). So, f is not increasing on
(a − ε, a).
(b) If f is strictly increasing on (a − ε, a), then by (a), f (a) ≥ f (x) for all x ∈ (a − ε, a).
1
Suppose for contradiction that there exists b ∈ (a − ε, a) such that f (a) = f (b). Since f is
strictly increasing on (a − ε, a), for any c ∈ (b, a), we have f (c) > f (b) = f (a), contradicting
≥.
1
The proofs of (c) and (d) are similar to (a) and (b) and thus omitted.
The following result is nearly immediate from our definitions of the terms (strict) local
maximum, (strict) local minimum, (strictly) increasing, and (strictly) decreasing:
It is tempting to assume that the converses of (a)–(d) in the above Lemma are also true.
Unfortunately, they are not:
1344, Contents www.EconsPhDTutor.com
Example 1232. Define f ∶ R → R by:
⎧
⎪
⎪
⎪
⎪ 2 for x = 0,
⎪
⎪
f (x) = ⎨1 for x ∈ Q ∖ {0} ,
⎪
⎪
⎪
⎪
⎪
⎪
⎩0 for x ∉ Q.
Figure to be
inserted here.
Observe that 0 is a strict local (and also global) maximum of f . However, f is not
strictly increasing for any ε-neighbourhood to the left of 0; and similarly, f is not strictly
decreasing for any ε-neighbourhood to the right of 0.
Thus, the converse of (a) of the above Fact is false.
Indeed, even if we assume continuity or even differentiability, the converses of the above
Lemma remain false:
Observe that for any ε > 0, we can find some (a, b) ⊆ (0, ε) such that g ′ (x) < 0 for all
x ∈ (a, b) and thus that g is (strictly) decreasing on (a, b) ⊆ R+ .
Figure to be
inserted here.
In the main text, we gave an informal statement of the First Derivative Test (FDT). We
now give a formal statement thereof:
406
This example was stolen from Gelbaum and Olmsted, Counterexamples in Analysis (1964, p. 36).
1346, Contents www.EconsPhDTutor.com
Proposition 20. (First Derivative Test for Extrema) Let a < b, f ∶ (a, b) → R be
differentiable, and c ∈ (a, b).
(a) If there exists ε > 0 such that f ′ is non-negative on (a − ε, a) and non-positive on
(a, a + ε), then c is a local maximum (of f ).
(b) If there exists ε > 0 such that f ′ is positive on (a − ε, a) and negative on (a, a + ε),
then c is a strict local maximum (of f ).
(c) If there exists ε > 0 such that f ′ is non-positive on (a − ε, a) and non-negative on
(a, a + ε), then c is a local minimum (of f ).
(d) If there exists ε > 0 such that f ′ is negative on (a − ε, a) and positive on (a, a + ε),
then c is a strict local minimum (of f ).
Proof. (a) By Fact 145, f is increasing on (a − ε, a) and decreasing on (a, a + ε). By Lemma
11 then, a is a local maximum of f .
The proofs of (b), (c), and (d) are similar and thus omitted.
Remark 151. As illustrated in Example 1233, the converses of (a)–(d) are false.
We say that f is convex if for every x1 , x2 ∈ D and every α ∈ [0, 1], we have:
Definition 265. Let D be an interval and f ∶ D → R. We say that f is linear if for every
x1 , x2 ∈ D, we have:
It is immediate from the above definitions that “linearity ⇐⇒ concavity and convexity”:
The following Lemma characterises concavity and can serve as an alternative definition
thereof:
Also, (a) remains true if we replace concave with strictly concave and ≥ with >.
1
Also, (b) remains true if we replace convex with strictly convex and ≤ with <.
2
x3 − x2
Proof. (a) Pick any distinct x1 , x2 , x3 ∈ D. Let α = .
x3 − x1
Observe that α ∈ [0, 1]. Moreover:
x3 − x2 x2 − x1
αx1 + (1 − α) x3 = x1 + x3 = x2 .
x3 − x1 x3 − x1
x3 − x2 x2 − x1
f (x2 ) ≥ f (x1 ) + f (x3 )
x3 − x1 x3 − x1
⇐⇒ (x3 − x1 ) f (x2 ) ≥ (x3 − x2 ) f (x1 ) + (x2 − x1 ) f (x3 )
f is concave on D
Definition 266. Let a < b and f ∶ (a, b) → R be a continuous function. We call c ∈ (a, b)
an inflexion point of f if there exists ε > 0 such that either statement (a) or (b) is true:
(a) f is strictly concave on (c − ε, c) and strictly convex on (c, c + ε).
(b) f is strictly convex on (c − ε, c) and strictly concave on (c, c + ε).
Fact 223. (The First Derivative Test for Inflexion Points [FDTI]) Let a < b,
f ∶ (a, b) → R be a differentiable function whose derivative is continuous, and c ∈ (a, b) is
a stationary point of f . If c is also an inflexion point of f , then there exists ε > 0 such
that f ′ is either strictly positive or strictly negative on (c − ε, c) ∪ (c, c + ε).
Fact 146. (Second Derivative Test for Inflexion Points [SDTI]) Let a < b. Suppose
f ∶ (a, b) → R is twice differentiable and has inflexion point c ∈ (a, b). Then:
(a) c is a strict extremum of the first derivative f ′ ; and
(b) f ′′ (c) = 0.
Proof. (a) By our definition of an inflexion point, f is either (i) strictly concave on (c − ε, c)
and strictly convex on (c, c + ε); or (ii) strictly convex on (c − ε, c) and strictly concave on
(c, c + ε).
If (i), then by Proposition 8, f ′ is strictly decreasing on (c − ε, c) and strictly increasing on
(c, c + ε). By Lemma 11then, c is a strict local minimum of f ′ .
If (ii), then by Proposition 8, f ′ is strictly increasing on (c − ε, c) and strictly decreasing
on (c, c + ε). By Lemma 11then, c is a strict local maximum of f ′ .
(b) was already proven in the main text.
Fact 224. (Tangent Line Test) Let a < b, f ∶ (a, b) → R be differentiable, and c ∈ (a, b).
If c is an inflexion point of f , then there exists ε > 0 such that for every x1 ∈ (c − ε, c)
and x2 ∈ (c, c + ε), one of the following statements is true:
f (x1 ) − f (c) f (x2 ) − f (c)
(a) f ′ (c) ≥ and f ′ (c) ≤ .
x1 − c x2 − c
f (x1 ) − f (c) f (x2 ) − f (c)
(b) f ′ (c) ≤ and f ′ (c) ≥ .
x1 − c x2 − c
407
This example was stolen from Kouba (1995), “Can We Use the First Derivative to Determine Inflection
Points?”
1351, Contents www.EconsPhDTutor.com
Proof. Following Definition 266, suppose there exists ε > 0 such that f is strictly concave
on (c − ε, c), but convex on (c, c + ε). (The other cases where the words strictly, concave,
and convex are permuted are similar and thus omitted.)
Then by Proposition 8, f ′ (x) is strictly decreasing on (c − ε, c) and f ′ (x) is increasing on
(c, c + ε). And so, it is true that for every x1 ∈ (c − ε, c) and x2 ∈ (c, c + ε), we have:
f (x1 ) − f (c) f (x2 ) − f (c)
(b) f ′ (c) < and f ′ (c) ≥ .
x1 − c x2 − c
Remark 152. Note that the converse of the TLT is false. That is, every inflexion point
satisfies the TLT, but not every point that satisfies the TLT is an inflexion point. In
other words, the TLT is necessary but not sufficient.
It turns out that this remains true even if the given function is smooth! See the answer
given by zhw at .
Figure to be
inserted here.
Proof. Suppose for contradiction that c is also an extremum of f . Then by the IET,
If c ∈ (a, b) is an inflexion point of f , then by Fact 146, c is strict extremum of the first
derivative f ′ .
Suppose c is a strict local maximum of f ′ (the case where c is instead a strict local minimum
is similar and thus omitted). Then there exists some ε > 0 such that f ′ (x) < f ′ (c) = 0 for
1
all x ∈ (c − ε, c) ∪ (c, c + ε). Fact 145 then implies that f is strictly decreasing on (c − ε, c)
and (c, c + ε). And so, by the continuity of f and Lemma 11, c is a strict minimum of f on
(c − ε, c] and a strict maximum on [c, c + ε). Thus, c cannot be an extremum of f — this
is our desired contradiction.
∞ ∞
Definition 267. Given the two series A = ∑ an and B = ∑ bn , their Cauchy product is
n=0 n=0
the series:
∞ n
C = AB = ∑ cn , where cn = ∑ ai bn−i .
n=0 i=0
∞ ∞
Definition 268. The series ∑ an is absolutely convergent if ∑ ∣an ∣ converges.
n=0 n=0
∞
Definition 269. The series ∑ an is conditionally convergent if it is convergent but not
n=0
absolutely convergent.
∞ ∞ ∞
Theorem 37. Let C = ∑ cn be the Cauchy product of A = ∑ an and B = ∑ bn . Suppose
n=0 n=0 n=0
A and B are both convergent, with at least one being absolutely convergent. Then C
converges.
Proof. Omitted — see e.g. Rudin (1976, p. 74, Theorem 3.50) or Apostol (1974, p. 204,
Theorem 8.46).
Remark 153. The version of Theorem 37 given in the main text (p. 784) is slightly
incorrect because it omits the additional condition that at least one of the two series is
absolutely convergent.
If A and B are both merely conditionally convergent, then C may not converge.
∞ ∞
f (x) = ∑ an x for all x ∈ (−c, c)
n
and g (x) = ∑ bn xn for all x ∈ (−d, d).
n=0 n=0
∞
Then for any z ∈ (−d, d) such that ∑ bn z n converges absolutely to some number in (−c, c),
n=0
we have:
∞ ∞ n
(f ○ g) (z) = ∑ an ( ∑ bm z ) .m
n=0 m=0
Proof. Omitted — see e.g. Apostol (1974, p. 238, Theorem 9.25: “Substitution Theorem”)
or Lang (1999, p. 66, Theorem 3.10: “Composition of Series”).
Next,408 for each i, let li and ui be the minimum and maximum values of f in Pi .
Then the lower and upper n-sums of f from a to b are denoted Ln and Un and are defined
by:
b−a n b−a n
Ln = ∑ li and Un = ∑ ui .
n i=1 n i=1
The lower and upper integrals of f from a to b are denoted L and U and are defined by:
a
definite integral of f from a to b by:
∫a f = L = U.
b
The following result says that a continuous function on closed (and bounded) interval is
Riemann-integrable.
408
The existence of li and ui is given by the Extreme Value Theorem (Theorem 36).
1355, Contents www.EconsPhDTutor.com
Theorem 40. (Order Limit Theorem) Let D be an interval, f ∶ D → R be a continuous
function, a be an interior point of D, and p ∈ R.
(a) If f (x) ≤ p for all x < a, then f (a) ≤ p.
(b) If f (x) < p for all x > a, then f (a) ≤ p.
(c) If f (x) ≥ p for all x < a, then f (a) ≥ p.
(d) If f (x) > p for all x > a, then f (a) ≥ p.
f (a) − p
Proof. (a) Suppose f (a) > p. Let δ = .
2
Then by the continuity of f , there exists ε > 0 such that for every x ∈ (a − ε, a + ε), we
have f (x) ∈ (f (a) − δ, f (a) + δ) and hence f (x) > p. This contradicts our assumption that
f (x) ≤ p for all x < a.
Given (a), (b), which has the same conclusion but uses a stronger assumption, is a fortiori
true.
The proofs of (c), and (d) are similar to those of (a) and (b) and thus omitted.
Theorem 25. Let a, b, c, d, e ∈ R with a < c < b. Suppose f, g ∶ [a, b] → R are continuous
functions. Then:
a a a
a a c
a a
a a
Since L = U , we have:
∫a d = ∫a h = L = U = d.
b b
(e) xxx
(f) xxx
g (x) = ∫
x
f.
a
Then g ′ = f .
g (d) − g (c) ∫a f − ∫a f ∫c f
d c d
= =
d−c d−c d−c
,
where the last step uses Adjacent Intervals Rule (Theorem 25).
g (d) − g (c)
Let y = f (c) (note that this is a constant). Let S = − y.
d−c
Observe that:
d − c ∫c
We have just shown that for any δ > 0, there exists ε > 0 such that for all d ∈ (c − ε, c + ε),
we have:
S ∈ Nδ (0) .
Thus: lim S = 0.
d→c
g (d) − g (c)
That is: lim [ − y] = 0.
d→c d−c
g (d) − g (c) g (d) − g (c)
But: lim [ − y] = lim − lim y = g ′ (c) − y = g ′ (c) − f (c).
d→c d−c d→c d−c d→c
Hence, g ′ (c) = f (c). Since c was chosen arbitrarily from [a, b], we have g ′ = f .
√
Fact 152. Suppose a, b, c ∈ R with a ≠ 0 and d = ∣b2 − 4ac∣. Then:
⎧
⎪ x + b−d
⎪
⎪
⎪
1
ln ∣ ∣+C for b2 − 4ac > 0,
⎪
2a
⎪
⎪
⎪ x + 2a
⎪
d b+d
⎪
⎪
⎪
⎪
⎪
⎪
1 ⎪
⎪
∫ ax2 + bx + c dx = ⎨− 1 + C for b2 − 4ac = 0,
⎪
⎪
⎪ x + 2a
⎪
b
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
⎪
2 −1 2ax
+C for b2 − 4ac < 0.
⎪
⎩d
tan
d
1 1 1 (A + B) x + A b−d
2a + B 2a
b+d
= = ( + )=
1 A B
ax2 + bx + c a (x + b+d ) (x ) + + (x ) (x )
.
2a + b−d
2a
a x b+d
2a x b−d
2a a + b+d
2a + b−d
2a
b−d b+d 1
+B = [(A + B) b + (B − A) d] = (B − A) = 1.
2 d 3
A
2a 2a 2a 2a
2a −a
2B = or B = A=
a
and .
d d d
−1 1
Thus: 1 1
∫ ax2 + bx + c dx = a (∫ dx + ∫ dx) = ∫ dx + ∫
A B d d
dx
x + b+d
2a x + b−d
2a x + b+d
2a x + b−d
2a
1 1 −2ax − b
∫ √ 2 dx = √ sin−1 √ + C.
ax + bx + c ∣a∣ b2 − 4ac
Remark 154. Note that if a < 0 and ax2 + bx + c > 0, then we must have b2 − 4ac > 0.
−2ax − b
Also, it is possible to verify that √ ∈ [−1, 1] = Domain sin−1 .
b − 4ac
2
1 −1 x + 2a 1 −1 x + 2a 1 −1 −2ax − b
b b
= √ sin + = √ + = √ √ + C.
3
√ C sin √ C sin
∣a∣ ∣a∣ ∣a∣ 2 − 4ac
∣ b2∣a∣ ∣
2 −4ac b2 −4ac
2∣a∣
b
At =, we can remove the absolute value sign because the expression inside is positive.
3
√ √ 2 dx
Proof. Let u = tan−1 √ ∈ (− , ). So, a tan u = x and a sec2 u =
x π π 1
. And now:
a 2 2 du
RRR√ RRR
RRR R
dx = √ ln RR x2 + x + + x + RRRR + C.
1 1
∫ √ 2
b c b
ax + bx + c a RRR 2a RRR
R R
a a
b 2 4ac − b2
ax + bx + c = a [(x + ) + ].
2 1
2a 4a2
So:
RRR¿ RR
1 1 1 1 RRRÁ Á
À b 2 4ac − b2 b RRRR
∫ √ 2 dx = √ ∫ √ dx = √ ln RRR (x + ) + + x + RRR +
ax + bx + c RRR 2a 4a2 2a RRR
a (x + 2a ) + 4a2
b 2 4ac−b2 a
RR RR
RRR√ RRR
RRR R
= √ ln RR x2 + x + + x + RRRR + C.
1 b c b
a RRR 2a RRR
R R
a a
Remark 155. We could also have simply verified the above results (i.e. simply show that
the derivative of the RHS is equal to the integrand on the LHS), but that would’ve been
less enlightening.
Fact 159.
Proof. By our general definition of exponents (Definition 271), xn = exp (n ln x). So:
ln xn = ln [exp (n ln x)] = n ln x,
1
51 = 5,
√ 17
51.7 = 5 10 =( ≈ (1.174 619 . . . ) ≈ 15.425 . . .
17 17
5)
10
√ 173
51.73 = 5 100 = ( 5) ≈ (1.016 224 6 . . . ) ≈ 16.188 . . .
173 100 173
√ 1 732
51.732 = 5 1 000 =( ≈ (1.001 610 73 . . . ) ≈ 16.241 . . .
1 732 173
5)
1 000
√ 17 320
51.732 0 = 5 10 000 =( ≈ (1.000 160 957 . . . ) ≈ 16.241 . . .
17 320 1 730
5)
10 000
√ 173 205
51.732 05 = 5 100 000 =( ≈ (1.000 016 094 5 . . . ) ≈ 16.242 . . .
173 205 17 305
5)
100 000
√ 1 732 050
51.732 050 = 5 1 000 000 =( ≈ (1.000 016 094 5 . . . ) ≈ 16.242 . . .
1 732 050 173 050
5)
1 000 000
√ 17 320 508
51.732 050 8 = 5 100 00 000 =( ≈ (1.000 000 160 944 . . . ) ≈ 16.242 . . .
17 320 508 1 730 508
5)
10 000 000
√
And so informally, we might say that 5 3
≈ 16.242 4 . . .
√
3
A little more formally, we might say that 5 is the limit of the following sequence:
Definition 271. Let b > 0 and x ∈ R. Then b raised to the power of x, denoted bx , is
defined as the following number:
bx = exp (x ln b) .
Remark 156. The above definition is fairly general, but still fails to cover the case where
the base is negative, i.e. b < 0. To cover also that case, we’d have to learn a bit more
about complex numbers and in particular the complex natural logarithm function ln
(whose domain is every complex number except 0).
It turns out that with the complex natural logarithm function, we can define bx when
b < 0 exactly as we did above:
bx = exp (x ln b) .
This though is of course quite a bit beyond H2 Maths and so we shall discuss this no
further.
Observe that in the special case where b = 0, the above general definition of exponentiation
coincides with Definition 26, which covered only the special case where b ∈ Z.
The following Proposition shows that for b > 0 and any x ∈ R, Definitions 271 and 185 imply
and therefore supersede the four definitions of exponents and logarithms given in Ch. 5.4:
Proposition 21. If b > 0 and x ∈ R, then Definition 271 implies Definitions (a) 26; (b)
27; (c) 28; and (d) 30.
409
Which were in turn formally defined as Definitions 184 and 59 in the main text.
1364, Contents www.EconsPhDTutor.com
⋆ ○
Proof. Below, = and = indicate the use of Definition 271 and the fact that exp and ln are
each other’s inverse.
(a) We will show that in each of three possible cases, Definition 271 implies Definition 26.
Case 1. If x = 0, then by Fact 160(a):
⋆
b0 = exp 0 = 1.
⋆ ○
bx = exp (x ln b) = exp (ln b + ln b + ⋅ ⋅ ⋅ + ln b) = exp (ln b) exp (ln b) . . . exp (ln b) = b ⋅ b ⋅ ⋅ ⋅ ⋅ ⋅ b.
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
x times x times x times
⋆ 1 1 1 1 1
bx = exp (x ln b) = exp (− ln b − ln b − ⋅ ⋅ ⋅ − ln b) = exp (ln + ln + ⋅ ⋅ ⋅ + ln ) = exp (ln ) exp (ln
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹
b b b b b
∣x∣ times
∣x∣ times ∣x∣ time
1 x ⋆
x
⋆ 1 ○ 1 ○
(b ) = [exp ( ln b)] = exp {x ln [exp ( ln b)]} = exp [x ( ln b)] = exp (ln b) = b.
1
x
x x x
(c) is immediate from (a) and (b).
⋆ ○
(d) bx = n ⇐⇒ exp (x ln b) = n ⇐⇒ ln [exp (x ln b)] = ln n ⇐⇒ x ln b = ln n ⇐⇒
ln n
x= = logb n.
ln b
(The last step uses Definition 185.)
In the main text, we proved the following Laws of Exponents only in the special and simple
case where the exponents x and y are positive integers. We will now prove them more
generally:
⋆
Proof. Below, = indicates the use of Definition 271.
⋆ ⋆
(a) bx by = exp (x ln b) exp (y ln b) = exp (x ln b + y ln b) = exp [(x + y) ln b] = bx+y .
⋆ 1 ○ 1
(b) b−x = exp (−x ln b) = exp (− ln bx ) = exp (ln x ) = x , where = and = use Facts 159 and
1 2 1 2
b b
158(c).
⋆ exp (x ln b) ⋆ bx
(c) bx−y = exp [(x − y) ln b] = exp (x ln b − y ln b) = = y , where = uses Fact 160(e).
3 3
exp (y ln b) b
x ⋆ ⋆
(e) (ab) = exp [x ln (ab)] = exp [x (ln a + ln b)] = exp (x ln a + x ln b) = [exp (x ln a)] [exp (x ln b)] =
5 6
ex = exp x.
Theorem 41. (AP.) If A and B are disjoint, finite sets, then ∣A ∪ B∣ = ∣A∣ + ∣B∣.
A ∪ B = {a1 , a2 , . . . , ap , b1 , b2 , . . . , bq } .
n
Corollary 39. If A1 , A2 , . . . , An are disjoint, finite sets, then ∣∪ni=1 Ai ∣ = ∑ ∣Ai ∣.
i=1
Theorem 42. (MP.) If A and B are finite sets, then ∣A × B∣ = ∣A∣ × ∣B∣.
Theorem 44. (CP.) If A and B are finite sets and A ⊆ B, then ∣A ∖ B∣ = ∣A∣ − ∣B∣.
n
Proof. By the corollary to the AP, ∣∪ni=1 Bi ∣ = ∑ ∣Bi ∣. The result then follows by the CP.
i=1
We also know that m distinct objects have m! (linear) permutations and (m − 1)! circular
permutations.
A reasonable conjecture might thus be that the number of circular permutations of the
above n objects is
(n − 1)!
.
r1 !r2 ! . . . rk !
The above conjecture sometimes “works” — e.g. SEE has 3!/2! = 3 (linear) permutations
and SEE indeed also has (3 − 1)!/2! = 1 circular permutation. However and unfortunately,
this conjecture is, in general, incorrect. Here are two counter-examples.
Example 1236. There are 3!/3! = 1 (linear) permutations of the three letters AAA.
If the above conjecture were true, then there ought to be (3 − 1)!/3! = 2!/3! = 1/3 circular
permutations of AAA. But this is not even an integer, so obviously it cannot be the num-
ber of circular permutations of AAA. In fact, there is also exactly 1 circular permutation
of AAA.
Example 1237. There are 6!/ (3!3!) = 20 (linear) permutations of the six letters
AAABBB.
If the above conjecture were true, then there ought to be (6 − 1)!/ (3!3!) = 10/3 circular
permutations of AAABBB. But this is not even an integer, so obviously it cannot be
the number of circular permutations of AAABBB. In fact, there are exactly 4 circular
permutations of AAABBB.
A general solution (i.e. formula) is possible but is a bit too advanced for A-Levels.410
410
See e.g. this Handbook on Combinatorics.
1369, Contents www.EconsPhDTutor.com
122.3. Probability
Proposition 14 (p. 945 above). Let S be the sample space, Σ be the corresponding
event space, and A, B be events. If the probability function P ∶ Σ → R satisfies the
Kolmogorov Axioms, then P also satisfies the following properties:
1. Complements. P(A) = 1 − P (Ac ).
2. Probability of Empty Event is Zero. P(∅) = 0.
3. Monotonicity. If B ⊆ A, then P(B) ≤ P(A).
4. Probabilities Are At Most One. P(A) ≤ 1.
5. Inclusion-Exclusion. P(A ∪ B) = P(A) + P(B) − P(A ∩ B).
Proposition 15 (p. 981). The expectation operator E is linear. That is, if X and Y
are random variables and c is a constant, then
(a) Additivity: E[X + Y ] = E [X] + E [Y ],
(b) Homogeneity of degree 1: E[cX] = cE [X].
Proof. This proposition applies even for non-discrete random variables. But we’ll prove
this proposition only for the case where the random variable is discrete.
We’ll use the linearity of the expectation operator. We prove (b) first.
(a) E [X + Y ]
= ∑ ∑ P (X = k, Y = l) ⋅ (k + l)
k∈Range(X) l∈Range(Y )
= ∑ k ∑ P (X = k, Y = l) + ∑ l ∑ P (X = k, Y = l)
k∈Range(X) l∈Range(Y ) l∈Range(Y ) k∈Range(X)
= ∑ kP (X = k) + ∑ lP (Y = l)
k∈Range(X) l∈Range(Y )
= E [X] + E [Y ] .
Proof. We use Fact 175 and the linearity of the expectation operator.
V [X + Y ] = E [(X + Y ) ] − (E [X + Y ])
2 2
= E [X 2 + Y 2 + 2XY ] − (E [X] + E [Y ])
2
Lemma 13. If X and Y are independent random variables, then E [XY ] = E [X] E [Y ].
Proof. We prove this Lemma only for the case where X and Y are both discrete.
E [XY ] = ∑ ∑ P (X = k, Y = l) ⋅ kl
k l
= ∑ ∑ P (X = k) P (Y = l) ⋅ kl (independence)
k l
= ∑ (P (X = k) k ∑ P (Y = l) ⋅ l) = ∑ (P (X = k) kE [Y ])
k l k
= E [Y ] ∑ P (X = k) k = E [Y ] E [X] .
k
(Explanation: If the next flip is H, then we’ve completed HH and this took us only 1 more
flip. If instead the next flip is T , then we start all over again; we’ve already taken 1 flip
and are expected to take another p flips.) Similarly, observe that
(Explanation: If the next flip is H, then we expect to take, in addition, another q flips. If
instead the next flip is T , then we start all over again; we’ve already taken 1 flip and are
expected to take another p flips.)
Hence, p = 6 = µX . The reasoning used above is illustrated by the probability tree below.
Let’s now find µY . Again, let
(Explanation: If the next flip is T , then we’ve completed HT and this took us only 1 more
flip. If instead the next flip is H, then we’ve already taken 1 flip and are expected to take
another s flips.)
So s = 2. Similarly, observe that
(Explanation: If the next flip is H, then we’ve already taken 1 flip and are expected to
take another s flips. If the next flip is T , then we’ve already taken 1 flip and are expected
to take another r flips.)
So r = 4 = µY .
(b) Let Si be the random variable that indicates whether the ith pair of consecutive coin-
flips is HH. That is, Si = 1 if so and Si = 0 if not. Then
S1 + S2 + ⋅ ⋅ ⋅ + Sn
A= .
n
S1 + S2 + ⋅ ⋅ ⋅ + Sn 1 n
And so, E [A] = E [ ] = ∑ E [Si ] .
n n i=1
∞ √
Fact 230. ∫ e−x dx =
2
π.
−∞
Fact 180 (p. 1008). Let Z ∼ N(0, 1) and φ and Φ be its PDF and CDF.
1. Φ(∞) = 1. (As with any random variable, the area under the entire PDF is 1.)
2. φ (a) > 0, for all a ∈ R. (The PDF is positive everywhere. This has a surprising
implication: however large a is, there is always some non-zero probability that Z ≥ a.)
3. E [Z] = 0. (The mean of Z is 0.)
4. The PDF φ reaches √ a global maximum at the mean 0. (In fact, we can go ahead and
compute φ (0) = 1/ 2π ≈ 0.399.)
5. Var [Z] = 1. (The variance of Z is 1.)
6. P (Z ≤ a) = P (Z < a). (We’ve already discussed this earlier. It makes no difference
whether the inequality is strict. This is because P(Z = a) = 0.)
7. The PDF φ is symmetric about the mean. This has several implications:
(a) P (Z ≥ a) = P (Z ≤ −a) = Φ(−a).
(b) Since P (Z ≥ a) = 1 − P (Z ≤ a) = 1 − Φ (a), it follows that Φ(−a) = 1 − Φ (a) or,
equivalently, Φ (a) = 1 − Φ(−a).
(c) Φ (0) = 1 − Φ (0) = 0.5.
8. P (−1 ≤ Z ≤ 1) = Φ (1) − Φ (−1) ≈ 0.6827. (There is probability 0.6827 that Z takes on
values within 1 standard deviation of the mean.)
9. P (−2 ≤ Z ≤ 2) = Φ (2) − Φ (−2) ≈ 0.9545. (There is probability 0.9545 that Z takes on
values within 2 standard deviations of the mean.)
10. P (−3 ≤ Z ≤ 3) = Φ (3) − Φ (−3) ≈ 0.9973. (There is probability 0.9973 that Z takes on
values within 3 standard deviations of the mean.)
11. The PDF φ has two points of inflexion, namely at ±1. (The points of inflexion are one
standard deviation away from the mean.)
√ √
Proof. 1. Let u = x/ 2. We have u2 = 0.5x2 and du/dx = 1/ 2. And using Fact 230:
∞ 1 1 2 √ du 1 1 1 √
Φ(∞) = ∫ √ e−0.5x dx = √ ∫ e−0.5x 2 dx = √ ∫ e−u du = √ π = 1.
2 x=∞ u=∞ 2
−∞ 2π 2π x=−∞ dx π u=−∞ π
2π −∞ 2π −∞
φ is continuous, increasing for a < 0 and decreasing for a > 0. Thus, φ reaches a global
maximum√at 0. By plugging in a = 0, we can compute this global maximum value to be
φ (0) = 1/ 2π ≈ 0.399.
6. By the Additivity Axiom, P (Z ≤ a) = P (Z < a, Z = a) = P (Z < a)+P (Z = a) = P (Z < a)+
0 = P (Z < a), as desired.
2 √ 2 √
7. Clearly, φ (a) = e−0.5a / 2π = e−0.5(−a) / 2π = φ(−a) for all a ∈ R. Thus, φ is symmetric
about the vertical axis x = 0, which is also the mean.
7(a). Using the substitution u = −x, we have du/dx = −1 and
e−0.5x
2
u=−∞ −e−0.5u 2
u=−a e−0.5u
2
P (Z ≥ a) = ∫ √ dx = ∫ √ du = ∫ √ du = P (Z ≤ −a) = Φ(−a).
x=∞
Hence, ±1 are the only two points of inflexion since φ changes concavity only here.
c−b c−b
Case #2. If a < 0, then FY (c) = ⋅ ⋅ ⋅ = P (aX ≤ c − b) = P (X ≥ ) = 1 − FX ( ).
a a
Now differentiate:
c−b 1 c−b 1 c−b
FY (c) = [1 − FX ( )] = fY (c) = − fX ( ) = fX ( ).
d d
da dc a a a ∣a∣ a
c−b
c−b −µ 2
1 1 1 −0.5( ) 1 −0.5[ c−(aµ+b) ]
2
faX+b (c) = fX ( )= √ e = √ e
a
∣a∣ ∣a∣ σ 2π
σ
.
∣a∣ σ 2π
aσ
a
But this lattermost expression is indeed the PDF of the random variable with distribution
N (aµ + b, a2 σ 2 ).
Fact 183 (p. 1054). Let S = (X1 , X2 , . . . , Xn ) be a random sample of size n. Let X̄ be
the sample mean and S 2 be the sample variance. Let a ∈ R be a constant. Then
Proof. This proof may look intimidating but it’s really just a bunch of tedious algebra. (I’ve
also tried to go slow with the algebra, so more steps are explicitly listed than is typical in
a proof.)
(a) Start from the definition of the sample variance and do the algebra:
∑i=1 (Xi − X̄) ∑i=1 (Xi2 + X̄ 2 − 2X̄Xi ) ∑i=1 Xi2 − ∑i=1 X̄ 2 − ∑i=1 (2X̄Xi )
n n2 n n n
S =
2
= =
n−1 n−1 n−1
∑i=1 Xi − nX̄ 2 − 2X̄ ∑i=1 Xi ∑i=1 Xi − nX̄ − 2X̄ (nX̄) ∑i=1 Xi2 − nX̄ 2
n 2 n n 2 2 n
= = =
n−1 n−1 n−1
2
∑i=1 Xi2 − n [ n ]
∑i=1 Xi [∑n
i=1 Xi ]
∑i=1 Xi2 −
n n 2
n
= = n
n−1 n−1
.
(b) Start from the formula found in (a) and do the algebra:
[∑ X ] [∑ (X −a+a)]
∑i=1 Xi2 − i=1n i ∑i=1 (Xi − a + a) − i=1 ni
2 2
n
n
n 2 n
S2 = =
n−1 n−1
[∑n (X −a)+∑n a]
∑i=1 [(Xi − a) + a2 + 2 (Xi − a) a] − i=1 i n i=1
2
n 2
=
n−1
[∑n (X −a)] +(∑ni=1 a) +2 ∑i=1 (Xi −a) ∑i=1 a
∑i=1 (Xi − a) + ∑i=1 a2 + 2a ∑i=1 (Xi − a) − i=1 i
2 2
n 2 n n
n n
= n
n−1
(X − +na 2 +2a n (X − a) − [∑i=1 (Xi −a)] +(na) +2na ∑i=1 (Xi −a)
∑i=1 i ∑i=1 i
2 2
n 2 n n
=
a) n
n−1
[∑ (X −a)]
∑i=1 (Xi − a) − i=1 n i
2
n 2 n
=
n−1
.
Rearranging:
We’ve just shown that (Xi − X̄) is a biased estimator for σ 2 . And in turn, S 2 is not:
2
⎢ n−1 ⎥ ⎢ ⎥
⎣ ⎦ ⎣ ⎦
n n n
Definition 272. The random variable Tν with Student’s t-distribution with ν degrees of
freedom has PDF f ∶ R → R given by mapping rule
∞ −1 −x −
∫0 x 2 e dx
ν+1
t2
ν+1
f (t) = √ (1 + )
2
∞ ν .
νπ ∫0 x 2 −1 e−x dx ν
Thus, k = 29. Now, 29/900 ≈ 3.2%. Thus, at a 95% confidence level, the margin of error
is ±3.2%. This is the “true” margin of error, assuming we know µ. But this assumption
defeats the point of sampling — we don’t know µ, which is why we’re doing sampling in
the first place!
What we want instead is the margin of error in the case where µ is unknown.
Case #2: Without perfect hindsight: µ unknown.
With µ unknown, a conservative interpretation would be to find the smallest k such that
for all µ, P (900µ − k ≤ X ≤ 900µ + k) ≥ 0.95.
411
This is slightly different from what actually happened: (1) The actual random sampling was most likely
without replacement (which would change the maths slightly). (2) 100 votes were taken from each of 9
different polling stations (which would also change the maths slightly).
1381, Contents www.EconsPhDTutor.com
(... Analysis continued from the previous page ...)
Observe that Var [X] = 900µ(1 − µ) is maximised at µ = 0.5. Thus, it is plausible412 that if
k satisfies
Our problem thus boils down to finding the smallest k such that for X ∼ B (900, 0.5) implies
We conclude that the smallest such k is 29. Now, 29/900 ≈ 3.2%. So the margin of error
may be given as ±3.2%. This is the same as what was calculated above, which is not
surprising, since 9142/23570 ≈ 0.388 is close to 0.5.
The reader will, of course, wonder why the Elections Department stated that the margin of
error was ±4%, rather than ±3.2% as I calculated here. I am not sure myself. My guess is
that they probably don’t bother going through all the above calculations afresh each time.
Instead, each time they report a sample count, they simply read off the margin of error
from a table that looks something like this:
(By the way, note that it is common to use the CLT approximation when calculating the
margin of error. I have not done so here. Instead, I’ve stuck with using the original, exact
binomial distribution.)
412
Proving this would need a little work though.
1382, Contents www.EconsPhDTutor.com
122.9. Correlation and Linear Regression
Proof. Let u = (x1 − x̄, x2 − x̄, . . . , xn − x̄) and v = (y1 − ȳ, y2 − ȳ, . . . , yn − ȳ) be n-dimensional
vectors. Then
But from what we learnt about vectors,413 if θ is the angle between two vectors, then:
u⋅v
cos θ =
∣u∣ ∣v∣
.
413
Of course, in this textbook, we’ve only shown that this is true for two- and three-dimensional vectors.
But let’s just wave our hands and say that this is also true for higher-dimensional vectors.
1383, Contents www.EconsPhDTutor.com
Fact 187. Let (x1 , x2 , . . . , xn ) and (y1 , y2 , . . . , yn ) be two ordered sets of data. The OLS
regression line of y on x is y − ȳ = b̂ (x − x̄), where
∑ xi yi − nx̄ȳ
(ii) b̂ =
∑ x2i − nx̄2
.
Moreover, the regression line can also be written in the form y = â + b̂x, where b̂ is a given
above and â = ȳ − b̂x̄.
Proof. (Continued from the proof begun on p. 1098.) Remember that the data (x1 , x2 , . . . , xn )
and (y1 , y2 , . . . , yn ) are given. Thus, we can treat all the xi s and yi s as constants. We have:
P = Pb [ ]
TM,b
TM,b + LM,b (h − hb )
,
′
g0 M
R∗ LM,b
P = Pb [ ]
TM,b
TM,b + LM,b (h − hb )
′
− R∗ 0L
TM,b + LM,b (h − hb )
g M
= Pb [ ]
M,b
TM,b
′
− R∗ 0L
g M
= PM,b [1 + (h − hb )]
LM,b M,b
TM,b
ln P = ln PM,b − ∗0 ln [1 + (h − hb )] .
gM LM,b
R LM,b TM,b
Now, for heights up to 11 000 m above sea level, hb is simply the height at sea level. That
′
ln P = a + b ln (1 +
L
h) .
T
For heights up to 11 000 m above sea level, L = −0.000 65 kelvin per metre is the temperature
lapse rate (the rate at which the temperature falls, as we go up in altitude; see p.3, Table
4) and T = 288.15 kelvin is the standard sea-level temperature (also precisely equal to 15
°C; see p. 4).
23 23
We have: 8 057 ÷ 39 = 206 + = 206 .
39 39
A5. NOT-E: “It’s not raining.” NOT-F : “The grass is not wet.” NOT-G: “I’m not
sleeping.” NOT-H: “My eyes are not shut.”
A6(a) If x = 0.5, then O is true while N is false. Thus, we say that O and N are not
equivalent and write O ⇐⇒
/ N.
(b) If x = −3, then γ is true while α is false. Thus, we say that γ and α are not equivalent
and write γ ⇐⇒ / α.
A7(a) NOT- (B AND C) is true. There are two ways to see this:
• Since B AND C is false, its negation NOT- (B AND C) must be true.
• By Fact 1, NOT- (B AND C) is equivalent to NOT-B OR NOT-C: “Germany is not in
Asia OR 1 + 1 ≠ 2”. Which is true because “Germany is not in Asia” (NOT-B) is true.
(b) NOT- (A AND D) is true. There are two ways to see this:
• Since A AND D is false, its negation NOT- (A AND D) must be true.
• By Fact 1, NOT- (A AND D) is equivalent to NOT-A OR NOT-D: “Germany is not in
Europe OR 1 + 1 ≠ 3”. Which is true because “1 + 1 ≠ 3” (NOT-D) is true.
(c) NOT- (B AND D) is true. There are two ways to see this:
• Since B AND D is false, its negation NOT- (B AND D) must be true.
• By Fact 1, NOT- (B AND D) is equivalent to NOT-B OR NOT-D: “Germany is not in
Asia OR 1 + 1 ≠ 3”. Which is true because “Germany is not in Asia” (NOT-B) is true.
Indeed, “1 + 1 ≠ 3” (NOT-D) is also true.
“If x > 1, then x > 0.” Or: “That x > 1 implies that x > 0.”
A11(a) The converse is “If the Nazis won World War II (WW2), then Tin Pei Ling (TPL)
is a genius.” True because the hypothesis is false.
(b) The converse is “If the Allies won WW2, then TPL is a genius.” False because the
hypothesis is true AND the conclusion is false.
(c) The converse is “If I am the king of the world, then π is rational.” True because the
hypothesis is false.
(d) The converse is “If Lee Hsien Loong is Lee Kuan Yew’s son, then π is rational.” False
because the hypothesis is true AND the conclusion is false.
1. “P Ô⇒ Q.”
2. “P .”
3. “Therefore, P .”
P AND NOT-Q.
K AND NOT-L.
By the way, (a) is the converse of the given statement, while (b) is the inverse = “Negate
both”.
A18. O Ô⇒ N is false (counterexample: x = 0.5) and so by Fact 7, N ⇐⇒
/ O.
A19. No two of these three statements are equivalent. X ⇐⇒ / Y because John may
have a blue NRIC and thus not be a Singapore citizen. X ⇐⇒ / Z, because John may be
a newborn Singapore citizen who hasn’t yet obtained his pink NRIC. Y ⇐⇒/ Z because
John may have a blue but not a pink NRIC.
A21 UA UN PA PN
(a) “All donzers “No donzer “Some donzer “Some donzer
are kiki.” is kiki.” is kiki.” is not kiki.”
(b) “All donzers “No donzer “Some donzer “Some donzer does
cause cancer.” causes cancer.” causes cancer.” not cause cancer.”
(c) “All bachelors “No bachelor “Some bachelor “Some bachelor
are married.” is married.” is married.” is not married.”
(d) “All bachelors “No bachelor “Some bachelor “Some bachelor
smoke.” smokes.” smokes.” does not smoke.”
A24. “No person is LeBron James.” This is a UN with the subject “person” and the
predicate “LeBron James”.
The negation of this UN is the PA “Some person is LeBron James”. Which is true, since
there is a person who is LeBron James — namely LeBron James himself . Since the negation
is true, the commentator’s original statement is false.
Where we place the word not is crucial. The commentator wanted to negate the state-
ment “Everybody is LeBron James”, but placed the word not in the wrong position. He
incorrectly said, “Everybody is not LeBron James,” but should instead have said, “Not
everybody is LeBron James”. This latter statement is true and is equivalent to the PN
“Some person is not LeBron James”.
Moral of the story: Don’t say “All S are NOT-P ” when you mean “Not all S are P ”.
√
The statement would be true if we changed its premise: “If x ≥ 0, then x2 = x.” 3
√
It would also be true if we changed its conclusion: “If x ∈ R, then x2 = ∣x∣.” 3
54x ⋅ 251−x 54x ⋅ 52(1−x)
A61(a) =
52x+1 + 3 ⋅ 25x + 17 ⋅ 52x 52x+1 + 3 ⋅ 52x + 17 ⋅ 52x
52+2x
= 2x+1
5 + 3 ⋅ 52x + 17 ⋅ 52x
52+2x 52+x 52+2x
= 2x 1 = = = 1.
5 (5 + 3 + 17) 52x ⋅ 25 52x+2
√ 8x (82 − 34) √ 82 − 34 √ 64 − 34
= 2 √ = 2 √ = 2 √
8x 8 8 8
√ 30 √ 30 √ 15
= 2 √ = 2 √ = 2 √ = 15.
8 2 2 2
Let b = 2, x = 1, y = 2. Then b(x ) = 2(1 ) = 21 = 2, but bxy = 21⋅2 = 22 = 4. Hence, b(x ) ≠ bxy .
y 2 y
(b) (bx ) = bxy is true and was already proven in Proposition 1(d) above.
y
√ 2
A63. 1 1
x
∓ y2 + 1
x
√ 2 = √ 2 √ 2
y
x
y ± x
y2 + 1 x
y ± x
y2 + 1 x
y ∓ y2 + 1
x
√ 2 √ 2
y ∓ y2 + 1 y ∓ y2 + 1
x x x x
= √ 2 =
y 2 − ( y 2 + 1)
2 2
( y ) − ( y2 + 1)
x x x2 x2
√ 2 √
x
∓ y2 + 1
x
x2
= =− ± + 1.
y x
−1 y y2
Observe that at the last step, the −1 in the denominator flips the ∓ into a ±.
1
A65(a) log2 32 + log3 = 5 − log3 27 = 5 − 3 = 2.
27
log3 25 2 log3 5
(b) log3 45 − log9 25 = log3 5 + log3 9 − = log3 5 + 2 − = 2.
log3 9 2
1
(c) First, log16 768 = log16 (256 × 3) = log16 256 + log16 3 = 2 + log16 3 = 2 + log2 3.
4
√ 1
Next, log2 3 = log2 31/4 = log2 3.
4
4
√ 1 1
Hence, log16 768 − log2 3 = 2 + log2 3 − log2 3 = 2.
4
4 4
y
(b) The graph of the
equation y = 3x + 2 is
the set {(x, y) ∶ y = 3x + 2}.
(c) The graph of the
equation y = 2x2 + 1 is
the set {(x, y) ∶ y = 2x2 + 1}.
−1 2 x
x
−1
y
(e)
⎧
⎪
⎪
⎪x + 1, for x < 0
y=⎨
⎪
⎪
⎩x − 1, for x ≥ 0.
⎪
x
−1
Both ◊1 and ◊2 say that y = −π/2 is a vertical asymptote for the graph of y = tan x.
A73. The line y = 0 (the x-axis) is a horizontal asymptote for the graph of y = 1/x:
As x approaches −∞, y approaches 0 from below: lim y = lim 1/x = 0− .
x→−∞ x→−∞
And as x approaches ∞, y approaches 0 from above: lim y = lim 1/x = 0+ .
x→∞ x→∞
y
1
y=
x
Vertical asymptote
x=0
Horizontal asymptote
y=0
x
The line x = 0 (the y-axis) is a vertical asymptote for the graph of y = 1/x:
As x approaches 0 from the left, y approaches −∞: lim− y = lim− 1/x = −∞.
x→0 x→0
And as x approaches 0 from the right, y approaches ∞: lim+ y = lim+ 1/x = ∞.
x→0 x→0
1401, Contents www.EconsPhDTutor.com
A74. Refer to graphs and table below. For (c), for each k ∈ Z, let Ek = (2kπ, 1) and
Fk = ((2k + 1) π, 1) — note that there are infinitely many points Ek and Fk .
(a) y = x2 + 1. y (b) y = x2 + 1, y
−1 ≤ x ≤ −1
B = (1, 2) D = (1, 2)
A = (0, 1) C = (0, 1) x
x
(c) y = cos x. y
F1 = (3π, 1)
F−1 = (−π, 1) F0 = (π, 1)
A B C D Ek Fk G H I
GMax 3 3 3 3 (d) y = cos x, y
SGMax 3 −1 ≤ x ≤ 1
H = (0, 1)
LMax 3 3 3 3
SLMax 3 3 3 3
GMin 3 3 3 3 3
SGMin 3 3 I = (1, cos 1)
LMin 3 3 3 3 3 G = (−1, cos (−1))
SLMin 3 3 3 3 3 x
Turning 3 3 3 3 3
(2 × (−2) − 8, 2 × 4 − 5) = (−12, 3) .
x = −1
y = x2 + 2x + 2
1
x-intercepts None (− , 0), (1, 0) (−2, 0)
2
1 1
Line of symmetry x=− x= x = −2
4 4
1 7 1 9
Turning point (− , ) ( , ) (−2, 0)
4 8 4 8
The turning point in each of (a) and (c) is also the strict global minimum; and in (b), it
is also the strict global maximum.
(a) y = 2x2 + x + 1 y
4
(c) y = x + 4x + 4
2
1 9
( , )
4 8
1 7
(− , ) (0, 1)
−2 4 8
x
1 1
−
2
(b) y = −2x2 + x + 1
1
x=
4
1
x=−
4
In general, given any two finite sets S and T , we can construct n (S) × n (T ) possible
functions using S as the domain and T as the codomain.
A88(a) Yes, a is well-defined because it maps every element in the domain to (ex-
actly) one element in the codomain — we have a (Cow) = Produces milk, a (Chicken) =
Produces eggs, and a (Dog) = Guards the home.
(b) No, b isn’t well-defined, because it isn’t clear what b (Dog) is.
(c) No, c isn’t well-defined, because it isn’t clear what each state’s “most splendid” city is.
(d) No, d isn’t well-defined. China has more than one city with over 10M people, while
Iceland has none. So, China would be mapped to more than one element in the codomain,
while Iceland would be mapped to none — in either case, we’d violate the requirement that
a function map every element in the domain to (exactly) one element in the codomain.
√
414
As noted earlier and as we’ll learn later, −1 is not a real but an imaginary number.
415
As discussed in Ch. 6, 1 ÷ 0 is not a real number. Indeed, it is not even a number; it is undefined.
1406, Contents www.EconsPhDTutor.com
A90. Change the domain of n to R+0 , the set of non-negative
√ 416 reals. We then have the
function n ∶ R0 → R that is (well-)defined by n (x) = x.
+
Change the domain of o to R ∖ {0}, the set of all reals except zero. We then have the
function o ∶ R ∖ {0} → R that is (well-)defined by o (x) = 1/x.417
A91(a) Range (a) = R+0 . (b) Range (b) = {0, 1, 4, 9, 16, 25, 49, . . . }.
(c) Range (c) = {0, 1, 4, 9, 16, 25, 49, . . . }. (d) Range (d) = Z.
(e) Range(e) = Z. (f) Range(f ) = {100, 200}. (g) Range(g) = {100}.
d has a range that’s equal to its codomain. So does f . None of the other functions do.
A92. Only (b) “Range(f ) ⊆ Codomain(f )” must be true.
416
Note that the new domain R+0 is indeed the largest subset of R such that the function n is well-defined.
The addition of any negative number to this new domain would render the function ill-defined.
417
Note that the new domain R∖{0} is indeed the largest subset of R such that the function o is well-defined.
The addition of zero to this new domain would render the function ill-defined.
1407, Contents www.EconsPhDTutor.com
124.7. Ch. 13 Answers (Arithmetic Combinations of Functions)
Hence, define:
x22 > x21 ⇐⇒ x22 − 1 > x21 − 1 ⇐⇒ b (x2 ) > b (x1 ) Ô⇒ b (x2 ) ≠ b (x1 ) .
y
y=x
(1, 2)
f
(2, 1)
f −1
(b) Range(g) = (0, 2]. So, define g −1 ∶ (0, 2] → (0, 1] by g −1 (x) = x/2.
y
(1, 2)
y=x
g −1
(2, 1)
y
y=x
h
(1, 1) h−1
√
(d) Range(i) = (0, 1]. So, define i−1 ∶ (0, 1] → (0, 1] by i−1 (x) = x.
y=x
(1, 1)
i
i−1
(b) Let x1 , x2 ∈ (1, ∞) with x2 > x1 . Then x2 − 1 > x1 − 1 > 0 Ô⇒ (x2 − 1) > (x1 − 1)
2 2
1
g −1 (g (x)) = x ⇐⇒ g −1 ( 2) =x
(x − 1)
1
⇐⇒ g −1 (y) = x (Let y = g (x) = 2 .)
(x − 1)
1 1
⇐⇒ g −1 (y) = 1 + √ (Do the algebra: x = 1 ± √ .)
y y
√
Note that in the last step, we discard 1 − 1/ y, because the codomain√of g −1 is (1, ∞).
Thus, the inverse function is g −1 ∶ R+ → (1, ∞) defined by g −1 (y) = 1 + 1/ y.
(c) Let x3 , x4 ∈ (−∞, 1) with x4 < x3 . Then x4 − 1 < x3 − 1 < 0 Ô⇒ (x4 − 1) > (x1 − 1)
2 2
1
h−1 (h (x)) = x ⇐⇒ h−1 ( 2) =x
(x − 1)
1
⇐⇒ h−1 (y) = x (Let y = h (x) = 2 .)
(x − 1)
1 1
⇐⇒ h−1 (y) = 1 − √ (Do the algebra: x = 1 ± √ .)
y y
√
Note that in the last step, we discard 1 + 1/ y, because the codomain of√h−1 is (−∞, 1).
Thus, the inverse function is h−1 ∶ R+ → (−∞, 1) defined by h−1 (y) = 1 − 1/ y.
4− 8
3
3 x x
5 x
g (x) = g ( − ) = 1 −
4
= + .
4 8 2 8 16
8 + 16
5
5 x x
11 x
(ii) Define g ∶ R → R by g (x) = g ( + ) = 1 −
5 5
= − .
8 16 2 16 32
(iii) g 5 (1) = 21/32 and g 5 (3) = 19/32.
1 x 1 x 3 x
g (x) = − , g 2 (x) = + , g 3 (x) = − ,
1 2 2 4 4 8
5 x 11 x 21 x
g 4 (x) = + , g 5 (x) = − , g 6 (x) = + .
8 16 16 32 32 64
We observe that the red numbers are the Jacobstahl numbers; the blue numbers are 2n−1 ;
the green numbers are 2n ; and the sign between the two terms alternates between − and +.
Thus, we guess that g n ∶ R → R is defined by:
2n −(−1)
x 2 (−1)
n
n
g (x) = + (−1) n = − + (−1)
n 3 n n x
3 3 ⋅ 2n−1
.
2 n−1 2 2n
As n → ∞, the second and third terms tend towards zero. Hence, for any x ∈ R, we have:
2
lim g n (x) = .
n→∞ 3
We have f g(1) = e1 +1
= e2 and f g(2) = e2 +1
= e5 .
2 2
y = 2f (x) + 1 y
y = −2f (x) − 1
(b) We already graphed y = 2f (x) + 1 in the above example. Simply reflect that in the
y-axis to get the graph of y = 2f (−x) + 1.
y = 2f (x) + 1 y y = 2f (−x) + 1
y
y = 2f (x + 1)
f
y = −2f (x + 1)
(d) We already graphed y = 2f (x + 1) in the above example. Simply reflect that in the
x-axis to get the graph of y = 2f (−x + 1).
y
y = 2f (−x + 1) y = 2f (x + 1) f
y
y = −f (2x) + 1 y = f (2x) + 1
(f) We already graphed y = f (2x) + 1 in the above example. Reflect that in the y-axis to
get y = f (−2x) + 1.
y
y = f (−2x) + 1 y = f (2x) + 1
y
f
y = −f (2x + 1) y = f (2x + 1)
(h) We already graphed y = f (2x + 1) in the above example. Reflect that in the y-axis to
get y = f (−2x + 1).
y
y = f (−2x + 1) y = f (2x + 1) f
y = ∣2f (2x)∣ y
f
y = f (2x) y = 2f (2x)
y
y = f (∣x − 1∣) + 2 f
y = f (∣x − 1∣)
y = f (x − 1)
1
3. y =
5x − 2
1
5. y = 3 −
5x − 2
1
4. y = −
5x − 2
1 1
1. y = 2. y =
x x−2
dx − sin x dx
dy ÷ x d sin x cos x − sin x cos x sin x
x dx
(c) = = = − 2 .
dx x 2 x2 x x
dx − sin x dx
dy ÷ cos x d sin cos2 x + sin2 x
d cos x
x
1
(d) = = = .
dx cos x
2 cos x
2 cos2 x
d
dy ÷ z dx 1 − 1 dx
dz
dz/dx
(e) = = − 2 . By the way, this is called the Reciprocal Rule.
dx z 2 z
dy (e) d sin x/dx cos x
(f) = − = − .
dx sin2 x sin2 x
dy (e) d cos x/dx − sin x sin x
(g) = − =− = .
dx cos x
2 cos2 x cos2 x
dy (e) d tan x/dx 1/ cos2 x 1
(h): = − 2 = − 2 = − .
dx tan x sin x/ cos2 x sin2 x
dx − x dx
dz
dy d sin xz Ch d sin xz d xz x d xz ÷ x z dx
= = = cos = cos
dx dx d xz dx z dx z z2
where in the last step we simply plugged in z = 1 + [x − ln (x + 1)] . (We could do a little
2
z∣x=0 = 1 + [0 − ln (0 + 1)] = 1.
2
Observe that:
R
dy RRRR 01−0⋅0
Thus: RRR = cos = 1.
dx RR 1 12
Rx=0
d
A109(a) Newton’s Second Law of Motion is F = (mv). (In words, force is equal to the
dt
rate of change of momentum.)
dv
(b) By definition: a= .
dt
dm
If mass is constant (i.e. mass is not changing over time), then = 0.
dt
d × dm dv
Altogether: F= (mv) = v +m = 0 + ma = ma.
dt dt dt
1423, Contents www.EconsPhDTutor.com
d 1 d
ln (exp x) = ⋅ ( exp x).
1
A110(a)
dx exp x dx
d
(b) Since exp is defined to be the inverse of ln, we have ln (exp x) = x. Thus, ln (exp x) =
dx
d
x = 1.
dx
(c) Putting our answers in (a) and (b) together, we have:
1 d d
⋅ ( exp x) = 1 or exp x = exp x.
exp x dx dx
A
B C D F
A B C D E F G H
Stationary 7 7 3 7 3 7 3 7 G
Turning 7 7 7 7 3 7 7 7
H x
sin A sin B
sin B
cos(A − B)
1
T
cos B
A cos A cos B
B
A−B
P sin A cos B U
A 1 − cos A A 1 + cos A
Thus: sin2 = and cos2 = .
2 2 2 2
Taking square roots, we have:
√ √
1 − cos A 1 + cos A
sin = ± and cos = ±
A A
.
2 2 2 2
Here we must be a little careful with the signs. By the mnemonic ASTC, we know that
sin A/2 is positive if A/2 is in Quadrant I or II, but negative otherwise. Thus:
⎧ √
⎪
⎪
⎪ 1 − cos A
⎪
⎪
⎪ for
A
in Quadrant I or II,
A ⎪ ⎪ 2 2
sin = ⎨ √
2 ⎪ ⎪
⎪
⎪
⎪ 1 − cos A
⎪ −
A
⎪
⎪
for in Quadrant III or IV.
⎩ 2 2
A A
And cos is positive if is in Quadrant I or IV, but negative otherwise. Thus:
2 2
⎧ √
⎪
⎪
⎪ 1 + cos A
⎪
⎪
A
A ⎪
for in Quadrant I or IV,
⎪
⎪ 2 2
cos = ⎨ √
2 ⎪ ⎪
⎪
⎪
⎪ 1 + cos A
⎪ −
A
⎪
⎪
for in Quadrant II or III.
⎩ 2 2
P +Q P −Q P +Q P −Q
P= + and Q = − .
2 2 2 2
Now simply apply the Addition and Subtraction Formulae:
P +Q P −Q P +Q P −Q P +Q P −Q
sin P = sin ( + ) = sin cos + cos sin ,
2 2 2 2 2 2
P +Q P −Q P +Q P −Q P +Q P −Q
sin Q = sin ( − ) = sin cos − cos sin ,
2 2 2 2 2 2
P +Q P −Q P +Q P −Q P +Q P −Q
cos P = cos ( + ) = cos cos − sin sin ,
2 2 2 2 2 2
P +Q P −Q P +Q P −Q P +Q P −Q
cos Q = cos ( − ) = cos cos + sin sin .
2 2 2 2 2 2
You can easily verify that the four S2P or P2S Formulae now follow.
P +Q P −Q
A120. Let 2x = and 5x = .
2 2
P +Q P −Q
Then: P= + = 2x + 5x = 7x;
2 2
P +Q P −Q
And: Q= − = 2x − 5x = −3x.
2 2
A123(a) Recall (p. 19.5) that cosine is symmetric in the y-axis. That is, for all x,
cos x = cos (−x). Thus,
sin−1 x + cos−1 x =
π
Rearranging, we have: .
2
Terms: x1x0
3.2
5x − 2 16x +3
16x −6.4
9.4
Terms: x2 x1 x0
4x −23
x + 5 4x2 −3x +1
4x2 +20x
−23x +1
−23x −115
116
4x2 − 3x + 1 116
= 4x − 23 + .
x+5 x+5
x2 + x + 3
(c) In the expression 2 , the dividend is x2 +x+3 and the divisor is −x2 − 2x + 1.
−x − 2x + 1
Long division:
Terms: x2 x1 x0
−1
−x2 − 2x + 1 x2 +x +3
x2 +2x −1
−x +4
x2 + x + 3 −x + 4
= −1 + .
−x2 − 2x + 1 −x2 − 2x + 1
1431, Contents www.EconsPhDTutor.com
A125(a) (2x3 + 7x2 − 3x + 5) ÷ (x − 3) leaves 2 ⋅ 33 + 7 ⋅ 32 − 3 ⋅ 3 + 5 = 113.
(b) (−2x4 + 3x2 − 7x − 1) ÷ (x + 2) leaves −2 ⋅ (−2) + 3 ⋅ (−2) − 7 ⋅ (−2) − 1 = −7.
4 2
(2x − 3) (x + 1) = 2x2 − x − 3. 7
√ use the quadratic formula. We have b −4ac = (−19) −4(7)(−6) = 361+168 = 529 > 0
2 2
(b) I’ll
and 529 = 23. Thus:
19 − 23 19 + 23 2
7x2 − 19x − 6 = 7 (x − ) (x − ) = 7 (x + ) (x − 3) = (7x + 2) (x − 3) .
14 14 7
(c) I’ll use the SSGACM: (2x + 1) (3x − 1) = 6x2 + x − 1. 3 Yay! Done!
(d) I’ll start by using the FTGACM. Since the constant term is −14 = −2 × 7, let’s try
plugging in 2:
p(2) = 2 ⋅ 23 − 22 − 17 ⋅ 2 − 14 < 0. 7
Aiyah, sian. Doesn’t work — by the FT, x − 2 is not a factor for 2x3 − x2 − 17x − 14.
Let’s instead try −2:
The coefficients on the cubed and constant terms are a = 2 and 2c = −14. And so, c = −7.
To find b, look at the coefficients on the squared term, which are 2a + b = −1 and so b = −5.
Thus, ax2 + bx + c = 2x2 − 5x − 7.
To factorise 2x2 − 5x − 7, I’ll use the SSGACM:
(2x − 7) (x + 1) = 2x2 − 5x − 7. 3
1 a b 31 3
By (ii): 0 = p( ) = + − + +3
2 16 8 4 2
a b 13 1 a 30 − a 13
= + − = + −
16 8 4 16 8 4
60 − a 13 ×16
= − = 60 − a − 52 = 8 − a.
16 4
(b) Observe that p (0) > 0. Given also (iii) p (−1/3) < 0, the IVT says there must be some
−1/3 < c < 0 such that p (c) = 0.
So, let’s try the FTGACM, by plugging in −1/4:
1 4 1 3 1 2 1
8 (− ) + 22 (− ) − 31 (− ) + 3 (− ) + 3 = 0. 3
4 4 4 4
Yay, works! By the FT, x + 1/4 or 4 (x + 1/4) = 4x + 1 is a factor of p (x).
From (ii), we also already knew that x − 1/2 or 2 (x − 1/2) = 2x − 1 is a factor of p (x).
So write: p (x) = 8x4 + 22x3 − 31x2 + 3x + 3
= (2x − 1) (4x + 1) (dx2 + ex + f )
= (8x2 − 2x − 1) (dx2 + ex + f ) .
The coefficients on the 4th-degree and constant terms are 8d = 8 and −f = 3. And so, d = 1
and f = −3. To find e, look at the coefficients on the linear term, which are −2f − e = 3.
And so, e = −2f − 3 = 3. Thus, dx2 + ex + f = x2 + 3x − 3.
To factorise this last quadratic polynomial, we observe that b2 −4ac = 32 −4(1)(−3) = 21 > 0.
And so, by the quadratic formula, we have:
√ √ √ √
−3 − 21 −3 + 21 3 + 21 3 − 21
x2 + 3x − 3 = (x − ) (x − ) = (x + ) (x + ).
2 2 2 2
A128. Translate x2 /a2 + y 2 /b2 = 1 leftwards by c units to get (x + c) /a2 + y 2 /b2 = 1. Then
2
y
(x + c) (y + d) √
2 2
+ =1 b a2 − c2
−d +
a2 b2
a
√ (−c, b − d) is a strict √
a b2 − d2 a b2 − d2
−c − global maximum. −c +
b b
I
x
Line of symmetry
I
I
y = −d
I
I
---------�----
I
1
I
I
Line of symmetry
x = −c
√
b a2 − c2
−d −
a
(−c, −b + d) is a strict
global minimum.
So,(x + c) /a2 +(y + d) /b2 = 1 is the exact same ellipse as x2 /a2 +y 2 /b2 = 1, but now centred
2 2
translated c units leftwards and d units downwards, (x + c) /a2 + (y + d) /b2 = 1 again has
2 2
two turning points — the strict global maximum (−c, b − d) and the strict global minimum
(−c, −b − d).
By observation, there are no asymptotes.
By observation, there are two lines of symmetry y = −d and x = −c.
(Answer continues on the next page ...)
(0 + c) (y + d) (y + d) c2 a2 − c2
2 2 2
+ = 1 ⇐⇒ =1− 2 =
a2 b2 b2 a a2
√ √
y + d ± a2 − c2 b a2 − c2
⇐⇒ = ⇐⇒ y = −d ± .
b a a
√ √
b a2 − c2 b a2 − c2
So, if ∣a∣ > ∣c∣, then the y-intercepts are (0, −d − ) and (0, −d + ).
a a
If ∣a∣ = ∣c∣, then the (only) y-intercept is (0, −d). (Either the leftmost or rightmost point of
the ellipse just touches the y-axis.)
And if ∣a∣ < ∣c∣, then there are no y-intercepts (the ellipse doesn’t touch the y-axis).
Similarly, to find the x-intercepts, plug in y = 0:
(x + c) (0 + d) (x + c) d2 b2 − d2
2 2 2
+ = 1 ⇐⇒ = 1 − =
a2 b2 a2 b2 b2
√ √
x + c ± b2 − d2 a b2 − d2
⇐⇒ = ⇐⇒ x = −c ± .
a b b
√ √
a b 2 − d2 a b 2 − d2
So, if ∣b∣ > ∣d∣, then the x-intercepts are (−c − , 0) and (−c + , 0).
b b
If ∣b∣ = ∣d∣, then the (only) x-intercept is (0, −d). (Either the topmost or bottommost point
of the ellipse just touches the x-axis.)
And if ∣b∣ < ∣d∣, then there are no x-intercepts (the ellipse doesn’t touch the x-axis).
y
3x + 2
y=
x+2
x
2
(− , 0)
3
Vertical asymptote
x = −2
y
x−2
y=
−2x + 1
Line of symmetry Line of symmetry
y = −x y =x−1
(2, 0) x
Horizontal asymptote
Centre
1
1 1
( ,− ) y=−
(0, −2) 2 2 2
Vertical asymptote
1
x=
2
y
Vertical asymptote
3
x=−
2
Line of symmetry Line of symmetry
y = −x − 3 1 y=x
(0, )
3
1
( , 0) x
3
Horizontal asymptote
3
Centre y=−
2
3 3
(− , − )
2 2
−3x + 1
y=
2x + 3
y x2 + 2x + 1
y=
x−4
y =x+6
√ √
y = (1 − 2) x + 6 + 4 2 (9, 20)
(4, 10)
(−1, 0)
1 x
(0, − )
4
In this example, the x-intercept
(−1, 0) coincides with the
maximum turning point.
√ √ x=4
y = (1 + 2) x + 6 − 4 2
dy d 25
To find the turning points, write: = (x + 6 + ) = 1 − 25 (x − 4) = 0 ⇐⇒ (x − 4) = 25.
−2 ! 2
x−4
418
dx dx
So x = −1, 9. The corresponding y-values are −1 + 6 + 25/ (−1 − 4) = 0 and 9 + 6 + 25/ (9 − 4) = 20. Thus,
the two turning points are (−1, 0) and (9, 20). (If necessary, we can also show that these are respectively
the strict local maximum and minimum.)
1439, Contents www.EconsPhDTutor.com
−x2 + x − 1 3
A130(b) Do the long division: = −x + 2 − .
x+1 x+1
Intercepts. Plug in x = 0 to get y = −1/1 = −1. Thus, the y-intercept is (0, −1). Plug in
y = 0 to get −x2 + x − 1 = 0, an equation for which there are no (real) solutions. Thus, there
are no x-intercepts.
Asymptotes. The vertical asymptote x = −1 is given by the value of x for which x + 1 = 0.
The oblique asymptote y = −x + 2 is given by the quotient in the long division.
The centre’s x-coordinate is given by the vertical asymptote x = −1. For its y-coordinate,
plug x = −1 into the oblique asymptote to get y = − (−1)+2 = 3. Hence, the centre is (−1, 3).
You should be able to sketch the two lines of symmetry and the two turning points.419
−x2 + x − 1 y
y=
x+1
√ √
y = (−1 − 2) x + 2 + 2
√ √
(−1 + 3, 3 − 2 3)
√ √
y = (−1 + 2) x + 2 − 2
y = −x + 2
(−1, 3) x
√ √
(0, −1) (−1 − 3, 3 + 2 3)
x = −1
dy d 3
To find the turning points, write: = (−x + 2 − ) = −1 + 3 (x + 1) = 0 ⇐⇒ (x + 1) = 3.
−2 ! 2
x+1
419
dx dx
√ √ √ √
So x = −1 ± 3. The corresponding y-values are 1 ∓ 3 + 2 − 3/ (−1 ± 3 + 1) = 3 ∓ 2 3. Thus, the two
√ √
turning points are (−1 ± 3, 3 ∓ 2 3). (If necessary, we can also show that these are respectively the
strict local maximum and minimum.)
1440, Contents www.EconsPhDTutor.com
2x2 − 2x − 1 39
A130(c) Do the long division: = 2x − 10 + .
x+4 x+4
Intercepts. Plug in x = 0 to get y = −1/4. Thus, the y-intercept is (0, −1/4). Plug
in y = 0 to get 2x2 − 2x − 1 = 0, an equation for which there are two (real) solutions:
√ √ √
x = (2 ± 12) /4 = (1 ± 3) /2. Thus, there are two x-intercepts: ((1 ± 3) /2, 0).
Asymptotes. The vertical asymptote x = −4 is given by the value of x for which x + 4 = 0.
The oblique asymptote y = 2x − 10 is given by the quotient in the long division.
The centre’s x-coordinate is given by the vertical asymptote x = −4. For its y-coordinate,
plug x = −4 into the oblique asymptote to get y = 2 (−4) − 10 = −18. Hence, the centre is
(−4, −18).
You should be able to sketch the two lines of symmetry and the two turning points.420
y √ √
y = (2 + 5) x − 10 + 4 5
√ √
1− 3 1+ 3
( , 0) ( , 0)
2 2
√
√ √ 1 (−4 + 39/2, −18 + 2√78) x
(0, − )
y = (2 − 5) x − 10 − 4 5 4
(−4, −18)
√ √
(−4 − 39/2, −18 − 2 78)
y = 2x − 10
x = −4
2x2 − 2x − 1
y=
x+4
dy d 39
To find the turning points, write: = (2x − 10 + ) = 2 − 39 (x + 4) = 0 ⇐⇒ (x + 4) = 39/2.
−2 ! 2
+
420
√ dx dx x 4 √ √
So x = −4 ± 39/2. The corresponding y-values are 2 (−4 ± 39/2) − 10 + 39/ (−4 ± 39/2 + 4) = −18 ±
√ √ √
2 78. Thus, the two turning points are (−4 ± 39/2, −18 ± 2 78). (If necessary, we can also show that
these are respectively the strict local maximum and minimum.)
1441, Contents www.EconsPhDTutor.com
124.16. Ch. 23 Answers (Simple Parametric Equations)
A131(a) As stated in the above example, at time t = 0, particle P is at (x, y) = (cos 0, sin 0) =
(1, 0). In contrast, particle Q is at (x, y) = (sin 0, cos 0) = (0, 1).
(b) At t = 0, Q is at (x, y) = (sin 0, cos 0) = (0, 1). A little after t = 0, x = sin t will have
grown a little while y = cos t will have shrunk a little. That is, the particle Q will have
moved a little to the right and a little to the south. Thus, Q travels clockwise.
(c) Every 2π s, each particle travels one full circle. Therefore, at t = 664π, each particle
will be at its starting point. And π s later, each particle will have travelled an additional
half-circle. Thus, at t = 665π, particle P will be at (x, y) = (cos π, sin π) = (−1, 0) (at the
left of the circle), while particle Q will be at (x, y) = (sin π, cos π) = (0, −1) (at the bottom
of the circle).
(d) The two particles are at the exact same position whenever (cos t, sin t) = (sin t, cos t).
Thus, they are at the exact same position whenever cos t = sin t. By inspecting the graphs
of cos and sin (see e.g. p. 265), we see that this occurs at t = (k + 1/4) π, for every k ∈ Z+0 .
(e) For the particle Q, we have:
dx dy dvx d2 x dvy d2 y
vx = = cos t, vy = = − sin t, ax = = 2 = − sin t, ay = = 2 = − cos t.
dt dt dt dt dt dt
The particle R starts at (a, 0) and travels anticlockwise. At t = π/4, R has completed
√ full revolution
one-eighth of the
−1
and is now√at the top-right of the ellipse; it is travelling
√
leftwards √
at a 2/2 m s and upwards at b 2/2 m s−1 ; and it is accelerating leftwards at
a 2/2 m s−2 and downwards at b 2/2 m s−2 .
At t = π/2, R has completed one-quarter of the full revolution and is now at the top of the
ellipse; it is travelling leftwards at a m s−1 (and not upwards at all); and it is accelerating
downwards at b m s−2 (and not rightwards at all).
(Answer continues on the next page ...)
At t = 2π, R has completed one full revolution and is back at its starting position; it is
travelling upwards at b m s−1 (and not rightwards at all); and it is accelerating leftwards at
a m s−2 (and not upwards at all).
At t = At t = ,
π π
, y
2 4
√ √
(x, y) = (0, b) , 2 2
(vx , vy ) = (−a, 0) , (x, y) = (a ,b ),
2 2
(ax , ay ) = (0, −b) . √ √
l
2 2
(vx , vy ) = (−a ,b ),
2 2
√ √
l (ax , ay ) = (−a
2
2
, −b
2
2
).
\
x
At t = 2π,
(x, y) = (a, 0) ,
(vx , vy ) = (0, b) ,
(ax , ay ) = (−a, 0) .
x2 y 2
U = {(x, y) ∶ + = 1}
a2 b2
Arrows indicate instantaneous
= {(x, y) ∶ x = a cos t, y = b sin t, t ≥ 0}
direction of travel.
(c) At t = 0, (x, y) = (a cos 0, b sin 0) = (a, 0) and (vx , vy ) = (−a sin 0, b cos 0) = (0, b). Hence:
(i) If a, b < 0, R starts at the left of the ellipse. It also starts by moving downwards and is
thus moving anticlockwise.
(ii) If a > 0, b < 0, R starts at the right of the ellipse. It also starts by moving downwards
and is thus moving clockwise.
(ii) If a < 0, b > 0, R starts at the left of the ellipse. It also starts by moving upwards and
is thus moving clockwise.
A134(a) An instant after t = 1.5π, the particle magically reappears “near” “bottom-right
infinity” (∞, −∞).
(b) During t ∈ (1.5π, 2.5π), the particle moves upwards along the right branch of the
hyperbola. At t = 2π, it is back to its starting position (1, 0). And as t → 2.5π, it “flies off”
towards “top-right infinity” (∞, ∞).
Position Ba Bb Bc Bd Be Bf
Time t 5 0 1 2 3 4
Since 2, 3 ∈ (0.5π, π) ≈ (1.57, 3.14), at t = 2 and t = 3, B must be on the left portion of the
bottom branch. Since B is “near” “bottom-left” infinity an instant after t = 0.5π and t = 2
is earlier than t = 3, it must be that t = 2 corresponds to Bd and t = 3 corresponds to Be .
Since 4 ∈ (π, 1.5π) ≈ (3.14, 4.71), at t = 4, B must be on the right portion of the bottom
branch — hence, Bf .
Since 5 ∈ (1.5π, 2π) ≈ (4.71, 6.28), at t = 5, B must be on the left portion of the top branch
— hence, Ba .
A = {(x, y) ∶ y = ln (x + 2) , y ≥ 0} .
y
At t = e − 1 ≈ 1.72,
(x, y) = (e − 2, 1) ≈ (0.72, 1)
1
(vx , vy ) = (1, ) ≈ (1, 0.37)
e−1+1
Starting
point x
At t = 0,
(x, y) = (−1, 0)
(vx , vy ) = (1, 1)
dx dy 1
(iii) As usual, compute: vx = = 1 and vy = = .
dt dt t + 1
At t = 0, A starts at the position (x, y) = (0 − 1, ln (0 + 1)) = (−1, 0), and is moving right-
wards 1 m s−1 and upwards at 1/ (0 + 1) = 1 m s−1 .
A’s rightwards velocity stays fixed at 1 m s−1 , while its upwards velocity decreases towards
zero. As time progresses, A travels steadily towards “top-right infinity”.
1
Noting that t ≥ 0 ⇐⇒ t + 1 ≥ 1 ⇐⇒ x = ∈ (0, 1], we can rewrite the set as:
t+1
2
1
B = {(x, y) ∶ y = ( − 1) + 1, x ∈ (0, 1]} .
x
(ii) Using your graphing calculator, we see that the complete graph of y = (1/x − 1) + 1 has
2
two branches.
However, we have the constraint x ∈ (0, 1]. And so, particle B travels only along the black
graph and does not travel along the grey portion.
y At t = 3,
1 1
(x, y) = ( , 32 + 1) = ( , 10)
3+1 4
B does not 1 1
travel along (vx , vy ) = (− , 2 ⋅ 3) = (− , 6)
(3 + 1)
2 16
this gray portion.
At t = 0,
1
(x, y) = ( , 02 + 1) = (1, 1)
0+1
1 1
B = {(x, y) ∶ x = , y = t2 + 1, t ≥ 0} (vx , vy ) = (− 2 , 2 ⋅ 0) = (−1, 0)
t+1 (0 + 1)
2
1
= {(x, y) ∶ y = ( − 1) + 1, x ∈ (0, 1]} Starting B does not
x point travel along
this gray portion.
x
dx 1 dy
(iii) As usual, compute: vx = =− and v = = 2t.
(t + 1)
dt 2 y
dt
At t = 0, B starts at the position (x, y) = (1/ (0 + 1) , 02 + 1) = (1, 1), and is moving leftwards
at 1/ (0 + 1) = 1 m s−1 and is upwards at 2 ⋅ 0 = 0 m s−1 . That is, B is initially not moving
2
in the y-direction.
As time progresses, B move leftwards and upwards. Its leftwards velocity decreases towards
zero, while its upwards velocity increases towards infinity.
x+1 2 x2 + 2x + 1 3 3 9
y = 3 [1 − ( ) ] = 3 [1 − ] = − x2 − x + .
2 4 4 2 4
Noting that t ≥ 0 Ô⇒ sin t ∈ [−1, 1] ⇐⇒ x = 2 sin t − 1 ∈ [−3, 1], we may rewrite the set
as: C = {(x, y) ∶ y = −0.75x2 − 1.5x + 2.25, x ∈ [−3, 1]}.
(ii) The graph of y = −0.75x2 − 1.5x + 2.25 is simply a ∩-shaped quadratic, with turning
point at x = −1 and roots x = −3, 1. But note the constraint x ∈ [−3, 1] — the particle C
travels only along the black graph and not along the grey portion.
At t = 0,
(x, y) = (−1, 3)
(vx , vy ) = (2, 0)
3π
At t = At t =
π
2
, At t = 2π, 2
,
ß ß
C does not travel along x
these grey portions.
dx dy Ch.
(iii) As usual, compute: vx = = 2 cos t and vy = = −6 cos t sin t.
dt dt
At t = 0, C starts at (2 sin 0 − 1, 3 cos2 0) = (−1, 3) (the maximum point of the parabola)
and has velocity (vx , vy ) = (2 cos 0, −6 sin 0 cos 0) = (2, 0). That is, it is moving rightwards
at 2 m s−1 (and not moving in the y-direction).
During t ∈ (0, 0.5π), it moves rightwards along the parabola. At t = 0.5π, it is at the
rightmost point of the black graph and (vx , vy ) = (0, 0).
It then does a U-turn — during t ∈ (0.5π, 1.5π), it moves leftwards along the parabola.
At t = π, it is again at the maximum point of the parabola. And at t = 1.5π, it is at the
leftmost point of the constrained parabola and again (vx , vy ) = (0, 0).
It then does a U-turn — during t ∈ (1.5π, 2π), it moves rightwards along the parabola. And
at t = 2π, it is again at the maximum point of the parabola.
The particle has completed one period and will during t ∈ [2π, 4π] repeat exactly the same
movement made during t ∈ [0, 2π]. And so on.
1447, Contents www.EconsPhDTutor.com
124.17. Ch. 24 Answers (Solving Inequalities)
A140(a) ∣x − 4∣ ≤ 71 ⇐⇒ −71 ≤ x − 4 ≤ 71 ⇐⇒ −67 ≤ x ≤ 75. The solution set is [−67, 75].
(b) ∣5 − x∣ > 13 ⇐⇒ (5 − x > 13 OR 5 − x < −13) ⇐⇒ (−8 > x OR 18 < x). The solution
set is (−∞, −8) ∪ (18, ∞) or R ∖ [−8, 18].
(c) Sketch y = ∣−3x + 2∣ − 4 and y = x − 1. Observe ∣−3x + 2∣ − 4 ≥ x − 1 ⇐⇒ x is to the left
or right of the two intersection points P and Q. P is given by: −3x + 2 − 4 = x − 1 or −1 = 4x
or x = −1/4. Q is given by: 3x − 2 − 4 = x − 1 or 2x = 5 or x = 5/2. Thus, the solution set is
(−∞, −1/4] ∪ [5/2, ∞) or R ∖ (−1/4, 5/2).
y =x−1
1 5
−
4 2
x
y = ∣−3x + 2∣ − 4
(d) Sketch y = ∣x + 6∣ and y = 2 ∣2x − 1∣. Observe that ∣x + 6∣ > 2 ∣2x − 1∣ ⇐⇒ x is between
the two intersection pointsP and Q. P is given by: −x−6 = 2 (2x − 1) or −4 = 5x or x = −4/5.
Q is given by: x + 6 = 2 (2x − 1) or 8 = 3x or x = 8/3. Thus, the solution set is (−4/5, 8/3).
y = 2 ∣2x − 1∣
Q
y = ∣x + 6∣
4 8
− x
5 3
+ − +
−2/3 −1/2
2x + 1
y=
3x + 2
2 1 x
− −
3 2
(e) The numerator and denominator equal zero at −6 and 14/9. Draw the sign diagram:
− + −
−6 14/9
−6 14 −3x − 18
y=
9 9x − 14
y y=9
Altogether then,
2x + 3 60
< 9 ⇐⇒ x ∈ R ∖ [ , 7].
−x + 7 11
60 7 x
2x + 3
11 y=
−x + 7
(b) (−4x + 2) / (x + 1) > 13 ⇐⇒ (−4x + 2 − 13x − 13) / (x + 1) > 0 ⇐⇒ (−17x − 11) / (x + 1) >
0 ⇐⇒ −17x − 11, x + 1 > 0 OR −17x − 11, x + 1 < 0.
1 2
−17x − 11, x + 1 > 0 ⇐⇒ (x < −11/17 AND x > −1) ⇐⇒ x ∈ (−1, −11/17).
1
y
y = 13
Altogether then,
−4x + 2 11
> 13 ⇐⇒ x ∈ (−1, − ).
x+1 17
x
−1 −
11
17 −4x + 2
y=
x+1
y
x
y = −3x2 + x − 5
x∈∅
(there are no values of x
for which −3x2 + x − 5 > 0)
(b) Since a > 0, this is a ∪-shaped quadratic. Since b2 − 4ac = (−2) − 4(1)(−1) = 8 > 0, the
2
y = x2 − 2x − 1
√ √ x
1− 2 1+ 2
x2 + 2x + 1
y= 2
x − 3x + 2
−1 1 2 x
Thus, the inequality’s solution set is R/ [−2, 2] ∪ (−1, 1) or (−∞, −2) ∪ (−1, 1) ∪ (2, ∞).
x2 − 1
y=
x2 − 4
x
−2 −1 1 2
y (−3, 2) ∪ (6, 7)
x2 − 3x − 18
solves > 0.
−x2 + 9x − 14
−3 2 6 7 x
x2 − 3x − 18
y=
−x2 + 9x − 14
√ √
(b) Rewrite the inequality as x − cos x > 0, then graph y = x − cos x on your TI84. It
looks like there’s at least one x-intercept near the origin, maybe more. So let’s zoom in.
Alright, it looks like there’s only one x-intercept. We can as usual find it using the ZERO
function — x ≈ 0.642.
Altogether then, based on this one root and what the graph looks like, we conclude:
√ √
x > cos x ⇐⇒ x − cos x > 0 ⇐⇒ x ? 0.642.
(c) Rewrite the inequality as 1/ (1 − x2 )−x3 −sin x > 0, then graph y = 1/ (1 − x2 )−x3 −sin x
on your TI84. It looks like there’s only one x-intercept near x = −1.
So let’s simply find it using the ZERO function — x ≈ −1.179.
Altogether then, based on this one root and what the graph looks like, we conclude:
1 1
> x3 + sin x ⇐⇒ − x3 − sin x > 0 ⇐⇒ x ∈ (−∞, −1.179 . . . ) ∪ (−1, 1).
1−x 2 1 − x2
A − k = 40 and B − k = 2 (C − k).
1 2
A = 2B and C = 28.
3 4
From =, we have
5
k = 2B − 40.
7
Thus, B = 32 — Beng is 32 years old today. And from =, Apu is 64 years old today.
3
300 km
3π
4
600 km ≈ 442 km
By the Law of Cosines (Proposition 5), the third side of a triangle is given by:
√
2
c2 = a2 + b2 − 2ab cos C = 90000 + 360000 − 2(300)(600) × (− ) ≈ 195442.
2
√
Hence, c ≈ 195442 ≈ 442.
Thus, at 3 p.m., the distance between the two planes is about 442 km. And from 3 p.m.
onwards, they’re travelling directly towards each other — Plane A at 100 km h−1 and Plane
B at 200 km h−1 . Thus, the distance between them is shrinking at a rate of 300 km h−1 .
Hence, to collide, it will take about another
442
h ≈ 1 h 28 min.
300
Hence, they will collide at about 4.28 p.m.
2 = a ⋅ 12 + b ⋅ 1 + c = a + b + c,
1
5 = a ⋅ 32 + b ⋅ 3 + c = 9a + 3b + c,
2
9 = a ⋅ 62 + b ⋅ 6 + c = 36a + 6b + c.
3
Now plug = and = into = to get: 36a + 6 (1.5 − 4a) + 3a + 0.5 = 9 ⇐⇒ 15a = −0.5.
4 5 3
1 49 2
Hence, the (unique) solution is (a, b, c) = (− , , ).
30 30 5
A146. Recall that the strict global minimum point of a quadratic equation occurs where
x = −b/2a and thus where:
b 2 b2 b2 b2
y = a (− ) + b (− ) + c = − +c=c− .
b
2a 2a 4a 2a 4a
The above, along with the fact that (−1, 2) solves the given equation, yields the following
system of equations:
b2
2 = a ⋅ (−1) + b ⋅ (−1) + c = a − b + c, 0 = − , and 0 = c − .
1 2 2 b 3
2a 2a
then graph this equation on our GC. It looks like there are no x-intercepts. We thus
conclude that this system of equations has no solutions and its solution set is ∅.
(b) Again, use Method 2 and rewrite the two equations as:
1
y= − x3 − sin x,
1−x 2
then graph this equation on our GC. It looks like there is only one x-intercept. Using the
“zero” function, we find that it’s at x ≈ −1.179.
Plug this value of x back into either of the original equations to get: y ≈ −2.563. We
conclude that the (unique) solution is ∼ (−1.179, −2.563).
(Answer continues on the next page ...)
7. Now as usual, use the left and right arrow keys to move the cursor to approximately where
we think the intersection point is. In my case, I’ve moved it to (x, y) ≈ (0.745, 0.678).
8. Now press ENTER . The TI84 now annoyingly now asks, “Guess?” So press ENTER
once more to learn that the intersection point is (x, y) ≈ (0.739, 0.674).
By repeating Steps 4–8 (and making the necessary changes), you should be able to find
that the other intersection point is (x, y) ≈ (−0.739, −0.674). (Alternatively, you can save
yourself some time by observing the symmetries here and immediately infer that this is the
other intersection point.)
We conclude that this system of equations has two solutions: (±0.739 . . . , ±0.673 . . . ).
8 8 8
= − .
x2 + x − 6 5 (x − 2) 5 (x + 3)
Comparing coefficients, A + 3B = 17 and −3A + B = −5. 3× = plus = yields 10B = 46. So,
1 2 1 2
p(1) = 1 − 1 − 1 + 1 = 0. 3
x3 − x2 − x + 1 = (x − 1) (ax2 + bx + c) .
2x2 − x + 7 A (x − 1) + B (x + 1) (x − 1) + C (x + 1)
2
= + + =
A B C
x − x − x + 1 x + 1 x − 1 (x − 1)
3 2 2
(x + 1) (x − 1)
2
(A + B) x2 + (−2A + C) x + A − B + C
= .
(x + 1) (x − 1)
2
2x2 − x + 7 5 1 4
= − + .
x3 − x2 − x + 1 2 (x + 1) 2 (x − 1) (x − 1)2
p(1) = 1 − 2 + 4 − 8 < 0. 7
Aiyah, sian. Doesn’t work — by the FT, x − 1 is not a factor for p (x). Now try 2 instead:
p(2) = 23 − 2 ⋅ 22 + 4 ⋅ 2 − 8 = 0. 3
x3 − 2x2 + 4x − 8 = (x − 2) (ax2 + bx + c) .
A (x2 + 4) + (Bx + C) (x − 2)
=
(x − 2) (x2 + 4)
(A + B) x2 + (C − 2B) x + 4A − 2C
=
(x − 2) (x2 + 4)
2× = plus = yields 2A + C = −6. Now 2× = plus = yields 8A = −7 and thus A = −7/8. We can
1 2 4 4 3
−3x2 + 5 7 17 (x + 2)
= − − .
x3 − 2x2 + 4x − 8 8 (x − 2) 8 (x2 + 4)
We can easily verify that x = 0 and x = π/2 do indeed satisfy =, while x = π and x = π/2 do
2 4 1 3 4
√ √
12 1 + 12 + 1 + 1 = 1 + 1 + 1 + 1 = 4 ≠ 0,
The error is in Step 4, where ⇐⇒ is incorrectly given as an ⇐⇒ or “is equivalent to” or “if
4
and only if” step. It is not. It is a squaring operation and is thus, as usual, an irreversible
step that should’ve been written as Ô⇒ . It is in this step that extraneous solutions may
4
1
A151. Plug x = 1 into =: 12 + (−1 − ) + 1 = 1 − 2 + 1 = 0.
5 4 4
3
1
Plug x = 1 into =: 12 + 1 + 1 = 3 ≠ 0.
5 1
7
existence of some (real) solution to =, where none may exist (and indeed none does) and
1
6
A155(a) ∑ (n + 1)! = 1 + 2 + 6 + 24 + 120 + 720 + 5 040 = 5 913.
n=0
7
(b) ∑ (3n + 2) = 2 + 5 + 8 + 11 + 14 + 17 + 20 + 23 = 100.
n=0
6 5
n 1 1 3 5 7
(c) ∑ ( + ) = + 1 + + 2 + + 3 + = 14. (d) ∑ (8 − n) = 8 + 7 + 6 + 5 + 4 + 3 = 33.
n=0 2 2 2 2 2 2 n=0
4
∑ (2 − i) = (2 − 1) + (2 − 2) + (2 − 3) + (2 − 4)
1 2 3 4
A156(a)
i
i=1
= 11 + 02 + (−1) + (−2) = 1 − 1 + 16 = 16.
3 4
17
(b) ∑ (4⋆ + 5) = (4 ⋅ 16 + 5) + (4 ⋅ 17 + 5) = 69 + 73 = 142.
⋆=16
33
(c) ∑ (x − 3) = (31 − 3) + (32 − 3) + (33 − 3) = 28 + 29 + 30 = 87.
x=31
∞
A157(a) ∑ n! = ∑ n! = 1 + 2 + 6 + 24 + 120 + 720 + 5 040 + . . .
n=1
∞
(b) ∑ (3n − 1) = ∑ (3n − 1) = 2 + 5 + 8 + 11 + 14 + 17 + 20 + 23 + . . .
n=1
∞
n 1 3 5 7
(c) ∑ = ∑ = + 1 + + 2 + + 3 + + ...
n
n=1 2 2 2 2 2 2
∞
(d) ∑ (9 − n) = ∑ (9 − n) = 8 + 7 + 6 + 5 + 4 + 3 + . . .
n=1
∞
A158(a) ∑ (n + 1)! = 1 + 2 + 6 + 24 + 120 + 720 + 5 040 + . . .
n=0
∞
(b) ∑ (3n + 2) = 2 + 5 + 8 + 11 + 14 + 17 + 20 + 23 + . . .
n=0
∞ ∞
n 1 1 3 5 7
(c) ∑ ( + ) = + 1 + + 2 + + 3 + + . . . (d) ∑ (8 − n) = 8 + 7 + 6 + 5 + 4 + 3 + . . .
n=0 2 2 2 2 2 2 n=0
200
(a1 + ak ) = (2 + 997) = 99 900.
k
2 2
(b) b1 = 3, bk = 1 703, and d = 17. So, k = (1703 − 3) /17 + 1 = 101 terms. Thus, the sum is:
101
(b1 + bk ) = (3 + 1703) = 86 153.
k
2 2
(c) c1 = 81, ck = 8 081, and d = 5. So, k = (8 081 − 81) /8 + 1 = 1 001 terms. Thus, the sum is:
1 001
(c1 + ck ) = (81 + 8 081) = 4 085 081.
k
2 2
a1 − rak 7 − 2 ⋅ 896
= = 1 785.
1−r 1−2
(b) b1 = 20, bk = 5/8, and r = 1/2. By Corollary 6, the sum of this series is:
b1 − rbk 20 − 21 ⋅ 58 1 5 5 3
= = 2 (20 − ⋅ ) = 40 − = 39 .
1−r 1 − 12 2 8 8 8
(c) c1 = 1, ck = 1/243, and r = 1/3. By Corollary 6, the sum of this series is:
c1 − rck 1 − 31 ⋅ 243
1
3 1 364
= = (1 − )= .
1−r 1− 3 1 2 729 243
A161(a) a1 = 6 and r = 3/4. Thus, the sum of this series is 6/ (1 − 3/4) = 24.
(b) b1 = 20 and r = 1/2. Thus, the sum of this series is 20/ (1 − 1/2) = 40.
(c) c1 = 1 and r = 1/3. Thus, the sum of this series is 1/ (1 − 1/3) = 3/2.
1 − 6x5 + 5x6
Thus: S5 = .
(1 − x)
2
1 − (k + 1) xk + kxk+1
Thus: Sk = .
(1 − x)
2
1
Hence, the nth term is: .
(n + 1) − 1
2
999
1 1 1 1 1 1 1
And: + + + + + ⋅⋅⋅ + =∑ .
3 8 15 24 35 999 999 n=1 (n + 1)2 − 1
Take the nth term, factorise its denominator, then do the partial fractions decomposition:
1 1 1
= =
(n + 1) − 1
2 (n + 1 − 1) (n + 1 + 1) n (n + 2)
A (n + 2) + Bn (A + B) n + 2A
= + = =
A B
n n+2 n (n + 2) n (n + 2)
.
1 1 1 1 1 1
= + + + + + ⋅⋅⋅ +
3 8 15 24 35 999 999
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
= ( − + − + − + − + ⋅⋅⋅ + − + − + − ).
2 1 3 2 4 3 5 4 6 997 999 998 1 000 999 1 001
Observe that all the terms with denominators 3 through 999 will be cancelled out.
999
1 1 1 1 1 1 3 1 1
Thus: ∑ = ( + − − )= − − = 0.749 . . .
(n + 1) − 1
2 2 1 2 1 000 1 001 4 2 000 2 002
n=1
k
1 3 1 1
More generally: ∑ = − − .
n=1 (n + 1) − 1
2 4 2 (k + 1) 2 (k + 2)
1 1 1 1 1 k
1
Hence: + + + + + . . . = lim ∑
3 8 15 24 35 k→∞ n=1 (n + 1)2 − 1
3 1 1 3
= lim ( − − )= .
k→∞ 4 2 (k + 1) 2 (k + 2) 4
k
∑ lg = lg 1 − lg (k + 1) = − lg (k + 1)
n
More generally:
n=1 n+1
1 2 3 k
lg + lg + lg + ⋅ ⋅ ⋅ = lim ∑ lg = lim (− lg (k + 1)) = −∞.
n
Hence:
2 3 4 k→∞ n=1 n + 1 k→∞
1
(c) The nth term is √√ . Rationalise the surds:
(n + 1) n + n n + 1
√ √ √ √
1 (n + 1) n − n n + 1 (n + 1) n − n n + 1
√ √ √ √ =
(n + 1) n + n n + 1 (n + 1) n − n n + 1 (n + 1)2 n − n2 (n + 1)
√ √ √ √ √ √
(n + 1) n − n n + 1 (n + 1) n − n n + 1 n+1
= 3 = = −
n
.
n + 2n + n − (n + n )
2 3 2 n +n
2 n n+1
99
1 1 1 1
Thus: ∑ √ √ = √ √ + √ √ + ⋅⋅⋅ + √ √
n=1 (n + 1) n + n n + 1 2 1+1 2 3 2+2 3 100 99 + 99 100
√ √ √ √ √ √
1 2 2 3 99 100
= − + − + ⋅⋅⋅ + −
1 2 2 3 99 100
√ √
1 100 10
= − =1− = 0.9.
1 100 100
√
k
1 k+1 1
More generally: ∑ √ √ =1− =1− √ .
n=1 (n + 1) n + n n + 1 k k+1
k
1 1
Hence: lim ∑ √ √ = lim (1 − √ )=1
k→∞ n=1 (n + 1) n + n n + 1 k→∞ k+1
n=1
∑ [(n + 1) − n4 ] = 24 − 14 + 34 − 24 + 44 − 34 + ⋅ ⋅ ⋅ + (k + 1) − k 4
k
4 4
n=1
= (k + 1) − 14 = k 4 + 4k 3 + 6k 2 + 4k.
4 2
k
4 ∑ n3 + k (k + 1) (2k + 3) + k = k 4 + 4k 3 + 6k 2 + 4k
n=1
k 4 + 4k 3 + 6k 2 + 4k − (2k 3 + 5k 2 + 4k) k 4 + 2k 3 + k 2 k 2 (k + 1)
k 2
⇐⇒ ∑ n = 3
= = .
n=1 4 4 4
k 2 (k + 1)
k 2
1 + 2 + 3 + ⋅ ⋅ ⋅ = lim ∑ n = lim
3 3 3
= ∞.
3
k→∞ i=1 k→∞ 4
Ð→ ⎛ −3 ⎞
= (−3, −4) = = b 5 3 “ W
⎝ −4 ⎠
BA B A
Ð→ ⎛1⎞ √
= (1, 3) = = c 10 1 “ E
⎝3⎠
BG B G
Ð→ ⎛ −4 ⎞ √
= (−4, −7) = = d 65 4 “ W
⎝ −7 ⎠
GA G A
Ð→ ⎛ −1 ⎞ √
= (−1, −3) = = e 10 1 “ W
⎝ −3 ⎠
GB G B
A166. Here is one possible counterexample (yours may be different but still correct).
√ √
Let u = (1, 0) and v = (0, 1). Then ∣u∣ = ∣(1, 0)∣ = 12 + 02 = 1 and ∣v∣ = ∣(0, 1)∣ √
= 02 + 12 √
= 1,
so that ∣u∣ + ∣v∣ = 1 + 1 = 2. However, u + v = (1, 1), so that ∣u + v∣ = ∣(1, 1)∣ = 12 + 12 = 2.
Hence, ∣u + v∣ ≠ ∣u∣ + ∣v∣.
(As we’ll learn later, the correct assertion is this: ∣u + v∣ ≤ ∣u∣ + ∣v∣, with ∣u + v∣ = ∣u∣ + ∣v∣ if
and only if u and v point in the same direction.)
A171. The vectors b and c point in the exact opposite directions because c = −3b.
The vectors b and d point in different directions because b ≠ kd for any k.
Ð→ Ð→ Ð→
A173. The unit vectors of AB = (1, 3), AC = (4, 2), and BC = (3, −1) are, respectively:
Ð→
ˆ (1, 3) 1 1 3
AB = √ = √ (1, 3) = ( √ , √ ),
12 + 32 10 10 10
Ð→
ˆ (4, 2) 1 2 1
AC = √ = √ (4, 2) = ( √ , √ ),
42 + 22 20 5 5
Ð→
ˆ (3, −1) 1 3 −1
BC = √ = √ (3, −1) = ( √ , √ ).
32 + (−1) 10 10 10
2
Ð→ Ð→ Ð→
Of course, the above are also the unit vectors of 2AB, 3AC, and 4BC, respectively.
⎛ −1 ⎞ ⎛1⎞ ⎛3⎞
w= =2 − = 2a − b.
⎝ 0 ⎠ ⎝2⎠ ⎝4⎠
⎛ 1 ⎞ 1⎛ 1 ⎞ 1⎛ 7 ⎞
d= = + .
⎝ 1 ⎠ 8⎝ 3 ⎠ 8⎝ 5 ⎠
6a + 5b 6 ⎛1⎞ 5 ⎛3⎞ 1 ⎛ 21 ⎞
p= = + = ,
5+6 11 ⎝ 2 ⎠ 11 ⎝ 4 ⎠ 11 ⎝ 32 ⎠
a + 5b 1⎛ 1 ⎞ 5⎛ 2 ⎞ 1 ⎛ 11 ⎞
q= = + = ,
5+1 6⎝ 4 ⎠ 6⎝ 3 ⎠ 6 ⎝ 19 ⎠
3a + 2b 3 ⎛ −1 ⎞ 2 ⎛ 3 ⎞ 1 ⎛ 3 ⎞
r = = + = .
2+3 5 ⎝ 2 ⎠ 5 ⎝ −4 ⎠ 5 ⎝ −2 ⎠
21 32 11 19 3 2
Thus: P =( , ), Q=( , ), and R = ( , − ).
11 11 6 6 5 5
A178(a) The line described by −5x + y + 1 = 0 contains the point (0, −1) and has direction
vector (1, 5). Thus, it can also be described by r = (0, −1) + λ (1, 5) (λ ∈ R). λ = −1, λ = 0,
and λ = 1 produce the points (−1, −6), (0, −1), and (1, 4).
(b) The line described by x − 2y − 1 = 0 contains the point (1, 0) and has direction vector
(2, 1). Thus, it can also be described by r = (1, 0) + λ (2, 1) (λ ∈ R). λ = −1, λ = 0, and λ = 1
produce the points (−1, −1), (1, 0), and (3, 1).
(c) The line described by y − 4 = 0 contains the point (0, 4) and has direction vector (1, 0).
Thus, it can also be described by r = (0, 4) + λ (1, 0) (λ ∈ R). λ = −1, λ = 0, and λ = 1
produce the points (−1, 4), (0, 4), and (1, 4).
(d) The line described by x − 4 = 0 contains the point (4, 0) and has direction vector (0, 1).
Thus, it can also be described by r = (4, 0) + λ (0, 1) (λ ∈ R). λ = −1, λ = 0, and λ = 1
produce the points (4, −1), (4, 0), and (4, 1).
A179. If v = 0, then the “line” would not be a line, but the single point P :
{R ∶ r = p + λv (λ ∈ R)} = {R ∶ r = p} = {P } .
And so, we impose the restriction that v ≠ 0 to rule out the above trivial (or degenerate)
case.
8 2
Then 7× = minus 8× = yields: 7y − 8x = 2 or y = x + .
2 1
7 7
(c) Write out x = 3λ and y = −3.
1 2
This is a horizontal line. We can discard = and be left with the single equation y = −3.
1 2
This is a vertical line. We can discard = and be left with the single equation x = 1.
2 1
x − (−1) y − 3
A181(a) = or y = −2x + 1.
1 −2
x−5 y−6 8 2
(b) = or y = x − .
7 8 7 7
(c) y = −3.
(d) x = 1.
1474, Contents www.EconsPhDTutor.com
126.3. Ch. 36 Answers (The Scalar Product)
⎛2⎞ ⎛8⎞
A182. v⋅x = ⋅ = 16 + 7 = 23,
⎝1⎠ ⎝7⎠
⎛ −4 ⎞ ⎛ 8 ⎞
w⋅x = ⋅ = −32 + 0 = −32,
⎝ 0 ⎠ ⎝7⎠
√ √ √
∣a∣ = ∣(−2, 3)∣ = (−2) + 32 = (−2, 3) ⋅ (−2, 3) = a ⋅ a.
2
A184.
√ √ √
∣b∣ = ∣(7, 1)∣ = 7 +1
2 2 = (7, 1) ⋅ (7, 1) = b ⋅ b.
√ √ √
∣c∣ = ∣(5, −4)∣ = 52 + (−4) = (5, −4) ⋅ (5, −4) = b ⋅ b.
2
θ is acute. u and v are neither parallel nor perpendicular and point in different directions.
(d) The angle between u = (2, −3) and v = (1, 2) is:
θ is obtuse. u and v are neither parallel nor perpendicular and point in different directions.
1476, Contents www.EconsPhDTutor.com
∣u + v∣ = (u + v) ⋅ (u + v)
2
A187. (Fact 67)
=u⋅u+u⋅v+v⋅u+v⋅v (Distributivity)
= u ⋅ u + 2u ⋅ v + v ⋅ v (Commutativity)
= ∣u∣ + 2u ⋅ v + ∣v∣
2 2
(Fact 67 again)
= ∣u∣ + 0 + ∣v∣ (u ⋅ v = 0 because u ⊥ v)
2 2
= ∣u∣ + ∣v∣ .
2 2
∣u + v∣ = ∣u∣ + 2u ⋅ v + ∣v∣
2 2 2
A188.
≤ ∣u∣ + 2 ∣u∣ ∣v∣ + ∣v∣
2 2
(Cauchy’s Inequality)
= (∣u∣ + ∣v∣) 2 . (Complete the square)
We’ve just shown that ∣u + v∣ ≤ (∣u∣ + ∣v∣) . And so, taking square roots, we also have
2 2
∣u + v∣ ≤ ∣u∣ + ∣v∣.
1 1 4 4 2 −1 −1
x-direction cosine √ =√ √ =√ =√ √ =√
12 + 32 10 42 + 22 20 5 (−1) + 22
2 5
3 3 2 2 1 2 2
y-direction cosine √ =√ √ =√ =√ √ =√
12 + 32 10 42 + 22 20 5 (−1) + 22
2 5
1 3 2 1 −1 2
Unit vector (√ , √ ) (√ , √ ) (√ , √ )
10 10 5 5 5 5
(1, 0) ⋅ (1, 1) 1 ⋅ 1 + 0 ⋅ 1 2 √
∣proj(33,33) (1, 0)∣ = ∣proj(1,1) (1, 0)∣ = ∣ ∣= √ = √ = 2.
∣(1, 1)∣ 2 2
(c) We just showed that: ∣proj(33,33) (1, 0)∣ ≠ ∣proj(1,0) (33, 33)∣.
Therefore, the given statement is false. In general, the projection of a on b is not the same
as the projection of b on a.
⎛ 0 ⎞ ⎛3⎞ ⎛ −2 ⎞ 0 = 3 − 2λ̂,
1
C= = + λ̂ or
⎝ −1 ⎠ ⎝ 1 ⎠ ⎝ 5 ⎠ −1 = 1 + 5λ̂.
2
From =, we have λ̂ = 1.5. But this contradicts =. So, there is no solution to the above
1 2
vector equation (or system of two equations), meaning our line does not contain C. Thus,
A, B, and C are not collinear.
(b) The unique line that contains both A = (1, 2) and B = (0, 0) is:
Ð→ Ð→
r = OA + λAB = (1, 2) + λ(−1, −2) (λ ∈ R).
Observe that λ̂ = −2 solves the above vector equation (or system of two equations). Hence,
our line does indeed contain the point C (it corresponds to λ̂ = −2). Thus, A, B, and C are
collinear.
√ √
A194(a) ∣a∣ = a21 + a22 , ∣b∣ = b21 + b22 , ∣a × b∣ = ∣a1 b2 − a2 b1 ∣, and:
a⋅b a1 b1 + a2 b2
cos θ = =√ √
∣a∣ ∣b∣
.
a21 + a22 b21 + b22
But by (b), sin θ ≥ 0 and so we can discard the negative value. Altogether then:
√
sin θ = 1 − cos2 θ.
¿
√ Á (a1 b1 + a2 b2 )
sin θ = 1 − cos2 θ = Á
À1 −
2
(d)
(a21 + a22 ) (b21 + b22 )
.
¿
√ √ Á (a1 b1 + a2 b2 )
Á
À
2
(f) ∣a∣ ∣b∣ sin θ = a1 + a2 b1 + b2 1 − 2
2 2 2 2
(a1 + a22 ) (b21 + b22 )
√
= (a21 + a22 ) (b21 + b22 ) − (a1 b1 + a2 b2 )
2
√
= (a1 b2 − a2 b1 ) = ∣a1 b2 − a2 b1 ∣ = ∣a × b∣ .
(e) 2
Ð→ (1, 5) ⋅ (5, 1) ⎛ 5 ⎞ 5 + 5 ⎛ 5 ⎞ 5 ⎛ 5 ⎞
projv P B = proj(5,1) (1, 5) = = = .
52 + 12 ⎝ 1 ⎠ 26 ⎝ 1 ⎠ 13 ⎝ 1 ⎠
By Fact 86 then, the feet of the perpendiculars from A and B to the line are:
Ð→ ⎛ 2 ⎞ 6 ⎛ 5 ⎞ 1 ⎛ −45 ⎞ 9 ⎛5⎞
P + projv P A = − = =− ,
⎝ −3 ⎠ 13 ⎝ 1 ⎠ 13 ⎝ −9 ⎠ 13 ⎝ 1 ⎠
Ð→ ⎛ 2 ⎞ 5 ⎛ 5 ⎞ 1 ⎛ 51 ⎞ 17 ⎛ 3 ⎞
P + projv P B = + = = .
⎝ −3 ⎠ 13 ⎝ 1 ⎠ 13 ⎝ −34 ⎠ 13 ⎝ −2 ⎠
Ð→ Ð→ Ð→
A196(a)(i) By definition, projv P A = (P A ⋅ v̂) v̂ = [(P A ⋅ v) / ∣v∣ ] v. We’ve just shown
2
Ð→ Ð→
that we can write projv P A = λv, with λ = (P A ⋅ v) / ∣v∣ .
2
Ð→ Ð→ Ð→
(ii) We have B = P + projv P A = P + λv or equivalently OB = OP + λv. We’ve just shown
that B satisfies l’s vector equation and thus that B is on l.
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→
(b)(i) AB = B − A = P + projv P A − A = AP + projv P A = − (P A − projv P A) = −rejv P A.
Ð→
(ii) Recall that in general, rejb a ⊥ b. Hence, rejv P A ⊥ v.
Ð→ Ð→ Ð→ Ð→
(iii) Since rejv P A ⊥ v, we also have AB = − (rejv P A) ⊥ v and hence AB ⊥ l.
Ð→
(c)(i) BC is a direction vector of l, so that by definition of the foot of the perpendicular,
Ð→ Ð→ Ð→ Ð→
we must have AB ⊥ BC or AB ⋅ BC = 0.
Ð→ Ð→
(ii) We will prove that AC ⋅ BC ≠ 0:
Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ Ð→ 2 Ð→ 2
AC ⋅ BC = (AB + BC) ⋅ BC = AB ⋅ BC + BC ⋅ BC = 0 + ∣BC∣ = ∣BC∣ > 0.
Ð→ Ð→ Ð→
We’ve just shown that AC ⊥/ BC. And hence, AC ⊥/ l.
Ð→
A197(a) If A is on l, then as noted in Remark 62, d = 0. Moreover, P A ∥ v, so that by
Ð→ Ð→ Ð→
Corollary 11, P A × v̂ = 0 and hence ∣P A × v̂∣ = 0. Thus, we indeed have d = ∣P A × v̂∣.
Ð→ Ð→ Ð→
(b) By Corollary 13, d = ∣AB∣. (c) AB = −rejv P A.
Ð→ Ð→ Ð→
(d) Putting together (b) and (c), we have d = ∣AB∣ = ∣−rejv P A∣ = ∣rejv P A∣.
Ð→ Ð→ Ð→
But by Fact 85, ∣rejv P A∣ = ∣P A × v̂∣. Thus, d = ∣P A × v̂∣.
1481, Contents www.EconsPhDTutor.com
A198(a) Let P = (8, 3) and v = (9, 3).
Ð→
Method 1 (Formula Method). First compute P A = (7, 3) − (8, 3) = (−1, 0) and:
Ð→
So: B = P + projv P A = (8, 3) − 0.1 (9, 3) = (7.1, 2.7).
And the distance between A and l is:
√
Ð→ Ð→ (9, 3) (−1) ⋅ 3 − 0 ⋅ 9 3 1 10
∣AB∣ = ∣P A × v̂∣ = ∣(−1, 0) × √ ∣=∣ √ ∣= √ = √ = .
92 + 32 90 90 10 10
Ð→ 20 1
So: B = P + projv P A = (4, 4) − (2, 7) = (172, 72).
53 53
And the distance between A and l is:
Ð→ Ð→ (2, 7) 4 ⋅ 7 − (−4) ⋅ 2 36
∣AB∣ = ∣P A × v̂∣ = ∣(4, −4) × √ ∣=∣ √ ∣= √ .
22 + 72 53 53
Ð→ (0, 1) ⋅ (5, 6) ⎛ 5 ⎞ 0 + 6 ⎛ 5 ⎞ 6 ⎛ 5 ⎞
projv P A = proj(5,6) (0, 1) = = = .
52 + 62 ⎝ 6 ⎠ 61 ⎝ 6 ⎠ 61 ⎝ 6 ⎠
Ð→ 6 1
So: B = P + projv P A = (8, 4) + (5, 6) = (518, 280).
61 61
And the distance between A and l is:
Ð→ (5, 6) 0−5 5
∣P A × v̂∣ = ∣(0, 1) × √ ∣ = ∣√ ∣ = √ .
52 + 62 61 61
A201(a) A + B is undefined.
Ð→
(b) A − B is the vector BA = (1 − (−1) , 2 − 0, 3 − 7) = (2, 2, −4).
(c) B + C is undefined. Hence, A + (B + C) is also undefined.
Ð→
(d) B − C is the vector CB = (−1 − 5, 0 − (−2) , 7 − 3) = (−6, 2, 4). And so, A + (B − C) is a
Ð→
well-defined vector, namely A + (B − C) = A + CB = (1, 2, 3) + (−6, 2, 4) = (−5, 4, 7).
A202(a) u + v = (1, 2, 3) + (−1, 0, 7) = (0, 2, 10).
(b) u − v = (1, 2, 3) − (−1, 0, 7) = (2, 2, −4).
(c) u + (v + w) = u + v + w = (0, 2, 10) + (5, −2, 3) = (5, 0, 13).
(d) u + (v − w) = u + v − w = (0, 2, 10) − (5, −2, 3) = (−5, 4, 7).
Ð→
A203. AB = B − A = (3, 6, −5) − (5, −1, 0) = (−2, 7, −5).
Ð→
AC = C − A = (2, 2, 3) − (5, −1, 0) = (−3, 3, 3).
Ð→
BC = C − B = (2, 2, 3) − (3, 6, −5) = (−1, −4, 8).
Ð→ Ð→ Ð→ Ð→
AB − AC = (−2, 7, −5) − (−3, 3, 3) = (1, 4, −8) = −BC = CB. 3
Ð→ Ð→ Ð→
AB + BC = (−2, 7, −5) + (−1, −4, 8) = (−3, 3, 3) = AC. 3
A204(a) v and w point in the exact opposite directions because v = −1.5w. They are thus
also parallel — v ∥ w.
(b) v and x point in different directions because x ≠ kv for all k ∈ R.They are thus also
non-parallel — v ∥/ x.
(c) w and x point in different directions because x ≠ kw for all k ∈ R.They are thus also
non-parallel — w ∥/ x.
(d) By our definitions (which covered only non-zero vectors), the zero vector 0 does not
point in the same, exact opposite, or different direction as any other vector (including, in
particular, u).
Also, it is neither parallel nor non-parallel to any other vector (including, in particular, u).
⎛ 11 ⎞
µa + λb 3 (1, 2, 3) + 2 (4, 5, 6) 1 ⎜
p= = = ⎜ 16 ⎟⎟.
λ+µ 2+3 5
⎝ 21 ⎠
1
Hence, the point is P = (11, 16, 21).
5
A208(a) Informally, a vector is an “arrow” with two properties: direction and length.
(b) A point and a vector are entirely different objects and should not be confused. Non-
etheless, each can be described by an ordered triple of real numbers.
(c) Let A = (a1 , a2 , a3 ) be a point and a = (a1 , a2 , a3 ) be a vector. We say that a is A’s
position vector.
(d) The vector a = (a1 , a2 , a3 ) carries us from the origin to the point A = (a1 , a2 , a3 ).
⎛ a1 ⎞
Ð→ Ð
→
A209. OA = A − O = (a1 , a2 , a3 ) = a = ⎜ ⎟
⎜ a2 ⎟ = a1 i + a2 j + a3 k = a .
⎝ a3 ⎠
A210. A + B is undefined.
Ð→
A + OB is the point (a1 + b1 , a2 + b2 , a3 + b3 ).
Ð→ Ð→
OA + OB is the vector (a1 + b1 , a2 + b2 , a3 + b3 ).
Ð→ Ð→ Ð→
OA − OB is the vector BA = (a1 − b1 , a2 − b2 , a3 − b3 ).
Ð→ Ð→ Ð→
OA − BA is the vector OB = (b1 , b2 , b3 ).
(1, 2, 3) ⋅ (4, 5, 6) 32
cos−1 = cos−1 √ √ ≈ 0.226.
∣(1, 2, 3)∣ ∣(4, 5, 6)∣ 14 77
Thus, the vectors (1, 2, 3) and (4, 5, 6) are neither parallel nor perpendicular. Instead, they
point in different directions.
(b) The angle between the vectors (−2, 4, −6) and (1, −2, 3) is:
Thus, the vectors (−2, 4, −6) and (1, −2, 3) are parallel (and point in exact opposite direc-
tions). Actually, we could also have arrived at this conclusion by observing that u = −2v.
√ √
A213(a) The vector (1, 3, −2) has length 12 + 32 + (−2) = 14.
2
√
Hence, its unit vector is: (1, 3, −2) / 14.
√ √ √
Its x-, y-, and z-direction cosines are 1/ 14, 3/ 14, and −2/ 14.
And the angles it makes with the positive x-, y-, and z-axes are:
1 3 −2
cos−1 √ ≈ 1.300, cos−1 √ ≈ 0.641, and cos−1 √ ≈ 2.135.
14 14 14
√ √
(b) The vector (4, 2, −3) has length 42 + 22 + (−3) = 29.
2
√
Hence, its unit vector is: (4, 2, −3) / 29.
√ √ √
Its x-, y-, and z-direction cosines are 4/ 29, 2/ 29, and −3/ 29.
And the angles it makes with the positive x-, y-, and z-axes are:
4 2 −3
cos−1 √ ≈ 0.734, cos−1 √ ≈ 1.190, and cos−1 √ ≈ 2.162.
29 29 29
√ √
(c) The vector (−1, 2, −4) has length (−1) + 22 + (−4) = 21.
2 2
√
Hence, its unit vector is: (−1, 2, −4) / 21.
√ √ √
Its x-, y-, and z-direction cosines are −1/ 21, 2/ 21, and −4/ 21.
And the angles it makes with the positive x-, y-, and z-axes are:
−1 2 −4
cos−1 √ ≈ 1.791, cos−1 √ ≈ 1.119, and cos−1 √ ≈ 2.362.
21 21 21
⎛ −2 ⎞
(2, 5, −1) ⋅ (−2, −2, 1) ⎜ ⎟
projc a = (a ⋅ ĉ) ĉ = ⎜ −2 ⎟
(−2) + (−2) + 1 ⎝
2 2
1 ⎠
2
⎛ −2 ⎞ ⎛ −2 ⎞ ⎛ 2 ⎞
−4 − 10 − 1 ⎜ ⎟ 15 ⎜ ⎟ 5 ⎜ ⎟
= ⎜ −2 ⎟ = − 9 ⎜ −2 ⎟ = 3 ⎜ 2 ⎟
9
5 ⎝ 1⎠ ⎝ ⎠ ⎝ −1 ⎠
rejc a = a − projc a = (2, 5, −1) − (2, 2, −1) =1 (−4, 5, 2). 1
3 3
1 1
(rejb a) ⋅ b = (11, 29, −10) ⋅ (1, 1, 4) = (11 + 29 − 40) = 0. 3
6 6
1 1
(rejc a) ⋅ c = (−4, 5, 2) ⋅ (−2, −2, 1) = (8 − 10 + 2) = 0. 3
3 3
x − 2/7 y − 50/3
= =
z
A217(a) Rewrite the given equations: .
5/7 70/3 7/8
So, this line may be described by r = (2/7, 50/3, 0) + λ (5/7, 70/3, 7/8) (λ ∈ R).
It is not perpendicular to any of the axes.
= =
x y z
(b) Rewrite the given equations: .
1/2 1/3 1/5
So, this line may be described by r = (0, 0, 0) + λ (1/2, 1/3, 1/5) (λ ∈ R).
It is not perpendicular to any of the axes.
x − 4/17 y − 1/3
= =
z
(c) Rewrite the given equations: .
1/17 2/3 1/3
So, this line may be described by r = (4/17, 1/3, 0) + λ (1/17, 2/3, 1/3) (λ ∈ R).
It is not perpendicular to any of the axes.
11 x − 3 z − 2/5
(d) Rewrite the given equations: y = and = .
3 2 7/5
So, this line may be described by r = (3, 11/3, 2/5) + λ (2, 0, 7/5) (λ ∈ R).
It is perpendicular to the y-axis.
(e) The free variable is y. So, this line may be described by:
λ̂ = 1, 1 − λ̂ = 3, 1 + λ̂ = 3 + 2µ̂.
1 2 3
and
= immediately contradicts =. Hence, the two lines do not intersect and are not identical.
1 2
The two lines are neither parallel nor perpendicular. And since they do not intersect
either, they are skew.
(b) If the two lines intersect, then there are real numbers λ̂ and µ̂ numbers such that:
= immediately contradicts =. Hence, the two lines do not intersect and are not identical.
1 3
The two lines are neither parallel nor perpendicular. And since they do not intersect
either, they are skew.
(c) If the two lines intersect, then there are real numbers λ̂ and µ̂ numbers such that:
= plus = yields 11+12λ̂ = 16 or λ̂ = 5/12. And now from =, we have µ̂ = 4/9. But these values
1 3 3
of λ̂ and µ̂ contradict =. Hence, the two lines do not intersect and are not identical.
2
The two lines are perpendicular (and hence not parallel). And since they do not intersect
either, they are skew.
(d) The two lines have parallel direction vectors because (−3, −6, −3) = −3 (1, 2, 1). Hence,
the two lines are parallel (and thus neither perpendicular nor skew). Since they are parallel,
the angle between them is zero.
The first line does not contain the point (1, 0, 0) — the only point on the first line with
x-coordinate 1 is (1, 2, 3) (plug in λ = 1). Hence, the two lines are not identical. Since the
two lines are parallel and distinct, they do not intersect at all.
⎛3⎞ ⎛ −2 ⎞
Ð→ Ð→ ⎜ ⎟
r = OA + λAB = ⎜ 1 ⎟ + λ⎜
⎜ 5
⎟
⎟ (λ ∈ R).
⎝2⎠ ⎝ 3 ⎠
If the above line also contains C, then there exists λ̂ such that:
0 = 3 − 2λ̂,
1
⎛ 0 ⎞ ⎛3 ⎞ ⎛ −2 ⎞
C=⎜ ⎟ ⎜
⎜ −1 ⎟ = ⎜ 1
⎟ + λ̂⎜ 5 ⎟,
⎟ ⎜ ⎟ or −1 = 1 + 5λ̂,
2
⎝ 0 ⎠ ⎝2 ⎠ ⎝ 3 ⎠
0 = 2 + 3λ̂.
3
From =, we have λ̂ = 1.5. But this contradicts =. This contradiction means that there is no
1 2
⎛1⎞ ⎛ −1 ⎞
r = ⎜ 2 ⎟ + λ⎜
⎜ ⎟
⎜ −2
⎟
⎟ (λ ∈ R).
⎝4⎠ ⎝ −3 ⎠
If the above line also contains C, then there exists λ̂ such that:
3 = 1 − 1λ̂,
1
⎛ 3 ⎞ ⎛1 ⎞ ⎛ −1 ⎞
C=⎜ ⎟ ⎜
⎜ 6 ⎟=⎜ 2
⎟ + λ̂⎜ −2 ⎟,
⎟ ⎜ ⎟ or 6 = 2 − 2λ̂,
2
⎝ 10 ⎠ ⎝ 4 ⎠ ⎝ −3 ⎠
10 = 4 − 3λ̂.
3
As you can verify, λ̂ = −2 solves the above vector equation (or system of three equations).
Thus, our line contains C. We conclude that A, B, and C are collinear.
⎛ 0 ⎞ ⎛ 3 ⎞ ⎛ 1×5−2×4 ⎞ ⎛ −3 ⎞
u×v=⎜ ⎟ ⎜ ⎟ ⎜
⎜ 1 ⎟×⎜ 4 ⎟=⎜ 2×3−0×5
⎟ = ⎜ 6 ⎟.
⎟ ⎜ ⎟
⎝ 2 ⎠ ⎝ 5 ⎠ ⎝ 0×4−1×3 ⎠ ⎝ −3 ⎠
⎛ a2 b3 − a3 b2 ⎞ ⎛ a1 ⎞
(c) (a × b) ⋅ a = ⎜ ⎟ ⎜
⎜ a3 b1 − a1 b3 ⎟ ⋅ ⎜ a2 ⎟
⎟
⎝ a1 b2 − a2 b1 ⎠ ⎝ a3 ⎠
= (a2 b3 − a3 b2 ) a1 + (a3 b1 − a1 b3 ) a2 + (a1 b2 − a2 b1 ) a3
= a1 a2 b3 − a1 b2 a3 + b1 a2 a3 − a1 a2 b3 + a1 b2 a3 − b1 a2 a3 = 0.
⎛ a2 b3 − a3 b2 ⎞ ⎛ b1 ⎞
(a × b) ⋅ b = ⎜ ⎟ ⎜
⎜ a3 b1 − a1 b3 ⎟ ⋅ ⎜ b2 ⎟
⎟
⎝ a1 b2 − a2 b1 ⎠ ⎝ b3 ⎠
= (a2 b3 − a3 b2 ) b1 + (a3 b1 − a1 b3 ) b2 + (a1 b2 − a2 b1 ) b3
= b1 a2 b3 − b1 b2 a3 + b1 b2 a3 − a1 b2 b3 + a1 b2 b3 − b1 a2 b3 = 0.
⎛ a2 b3 − a3 b2 + a2 c3 − a3 c2 ⎞
=⎜
⎜ a3 b1 − a1 b3 + a3 c1 − a1 c3
⎟
⎟
⎝ a1 b2 − a2 b1 + a1 c2 − a2 c1 ⎠
⎛ a2 b3 − a3 b2 ⎞ ⎛ a2 c3 − a3 c2 ⎞
=⎜
⎜ a3 b1 − a1 b3
⎟+⎜ a c −a c
⎟ ⎜ 3 1 1 3
⎟ = a × b + a × c.
⎟
⎝ a1 b2 − a2 b1 ⎠ ⎝ a1 c2 − a2 c1 ⎠
(b) In Example 676, we already showed that (1, 2, 3) × (4, 5, 6) = (−3, 6, −3).
⎛4⎞ ⎛1 ⎞ ⎛ 3⋅5−2⋅6 ⎞ ⎛ 3 ⎞
⎜ 5 ⎟×⎜ 2 ⎟=⎜ 1⋅6−3⋅4 ⎟ = ⎜ −6 ⎟.
We now have: ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝6⎠ ⎝3 ⎠ ⎝ 2⋅4−1⋅5 ⎠ ⎝ 3 ⎠
⎛4⎞ ⎛1 ⎞ ⎛1⎞ ⎛4 ⎞
⎜ 5 ⎟×⎜ 2 ⎟ = −⎜ 2 ⎟ × ⎜ 5 ⎟.
So that indeed: ⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎝6⎠ ⎝3 ⎠ ⎝3⎠ ⎝6 ⎠
⎛ a2 b3 − a3 b2 ⎞ ⎛ b2 a3 − b3 a2 ⎞
a×b=⎜
⎜ a3 b1 − a1 b3
⎟
⎟ and b×a =⎜
⎜ b3 a1 − b1 a3
⎟.
⎟
⎝ a1 b2 − a2 b1 ⎠ ⎝ b1 a2 − b2 a1 ⎠
⎛ da2 b3 − da3 b2 ⎞ ⎛ a2 b3 − a3 b2 ⎞
(da) × b = ⎜
⎜ da3 b1 − da1 b3
⎟ = d⎜ a b − a b
⎟ ⎜ 3 1 1 3
⎟ = d (a × b) ,
⎟
⎝ da1 b2 − da2 b1 ⎠ ⎝ a1 b2 − a2 b1 ⎠
⎛ a2 db3 − a3 db2 ⎞ ⎛ a2 b3 − a3 b2 ⎞
a × (db) = ⎜
⎜ a3 db1 − a1 db3
⎟ = d⎜ a b − a b
⎟ ⎜ 3 1 1 3
⎟ = d (a × b) .
⎟
⎝ a1 db2 − a2 db1 ⎠ ⎝ a1 b2 − a2 b1 ⎠
√
⋆
∣a × b∣ = (a2 b3 − a3 b2 ) + (a3 b1 − a1 b3 ) + (a1 b2 − a2 b1 ) ,
2 2 2
a⋅b a1 b1 + a2 b2 + a3 b3
cos θ = =√ √ .
∣a∣ ∣b∣ a21 + a22 + a23 b21 + b22 + b23
(b) Since θ ∈ [0, π], it must be that sin θ is non-negative, i.e. sin θ ≥ 0.
√ The trigonometric identity is sin θ + cos θ = 1 (Fact 29). Rearranging, we have sin θ =
2 2
(c)
± 1 − cos2 θ. Since sin θ ≥ 0, we can discard the negative value. Altogether then, we have:
√
sin θ = 1 − cos2 θ.
¿ ¿
√ Á ⋅
2 Á (a + + )
Á
À Á
À
2
a b
sin θ = 1 − cos2 θ = 1−( ) = 1− 2
1 b 1 a2 b 2 a3 b 3
(d) .
∣a∣ ∣b∣ (a1 + a22 + a23 ) (b21 + b22 + b23 )
(e) As per the hint, fully expand each of LHS and RHS:
= a21 b21 + a21 b22 + a21 b23 + a22 b21 + a22 b22 + a22 b23 + a23 b21 + a23 b22 + a23 b23
− (a21 b21 + a22 b22 + a23 b23 + 2a1 a2 b1 b2 + 2a1 a3 b1 b3 + 2a2 a3 b2 b3 )
= a21 b22 + a21 b23 + a22 b21 + a22 b23 + a23 b21 + a23 b22 − (2a1 a2 b1 b2 + 2a1 a3 b1 b3 + 2a2 a3 b2 b3 ).
= a22 b23 + a23 b22 − 2a2 a3 b2 b3 + a21 b23 + a23 b21 − 2a1 a3 b1 b3 + a22 b21 + a21 b22 − 2a1 a2 b1 b2 .
¿
√ Á (a1 b1 + a2 b2 + a3 b3 )
∣a∣ ∣b∣ sin θ = (a1 + a2 + a3 ) (b1 + b2 + b3 )Á
À1 −
2
2 2 2 2 2 2
(f)
(a21 + a22 + a23 ) (b21 + b22 + b23 )
√
= (a21 + a22 + a23 ) (b21 + b22 + b23 ) − (a1 b1 + a2 b2 + a3 b3 )
2
√
2 ⋆
= (a2 b3 − a3 b2 ) + (a3 b1 − a1 b3 ) + (a1 b2 − a2 b1 ) = ∣a × b∣.
(e) 2 2
Rearranging, λ̃ = −9/139.
9 1
And now: B = (8, 3, 4) − (9, 3, 7) = (1 031, 390, 493).
139 139
Lovely — this is the same as what we found in Method 1. And now, the distance between
A and l is:
√ √
Ð→ √
∣AB∣ = (9λ̃ + 1) + (3λ̃) + (7λ̃) = 139λ̃ + 18λ̃ + 1 = 58/139.
2 2 2
2
Ð→
Method 3 (or the Calculus Method). Let R be a generic point on l, so that AR =
(9λ + 1, 3λ, 7λ + 4) and the distance between A and R is:
Ð→ √
∣AR∣ = 139λ2 + 18λ + 1.
d
(139λ2 + 18λ + 1) = 278λ + 18.
dλ
18 9
FOC: (278λ + 18) ∣λ=λ̃ = 0 or λ̃ = − =− .
278 139
Ð→
Lovely — this is also what we found in Method 2. We can now find B and ∣AB∣ (omitted).
Alternatively, we could simply have used “−b/2a”: λ̃ = −18/ (2 ⋅ 139) = −9/139.
1495, Contents www.EconsPhDTutor.com
A223(b) Let P = (4, 4, 3) and v = (6, 11, 5) − (4, 4, 3) = (2, 7, 2). Let B be the foot of the
perpendicular from A to l.
Ð→
Method 1 (Formula Method). First, P A = (8, 0, 2) − (4, 4, 3) = (4, −4, −1). So:
Ð→ Ð→ 22
P B = projv P A = proj(2,7,2) (4, −4, −1) = − (2, 7, 2) .
57
Ð→ 22 1
By Fact 86: B = P + projv P A = (4, 4, 3) − (2, 7, 2) = (184, 74, 127).
57 57
Rearranging, λ̃ = −22/57.
22 1
And now: B = (4, 4, 3) − (2, 7, 2) = (184, 74, 127).
57 57
Lovely — this is also what we found in Method 1. And now, the distance between A and l
is:
√ √ √
Ð→ 1397
∣AB∣ = (2λ̃ − 4) + (7λ̃ + 4) + (2λ̃ + 1) = 57λ̃2 + 44λ̃ + 33 =
2 2 2
.
57
Ð→
Method 3 (or the Calculus Method). Let R be a generic point on l, so that AR =
(2λ − 4, 7λ + 4, 2λ + 1) and the distance between A and R is:
Ð→ √
∣AR∣ = 57λ2 + 44λ + 33.
d
(57λ2 + 44λ + 33) = 114λ + 44.
dλ
44 22
FOC: (114λ + 44) ∣λ=λ̃ = 0 or λ̃ = − =− .
114 57
Ð→
Lovely — this is also what we found in Method 2. We can now find B and ∣AB∣ (omitted).
Alternatively, we could simply have used “−b/2a”: λ̃ = −44/ (2 ⋅ 57) = −22/57.
1496, Contents www.EconsPhDTutor.com
A223(c) Let P = (8, 4, 5) and v = (5, 6, 0). Let B be the foot of the perpendicular from A
Ð→
to l. Method 1 (Formula Method). First, P A = (8, 5, 9) − (8, 4, 5) = (0, 1, 4). So:
Ð→ Ð→ 6
P B = projv P A = proj(5,6,0) (0, 1, 4) = (5, 6, 0) .
61
Ð→ 6 1
By Fact 86: B = P + projv P A = (8, 4, 5) + (5, 6, 0) = (518, 280, 305).
61 61
Ð→ Ð→
Since AB ⊥ l, we have AB ⊥ v or:
Ð→
0 = AB ⋅ (5, 6, 0) = (5λ̃, 6λ̃ − 1, −4) ⋅ (5, 6, 0) = 5 (5λ̃) + 6 (6λ̃ − 1) + 0 (−4) = 61λ̃ − 6.
Rearranging, λ̃ = 6/61.
6 1
And now: B = (8, 4, 5) + (5, 6, 0) = (518, 280, 305).
61 61
Lovely — this is also what we found in Method 1. And now, the distance between A and l
is:
√ √ √
Ð→ 1 001
∣AB∣ = (5λ̃) + (6λ̃ − 1) + (−4) = 61λ̃2 − 12λ̃ + 17 =
2 2 2
.
61
Ð→
Method 3 (or the Calculus Method). Let R be a generic point on l, so that AR =
(5λ, 6λ − 1, −4) and the distance between A and R is:
Ð→ √
∣AR∣ = 61λ2 − 12λ̃ + 17.
d
(61λ2 − 12λ̃ + 17) = 122λ − 12.
dλ
12 6
FOC: (122λ − 12) ∣λ=λ̃ = 0 or λ̃ = = .
122 61
Ð→
Lovely — this is also what we found in Method 2. We can now find B and ∣AB∣ (omitted).
Alternatively, we could simply have used “−b/2a”: λ̃ = − (−12) / (2 ⋅ 61) = 6/61.
y=2
z=3
A226. No, l contains the point (7, 3, 1), which isn’t on q because (7, 3, 1) ⋅ (4, −3, 2) ≠ −10.
√ √ √
A227. a = (2, −2, 2) and c = (− 2, 2, − 2) are parallel to (1, −1, 1), while b = (2, 2, −2)
is not. Hence, a and c are normal to q, but not b.
Ð→ Ð→
A228(a) AB = B − A = (−2, 3, 0) − (1, −1, 2) = (−3, 4, −2) and AC = C − A = (0, −1, 1) −
(1, −1, 2) = (−1, 0, −1). And so, a normal vector of q is:
Ð→ Ð→
n = AB × AC = (−3, 4, −2) × (−1, 0, −1) = (−4, −1, 4)
Ð→
(b) n⋅ OA = (−4, −1, 4)⋅(1, −1, 2) = −4+1+8 = 5. So, q may be described by r⋅(−4, −1, 4) = 5.
(c) Another normal vector of q is m = 2n = 2 (−4, −1, 4) = (−8, −2, 8)
(d) Hence, the plane q may also be described by r ⋅ (−8, −28) = 10.
A229. Only b is perpendicular to q’s normal vector (see below). Hence, only b is on q.
Ð→
A230. Method 1. First find B = A + AB = (1, 4, −1) + (7, 3, −2) = (8, 7, −3). Then show
that B does not satisfy the plane’s vector equation and is thus not on q:
Ð→
OB ⋅ (7, −1, 3) = (8, 7, −3) ⋅ (7, −1, 3) = 56 − 7 − 9 ≠ 19. 7
Ð→
Method 2. Simply check if AB ⊥ (7, −1, 3):
Ð→
AB ⋅ (7, −1, 3) = (7, 3, −2) ⋅ (7, −1, 3) = 49 − 3 − 6 ≠ 0. 7
Ð→
So no, AB ⊥/ (7, −1, 3). Hence, by Fact 100, B ∉ q.
A233(a) Rewrite x + 5 = 17y + z as x − 17y − z = −5. Reading off, the plane may also be
described by r ⋅ (1, −17, −1) = −15 and doesn’t contain the origin.
(b) Rewrite y + 1 = 0 as 0x + y + 0z = −1. Reading off, the plane may also be described by
r ⋅ (0, 1, 0) = −1 and doesn’t contain the origin.
(c) Rewrite x + z = y − 2 as x − y + z = −2. Reading off, the plane may also be described by
r ⋅ (1, −1, 1) = −2 and doesn’t contain the origin.
A234(a) The plane described by r ⋅ (0, 0, 1) = 32 or z = 32 contains the points (0, 0, 32),
(1, 0, 32), and (0, 1, 32). It does not contain the points (1, 0, 0), (0, 1, 0), or (0, 0, 1).
(b) The plane described by r ⋅ (5, 3, 1) = −2 or 5x + 3y + z = −2 contains the points (−1, 1, 0),
(0, 0, −2), and (0, −1, 1). It does not contain the points (1, 0, 0), (0, 1, 0), or (0, 0, 1).
(c) The plane described by r ⋅ (1, −2, 3) = 0 or x − 2y + 3z = 0 contains the points (0, 0, 0),
(2, 1, 0), and (−3, 0, 1). It does not contain the points (1, 0, 0), (0, 1, 0), or (0, 0, 1).
A235(a) (2, 1, 0), (3, 0, −1), and (0, 3, 2). (c) (4, 0, −1), (4, 1, −1), and (4, 2, −1).
(b) (3, −5, 0), (1, 0, −5), and (0, 1, −3). (d) (1, 0, 0), (0, 0, 1), and (1, 0, 1)
(iv) Since (1, 1, 1) ⋅ (1, 2, 3) ≠ 0, we have (1, 1, 1) ⊥/ (1, 2, 3) and so (1, 1, 1) is not on the
plane. Hence, by Theorem 10, it cannot be written as a LC of u and v.
(b)(i) x − z = 0.
(ii) Since u ⋅ (1, 0, −1) = (0, 1, 0) ⋅ (1, 0, −1) = 0 and v ⋅ (1, 0, −1) = (1, 0, 1) ⋅ (1, 0, −1) = 0, both
u and v are on the plane. Also, u ∥/ v because they aren’t scalar multiples of each other.
(iii) Since w ⋅ (1, 0, −1) = (1, −1, 1) ⋅ (1, 0, −1) = 0, the vector w is on the plane.
The vectors u and v are on the same plane and u ∥/ v. Hence, by Theorem 10, we should
be able to write w as the LC of u and v, as indeed we can:
(iv) Since (1, 1, 1)⋅(1, 0, −1) = 0, we have (1, 1, 1) ⊥ (1, 0, −1) and so (1, 1, 1) is on the plane.
And hence, by Theorem 10, it can be written as a LC of u and v:
(c)(i) 9x + y + z = −5.
(ii) Since u ⋅ (9, 1, 1) = (0, 1, −1) ⋅ (9, 1, 1) = 0 and v ⋅ (1, 0, −1) = (−1, 9, 0) ⋅ (9, 1, 1) = 0, both
u and v are on the plane. Also, u ∥/ v because they aren’t scalar multiples of each other.
(iii) Since w ⋅ (9, 1, 1) = (−1, 10, −1) ⋅ (9, 1, 1) = 0, the vector w is on the plane.
The vectors u and v are on the same plane and u ∥/ v. Hence, by Theorem 10, awe should
be able to write w as the LC of u and v, as indeed we can:
(iv) Since (1, 1, 1) ⋅ (9, 1, 1) ≠ 0, we have (1, 1, 1) ⊥/ (9, 1, 1) and so (1, 1, 1) is not on the
plane. Hence, by Theorem 10, it cannot be written as a LC of u and v.
A238(a) This plane contains the vectors (4, 5, 6) and (7, 8, 9). And so, a normal vector is:
Since (−3, 6, −3) ∥ (1, −2, 1), a normal vector of the plane is (1, −2, 1).
The plane contains the point (1, 2, 3). Since (1, 2, 3) ⋅ (1, −2, 1) = 1 − 4 + 3 = 0, the plane may
be described in vector and cartesian forms by r ⋅ (1, −2, 1) = 0 and x − 2y + z = 0.
(b) First, rewrite the given parametric equation as:
Thus, this plane contains the vectors (1, 4, 0) and (−1, 0, 0). And so, a normal vector is:
Since (0, 0, 4) ∥ (0, 0, 1), a normal vector of the plane is (0, 0, 1).
The plane contains the point (0, 5, 0). Since (0, 5, 0) ⋅ (0, 0, 1) = 0 + 0 + 0 = 0, the plane may
be described in vector and cartesian forms by r ⋅ (0, 0, 1) = 0 and z = 0.
(c) First, rewrite the given parametric equation as:
Thus, this plane contains the vectors (0, 1, 1) and (1, 0, 1). And so, a normal vector is:
The plane contains the point (1, 1, 0). Since (1, 1, 0) ⋅ (1, 1, −1) = 1 + 1 + 0 = 2, the plane may
be described in vector and cartesian forms by r ⋅ (1, 1, −1) = 2 and x + y − z = 2.
1502, Contents www.EconsPhDTutor.com
A239(a) The plane that contains the points (7, 3, 4), (8, 3, 4), and (9, 3, 7) also contains
the vectors (8, 3, 4) − (7, 3, 4) = (1, 0, 0) and (9, 3, 7) − (7, 3, 4) = (2, 0, 3).
Since (1, 0, 0) ∥/ (2, 0, 3), the plane may be described in parametric form by:
⎛7⎞ ⎛1⎞ ⎛ 2 ⎞ ⎛ 7 + λ + 2µ ⎞
r=⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜
⎜ 3 ⎟ + λ⎜ 0 ⎟ + µ⎜ 0 ⎟ = ⎜ 3 ⎟
⎟ (λ, µ ∈ R).
⎝4⎠ ⎝0⎠ ⎝ 3 ⎠ ⎝ 4 + 3µ ⎠
r ⋅ (0, 1, 0) = 3 or y = 3.
(b) The plane that contains the points (8, 0, 2), (4, 4, 3), and (2, 7, 2) also contains the
vectors (4, 4, 3) − (8, 0, 2) = (−4, 4, 1) and (2, 7, 2) − (8, 0, 2) = (−6, 7, 0).
Since (−4, 4, 1) ∥/ (−6, 7, 0), the plane may be described in parametric form by:
⎛8⎞ ⎛ −4 ⎞ ⎛ −6 ⎞ ⎛ 8 − 4λ − 6µ ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ⎜ 4
⎟ + µ ⎜ 7 ⎟ = ⎜ 0 + 4λ + 7µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝2⎠ ⎝ 1 ⎠ ⎝ 0 ⎠ ⎝ 2+λ ⎠
This plane has normal vector (−4, 4, 1) × (−6, 7, 0) = (−7, −6, −4).
It thus also has normal vector (7, 6, 4). Compute (8, 0, 2) ⋅ (7, 6, 4) = 56 + 0 + 8 = 64. So, this
plane may be described in vector or cartesian form by:
r ⋅ (7, 6, 4) = 64 or 7x + 6y + 4z = 64.
(c) The plane that contains the points (8, 5, 9), (8, 4, 5), and (5, 6, 0) also contains the
vectors (8, 5, 9) − (8, 4, 5) = (0, 1, 4) and (8, 5, 9) − (5, 6, 0) = (3, −1, 9).
Since (0, 1, 4) ∥/ (3, −1, 9), the plane may be described in parametric form by:
⎛5⎞ ⎛0 ⎞ ⎛ 3 ⎞ ⎛ 5 + 3µ ⎞
r = ⎜ 6 ⎟ + λ⎜
⎜ ⎟
⎜1
⎟ + µ ⎜ −1 ⎟ = ⎜ 6 + λ − µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝4 ⎠ ⎝ 9 ⎠ ⎝ 4λ + 9µ ⎠
This plane has normal vector (0, 1, 4) × (3, −1, 9) = (13, 12, −3).
Compute (5, 6, 0) ⋅ (13, 12, −3) = 65 + 72 + 0 = 137. So, this plane may be described in vector
or cartesian form by:
(b) The given line has direction vector (−1, 4, 9)−(−1, 2, 3) = (0, 2, 6) or (0, 1, 3). The given
plane has normal vector (−3, 1, 0) × (0, 5, −3) = (−3, −9, −15) or (1, 3, 5).
Hence, the angle between the line and the plane is:
(c) The given line has direction vector (0, 11, 11)−(−1, 2, 3) = (1, 9, 8). The given plane also
contains the vector (1.5, 0, 0) − (0, 0, 1.5) = (1.5, 0, −1.5) or (1, 0, −1); hence, it has normal
vector (4, −1, 0) × (1, 0, −1) = (1, 4, 1).
Hence, the angle between the line and the plane is:
A241(a) The line l has direction vector v = (2, 3, 5) and the plane q has normal vector
n = (−10, 0, 4). Now, v ⋅ n = (2, 3, 5) ⋅ (−10, 0, 4) = −20 + 0 + 20 = 0. So, v ⊥ n and thus l ∥ q.
The point (4, 5, 6) is on l but not on q because (4, 5, 6)⋅(−10, 0, 4) = −40+0+24 = −16 ≠ −26.
Hence, l and q do not intersect at all.
(b) The line l has direction vector v = (5, 5, 6) − (3, 2, 1) = (2, 3, 5).
The plane q has normal vector n = (2, 0, 5) × (2, 1, 5) = (−5, 0, 2).
Since v ⋅ n = (2, 3, 5) ⋅ (−5, 0, 2) = −10 + 0 + 10 = 0, we have v ⊥ n and thus l ∥ q.
Compute (3, 0, 1) ⋅ (−5, 0, 2) = −15 + 0 + 2 = −13. So, q has vector equation r ⋅ (−5, 0, 2) = −13.
The point (3, 2, 1) is on l and is also on q because (3, 2, 1) ⋅ (−5, 0, 2) = −15 + 0 + 2 = −13.
Hence, the line lies entirely on the plane.
(c) The line l has direction vector v = (6, 8, 11) − (4, 5, 6) = (2, 3, 5).
The plane q contains the vector (2, 1, −2) − (2, 0, −2) = (0, 1, 0) and thus has normal vector
n = (0, 1, 0) × (3, 0, 10) = (10, 0, −3).
Since v ⋅ n = (2, 3, 5) ⋅ (10, 0, −3) = 20 + 0 − 15 = 5 ≠ 0, , we have v ⊥/ n and thus l ∥/ q. Hence,
l and q share exactly one intersection point.
Compute (2, 0, −2) ⋅ (10, 0, −3) = 20 + 0 + 6 = 26. So, q has vector equation r ⋅ (10, 0, −3) = 26.
To find the intersection point, plug a generic point of l into q’s vector equation:
Thus, l and q intersect at: (4, 5, 6) + λ̂(2, 3, 5) = (4, 5, 6) + 0.8(2, 3, 5) = (5.6, 7.4, 10).
1504, Contents www.EconsPhDTutor.com
Ch. 57 Answers (The Angle Between Two Planes)
A242(a) The angle between planes with normal vectors (−1, −2, −3) and (3, 4, 5) is:
(b) The first plane has normal vector (1, −1, 0) × (3, 5, −1) = (1, 1, 8).
The second plane has normal vector (0, 1, 0) × (10, 2, 3) = (3, 0, −10).
And so, the angle between them is:
(c) The first plane contains vectors (3, 0, 0) − (1, 1, 0) = (2, −1, 0) and (3, 0, 0) − (0, 0, 1) =
(3, 0, −1). And so, it has normal vector (2, −1, 0) × (3, 0, −1) = (1, 2, 3).
The second plane contains vectors (1, 0, −1) − (1, −1, 0) = (0, 1, −1) and (0, 3, 1) − (1, −1, 0) =
(−1, 4, 1). And so, it has normal vector (0, 1, −1) × (−1, 4, 1) = (5, 1, 1).
Thus, the angle between the two planes is:
A243(a) Since (4, 9, 3) ∥/ (1, 1, 2), q1 and q2 are not parallel and intersect along a line
with direction vector (4, 9, 3) × (1, 1, 2) = (15, −5, −5) or (−3, 1, 1).
To find an intersection point, plug x = 0 into their cartesian equations to get 9y + 3z = 61
1
and y + 2z = 19. = minus 9× = yields −15z = −110 or z = 22/3. And so, y = 13/3. Thus, their
2 1 2
intersection line has vector equation r = (0, 13/3, 22/3) + λ (−3, 1, 1) (λ ∈ R).
(b) q1 has normal vector (1, −1, 0) × (1, −1, 1) = (−1, −1, 0) or (1, 1, 0).
q2 has normal vector (6, −1, 0) × (8, 0, −1) = (1, 6, 8).
Since (1, 1, 0) ∥/ (1, 6, 8), q1 and q2 are not parallel and intersect along a line with direction
vector (1, 1, 0) × (1, 6, 8) = (8, −8, 5).
Compute (1, 3, −2) ⋅ (1, 1, 0) = 1 + 3 + 0 = 4 and (2, 3, 5) ⋅ (1, 6, 8) = 2 + 18 + 40 = 60. Hence, q1
and q2 have cartesian equations x + y = 4 and x + 6y + 8z = 60.
To find an intersection point, plug in x = 0 to get y = 4 and 6y + 8z = 60. So, y = 4
1 2
and z = 4.5. Thus, their intersection line has vector equation r = (0, 4, 4.5) + λ(8, −8, 5)
(λ ∈ R).
1505, Contents www.EconsPhDTutor.com
A243(c) q1 contains the vectors (7, 7, 0) − (1, 1, 6) = (6, 6, −6) or (1, 1, −1) and (5, 3, 3) −
(1, 1, 6) = (4, 2, −3). And so, it has normal vector (1, 1, −1) × (4, 2, −3) = (−1, −1, −2) or
(1, 1, 2).
q2 contains the vectors (7, 3, 1) − (5, 5, 1) = (2, −2, 0) or (1, −1, 0) and (5, 5, 1) − (3, 5, 2) =
(2, 0, −1). And so, it has normal vector (1, −1, 0) × (2, 0, −1) = (1, 1, 2).
Clearly, q1 and q2 are parallel.
To check if they intersect at all, we’ll pick any point on q1 — say (7, 7, 0) — and check
if it’s on q2 . Compute (5, 5, 1) ⋅ (1, 1, 2) = 5 + 5 + 2 = 12 — hence, q2 has vector equation
r ⋅ (1, 1, 2) = 12. Since (7, 7, 0) ⋅ (1, 1, 2) = 7 + 7 + 0 = 14 ≠ 12, the point (7, 7, 0) is not on q2 .
Thus, q1 and q2 do not intersect at all.
(d) q1 contains the vectors (5, 3, 2)−(1, 5, 3) = (4, −2, −1) and (10, 0, 1)−(5, 3, 2) = (5, −3, −1).
And so, it has normal vector (4, −2, −1) × (5, −3, −1) = (−1, −1, −2) or (1, 1, 2).
q2 contains the vectors (8, 8, −2)−(5, −1, 4) = (3, 9, −6) or (1, 3, −2) and (8, 8, −2)−(3, 5, 2) =
(5, 3, −4). And so, it has normal vector (1, 3, −2) × (5, 3, −4) = (−6, −6, −12) or (1, 1, 2).
Clearly, the two planes are parallel.
To check if they intersect at all, we’ll pick any point on q1 — say (10, 0, 1) — and check
if it’s on q2 . Compute (3, 5, 2) ⋅ (1, 1, 2) = 3 + 5 + 4 = 12 — hence, q2 has vector equation
r ⋅ (1, 1, 2) = 12. Since (10, 0, 1) ⋅ (1, 1, 2) = 10 + 0 + 2 = 12, the point (10, 0, 1) is on the second
plane. Since q1 and q2 are parallel and share at least one intersection point, they must be
identical.
(e) Since (7, 1, 1) ∥/ (1, 1, 2), q1 and q2 are not parallel and intersect along a line with
direction vector (7, 1, 1) × (1, 1, 2) = (1, −13, 6).
To find an intersection point, plug x = 0 into their cartesian equations to get y + z = 42 and
1
y + 2z = 6. = minus = yields z = −36. And so, y = 78. Thus, their intersection line has
2 2 1
(f) Since (0, 1, 3) ∥/ (−1, 1, 3), q1 and q2 are not parallel and intersect along a line with
direction vector (0, 1, 3) × (−1, 1, 3) = (0, −3, 1).
Observe that if here we try plugging x = 0 into their cartesian equations, then we get
y + 3z = 0 and y + 3z = 2, which are contradictory. This contradiction tells us that the two
1 2
Thus, their intersection line has vector equation r = (−2, 0, 0) + λ (0, −3, 1) (λ ∈ R).
Ð→ Ð→ Ð→
We’ve just shown that AC ⊥/ BC. And hence, AC ⊥/ q.
9 1
So, k = 9/139 and B = A + kn = (7, 3, 4) + (9, 3, 7) = (1 054, 444, 619).
139 139
And the distance between A and q is:
Ð→ 9 9 √ 9
∣AB∣ = ∣kn∣ = ∣k∣ ∣n∣ = ∣ ∣ ∣(9, 3, 7)∣ = 139 = √ .
139 139 139
22 1
So, k = 22/57 and B = A + kn = (8, 0, 2) + (2, 7, 2) = (500, 154, 158).
57 57
And the distance between A and q is:
Ð→ 22 22 √ 22
∣AB∣ = ∣kn∣ = ∣k∣ ∣n∣ = ∣ ∣ ∣(2, 7, 2)∣ = 57 = √ .
57 57 57
6 1
So, k = −6/61 and B = A + kn = An = (8, 5, 9) − (5, 6, 0) = (458, 269, 549).
61 61
And the distance between A and q is:
Ð→ 6 6√ 6
∣AB∣ = ∣kn∣ = ∣k∣ ∣n∣ = ∣− ∣ ∣(5, 6, 0)∣ = 61 = √ .
61 61 61
Ð→ Ð→
d − OA ⋅ n d − OA ⋅ n
Rearranging: k= = .
n⋅n
3
∣n∣
2
√ √
A247(a) First compute ∣n∣ = 92 + 32 + 72 = 139. Then compute:
Ð→
d − OA ⋅ n 109 − (7, 3, 4) ⋅ (9, 3, 7) 109 − (63 + 9 + 28) 9
k= = = = .
∣n∣
2 139 139 139
9 1
And so by Fact 115, B = A + kn = (−1, 0, 1) + (9, 3, 7) = (1 054, 444, 619).
139 139
Ð→ 9 √ 9
And as before: ∣AB∣ = ∣k∣ ∣n∣ = ⋅ 139 = √ .
139 139
√ √
(b) First compute ∣n∣ = 22 + 72 + 22 = 57. Then compute:
Ð→
d − OA ⋅ n 42 − (8, 0, 2) ⋅ (2, 7, 2) 42 − (16 + 0 + 4) 22
k= = = = .
∣n∣
2 57 57 57
22 2
And so by Fact 115, B = A + kn = (8, 0, 2) + (2, 7, 2) = (250, 77, 79).
57 57
Ð→ 22 √ 22
And as before: ∣AB∣ = ∣k∣ ∣n∣ = ⋅ 57 = √ .
57 57
√ √
(c) First compute ∣n∣ = 52 + 62 + 02 = 61. Then compute:
Ð→
d − OA ⋅ n 64 − (8, 5, 9) ⋅ (5, 6, 0) 64 − (40 + 30 + 0) 6
k= = = =− .
∣n∣
2 61 61 61
6 1
And so by Fact 115, B = A + kn = (8, 5, 9) − (5, 6, 0) = (458, 269, 549).
61 61
Ð→ 6 √ 6
And as before: ∣AB∣ = ∣k∣ ∣n∣ = ⋅ 61 = √ .
61 61
And hence, if ∣n∣ = 1, then the distance between the plane and the origin is simply ∣d∣.
A249(a) Perpendicular Method. Let A and B be the feet of the perpendiculars from
S and T to q. Write: A = S + kn and B = T + ln.
Since A, B ∈ q, we have:
Ð→
OA ⋅ (5, −3, 1) = 0 or [(−1, 0, 7) + k(5, −3, 1)] ⋅ (5, −3, 1) or 2 + 35k = 0,
Ð→
OB ⋅ (5, −3, 1) = 0 or [(3, 2, 1) + l(5, −3, 1)] ⋅ (5, −3, 1) = 0 or 10 + 35l = 0.
2 1
A = S + kn = (−1, 0, 7) − −
(5, −3, 1) = (−45, 6, 243),
35 35
2 1
B = T + ln = (3, 2, 1) − − (5, −3, 1) = (11, 20, 5).
7 7
√ √
Formula Method. First compute ∣n∣ = 52 + (−3) + 12 = 35. Then compute:
2
Ð→
d − OS ⋅ n 0 − (−1, 0, 7) ⋅ (5, −3, 1) 0 − (−5 + 0 + 7) 2
k= = = =− .
∣n∣
2 35 35 35
Ð→
d − OT ⋅ n 0 − (3, 2, 1) ⋅ (5, −3, 1) 0 − (15 − 6 + 1) 2
l= = = =− .
∣n∣
2 35 35 7
2 1
S + kn = (−1, 0, 7) − − (5, −3, 1) = (−45, 6, 243),
35 35
2 1
T + ln = (3, 2, 1) − − (5, −3, 1) = (11, 20, 5).
7 7
(b) The distances between q and the points S and T are, respectively:
2 √ 2
∣k∣ ∣n∣ = ⋅ 35 = √ .
35 35
√
2 √ 2 5
∣l∣ ∣n∣ = ⋅ 35 = √ .
7 7
(c) Since the origin is on q, the distance between the origin and q is 0.
⎛0⎞ ⎛3 ⎞ ⎛ 5 ⎞ ⎛ 3λ + 5µ ⎞
r=⎜ ⎟ ⎜
⎜ 1 ⎟ + λ⎜ 2
⎟ + µ ⎜ 8 ⎟ = ⎜ 1 + 2λ + 8µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝5⎠ ⎝4 ⎠ ⎝ 4 ⎠ ⎝ 5 + 4λ + 4µ ⎠
Now check if D ∈ q:
⎛ 6 ⎞ ⎛ 3λ + 5µ ⎞ 6 = 3λ + 5µ,
1
⎜ 6 ⎟ = ⎜ 1 + 2λ + 8µ ⎟ 6 = 1 + 2λ + 8µ,
⎜ ⎟ ⎜ ⎟
2
or
⎝ 1 ⎠ ⎝ 5 + 4λ + 4µ ⎠ 1 = 5 + 4λ + 4µ.
3
which contradicts =. So, D ∉ q and the four points are not coplanar.
4
Ð→
(b) Let q be the plane that contains A, B, and C. The non-parallel vectors AB = (1, −3, 5)
Ð→
and BC = (6, 1, −2) are on q.
Ð→ Ð→
Method 1 (Vector Form). q has normal vector AB × BC = (1, −3, 5) × (6, 1, −2) =
(1, 32, 19).
Since q contains the origin, it may be described by r ⋅ (1, 32, 19) = 0.
ÐÐ→
Now check if D ∈ q: OD ⋅ (1, 32, 19) = (4, 7, −12) ⋅ (1, 32, 19) = 4 + 224 − 228 = 0. Yup, it is.
So D ∈ q and the four points are coplanar.
Method 2 (Parametric Form). The plane q may be described in parametric form as:
⎛0⎞ ⎛ 1 ⎞ ⎛ 6 ⎞ ⎛ λ + 6µ ⎞
r=⎜ ⎟ ⎜
⎜ 0 ⎟ + λ ⎜ −3
⎟ + µ ⎜ 1 ⎟ = ⎜ −3λ + µ
⎟ ⎜ ⎟ ⎜
⎟
⎟ (λ, µ ∈ R).
⎝0⎠ ⎝ 5 ⎠ ⎝ −2 ⎠ ⎝ 5λ − 2µ ⎠
Now check if D ∈ q:
1510, Contents www.EconsPhDTutor.com
⎛ 4 ⎞ ⎛ λ + 6µ ⎞ 4 = λ + 6µ,
1
⎜ 7 ⎟ = ⎜ −3λ + µ ⎟ 7 = −3λ + µ,
⎜ ⎟ ⎜ ⎟
2
or
⎝ −12 ⎠ ⎝ 5λ − 2µ ⎠ −12 = 5λ − 2µ.
3
A252(a) Since (3, 2, 1) ∥/ (5, 6, 7), the two lines are not parallel. Now write:
⎜ 1 ⎟ + λ̂ ⎜ 2 ⎟ = ⎜ 2 ⎟ + µ̂ ⎜ 6 ⎟ 1 + 2λ̂ = 2 + 6µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝5⎠ ⎝1 ⎠ ⎝3⎠ ⎝7 ⎠ 5 + λ̂ = 3 + 7µ̂.
3
2× = minus (= + =) yields −11 = 0, a contradiction. So, the two lines do not intersect.
2 1 3
⎛6⎞ ⎛1 ⎞ ⎛9⎞ ⎛0 ⎞ 6 + λ̂ = 9,
1
⎜ 5 ⎟ + λ̂ ⎜ 0 ⎟ = ⎜ 3 ⎟ + µ̂ ⎜ 1 ⎟ 5 = 3 + µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝5⎠ ⎝1 ⎠ ⎝6⎠ ⎝1 ⎠ 5 + λ̂ = 6 + µ̂.
3
From =, λ̂ = 3. From =, µ̂ = 2. These values of λ̂ and µ̂ satisfy =. Plugging these back in,
1 2 3
A254. Rationalise any denominators with surds and write out the sine or cosine values:
√ √ √ √ √ √ √ √
2 2 3 2 2 3 2 3 2
a= − i, b= − i, c= − i, d= − i.
2 2 2 2 2 2 2 2
Comparing the real and imaginary parts, we see that only c = d.
√
A255(a) z = (33, 33e). (b) w = (237 + π, 3 − 2). (c) ω = (p, q).
z 3 = (−5 + 2i) (−5 + 2i) = (21 − 20i) (−5 + 2i) = −105 + 42i + 100i + 40 = −65 + 142i.
2
√ √ √ √ √
(c) zw = (1 + 2i) (3 − 2i) = 3 − 2i + 6i − 2 2i2 = 3 + 2 2 + (6 − 2) i.
Two complex numbers are equal if and only if their real and imaginary parts are equal. So:
2a + 3b + 5 = 0 11a + 4b + 3 = 0.
1 2
and
11 5.88
So: a= = 0.44 and b=− = −1.96.
25 3
1 1 1 1 5 2
A260(a) z ∗ = −5 − 2i. So = 2 2 z∗ = (−5 − 2i) = (−5, −2) = − − i.
z 5 +2 29 29 29 29
1 1 1 1
(b) w∗ = 3 + i. So = 2 2 w∗ = (3 + i) = (3, 1) = 0.3 + 0.1i.
w 3 +1 10 10
1 1 1 1
(c) ω ∗ = 1 − 2i. So = 2 2 ω ∗ = (1 − 2i) = (1, −2) = 0.2 − 0.4i.
ω 1 +2 5 5
zw∗ (1 + 3i) i
= 2 2= = −3 + i.
z
A262(a)
w 0 +1 1
zw∗ (2 − 3i) (1 − i) 2 − 2i − 3i − 3 −1 − 5i
= 2 2= = = = −0.5 − 2.5i.
z
(b)
w 1 +1 2 2 2
√ √ √ √
zw∗ ( 2 − πi) (3 + 2i) 3 2 + 2i − 3iπ + 2π 3 + π √ 2 − 3π
(c) = = = = +
z
√
w 32 + (− 2)2 11 11 11
2
11
i.
√ 3 2√ √ 2 √ √ √
A264. (−1 ± 3i) = −1 ± 3 (−1) 3i + 3 (−1) ( 3i) ∓ 3 3i = −1 ± 3 3i + 9 ∓ 3 3i = 8.
3
We can now find the other two roots using the usual quadratic formula:
√ √
−b ± b2 − 4ac 4 ± 42 − 4 (1) (16) √ √ √
x= = = 2 ± 4 − 16 = 2 ± 12i = 2 ± 2 3i.
2a 2⋅1
√ √
Thus, the three roots of x3 + 64 = 0 are −4, 2 + 2 3i, and 2 − 2 3i.
We can now find the other two roots using the usual quadratic formula:
√ √
−b ± b2 − 4ac −2 ± (−2) − 4 (1) (2)
2
√
x= = = −1 ± 1 − 2 = −1 ± i.
2a 2⋅1
Thus, the three roots of x3 + x2 − 2 = 0 are 1 and −1 ± i.
But in (a), we already worked out the three roots of x3 + x2 − 2 = 0. Thus, the four roots
of x4 − x2 − 2x + 2 = 0 are 1, 1 (repeated), and −1 ± i.
1514, Contents www.EconsPhDTutor.com
A267. Both equations have real coefficients and so the Complex Conjugate Root Theorem
applies. That is, since 2 − 3i solves both equations, so too does 2 + 3i.
(a) x4 − 6x3 + 18x2 − 14x − 39 = (x2 − 4x + 13) (ax2 + bx + c) = ax4 + (b − 4a) x3 +?x2 +?x + 13c.
We can now find the other two roots using the usual quadratic formula:
√ √ √
−b ± b2 − 4ac −13 ± 132 − 4 (−2) (−15) 13 ∓ 49 13 ∓ 7
x= = = = = 1.5, 5.
2a 2 (−2) 4 4
x2 + px + q = [x − (1 − i)] [x − (1 + i)] = (x − 1) 2 − i2 = x2 − 2x + 2.
Hence, p = −2 and q = 2.
1 + 2i = (1, 2)
2i = (0, 2)
−1 = (−1, 0) 2 = (2, 0) x
θ −1.893
−1 − 3i = (−1, −3)
A271. Refer to figure on the previous page. We have arg 2 = 0, arg (−1) = π, arg (2i) = π/2
and arg (1 + 2i) = tan−1 (2/1) ≈ 1.107. For −1 − 3i, observe that θ = tan−1 (3/1). Thus,
arg (−1 − 3i) = θ − π = tan−1 (3/1) − π ≈ −1.893.
A272.
y
√ √
∣w∣ = (−3) + 22 = 13
2
w = −3 + 2i
2
arg w = π − tan−1 ≈ 2.554
3
z =2−i
√ √
∣z∣ = 22 + (−1) = 5
2
1
arg z = − tan−1 ≈ −0.464
2
2 −1 0
A273. arg 2 = cos−1 = 0, arg (−1) = cos−1 = π, arg (2i) = cos−1 = π/2.
2 1 2
1 −1
arg (1 + 2i) = cos−1 √ ≈ 1.107, arg (−1 − 3i) = − cos−1 √ ≈ −1.893.
5 10
2 −3
arg z = arg (2 − i) = − cos−1 √ ≈ −0.464, arg w = arg (−3 + 2i) = cos−1 √ ≈ 2.554.
5 13
A274. Using the moduli and arguments found in the above answers, we have:
2 = 2 (cos 0 + i sin 0), −1 = 1 (cos π + i sin π), 2i = 2 (cos + i sin ),
π π
2 2
√ √
1 + 2i ≈ 5 (cos 1.107 + i sin 1.107), −1 − 3i ≈ 10 (cos −1.893 + i sin −1.893),
√ √
2 − i ≈ 5 (cos −0.464 + i sin −0.464), −3 + 2i ≈ 13 (cos 2.554 + i sin 2.554).
1
Thus: = 1e0i = cos 0 + i sin 0 = 1.
z
(b) ∣w∣ = 2, arg w = π/2. So, ∣1/w∣ = 1/ ∣w∣ = 1/2, arg (1/w) = − arg w = −π/2.
1 1 −iπ/2 1 −π −π 1
Thus: = e = (cos + i sin ) = − i.
w 2 2 2 2 2
(c) ∣z∣ = 17, arg z = π. So, ∣1/z∣ = 1/ ∣z∣ = 1/17. Note importantly that z < 0, so that Fact
1
131(b) does not apply here. We have, simply, arg = arg z = π. And:
z
1 1 π 1 1
= e = (cos π + i sin π) = − .
z 17 17 17
(d) ∣w∣ = 8, arg w = −π/2. So, ∣1/w∣ = 1/ ∣w∣ = 1/8, arg (1/w) = − arg w = π/2.
1 1 iπ/2 1 1
= e = (cos + i sin ) = i.
π π
Thus:
w 8 8 2 2 8
√ √ √
(e) ∣z∣ = 29, arg z = cos−1 (−2/ 29) ≈ 1.951. So, ∣1/z∣ = 1/ ∣z∣ = 1/ 29, arg (1/z) ≈ −1.951.
1 1 1
Thus: ≈ √ e−1.951i ≈ √ (cos −1.951 + i sin −1.951) ≈ −0.069 + 0.172i.
z 29 29
√ √ √
(f) ∣w∣ = 2, arg w = − cos−1 (−1/ 2) = −3π/4. So, ∣1/w∣ = 1/ ∣w∣ = 1/ 2, arg (1/w) = 3π/4.
1 1 1 3π 3π 1 1
Thus: = √ e3iπ/4 = √ (cos + i sin ) = − + i.
w 2 2 4 4 2 2
√ √ √
(g) ∣z∣ = 10, arg z = − cos−1 (1/ 10) ≈ −1.249. So, ∣1/z∣ = 1/ ∣z∣ = 1/ 10, arg (1/z) ≈ 1.249.
1 1 1
Thus: ≈ √ e1.249i ≈ √ (cos 1.249 + i sin 1.249) ≈ 0.1 + 0.3i.
z 10 10
(h) ∣w∣ = 5, arg w = cos−1 (3/5) ≈ 0.927. So, ∣1/w∣ = 1/ ∣w∣ = 1/5, arg (1/w) = − arg w = −0.927.
1 1 −0.927 1
Thus: ≈ e ≈ (cos −0.927 + i sin −0.927) ≈ 0.12 − 0.16i.
z 5 5
1518, Contents www.EconsPhDTutor.com
A279(a) ∣z∣ = ∣1∣ = 1, arg z = 0, ∣w∣ = ∣3∣ = 3, and arg w = π. Hence, ∣z/w∣ = ∣z∣ / ∣w∣ = 1/3 and
arg (z/w) = arg z − arg w + 2kπ = 0 − π + 2π = π. Thus, z/w = 1/3 (cos π + i sin π) = 1/3eiπ = −1/3.
√ √
(b) ∣z∣ = ∣2i∣ = 2, arg z = π/2, ∣w∣ = ∣1 + 2i∣ = 5, and arg w = cos−1 (1/ 5) ≈ 1.107. Hence,
√
∣z/w∣ = ∣z∣ / ∣w∣ = 2/ 5 and arg (z/w) = arg z − arg w + 2kπ = π/2 − 1.107 + 0 ≈ 0.464. Thus,
√ √
z/w ≈ (2/ 5) (cos 0.464 + i sin 0.464) ≈ (2/ 5) e0.464i ≈ 0.8 + 0.4i.
2i 2i 1 − 2i 2i + 4
= = = 0.8 + 0.4i.
1 + 2i 1 + 2i 1 − 2i 12 + 22
√ √
(c) ∣z∣ = ∣−1 − 3i∣ = 10, arg z = − cos−1 (−1/ 10) ≈ −1.893, ∣w∣ = ∣3 + 4i∣ = 5, and arg w =
√ √ √
cos (3/5) ≈ 0.927. Hence, ∣z/w∣ = ∣z∣ / ∣w∣ = 10/5 = 2/5 = 0.4 and arg (z/w) = arg z −
−1
1 1 1 2 1 ∣z∣
A280. ∣ ∣ = ∣z ∣ = ∣z∣ ∣ ∣ = ∣z∣ = , where = and = use Facts 130 and 131.
z 1 2
w w w ∣w∣ ∣w∣
1519, Contents www.EconsPhDTutor.com
128. Part V Answers (Calculus)
sin 1 1 cos
tan = , cosec = , sec = , cot = .
cos sin cos sin
Since each of tan, cosec, sec, and cot is defined defined as the quotient of two continuous
functions, by Theorem 16, each is continuous.
By Definition 69, tan−1 is defined as the inverse of the continuous function tan, which is
defined on the interval R. And so by Theorem 17, tan−1 is also continuous.
−x − 3
= lim (For all x “near” 3, x < 0 and hence ∣x∣ = −x)
x→−3 x + 3
= −1
F
(Constant Factor Rule).
(b) f is differentiable at any a < 0, because the derivative of f at a exists and is equal to
−1:
f (x) − f (a) ∣x∣ − ∣a∣
lim = lim (Simply plug in)
x→a x−a x→a x − a
−x + a
= lim (For all x “near” a < 0, x < 0 and hence ∣x∣ = −x)
x→a x − a
= −1
F
(Constant Factor Rule).
[x − (−3)] (x − 3)
= lim
x→−3 x − (−3)
= −3 + −3 = −6
P,C
(Power and Constant Rules).
(x − 0) (x + 0)
= lim
x→0 x−0
= lim (x + 0) (Note that x ≠ 0)
x→0
±
= lim x + lim 0 (Sum and Difference Rules)
x→0 x→0
= 0+0=0
P,C
(Power and Constant Rules).
g (x) − g (a) x2 − a2
lim = lim (Simply plug in)
x→a x−a x→a x − a
(x − a) (x + a)
= lim
x→a x−a
= lim (x + a) (Note that x ≠ a)
x→a
±
= lim x + lim a (Sum and Difference Rules)
x→a x→a
= a + a = 2a
P,C
(Power and Constant Rules).
i (x) − i (a) x4 − a4
lim = lim (Simply plug in)
x→a x−a x→a x − a
⋆ (x − a) (x3 + ax2 + a2 x + a2 )
= lim
x→a x−a
= lim (x3 + ax2 + a2 x + a3 ) (Note that x ≠ a)
x→a
±, F±, F
= = lim x3 + a lim x2 + a2 lim x + lim a3 (Sum, Difference, and Constant Fa
x→a x→a x→a x→a
= a3 + a ⋅ a2 + a2 ⋅ a + a3 = 4a3
P, C
(Power and Constant Rules).
What we’ve just shown is that for any a ∈ R, the derivative of h at a is 4a3 .
Thus, the derivative of i is the function i′ ∶ R → R defined by i′ (x) = 4x3 .
A292. For any a ∈ R, we have:
We’ve just shown that for any a ∈ R, the derivative of f at a is cac−1 . Thus, the derivative
of f is the function f ′ ∶ R → R defined by f ′ (x) = cxc−1 .
A293. The mistake is in Step 4.
This is a common mistake made by students. To find the derivative of f at 2, we must
first find the derivative of f then plug in 2. The mistake here is to plug in 2 first, getting
f (2) = −2, then differentiating the constant −2, which of course yields 0.
1
A294(a) Observe that is a constant. And so, by the Constant Rule for Limits
g (a)
(Theorem 14), we have:
1 1
lim =
x→a g (a) g (a)
.
(b) Since f and g are differentiable at a, by Theorem 19 (result), they are also continuous
at a. And so, by the definition of continuity, we have:
(e) Now return to expression ⋆ and plug in the equations = through = to get:
1 6
′
Thus, the derivative of f /g is the function (f /g) ∶ D ∖ {x ∶ g (x) = 0} → R defined by:
dg dg
g ′ (x) = (x) = ġ (x) = 4x3 − 3x2 + 2x − 1 and g ′ (1) = (1) = ġ (
dx dx
d2 g d2 g
g (x)
′′
= (x) = g̈ (x) = 12x2 − 6x + 2 and g (1) =
′′
(1) = g̈ (
dx2 dx2
d3 g ... d3 g ...
g (x)
′′′
= (x) = g (x) = 24x − 6 and g (1) =
′′′
(1) = g
dx3 dx3
d4 g 4 d4 g 4
g (4)
(x) = (x) = ġ (x) = 24 and g (4)
(1) = (1) = ġ (
dx4 dx4
For any n ≥ 5:
dn g dn g
g (n) (x) = (x) = (x) = 0 g (n) (1) = (1) = (
n n
ġ and ġ
dxn dxn
A303.
√ dy dy −1
1 = − 1 − x2 or =√ .
dx dx 1 − x2
d dy
(d) Let y = tan−1 x. Then x = tan y. Apply to = to get 1 = sec2 y .
1 1 2
dx dx
From the identity sec2 y = 1 + tan2 y, we have sec2 y = 1 + x2 . Plug = into = to get:
3 3 2
dy dy 1
1 = (1 + x2 ) or = .
dx dx 1 + x2
dx dy
A297. Given x = cos t+t2 and y = et −t3 , we may compute = − sin t+2t and = et −3t2 .
dt dt
dy et − 3t2
So = (for − sin t + 2t ≠ 0).
dx − sin t + 2t
The 2nd derivative of g is the function with domain and codomain both R and mapping
d2 g ⋅⋅
rule x ↦ 12x − 6x + 2. It may be denoted g or
2 ′′
or g. Evaluated at 1, we have
dx2
d2 g ⋅⋅
g ′′ (1) = 2 ∣ = g(1) = 8.
dx x=1
The 3rd derivative of g is the function with domain and codomain both R and mapping
3
d3 g ⋅
rule x ↦ 24x − 6. It may be denoted g or
(3)
3
or g. Evaluated at 1, we have g (3) (1) =
dx
The 4th derivative of g is the function with domain and codomain both R and mapping
4
d4 g ⋅ d4 g
rule x ↦ 24. It may be denoted g or 4 or g. Evaluated at 1, we have g (1) = 4 ∣ =
(4) (4)
dx dx x=1
4
⋅
g(1) = 24.
For n ≥ 5, the nth derivative of g is the function with domain and codomain both R
⋅
n
and mapping rule x ↦ 0. It may be denoted g (n) dn g
or n
or g. Evaluated at 1, we have
dx
⋅
n
g (1) = n ∣ = g(1) = 0.
(n) dn g
dx x=1
A308. (a) It is false that every maximum point or minimum point is a stationary point
— see Points A and E in Example 722.
(b) It is also false that every maximum point or minimum point is a turning point — again,
see Points A and E in Example 722.
(c) It is false that every stationary point is a maximum point or minimum point — see
Point D in Example 722.
(d) By Definition ??, it is true that every turning point is a maximum point or minimum
point.
(e) By Definition ??, it is true that every turning point is a stationary point.
(f) It is false that every stationary point is a turning point — again, see Point D in Example
722.
A310. “If c is a maximum or minimum point AND in the interior of D, then c is a turning
point” — true! By the IET, c is a stationary point. Since c is also either a maximum or a
minimum point, by Definition ??, x is also a turning point.
-2 -1 0 1 2
(c) The total external surface area of the cone (including the base) is
√ √ √ √
3 3 9 3h 9
A = πrl = π + h2 = π + = + 3πh.
πh πh π2 h2 π h2
h3 + 3π 3 π − h63 3 π − h63
−18
6 1/3
= √ = √ = = 0 ⇐⇒ h = ( ) .
dA dA
(d) Compute . So
dh 2 9 + 3πh 2 9 + 3πh 2 A dh π
h2 h2
h4 A − (π − h3 ) dh h4 A − (π − h3 ) 2 A 9 h4 A2 − (π − h3 )
6
18 6 dA 18 6 3 π− h3 12 6 2
d2 A 3 3
= = =
dh2 2 A2 2 A2 4 A3
12 2 6 2 12 9 6 2
(f) A − (π − 3 ) = 4 ( 2 + 3πh) − (π − 3 )
h4 h h h h
d2 A d2 A
(g) The numerator of our expression for is always positive. So is always positive.
dh2 dh2
dA
That is, is always strictly increasing. So the stationary point we found in (d) must also
dh
be the global minimum point.
f ′ (x)
= k (1 + x) ,
k−1
f ′′ (x)
= k (k − 1) (1 + x) ,
k−2
f (3) (x)
= k (k − 1) (k − 2) (1 + x) ,
k−3
f (4) (x)
= k (k − 1) (k − 2) (k − 3) (1 + x) ,
k−4
⋮
f (x) = k (k − 1) . . . (k − n + 1) (1 + x) .
(n) k−n
f (0) = (1 + 0) = 1,
k
f ′ (0) = k (1 + 0) = k,
k−1
f ′′ (0) = k (k − 1) (1 + 0) = k (k − 1),
k−2
⋮
f (n) (0) = k (k − 1) . . . (k − n + 1) (1 + 0) = k (k − 1) . . . (k − n + 1) =
k!
.
k−n
(k − n)!
The nth Maclaurin coefficient generated by f is:
f (n) (0)
k!
(k−n)!
cn = = =
k!
(k − n)!n!
.
n! n!
So, for each n ∈ Z+0 , the nth Maclaurin coefficient generated by cos is:
⎧
⎪
⎪
⎪ 1/n! for n = 0, 4, 8, . . . ,
⎪
⎪
⎪
cos(n) (0) ⎪
⎪
⎪0/n! = 0 for n = 1, 5, 9, . . . ,
cn = =⎨
⎪
⎪
⎪ −1/n! for n = 2, 6, 10, . . . ,
⎪
n!
⎪
⎪
⎪
⎪
⎪
⎩0/n! = 0 for n = 3, 7, 11, . . .
x2 x4
cos x = 1 − + − ...
2! 4!
(c) First, the derivatives of g are:
1
g ′ (x) = ,
1+x
−1
g ′′ (x) = 2,
(1 + x)
2⋅1
g (3) (x) = 3,
(1 + x)
−3 ⋅ 2 ⋅ 1
g (4) (x) = 4,
(1 + x)
⋮
(−1) (n − 1)!
n−1
(n)
(x) = .
(1 + x)
g n
∞ ∞
g (n) (0) n ∞ (−1) x x2 x3 x4
n−1
∑ cn x = ∑
n
x =∑ x = −
n
+ − + ...
n=0 n=0 n! n=0 n 1 2 3 4
x2 x3 x4
g (x) = ln (1 + x) = x − + − + ...
2 3 4
x2 x3
A321(a) From List MF26, we know that exp x = M (x) = 1 + x + + + . . . Thus, the
2! 3!
0th, 1st, 2nd, and 3rd Maclaurin series polynomials generated by exp are simply:
x2 x2 x3
M0 (x) = 1, M1 (x) = 1 + x, M2 (x) = 1 + x + , and M3 (x) = 1 + x + + .
2! 2! 3!
Figure to be
inserted here.
n (n − 1) 2
(b) From List MF26, we know that f (x) = (1 + x) = M (x) = 1 + nx + x +
k
2!
n (n − 1) (n − 2) 3
x + ...
3!
Thus, the 0th, 1st, 2nd, and 3rd Maclaurin series polynomials generated by f are simply:
Figure to be
inserted here.
x2 x4 x2 x4
(c) From List MF26, we know that cos x = M (x) = 1 − + − ⋅ ⋅ ⋅ = 1 − + − . . . Thus,
2! 4! 2 4!
the 0th, 1st, 2nd, and 3rd Maclaurin series polynomials generated by cos are simply:
x2 x2
M0 (x) = 1, M1 (x) = 1, M2 (x) = 1 − , and M3 (x) = 1 − .
2 2
Figure to be
inserted here.
x2
For small x, cos x ≈ M2 (x) = 1 − . We call this the small-angle approximation for
2
cosine.
A322(a) The derivative of f is the function f ′ ∶ (−1, 1) → R defined by:
1
f ′ (x) = 2.
(1 − x)
∞
1
(b) f (x) =
′
= 1 + 2x + 3x + ⋅ ⋅ ⋅ = ∑ nxn−1 .
2
(1 − x)
2
n=0
(c) It would appear that we can indeed find the derivative of f simply by differentiating
its Maclaurin series term by term!
A325. By the double angle formula for sine, we have:
1
f (x) = sin x cos x = sin 2x.
2
Hence: f ′ (x) = cos 2x, f ′′ (x) = −2 sin 2x, and f (3) (x) = −4 cos 2x.
And so: f (0) = 0, f ′ (0) = 1, f ′′ (0) = 0, and f (3) (0) = −4.
Thus: m0 = 0, m1 = 1, m2 = 0, m3 = −4/3! = −2/3 and:
1539, Contents www.EconsPhDTutor.com
2 2
f (x) = sin x cos x = 0 + 1x + 0x2 − x3 + ⋅ ⋅ ⋅ = x − x3 + . . .
3 3
A328. Define g ∶ R → R by g (y) = sin y and h ∶ (−1, 1] → R by h (x) = ln (1 + x).
Then the composite function f = gh ∶ (−1, 1] → R is indeed defined by f (x) = sin [ln (1 + x)].
The power (and also Maclaurin) series representation of g is:
y3 y5
g (y) = sin y = y − + − ... for y ∈ R.
1
3! 5!
The power (and also Maclaurin) series representation of h is:
x2 x3
h (x) = ln (1 + x) = x − + − ... for x ∈ (−1, 1].
2
2 3
Since we’re only asked to write down the expansion up to and including the x3 term, we
have:
x2 x3 1 3 x2 x3
f (x) = sin [ln (1 + x)] = (x − + − . . . ) − (x ) + . . . = x − + − ... for
2 3 3! 2 6
x ∈ (−1, 1].
cos [ln (1 + x)] sin [ln (1 + x)] sin [ln (1 + x)] cos [ln (1 + x)]
f (3) (x) = − + 2 + + 2 .
(1 + x) (1 + x) (1 + x) (1 + x)
3 3 3 3
f (0) = 0,
f ′ (0) = 1,
f ′′ (0) = 0 − 1 = −1,
f (3) (0) = −1 + 0 + 0 + 2 = 1.
0 1 −1 2 1 3 x2 x3
f (x) = sin [ln (1 + x)] = + x + x + x + ⋅ ⋅ ⋅ = x − + − ... for x ∈ (−1, 1].
0! 1! 2! 3! 2 6
1540, Contents www.EconsPhDTutor.com
128.9. Ch. 77 Answers (Integration)
A335(a) The derivative of F is the function F ′ ∶ R → R with mapping rule F ′ (x) = 4 sin 4x.
Thus, F ′ = f or equivalently F = ∫ f — that is, F is indeed an antiderivative for f .
The derivative of G is the function G′ ∶ R → R with mapping rule
So, F and G do not contradict our assertion that “antiderivatives are unique up to a
constant”, because they do indeed differ by only a constant.
A336Using the various Rules of Differentiation, we have:
d
(a) (kx + C) = k.
dx
d xk+1
( + C) = (k + 1) = xk .
xk
(b)
dx k + 1 k+1
d x
(d) (e + C) = ex .
dx
d
(e) (− cos x + C) = sin x.
dx
d
(f) (sin x + C) = cos x.
dx
d d d
(g) f± g= (f ± g).
dx dx dx
d d
(h) (kf ) = k f .
dx dx
d 1 1 d C 1
(i) [ (∫ f ) (ax + b)] = [(∫ f ) (ax + b)] = f (ax + b) ⋅ a = f (ax + b).
dx a a dx a
A337(b) Suppose a function has mapping rule x ↦ xk (where k ≠ −1). Then this function’s
1542, Contents www.EconsPhDTutor.com
antiderivatives are exactly those functions whose mapping rule is x ↦ + C.
xk+1
k+1
1
(c) Suppose a function has mapping rule x ↦ . Then this function’s antiderivatives are
x
exactly those functions whose mapping rule is x ↦ ln ∣x∣ + C.
1
Remark 157. Observe that the function with mapping rule x ↦ could have a domain
x
as large as R ∖ {0}. In which case, we want to say that its antiderivative also has the
same domain. But this wouldn’t be the case if we simply say that the antiderivative has
mapping rule x ↦ ln x + C, because in H2 Maths, ln isn’t defined for negative values.
d 1
This, along with the fact that (ln ∣x∣ + C) = , are the reasons why, in general, it is
dx x
not OK to simply drop the absolute value sign here.
1
Note though that if the function with mapping rule x ↦ has domain D ⊆ R+ , then it is
x
OK to drop the absolute value sign when writing down its antiderivative.
(d) Suppose a function has mapping rule x ↦ ex . Then this function’s antiderivatives are
exactly those functions whose mapping rule is x ↦ ex + C.
(e) Suppose a function has mapping rule x ↦ sin x. Then this function’s antiderivatives
are exactly those functions whose mapping rule is x ↦ − cos x + C.
(f) Suppose a function has mapping rule x ↦ cos x. Then this function’s antiderivatives
are exactly those functions whose mapping rule is x ↦ sin x + C.
(g) Suppose a function has mapping rule x ↦ (f ± g) (x). Then this function’s antideriv-
atives are exactly those functions whose mapping rule is x ↦ ∫ f (x) ± ∫ g (x).
(h) Suppose a function has mapping rule x ↦ kf (x). Then this function’s antiderivatives
are exactly those functions whose mapping rule is x ↦ k ∫ f (x).
A338. Once again, this goes back to our repeated warning about the distinction between
antidifferentiation and integration.
As you were asked to write down in the previous exercise, the Constant Factor Rule in
Theorem 27 (Rules of Antidifferentiation) says that given a function f with antiderivative
∫ f , a function with mapping rule x ↦ kf (x) will have antiderivative k ∫ f .
In contrast, the Constant Factor Rule in Theorem 25 (Rules of Integration) says that if
the area under the graph of f between a and b is S, then the area under the graph of kf
between a and b is kS.
A priori, these two Constant Factor Rules have absolutely no relationship with each other.
One is concerned with antiderivatives, while the other is concerned with finding the area
under a curve. That they are in fact related is established only by the two FTCs.
Of course, these same remarks also apply to the Sum and Difference Rules listed in each of
these same two Theorems.
A339. In each of the following, C denotes the constant of integration.
1
(a) ∫ ax + b dx = ax2 + bx + C.
2
1543, Contents www.EconsPhDTutor.com
1 1
(b) ∫ ax2 + bx + c dx = ax3 + bx2 + cx + C.
3 2
1 1 1
(c) ∫ ax3 + bx2 + cx + d dx = ax4 + bx3 + cx2 + d + C.
4 3 2
1
(d) ∫ (ax + b) dx = (ax + b) + C.
c c+1
a (c + 1)
1 1
(e) ∫ dx = ln ∣ax + b∣ + C.
ax + b a
(f) ∫ a sin (bx + c) + d dx = − cos (bx + c) + dx.
a
b
(g) ∫ a exp (bx + c) + d dx = exp (bx + c) + dx.
a
b
1 1
(h) ∫ a cos bx + c + dx = sin bx + cx + ln ∣x∣ + C.
a
dx b d
1 5 1 1 1 51 1
∫ 5x2 − 2x − 3 dx = − 8 ∫ 5x + 3 dx + 8 ∫ x − 1 dx = − 8 5 ln ∣5x + 3∣ + 8 ln ∣x − 1∣ + C =
1 x−1
ln ∣ ∣ + C.
8 5x + 3
1
Comparing coefficients, we have A + B = 0 and (B − A) a = 1. Solving, we have A = −
2a
1
and B = . Thus:
2a
1 − 2a
1 1
∫ x2 − a2 dx = ∫ x + a + x − a dx
2a
− 2a
1 1
=∫ dx + ∫ 2a
dx (Sum Rule)
x+a x−a
1 1 1 1
=− ∫ dx + ∫ dx (Constant Rule)
2a x+a 2a x−a
1 1
= − ln ∣x + a∣ + ln ∣x − a∣ + C (Reciprocal Rule)
2a 2a
1
= (ln ∣x − a∣ − ln ∣x + a∣) + C
2a
1 ∣x − a∣
= ln +C (Law of Logarithm)
2a ∣x + a∣
1 x−a
= ln ∣ ∣ + C, (Fact 42)
2a x+a
where the last step uses a Law of Logarithm and Ĉ is, as usual, our COI (and could’ve
been denoted by any other symbol).
7x + 2 7x + 2 7 2x − 1 11/2 7 1
A343(a) ∫ dx = ∫ dx = ∫ + dx = ∫ dx+
4x2 − 4x + 1 (2x − 1)
2 2 (2x − 1)2 (2x − 1)2 2 2x − 1
11 1 71 11 1 1 7 11
∫ dx = ln ∣2x − 1∣ + (− ) = ln ∣2x − 1∣ + .
2 (2x − 1)
2 22 2 2 2x − 1 4 4 − 8x
So:
7x + 2 7x + 2 x+3 19 1
∫ x2 + x − 6 dx = ∫ (x + 3) (x − 2) dx = ∫ 7 (x + 3) (x − 2) − (x + 3) (x − 2) dx = 7 ∫ x − 2 dx
19 x−2 1
= 7 ln ∣x − 2∣ − ln ∣ ∣ + C = (16 ln ∣x − 2∣ + 19 ln ∣x + 3∣) + C.
1
5 x+3 5
Note that the last step is nice but not necessary (you should however be perfectly capable
of doing this bit of algebra).
(c) Below, = uses our answer from Exercise 342(a).
2
7x + 2 7x + 2 x−1 9
∫ 5x2 − 2x − 3 dx = ∫ (5x + 3) (x − 1) dx = ∫ 7 (5x + 3) (x − 1) + (5x + 3) (x − 1) dx = 7 ∫ 5x
7 9 x−1 11 9
= ln ∣5x + 3∣ + ln ∣ ∣+C = ln ∣5x + 3∣ + ln ∣x − 1∣ + C.
2
5 8 5x + 3 40 8
(h) ⎧
⎪
⎪ sec x tan x + sec2 x sec x(sec x + tan x)
⎪
⎪ = = sec x for sec x + tan x
⎪
⎪
⎪ sec x + tan x sec x + tan x
d ⎪
(ln ∣sec x + tan x∣ + C) = ⎨
dx ⎪
⎪
⎪
⎪
⎪ − sec x tan x − sec2 x sec x(sec x + tan x)
⎪
⎪
⎪ = = sec x for sec x + tan x
⎩ − (sec x + tan x) sec x + tan x
P +Q P −Q
(d) We have the identity: sin P + sin Q = 2 sin
1
cos .
2 2
P +Q P −Q
So let: mx = and nx = .
2 2
P = (m + n) x Q = (m − n) x.
2 3
Which means: and
1
∫ sin mx cos nx dx = ∫ 2 [sin (m + n) x + sin (m − n) x] dx
1 cos (m + n) x cos (m − n) x
=− [ + ] + C.
2 m+n m−n
P +Q P −Q
(e) We have the identity: cos P − cos Q = −2 sin
1
sin .
2 2
P +Q P −Q
So let: mx = and nx = .
2 2
1548, Contents www.EconsPhDTutor.com
P == (m + n) x Q = (m − n) x.
2 3
Which means: and
1
∫ sin mx sin nx dx = ∫ − 2 [cos (m + n) x − cos (m − n) x] dx
1 sin (m − n) x sin (m + n) x
= [ − ] + C.
2 m−n m+n
P +Q P −Q
(f) We have the identity: cos P + cos Q = 2 cos
1
cos .
2 2
P +Q P −Q
So let: mx = and nx = .
2 2
P = (m + n) x Q = (m − n) x.
2 3
Which means: and
1
∫ cos mx cos nx dx = ∫ 2 [cos (m + n) x + cos (m − n) x] dx
1 sin (m − n) x sin (m + n) x
= [ + ] + C.
2 m−n m+n
cos x
A357. Since cot x = and sin′ (x) = cos x, we have:
sin x
cos x sin′ (x) ⋆
∫ cot x dx = ∫ sin x dx = ∫ sin x dx = ln ∣sin x∣ + C.
dx 4
A355(a) From the given substitution x = sec u, we have = sec u tan u.
1
du
Now plug in =:
1
1
=∫ dx
sec2 u ∣tan u∣
1
=∫ dx (By hint)
sec2 u tan u
1
=∫
4
dx
dx
sec u du
1 du
=∫ dx
sec u dx
1
=∫
s
du
sec u
= ∫ cos u du
= sin u + C1
= sin (sec−1 x) + C1 .
1
1 du 5 2
(b) From the given substitution u = 1 − 2 , we have =
2
and also:
dx x3
√ √
x
√
√ 6 1 x2 − 1 x2 − 1
u= 1− 2 = = .
x x2 x
Now:
1 2
√ dx = ∫ √
x
∫ dx
x2 x2 − 1 2 x2 − 1 x3
du
=∫ √
5 x
dx
2 x2 − 1 dx
=∫ √
s x
du
2 x2 − 1
1
= ∫ √ du
6
2 u
√
= u + C2
√
x2 − 1
= + C2 .
6
x
(c) Answering Hint 2, two antiderivatives of the same function differ by at most a constant.
Hence:
√
x2 − 1
sin (sec−1 x) = +D for some D ∈ R.
7
x
1550, Contents www.EconsPhDTutor.com
Following Hint 3, plug x = 2 into = to get:
7
√
3
LHS = sin (sec−1 2) = sin =
π
,
3 2
√ √
22 − 1 3
RHS = +D = + D.
2 2
√
x2 − 1
sin (sec x) =−1
.
x
dx 5
u = tan−1 x = sec2 u.
4
and
du
x3 tan3 u
∫ dx = ∫
1
dx
(x2 + 1) (tan2 u + 1)
3/2 3/2
tan3 u
=∫ dx
(sec2 u)
3/2
tan3 u
=∫ dx
sec3 u
tan3 u
= ∫ dx dx
du sec u
tan3 u du
=∫
5
dx
sec u dx
tan3 u
=∫
s
du
sec u
sin2 u
=∫ sin u du
cos2 u
1 − cos2 u
=∫ sin u du
cos2 u
sin u
=∫ − sin u du
cos2 u
1
= + cos u + C1
cos u
1
= + cos (tan−1 x) + C1 .
4
cos (tan x)
−1
du 6
(b) From the given substitution u = x2 + 1, we have = 2x.
2
dx
1551, Contents www.EconsPhDTutor.com
x3 x3
∫ dx = ∫ 3/2 dx
2
(x2 + 1)
3/2 u
x2
=∫ 2x dx
2u3/2
x2 du
=∫
6
dx
2u3/2 dx
x2
=∫
s
du
2u3/2
2 1 u−1
= ∫ du
2 u3/2
1 1 1
= ∫ 1/2 − 3/2 du
2 u u
1
= (2u1/2 + 2u−1/2 ) + C2
2
√ 1
= u + √ + C2
u
√ 1
= x2 + 1 + √ + C2 .
x2 + 1
1 1 ×ab
(c) + a = + b Ô⇒ b + a2 b = a + ab2 ⇐⇒ ba2 − (1 + b2 ) a + b = 0. By the quadratic formula:
a b
√
1 + b ± (1 + b2 ) − 4b2
2 2
a=
2b
√
1 + b2 ± b4 + 2b2 + 1 − 4b2
=
2b
√
1 + b2 ± b4 − 2b2 + 1
=
2b
1 + b2 ± (b2 − 1) 1
= = b, .
2b b
(d) The two antiderivatives found in (a) and (b) differ by at most a constant:
1 √ 1
+ (tan −1
= 2+1+ √ +D for some D ∈ R.
7
cos
cos (tan−1 x)
x) x
x2 + 1
1
LHS = + cos (tan−1 0) = 1 + 1 = 2,
cos (tan 0)
−1
√ 1
RHS = 02 + 1 + √ + D = 1 + 1 + D = 2 + D.
02 + 1
By (c) then:
√ 1
Either cos (tan x) =
−1
x2 + 1 cos (tan−1 x) = √
8 9
or .
x2 + 1
A??. We do not know which the area function is, amongst the infinitely many indefinite
integrals of f . We merely know that the area function is one of them. Hence, we use the
indefinite article an, rather than the definite article the.
2 3 4/3 2 3 4/3
Method #2. y = x3 ⇐⇒ x = y 1/3 . So A = ∫ x dy = ∫ y 1/3 dy = [y ]1 = (2 − 1).
y=2
y=1 1 4 4
y
y=2
A
y=1
D
B
C
x
3− .
π
3
√
A366. By the quadratic formula, the two curves intersect at ± 2/2. So
√ √ √
2/2 √ √ √ √
2/2 2x3 2 2 2 2 2 2 2 2
A=∫ √ 2 − x2 − (x2 + 1) dx = [x − ] √ =[ − ] − [− + ]= .
− 2/2 3 − 2/2 2 12 2 12 3
−32
2
2 x5 32 256
∫−2 x − 16 dx = [ − 16x] = ( − 32) − (
4
+ 32) = − .
5 −2 5 5 5
√
A368. (Again,√it helps to graph this on your calculator.) Note that y = 1 ⇐⇒ t = 2;
3
√ √
3
t= 3 3 3
3t5 6t4
∫y=1 x dy = ∫t= √ (t + 2t) 3t dt = [ + ]
y=2
2 2
3
2 5 4 √ 3
2
d 1 −1 1 1 d 2 1
ln = x ( 2 ) = − and (− ln x) = − .
dx x x x dx x
Or equivalently:
1 1 1 1 2
∫ (− x ) = ln x and ∫ (− x ) = − ln x.
1
By Corollary 32 then, ln and − ln x differ by at most a constant:
x
1 3
ln = − ln x + C.
x
1 3
Plugging in x = 1, we have ln 1 = 0 = − ln 1 + C = C. Hence, C = 0. And thus, ln = − ln x.
x
c ln a ln ac ⋆
⋆
A371(e) c logb a = = = logb ac . (The middle step uses Fact 159.)
ln b ln b
1 ⋆ ln (1/a) ln a ⋆
(f) logb = =− = − logb a. (The middle step uses Fact 158(c).)
a ln b ln b
⋆ ln (ac) ln a + ln c ⋆
(g) logb (ac) = = = logb a + logb c. (The middle step uses Fact 158(b).)
ln b ln b
(h) is immediate from (f) and (g). But if we’d like, we can also prove this without using
(f) and (g):
a ⋆ ln (a/c) ln a − ln c ⋆
logb = = = logb a − logb c.
c ln b ln b
(The middle step uses Fact 158(d).)
⋆ ln a ⋆ ln b
(i) logc a = and logc b = . So:
ln c ln c
logc b ln a/ ln c ln a ⋆
= = = logb a.
logc a ln b/ ln c ln b
⋆ ln c ln c ⋆ 1
(j) logab c = = = loga c. (The middle step uses Fact 159.)
ln ab b ln a b
⎛ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ⎞
y
v′ v′
©x ¬ ¬ © ⎜ ⎟
u u
y = ∫ e sin x dx = ex sin x − ∫ cos x ex dx = ex sin x − ⎜ex cos x + ∫ sin xex dx⎟
⎜ ⎟
⎝ ⎠
dx 1 1
A373. Rearranging, = 2 . So the general solution is x = ∫ 2 dy = tan−1 y + C
dy y + 1 y +1
(Proposition ??). Rearranging, the general solution is y = tan (x + D) (where D = −C).
Given also the initial condition x = 0 Ô⇒ y = 1, we have C = −π/4. So the particular
solution is x = tan−1 y − π/4.
v′ ⎛ v′ ³¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹·¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ µ⎞
dy/dx
©¬ ¬© ⎜ x ⎟
u u
= ∫ ex sin x dx = ex sin x − ∫ cos x ex dx = ex sin x − ⎜ ⎟
dy
+ ∫
1
⎜ e cos sin dx ⎟
x
⎜ ⎟
x xe
dx
⎝ ⎠
dy ex
⇐⇒ = (sin x − cos x) + C1
dx 2
dy ex
Ô⇒ y=∫ dx = ∫ (sin x − cos x) + C1 dx
dx 2
1 ex
= [ (sin x − cos x) + C2 − ∫ ex cos x dx + C1 x]
2 2
1 ex ex
= [ (sin x − cos x) + C2 − (sin x + cos x) + C3 + C1 x]
2
2 2 2
1
= (−ex cos x + C1 x + C4 ) ,
2
dy 1
where = used =. The general solution is = (−ex cos x + C1 x + C4 ).
2 1
dx 2
d
(b) (i) Newton’s Second Law of Motion is that F = (mv).
dt
d dm dv
(ii) By the Product Rule, F = (mv) = v + m . Assuming that m is constant, we
dt dt dt
dm dv
have = 0 and hence F = m .
dt dt
(c) Taking the Earth is immobile, the force of gravitation is the rate of change of momentum
of the small ball. That is,
dv
F= = −m
GM m
.
r2 dt
The ball drops towards the surface of the Earth at an increasing speed. By assumption,
downwards is the negative direction. Hence the negative sign.
dv
Cancelling out the m’s yields 2 = − .
GM
r dt
1 R 1 1
(d) (i) ∫R+x r2 dr = Gm1 [− r ] = Gm1 (− + ).
RGm1
R+x R R+x
dv v2 s vs2
v
∫r=R+x dt dr = ∫r=R+x dt dv =∫ v dv = [ ] = − .
r=R r=R dr v=vs
(ii)
v=0 2 0 2
√
1 1 vs2 1 1
(iii) Gm1 (− + )=− ⇐⇒ vs = ± 2Gm1 ( − ).
R R+x 2 R R+x
√assumption, downwards is the negative direction. So for (d) (iii), we must have vs =
By
1 1
− 2Gm1 ( − ).
R R+x
(e) This is simply the same process as before, but in reverse. The ball will keep moving
upwards, but the force of gravitation will keep pulling it down, reducing its velocity at a
dv
rate given by the equation 2 = − . Eventually, the velocity of the ball will hit 0 and
Gm1
r dt
then start going negative (i.e. the ball will start falling down towards the Earth).
√
1 1
Hence, if x is the maximum height attained by the ball, we have V = 2GM ( − ).
R R+x
(f) In order for the ball to never fall back down to earth, it must be that the ball keeps
going upwards and never reaches any maximum height. That is, x → ∞. Thus, ve = lim V =
√ √ x→∞
1 1 2GM
2GM ( − )= .
R R+x R
A379. We must choose three 4D numbers. Choosing the first 4D number involves four
decisions — what to put as the first, second, third and fourth digits, with the condition
that no digit is repeated.
____
1 2 3 4
Thus, by the MP, there are 10 × 9 × 8 × 7 = 5040 ways to choose the first 4D number.
If we ignored the fact that we already chose the first 4D number, then there’d similarly be
5040 ways to choose the second 4D number (given the condition that this second 4D number
does not have any repeated digits). However, there is an additional condition — namely,
the second 4D number cannot be the same as the first. Thus, there are 5040 − 1 = 5039
ways to choose the second 4D number.
By similar reasoning, we see that there are 5040 − 2 = 5038 ways to choose the third 4D
number.
Altogether then, by the MP, there are 5040 × 5039 × 5038 = 127, 947, 869, 280 ways to choose
the three 4D numbers.
A384. 9!.
(b) First consider the problem of permuting the seven letters in BBBBSSS, without any
two S’s next to each other. We’ll use the AP.
1. B in position #1.
(a) B in position #2. Then the only way to fill the remaining five positions is SBSBS.
Total: 1 possible arrangement.
(b) S in position #2. Then we must have B in position #3.
i. B in position #4. Then the only way to fill the remaining three positions is
SBS. Total: 1 possible arrangement.
ii. S in position #4. Then we must have B in position #5. And there are two
ways to fill the remaining two positions: either BS or SB. Total: 2 possible
arrangements.
(... A386 continued on the next page ...)
(c) We saw that there was only 1 possible (linear) permutation of BBBBSSS that satisfied
the restriction, namely BSBSBSB.
If we now arrange the siblings in a circle, there will necessarily be two brothers next to each
other.
We thus conclude: There are 0 possible ways to arrange the siblings in a circle so that no
two brothers are next to each other.
(d) In part (b), we found 10 possible (linear) permutations of BBBBSSS that satisfied
the restriction.
Of these, 3 have sisters at the two ends: SBSBBBS, SBBSBBS, and SBBBSBS. If
arranged in a circle, these 3 arrangements would involve two sisters next to each other. So
we must deduct these 3 arrangements.
We are left with 7 possible arrangements: BBSBSBS, SBBSBSB, BSBBSBS,
SBSBBSB, BSBSBBS, SBSBSBB, and BSBSBSB. But of course, these are simply
one and the same fixed circular permutation! (This is consistent with Fact 167, which tells
us to simply divide by 7.)
And now again, we must now take into account the fact that the brothers are distinct and
the sisters are distinct. We conclude that there are in total 1 × 4!3! = 144 possible ways to
arrange the siblings in a circle, so that no two sisters are next to each other.
⎛n⎞
=
n!
⎝ k ⎠ k!(n − k)!
n × (n − 1) × ⋅ ⋅ ⋅ × (n − k + 1) × (n − k) × (n − k − 1) × ⋅ ⋅ ⋅ × 1
=
k!(n − k) × (n − k − 1) × ⋅ ⋅ ⋅ × 1
n × (n − 1) × (n − 2) × ⋅ ⋅ ⋅ × (n − k + 1)
= (mass cancellation).
k!
A388.
4! 4! 4×3
C(4, 2) = = = = 6,
2!(4 − 2)! 2!2! 2 × 1
6! 6! 6×5
C(6, 4) = = = = 15,
4!(6 − 4)! 4!2! 2 × 1
7! 7! 7×6×5
C(7, 3) = = = = 35.
3!(7 − 3)! 3!4! 3 × 2 × 1
⎛ 3 ⎞⎛ 7 ⎞⎛ 5 ⎞
A389. = 630.
⎝ 1 ⎠⎝ 2 ⎠⎝ 2 ⎠
18 × 17 × 16
= 17 × 8 + 17 × 8 × 5 = 17 × 8 × 6 =
3×2×1
.
(1 + x) = (1 + x) (1 + x) (1 + x)
= 1 ⋅ 1 ⋅ 1 + 1 ⋅ 1 ⋅ x + 1 ⋅ x ⋅ 1 + x ⋅ 1 ⋅ 1 + 1 ⋅ x ⋅ x + x ⋅ 1 ⋅ x + x ⋅ x ⋅ 1 + x ⋅ x ⋅ x.
´¹¹ ¹ ¸′ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸′ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¸′ ¹ ¹ ¹ ¹ ¶
0 xs 1 x 2 xs 3 xs
Consider the 6 terms on the right. There is C(3, 0) = 1 way to choose 0 of the x’s. Hence,
the coefficient on x0 is C(3, 0) — this corresponds to the term 1 ⋅ 1 ⋅ 1 above.
There are C(3, 1) = 3 ways to choose 1 of the x’s. Hence, the coefficient on x1 is C(3, 1) —
this corresponds to the terms 1 ⋅ 1 ⋅ x, 1 ⋅ x ⋅ 1, and x ⋅ 1 ⋅ 1 above.
There are C(3, 2) = 3 ways to choose 2 of the x’s. Hence, the coefficient on x2 is C(3, 2) —
this corresponds to the terms 1 ⋅ x ⋅ x, x ⋅ 1 ⋅ x, and x ⋅ x ⋅ 1 above.
There is C(3, 03) = 1 way to choose 3 of the x’s. Hence, the coefficient on x3 is C(3, 3) —
this corresponds to the term x ⋅ x ⋅ x above.
Altogether then,
A393. 27 = 128.
⎛4⎞ ⎛3⎞
A395. (a) There are = 4 ways of choosing the two Tan sons and = 3 ways of
⎝2⎠ ⎝2⎠
choosing the two Wong daughters.
Having chosen these sons and daughters, there are only 2! = 2 × 1 possible ways of matching
them up. This is because for the first chosen Tan Son, we have 2 possible choices of brides
for him. And then for the second chosen Tan Son, there is only 1 possible choice of bride
left for him.
⎛ 4 ⎞⎛ 3 ⎞
Altogether then, there are ⋅ 2 = 24 ways of forming the two couples.
⎝ 2 ⎠⎝ 2 ⎠
⎛6⎞ ⎛9⎞
(b) There are = 6 ways of choosing the five Lee sons and = 126 ways of choosing
⎝5⎠ ⎝5⎠
the five Ho daughters.
Having chosen these sons and daughters, there are 5! = 5 × 4 × 3 × 2 × 1 possible ways of
matching them up. This is because for the first chosen Tan Son, we have 5 possible choices
of brides for him. And then for the second chosen Tan Son, there are 4 possible choices of
brides left for him. Etc.
⎛ 6 ⎞⎛ 9 ⎞
Altogether then, there are ⋅ 5! = 6 ⋅ 126 ⋅ 5! = 90, 720 ways of forming the five
⎝ 5 ⎠⎝ 5 ⎠
couples.
S = {A«, K«, Q«, . . . , 2«, Aª, Kª, Qª, . . . , 2ª, A©, K©, Q©, . . . , 2©, A¨, K¨, Q¨, . . . , 2¨} .
(a) (ii) Since there are 52 possible outcomes, there are 252 possible events. Hence, the
event space contains 252 elements. It is too tedious to write this out explicitly.
(a) (iii) As always, P has domain Σ and R. We have P({3©}) = P({5♣}) = 1/52 and
P({3©, 5♣}) = 2/52. In general, given any event A ∈ Σ, we have
∣A∣ ∣A∣
P(A) = =
∣S∣ 52
.
In words, given any event A, its probability P(A) is simply the number of elements it
contains, divided by 52. So for example, P ({3©, 5♣, A«}) = 3/52, as we would expect.
(a) (iv) John might argue that since packs of poker cards usually come with Jokers, there
is the possibility that we mistakenly included one or more Jokers in our deck of cards. He
might thus argue that to cover this possibility, we should set our sample space to be
S = {A«, K«, , . . . , 2«, Aª, Kª, . . . , 2ª, A©, K©, . . . , 2©, A¨, K¨, . . . , 2¨, Joker} .
(b) (ii) Since there are 4 possible outcomes, there are 24 = 16 possible events. Hence, the
event space contains 16 elements. It is not too tedious to write these out explicitly:
⎧
⎪
⎪
Σ = ⎨∅, {HH} , {HT } , {T H} , {T T } , {HH, HT } , {HH, T H} , {HH, T T } ,
⎪
⎪
⎩
{HT, T H} , {HT, T T } , {T H, T T } , {HH, HT, T H} ,
⎫
⎪
⎪
{HH, HT, T T } , {HH, T H, T T } , {HT, T H, T T } , S ⎬.
⎪
⎪
⎭
(b) (iii) As always, P has domain Σ and R. We have P({HH}) = P({HT }) = 1/4 and
P({HT, HT, T H}) = 3/4. In general, given any event A ∈ Σ, we have
∣A∣ ∣A∣
P(A) = =
∣S∣
.
4
In words, given any event A, its probability P(A) is simply the number of elements it
contains, divided by 4. So for example, P ({T H, T T }) = 2/4, as we would expect.
(b) (iv) John might, as before, argue that there is the possibility that a coin lands on its
edge. He might thus argue that the sample space should be
(c) (ii) Since there are 36 possible outcomes, there are 236 possible events. Hence, the
event space contains 236 elements.
⎧ ⎫ ⎧ ⎫
⎪ ⎪
⎛⎪ ⎪⎞ ⎛⎪
⎪ ⎪
⎪⎞ 1
(c) (iii) As always, P has domain Σ and R. We have P ⎨ ⎬ = P ⎨ ⎬ = and
⎝⎪
⎪ ⎪⎠ ⎝⎪ ⎪ ⎠ 36
⎩ ⎪ ⎭ ⎪
⎩ ⎪
⎭
⎧ ⎫
⎛⎪
⎪ ⎪
⎪ ⎞ 2
P ⎨ , ⎬ = . In general, given any event A ∈ Σ, we have
⎝⎪
⎪ ⎪
⎪ ⎠ 36
⎩ ⎭
∣A∣ ∣A∣
P(A) = =
∣S∣ 36
.
In words, given any event A, its probability P(A) is simply the number of elements it
⎧ ⎫
⎛⎪
⎪ ⎪
⎪⎞ 4
contains, divided by 52. So for example, P ⎨ , ⎬ = , as we would expect.
⎝⎪ ⎪ ⎠ 36
, ,
⎪
⎩ ⎪
⎭
(c) (iv) John might argue that there is the possibility that a die lands on a vertex. He
might thus argue that the sample space contains 72 = 49 outcomes and should be
⎧
⎪ ⎫
⎪
⎪ ⎪
S=⎨ ⎬.
V V V
⎪ ⎪
, ,..., , , ,..., , , ,...,
⎪
⎩ V V V ⎪
⎭
The event space would be appropriately adjusted to contain 249 elements.
The mapping rule of the probability function would be appropriately adjusted. For example,
if John believes that any given die roll has probability 1/1000000 of landing on a vertex,
⎧ ⎫ ⎧ ⎫
⎛⎪
⎪V ⎪ ⎪⎞ 1 ⎛⎪
⎪ ⎪ ⎪⎞ 999999 2
then we might assign P ⎨ ⎬ = , P ⎨ ⎬ =( ) , etc.
⎝⎪
⎪ ⎪
⎪ ⎠ 10000002 ⎝⎪
⎪ ⎪
⎪ ⎠ 1000000
⎩ V ⎭ ⎩ ⎭
(b) The events A, Ac ∩ B, and Ac ∩ B c ∩ C are mutually exclusive. Moreover, their union
is A ∪ B ∪ C. Hence, by the Additivity Axiom (applied twice),
There is reason to believe that P (Blood stain is not John Brown′ s) is much greater than
P (DNA match) and thus that P (Blood stain is not John Brown′ s∣DNA match) is much
greater than P (DNA match∣Blood stain is not John Brown′ s).
One important factor is that if the DNA database is large, then invariably we’d expect to
find, purely by coincidence, a DNA match to the blood stain at the murder scene. As of
May 2016, the US National DNA Index contains over the DNA profiles of over 12.3 million
1
individuals. And so, even if it were true that there is only probability that two
10, 000, 000
random individuals have a DNA match, we’d expect to find a match, simply by combing
through the entire US National DNA Index!
The error here is similar to the lottery example, where we conclude (erroneously) that a
lottery winner must have cheated, simply because it was so unlikely that she won.
A402. No, the journalist is incorrectly assuming that the probability of one family
member making the NBA is independent of another family member making the NBA. But
such an assumption is almost certainly false.
The same excellent genes that made Rick Barry a great basketball player, probably also
helped his three sons. Not to mention that having an NBA player as your father probably
helps a lot too.
The two events “family member #1 in NBA” and “family member #2 in NBA” are probably
not independent. So we cannot simply multiply probabilities together.
3 2 1 6 1
(c) P(E) = P (X ≥ 10) = P (X = 10) + P (X = 11) + P (X = 12) = + + = = .
36 36 36 36 6
⎛ ⎞ ⎛ ⎞
= 3 and Q = 3.
⎝ ⎠ ⎝ ⎠
Q
⎛ ⎞ ⎛ ⎞
(P Q) = 15 and (P Q) = 12.
⎝ ⎠ ⎝ ⎠
HT T H, T HT H, T T HH, HT T T, T HT T, T T HT, T T T H, T T T T }.
The event space Σ is the set of all possible subsets of S and contains 216 elements.
The probability function P ∶ Σ → R is defined by P(A) = ∣A∣/16, for any event A ∈ Σ.
HT T T, T HT T, T T HT, T T T H ↦ 1,
HHT T, HT HT, T HHT, HT T H, T HT H, T T HH ↦ 2,
HHHT, HHT H, HT HH, T HHH ↦ 3,
T T T T ↦ 0, HHHH ↦ 4.
The event space Σ is the set of all possible subsets of S and contains 2216 elements.
The probability function P ∶ Σ → R is defined by P(A) = ∣A∣/216, for any event A ∈ Σ.
(b) (ii) The range of X is {3, 4, 5, . . . , 18}. We now count the number of ways there are for
the three dice to reach a sum of 3, to reach a sum of 4, etc. This will enable us to write
down the mapping rule of the function X ∶ S → R.
To get a sum of 3, the three dice must be or permutations thereof. There is thus
3!
= 1 possibility.
3!
To get a sum of 4, the three dice must be , or permutations thereof. There are thus
3!
= 3 possibilities.
2!
To get a sum of 5, the three dice must be , , or permutations thereof. There are
3! 3!
thus + = 6 possibilities.
2! 2!
To get a sum of 6, the three dice must be , , , or permutations thereof.
3! 3!
There are + 3! + = 10 such possibilities.
2! 3!
To get a sum of 7, the three dice must be , , , , or permutations
3! 3! 3!
thereof. There are + 3! + + = 15 such possibilities.
2! 2! 2!
To get a sum of 8, the three dice must be , , , , , or permutations
3! 3! 3!
thereof. There are + 3! + 3! + + = 21 such possibilities.
2! 2! 2!
To get a sum of 9, the three dice must be , , , , , , or
3! 3! 3!
permutations thereof. There are 3! + 3! + + + 3! + = 25 such possibilities.
2! 2! 3!
To get a sum of 10, the three dice must be , , , , , , or
3! 3! 3!
permutations thereof. There are 3! + 3! + + 3! + + = 27 such possibilities.
2! 2! 2!
By symmetry, there are also 27 ways to get a sum of 11; also 25 ways to get a sum of 12,
etc.
So X ∶ S → R is defined by
⎛ ⎞
X⎜
⎜
⎟ = 3,
⎟
⎝ ⎠
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
X⎜
⎜
⎟=X⎜
⎟ ⎜
⎟=X⎜
⎟ ⎜
⎟ = 4,
⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
X⎜
⎜
⎟=X⎜
⎟ ⎜
⎟=X⎜
⎟ ⎜
⎟=X⎜
⎟ ⎜
⎟=X⎜
⎟ ⎜
⎟=X⎜
⎟ ⎜
⎟ = 5,
⎟
⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠
1 3 6
(b)(iii) P(X = 3) = , P(X = 4) = , P(X = 5) = ,
216 216 216
10 15 21
P(X = 6) = , P(X = 7) = , P(X = 8) = ,
216 216 216
25 27 27
P(X = 9) = , P(X = 10) = , P(X = 11) = ,
216 216 216
25 21 15
P(X = 12) = , P(X = 13) = , P(X = 14) = ,
216 216 216
10 6 3
P(X = 15) = , P(X = 16) = , P(X = 17) = ,
216 216 216
1
P(X = 18) = , P(X = k) = 0,
216
for any k ∉ {3, 4, 5, . . . , 18}.
1 1 5 5 ⎛ 2 ⎞ 1 1 ⎛ 2 ⎞ 5 1 1 1 1 1 25 20 1 46
P (X + Y = 2) = ⋅ ⋅ ⋅ + ⋅ ⋅ + ⋅ ⋅ ⋅ = + + =
2 2 6 6 ⎝ 1 ⎠ 2 2 ⎝ 1 ⎠ 6 6 2 2 6 6 144 144 144 144
.
(b) P (X + Y = 3) is simply the probability of 2 heads and 1 six OR 1 head and 2 sixes. So
1 1 ⎛ 2 ⎞ 5 1 ⎛ 2 ⎞ 1 1 1 1 10 2 12
P (X + Y = 3) = ⋅ ⋅ ⋅ + ⋅ ⋅ ⋅ = + =
2 2 ⎝ 1 ⎠ 6 6 ⎝ 1 ⎠ 2 2 6 6 144 144 144
.
(d) E[X + Y ]
= ∑ P (X + Y = k) ⋅ k
k∈Range(X+Y )
= P (X + Y = 0) ⋅ 0 + P (X + Y = 1) ⋅ 1 + P (X + Y = 2) ⋅ 2
+ P (X + Y = 3) ⋅ 3 + P (X + Y = 4) ⋅ 4
25 60 46 12 1 60 + 92 + 36 + 4 192 4
= ⋅0+ ⋅1+ ⋅2+ ⋅3+ ⋅4= = = .
144 144 144 144 144 144 144 3
10
P (X = 250) = P (X = 60) = ,
10000
9977
P (X = 0) = ,
10000
1
P (Y = 3000) = P (Y = 2000) = P (Y = 800) = ,
10000
9997
P (Y = 0) = .
10000
E [Y ] = ∑ P (Y = 2000) ⋅ k
k∈Range(Y )
1 1 1 9997
= ⋅ 3000 + ⋅ 2000 + ⋅ 800 + ⋅ 0 = 0.3 + 0.2 + 0.08 + 0 = 0.58.
10000 10000 10000 10000
(d) For every $1 staked, the “big” game is expected to lose you $0.341 and the “small”
game is expected to lose you $0.42. Thus, the “big” game is expected to lose you less
money.
35 65
A412. E [Y ] = × 20 cm + × 30 cm = 26.5 cm.
100 100
35 65
V [Y ] = × (20 cm − 26.5 cm) + × (30 cm − 26.5 cm) = 22.75 cm2 .
2 2
100 100
√
SD [Y ] = V [Y ] ≈ 4.77 cm.
P (X ≥ 2) = 1 − P (X ≤ 1) = 1 − P (X = 0) − P (X = 1)
⎛ 20 ⎞ ⎛ 20 ⎞
=1− 0.010 0.9920 − 0.011 0.9919
⎝ 0 ⎠ ⎝ 1 ⎠
≈ 0.0169.
P (Y ≥ 2) = 1 − P (Y ≤ 1) = 1 − P (Y = 0) − P (Y = 1)
⎛ 35 ⎞ ⎛ 35 ⎞
=1− 0.0050 0.99535 − 0.0051 0.99534
⎝ 0 ⎠ ⎝ 1 ⎠
≈ 0.0133.
P (X ≥ 2) P (Y ≥ 2) ≈ 0.00022.
(c) P (3.1 ≤ Y ≤ 4.6) = 0.75 is in blue and P (4.8 ≤ Y ≤ 4.9) = 0.05 is in red.
-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
σ 2π 1 2π 2π
We’ve just shown that the PDF of X ∼ N(µ, σ 2 ) when µ = 0 and σ 2 , is the same as the PDF
of the SNRV Z ∼ N(0, 1). Hence, the SNRV is indeed simply a normal random variable
with mean µ = 0 and variance σ 2 = 1.
X − µ X −µ 1
A418. First observe that = + . Now simply use Fact 181, with a = and
−µ
σ σ σ σ
b= :
σ
X − µ X −µ µ −µ 1 2
= + ∼ N( + , σ ) = N (0, 1) .
σ σ σ σ σ σ2
1 − 2.14
(a) P (X ≥ 1) = P (Z ≥ √ ) ≈ P (Z ≥ −0.5098)
5
1 − (−0.33)
P (Y ≥ 1) = P (Z ≥ √ ) ≈ P (Z ≥ 0.9405)
2
(b) Let B1 ∼ N (110, 1156), B2 ∼ N (110, 1156), . . . , B12 ∼ N (110, 1156) be the bills in each
of the 12 months.
Then the total bill in a year is T = B1 +B2 +⋅ ⋅ ⋅+B12 ∼ N (12 × 110, 12 × 1156) = N (1320, 13872).
Thus, P (T > 1000) ≈ 0.9967 (calculator).
Our goal is to find the value of x for which P (B > 100) = 0.1. We have
50 − 200x
= 1 − Φ (√ ) = 0.1.
256 + 10000x2
From the Z-tables,
50 − 200x 50 − 200x
Φ (√ ) = 0.9 ⇐⇒ √ ≈ 1.2815.
256 + 10000x2 256 + 10000x2
One can rearrange, do the algebra (square both sides), and use the quadratic formula.
Alternatively, one can simply use one’s graphing calculator to find that x ≈ 0.084. We
conclude that the maximum value of x is approximately 0.084, in order for the probability
that the total utility bill in a given month exceeds $100 is 0.1 or less.
A423. Let X be the random variable that is the sum of the weights of the 5, 000 Coco-Pops.
The CLT says that since n = 5000 ≥ 30 is large enough and the distribution is “nice
enough” (we are assuming this), X can be approximated by the normal random variable
Y ∼ N (5000 × 0.1, 5000 × 0.004) = N (500, 20). Thus, P (X ≤ 499) ≈ P (Y ≤ 499) ≈ 0.4115
(calculator).
∑i=1 x 1885
n
x̄ = = = 188.5,
n 10
(∑n
∑i=1 x2 − 378, 265 − 1885
2 2
n i=1 x)
s =
2 n
= 10
≈ 2550.
n−1 9
s2 = = 10
≈ 2550.
n−1 9
A426. (a) Assume that the weights of the five Singaporeans sampled are independently-
and identically-distributed. Then unbiased estimates for the population mean µ and vari-
ance σ 2 of the weights of Singaporeans are, respectively, the observed sample mean x̄ and
observed sample variance s2 :
∑ xi 32 + 88 + 67 + 75 + 56
x̄ = = = 63.6,
n 5
∑ x2i − nx̄2 322 + 882 + 672 + 752 + 562 − 4 × 63.6
s =
2
= = 448.3.
n−1 4
(b) We don’t know! And unless we literally gather and weigh every single Singaporean, we
will never know what exactly the average weight of a Singaporean is.
All we’ve found in part (a) is an estimate (63.6 kg) for the average weight of a Singaporean.
We know that on average, the estimator we uses “gets it right”.
However, it could well be that we’re unlucky (and got 5 unusually heavy or unusually light
persons) and the estimate of 63.6 kg is thus way off.
We have just shown that E [X̄] = µ. In other words, we’ve just shown that X̄ is an unbiased
estimator for µ.
A428. (a) The observed random sample is (x1 , x2 , . . . , x10 ) = (1, 1, 1, 1, 1, 1, 1, 0, 0, 0). The
observed sample mean and observed sample variance are
x1 + x2 + ⋅ ⋅ ⋅ + x10
x̄ = = 0.7,
n
(b) Yes, the observed sample mean x̄ = 0.7 is an unbiased estimate for the true population
mean µ (i.e. the true proportion of coin flips that are heads).
⋅
And yes, the observed sample variance s2 = 0.23 is an unbiased estimate for the true
population variance σ 2 .
(c) No, this is merely one observed random sample, from which we generated a single
estimate (“guess”) — namely x̄ = 0.7 — of the true population mean µ.
All we know is that the sample mean X̄ is an unbiased estimator for the true population
mean µ. That is, the average estimate generated by X̄ will equal µ.
However, any particular estimate x̄ may or may not be equal to µ. Indeed, if we’re unlucky,
our particular estimate may be very far from the true µ.
1 1 1
A429. Var [X̄] = Var [ (X1 + X2 + ⋅ ⋅ ⋅ + Xn )] = 2 Var [X1 + X2 + ⋅ ⋅ ⋅ + Xn ] = 2 (Var [X1 ] + Var
n n n
1 2
(nσ 2 ) = .
σ
n2 n
(d) The sample variance S is a random variable defined by S = ∑ (Xi − X̄) / (n − 1).
n
2 2
i=1
It measures the dispersion across the values in a random sample.
(e) The mean of the sample mean, also called the expected value of the sample mean, is the
number E [X̄]. The interpretation is that if we we have infinitely many observed samples
of size n, calculate the observed sample mean for each, then E [X̄] is equal to the average
across the observed sample means. It can be shown that E [X̄] = µ and hence that the
sample mean X̄ is an unbiased estimator for the population mean µ.
(f) The variance of the sample mean is the number Var [X̄]. The interpretation is that if
we have infinitely many observed random samples of size n, calculate the observed sample
mean for each, then Var [X̄] measures the dispersion across the observed sample means.
(g) The mean of the sample variance, also called the expected value of the sample variance,
is the number E [S 2 ]. The interpretation is that if we have infinitely many observed
random samples of size n, calculate the observed sample variance for each, then E [S 2 ] is
equal to the average across the observed sample variances. It can be shown that E [S 2 ] = σ 2
and hence that the sample variance S 2 is an unbiased estimator for the population variance
σ2 .
(h) Given an observed random sample, e.g. (x1 , x2 , x3 ) = (1, 1, 0), we can calculate the
corresponding observed sample mean as
x1 + x2 + x3 1 + 1 + 0 2
x̄ = = = .
3 3 3
The observed sample mean is the average of all values in an observed random sample.
(i) Given an observed random sample, e.g. (x1 , x2 , x3 ) = (1, 1, 0), we can calculate the
corresponding observed sample variance as
Our random sample is 20 coin-flips: (X1 , X2 , . . . , X20 ), where Xi takes on the value 1 if
the ith coin-flip is heads and 0 otherwise.
Our test statistic is the number of heads: T = X1 + X2 + ⋅ ⋅ ⋅ + X20 .
In our observed random sample (x1 , x2 , . . . , x20 ), there are 17 heads. So the observed
test statistic is t = 17.
Assuming H0 were true, we’d have T ∼ B (20, 0.5). Thus, the p-value is
A432. Let µ be the true long-run proportion of coin-flips that are heads. The null and
alternative hypotheses are
Our random sample is 20 coin-flips: (X1 , X2 , . . . , X20 ), where Xi takes on the value 1 if
the ith coin-flip is heads and 0 otherwise.
Our test statistic is the number of heads: T = X1 + X2 + ⋅ ⋅ ⋅ + X20 .
In our observed random sample (x1 , x2 , . . . , x20 ), there are 17 heads. So the observed
test statistic is t = 17.
Assuming H0 were true, we’d have T ∼ B (20, 0.5). Thus, the p-value is
Thus, the critical value is 15 (this is the value of t at which we are just able to reject H0 at
the α = 0.05 significance level).
And the critical region is {15, 16, . . . , 20} (this is the set of values of t at which we’d be able
to reject H0 at the α = 0.05 significance level).
(b) The competing hypotheses are H0 ∶ µ = 0.5, HA ∶ µ ≠ 0.5.
The test statistic T is the number of heads (out of the 20 coin-flips).
For t = 14, the corresponding p-value is
H0 ∶ µ = 34,
HA ∶ µ ≠ 34.
⎛ 33.4 − 34 ⎞ ⎛ 34.6 − 34 ⎞
=P Z≥ √ +P Z ≤ √ ≈ 0.5271.
⎝ 9/10 ⎠ ⎝ 9/10 ⎠
The large p-value does not cast doubt on or provide evidence against H0 . We fail to reject
H0 at the α = 0.05 significance level.
H0 ∶ µ = 34,
HA ∶ µ ≠ 34.
The large p-value casts doubt on or provides evidence against H0 . We reject H0 at the
α = 0.05 significance level.
H0 ∶ µ = 34,
HA ∶ µ ≠ 34.
The observed sample mean is x̄ = 33.4. And the observed sample variance is s2 = 11.2.
The corresponding p-value is
The fairly small p-value casts some doubt on or provides some evidence against H0 . But
we fail to reject H0 at the α = 0.05 significance level.
A437. The observed sample mean is x̄ = 68 and the observed sample variance (use Fact
183(a)) is
[∑n xi ]
∑i=1 x2i − i=1n 50 × 5000 − (68×50)
2 2
n
s =
2
= 50
≈ 383.7.
n−1 49
Let µ be the true average weight of a Singaporean. The competing hypotheses are H0 ∶ µ =
75 and HA ∶ µ < 75.
(This is a one-tailed test, because your friend’s claim is that the average American is heavier
than the average Singaporean. If the claim were instead that the average American’s weight
is different from the average Singaporean’s, then we’d have a two-tailed test.)
Since the sample size n = 50 is “large enough”, we can appeal to the CLT. The p-value is
CLT ⎛ 68 − 75 ⎞
p = P (X̄ ≤ 68∣H0 ) ≈ P Z ≤ √ ≈ 0.0058.
⎝ 383.7/50 ⎠
The small p-value casts doubt on or provides evidence against H0 . We can reject H0 at any
conventional significance level (α = 0.1, α = 0.05, or α = 0.01).
1200 q
1000
800
600
400
200
p ($)
0
0 2 4 6 8 10 12
¿ √
Án
Á
À∑ (pi − p̄)2 = (8 − p̄)2 + (9 − p̄)2 + (4 − p̄)2 + (10 − p̄)2 + (8 − p̄)2
√
i=1
√
= (8 − 7.8) + (9 − 7.8) + (4 − 7.8) + (10 − 7.8) + (8 − 7.8) = 20.8 ≈ 4.56070170,
2 2 2 2 2
¿ √
Án
Á
À∑ (qi − q̄)2 = (300 − q̄)2 + (250 − q̄)2 + ⋅ ⋅ ⋅ + (400 − q̄)2
√
i=1
√
= (300 − 470) + (250 − 470) + ⋅ ⋅ ⋅ + (400 − 470) = 368000 ≈ 606.63003552.
2 2 2
i=1 i=1
∑i=1 (pi − p̄) (qi − q̄) −2480
n
b̂ = = ≈ −119.2
∑i=1 (pi − p̄)
n 2 20.8
(b) i 1 2 3 4 5
pi ($) 8 9 4 10 8
qi 300 250 1000 400 400
q̂i 446 327 923 208 406
ûi = qi − q̂i −146 −77 77 192 −46
1000 q
900
800
700
600
500
400
300
200
100 p ($)
0
(c) 0 2 4 6 8 10
5
(d) The SSR is ∑ û2i ≈ (−146) + (−77) + 772 + 1922 + (−46) = 72308.
2 2 2
i=1
After Step 9. After Step 10. After Step 11. After Step 12.
The TI84 tells us that r = −.8963881445 and the regression line is y = ax+b = −119.2307692+
1400. This is indeed consistent with the answers from the previous exercises.
A442. In the previous exercises, we already calculated that the OLS line of best fit is
q = 1400 − 119.2p. Thus,
(a) By interpolation, a barber who charged $7 per haircut sold 1400 − 119.2 × 7 ≈ 566
haircuts.
(b) By extrapolation, a barber who charged $200 per haircut sold 1400−119.2×200 = −22440
haircuts. This is plainly absurd.
The second prediction is obviously absurd and thus obviously less reliable than the first.
(b) r ≈ 0.984.
y = b ∣x − a∣
a 1 x
a+ √
1
y= b
x−a
(ii) We look for the two graphs’ intersection points. First, suppose x > a. Then:
1 ±1 1
= b ∣x − a∣ = b (x − a) ⇐⇒ b (x − a) = 1 ⇐⇒ x − a = √ ⇐⇒ x = a ± √ .
2 1
x−a b b
√
Since x > a and b > 0, it cannot be that x = a − 1/√ b. We conclude that if x > a, then the
two graphs intersect at the point where x = a + 1/ b.
Next, suppose x < a. Then:
1 1
= b ∣x − a∣ ⇐⇒ = −b (x − a) ⇐⇒ b (x − a) = −1.
2 2
x−a x−a
Since b > 0 and (x − a) > 0, = does not hold and the two graphs do not intersect.
2 2
√
Altogether, the two graphs intersect at only one point, namely where x = a + 1/ b.
From the graph, we observe that the given inequality is true to the right of this intersection
point and to the left of the vertical asymptote x = a (shaded red region above). Altogether
then, the inequality’s solution set is:
1 1
(−∞, a) ∪ (a + √ , ∞) = R ∖ [a, a + √ ] .
b b
√
Equivalently: x < a or x > a + 1/ b.
1603, Contents www.EconsPhDTutor.com
1
A445 (9758 N2017/I/4). By long division, y = 4 + .
x+2
dy dy
= − (x + 2) . So, < 0 for all x except −2.
2
(i) Compute the gradient of C:
dx dx
But since there’s no point on C for which x = −2, we conclude that the gradient of C is
negative at all points.
(ii) The horizontal and vertical asymptotes are y = 4 and x = −2.
1 1
(iii) Start with the graph of y = 4 + . Translate rightwards by 2 units to get y = 4 + .
x+2 x
Then translate downwards by 4 units to get y = 1/x.
A446 (9758 N2017/I/5)(a) Let f (x) = x3 + ax2 + bx + c. By the Remainder Theorem,
f (1) = 1 + a + b +c = 8,
1
f (2) = 8 + 4a + 2b +c = 12,
2
f (3) = 27 + 9a + 3b +c = 25.
3
This is a system of three equations with three unknowns. Either solve using your graphing
calculator or “manually”, as we now do:
= minus = yields 4 = 7 + 3a + b or b = −3 (a + 1).
2 1 4
(b) Given the curve of f (x) = x3 − 1.5x2 + 1.5x + 7, its gradient (derivative) is f ′ (x) =
3x2 − 3x + 1.5.
f ′ (x) is a quadratic polynomial in x with positive coefficient on x2 and negative discriminant
(−3) − 4 (3) (1.5) = 9 − 18 = −9. Hence, it is ∪-shaped and doesn’t touch the horizontal axis.
2
√
√
3 ± (−3) − 4(3)(−0.5) 3±
2
15
⇐⇒ x= = ≈ 1.15, −0.145.
2⋅3 6
1604, Contents www.EconsPhDTutor.com
√
A447 (9758 N2017/II/1)(i) Plug x = 3/t and y = 2t into y = 2x to get 2t = 6/t or t = ± 3.
√ √ √ √
And so the two points are A = ( 3, 2 3) and B = (− 3, −2 3). We have:
√
√ √ 2 √ √ 2 √ √ √
∣AB∣ = [ 3 − (− 3)] + [2 3 − (−2 3)] = 4 ⋅ 3 + 16 ⋅ 3 = 60 = 2 15.
dy dy dx −3 2t2
(ii) = ÷ =2÷ 2 =− .
dx dt dt t 3
3
And so the equation of the tangent line at P ( , 2p) is:
p
2p 2 3
y − 2p = − (x − ) .
3 p
2p2 3 1 1 3 3 3 6
At D, y = 0 and so: 0 − 2p = − (x − ) ⇐⇒ = (x − ) ⇐⇒ x = + = . That is,
3 p p 3 p p p p
6
D = ( , 0).
p
2p2 3 1
At E, x = 0 and so: y − 2p = − (0 − ) ⇐⇒ y = 2p + 2p2 ( ) = 2p + 2p = 4p. That is,
3 p p
E = (0, 4p).
6 3
The midpoint of D = ( , 0) and E = (0, 4p) is F = ( , 2p).
p p
Write x = 3/p and y = 2p. Write p = 3/x and so the desired cartesian equation is y = 6/x.
A448 (9758 N2017/II/3)(a)(i) The graph of y = f (2x) is the graph of y = f (x) com-
pressed inwards towards the y-axis by a factor of 2. Hence, its x-intercept is (a/2, 0), while
its y-intercept (0, b) remains unchanged.
(ii) The graph of y = f (x − 1) is the graph of y = f (x) translated rightwards by 1 unit.
Hence, its x-intercept is (a + 1, 0). We cannot say anything about the y-intercept.
(iii) The graph of y = f (2x − 1) is the graph of y = f (2x) translated rightwards by 1 unit.
Hence, its x-intercept is (a/2 + 1, 0). We cannot say anything about the y-intercept.
(iv) The graph of y = f −1 (x) is the graph of y = f (x) reflected in the line y = x. Hence,
its x-intercept is (b, 0) and its y-intercept is (0, a).
(b)(i) a = 1 is excluded because g (1) would be undefined.
1 1
(ii) g 2 (x) = 1 − = 1 − 1 = 1 − (1 − x) = x.
1 − (1 − 1−x )
1
1−x
1
Recall that g −1 (g (x)) = x. Since g (g (x)) = x, we have g −1 (x) = g (x) = 1 − .
1−x
1 1
(iii) g 2 (b) = g −1 (b) ⇐⇒ b = 1 − ⇐⇒ b − 1 = − ⇐⇒ (1 − b) = 1 ⇐⇒ 1 − b = ±1
2
1−b 1−b
⇐⇒ b = 0, 2.
4x2 + 4x − 14 4x2 + 4x − 14 − (x2 − x − 12)
A449 (9740 N2016/I/1). − (x + 3) =
x−4 x−4
3x2 + 5x − 2 (3x − 1) (x + 2)
= = = f.
x−4 x−4
1605, Contents www.EconsPhDTutor.com
The given inequality is equivalent to f < 0. The numerator or denominator of f expression
equals zero at x = 1/3, x = −2, and x = 4. Sign diagram:
− + − +
−2 1/3 4
k (x − l) + m.
4
The graph of y = x4 has turning point (0, 0). Following the above steps, this turning point
corresponds to (1) the point (0 + l, 0) = (l, 0) on the graph of y = (x − l) ; (2) the point
4
(l, k ⋅ 0) = (l, 0) on the graph of y = k (x − l) ; and (3) the point (l, 0 + m) = (l, m) on the
4
graph of y = k (x − l) + m.
4
We are given that the turning point corresponds to the point (a, b) on the graph of
y = f (x) = k(x − l)4 + m. Hence, have a = l and b = m.
Plug (0, c) into y = f (x) = k (x − l) 4 + m to get c = kl4 + m = ka4 + b. Thus, k = (c − b) /a4 .
y = x4
(0, c)
y = f (x) = k (x − l) + m
4
y = (x − l)
4
y = k (x − l)
4
(a, b) = (l, m)
1
(0, 1/c) (a, 1/b) y=
f (x)
x
(0, 0)
y = 1/f (x) has y-intercept (0, 1/c) and turning point (a, 1/b).
1606, Contents www.EconsPhDTutor.com
√
A451 (9740 N2016/I/10)(a)(i) Let y = 1 + x. Do the algebra: (y − 1) = x. Thus,
2
f −1 (x) = (x − 1) .
2
Remark 158. N2016-I-10(a)(ii) was a difficult question that few students could have prop-
erly and rigorously solved under exam conditions. As my answer below suggests, any
proper and rigorous answer to this question is necessarily somewhat lengthy.
We do not have access to the “official” answers, but I suspect that the “official” answer
to this question was hand-wavy, sloppy, incomplete, and non-rigorous. Which seems to
be how our educational system works — hurl at you a difficult question, then expect you
to foggily sleepwalk through and “answer” it without really understanding what’s going
on.
(ii) First sentence of the question.
√ f f (x) =x
1
√
⇐⇒ 1 + √1 + x =x
2
√
⇐⇒ 1+ x =x−1
√
Ô⇒ 1+ x = (x − 1)
2
√
⇐⇒ = (x − 1) − 1
2
x
Ô⇒ = [(x − 1) − 1] = [(x − 1 − 1) (x − 1 + 1)]
2 2 2
x
= [(x − 2) x] = (x2 − 4x + 4) x2 = x4 − 4x3 + 4x2
2
⇐⇒ x4 − 4x3 + 4x2 − x = 0.
3
(A brief word about the two Ô⇒ ’s: It is generally true that a = b Ô⇒ a2 = b2 . However,
as you will recall, the converse is false. That is, it is not generally true that a2 = b2 implies
a = b. A simple counterexample is a = 1 and b = −1. Hence, these two Ô⇒ ’s cannot be
replaced by ⇐⇒ ’s.)
Observe that if f f (x) = x, then x ≠ 0 because f f (0) = 2 ≠ 0. So, if f f (x) = x, then we can
1 1
divide = by x ≠ 0 to get:
3
x3 − 4x2 + 4x − 1 = 0.
4
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
Call this g(x)
Second sentence of the question. We now solve the cubic equation g (x) = 0. Observe
4
that g (1) = 0 — hence, by the Factor Theorem, we know that x − 1 is a factor of g (x).
So write:
26).
Third sentence of the question. Observe that:
+ 1.6b + c = −2.4,
a 1
1.6 2
− 0.7b + c = 3.6,
a 2
(−0.7)
2
R
dy RRRR
RRR = −2 3 + b = 2.
a 3
dx RR 1
Rx=1
So a ≈ −3.593, b ≈ −5.187, c ≈ 7.303 (calculator).
(ii) − 2 + bx + c = 0 Ô⇒ x ≈ −0.589 (calculator).
a
x
(iii) As x → ±∞, y → bx + c. Hence, the other asymptote is y = bx + c or y ≈ 5.187x + 7.303.
A453 (9740 N2015/I/2)(i) One method for “doing” this question is to simply easily
graph the equations on your calculator and copy. But here as an exercise, I’ll do it without
a calculator:
x+1 x+1 2
First draw the graph of y = (below left). Write = −1+ . This is a rectangular
1−x 1−x 1−x
hyperbola with two distinct branches.
x+1
• Intercepts. The graph of y = crosses the vertical axis at (0, 1) and the horizontal
1−x
axis at (−1, 0).
x+1 x+1
• Asymptotes. Observe that as x → 1, → ±∞, so that the graph of y = has
1−x 1−x
vertical asymptote x = 1. And as x → ±∞, (x + 1) / (1 − x) → −1, so that the graph of
x+1
y= has horizontal asymptote y = −1.
1−x
1608, Contents www.EconsPhDTutor.com
• The centre is thus (1, −1).
• The two lines of symmetry run through the centre and bisect the angles formed by
the asymptotes.
Figure to be
inserted here.
x+1
Now take those parts of the graph of y = below the x-axis and reflect them in the
1−x
x+1
x-axis to get the graph of y = ∣ ∣ (above right). As instructed, we’ve also graphed
1−x
y = x + 2.
(ii) Let P , Q, and R (marked above) be the three points at which the two graphs intersect.
Then the given inequality holds if and only if x is between P and Q or to the right of R.
One method for “doing” this question is to simply use your calculator to find the three
intersection points. But here as an exercise, I’ll do it without a calculator:
First, to find the x-coordinate of the intersection point Q, suppose x ∈ [−1, 1). Then:
x+1 x+1
∣ ∣ = x + 2 ⇐⇒ = x + 2 ⇐⇒ x + 1 = −x2 − x + 2 ⇐⇒ x2 + 2x − 1 = 0
1−x 1−x
√
−2 ± 22 − 4 (1) (−1) √
⇐⇒ x = = −1 ± 2.
2 (1)
√ √
Note that −1 − 2√∉ [−1, 1), while −1 + 2 ∈ [−1, 1). Thus, the intersection point Q has
x-coordinate −1 + 2.
Next, to find the x-coordinates of the intersection points P and R, suppose x ∉ [−1, 1].
Then:
x+1 x+1 √
∣ ∣ = x + 2 ⇐⇒ − = x + 2 ⇐⇒ x + 1 = x2 + x − 2 ⇐⇒ x2 − 3 = 0 ⇐⇒ x = ± 3.
1−x 1−x
√ √
Thus, the x-coordinates of the intersection points P and R are − 3 and 3.
√ √ √
Altogether then, the given inequality holds if and only if x ∈ (− 3, −1 + 2) ∪ ( 3, ∞).
A454 (9740 N2015/I/5)(i) First, move the graph of y = x2 rightwards by 3 units to get
the graph of y = (x − 3) .
2
Then stretch it vertically, outwards from the x-axis by a factor of 0.25 — or equivalently,
compress it vertically, inwards towards the x-axis by a factor of 4 — to get the graph of
y = 0.25 (x − 3) .
2
(iii) Again,
(iii) The graph
the easy = 1 +isf by
of y way (0.5x) is that
graphing of f stretched
calculator, but againhorizontally outwards
as an exercise, let’sfrom
alsothe
do
y-axis by a factor of
it without a calculator.2, then shifted up by 1 unit.
Written out explicitly, we have:
Method #1: Mechanically do the algebra. First replace each x with 0.5x.
⎧
⎪
⎪
⎪
⎪ 2 for 0 ≤ x ≤ 2,
⎧ ⎪
⎪ ⎧
⎪
⎪
⎪ y = 1 + f (0.5x)for = ⎨01≤+x0.25(0.5x
≤ − 2 ⎪
⎪
<
⎪ 1x ≤ 6, for 0 ≤ 0.5x
⎪
⎪
⎪
1 ⎪
⎪
1, 3) for 2 ⎪
⎪
⎪
⎪
f (x) = ⎨0.25(x − 3)2 for ⎪ ⎪
⎪11< x ≤ 3, Ô⇒ f (0.5x) = ⎨0.25(0.5x − 3)2 for 1 < 0.5x
⎪
⎪ ⎩ otherwise.
⎪
⎪
⎪
⎪
⎪ ⎪
⎪
⎪
⎪
⎩ 0 otherwise. ⎪
⎩0 otherwise.
A455 (9740 N2015/II/3)(a)(i) Rangef = (−∞, 0). Pick any element y ∈ Rangef and do
the algebra:
⎧
⎪ ⎧
⎪
⎪
⎪
⎪ 1 for 0 ≤ ≤ 2, ⎪
⎪
⎪ 2√ for 0 ≤
⎪ ⎪
x
⎪ 1 1 1 ⎪ 1
Ô⇒ f (0.5x) =y⎨=0.25(0.5x ⇐⇒
− 3)2y = for
1 − 2x < x⇐⇒ = 1 1− + f (0.5x)
⇐⇒ x==⎨±1 + 10.25(0.5x
− .
1
≤ 6,x Ô⇒
2 2 2
⎪ − ⎪ − 3)2 for 2 <
⎪ 1 ⎪
2
⎪ ⎪
x y y
⎪
⎪
⎪ ⎪
⎪
⎪
⎩ 0 otherwise. ⎩1 otherw
(Note that ⇐⇒ is permissible because y ≠ 0)
1
This1125,
Page last Table
equation shows
of Contents that every y ∈ Rangef corresponds to www.EconsPhDTutor.com
a unique x ∈ Domainf .
Equivalently, f is invertible.
Rangef Domainf
³¹¹ ¹ ¹ ¹ ¹ · ¹ ¹ ¹ ¹ ¹ ¹µ ³¹¹ ¹ · ¹ ¹ ¹µ
(ii) The inverse function is f −1
√ ∶ (−∞, 0) → (1, ∞) (1, ∞) with the mapping rule f −1 (x) =
1 − 1/x.
(b) Let y ∈ Rangeg. Then there exists some x ∈ R ∖ {−1, 1} such that g (x) = y or:
Observe that = is a quadratic equation in the variable x. Since x ∈ R, = holds for some
1 1
12 − 4y (2 − y) ≥ 0 4y 2 − 8y + 1 ≥ 0.
2
or
To solve ≥, observe that its LHS is a quadratic expression in y. It has positive coefficient
2
√ √
y ∈ (−∞, 1 − 0.5 3] ∪ [1 + 0.5 3, ∞).
3
What we’ve just shown is that there exists x ∈ R ∖ {−1, 1} such that g (x) = y if and only if
∈ holds. Thus:
3
Rangeg.
To show that f 2 (x) = f −1 (x), we need merely show that f 2 (y) = x ⇐⇒ f (x) = y. To
that end, write:
1 1 1 1
f 2 (y) = x ⇐⇒ 1 − = x ⇐⇒ f (x) = f (1 − ) = = 1 = y.
y y 1 − (1 − y1 ) y
√
(0, − d)
A = (−a, 0) B = (b, 0) C = (c, 0)
O x
y = f (x)
√
(0, d) y 2 = f (x)
(ii) The tangents to the curve y 2 = f (x) at the points where it cross the x-axis are vertical.
dy dy dx 1
A458 (9740 N2014/II/1)(i) = ÷ = 6 ÷ 6t = = 0.4. So, t = 2.5.
dx dt dt t
dy 1
(ii) The tangent line at (3p2 , 6p) has equation y − 6p = (x − 3p2 ) = (x − 3p2 ). Where
dx p
1
this line meets the y-axis, we have y − 6p = (0 − 3p2 ) = −3p or y = 3p. So D = (0, 3p) and
p
the mid-point of the line segment P D is:
3p2 + 0 6p + 3p
( , ) = (1.5p2 , 4.5p).
2 2
¬ ¬
x y 2 2 y
Observe that 1.5p = 1.5( 4.5p /4.5) = ( 4.5p ) /13.5. So the desired cartesian equation is:
2
y2
x= .
13.5
Observe that = is a quadratic equation in the variable x. It holds if and only if its discrim-
1
(1 − y) − 4 (1) (y + 1) = y 2 − 6y − 3 ≥ 0.
2 2
if and only if y is “between” the two roots. By the quadratic formula, the two roots are:
√
6 ± (−6) 2 − 4 (1) (−3) √ √
y= = 3 ± 9 + 3 = 3 ± 2 3.
2
√ √
Altogether then, = holds ⇐⇒ ≥ holds ⇐⇒ y ∈ (−∞, 3 − 2 3] ∪ [3 + 2 3, ∞).
1 2
Thus, this is also the set of possible values that y can take.
1612, Contents www.EconsPhDTutor.com
x+1 1.5
A460 (9740 N2013/I/3)(i) y = = 0.5 + .
2x − 1 2x − 1
To find the y-intercept, set x = 0 to get:
1.5
0.5 + = −1.
2⋅0−1
Hence, the y-intercept is (0, −1).
To find the x-intercept, set y = 0 to get:
x+1
y= or x = −1.
2x − 1
Hence, the x-intercept is (−1, 0).
Observe that as x → 0.5, we have y → ±∞. Hence, x = 0.5 is a vertical asymptote.
And as x → ±∞, y → 0.5. Hence, y = 0.5 is a horizontal asymptote.
y y=x
Line of
y=1-x (0.5, 0.5) symmetry
Line of Centre
symmetry
y = 0.5 y = (x + 1) / (2x - 1)
horizontal asymptote
x
(-1, 0)
Horizontal
intercept x = 0.5
vertical
asymptote
(0, -1)
Vertical
intercept
(ii)
A462 (9740 N2012/I/1). Let x, y, and z be, respectively, the costs of the under-16,
16-65, and over-65 tickets. Then we have the following system of equations:
One method for solving the above system of equations is to use your graphing calculator.
But here as a masochistic exercise, let’s do it by hand:
= minus = yields: 2x + y + z = $33.67.
1 2 4
g (x) + k x+k
+k x + k + k (x − 1) x (1 + k)
h (g (x)) = = x−1
= = = x.
g (x) − 1 x+k
x−1 −1 x + k − (x − 1) k+1
y
y = -x y = (x + k) / (x - 1)
Line of
symmetry
(-k, 0) (1, 1)
Horizontal Centre y=1
intercept horizontal asymptote
x=0
y=x vertical
Line of asymptote
symmetry
(0, -k)
Vertical
intercept
y = f(x ) y = |f(x )|
x x
Observe though that the quadratic polynomial x2 + 3x + 4 has negative determinant. Hence,
the only real solution to f (x) = 4 is 2.
(iii) The given equation is equivalent to f (x + 3) = 4.
In (ii), we showed that f (x) = 4 has solution x = 2.
Hence, the given equation has solution has x + 3 = 2 or x = −1.
(iv) Where f (x) < 0, reflect the graph in the x-axis. Where f (x) ≥ 0, keep it unchanged.
(v) ∣f (x)∣ = 4 ⇐⇒ ∣x3 + x2 − 2x − 4∣ = 4.
1
We already found that this cubic equation has only one real root, namely 2. We can verify
that x = 2 satisfies ≥.
2
0 = x3 + x2 − 2x = x (x2 + x − 2) = x (x + 2) (x − 1) .
This second cubic equation has three real roots, namely 0, −2, and 1. We can verify that
these values of x satisfy <.
3
Altogether, the equation ∣f (x)∣ = 4 has four real roots: −2, 0, 1, and 2.
A465 (9740 N2011/I/1). The numerator N = x2 + x + 1 is a quadratic polynomial with
positive coefficient on x2 and negative discriminant. So, N > 0 for all x.
1616, Contents www.EconsPhDTutor.com
Thus, the given inequality holds if and only the denominator is negative.
The denominator D = x2 + x − 2 is a quadratic polynomial with positive coefficient on x2
and roots given by:
√
−1 ± 12 − 4 (1) (−2)
x= = −2, 1.
2 (1)
Hence, D < 0 (and the given inequality holds) if and only if x ∈ (−2, 1).
A466 (9740 N2011/I/2)(i) The given information forms the following system of equa-
tions:
One method for solving the above system of equations is to use your graphing calculator.
But here just for fun, let’s do it by hand:
= minus = yields 2.16a + 3.6b = −1.3 or 108a + 180b = −65
2 1 4
y = ln (2 ⋅ 0 + 1) + 3 = ln 1 + 3 = 3.
y
x = - 0.5
Vertical
asymptote
for f (x)
(0, 3)
Vertical
y = f -1(x)
intercept
(3, 0)
-3
(0.5 [e -1] , 0) Horizontal y = f(x )
Horizontal intercept
intercept
The graphs of f and f −1 are reflections of each other in the line y = x. Hence, the graph of
f −1 has x-intercept (3, 0), y-intercept (0, 0.5 (e−3 − 1)), and horizontal asymptote y = −0.5.
(iii) By Fact 23, any point at which f intersects the line y = x is also a point at which f
intersects f −1 .
The points where f intersects the line y = x are the points where f (x) = x or equivalently
ln (2x + 1) + 3 = x. And thus, f and f −1 also intersect at the points where = holds. Solve =
1 1 1
by using your graphing calculator — we find that the two solutions are x ≈ −0.4847, 5.482.
But as was already discussed in Ch. 14.3, the above statement is false.
A468 (9740 N2010/I/5)(i) Translating y = x3 rightwards by 2 units yields y = (x−2) .
3
curve.
To find the new curve’s y-intercept, plug in x = 0 to get y = (2 ⋅ 0 − 2) − 6 = −14. So, the
3
√ √
x = 0.5 6 + 1. So, the x-intercept is (0.5 6 + 1, 0).
3 3
(ii) Being the reflection of f in the line y = x, the graph of f −1 has x-intercept (−14, 0) and
√
y-intercept (0, 0.5 6 + 1).
3
(0, 0.5 + 2)
Vertical
intercept of
y = f -1(x)
y = f -1(x)
x
(0.5 , 0)
Horizontal
intercept of
(-14, 0) y = f(x)
Horizontal
intercept of y = f (x) = (2x - 2)3 - 6
y = f -1(x)
(0, -14)
Vertical
intercept of
y = f(x)
A469 (9740 N2010/II/4)(i) Graph on your calculator and copy. The asymptotes are
x = ±1.
y = f (x)
y=0
Horizontal
asymptote
x
x = -1
Vertical
asymptote
x=1
(0, -1) Vertical
Vertical asymptote
intercept
(ii) Observe that f is symmetric in the y-axis. So, by restricting Domainf to R+0 , the new
function produced would be invertible. Hence, the smallest k for which f −1 exists is k = 0.
(iii) 1 1 (x − 3) 2
f g (x) = f ( )= =
x−3 ( x−3
1 2
) − 1 1 − (x − 3)
2
(x − 3) 2 (x − 3) 2
= =
[1 − (x − 3)] [1 + (x − 3)] (4 − x) (x − 2)
.
slightly incorrect.
y
y = fg (x)
The point
(3, 0) is not
part of the
graph of y
= fg (x).
Note well that while your graphing calculator has graphed y for all x ∈ R ∖ {2, 4}, the
domain of f g is R ∖ {2, 3, 4}.
Hence, the point (3, 0) is not in the graph of f g. Since there is no other x for which
f g (x) = 0, we conclude that 0 ∉ Range (f g). So, = is incorrect. Instead, we have:
1
1
And so: ∈ (−∞, 0) ∪ (1, ∞).
(4 − x) (x − 2)
1
And thus: −1 + ∈ (−∞, −1) ∪ (0, ∞) = Range (f g).
(4 − x) (x − 2)
A470 (9740 N2009/I/1)(i) Write un = an2 + bn + c. The information given yields the
following system of equations:
a ⋅ 12 + b ⋅ 1 + c = 10, a ⋅ 22 + b ⋅ 2 + c = 6, a ⋅ 32 + b ⋅ 3 + c = 5.
1 2 3
You can solve this using your calculator. But here as an exercise, let’s do it by hand:
= minus = yields −3a − b = 4.
1 2 4
> is a quadratic inequality with positive coefficient on n2 . Thus, > holds if and only if n is
1 1
x2 ( x+2 )
2
x−2
x−2
(ii) Plug C1 ’s equation y = into C2 ’s to get + = 1.
x+2 6 3
Multiply by 6 (x + 2) to get: x2 (x + 2) + 2 (x − 2) = 6 (x + 2)
2 2 2 2
Rearranging, 2 (x − 2) = 6 (x + 2) − x2 (x + 2) = (x + 2) (6 − x2 ), as desired.
2 2 2 2
y
y=x+3
Line of
symmetry
(0.5, 0.5)
Centre (0, )
Vertical
intercepts (-1, 0)
y = (x - 2) / (x + 1) Horizontal
intercept
y = 0.5
horizontal asymptote
x
(0, -1)
Vertical
intercept
( , 0) x = 0.5 y=-x-1
Horizontal vertical Line of
intercepts asymptote symmetry
y = 1.5
y = (3x - 7) / (2x + 1) horizontal asymptote
y=
horizontal x
asymptotes
(7 / 3, 0)
y2 = (3x - 7) / (2x + 1) Horizontal
intercept
for both graphs
(0, -7)
Vertical
intercept
x=±1 y
vertical
asymptotes y = x / (x2 - 1)
y=0
horizontal asymptote
for both graphs
x
y2 = x / (x2 - 1)
(0, 0)
Horizontal and
vertical intercepts
for both graphs
Take care to note that Domainf = (4, ∞) and Rangef = (1, ∞). In particular, the graph of
f does not include the point (4, 1).
6 y = f -1(x)
3
y=x
2 line
The point (4, 1) is
1 not part of the
graph of y = f (x).
0
-2 0 2 4 6 8
x
-1
√
= (x − + − = (x − ⇐⇒ ± y − 1 = x − 4 ⇐⇒
2 2
(ii) Write
√ y 4) 1. Do the algebra: y 1 4)
x = 4 ± y − 1. Since Domainf√ = (4, ∞), we have x > 4 and so we can discard the negative
value. Thus, f (x) = 4 + x − 1.
−1
√
√
9 ± (−9) − 4 (1) (17) 9 ± 13
2
x= =
2 (1)
.
2
Since Domainf = (4, ∞), we may discard any values that are smaller than or equal to 4.
√
This leaves us with (9 + 13) /2 as a solution to f (x) = f −1 (x). 421
421
The answer here may have sufficed on these particular A-Level exams where the writers seem to have
made a mistake (see remark). However, this is not in fact a complete answer. We have merely found
one solution to f (x) = f −1 (x). But there may or may not be still other solutions.
1628, Contents www.EconsPhDTutor.com
Remark 161. The writers of (iv) may have made a mistake. It seems that they believed
the following statement to be true:
But as was already discussed in Ch. 14.3, the above statement is false.
2x2 − x − 19 2x2 − x − 19 − (x2 + 3x + 2) x2 − 4x − 21
A476 (9740 N2007/I/1). 2 −1= = 2 .
x + 3x + 2 x2 + 3x + 2 x + 3x + 2
2x2 − x − 19 2x2 − x − 19 x2 − 4x − 21
And so 2 > 1 ⇐⇒ 2 − 1 > 1 − 1 = 0 ⇐⇒ 2 > 0.
x + 3x + 2 x + 3x + 2 x + 3x + 2
The numerator N = x2 − 4x − 21 has positive coefficient on x2 and roots −3 and 7.
The denominator D = x2 + 3x + 2 has positive coefficient on x2 and roots −1 and −2.
Hence, N /D > 0
>0
N
D
⇐⇒ “N > 0 AND D > 0” OR “N < 0 AND D < 0”
⇐⇒ “’x < −3 OR x > 7’ AND ’x < −1 OR x > −2’” OR “−3 < x < 7 AND −2 < x < −1”
Altogether then, the inequality holds if and only if x ∈ (−∞, −3) ∪ (−2, −1) ∪ (7, ∞).
A477 (9740 N2007/I/2)(i) Since Domainf = R ∖ {3} ⊆/ Rangeg = R+0 , f g does not exist.
Since Domaing = R ⊆ Rangef = R ∖ {0}, gf exists. We have Domain (gf ) = Domainf =
R ∖ {3} and:
To confirm
√ that this is the unique√solution, let us directly solve f (x) = f −1 (x) ⇐⇒ (x − 4) + 1 =
1 2
solution to =. The converse, though, is not true. So, what we’ll do is find all the solutions to =, then
2 2
Above we have already found that two roots of = are (9 ± 13) /2. (We also concluded that one of these
1
√ √
x4 − 16x3 + 90x2 − 209x + 170 = (x2 + ax + b) [x − (9 + 13) /2] [x − (9 − 13) /2] =
(x2 + ax + b) (x2 − 9x + 17) = x4 + (a − 9) x3 +?x2 +?x + 17b.
f (x) = f −1 (x).
√
Altogether then, (9 + 13) /2 is indeed the unique solution to f (x) = f −1 (x).
1629, Contents www.EconsPhDTutor.com
1 1
gf (x) = g (f (x)) = g ( )= .
x−3 (x − 3)2
1 1 1
(ii) Write y = f (x) = and do the algebra to get = x − 3 or + 3 = x. (Note that
x−3 y y
y ≠ 0 because 0 ∉ Rangef .) Hence, Domainf −1 = Rangef = R ∖ {0} and:
1
f −1 (x) = + 3.
x
2x + 7 3
A478. (9740 N2007/I/5) Write y = =2+ .
x+2 x+2
Starting with y = 1/x:
1. Translate leftwards by 2 units to get y = 1/ (x + 2).
2. Stretch vertically outwards from the x-axis by a factor of 3 to get y = 3/ (x + 2).
3. Translate upwards by 2 units to get y = 2 + 3/ (x + 2).
This is a rectangular hyperbola.
3
For the y-intercept, plug in x = 0 to get y = 2 + = 3.5 — so, the only y-intercept is
0+2
(0, 3.5).
3
For the x-intercept, plug in y = 0 to get 0 = 2 + or x = −3.5 — so, the only x-intercept
x+2
is (3.5, 0).
As x → −2, y → ±∞. So, x = −2 is a vertical asymptote.
As x → ±∞, y → 2. So, y = 2 is a horizontal asymptote.
y=x +4
Line of
symmetry
y=2
horizontal asymptote
x
(0, -3.5)
Vertical
intercept
x = -2
vertical
asymptote
(-3.5, 0)
Horizontal
intercept
A479 (9740 N2007/II/1). Let p, m, and l be the prices (in dollars per kilogram) of,
respectively, the pineapples, mangoes, and lychees. Then the given table yields the following
system of equations:
1.15p + 0.6m + 0.55l = 8.28, 1.2p + 0.45m + 0.3l = 6.84, 2.15p + 0.9m + 0.65l = 13.05.
1 2 3
We can either solve the above system by calculator or by hand. We’ll do the latter:
2× = minus = yields 0.25p − 0.05l = 0.63 or 25p − 5l = 63.
2 3 4
1.3p + 0.25m + 0.5l = 1.3 ⋅ 3.5 + 0.25 ⋅ 2.6 + 0.5 ⋅ 4.9 = 4.55 + 0.65 + 2.45 = 7.65 dollars.
4x + 1 13
4A480. (9233 N2007/II/4)(i) Write y = =4+ .
x−3 x−3
As x → 3, y → ±∞. So, x = 3 is a vertical asymptote.
1631, Contents www.EconsPhDTutor.com
As x → ±∞, y → 4. So, y = 4 is a horizontal asymptote.
13
(ii) For the y-intercept, plug in x = 0 to get y = 4 + = −1/3. So, the only y-intercept is
0−3
(0, −1/3).
13
For the x-intercept, plug in y = 0 to get 0 = 4 + or x = −1/4. So, the only x-intercept
x−3
is (−1/4, 0).
y
y = (4x + 1) / (x - 3)
(3, 4)
y=-x+7 Centre y=x +1
Line of Line of
symmetry symmetry
y=4
horizontal asymptote
x=3
vertical
(- 1 / 4, 0) asymptote
Horizontal
intercept
(0, - 1 / 3)
Vertical
intercept
13
(iii) Domainf −1 = Rangef = R ∖ {4}. Write y = f (x) = 4 + . Then do the algebra:
x−3
13 13 13
y−4= ⇐⇒ = x − 3 ⇐⇒ 3 + = x.
x−3 y−4 y−4
13
(Note that the second step is permitted because y ≠ 4.) So, f −1 (x) = 3 +
.
x−4
A481 (9233 N2006/I/3)(i) Since Domainf = R+ ⊆ Rangeg = R+ , f g exists and has
mapping rule:
3 3 15
f g (x) = f (g (x)) = f ( ) = 5 ⋅ + 3 = + 3.
x x x
≥0
N 1
D
⇐⇒ “N ≥ 0 AND D ≥ 0 AND x ≠ ±3” OR “N ≤ 0 AND D ≤ 0 AN
Altogether then, the inequality holds if and only if x ∈ (−∞, −3) ∪ [0, 1] ∪ (3, ∞).
2An − A + B.
(ii) u10 = 2A ⋅ 10 − A + B = 19A + B = 48 and u17 = 2A ⋅ 17 − A + B = 33A + B = 90.
1 2
r2 [2 (2r)] = 4r3 . So k = 4.
∑ 4r3 = ∑ [r2 (r + 1) − (r − 1) r2 ]
n n
2 2
r=1 r=1
= 12 ⋅ 22 − 02 ⋅ 12 + 22 ⋅ 32 − 12 ⋅ 22 + ⋅ ⋅ ⋅ + n2 ⋅ (n + 1) − (n − 1) ⋅ n2 .
2 2
= −02 ⋅ 12 + n2 ⋅ (n + 1) = n2 ⋅ (n + 1)
2 2
A484 (9758 N2017/II/2). Let the arithmetic progression be (ai ) and the geometric
progression be (gi ). Let d = a2 − a1 be the common difference in the arithmetic progression.
(i) a1 = 3 and a13 = 3 + 12d. So, (3 + 3 + 12d) × 13/2 = 156, so d = (2 × 156/13 − 6)/12 = 1.5.
13
(ii) The common ratio r cannot be equal to 1, because if so, ∑ gi = 13g1 = 13 × 3 = 39 ≠ 156.
i=1
The sum of the first 13 terms is 3 (1 − r13 ) / (1 − r) = 156 ⇐⇒ 3 − 3r13 = 156 − 156r ⇐⇒
r13 − 52r + 51 = 0.
Use your graphing calculator to find that besides 1, the other two possible roots to this last
equation are r ≈ −1.451, 1.210.
These are thus also the two possible values of r.
(iii) We know that gn = 3rn−1 and an = 3 + 1.5 (n − 1).
We are told that r ≈ 1.210. We are told also that gn > 100an .
Thus: 3 ⋅ 1.210n−1 > 100 [3 + 1.5 (n − 1)] = 150 + 150n. Graph 3 ⋅ 1.210x−1 − 150x − 150 in your
graphing calculator. You should find that there is a positive x-intercept. To the left of this
x-intercept, the graph is below the x-axis and to the right, it is above.
1634, Contents www.EconsPhDTutor.com
This x-intercept is given by x ≈ 41.149. Thus, the smallest value of n for which the inequality
holds is 42.
1
A486 (9740 N2016/I/6)(i) Let P (k) be the proposition: ∑ r (r2 + 1) = k (k + 1) (k 2 + k + 2).
k
r=1 4
We first verify that P (1) is true:
1
1 1 1
∑ r (r2 + 1) = 1 (12 + 1) = 2 = ⋅ 1 ⋅ 2 ⋅ 4 = ⋅ 1 (1 + 1) (12 + 1 + 2) = . 3
r=1 4 4 4
Now let k be any positive integer. Suppose P (k) is true. Below we show that P (k + 1) is
also true and hence, by the principle of mathematical induction, that the given proposition
is also true:
r=1 r=1
1
= k (k + 1) (k 2 + k + 2) + (k + 1) (k 2 + 2k + 2)
4
1
= (k + 1) [ k (k 2 + k + 2) + (k 2 + 2k + 2)]
4
1 5
= (k + 1) ( k 3 + k 2 + 2.5k + 2)
4 4
1
= (k + 1) (k 3 + 5k 2 + 10k + 8)
4
1
= (k + 1) (k + 2) (k 2 + 3k + 4)
4
1
= (k + 1) (k + 2) [(k + 1) + (k + 1) + 2] .
2
3
4
(ii) u1 = u0 + 13 + 1 = 2 + 1 + 1 = 4.
u2 = u1 + 23 + 2 = 4 + 8 + 2 = 14.
u3 = u2 + 33 + 3 = 14 + 27 + 3 = 44.
n
(iii) Through telescoping, we have ∑ (ur − ur−1 ) = un − u0 = un − 2.
1
r=1
1
∑ (ur − ur−1 ) = ∑ (r + r) = ∑ r (r2 + 1) = k (k + 1) (k 2 + k + 2).
n n n
3 3 (i)
1
Plugging this last equation into =, we have un = k (k + 1) (k 2 + k + 2) + 2.
1
4
A485 (9740 N2016/I/4). We are given that:
a + 3d = br4 ,
1
a + 8d = br7 ,
2
a + 11d = br14 .
3
b (1 − rn ) 0.74b b
− = [1 − (1 − r )] = ≈
b b n brn
1−r 1−r 1−r 1−r
.
0.26
A487 (9740 N2015/I/8). First, note that in seconds, the required time interval is
[5 400, 6 300].
(i) The time (in seconds) taken by A to complete the 50 laps is:
Number of terms 50
(First term + Last term) × = (T + T + 49 × 2) × = 50T + 49 × 50 =
2 2
50T + 2450.
So, we need 50T + 2 450 ∈ [5 400, 6 300] or 50T ∈ [2 950, 3 850] or T ∈ [59, 77].
(ii) The time (in seconds) taken by B to complete the 50 laps is:
So, we need 50t (1.0250 − 1) ∈ [5 400, 6 300] or 3 192.267 > 50t > 3 724.311 or 63.845 > t >
74.486.
(iii) T = 59 and t ≈ 63.845. So the times taken to complete the 50th lap by A and B are:
1
k (k + 1) (3k 2 + 31k + 74).
k
∑ r (r + 2) (r + 5) =
r=1 12
j+1 j
∑ r (r + 2) (r + 5) = ∑ r (r + 2) (r + 5) + (j + 1) (j + 3) (j + 6)
r=1 r=1
1
= j (j + 1) (3j 2 + 31j + 74) + (j + 1) (j + 3) (j + 6)
P(j)
12
j+1
= (3j 3 + 31j 2 + 74j) + (j + 1) (j 2 + 9j + 18)
12
j+1
= (3j 3 + 31j 2 + 74j + 12j 2 + 108j + 216)
12
j+1
= (3j 3 + 43j 2 + 182j + 12j 2 + 216)
12
j+1
= (j + 2) (3 (j + 1) 2 + 31 (j + 1) + 74). 3
12
(b)(i) Write: (2r + 3) A + (2r + 1) B
+ =
A B
2r + 1 2r + 3 (2r + 1) (2r + 3)
(2A + 2B) r + 3A + B
= .
4r2 + 8r + 3
So 2A + 2B = 0 and 3A + B = 2.
1 2
2 1 1
= − .
4r2 + 8r + 3 2r + 1 2r + 3
(ii) n
2 n
1 1
∑ 2 = ∑( − )
r=1 4r + 8r + 3 r=1 2r + 1 2r + 3
1 1 1 1 1 1 1 1
= ( − ) + ( − ) + ( − ) + ⋅⋅⋅ + ( − )
3 5 5 7 7 9 2n + 1 2n + 3
1 1
= − .
3 2n + 3
(iii) The sum to infinity is 1/3. Hence, the difference between Sn and the sum to infinity
1
is . Now:
2n + 3
1
≤ 10−3 ⇐⇒ 1 000 ≤ 2n + 3 ⇐⇒ n ≥ 498.5.
2n + 3
So the smallest such n is 499.
A489 (9740 N2014/I/6)(i) Let P (k) be the following proposition:
1
pk = (7 − 4k ).
3
1637, Contents www.EconsPhDTutor.com
We show that P (1) is true:
1 1
p1 = (7 − 4) = (7 − 41 ). 3
3 3
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
pj+1 = 4pj − 7
4
= (7 − 4j ) − 7
P(j)
3
1
= (7 − 4j+1 ). 3
3
(ii) n n
1 1 n 1 n n
∑ pr = ∑ (7 − 4r ) = ∑ (7 − 4r ) = ( ∑ 7 − ∑ 4r )
r=1 r=1 3 3 r=1 3 r=1 r=1
1 1 − 4n 1 1 − 4n 4 7n 4n+1
= (7n − 4 ) = (7n + 4 )= + − .
3 1−4 3 3 9 3 9
1 1
(b)(i) As n → ∞, we have (n + 1)! → ∞ and so → 0 and thus Sn = 1 − → 1.
(n + 1)! (n + 1)!
(ii) 1 1 1 1
un = Sn − Sn−1 = 1 − − (1 − ) = −
(n + 1)! n! n! (n + 1)!
n+1 1
= − =
n
.
(n + 1)! (n + 1)! (n + 1)!
Number of terms
(First term + Last term) × = (8 + 8n) × = 4n + 4n2 m.
n
2 2
√
−1 ± 12 − 4 (1) (−1250) √
n= = −0.5 ± 0.5 5001 ≈ 34.859, −35.859.
2 (1)
Thus, the minimum number of stages to complete in order to have run at least 5 km is 35.
(ii) The distance run in the nth stage is 2n−1 ⋅ 8 m. Thus, the distance run in the first n
stages is:
1638, Contents www.EconsPhDTutor.com
2n − 1
∑ (2k−1 ⋅ 8) = 8 ∑ 2k−1 = 8
n n
= 2n+3 − 8 m.
k=1 k=1 2−1
Let j be the largest integer such that 2j+3 − 8 < 10 000. Since 213 = 8 192 and 214 = 16 384,
we have j + 3 = 13 or j = 10.
So, the distance run after completing exactly 10 stages is 213 − 8 = 8 184 m.
So, at the instant at which he has run exactly 10 km or 10 000 m, he has completed 1 816 m
of the 11th stage. Since Stage 11 is 211−1 ⋅ 8 = 8 192 m long , at this instant, he will not
even have completed half of Stage 11. Thus, at this instant, he is 1 816 m away from O and
running away from O.
A491 (9740 N2013/I/7)(i) The nth piece is p = (2/3) × 128 cm long. Applying ln to
1 n−1
=, we get:
1
2 k−1 1 − (2/3) 2 n
n n
Sn = ∑ ( ) × 128 = 128 = 384 − 384 ( ) .
k=1 3 1 − 2/3 3
2 n
Thus, as n → ∞, Sn = 384 − 384 ( ) → 384.
3
(iii) Let j be the smallest integer such that Sj > 380. Then:
2 j 2 j 3 j 384
Sj = 384 − 384 ( ) > 380 or 4 > 384 ( ) or ( ) > = 96 or
3 3 2 4
3 ln 96
j ln > ln 96 or j> ≈ 11.257.
2 ln (3/2)
Thus, the minimum number of pieces one must cut off in order for the length cut off to
exceed 380 cm is j = 12.
A492 (9740 N2013/I/9)(i) Let P (k) be the following proposition:
1
∑ r (2r2 + 1) = k (k + 1) (k 2 + k + 1).
k
r=1 2
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
r=1 r=1
1
= j (j + 1) (j 2 + j + 1) + (j + 1) (2j 2 + 4j + 3)
P(j)
2
j+1 3 2 j+1
= (j + j + j) + (4j 2 + 8j + 6)
2 2
j+1 3
= (j + 5j 2 + 9j + 6)
2
1
= (j + 1) (j + 2) (j 2 + aj + b)
2
1
= (j + 1) (j + 2) (j 2 + 3j + 3)
2
1
= (j + 1) (j + 2) [(j + 1) + (j + 1) + 1].
2
3
2
(ii)
f (r) − f (r − 1) = (2r3 + 3r2 + r + 24) − [2 (r − 1) + 3 (r − 1) + (r − 1) + 24]
3 2
= (
2r3 + 3r2
+r
+24)
− [2 (r3 − 3r2
+3r −1) + 3 (r2
−2r +1) + (
−1)
r +24]
= 6r .
2
= ∑ [r (2r + 1)] + 3 ∑ r + ∑ 24
n n n
2 2
r=1 r=1 r=1
1 n (n + 1) (2n + 1)
= n (n + 1) (n2 + n + 1) + + 24n.
2 2
3⋅2−1 5 3 ⋅ 5/6 − 1 1
A493 (9740 N2012/I/3)(i) u2 = = and u3 = = .
6 6 6 4
3un − 1 un 1 1
(ii) As n → ∞, un+1 − un → 0 ⇐⇒ − un = − − → 0 ⇐⇒ un → − .
6 2 6 3
(iii) Let P (k) be the following proposition:
14 1 k 1
uk = ( ) − .
3 2 3
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
uj 1 uj 1
uj+1 = uj + uj+1 − uj = uj + (− − )= −
2 6 2 6
( 12 ) − 31 1 14 1 j+1 1
14 j
= − = ( ) − .
P(j) 3
3
2 6 3 2 3
A494 (9740 N2012/II/4)(i) On the first day of the nth month, she deposits 100 +
(n − 1) 10 = 10n + 90. Hence, through the first day of the nth month, her account has:
Number of terms
(First term + Last term) × = (100 + 10n + 90) × = 5n2 + 95n.
n
2 2
Let j be the smallest positive integer such that 5j 2 + 95j > 5 000 or j 2 + 19j − 1 000 > 0. By
1
Hence, j = 24. Thus, her account first exceeded $5 000 on the 24th month — that is, on
December 1 2002.
(ii) Let Sn be the amount in his account on the last day of each month, after interest has
been paid.
Then S1 = 1.005 ⋅ 100 and Sn+1 = 1.005 (Sn + 100).
So, in general:
1.005n − 1
Sn = 1.005n ⋅ 100 + 1.005n−1 ⋅ 100 + ⋅ ⋅ ⋅ + 1.005 ⋅ 100 = 1.005 ⋅ 100 = 20 100 (1.005n − 1) .
1.005 − 1
Let j be the smallest positive integer such that Sj = 20 100 (1.005j − 1) > 5 000 or 201 ⋅
1.005j > 251 or:
251 251 ln (251/201)
1.005j > or j ln 1.005 > ln or j> ≈ 44.541.
201 201 ln 1.005
Hence, j = 45. Thus, his account first exceeded $5 000 in the 45th month — that is, in
September 2004.
(iii) Let r be the interest rate. Then given r, the amount in the account on 2 December
2003 is:
Note that = is an equation we haven’t learnt to solve in H2 Maths, so you’ll need to use
1
sin (r + 12 ) θ − sin (r − 12 ) θ
(ii) Rearranging (i), we have cos rθ = . And so:
2 sin 12 θ
n n sin (r + 12 ) θ − sin (r − 12 ) θ
∑ cos rθ = ∑
r=1 r=1 2 sin 12 θ
1 3 1 5 3 2n + 1 2n − 1
= (sin θ − sin θ + sin θ − sin θ + ⋅ ⋅ ⋅ + sin θ − sin θ)
2 sin 12 θ 2 2 2 2 2 2
1 2n + 1 1 sin (n + 12 ) θ 1
= (sin θ − sin θ) = − .
2 sin 12 θ 2 2 2 sin 12 θ 2
k cos 21 θ − cos (k + 12 ) θ
∑ sin rθ = .
r=1 2 sin 12 θ
cos 21 θ − cos 32 θ
, 3
2 sin 21 θ
where the last step uses the last formula printed under Trigonometry in List MF26, p. 3.
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
cos 21 θ − cos (j + 21 ) θ
= + sin (j + 1) θ
P(j)
2 sin 12 θ
cos 21 θ − cos (j + 12 ) θ + 2 sin 21 θ sin (j + 1) θ
=
2 sin 12 θ
cos 12 θ − cos (j + 12 ) θ + cos (j + 21 ) θ − cos (j + 32 ) θ
=
1
2 sin 12 θ
cos 12 θ − cos (j + 32 ) θ
= , 3
2 sin 12 θ
as desired. (Again, to get from = to =, I used the same trigonometric identity as before.)
3 4
A496 (9740 N2011/I/9)(i) The depth drilled on the nth day is 256 − 7 (n − 1) = 263 − 7n
metres and the depth drilled through the nth day is:
Number of terms n 519 7
Dn = (First term + Last term) × = (256 + 263 − 7n) = n − n2 .
2 2 2 2
Thus, the depth drilled on the 10th day is 263 − 7 ⋅ 10 = 193 metres.
Let j be the smallest integer such that 263 − 7j < 10 or j > 253/7 ≈ 36.1. Then j = 37. So,
the total depth drilled is:
519 7
Dj = D37 = ⋅ 37 − ⋅ 372 = 4 810 metres.
2 2
(ii) Through the nth day, the depth drilled is:
8 r−1 1 − (8/9) 8 n
n n
dn = ∑ 256 ( ) = 256 = 2 304 [1 − ( ) ] metres.
r=1 9 1 − 8/9 9
8 j
Let j be the smallest integer such that dj > 0.99 ⋅ 2 304 or 2 304 [1 − ( ) ] > 0.99 ⋅ 2 304 or:
9
8 j 8 j 8
1 − ( ) > 0.99 or 0.01 > ( ) or − ln 100 > j ln or
9 9 9
ln 100
j> ≈ 39.1.
ln (9/8)
un+1 = un + 4.
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
j+1 j
∑ r (r + 2) = ∑ r (r + 2) + (j + 1) (j + 3)
r=1 r=1
1
= j (j + 1) (2j + 7) + (j + 1) (j + 3)
P(j)
6
j+1 j+1
= (2j 2 + 7j) + (6j + 18)
6 6
j+1
= (2j 2 + 13j + 18)
6
j+1
= (j + 2) (2j + 7)
6
1
= (j + 1) (j + 1 + 1) [2 (j + 1) + 7]. 3
6
(ii)(a) Observe that:
1 0.5 0.5
= −
r (r + 2) r+2
.
r
Hence:
n
1 n
0.5 0.5
∑ = ∑( − )
r=1 r (r + 2) r=1 r r+2
0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5
= − + − + − + ⋅⋅⋅ + − + −
1 3 2 4 3 5 n−1 n+1 n n+2
0.5 0.5 0.5 0.5 3 1 1
= + − − = − − .
n + 1 n + 2 4 2 (n + 1) 2 (n + 2)
3
1 2
1644, Contents www.EconsPhDTutor.com
(b) In the formula found in (ii)(a), as n → ∞, the second and third terms vanish. Hence,
n
1 3
as n → ∞, ∑ → .
r=1 r (r + 2) 4
A499 (9740 N2009/I/3)(i)
1 2 1 n (n + 1) − 2 (n − 1) (n + 1) + (n − 1) n
− + =
n−1 n n+1 (n − 1) n (n + 1)
n2 + n − 2 (n2 − 1) + n2 − n −2 (−1) 2
= = = .
n (n2 − 1) n3 − n n3 − n
So, A = 2.
(ii)
n
1 1 n 1 2 1
∑ 3 = ∑( − + )
r=2 r − r 2 r=2 r − 1 r r + 1
1 1 2 1 1 2 1 1 2 1 1 2 1 1 2 1
= ( − + + − + + − + + ⋅⋅⋅ + − + + − + ).
2 1 2 3 2 3 4 3 4 5 n−2 n−1 n n−1 n n+1
Observe that the terms with denominators 3 through n − 1 are cancelled out. Hence:
n
1 1 1 2 1 1 2 1 1 1 1 1 1 1 1
∑ = ( − + + − + ) = ( − + ) = − + .
r=2 r − r 2 1 2 2 n n n+1 2 2 n n+1 4 2n 2 (n + 1)
3
(iii) In the formula found in (ii), as n → ∞, the second and third terms vanish. Hence, as
n
1 1
n → ∞, ∑ 3 → .
r=2 r − r 4
A500 (9740 N2009/I/5)(i) Let P (k) be the following proposition:
k
1
∑ r2 = k (k + 1) (2k + 1).
r=1 6
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
j+1 j
1
∑ r = ∑ r2 + (j + 1) = j (j + 1) (2j + 1) + (j + 1)
2 2 P(j) 2
r=1 r=1 6
j+1 j+1 j+1
= (2j 2 + j) + (6j + 6) = (2j 2 + 7j + 6)
6 6 6
j+1 1
= (j + 2) (2j + 3) = (j + 1) (j + 1 + 1) [2 (j + 1) + 1]. 3
6 6
(ii) 2n
1 2n n
1
∑ r = ∑ r − ∑ r2 = 2n (2n + 1) (2 ⋅ 2n + 1) − n (n + 1) (2n + 1)
2 2
r=n+1 r=1 r=1 6 6
n (2n + 1) n (2n + 1) n (2n + 1)
= (8n + 2) − (n + 1) = (7n + 1).
6 6 6
1645, Contents www.EconsPhDTutor.com
A501 (9740 N2009/I/8)(i) Let r be the common ratio. Then the 25th bar has length
1
20r24 = 5 cm and so r = ( ) 1/24 = 0.51/12 .
4
20
In the limit, the total length of all the bars is ≈ 356.343 cm.
1−r
And so indeed, no matter how many bars there are, their total length cannot exceed 357 cm.
1 − r25
(ii) The total length is L = 20 ≈ 272.257 cm.
1−r
The length of the 13th bar is 20r12 = 20 ⋅ (0.51/12 ) = 10 cm.
12
So, d ≈ 0.491 cm and the length of the longest bar (the 1st bar) is 5 + 24d ≈ 16.781 cm.
A502 (9740 N2008/I/2). Let P (k) be the following proposition:
k
1
Sk = ∑ ur = k (k + 1) (4k + 5).
r=1 6
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
j
1 j+1 j+1
Sj+1 = ∑ ur + (j + 1) (2j + 3) = j (j + 1) (4j + 5) + (j + 1) (2j + 3) = (4j 2 + 5j) + (1
P(j)
r=1 6 6 6
j+1 j+1 1
= (4j 2 + 17j + 18) = (j + 2) (4j + 9) = (j + 1) (j + 1 + 1) [4 (j + 1) + 5]. 3
6 6 6
A503 (9740 N2008/I/10)(i) On the first day of the nth month, she saves 10 + 3 (n − 1) =
7 + 3n dollars.
Thus, the total saved through the first day of the nth month is:
Number of terms n 3 17
(First term + Last term) × = (10 + 7 + 3n) = n2 + n.
2 2 2 2
3 2 17
Let j be the smallest positive integer such that j + j > 2 000 or 3j 2 + 17j − 4 000 > 0.
2 2
By the quadratic formula, the solution to 3x2 + 17x − 4000 = 0 is:
√ √
−17 ± 172 − 4 (3) (−4 000) −17 ± 48 289
x= = ≈ −39.458, 33.791
2 (3) 6
1646, Contents www.EconsPhDTutor.com
Thus, j = 34. So, she will have saved over $2 000 on 1 October 2011.
In the 1st month (Jan 2009), she has saved 10 dollars. In the 2nd (Feb 2009), she has saved
10 + (10 + 1 × 3) dollars. So in the nth month, she has saved 10 + (10 + 1 × 3) + (10 + 2 × 3) +
. . . [10 + (n − 1) × 3] = [20 + (n − 1) × 3] × = 8.5n + 1.5n2 dollars. Set 8.5n + 1.5n2 = 2000
n
2
and solve:
√
−8.5 ± 8.52 − 4(1.5)(−2000) −8.5 ± 109.874
n= =
3 3
−8.5 + 109.874
We can ignore the negative root. So n = ≈ 33.791. So it is only in the 34th
3
month that she has saved over $2000. That’s 1 October 2011.
(ii)(a) At the end of 2 years, her original $10 has earned 10 × 1.0224 − 10 ≈ 6.084 dollars in
compound interest.
(b) Just after interest has been paid on the last day of the nth month, the total in her
account is:
1.02n − 1
10 ⋅ 1.02 + 10 ⋅ 1.02
n n−1
+ ⋅ ⋅ ⋅ + 10 ⋅ 1.02 = 10 ⋅ 1.02 ⋅
1
= 510 (1.02n − 1) dollars.
1.02 − 1
Hence, at the end of 2 years, just after interest has been paid on 31 December 2010, the
total in her account is:
(c) Let j be the smallest positive integer such that 510 (1.02j − 1) > 2 000 or 510⋅1.02j > 2 510
or:
251 251 251
1.02j > or j ln 1.02 > ln or j > ln ÷ ln 1.02 ≈ 80.476.
51 51 51
Thus, it is only after j = 81 complete months that her total savings first exceed $2 000.
A504 (9233 N2008/II/2). Let an and gn denote the nth terms of the arithmetic and
geometric progressions. Let d and r be the corresponding common difference and ratio. We
are given that:
1 1 r 1
a2 + g2 = +d+ = d+ = 0.
r 1
or or
2 2 2 2 2
1 1 r2 1 r2 2 3
And: a3 + g3 = or + 2d + = or 2d + = − .
8 2 2 8 2 8
√
8 ± (−8) − 4 (4) (3)
2
√ 1 1 3
r= = 1 ± 1 − 3/4 = 1 ± = , .
2 (4) 2 2 2
Since the geometric progression converges, ∣r∣ < 1 and so r = 1/2. And thus, its sum to
infinity is:
1647, Contents www.EconsPhDTutor.com
1/2
= = 1.
g1
1 − r 1 − 1/2
A505 (9740 N2007/I/9)(i) Using your graphing calculator, α ≈ 0.619 and β ≈ 1.512
(calculator).
(ii) Suppose xn → L. Then xn+1 − xn → 0. Or:
1 1 L
xn+1 − xn = exn − xn → 0 or e − L = 0.
3 3
1
Equivalently, L equals a solution to ex − x = 0. So, L equals α or β.
3
Remark 162. The answer to (ii) takes for granted certain results that students were not
taught (even under the old 9740 syllabus). This question should never have been asked.
1
(iii) If x1 = 0, then x2 = , x3 ≈ 0.465, x4 ≈ 0.531, x5 ≈ 0.567, x6 ≈ 0.588, . . . , x15 ≈ 0.619.
3
So the sequence converges to α ≈ 0.619.
If x1 = 1, then x2 ≈ 0.906, x3 ≈ 0.825, x4 ≈ 0.761, x5 ≈ 0.713, x6 ≈ 0.680, . . . , x17 ≈ 0.619. So
the sequence converges to α ≈ 0.619.
If x1 = 2, then x2 ≈ 2.463, x3 ≈ 3.913, x4 ≈ 16.690, x5 ≈ 5 903 230.335. “Clearly”, the sequence
diverges.
1
(iv) From the graph of y = ex − 3x, we observe that if α < xn < β, then ex − 3x < 0 or ex < x
3
or xn+1 < xn .
1
Similarly, we observe that if x < α or x > β, then ex − 3x > 0 or ex > x or xn+1 > xn .
3
(v) If xn > β ≈ 1.512, then (iv) tells us that xn+1 > xn and therefore that the sequence
diverges. We saw this with x1 = 2 in (iii).
If xn ∈ (α, β) ≈ (0.619, 1.512), then (iv) tells us that xn+1 < xn . We saw this with x1 = 1 in
(iii).
If xn < α ≈ 0.619, then (iv) tells us that xn+1 > xn . We saw this with x1 = 0 in (iii).
A506 (9740 N2007/I/10)(i) We are given that the first term of the geometric progression
equals a.
ra = a + 3d, r2 a = a + 5d.
1 2
We are also given:
a (r − 1) a (r2 − 1)
d= = or 5r − 5 = 3r2 − 3 or 3r2 − 5r + 2 = 0.
3 5
contradicting our assumption that d ≠ 0. Hence, r = 2/3. Since ∣r∣ < 1, the geometric series
converges to:
= 3a.
a
1−r
1648, Contents www.EconsPhDTutor.com
ra − a −a/3
(iii) From =, d = = = − . And now:
1 a
3 3 9
n−1 n−1 19 − n
S = [a + a + (n − 1) d] = an + nd = an (1 − ) = an
n
.
2 2 18 18
19 − n
S > 4a ⇐⇒ an > 4a ⇐⇒ n (19 − n) > 72 ⇐⇒ n2 − 19n + 72 < 0. By the quadratic
1
18
formula, x2 − 19x + 72 = 0 has solutions:
√
√
19 ± (−19) − 4 (1) (72) 19 ± 73
2
x= = ≈ 5.228, 13.772.
2 (1) 2
So, < holds if and only if 5.228 > n > 13.772. Of course, n must be an integer. And so, the
1
set of possible values of n for which < or S > 4a holds is {6, 7, 8, 9, 10, 11, 12, 13}.
1
1
uk = .
k2
We show that P (1) is true:
1
u1 = 1 = . 3
12
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
2j + 1 2j + 1 (j + 1) − (2j + 1)
2
1
uj+1 = uj − = 2− =
P(j)
j 2 (j + 1) j 2 (j + 1) j 2 (j + 1)
2 j 2 2
j2 1
= = 2. 3
j 2 (j + 1) (j + 1)
2
(ii) N
2n + 1 N
∑ 2 = ∑ (un − un+1 )
n=1 n (n + 1)
2
n=1
= u1 − u2 + u2 − u3 + ⋅ ⋅ ⋅ + uN − uN +1
1
= u1 − uN +1 = 1 − 2.
(N + 1)
(iii) In the formula just found in (ii), as N → ∞, the second term vanishes, so that the
series converges to 1.
2n + 1 2 (n + 1) − 1
(iv) First observe that 2 = . Thus:
n (n + 1) 2 (n + 1) 2 (n + 1 − 1) 2
N
2n − 1 N −1
2n + 1 1
∑ 2 = ∑ 2 = 1 − 2.
n=2 n (n − 1) n=1 n (n + 1)
2 2 N
1649, Contents www.EconsPhDTutor.com
A508 (9233 N2007/I/14). Let P (k) be the following proposition:
k cos 21 x − cos (k + 21 ) x
∑ sin rx = .
r=1 2 sin 12 x
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
j+1 j
∑ sin rx = ∑ sin rx + sin (j + 1) x
r=1 r=1
cos 12 x − cos (j + 21 ) x
= + sin (j + 1) x
P(j)
2 sin 12 x
cos 12 x − cos (j + 21 ) x + 2 sin 12 x sin (j + 1) x
=
2 sin 12 x
−
cos 12 x cos (j+21 ) x
+cos (j+21 ) x − cos (j + 32 ) x
=
2
2 sin 12 x
cos 12 x − cos (j + 1 + 12 ) x
= . 3
2 sin 12 x
2n
32n − 1 27 2n
2n
A509 (9233 N2007/II/1). ∑ 3r+2 = 9 ∑ 3r = 9 ⋅ 3 = (3 − 1).
r=1 r=1 3 − 1 2
2
A510 (9233 N2006/I/1). The first term is S1 = 6 − 1−1 = 4.
3
2 4 1
The common ratio is (S2 − S1 ) ÷ S1 = S2 ÷ S1 − 1 = (6 − 2−1 ) ÷ 4 − 1 = − 1 = .
3 3 3
A511 (9233 N2006/I/11)(i) Let P (k) be the following proposition:
k
1
∑ r3 = k 2 (k + 1) 2 .
r=1 4
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
1650, Contents www.EconsPhDTutor.com
j+1 j
1 2
∑ r = ∑ r3 + (j + 1) = j (j + 1) 2 + (j + 1)
3 3 P(j) 3
r=1 r=1 4
(j + 1) 2 2 (j + 1) 2 (j + 1) 2 2 (j + 1) 2
= j + (4j + 4) = (j + 4j + 4) = (j + 2) .
2
3
4 4 4 4
n n
(ii) 2 + 4 + ⋅ ⋅ ⋅ + (2n) = ∑ (2r) = 8 ∑ r3 = 2n2 (n + 1) 2 .
3 3 3 3
r=1 r=1
(iii)
∑ (2r − 1) 3 = 13 + 33 + . . . (2n − 1) = 13 + 23 + ⋅ ⋅ ⋅ + (2n) − [23 + 43 + ⋅ ⋅ ⋅ + (2n) ]
n
3 2 2
r=1
2n n
1
= ∑ r3 − ∑ (2r) = (2n) (2n + 1) 2 − 2n2 (n + 1) 2
3 2
r=1 r=1 4
r = a + tb and r ⋅ n = d.
If the two lines intersect, then there exist real numbers λ̂ and µ̂ such that:
3λ̂ = 1 + 4µ̂,
1
⎛0⎞ ⎛ 3 ⎞ ⎛ 1 ⎞ ⎛ 4 ⎞
⎜ 0 ⎟ + λ̂ ⎜ 1 ⎟ = ⎜ 2 ⎟ + µ̂ ⎜ 5 ⎟ λ̂ = 2 + 5µ̂,
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
2
or
⎝0⎠ ⎝ −2 ⎠ ⎝ −1 ⎠ ⎝ a+1 ⎠
−2λ̂ = −1 + (a + 1) µ̂ (λ ∈ R).
3
3× = minus = yields 0 = 5 + 11µ̂ or µ̂ = −5/11. And now from =, λ̂ = −3/11. If these values
2 1 2
−3 3 −5 22
−2 ( ) = −1 + (a + 1) ( ) a=−
3
or .
11 11 5
1652, Contents www.EconsPhDTutor.com
(i) Let R = (0, 0, 0) + λ̃ (3, 1, −2) = λ̃ (3, 1, −2).
Ð→ Ð→ Ð→ Ð→
If ∠P RQ is right, then P R ⊥ QR or P R ⋅ QR = 0. But:
⎛ 3 ⎞ ⎛ 1 ⎞ ⎛ 3λ̃ − 1 ⎞
Ð→
P R = R − P = λ̃ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜ 1 ⎟ − ⎜ 2 ⎟ = ⎜ λ̃ − 2 ⎟ and
⎝ −2 ⎠ ⎝ −1 ⎠ ⎝ −2λ̃ + 1 ⎠
⎛ 3 ⎞ ⎛ 5 ⎞ ⎛ 3λ̃ − 5 ⎞
Ð→
QR = R − Q = λ̃ ⎜ ⎟ ⎜ ⎟ ⎜
⎜ 1 ⎟ − ⎜ 7 ⎟ = ⎜ λ̃ − 7
⎟.
⎟
⎝ −2 ⎠ ⎝ −3 ⎠ ⎝ −2λ̃ + 3 ⎠
So:
Ð→ Ð→
P R ⋅ QR = (3λ̃ − 1) (3λ̃ − 5) + (λ̃ − 2) (λ̃ − 7) + (−2λ̃ + 1) (−2λ̃ + 3)
= 14λ̃2 − 35λ̃ + 22.
This is a quadratic expression in λ̃ with determinant (−35) − 4 (14) (22) < 0. Hence,
2
Ð→ Ð→ Ð→ Ð→
P R ⋅ QR ≠ 0 or P R ⊥/ QR and thus ∠P RQ cannot be right,
(iii) Let R = (0, 0, 0) + λ̄ (3, 1, −2) = λ̄ (3, 1, −2).
√ √
Ð→
∣P R∣ = (3λ̄ − 1) + (λ̄ − 2) + (−2λ̄ + 1) = 14λ̄2 − 14λ̄ + 6.
2 2 2
Ð→
∣P R∣ is minimised at “−b/2a” (Fact 20):
−14
λ̄ = − = 0.5.
2(14)
Ð→ √ √ √
Hence, R = λ̄ (3, 1, −2) = (1.5, 0.5, −1) and ∣P R∣ = 14λ̄2 − 14λ̄ + 6 = 3.5 − 7 + 6 = 2.5.
A514 (9740 N2016/I/5)(i) Method 1. u +v = (2 + a, −1, 2 + b), u −v = (2 − a, −1, 2 − b),
and so:
Method 2. Recall that the vector product has the following three properties (Fact 81): the
vector product is distributive and anti-commutative; moreover, the vector product
of a vector with itself is the zero vector. Hence:
(u + v) × (u − v) = u × (u − v) + v × (u − v) (Distributivity)
=u×u−u×v+v×u−v×v (“)
=u×u+v×u+v×u−v×v (Anti-commutativity)
=0+v×u+v×u−0 (Self vector product is zero)
= 2v × u.
1653, Contents www.EconsPhDTutor.com
⎛a⎞ ⎛ 2 ⎞ ⎛ b ⎞
But: v×u=⎜ ⎟ ⎜ ⎟ ⎜
⎜ 0 ⎟ × ⎜ −1 ⎟ = ⎜ 2b − 2a
⎟.
⎟
⎝ b ⎠ ⎝ 2 ⎠ ⎝ −a ⎠
⎛ 2b ⎞
So: (u + v) × (u − v) = ⎜
⎜ 4b − 4a
⎟.
⎟
⎝ −2a ⎠
⎛ 2b ⎞ ⎛ −2a ⎞
So: (u + v) × (u − v) = ⎜
⎜ 4b − 4a
⎟ = ⎜ −8a ⎟.
⎟ ⎜ ⎟
⎝ −2a ⎠ ⎝ −2a ⎠
√ √ √
∣(u + v) × (u − v)∣ = (−2a) + (−8a) + (−2a) = 72a2 = 72 ∣a∣.
2 2 2
And:
√ √
If 72 ∣a∣ = 1, then a = ±1/ 72.
(c) Recall that the scalar product is distributive and commutative. Hence:
(u + v) ⋅ (u − v) = u ⋅ u − u ⋅ v + v ⋅ u − v ⋅ v = u ⋅ u
−u ⋅
+v
v ⋅
u − v ⋅ v = 0.
√ √
v ⋅ v = u ⋅ u or ∣v∣ = ∣u∣ or ∣v∣ = ∣u∣ = 22 + (−1) + 22 = 9 = 3.
2 2 2
Or:
A515 (9740 N2016/I/11)(i)(a) Let w = (−2, 1, 2) be l’s direction vector. We show that
w is perpendicular to the two non-parallel vectors u = (1, 2, 0) and v = (a, 4, −2) = (0, 4, −2)
on p:
⎛ 1 ⎞ ⎛1 ⎞ ⎛ 0 ⎞ ⎛ −1 ⎞ ⎛ −2 ⎞ 1 + λ̂ = −1 − 2t̂,
1
And now plugging = back into = and =, we have λ̂ = −8/9 and µ̂ = 19/18.
6 4 5
(i)(b) “Obviously”, the two planes must be parallel to p, with one “above” it and the other
“below” it.
Upper plane
C
12n̂
The plane p B
−12n̂
D Lower plane
The plane p has normal vector n = (1, 2, 0) × (0, 4, −2) = (−4, 2, 4). We have n̂ = n/ ∣n∣ =
√ 1
(−4, 2, 4) / 36 = (−2, 1, 2).
3
The plane p contains the point B = (1, −3, 2).
The upper plane contains the point C = B + 12n̂ = B + 4 (−2, 1, 2) = (−7, 1, 10).
The lower plane contains the point D = B − 12n̂ = B − 4 (−2, 1, 2) = (9, −7, −6).
Both planes have normal vector (−2, 1, 2). We have:
Ð→ ÐÐ→
OC ⋅ (−2, 1, 2) = 14 + 1 + 20 = 35 and OD ⋅ (−2, 1, 2) = −18 − 7 − 12 = −37.
(ii) If a line and a plane intersect at zero or more than one points, then they are parallel
(Fact 107). And so, w ⊥ n or w ⋅ n = 0.
But n = (1, 2, 0) × (a, 4, −2) = (−4, 2, 4 − 2a).
So w ⋅ n = (−2, 1, 2) ⋅ (−4, 2, 4 − 2a) = 18 − 4a = 0 or a = 18/4 = 9/2.
Ð→ ÐÐ→
A516 (9740 N2015/I/7)(i) OC = 0.6a and OD = 5/11b.
Ð→ Ð→ Ð→
(ii) BC = OC − OB = 0.6a − b and so the line BC can be written as r = b + λ(0.6a − b) =
0.6λa + (1 − λ)b, for λ ∈ R, as desired.
Ð→ ÐÐ→ Ð→ 5
AD = OD − OA = /11b − a and so the line AD can be written as r = a + µ(5/11b − a) =
(1 − µ)a + 5/11µb, for λ ∈ R, as desired.
(iii) Where the lines meet, we have 0.6λa + (1 − λ)b = (1 − µ)a + 5/11µb. Equating the
coefficients, we have 0.6λ = 1 − µ and 5/11µ = 1 − λ. From =, we have µ = 1 − 0.6λ. Plugging
1 2 1
this into =, we have 5/11 (1 − 0.6λ) = 1 − λ ⇐⇒ 1 − 0.6λ = 11/5 − 11/5λ ⇐⇒ 8/5λ = 6/5 ⇐⇒
2
λ = 3/4. And µ = 0.55. Altogether then, the position vector of E is 0.45a + 0.25b.
(ii) The vector from P to a generic point on L is (2, 5, −6) − r = (2, 5, −6) − (1, −2, −4) −
(2λ, 3λ, −6λ) = (1 − 2λ, 7 − 3λ, −2 + 6λ). The length of this vector is
√ √
(1 − 2λ) + (7 − 3λ) + (−2 + 6λ) = 49λ2 − 70λ + 54.
2 2 2
Now plug = and = into the equation for r to get 5x − 4(1.25x − 4) + µ(0.5x − 2) = −9 ⇐⇒
1 2
4 4µ − 50
0.5µx + 25 − 2µ = 0 ⇐⇒ x =
3
.
µ
So if µ = 3, from =, =, and =, we have:
1 2 4
38 119 25
x=− , y=− , z=− .
3 6 3
(ii) From =, if µ = 0, then we have 250, a contradiction. So the three planes do not intersect.
1
A521 (9740 N2013/I/6)(i) Every vector in the plane can be expressed as the linear
combination of any two vectors with distinct directions (see Fact 61).
ÐÐ→ 4a + 3c
(ii) By the Ratio Theorem, ON = .
7
(iii) The area of triangle ON C is
ÐÐ→ Ð→ 4a + 3c
0.5 ∣ON × OC∣ = 0.5 ∣ × c∣
7
= 1/14 ∣(4a + 3c) × c∣
= 1/14 ∣4a × c + 3c × c∣ (distributivity of vector product)
= 1/14 ∣4a × c∣ (v × v = 0)
= 1/14 ∣4a × (λa + µb)∣
= 1/14 ∣4a × λa + 4a × µb∣ (distributivity of vector product)
2µ
= 1/14 ∣4a × µb∣ = .
7
Similarly, the area of triangle OM C is
ÐÐ→ Ð→
0.5 ∣OM × OC∣ = 0.5 ∣0.5b × c∣
= 1/4 ∣b × c∣
= 1/4 ∣b × (λa + µb)∣
= 1/4 ∣b × λa + b × µb∣ (distributivity of vector product)
= 1/4 ∣b × λa∣ (v × v = 0)
= 1/4λ ∣b × a∣ .
(ii) (a) Since a is a unit vector, (2p)2 + (6p)2 + (3p)2 = 1 or 49p2 = 1 or p = 1/7.
(b) ∣a ⋅ b∣ is the length of the projection vector of b on a.
(c) a × b = 1/7(2, −6, 3) × (1, 1, −2) = 1/7(9, 7, 8).
A526 (9740 N2011/I/11)(i) A normal vector to the plane is
Another normal vector to the plane is a scalar multiple of the above, namely (1, 1, 2). We
have (4, −1, −3) ⋅ (1, 1, 2) = −3. Hence, a cartesian equation of p is x + y + 2z = −3.
x−1
= z + 3 ⇐⇒ x = 2(z + 3) + 1 = 2z + 7 and
1
(ii) From the equations for l1 , we have
2
y−2
= z + 3 ⇐⇒ y = −4z − 10.
2
−4
Plug in = and = into the equations for l2 to get
1 2
2z + 7 + 2 3 −4z − 10 − 1 4 z − 3
= = .
1 5 k
z−3 −7
k=5 = 5 = −7.
−4z − 11 5
(iii) The direction vector of l1 is perpendicular to the normal vector of the plane p, as we
can verify — (2, −4, 1) ⋅ (1, 1, 2) = 0. Moreover, a point on l1 is on p, as we can verify —
(1, 2, −3) ⋅ (1, 1, 2) = −3. Altogether then, l1 is on p.
From the equations for l2 , we have y = 5x+11 and z = −7x−11. Plug these into the equation
for the plane p to get: x + (5x + 11) + 2(−7x − 11) = −3 ⇐⇒ −8x − 11 = −3 ⇐⇒ x = −1. So
y = 6 and z = −4. The intersection point is (−1, 6, −4).
6 9 18 6 9 18
(ii) (a + b) ⋅ (a − b) = ( + 1, + 2, + 2) ⋅ ( − 1, − 2, − 2)
7 7 7 7 7 7
13 115 128
=− − + = 0.
49 49 49
(Optional. Actually, more generally, since (a + b) ⋅ (a − b) = ∣a∣ − ∣b∣ , if ∣a∣ = ∣b∣, then
2 2
(a + b) ⋅ (a − b) = 0.)
A528 (9740 N2010/I/10)(i) The line has direction vector (−3, 6, 9), which is a scalar
multiple of the plane’s normal vector (1, −2, −3). So the line is perpendicular to the plane.
(ii) From the equations of the line, we have y = −2x + 19 and z = −3x + 27. Plug these in to
the equation of the plane to get x − 2(−2x + 19) − 3(−3x + 27) = 0 ⇐⇒ 14x − 119 = 0 ⇐⇒
x = 119/14 = 8.5. And so y = 2 and z = 1.5. So the point of intersection is (8.5, 2, 1.5).
−2 − 10
(iii) We can easily verify that the given point satisfies the equations for the line: =
−3
23 + 1 33 + 3
4= = . The point is therefore on the line.
6 9
The point of intersection we found in (ii) (call it X) is equidistant to both A and B.
Moreover, these three points are collinear. Thus, B = (19, −19, −30).
(iv) The area of triangle OAB is
A529 (9740 N2009/I/10)(i) The angle θ between the two planes is given by
(2, 1, 3) ⋅ (−1, 2, 1) 3
θ = cos−1 ( ) = cos−1 ( √ √ ) ≈ 1.237.
∣(2, 1, 3)∣ ∣(−1, 2, 1)∣ 14 4
(ii) The line l has direction vector (2, 1, 3) × (−1, 2, 1) = (−5, −5, 5) and thus also direction
vector (1, 1, −1).
A point (x, y, z) that lies on both planes satisfies 2x+y +3z = 1 and −x+2y +z = 2. Plugging
1 2
Altogether then, the line l has vector equation r = (0, 1, 0) + λ(1, 1, −1), for λ ∈ R.
(iii) The line l is parallel to the plane p3 , as we now verify: (1, 1, −1) ⋅ (2 − k, 1 + 2k, 3 + k) =
2 − k + 1 + 2k − 3 − k = 0. Moreover, the point (0, 1, 0), which is on the line l, is also on the
plane p3 , as we now verify: 2 × 0 + 1 + 3 × 0 − 1 + k(−0 + 2 × 0 + 0 − 2) = 0. Altogether then,
the line l lies in p3 for any constant k.
µ − 17 0.4 4 4 4 7
x= =− = − . So the point of intersection is (− , − , ).
22 + λ 1.1 11 11 11 11
(i) The line has direction vector (2, −5, 3) × (3, 2, −5) = (19, 19, 19) and thus also direction
vector (1, 1, 1). From our work above, x = y at the intersection of the two planes. Plug in
x = 0 to find that the two planes intersect at (0, 0, 1). Altogether then, the line has vector
equation r = (0, 0, 1) + α(1, 1, 1), for α ∈ R.
(ii) Two points on the line are (0, 0, 1) and (−1, −1, 0). Plug these into the equation for
plane p3 to get 17 = µ and −5 − λ = µ, so that µ = −22.
(iii) The line l must be parallel to the plane p3 , so that (1, 1, 1) ⋅ (5, λ, 17) = 0 or λ = −22.
Moreover, the point (0, 0, 1) on the line is not on the plane, so that µ ≠ 17.
(iv) Another vector that is parallel to the plane to be found is (1, −1, 3)−(0, 0, 1) = (1, −1, 2).
The plane thus has normal vector (1, 1, 1) × (1, −1, 2) = (3, −1, −2). Compute also d =
(0, 0, 1) ⋅ (3, −1, −2) = −2. Altogether then, the plane has cartesian equation 3x − y − 2z = −2.
A535 (9740 N2007/I/8)(i) The line l has vector equation r = (1, 2, 4) + λ(−3, 1, −3), for
λ ∈ R. Plugging this into the equation for the plane, we have 3(1−3λ)−(2+λ)+2(4−3λ) = 17
⇐⇒ 9 − 16λ = 17 ⇐⇒ λ = −0.5. So the point of intersection is (1, 2, 4) − 0.5(−3, 1, −3) =
(2.5, 1.5, 5.5).
(ii) The angle between l and the normal vector to p is
(−3, 1, −3) ⋅ (3, −1, 2) −16
cos−1 = cos−1 √ √ ≈ 2.946
∣(−3, 1, −3)∣ ∣(3, −1, 2)∣ 19 14
So the angle between the line and the plane is 2.946 − π/2 ≈ 1.376.
√
∣17 − (1, 2, 4) ⋅ (3, −1, 2)∣ ∣17 − 9∣ 8 4 14
(iii) ∣d − a ⋅ n̂∣ = √ = √ =√ = ≈ 2.138.
14 14 14 7
A536 (9233 N2007/I/7). The foot of the perpendicular a point A to a line is Q +
Ð→
(QA ⋅ v̂) v̂, where Q is any point on the line and v is the line’s direction vector. Hence,
(4, −5, −5) ⋅ (2, 2, 3) (2, 2, 3)
P = (−3, 8, 3) + √ √
17 17
17(2, 2, 3)
= (−3, 8, 3) − = (−5, 6, 0).
17
Ð→ √
∣AP ∣ = ∣(−6, 3, 2)∣ = 49 = 7.
ÐÐ→
A537 (9233 N2007/II/2)(i) OD = 0.75(1, −3, 4). So the line AD has direction vector
(3.25, 3.25, 0) and hence also direction vector (1, 1, 0). So the line AD has equation r =
(4, 1, 3) + λ(1, 1, 0), for λ ∈ R.
1662, Contents www.EconsPhDTutor.com
Ð→
(ii) OC = 0.25(4, 1, 3). So the line BC has direction vector (0, 3.25, −3.25) and hence also
direction vector (0, 1, −1). So the line BC has equation r = (1, −3, 4) + µ(0, 1, −1), for µ ∈ R.
Setting the equations of the two lines equal to each other, we have 4 + λ = 1, 1 + λ = −3 + µ,
and 3 = 4 − µ, so that λ = −3 and µ = 1. And the point of intersection is (1, −2, 3).
Ð→ Ð→ Ð→
A538 (9233 N2006/I/14). By the Ratio Theorem, OP = (1 − λ)OA + λOB = (1 −
Ð→ Ð→ ÐÐ→
λ)(1, −2, 5) + λ(1, 3, 0) = (1, −2 + 5λ, 5 − 5λ). And OQ = (1 − µ)OC + µOD = (1 − µ)(10, 1, 2) +
µ(−2, 4, 5) = (10 − 12µ, 1 + 3µ, 2 + 3µ).
Ð→ ÐÐ→
(i) P Q has direction vector AB × CD = (0, 5, −5) × (−12, 3, 3) = (30, 60, 60) and hence also
direction vector (1, 2, 2).
Ð→ Ð→ Ð→
Moreover, P Q = OQ − OP = (9 − 12µ, 3 + 3µ − 5λ, −3 + 3µ + 5λ), which must be a scalar
multiple of (1, 2, 2). And so 3 + 3µ − 5λ = 2(9 − 12µ) and −3 + 3µ + 5λ = 2(9 − 12µ). Taking
1 2
z= = = =
1+i 1 ± 3i 1 ± 3i 1 + i 1 + i ± 3i ∓ 3
Multiply by : = = = −1 + 2i or 2 − i.
1+i 1−i 1−i 1+i 12 + 12
(b)(i) ω 2 = (1 − i) = 1 − 1 − 2i = −2i, ω 3 = (1 − i) (−2i) = −2 − 2i, ω 4 = (1 − i) (−2 − 2i) =
2
−2 − 2i + 2i − 2 = −4.
We are given that:
ω 4 +pω 3 +39ω 2 +qω+58 = −4+p (−2 − 2i)+39 (−2i)+q (1 − i)+58 = q−2p+54+i (−q − 2p − 78) = 0.
and q = −66.
(c)(i) Since the coefficients of the given quartic equation are all real, by the Complex
Conjugate Root Theorem (Theorem 12), ω ∗ = 1 + i also solves the given equation. Thus, a
quadratic factor of the given quadratic polynomial is (ω − 1 + i) (ω − 1 − i) = ω 2 + 1 − 2ω + 1 =
ω 2 − 2ω + 2.
6ω 3 + 39ω 2 − 66ω + 58 = (ω 2 − 2ω + 2) (aω 2 + bω + c) = aω 4 + (b − 2a) ω 3 +?ω 2 +?ω + 2c,
ω 4 −write:
Now
where ?’s are coefficients we didn’t bother to compute. Comparing coefficients, we have
a = 1, b = −4, and c = 29.
Thus: ω 4 − 6ω 3 + 39ω 2 − 66ω + 58 = (ω 2 − 2ω + 2) (ω 2 − 4ω + 29).
A540 (9740 N2016/I/7)(a) Simply plug in −1 + 5i and verify that the equation holds:
(−1 + 5i) + (−1 − 8i) (−1 + 5i) + (−17 + 7i) = 1 − 25−10i + 1−5i+8i + 40 − 17+7i = 0.
2
3
Note that the Complex Conjugate Roots Theorem (Theorem ???) does not apply here
because the coefficients of the given quadratic equation are not all real.
Let a + ib be the other root. Then:
w2 + (−1 − 8i) w + (−17 + 7i) = (w + 1 − 5i) (w − a − ib) = w2 + (1 − a − 5i − ib) w+?,
where we didn’t bother computing ? because it isn’t necessary.
Comparing coefficients, we have 1 − a = −1 or a = 2 and −5i − ib = −8i or b = 3. Hence, the
other root is 2 + 3i.
(b) Plug in 1 + ai into the given equation:
(3, 1).
arg z = tan−1 0.4 traces out a ray from the origin, with slope 0.4.
2
y
∣z − 3 − i∣ = 1 z2
z1 (3, 1)
(a)(ii) We are asked to find the two points that are on the circle and equidistant from z1
and z2 . These are the two blue points depicted in the figure above.
The line connecting these two blue points is perpendicular to the line with slope 0.4 — it
1
thus has slope −1/0.4 = −2.5 and direction vector (2, −5), whose unit vector is √ (2, −5).
29
Moreover, it passes through the point (3, 1). Thus, it may be described by the vector
equation:
The circle’s radius is 1. And so, the two blue points are given by:
1 2 5
(3, 1) + √ (2, −5) = (3 + √ , 1 − √ ) or
29 29 29
1 2 5
(3, 1) − √ (2, −5) = (3 − √ , 1 + √ ).
29 29 29
√ √ 2 1
(b)(i) ∣w∣ = ∣2 − 2i∣ = 22 + (−2) = 2 2 and arg w = arg (2 − 2i) = − cos−1 √ = − cos−1 √ =
2
2 2 2
− .
π
4
√ √ 1/3 √ √
Thus, w = 2 2e−iπ/4 and the cube roots of w are (2 2) e−iπ/12 = 2e−iπ/12 , 2e(−π/12+2π/3)i =
√ 7iπ/12 √ √
2e , and 2e(−π/12−2π/3)i = 2e−3iπ/4 .
π 1−n
(b)(ii) arg (w∗ wn ) = arg w∗ +n arg w+2kπ = − +2kπ = ( + 4k), where k = −1, 0, 1.
π πn
4 4 2 2
1−n 1−n
So, + 4k = 1 and n > 0. So we must have k = 1 and thus = −3 or n = 7.
2 2
1665, Contents www.EconsPhDTutor.com
A542 (9740 N2015/I/9)(a)
w2 (a + ib)2 a2 − b2 + 2abi a2 − b2 + 2abi a + ib
= = = ×
w∗ a − ib a − ib a − ib a + ib
1
= [(a3 − 3ab2 ) + i (3a2 b − b3 )]
a2 + b2
√ √
is purely imaginary if and only if a3 −3ab2 = 0. But a3 −3ab2 = a(a2 −3b2 ) = a (a − 3b) (a + 3b).
√
So either b = ±a/ 3 or a = 0 (but the latter is explicitly ruled out in the question).
√
Altogether, the possible values of w = a + ib are given by b = ±a/ 3 and a is any non-zero
real number.
(b)(i) z 5 = 25 eiπ(−0.5) = 25 eiπ(−0.5+2k) for k ∈ Z. So z = 2eiπ(−0.5+2k)/5 for k = 0, ±1, ±2. So
∣z∣ = 2 and arg z = −0.9π, −0.5π, −0.1π, 0.3π, 0.7π.
where the last line uses the fact that sin(π − x) = sin x.
A543 (9740 N2014/I/5)(i) z 2 = (1 + 2i)2 = 12 + (2i)2 + 2(1)(2i) = 1 − 4 + 4i = −3 + 4i.
z 3 = (−3 + 4i)(1 + 2i) = −3 − 6i + 4i + (4i)(2i) = −3 − 2i − 8 = −11 − 2i. So:
1 1 1 −11 + 2i −11 + 2i −11 + 2i −11 + 2i
= = × = = =
z 3 −11 − 2i −11 − 2i −11 + 2i 112 − (2i)2 121 + 4
.
125
−11 + 2i 11 2
(ii) Since pz 2 + 3 = p(−3 + 4i) + q = (−3p − q) + i (4p +
q
q) is real, we have
z 125 125 125
4p + 2 = 0 or q = −250p. And pz 2 + 3 = 19p.
q q
125 z
y
{z : |z + 5 - i| = 4}
Radius 4
(-5, 1)
(ii) The complex equation ∣z − 6i∣ = ∣z + 10 + 4i∣ is equivalent to the cartesian equation
(x − 0)2 + (y − 6)2 = (x + 10)2 + (y + 4)2 or −12y + 36 = 20x + 100 + 8y + 16 or y = −x − 4.
2
(ii) arg ( ∗ ) = arg wn − arg w∗ + 2kπ = n arg w + arg w + 2kπ = (n + 1) arg w + 2kπ = (n +
w n
w
1) × (−π/6) + 2kπ. A complex number z is real if and only if arg z = 0 or arg z = π. So by
wn
observation, the three smallest positive whole number values of n for which ∗ is real are
w
5, 11, and 17.
A545 (9740 N2013/I/4)(i) (1+2i)3 = 1+3×2i+3×(2i)2 +(2i)3 = 1+6i−12−8i = −11−2i.
(ii) Since w = 1 + 2i is a root for az 3 + 5z 2 + 17z + b = 0, we have
1667, Contents www.EconsPhDTutor.com
0 = a (1 + 2i) + 5 (1 + 2i) + 17 (1 + 2i) + b
3 2
re i 0
x
{z = re iɅ : Ʌ [0, π / 2]}
2re i (- π / 3)
z 10
(iii) arg ( 2 ) = arg z 10 − arg w2 + 2kπ = 10 arg z − 2 arg w + 2kπ = 10θ − 2 (− + θ) + 2kπ =
π
w 3
8θ + 2 + 2kπ = π, so θ = (with k = 0).
π π
3 24
A547 (9740 N2012/I/6)(i) z 3 = (1 + ic)3 = 13 + 3(ic) + 3(ic)2 + (ic)3 = 1 + 3ic − 3c2 − ic3 =
(1 − 3c2 ) + i(3c − c3 ).
√
(ii) z 3 is√real if and only√if 3c − c3 = 0 or c = 0, ± 3. The question already ruled out c = 0.
So c = ± 3 and z = 1 ± i 3.
√
(iii) z = 1 − i 3 = ∣z∣ ei arg z = 2ei(−π/3) . ∣z n ∣ = 2n > 1000 if and only if n > 9. (The reason is
that 29 = 512 and 210 = 1024.) So the smallest positive integer n is 10.
∣z 10 ∣ = 210 and arg z 10 = 10(−π/3) + 2kπ = 2π/3 (k = 2).
A548 (9740 N2012/II/2)(i) ∣z − (7 − 3i)∣ = 4 describes a circle with centre 7 − 3i and
radius 4.
{z : |z - (7 - 3i )| = 4}
o x
a
Radius 4
c = (7, -3)
b
Radius 4
d
(ii)(a) a is the point on the circle’s circumference that is closest to the origin a. The line
l through the origin and the centre of the circle passes through a (see Fact ??).
√ √
But the distance of the centre of the circle from the origin is 72 + 32 = 58. The distance
of the centre of the circle to the point a is 4 √
(this is simply the length of the radius). Hence,
the distance of the origin to the point a is 58 − 4.
(b) △abc is right. So ab2 + bc2 = ca2 = 42 = 16.
3
But the line l has gradient − (because it runs through the origin and the point 7 − 3i) and
7
3 3 2 49 7 28
so ab = bc. Hence, ( ) × bc2 + bc2 = 16. Or bc2 = 16 × . Or bc = 4 × √ = √ . And
7 7 58 58 58
12 28 12
ab = √ . Hence, a = (7 − √ , −3 + √ ).
58 58 58
(iii) By observation, d is the point where ∣arg z∣ is as large as possible. arg z = arg(7 − 3i) +
∠cod.
4 −3
But △cod is right. So ∠cod = sin−1 √ . Moreover, arg(7 − 3i) = tan−1 .
58 7
−3 4
Altogether then, arg z = tan−1 + sin−1 √ = −0.9579.
7 58
A549 (9740 N2011/I/10)(i) Let (x + iy)2 = x2 − y 2 + i(2xy) = −8i. So x2 − y 2 = 0 and
1
2xy = −8. From =, we observe that x and y must have opposite signs. From =, x = ±y and
2 2 1
by our observation of the previous sentence, we must have x = −y. And now from =, we
2
|z - z1 | = |z - z2 |
|w - w1 | = |w - w2 |
x
(b) This simply the line that is equidistant to w1 = (−3, −1) and w2 = (−1, 1). By observa-
tion, it has cartesian equation y = x + 2.
(iv) The two lines are parallel and do not intersect.
A550 (9740 N2011/II/1)(i) This is simply the circle with radius 3 and centre 2 + 5i,
including all the points within the circle.
{z : |z - (2 + 5i )| ≤ 3}
b
P
Radius 3
c = (2, 5)
P2
Radius 3
a
P1
(6, 1)
(ii) The points on the circle’s circumference that are closest to and furthest from the origin
o are a and b. The line l through the origin and the centre of the circle passes through both
a and b (see Fact ??).
√ √ √ √
oc = 22 + 52 = 29 and ac = 3. Hence, oc = 29 − 3. √ Symmetrically, ob = 29 + 3. The
maximum and minimum possible values of ∣z∣ are thus 29 ± 3.
(iii) The locus of points that satisfy both ∣z − 2 − 5i∣ ≤ 3 and 0 ≤ arg z ≤ π/4 is the blue
closed segment.
By observation, ∣z − 6 − i∣ is maximised either at P1 or P2 . These points are given by
y
|z - z 1| = 2
arg (z - z 2) = π / 4
z 1= (1, )
z 2 = (-1, -1)
(b) This is simply the ray from the point z2 (but excluding the point z2 ) that makes an
angle π/4 with the horizontal.
√
(iv) We want to find x > 0 such that ∣(x, 0) − (1, 3)∣ = 2 or (x − 1)2 + 3 = 4 or (x − 1)2 = 1
or x = 0, 2. So (2, 0) is where the locus ∣z − z1 ∣ = 2 meets the positive real axis.
A552 (9740 N2010/II/1)(i)
√ √
6 ± (−6)2 − 4(1)(34) −100
x= =3± = 3 ± 5i.
2 2
(ii) Since −2 + i is a root of x4 + 4x3 + x2 + ax + b = 0, we have
= (x2 + 4x + 5) (x2 + cx + d)
= x4 + (4 + c)x3 + (5c + 4d)x + 5d.
(ii)
(iii) ∣z − z1 ∣ = ∣z − z2 ∣ is the line (blue) that is equidistant to the points z1 = 21/14 eiπ(1/28) and
z2 = 21/14 eiπ(1/28+2/7)
Explanation #1: 0 satisfies the equation ∣z − z1 ∣ = ∣z − z2 ∣ as we can easily verify — ∣0 − z1 ∣ =
∣0 − z2 ∣ = 21/14 . So 0 is in the locus ∣z − z1 ∣ = ∣z − z2 ∣.
Explanation #2: The perpendicular bisector of a chord runs through the centre of the
circle. So in this case, the perpendicular bisector of the chord z1 z2 runs through the origin
(which is the centre of the circle).
A554 (9740 N2008/I/8)(i)
√ 2 √ √ √ √ √
(1 + 3i) (1 + 3i) = (−2 + 2 3i) (1 + 3i) = −2 − 6 + (2 3 − 2 3) i = −8.
(ii) 0 = 2z 3 + az 2 + bz + 4
√ √
= 2(−8) + a (−2 + 2 3i) + b (1 + 3i) + 4
√
= −12 − 2a + b + i 3 (2a + b)
´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶ ´¹¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹¸¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¹ ¶
=0
1
=0
2
= 2 (z 2 − 2z + 4) (z − c)
= 2 [z 3 + (−c − 2)z 2 + (4 + 2c)z − 4c] .
Comparing coefficients, we have c = −0.5, which is also the third root for the equation.
∣w∣
A555 (9740 N2008/II/3)(a) ∣p∣ = ∣ ∣ = = = = arg w − arg w∗ =
w w
1 and arg arg
∣w ∣
∗ ∗
p ∗
w w
θ − (−θ) = 2θ.
arg p5 = 10θ + 2kπ. The argument of a positive real number is 2mπ for some integer m.
Hence, θ = nπ/5 for integers n. Given also the restriction that θ ∈ (0, π/2), we have θ = π/5
or 2π/5.
(b) ∣z∣ ≤ 6 is a circle of radius 6 centred on the origin, including the interior of the circle.
∣z∣ = ∣z − 8 − 6i∣ is a line that is equidistant to the origin and the point (8, 6).
So the locus of z is the line segment AB.
|z | ≤ 6 8 + 6i
A
O B
x
|z | = |z - 8 - 6i |
(i)
(ii) Observe that arg z is maximised and minimised at A and B. arg A = ∠COX + ∠AOC,
6 3
arg B = ∠COX − ∠BOC. Moreover, ∠COX = arg(8 + 6i) = tan−1 = tan−1 .
8 4
√
Note that △AOC is right and the length of OC is half of ∣8 + 6i∣ = 82 + 62 = 10. So
5
OC = 10. Thus, ∠AOC = ∠BOC = cos−1 = cos−1 = cos−1 .
OC OC
OA OB 4
3 5
Altogether then, arg A = ∠COX + ∠AOC = tan−1 + cos−1 ≈ 1.229 and arg B = ∠COX −
4 4
3 5
∠BOC = tan−1 − cos−1 ≈ 0.058.
4 4
A556 (9233 N2008/I/9)(i) This is the circle centred on −2i with radius 2.
- 2i
Radius 2
|z + 2i | = 2
A556 (9233 N2008/I/9)(iii) This is the region bounded by and including the rays
arg(z + 1 − 3i) = π/6 and arg(z + 1 − 3i) = π/3.
π/3
1 + 3i π/6
π / 6 ≤ arg (z + 1 – 3i) ≤ π / 3
x
Remark 163. Do not make the mistake of concluding that by the complex conjugate roots
theorem, 1 + i is the other root of the equation w2 = −2i. The theorem applies only for
polynomials whose coefficients are all real. It does not apply here because there is an
imaginary coefficient.
√ √
3 + 5i ± (3 + 5i)2 − 4(1)(−4)(1 − 2i) 3 + 5i ± 9 − 25 + 30i + 16(1 − 2i)
(ii) z = =
√ 2 2
3 + 5i ± −2i 3 + 5i ± (1 − i)
= = = 2 + 2i, 1 + 3i.
2 2
√
A558 (9740 N2007/I/3)(a) This is the circle with radius 13 centred on the point
−2 + 3i.
|z + 2 - 3i | =
-2 + 3i
Radius
(b) (a + ib)(a − ib) + 2(a + ib) = 3 + 4i or a2 + b2 + 2a + 2bi = 3 + 4i. Two complex numbers are
equal if and only if their real and imaginary parts are equal. So a2 + b2 + 2a = 3 and 2b = 4.
1 2
A559 (9740 N2007/I/7)(i) By the complex conjugate roots theorem, another root is
re−iθ . And so a quadratic factor of P (z) is:
(ii) z 6 = −64 = 64eiπ = 26 eiπ(1+2k) for k ∈ Z. So z = 2eiπ(1+2k)/6 for k = 0, ±1, ±2, −3.
(iii) We first use (ii), then use (i):
√ √
= (z 2 − 2 3 + 4) (z 2 + 4) (z 2 + 2 3 + 4) .
A560 (9233 N2007/I/9)(i) By the complex conjugate roots theorem, another root is
−ki. Altogether then,
az 4 + bz 3 + cz 2 + dz + e = a (z − ki) (z + ki) (z 2 + f z + g)
= a (z 2 + k 2 ) (z 2 + f z + g)
= a [z 4 + f z 3 + (k 2 + g)z 2 + k 2 f z + gk 2 ] .
that indeed:
ad2 + b2 e = a3 k 4 f 2 + a3 f 2 gk 2
= (af ) × [a(k 2 + g)] × (ak 2 f )
= b × c × d. 3
(ii)a = 1, b = 3, c = 13, d = 27, e = 36. So indeed ad2 +b2 e = 1×272 +32 ×36 = 1053 = 3×13×27 =
√
3
√
bcd.
27 √
From = above, f = = 3. So from =, k = ± =± = ± 9 = ±3. So the two desired
1 b 2 d
a af 1×3
roots are ±3i.
A561 (9233 N2007/II/5). The locus of P is the ray from (but excluding) the point 2i
that makes an angle π/3 with the horizontal. This is one half of the line with cartesian
√
equation y = x tan + 2 = 3x + 2.
π
3
The locus of Q is the line that is equidistant to the points 4 and −2. It has cartesian
equation x = 1.
(1, )
π/3
(1, 2)
Q : |z + 2| = |z – 4|
(-2, 0) (4, 0) x
√ √
The intersection of the two lines is (1, 3 + 2) or 1 + i ( 3 + 2).
√ √ √ 2 √ √
[1 + i ( 3 + 2)] [1 − i ( 3 + 2)] = 1 + ( 3 + 2) = 1 + 3 + 4 + 4 3 = 8 + 4 3. 3
A562 (9233 N2006/I/5)(i) This is the circle with radius 3 centred on −4 + 4i.
|z + 4 - 4i| = 3
A = (-4, 4)
C = (0, 1) x
(ii) Given a point (C here), the line connecting it to the centre of a circle (A here) also
passes through the point on the circumference (B here) that is closest to the given point
(see Fact ??).
√ √
The distance between A and C is (−4 − 0) + (4 − 1) = 42 + 32 = 5. So the distance
2 2
between B and C is 5 − 3 = 2.
A563 (9233 N2006/I/6)(i)
(ax) (ax)
2 3
= ax − + + 2ax2 − a2 x3 + 2ax3 + . . .
2 3
a2 2 a3 3
= ax + (2a − ) x + (2a − a + ) x + . . .
2
2 3
a2
2a − =0 ⇐⇒ (4 − a) = 0 ⇐⇒ a = 0 (discard) or a = 4.
a
2 2
d
A565 (9758 N2017/I/3)(i) Apply the operator to the given equation:
dx
dy dy
− 2 (y + x ) + 10x = 0.
1
2y
dx dx
dy 2
= 0. Plug = into =:
2 1
The stationary points are given by
dx
√
1 1 2
25x2 − 10x2 + 5x2 − 10 = 0 or or x = ±√ = ±
x2 = .
2 2 2
√
Thus, the x-coordinates of the two stationary points of C are ± 2/2.
√ √
2 5 2
(ii) The point in question is P = ( , ).
2 2
d
operator to =:
1
Apply the
dx
dy 2 d2 y dy dy d2 y
2 (( ) + y 2 ) − 2 ( + + x 2 ) + 10 = 0.
4
dx dx dx dx dx
dy 2
= 0 into =:
4
Plug
dx
d2 y d2 y d2 y 5
2y 2 − 2x 2 + 10 = 0 = (for x ≠ y).
4
or
dx dx dx 2 x−y
At P , we have x < y and so the second derivative is negative. By the Second Derivative
Test then, P is a maximum point.
1685, Contents www.EconsPhDTutor.com
A566 (9758 N2017/I/7)(i) We’ll use the following identity given on List MF26 (p. 3):
P +Q P −Q
cos P − cos Q = −2 sin sin .
2 2
P +Q 2 P −Q
2mx = and 2nx =
1
Set: .
2 2
1 1
∫ sin 2mx sin 2nx dx = 2 ∫ cos Q − cos P dx = 2 ∫ cos 2 (m − n) x − cos 2 (m + n) x dx
1 sin 2 (m − n) x sin 2 (m + n) x
= [ − ] + C.
2 2 (m − n) 2 (m + n)
0
1 − cos 4mx 1 − cos 4nx
=∫ + + 2 sin 2mx sin 2nx dx
π
0 2 2
sin 2 (m − n) x sin 2 (m + n) x
π
1 1
= [x − sin 4mx − sin 4nx + − ] = π,
8m 8n 2 (m − n) 2 (m + n) 0
where the last step exploits the fact that sin (kπ) = 0 for any k ∈ Z.
dv
A567 (9758 N2017/I/11)(i)(a) = c. (b) 4 + 2.5c = 29. So c = 10 and v = 4 + 10t.
dt
dv dt 1 1
= c − kv = 10 − kv ⇐⇒ = . Apply the ∫ dv operator to =:
1
(ii)
dt dv 10 − kv
dt 1 1
∫ dv dv = ∫ 10 − kv dv or t + C = − ln ∣10 − kv∣.
k
Plug the initial values (t, v) = (0, 0) into = to get: 10 − k ⋅ 0 = C1 e−k⋅0 = C1 or C1 = 10.
2
10
Thus: 10 − kv = 10e−kt or v= (1 − e−kt ).
k
10 10
(iii) The terminal velocity is vT = lim v = lim (1 − e−kt ) = = 40. So, k = 1/4. Now:
t→∞ t→∞ k k
The time taken for this object to reach 90% of its terminal velocity is 4 ln 10 s ≈ 9.210 s.
1686, Contents www.EconsPhDTutor.com
A568 (9758 N2017/II/4)(a) First, sketch the given curve and line:
y
2y = x − 1
2 x
11
2
y = x2 − 6x + 5
To find their intersection points, plug the line’s equation into the curve’s:
x−1
11/2
11/2 x3 13 11 243
∫1 − (x2 − 6x + 5) dx = [− + x2 − x] = = 15.187 5.
2 3 4 2 1 16
√ 2
1 1 1 1 π 1 1
π( ) = = [ ] = ( − ) =
y y π π
(b)(i) ∫ dy ∫ dy .
a − y2 2 a − y2 0 2 a − 1 a 2 (a2 − a)
π
0 (a − y 2 )
2
0
(ii) We are given that the volume of the second container is four times that of the first:
= 4 (b2 − b) = a2 − a 4b2 − 4b + a − a2 = 0.
π π
4 or or
2 (b2 − b) 2 (a2 − a)
Note that the correct solution to (b)(ii) must include both values of b. It is incorrect to
reject either value. See lengthy remark on the next page.
√
x=
y
b2 − y 2
1 (1, 1)
1
( , 1)
b2 − 1 1
( , 1)
b1 − 1
√
x=
y
b1 − y 2
x
Let V be the volume generated by rotating the black area around the y-axis, i.e. the
volume found in (b)(i). Let V1 and V2 be the corresponding volumes for the red and blue
areas. (The red area includes the black area.)
It is not difficult to show that V = π/4 and V1 = π = V2 , so that indeed, each of the
volumes V1 and V2 is four times the volume V .
We can quite easily prove that this is also more generally true for any a > 1:
√ ⎡ √ ⎤2
2 ⎢ ⎥
1 1
⎢ ⎥ 2π
V1 = V2 = ∫ π( ) = ⎢ ⎥ = ⋅ ⋅ ⋅ =
y y
dy ∫ √
⎢ (1 ± 1 − a + a2 ) /2 − y 2 ⎥
dy
b − y2 a2 − a
π .
0 0 ⎢ ⎥
⎣ ⎦
√
422
The one-sentence reason given is: “Since a > 1, 1 − a + a2 > 1 Ô⇒ b > 1.”
1688, Contents www.EconsPhDTutor.com
A569 (9740 N2016/I/2). Here as an exercise, let’s do this without a calculator.
dy
First, apply ln to the given equation: ln y = cos x ln 2. Then apply :
dx
1 dy dy
= − sin x ln 2 or = −y ln 2 sin x = −2cos x ln 2 sin x.
y dx dx
dy
Thus: ∣ = −2cos 0 ln 2 sin 0 = 0.
dx x=0
dy
∣ = −2cos(π/2) ln 2 sin = −20 ln 2 ⋅ 1 = − ln 2 ≈ −0.693.
π
dx x=π/2 2
dt 1 1 1 1
∫ dy dy = ∫ 10 − 2y dy = − 2 ∫ y − 5 dy or t + C = − ln (y − 5) (for y < 5).
2
dx 4
y = 5 − 5e−2t or = 5 − 5e−2t .
dt
5
Apply ∫ dt to = to get: x = ∫ 5 − 5e−2t dt = 5t + e−2t + C2 .
3 5
2
Plugging the initial values = into =, we get:
3 5
5 5 5 5
C2 = 0 − 5 ⋅ 0 − e−2⋅0 = − , and thus: x = 5t + e−2t − .
2 2 2 2
1 t2
x = 10 ∫ t + cos t − 1 dt = 10 ( + 2 sin − t) + C4 .
7 t
2 2 2
t2
Thus, x = 10 ( + 2 sin + t) = 5 (t2 + 4 sin − 2t).
t t
2 2 2
5 5
(iii) For model (i), x = 5 ⇐⇒ 5t + e−2t − = 5 ⇐⇒ t ≈ 1.474 (calculator).
2 2
For model (ii), x = 5 ⇐⇒ 5 (t2 + 4 sin − 2t) = 5 ⇐⇒ t ≈ 1.046 (calculator).
t
2
1690, Contents www.EconsPhDTutor.com
dV 1
A572 (9740 N2016/II/1). We are given that = 0.1.
dt
dh dr
Also, tan α = = 0.5. So, h = 2r and =2 .
r 2
h dt dt
1 2 2π 3 3V 1/3 9 1/3
Plug = into V = πr h to get V = r or r = ( ) . So, when V = 3, we have r = ( )
2 3
3 3 2π 2π
dV d π dr dh π dh dh 4 π dh dh
= ( r2 h) = (2r h + r2 ) = (r h + r2 ) = r (h + r) = πr2 .
π
Also:
dt dt 3 3 dt dt 3 dt dt 3 dt dt
1 2 1
∫ x cos nx dx = n x sin nx − 2 ∫ x n sin nx dx
2
1 2 −1 −1
= x2 sin nx − [x cos nx − ∫ cos nx dx]
n n n n
1 2 2
= x2 sin nx + 2 x cos nx − 3 sin nx + C
n n n
1 2 2
= [(x2 − 2 ) sin nx + x cos nx] + C1 .
n n n
2π
2π 1 2 2
(ii)
∫π x cos nx dx = [(x − 2 ) sin nx + x cos nx]
2 2
n n n π
2π
1 2 2
= [ x cos nx] = 2 (2π ⋅ 1 ± π) = 2 (2 ± 1) 2 .
π
n n π n n
So, a = 6 or 2.
du dx 2 1
(b) Given the substitution u = 9 − x2 , we have = −2x and =
1
.
dx du −2x
The volume of the solid is:
√ 2
2 2 2 x3 2 x3 dx du
y 2 dx = π ∫ ( ) = =
x x
π∫ dx ∫ dx ∫ dx
9−x
π π
0 (9 − x2 ) 0 (9 − x2 ) du dx
2 2 2
0 0
x3 1 π 5 x2 5 9−u π 5 9 1
= π∫ = − = − = − − du
u=5
s,2 1 π
2 ∫9 u2 2 ∫9 u2 2 ∫9 u2 u
du du du
u=9 u2 −2x
5
π 9 π 9 9 π 4 5
= − [− − ln ∣u∣] = ( + ln 5 − − ln 9) = ( + ln ) ≈ 0.333.
2 u 9 2 5 9 2 5 9
1691, Contents www.EconsPhDTutor.com
A574 (9740 N2016/II/3)(i) y = 0 ⇐⇒ cos t = 1 ⇐⇒ t = 0, 2π ⇐⇒ x = −1, 2π − 1.
So, the x-intercepts are (−1, 0) and (2π − 1, 0). (Note that these correspond to where t = 0
and t = 2π.)
y
When t = π,
(x, y) = (π + 1, 2)
When t = 0,
(x, y) = (−1, 0) x
When t = 2π,
(x, y) = (2π − 1, 0)
t=a dx dt
y dx = ∫ dx = ∫ (1 − cos t) (1 + sin t) dt
t=a t=a
(ii) ∫t=0 t=0
y
dt dx t=0
1
=∫ 1 + sin t − cos t − sin 2t dt
1 + sin t − cos t − sin t cos t dt = ∫
a a
0 0 2
1 a
1 3
= [t − cos t − sin t + cos 2t] = a − cos a − sin a + cos 2a + .
4 0 4 4
1
(iii) The gradient of the normal line at t = π is:
2
dx dx dy 1 + sin t
− ∣ =− ÷ ∣ =− ∣ = −2.
dy t=π/2 dt dt t=π/2 sin t t=π/2
Hence, the normal line’s equation is:
π+1
Thus, E = ( , 0), F = (0, π + 1), and the area of △OEF is:
2
1π+1 (π + 1)
2
(π + 1) = .
2 2 4
(Note that we haven’t proven that = is true. We have merely presented an informal argument
1
for why it might be true. Which is all you need to know for H2 Maths.)
√
(ii) Define f ∶ R → R by f (x) ↦ 3 x. Then:
√ √ √ ⎡√ √ √ ⎤
1 3
1 + 2 + ⋅⋅⋅ + n
3 3 ⎢
1⎢3 1 3 2 n ⎥⎥ 1 1 2
( √ ) = + + ⋅ ⋅ ⋅ + = [f ( ) + f ( ) + ⋅ ⋅ ⋅ + f ( )] .
n
⎢ ⎥
3
n⎢ n n⎥ n
⎣ ⎦
3
n n n n n n
√ √ √
1 3 1 + 3 2 + ⋅⋅⋅ + 3 n 1 1√ 3 4 1 3
Hence, by (i): ( √ ) = ∫0 f (x) dx = ∫0
3
x dx = [x 3 ] = .
n 3
n 4 0 4
1693, Contents www.EconsPhDTutor.com
A576 (9740 N2015/I/4). The rectangle’s perimeter is 2 (x + y). The semicircle’s is
2x + πx. The sum of these two perimeters is:
d = 2 (x + y) + 2x + πx = (4 + π) x + 2y.
y= − (2 + ) x.
1 d π
Rearranging:
2 2
The rectangle’s area is xy. The semicircle’s is πx2 /2. The sum of these two areas is:
A = xy + x2 = x [ − (2 + ) x] + x2 = x ( − 2x).
π 1 d π π d
2 2 2 2 2
1 1
= ( − 2 ) = d2 , where k =
d d d
A∣ .
x=d/8 8 2 8 32 32
(2x) (2x)
2 3
8
ln (1 + 2x) = (2x) − + − ⋅ ⋅ ⋅ = 2x − 2x2 + x3 − . . .
2 3 3
(ii) In the second Maclaurin expansion, replace n with c and each x with bx:
c (c − 1) (bx) c (c − 1) (c − 2) (bx)
2 3
ax (1 + bx) = ax [1 + c (bx) + + + ...]
c
2! 3!
1 1
= ax + abcx2 + ab2 c (c − 1) x3 + ab3 c (c − 1) (c − 2) x4 + . . .
2 6
1 2 3 8
a = 2, abc = −2, ab c (c − 1) = .
1 2
Comparing coefficients, we have:
2 3
Plug = into = to get b = −1/c. Then plug = and = into = to get:
1 2 4 1 4 3
c−1 8 1 8 −3
= ⇐⇒ 1− = ⇐⇒ c= .
c 3 c 3 5
5
And now from =, we also have b = .
4
3
4
Thus, the coefficient of x is:
1 3 1 5 3 3 8 13 1 1 3 104
ab c (c − 1) (c − 2) = (2) ( ) (− ) (− ) (− ) = ( ) (−3) (−8) (−13) = − .
6 6 3 5
5
5
3 3
27
1694, Contents www.EconsPhDTutor.com
A578 (9740 N2015/I/10)(i) A1 + A2 = ∫ cos x dx = [sin x]0 = 1.
π/2
π/2
0
√ √
2 2 √
A1 = ∫ cos x dx = [− cos x]0 +[sin x]π/4 = (− + 1)+(1 − ) = 2− 2.
π/4 π/2
sin x dx+ ∫
π/4 π/2
0 π/4 2 2
√ √
Hence: A2 = (A1 + A2 ) − A1 = 1 − (2 − 2) = 2 − 1.
√ √ √ √ √
A1 2 − 2 2 − 2 2 + 1 2 2 − 2 + 2 − 2 √
And: =√ =√ √ = = 2.
2−1
3
A2 2−1 2−1 2+1
√ √
2/2 2/2
(ii) The volume of the solid is π ∫ x dy = π ∫
2
(sin−1 y) 2 dy.
0 0
dy 2
(iii) From the given substitution y = sin u, we have = cos u.
1
du
√ √
2/2 2/2
π∫ (sin−1 y) 2 dy = π ∫
1
u2 dy
0 0
√
y= 2/2 dy
= π∫
s
u2 du
y=0 du
= π∫
u=π/4
2
u2 cos u du. ,
u=0
Plug = into ,:
3
d2 y d dθ −2cosec2 θ − sec2 θ
= (2 cot θ − tan θ) = (−2cosec2
θ − sec2
θ) = .
dx2 dx dx 3 sin2 θ cos θ
Evaluated at θ = θ̂, this last expression has negative numerator and positive denominator
and is thus negative. So by the Second Derivative Test, this is a maximum turning point.
(iii) Observe that since θ ∈ [0, π/2], we have y = 3 sin2 θ cos θ ≥ 0. And so, the curve C is
entirely above the x-axis.
Moreover, x = sin3 θ is strictly increasing in θ, with the endpoints of θ matching those of
the range of values taken by x.
Thus, the requested area is simply:
y dx = ∫
θ=π/2 θ=π/2
∫θ=0 θ=0
3 sin2 θ cos θ dx
dx dθ
=∫
θ=π/2
3 sin2 θ cos θ dx
θ=0 dθ dx
dx
=∫
θ=π/2
s
3 sin2 θ cos θ dθ
θ=0 dθ
(iv) The intersection points of the line and the curve C are given by:
3 2
y = ax or 3 sin2 θ cos θ = a sin3 θ or = tan θ.
a
√
From (ii), the maximum point occurs at tan θ̂ = 2. So plug = into = to get:
1 1 2
√
3 √ 3 3 2
= 2 or a= √ = .
a 2 2
1696, Contents www.EconsPhDTutor.com
A580 (9740 N2015/II/1)(i) The maximum height is attained when
√
dh 1 1
=0 or 16 − h = 0 or h = 32.
dt 10 2
dt 1 10
(ii) Rearrange the given differential equation as: =√ .
dh 16 − 2 h
1
√
10 1 √
So: t=∫ √ dh = −40 16 − h + C = −20 64 − 2h + C.
16 − 2 h
1 2
Plugging in the given initial condition (h, t) = (0, 0), we have C = 160. Hence:
√
t = −20 64 − 2h + 160.
√ √
Thus: t∣h=16 = −20 64 − 2 ⋅ 16 + 160 = −20 ⋅ 4 2 + 160.
d
A581 (9740 N2014/I/2). Apply the operator to the given equation:
dx
dy dy 1
2xy + x2 + y 2 + x ⋅ 2y = 0.
dx dx
dy
= −1 into = to get:
1
Plug
dx
±x3 + x3 + 54 = 0.
2x3 + 54 = 0 or x = −3.
dy
Thus, the unique point at which = −1 is (−3, −3).
dx
1697, Contents www.EconsPhDTutor.com
A582 (9740 N2014/I/7)(i) α ≈ 1.885 (mindless use of the calculator).
For β, we work out the “exact value”:
√
f (x) = −7 ⇐⇒ x6 − 3x4 − 7 = −7 ⇐⇒ x4 (x2 − 3) = 0 ⇐⇒ x = 0, ± 3.
√
From the graph, β = 3.
1.885 x7 3 5
(ii) ∫ f (x) dx ≈ ∫√ x6 − 3x4 − 7 dx = [ − x − 7x] 1.885 ≈ −0.597.
α 1 2
√
β 3 7 5 3
√
3 x7 3 5 √ 27 √ 27 √ √
(iii) ∫ f (x) dx = ∫ x − 3x − 7 dx = [ − x − 7x] 0 3 = 3− 3 − 7 3.
β
6 4
0 0 7 5 7 5
27 √ 27 √ √ √ 54 √
Hence, the requested area is − ( 3− 3 − 7 3) − 7 3 = 3.
7 5 35
Remark 165. The last question was strangely open-ended. I think the above answer
should have sufficed for the full four marks. But of course, who can divine what was on
the mind of those who wrote this question?
Here are two other things that could also have been “said”:
√
• Compute f ′ (x) = 6x5 − 12x3 = 6x3 (x2 − 2). Hence, for x ≥ 0, f ′ (x) > 0 ⇐⇒ x > 2.
Thus, the only positive root is α. Therefore, the other four roots are complex.
• We can even find these using only what we’ve learnt in H2 Maths.423 Write:
Comparing coefficients on the x4 and constant terms, we have −3 = −α2 +b and −7 = −α2 c.
And thus, b = α2 − 3 and c = 7/α2 .
Now let z = x2 , so that x4 + bx2 + c = z 2 + bz + c = 0. By the usual quadratic formula:
√ √ √
−b ± b2 − 4c 3 − α2 ± α4 − 6α2 + 9 − 28/α2 3 − α2 ± α4 − 6α2 + 9 − 28/α2
z= = = .
2 2 2
So, the other four (complex) roots of f (x) = 0 are:
¿
√ Á 3 − α2 ± √α4 − 6α2 + 9 − 28/α2
Á
À
x=± z=± .
2
423
In particular, without knowing how to solve cubic equations.
1698, Contents www.EconsPhDTutor.com
1
A583 (9740 N2014/I/8)(i) List MF26, p. 4 tells us that ∫ √ = sin−1 .
x
a2 − x2 a
1
∫ f (x) dx = ∫ √ dx = sin−1 + C.
x
So:
9 − x2 3
−1/2
1 1 1 1 x 2
(ii) f (x) = √ = √ = [1 − ( ) ]
9 − x2 3 1 − ( x )2 3 3
3
⎡ ⎤
⎢ ⎥
2 3
1 ⎢⎢ 2 (− 1
) (− 3
) [− ( )
x 2
] (− 1
) (− 3
) (− 5
) [− ( )
x 2
] ⎥
+ . . . ⎥⎥
1
= = ⎢1 + (− ) [− ( ) ] + +
x 2 2 3 2 2 2 3
3⎢ 2 3 2! 3! ⎥
⎢ ⎥
⎣ ⎦
1 1 1 4 5
= = + x2 + x + x6 + . . .
3 54 648 34 992
sin−1 = f (x) dx
x
3 ∫
1 1 2 1 4 5
=∫ + x + x + x6 + . . . dx
3 54 648 34 992
1 1 3 1 5 5
=C + x+ x + x + x7 + . . .
3 162 3 240 244 944
x 1 1 3 1 5 5
sin−1 = x+ x + x + x7 + . . .
3 3 162 3 240 244 944
1 1 1 2 5 1
− = k [1 + − ( ) ] = k or k=− .
4 2 2 4 5
5 1 2
(ii) 1 + x − x = − (x − ) . So:
2
4 2
1 1 1
t =∫ dx = ∫ dx = 5 ∫ dx
k (1 + x − x2 ) − 5 [ 4 − (x − 2 ) ]
1 5 1 2
(x − 2 ) − 4
1 2 5
RRR √ R √
R − 1/2 − 5/4 RRR √ 5 + 1 − 2x
ln RRRR √ RRR + C = 5 ln √
1
=5⋅ √ + C,
x
2 5/4 RRR x − 1/2 + 5/4 RRRR 5 + 2x − 1
1
where we can remove the absolute value operator because x ∈ [0, ].
2
√
2 √ 5 + 1 − 2x
Plugging in =, we find that C = 0. So: t = 5 ln √
1
.
5 + 2x − 1
√ √
√ 5 + 1/2 √ 2 5+1
(iii)(a) t∣x=1/4 = 5 ln √ = 5 ln √ .
5 − 1/2 2 5−1
√
√ 5+1
(b) t∣x=0 = 5 ln √ ≈ 2.152.
5−1
(iv) Rearrange =:
2
√ √ √ √
√ 5 + 1 − 2x 2 5 5 1− 5
et/ 5 = √ = −1 + √ ⇐⇒ x= √ + .
2x + 5 − 1 2x + 5 − 1 et/ 5 + 1 2
x
1
(0, )
2
t
√
√ 5+1
( 5 ln √ , 0) ≈ (2.152, 0)
5−1
1700, Contents www.EconsPhDTutor.com
√ √
A585 (9740 N2014/I/11)(i) By the Pythagorean Theorem, h = 42 − r2 = 16 − r2 .
2 1 1 √
The hemisphere’s volume is Vh = πr3 and the cone’s is Vc = πr2 h = πr2 16 − r2 .
3 3 3
2 1 √
So, the total volume is: V (r) = Vh + Vc = πr3 + πr2 16 − r2 .
3 3
We have r ∈ [0, 4]. Clearly, r = 0 does not produce a maximum.
√ r3
V (r) = 2πr + (2r 16 − r − √
′
).
2 π
Compute the derivative: 2
3 16 − r2
Observe that V ′ (4) < 0. So, 4 cannot be a maximum.
Since neither boundary point 0 or 4 is a maximum, the maximum is attained in the interior.
And so, by the Interior Extremum Theorem, the maximum is a stationary point.
So, we now look for any stationary points. Observe that V ′ (r) = 0 is equivalent to:
1 √ r2
2r + (2 16 − r2 − √ )=0 r = 0.
1 2
or
3 16 − r2
1 r2 √ 1 r2 − 32 + 2r2 3r2 − 32
2r = (√ − 2 16 − r2 ) = √ = √ .
3 16 − r2 3 16 − r2 3 16 − r2
√ √
768 ± 7682 − 4 (45) (1 024)
384 ± 101 376
r2 = = ≈ 15.608, 1.457.
2 (45) 45
√ √
So, the positive solutions to = are: ra ≈ 15.608 ≈ 3.951 rb ≈ 1.457 ≈ 1.207.
3
and
(iii) V ′ (ra ) = 0 and V ′ (rb ) ≠ 0. Thus, only ra is a stationary point.424
√
So: ra = r1 ≈ 3.951 and h (r1 ) = 16 − r12 ≈ 0.625.
(iv)
V (r1 , V (r1 )) ≈ (3.951, 139)
4
The extraneous solution rb arose because of Ô⇒ (see Ch. 26).
424 Square
A + 2B = 9, 2C − 5B = 1, 9A − 5C = −13.
1 2 3
Comparing coefficients:
Solving, we have A = 3, B = 3, C = 8. Hence, the given definite integral equals:
2 3 3x + 8 3 3 8 2
∫0 + dx = [ ln ∣2x − 5∣ + ln (x + 9) + tan
2 −1 x
]
2x − 5 x2 + 9 2 2 3 3 0
3 3 8 2 0 3 13 8 2
= (ln 1 − ln 5) + (ln 13 − ln 9) + (tan−1 − tan−1 ) = ln + tan−1 .
2 2 3 3 3 2 45 3 3
√
x 2
A587 (9740 N2013/I/5)(i) For x ∈ [−a, a], we have f (x) = 1 − ( ) or equivalently:
a
x 2
[f (x)] + ( ) = 1, with f (x) ≥ 0.
2
a
Thus, this portion of the graph is a semi-ellipse with y-intercept 1 and x-intercepts ±a.
On [−4a, −2a] and [2a, 4a], there are two identical semi-ellipses. And on [5a, 6a], there is
a quarter-ellipse. For all other x, we have y = 0. Altogether then:
−4a −2a −a a 3a 5a 6a
dx 2
(ii) From the given substitution x = a sin θ, we have = a cos θ. We specify also that
1
√ dθ
θ ∈ [−π/2, π/2], so that cos2 θ = cos θ.
3
Note that both the upper and lower limits of integration are in [−a, a]. So:
√ √ √ √ √
3a/2 3a/2 x2 3a/2 x2 dx dθ
∫a/2 f (x) dx = ∫ 1 − 2 dx = ∫ 1− 2 dx
a/2 a a/2 a dθ dx
π/3 √ π/3 √
= ∫ 1 − sin2 θa cos θ dθ = ∫ cos2 θa cos θ dθ = a ∫
π/3
1,2,s 3
cos2 θ dθ
π/6 π/6 π/6
cos 2θ − 1 1 1 π/3 π
= a∫ dθ = a [ sin 2θ − θ] = a.
π/3
π/6 2 4 2 π/6 12
1702, Contents www.EconsPhDTutor.com
A588 (9740 N2013/I/10)(i) Note that z < 3/2 implies 3 − 2z > 0. Rearrange (A):
1
dx 1 1 1 1 1
= or x=∫ dz = − ln ∣3 − 2z∣ + C1 = − ln (3 − 2z) + C1 .
dz 3 − 2z 3 − 2z 2 2
3
Or: 3 − 2z = C3 e−2x or 3 − C3 e−2x = 2z or z = C4 e−2x + .
2
dy 3 3 3
(ii) = z = C4 e−2x + . So, y = ∫ C4 e−2x + dx = C5 e−2x + x + C6 .
dx 2 2 2
d2 y d dy dy 3
(iii) = = −2C 4 e−2x
= a + b = a (C 4 e−2x
+ ) + b.
dx2 dx dx dx 2
Comparing coefficients, a = −2 and b = 3.
3 3
y = x (C5 = 0, C6 = 0) and y = x + 1 (C5 = 0, C6 = 1).
2 2
A non-linear member of this family is:
3
y = e−2x + x (C5 = 1, C6 = 0),
2
3
which has the line y = x as its asymptote.
2
3
y = e−2x + x
2
3
y= x
2
p3 − q 3
pxR − p3 = qxR − q 3 or (p − q) xR = p3 − q 3 or xR = = p2 + pq + q 2 .
p−q
(Note that it’s OK to divide by p − q because p ≠ q.)
The corresponding y-coordinate is yR = px − p3 = p (xR − p2 ) = p (pq + q 2 ). Hence:
xR = p2 − 1 + q 2 = p2 + q 2 + 2pq + 1 = (−p − q) + 1 = yR .
2
x 1.5 √
(iv) In the given region, C and L may be described by: y = 2( ) and y = x − 1.
3
The area under C, above the x-axis, and from 0 to M is:
√
3/2 x 1.5 2 3/2
4 3 2.5
3 3 2
∫0 2 ( ) dx = [ 1.5 x2.5 ] = ( ) = √ =
3 ⋅ 2.5 5⋅3
.
3 0
1.5 2 5 2 10
The area under L, above the x-axis, and from 1 to M is:
√
3/2 √ 2 3/2
2 1 1 2
∫1 x − 1 dx = [ (x − 1) ] = 1.5 = √ =
1.5
.
3 1 32 3 2 6
√ √
3 2 2 2√
Hence, the requested area is − = 2.
10 6 15
1704, Contents www.EconsPhDTutor.com
A590 (9740 N2013/II/2). The three figures reproduced, but with D and E added:
D
x x
a
E x x
x x x
B Fig. 1 C Fig. 2 Fig. 3
π √
(i) ∣AD∣ = x tan = 3x. Hence:
3
√
∣DE∣ = ∣AB∣ − (∣AD∣ + ∣BE∣) = a − 2 ∣AD∣ = a − 2 3x.
1
Note that ∣DE∣ is also the length of each side of the equilateral triangle in Fig. 2. So, that
triangle’s area is:
√
1 π 1 √ 2 3 1√ √ 2
T = ∣DE∣ sin = (a − 2 3x) = 3 (a − 2 3x) .
2
2 3 2 2 4
The prism has height x and hence volume:
1 √ √ 2
V (x) = xT = x 3 (a − 2 3x) .
4
√ √ √
(ii) From =, we have x ≤ a/ (2 3) = 3a/6. Hence, x ∈ [0, 3a/6]. Observe that the
1
√
boundary points x = 0 and x = 3a/6 correspond to V = 0 and thus cannot be maximum
points. So, the maximum is attained in the interior. Thus, by the Interior Extremum
Theorem, the maximum is also a stationary point.
And so, let us find any stationary points of V .
√
3 √ 2 √ √
V ′ (x) = [(a − 2 3x) + 2x (a − 2 3x) (−2 3)]
4
√ √
3 √ √ √ 3 √ √
= (a − 2 3x) (a − 2 3x − 4 3x) = (a − 2 3x) (a − 6 3x).
4 4
√ √
3 3
V (x) = 0
′
⇐⇒ x= √ = a or x = √ =
2 a 3 a
Now: a.
2 3 6 6 3 18
√
We already saw that x =
2
3a/6 is not a maximum. So, the maximum must be:
√ √ √ √
3 3 1 3 √ √ 3 2 a a 2 a3
x= Vmax = V ( a) = a 3 (a − 2 3 a) = (a − ) = .
3
a and
18 18 4 18 18 24 3 54
2 cos x
f ′ (x) = Ô⇒ f ′ (0) = 2.
1 + 2 sin x
− sin x 2 cos2 x
f ′′ (x) = 2 [ − ] Ô⇒ f ′′ (0) = 2 (0 − 2) = −4.
1 + 2 sin x (1 + 2 sin x)2
− cos x 2 sin x cos x 4 cos x sin x 4 cos3 x
f ′′′ (x) = 2 [ + + + 2 3]
1 + 2 sin x (1 + 2 sin x)2 (1 + 2 sin x)2 (1 + 2 sin x)
(ax) 2 (nx) 3
(ii) eax sin nx = [1 + ax + + . . . ] [nx − + ...]
2! 3!
a2 n n3 3
= nx + anx2 + ( − )x + ...
2 6
Comparing the first two non-zero coefficients, we have n = 2, an = −2, and so a = −1. Hence,
the third non-zero term is:
(−1) ⋅ 2 23
2
a2 n n3 8 x3
( − )x = [ − ] x = (1 − ) x3 = −
2 6 2 6 6 3
x3 1
A592 (9740 N2012/I/2)(i) ∫ dx = ln (1 + x4 ) + C.
1+x 4 4
du 2 1 du
(ii) From the given substitution u = x2 , we have = 2x or x =
1
. Now:
dx 2 dx
1 1 du s,1 1 1 ⋆ 1 1 1
= = = −1
+ = tan−1 x2 + C.
x
∫ 1 + x4 ∫ 2 1 + x4 dx ∫
2
dx dx du tan
1+u
u C
2 2 2 2
⋆
(Note that = is given on List MF26, p. 4.)
1 2
( ) dx ≈ 0.186.
x
(iii) Just use your calculator: ∫
0 1 + x4
1706, Contents www.EconsPhDTutor.com
A593 (9740 N2012/I/4)(i) Observe that: ∠ACB = π − (∠ABC + ∠BAC) = − θ.
π
4
=
AC AB
By the Law of Sines, we have , or:
sin ∠ABC sin ∠ACB
3π 1 3π 1 1
AC = sin = sin π = ,
4 sin ( 4 − θ)
π 4 sin 4 cos θ − sin θ cos 4 cos θ − sin θ
π
3π
= sin = cos .
π π
where the last step uses sin
4 4 4
1
(ii) From the small-angle approximations cos θ ≈ 1 − θ2 and sin θ ≈ θ, we have:
2
1
cos θ − sin θ ≈ 1 − θ − θ2 .
2
1
Now use the Maclaurin series expansion for (1 + x) , with x = −θ − θ2 and n = −1:
n
2
1 1 n (n − 1) 2
≈ = 1 + + x + ...
cos θ − sin θ 1 − θ − 21 θ2
nx
2
1 2 (−1) (−2) 1 2 2 3
≈ 1 + (−1) (−θ − θ ) + (−θ − θ ) ≈ 1 + θ + θ2 .
2 2 2 2
dy dy dy
1− = 2 (x + y) (1 + ) = 2 (x + y) + 2 (x + y) .
dx dx dx
dy
Or: 1 − 2x − 2y = (2x + 2y + 1) .
dx
dy 1 − 2x − 2y dy 1 − 2x − 2y 1 2
= or 1+ =1+ = .
dx 2x + 2y + 1 dx 2x + 2y + 1 2x + 2y + 1
d
to =, then do some algebra:
1
(ii) Apply
dx
d2 y −2 dy −4 dy 1 dy 3
= (2 + 2 ) = 2 (1 + dx ) = − (1 + dx ) .
dx2 (2x + 2y + 1)2 dx (2x + 2y + 1)
dy d2 y
(iii) The turning point occurs where = 0. But at any such point, < 0.
dx dx2
So, by the Second Derivative Test, the turning point is a maximum.
1707, Contents www.EconsPhDTutor.com
2
k = πr2 h + πr3 .
1
A595 (9740 N2012/I/10)(i) The model’s volume is:
3
2
So, h = − r. And the model’s external surface area is:
2 k
πr 2 3
2 3 2k 5 2
A = πr2 + 2πrh + 2πr2 = πr (2h + 3r) = πr [2 ( − + = +
2 k
r) 3r] πr .
πr2 3 r 3
dA 2k 10
Compute: = − 2 + πr.
dr r 3
dA 2k 10 3k 1/3
And so: =0 ⇐⇒ − 2 + πr = 0 ⇐⇒ r=( ) .
dr r 3 5π
d2 A 4k 10
= + π > 0.
dr2 r3 3
The second derivative is positive for all r > 0. So by the Second Derivative Test, the point
we found is indeed a global minimum.
2k 5 2 4 400 5 2 5 3
A = 180 = + πr = + πr πr − 180r + 400 = 0.
3 5
or
r 3 r 3 3
2 200 2
h= − = − r ≈ 4.877, 2.116.
2 k
r
πr2 3 πr2 3
Given the constraint that r < h, we have:
R
dy dy RRRR
is undefined where θ = 0 or θ = 2π. Also, R = 0.
dx RRRR
Note that
dx
Rθ=π
dy dy
As θ → 0, → ∞. And as θ → 2π, → −∞.
dx dx
So, as θ → 0 or θ → 2π, the tangents to C tend towards being vertical.
dx 2
(iii) Given y = 1 − cos θ, we have = 1 − cos θ and:
1
dθ
2π 2π dx dθ x=θ=2π dx 2π
y dx = ∫ dx = ∫ y dθ = ∫ (1 − cos θ) dθ
1,2
∫0
s 2
y
0 dθ dx x=θ=0 dθ 0
2π 2π cos 2θ + 1 3 1 2π
=∫ 1 + cos θ − 2 cos θ dθ = ∫
2
1+ − 2 cos θ dθ = [ θ + sin 2θ − 2 sin θ] = 3π.
0 0 2 2 4 0
dx
− = − tan .
p
(iv) The gradient of the normal to C at P is:
dy 2
y − (1 − cos p) = − tan [x − (p − sin p)].
3 p
Thus, the normal is:
2
For the x-intercept, plug y = 0 into = to get cos p − 1 = − tan [x − (p − sin p)] or:
3 p
2
p 5 x − (p − sin p)
1 − cos p = tan [x − (p − sin p)] =
p
or cot .
2 2 1 − cos p
sin θ ⋆
= cot , we have x = p.
θ
And now by
1 − cos θ 2
1709, Contents www.EconsPhDTutor.com
dy
A597 (9740 N2012/II/1)(a) = ∫ 16 − 9x2 dx = 16x − 3x3 + C1 .
dx
3
y = ∫ 16x − 3x3 + C1 dx = 8x2 − x4 + C1 x + C2 .
4
4 dt 1
(b) If u ≠ , then = and:
3 du 16 − 9u2
1 1 1 1 1 4 + 3u 1 4 + 3u
t=∫ du = ∫ du = ln ∣ ∣ + = ln ∣ ∣+C
16 − 9u2 9 2 ⋅ (4/3) 4 − 3u 4 − 3u
C
(4/3) − u2
9 2 24
1
Plug in the given initial condition (t, u) = (0, 1) to find that C = − ln 7. So:
24
1 4 + 3u 1 4 + 3u 4
t= (ln ∣ ∣ − ln 7) = ln ∣ ∣ for u ≠ .
24 4 − 3u 24 7 (4 − 3u) 3
A598 (9740 N2011/I/3)(i) The tangent at the point with parameter t has gradient :
dy dy dx 2 1
= ÷ = − 2 ÷ (2t) = − 3 .
dx dt dt t t
2 1 x 3
So the tangent has equation: y− = − 3 (x − p2 ) or or y = − + .
p p p3 p
(ii) For the x-intercept Q, plug in y = 0: 0 = −x/p3 + 3/p or x = 3p2 . For the y-intercept R,
plug in x = 0: y = 0 + 3/p. So, Q = (3p2 , 0) and R = (0, 3/p).
(iii) The mid-point is (x, y) = (1.5p2 , 1.5/p) and has equation xy 2 = 1.5p2 (1.5/p) = 1.53 .
2
A599 (9740 N2011/I/4)(i) Recall that (a + b) = a6 + 6a5 b + . . . So, using List MF26:
6
6
⎛ x2 x4 ⎞ x2
6 5
x2 x4
cos x = 1 −
6
+ + . . . = (1 − ) + 6 (1 − ) + ...
⎝ 2 24 ⎠ 2 2 24
´¹¹ ¹¸¹ ¹ ¹¶ ¯
a b
−x2 −x2
2
x4
= 1 + 6( ) + 15 ( ) + 6 (1) + ⋅ ⋅ ⋅ = 1 − 3x2 + 4x4 + . . .
2 2 24
(ii)(a) By the merry and unjustified application of Theorem ??:
4 5 a
4
∫0 cos x dx = ∫0 1 − 3x + 4x + . . . dx = [x − x + x + . . . ] = a − a3 + a5 + . . .
a a
6 2 4 3
5 0 5
π 3 4 π 5
If a = π/4, then ∫ cos x dx ≈ − ( ) + ( ) ≈ 0.540.
π/4 π
6
0 4 4 5 4
cos6 x dx ≈ 0.475.
π/4
(b) By our calculator: ∫0
Using the first few terms of the Maclaurin series as an approximation works well if π/4 is
near 0. But it isn’t and so this approximation doesn’t work well.
1710, Contents www.EconsPhDTutor.com
A600 (9740 N2011/I/5)(i) The graphs of y = f (∣x∣) and y = ∣f (x)∣ are identical to that
of f , except that:
• Where x < 0, y = f (∣x∣) is the reflection of f in the y-axis.
• Where f (x) ≥ 0, y = ∣f (x)∣ is the reflection of f in the x-axis.
y y
y = f (x)
(0, 2) (0, 2)
(2, 0) x
(−2, 0) x y = f (x)
⎧
⎪
⎧ ⎪
⎪2 − x for x ≤ 2,
⎪
⎪
⎪2 − x for x ≥ 0, y = ∣f (x)∣ = ⎨
y = f (∣x∣) = ⎨ ⎪
⎪x − 2
⎪ for x > 2.
⎪
⎪ ⎩
⎩2 + x
⎪ for x < 0.
0 1
1 0 1 x2 x2 1
(iii) ∫ f (∣x∣) dx = ∫ 2 + x dx + ∫ 2 − x dx = [2x + ] + [2x − ] = 3.
−1 −1 0 2 −1 2 0
2
2 x2 x2
a
∫1 ∣f (x)∣ dx = ∫1 (2 − x) dx + ∫2 (x − 2) dx = [2x − ] + [ − 2x]
a a
2 1 2 2
1 a2 2 5
= (4 − 2 − 2 + ) + ( − 2a − 2 + 4) = − 2a + .
2 a
2 2 2 2
a2 5
Set = and = to be equal: − 2a + 3= a2 − 4a − 1 = 0.
1 2
or
2 2
√
4 ± (−4) − 4 (1) (−1)
2
√ √
By the quadratic formula: a= = 2 ± 4 + 1 = 2 ± 5.
2 (1)
√ √
We can discard a = 2 − 5 < 2. So, a = 2 + 5.
1711, Contents www.EconsPhDTutor.com
1 1 10 + v
A601 (9740 N2011/I/8)(i) ∫ dv = ln ∣ ∣ + C1 (see List MF26, p. 4).
100 − v 2 20 10 − v
dt 1 1
(ii)(a) Rearrange the given differential equation: = = 10 .
dv 10 − 0.1v 2 100 − v 2
1 1 1 10 + v
So: t=∫ dv = ∫ 10 dv = ln ∣ ∣ + C2 .
10 − 0.1v 2 100 − v 2 2 10 − v
1 10 + 0
Plug in the given initial condition (t, v) = (0, 0) to get 0 = ln ∣ ∣ + C2 . So, C2 = 0.
2 10 − 0
1 10 + v 1 10 + 5 1
t= ln ∣ ∣ = ln ∣ ∣ = ln 3.
1
Thus: and
10 − v 10 − 5 2
t∣
2 v=5
2
10 + v 10 + v
(b) Rearrange =: 2t = ln ∣ ∣ ⇐⇒ e2t = ∣ ∣.
1
10 − v 10 − v
10 + v 20 20
For v < 10: e2t = = −1 + ⇐⇒ v (t) = 10 −
2
.
10 − v 10 − v e2t + 1
So, if we start with v < 10, then for all t, we have v < 10. And we do start from v < 10
(indeed, we start at v = 0). So = holds for all t and v (1) = 10 − 20/ (e2 + 1).
2
A602 (9740 N2011/II/2)(i) The box has length L = 2 (n − x), breadth B = n − 2x, and
height H = x. It thus has volume:
dV
(ii) Compute: = 2n2 − 12nx + 12x2 . So:
dx
dV
=0 ⇐⇒ 2n2 − 12nx + 12x2 = 0 ⇐⇒ n2 − 6nx + 6x2 = 0.
dx
√ √ √ √
6n ± (−6n) − 4 (1) (6n2 ) 3n ± 9n2 − 6n2 1
2
3 1 3
Or: x= = = n± n = (1 ± ) n.
2 (12) 6 2 6 2 3
−1 −2x −1 n
1 n
x2 e−2x dx = [x2 ( ) e − ∫ 2x ( ) e−2x dx] = [− x2 e−2x + ∫ xe−2x dx]
n
∫0 2 2 2
0 0
1 −2x n
1 1
= [− e (2x + 2x + 1)] = − e−2n (2n2 + 2n + 1) + .
2
4 0 4 4
∞ 1 1 1
(a)(ii) ∫ x2 e−2x dx = lim [− e−2n (2n2 + 2n + 1) + ] = .
0 n→∞ 4 4 4
dx 2
(b) From the given substitution x = tan θ, we have = sec2 θ and:
1
dθ
1 1 4x 2 1 16 tan2 θ 1 16 tan2 θ
V π ∫ y 2 dx = π ∫ ( 2 ) dx = π ∫ = ∫0
1
dx 2 dx
x +1
π
0 (tan2 θ + 1) (sec θ)
2 2
0 0
1 π/4
= 8π ∫ 1 − cos 2θ dθ = 8π [θ − sin 2θ] = 2π (π − 4).
π/4
0 2 0
1
A604 (9740 N2010/I/2)(i) ex = 1 + x + x2 + . . . and 1 + sin 2x = 1 + 2x + . . .
2
1 5
Thus: ex (1 + sin 2x) = (1 + x + x2 + . . . ) (1 + 2x + . . . ) = 1 + 3x + x2 + . . .
2 2
4x n 4x n (n − 1) 4x 2 4n 8n (n − 1) 2
(ii) (1 + ) = 1 + n + ( ) + ⋅⋅⋅ = 1 + x + x + ...
3 3 2 3 3 9
4n 9 8n (n − 1) 8 (9/4) (5/4) 5
So, 3 = or n = . And = = . 3
3 4 9 9 2
dy dy dy dy x + y
2x − 2y + 2y + 2x =0 ⇐⇒ 2 (x − y) = −2 (x + y) ⇐⇒ = .
dx dx dx dx y − x
√
β
(iii) The curve and line intersect at x = − 3. So, the requested area is:
0
0 x4 3 2 9 3 9
∫− 3
√ − 3x3
dx = [ − ] = − + ⋅ 3 = .
4 2 −√3
x x
4 2 4
By observation, x3 −3x+1 = k has three distinct real roots if and only if k is strictly between
the above two values. That is, k ∈ (−1, 3).
10 10
t (θ) = 10 ln or e0.1t = or θ (t) = 20 − 10e−0.1t .
20 − θ 20 − θ
θ (t)
10
dA 800 √
= − 2 (1 + k) + 12x = 0 ⇐⇒ 12x = 800 (1 + k)
3
⇐⇒ x = 3 200 (1 + k) /3.
dx x
√
200
Thus, A is minimised at: x= (1 + k).
3
1 1 1 3 3 3
(iii) k ∈ (0, 1] ⇐⇒ 2 (1 + k) ∈ (2, 4] ⇐⇒ ∈ [ , ) ⇐⇒ ∈ [ , ).
2 (1 + k) 4 2 2 (1 + k) 4 2
3 1
= 1 ⇐⇒ = 1 ⇐⇒ k = .
y
(iv)
x 2 (1 + k) 2
dy dy dx 1 + t−2 t2 + 1
= ÷ = = (for t ≠ 0, 1).
dx dt dt 1 − t−2 t2 − 1
So the tangent at point P has equation:
p2 + 1 (p2 − 1) p2 + 1
2
1 1
y − (p − ) = 2 [x − (p + )] or (p − 1) y −
2
= (p2 + 1) (x − ).
p p −1 p p p
1 1
(p2 + 1) x − (p2 − 1) y = [(p2 + 1) − (p2 − 1) ] = (2 ⋅ 2p2 ) = 4p.
2 2 1
Or:
p p
(ii) For A, plug y = x into =: (p2 + 1) x − (p2 − 1) x = 4p or x = 2p. So, A = (2p, 2p).
1
2 2 2
For B, plug y = −x into =: (p2 + 1) x + (p2 − 1) x = 4p or x = . So, B = ( , − ).
1
p p p
Observe that since y = x and y = −x are perpendicular, OA and OB are likewise perpendic-
ular. Hence, the area of the triangle OAB is simply given by the primary-school formula
“half base times height”:
√ RRR¿ 2 RRR
R √
R Á √
∣OA∣ ∣OB∣ = ∣ (2p) + (2p) ∣ RRRRRÁ ( ) + (− ) RRRRR =
2 À
2
1 1 2 2 1 8
= 4,
2
8p2
2 2 RRR p p RR 2
RR
p2
R
which is indeed independent of p.
2
(iii) Observe that x + y = 2t and x − y = . Hence, x2 − y 2 = (x + y) (x − y) = 4.
t
This is an east-west hyperbola with x-intercepts (±2, 0) and asymptotes y = ±x.
y
C y = −x y=x
(2, 0) (−2, 0)
So, x = −4/3 is the only stationary point. To show that this is also a minimum turning
point, we use the Second Derivative Test:
R √
d2 y RRRR 1.5x 1 1.5x + 2 1.5x x + 2 − 0.75x − 1
R = [√ − ] =[ ] > 0.
dx2 RRRR 4 + 2 2 (x + 2)1.5 x=−4/3 (x +
1.5
Rx=− /3 x 2) x=− /3
4
√
(ii)(a) The given equation is equivalent to y = ±x x + 2, which is, of course, very similar
1
(ii)(b) y (iii) y
Vertical asymptote
x = −2
x x
1.5x + 2
y 2 = x2 (x + 2) y = f ′ (x) = √
x+2
1 4π 2π
ln 3 = ⇐⇒ p = = ≈ 1.906.
π
So:
4 6p 6 ln 3 3 ln 3
1717, Contents www.EconsPhDTutor.com
A612 (9740 N2009/I/4)(i) f (27) + f (45) = f (3) + f (1) = 5 + 6 = 11.
(ii)
(−4, 7) y (4, 7) (8, 7)
(0, 7)
(−7, 6)
x
3 −2 0 2 3
(iii) ∫−4 f (x) dx = ∫ f (x) dx + ∫ f (x) dx + ∫ f (x) dx + ∫ f (x) dx
−4 −2 0 2
2 4 2 3
= ∫ 7 − x2 dx + ∫ 2x − 1 dx + ∫ 7 − x2 dx + ∫ 2x − 1 dx
0 2 0 2
1 3 2 2 2
= 2 [7x − x ] + [x2 − x]2 + [x2 − x]2 = 22 + 12 − 2 + 6 − 2 = 36 .
4 3
3 0 3 3
2b 8bx2 2b
And: g ′′ (x) = − + Ô⇒ g ′′ (0) = − .
(a + bx2 ) (a + bx2 )
2 3 a2
Since the first two non-zero terms coincide, we have: f (0) = g (0) or a = 1/e.
2b 1
And: f ′′ (0) = g ′′ (0) or −e = − = −2be2 or b= .
a2 2e
1718, Contents www.EconsPhDTutor.com
A614 (9740 N2009/I/11)(i)
y
√ √ √
So, f ′ (x) = 0 ⇐⇒ x = ± 0.5. So, the two stationary points are (± 0.5, ± 0.5e1/2 ).
To verify that these are also turning points, compute: f ′′ (x) = −2xe−x (1 − 2x2 )+e−x (−4x).
2 2
√ √
And so: f ′′ (± 0.5) = ∓2 0.5e−0.5 (1 − 1) ∓ 2e−0.5 = ∓2e−0.5 .
√ √
The second derivative is negative at x = 0.5 and positive at x = − 0.5. Hence, by the
Second Derivative Test, the first point is a minimum turning point and the second is a
maximum turning point.
du 2
(iii) From the given substitution u = x2 , we have = 2x and:
1
dx
2
1 n du −u s 1 1 1
dx = ∫ dx = ∫ e dx = ∫ e−u du = [−e−u ]0 = (1 − e−n ) .
n n u=n
−x2 −u n2
∫0 xe
1 2 2
xe
0 2 0 dx 2 0 2 2
1 1
lim ∫ xe−x dx = lim (1 − e−n ) = .
n 2 2
So, the requested area is:
n→∞ 0 n→∞ 2 2
2 2 1
∫−2 ∣xe−x ∣ dx = 2 ∫ xe−x dx = 2 [ (1 − e−n )] = 1 − e−4 .
2 2 2
(iv) By symmetry:
0 2 n=2
1 1 2
π∫ y 2 dx = π ∫ (xe−x ) dx ≈ 0.363.
2
(v) Calculator:
0 0
1719, Contents www.EconsPhDTutor.com
A615 (9740 N2009/II/1)(i) Observations:
1. x = t2 + 4t = t (t + 4). So, x is minimised at t = −2 and is strictly increasing for t ∈ [−2, 1].
2. y ′ (t) = 3t2 + 2t = t (3t + 2), so that y has turning points at t = −2/3 and t = 0.
y
When t = 1,
(x, y) = (5, 2)
When t = −2,
(x, y) = (−4, −4)
dy dy dx 3t2 + 2t dy 3 ⋅ 22 + 2 ⋅ 2 16
(ii) Compute: = ÷ = . So: ∣ = = = 2.
dx dt dt 2t + 4 dx t=2 2⋅2+4 8
y − 12 = 2 (x − 12) y = 2x − 12.
1
So, l has cartesian equation: or
t3 + t2 = 2 (t2 + 4t) − 12 t3 − t2 − 8t + 12 = 0.
2
or
The solutions to = give us the points at which the curve C and the line l intersect.
2
t3 − t2 − 8t + 12 = (t − 2) (t2 + bt + c) .
t3 − t2 − 8t + 12 = (t − 2) (t2 + bt + c) = (t − 2) (t2 + t − 6) = (t − 2) (t − 2) (t + 3) .
So, the only other intersection point is at t = −3, which corresponds to the point:
dn
n=∫ dt = ∫ 10t − 3t2 + C1 dt = 5t2 − t3 + C1 t + C2 .
1
And so, the general solution is:
dt
The remainder of the answer for (i) is no longer in the current 9758 syllabus.
Plug the initial condition (t, n) = (0, 100) into =: 100 = 5 ⋅ 02 − 03 + C ⋅ 0 + D = D.
1
C=1
50 50
And: t = 50 ln or et/50 = or n = 150 − 50e−t/50 .
150 − n 150 − n
dy 3x 1 3
A618 (9740 N2008/I/4)(i) y = ∫ dx = ∫ 2 dx = ln (x2 + 1) + C.
dx x +1 2
(ii) Plug the initial condition (x, y) = (0, 2) into = to get: 2 = 1.5 ln 1 + C = C.
1
C=2
C=0 x
√ 3 2 1/2 1 3 3
So: 4 + 3θ = 2 (1 + θ ) = 2 [1 + ( θ2 ) + . . . ] = 2 + θ2 + . . .
2
4 2 4 4
Hence, a = 2 and b = 3/4.
16x2
Hence, f (x) = 1 + 4x + + ⋅ ⋅ ⋅ = 1 + 4x + 8x2 + . . .
2!
A621 (9740 N2008/I/7). The straight parts have length x + 2y. The semicircular part
has length πx/2. So, the total time to build the wall is: 3 (x + 2y) + 9πx/2 = 180.
3 1 1 1 3
Rearranging: y = 30 − πx − x = 30 − ( + π) x.
4 2 2 4
1 x 21 1 3 1 5
A = xy + π ( ) = 30x − ( + π) x2 + = 30x − ( + π) x2 .
π
2 2 2 4 8 2 8
⇐⇒ x ≈ −1.962 or x ≈ 1.560.
√ √
A623 (9740 N2008/II/2)(i) The upper half of the curve C has equation y = x 1 − x.
1√ √
And so, by symmetry, the requested area is R = 2 ∫ x 1 − x dx ≈ 0.999.
0
du 2 dx
(ii) From u = 1 − x, we have = −1 =
1
. And the requested volume is:
dx du
1 1 √ 1 √
π∫ y dx = π ∫ x 1 − x dx = π ∫ (1 − u) u dx
2 1
0 0 0
1 √ dx du 0 √
= π ∫ (1 − u) u dx = π ∫ (1 − u) u (−1) du
s,2
0 du dx 1
1√ 2 3/2 2 5/2 1 4π
= π∫ u − u du = π [ u − u ] =
3/2
.
0 3 5 0 15
dy √
= 1−x− √
x
(iii) Differentiate the given equation with respect to x: 2y .
dx 2 1−x
dy √
= 0 ⇐⇒ 1 − x − √ = 0 ⇐⇒ 2 (1 − x) − x = 0 ⇐⇒ x = 2/3.
x
So:
dx 2 1−x
(There are two stationary points, both with x-coordinate 2/3. One is a maximum point of
C and the other a minimum point of C.)
1724, Contents www.EconsPhDTutor.com
A624 (9233 N2008/I/2). By the standard Maclaurin series expansions, for small x:
(2x) −1
2
1 −1/2 1
cos 2x ≈ 1 − = 1 − 2x2 and √ = (1 + x2 ) ≈ 1 + ( ) x2 = 1 − x2 .
2! 1 + x2 2 2
cos 2x 1 5
Thus: √ ≈ (1 − 2x2 ) (1 − x2 ) ≈ 1 − x2
1 + x2 2 2
1 −1 −2x 1 1 −1 1 1 1
1
∫0 xe−2x dx = [x ( ) e ] − ∫ ( ) e−2x dx = − e−2 + [− e−2x ] = (1 − 3e−2 ) .
2 0 0 2 2 4 0 4
dt 2 1
A626 (9233 N2008/I/4). From the given substitution t = ln x, we have = . So:
1
dx x
e3 1 e3 1 dt 1 3 1 1 3 2
dx = ∫ dx = ∫ 2 dt = ∫ dt = [− ] =
t=3
∫e
2 1
x (ln x) (ln x) dx t=1 (ln x)
2 2 t 1 3
e 1 t2
y ⎧
⎪
A627 (9233 N2008/I/6)(i) ⎪
⎪x − a for x ≥ a,
∣x − a∣ = ⎨
(ii) By observation, this definite integral is ⎪
⎪a − x
⎪ for x < a.
simply the area of the two triangles created ⎩
by the graph and the x-axis:
1 1
[a − (−b)] + (b − a) = a2 + b2 .
2 2
2 2 x
−b a b
∞ ∞
1 1 π 1
dx = [ tan −1 x
] = − tan−1 .
a
A628 (9233 N2008/I/8). ∫
a 4+x 2 2 2 a 4 2 2
√ √
3/2 1
√ dx = [sin x]1/2 = − = .
−1
3/2 π π π
∫1/2
1 − x2 3 6 6
π 1 2 2√
− tan−1 = ⇐⇒ = tan−1 ⇐⇒ a = 2 tan =√ =
a π π a π
3.
4 2 2 6 6 2 6 3 3
1725, Contents www.EconsPhDTutor.com
dy 2 dz
A629 (9233 N2008/I/10)(i) From the given substitution y = xz, we have = z+x .
1
dx dx
Now plug these into the given differential equation:
dz dz dz
x2 z (z + x ) = x2 + x2 z 2 ⇐⇒ x2 (z 2 + xz − z 2 − 1) = 0 ⇐⇒ x2 (xz − 1) = 0.
3
dx dx dx
dz 4
So, provided x ≠ 0, we have xz = 1.
dx
dz 5 1
(ii) Rearrange = as z = . Apply the ∫ dx operator to = to get:
4 5
dx x
dz 1 1
∫ z dx dx = ∫ x dx ∫ z dz = ∫ x dx z 2 = 2 ln ∣x∣ + C.
6
or or
differential equation, if x = 0, then y = 0. We will note this in our final answer below.)
√
Plug = into = to get: y 2 = x2 (2 ln ∣x∣ + C) y = ±x 2 ln ∣x∣ + C.
6 7 8
or
Plug in the initial condition (x, y) = (2, 6) to get: 62 = 36 = 22 (2 ln 2 + C) or C = 9 − 2 ln 2.
⎧
⎪
⎪
⎪0 for x = 0,
So: y=⎨ √
⎪
⎪
⎩±x 2 ln ∣x∣ + 9 − 2 ln 2
⎪ for x ≠ 0.
A630 (9233 N2008/I/13)(i) The gradient of the normal to the curve is:
dx dx dy −3 cos2 t (− sin t) cos t
− =− ÷ = = .
dy dt dt 3 sin2 t cos t sin t
Thus, the equation of the normal at the given point is:
cos t
y − sin3 t = (x − cos3 t) x cos t − y sin t = cos4 t − sin4 t.
1
or
sin t
(ii) cos4 t − sin4 t = (cos2 t + sin2 t) (cos2 t − sin2 t) = (1) (cos 2t) = cos 2t.
2
(iii) Plug = into = to get x cos t − y sin t = cos 2t. (Note: 0 < t < π/4 Ô⇒ cos t ≠ 0 ≠ sin t.)
2 1 3
For the x-intercept, plug y = 0 into = to get x cos t = cos 2t or x = cos 2t/ cos t.
3
For the y-intercept, plug x = 0 into = to get −y sin t = cos 2t or y = − cos 2t/ sin t.
3
cos 2t cos 2t
So: A=( , 0) and B = (0, − ). And thus:
cos t sin t
√ √ √
cos 2t 2 cos 2t 2 1 1 sin2 t + cos2 t
∣AB∣ = ( ) + (− ) = cos 2t + = cos 2t
cos t sin t cos2 t sin2 t sin2 t cos2 t
1 cos 2t
= cos 2t =1 = 2 cot 2t. (So, k = 2.)
sin t cos t /2 sin 2t
1726, Contents www.EconsPhDTutor.com
A631 (9233 N2008/I/14)(i) Let P (k) be the following proposition:
1 − (k + 1) xk + kxk+1
1 + 2x + 3x + ⋅ ⋅ ⋅ + kx
2 k−1
=
(1 − x) 2
1 − (1 + 1) x1 + 1 ⋅ x1+1 1 − 2x + x2
We show that P (1) is true: 1= = . 3
(1 − x) (1 − x)
2 2
We next show that for all j ∈ N, if P (j) is true, then P (j + 1) is also true:
1 − (j + 1) xj + jxj+1
1 + 2x + 3x2 + ⋅ ⋅ ⋅ + jxj−1 + (j + 1) xj = + (j + 1) xj
P(j)
(1 − x) 2
1 + (j + 1) xj [(1 − x) 2 − 1] + jxj+1 1 + (j + 1) xj (x2 − 2x) + jxj+1
= =
(1 − x) 2 (1 − x) 2
1 − 2 (j + 1) xj+1 + jxj+1 + (j + 1) xj+2 1 − (j + 2) xj+1 + (j + 1) xj+2
= = .
(1 − x) 2 (1 − x) 2
3
d
∫ 1 + 2x + 3x2 + ⋅ ⋅ ⋅ + nxn−1 dx = 1 + 2x + 3x2 + ⋅ ⋅ ⋅ + nxn−1 . Also:
1
(ii) By definition,
dx
d d
∫ 1 + 2x + 3x2 + ⋅ ⋅ ⋅ + nxn−1 dx = (x + x2 + x3 + ⋅ ⋅ ⋅ + xn + C)
dx dx
d 1 − xn 1 − xn n−1 1 1 − xn (1 − xn − nxn ) (1 − x) + x − xn+1
= (x + C) = + x [−nx + ]=
dx 1−x 1−x 1 − x (1 − x)2 (1 − x)
2
(x + 2) ⋅ 2 − 2x ⋅ 1 (x + 2) − 4 (1 + x)
2
1 x2
f (x) =
′
− = =
1+x (x + 2)
2
(1 + x) (x + 2)
2
(1 + x) (x + 2)
2.
Assuming425 the domain of ln is R+ , we have 1+x > 0 and hence also (x + 2) > 0. Assuming
2
2x
(ii) Observe that f (0) = 0 or equivalently ln (1 + x) = .
x+2
Since f is non-decreasing, for all x ≥ 0, we must have f (x) ≥ f (0) = 0 and thus:
2x
ln (1 + x) ≥ .
x+2
4 dt
A634 (9740 N2007/I/4)(i) For I ≠ 2/3, we have = .
2 − 3I dI
4 dt 4
∫ 2 − 3I dI = ∫ dI dI − ln ∣2 − 3I∣ = t + C.
1
So: or
3
4 4
− ln ∣2 − 3 ⋅ 2∣ = 0 + C or C = − ln 4.
3 3
4 4 4
So: ln =t or = e3t/4 or ∣2 − 3I∣ = 4e−3t/4 .
3 ∣2 − 3I∣ ∣2 − 3I∣
⎧
⎪ 1
⎪
⎪
⎪ (4e−3t/4 + 2) for I > 2/3,
⎪
⎪3
Hence: I =⎨
⎪
⎪
⎪ 1
⎪
⎪
⎪ (2 − 4e−3t/4 ) for I < 2/3.
⎩3
1 2
(ii) We start at t = 0 with I = 2 > 2/3. So, lim I = lim (4e−3t/4 + 2) = .
t→∞ t→∞ 3 3
425
It turns out that more generally, the domain of the natural logarithm function ln is C ∖ {0}, that is,
the set of all complex numbers excluding 0. In which case, it is perfectly possible that 1 + x < 0 and the
conclusion to which this question leads is false.
1728, Contents www.EconsPhDTutor.com
A635 (9740 N2007/I/11)(i) y
When t = π/2,
dy dy dx 3 sin t cos t 2
3 (x, y) = (0, 1).
(ii) = ÷ = = − sin t.
dx dt dt 2 cos t (− sin t) 2
When t = 0,
3
− sin3 θ = − sin θ (xQ − cos2 θ). (x, y) = (1, 0).
2
2
Or: xQ = sin2 θ + cos2 θ.
3
For its y-intercept, plug x = 0 into = to get:
1
3 3
y − sin3 θ = − sin θ (− cos2 θ) or yR = sin θ cos2 θ + sin3 θ.
2 2
OQR is simply a triangle with area:
1 1 2 3
xQ yR = ( sin2 θ + cos2 θ) ( sin θ cos2 θ + sin3 θ)
2 2 3 2
1
= sin θ (2 sin2 θ + 3 cos2 θ) (3 cos2 θ + 2 sin2 θ)
12
1 1
= sin θ (2 sin2 θ + 3 cos2 θ) = sin θ (2 + cos2 θ) .
2 2
12 12
0
=∫ sin t ⋅ 2 cos t (− sin t) dt = 2 ∫
π/2
3
cos t sin4 t dt.
π/2 0
du 3
The given substitution u = sin t yields = cos t.
2
dt
du 4 s 1 1 2
cos t sin4 t dt = 2 ∫ u dt = 2 ∫ u4 du = 2 [ u5 ] = .
π/2 π/2 u=1
2,3
2∫
0 0 dt u=0 5 0 5
1729, Contents www.EconsPhDTutor.com
A636 (9740 N2007/II/3)(i) Let f (x) = (1 + x) . Then f (0) = 1. Also:
n
n (n − 1) 2 n (n − 1) (n − 2) 3
Hence: (1 + x) = 1 + nx + x + x + ...
n
2! 3!
16 128
3 1 3 3 127 3
= 8 − 3x + x2 + x + 24x2 − 9x3 + ⋅ ⋅ ⋅ = 8 − 3x + 24 x2 − 8 x + ...
16 128 16 128
√ √ √
3/2 3/2 1 3/2 1
y 2 dx = π ∫ =
π
A639 (9233 N2007/I/3).π ∫ dx ∫ dx
1/2 1/2 1 + 4x2 4 1/2 1/22 + x2
√ √
π2
= [2 tan−1 2x]1/2 = (tan−1 3 − tan−1 1) = .
π 3/2 π
4 2 24
du 3 1
A640 (9233 N2007/I/8)(i) From t = sin u, we have sin−1 t = u and =
1 2
. So:
dt cos u
A641 (9233 N2007/I/10)(i) For x ∈ [0, 2π], the graphs of y = cos x and y = sin x intersect
at x = π/4 and x = 5π/4. With the aid of a sketch, we see that the given inequality holds
to the left of the first intersection point and to the right of the second. That is:
5π
cos x > sin x ⇐⇒ x ∈ [0, ) ∪ ( , 2π].
π
4 4
2π
(ii) ∫0 ∣cos x − sin x∣ dx
5π
π
2π
= ∫ cos x − sin x dx + ∫ π sin x − cos x dx + ∫ 5π cos x − sin x dx
4 4
0 4 4
√
= sin + cos − 1 + 2 (sin + cos ) + (sin + cos ) + 1 = 8 sin = 4 2.
π π π π π π π
4 4 4 4 4 4 4
1731, Contents www.EconsPhDTutor.com
5x + 4 Bx + C
= +
A642 (9233 N2007/I/11). A
(x − 5) (x2 + 4) x − 5 x2 + 4
Ax2 + 4 + Bx2 − 5Bx + Cx − 5C (A + B) x2 + (−5B + C) x + 4 − 5C
= = .
(x − 5) (x2 + 4) (x − 5) (x2 + 4)
A + B = 0, −5B + C = 5, 4 − 5C = 4.
1 2 3
Comparing coefficients:
1
A643 (9233 N2007/I/13)(i) y ′ (x) = sec x tan x = tan x.
sec x
(ii) y (4) (x) = 2 (2 sec2 x tan2 x + sec4 x). So, y (4) (0) = 2 (0 + 1) = 2.
dy 2x −
−y 2
dy x2 − y 2 x2 + y 2
2
dy
x
2x − 2y =A or 2x − 2y = or = x
= .
dx dx x dx 2y xy
dy 4 dv
(ii) From the given substitution y = vx, we have = x + v. So:
3
dx dx
dv 4 dy 2xy 2vx2 2v 3v + v 3
= −v =− 2 −v =− 2 − v = −( + v) = −
3
x + y2 x + v 2 x2 1 + v2 1 + v2
x .
dx dx
1 + v2 1 dx
(iii) Rearrange (ii): − = . Then apply ∫ dv:
3v + v 3 x dv
1 + v2 1 dx 1 1
∫ 3v + v 3 dv = ∫ x dv dv = − 3 ln (3v + v ) = ∫ x dx = ln x + C1 .
− 3
−1/3
⇐⇒ (3v + v 3 ) = C2 x ⇐⇒ 3x3 v + x3 v 3 = C ⇐⇒ 3x2 y + y 3 = C.
3
A645 (9233 N2006/I/7). Consider the cone formed by the liquid. Let r be the radius
of this cone’s base and h be its height. Then tan 45○ = Opp/Adj = r/h = 1, so that r = h.
1
V = r h = h3 .
π 2 π
So, the volume of the liquid is:
3 3
V ′ (t) = −2.
2
We are given that:
dh
V ′ (t) = πh2
3
But we also have: .
dt
−2
h′ (t) =
4
So: .
πh2
So, at 3 minutes or 180 seconds after the start of the experiment, we have:
2/3 2/3
5 3 6 90
[h (180)] 2 = [ (390 − 2 ⋅ 180)] = ( ) .
π π
−2 −2
h′ (180) = = √ ≈ −0.068.
4 6
And thus:
π [h (180)]
2
902 π
3
The instantaneous rate of decrease of the depth of the liquid at 3 minutes is approximately
0.068 centimetres per second.
1733, Contents www.EconsPhDTutor.com
dy dy dy
A646 (9233 N2006/I/8). Apply to the given equation: 6x + y + x + 2y = 0.
dx dx dx
dy
= 0 ⇐⇒ 6x + y = 0 ⇐⇒ y = −6x.
1
So:
dx
So, the two points at which the tangent is parallel to the x-axis are (±1, ∓6).
d d 1 − sin θ 1 sin θ
A647 (9233 N2006/I/9)(i) sec θ = =− = = sec θ tan θ.
dθ dθ cos θ cos2 θ cos θ cos θ
dx 2
(ii) From x = sec θ − 1, we have = sec θ tan θ and x2 + 2x = (x + 1) − 1 = sec2 θ − 1.
1 3 2
dθ
1 1 1 1 1 1
√ = √ = √
1,3
∫√2−1 dx ∫ √ dx ∫ √ dx
(x + 1) x2 + 2x 2−1 sec θ sec2 θ − 1 2−1 sec θ tan2 θ
1 1 1 dθ
= ∫√ dx = ∫√ dx = ∫ dθ = [sin θ]π/4 = .
θ=π/3 π
2 s π/3
2−1 sec θ tan θ 2−1 dx θ=π/4 12
A + 2C = 1, 2B − C = 1, A − B = −2.
1 2 3
Comparing coefficients:
= plus 2× = plus 4× = yields 5A = −5 or A = −1. We then find that C = 1 and B = 1.
1 2 3
1 + x − 2x2 4 1 x+1
So: = − + .
(2 − x) (1 + x2 ) 2 − x 1 + x2
(ii) We use the first standard Maclaurin series expansion twice. First:
1 5 1 x −1 1 (−1) (−2) x 2 1 1 1
− = − (1 − ) = − [1 + (−1) (− ) + (− ) + . . . ] = − − x − x2 − . . .
x
2−x 2 2 2 2 2! 2 2 4 8
1
Second: = 1 + (−1) x2 + ⋅ ⋅ ⋅ = 1 − x2 + . . .
1+x2
x+1 6
So: = (x + 1) (1 − x2 + . . . ) = 1 + x − x2 + . . .
1+x2
1 + x − 2x2 1 1 1 2 1 3 9 2
= (− − − − ) + (1 + − 2
+ ) = + − x + ...
(2 − x) (1 + x2 )
x x . . . x x . . . x
2 4 8 2 4 8
yR − yQ c/r − c/q
q−r
1
= = =− .
qr
xR − xQ cr − cq r − q qr
(ii) The described line has gradient qr, passes through P (cp, ), and thus has equation:
c
p
y − = qr (x − cp).
c 1
p
This line passes through V . So, plug (x, y) = (cv, ) into = to get:
c 1
v
1 1 1
− = qr (cv − cp) − = qr (v − p) v=−
c c
or or .
v p v p pqr
d dx
(iii) Observe that xy = c2 . Applying the operator, we have y + x = 0.
dy dy
dx x ct
Rearranging, the gradient of the normal to the given curve at t is: − = = = t2 .
dy y c/t
So the gradient of the normal at P is p2 .
y−
= p (x − cp).
c 2 2
(iv) The equation of the normal at P is:
p
This line passes through S. So, plug (x, y) = (cs, ) into = to get:
c 2
s
p−s 1 1
− = p2 (cs − cp) = p2 (s − p) − = p3 or s=−
c c
or or .
s p sp s p3
⎡ ⎤7
1 ⎢⎢ ⎥
7
1 ⎥ 1 7 2 1 7 2 4 1
7
= = [ √ − √ ] = ( − ) = =
x
(ii)∫ dx ⎢ ⎥
⎢ ⎥ ⋅
.
(x + 32) 32
⎣ (x + 32) ⎦2
32 32 9 6 9 32 72
3/2 1
2 2 2 2 81 36 2
1735, Contents www.EconsPhDTutor.com
130.6. Ch. 114 Answers (Probability and Statistics)
A651 (9758 N2017/II/5). XXX
A652 (9758 N2017/II/6). XXX
A653 (9758 N2017/II/7). XXX
A654 (9758 N2017/II/8). XXX
A655 (9758 N2017/II/9). XXX
A656 (9758 N2017/II/10). XXX
A657 (9740 N2016/II/5). XXX
A658 (9740 N2016/II/6). XXX
A659 (9740 N2016/II/7). XXX
A660 (9740 N2016/II/8). XXX
A661 (9740 N2016/II/9). XXX
A662 (9740 N2016/II/10). XXX
A663 (9740 N2015/II/5)(i) The manager may not have all the required information to
properly implement stratified sampling. For example, he may not know what proportion
of the sampling population each age group composes.
(ii) Decide what the age groups are and how many he wishes to survey from each group.
(That is, for each age group, set a quota of respondents to be surveyed.) Then simply go
around surveying customers he sees in the supermarket, until he meets the quota for each
age group.
(iii) The manager may unconsciously gravitate towards customers that look more friendly.
He may thus not get a representative sample of his customers (many of whom look un-
friendly).
A664 (9740 N2015/II/6)(i) Let X be the number of red sweets in the packet.
(ii) X ∼ B(100, 0.25). Since np = 25 > 5 and n(1 − p) > 5, the normal approximation
Y ∼ N (25, 18.75) is suitable. Hence, using also the continuity correction,
29.5 − 25
P(X ≥ 30) = 1 − P(X < 30) ≈ 1 − P(Y < 29.5) = 1 − Φ ( √ )
18.75
≈ 1 − Φ(1.039) ≈ 1 − 0.8506 = 0.1494.
(iii) Let p = P(X ≥ 30) ≈ 0.1494 and q = 1 − P(X ≥ 30) ≈ 0.8506. Then the desired
⎛ 15 ⎞ 15 ⎛ 15 ⎞ 14 ⎛ 15 ⎞ 2 13 ⎛ 15 ⎞ 3 12
q + pq + pq + p q ≈ 0.8245.
⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 2 ⎠ ⎝ 3 ⎠
A665 (9740 N2015/II/7)(i) The rate at which errors are made is independent of the
number of errors that have already been made.
The rate at which errors are made is constant throughout the newspaper.
(ii) Let E ∼ Po(6 ⋅ 1.3) = Po(7.8). Then
7.80 7.81 7.810
P(E > 10) = 1 − P(E ≤ 10) = 1 − e−7.8 ( + + ⋅⋅⋅ + ) ≈ 0.164770.
0! 1! 10!
(iii) Let F ∼ Po(1.3n). We are given that P(F < 2) < 0.05. That is:
(1.3n)0 (1.3n)1
e−1.3n ( + ) < 0.05 or e−1.3n (1 + 1.3n) < 0.05.
0! 1!
Let f (n) = e−1.3n (1+1.3n). From calculator, f (1), f (2), f (3) > 0.05 and f (4) < 0.05. Hence,
the smallest possible integer value of n is 4.
A666 (9740 N2015/II/8)(i)
0.80 + 1.000 + 0.82 + 0.85 + 0.93 + 0.96 + 0.81 + 0.89 ∑ xi
x̄ = = = 0.8825,
8 n
Since, ∣t∣ < t7,0.1 = −1.415, we are unable to reject the null hypothesis at the 10% significance
level.
A667 (9740 N2015/II/9)(i) By indep., P(B∣A) = P(B) = 0.4.
= 0.45 + 0.4 + 0.3 − 0.45 ⋅ 0.4 − 0.45 ⋅ 0.3 − P(B ∩ C) + 0.1 = 0.935 − P(B ∩ C)
30 y
25
20
15
10
0 x
0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 40,000 45,000
(iv) Let x be the height given in metres. Then 3x = h. Thus, the above equation may be
rewritten as
√
P − 14.083 = −0.147 ( 3x − 140.986) .
1600 − 1500
P(F > 1600) = 1 − P(F ≤ 1600) = 1 − Φ ( √ )
520
√
= 1 − Φ ( 5) ≈ 1 − Φ(2.236) ≈ 1 − 0.9873 = 0.0127.
0 − (−100)
P(F > E) = P(F − E > 0) = 1 − P(F − E ≤ 0) = 1 − Φ ( √ )
3810
10
= 1 − Φ ( √ ) ≈ 1 − Φ(1.622) ≈ 1 − 0.9476 = 0.0524.
38
(iii) 0.85F +0.9E ∼ N (0.85 ⋅ 5 ⋅ 300 + 0.9 ⋅ 8 ⋅ 200, 0.852 ⋅ 5 ⋅ 202 + 0.92 ⋅ 8 ⋅ 152 ) = N (2715, 2903).
1739, Contents www.EconsPhDTutor.com
2750 − 2715
P(0.85F + 0.9E < 2750) = Φ ( √ ) ≈ Φ(0.650) ≈ 0.7422.
2903
P (⋆ + ◯) + P (⋆◯+) + P (+ ⋆ ◯) + P (◯ ⋆ +) + P (+◯⋆) + P (◯ + ⋆)
= 0.1(0.3 ⋅ 0.3 + 0.4 ⋅ 0.2) + 0.2(0.4 ⋅ 0.3 + 0.2 ⋅ 0.2) + 0.1(0.4 ⋅ 0.4 + 0.2 ⋅ 0.3)
= 0.017 + 0.032 + 0.022 = 0.071
Let f (n) = e−2n (1 + 2n + 2n2 ). From calculator, f (1), f (2), f (3), f (4) > 0.01 and f (5) <
0.01. Hence, the smallest possible integer value of n is 5.
(iii) Let R ∼ Po(52⋅11) = Po(572). Given a large sample, we can use the normal distribution
S ∼ N (572, 572)as an approximation. Hence, using also the continuity correction,
550.5 − 572
P(R > 550) ≈ P(S > 550.5) = 1 − P(S < 550.5) = 1 − Φ ( √ )
572
(iv) Sales may be seasonal — e.g. it may be that art collectors make most of their purchases
in the northern hemisphere’s summer months.
Since ∣t∣ < t7,0.05 = 1.895, we are unable to reject the null hypothesis at the 5% significance
level.
A683 (9740 N2013/II/10)(i) In blue is case (A), in red is case (B), and in green is
case (C).
(ii)
150 Distance, y
100
50
Speed, x
0
0 15 30 45 60 75 90 105 120 135 150
(iii) As a function of speed, the distance travelled decreases at an increasing rate. So (A)
is the most appropriate.
PMCC ≈ −0.939203.
(iv) In general, the estimated regression equation is y−ȳ = b(x−x̄), where b = ∑ x̂i ∑ ŷi / ∑ x̂2i .
So in this case, the estimated regression equation is
(iv) Let D be the total number of days of absence across both departments, over a 60-day
period. Then D ∼ Po(234). Since λD = 234 is large, the normal distribution is a suitable
approximation. Let E ∼ N (234, 234). Then:
A686 (9740 N2012/II/5)(i)(a) Let +, −, D, and N denote the events “positive result”,
“negative result”, “has disease”, and “no disease”. Then:
3.8 3.8
x̄ ∈ (µ − Z0.025 √ , µ + Z0.025 √ ) = (14.0 − 1.96 √ , 14.0 + 1.96 √ ) ≈ (12.335, 15.665) .
σ σ
n n 20 20
91 ⋅ 2 33 43 43
= + = = ≈ 0.17695.
3 ⋅ 455 3 ⋅ 455 3 ⋅ 91 243
A689 (9740 N2012/II/8)(i)
(ii) The trend is one of steady improvement. After a terrible performance in Week 1, Amy
resolves to work hard. Her work pays off, with her mark improving week after week.
The only deviation from trend occurs on Week 5, because Amy happened to be experi-
menting with drugs that week.
(iii) A linear model would suggest that she eventually breaks the 100% barrier, which is
quite impossible.
A quadratic model would suggest that her mark eventually starts falling and moreover at
an increasing rate, which is quite improbable, unless of course she gets hooked on drugs.
(iv) PMCC ≈ −0.929744.
(v) We are supposed to say that the most appropriate choice is wherever the magnitude of
the PMCC is the largest. Hence, L = 92 is the most appropriate.
(vi) In general, the estimated regression equation is y−ȳ = b(x−x̄), where b = ∑ x̂i ∑ ŷi / ∑ x̂2i .
So in this case, the estimated regression equation is
89.5 − 80
P(I ≥ 90) ≈ P(J ≥ 89.5) = 1 − Φ ( √ ) ≈ 1 − Φ(1.062) ≈ 1 − 0.8559 = 0.1441.
80
(v) Let P ∼ Po(3). Let Z be the number of gold coins and pottery shards found in 50 m2 .
Then Z ∼ Po(190). Since λ is large, the normal distribution Q ∼ N(190, 190) is a suitable
approximation for Z. Using also the continuity correction,
(vi) Let X and Y be, respectively, the numbers of gold coins and pottery shards found in
50 m2 . Then X ∼ Po(40) and Y ∼ Po(150). Our goal is to find P(Y ≥ 3X) = P(Y − 3X ≥ 0).
Since λX = 40 and λY = 150 are both large, the normal distributions A ∼ N(40, 40) and
B ∼ N(150, 150) are suitable approximations for X and Y , respectively. And in turn,
B − 3A ∼ N(150 − 3 ⋅ 40, 150 + 32 ⋅ 40) = N(30, 510) is a good approximation for Y − 3X.
Hence, using also the continuity correction,
−0.5 − 30 30.5
P(Y − 3X ≥ 0) ≈ P(B − 3A ≥ −0.5) = 1 − Φ ( √ ) = Φ (√ ) ≈ Φ(1.3506) ≈ 0.9116.
510 510
40.0 − µ 40.0 − µ
A692 (9740 N2011/II/5)(i) P(X < 40.0) = P (Z < ) = 0.05 ⇐⇒ ≈
σ σ
−1.645 ⇐⇒ µ ≈ 1.645σ + 40.0
1
70.0 − µ 70.0 − µ
P(X < 70.0) = P (Z < ) = 0.975 ⇐⇒ ≈ 1.96 ⇐⇒ µ ≈ −1.96σ + 70.0.
2
σ σ
Comparing ≈ and ≈, we have 1.645σ + 40.0 ≈ −1.96σ + 70.0 ⇐⇒ 3.605σ ≈ 30.0 ⇐⇒ σ ≈ 8.3
1 2
and µ ≈ 53.7.
A693 (9740 N2011/II/6)(i) Decide what the age groups will be. Decide how many
from each age group are to be interviewed (these are our quotas). Then pick, at random,
residents on the street to be interviewed, until the quota for every age group is fulfilled.
(ii) Residents who are on the street may not be a representative sample of the population.
(iii) Random sampling. Acquire a complete list of the city suburb’s population. Use a
computer program to randomly pick a sample. Interview this sample.
No it is not realistic. First, one may be able to acquire a complete list of the city suburb’s
population. Second, one may not be able to contact every member of one’s sample.
A694 (9740 N2011/II/7)(i) #1. I do indeed make an actual attempt to contact n
different friends.
#2. The probability that one friend is contactable is independent of whether another friend
is contactable.
(ii) Assumption #1 may not hold because if say n = 100, I may run out of time before I
attempt to contact all 100 different friends.
Assumption #2 may not hold because my friends probably know each other and so they
might be watching a movie together and their handphones are switched off. This would
mean that the probability that one friend is contactable is dependent on whether another
friend is contactable.
5 5 ⎛ 5 ⎞ i 5−i
(iii) P(R ≥ 6) = 1 − ∑ P(R = i) = 1 − ∑ 0.7 0.3 ≈ 0.551774.
i=0 i=0 ⎝ i ⎠
(iv) Since np = 28 > 5 and n(1 − p) = 12 > 5 are both large, a suitable approximation to R
is the normal distribution S ∼ N (28, 8.4). Using also the continuity correction, we have
(ii) The PMCC is ≈ −0.992317 which is very large in magnitude. But this merely means
that the correlation between x and y is very strong. It does not also imply that their true
relationship is definitely linear. Indeed in this case, it appears that the relationship is not
linear.
(iii) We are supposed to say that the larger the magnitude of the PMCC, the better the
model. In this case, the PMCC of y and x2 is −0.999984. And so we’re supposed to conclude
that y = a + bx2 is the better model.
(iv) In general, the estimated regression equation is y−ȳ = b(x−x̄), where b = ∑ x̂i ∑ ŷi / ∑ x̂2i .
So in this case, the estimated regression equation is
But P(E ∩ F ) = P(E)P(F ∣E) = 0.62 (0.05 ⋅ 0.95 + 0.95 ⋅ 0.05) = 0.0342. Hence P(E∣F ) =
0.0342/0.109272 ≈ 0.312980.
A697 (9740 N2011/II/10)(i) We are given that T ∼ N(5.0, 38.0).
Let X be the time taken to install the component after background music is introduced.
Assume that X remains normally distributed with standard deviation 5.0 (these are ques-
1750, Contents www.EconsPhDTutor.com
tionable assumptions, but without these we cannot proceed). That is, X ∼ N (µ0 , 5.02 ).
The null hypothesis is H0 ∶ µ0 = 38.0 and the alternative hypothesis is HA ∶ µ0 < 38.0.
√
(ii) Z0.05 ≈ 1.645.√ So to reject the null hypothesis, we must have t̄ < µ 0 − Z 0.05 σ/ n=
38.0 − 1.645 ⋅ 5.0/ 50 ≈ 36.8.
√
√
(iii) Since the null is not rejected with t̄ = 37.1, we must have t̄ = 37.1 > µ 0 − Z 0.05 σ/ n=
38.0 − 1.645 ⋅ 5.0/ n. Rearranging, n < (1.645 ⋅ 5.0/0.9) ≈ 83.5. Thus, n ∈ {1, 2, . . . , 83}.
2
A698 (9740 N2011/II/11)(i) There are in total C(30, 10) ways to choose the committee.
There are C(18, 4) × C(12, 6) ways to choose a committee with exactly 4 women. Hence,
the desired probability is
17 ⋅ 48
= ≈ 0.9410679.
29 ⋅ 13 ⋅ 23
(ii) The number of ways to choose a committee with exactly r women is:
⎛ 18 ⎞ ⎛ 12 ⎞
⎝ r ⎠ ⎝ 10 − r ⎠
.
And the number of ways to choose a committee with exactly r + 1 women is:
⎛ 18 ⎞ ⎛ 12 ⎞
⎝ r + 1 ⎠⎝ 9 − r ⎠
.
We are told that the first number is greater than the second, i.e.
⎛ 18 ⎞ ⎛ 12 ⎞ ⎛ 18 ⎞ ⎛ 12 ⎞
>
⎝ r ⎠ ⎝ 10 − r ⎠ ⎝ r + 1 ⎠ ⎝ 9 − r ⎠
⇐⇒ (17 − r)!(r + 1)!(3 + r)!(9 − r)! > (18 − r)!r!(2 + r)!(10 − r)! (as desired).
Continuing with the algebra, we have (r + 1)(3 + r) > (18 − r)(10 − r) ⇐⇒ r2 + 4r + 3 >
r2 − 28r + 180 ⇐⇒ 32r > 177 ⇐⇒ r > 5 + 17/32.
We have just proven that P(R = r) > P(R = r + 1) if and only if r = 6, 7, 8, 9. That is,
we have just shown that P(R = 6) > P(R = 7) > P(R = 8) > P(R = 9) > P(R = 10), but
P(R = 0) ≤ P(R = 1) ≤ P(R = 2) ≤ P(R = 3) ≤ P(R = 4) ≤ P(R = 5) ≤ P(R = 6).
We have thus shown that 6 is a most-probable-number-of-women and that 7, 8, 9, 10 are
not. We must rule out that 5 (or any smaller number) is a most-probable-number-of-women.
But clearly, 6!4! ≠ 5!5!, so that
Hence, it is indeed the case that P(R = 5) < P(R = 6). Thus, 6 is indeed the unique
most-probable-number-of-women.
A699 (9740 N2011/II/12)(i) Let X be the number of people who join the queue in a
period of 4 minutes. Then X ∼ Po(4.8) and:
7
4.8i
P(X ≥ 8) = 1 − P(X ≤ 7) = 1 − e −4.8
∑ ≈ 0.113334.
i=0 i!
(ii) Let Y be the number of people who join the queue in a period of t minutes. Then
Y ∼ Po(1.2t/60) = Po(0.02t). We are told that P(Y ≤ 1) = 0.7. That is,
By calculator, t ≈ 54.8675.
(iii) Let Z be the number of people who leave the queue over 15 minutes. Then Z ∼ Po(27).
Let B be the number of people who join the queue over 15 minutes. Then B ∼ Po(18).
We wish to find P(35 + B − Z ≥ 24) = P(Z − B ≤ 11).
Since λZ = 27 is large, a suitable approximation for Z is the normal distribution is A ∼
N(27, 27). Since λB = 18 is large, a suitable approximation for B is the normal distribution
is C ∼ N(18, 18). In turn, a suitable approximation for Z − B is A − C ∼ N(9, 45). Hence,
using also the continuity correction,
11.5 − 9
P(Z − B ≤ 11) ≈ P(A − C ≤ 11.5) = Φ ( √ ) ≈ Φ(0.3727) ≈ 0.6453.
45
(iv) There might be certain periods of time when more planes arrive and other periods when
fewer arrive. So the rate at which people join the queue will probably not be constant.
A700 (9740 N2010/II/5)(i) Say we wish to stratify the spectators by age group. One
problem is that we may not know what proportion of the spectators belongs to each age
group. As such, it would may be difficult to get a representative sample.
(ii) Order the spectators by their names, alphabetically. Choose every 100th spectator on
the list to survey.
A701 (9740 N2010/II/6)(i)
∑ t 454.3
t̄ = = = 41.3,
n 11
Since ∣T ∣ < t10,0.05 = 1.812, we are unable to reject the null hypothesis.
A702 (9740 N2010/II/7)(i) P(A ∩ B ′ ) = P(A∣B ′ )P(B ′ ) = 0.8 ⋅ 0.4 = 0.32.
(ii) P(A ∪ B) = P(B) + P(A ∩ B ′ ) = 0.92.
(iii) P(B ′ ∣A) = P(B ′ ∩ A) ÷ P(A) = 0.32 ÷ 0.7 = 16/35 ≈ 0.457142857.
(iv) P(A′ ∩ C) = P(A′ )P(C) = 0.3 ⋅ 0.5 = 0.15.
(v) P(A′ ∩ B ∩ C) ≤ 0.15.
A703 (9740 N2010/II/8)(i) The probability that the number is greater than 30000 is
the probability that the first digit is 3, 4, or 5. Answer: 3/5 = 0.6.
(ii) The first three digits are odd and there are 3! ways to arrange them. The last two are
even and there are 2! ways to arrange them. The total number of ways to arrange the five
digits is 5!. Answer: 3!2!/5! = 1/10 = 0.1.
(iii) If the first digit is 3, the last digit must be 1or 5, and in each case, there are 3! ways
to arrange the middle 3 digits.
Similarly, if the first digit is 5, the last digit must be 1 or 3, and in each case, there are 3!
ways to arrange the middle 3 digits.
If the first digit is 4, the last digit can be 1, 3, or 5, and in each case, there are 3! ways to
arrange the middle 3 digits.
Altogether then, there are 7 ⋅ 3! ways to get such a number and the desired probability is
7 ⋅ 3!/5! = 7/20 = 0.35.
A704 (9740 N2010/II/9)(i) Our desired probability is P(Y > 2X) = P(Y − 2X > 0).
Now, Y − 2X ∼ N (400 − 2 ⋅ 180, 602 + 22 302 ) = N (40, 7200). So
0 − 40
P(Y − 2X > 0) = 1 − Φ ( √ ) ≈ Φ(0.4714) ≈ 0.6813.
7200
0.12X + 0.05Y ∼ N (0.12 ⋅ 180 + 0.05 ⋅ 400, 0.122 ⋅ 302 + 0.052 ⋅ 602 ) = N (41.6, 21.96)
45 − 41.6
Ô⇒ P(0.12X + 0.05Y > 45) = 1 − Φ ( √ ) ≈ 1 − Φ(0.7255) ≈ 1 − 0.7658 = 0.2342.
21.96
0.12X1 + 0.12X2 ∼ N (0.12 ⋅ 180 + 0.12 ⋅ 180, 0.122 ⋅ 302 + 0.122 ⋅ 302 ) = N(43.2, 25.92)
45 − 43.2
P (0.12X1 + 0.12X2 > 45) = 1 − Φ ( √ ) ≈ 1 − Φ(0.3536) ≈ 1 − 0.6381 = 0.3619.
25.92
−12 12
8
P(X = 8) = e ≈ 0.0655233.
8!
(ii) Let Y be the number of calls received in a randomly chosen period of t seconds. Then
Y ∼ Po(3t/60) = Po(0.05t) and P(Y = 0) = e−0.05t = 0.2. So t = (ln 0.2) /(−0.05) ≈ 32.
128
P(Y = 0) = e−12 ≈ 0.0655233.
8!
(iii) Let Z be the number of calls received in a randomly chosen period of 12 hours.
Then Z ∼ Po(2160) and a suitable approximation therefor is the normal distribution A ∼
N (2160, 2160). Hence, using also the continuity correction,
2200.5 − 2160
P(Z > 2200) ≈ P(A ≥ 2200.5) = 1 − Φ ( √ ) ≈ 1 − Φ (0.8714) ≈ 1 − 0.8082 = 0.1918.
2160
⎛6⎞
(iv) 0.19182 0.80824 ≈ 0.2354.
⎝2⎠
10.5 − 5.754
P(B ≤ 10) ≈ P(C ≤ 10.5) = Φ ( √ ) ≈ Φ(2.201) ≈ 0.9861.
4.650
(ii) No. A linear model would imply that several centuries hence, the time taken to run a
mile would be negative, which is clearly impossible.
The scatter diagram similarly suggests that the rate of improvement is tapering off, rather
than linear.
(iii) A quadratic model would imply that the world record time taken to run a mile eventu-
ally bottoms out, then starts increasing. But by definition, it is impossible that the world
record time increases.
(iv) In general, the estimated regression equation is y−ȳ = b(x−x̄), where b = ∑ x̂i ∑ ŷi / ∑ x̂2i .
So in this case, the estimated regression equation is
t(2010) ≈ e−0.0161280(2010)+34.853071 ≈ 11.4. So the predicted world record time on 1st January
2010 is 3 m 41.4 s.
Our range of data is 1930-2000. We are extrapolating our data, which might not always
work out reliably.
A709 (9740 N2009/II/7)(i) Let E and F be the events that “a randomly chosen com-
ponent that is faulty” and “a randomly chosen component was supplied by A”. Then:
f ′ (p) = 7.5(3 + 0.02p)−2 (0.02) > 0. This shows that the probability that a randomly chosen
component that is faulty was supplied by A is increasing in the percentage of electronic
components bought from A. Which is not very surprising.
A710 (9740 N2009/II/8)(i) We have 8 letters total, 3 of which are repeated. Hence,
there are 8!/3! = 6720 possible permutations.
(ii) Let TD or DT be a single letter. Then we have 7 “letters” total, 3 of which are
repeated, so there are 2! × 7!/3! possible permutations that we do not want. So there are
6720 − 2! × 7!/3! = 5040 possible permutations that we do want.
(iii) The 4 consonants by themselves have 4! possible permutations. The 4 vowels by
themselves have 4! ÷ 3! = 4 possible permutations. The first letter can either be a consonant
or a vowel. Hence, there are in total 2 × 4! × 4 = 192 possible permutations.
(iv) There are only four broad possibilities: E _ _ E _ _ E _ , E _ _ E _ _ _ E, E _
_ _E _ _ E, and _ E _ _E _ _ E. Each of which have 5! possible permutations. Hence,
there are in total 4 × 5! = 480 possible permutations.
0.12 ⎛ 2.53 − 2.5 ⎞
A711 (9740 N2009/II/9)(i) M̄ ∼ N (2.5, ). So P (M̄ > 2.53) = 1 − Φ √ =
⎝ 0.12 /n ⎠
√ √ √
n
1 − Φ (0.3 n) = 0.0668 ⇐⇒ Φ (0.3 n) = 0.9332 ⇐⇒ 0.3 n = 1.5 ⇐⇒ n = 25.
(ii) Assuming the thicknesses of the textbooks are independently distributed,
X = M1 +⋅ ⋅ ⋅+M21 +S1 +. . . S24 ∼ N (21 ⋅ 2.5 + 24 ⋅ 2.0, 21 ⋅ 0.12 + 24 ⋅ 0.082 ) = N (100.5, 0.3636) .
100 − 100.5
Now, P(X ≤ 100) = Φ ( √ ) ≈ 1 − Φ (0.8292) ≈ 1 − 0.7964 = 0.2036.
0.3636
(iii) Again assuming the thicknesses of the textbooks are independently distributed, our
desired probability is P (S1 + S2 + S3 + S4 < 3M ) = P (S1 + S2 + S3 + S4 − 3M < 0). Now, S1 +
S2 + S3 + S4 − 3M ∼ N (4 ⋅ 2.0 − 3 ⋅ 2.5, 4 ⋅ 0.082 + 32 ⋅ 0.12 ) = N (0.5, 0.1156). Hence,
0 − 0.5
P (S1 + S2 + S3 + S4 − 3M < 0) = Φ ( √ ) ≈ 1 − Φ (1.4706) ≈ 1 − 0.9293 = 0.0707.
0.1156
∑ x2 − (∑ x) /n 835.92. − 86.42 /9
2
s =
2
= ≈ 0.81.
n−1 8
Since ∣t∣ < t8,0.025 = 2.306, we are unable to reject the null hypothesis.
The sample size is small. And so we are unable to appeal to the CLT and claim that a
normal distribution is a suitable approximate distribution for x̄.
(Author’s remark: It actually makes no sense to say that “the CLT does not apply in this
context”. The CLT certainly applies. It is merely that the normal distribution is a poor
approximation for the sample mean.)
(iii) We’d use the Z-test instead.
A713 (9740 N2009/II/11)(i) The probability that any observed car is red is independent
of whether any other observed car is red.
Each car is either strictly red or strictly not red.
⎛ 20 ⎞ ⎛ 20 ⎞ ⎛ 20 ⎞ ⎛ 20 ⎞
= 0.154 0.8516 + 0.155 0.8515 + 0.156 0.8514 + 0.157 0.8513
⎝ 4 ⎠ ⎝ 5 ⎠ ⎝ 6 ⎠ ⎝ 7 ⎠
≈ 0.346354.
(iii) Since np and n(1−p) are large, a suitable approximation to R is the normal distribution
X ∼ N (72, 50.4). Hence, using also the continuity correction,
59.5 − 72
P(R < 60) ≈ P(X < 59.5) = Φ ( √ ) ≈ 1 − Φ (1.761) ≈ 1 − 0.9609 = 0.0391.
50.4
(iv) Since n is large and p is small, a suitable approximation to R is the normal distribution
Y ∼ Po (4.8). Hence,
−4.8 4.8
3
P(R = 3) = e ≈ 0.152.
3!
⎛ 20 ⎞ 0 ⎛ 20 ⎞ 1
(v)P(R = 0) + P(R = 1) = p (1 − p)20 + p (1 − p)19
⎝ 0 ⎠ ⎝ 1 ⎠
= (1 − p)19 (1 − p + 20p) = 0.2.
By calculator, p ≈ 0.142432.
A714 (9740 N2008/II/5)(i) Take any ordered list of the 950 pupils. From the list, pick
every 19th student.
(ii) We might want each level to be equally well-represented. For example, we might like
approximately one-sixth of the sample to be from Primary 1, another sixth from Primary
1757, Contents www.EconsPhDTutor.com
2, etc.
In which case we’d probably prefer to do a stratified sample. The method might be some-
thing like this: Pick from the aforementioned ordered list the first 108 Primary 1 students,
the first 108 Primary 2 students, etc.
A715 (9740 N2008/II/6). Let the mass of calcium in a bottle (after the extreme
weather) be X ∼ N (µ0 , σ 2 ). (We have made the necessary assumption that X is normally
distributed.)
The null hypothesis is H0 ∶ µ0 = 78 and the alternative hypothesis is H0 ∶ µ0 ≠ 78. Now,
x̄ − µ0 ∑ x/n − 78
t= √ =√ √ ≈ −1.207.
[∑ x − (∑ x) /n] /(n − 1)/ n
s/ n 2 2
Since ∣t∣ < t14,0.025 ≈ 2.145, we are unable to reject the null hypothesis.
A716 (9740 N2008/II/7)(i) Let A1 denote the event that A wins the first set. Similarly
define A2 , A3 , B1 , B2 , and B3 . P (A2 ) = P (A1 ∩ A2 ) + P (B1 ∩ A2 ) = 0.6 ⋅ 0.7 + 0.4 ⋅ 0.2 = 0.5.
(ii) P (A wins) = P (A1 ∩ A2 ) + P (A1 ∩ B2 ∩ A3 ) + P (B1 ∩ A2 ∩ A3 ) = 0.42 + 0.6 ⋅ 0.3 ⋅ 0.2 +
0.4 ⋅ 0.2 ⋅ 0.7 = 0.42 + 0.036 + 0.056 = 0.512.
(iii) P (B1 ∩ A2 ∩ A3 ) /P (A wins) = 0.056/0.512 = 0.109375.
A717 (9740 N2008/II/8)(i) PMCC ≈ 0.9695281468. This large PMCC merely suggests
that there is a strong (positive) linear relationship between x and t. However, the true
relationship between x and t could be something other than linear.
(ii)
(iii) Without P , it appears that t is increasing, but at a decreasing rate. So a log model
might be appropriate.
(iv) In general, the estimated regression equation is y − ȳ = b(x − x̄), where
79.5 − 90
P(Z < 80) ≈ P(A < 79.5) = Φ ( √ ) ≈ 1 − Φ(1.1068) ≈ 1 − 0.8657 = 0.1343.
90
(iv) An organisation might buy a relatively-large number of grand pianos on any given day.
So it is not likely that the rate at which grand pianos are sold is constant throughout the
year.
⎛ 3 ⎞⎛ 4 ⎞⎛ 5 ⎞
A719 (9740 N2008/II/10)(i) = 3 ⋅ 4 ⋅ 10 = 120.
⎝ 2 ⎠⎝ 3 ⎠⎝ 3 ⎠
⎛9⎞
(ii) = 9.
⎝8⎠
⎛ 5 ⎞⎛ 7 ⎞ ⎛ 5 ⎞⎛ 7 ⎞
(iii) + = 5 ⋅ 35 + 1 × 35 = 210.
⎝ 4 ⎠⎝ 4 ⎠ ⎝ 5 ⎠⎝ 3 ⎠
(iv) The number of ways to have
⎛9⎞
• No diplomats from K (i.e. only diplomats from L and M ) is ;
⎝8⎠
⎛8⎞
• No diplomats from L is ;
⎝8⎠
• No diplomats from M is 0.
⎛ 12 ⎞
The total number of ways to choose the diplomats is . Hence the number of ways to
⎝ 8 ⎠
⎡ ⎤
⎛ 12 ⎞ ⎢⎢⎛ 9 ⎞ ⎛ 8 ⎞⎥⎥
− + = 495 − (9 + 1) = 485.
⎝ 8 ⎠ ⎢⎢⎝ 8 ⎠ ⎝ 8 ⎠⎥⎥
have at least 1 diplomat from each island is
⎣ ⎦
120 − 100
P(X1 + X2 > 120) = 1 − Φ ( √ ) ≈ 1 − Φ(1.768) ≈ 1 − 0.9615 = 0.0385.
2 ⋅ 82
15 − 0
P(X1 > X2 + 15) = P(X1 − X2 > 15) = 1 − Φ ( √ ) ≈ 1 − Φ(1.3258) ≈ 1 − 0.9075 = 0.0925.
2 ⋅ 82
74 − µ 74 − µ
(iii) P(Y < 74) = Φ ( ) = 0.0668 ⇐⇒ = −1.5.
σ σ
146 − µ 146 − µ 146 − µ
P(Y > 146) = 1 − Φ ( ) = 0.0668 ⇐⇒ Φ ( ) = 0.9332 ⇐⇒ = 1.5.
σ σ σ
146 − µ 74 − µ 72
− = 1.5 − (−1.5) = = 3 ⇐⇒ σ = 24 and µ = 110.
σ σ σ
Since σ = 8a and µ = 50a + b, a = 3 and b = −40.
A721 (9233 N2008/I/1). 3 ways to arrange the 3 groups of books. And within each
group of books, we can permute them as usual. So there are 3!6!5!4! = 12 441 600 ways.
A722 (9233 N2008/II/23). By independence, pA∩B = pA pB . Also pA∪B = pA +pB −pA∩B =
pA + pB − pA pB . Plugging in the given numbers, we have 0.4 = 0.2 + pB − 0.2pB , so pB = 0.25.
pB pC = 0.25 ⋅ 0.4 = 0.1 = pB∩C , so that by definition, B and C are indeed independent.
A723 (9233 N2008/II/26)(i) Let X ∼ Po(3). P(X > 2) = 1 − P(X ≤ 0) = 1 −
e−3 (1 + 3 + 9/2) = 1 − 8.5e−3 ≈ 1 − 0.423 = 0.577.
(ii) Let Y be the number of times the machine will break down in a period of four weeks.
Then Y ∼ Po(12).
(iii) Let Z be the number of times the machine will break down in a period of 16 weeks.
Then Z ∼ Po(48). Since λZ is large, a suitable approximation for Z is the normal distribu-
tion A ∼ N (48, 48). Hence, using also the continuity correction,
50.5 − 48
P(Z > 50) ≈ P(A > 50.5) = 1 − Φ ( √ ) ≈ 1 − Φ(0.3608) ≈ 1 − 0.6409 = 0.3591.
48
A724 (9233 N2008/II/27)(i) Let the mass after the adjustment be X ∼ N (µ0 , σ 2 ). It
is necessary to assume that these masses remain normally distributed. The null hypothesis
is H0 ∶ µ0 = 32.40 and the alternative hypothesis is HA ∶ µ0 ≠ 32.40. Now,
x̄ − µ0 32.00 − 32.40
t= √ = √ ≈ −2.104.
s/ n 2.892/80
Since ∣t∣ > t79,0.025 ≈ 1.99, we can reject the null hypothesis.
(ii) This means that if H0 were true and we tested infinitely many size-80 samples (as done
above), we’d reject H0 in 5% of the samples.
1760, Contents www.EconsPhDTutor.com
(iii) The one-tailed p-value is ≈ 0.0193. So the least level of significance is 1.93%.
A725 (9233 N2008/II/29)(i) Let X ∼ N (50, 42 ). The probability that Mr Sim is late
on any given day is
55 − 50
P(X > 55) = 1 − Φ ( ) = 1 − Φ(1.25) ≈ 1 − 0.8944 = 0.1056.
4
Assuming that the probability that he’s late each day is independent of whether he was
late on any other day, the probability that he will be late no more than once in 5 days is
⎛5⎞ ⎛5⎞
0.10560 0.89445 + 0.10561 0.89444 ≈ 0.910.
⎝0⎠ ⎝1⎠
(ii) Let Y ∼ N (40, 52 ). Our desired probability is P(X − Y − 5 < 0). Assuming the journey
times of Messrs Sim and Lee are independent, X − Y − 5 ∼ N (5, 42 + 52 ). Thus,
0−5
P(X − Y − 5 < 0) = Φ ( √ ) ≈ 1 − Φ(0.7809) ≈ 1 − 0.7826 = 0.2174.
42 + 52
(iii) Assume that the journey times of Messrs Sim and Lee each day are independent. Then
the desired probability is
86.50 − µ
A726 (9233 N2008/II/30)(i) Let M ∼ N (µ, σ 2 ). P(M < 86.50) = Φ ( ) = 0.12
σ
86.50 − µ 1
⇐⇒ = −1.175.
σ
92.25 − µ 92.25 − µ 92.25 − µ 2
P(M > 92.25) = 1 − Φ ( ) = 0.2 ⇐⇒ Φ ( ) = 0.8 ⇐⇒ = 0.842.
σ σ σ
5.75
=minus = yields = 2.017 ⇐⇒ σ ≈ 2.85. And now µ ≈ 89.85.
2 1
σ
2
(ii) Let X ∼ N (µ, σ 2 ). P (µ − 2 ≤ X ≤ µ + 2) = 0.8 Ô⇒ P (X ≤ µ + 2) = 0.9 ⇐⇒ Φ ( ) =
σ
2
0.9 ⇐⇒ ≈ 1.281 ⇐⇒ σ ≈ 1.56.
σ
σ2 0.50
(iii) Let X̄ ∼ N (µ, ). Then P(X̄ ≥ µ + 0.50) ≤ 0.1 ⇐⇒ 1 − Φ ( √ ) ≤ 0.1 ⇐⇒
√ √ √
n σ/ n
0.50 n 0.50 n 0.50 n 2 √
0.9 ≤ Φ ( ) ⇐⇒ ? 1.281 ⇐⇒ ≥ ⇐⇒ 0.50 n ≥ 2 ⇐⇒ n ≥ 16.
σ σ σ σ
A727 (9740 N2007/II/5)(i) Consider a survey of whether students like a particular
teacher. A quota of 10 students is to be chosen. Take a list of the teacher’s students, sort
their names alphabetically, and pick the first 10 students on the list.
One disadvantage is that this sample of 10 students might not be representative. For
example, they might all be siblings from the same family of Angs.
⎛ 10 ⎞ ⎛ 10 ⎞ ⎛ 10 ⎞
0.240 0.7610 + 0.241 0.769 + ⋅ ⋅ ⋅ + + 0.244 0.766 ≈ 0.933.
⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 4 ⎠
(i) Let X ∼ B(1000, 0.24) be the number of people in a sample of 1000 that have gene A.
Since np = 240 > 5 and n(1 − p) = 760 > 5 are both large, a suitable approximation for X is
the normal distribution Y ∼ N (240, 182.4). Hence, using also the continuity correction,
(ii) Let Z ∼ B(1000, 0.003) be the number of people in a sample of 1000 that have gene B.
Since n is large and p is small, a suitable approximation for Y is the Poisson distribution
A ∼ Po (3). Hence,
∑ x 4626 ∑ x2 − (∑ x) /n
2
x̄ = = = 30.84 and s =
2
≈ 33.7259.
n 150 n−1
(ii) Let H0 ∶ µ0 = 30 and HA ∶ µ0 > 30 be the null and alternative hypotheses. Now,
x̄ − µ0 30.84 − 30
Z= √ ≈√ ≈ 1.772.
s/ n 33.7259/150
(ii) Let T be the weight of a randomly chosen turkey. Then T ∼ N (10.5, 2.12 ). Then
5T ∼ N (5 ⋅ 10.5, 52 ⋅ 2.12 ) = N (52.5, 10.52 ) and:
55 − 52.5 5
P(5T > 55) = 1 − Φ ( ) = 1 − Φ ( ) ≈ 0.405904.
10.5 21
Thus, P(3C > 7) ⋅ P(5T > 55) ≈ 0.160.
62 − 59.1 5
P(3C + 5T > 62) = 1 − Φ ( √ ) = 1 − Φ ( ) ≈ 0.392.
112.5 21
(iv) The event “both chicken costs more than $7 and turkey costs more than $55” is a
proper subset of the event “chicken and turkey together cost $62”. By the monotonicity of
probability, the probability of the latter is greater than the latter.
A731 (9740 N2007/II/9)(i)(a) 12! (b) 6! ⋅ 26 .
(ii)(a) 11!
(ii)(b) Fix any man. Then we must have to his right: Woman, man, woman, man, etc. So
6!5!
(ii)(c) Fix any man A. Then we must have:
• To his right: “Wife A, some other man, that some other man’s wife, etc.”; OR
• To his left: “Wife A, some other man, that some other man’s wife, etc.”.
In the first scenario, we have 5! possible arrangements. Likewise in the second. Altogether
2 ⋅ 5! possible arrangements.
A732 (9740 N2007/II/10).
Figure to be
inserted here.
1 1 1 1
(i) P(1, 1, 1) = ⋅ ⋅ = .
8 4 2 64
1 1 1 3 1 7 1 1 8 + 2 ⋅ 3 + 7 21
(ii) P(1, 1) + P(1, 0, 1) + P(0, 1, 1) = ⋅ + ⋅ ⋅ + ⋅ ⋅ = = .
8 4 8 4 4 8 8 4 256 256
(iii) Let E and F be the events that “the third throw is successful” and “exactly two of
the three throws are successful”.
1 3 1 7 1 1 13
P(E ∩ F ) = P(1, 0, 1) + P(0, 1, 1) = ⋅ ⋅ + ⋅ ⋅ = .
8 4 4 8 8 4 256
13 17
P(F ) = P(E ∩ F ) + P(E ′ ∩ F ) = + P(1, 1, 0) = .
256 256
Thus, P(E∣F ) = P(E ∩ F ) ÷ P(F ) = 13/17.
A733 (9740 N2007/II/11)(i) In general, the estimated regression equation is y − ȳ =
b(x − x̄), where b = ∑ x̂i ∑ ŷi / ∑ x̂2i . So in this case, the estimated regression equation is:
(ii) The distribution is “sufficiently nice” that with a sample size of 100, it is appropriate
to use the CLT.
A736 (9233 N2007/II/25)(i) P(W ∣B) = 20/52 = 5/13 ≈ 0.384615.
(ii) P(B∣W ) = 20/40 = 0.5.
(iii) P(B ∪ W ) = (40 + 32)/90 = 72/100 = 0.72.
(iv) P(W )P(B) = 0.4 ⋅ 0.52 ≠ P(B ∪ W ) and so W and B are not independent.
There are men who take chemistry (equivalently, P(M ∩ C) ≠ 0), so M and C are not
mutually exclusive.
A737 (9233 N2007/II/26)(i) Let X be the number of genuine call-outs in a randomly
chosen two-week period. Then X ∼ Po(4) and
42 43 44 45
P(X < 6) = e−4 (1 + 4 + + + + ) ≈ 0.785130.
2! 3! 4! 5!
(ii) Let Y be the total number of call-outs in a randomly chosen six-week period. Then
Y ∼ Po(15) and since λY is large, a suitable approximation for Y is the normal distribution
Z ∼ N (15, 15). Hence, using also the continuity correction,
(ii) 0.74L + 0.86H ∼ N (0.74 ⋅ 5 + 0.86 ⋅ 3, 0.742 ⋅ 0.12 + 0.862 ⋅ 0.052 ) = N (6.28, 0.00728225).
(ii) Let Y be the number of severe floods in a randomly-chosen 1000-year period. Then
Y ∼ Po(20). Since λY is large, a suitable approximation for Y is the normal distribution
Z ∼ N (20, 20). Hence, using also the continuity correction,
25.5 − 20
P(Y > 25) ≈ P(Z > 25.5) = 1 − Φ ( √ ) ≈ 0.109.
20
Since ∣Z∣ > Z0.05 = 1.645, we can reject the null hypothesis.
(ii) If H0 is true and we conduct the above test on infinitely many size-80 samples, we’d
(falsely) reject H0 for 5% of the samples.
A743 (9233 N2006/II/28)(i) Let the speed of any car (in km h−1 ) be X ∼ N (µ, σ 2 ). We
are given that P (X > 125) = 1/80 and P (X < 40) = 1/10.
1 125 − µ 1 125 − µ 79
P (X > 125) = ⇐⇒ 1 − Φ( )= ⇐⇒ Φ( )=
80 σ 80 σ 80
125 − µ 1
⇐⇒ ≈ 2.240.
σ
1 40 − µ 1 40 − µ 2
P(X < 40) = ⇐⇒ Φ( )= ⇐⇒ ≈ −1.282.
10 σ 10 σ
85
≈ minus ≈ yields ≈ 3.522 ⇐⇒ σ ≈ 24.1 and µ ≈ 70.9.
1 2
σ
⎛ 10 ⎞ 0 10 ⎛ 10 ⎞ 1 9 ⎛ 10 ⎞ 2 8 ⎛ 10 ⎞ 3 7
(ii) 0.1 0.9 + 0.1 0.9 + 0.1 0.9 + 0.1 0.9 ≈ 0.987.
⎝ 0 ⎠ ⎝ 1 ⎠ ⎝ 2 ⎠ ⎝ 3 ⎠
(iii) Let Y be the number of cars out of a random sample of 100 that are travelling at speed
less than 40 km h-1 . Then Y ∼ B (100, 0.1). Since np = 10 > 5 and n (1 − p) = 90 > 5 are
both large, a suitable approximation to Y is the normal distribution Z ∼ N (10, 9). Hence,
using also the continuity correction:
8.5 − 10
P (Y ≤ 8) ≈ P (Z ≤ 8.5) = Φ ( √ ) = 1 − Φ(0.5) ≈ 1 − 0.6915 = 0.3085.
9
2D two-dimensional
3D three-dimensional
SE Stack Exchange
SEAB Singapore Examinations and Assessment Board xxxiii
UK United Kingdom
1774, Contents
US United States of America www.EconsPhDTutor.com
Singlish Used in This Textbook
We give the number of the page on which each Singlish expression is first used.
∵ because 3
∴ therefore 3
” ditto 3
YouTube.com/EconCow
EconsPhDTutor.com
Or email:
DrChooYanMin@gmail.com