Bayesian Statistics For Beginners A Step by Step Approach Therese M Donovan Full Chapter

Bayesian Statistics for Beginners: A
Step-By-Step Approach Therese M

Donovan
Visit to download the full and correct content document:
https://ebookmass.com/product/bayesian-statistics-for-beginners-a-step-by-step-appr
oach-therese-m-donovan/
OUP CORRECTED PROOF – FINAL, 6/5/2019, SPi
Bayesian Statistics
for Beginners
Bayesian Statistics
for Beginners
A Step-by-Step Approach
THERESE M. DONOVAN
RUTH M. MICKEY
1
3
Great Clarendon Street, Oxford, OX2 6DP,
United Kingdom
Oxford University Press is a department of the University of Oxford.
It furthers the University’s objective of excellence in research, scholarship,
and education by publishing worldwide. Oxford is a registered trade mark of
Oxford University Press in the UK and in certain other countries
© Ruth M. Mickey 2019
The moral rights of the author have been asserted
First Edition published in 2019
Impression: 1
All rights reserved. No part of this publication may be reproduced, stored in
a retrieval system, or transmitted, in any form or by any means, without the
prior permission in writing of Oxford University Press, or as expressly permitted
by law, by licence or under terms agreed with the appropriate reprographics
rights organization. Enquiries concerning reproduction outside the scope of the
above should be sent to the Rights Department, Oxford University Press, at the
address above
You must not circulate this work in any other form
and you must impose this same condition on any acquirer
Published in the United States of America by Oxford University Press
198 Madison Avenue, New York, NY 10016, United States of America
British Library Cataloguing in Publication Data
Data available
Library of Congress Control Number: 2019934655
ISBN 978–0–19–884129–6 (hbk.)
ISBN 978–0–19–884130–2 (pbk.)
DOI: 10.1093/oso/9780198841296.001.0001
Printed and bound by
CPI Group (UK) Ltd, Croydon, CR0 4YY
Links to third party websites are provided by Oxford in good faith and
for information only. Oxford disclaims any responsibility for the materials
contained in any third party website referenced in this work.
To our parents, Thomas and Earline Donovan and Ray and Jean Mickey,
for inspiring a love of learning.
To our mentors, some of whom we’ve met only by their written words,
for teaching us ways of knowing.
To Peter, Evan, and Ana—for everything.

Preface
Greetings. This book is our attempt at gaining membership to the Bayesian Conspiracy.
You may ask, “What is the Bayesian Conspiracy?” The answer is provided by Eliezer
Yudkowsky (http://yudkowsky.net/rational/bayes): “The Bayesian Conspiracy is a multi
national, interdisciplinary, and shadowy group of scientists that controls publication,
grants, tenure, and the illicit traffic in grad students. The best way to be accepted into the
Bayesian Conspiracy is to join the Campus Crusade for Bayes in high school or college, and
gradually work your way up to the inner circles. It is rumored that at the upper levels of the
Bayesian Conspiracy exist nine silent figures known only as the Bayes Council.”
Ha ha! Bayes’ Theorem, also called Bayes’ Rule, was published posthumously in 1763 in
the Philosophical Transactions of the Royal Society. In The Theory That Would Not Die,
author Sharon Bertsch McGrayne aptly describes how “Bayes’ rule cracked the enigma code,
hunted down Russian submarines, and emerged triumphant from two centuries of contro-
versy.” In short, Bayes’ Rule has vast application, and the number of papers and books that
employ it is growing exponentially.
Inspired by the knowledge that a Bayes Council actually exists, we began our journey by
enrolling in a 5-day ‘introductory’ workshop on Bayesian statistics a few years ago. On Day
1, we were introduced to a variety of Bayesian models, and, on Day 2, we sheepishly had to
inquire what Bayes’ Theorem was and what it had to do with MCMC. In other words, the
material was way over our heads. With tails between our legs, we slunk back home and
began trying to sort out the many different uses of Bayes’ Theorem.
As we read more and more about Bayes’ Theorem, we started noting our own questions as
they arose and began narrating the answers as they became more clear. The result is this
strange book, cast as a series of questions and answers between reader and author. In this
prose, we make heavy use of online resources such as the Oxford Dictionary of Statistics
(Upton and Cook, 2014), Wolfram Mathematics, and the Online Statistics Education: An
Interactive Multimedia Course for Study (Rice University, University of Houston Clear
Lake, and Tufts University). We also provide friendly links to online encyclopedias such
as Wikipedia and Encyclopedia Britannica. Although these should not be considered
definitive, original works, we have included the links to provide readers with a readily
accessible source of information and are grateful to the many authors who have contrib-
uted entries.
We are not experts in Bayesian statistics and make no claim as such. Therese Donovan is a
biologist for the U. S. Geological Survey Vermont Cooperative Fish and Wildlife Research
Unit, and Ruth Mickey is a statistician in the Department of Mathematics and Statistics at
the University of Vermont. We were raised on a healthy dose of “frequentist” and max-
imum likelihood methods but have begun only recently to explore Bayesian methods. We
have intentionally avoided controversial topics and comparisons between Bayesian and
frequentist approaches and encourage the reader to dig deeper—much deeper—than we
have here. Fortunately, a great number of experts have paved the way, and we relied heavily
on the following books while writing our own:
• N. T. Hobbs and M. B. Hooten. Bayesian Models: A Statistical Primer for Ecologists.
Princeton University Press, 2015.
viii PREFACE
• J. Kruschke. Doing Bayesian Data Analysis: A Tutorial with R, JAGS, and Stan.
Elsevier, 2015.
• A. Gelman, J. B. Carlin, H. S. Stern, and D. B. Rubin. Bayesian Data Analysis. Chapman &
Hall, 2004.
• J. V. Stone. Bayes’ Rule: a Tutorial Introduction to Bayesian Analysis. Sebtel Press, 2014.
• H. Raiffa and R. Schlaifer. Applied Statistical Decision Theory. Division of Research,
Graduate School of Business Administration, Harvard University, 1961.
• P. Goodwin and G. Wright. Decision Analysis for Management Judgment. John Wiley &
Sons, 2014.
Although we relied on these sources, any mistakes of interpretation are our own.
Our hope is that Bayesian Statistics for Beginners is a “quick read” for the uninitiated and
that, in one week or less, we could find a reader happily ensconced in a book written by one
of the experts. Our goal in writing the book was to keep Bayes’ Theorem front and center in
each chapter for a beginning audience. As a result, Bayes’ Theorem makes an appearance in
every chapter. We frequently bring back past examples and explain what we did “back
then,” allowing the reader to slowly broaden their understanding and sort out what has
been learned in order to relate it to new material. For the most part, our reviewers liked this
approach. However, if this is annoying to you, you can skim over the repeated portions.
If this book is useful to you, it is due in no small part to a team of stellar reviewers. We owe
a great deal of gratitude to George Allez, Cathleen Balantic, Barry Hall, Mevin Hooten, Peter
Jones, Clint Moore, Ben Staton, Sheila Weaver, and Robin White. Their enthusiasm,
questions, and comments have improved the narrative immensely. We offer a heartfelt
thank you to Gary Bishop, Renee Westland, Melissa Murphy, Kevin Roark, John Bell, and
Stuart Geman for providing pictures for this book.
Therese Donovan
Ruth Mickey
October 2018
Burlington, VT
Contents
SECTION 1 Basics of Probability
1 Introduction to Probability 3
2 Joint, Marginal, and Conditional Probability 11
SECTION 2 Bayes’ Theorem and Bayesian Inference
3 Bayes’ Theorem 29
4 Bayesian Inference 37
5 The Author Problem: Bayesian Inference with Two Hypotheses 48
6 The Birthday Problem: Bayesian Inference with Multiple Discrete Hypotheses 61
7 The Portrait Problem: Bayesian Inference with Joint Likelihood 73
SECTION 3 Probability Functions
8 Probability Mass Functions 87
9 Probability Density Functions 108
SECTION 4 Bayesian Conjugates
10 The White House Problem: The Beta-Binomial Conjugate 133

11 The Shark Attack Problem: The Gamma-Poisson Conjugate 150
12 The Maple Syrup Problem: The Normal-Normal Conjugate 172
SECTION 5 Markov Chain Monte Carlo
13 The Shark Attack Problem Revisited: MCMC with the Metropolis Algorithm 193
14 MCMC Diagnostic Approaches 212
15 The White House Problem Revisited: MCMC with the Metropolis–Hastings
Algorithm 224
16 The Maple Syrup Problem Revisited: MCMC with Gibbs Sampling 247
x CONTENTS
SECTION 6 Applications
17 The Survivor Problem: Simple Linear Regression with MCMC 269

18 The Survivor Problem Continued: Introduction to Bayesian Model Selection 308
19 The Lorax Problem: Introduction to Bayesian Networks 325

20 The Once-ler Problem: Introduction to Decision Trees 353
Appendices
A.1 The Beta-Binomial Conjugate Solution 369

A.2 The Gamma-Poisson Conjugate Solution 373
A.3 The Normal-Normal Conjugate Solution 379
A.4 Conjugate Solutions for Simple Linear Regression 385

A.5 The Standardization of Regression Data 395
Bibliography 399
Hyperlinks 403
Name Index 413
Subject Index 414
SECTION 1
Basics of Probability
Overview
And so we begin. This first section deals with basic concepts in probability theory, and
consists of two chapters.
• In Chapter 1, the concept of probability is introduced. Using an example, the chapter

focuses on a single characteristic and introduces basic vocabulary associated with
probability.
• Chapter 2 introduces additional terms and concepts used in the study of probability. The
chapter focuses on two characteristics observed at the same time, and introduces the
important concepts of joint probability, marginal probability, and conditional
probability.
After covering these basics, your Bayesian journey will begin.
CHAPTER 1
Introduction to Probability
In this chapter, we’ll introduce some basic terms used in the study of probability. By the end
of this chapter, you will be able to define the following:
• Sample space
• Outcome
• Discrete outcome
• Event
• Probability
• Probability distribution
• Uniform distribution
• Trial
• Empirical distribution
• Law of Large Numbers
To begin, let’s answer a few questions . . .
What is probability?
Answer: The best way to introduce probability is to discuss an example. Imagine you’re a
gambler, and you can win $1,000,000 if a single roll of a die turns up four. You get only one
roll, and the entry fee to play this game is $10,000. If you win, you’re a millionaire. If you
lose, you’re out ten grand.
Should you play?
Answer: It’s up to you!

If the roll always comes up a four, you should play! If it never comes up four, you’d be
foolish to play! Thus, it’s helpful to know something about the die. Is it fair? That is, is each
face equally likely to turn up? How likely are you to roll a four?
This type of question was considered by premier mathematicians Gerolamo Cardano
(1501–1576), Pierre de Fermat (1601–1665), Blaise Pascal (1623–1662), and others. These
brilliant minds created a branch of mathematics known as probability theory.
The rolling of a die is an example of a random process: the face that comes up is subject
to chance. In probability, our goal is to quantify a random process, such as rolling a die.
That is, we want to assign a number to it. If we roll a die, there are 6 possible outcomes
(possible results), namely, one, two, three, four, five, or six. The set of all possible outcomes
is called the sample space.
Bayesian Statistics for Beginners: A Step-by-Step Approach. Therese M. Donovan and Ruth M. Mickey,
Oxford University Press (2019). © Ruth M. Mickey 2019.
DOI: 10.1093/oso/9780198841296.001.0001
4 BAYESIAN STATISTICS FOR BEGINNERS
Let’s call the number of possible outcomes N, so N ¼ 6. These outcomes are discrete,
because each result can take on only one of these values. Formally, the word “discrete” is
defined as “individually separate and distinct.” In addition, in this example, there is a
finite number of possible outcomes, which means “there are limits or bounds.” In other
words, the number of possible outcomes is not infinite.
If we believe that each and every outcome is just as likely to result as every other outcome
(i.e., the die is fair), then the probability of rolling a four is 1/N or 1/6. We can then say, “In
6 rolls of the die, we would expect 1 roll to result in a four,” and write that as Pr(four) ¼ 1/6.
Here, the notation Pr means “Probability,” and we will use this notation throughout
this book.
How can we get a good estimate of Pr(four) for this particular die?
Answer: You collect some data.

Before you hand over the $10,000 entry, you ask the gamemaster if you could run an
“experiment” and roll the die a few times before you make the decision to play. In this
experiment, which consists of tossing a die many times, your goal is to get a rough estimate
of the probability of rolling a four compared to what you expect. To your amazement, the
gamemaster complies.
You start with one roll, which represents a single “trial” of an experiment. Suppose you roll
a three and give the gamemaster a sneer. Table 1.1 shows you rolled 1 three, and 0 for the rest.
Table 1.1
Outcome Frequency Probability

One 0 0
Two 0 0
Three 1 1
Four 0 0
Five 0 0
Six 0 0
Sum 1 1
In this table, the column called Frequency gives the number of times each outcome was
observed. The column called Probability is the frequency divided by the sum of the
frequencies over all possible outcomes, a proportion. The probability of an event of four is
the frequency of the observed number of occurrences of four (which is 0) divided by the
total throws (which is 1). We can write this as:
jnumber of foursj 0
PrðfourÞ ¼ ¼ ¼ 0: ð1:1Þ
jtotal trialsj 1
This can be read as, “The probability of a four is the number of “four” events divided by the
number of total trials.” Here, the vertical bars indicate a number rather than an absolute
value. This is a frequentist notion of probability, because we estimate Pr(four) by asking
“How frequently did we observe the outcome that interests us out of the total?”
INTRODUCTION TO PROBABILITY 5
A probability distribution that is based on raw data is called an empirical probability

distribution. Our empirical probability distribution so far looks like the one in Figure 1.1:
1.0
0.8
Probability
0.6
0.4
0.2
0.0
One Two Three Four Five Six
Outcome
Figure 1.1 Empirical probability distribution for

die outcomes, given 1 roll.
Is one roll good enough?
Answer: No.
By now, you should realize that one roll will never give us a good estimate of Pr(four).
(Sometimes, though, that’s all we have . . . we have only one Planet Earth, for example).
Next, you roll the die 9 more times, and summarize the results of the 10 total rolls (i.e., 10
trials or 10 experiments) in Table 1.2. The number of fours is 2, which allows us to estimate
Pr(four) as 2/10, or 0.20 (see Figure 1.2).
Table 1.2

One 0 0.0
Two 2 0.2
Three 5 0.5
Four 2 0.2
Five 0 0.0
Six 1 0.1
Sum 10 1.0
1.0
0.8
Probability
0.6
0.4
0.2
0.0
Outcome

die outcomes, given 10 rolls.
What can you conclude from these results? The estimate of Pr(four) ¼ 0.2 seems to
indicate that the die may be in your favor! (Remember, you expect Pr(four) ¼ 0.1667 if
the die was fair). But $10,000 is a lot of money, and you decide that you should keep test
rolling until the gamemaster shouts “Enough!” Amazingly, you are able to squeeze in 500
rolls, and you obtain the results shown in Table 1.3.
Table 1.3

One 88 0.176
Two 91 0.182
Three 94 0.188
Four 41 0.082
Five 99 0.198
Six 87 0.174
Sum 500 1.000
The plot of the frequency results in Figure 1.3 is called a frequency histogram. Notice
that frequency, not probability, is on the y-axis. We see that a four was rolled 41 times.
Notice also that the sum of the frequencies is 500. The frequency distribution is an example
of an empirical distribution: It is constructed from raw data.
100
80
Frequency
60
40
20
0
Outcome
Figure 1.3 Frequency distribution of 500 rolls.
We can now estimate Pr(four) as 41/500 ¼ 0.082. We can calculate the probability estimates
for the other outcomes as well and then plot them as the probability distribution in Figure 1.4.
1.0
0.8
Probability
0.6
0.4
0.2
0.0
Outcome

500 rolls.
According to the Law of Large Numbers in probability theory, the formula:
jnumber of observed outcomes of interestj

probability ¼ ð1:2Þ
jtotal trialsj
yields an estimate that is closer and closer to the true probability as the number of trials
increases. In other words, your estimate of Pr(four) gets closer and closer to or approaches
the true probability when you use more trials (rolls) in your calculations.
What would we expect if the die were fair?
Table 1.4 lists the six possible outcomes, and the probability of each event (1/6 ¼ 0.167).
Notice that the sum of the probabilities across the events is 1.0.
Table 1.4
Outcome Probability
One 0.167
Two 0.167
Three 0.167
Four 0.167
Five 0.167
Six 0.167
Sum 1
Figure 1.5 shows exactly the same information as Table 1.4; both are examples of a
probability distribution. On the horizontal axis, we list each of the possible outcomes. On
the vertical axis is the probability. The height of each bar provides the probability of
observing each outcome. Since each outcome has an equal chance of being rolled, the
heights of the bars are all the same and show as 0.167, which is 1/N. Note that this is not an
empirical distribution, because we did not generate it from an experiment. Rather, it was
based on the assumption that all outcomes are equally likely.
1.0
0.8
Probability
0.6
0.4
0.2
0.0
Outcome
Figure 1.5 Probability distribution for rolling a fair die.
Again, this is an example of a discrete uniform probability distribution: discrete

because there are a discrete number of separate and distinct outcomes; uniform because
each and every event has the same probability.
How would you change the table and probability distribution if the
die were loaded in favor of a four?
Answer: Your answer here!

There are many ways to do this. Suppose the probability of rolling a four is 0.4. Since all
the probabilities have to add up to 1.0, this leaves 0.6 to distribute among the remaining
five outcomes, or 0.12 for each (assuming these five are equally likely to turn up). If you roll
this die, it’s not a sure thing that you’ll end up with a four, but getting an outcome of four is
more likely than, say, getting a three (see Table 1.5).
Table 1.5
Outcome Probability
One 0.12
Two 0.12
Three 0.12
Four 0.4
Five 0.12
Six 0.12
Sum 1
The probabilities listed in the table sum to 1.0 just as before, as do the heights of the
corresponding blue bars in Figure 1.6.
1.0
0.8
Probability
0.6
0.4
0.2
0.0
Outcome
Figure 1.6 Probability distribution for rolling a

loaded die.
What would the probability distribution be for the bet?
Answer: The bet is that if you roll a four, you win $1,000,000, and if you don’t roll a
four, you lose $10,000. It would be useful to group our 6 possible outcomes into one of
two events. As the Oxford Dictionary of Statistics explains, “An event is a particular
collection of outcomes, and is a subset of the sample space.” Probabilities are assigned
to events.
Our two events are E1 ¼ {four} and E2 ¼ {one, two, three, five, six}. The brackets { }
indicate the set of outcomes that belong in each event. The first event contains one
outcome, while the second event consists of five possible outcomes. Thus, in probability
theory, outcomes can be grouped into new events at will. We started out by considering six
outcomes, and now we have collapsed those into two events.
Now we assign a probability to each event. We know that Pr(four) ¼ 0.4. What is the
probability of NOT rolling a four? That is the probability of rolling one OR two OR three OR
five OR six. Note that these events cannot occur simultaneously. Thus, we can write
Pr(four) as the SUM of the probabilities of events one, two, three, five, and six (which is
0.12 þ 0.12 þ 0.12 þ 0.12 þ 0.12 ¼ 0.6). Incidentally, the sign means “complement of.” If
A is an event, A is its complement (i.e., everything but A). This is sometimes written as Ac.
The word OR is a tip that you ADD the individual probabilities together to get your answer
as long as the events are mutually exclusive (i.e., cannot occur at the same time).
This is an example of a fundamental rule in probability theory: if two or more events are
mutually exclusive, then the probability of any occurring is the sum of the probabilities of
each occurring. Because the different outcomes of each roll (i.e., rolling a one, two, three,
five, or six) are mutually exclusive, the probability of getting any outcome other than four is
the sum of the probability of each one occurring (see Table 1.6).
Table 1.6
Event Probability
Four 0.4
Not Four 0.6
Sum 1
Note that the probabilities of these two possible events sum to 1.0. Because of that, if we
know that Pr(four) is 0.4, we can quickly compute the Pr(four) as 1 0.4 ¼ 0.6 and save a
few mental calculations.
The probability distribution looks like the one shown in Figure 1.7:
1.0
0.8
Probability
0.6
0.4
0.2
0.0
Four Not Four
Event
Figure 1.7 Probability distribution for rolling a four

or not.
Remember: The SUM of the probabilities across all the different outcomes MUST EQUAL
1! In the discrete probability distributions above, this means that the heights of the bars
summed across all discrete outcomes (i.e., all bars) totals 1.
Do you still want to play? At the end of the book, we will introduce decision trees, an
analytical framework that employs Bayes’ Theorem to aid in decision-making. But that
chapter is a long way away.
Do Bayesians think of probability as long-run averages?
Answer: We’re a few chapters away from hitting on this very important topic. You’ll see
that Bayesians think of probability in a way that allows the testing of theories and hypoth-
eses. But you have to walk before you run. What you need now is to continue learning the
basic vocabulary associated with probability theory.
What’s next?
Answer: In Chapter 2, we’ll expand our discussion of probability. See you there.
CHAPTER 2
Joint, Marginal, and Conditional

Probability
Now that you’ve had a short introduction to probability, it’s time to build on our
probability vocabulary. By the end of this chapter, you will understand the
following terms:
• Venn diagram
• Marginal probability
• Joint probability
• Independent events
• Dependent events
• Conditional probability
Let’s start with a few questions.
What is an eyeball event?
A gala that celebrates the sense of vision? Nope. The eyeball event in this chapter
refers to whether a person is right-eyed dominant or left-eyed dominant. You already
know if you are left- or right-handed, but did you know that you are also left- or right-
eyed? Here’s how to tell (http://www.wikihow.com/Determine-Your-Dominant-Eye; see
Figure 2.1):
Figure 2.1 Determining your dominant eye.
Bayesian Statistics for Beginners: A Step-by-Step Approach. Therese M. Donovan and Ruth M. Mickey,
Oxford University Press (2019). © Ruth M. Mickey 2019.
DOI: 10.1093/oso/9780198841296.001.0001
1. Stretch your arms out in front of you and create a hole with your hands by joining your
finger tips to make a triangular opening, as shown.
2. Find a small object nearby and align your hands with it so that you can see it in the
triangular hole. Make sure you are looking straight at the object through your hands—
cocking your head to either side, even slightly, can affect your results. Be sure to keep
both eyes open!
3. Slowly move your hands toward your face to draw your viewing window toward you. As
you do so, keep your head perfectly still, but keep the object lined up in the hole between
your hands. Don’t lose sight of it.
4. Draw your hands in until they touch your face—your hands should end up in front of
your dominant eye. For example, if you find that your hands end up so you are looking
through with your right eye, that eye is dominant.
The eyeball characteristic has two discrete outcomes: lefty (for left-eyed dominant people)
or righty (for right-eyed dominant people). Because there are only two outcomes, we can
call them events if we want.
Let us suppose that you ask 100 people if they are “lefties” or “righties.” In this case,
the 100 people represent our “universe” of interest, which we designate with the letter
U. The total number of elements (individuals) in U is written |U|. (Once again, the
vertical bars here simply indicate that U is a number; it doesn’t mean the absolute value
of U.)
Here, there are only two possible events: “lefty” and “righty.” Together, they make up a
set of possible outcomes. Let A be the event “left-eye dominant,” and A be the event
“right-eye dominant.” Here, the tilde means “complement of,” and here it can be inter-
preted as “everything but A.” Notice that these two events are mutually exclusive: you
cannot be both a “lefty” and a “righty.” The events are also “exhaustive” because you must
be either a lefty or righty.
Suppose that 70 of 100 people are lefties. These people are a subset of the
larger population. The number of people in event A can be written | A |, and in this
example | A | ¼ 70. Note that | A | must be less than or equal to | A |, which is 100.
Remember that we use the vertical bars here to highlight that we are talking about a
number.
Since there are only two possibilities for eye dominance type, this means that 100 70 ¼ 30
people are righties. The number of people in event A can be written |A|, and in this
example |A| ¼ 30. Note that |A| must be less than or equal to |U|.
Our universe can be summarized as shown in Table 2.1.
Table 2.1
Event Frequency
Lefty (A) | A | ¼ 70
Righty (A) | A | ¼ 30
Universe (U) | U | ¼ 100
We can illustrate this example in a diagrammatic form, as shown in Figure 2.2. Here, our
universe of 100 people is captured inside a box.
JOINT, MARGINAL, AND CONDITIONAL PROBABILITY 13
This is a Venn diagram, which shows A and A. The universe U is 100 people and is
represented by the entire box. We then allocate those 100 individuals into A and A. The blue
circle represents A; lefties stand inside this circle; righties stand outside the circle, but inside
the box. You can see that A consists of 70 elements, and A consists of 30 elements.
Lefty (A)
70
Righty (∼A) 30
Figure 2.2
Why is it called a Venn diagram?
Answer: Venn diagrams are named for John Venn (see Figure 2.3), who wrote his seminal
article in 1880.
Figure 2.3 John Venn.

According to the MacTutor History of Mathematics Archive, Venn’s son described him as
“of spare build, he was throughout his life a fine walker and mountain climber, a keen
botanist, and an excellent talker and linguist.”
What is the probability that a person in universe U is in group A?
Answer: We write this as Pr(A). Remember that Pr stands for probability, so Pr(A) means
the probability that a person is in group A and therefore is a lefty. We can determine the
probability that a person is in group A as:
jAj 70
PrðAÞ ¼ ¼ ¼ 0:7: ð2:1Þ
j U j 100
Probability is determined as the number of persons in group A out of the total. The
probability that the randomly selected person is in group A is 0.7.
What about people who are not in group A?
Answer: There are 30 of them (100 70 ¼ 30), and they are righties.
jAj 30
Prð AÞ ¼ ¼ ¼ 0:3: ð2:2Þ
jU j 100
With only two outcomes for the eyeball event, our notation focuses on the probability
of being in a given group (A ¼ lefties) and the probability of not being in the given group
(A ¼ righties).
I’m sick of eyeballs. Can we consider another characteristic?
Answer: Yes, of course. Let’s probe these same 100 people and find out other details about
their anatomy. Suppose we are curious about the presence or absence of Morton’s toe.
People with “Morton’s toe” have a large second metatarsal, longer in fact than the first
metatarsal (which is also known the big toe or hallux toe). Wikipedia articles suggest that
this is a normal variation of foot shape in humans and that less than 20% of the human
population have this condition. Now we are considering a second characteristic for our
population, namely toe type.
Let’s let B designate the event “Morton’s toe.” Let the number of people with Morton’s
toe be written as |B|. Let B designate the event “common toe.” Suppose 15 of the 100
people have Morton’s toe. This means |B| ¼ 15, and |B| ¼ 85. The data are shown in
Table 2.2, and the Venn diagram is shown in Figure 2.4.
Table 2.2
Event Frequency
Morton’s toe (B) |B| ¼ 15
Common toe (B) |B| ¼ 85
Universe (U) |U| ¼ 100
These events can be represented in a Venn diagram, where a box holds our universe of
100 people.
Note the size of this red circle is smaller than the previous example because the number of
individuals with Morton’s toe is much smaller.
Morton’s toe (B)

15
Common toe (∼B) 85
Figure 2.4
Can we look at both characteristics simultaneously?
Answer: You bet. With two characteristics, each with two outcomes, we have four possible
combinations:
1. Lefty AND Morton’s toe, which we write as A \ B.
2. Lefty AND common toe, which we write as A \B.
3. Righty AND Morton’s toe, which we write as A \ B.
4. Righty AND common toe, which we write as A \B.
The upside-down \ is the mathematical symbol for intersection. Here, you can read it as
“BOTH” or “AND.”
The number of individuals in A \ B can be written j A \ B j, where the bars indicate a
number (not absolute value). Let’s suppose we record the frequency of individuals in each
of the four combinations (see Table 2.3).
Table 2.3
Lefty (A) Righty (A) Sum

Morton’s toe (B) 0 15 15
Common toe (B) 70 15 85
Sum 70 30 100
Let’s study this table carefully.

Notice that this table has four “quadrants,” so to speak. Our actual values are stored in the
upper left quadrant, shaded dark blue. The upper right quadrant (shaded light blue) sums
the number of people with and without Morton’s toe. The lower left quadrant (shaded light
blue) sums the number of people that are lefties and righties. The lower right (white)
quadrant gives the grand total, or |U|.
Now let’s plot the results in the same Venn diagram (see Figure 2.5).
Lefty (A)
70 Morton’s toe (B)

15
Righty with common toe (∼A and ∼B): 15
Figure 2.5
The updated Venn diagram shows the blue circle with 70 lefties, the red circle with
15 people with Morton’s toe, and no overlap between the two. This means j A \ B j ¼ 0,
j A j ¼ j A \ B j ¼ 70, and j B j ¼ j A \ B j ¼ 15. By subtraction, we know the number of
individuals that are not in A OR B j A \ B j is 15 because we need to account for all 100
individuals somewhere in the diagram.
Is it possible to have Morton’s toe AND be a lefty?
Answer: For our universe, no. If you are a person with Morton’s toe, you are standing in
the red circle and cannot also be standing in the blue circle. So these two events
(Morton’s toe and lefties) are mutually exclusive because they do not occur at the
same time.
Is it possible NOT to have Morton’s toe if you are a lefty?
Answer: You bet. All 70 lefties do not have Morton’s toe. These two events are non-
mutually exclusive.
What if five lefties also have Morton’s toe?
Answer: In this case, we need to adjust the Venn diagram to show that five of the people
that are lefties also have Morton’s toe. These individuals are represented as the intersection
between the two events (see Figure 2.6). Note that the total number of individuals is still
100; we need to account for everyone!
Lefty (A)
Morton’s toe (B)

65 5 10
Righty with common toe (∼A and ∼B): 20
Figure 2.6
We’ll run with this example for the rest of the chapter.
This Venn diagram is pretty accurate (except for the size of the box overall). There are
70 people in A, 15 people in B, and 5 people in A \ B. The blue circle contains 70 elements
in total, and 5 of those elements also occur in B. The red circle contains 15 elements (so is a
lot smaller than the blue circle), and 5 of these are also in A. So 5/15 (33%) of the red circle
overlaps with the blue circle, and 5/70 (7%) of the blue circle overlaps with the red circle.
Of the four events (A, A, B, and B), which are not
mutually exclusive?
Answer:
• A and B are not mutually exclusive (a lefty can have Morton’s toe).
• A and B are not mutually exclusive (a lefty can have a common toe).
• A and B are not mutually exclusive (a righty can have Morton’s toe).
• A and B are not mutually exclusive (a righty can have a common toe).
In Venn diagrams, if two events overlap, they are not mutually exclusive.
Are any events mutually exclusive?
Answer: If we focus on each circle, A and A are mutually exclusive (a person cannot be a
lefty and righty). B and B are mutually exclusive (a person cannot have Morton’s toe and a
common toe). Apologies for the trick question!
If you were one of the lucky 100 people included in the universe,
where would you fall in this diagram?
Answer: Your answer here!

It’s handy to look at the numbers in a table format too. Table 2.4 shows the same values as
Figure 2.6.
Table 2.4

Morton’s toe (B) 5 10 15
Common toe (B) 65 20 85
Sum 70 30 100
This is an important table to study. Once again, notice that there are four quadrants or
sections in this table. In the upper left quadrant, the first two columns represent the two
possible events for eyeball dominance: lefty and righty. The first two rows represent the two
possible events for toe type: Morton’s toe and common toe.
The upper left entry indicates that 5 people are members of both A and B.
• Look for the entry that indicates that 20 people are A \B, that is, both not A and not B.
• Look for the entry that indicates that 65 people are A \B.
• Look for the entry that indicates that 10 people are A \ B.
The lower left and upper right quadrants of our table are called the margins of the table.
They are shaded light blue. Note that the total number of individuals in A (regardless of B)
is 70, and the total number of individuals in A is 30. The total number of individuals in B
(regardless of A) is 15 and the total number of individuals B is 85. Any way you slice it, the
grand total must equal 100 (the lower right quadrant).
What does this have to do with probability?
Answer: Well, if you are interested in determining the probability that an individual
belongs to any of these four groups, you could use your universe of 100 individuals to do
the calculation. Do you remember the frequentist way to calculate probability? We
learned about that in Chapter 1.
jnumber of observed outcomes of interestj

Pr ¼ : ð2:3Þ
jU j
Our total universe in this case is the 100 individuals. To get the probability that a person
selected at random would belong to a particular event, we simply divide the entire table
above by our total, which is 100, and we get the results shown in Table 2.5.
Table 2.5

Morton’s toe (B) 0.05 0.1 0.15
Common toe (B) 0.65 0.2 0.85
Sum 0.7 0.3 1
We’ve just converted the raw numbers to probabilities by dividing the frequency table by
the grand total.
Another random document with
no related content on Scribd:
418
“So art, whether it be painting or sculpture, poetry or music,
has no other object than to brush aside the utilitarian symbols,
the conventional and socially accepted generalities, in short,
everything that veils reality from us, in order to bring us face to
face with reality itself” (Laughter, p. 157). It is true that if we
read further on this page, and elsewhere in Bergson, we will be
able to see that there is for him in art and in the spiritual life a
kind of intelligence and knowledge. But it is difficult to work out
an expression or a characterisation of this intelligence and this
knowledge. “Art,” he says, “is only a more direct vision of
reality.” And again: “Realism is in the work when idealism is in
the soul, and it is only through ideality that we can resume
contact with reality” (ibid.).
419
It is only fair to Bergson to remember that he is himself aware
of the appearances of this dualism in his writings, that he
apologises as it were for them, intending the distinction to be,
not absolute, but relative. “Let us say at the outset that the
distinctions we are going to make will be too sharply drawn,
just because we wish to define in instinct what is instinctive,
and in intelligence what is intelligent, whereas all concrete
instinct is mingled with intelligence, as all real intelligence is
penetrated by instinct. Moreover [this is quite an important
expression of Bergson’s objection to the old “faculty”
psychology], neither intelligence nor instinct lends itself to rigid
definition; they are tendencies and not things. Also it must not
be forgotten that ... we are considering intelligence and instinct
as going out of life which deposits them along its course”
(Creative Evolution, p. 143).
420
He talks in the Creative Evolution of a “real time” and a “pure
duration” of a real duration that “bites” into things and leaves
on them the mark of its tooth, of a “ceaseless upspringing of
something new,” of “our progress in pure duration,” or a
“movement which creates at once the intellectuality of mind
and the materiality of things” (p. 217). I have no hesitation in
saying that all this is unthinkable to me, and that it might
indeed be criticised by Rationalism as inconsistent with our
highest and most real view of things.
421
He admits himself that “If our analysis is correct, it is
consciousness, or rather supra-consciousness that is at the
origin of life” (Creative Evolution, p. 275).
422
“Now, if the same kind of action is going on everywhere,
whether it is that which is striving to remake itself, I simply
express this probable similitude when I speak of a centre from
which worlds shoot out as rockets in a fireworks display—
provided, however, that I do not present [there is a great idea
here, a true piece of ‘Kantianism’] this centre as a thing, but as
a continuity of shooting out. God thus defined has nothing of
the already made. He is unceasing life, action, freedom.
Creation so conceived is not a mystery; we experience it in
ourselves when we act freely” (Creative Evolution, p. 262).
423
See p. 155, note 1.
424
It is somewhat difficult, and it is not necessary for our
purposes, to explain what might be meant by the “Idealism” of
Bergson—at least in the sense of a cosmology, a theory of the
“real.” It is claimed for him, and he claims for himself that he is
in a sense both an “idealist” and a “realist,” believing at once
(1) that matter is an “abstraction” (an unreality), and (2) that
there is more in matter than the qualities revealed by our
perceptions. [We must remember that he objects to the idea of
qualities in things in the old static sense. “There are no things;
there are only actions.”] What we might mean by his initial
idealism is the following: “Matter, in our view, is an aggregate
of images. And by ‘image’ we mean [Matter and Memory, the
Introduction] a certain existence which is more than that which
the idealist calls a representation, but less than that which the
realist calls a thing—an existence placed half-way between the
‘thing’ and the ‘representation.’ This conception of matter is
simply that of common sense.” ... “For common sense, then,
the object exists in itself, and, on the other hand, the object is
in itself pictorial, as we perceive it: image it is, but a self-
existing image.” Now, this very idea of a “self-existing image”
implies to me the whole idealism of philosophy, and Bergson is
not free of it. And, of course, as we have surely seen, his
“creative-evolution” philosophy is a stupendous piece of
idealism, but an idealism moreover to which the science of the
day is also inclining.
425
There is so much that is positive and valuable in his teaching,
that he is but little affected by formal criticism.
426
Cf. “We have now enumerated a few of the essential features
of human intelligence. But we have hitherto considered the
individual in isolation, without taking account of social life. In
reality man is a being who lives in society. If it be true [even]
that the human intellect aims at fabrication, we must add that,
for that as well as other purposes, it is associated with other
intellects. Now it is difficult to imagine a society whose
members do not communicate by signs,” etc. etc. (Creative
Evolution, p. 166). Indeed all readers of Bergson know that he
is constantly making use of the social factor and of “co-
operation” by way of accounting for the general advance of
mankind. It may be appropriate in this same connexion to cite
the magnificent passage towards the close of Creative
Evolution in which he rises to the very heights of the idea
[Schopenhauer and Hartmann had it before him, and also
before the socialists and the collectivists] of humanity’s being
possibly able to surmount even the greatest of the obstacles
that beset it in its onward path: “As the smallest grain of dust
[Creative Evolution, pp. 285–6] is bound up with our entire
solar system, drawn along with it in that undivided movement
of descent which is materiality itself, so all organised beings,
from the humblest to the highest, ... do but evidence a single
impulsion, the inverse of the movement of matter, and in itself
indivisible. All the living hold together, and all yield to the same
tremendous push. The animal takes its stand on the plant, man
bestrides animality, and the whole of humanity, in space and in
time, is one immense army galloping beside and before and
behind each of us in an overwhelming charge to beat down
every resistance and clear the most formidable obstacles,
perhaps even death.”
427
Cf. p. 160 and p. 262.
428
He comes in sight of some of them, as he often does of so
many things. “It is as if a vague and formless being, whom we
may call, as we will [C.E., p. 281], man or superman, had
sought to realise himself, and had succeeded only by
abandoning a part of himself on the way. The losses are
represented by the rest of the animal world, and even by the
vegetable world, at least in what these have that is positive
and above the accidents of evolution.”
429
From what has been said in this chapter about Bergson, and
from the remarks that were made in the second chapter about
Renouvier and the French Critical Philosophy, the reader may
perhaps be willing to admit that our Anglo-American
Transcendental philosophy would perhaps not have been so
abstract and so rationalistic had it devoted more attention, than
it has evidently given, to some of the more representative
French thinkers of the nineteenth century.
430
We must remember that nowhere in his writings does Bergson
claim any great originality for his many illuminative points of
view. He is at once far too much of a catholic scholar (in the
matter of the history of philosophy, say), and far too much of a
scientist (a man in living touch with the realities and the
theories of the science of the day) for this. His findings about
life and mind are the outcome of a broad study of the
considerations of science and of history and of criticism. By
way, for example, of a quotation from a scientific work upon
biology that seems to me to reveal some apparent basis in fact
(as seen by naturalists) for the “creative evolution” upon which
Bergson bases his philosophy, I append the following: “We
have gone far enough to see that the development of an
organism from an egg is a truly wonderful process. We need
but go back again and look at the marvellous simplicity of the
egg to be convinced of it. Not only do cells differentiate, but
cell-groups act together like well-drilled battalions, cleaving
apart here, fusing together there, forming protective coverings
or communicating channels, apparently creating out of nothing,
a whole set of nutritive and reproductive organs, all in orderly
and progressive sequence, producing in the end that orderly
disposed cell aggregate, that individual life unit which we know
as an earthworm. Although the forces involved are beyond our
ken, the grosser processes are evident” (Needham, General
Biology, p. 175; italics mine). Of course it is evident from his
books that Bergson does not take much account of such
difficult facts and topics as the mistakes of instinct, etc. And I
have just spoken of his optimistic avoidance of some of the
deeper problems of the moral and spiritual life of man.
431
“This amounts to saying that the theory of knowledge and
theory of life seem to us inseparable [Creative Evolution, p.
xiii.; italics Bergson’s]. A theory of life that is not accompanied
by a criticism of knowledge is obliged to accept, as they stand,
the concepts which the understanding puts at its disposal: it
can but enclose the facts, willing or not, in pre-existing frames
which it regards as ultimate. It thus obtains a symbolism which
is convenient, perhaps even necessary to positive science, but
not a direct vision of its object.”
432
I more than agree with Bergson that our whole modern
philosophy since Descartes has been unduly influenced by
physics and mathematics. And I deplore the fact that the “New
Realism” which has come upon us by way of a reaction (see p.
53) from the subjectivism of Pragmatism, should be travelling
apparently in this backward direction—away, to say the very
least, from some of the things clearly seen even by biologists
and psychologists. See p. 144.
433
As I have indicated in my Preface, I am certainly the last
person in the world to affect to disparage the importance of the
thin end of the wedge of Critical Idealism introduced into the
English-speaking world by Green and the Cairds, and their first
followers (like the writers in the old Seth-Haldane, Essays on
Philosophical Criticism). Their theory of knowledge, or
“epistemology,” was simply everything to the impoverished
condition of our philosophy at the time, but, as Bergson points
out, it still left many of us [the fault perhaps was our own, to
some extent] in the position of “taking” the scientific reading of
the world as so far true, and of thinking that we had done well
in philosophy when we simply partly “transformed” it. The really
important thing was to see with this epistemology that the
scientific reading of the world is not in any sense initial “fact”
for philosophy.
INDEX
Absolutism, 13, chap. viii.

Action, 91 n., 105, chap. iv.
Activity-Experience, 105, 109
Alexander, S., 163
Anti-Intellectualism, 73, 239
Appearance and Reality, 84
Arcesilaus, 155
Aristotle, 155
Armstrong (Prof.), 49 n.
Attention, 119
Augustine, 107
Avenarius, 41
Bain, 120
Baldwin, J. M., 156, 110 n.
Bawden (Prof.), 17, 85
Belief, 64, 65, 229 n., 251
Bergson, 72, 104, 126
Berthelot, 117
Blondel, 32, 34
Bosanquet, B., 110, 185, chap. viii.
Bourdeau, 26, 133 n., 193
Boyce-Gibson (Prof.), 154
Bradley, F. H., 74, 75, 91
Browning, R., 117
Brunschvig, 30
Bryce, James, 193
Butler, 119
Caird, E., 112

Carlyle, 125
Chesterton, W. K., 117, 156
Cohen, 48 n.
Common-sense Beliefs, 7
Common-sense Philosophy, 117
Comte, 120
Contemplation, 96
Cornford, 184
Curtis (Prof. M. M.), 22
Dawes-Hicks (Prof.), 163

De Maistre, 170
Descartes, 66, 121
Desjardins, P., 37
Dewey, J., 16, 17, 37, 62, 147, 173, 175
Du Bois Reymond, 110
Duncan (Prof.), 122
Duns Scotus, 119
Eleutheropulos, 43
Elliot, H. S. R., 66
Epicureanism, 118
Eucken, 39, 154
Ewald (Dr.), 44, 48
Flournoy, 180
Fouillée, 37 n.
Fraser, A. C., 112
Futurism, 26
Geddes, P., 123

Goethe, 195, 215
Gordon, A., 152–3
Green, T. H., 199
Gregory (Prof.), 24
Inge (Dean), 29, 31

Invention, 192
James, W., 3, 4, 24, 35, 39, 45, 50, 65, 135, 182, 192 n.
Jerusalem, W., 43
Joachim, 56
Jones, Sir H., 56
Joseph, 57
Kant, 119, 121, 247

Kant and Hegel, 183
Knox (Capt.), 15
Lalande, A., 29, 33, 164

Lankester (Sir R.), 167
Lecky, 70
Leighton (Prof.), 133
Le Roy, 31
Locke, 61, 119
Lovejoy (Prof.), 49
MacEachran (Prof.), 49 n.
Mach, 40
Mackenzie, J. S., 112
Maeterlinck, 90
Mallarmé, 214
Marett, 160
Mastermann, G. F. G., 118
M’Dougall, 104
McTaggart, J. M. E., 92
Meaning, 21, 51, 149
Mellone, 57
Merz, 157
Münsterberg, 46
Natorp, 48
Needham (Prof.), 101, 260
New Realism, 53
Nietzsche, 118, 139, 151
Ostwald, 40, 41
Pace (Prof.), 187

Paleyism, 247
Papini, 24, 135
Pascal, 119
Pater, W., 124
Peirce, 3, 22
Perry (Prof.), 53, 185
Perry, Bliss, 171, 179
Plato, 57, 61, 121, 150, 151
Pluralism, 87
Poincaré, 30
Pradines, 36 n.
Pragmatism, and American philosophy, 49, chap. vii.;
and British thought, 54;
and French thought, 28;
and German thought, 38;
and Italian thought, 23;
a democratic doctrine, 105;
its ethics, 136;
its pluralism, 162;
its sociological character, 164, 262;
its theory of knowledge, 131;
its theory of truth, 127;
its theory of reality, 135
Pratt (Prof.), 51, 127
Radical Empiricism, 85
Renan, 110
Renouvier, 29
Rey, 31
Riley, W., 26 n.
Ritzsche, 45
Royce, J., 54
Russell, B., 61, 66 n., 169
Santayana, 171, 181, 190

Schellwien, 44
Schiller, F. C. S., 12, 14, 16, 132, 133
Schinz, 192 n.
Schopenhauer, 28, 119, 151, 260
Seth, James, 14 n.
Seth-Haldane, 260
Shaw, Bernard, 124
Sidgwick, H., 56, 118, 119 n., 140
Sigwart, 42
Simmel, 44
Spencer, 41 n.
Starbuck, 28
Stoicism, 118
Stout, G. F., 55
Subjective Idealism, 259
Taylor, A. E., 57, 77, 78, 199 n., 219

Teleology, 88, 198
Tertullian, 119
Theism, 215 n.
Themistius, 155
Thompson, J. H., 144
Titchener, 157
Truth, 59, 81, 163
Tufts, 147
Tyndall, 110
Vaihinger, 39
Walker, L. J., 31
Ward, James, 30, 55, 143, 162
Wells, H. G., 123
Westermarck, 145
Windelband, 46, 150
Wollaston, 224
Printed by R. & R. Clark, Limited, Edinburgh.

Transcriber’s Notes
Inconsistent punctuation, hyphenation, and spelling were
not changed.
Simple typographical errors were corrected; unbalanced
quotation marks were remedied when the change was
obvious, and otherwise left unbalanced.
The index was not checked for proper alphabetization or
correct page references. Index references to footnotes link to
the pages containing the footnote anchors, not to the
footnotes themselves.
Some page links in the Footnotes may be erroneous, as
they actually reference other books.
*** END OF THE PROJECT GUTENBERG EBOOK PRAGMATISM
AND IDEALISM ***
Updated editions will replace the previous one—the old editions will
be renamed.
Creating the works from print editions not protected by U.S.

copyright law means that no one owns a United States copyright in
these works, so the Foundation (and you!) can copy and distribute it
in the United States without permission and without paying copyright
royalties. Special rules, set forth in the General Terms of Use part of
this license, apply to copying and distributing Project Gutenberg™
electronic works to protect the PROJECT GUTENBERG™ concept
and trademark. Project Gutenberg is a registered trademark, and
may not be used if you charge for an eBook, except by following the
terms of the trademark license, including paying royalties for use of
the Project Gutenberg trademark. If you do not charge anything for
copies of this eBook, complying with the trademark license is very
easy. You may use this eBook for nearly any purpose such as
creation of derivative works, reports, performances and research.
Project Gutenberg eBooks may be modified and printed and given
away—you may do practically ANYTHING in the United States with
eBooks not protected by U.S. copyright law. Redistribution is subject
to the trademark license, especially commercial redistribution.
START: FULL LICENSE

THE FULL PROJECT GUTENBERG LICENSE
PLEASE READ THIS BEFORE YOU DISTRIBUTE OR USE THIS WORK
To protect the Project Gutenberg™ mission of promoting the free

distribution of electronic works, by using or distributing this work (or
any other work associated in any way with the phrase “Project
Gutenberg”), you agree to comply with all the terms of the Full
Project Gutenberg™ License available with this file or online at
www.gutenberg.org/license.
Section 1. General Terms of Use and

Redistributing Project Gutenberg™
electronic works
1.A. By reading or using any part of this Project Gutenberg™
electronic work, you indicate that you have read, understand, agree
to and accept all the terms of this license and intellectual property
(trademark/copyright) agreement. If you do not agree to abide by all
the terms of this agreement, you must cease using and return or
destroy all copies of Project Gutenberg™ electronic works in your
possession. If you paid a fee for obtaining a copy of or access to a
Project Gutenberg™ electronic work and you do not agree to be
bound by the terms of this agreement, you may obtain a refund from
the person or entity to whom you paid the fee as set forth in
paragraph 1.E.8.
1.B. “Project Gutenberg” is a registered trademark. It may only be

used on or associated in any way with an electronic work by people
who agree to be bound by the terms of this agreement. There are a
few things that you can do with most Project Gutenberg™ electronic
works even without complying with the full terms of this agreement.
See paragraph 1.C below. There are a lot of things you can do with
Project Gutenberg™ electronic works if you follow the terms of this
agreement and help preserve free future access to Project
Gutenberg™ electronic works. See paragraph 1.E below.
1.C. The Project Gutenberg Literary Archive Foundation (“the
Foundation” or PGLAF), owns a compilation copyright in the
collection of Project Gutenberg™ electronic works. Nearly all the
individual works in the collection are in the public domain in the
United States. If an individual work is unprotected by copyright law in
the United States and you are located in the United States, we do
not claim a right to prevent you from copying, distributing,
performing, displaying or creating derivative works based on the
work as long as all references to Project Gutenberg are removed. Of
course, we hope that you will support the Project Gutenberg™
mission of promoting free access to electronic works by freely
sharing Project Gutenberg™ works in compliance with the terms of
this agreement for keeping the Project Gutenberg™ name
associated with the work. You can easily comply with the terms of
this agreement by keeping this work in the same format with its
attached full Project Gutenberg™ License when you share it without
charge with others.
1.D. The copyright laws of the place where you are located also
govern what you can do with this work. Copyright laws in most
countries are in a constant state of change. If you are outside the
United States, check the laws of your country in addition to the terms
of this agreement before downloading, copying, displaying,
performing, distributing or creating derivative works based on this
work or any other Project Gutenberg™ work. The Foundation makes
no representations concerning the copyright status of any work in
any country other than the United States.
1.E. Unless you have removed all references to Project Gutenberg:
1.E.1. The following sentence, with active links to, or other

immediate access to, the full Project Gutenberg™ License must
appear prominently whenever any copy of a Project Gutenberg™
work (any work on which the phrase “Project Gutenberg” appears, or
with which the phrase “Project Gutenberg” is associated) is
accessed, displayed, performed, viewed, copied or distributed:
This eBook is for the use of anyone anywhere in the United
States and most other parts of the world at no cost and with
almost no restrictions whatsoever. You may copy it, give it away
or re-use it under the terms of the Project Gutenberg License
included with this eBook or online at www.gutenberg.org. If you
are not located in the United States, you will have to check the
laws of the country where you are located before using this
eBook.
1.E.2. If an individual Project Gutenberg™ electronic work is derived

from texts not protected by U.S. copyright law (does not contain a
notice indicating that it is posted with permission of the copyright
holder), the work can be copied and distributed to anyone in the
United States without paying any fees or charges. If you are
redistributing or providing access to a work with the phrase “Project
Gutenberg” associated with or appearing on the work, you must
comply either with the requirements of paragraphs 1.E.1 through
1.E.7 or obtain permission for the use of the work and the Project
Gutenberg™ trademark as set forth in paragraphs 1.E.8 or 1.E.9.
1.E.3. If an individual Project Gutenberg™ electronic work is posted

with the permission of the copyright holder, your use and distribution
must comply with both paragraphs 1.E.1 through 1.E.7 and any
additional terms imposed by the copyright holder. Additional terms
will be linked to the Project Gutenberg™ License for all works posted
with the permission of the copyright holder found at the beginning of
this work.
1.E.4. Do not unlink or detach or remove the full Project

Gutenberg™ License terms from this work, or any files containing a
part of this work or any other work associated with Project
Gutenberg™.
1.E.5. Do not copy, display, perform, distribute or redistribute this

electronic work, or any part of this electronic work, without
prominently displaying the sentence set forth in paragraph 1.E.1 with
active links or immediate access to the full terms of the Project
Gutenberg™ License.
1.E.6. You may convert to and distribute this work in any binary,
compressed, marked up, nonproprietary or proprietary form,
including any word processing or hypertext form. However, if you
provide access to or distribute copies of a Project Gutenberg™ work
in a format other than “Plain Vanilla ASCII” or other format used in
the official version posted on the official Project Gutenberg™ website
(www.gutenberg.org), you must, at no additional cost, fee or expense
to the user, provide a copy, a means of exporting a copy, or a means
of obtaining a copy upon request, of the work in its original “Plain
Vanilla ASCII” or other form. Any alternate format must include the
full Project Gutenberg™ License as specified in paragraph 1.E.1.
1.E.7. Do not charge a fee for access to, viewing, displaying,

performing, copying or distributing any Project Gutenberg™ works
unless you comply with paragraph 1.E.8 or 1.E.9.
1.E.8. You may charge a reasonable fee for copies of or providing

access to or distributing Project Gutenberg™ electronic works
provided that:
• You pay a royalty fee of 20% of the gross profits you derive from
the use of Project Gutenberg™ works calculated using the
method you already use to calculate your applicable taxes. The
fee is owed to the owner of the Project Gutenberg™ trademark,
but he has agreed to donate royalties under this paragraph to
the Project Gutenberg Literary Archive Foundation. Royalty
payments must be paid within 60 days following each date on
which you prepare (or are legally required to prepare) your
periodic tax returns. Royalty payments should be clearly marked
as such and sent to the Project Gutenberg Literary Archive
Foundation at the address specified in Section 4, “Information
about donations to the Project Gutenberg Literary Archive
Foundation.”
• You provide a full refund of any money paid by a user who

notifies you in writing (or by e-mail) within 30 days of receipt that
s/he does not agree to the terms of the full Project Gutenberg™
License. You must require such a user to return or destroy all
copies of the works possessed in a physical medium and
discontinue all use of and all access to other copies of Project
Gutenberg™ works.
• You provide, in accordance with paragraph 1.F.3, a full refund of

any money paid for a work or a replacement copy, if a defect in
the electronic work is discovered and reported to you within 90
days of receipt of the work.
• You comply with all other terms of this agreement for free
distribution of Project Gutenberg™ works.
1.E.9. If you wish to charge a fee or distribute a Project Gutenberg™

electronic work or group of works on different terms than are set
forth in this agreement, you must obtain permission in writing from
the Project Gutenberg Literary Archive Foundation, the manager of
the Project Gutenberg™ trademark. Contact the Foundation as set
forth in Section 3 below.
1.F.
1.F.1. Project Gutenberg volunteers and employees expend

considerable effort to identify, do copyright research on, transcribe
and proofread works not protected by U.S. copyright law in creating
the Project Gutenberg™ collection. Despite these efforts, Project
Gutenberg™ electronic works, and the medium on which they may
be stored, may contain “Defects,” such as, but not limited to,
incomplete, inaccurate or corrupt data, transcription errors, a
copyright or other intellectual property infringement, a defective or
damaged disk or other medium, a computer virus, or computer
codes that damage or cannot be read by your equipment.
1.F.2. LIMITED WARRANTY, DISCLAIMER OF DAMAGES - Except

for the “Right of Replacement or Refund” described in paragraph
1.F.3, the Project Gutenberg Literary Archive Foundation, the owner
of the Project Gutenberg™ trademark, and any other party
distributing a Project Gutenberg™ electronic work under this
agreement, disclaim all liability to you for damages, costs and

Bayesian Statistics For Beginners A Step by Step Approach Therese M Donovan Full Chapter

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bayesian Statistics For Beginners A Step by Step Approach Therese M Donovan Full Chapter

Uploaded by

Copyright:

Available Formats

Bayesian Statistics for Beginners: A

Step-By-Step Approach Therese M

To Peter, Evan, and Ana—for everything.

SECTION 1 Basics of Probability

SECTION 2 Bayes’ Theorem and Bayesian Inference

7 The Portrait Problem: Bayesian Inference with Joint Likelihood 73

SECTION 3 Probability Functions

8 Probability Mass Functions 87

9 Probability Density Functions 108

SECTION 4 Bayesian Conjugates

10 The White House Problem: The Beta-Binomial Conjugate 133

SECTION 5 Markov Chain Monte Carlo

17 The Survivor Problem: Simple Linear Regression with MCMC 269

19 The Lorax Problem: Introduction to Bayesian Networks 325

A.1 The Beta-Binomial Conjugate Solution 369

A.4 Conjugate Solutions for Simple Linear Regression 385

• In Chapter 1, the concept of probability is introduced. Using an example, the chapter

Should you play?

Answer: It’s up to you!

4 BAYESIAN STATISTICS FOR BEGINNERS

Answer: You collect some data.

Outcome Frequency Probability

A probability distribution that is based on raw data is called an empirical probability

Figure 1.1 Empirical probability distribution for

Is one roll good enough?

Outcome Frequency Probability

Figure 1.2 Empirical probability distribution for

6 BAYESIAN STATISTICS FOR BEGINNERS

Outcome Frequency Probability

Figure 1.3 Frequency distribution of 500 rolls.

Figure 1.4 Empirical probability distribution for

According to the Law of Large Numbers in probability theory, the formula:

jnumber of observed outcomes of interestj

What would we expect if the die were fair?

Figure 1.5 Probability distribution for rolling a fair die.

Again, this is an example of a discrete uniform probability distribution: discrete

8 BAYESIAN STATISTICS FOR BEGINNERS

Answer: Your answer here!

Figure 1.6 Probability distribution for rolling a

What would the probability distribution be for the bet?

Figure 1.7 Probability distribution for rolling a four

10 BAYESIAN STATISTICS FOR BEGINNERS

Do Bayesians think of probability as long-run averages?

Joint, Marginal, and Conditional

What is an eyeball event?

Figure 2.1 Determining your dominant eye.

12 BAYESIAN STATISTICS FOR BEGINNERS

JOINT, MARGINAL, AND CONDITIONAL PROBABILITY 13

Why is it called a Venn diagram?

Figure 2.3 John Venn.

14 BAYESIAN STATISTICS FOR BEGINNERS

What is the probability that a person in universe U is in group A?

What about people who are not in group A?

I’m sick of eyeballs. Can we consider another characteristic?

JOINT, MARGINAL, AND CONDITIONAL PROBABILITY 15

Morton’s toe (B)

Common toe (∼B) 85

Can we look at both characteristics simultaneously?

Lefty (A) Righty (A) Sum

Let’s study this table carefully.

16 BAYESIAN STATISTICS FOR BEGINNERS

70 Morton’s toe (B)

Righty with common toe (∼A and ∼B): 15