You are on page 1of 10

# Stat 104 Section Notes 4

Problem Solutions
1) In a 7 game playoff series like the World Series, the first team to win
4 games wins the series.
a) Assuming there is equal probability that each team wins each game,
what is the probability that the series ends in 4, 5, 6, or 7 games?
Let X = number of games the World Series will go this year.
Let there be two teams in the series: the Phillies (P) and the
Red Sox (R). Then, for the series to end in four games with the
Phillies as champions, the only way this can happen is if the
Phillies win the first four games, so the sequence of game-bygame winners would be: P P P P. In each of those four games,
the Phillies have 0.5 chance of winning each game, so the
probability that the Phillies win in 4 games is (0.5)4. The same
argument could be made for the Red Sox winning in 4 games
(and thus we just multiply the (0.5)4 times two):
.
For the series to end in five games with the Phillies as
champions, there are 4 ways (game-by-game sequences) this
can happen: 1) R P P P P, 2) P R P P P, 3) P P R P P, 4) P P P R
P [Note, the Phillies have to win the 5th game]. This is
equivalent to choosing which 3 of the first 4 games the Phillies
must win, thus there are
ways to do this. Also in each of
those five games, the Phillies have 0.5 chance of winning or
losing each game, so the probability that the Phillies win in 5
games is 4*(0.5)4. The same argument could be made for the
Red Sox winning in 5 games (and thus we just multiply the
4*(0.5)4 times two):
.
Similar arguments can be made to calculate the remaining two
possibilities as such:

or

## Note: coming up with these calculations is difficult, and would

be too challenging for a midterm question (though you would
be expected to do them in a HW problem since you would have
more time to think).
b) Let X be the random variable for the number of games the World
Series will go this year. What is the mean and variance of X?
Since X is a discrete random variable, we can use the following
formulas:
Mean:

Variance:

## Thus, the standard deviation is:

Since the World Series went to 7 games (in 1923), the number of
games played has been:
Frequenc
y

4 games

5 games

6 games

7 games

17

18

18

33

## c) Anything interesting seen above?

Yes, we see that there are proportionally more 4-game and 7game series than would be expected if the winner of each
game was based on a coin flip. If you know something about
baseball, this can be explained by the facts:
i) Some years there is a dominant team. If one team has
probability of winning of p = 0.75, then the series is more
likely to end in just 4 games (not to mention momentum).
ii) There is home-field advantage. If team A is more likely to
win at home (say with probability 0.75) and less likely to win
away (with probability 0.25), and since about half of the
games are home and away, then would lead to an increased
likelihood of the series going the full 7 games.

## Note: this is above and beyond what would be expected in a

Stat 104 problem since it requires you to know something

2) A jewel thief and his partner planned to steal 2 identical diamonds from a

jewelry store. They had 2 similar, but fake diamonds prepared. The plan was
for the thiefs accomplice to faint and, while the stores staff were distracted,
for the thief to grab the two real diamonds from the viewing pad and replace
them with the two fake diamonds. On the day of the robbery all was going
according to plan up to the point when the thiefs accomplice fainted and the
staff rushed to her assistance. The thief grabbed the two real diamonds from
the viewing pad and put them in his pocket, but then to his horror realized
that the two fake diamonds were already in that same pocket. The thief had
to quickly grab two stones at random from the four (two real and two fake) in
his pocket and leave them on the viewing pad after which he left the store
with his accomplice.
a) Give the sample space S that corresponds to the two stones in the
thiefs pocket as he leaves the
store [(D1,F1), etc.] and assign probabilities to the elements of that
sample space.
S = {(D1,D2), (D1, F1), (D1,F2), (D2, F1), (D2, F2), (F1, F2)}
b) Let X be the number of real diamonds the thief had in his pocket as he
left the store. Give the
probability distribution of X.
x
P(X=x)

0
1//6

1
2/3

2
1/6

c) What is the probability that at least one of the two stones in the thiefs
pocket is real?

## 3) Harvard College is composed of about 52% women. For all students

currently enrolled in Stat 104, 115 of the 275 students are women. Use this
information for the following problem.
a) What is the expected number of women in a simple random sample of 275
students from Harvard College? What is the standard deviation?
Binomial Distribution:
E(X) = np = .52 * 275 = 143

## b) What is the probability of selecting exactly 115 women in a sample of 275

students from Harvard College?

where

## c) What is the probability of selecting at least 115 women or fewer in a

sample of 275 students from Harvard College [use software]?

## Note: this calculation, and all the probability calculations

above, can be done using an online calculator like this one (or
just do a google search for binomial distribution calculator):
http://dostat.stat.sc.edu/prototype/calculators/index.php3?
dist=Binomial
4) In January & February in Cambridge, MA, each day has a 30% chance of
being snowy/rainy (consider snowy/rainy to be any type of precipitation
throughout this problem). Also, if it precips one day, it has no effect on
whether it will precip any other day. Use this information to answer the
following questions:
a) Last week it rained 1 out of the 7 days. What is the probability of 1 or
fewer rainy days over the course of one week (7 days) in February in
Cambridge?

n=7
p=.30

b) Let X be the random variable for the count of the number of days with
precipitation in January in Cambridge (treat this as a random sample of 31
days). What are the mean and standard deviation of X?
E(X)=np = 7*.30 = 2.1

c) There were 14 days with precipitation in the month of January this year.
Calculate the probability of there being exactly 14 rainy days in the 31 days
in January in Cambridge (a random sample of 31 days...go ahead and use
software).

## d) Calculate the probability of there being 14 or more days with precipitation

in the month of January in Cambridge (a random sample of 31 days...go
ahead and use software).
http://www.wunderground.com/history/airport/KBOS/2011/1/1/CustomHistory.
html?
dayend=31&monthend=1&yearend=2011&req_city=NA&req_state=NA&req_
statename=NA

5) You and your roommate are discussing how many classes you plan
on taking next semester (either 3, 4 or 5). Let X be the number of
classes you will decide to take and Y be the number of classes your
roommate will decide to take. The joint probability distribution for the
random variables X and Y is shown below:
# Classes Taken

You

Your
Roommate

3
4
5

3
0.04
0.05
0.01
0.10

4
0.05
0.55
0.10
0.70

5
0.01
0.10
0.09
0.20

0.10
0.70
0.20
1.00

## a) Confirm that this is a valid joint probability distribution.

We see that each entry (each entry is a probability of the
intersection of events for X and Y) is between 0 and 1
inclusive, and that they sum to one:
0.04 + 0.05 + 0.01 + 0.05 + 0.55 + 0.10 + 0.01 + 0.10 + 0.09 =
1
b) Find the marginal distribution of X. What are the mean and standard
deviation of X?
Essentially, this just means what is the distribution of X if we
ignore Y. Thus, we can just sum each column and find that:
P(X = 3) = 0.04 + 0.05 + 0.01 = 0.10
P(X = 4) = 0.05 + 0.55 + 0.10 = 0.70
P(X = 5) = 0.01 + 0.10 + 0.09 = 0.20
Mean:

Variance:

## Thus, the standard deviation is:

c) Find the marginal distribution of Y. What are the mean and standard
deviation of Y?
Note, we just need to find the row sums for the marginal
distribution of Y, and they are exactly equal to that for X. So Y
has the same marginal distribution as X, and has the same
mean and standard deviation.
d) Are the number of classes you and your roommate take
independent? How do you know?

## X and Y have the identical marginial distribution, but this give

us no info on if they are related. In fact, they are not
independent. To determine independence, we would have to
check each cell entry such that the equality:
holds. In this case, we know:
, therefore they
are dependent.

## e) What is the covariance between X and Y? What is the correlation

between # classes you and your roommate take?
By looking at the table, we see that when you take more
classes, your roommate tends to take more classes with you,
thus covariance and correlation should be positive. Here is the
calculation:

## f) What is the conditional probability distribution of # classes your

roommate takes, given you take 5 classes?
This is saying that knowing that you are in the column
referring to X = 5, what is the probability for each of Y = 3, 4,
or 5. So we need to take the cell entries in that column and
divide by the column total to get:

## g) What is the expected value and variance of the # classes your

roommate takes, given you take 5 classes?
Mean:

Variance:

## Again, they are most certainly dependent since the conditional

distribution of Y on X calculated in part (f) is not equal to the
marginal distribution of Y. [P(Y=3|X=5) = 0.05 0.10 =
P(Y=3)]
Let V be the random variable for the difference in number of
classes that you take compared to your roommate (i.e. V = X
Y).
i) What is the distribution of V? What are the mean and variance of V?
Vs sample space is SV = {-2, -1, 0, 1, 2}. The related
probabilities are:

Mean:

Variance:

## Thus the standard deviation is

Simlarly, since V is a linear combination of X and Y: