Minh Khue 2017

STATISTICS
SECTION II
Part A
Questions 1-5
Spend about 65 minutes on this part of the exam.
Percent of Section II score—75
Directions: Show all your work. Indicate clearly the methods you use, because you will be scored on the
correctness of your methods as well as on the accuracy and completeness of your results and explanations.
1. Population pyramids are a type of bar chart that show the distribution of ages of a country’s population.
The distributions of ages of men and women for two countries, A and B, for the year 2015 are shown in the
population pyramids below. The age-groups, in years, are listed in the center columns, and the proportions are
shown on the horizontal axes. Each bar represents the proportion of the age-group in the population for that sex.
0 01
↑
.
I 0 .
058 .
0650 . 11
-
,
0 14
, 14
0
, 14
0
, 125
0
0 . 124
0 02
. 0 . 1
0 02.
0 . 07
O .
11
- 0 .
12
Unauthorized copying or reuse of

any part of this page is illegal.
GO ON TO THE NEXT PAGE.
-6-
(a) Is the proportion of the female population age 60 or older in Country A greater than the proportion of the
female population age 60 or older in Country B? Justify your answer.
the
proportion of the female population age 60
awolder
A is around 01 0 +0 038 + 0 01)

065 + .
in
country
.
: .
.
=
0 .
223
the proportion of the female population 60 or older

age
country Bis around
in 02 + 0 07 : 0
0 11 + 0 + 12
.
. .
Proportion of
=>
the femate pation age 60 order in
country A is not
greater than that in country B
(b) One of the two countries experienced an increase in the birth rate in the years 1946 to 1955 and another
increase in the birth rate about 20 years later. Based on the graphs, which country experienced the described
increases in birth rate? Justify your answer using information from the graphs.
Country B experienced the described

increases in birth rate since people who were born in
1946-1955 are 60-69
years
old in 2015
,
and people
who were born in the second increaseare 40-49 in 2015
.
male a female population age

Therefore, the proportion of
. In B
60-69 and 40-49 would be the higest graph
around 14 female
Mak in 60-69 & 40 49 are both
: 0
we can see
.
60-69 + 40-49 are both aroundo 14 ,the

in
age
highest-proportion
ecied
.
(c) For Country A, in which age-group is the median age of the male population? Justify your answer.
to other
the median age will be in the 50thpercentile
ofthe
age groups
. The total proportion
male population male
of population
age
0-39 would be around 0 .
529 8
. the proportion
make population 0-29 would be around 0 309
of age
.
.
male A will
Therefore ,
the median
age of population for country
th
be in
age group of 30-39 .

-7-
2. The ability to visually search, such as when reading an x-ray or interpreting a satellite image, is an important
skill in many jobs. Researchers conducted a study to investigate whether playing video games could improve a
person’s ability to visually search. Three video games were used in the study: one was a driving game, one was a
sports game, and one was a puzzle game. The participants consisted of 60 volunteers who had no experience
playing video games before the study. Each participant was randomly assigned to one of the three games so that
there were 20 participants per game.
(a) Describe an appropriate method for randomly assigning 60 participants to three groups so that each group
has 20 participants.
-
Number 60 participants from 1 to 60
Use a number generator to generate 20 participants

into driving game group
the number generation to generate
-
Replicate previous group

into of 20 participants into
2
participants sports game
puzzle game group
The time to complete a visual search task was recorded for each participant before the assigned game was
played. The time to complete a visual search task was again recorded for each participant after the assigned
game was played. For each game, the mean improvement time (time before minus time after) was calculated.
(b) One researcher expressed an interest in investigating whether playing a driving game would lead to a
different mean improvement time to complete a visual search task than would playing a sports game.
Assuming the appropriate population values are approximately normally distributed, state the name of
a test with the appropriate null and alternative hypotheses that could be used to investigate the researcher’s
interest.
Two +-test
sample
Ho
driving Esports
:
M
x :
M # M
driving sports

-8-
(c) When the appropriate test was performed, no significant difference was found in mean improvement time
between the driving game and the sports game. Name one change the researchers could make in a future
study to increase the power of the test. Explain why such a change would increase the power.
* I
the value since
the researchers could increase increasing
the ↓ value would lower the
chance
of getting type II
Hence the power of
the est (1-B) would
error
(B) .
increase .

-9-
3. In women’s tennis, a player must win 2 out of 3 sets to win a match. If a player wins the first 2 sets, she wins the
match and the third set is not played. Player V and Player M will compete in a match.
(a) Let V represent the event that Player V wins a set, and let M represent the event that Player M wins a set.
(i) List all possible sequences of events V and M by set played that will result in Player V winning the
match.
VV
VMV
MVV
(ii) List all possible sequences of events V and M by set played that will result in Player M winning the
match.
MM
MVM
VMM
Player V and Player M have competed against each other many times. Historical data show that each player is
equally likely to win the first set. If Player V wins the first set, the probability that she will win the second set
is 0.60. If Player V loses the first set, the probability that she will lose the second set is 0.70. If Player V wins
exactly one of the first two sets, the probability that she will win the third set is 0.45.
(b) What is the probability that Player V will win a match against Player M?
v
0 .
6
can win
three ways
/
V There are
0 45
M
.
a match :
0
4v M 5x 0 6 0
3
& VV =
0
. .
=
.
.
0 45 0
55
M 45 09
.
3 5x0 4 0
.
O 0 0
VMV
+
= .
. .
= .
M
0 5 55 MVV 0 5 0 3
+0 45 = 0 0675
M
.
=
0 .
. x .
. .
probability that player

0 7
=> the
.
V will win is : 0 . 3 + 0 09 + 0 0675

.
.
=
0 .
4575

-10-
(c) What is the probability that a match between Player V and Player M will consist of 3 sets given that
Player V wins the match?
ways for player V to win in a

There are two
3 set match : VMU ,

Mr
probability the player Vaill win in 3 set match

a
the
V wins the match is
given that the player
:
0 09 +
0 0675
0 344
.
0 .
4575
(d) What is the expected number of sets played when Player V competes in a match with Player M?
2 sets 3 sets
Numberof
sets -
Probability 0.3+05 vin : 00006750.15s

.
0 400 55 = 0
.
.
.
192
<MVM) CVMM)
= 0 .
65 total :
0 .
35
Expected number of sets played : 0 .

65 + 2 + 0 35 x 3
.
= 2 35.
(sets)

-11-
4. A sociologist was investigating the ages of grandparents of high school students. From a random sample of
10 high school students, the sociologist collected data on the current ages, in years, of the students’ maternal
grandparents. The data are shown in the table below.
Student
Standard
A B C D E F G H I J Mean
Deviation
Age of grandmother 75 70 55 75 55 70 80 70 74 74 69.8 8.38
Age of grandfather 74 75 65 67 60 78 83 74 70 70 71.6 6.65
Difference 1 –5 –10 8 –5 –8 –3 –4 4 4 – 1.8 5.81
C B A BA B B C
(a) Construct and interpret a 95 percent confidence interval for the population mean difference in age
(age of grandmother minus age of grandfather) of the maternal grandparents of high school students.
Conditions :
-
The sample is
randomly selected
10-which
-
10 %
rule : the sample size is
only
school students
is
- of
probably
all
high
Normality outliers
strongskewness
-
: no or
=>
approximately normal 10 > 6
& A:
- -
=
7 -
6 -
B: -
S - - 1
Frequency Sp = C : 0 - 4
Il
3
-
2 -
D : 5 + 9
1 -
S
A B C D
1 +,
,
=
-1 .
82 262 .
= 1 8 =
.
4 156.
=> Confidence interval = (-5 956 ;.

2 .
35 6)
We are 95 % confident that the interval
of ( -
5 . 956; 2 .
356)
will capture the true differences in
population mean
age
of maternal grandparents of high school students
-12-
(b) One of the sociologist’s research questions was about the mean difference in age without regard to which
grandparent is older. The interval constructed in part (a) does not address such a question. Based on the
sample of high school students, give the value of the point estimate for the mean difference in age that could
be used to address the sociologist’s question.
Difference
= I grandmother-grandfather
therefore , we have a new table
* B C D E F GHIJ
Differ -
15108583444
ence
estimate for difference would be :

The new
point mean
10 + 8 + 5 + 8 + 3 + 4 +4 + 4 5 2
1+ 5 +
=
.
10

-13-
5. An automobile manufacturer sold 30,000 new cars, one to each of 30,000 customers, in a certain year. The
manufacturer was interested in investigating the proportion of the new cars that experienced a mechanical
problem within the first 5,000 miles driven.
(a) A list of the names and addresses of all customers who bought the new cars is available. Describe a
sampling plan that could be used to obtain a simple random sample of 1,000 customers from the list.
Number all the customers in the list. Use a number
1000 numbers out

generator
to generate of 30000
1000 customers
numbers we gather a sample of
,
from the list.
Each customer from a simple random sample of 1,000 customers who bought one of the new cars was asked
whether they experienced any mechanical problems within the first 5,000 miles driven. Forty customers from the
sample reported a problem. Of the 40 customers who reported a problem, 13 customers, or 32.5%, reported a
problem specifically with the power door locks.
(b) Explain why 0.325 should not be used to estimate the population proportion of the 30,000 new cars sold that
experienced a problem with the power door locks within the first 5,000 miles driven. e
number ofcustomersexperience
th
The point estimate mustbe : P(a) =
all customers in the sample

thecustomers
proportionthat
the who
0 . 325 is
reported a
problem with door locking the
I
customers who reported problem (13 out of 40)

a
Hence , 0 325
not
of the sample size of 1000 ·
.
can't be the point estimate for the population

the 30 000 new cars that experienced
proportion of
,
door locks within the first

a problem with the
5000 miles driver .

-14-
(c) Based on the results of the sample, give a point estimate of the number of new cars sold that experienced a
problem with the power door locks within the first 5,000 miles driven.
probability of new cars sold that experienced a
door locks in the sample is

problem with power
= 01
0 .
the number
of
sold that experienced
point
cars
The estimate
of
door locks
problem with is :
a
0 . 013x30000 =
390 (cars)

-15-
STATISTICS
SECTION II
Part B
Question 6
Spend about 25 minutes on this part of the exam.
Percent of Section II score—25
Directions: Show all your work. Indicate clearly the methods you use, because you will be scored on the
correctness of your methods as well as on the accuracy and completeness of your results and explanations.
6. Emily walks every day, and she keeps a record of the number of miles she walks each day. The histogram and
five-number summary below were created from the recorded miles for a random sample of 25 of the days
Emily walked.
Minimum Q1 Median Q3 Maximum

1.4 2.6 3.25 3.8 7.5
On one of the 25 days in the sample, Emily walked 7.5 miles. From the histogram, it appears that the value
7.5 might be an outlier relative to the other values. Two methods are proposed for identifying an outlier in a set
of data.
(a) One method for identifying an outlier is to use the interquartile range (IQR). An outlier is any number that is
greater than the upper quartile by at least 1.5 times the IQR or less than the lower quartile by at least
1.5 times the IQR. Does such a method identify the value of 7.5 miles as an outlier for Emily’s set of data?
Justify your answer.
IQR = 3 . 8 -
2 .
6 = 1 .
2
If larger than : 3 . 8 + 1 .
2 x 1 5
.
= 5 6
.
,
outlier
the data will
be an .
7 5 is an outlier (5 .
6 <7 .
5)
=>
.

-16-
Another method of identifying an outlier is to investigate whether there is evidence that a value might have come
from a population with a mean different from the mean of the population of the other values.
Let X and Y represent random variables. X is distributed normally with mean mx and standard deviation s,
and Y is distributed normally with mean my and standard deviation s. Consider 1 randomly selected value of Y
and n - 1 randomly selected values of X.
(b) Consider the difference Y - X .
(i) In terms of my and m x , what is the mean of the difference Y - X ?
My-ux
(ii) In terms of n and s, what is the standard deviation of the difference Y - X ?
((2 ( ) +
Suppose that of the n = 25 recorded values from Emily’s sample, the value of 7.5 comes from the distribution
of Y and the remaining 24 values come from the distribution of X. The summary statistics for the 24 values that
come from the distribution of X are given below.
n - 1 = 24
x = 3.171
s = 0.821
(c) Use the value of the potential outlier and the summary statistics of the remaining 24 values to estimate the
mean and standard deviation of the difference Y - X .
The estimated mean of Y - X :
3 171 4 329
7 .
5 -
.
=
.
The estimated standard deviation of Y - X :

837
+ (082 )
8212 .
0 .
0 .

-17-
Recall that a method for identifying an outlier is to investigate whether there is evidence that a value might
have come from a population with a mean different from the mean of the population of the other values. The
following hypotheses can be used for such an investigation.
H 0 : m y = mx
H a : my mx
(d) Calculate the value of the test statistic used for evaluating the hypotheses.
-
4 329 sa
t=
.
0 .
8379
(e) The p-value for the hypothesis test described above is less than 0.0001. What conclusion can be made about
the population means, and what conclusion can be made about identifying 7.5 as an outlier? Justify your
answers. suppose x 0 01 = .
00001
-
&
=> Reject Ho
There evidence that
=> is
convincing
u + + Ma
=> 7 5 .
is an outlier
according
to the mile of the mentioned method

-18-
STOP
END OF EXAM
THE FOLLOWING INSTRUCTIONS APPLY TO THE COVERS OF THE

SECTION II BOOKLET.
• MAKE SURE YOU HAVE COMPLETED THE IDENTIFICATION

INFORMATION AS REQUESTED ON THE FRONT AND BACK
COVERS OF THE SECTION II BOOKLET.
• CHECK TO SEE THAT YOUR AP NUMBER LABEL APPEARS IN

THE BOX ON THE COVER.
• MAKE SURE YOU HAVE USED THE SAME SET OF AP

NUMBER LABELS ON ALL AP EXAMS YOU HAVE TAKEN
THIS YEAR.
-19-

Minh Khue 2017

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Minh Khue 2017

Uploaded by

Copyright:

Available Formats

STATISTICS

Unauthorized copying or reuse of

A is around 01 0 +0 038 + 0 01)

the proportion of the female population 60 or older

Country B experienced the described

male a female population age

60-69 + 40-49 are both aroundo 14 ,the

Unauthorized copying or reuse of

Use a number generator to generate 20 participants

Replicate previous group

Unauthorized copying or reuse of

Unauthorized copying or reuse of

probability that player

V will win is : 0 . 3 + 0 09 + 0 0675

Unauthorized copying or reuse of

ways for player V to win in a

3 set match : VMU ,

probability the player Vaill win in 3 set match

Probability 0.3+05 vin : 00006750.15s

Expected number of sets played : 0 .

Unauthorized copying or reuse of

=> Confidence interval = (-5 956 ;.

estimate for difference would be :

Unauthorized copying or reuse of

Number all the customers in the list. Use a number

1000 numbers out

from the list.

The point estimate mustbe : P(a) =

all customers in the sample

customers who reported problem (13 out of 40)

can't be the point estimate for the population

door locks within the first

Unauthorized copying or reuse of

probability of new cars sold that experienced a

door locks in the sample is

Unauthorized copying or reuse of

Minimum Q1 Median Q3 Maximum

Unauthorized copying or reuse of

The estimated standard deviation of Y - X :

Unauthorized copying or reuse of

Unauthorized copying or reuse of

THE FOLLOWING INSTRUCTIONS APPLY TO THE COVERS OF THE

• MAKE SURE YOU HAVE COMPLETED THE IDENTIFICATION

• CHECK TO SEE THAT YOUR AP NUMBER LABEL APPEARS IN

• MAKE SURE YOU HAVE USED THE SAME SET OF AP

You might also like