1) If the largest value of data set is doubled, which of these values will be affected and why?

a) Mean b) Median c) Standard Deviation d)IQR

The mean and standard deviation will affected. Both of these measures are non-resistant to outliers. The other

two measures, median and IQR are resistant measures, as their values are determined by their position in the data.

2) The following list is a set of data ordered from smallest to largest. All values are integers.

2, 12, y, y, y, 15, 18, 18, 19

Which of the following could be true?

I. The median and first quartile cannot be equal.

II. The mode is 18.

III. 2 is an outlier

I and III only. The median and Q1 are equal if y = 12. The mode is y. 2 is an outlier if the IQR is 6, which is the

largest it could be.

3) The scores of male (M) and female (F) students on a statistics

exam are displayed in the following boxplot. The pluses indicate

the location of the means. Compare the two distributions.

The median of the females is slightly higher than the median of

the males. The range of the females is smaller than that of the

males. Based on the shape of the distributions each is

approximately symmetric or slightly skewed left (the mean is less

than median, and the tails or longer on the left.) There are no

outliers for either distribution. 50% of females scored between

55-90 while 50% of males scored between 45-90.

4) A substitute teacher was asked to keep track of how long it took her to get to her

assigned school each morning. Here is a stem plot of the data. Would you expect the

mean to be higher or lower than the median?

Higher, because the data is skewed to the right.

5) Suppose that a normal model describes the acidity (pH) of rainwater, and that water tested after last weeks

storm had a z-score of 1.8. What does the 1.8 mean in terms of the acidity of the rain?

This means that the acidity of had a pH 1.8 standard deviations higher than that of average rainwater.

6) The height of male Labrador retrievers is normally distributed with a mean of 23.5 inches and a standard

deviation of 0.8 inches. Labradors must fall under a height limit in order to participate in certain dog shows. If the

maximum height is 24.5 inches for male labs, what percentage of male labs are not eligble?

0.1056

7) A large college class is graded on a total points system. The total points earned in a semester by the students in

the class vary normally with a mean of 675 and a standard deviation of 50. Another large class in a different

department is graded on a 0 to 100 scale. The final grades in that class follow a normal model with a mean of 82

and a standard deviation of 6. Jessica earns 729 points in the first class, while Ana scores 90 in the second class.

Which student did better and why?

Ana did better because her score is 1.33 standard deviations above the mean while Jessicas is only 1.08 standard

deviations above the mean.

8) Heights of fourth graders are normally distributed with a mean of 52 inches and a standard deviation of 3.5

inches. For a research project, you plan to measure a simple random sample of 30 fourth graders. For samples such

as yours, 10% of the samples should have an average height below what number?

51.18 inches

9) The risk of developing iron deficiency is especially high during pregnancy. Detecting such a deficiency is

complicated by the fact that some methods for determining iron status can be affected by the state of pregnancy

itself. Consider the following data on transferring receptor concentration for a sample of women with laboratory

evidence of overt iron deficiency anemia .

a) Compute the values of the sample mean and median. Why are these values different here? Which one do you

regard more representative of the sample, WHY?

Mean = 11.6 median = 10.05 because. We should use the median as a more representative sample because there are

outliers. The median is a better measure when the distribution is skewed. This distribution is skewed to the right.

b) Compute the 5 number summary along with a box plot for the

previous problem. Also compute the variance and standard deviation.

Min = 7.6 Q1 = 9.375 med = 10.05 Q3 =

12.73 max = 20.40

Lower Fence =Q1 - 1.5(IQR) Upper Fence = Q3 + 1.5(IQR)

Variance = 14.4154 sd. 3.7968

10) A researcher interested in the age at which women are having

their first child surveyed a random sample of 250 women having at

least one child and found an approximately normal distribution with a mean age of 27.3 and a standard deviation of

5.4. According to the Empirical rule, between what years did approximately 95% of the women have their first

child?

The empirical rule states that 95% of the data lies within 2 standard deviations of the mean. Therefore 2

standard deviations of the data would lie between the values of 16.5 and 38.1 years.

15.2 9.3 7.6 11.9 10.4 9.7

20.4 9.4 11.5 16.2 9.4 8.3

Free Responses

11) A machine is used to fill soda bottles in a factory. The bottles are labeled as containing 2.0 liters, but extra

room at the top of the bottle allows for a maximum of 2.25 liters of soda before the bottle overflows. The

standard deviation of the amount of soda put into the bottles by the machine is known to be 0.15 liter.

(a) Overfilling the bottles causes a mess on the assembly line, but consumers will complain if bottles contain less

than 2 liters. If the machine is set to fill the bottles with an average of 2.08 liters, what proportion of bottles will

be overfilled?

Let x = amount of soda put into the bottle. X~N(2.25, 0.15)

P(x>2.25)=P(z>1.13) = 0.129

(b) If management requires that no more than 3% of bottles should be overfilled, the machine should be set to fill

the bottles with what mean amount?

liters.

(c) Complaints from consumers about underfilled bottles leads the company to set the mean amount to 2.15 liters.

In this situation, what standard deviation would allow for no more than 3% of the bottles to be overfilled?

A z-score of 1.88 separates the normal model in 97% below the line and 3% above the line: P(z> 1.88) = 0.0300.

We require that only 3% have over 2.25 liters of soda.

12) As a project in their physical education classes, elementary school students were asked to kick a soccer ball

into a goal from a fixed distance away. Each student was given 8 chances to kick the ball, and the number of goals

was recorded for each student. The number of goals for 200 first graders is given in the table.

Number of goals scored 0 1 2 3 4 5 6 7 8

Number of 1st Graders 14 37 51 33 30 14 11 7 3

In order to compare whether older children are better at kicking goals, the exercise was repeated with 200 fourth

graders.

Number of goals scored 0 1 2 3 4 5 6 7 8

Number of 1st Graders 5 11 18 24 27 34 39 28 14

a) Graph these two distributions so that the number of goals scored by the first

graders and the number of goals scored by the fourth graders can be easily

compared.

b) Based on your graphs, how do the results from the fourth graders differ

from those of the first graders? Write a few sentences to answer this question.

The fourth graders tended to score more goals The 4th graders mean (4.695), median (5) and mode (6) were all

larger than those of the first graders (mean = 2.835, median = 2, mode = 2). Both grades have approximately the

same variability (for fourth graders,,standard deviation = 2.1, IQR = 3 and for first graders, standard deviation =

1.9 and IQR = 3). The distribution for the goals scored by the fourth graders is skewed left while the distribution

for the first graders is skewed right.

