You are on page 1of 9

SUMMARY MEASURES

MULTIPLE CHOICE QUESTIONS

In the following multiple-choice questions, please circle the correct answer.

1. Which of the following are the three most common measures of central location?

a. Mean, median and mode


b. Average, variance and standard deviation
c. Mode, sample mean, and sample variance
d. Mean, median, and average

2. Which of the following are considered measures of association?

a. Mean and variance


b. Variance and correlation
c. Covariance and correlation
d. Covariance and variance

3. The difference between the first and third quartile is called the .

a. interquartile range
b. interdependent range
c. unimodal range
d. common occurrence range
e. none of the above

4. If a value represents the 95th percentile, this means:

a. 95% of all values are below this point


b. 95% of all values are above this point
c. 95% of the time you will observe this value
d. there is a 5% chance that this value is incorrect
e. none of the above

5. For a boxplot, the vertical line inside the box indicates the location of the______.

a. mean
b. median
c. mode
d. minimum value

1
e. maximum value

6. The correlation coefficient is always:

a. between – 1 and 0
b. between 0 and +1
c. between 0.0 and 100
d. between –1 and +1

7. Which of the following are the two most commonly used measures of variability?

a. Variance and mode


b. Variance and standard deviation
c. Sample mean, and sample variance
d. Mean and range

8. The mode is best described as:

a. the middle observation


b. the same as the average
c. the 50th percentile
d. the most frequently occurring value
9. The median can also be described as:

a. the middle observation when the data is arranged in ascending order


b. the second quartile
c. the 50th percentile
d. all of the above
e. none of the above

10. For a boxplot, the box itself represents percent of the observations.

a. 25
b. 50
c. 75
d. 90
e. 100

2
Describing Data: Summary Measures

TEST QUESTIONS

QUESTIONS 11 AND 12 ARE BASED ON THE FOLLOWING INFORMATION:

Statistics professor has just given a final examination in his statistical inference course.
He is particularly interested in learning how his class of 40 students performed on this
exam. The scores are shown below.

75 79 72 75 77 71 78 83 84 71
81 82 79 71 73 89 74 75 93 74
88 83 90 82 79 62 73 88 76 76
80 76 84 84 80 68 74 76 70 91

11. What are the mean and median scores on this exam?

12. Explain why the mean and median are different.

QUESTIONS 13 AND 14 ARE BASED ON THE FOLLOWING INFORMATION:

The average time in minutes, for a sample of 40 people living in Detroit metropolitan
area to travel to work and back home each day is shown below.

42.8 47.1 46.3 42.2 55.9 35.0 41.4 37.8 33.9 36.6
51.8 43.1 51.5 45.5 40.7 57.6 36.5 32.3 41.9 37.0
41.3 32.6 34.1 31.9 42.9 48.5 46.7 40.9 46.6 29.5
61.5 36.5 48.9 40.6 48.1 39.8 35.6 44.4 38.6 45.6

13. Find the most representative average commute time.

14. Does it appear that the distribution of average commute time is approximately
symmetric? Explain why or why not.

QUESTION 15 IS BASED ON THE FOLLOWING INFORMATION:

A production manager for Bell Computers is interested in determining the variability of


the proportion defective items in a shipment of one of the computer components used in
their product. The results below were calculated using information on the percent
defective from 50 randomly selected shipments collected over a 6-week period.

Count 50.000

3
Mean 0.210
Median 0.206
Standard deviation 0.057
Minimum 0.104
Maximum 0.299
Variance 0.003
First quartile 0.153
Third quartile 0.248

15. Explain what would cause the mean to be higher than the median in this case.

QUESTIONS 16 THROUGH 18 ARE BASED ON THE FOLLOWING INFORMATION:

A manager for Marko Manufacturing, Inc. has recently been hearing some complaints
that women are being paid less than men for the same type of work in one of their
manufacturing plants. The boxplots shown below represent the annual salaries for all
salaried workers in that facility (40 men and 34 women).

Female_Salary

Male_Salary

0 20000 40000 60000 80000

16. Would you conclude that there is a difference between the salaries of women and
men in this plant? Justify your answer.

4
Describing Data: Summary Measures

17. How large must a person’s salary should be to qualify as an outlier on the high
side? How many outliers are there in these data?

18. What can you say about the shape of the distributions given the boxplots above?

QUESTIONS 19 THROUGH 21 ARE BASED ON THE FOLLOWING INFORMATION:

Below you will find current annual salary data and related information for 30 employees
at Gamma Technologies, Inc. These data include each selected employees gender (1 for
female; 0 for male), age, number of years of relevant work experience prior to
employment at Gamma, number of years of employment at Gamma, the number of years
of post-secondary education, and annual salary. The tables of correlations and
covariances are presented below.

Table of Correlations

Gender Age Prior Exp Gamma Exp Education Salary


Gender 1.000
Age -0.111 1.000
Prior_Exp 0.054 0.800 1.000
Gamma_Exp -0.203 0.916 0.587 1.000
Education -0.039 0.518 0.434 0.342 1.000
Salary -0.154 0.923 0.723 0.870 0.617 1.000

Table of Covariances (variances on the diagonal)

Gender Age Prior Exp Gamma Exp Education Salary


Gender 0.259
Age -0.633 134.051
Prior Exp 0.117 39.060 19.045
Gamma Exp -0.700 72.047 17.413 49.421
Education -0.033 9.951 3.140 3.987 2.947
Salary -1825.97 249702.35 73699.75 143033.29 24747.68 584640062

19. Which two variables have the strongest linear relationship with annual salary?

20. For which of the two variables, number of years of prior work experience or
number of years of post-secondary education, is the relationship with salary
stronger? Justify your answer.

21. How would you characterize the relationship between gender and annual salary?

5
QUESTIONS 22 THROUGH 24 ARE BASED ON THE FOLLOWING INFORMATION:

The data shown below contains family incomes (in thousands of dollars) for a set of 50
families; sampled in 1980 and 1990. Assume that these families are good representatives
of the entire United States.

1980 1990 1980 1990 1980 1990


58 54 33 29 73 69
6 2 14 10 26 22
59 55 48 44 64 70
71 57 20 16 59 55
30 26 24 20 11 7
38 34 82 78 70 66
36 32 95 97 31 27
33 29 12 8 92 88
72 68 93 89 115 111
100 96 100 102 62 58
1 0 51 47 23 19
27 23 22 18 34 30
22 47 50 75 36 61
141 166 124 149 125 150
72 97 113 138 121 146
165 190 118 143 88 113
79 104 96 121

22. Find the mean, median, standard deviation, first and third quartiles, and the 95 th
percentile for family incomes in both years.

ANSWER:
Income 1980 Income 1990
Mean 62.820 67.120
Median 59.000 57.500
Standard deviation 39.786 48.087
First quartile 30.250 27.500
Third quartile 92.750 97.000
95th percentile 124.550 149.55

23. The Republicans claim that the country was better off in 1990 than in 1980,
because the average income increased. Do you agree?

6
Describing Data: Summary Measures

24. Generate a boxplot to summarize the data. What does the boxplot indicate?

QUESTIONS 25 THROUGH 26 ARE BASED ON THE FOLLOWING INFORMATION:

In an effort to provide more consistent customer service, the manager of a local fast-food
restaurant would like to know the dispersion of customer service times about their
average value for the facility’s drive-up window. The data below represents the customer
service times (in minutes) for a sample of 47 customers collected over the past week.

Count 47.000
Mean 0.914
Median 0.822
Standard deviation 0.511
Minimum 0.095
Maximum 2.372
Variance 0.261
First quartile 0.563
Third quartile 1.180

25. Interpret the variance and standard deviation of these sample data.

26. Explain what would cause the mean to be higher than the median in this case.

7
TRUE / FALSE QUESTIONS

27. If all values in a data set are negative, the value of the standard deviation may be
either positive or negative.

28. Assume that the histogram of a data set is symmetric and bell shaped, with mean
of 72 and standard deviation of 10. Then, using the “rules of thumb”, we can say
that 95% of the data values were between 52 and 92.

29. If a histogram has a single peak and looks approximately the same to the left and
right of the peak, we should expect no difference in the values of the mean,
median, and mode.

30. The mean is a measure of central location.

31. In a negatively skewed distribution, the mean is larger than the median and the
median is larger than the mode.

32. Since the population is always larger than the sample, the population mean is
always larger than the sample mean.

33. The length of the box in the boxplot portrays the interquartile range.

34. In a positively skewed distribution, the mean is smaller than the median and the
median is smaller than the mode.

35. The interquartile range is considered the weakest measure of central location.

36. The value of the standard deviation always exceeds that of the variance.

37. The difference between the first and third quartiles is called the interquartile
range.

38. The standard deviation is measured in original units, such as dollars and pounds.

39. The median is one of the most frequently used measures of variability.

40. Each of the covariance and correlation measures the strength and direction of a
linear relationship between two numerical variables.

41. Abby has been keeping track of what she spends to rent movies. The last seven
week's expenditures, in dollars, were 6, 4, 8, 9, 6, 12, and 4. The mean amount
Abby spends on renting movies is $7.

8
Describing Data: Summary Measures

42. If two data sets have the same range, the smallest and largest observations in both
sets will be the same

43. The variance is a measure of the linear relationship between two variables

44. Generally speaking, if two variables are unrelated, the covariance will be a
positive or negative number close to zero

45. The correlation between two variables is a unitless quantity that is always
between –1 and +1.

46. The variance is the positive square root of the standard deviation.

47. It is possible that the data points are close to a curve and have a correlation close
to 0, because correlation is relevant only for measuring linear relationships.

48. Expressed in percentiles, the interquartile range is the difference between the 25th
and 75th percentiles.

49. A data sample has a mode 0f 140, a median of 130, and a mean of 120. The
distribution of the data is positively skewed.

50. A student scores 85, 75, and 80 on three exams during the semester and 90 on the
final exam. If the final is weighted double and the three others weighted equally,
the student's final average would be 80.

You might also like