Professional Documents
Culture Documents
Tutorial 4
Problem set
a) Will the sample mean always correspond to one of the observations in the sample?
b) Will exactly half of the observations in a sample fall below the mean?
c) For any set of data values, is it possible for the sample standard deviation to be larger
than the sample mean? If so, give an example.
d) Can the sample standard deviation be equal to zero? If so, give an example.
2. In their book Introduction to Linear Regression Analysis (5th edition, Wiley, 2012), Mont-
gomery, Peck, and Vining presented measurements on NbOCl3 concentration from a tube-
flow reactor experiment. The data, in gram−mole per liter ×10(−3) , are as follows. Con-
struct a histogram in R of this data. Compute the sample mean, the sample median, 20th,
40th and 67th percentiles. Check 67th percentile by hand using formula given in the class.
Which percentile does the value 1617 correspond to using the formula given in the lecture?
Subsequently make a boxplot of the data. What is the IQR approximately from the box-
plot, in addition, calculate it exactly. Use R and calculation friendly version of the formula
for variance to calculate the sample variance s2 . How big is the sample standard deviation.
Check your results using function var in R. Check how well the range rule of thumb,
Chebyshev Theorem for k = 1.5 and normal empirical rule for distance of 1 standard
deviations apply to this data.
450, 450, 473, 507, 457, 452, 453, 1215, 1256, 1145, 1085, 1066, 1111, 1364, 1254, 1396,
1575, 1617, 1733, 2753, 3186, 3227, 3469, 1911, 2588, 2635, 2725,
a) Assume that the distribution of IQ’s in the population is normal. If mental retardation
is defined as an IQ less than 70, what proportion of the population would we expect
to have this condition? (use empirical normal rule to answer this)
b) If a genius is defined as someone with an IQ greater than 145, approximately what
proportion of the population would we expect to be geniuses? (use empirical normal
rule to answer this)
c) Calculate a z score of a least intelligent genius.
1
Tutorial 4 MATH 203
d) Calculate the maximum IQ that a person who is in the lowest 2.5 % of the population
on their IQ, can have. (again empirical normal rule might help)
e) Calculate the IQ of a person whose z-score is -2. (use the formula from the slides)
4. You are writing a research paper on grandparents who had one or more of their grandchil-
dren living with them. In 2000, 2.4 million grandparents were defined as caregivers by the
U.S. Census, meaning that they had primary responsibility for raising their grandchildren
below the age of 18. You discover the following information from the U.S. Census Report,
”Grandparents Living With Grandchildren: 2000” (C2KBR-31, October 2003): Among
grandparent caregivers, 12% cared for a grandchild for less than 6 months, 11% for 6 to 11
months, 23% at least 1 and less than 2 years, 15% for at least 2 and less than 4 years, and
39% for 4 or more years.
1) Construct a graph or chart that best displays this information on how long grandpar-
ents care for their grandchildren.
2) Explain why the graph you selected is appropriate.
6. The number of Americans on Medicare is increasing as expected with the aging baby boomer
population. Figure 3 shows the number of Americans on Medicare in 2000 and 2005 for
eight U.S. states (Medicare is the federal health insurance program for 65-yearold’s and
over). Note: The numbers listed are in thousands.
a) Calculate the mean number of Americans on Medicare in these eight states for both
2000 and 2005. How would you characterize the difference in the number of Americans
on Medicare between 2000 and 2005? Does the mean adequately represent the central
tendency of the distribution of Americans on Medicare in each year for these eights
states? Why or why not?
b) Recalculate the mean for each year after removing Florida, Illinois, and New York
from the table. Is the mean now a better representation of central tendency for the
remaining five states? Explain.
c) We now want to test whether the distribution of Americans on Medicare is symmet-
rical or skewed:Calculate the median and mode for each year, using all eight states.
Based on these results and the means, how would you characterize the distribution of
Americans on Medicare for each year?
2
Tutorial 4 MATH 203
d) Does the mean or median best represent the central tendency of each distribution?
Why?
e) If you found the distributions to be skewed, what might be the statistical cause?
7. U.S. households have become smaller over the years. Figure 4 contains information on the
number of people currently aged 18 years or older living in a respondent’s household. It is
the data on U.S. household size form GSS 2008. Using these data, construct a histogram
to represent the distribution of household size.
a) From the appearance of the histogram, would you say the distribution is positively or
negatively skewed? Why?
b) Now calculate the median and the mean for the distribution and compare them. Do
these numbers provide further evidence to support your decision about how the dis-
tribution is skewed?
c) Why do you think the distribution of household size is asymmetrical?
8. The mean time it takes a group of students to complete a statistics final exam is 44 minutes,
and the standard deviation is 9 minutes. Within what limits would you expect approxi-
mately 95% of the students to complete the exam? Assume the variable is approximately
normally distributed.
9. The reported high temperatures of 23 cities of the United States in October are shown.
Find the z-scores for temperature of 80 degrees and 50 degrees.
62 72 66 79 83 61 62 85 72 64 74 71 42 38 91 66 77 90 74 63 64 68 42
10. If the average number of textbooks in professors’ offices is 16, the standard deviation is 5,
and the average age of the professors is 43, with a standard deviation of 8, which data set
is more variable?
11. The average labor charge for automobile mechanics is $54 per hour. The standard deviation
is $4. Find the minimum percentage of data values that will fall within the range of $48 to
$60. Use Chebyshev’s theorem.
12. A student scored 76 on a general science test where the class mean and standard deviation
were 82 and 8, respectively; he also scored 53 on a psychology test where the class mean
and standard deviation were 58 and 3, respectively. In which class was his relative position
higher?
13. On a philosophy comprehensive exam, this distribution was obtained from 25 students
3
Tutorial 4 MATH 203
Score Frequency
40.5 - 45.5 3
45.5 - 50.5 8
50.5 - 55.5 10
55.5 - 60.5 3
60.5 - 65.5 1
4
Tutorial 4 MATH 203
5
Tutorial 4 MATH 203
6
Tutorial 4 MATH 203
Figure 3: Size of households (people that live in a single house) in the US (2008)