You are on page 1of 7

MATH 203, Fall 2019

Tutorial 4
Problem set

1. Think about and answer the following

a) Will the sample mean always correspond to one of the observations in the sample?
b) Will exactly half of the observations in a sample fall below the mean?
c) For any set of data values, is it possible for the sample standard deviation to be larger
than the sample mean? If so, give an example.
d) Can the sample standard deviation be equal to zero? If so, give an example.

2. In their book Introduction to Linear Regression Analysis (5th edition, Wiley, 2012), Mont-
gomery, Peck, and Vining presented measurements on NbOCl3 concentration from a tube-
flow reactor experiment. The data, in gram−mole per liter ×10(−3) , are as follows. Con-
struct a histogram in R of this data. Compute the sample mean, the sample median, 20th,
40th and 67th percentiles. Check 67th percentile by hand using formula given in the class.
Which percentile does the value 1617 correspond to using the formula given in the lecture?
Subsequently make a boxplot of the data. What is the IQR approximately from the box-
plot, in addition, calculate it exactly. Use R and calculation friendly version of the formula
for variance to calculate the sample variance s2 . How big is the sample standard deviation.
Check your results using function var in R. Check how well the range rule of thumb,
Chebyshev Theorem for k = 1.5 and normal empirical rule for distance of 1 standard
deviations apply to this data.

450, 450, 473, 507, 457, 452, 453, 1215, 1256, 1145, 1085, 1066, 1111, 1364, 1254, 1396,
1575, 1617, 1733, 2753, 3186, 3227, 3469, 1911, 2588, 2635, 2725,

3. IQ scores in the population have a mean of µ = 100 and a standard-deviation (SD) of


σ = 15.

a) Assume that the distribution of IQ’s in the population is normal. If mental retardation
is defined as an IQ less than 70, what proportion of the population would we expect
to have this condition? (use empirical normal rule to answer this)
b) If a genius is defined as someone with an IQ greater than 145, approximately what
proportion of the population would we expect to be geniuses? (use empirical normal
rule to answer this)
c) Calculate a z score of a least intelligent genius.

1
Tutorial 4 MATH 203

d) Calculate the maximum IQ that a person who is in the lowest 2.5 % of the population
on their IQ, can have. (again empirical normal rule might help)
e) Calculate the IQ of a person whose z-score is -2. (use the formula from the slides)

4. You are writing a research paper on grandparents who had one or more of their grandchil-
dren living with them. In 2000, 2.4 million grandparents were defined as caregivers by the
U.S. Census, meaning that they had primary responsibility for raising their grandchildren
below the age of 18. You discover the following information from the U.S. Census Report,
”Grandparents Living With Grandchildren: 2000” (C2KBR-31, October 2003): Among
grandparent caregivers, 12% cared for a grandchild for less than 6 months, 11% for 6 to 11
months, 23% at least 1 and less than 2 years, 15% for at least 2 and less than 4 years, and
39% for 4 or more years.

1) Construct a graph or chart that best displays this information on how long grandpar-
ents care for their grandchildren.
2) Explain why the graph you selected is appropriate.

5. The frequency distribution in Figure 1 contains information about children’s attitudes of


smoking one pack of cigarettes per day.

a) Find the mode.


b) Find the median.
c) Interpret the mode and the median. (give the value of each in a sentence that gives a
context and meaning w.r.t. the data set)
d) Why would you not want to report the mean for this variable?

6. The number of Americans on Medicare is increasing as expected with the aging baby boomer
population. Figure 3 shows the number of Americans on Medicare in 2000 and 2005 for
eight U.S. states (Medicare is the federal health insurance program for 65-yearold’s and
over). Note: The numbers listed are in thousands.

a) Calculate the mean number of Americans on Medicare in these eight states for both
2000 and 2005. How would you characterize the difference in the number of Americans
on Medicare between 2000 and 2005? Does the mean adequately represent the central
tendency of the distribution of Americans on Medicare in each year for these eights
states? Why or why not?
b) Recalculate the mean for each year after removing Florida, Illinois, and New York
from the table. Is the mean now a better representation of central tendency for the
remaining five states? Explain.
c) We now want to test whether the distribution of Americans on Medicare is symmet-
rical or skewed:Calculate the median and mode for each year, using all eight states.
Based on these results and the means, how would you characterize the distribution of
Americans on Medicare for each year?

2
Tutorial 4 MATH 203

d) Does the mean or median best represent the central tendency of each distribution?
Why?
e) If you found the distributions to be skewed, what might be the statistical cause?

7. U.S. households have become smaller over the years. Figure 4 contains information on the
number of people currently aged 18 years or older living in a respondent’s household. It is
the data on U.S. household size form GSS 2008. Using these data, construct a histogram
to represent the distribution of household size.

a) From the appearance of the histogram, would you say the distribution is positively or
negatively skewed? Why?
b) Now calculate the median and the mean for the distribution and compare them. Do
these numbers provide further evidence to support your decision about how the dis-
tribution is skewed?
c) Why do you think the distribution of household size is asymmetrical?

8. The mean time it takes a group of students to complete a statistics final exam is 44 minutes,
and the standard deviation is 9 minutes. Within what limits would you expect approxi-
mately 95% of the students to complete the exam? Assume the variable is approximately
normally distributed.

9. The reported high temperatures of 23 cities of the United States in October are shown.
Find the z-scores for temperature of 80 degrees and 50 degrees.

62 72 66 79 83 61 62 85 72 64 74 71 42 38 91 66 77 90 74 63 64 68 42

10. If the average number of textbooks in professors’ offices is 16, the standard deviation is 5,
and the average age of the professors is 43, with a standard deviation of 8, which data set
is more variable?

11. The average labor charge for automobile mechanics is $54 per hour. The standard deviation
is $4. Find the minimum percentage of data values that will fall within the range of $48 to
$60. Use Chebyshev’s theorem.

12. A student scored 76 on a general science test where the class mean and standard deviation
were 82 and 8, respectively; he also scored 53 on a psychology test where the class mean
and standard deviation were 58 and 3, respectively. In which class was his relative position
higher?

13. On a philosophy comprehensive exam, this distribution was obtained from 25 students

3
Tutorial 4 MATH 203

Score Frequency
40.5 - 45.5 3
45.5 - 50.5 8
50.5 - 55.5 10
55.5 - 60.5 3
60.5 - 65.5 1

a) Construct the percentile graph.


b) Find the values that correspond to the 22nd, 78th, and 99th percentiles
c) Find the percentiles of the values 52, 43, and 64.

4
Tutorial 4 MATH 203

Figure 1: Children and smoking (n=1455)

5
Tutorial 4 MATH 203

Figure 2: Medicare in the US 2000-2005

6
Tutorial 4 MATH 203

Figure 3: Size of households (people that live in a single house) in the US (2008)

You might also like