You are on page 1of 6

Descriptive Statistics Quiz

Description
This quiz will cover the material from the Descriptive Statistics Unit. The problems are based on practice
problems from the textbook. Note, however, that some modifications have been made to those questions, so
please read the quiz problems carefully!

Instructions
This quiz is not timed. However, once you begin you must work it to completion. You have 2 attempts and
your final attempt will be graded.

Health Promotion

A man runs 1 mile approximately once per weekend. He records his time over an 18-week period. The
individual times and summary statistics are given in the table below.

Week Time (min)(xi ) Week Time (min)(xi )


1 12.80 10 11.57
2 12.20 11 11.73
3 12.25 12 12.67
4 12.18 13 11.92
5 11.53 14 11.67
6 12.47 15 11.80
7 12.30 16 12.33
8 12.08 17 12.55
9 11.72 18 11.83

# Data

# input the time data in list form


time.data <- c(12.8, 12.2, 12.25, 12.18, 11.53, 12.47,
12.3, 12.08, 11.72, 11.57, 11.73, 12.67, 11.92,
11.67, 11.8, 12.33, 12.55, 11.83)

Problem 1 (2 pts)

What is the (arithmetic) mean 1 mile running time over 18 weeks? Round your answer to 3 decimals
places.
Answer (numeric):
The average 1 mile running time over 18 weeks is 12.089 minutes.
Supporting Work:

1
# calculate the arithmetic mean, rounding to 3
# decimal places
round(mean(time.data), 3)

## [1] 12.089

Problem 2 (2 pts)

What is the standard deviation of the 1 mile running time over 18 weeks? Round your answer to 3
decimals places.
Answer (numeric):
The standard deviation of the 1 mile running time over 18 weeks is 0.387 minutes.
Supporting Work:
# calculate the sd, rounding to 3 decimals places
round(sd(time.data), 3)

## [1] 0.387

Problem 3 (1 pt)

Suppose we construct a new variable called time_100=100 X time (e.g., for week 1 time_100=1280).
Select the correct mean/standard deviation of time_100. Answers are rounded to 1 decimal place.
A. 120.89/3.87
B. 1208.9/387.4
C. 1208.9/3.87
D. 1208.9/38.7
E. None of the above.
Answer (multiple choice):
The new variable has a mean of 1208.9 and a standard deviation of 38.7.
Supporting Work:
# adjust time data by multiplying by 100
time_100 <- time.data * 100

# calculate the arithmetic mean, rounding to 1


# decimal place
round(mean(time_100), 1)

## [1] 1208.9
# calculate the standard deviation, rounding to 1
# decimal place
round(sd(time_100), 1)

## [1] 38.7

2
Problem 4 (1 pt)

Suppose the man does not run for 6 months over the winter due to snow on the ground. He resumes running
once a week in the spring and records a running time = 12.97 minutes in his first week of running in the
spring.
True or False: This is an outlying value relative to the distribution of running times recorded the previous
year (as given in the table above).
Answer (true/false):
False - the running time of 12.97 is not considered an outlier as it is not less than 10.83 (Q1-1.5IQR) and not
greater than 13.23 (Q3+1.5IQR).
Supporting Work:
# order the time.data list
time.data[order(time.data)]

## [1] 11.53 11.57 11.67 11.72 11.73 11.80 11.83 11.92 12.08 12.18 12.20
## [12] 12.25 12.30 12.33 12.47 12.55 12.67 12.80
# chek the bounds of time data for outliers
# calculate quartiles
Q1 <- fivenum(time.data)[2]
Q3 <- fivenum(time.data)[4]
iqr <- IQR(time.data, type = 2)

# bound for lower outliers (anything less is an


# outlier)
Q1 - 1.5 * iqr

## [1] 10.83
# bound for upper outliers (anything more is an
# outlier)
Q3 + 1.5 * iqr

## [1] 13.23

3
Microbiology

A study was conducted to demonstrate that soy beans inoculated with nitrogen-fixing bacteria yield more
and grow adequately without expensive environmentally deleterious synthesized fertilizers. The trial was
conducted under controlled conditions with uniform amounts of soil. The initial hypothesis was that inoculated
plants would outperform their uninoculated counterparts. This assumption is based on the facts that plants
need nitrogen to manufacture vital proteins and amino acids and that nitrogen-fixing bacteria would make
more of this substance available to plants, increasing their size and yield. There were 8 inoculated plants (I)
and 8 uninoculated plants (U). The plant yield as measured by pod weight (grams) for each plant is given in
the table below.

Sample Number I U
1 1.76 0.49
2 1.45 0.85
3 1.03 1.00
4 1.53 1.54
5 2.34 1.01
6 1.96 0.75
7 1.79 2.11
8 1.21 0.92

# Data

# input the data in list form


u_grams <- c(0.49, 0.85, 1, 1.54, 1.01, 0.75, 2.11,
0.92)

Problem 5 (2 pts)

Compute the standard deviation of the pod weight in ounces for the uninoculated (U) plants. Round your
answer to 3 decimal places.
Note: Use the conversion 1 ounce = 28.349 grams.
Answer (numeric):
The standard deviation of the pod weight for the uninoculated plants is 0.018 ounces.
Supporting Work:
# convert to ounces
u_oz <- u_grams/28.349

# calculate the mean in ounces, rounded to 3


# decimal places
round(sd(u_oz), 3)

## [1] 0.018

Problem 6 (2 pts)

Parallel boxplots have been created below to compare the pod weights (in ounces) of the two groups. Which
of the following statements do you know to be true based on the boxplots alone? (Select all that are
true.)

4
• The median pod weight of the inoculated plants is greater than the median pod weight of the
uninoculated plants. (True)
• The geometric mean of the pod weights of the inoculated plants is less than the geometric mean of
the pod weights of the uninoculated plants. (False)
• The IQR of the pod weights of the inoculated plants is about the same as the IQR of the pod weights
of the uninoculated plants. (True)
• The third quartile of the pod weights of the inoculated plants is greater than the third quartile of the
pod weights of the uninoculated plants. (True)
• The arithmetic mean of the pod weights of the inoculated and uninoculated plants is the same. (False)
• The standard deviation of the pod weights of the inoculated plants is greater than the standard
deviation of the pod weights of the uninoculated plants. (False)
• Including the outlier, the range of the pod weights of the inoculated plants is less than the range of
the pod weights of the uninoculated plants. (True)
If you are interested in the code used to generate the parallel boxplots using the raw data, see the R code
included below:
# constructing the boxplots
i_grams <- c(1.76, 1.45, 1.03, 1.53, 2.34, 1.96, 1.79,
1.21)
u_grams <- c(0.49, 0.85, 1, 1.54, 1.01, 0.75, 2.11,
0.92)

mydf <- as.data.frame(rbind(cbind(i_grams/28.349, rep("Inoculated",


8)), cbind(u_grams/28.349, rep("Uninoculated",
8))))
mydf$V1 <- as.numeric(as.character(mydf$V1))
mydf <- dplyr::rename(mydf, PodWeight_oz = V1)
mydf <- dplyr::rename(mydf, PlantType = V2)

bwplot(PlantType ~ PodWeight_oz, data = mydf, xlab = "Pod Weight (ounces)",


main = "Pod weight from I and U plants")

5
Pod weight from I and U plants

Uninoculated

Inoculated

0.02 0.04 0.06 0.08

Pod Weight (ounces)

You might also like