You are on page 1of 3

Tutorial 3 – Solutions

Measures of central tendency and spread, sampling variation & standard


error
1. Define what is meant by the following statistical terms.
a. Mode

The value that appears most frequently in a data set is called the mode.

b. Inter-quartile range (IQR)

The IQR is a measure of spread defined as the distance between the lower and upper
quartiles (i.e., IQR = upper quartile – lower quartile).

c. Standard deviation

Standard deviation measures the spread of the distribution around the mean. It is defined
as the square root of the variance and is measured on the same scale as the data.

d. Standard error

Standard error is the standard deviation of the sampling distribution. It measures how
accurately a sample mean estimates the population mean. The size of the standard error is
determined by the population variance and the sample size.

2. Serum triglyceride. Serum triglyceride was measured from the cord blood of 9 babies. The levels
ranged from 0.11 up to 0.53mmol/L and are given in the table below.
a. Calculate the sample median serum triglyceride level. Is it different from the sample
mean level?

Position of sample median = (n + 1)/2 = 10/2 = 5


Sample median = 0.21mmol/L

The distribution of serum triglyceride level may be positively skewed.


3. Suppose measurements of the body mass index (BMI; kg/m2) are recorded from a sample of size
n drawn at random from the population of men aged between 40 and 69 living in Melbourne in
2004 as part of a study of the relationship between blood pressure and BMI in lean populations.

a. What happens to the size of the standard error of the sample mean as the sample size n
is increased from n = 100 to n = 10,000?

For a sample size of n = 100, the standard error equals:

  
SE = = = = 0.1 
n 100 10

For a sample size of 10,000, SE = 0.01 × σ

Thus, the standard error decreases in size as the sample size increases from 100 to 10,000.
In fact, the standard error for a sample size of 100 is ten times greater than the standard
error for a sample size of 10,000.

b. The sample mean may be used as an estimate of the population BMI in men aged
between 40 and 69. What are the implications of increasing the sample size n on the
accuracy of our estimate?

For very large samples the sample mean provides a very accurate estimate of the population
mean. If we take the sample mean as an estimate of the population mean value of BMI in
men aged between 40 and 69, then a larger sample size n implies decreasing variability in
the sample mean and hence a more reliable estimate of the population mean.
4. In a report investigating the association between diet and the risk of postmenopausal breast
cancer (Giles G et al. Int J Cancer, 2006), the mean body mass index (BMI; kg/m2) of 13,171
women was 26.7 kg/m2 and the standard deviation was 4.5 kg/m2.

a. Calculate and describe in words the standard error.

The standard error measures how precisely a sample mean estimates the population mean.
In this breast cancer study, the standard error equals 0.04 kg/m2. This is very small, so the
sample mean of 26.7 kg/m2 estimates the unknown population mean with high precision.

b. Suppose that the BMI of one of the women was entered into the database incorrectly
(incorrect value 21.1 kg/m2; correct value 20.7 kg/m2). Using the correct value in the
database, would the above published values of the mean and standard deviation change
at all?

The mean and standard deviation will always change if one of the data points is altered. In
this example, there are 13,171 women, so changing one data point will change the mean
and standard deviation only minimally.

5. State which of the following are TRUE or FALSE.

a. The standard error of the sample mean:


i. measures the variability of the observations; (FALSE)
ii. is an estimate of how far the sample mean is likely to be from the population
mean; (TRUE)
iii. is greater than the estimated standard deviation. (FALSE)

b. If the size of a random sample were increased, we would expect:


i. the sample mean to decrease; (FALSE)
ii. the sample standard deviation to decrease; (FALSE)
iii. the standard error of the sample mean to decrease. (TRUE)

c. The maximum volume of air that can be breathed out in 1 second (measured using a
spirometer) is called the Forced Expiratory Volume in 1 second (denoted as FEV1). It is
necessary to estimate the mean FEV1 by drawing a sample from a large population. The
accuracy of the estimate will depend on:
i. the mean FEV1 in the population; (FALSE)
ii. the number in the population; (FALSE)
iii. the number in the sample; (TRUE)
iv. the variance of FEV1 in the population; (TRUE)
v. the way the sample is selected. (TRUE)

You might also like