You are on page 1of 2

A summary of the material covered in this session is available below:

 We discuss three measures of centrality:


o Mean: the average value and is measured in the same units as the
data. The sample mean is computed from all data in the sample
and is therefore sensitive to outliers; even a single outlier will
adversely affect the sample mean.
o Median: the middle data point, 50% of the data is greater/less than
the median. The median is measured in the same units as the data.
The sample median is computed from the middle data point or the
middle two data points and is not sensitive to one (or a few)
outliers.
o Mode: most frequently occurring data point. The mode is not
necessarily unique and is rarely used.
 We discuss the following measures of spread:
o Range: defined as maximum – minimum value and measured in
the same units as the data. We only need two data points to
compute the range. The range tells us about the spread of the data
in the entire sample.
o Interquartile Range (IQR): defined as Q3 – Q1. IQR describes the
spread of the middle 50% of the data and is measured in the same
units as the data. The IQR is computed from two data points only
(the two quartiles).
o Variance/Standard deviation (SD): variance/SD tells us how
close the data is, on average, to the mean. Small variance/SD
implies the data is, on average, concentrated around the mean.
Large variance/SD implies the data is spread out around the mean.
Variance is measured in units of the data squared which makes it
difficult to interpret. SD is measured in the same units as the data
and is preferred for interpretation. To interpret the SD, we use the
following guidelines:
 For a symmetric, bell shaped distribution, approximately
70% of the data in the sample will lie within one standard
deviation of the mean
 For a symmetric, bell shaped distribution, approximately
95% of the data in the sample will lie within two standard
deviations of the mean
 For symmetric, bell shaped data: we use the mean and standard deviation as
measures of centre and spread.
 For skewed data: we use the median and the IQR as measures of centre and
spread.
 Standard error: the standard error of the sample mean tells us how good the
sample mean (x¯x¯) is as an estimate of the unknown population mean ( μμ).
Small standard error of the sample mean implies that the sample mean is
very close to the unknown population mean (i.e., our estimate is accurate).
Large standard error of the sample mean implies that the sample mean is
possibly far away from the unknown population mean (i.e., our estimate is not
accurate).
 The standard error of the sample mean depends on the size of the sample
and the standard deviation. We know this from the formula for the standard
error of the sample mean: s/n−−√s/n 

You might also like