You are on page 1of 27

Working with one variable

data
Measures of Central Tendency
In statistics, the three most commonly
used measures of central tendency are:
Mean
Median
Mode
Each measure has its particular
advantage and disadvantage for a given
set of data.
Mean
 Most commonly referred to as the average
To find the mean, add up all of the
numbers in your list and divide by the
number of numbers.
Really good when the data is fairly close
together.
Most commonly used.
Mean
 In statistics, it is important to
distinguish between the mean of a
population and the mean of a
sample of that population
Mean - Population
The Greek letter mu, μ – Represents a Population Mean

x1  x2    xN

N

  x ∑ x – is the sum of all values of X in


N the population.
N – is the number of values in the
entire population.
Mean - Sample
x read as “ x-bar – Represents a Sample Mean
x1  x2    xn
x
n


 x x – is the sum of all values of X in
n the population.
N – is the number of values in the
entire population.
Median
 The median is the middle entry in an
ordered list. There are as many data points
above it as below it.
When there is an even number of values,
the median is the midpoint between the
two middle values.
Mode
The mode is the most frequent number in a data set.

There can be no mode as well as more than one mode.

Good when the value of the number is the most


important information (e.g. shoe size).

Only choice with categorical data.


Outliers
Values distant from the majority of the
data.
The median is often a better measure of
central tendency than the mean for small
data sets that contain outliers.
For larger data sets, the effect of outliers on
the mean is less significant.
Choosing a Measure of Central Tendency
If data contains outliers, use the median
If the data are strongly skewed , use median
If data is roughly symmetrical, the mean
and the median will be close, so either is
appropriate.
If data is not numeric, use the mode.
Example
The physics exam had the following results.
71, 82, 55, 76, 66, 71, 90, 84, 90,
64, 71, 70, 83, 45, 73, 51 68
Determine the mean, median, and mode.
Example - Mean
The physics exam had the following results.
71, 82, 55, 76, 66, 71, 90, 84, 90,
64, 71, 70, 83, 45, 73, 51 68

  x x
 1 x2    x N
N N
71  82  55  76  66  71  90  64  71  70  83  45  73  51  68

17
1215

17
 71.5
Example - Median
The physics exam had the following results.
71, 82, 55, 76, 66, 71, 90, 84, 90,
64, 71, 70, 83, 45, 73, 51 68

Order the data:


45, 51, 55, 64, 66, 68, 70, 71, 71,
71, 73, 76, 82, 83, 84, 90, 90,

Therefore the median is 71.


Example - Mode
The physics exam had the following results.
71, 82, 55, 76, 66, 71, 90, 84, 90,
64, 71, 70, 83, 45, 73, 51 68

Therefore the mode is 71.


xw
Weighted Mean
•Sometimes, certain data within a set are more
significant than others.
•A weighted mean gives a measure of central
tendency that reflects importance of the data
•Weighted means are often used in calculations
of indices
Weighted Mean
w1 x1  w2 x2    wn xn
xw 
w1  w2    wn
w x i i
 i

w i
i

w x
i
i i – sum of the weighted values.

w i
i
– sum of the various weighting factors.
Weighted Mean - Example
The averages (means) of five Data
Management classes are 69, 72, 66, 75, and
78. If the class sizes were 26, 33, 25, 35,
and 37 respectively, determine the overall
average (mean) for the entire grade.
Weighted Mean - Example
Weight Factor
Class Mean, xi
Class Size, wi
1 69 26

2 72 33

3 66 25

4 75 35

5 78 37
Weighted Mean
w x i i
xw  i

w i
i

26  69  33  72  25  66  35  75  37  78

26  33  25  35  37
11 331

156
The average for the
 72.6 entire grade is 72.6%
Mean for Group Data
•The mean should always be calculated using
the original data before they are grouped into
intervals.
•If you are presented with the data already
summarized in a frequency table approximation
of the centres of the data can be made.
Mean for Group Data
fm fm i i i i

  x 
i i

f f i
i
i
i

fm
i
i i
– sum of the interval midpoints times the
number of data in the interval.

f
i
i – sum of all the frequencies.
Mean for Group Data - Example
The following table represents the number of hours
per day of watching TV in a sample of 500 people.
Number
of hours 0-1 2-3 4-5 6-7 8-9 10-11 12-13
Frequency 64 92 141 86 71 35 11

a) What is the mean number of TV viewing hours in this


group?
b) What length of time is most often spent in front of a
TV by this group?
c) What is the median number of TV viewing hours?
Midpoint Frequency Cumulative
Interval fixi
(mi) fi Frequency
0-1 0.5 64 64 64 x 0.5 = 32.0
2-3 2.5 92 156 92 x 2.5 = 230.0
4-5 4.5 141 297 141 x 4.5 = 634.5
6 -7 6.5 86 383 86 x 6.5 = 559.0
8-9 8.5 71 454 71 x 8.5 = 603.5
10 - 11 10.5 35 489 35 x 10.5 = 367.5
12 - 13 12.5 11 500 11 x 12.5 = 137.5

fi
i  500 fm
i
i i  2564.0

1. Find the midpoints and cumulative frequencies for the intervals


2. Calculate the midpoints times frequency for each interval
3. Determine the sum of frequency and fimi
Mean for Group Data
.  f i mi
 i
 fi
i
. 2564

. 500
 5.1
The mean number of viewing hours for this
group was approximately five hours.
Mean for Group Data - Example
b) What length of time is most often spent in front of a
TV by this group?
The mode is the answer to this question. From the
frequency table the model interval is identified by
the larges frequency.

The most frequent period of time spent in front of


a TV by this group is between four and five hours.
Mean for Group Data - Example
c) What is the median number of TV viewing hours?
The median is the middlemost datum. The median
is the average of the 250th or 251th By referring to
the cumulative frequency column we notice that the
250th or 251th data occur in the interval 4-5.

We would then estimate the median to be 4.5


hours of viewing time.
Homework
Pg 133
#1,3,5,7,8,9,11

You might also like