Professional Documents
Culture Documents
Numerical Measures
Measures of Location or Measures of Central Tendency
Lecture Two / 2022-2023
̅ ):
2- The Sample Mean (𝑿
Frequently we select a sample from the population in order to find something about a
specific characteristic of the population.
Lecture 2 Page -1
For grouped data, the sample mean is given as:
𝑥1 𝑓1 + 𝑥2 𝑓2 + 𝑥3 𝑓3 + ⋯ + 𝑥𝐾 𝑓𝐾
𝑋̅ =
𝑓1 + 𝑓2 + 𝑓3 + ⋯ + 𝑓𝐾
∑𝐾
𝑖=1 𝑥𝑖 𝑓𝑖
𝑋̅ = ∑𝐾𝑖=1 𝑓𝑖
1
𝑋̅ = ∑𝐾
𝑖=1 𝑥𝑖 𝑓𝑖
𝑛
Where:
𝑥𝑖: is the class mid-point
𝑓𝑖: is the class frequency
n: is the number of values in the sample.
Example: Find the mean for the data in the table shown. Class Frequency
10-14 2
15-19 3
20-24 4
25-29 3
30-34 2
35-39 1
Solution:
We have to find class mid-point.
Class Class true limit Frequency Class mid-point
10-14 9.5-14.5 2 (9.5+14.5)/2=12
𝐾 15-19 14.5-19.5 3 17
1
𝑋̅ = ∑ 𝑥𝑖 𝑓𝑖 20-24 19.5-24.5 4 22
𝑛 25-29 24.5-29.5 3 27
𝑖=1
30-34 29.5-34.5 2 32
12 ∗ 2 + 17 ∗ 3 + 22 ∗ 4 + 27 ∗ 3 + 32 ∗ 2 + 37 ∗ 1
𝑋̅ = 35-39 34.5-39.5 1 37
2+3+4+3+2+1
𝑋̅ = 23
Lecture 2 Page -2
As an example, the mean of (3, 8, 4) is 5. Then:
∑(𝑥𝑖 − 𝑋̅) = (3 − 5) + (8 − 5) + (4 − 5)
= −2 + 3 − 1
= 0.0
5)If ( ̅̅̅
𝑋1 , ̅̅̅
𝑋2 , ̅̅̅ ̅̅̅̅
𝑋3 , … … , 𝑋 𝐵 )be the means of B distributions with respective frequencies (𝑛1,
𝑛2, 𝑛3,..., 𝑛𝐵 ) then the mean 𝑋̅ of the whole distribution with frequency ( 𝑁 = ∑𝐵𝑖=1 𝑛𝑖 ) is
given by:
𝐵
1
𝑋̅ = ∑ 𝑛𝑖 𝑋̅𝑖
𝑁
𝑖=1
The mean does have a weakness because it uses the value of every item in a sample, or
population, in its computation. If one or two of these values are either extremely large or
extremely small compared to the majority of data, the mean might not be an appropriate
average to represent the data.
The Weighted Mean: The weighted mean is a special case of the arithmetic mean. We
will refer to the weighted mean as 𝑋̅𝑤 . Any measure of importance could be used as a weight.
In general, the weighted mean of a set of numbers designated (𝑥1, 𝑥2, ..., 𝑥𝑛) with the
corresponding weights (𝑤1, 𝑤2, …., 𝑤𝑛) is computed as:
∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖
𝑋̅𝑤 =
∑𝑛𝑖=1 𝑤𝑖
Lecture 2 Page -3
Example: The Carter Construction Company pays its hourly employees $16.50, $17.50, or
$ 18.50 per hour. There are 26 hourly employees, 14 are paid at the $16.50 rate, 10 at the
$17.50 rate, and 2 at the $18.50 rate. What is the mean hourly rate paying the 26 employees?
Solution:
∑𝑛𝑖=1 𝑤𝑖 𝑥𝑖
𝑋̅𝑤 =
∑𝑛𝑖=1 𝑤𝑖
14∗16.50+10∗17.50+2∗18.50
𝑋̅𝑤 =
14+10+2
443
𝑋̅𝑤 = = $ 17.038
26
The Median
For data containing one or two very large or very small values the arithmetic mean may not
be representative. The center for such data can be better described by the median.
Median: The midpoint of the values after they have been ordered from the smallest to the
largest, or the largest to the smallest.
For the median, the data must be at least an ordinal level of measurement. The major
properties of the median are:
1- It is not affected by extremely large or small values. Therefore, the median is a
valuable measure of location when such values do occur.
2- It can be computed for ordinal level data or higher.
3- It is unique.
4- Can be computed for a frequency distribution with an open-ended class if the median
does not lie in an open-ended class.
For raw data: for n values arranged in ascending (or descending) order of magnitude, the
median is:
❖ The middle value if n is odd.
M = element[(n+1) /2]
❖ The arithmetic means of the two middle values if n is even
Lecture 2 Page -4
Example: Find the median for the data:
a.55, 62, 75, 35, 47, 80,50
b.70, 56, 42, 86, 26, 23, 75, 62
Solution:
a.35, 47, 50, 55, 62, 75, 80
M= 55
Lecture 2 Page -5
The Mode
Mode: The value of the observation that appears most frequently.
The major properties of the mode are:
1. It is not affected by extremely large or small values.
2. It can be computed for all levels of data (nominal, ordinal, interval, and ratio).
For raw data:
1- Some sets of data have no mode because no value appears more than the others.
Example: (19, 21, 23, 20)
3- If the set of data has more than two modes, the distribution is referred to as being
multimode. In such cases we would probably not consider any of the modes as being
representative of the central value of the data.
Where:
L: true lower limit of the modal class.
𝑓𝑚: frequency of modal class (maximum frequency).
𝑓1: frequency of class preceding the modal class.
𝑓2: frequency of class following the modal class.
h: modal class interval.
Class Frequency
Example: Find the mode for the data in the table shown. 10-14 2
15-19 3
20-24 4
25-29 3
30-34 2
35-39 1
Solution:
𝑓𝑚 − 𝑓1 Class Class true limit Frequency
𝑀𝑜 = 𝐿 + ( )∗ℎ 10-14 9.5-14.5 2
2𝑓𝑚 − 𝑓1 − 𝑓2 15-19 14.5-19.5 3
4 −3 20-24 19.5-24.5 4
𝑀𝑜 = 19.5 + ( )∗5 25-29 24.5-29.5 3
(2 ∗ 4) − 3 − 3
Mo= 22 30-34 29.5-34.5 3
35-39 34.5-39.5 1
Lecture 2 Page -6
Example: The following is the percent change in net income from last year to this year for
a sample of 12 construction companies in a certain state. Determine mean, median, and the
mode:
5 1 -10 -6 5 12 7 8 2 5 -1 11
Solution:
To find the median we have to sort the data from the smallest value to the largest value
-10, -6, -1, 1, 2, 5, 5, 5, 7, 8, 11, 12
M = (5+5) / 2 = 5
Mo=5
Lecture 2 Page -7
Measures of Dispersion
A measure of location, such as the mean or the median, only describes the center of the data.
It is valuable from that standpoint, but it does not tell us anything about the spread of the
data. A small value for a measure of dispersion indicates that the data are clustered closely,
say, around the mean. The mean is therefore considered representative. While a large value
of the measure of dispersion indicates that the mean is not reliable.
If two groups of students have the same average marks, we may like to know whether
one group consists of students of average and near-average intelligence and the other group
is made up of a large number of very bright and very dull students
Group A: 75 85 95 105 115 125
Group B: 10 20 30 70 190 280
Both A and B have mean 100, yet they are different. Such a variation is variously called as
dispersion, spread, scatter, or variability. We will consider several measures of dispersion:
The Range
Range: It is the difference between the largest and the smallest values in a data set. It is the
simplest measure of dispersion
Range = Largest value – Smallest value
Interquartile range = 𝑄3 – 𝑄1
Interdecile range = 𝐷9 – 𝐷 1
Interpercentile range = 𝑃90 – 𝑃10
Range is a very useful measure in industrial engineering work, especially in statistical
quality control work.
Mean Deviation
The arithmetic mean of the absolute values of the deviations from the arithmetic mean.
Where:
X: is the value of each observation.
𝑋̅: is the arithmetic mean of the values.
n: is the number of observations in the sample.
| |: indicates the absolute value
Lecture 2 Page -8
For grouped date
Where:
𝑥𝑖: is the class mid-point
𝑓𝑖: is the class frequency
𝑋̅: is the arithmetic mean of the sample
Where:
𝜎2: is the population variance.
X: is the value of an observation in the population.
𝜇: is the arithmetic mean of the population.
N: is the number of observations in the population.
Because the unit of the variance the square of the variate or the square variable unit we use
the square root of the variance which is called the standard deviation (σ):
𝑁
1
𝜎 = √ ∑(𝑥𝑖 − 𝜇)2
𝑁
𝑖=1
Lecture 2 Page -9