You are on page 1of 18

Measures of Central Tendency

The Population Mean:


The population mean is the sum of all the values in the population divided by the
total number of values in that population.
For ungrouped data,
N

åX i

Population mean, µ =
i =1
N
Where, N = Number of values in the population
å X = sum of the X values in the population

Example: There are 30 IT companies in the city of Rangpur. Their profits (in
lakh taka) in the year 2022-2023 are given below:
20, 22, 35, 42, 37, 42, 48, 53, 49, 65, 39, 48, 67, 18, 16, 23, 37, 35, 49, 63, 65, 55,
45, 58, 57, 69, 25, 29, 58, 65.
What is the average profit of the companies?

The Sample Mean:


For raw data/ungrouped data, the mean is the sum of all the sampled values
divided by the total number of sampled values.
The formula for the mean of a sample is:
n

åx i

Sample Mean, x =
i =1
n
Where, n = sample size.
Example: From our previous example, we take sample of 5 companies’ profit (in
lakh taka) as below:
65, 22, 48, 55, 29.
Find the average profit of the companies from the sample data?

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Arithmetic Mean for grouped Data:

x=
åfx i i
Formula,
åf i

Arithmetic Mean for Grouped Data:

Arithmetic Mean of Grouped data, x =


åfx i i

åf i

Where,
fi = frequency of each class

xi = mid-point each class

Example: We organize the raw data from our previous example and present in
the form of a frequency distribution (We consider the data as sample):

Profits (lakh taka) Frequency, f i Mid-point, xi fi xi


15-24 5
24-33 2
33-42 7
42-51 6
51-60 5
60-79 5
Total 30 å f x =… i i

Weighted Mean
The weighted mean is a special case of the Arithmetic Mean.
Formula:

Weighted Mean= X w =
W1 X 1 + W2 X 2 + W3 X 3 + ................. + Wn X n
=
åW X
i i

W1 + W2 + W3 + ....... + Wn åW i

Example: The combined grade point average (CGPA) of a student in 3 semesters


are 3.5, 2.8 & 3.5. The credit hours he completed in 3 semesters are 12, 15 & 9
respectively. Find the average CGPA?
Solution: 3.21.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Definition:
The geometric mean of a set of n positive numbers is defined as the nth root of
the product of n values. The formula is:

GM = n ( X 1 )( X 2 )...( X n )
• The geometric mean will always be less than or equal to (never more than)
the arithmetic mean.

Example: The profits earned by a Rangpur IT Institute on four recent projects in


four successive years were 3 percent, 2 percent, 4 percent, and 6 percent. What is
the geometric mean profit?

GM = n ( X 1 )( X 2 )...( X n ) = 4 (3)(2)(4)(6)
= 4
144 = 3.46
So the average profit earned by RIIT is about 3.46 percent.
• A second application of the geometric mean is to find average percent
increase over a period of time.
The formula is:
Value - at - end - period
GM = n -1
Value - at - start - period

Example: If RIT Institute earned $30,000 in 2012 and $50,000 in 2013. What is
the average annual rate of percentage increase during the period?
Solution: Earning increased at a rate of …..percent from 2012 to 2013.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Harmonic Mean
The formula for the Harmonic mean is.
n
HM =
1 1 1 1
+ + + ............ +
x1 x2 x3 xn
Example: In a journey from campus to the Bus terminal, the scheduled bus of
BSMRAAU moves first 50 km at a speed of 60km/hour, second 50 km at a speed
of 75km/hour, third 50 km at a speed of 65km/hour, fourth 50 km at a speed of
80km/hour. What is average speed of the bus throughout the journey?
Solution: 69.10 km/h……
THE MEDIAN:
Median: Median is the middle most values of any ranked or ordered
observations.

Median for ungrouped data:


When n is an odd number,
Order the values in an ascending or descending manner.
n +1
Thus Median = th item in the ordered series.
2
Example: Find the median of the following values: 11, 9, 13, 4, and 7?
Solution: First we array the data in ascending order as follows: 4, 7, 9, 11, and
13
n +1 5 +1
Median = th item = th item = 3rd item. In the series, the third item is 9.
2 2
So the median value is 9.

When n is an even number,


Order the values in an ascending or descending manner.
n n
+ ( + 1)
Median = 2 2 th item in the ordered series.
2

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Example: Find the median of the following values: 11, 9, 13, 4, 7, and 15?
Solution: First, we array the data in an ascending order as follows: 4, 7, 9, 11,
13, and 15
n n
+ ( + 1)
Median = 2 2 th item in the ordered series
2
9 + 11
= (3rd + 4th obs.)/2= = 10 , so the median value is 10.
2

MEDIAN FOR GROUPED DATA

The formula is:


h N
Me = l + ( - c)
f 2
Where,
l = is the lower limit of the median class

h is the width of the median class


N is the total number of frequencies
f is the frequency of the median class
c is the cumulative frequency preceding to the median class.
i is the class interval in which the median lies.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Example:
The following data represent the distribution of marks obtained by the students
of BSMRAAU in an assignment in a Statistics course:

Class interval of No. of Students Cumulative frequency


marks top bottom

<10 10 10 132
10-15 25 35 122
15-20 48 83 97
20-25 21 104 49
25-30 16 120 28
30+ 12 132 12

1. Find the median number of marks obtained by the students?


2. How many students get 15 or more marks?
Solution:
1) In the given distribution, N=132, h=5
The class for which cf ³ 66 is 15-20, where lower limit of the class is l=15,
frequency of the class is f = 48, and c.f. preceding to the class is c=35, so
h N
Me = l + ( - c)
f 2

=15+ ((5/48) (66-35)) =17.75


2) The c.f. calculated from bottom shows that 97 students got 15 or more
marks.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Mode
Suppose a group of people ages 22, 26, 27, 31, and 35. Both the ages 27 and 35
are modes. This grouping of ages is referred to as bimodal (having two modes).

Mode for grouped data:


For data grouped into a frequency distribution, the mode can be approximated by
the midpoint of the class containing the largest number of class frequencies.
h( f1 - f 0 )
Mode = l +
2 f1 - f 0 - f 2

Where, l =lower limit of the modal class


h = width of the modal class
f1 =frequency of the modal class

f 0 =frequency of the class preceding to the modal class

f 2 =frequency of the class following to modal class

Example:
The following data represent the distribution of the of the part time job students
of BSMRAAU according to their daily salary (in taka):
Class interval of salary (in taka) No. of students, c.f.
fi

<600 10 10
600-700 40 50
700-800 65 115
800-900 250 365
900-1000 175 540
1000-1100 82 622
1100-1200 50 672

1) Find the maximum salary, on average, of the major group of students.


2) How many students have a salary of less than 1000.00 taka?

Solution:

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
1) The major group of the students are 250 whose salary in the limit 800-
900.There,on average, salary is given by mode, where,
h( f1 - f 0 )
Mode = l +
2 f1 - f 0 - f 2
100(250 - 65)
= 800 + = 871.15 taka.
2 ´ 250 - 65 - 175

2) From c.f. it is observed that 540 students’ salary is less than 1000.00 taka.

Measures of location
Some measures which are not based their position in series of observations but
not but they are not necessarily central values and hence they are referred to as
measures of Measures of location.
Ø Quartiles
Ø Percentiles
Ø Deciles.
The Quartiles
There are three quartiles in a data set, usually denoted by Q1, ³ and Q3 which

divide the whole distribution into four equal parts. The second quartile is identical
with the median. The first quartile, Q1, is the value at or below which one-fourth

(25%) of all observations in the set fall; the third quartile, Q3 is the value at or
below which three-fourths (75%) of the observations lie.

For ungrouped data, a quartile, as does the median, either assumes the value of
one of the items or falls between two values. If n is divisible by 4, the first quartile
( Q1, ) has the value half-way between the n/4th and (n/4 +1)th observation. If n is

not exactly divisible by 4 (i.e. n/4 is not an integer), the first quartile is the value
of next higher integer. To find the third quartile, Q3 we replace n/4 by 3n/4.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Formula
Qi =value of ¼(in+1)th observation ,if n is odd

Qi =value of ½ (in/4th +(in/4+1)th) observation ,if n is odd

For grouped data,


h iN
Qi = l + ( - c)
f 4

The quartile class is one for which c.f ³ iN/4; i=1, 2, 3


The two quartiles can be represented by Box-and-whisker plot
h 2N
Q2 = l + ( - c)
f 4

A value of 34.67 for Q1, implies that 25percent of the workers are below age

34.67.Similarly,there are 75 percent workers in the company who are below 43.87
years of age and only 25 of them are above this age as implied by the value, Q3

Decile
Formula:
Di =value of i(n+1)/10th observation, if n is odd

Di =value of ½ (in/10th +(in/10+1)th) observation ,if n is odd

For grouped data,


h iN
Di = l + ( - c)
f 10

The decile class is one for which c.f ³ iN/10; i=1,2,…9


Suppose 4th decile is 37.17 i.e; approximately 40percent of the worker are under
age 37.17years.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Percentile
Pi =value of i(n+1)/100th observation, if n is odd

Di =value of ½ (in/100th +(in/100+1)th) observation ,if n is odd

For grouped data,


h iN
Pi = l + ( - c)
f 100

The Percentile class is one for which c.f ³ iN/100; i=1,2,…99


Suppose the 30th percentile is 35.5, i.e., 30 percent of the workers were under the
age of 35.5 years.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Measures of Dispersion

1. The absolute measure of Dispersion


It is a measure that provides information on the average deviation or scatteredness
of observations where the measure depends on the unit of the variable under
study. The measures are absolute in that they are expressed in the same statistical
unit in which the original data are presented, such as the dollar, taka, meter,
kilogram, etc.
• Range
• Semi-interquartile range or quartile deviation (QD)
• Mean Deviation
• Variance
• Standard Deviation

2. Relative Measures of Dispersion


It is a measure which provides the relative information on average deviation or
scatteredness of observations where measure doesn’t depend on unit of the
variable under study. When two or more datasets are expressed in different units,
however, the absolute measures are not comparable, in which case it is necessary
to consider some other measures that reduce the absolute deviation in some
relative form. The relative measures are usually expressed in the form of
coefficients and are pure numbers, independent of the unit of measurement.
• Coefficient of Range
• Coefficient of Semi-interquartile range or quartile deviation (QD)
• Coefficient of Mean Deviation
• Coefficient of variation

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Range: Range is the difference between the largest and smallest observations.
Range = X (n)
- X (1) = Largest Value – smallest value

For example, let us consider the total annual rainfall (in mm) recorded in some
meteorological stations in Bangladesh, where the rainfall data are as follows:
3863,3914,4672,4139,4435,4245,3216,2518,3368,4388,2312,1819,2200,2858,2
548,1490,1994,3217,2852,2601,2391,1636,1540,2365,3139.
Here, n = 25, Range of rainfall is R= X ( n ) - X (1)

=4672-1490=3182 mm

Coefficient of Range =
X (n)
- X (1)
X (n)
+ X (1)

Suppose, the marks obtained in a quiz by BSMRAAU students in Statistics


course:

Class interval Frequency


5-10 18
10-15 20
15-20 26

Range = X (n)
- X (1) = Largest Value – smallest value = 20-5 = 5

Coefficient of Range
=
X (n)
- X (1)
==
X (n)
- X (1)
=
20 - 5 15
= = 0.6 ´ 100 = 60 percent
X (n)
+ X (1) X (n)
+ X (1) 20 + 5 25

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Mean deviation
The mean deviation is computed as the arithmetic mean of absolute values of the
deviation from a typical value of the distribution.
å xi - x
• MD = ( Mean deviation from mean)
n
å xi - Me
• MD( Median) = ( Mean deviation from median )
n
å xi - Mo
• MD( Mode) = ( Mean deviation from mode )
n
For frequency distribution,
K
• MD(Mean) = 1 å fi x i - x ( Mean deviation from mean)
N I =1
K
1
• MD( Median) =
N
å fi x - Me ( Mean deviation from median )
I =1
i

K
1
• MD( Mode) =
N
å fi x - Mo ( Mean deviation from mode )
I =1
i

The percentage change of variation in the average amount of minimum


temperature is found by coefficient of mean deviation, where deviation can be
measured from mean or median.
MD(mean)
• Coefficient of MD ( Mean ) = ´100%
mean
MD(median)
• Coefficient of MD ( Median ) = ´100%
median
MD(mean)
• Coefficient of MD ( Mode ) = ´100%
mean

Variance and Standard Deviation


Variance and Standard Deviations are also based on the deviations from the mean.
Variance – The arithmetic mean of the squared deviations from the mean.
When population data are used, the variance is denoted by s 2 and the standard
deviation by s .
When sample data are used, the variance is denoted by s and the standard
deviation by s.
The formula for Population Variance is:

s2 = å
(x i - µ ) 2
N

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
The same formula can be used for more convenient used in calculation:

å x - åN
2
(x )
2 i
i
s2 =
N
å (x - x )
2
i
The formula for the sample variance, s 2
=
n -1
The same formula can be used for more convenient used in calculation:
å xi 2 - å
( xi ) 2
s2 = n
n -1

Since our goal is to find an average of squared deviations from the mean, one
would expect division by n.So why is sample variance found by division of(n-1)?
If we were to take a very large number of samples, each of size,n from the
population and compute the sample variance, then average of all these sample
variances would be the population variance, s 2 .For now, we rely on
mathematical statisticians who have shown that,if the population variance is
unknown, a sample variance is a better estimator of the population variance if the
denominator in the sample variance is (n-1),rather than n.

Computing variance for Frequency distribution:

s2 = å
fi(x i - µ ) 2
The formula for the population variance,
N
å fi( xi - x )
2

The formula for the sample variance, s 2 =


n -1
Standard Deviation – The standard deviation is the square root of the variance.
To compute the variance requires squaring the distances, which then changes the
unit of measurement to square units. The standard deviation, which is the square
root of variance, restores the data to their original measurement unit. If the
original measurements were in feet, the variance would be in feet squared, but the
standard deviation would be in feet. The standard deviation measures the average
spread around the mean.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
The formula for the population Standard deviation, s = å (x i
- µ )2
N

The formula for the sample standard deviation, s = å (x - x )


i
2

n -1
Example:
A professor teaches two large sections of introductory statistics and
randomly selects a sample of test scores from both sections. Find the range
and standard deviation for each sample?
Section 1: 50 60 70 80 90
Section 2: 72 68 70 74 66
Solution: Although the average grade for both sections is 70, we notice that the
grades in section 2 are closer to the mean, 70, than are the grades in section 1.And
just as we would expect, the range of section 1,40 is larger than the range of
section 2, which is 8.
Similarly, we would expect the standard deviation for section 1 to be greater than
the standard deviation for section 2.
Problem: The hourly wages part time graduates of BSMRAAU at aviation
company are: $12, $20, $16, $18, $19
Compute the variance & standard deviation (Sample, Population)?

Solution:
Hourly Wage (X) (X - X ) ( X - X )2

$12 -5 25
20 3 9
16 -1 1
18 1 1
19 2 4
Total-85 0 40

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Coefficient of variation
The Coefficient of variation (CV) is one of the important measures of dispersion
that attempt to measure the variability in data relative to the mean.
If the standard deviation in sales for large and small stores selling similar goods
are compared, the standard deviation for large stores will almost always be
greater. A simple explanation is that a large store could be modeled as number of
small stores. Comparing variation using the standard deviation would be
misleading. The coefficient of variation overcomes this problem by adjusting for
the scale of units in the population.

COEFFICIENT OF VARIATION: The ratio of the standard deviation to the


arithmetic mean, expressed as a percent.
In terms of a formula for a sample:
COEFFICIENT OF VARITION

S
CV =  ´ 100
x

[When to use (CV)]


• The data are in different units (such as dollars and days absent).

• The data are in the same units, but the means are far apart (such as the
incomes of top executives and the incomes of the unskilled employees).

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Example:
The combined grade point average (CGPA) in different semesters of eight
students from two sections in a statistics course at BSMRAAU are:
Section CGPA in semesters
A 2.5 2.5 3.0 3.5 3.5 4.0 3.5 3.5
B 2.5 3.0 4.0 4.0 4.0 2.0 2.5 4.0

Which section of students would you consider better throughout the courses
of studies? Solution: For section A,
S
CV =  ´ 100
x
=15.38%

For section B,
S
CV =  ´ 100
x
=24.31%
It is observed that the average CGPA of students in both sections is the same:
C.V. of A is less than the C.V. of B. This implies that the students from section
A is better than students from section B throughout the courses of study. The
performance of A is more homogenous in all semesters.

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman
Interpretations on arbitrary data

Mean: On average, the hourly wages of part-time graduates at aviation


companies are about TK.2226.056 (Thousands).
Median: 50% of the hourly wages of part-time graduates at aviation companies
is about less than TK. 2110.65 (Thousands), and 50% of the hourly wages of part-
time graduates at aviation companies is TK. 2110.65 (Thousands).
Mode: The maximum number of hourly wages for part-time graduates at aviation
companies is about TK. 2090.3 (Thousands).
Standard Deviation: The actual amount of the hourly wages of part-time
graduates at aviation companies on average differs/varies from the mean hourly
wages, which is about TK.2226.056 thousand by TK. 466.7437 thousand.
Variance: The average variation of the hourly wages of part-time graduates at
the aviation company Centre is about Tk. 217849.7 thousand.
Range: The range of the hourly wages for part-time graduates at aviation
companies is about TK. 2059.4 (Thousands) where the highest hourly wage is
TK. 1390.9 (Thousands), and the lowest hourly wage is TK.3450.3 (Thousands).

PROBLEMS ON MEASURES OF CENTRAL TENDENCY, LOCATION & DISPERSION Prepared by: Dr. Md. Siddikur Rahman

You might also like