Professional Documents
Culture Documents
Hs h22 Topic1 Mean
Hs h22 Topic1 Mean
HS 522
Topic 1, Mean
Numerical Measures
• Many numerical measures are used to
describe the data set. We will pick up what is
most important for us.
• Central tendency: which measurement
represents the data set?
• Dispersion/ Variability: how the data is
arranged around the representational
measure?
Why Important?
• What is the average infant mortality rate in
India?
• Which states show higher variability?
• Higher variability: lower stability.
• Similarly, in case of income distribution:
higher variability: higher inequality.
1
09/08/2023
Distribution-1 (Symmetric)
Most data is points are around the red dot.
Frequencies fall on both sides of red dot.
An Example
x f
1 1
2 2
3 3
4 8
5 3
6 2
7 1
2
09/08/2023
An Example
X freq
1 1
2 2
1 1
4 3
5 2
6 30
7 1
3
09/08/2023
An Example
X freq
1 1
2 5
3 2
4 0
5 1
6 7
7 1
Discussion
• For distribution 1, a center is probably well
defined..
• Not so for distributions 2 and 3.
• As there are different notions of center, there
are different notions of measure of central
tendency.
• Those are called mean, median and mode.
• We take up the issue of mean first.
09/08/2023 Topic 1: Mean 11
Arithmetic Mean
• Arithmetic mean, or sometimes simply the mean is
computed by adding the data points and divide by total
number of observations.
• In symbol , for a sample,
x i
x i
4
09/08/2023
Example
•
Y 0 2 3 4
• Assuming this to be a sample, the sample
mean is Y 9 i
Y 2.25
n 4
x n
x i
; x 8, n 15 xi ?
n i 1
n
xi n * x 15*8 120
i 1
Mean As Representation
• Suppose we have “n” observations.
• If we sum the mean n times, that will be equal
to the sum of the data set.
• If we erase every number and replace with
AM, the sum of AM will preserve the sum of
data set.
5
09/08/2023
Mean As Centre
• Subtract mean from each observation and
sum up.
Y Y- (mean)
0 -2.25
2 -.25
3 .75
4 1.75
Sum= 9 Sum= ?
6
09/08/2023
7
09/08/2023
Example
• The scores relate to two groups within a class,
say Group X for English speakers (4 in number)
and Y for non-English speakers (5 in number)
X (nx 1 2 1 3 - ΣX=7
=4)
Y (ny 2 4 6 6 2 ΣY=20
=5)
Here, x 1.75; y 4
27
• Thus z 9 3
• But notice that 4 5
*1.75 * 4 3
9 9
In Words
Suppose there are two groups (1, and 2) in a
class/sample. Then the
Mean of the class=
(fraction of the class in group 1) * (mean of
group 1) + (fraction of the class in group 2)*
(mean of group 2).
• Can be extended beyond two groups.
8
09/08/2023
Another Example
• In a class of 100, 40 are English and 60 are
Spanish. The teacher gave an IQ test. The
English student mean is 30 and the Spanish
students’ mean is 50. What is the mean of the
class?
Solution
• Total number= (60+40)=100.
• English fraction = 40/100= 2/5
• Spanish fraction = 60/100 = 3/5
• Thus, class mean = (2/5)*30+ (3/5)*(50)=?.
• The fractions, 3/5,2/5 etc. are called weights
attached to each group.
• Weighted mean.
Notation
z i
z i
nz
x y i i
i i
nx n y
nx x n y y nx ny
x y
nx n y nx n y nx n y
9
09/08/2023
1 3 3
2 2 4
3 2 6 6
f x i i
47
4 3 12 x= i=1
= =3.61(apprx)
n 13
7 2 14
8 1 8
Σfi =13=n Σfi *Xi=
47
09/08/2023 Topic 1: Mean 28
111 2 2 3 3 4 4 4 7 7 8
x
13
1*3 2* 2 3*3 4*3 7 * 2 8*1
13
1* f1 2* f 2 3* f 3 4* f 4 7 * f5 8* f 6
13
Another Example
• X=(4,4,6,6,6,8,10,10)
Xi fi fi *Xi
4 2 8
6 3 18
8 1 8
10 2 20
Σfi= 8 ΣfiXi= 54
fX i i
54
x i 1
6.75
n 8
10
09/08/2023
• AM cannot be computed.
11
09/08/2023
Geometric Mean
• Definition.
• Computation
• The rational.
• Importance in Social Sciences.
12
09/08/2023
GM: Example-1
• What is the GM of 1,3, 9.
• Here, x1*x2*x3 = 1*3*9=27 (n=3).
• So we want a number G, such that
G3 = 27
G=3
Ans: GM of 1,3, and 9 is 3.
GM: Example-2
• What is the GM of 2,8.
• Here, x1*x2 = 2*8=16 (n=2).
• So we want a number G, such that
G2 = 16
G=4
Ans: GM of 2 and 8 is 4.
GM: Example-3
• What is the GM of 2, 4,8.
• Here, x1*x2*x3 = 2*4*8=64 (n=3).
• So we want a number G, such that
G3 = 64
G=4
Ans: GM of 2,4, and 8 is 4.
13
09/08/2023
Problem
• If You do not have a calculator, computation of
GM is problematic, even for simple cases.
• What is the GM of 1, 2, 4,8.
• Here, x1*x2*x3 *x4= 1*2*4*8=64 (n=4).
• So we want a number G, such that
G4 = 64
Answer is not simple to find! (2.83 approx)
Ans: GM of 1, 2,4, and 8 is 2.83.
AM as Representative Data
• AM represents the data set in the sense that
If we replace every value of a data set by the
AM, then the sum of the data set gets
preserved.
Example: (1,3,3,9). The sum of the data is 16.
The AM is 4.
If we have the data set (4,4,4,4) the sum is
preserved.
GM as Representative Data
• GM represents the data set in the sense that
If we replace every value of a data set by the
GM, then the product of the data set gets
preserved.
Example: (1,3,3,9). The product of the data is
81. The GM is 3 (34 =81).
If we have the data set (3,3,3,3) the product is
preserved.
14
09/08/2023
Example
• I have deposited 100 Rupees in a bank. The bank
offers 1% of interest rate in the first year, 3% in
the next year, and 8% in the third year. What is
the average interest rate offered by the bank?
• Value of Money after 1st year= 100(1+.01)= 101
• Value of money after 2nd year=101(1+.03)=104.03
• Value of money at the end:
104.03(1+.08)=112.3524 (bank will actually give
112.35.
• (highly unrealistic, but let us keep the value).
Question: AM
• Which interest rate, applied uniformly over the 3
years, will give us 112.3524? (that is the mean
interest rate quoted by bank)
• Use AM.
• Over the three years, the money has increased by
the proportion 1.01, 1.03 and 1.08, respectively
• AM= 1.04→ 4% interest rate each year
• After 3 years, we will get
100(1+.04)(1+.04)(1+.04)=112.4864.
• Error= 112.4864-112.3524=.134
Question: GM
• Use GM.
• Over the three years, the money has increased by
the proportion 1.01, 1.03 and 1.08, respectively.
• GM of the proportion G3 =1.01*1.03*1.08=
1. 1235→G=1.0396 (use calculator) (interest rate:
3.96%)
Over three years, we have
• 100*(1+.0396)*(1+.0396)*(1+.0396)=112.3567
• Error= 112.3567-112.3524= .0043
15
09/08/2023
Conclusion
• GM is appropriate if the data involves
calculation of mean growth rates.
• Inflation, GDP, population…..the list is quite
endless (especially in Development Studies).
A Fictional Country
• H=.5, Y=.5. E=.5. AM=GM=.5
• Now suppose E and H both fall to .4. The country
increases Y score to .7. (think about relaxing child
labor laws, which takes a toll in health and
education, but boosts GDP).
• AM= .5 (as before)
• GM= .48 (approx.).
• GM would be worse if some of these indicators
become close to zero: even if one boosts Y to 1.
16
09/08/2023
Moral
• If one uses GM, then it has hard to substitute
and compensate between indicators,
compared to AM.
• Therefore, GM is used for calculation of HDI.
• There is more to HDI calculation than this.
• However, GM also suffers from the problem of
extremely large values (outliers).
• If any observation is 0 or negative, GM cannot
be calculated.
Reference
• GW, Chapt 3
17