6 InequalityMeasures

Inequality measures
STATISTICS AND PROGRAMMING
Laura Anderlucci
University of Bologna
Inequality measures
The point is measuring how uniform

the resources distribution is
Inequality measures
The focus is on measuring the variability of data concerning a

special kind of quantitative variables, the so-called transferable
variables.
Examples of transferable variables are income, wealth, etc. for which
the values held by the statistical units under study can be altered (e.g.
by government redistribution policies). In particular, the values can be
exchanged between the units.
In economics terminology, a variable is said to be less or more

concentrated depending on whether the amount of variable held by
a given proportion of units is greater or lesser.
1
TWO EXTREME SCENARIOS
The distribution has two possible extremes,one is the most equal and the other
is the least equal
• EQUIDISTRIBUTION= Each fraction of the sample carries a

corresponding fraction of the variable. All persons have the
same income, which is equal to the mean income
• MAX CONCENTRATION= Nobody has nothing but one person
Gini Coefficient
Given the series of observations x1, x2, . . . , xN on a transferable

variable
X and the corresponding ordered series x(1), x(2), . . . , x(N) , denote by
Ai = x(1) + x(2) + . . . + x(i ) ,
the total amount of variable X observed on the first i units,

arranged in non-decreasing order of their X value.
Obviously, AN = N · µ is the total over all observations, referred to as

the
variable’s total. It is the amount of variable carried by the whole
sample N
2
So, the last person will the variable’s total.
Gini Coefficient
Define the two ratios, with i = 1,

2, . . . , N:
i
* Pi = , i.e. fraction of units with respect to the total N in which X
N
≤ x(i ) ,
i
* Qi = , i.e. fraction of the variable’s total pertaining to such units.
AAN
In the case of
• perfect equality: Ai = i · µ and then Qi = (i · µ)/(N · µ) = Pi ,

• perfect inequality: Q1 = Q2 = . . . = QN − 1 = 0 and QN = 1.
Hence, the more Qi differ from Pi , i = 1, 2, . . . , N − 1, the

greater inequality in our data.
3
Gini Coefficient
Let X be a transferable variable, and let x1, x2, . . . , xN be a series of

observations on X . A measure of inequality of these data is given by
the Gini coefficient:
The following relationship holds:

Σ 1
N− Σ 1
N−
(Pi − Q i ) ≤ i
i =1 P . i
=1
The expression on the left-hand side is a measure of inequality, which is 0
Σ 1
in case of perfect equality and equal to N− i =1 Pi in the case of perfect
inequality.
⇒ G ranges from 0 and 1; it is a relative measure of variability, suitable
for comparing two or more series of observations.
4
CONSIDERATIONS
• The Gini coefficient measures how far the distribution is from the
equidistribution scenario, taking into account the distance Pi and Qi
• These differences are cumulated. The larger the differences, the larger the
equality
• If this difference is 0, the EQUIDISTRIBUTION IS REACHED
MEANING OF THE EXTREMES
• G=1 Numerator and denominator are equal, because all Qi are zero.
• G=0 Numerator is equal to 0, as Pi=Qi
Example - Gini Coefficient
Suppose the monthly income of five households (hypothetical data):

3.5, 6, 7, 9.5, 12.
i x(i ) Ai Qi Pi Pi − Qi
1 3.5 3.5 3.5/38=0.09 1/5=0.2 0.11
2 6.0 9.5 9.5/38=0.25 2/5=0.4 0.15
3 7.0 16.5 16.5/38=0.43 3/5=0.6 0.17
4 9.5 26.0 26.0/38=0.68 4/5=0.8 0.12
5 12.0 38.0 38.0/38=1.00 5/5=1.0 -
Total 38.0 - - - 0.55
The Gini coefficient is:

Σ 4
(Pi − Q i )
G = Σ 4
i=1 =
=
i =1 Pi
0.
0.275
55
2 5
Gini coefficient - Properties
• It ranges from 0 to 1 (0=“perfect equality”, 1=“perfect inequality”);

• It decreases if we add the same positive number to each observation;
• It does not change if we multiply all observations by a
positive number.
• It satisfies the so-called transfer principle: given two observations,
x(i ) and x(j) . with x(i ) < x(j) , if we decrease x(i ) by an amount τ and
increase x(j) by the same amount (maintaining the ranking order of
observations), G increases. Viceversa, inequality decreases
• EQUIDISTRIBUTION: Qi=Pi
• MAX CONCENTRATION: Xn=Nu
6
Geometric interpretation of the Gini
coefficient
Given a series of observations x1, x2, . . . , xN , we now draw a graph
which depicts the level of concentration: the Lorenz curve.
1. We put Pi on the horizontal axis and Qi on vertical axis, then we

plot the points (0, 0), (P1, Q1), (P2, Q2),. . . , (PN , QN ). The
curve is then obtained by connecting each pair of adjacent points
with a line segment.
2. Then, the line connecting the points (0,0) and (1,1) is drawn
(known as line of perfect equality).
3. Finally, a piecewise line ABC is drawn, connecting the points:
A = (0, 0) B N− 1 , C = (1,
N ,
= 1),
known as the curve of perfect 0
inequality.
7
coefficient
Denote by:
• S the area between the Lorenz
curve and the line of perfect
inequality;
• max S the area between the
line of perfect equality and the
curve of perfect inequality.
• The triangle ABC represents the
possible distributions
Then Gini coefficient can be ex-
pressed as follows:
S
G=
max
S
where max S = (N −
1)/2N.
8
coefficient
The quantity max S is the area of

the triangle ABC, having height 1
and base (N − 1)/N.
Notice that writing G = S/max S
re- veals that the greater the
distance between the Lorenz curve
and the line of perfect equality, the
greater the inequality in the data.
When N is sufficiently large, the area of perfect inequality is

approximately 1/2.
Limit for N to Infinity of N-1/2N
9
HOW TO CALCULATE THE SHADED AREA
TRIANGLE ABC-SUM OF TRAPEZOIDS

Calculate the area of trapezoids
Income inequality
Website Globalinc allows the visualisation of the Global Income

Distribution since 1980 (and up to 2014).
For example:
Italy 1980 Italy 2014

xi Ai Qi Pi P i − Qi xi Ai Qi Pi P i − Qi
2104 2104 0.030 0.1 0.070 2434 2434 0.022 0.1 0.078
3095 5199 0.075 0.2 0.125 4568 7002 0.062 0.2 0.138
3946 9145 0.132 0.3 0.168 6172 13174 0.117 0.3 0.183
4741 13886 0.200 0.4 0.200 7552 20726 0.185 0.4 0.215
5536 19422 0.280 0.5 0.220 8864 29590 0.264 0.5 0.236
6392 25814 0.372 0.6 0.228 10226 39816 0.355 0.6 0.245
7390 33204 0.478 0.7 0.222 11783 51599 0.460 0.7 0.240
8696 41900 0.603 0.8 0.197 13803 65402 0.583 0.8 0.217
10775 52675 0.759 0.9 0.141 17077 82479 0.736 0.9 0.164
16757 69432 1.000 1.0 0.000 29658 112137 1.000 1.0 0.000
Σ 9
Σ 9
Pi − Q i Pi − Q i
Σi =19 = 0.349 G2014 =
Σi =19 =
= 0.381
G1980 = = 1.57 1.71
i =1 Pi i =1 Pi
1 6
4.5 4.5 10
Examples: Global Income Distribution
It allows for some comparison:
Gini coefficient − period 1980−2014

Coutnries: Brazil, China, Italy, Russian Federation and USA
0.6
Country
Brazil
0.5
China
Gini
Italy
Russian Federation
0.4 USA
0.3
1980 1990 2000 2010

Year
Data source: h tt p s: / /ti nyco . re / 9553483
11
Heterogeneity
Nominal and categorical variables does

not accept variability measures used for
quantitative variables. So the concept of
heterogeneity is introduced to assess the
variability of a qualitative data
distribution
Heterogeneity
Let’s consider the following percentage distributions about religion:
Xi pi Xi pi
Catholicism 25% Catholicism 0%
No religion 25% No religion 0%
Islam 25% Islam 0%
Other religion 25% Other religion 100%
Total 100% Total 100%
Which one is the most homogeneous?

ANSWER: 2
12
Heterogeneity
For frequency distribution of a nominal variable, various indexes

can be introduced that measure the degree of diversity between
the observations.
HOMOGENEITY
• Consider a frequency distribution of a nominal variable. We say that
the frequency distribution is perfectly homogeneous when all
observations fall into a single category.
HETEROGENEITY
• On the contrary, we say that the frequency distribution is
perfectly heterogeneous when all categories have the
same frequency.
13
Measures of heterogeneity
The following indices are measures of heterogeneity:
Σk k− 1
e1 = 1− fi 2 , 0 ≤ e1 ≤
k
i =1
Σk
e2 = − fi ln (fi ), 0 ≤ e2 ≤ ln
(k) i =1
where fi , i = 1, 2, . . . , k are the relative frequencies.
* e 1 attains its minimum, 0, when the frequency

distribution is
perfectly homogeneous; it attains its maximum, ( k − 1)/k , when
the distribution is perfectly heterogeneous.
* Similarly, e 2 attains its minimum, 0, when the frequency
distribution is perfectly homogeneous; it attains its maximum,
ln( k ), when the frequency distribution is perfectly
heterogeneous. 14
OBSERVATIONS ON HETEROGENEITY
• E1 and E2 can be used alternatively

• E1 is always less than 1
• E2 can be greater than 1
• By dividing E1 and E2 for their theorical maximum, a RELATIVE
MEASURE OF HETEROGENEITY IS OBTAINED.
• These relative measure is known as NORMALIZED VERSION
NORMALIZED VERSIONS
• Normalized versions of E1 and E2 range between 0 an 1
• Normalized versions are more apt for comparisons and to assess
how heterogenous a distribution is with respects to the
maximum heterogeneity technically possible
ASSESS HETEROGENEITY
In order to assess heterogeneity:

• REPORT RANGE
• COMPUTE NORMALIZED RANGE
Example - Heterogeneity
Let’s consider the percentage distribution of the French population by

religion (source: Eurobarometer, 2018). Evaluate the heterogeneity.
Xi pi 2 2
2
2
Catholicism 50% e1 = 1 − (0.50 + 0.38 + 0.05 +
No religion 38% 0.07 )
Islam 5% = 0.60
Other religion 7% e2
= − (0.50 ln 0.50 + 0.38 ln 0.38 +
Total 100%
+0.05 ln 0.05 + 0.07 ln 0.07)
= 1.050
Given the max e1 = 0.75 and max e2 = ln(4) = 1.386, their
normalized values are 0.8 and 0.758, respectively.
⇒ Quite high heterogeneity.
15

6 InequalityMeasures

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

6 InequalityMeasures

Uploaded by

Copyright:

Available Formats

Inequality measures

STATISTICS AND PROGRAMMING

The point is measuring how uniform

The focus is on measuring the variability of data concerning a

In economics terminology, a variable is said to be less or more

• EQUIDISTRIBUTION= Each fraction of the sample carries a

Given the series of observations x1, x2, . . . , xN on a transferable

Ai = x(1) + x(2) + . . . + x(i ) ,

the total amount of variable X observed on the first i units,

Obviously, AN = N · µ is the total over all observations, referred to as

Define the two ratios, with i = 1,

• perfect equality: Ai = i · µ and then Qi = (i · µ)/(N · µ) = Pi ,

Hence, the more Qi differ from Pi , i = 1, 2, . . . , N − 1, the

Let X be a transferable variable, and let x1, x2, . . . , xN be a series of

The following relationship holds:

Suppose the monthly income of five households (hypothetical data):

The Gini coefficient is:

• It ranges from 0 to 1 (0=“perfect equality”, 1=“perfect inequality”);

1. We put Pi on the horizontal axis and Qi on vertical axis, then we

The quantity max S is the area of

When N is sufficiently large, the area of perfect inequality is

Limit for N to Infinity of N-1/2N

TRIANGLE ABC-SUM OF TRAPEZOIDS

Website Globalinc allows the visualisation of the Global Income

Italy 1980 Italy 2014

It allows for some comparison:

Gini coefficient − period 1980−2014

1980 1990 2000 2010

Nominal and categorical variables does

Let’s consider the following percentage distributions about religion:

Which one is the most homogeneous?

For frequency distribution of a nominal variable, various indexes

The following indices are measures of heterogeneity:

where fi , i = 1, 2, . . . , k are the relative frequencies.

* e 1 attains its minimum, 0, when the frequency

• E1 and E2 can be used alternatively

In order to assess heterogeneity:

Let’s consider the percentage distribution of the French population by

⇒ Quite high heterogeneity.

You might also like