You are on page 1of 23

Inequality measures

STATISTICS AND PROGRAMMING

Laura Anderlucci

University of Bologna
Inequality measures

The point is measuring how uniform


the resources distribution is
Inequality measures

The focus is on measuring the variability of data concerning a


special kind of quantitative variables, the so-called transferable
variables.
Examples of transferable variables are income, wealth, etc. for which
the values held by the statistical units under study can be altered (e.g.
by government redistribution policies). In particular, the values can be
exchanged between the units.

In economics terminology, a variable is said to be less or more


concentrated depending on whether the amount of variable held by
a given proportion of units is greater or lesser.

1
TWO EXTREME SCENARIOS

The distribution has two possible extremes,one is the most equal and the other
is the least equal

• EQUIDISTRIBUTION= Each fraction of the sample carries a


corresponding fraction of the variable. All persons have the
same income, which is equal to the mean income
• MAX CONCENTRATION= Nobody has nothing but one person
Gini Coefficient

Given the series of observations x1, x2, . . . , xN on a transferable


variable
X and the corresponding ordered series x(1), x(2), . . . , x(N) , denote by

Ai = x(1) + x(2) + . . . + x(i ) ,

the total amount of variable X observed on the first i units,


arranged in non-decreasing order of their X value.

Obviously, AN = N · µ is the total over all observations, referred to as


the
variable’s total. It is the amount of variable carried by the whole
sample N
2
So, the last person will the variable’s total.
Gini Coefficient

Define the two ratios, with i = 1,


2, . . . , N:
i
* Pi = , i.e. fraction of units with respect to the total N in which X
N
≤ x(i ) ,
i
* Qi = , i.e. fraction of the variable’s total pertaining to such units.
AAN

In the case of

• perfect equality: Ai = i · µ and then Qi = (i · µ)/(N · µ) = Pi ,


• perfect inequality: Q1 = Q2 = . . . = QN − 1 = 0 and QN = 1.

Hence, the more Qi differ from Pi , i = 1, 2, . . . , N − 1, the


greater inequality in our data.

3
Gini Coefficient

Let X be a transferable variable, and let x1, x2, . . . , xN be a series of


observations on X . A measure of inequality of these data is given by
the Gini coefficient:

The following relationship holds:


Σ 1
N− Σ 1
N−
(Pi − Q i ) ≤ i
i =1 P . i
=1
The expression on the left-hand side is a measure of inequality, which is 0
Σ 1
in case of perfect equality and equal to N− i =1 Pi in the case of perfect
inequality.
⇒ G ranges from 0 and 1; it is a relative measure of variability, suitable
for comparing two or more series of observations.
4
CONSIDERATIONS

• The Gini coefficient measures how far the distribution is from the
equidistribution scenario, taking into account the distance Pi and Qi
• These differences are cumulated. The larger the differences, the larger the
equality
• If this difference is 0, the EQUIDISTRIBUTION IS REACHED
MEANING OF THE EXTREMES
• G=1 Numerator and denominator are equal, because all Qi are zero.
• G=0 Numerator is equal to 0, as Pi=Qi
Example - Gini Coefficient

Suppose the monthly income of five households (hypothetical data):


3.5, 6, 7, 9.5, 12.

i x(i ) Ai Qi Pi Pi − Qi
1 3.5 3.5 3.5/38=0.09 1/5=0.2 0.11
2 6.0 9.5 9.5/38=0.25 2/5=0.4 0.15
3 7.0 16.5 16.5/38=0.43 3/5=0.6 0.17
4 9.5 26.0 26.0/38=0.68 4/5=0.8 0.12
5 12.0 38.0 38.0/38=1.00 5/5=1.0 -
Total 38.0 - - - 0.55

The Gini coefficient is:


Σ 4
(Pi − Q i )
G = Σ 4
i=1 =
=
i =1 Pi
0.
0.275
55
2 5
Gini coefficient - Properties

• It ranges from 0 to 1 (0=“perfect equality”, 1=“perfect inequality”);


• It decreases if we add the same positive number to each observation;
• It does not change if we multiply all observations by a
positive number.
• It satisfies the so-called transfer principle: given two observations,
x(i ) and x(j) . with x(i ) < x(j) , if we decrease x(i ) by an amount τ and
increase x(j) by the same amount (maintaining the ranking order of
observations), G increases. Viceversa, inequality decreases
• EQUIDISTRIBUTION: Qi=Pi
• MAX CONCENTRATION: Xn=Nu

6
Geometric interpretation of the Gini
coefficient
Given a series of observations x1, x2, . . . , xN , we now draw a graph
which depicts the level of concentration: the Lorenz curve.

1. We put Pi on the horizontal axis and Qi on vertical axis, then we


plot the points (0, 0), (P1, Q1), (P2, Q2),. . . , (PN , QN ). The
curve is then obtained by connecting each pair of adjacent points
with a line segment.
2. Then, the line connecting the points (0,0) and (1,1) is drawn
(known as line of perfect equality).
3. Finally, a piecewise line ABC is drawn, connecting the points:

A = (0, 0) B N− 1 , C = (1,
N ,
= 1),
known as the curve of perfect 0
inequality.
7
Geometric interpretation of the Gini
coefficient
Denote by:
• S the area between the Lorenz
curve and the line of perfect
inequality;
• max S the area between the
line of perfect equality and the
curve of perfect inequality.
• The triangle ABC represents the
possible distributions
Then Gini coefficient can be ex-
pressed as follows:

S
G=
max
S
where max S = (N −
1)/2N.
8
Geometric interpretation of the Gini
coefficient

The quantity max S is the area of


the triangle ABC, having height 1
and base (N − 1)/N.
Notice that writing G = S/max S
re- veals that the greater the
distance be- tween the Lorenz curve
and the line of perfect equality, the
greater the in- equality in the data.

When N is sufficiently large, the area of perfect inequality is


approximately 1/2.

Limit for N to Infinity of N-1/2N

9
HOW TO CALCULATE THE SHADED AREA

TRIANGLE ABC-SUM OF TRAPEZOIDS


Calculate the area of trapezoids
Income inequality

Website Globalinc allows the visualisation of the Global Income


Distribution since 1980 (and up to 2014).
For example:

Italy 1980 Italy 2014


xi Ai Qi Pi P i − Qi xi Ai Qi Pi P i − Qi
2104 2104 0.030 0.1 0.070 2434 2434 0.022 0.1 0.078
3095 5199 0.075 0.2 0.125 4568 7002 0.062 0.2 0.138
3946 9145 0.132 0.3 0.168 6172 13174 0.117 0.3 0.183
4741 13886 0.200 0.4 0.200 7552 20726 0.185 0.4 0.215
5536 19422 0.280 0.5 0.220 8864 29590 0.264 0.5 0.236
6392 25814 0.372 0.6 0.228 10226 39816 0.355 0.6 0.245
7390 33204 0.478 0.7 0.222 11783 51599 0.460 0.7 0.240
8696 41900 0.603 0.8 0.197 13803 65402 0.583 0.8 0.217
10775 52675 0.759 0.9 0.141 17077 82479 0.736 0.9 0.164
16757 69432 1.000 1.0 0.000 29658 112137 1.000 1.0 0.000
Σ 9
Σ 9
Pi − Q i Pi − Q i
Σi =19 = 0.349 G2014 =
Σi =19 =
= 0.381
G1980 = = 1.57 1.71
i =1 Pi i =1 Pi
1 6
4.5 4.5 10
Examples: Global Income Distribution

It allows for some comparison:

Gini coefficient − period 1980−2014


Coutnries: Brazil, China, Italy, Russian Federation and USA

0.6

Country
Brazil
0.5
China
Gini

Italy
Russian Federation
0.4 USA

0.3

1980 1990 2000 2010


Year
Data source: h tt p s: / /ti nyco . re / 9553483

11
Heterogeneity

Nominal and categorical variables does


not accept variability measures used for
quantitative variables. So the concept of
heterogeneity is introduced to assess the
variability of a qualitative data
distribution
Heterogeneity

Let’s consider the following percentage distributions about religion:

Xi pi Xi pi
Catholicism 25% Catholicism 0%
No religion 25% No religion 0%
Islam 25% Islam 0%
Other religion 25% Other religion 100%
Total 100% Total 100%

Which one is the most homogeneous?


ANSWER: 2

12
Heterogeneity

For frequency distribution of a nominal variable, various indexes


can be introduced that measure the degree of diversity between
the observations.
HOMOGENEITY
• Consider a frequency distribution of a nominal variable. We say that
the frequency distribution is perfectly homogeneous when all
observations fall into a single category.
HETEROGENEITY
• On the contrary, we say that the frequency distribution is
perfectly heterogeneous when all categories have the
same frequency.
13
Measures of heterogeneity

The following indices are measures of heterogeneity:

Σk k− 1
e1 = 1− fi 2 , 0 ≤ e1 ≤
k
i =1
Σk
e2 = − fi ln (fi ), 0 ≤ e2 ≤ ln
(k) i =1

where fi , i = 1, 2, . . . , k are the relative frequencies.

* e 1 attains its minimum, 0, when the frequency


distribution is
perfectly homogeneous; it attains its maximum, ( k − 1)/k , when
the distribution is perfectly heterogeneous.
* Similarly, e 2 attains its minimum, 0, when the frequency
distribution is perfectly homogeneous; it attains its maximum,
ln( k ), when the frequency distribution is perfectly
heterogeneous. 14
OBSERVATIONS ON HETEROGENEITY

• E1 and E2 can be used alternatively


• E1 is always less than 1
• E2 can be greater than 1
• By dividing E1 and E2 for their theorical maximum, a RELATIVE
MEASURE OF HETEROGENEITY IS OBTAINED.
• These relative measure is known as NORMALIZED VERSION
NORMALIZED VERSIONS
• Normalized versions of E1 and E2 range between 0 an 1
• Normalized versions are more apt for comparisons and to assess
how heterogenous a distribution is with respects to the
maximum heterogeneity technically possible
ASSESS HETEROGENEITY

In order to assess heterogeneity:


• REPORT RANGE
• COMPUTE NORMALIZED RANGE
Example - Heterogeneity

Let’s consider the percentage distribution of the French population by


religion (source: Eurobarometer, 2018). Evaluate the heterogeneity.

Xi pi 2 2
2
2
Catholicism 50% e1 = 1 − (0.50 + 0.38 + 0.05 +
No religion 38% 0.07 )
Islam 5% = 0.60
Other religion 7% e2
= − (0.50 ln 0.50 + 0.38 ln 0.38 +
Total 100%
+0.05 ln 0.05 + 0.07 ln 0.07)
= 1.050
Given the max e1 = 0.75 and max e2 = ln(4) = 1.386, their
normalized values are 0.8 and 0.758, respectively.

⇒ Quite high heterogeneity.

15

You might also like