You are on page 1of 190

Subscribe to DeepL Pro to edit this document.

Visit www.DeepL.com/Pro
Comparisons by ratios for more information. Comparisons by di erences

Istat allows these structural changes to be monitored


through the di usion of the following indicators:
Birth rate: is the ratio between the number of live births in a
year and the average population of the same year (or person-
years), multiplied by 1000
Total fertility rate (average number of births per woman): This is
the sum of the specific fertility quotients calculated by comparing,
for each fertile age (between 15 and 49 years), the number of live
births to the average annual amount of the female population.
Marriage rate: The ratio between the number of marriages
celebrated in the year and the average amount of the resident
population, multiplied by 1,000.
Mortality rate : is the ratio between the number of deaths in the
year and the average amount of the resident population,
multiplied by 1,000
Divorce rate : is the ratio between the number of divorces in the
year and the average amount of the resident population,
multiplied by 1,000

http://dati.istat.it/Index.aspx?DataSetCode=DCIS_FECONDITA1
Comparisons by ratios Comparisons by di erences

Derivative relationships are the result of the relationship


between a movement collective and a state collective (Leti,
Cerbara 2009 p 57-58)

The movement collectives are identifiable only if referred to a


time interval: it makes no sense to talk about deaths, births,
graduates, immigrants, etc. if I do not specify the time period
in which the event must be observed

State collectives refer to an instant in time: when we talk about


population, university enrollments, companies in an area, etc.,
we always refer to the number of collectives recorded in a
given moment in time.
Comparisons by ratios Comparisons by di erences

Derivation ratios are also used in areas other than


demographics.
Transition rate from upper secondary school
at the university: and the ratio of those enrolled in the university among
those who
have graduated in one year and those who graduated in the same
year, multiplied by 100.
Graduation rate (by cohort): and given by the ratio of graduates
from a cohort to those enrolled in the cohort, multiplied by 100.
Crime rate: It is the ratio of the number of reported crimes to
the population, multiplied by 100.
One of the main problems in de nitioning derivation ratios is to de
nition the collective to denominator of the ratio.

In fact, while the numerator collective (frequencies or quantities) always


refers to an interval of time (e.g. born in one year, graduated in one
year), the denominator collective refers to the collective at an instant in
time (e.g. at the beginning of the period, at the end of the period, or an
average value is used).
In addition, the de nition of collectives is not always unambiguous
and is more problematic for the denominator. . . ...(e.g.
Graduation rate)
Comparisons by ratios Comparisons by di erences

Average Ratios
They can be calculated as the ratio of a quantity and/or a
frequency referring to one or more collectives

It must make sense to calculate a ratio between


numerator and denominator
Average values of pensions in Italy

Value of pensions paid


(10)
Number of pensions paid
Average values of protested bills of exchange (VMCP)

Value of protested bills


(11)
Number of protested bills of exchange
Average number of people per household (NMPA)
N population
(12)
Number of homes
Comparisons by ratios Comparisons by di erences

Comparisons by di erences
Comparisons by ratios Comparisons by di erences

Absolute and relative di erences

Absolute di erences: (x1 x2)


When x1 and x2 indicate the amounts of a phenomenon at
different times, respectively t1 and t2 the di erence jx2 x1j takes
the name of
absolute increment if x2 > x1
absolute decrease if x2 < x1
Absolute di erences are little used in practice because absolute
variations say little if not related to the unit of measurement.

The salary increase of 300 euros per month has a different


value depending on whether the initial salary is 1200 euros or
6000 euros
Comparisons by ratios Comparisons by di erences

To account for the order of magnitude of the measurements,


relative or percentage di erences (or variations) are used.

Relative di erences make measurements dimensionless


(unrelated to the unit of measure in which they are
expressed).

An increase in the unit price (per gram) of gold cannot be


compared with an increase in the unit price (per tonne) of
steel, whereas their relative changes over a period of time
are comparable (e.g. 15% gold and 22% steel).

The relative di erences are obtained by relating the absolute di


erence to the first or second term; multiplying the relative di
erences by 100, we obtain the percentage di erences (or
variations).
x2 x1 x2 x1
100 100 (13)
x1 x2
The choice between the two expressions depends on which of
the two terms x2 and x1 is considered more representative for
comparison.
Comparisons by ratios Comparisons by di erences

In some cases it may make sense to relate the numerator


to the half-sum of the two terms

x2 x1
1
(x1 + x2) 100
2
e.g. To compare the population density (inhabitants per km2)
in two countries
Comparisons by ratios Comparisons by di erences

Given the historical series of registered drivers in italy between


2009 and 2015 we calculate the relative percentage changes
compared to 2009

Table: Percentage changes in enrollment trends x1 = 2009

Academic Years Registered


(x1 x2)
(N absolute frequencies) 100
x1

ITALY SARDINIA ITALY SAR


2009 294724 5569
2010 288286 5413 -2.2
2011 279025 5319 -5.6
2012 253848 5372 -16.1
2013 252457 4925 -16.7
2014 255294 5117 -15.4
2015 260755 5270 -13.0
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Part II

The comparison between collectives through


the
measures of inequality
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Topics:
Inequality in transferable quantitative traits
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The indices of inequality serve to measure the diversity between


the values assumed by the statistical units in the distributions, in
general assume minimum value if all units of the distribution
have the same value or mode (so they are equal to each other)

They assume a maximum value in the situation in which


the maximum diversity between units is observed.

As the diversity between the units of the distribution increases,


the value of inequality indices increases.

They are widely used to compare the characteristics of distributions

The use of these indices is often functional to the


construction of indicators of economic and social
phenomena
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Examples
If we examine the distribution of university students of two
universities with respect to the region of origin and we observe
that the students of the first university all come from the same
region, while the students of the second university come from
different Italian regions, we can say that in the second university
there is more inequality of students with respect to geographical
origin than in the first...
Information on the origin of students enrolled in a university can
be used as an indicator of the university's ability to attract
students from other regions (attractiveness indicator). The
indicator will assume a value equal to 0 when all those enrolled
in the university come from a single region, and a maximum
value when they come in equal measure from all regions.
Similarly, if we look at the distribution of two states (A and B)
with respect to the income of citizens: if in state A all citizens
have the same income, while in state B the distribution of
income has a U-shape, we will say that in state B there is more
inequality in the distribution of income than in state A.
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

In general, inequality is measured through the use of indices that


have a different structure depending on the type of variable
For qualitative variables, we measure inequality through indices
of heterogeneity
For quantitative variables we measure the inequality with indices of
variability (e.g. standard deviation, interquartile range, etc).

For quantitative variables whose character is said to be


transferable (i.e., transferable from one unit to another), we
measure inequality through concentration. Income is a
transferable character.
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

Inequality in qualitative variables: indices


of heterogeneity
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Variability of a nominal variable

In the case (1) all statistical units are characterized by the same
modality
In case (2) all statistical units are distributed among the modes
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Heterogeneity indices

The objective is to obtain an index that is

minimum in the situation (1)

that it grows as you move away from situation (1) and closer to
situation (2)

Maximum in Situation (2)

The indexes of heterogeneity and homogeneity (which we will not


see) respond to this need.
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

Example: Votes to a political party by social class (low, low-


middle, middle, middle-high, high)
Minimal heterogeneity: everyone belongs to the same social class

Maximum heterogeneity: the population is equally distributed


among all social classes
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

Inequality in quantitative characters


transferable: the concentration index
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Background

In statistics we also classify characters according to whether


they are
proper to the statistical unit (weight, stature, age)
transferable from one statistical unit to other statistical units
(income, population of a municipality, labor force or value
added of an enterprise)
The character that a statistical unit can transfer (in whole or in
part) to another unit is called transferable.
De nition:
A variable quantitative puo de nirsi transferable if puo to pass
(materially or also only ideally) from a possessor to the other. In
such sense the height and not transferable, the income s .
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Concepts and de nitions

How is the wealth of a community distributed among its


members? When is it unequal?

How does wealth vary across social groups, across countries, and
over time?
How is wealth inequality measured? What information should a
synthetic index of inequality provide?
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Concepts and de nitions

Several preliminary and methodological issues arise in


measuring inequality:

Income is a variable by which the economic well-being of


individuals is measured

However, other variables could be used to measure economic


well-being (e.g., consumption and wealth)

The reference unit for assessing well-being can be the individual


or the family.
For example, if the unit is the family, it is necessary to make
the comparison homogeneous between households with
different amplitude
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Concepts and de nitions

A synthetic inequality index associates each possible


income distribution with a number that measures the
degree of concentration (Baldini, Toso 2004, p. 31).

The index thus designed ensures that given two income


distributions it is possible to determine that the first is more
unequal than the second (or vice versa) or that the inequality is
the same.

Almost always it is preferred to vary the index between 0-1 (or


between 0-100), relating the value of the index to the
maximum observable value (and multiplying it eventually by
100)
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

Let's take income as an example of a transferable character

Suppose we observe for 5 households the following


distributions of monthly income
(a): 1,7 3,6 1,2 2,9 5,
(b): 2,9 2,9 2,9 2,9 2,
(c): 0 0 0 14,5 0

All three distributions have mean 2.9 and the sum of incomes is
14.5

In distribution (b) inequality is minimum, in


distribution (c) inequality is maximum
The distribution (a) describes a situation intermediate between (b) and
(c)
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

De nitions and notations

Let us consider n statistical units ordered according to the non-


decreasing amount of character that each of them possesses
Given the distribution (a) of monthly net incomes of 5
households (in thousands of euros)

1;7; 3;6; 1;2; 2;9; 5;1


The ordered distribution and

1;2; 1;7; 2;9; 3;6; 5;1


Let xi (xi 0) the amount of the character possessed from the unit
i esima. So x1 x2xn
In distribution (a): x1 = 1; 2; x2 = 1; 7 : : : ; x5 = 5; 1
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

We indicate with Ai = x1 + x2 + + xi the total amount of the


character possessed from the poorest units (ordered in non-
decreasing sense)

i = 1; : : : ; n represents the ranking of the unit.

In distribution (a): i = 1; 2; 3; 4; 5 A1
= 1;2;
A2 = 1;2+1;7 = 2;9
A3 = 1;2+1;7+2;9 = 5;8
...
A5 = 1;2+1;7+2;9+3;6+5;1 = 14;5 The
ratio
Ai
Qi =
An

and the fraction of the total amount of the character that


is owned by the poorest units
Q1 = 1; 2=14; 5 = 0; 08, Q2 = 2; 9=14; 5 = 0; 20,
Q3 = 5; 8=14; 5 = 0; 40
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

The report
i
=
Pi

indicates the fraction of the poorest i units out of the total n units
P1 = 1=5, P2 = 2=5, P3 = 3=5, P4 = 4=5, P5 = 5=5
xi
Ranking To Pi
1 1,2 1,2, 0,20
2 1,7 2,9 0,40
3 2,9 5,8 0,60
4 3,6 9,4 0,80
5 5,1 14,5 1,00

40% of the poorest units own 20% of the total income, 60% of
the poorest units own 40% of the total income

The situation of absence of inequality is when Pi = Qi for


i = 1; : : : ; n
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

For Pi and Qi we have that:


As Pi grows Qi grows
You always have that Qi Pi .

We observe that Pi = Qi only in the following situations


1 When i = n (Pn = Qn = 1): In distribution (a) Q5 = P5
2 For each i, in the case of character equidistribution among the units of the
collective x1 = x2 = = xn: see distribution (b)

3 In every other case we have Qi < Pi for i = 1; : : : ; n 1

Ranking xi To Qi
1 1,2 1,2, 0,08 0
2 1,7 2,9 0,20 0
3 2,9 5,8 0,40 0
4 3,6 9,4 0,65 0
5 5,1 14,5 1,00 1

The inequality index is a function of di erences ( Pi Qi ).


Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

We said that the Concentration studies the way in which


the total amount An is distributed among the n units

As for the understanding of other statistical concepts, also for


the Concentration is useful to start from the extreme
situations: Minimum and Maximum

Minimum concentration (or equidistribution)= VARIABILITY


NULL= the statistical units all possess equal
quantities
x1 = x2 = = xn
In that case for each observation

Pi = Qi

and, therefore, whatever the

PiQi = 0
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Minimum concentration

Table: Income distribution (b)

Ranking xi To Qi Pi
1 2,9 2,9, 0,20 0,20
2 2,9 5,8 0,40 0,40
3 2,9 8,7 0,60 0,60
4 2,9 11,6 0,80 0,80
5 2,9 14,5 1,00 1,00

X
(Pi Qi ) = 0
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

a
Maximum concentration= MAXIMUM VARIABILITY =
unit possesses the total; the other n 1 possesses
a null amount of the character:
x1 = x2 = = xn 1 = 0 , xn = An

e
i

Pi = n ,Q1 = Q2 = = Qn 1 = 0,Qn = 1

we have, therefore, for i 6= n


i

Pi Qi = Pi = n
and for i = n
PiQi = 0
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Maximum concentration

Table: Income distribution (c)


xi Qi
Ranking To Pi Pi

1 0 0 0,0 0,20
2 0 0 0,0 0,40
3 0 0 0,0 0,60
4 0 0 0,0 0,80
5 14,5 14,5 1,00 1,00

n1 n1
Xi X

(Pi Qi ) = Pi

=1 i=1
The more concentrated the character, the greater the di erence
between (Pi Qi )
Pn 1
The maximum value e i=1 Pi
Inequality in qualitative variablesInequality in transferable quantitative characteristics
Graphical representation of the concentration index

Any reasonable indicator of concentration will need to:

be MINIMUM when the concentration is minimum be


MAXIMUM when the concentration is maximum GROW
when the concentration increases

The measure will have that is to assume values more


elevates when from a situation of equidistribution it passes
to a situation in which little units possess an important part
of the total
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

The Gini Concentration Ratio

Qi
The more the character is concentrated the greater the Pi di
erence.
The concentration measurement should take into account all Pi
Qi di erences except the last one (which is always equal to 0).
The simplest formula you can use is

n 1
G= (Pi Qi )

i=1

X
which is minimal, and is worth 0, if
P1 Q1 = 0;P2 Q2 = 0; : : : ; Pn 1 Qn
n

and it is maximum if Q1 = Q2 = Qn 1 = 0 and, in such case it is valid i=1


P
To get a relative index, 0 being the minimum, we divide
maximum:
n1
n1 (Pi Qi ) Qi
g= P in =1 1
Pi = 1 Piin =1 1
Pi

i=1 =1

P P
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

The Gini Concentration Ratio

In relation to distribution (a)

Ranking Position xi To Qi Pi (Pi Q


1 1,2 1,2, 0,08 0,20 0,12
2 1,7 2,9 0,20 0,40 0,20
3 2,9 5,8 0,40 0,60 0,20
4 3,6 9,4 0,65 0,80 0,15
5 5,1 14,5 1,00 1,00

n1 (P Q ) 0; 67
i

g= Pi=1 in=11i Pi
= 2; 00
= 0; 335
P
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Maximum concentration

Table: Income distribution (c)

Ranking xi To Qi Pi Pi
1 0 0 0,0 0,20
2 0 0 0,0 0,40
3 0 0 0,0 0,60
4 0 0 0,0 0,80
5 14,5 14,5 1,00 1,00

4
X

Pi = 0; 20 + 0; 40 + 0; 60 + 0; 80 = 2; 0
i =1

n 1(P Q ) 2;
i

g= Pi=1in=11i Pi
=
P
Inequality in qualitative variablesInequality in transferable quantitative characteristics Graphical
representation of the concentration index

Minimum concentration

Table: Income distribution (b)

Ranking xi To Qi Pi
1 2,9 2,9, 0,20 0,20
2 2,9 5,8 0,40 0,40
3 2,9 8,7 0,60 0,60
4 2,9 11,6 0,80 0,80
5 2,9 14,5 1,00 1,00

n1 (P Q
i

g= P i=1 in=11i Pi

P
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Graphical representation of the


concentration
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The broken of Lorenz

In a system of orthogonal axes we report on the axis of the


abscissas the values of the Pi and on that of the ordinates the
values of the Qi
By means of the pairs it is possible to realize a graph called
Lorenz curve.
If there is equidistribution we have Pi = Qi and, therefore, the
points (Pi ; Qi ) are arranged on the bisector of the first
quadrant
If there is no equidistribution, except for the last point
((Pn; Qn) = (1; 1)), all points Pi ; Qi lie below the bisector of
the first quadrant
The points (Pi ; Qi ) to the variation of i constitute
the curve of concentration or spezzata of Lorenz
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The Lorenz split for the distribution (a) of incomes

LORENZ CURVE
1

0.9

0.8

Surface Concentration

0.7

0.6
i/A_ n

0.5
i=A_

S
Broken of
Q_

0.4

concentration

0.3
Q_2
0.2

Q_1
0.1

O0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

P_1 P_2 P_i=i/n


Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The broken of Lorenz

The Lorenz curve, in addition to representing the


concentration graphically, also allows us to measure this
characteristic.
And a measure of concentration is the ratio

Area S
Area triangle of maximum concentration

Null concentration ! Lorenz curve = Equidistribution


line ! Area S = 0
Maximum concentration ! Lorenz curve = Segment of
maximum concentration ! Area of concentration = Area of
maximum concentration triangle
Intermediate concentration ! Area S the greater the more
pronounced and the concavit that is the higher the concentration.
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The broken of Lorenz

Considering these aspects, Gini proposed another measure


of concentration
The surface S is obtained from the di erence between
^
the area of the OBC triangle (of unit base and unit height) equal to
1
2
e
the area of the triangle with vertex O, base P1 and height Q1

1
P1Q1
2
and n 1 trapezoids of bases Qi and Qi+1 and height Pi+1

1 n1
X
2 (Pi+1 Pi )(Qi+1 + Qi )
i=1
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Therefore, the concentration surface has area


n
1 1 1 1
Xi

2 2 P1Q1 2 (
Pi+1 Pi )(Qi+1 + Qi )
=1

Since (P0; Q0) = (0; 0) we can put

1 1 n1
X
2 2 (Pi+1 Pi )(Qi+1 + Qi )
i=0
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The broken of Lorenz

If the character is concentrated in a single unit the first n 1 poorest


units have an amount of the character equal to 0
1 1

S= 2 (n 1 2
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Broken: Summary

1
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The broken of Lorenz

However, since for n large the two triangles OB0C and OBC
are equal, Gini proposed to consider as max the value 12
So, dividing the area S by its max = 12
1 1 n1

P 2

2 2 (
i=0 Pi+1
Pi )(Qi+1

R=
1
n 1
X

R=1 (Pi+1 Pi )(Qi+1 + Qi )


i =0
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Below we list what are in the literature the desirable properties


that an index measuring inequality in the distribution of income
should possess and true I call which of these characteristics
are possessed by the Gini index
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

A measure of inequality in income distribution should possess the


following properties (Baldini and Toso, 2004 p.53-55):
1 Anonymity: the inequality index must be insensitive to permutations of the
vector of incomes. Given three income distributions (5; 10; 3), (10; 3; 5), (5; 10; 3),
the value of the inequality index associated with the distributions, which we will
denote with I , must not change. Therefore if the property is satisfied it will be true
that

I(5; 10; 3) = I(10; 3; 5) = I(5; 10; 3)

2 Independence from the mean: If all incomes are multiplied by a constant the
value of the index I does not change. Given the two distributions (5; 10; 3) and
(10; 20; 6) if we calculate the index of inequality for both will be true I (5; 10; 3) = I
(10; 20; 6)

This property ensures that the index depends on relative


di erences between incomes and not on absolute di
erences. In fact the absolute di erences between the
terms are varied by
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

3 Population independence: if each income is replicated k times, the


inequality of the new distribution is equal to the starting one. For example
we compare the two distributions
(5; 5; 10; 10; 3; 3) and (5; 10; 3) if the inequality index
satisfies the property it is true that
I (5; 5; 10; 10; 3; 3)=I (5; 10; 3)
4 Principle of transfer: If there is a transfer from a richer unit to a
poorer one, which does not change the order of the subjects (richer and
less rich and poorer), the inequality index decreases.
Ways to this principle have been proposed so as to make the
index more sensitive to income transfers when they occur to the
benefit of the poorest part of the distribution (Foster and
Sharrocks, 1987).
4.1 The inequality index satisfies the principle of descending
transfer if its value is reduced by a transfer to a poorer person,
and this reduction is greater the lower the income of the person
receiving the transfer.
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

Let's make a practical example to better clarify the principle


of descending transfer
Consider the distribution of income at:(20,300,900), and imagine
that a transfer (of the same total amount) of income from a richer
unit to a poorer unit (which does not change the order of the
units) can occur in two ways

distribution b:(70, 250, 900)


distribution c:(20, 350, 850)
The descending transfer principle is satisfied by the index I , if
calculating the inequality index for the three distributions results
in

I (a) I (b) > I (a) I (c)

The decrease in diseglect from distribution (a) is greater in


distribution (b) than in distribution (c).
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

4 Decomposability between groups: Suppose that the population can be


divided into G groups (for example by geographic area, or by qualification), the
index is decomposable into groups if it can be calculated as a weighted sum of the
values that the index takes in each subgroup (Ig ) plus a term that measures
inequality
between groups
X
I= wg + = Iwithin +
Ig IB Ibetween

Iwithin measures inequality within groups, while Ibetween only


measures distances between average incomes of groups
If an index satisfies this property, it means that if the inequality
within a group decreases, we can be sure that the value of the
index decreases.
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The Gini concentration ratio is used around the world to


measure inequality in income distribution

And the most popular summary measure of inequality.

The Gini index satisfies the properties of anonymity [1],


independence from the mean [2], independence from the
population [3] and the transfer principle [4].

The index does not satisfy the principle of descending transfer [4.1].

In fact, a redistribution of income in general has a greater effect


in terms of decreasing inequality the further apart the individuals
are in ordinal terms (di erence of rank in the ranking).

The sensitivity of the index to a redistribution does not depend on


the income levels of the two individuals, but on their rank di
erences.
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

We make an example to better understand the concept: The


transfer of the same amount of income from an individual who
occupies the 100th position to one who occupies the poorest
position in absolute, the first, determines the same reduction of a
transfer, of equal amount, from an individual who occupies the
800th position to one who occupies the 701st position.

The Gini index also does not satisfy the property of exact
decomposability between population groups [5].
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

The World Bank makes data on inequality in income


distribution available at the following link
http://wdi.worldbank.org/tables
Clicca su: Poverty and shared prosperity - 1.3 Distribution of income
or consumption
http://wdi.worldbank.org/table/1.3
https://www.theguardian.com/inequality/datablog/2017/apr/26/
inequality-index-where-are-the-worlds-most-unequal-countries

The methodology and limitations of the Gini index as an


indicator of inequality are discussed in the following paper
http://databank.worldbank.org/data/Views/Metadata/MetadataWidget.aspx?Name=
GINI%20index%20(World%20Bank%20estimate)&Code=SI.POV.GINI&Type=S&ReqType=
Metadata&ddlSelectedValue=SAU&ReportID=43276&ReportType=Table
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

From the World Bank website . . .

Gini index measures the extent to which the distribution of income (or, in
some cases, consumption expenditure) among individuals or households
within an economy deviates from a perfectly equal distribution. A Lorenz
curve plots the cumulative percentages of total income received against the
cumulative number of recipients, starting with the poorest individual or
household. The Gini index measures the area between the Lorenz curve and
a hypothetical line of absolute equality, expressed as a percentage of the
maximum area under the line. Thus a Gini index of 0 represents perfect
equality, while an index of 100 implies perfect inequality.
Diseguaglianza nelle variabili qualitative Inequality in transferable quantitative characters Graphical representation of the concentration index

OECD Income Distribution Database (IDD): Gini,


poverty, income, Methods and Concepts

To benchmark and monitor income inequality and poverty across


countries, the OECD relies on a dedicated statistical database: the
OECD Income Distribution Database which o ers data on levels and
trends in Gini coe cients before and after taxes and transfers, average
and median household disposable incomes, relative poverty rates and
poverty gaps, before and after taxes and transfers, etc. Due to the
increasing importance of income inequality and poverty issues in policy
discussion, the database is now annually updated

http://www.oecd.org/social/income-distribution-database.htm
To download key indicators click on

http://www.oecd.org/social/soc/IDD-Key-Indicators.pdf
To download the tables of data in excel click on

http://www.oecd.org/social/ OECD2016-
Inequality-Update-Figures.xlsx
Composite indicators Transformation of variables

Part I

Problems and methodologies of synthesis of


social indicators
Composite indicators Transformation of variables

Problems and methodologies of synthesis of social


indicators

References:
1 Delvecchio (1995) Measurement Scales and Social Indicators. Chap
5 page 117-141 page 158-160
Testo alternativo in lingua inglese: Nardo M., Saisana M.,
Tarantola S., Homan A., Giovannini E. (2005), Handbook on
Costructing Composite Indicators Methodology and user guide,
OCSE 2005 pag. 1-49.
2 Leti G. Cerbara L. (2009). Elements of Descriptive Statistics. Chap 10
The averages of distributions according to a character p. 185-189

3 BES2015 Report. The Composite Indicators (p. 49-54)


Composite indicators Transformation of variables

Composite indicators
Composite indicators Transformation of variables

Plain or composite?

When the phenomenon is by its nature complex, consisting of


several dimensions (multidimensional), needs to be
represented in its various aspects

In cases where several indicators contribute to the


representation of the phenomenon, an overall synthesis is
often used (the composite indicator) obtained as a combination
of the elementary indicators

The synthesis of elementary indicators into a composite


indicator is also sometimes done for concepts that have a
one-dimensional structure (think of income)

This is the case when several indicators are available that can
be used to define the phenomenon, and the ranking of the
units depends on which of the available indicators is selected
to monitor the phenomenon.
Income: Employment income? Property income ?
Composite indicators Transformation of variables

Plain or composite?

It is a matter of assessing whether there is an indicator of the


phenomenon that can identify its changes (possibly in space or
time)
+
If two indicators lead to similar rankings of units with respect to
the intensity of the phenomenon (e.g. similar rankings) these
are measuring the same aspect and are interchangeable
If the two indicator variables lead to different results, it is more
appropriate that both contribute to the formation of the
composite indicator
Composite indicators Transformation of variables

Example: Measuring the economic well-being of a nation through


per capita income

Per capita income is generally indicative of the well-being of a


country. High incomes are usually associated with high life
expectancy at birth, an efficient health and education system,
high technological development, etc.
Is per capita income equally representative of a state's well-
being in countries with high inequality in income distribution?
Italy and Finland in 2004 had the same income pro-capita
(PPP$) but the index of concentration of the Gini was for Italy
0.36 while for the Finland much more low 0.26
Composite indicators Transformation of variables

Plain or composite?

If the different distribution of income leads to significant


changes in the distribution of welfare, the choice of
comparing the welfare of the two countries only using per
capita income will lead to distorted results
Example:
In 2004, a nation with a high per capita income, such as the United
States, had an inequality index among the highest in the high-
income bracket. At the same time, life expectancy at birth in the US
is on average lower than in countries in the same income bracket.
Composite indicators Transformation of variables

The appropriateness of synthesizing elementary indicators


into composite indicators has long been debated, and
positions in the literature are diverse and related to the
nality of synthesis

In some situations it is preferable to represent a complex


phenomenon through the list of elementary indicators, avoiding
to operate a nal synthesis

This choice is motivated by the fact that the result of the


synthesis of several indicators in a composite index could
conceal significant changes that have taken place in relation to
specific aspects and therefore limit the speed of intervention.
Composite indicators Transformation of variables

In general, the choice of synthesizing the elementary indicators in a


composite indicator is also linked to the research objectives

If indicators are used for programming and/or monitoring


purposes (providing early warning signals) on the status of a
phenomenon, it is advisable to read together the elementary
indicators

If the indicators are used for descriptive-comparative purposes,


then the synthesis of the elementary indicators into a composite
index (indicator of the phenomenon as a whole) makes it
possible to better evaluate the general conditions of a complex
phenomenon and facilitates its comparison in time and space
Composite indicators Transformation of variables

Summary of elementary indicators

The synthesis of the elementary indicators therefore implies:


Conceptual considerations about whether or not to aggregate
individual components

Considerations mainly of a technical nature concerning the


most suitable statistical methods for the management of the
data:

First, transform the elementary indicators so that they all point in


the same direction with respect to the phenomenon under study
(e.g. if we measure the quality of life in Italian provinces, the % of
university graduates is a positive indicator, while the number of
crimes per 1000 inhabitants is a negative indicator).
Make the elementary indicators independent of the unit of
measurement (dimensionless) so that they are comparable
Composite indicators Transformation of variables

We indicate xij the value observed for the i esima unit in the j
esimo indicator

Where j = 1; : : : ; m are the m elementary indicators and i =


1; : : : ; n are the statistical units considered (e.g. households,
regions,provinces, states, etc.)

We represent the data through an X matrix with n rows and m


columns
x11 x12 x1j
...
x21 x22 x2j
...
... ...
xi1 xi2 x2j
...
... ...
xn1 xn2 xnj
...
Composite indicators Transformation of variables

Each statistical unit (i = 1; : : : : ; n) is associated with the


composite indicator si , obtained from the synthesis of the m
elementary indicators.
x11 x12 x1j
s1 = ...
x21 x22 x2j
s2 = ...
... ... ..
xi1 xi2 x2j
yes = ...
... ... ... ..
xn1 xn2 xnj
sn = ...
Composite indicators Transformation of variables

Summary of elementary indicators

The composite indicator for the i-th unit is obtained from the
synthesis of the values observed in relation to several simple
indicators
xi1; xi2; : : : ; xim

The synthesis is carried out through a function. For example,


through the sum, the average (simple arithmetic, geometric,
harmonic, etc.), the median, etc.

si = f (xi1; : : ; xim)
If we use the simple arithmetic average (not weighted) of the
m indicators we will have
Pm xij
j=1
yes =

m
Then we will get n values of the composite indicator s(s1; : : ; sn)
Composite indicators Transformation of variables

If the indicator variables x1; : : : ; xm have erent metrics we


need to make them dimensionless before aggregating them
through a synthesis function
At each indicator x1; : : : ; xm will be applied a transformation g
(x) that makes it dimensionless.

For each statistical unit I apply the transformation g (x) to all


the elementary indicators that contribute to the definition of
the phenomenon.

xi1; : : : ; xim
#
g (xi1); : : : ; g (xim)
g (x) is a transformation of the original data made with the aim of
obtaining measurements that all have the same direction and
the same unit of measurement
Composite indicators Transformation of variables

Summary of elementary indicators

Example:
Suppose we measure the socioeconomic status of
households through two indicators
socio-economic status =f(economic status, social status)
The two variables are not directly measurable, so we use two
indicators x1 and x2
x1 = annual household income (in thousands of euros) ! x1
x2 = education of the parent with the highest level of education
(number of years taken to obtain the degree) ! x2
Composite indicators Transformation of variables

Suppose:
for the x1 indicator I observe values between 5400-154000 euros
with an average of 36000 euros
for x2 I observe values between 5-25 years
I want to synthesize the values with a composite indicator, using
the arithmetic mean as a synthesis function

Before proceeding with the synthesis I need to transform x1 and


x2 so that they are directly comparable (e.g. they take values
between 0-100, or are interpretable in terms of z-score -
distance of each observation from the mean in terms of standard
deviations-, etc)
Composite indicators Transformation of variables

If I transform the values of the simple indicators x1 and x2 into


zeta scores, the function g (x) e

x x
z = g (x) =

z-scores z1 and z2 are dimensionless measurements that


can be used for synthesis
Composite indicators Transformation of variables

Transformation of variables
Composite indicators Transformation of variables

Transformation of variables

Identification of a procedure for the transformation of


elementary indicators

The elementary indicators measured with different units of


measurement must be made aggregable

The main transformation methods can be broken down into:

1 Methods based on the ordinal approach

2 Methods based on the cardinal approach

3 Methods for the transformation of ordered qualitative variables (indirect


quanti cation methods)
Composite indicators Transformation of variables

1 Ordinal approach

Sorting of the values of the elementary indicators (Ranking)

Values of elementary indicators transformed into percentiles


(or rank percentiles)

2 Cardinal approach
Values of elementary indicators transformed into index numbers

Values of elementary indicators relativized with the range


of variation (Re-scaling)

Values of elementary indicators transformed into z-


scores (Standardization)

Values of elementary indicators transformed into percentages


Composite indicators Transformation of variables

We examine these transformations taking as a reference, for


purely didactic purposes, the battery of indicators for
measuring the standard of living in the 20 Italian regions.
Data 1989-1990 (Delvecchio 1995, p. 98-99)
Composite indicators Transformation of variables

Example: Indicators for measuring living


standards (Delvecchio,1995 p. 98)
Composite indicators Transformation of variables

The value for Piedmont of the X 1 indicator is identified by column B


and row 2: therefore from cell B2
Composite indicators Transformation of variables

Negative Indicators

The first thing we will do is to orient all indicators in the


same direction (positive direction).

in such a way that higher indicator values correspond to


better situations

We will indicate with an a the indicators for which the direction


has been changed (polarity)

For example, with regard to the first elementary indicator =


'infant mortality rate 1 year after birth (1000)', the positive
indicator X1 is given by the complement to 1000 of the
indicator X1
X1 = 1000 X1

For Piedmont
X11 = 1000 8; 2 = 991; 8
Composite indicators Transformation of variables

If, as in the case of the second indicator, the reference unit is


10,000, to change the positivity we will do the following
operation

X2 = 10000 X2
In the next table the direction of all negative indicators has
been changed
Composite indicators Transformation of variables

X1 = 1000 X1
Composite indicators Transformation of variables

For indicator X5= Occupancy rate (No. of occupants of a


dwelling/No. of rooms) the positive indicator was obtained by
performing the following transformation
X5=1
X5
As the value of the variable X5 increases, X5 decreases
Composite indicators Transformation of variables

X*=1/X
1.9

1.7

1.5
X_5*

1.3

1.1

0.9

0.7
0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00
X_5
Composite indicators Transformation of variables

For Piedmont X 5 = 1=Indicators!F 2, because the values of the X5


indicator are found in column F of the excel sheet Indicators
Composite indicators Transformation of variables

Ranking
For each elementary indicator the statistical units are ordered in
ascending order and each unit is assigned a value equal to the
order number (or rank) that the unit occupies in the ranking

g (x) = rank(x)
If it is a negative indicator and there is no change of direction,
the units will be sorted in descending order.
When several units have the same value, the average rank is
assigned to them.

Example: If we order the regions according to per capita income and if


the third, fourth and fifth units assume the same value (the same income),
each of them will be given the average rank of '4'.
(3+4+5)
3
The range of variation of the new variables is ! 1 g (x) N The
method is not affected by the presence of an omal observations .
Composite indicators Transformation of variables

Elementary indicators with positive direction: Average


rank
I indicate with gij the rank assigned to the statistical unit i but, with
respect to the indicator j mo
I use the RANGO.AVERAGE(num, ref, order) function in Excel, which
requires that 3 pieces of information be speci cated. (You will see these in
detail in the lab).
Composite indicators Transformation of variables

RANGO.AVERAGE(num, ref, order)


num: Value to which I must assign the rank (sheet 'Indicators+'
value B2)
ref: Set of values that the variable takes on (sheet
'Indicators+' values from B2 to B21)
order:0 descending, 1 ascending
Composite indicators Transformation of variables

Where do I find the RANGO.MEDIA function?


Composite indicators Transformation of variables
Composite indicators Transformation of variables

Sum of ranks

The synthesis in a composite indicator of the observed


rankings in relation to the individual elementary indicators can
be done through the sum of the ranks

Indicating with gijstatistics il rango assegnato all’unit i ma,


regarding indicator j the synthetic indicator and
m
Xj

you = gij

=1

m s m n. The sum of the ranks varies between min = m


and max = m n
Composite indicators Transformation of variables

m
X

you = gij
j =1
Composite indicators Transformation of variables

The 57.5 index for Sicily (minimum value observed) was


obtained by summing the ranks that this region occupies in the
16 rankings of the
16 indicators:

2+1+4+5:::+6+1+3+7+6=57;5

The highest value is observed for Friuli V. G. (s=238)


Composite indicators Transformation of variables

Sometimes it is preferred to relativize the indicator between 0-1


using the following transformation

you mi
yes =
max(s) m
O between 0-100
you min(s)

max(s) min(s)
Both composite indicators (s and s ) do not take into account
the value assumed by the statistical units in each of the
elementary indicators that contributed to its determination, but
only their relative position
Composite indicators Transformation of variables

Relatively to s it is specified that it can be calculated using the


minimum and maximum theoretical: that is min = m and max = n
m

The theoretical minimum corresponds to the value of s that


would be observed if the same region occupied the lowest rank
(rank=1) - the worst result - in all the rankings for the 16
indicators.

The theoretical maximum corresponds to the value of s that


would be observed if the same region occupied the highest rank
(rank=16) - the best result - in all the rankings for the 16
indicators.

Relative to the example of the quality of life in the Italian


regions we will have the following values for the minimum and
theoretical maximum:
min = 16 e max = 16 20 = 320
Composite indicators Transformation of variables

you min(s)
yes =
max(s) min(
Composite indicators Transformation of variables

In some circumstances, the minimum and maximum observed in


the composite indicator values are used

Minimum and maximum observed in the distribution: min =


57; 5 and max = 256; 0
We will denote this second indicator by s to distinguish it from
s

In the latter case, relative indicator values will be observed


between 0-1 (or between 0-100 if multiplied by 100)
respectively.
Composite indicators Transformation of variables

you min(s)
yes =
max(s) min
Composite indicators Transformation of variables

Transformation of values into Rank Percentiles


of the distribution

A score on a scale of 0 to 100 is assigned to each statistical


unit for each elementary indicator based on the percentile the
unit occupies in each distribution
Example:
from 90th to 100th they will score 100 from 80th
to 90th they will score 90 . . . .
from the 20th to the 30th will score 30

To remember :
1 You lose all information about the levels
2 Transformation is not suitable for temporal comparisons.
Composite indicators Transformation of variables

Values of elementary indicators transformed into


rank percentiles

To operate the transformation of the values of the elementary


indicators into rank percentiles ( xij must be transformed into
percentiles pij ) two excel functions (which you will see in the lab) can
be used.
Composite indicators Transformation of variables

On Excel we will apply a similar function that returns the rank of a


value in a dataset as a percentage (extremes excluded) of the
dataset itself. If we take the example of an aptitude test, the function
expresses the condition of a score with respect to all other scores on
the same test.
ESC.PERCENT.RANGO(matrix; x; digits mean)

matrix: all the values of the elementary indicator

x: value of which you want to calculate the rank

digits mean: optional


Composite indicators Transformation of variables
Composite indicators Transformation of variables

ESC.PERCENT.RANGO('indicators+'!B$2:B$22, 'indicators+'!B2)*100

Matrix: the values of variable X1 are found in the excel sheet 'Indicators+' in the column
B, lines 2 to 22 (B2 : B22)

x: I want to extract the percentile rank relative to the first observation (x11 which is in
position B2)

I multiply the transformed value by 100 to get p11


Composite indicators Transformation of variables

ROTATE(num,num digits)
num: Number to be rounded.
num digits: Number of digits to round off the num argument.
If num digits is greater than 0 (zero), num will be rounded to
the specified number of decimal places.
If num digits is equal to 0, num will be rounded to the nearest
integer.
If num digits is less than 0, num will be rounded to the left of the
decimal point.
Composite indicators Transformation of variables

Synthesis

If we denote by pij the (rank) percentile assigned to the


statistical unit i but, with respect to the indicator j mo, the
synthetic indicator e

m
X

yes = pij
j=1

m s m n. The sum of the (rank) percentiles varies between min


= 0 and max = m 100
Composite indicators Transformation of variables

Values of elementary indicators transformed into


percentiles

m
X

yes = pij
j =1
Composite indicators Transformation of variables

Again, we can rescale the composite indicator in the range


0-1
you min(s)
yes =
max(s) min
O between 0-100
you min(s)
s =

i max(s) min(s)
Composite indicators Transformation of variables

Transformation into index numbers

The elementary indicators can be transformed into base index


numbers ssa

In this way they will be released from the unit of measurement

For each elementary indicator we will calculate the ratio


between the observed value and its average. Each value xij
will be transformed into
xij
Iij = i = 1; : : : ; n j = 1; : : : ; m
xj
xj ( j ) and the mean of the elementary indicator xj . Therefore

x
g (x) =
x
In the example relating to the quality of life in Italian regions
n = 20 and m = 16
Composite indicators Transformation of variables

The series of the index number at base ssa, constructed with


respect to the mean value of the distribution, allows to
evaluate the positions assumed by the different statistical units
with respect to the mean value of the distribution

We transform all the elementary indicators into index


numbers and then use the simple arithmetic mean function
for synthesis
Pm
Iij
j=1
si = i = 1; : : : ; n
m
Composite indicators Transformation of variables

First of all it is necessary to calculate the average of the single


elementary indicators (in the laboratory you will use the arithmetic
average function AVERAGE(B2:B22) available in Excel)
Composite indicators Transformation of variables

I transform the values xij into index numbers Iij


Composite indicators Transformation of variables

For each unit the arithmetic mean (line average) of the index
numbers is calculated
Composite indicators Transformation of variables

The region with the lowest value is Basilicata (0.78), while the
region with the highest value is Liguria (1.17).

By transforming them into index numbers we can free the


elementary indicators from the unit of measurement, but not
from the differences in variability.

Therefore, operating the simple arithmetic average of the index


numbers, the index numbers that have a greater variability
weigh more on the final result.

In order to free indicators from both the unit of measurement


and their variability, the following transformations are mainly
used: Rescaling (relativization with respect to the range of
variation) and Standardization
Composite indicators Transformation of variables

In cases where individual values are the result of a ratio, it


may be appropriate to synthesize the elementary indicators
through the geometric mean.

Later we will see the main differences between the averages


(summary measures) that we can use to obtain a composite
indicator: arithmetic, geometric, harmonic average.
Composite indicators Transformation of variables

Re-scaling

The elementary indicator is reproportioned so that it oscillates


between 0 (lowest value) and 1 (highest value). The generic
value xij is transformed into Rij
xij min(xj )
Rij =
max(xjmin ) (xj )

The range of variation of all transformed elementary indicators


and 0 g (x) 1
O between 0 g (x) 100 if the transformed indicator is multiplied
by 100

Re-scaling transformation allows for the release of indicators


from the unit of measurement and their variability, as the
deviation from the average value is relativized with respect to a
measure of variability, i.e. the range of variation
max(xj ) min(xj )
Composite indicators Transformation of variables

This transformation is widely used in economics and is


robust enough to make comparisons in space.

A limitation of the transformation is that it is affected by the


value assumed by the minimum and maximum (beforehand it
is necessary to check if there are anomalous observations)

It is necessary to set the minimum and the maximum if we


want to use this transformation to make comparisons over
time (we will see an example in the case of the HDI
indicator).
Composite indicators Transformation of variables

In the lab you will see how to apply the function in Excel

The value of R1 for Piedmont was rescaled using the following function
=('indicators+'!B2-MIN('indicators+'!B$2:B$21))/(MAX('indicators+'!B2 : B21)-MIN('indicators+'!B$2:B$21))
Composite indicators Transformation of variables

For Piedmont, the value of the first indicator R11 is equal to


x11
min(xj ) 991; 8 988; 8 3; 0
R11 =
= =
max(xj ) min(xj ) 995; 4 988; 8 6; 6

The overall synthetic indicator for each region will be


obtained as the arithmetic average (of line) of the
reproportioned values
Pm
j=1 Rij
si = i = 1; : : : ; n
m
Composite indicators Transformation of variables

The synthetic indicator nal for each region

Pm
j=1 Rij
si = i = 1; : : : ; n
m
Composite indicators Transformation of variables

The change of direction of the negative indicators can also


be performed by pressing the button:

min(x)=best value of the indicator (11,2)

max(x)=worst value of the indicator (4,6)


For the mortality rate infantile in Piedmont x11 = 8:2
8;2 11;2
R11 = 100 = 45;5
4;6 11;2
Composite indicators Transformation of variables

Standardization
Each value of the elementary indicator (xij ) and transformed
into standardized deviation (i.e. z-score)
xij
xj
zij =
j

where
n
1
j= v (xij x
u n i=1

u X

t
Therefore
x x
g (x) =
Measures how far the individual observations xij are from the
mean of variable j in terms of standard deviations
Range of variation of the new variables ! g (x) +1
The indicators are transformed into a common scale with
mean 0 and variance 1
Composite indicators Transformation of variables

This transformation assigns greater weight to indicators with


extreme values

This is positive if you want to highlight the performance


particularly positive or negative

This distortion can be corrected by assigning less weight to


the indicators that present anomalous values.
Composite indicators Transformation of variables

Each value xij is transformed into zij

xij xj
zij =

j
Composite indicators Transformation of variables

For Piedmont
991; 8 991; 63

z11 = = 0; 0939
Composite indicators Transformation of variables

The transformation in z-score is the most widely used in statistics

Often, synthesis methodologies based on multivariate analysis


require prior transformation of variables into z-scores.

In the presence of negative elementary indicators the change of


direction can be done directly by applying the transformation g(x)
x x
g (x) = z =
Composite indicators Transformation of variables

The synthetic indicator is obtained as an arithmetic mean of


the transformed values
Pm
j=1 zij
si = i = 1; : : : ; n
m
Composite indicators Transformation of variables

Sicily appears to have the lowest level of quality of life (s=-1.30),


followed by Calabria (s=-1.08). The regions with the best values
are Liguria (s=0.88) and Emilia Romagna (s=0.86)
Composite indicators Transformation of variables

Standardization is a procedure that allows comparison between


variables that have different means and variances. It satisfies
two important properties:

Untie the elementary indicators from the unit of measurement

Homogenizes the variances of the different elementary indicators


Composite indicators Transformation of variables

Transformation of values into percentages

The transformation of the values of each indicator into


percentage values is another option that can be adopted to
make the indicators dimensionless.

Each value of a simple indicator is expressed in


percentage terms with respect to the sum of the values
assumed in n statistical units by the indicator
xij
= 100
Pij Pi xij
For each elementary indicator we need to calculate the sum
(of the values in the same column)

The indicator shows the percentage of the total character


possessed by the i-th unit.
Composite indicators Transformation of variables

Pn
xij
i=1
Composite indicators Transformation of variables

xij
= 100
Pij Pi xij
For Piedmont:
991; 8
P11 = 100
19832; 6
Composite indicators Transformation of variables

The synthetic indicator is obtained as an arithmetic mean of


the transformed values

Pm
j=1 Pij
si = i = 1; : : : ; n
m
we will assume values between 0-100
Composite indicators Transformation of variables

Indirect Quanti cation Methods

The methods we will present are applied to obtain a


summary index of an ordinal qualitative variable (whose
categories are sortable).
We will take as an example the distribution of the respondents to
a question in the survey on income and living conditions of
families (EU-SILC), in which the head of the family is asked
about the economic situation and the presence of difficulties in
reaching the end of the month

In the example we are considering, the categories of response


to the question aimed at detecting the presence of those who
are able to arrive at the end of the month are
easily and very easily, with some di culty and quite easily,
with di culty, with great di culty
Composite indicators Transformation of variables

Indirect Quanti cation Methods

European Union Statistics on Income and Living Conditions (EU-SILC)


Households who can and don't afford to arrive at the end of the month - Italy 2017- (%)
easily and very with some difficulty with difficulty with great difficulty
easily and quite easily
3.2 69.4 19.5 7.9
A synthetic indicator of the difficulty of households in making ends meet can be constructed by
considering, relative to the k categories of the variable:
1) the percentage of responses (pk ) that indicate situations of difficulty: 19.5+7.9=27.4 %.
2) By assigning numerical values qk=(0.00, 0.33, 0.66, 1.00) to the levels of the variable and calculating
the weighted average (with relative frequencies or percentages)

The indicator Q that is derived takes values between 0-100 (or between 0-1 if relative frequencies or
proportions are used)
easily and very with some difficulty with difficulty with great difficulty
To
easily and quite easily
pk (%) 3.2 69.4 19.5 7.9 10
qk (weight) 0.00 0.33 0.66 1.00

pk qk 3.2 x 0.00 69.4x0.33 19.5x0.66 7.9 x1.00 43


Composite indicators Transformation of variables

With the method 1 we will conclude that 27.4% of the families


have difficulties to arrive at the end of the month.

With the method 2 we will verify that in a scale from 0 to


100, the indicator of cultured assumes value 43.67

We will now look at a third method that uses the


concept of dissimilarity between the distributions of
two variables.
Composite indicators Transformation of variables

Indirect Quanti cation Methods

The minimum cultured situation occurs when 100% of the


observations are in the first category (easily and very easily).

The situation of maximum culty is had when 100% of the


observations is placed in the last category (with great of
culty)

The objective is to assess how close the observed distribution of


respondents is to the minimum cultured distribution. The greater
the distance of the distribution that we observe from that of
minimum cultivation, the greater the dissimilarity between the
two distributions.

We want to construct an index that is equal to 0, when the two


distributions coincide, equal to 1 (or 100) when the distance
between the two distributions is maximum.
Let's look at these concepts through examples.
Composite indicators Transformation of variables

The method based on the dissimilarity index


We compute the relative dissimilarity index for ordinal variables
of the relative frequency distribution -fk (or percentages -pk )
observed from the distribution of minimum colta
The index is based on a comparison of the two
distributions (the observed and the minimum-collected
one) in terms of the cumulative relative frequencies -Fk (or
comulated percentages Pk ) of the two distributions
Let us denote Fk the cumulative relative frequency for the category
k -th
k
X
Fk = fk = f1 + : : : + fk
k=1
or by Pk the cumulative relative percentage frequency for the
category
k -th
k
X
Pk = pk = p1 + : : : + pk
k=1
Composite indicators Transformation of variables

We de nite the distributions of the observed frequencies and


those of the minimum of cultivation (that we will call of reference)

Households who can and don't afford to arrive at the end of the month (Frequenze Osservate %)
easily and very easily with some difficulty and with difficulty with great difficulty
quite easily total

3.2 69.4 19.5 7.9 100


Theoretical Situation of Minimal Difficulty (Reference Frequencies %)
easily and very easily with some difficulty and with difficulty with great difficulty
quite easily
100.0 0.0 0.0 0.0 100

Households who can and don't afford to arrive at the end of the month (Fre Cumulate Osservate -F o x100)
easily and very easily with some difficulty and with difficulty with great difficult
quite easily
3.2 72.6 92.1 100.0
Theoretical Situation of Minimal Difficulty (Reference Theoretical Cumulated Frequencies -F r x100 )
easily and very easily with some difficulty and with difficulty with great difficult
quite easily
100.0 100.0 100.0 100.0
Composite indicators Transformation of variables

The dissimilarity index between the two distributions is based on a


comparison of the cumulative frequencies of the observed
distribution and the reference distribution
K1 o

z0 = P k=1KjFk 1
(Leti, 1983)
K the number of categories
Fok = cumulative freq. of mode k in the observed distribution

Frk = cumulative freq. of mode k in the reference


distribution

z0 = 0 Least dissimilarity when all di erences are equal to 0.


z0 = 1 Maximum dissimilarity, when all observed frequencies fall
into the category signaling maximum culturing.
When using cumulative percentage frequencies, Pk , the index
varies between 0-100.
Composite indicators Transformation of variables

Example of index application:

Households who can and don't afford to arrive at the end of the month (Freq. Perc Cum Osservate -P o) totale

P or
easily and very with some difficulty and quite with difficulty with great difficulty
x100 easily easily
3.2 72.6 92.1 100.0
Theoretical Situation of Minimal Difficulty (Reference Theoretical Cumulative Perc. Frequencies -P r )

easily and very with some difficulty and quite with difficulty with great difficulty

P r easily easily

100.0 100.0 100.0 100.0


|Po- P r|

96.8 27.4 7.9 0.0 132.1

K-1=3 z'=132.1/3=44.03

Derive the value of the hardship index z' for households with a different number of members:
Territory Italy
Select time 2017
with great with difficulty with some easily and very Get (z'
Economic situation perceived difficulty difficulty and easily
quite easily
Household number of components
one 7.9 22.1 67.0 3.0
two 5.9 16.3 74.3 3.5
three 8.3 17.1 71.3 3.3
four 9.4 20.6 67.0 3.0
five or more 13.5 26.2 58.1 2.2
total 7.9 19.5 69.5 3.2
Composite indicators Transformation of variables

An interesting application of the method can be found at the following


link:
Sara Casacci and Adriano Pareto, The construction of subjective
indicators using dissimilarity indices: an application to the survey
of aspects of daily life, National Institute of Statistics https:
//www.slideshare.net/slideistat/s-casacci-a-pareto
Comparison of rankings

Comparison of rankings
Comparison of rankings

Comparison of rankings: Spearman's rho index

How do we measure the association between rankings


obtained with two different indicators?

For example, we may be interested in evaluating how the ranking


of Italian regions varies with respect to quality of life by making
changes in the choice of elementary indicators that define a
dimension of quality of life.

Alternatively, we may be interested in assessing whether the


ranking changes significantly as we vary the transformation
(or aggregation) functions we adopt
Comparison of rankings

A widely used cograduation index for assessing the


association between rankings and Spearman's s index

We de nce with of the di erence between the ranks of the i esima


unit and n the total of the statistical units

6 n
P
n (
i=1

s= 1
n2
Comparison of rankings

1 s 1

s = 1 when units have the same rank in both rankings

s = 1 if the ranks are in perfect discordance

s = 0 if the two rankings show no association


Comparison of rankings

Measures of cograduation between rankings


Indice by Spearman
6 n d2
P
n(

i=1 i

s= 1
n2 1)
Let's take the assessment of language and math skills for 9
countries as an example

LANGUAGE MATHEMATICS rank lin. mat rank by di2


200 198 1.5 3 -1.5 2.25
200 195 1.5 5 -3.5 12.25
196 198 4 3 1 1
196 198 4 3 1 1
196 200 4 1 3 9
189 190 6 6 0 0
186 188 7.5 7 0.5 0.25
186 185 7.5 8.5 1 1
183 185 9 8.5 0.5 0.25
sum 27.00

s= 1 6 27 = 0:775 (2)
9(81 1)
Comparison of rankings

When the number of ties is high, their value is not negligible and
should be taken into account in the calculation of the index. A
modified version of the Spearman index is used that takes into
account the number of ties in both rankings (the formula is quite
complicated to remember).
Comparison of rankings

Another way to proceed is to evaluate the linear association


between two quantitative variables with Pearson's linear
correlation coefficient.
Metodi di sintesi: aggregazione e ponderazione Indicatori di Sviluppo Umano Le Medie Media di Potenza Human Poverty Index (2008) The m

Part I

Synthesis methods: aggregation and


weighting
Synthesis methods: aggregation and weighting Indicatori di Sviluppo Umano Le Medie Media di Potenza Human Poverty Index (2008) The m

Synthesis methods: aggregation and weighting

You might also like