Professional Documents
Culture Documents
net/publication/258222305
CITATIONS READS
34 2,695
2 authors, including:
Marcello D’Agostino
University of Milan
66 PUBLICATIONS 835 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Marcello D’Agostino on 29 November 2016.
This is the pre-print authors’ version of the paper: M. D’Agostino, V. Dardanoni. What’s
so special about Euclidean distance? A characterization with applications to mobility and
spatial voting. Social Choice and Welfare 33(2009):211–233.
1. Introduction
The problem of measuring the distance1 between real-valued vectors arises in many
areas of scientific research. In particular, variants of the familiar Euclidean distance
play a prominent role in many important application contexts not only in economics
and political science, but in such diverse fields as statistics, decision theory, psychology,
Date: 8 October 2008.
We would like to thank an Editor and expecially two very perspective referees for useful comments
which helped us to improve the paper substantially.
1Our use of the expression “distance”, throughout the paper, is informal. In several application
contexts a variety of functions which do not fully satisfy the standard textbook definition have been
taken into consideration and are regarded as intuitively measuring some sort of distance. So, in this
paper we are not committing to any specific definition, let alone to the standard one. By “distance
function” we shall refer to any continuous function of two real-valued vectors (of the same finite size)
which can be intuitively considered as measuring their distance, even if in some cases, such a function
may not satisfy some of the properties of the standard definition.
1
2 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
DNA sequencing, image recognition, weather forecasting and so on.2 But what’s so
special about Euclidean distance? How can we judge the appropriateness of adopting
this conventional distance in some specific application context? In what contexts are
we forced to use it (or some monotonic transformation of it) as the appropriate distance
function?
In this paper we provide an application-oriented characterization of a class of func-
tions monotonically related to the Euclidean distance in terms of five general properties
which are intuitively (and perhaps empirically) testable. We show that Euclidean dis-
tance is (up to a monotonic transformation) the only function that satisfies them all.
We also show that, by replacing some of these five properties with suitable variants,
one obtains similar characterizations of an averaged version of the Euclidean distance
which we call Averaged Euclidean Distance, and of closely related distance functions
that we call Generalized Euclidean Distance and Averaged Generalized Euclidean Dis-
tance:
pPn
dn (x, y) = i=1 (xi − yi )2 (ED)
q P
1 n
dn (x, y) = n i=1 (xi − yi )2 (AED)
pPn
dn (x, y) = i=1 (g(xi ) − g(yi ))2 (GED)
q P
1 n
dn (x, y) = n i=1 (g(xi ) − g(yi ))2 (AGED)
where g is a continuous and increasing function. Our results may then be helpful in
testing — intuitively or empirically — whether one or the other of these functions fits
a specific application and should, therefore, be preferred to alternative ones.
In this paper we shall speak of a distance function over X n to mean simply a con-
tinuous function dn : X n × X n → R+ for some suitable interval X ⊆ R. Since we
are interested in evaluating the distance between any two vectors of equal size n, for
arbitrary n, our aim is in fact that of investigating infinite classes of distance func-
tionsSover X n , seeking for some uniform characterization of all their elements. Let
X= ∞ n n
n=1 X × X . Then, by a distance over X we mean a infinite collection {dn } of
distance functions over X n .
2To mention just a few applications in these fields, Euclidean distance (or some monotonic transfor-
mation such as the mean squared error) is often used as a loss function in statistics and decision theory;
in psychology, related measures are often employed to evaluate the amount of “similarity” between
two objects, each of which is decomposed into a fixed number of components, and (dis)similarity
is then modelled as a suitable metric in the resulting feature space ([SJ99]); in biology, there are
algorithms for searching genetic databases for biologically significant similarities in DNA sequences
using Euclidean and similar distances ([WBD97] and [TJaLA01]); similarity of images (used, among
other things, in security applications) is typically established by appropriate variations of Euclidean
distance [WZF05]); accuracy of weather forecasting is traditionally established by the so-called Brier
score ([Bri50]), which is (squared) Euclidean distance between a probability assignment to a sequence
of binary events and the observed outcomes (1 if the event occurs, and 0 if it does not occur).
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 3
6 σ12 (y) 6
1
r
0.8 r y0
0.7
r y
0.4 ry
0.2
r 0.2 rx
x
- -
0.4 0.7 1 0.4 0.9
(a) Order-sensitive (b) Value-sensitive
Our characterization hinges upon two fundamental properties that a distance func-
tion between two real-valued vectors may satisfy. These properties deal with two basic
ways in which the distance between vectors x and y may intuitively increase/decrease,
namely: (a) by altering the distance in one coordinate, leaving everything else un-
changed; (b) by “swapping” the values of two coordinates in one of the two vectors,
leaving everything else unchanged.
More precisely,
• we say that a distance function dn : X n × X n → R+ (for some interval X ⊆ R)
is value-sensitive if, for any x, y, y0 ∈ X n such that d1 (xj , yj ) < d1 (xj , yj0 ) and
yi = yi0 for i 6= j, we have dn (x, y) < dn (x, y0 );
• let σij (u) denote the vector obtained from u by “swapping” the values of ui
and uj ; a distance function dn is called order-sensitive if, for all x, y ∈ X n ,
whenever (xi − xj )(yi − yj ) > 0 we have dn (x, y) < dn (x, σij (y)).
While value-sensitivity requires that the distance between two vectors should be a
monotonic function of the distance between the values of their corresponding coordi-
nates, order-sensitivity requires that it should depend on the order association between
the two vectors. Suppose that, for some i and j, there is a positive order association
between the corresponding coordinates, that is (xi − xj )(yi − yj ) > 0. In this case a
swap between yi and yj (or between xi and xj ) is order-reversing, since it turns the
positive association into a negative one.3 Order-sensitivity requires that such swaps
always increase the distance between the vectors under consideration.
Clearly, not all commonly used distance functions are order-sensitive. In Figure 1(a)
on p. 3, it can be immediately verified that the so-called city-block distance dn (u, v) =
Σni=1 |ui − vi | declares x as equally close to y and σ12 (y). On the other hand, typical
order-sensitive functions, such as the Spearman and Kendall coefficients as well as any
3Such order-reversing swaps are discussed in mathematical statistics [Tch80], economics [ET80],
and mobility measurement [Atk83, Dar93].
4 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
other function based on ranks, are not value-sensitive and, for instance, would declare
x equally close to y and y0 in Figure 1(b).
Although value-sensitivity and order-sensitivity rely on different intuitions about
how the distance between two vectors may change, they are by no means incompatible
properties. Each member of the commonly used Minkowsky class of distance functions
1/p
dn (u, v) = Σni=1 |ui − vi |p (also known as the power metric) is both value and order
sensitive whenever p > 1.
In their well-known [TK70], Tverski and Krantz discuss axiomatic characterizations
of various generalizations of the power metric from the point of view of measurement
theory. In a different context, motivated by mobility measurement, Fields and Ok
([FO96]) and Mitra and Ok ([MO98]) also provide related characterizations of the
Minkowski class. However, since there are infinitely many elements in the Minkowsky
class, such characterizations leave open the choice of the specific metric which is ap-
propriate in each given context, as well as the characterizing properties of the chosen
metric.4 One aim of this paper is to provide a sharper analysis, by identifying a set of
properties that allow us to single out Euclidean distance, or one of its variants, from the
wide class of separable distance functions.5 A crucial step in this analysis is the recog-
nition that, unlike any other Minkowsky metric, Euclidean distance is order-sensitive
in a way which is monotonically related to the distance between the swapped values.
This distinctive property plays a crucial role in a variety of applications and can be
identified as what makes Euclidean distance so “special”.
In Section 2 we present three properties (Properties 3, 4 and 5) that refine, via nat-
ural monotonicity considerations, the general definitions of value-sensitive and order-
sensitive distance functions informally introduced above. We call them “monotonicity
properties”. We also introduce two invariance properties, Permutation Invariance and
Extension Invariance (Properties 1 and 2). We also consider alternative versions of
Properties 2 and 3 which may turn out to be more adequate for some applications. We
then show, in Section 3, that Euclidean Distance is (up to a monotonic transforma-
tion) the only distance function which satisfies the monotonicity properties, and the
invariance properties (Theorem 1.1). We also show that Averaged Euclidean distance
is (still up to a monotonic transformation) the only distance function that satisfies
the monotonicity properties, Permutation Invariance and an alternative to Extension
Invariance, called Replication invariance (Theorem 1.2). Finally, we show that by
replacing Property 3 with a suitable alternative version one obtains a similar char-
acterization of Generalized Euclidean Distance and Averaged Generalized Euclidean
Distance (Theorem 2.1 and 2.2).
4Thus,their analyses do not, and cannot, assign any special role to the property that we have
called “order-sensitivity”, for the good reason that the Minkowski class includes, as a special case
when p = 1, the city-block distance which is not order-sensitive.
5
A similar sharper analysis, but leading to a characterization of the city-block distance, is provided
by Fields and Ok in [FO96].
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 5
2. Properties
As mentioned in the Introduction, we shall speak of a distance function over X n
to mean simply a continuous function dn S : X n × X n → R+ for some suitable interval
X ⊆ R. Moreover, a distance over X = ∞ n n
n=1 X × X will be an infinite collection
{dn } of distance functions over X n .
As for the interval X, we shall restrict our attention to three typical cases: (i)
X = [0, 1] (ii) X = R+ ; (iii) X = R. Furthermore, by X+ we mean the nonnegative
part of X, so that X+ is equal to X in cases (i) and (ii), while X+ = R+ in case (iii).
Notice that in all cases, given any two x, y ∈ X, their absolute difference |x − y| ∈ X+ ;
this restriction will simplify the analysis and the notation used in this paper.
In what follows we shall use the light-face letters a, b, c, d, etc. to denote arbitrary
real numbers and the boldface letters x, y, w, etc. to denote arbitrary vectors. The
vector whose only element is the real number a will be denoted simply by “a”. We
shall write [x, y] for the concatenation of the two vectors x and y.8
6
For surveys on social mobility measurement see Maasoumi [Maa98] and Fields and Ok [FO99a].
For an introduction to the spatial theory of voting, see Hinich and Munger [HM70]; for a general more
advanced treatment of voting theory, Austeen-Smith and Banks [ASB99].
7
The linear regression model, for instance, is without doubts the workhorse of theoretical and
applied econometrics. Given an n-sized vector y (“regressand”) and k n-sized vectors x1 , . . . , xk
(“regressors”) which are collected into an n × k design matrix X, the linear regression model deals
with how to find the point in the linear space spanned by the columns of X which is closest to y.
Thus, the problem is to find a k-sized vector β (“regression coefficient”) which minimizes the distance
between y and Xβ. The most common method for solving this problem is of course the OLS, which
implies that Euclidean distance is the chosen distance concept.
8Since, conventionally, vectors in X n are columns, by [x, y] we formally mean [xT , yT ]T .
6 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
2.1. Invariance Properties. The first property of this group requires each dn to be
invariant under uniform permutations of both its arguments:
Property 1 (Permutation Invariance). For all x, y ∈ X n and all n ∈ N, dn (x, y) =
dn (π(x), π(y)) for every permutation π.
The second property relates the distance between two n-dimensional vectors to the
distance between higher-dimensional “conservative extensions”. It requires that, if
both vectors are extended by concatenating each of them with the same vector, their
distance is left unaltered.
Property 2 (Extension Invariance). For all m, n ∈ N, all x, y ∈ X m and all z ∈ X n ,
dm (x, y) = dm+n ([x, z], [y, z]).
This Property may appear intuitively sound in some application contexts but not in
others, for instance when we are interested in a notion of averaged distance. For an
alternative property which is appropriate in such contexts see Section 2.3 below.
2.2. Monotonicity Properties. The first two properties of this group articulate the
value-sensitivity property informally discussed in Section 1 and refine it via monotonic-
ity considerations. They are both rather natural assumptions to make, and similar
properties arise in several characterization results, including those mentioned in the
introduction, in a variety of fields.
Property 3 (One-dimensional value-sensitivity). For all a, b ∈ X, d1 (a, b) = G(|a−b|)
for some continuous and strictly increasing function G : X+ → R+ such that G(0) = 0.
This property makes one-dimensional distance depend monotonically on the absolute
difference between the values of their single coordinate (G(0) = 0 is a harmless normal-
ization requirement) and leaves entirely open the problem of how to measure higher-
dimensional distance. Property 3 is closely related to the one called “intradimensional
subtractivity” in [TK70]. In fact, the latter corresponds to Property 3∗ presented in
Section 2.3.
The second property requires that the distance between two vectors is monotonically
consistent with the distance between their subvectors:
Property 4 (Subvector Consistency). For all k, j ∈ N and whenever x, y, x0 , y0 ∈ X k ,
u, v, u0 , v0 ∈ X j ,
dk (x, y) > dk (x0 , y0 ) and dj (u, v) = dj (u0 , v0 ) =⇒
=⇒ dk+j ([x, u], [y, v]) > dk+j ([x0 , u0 ], [y0 , v0 ]).
A similar axiom is commonly used in the literature on income inequality [Sho88],
poverty [FS91] and mobility measurement [FO99b], and implies (as shown in the proof
of Theorems 1 and 2 provided in the Appendix, see also [FS91]) the fundamental
independence assumption which plays a crucial role in the theory of additive conjoint
measurement ([Deb60]). Tverski and Krantz show how to derive the latter from three
more primitive axioms (see Theorem 1 in [TK70]).
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 7
The third property of this group is best understood in connection with the order-
sensitivity property discussed in Section 1. It can be interpreted as requiring that the
increase in the distance between two vectors caused by any “order-reversing swap”
depends monotonically on the distance, in each vector, between the components that
are involved in the swap:9
Property 5 (Monotonic Order-Sensitivity). For all n ∈ N such that n ≥ 4, for all
x, y ∈ X n and all i, j, k, m ∈ {1, . . . , n}, if
• (xi − xj )(yi − yj ) > 0
• (xk − xm )(yk − ym ) > 0
• d1 (xi , xj ) = d1 (xk , xm ) and d1 (yi , yj ) ≤ d1 (yk , ym ),
then
dn (x, σij (y)) ≤ dn (x, σmk (y)).
In other words, the effect of an order-reversing swap in the y’s depends monotonically
on the distance between the swapped y’s, provided that the distance between the
corresponding x’s is the same.10 For instance, suppose x = [0, 0.2, 0.4, 0.6] and y =
[0, 0.3, 0.5, 1]. The distance between x and σ12 (y) = [0.3, 0, 0.5, 1] is smaller than the
distance between x and σ34 (y) = [0, 0.3, 1, 0.5], since the distance between the swapped
y’s in σ12 (y) is smaller than the distance between the swapped y’s in σ34 (y) (while the
distance between the corresponding x’s is the same). As will be shown in the next
section, this property provides a sharp means to single out Euclidean distance (or
some suitable variant of its) from the much wider class of separable distance functions.
Observe that, strictly speaking, this Property does not imply, by itself, that the distance
function be order-sensitive in the sense explained in the Introduction, that is, it does
not imply that any order-reversing swap be distance-increasing. However, it does so in
conjunction with the other properties.11
(1) satisfies Properties 1, 2, 3, 4, 5 and linear homogeneity if and only if for all
n ∈ N and all x, y ∈ X n :
v
u n
uX
dn (x, y) = β · t (xi − yi )2 ,
i=1
Hence, H must be such that H(α2 · t) = α · H(t)√for all constants α ∈ X and all
t ∈ dom(H). But this is possible only if H(u) = β · u for some constant β > 0 (since
H is strictly increasing). The argument for Corollary 1.2 is identical, once the distance
function is evaluated in accordance with Theorem 1.2.
interpreted as a distance function between fathers’ and sons’ incomes. So, in this ap-
plication context the data space is Rn+ × Rn+ , where each vector represents the economic
status of individuals in a society, and the intended intuitive distance d is some distance
function that complies with intuitive mobility comparisons.
In the context of intergenerational mobility measurement, Property 1 seems unexcep-
tionable. Property 4 has been assumed in the axiomatic study of mobility measurement
of Fields and Ok [FO99b] (where it is known as “subgroup consistency”); alternatively,
the stronger “decomposability” property is often employed (see e.g. [Cow85], [FO96],
[MO98]), which implies Property 4. On the other hand, many indices such as Sper-
man’s rank correlation or Pearson’s correlation coefficient, which are commonly used
to measure intergenerational mobility, fail to satisfy Property 4, and, while subgroup
consistency axioms are also widely used in the inequality and poverty measurement
literature, they are not without their critics, see e.g. the discussion in Foster and Sen
[FS97].
Property 2 is quite inappropriate if one believes that social mobility in a society
should be measured in per-capita terms. So, Property 2 can be dropped and replaced
by the Replication Invariance property (Property 2∗ ) which led us to Theorem 1.2.14
In this case, one would consider the per-capita Euclidean distance as a good candidate
for the sound distance function in the social mobility context. Let us now focus on the
remaining properties, namely 3 and 5.
If the domain of the mobility index are dollar-incomes, as often assumed, it may
be unreasonable to postulate that a father-son movement, say, between $100 and $110
has the same level of mobility than a movement between $1000 and $1010, as implied
by Property 3. If one interprets a social mobility index as measuring the distance
between the economic status of two generations, this kind of objection would lead to
rejecting the identification of economic status with dollar-income, and suggests replac-
ing Property 3 with an analogous property which is not subject to this criticism. As a
replacement of Property 3 consider then Property 3∗ , where g is interpreted as an ap-
propriate function measuring “economic status” whose form is application-dependent
(but see below for a brief discussion). Note that this property implies the symme-
try of upward and downward mobility. As remarked by Fields and Ok [FO99b], this
symmetry is unexceptionable if one does not distinguish between “good” and “bad”
movements of income.15 As for Property 5, observe that once Property 3 has been
replaced by Property 3∗ , its interpretation inherits the consideration of economic sta-
tus which is incorporated into the latter. As a consequence, the distance between the
14On the other hand, the Replication Invariance Property too may be criticized in mobility ap-
plications. For example, one may legitimately perceive less mobility in a society with two families
with incomes, say, (1, 2) and (2, 1) than in a society which replicates these two families a million
times. If this view is strongly held, our theorem implies that averaged euclidean distance is not a
good candidate for mobility measurement.
15While such a symmetric treatment seems appropriate from a descriptive viewpoint, some may
argue that it is unfortunate from a normative one, since upward mobility may appear to be more
desirable than downward mobility.
12 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
swapped values is measured taking into account the effect of the status function g. In
particular, observe that, if g is a concave function the distance between two swapped
values in the application of Property 5 will be smaller for incomes of high range even
when the difference between their absolute values is the same. Having clarified this,
Property 5 seems a natural monotonic extension of the order-reversing swap property
much discussed in the mobility literature.16 Theorem 2.2 implies that if these proper-
ties are deemed compelling in this application context, AGED may be a good choice
for mobility measurement. Applications of this theorem depend on the choice of an
appropriate status function g. Some may argue that a good choice would be, for ex-
ample, to take g as the log function, so that an income movement from $1000 to $2000
would have the same mobility effect as a movement from $100 to $200. In general, the
most appropriate choice of g depends on the application context and must be justified
independently. Property 3∗ (and, as a consequence, Property 5), can therefore be seen
as a general constraint on this choice.17
Comparing the characterizing properties of AGED with those of the “city-block”and
Minkowsky distance functions applied to dollar-incomes, or those of the “city-block”
function applied to log-dollar incomes, proposed in the seminal and much used char-
acterizations of mobility indices ([FO96], [MO98] and [FO99b]), the researcher may
agree with Property 5 and therefore decide on using AGED (with judicious choice of
the transformation function g).
16This
property appears to be uncontroversial if one holds the view of social mobility as an indicator
of (non-directional) movements in income, in which case it is natural to require that a mobility measure
is order-sensitive as defined in this paper. In the context of intergenerational income mobility, order
reversing swaps always decrease the positive association between fathers and sons incomes in a society,
and have been discussed, among others, by Atkinson [Atk83] and Dardanoni [Dar93]. On the other
hand, it has been noticed (see e.g. Van de Gaer, Schokkaert and Martinez [VdGSM01]) that such
uniform treatment of order reversing swaps may conflict with the interpretation of social mobility
as an indicator of equality of opportunity. The relationship beteween our properties and the equal
opportunity interpretation of mobility requires a lengthy and detailed discussion which cannot be
addressed here.
17Linking the concept of economic status to individual welfare, by interpreting g as an empirically
revealed utility function, could help this process.
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 13
that is, the individual will vote for y whenever y is “closer” to her ideal point x than
z. For illustration, in the two-dimensional case the ordering can be represented by the
utility function Ux (y) = −(x1 − y1 )2 − (x2 − y2 )2 , and the indifference curves are circles
centered at (x1 , x2 ). Theorem 1.1 shows that using Euclidean preferences in a given
context is equivalent to deeming its characterizing properties to comply with intuitive
judgments in that particular context. Here an intuitive distance function measures the
“closeness” between two political platforms.
In multidimensional spatial voting, Property 3 is indeed uncontroversial since it
can be seen as the Blackian and Downsian starting point for any multidimensional
extension. Similarly, Properties 4 and 2 seem natural properties to impose in this
context. However, Properties 1 and 5 appear to be intuitively sound only in situations
where different issues are regarded as equally important from the voter’s viewpoint.
Hence, while Theorem 1.1 justifies the use of Euclidean distance in all such restricted
situations, it can be argued that the restriction is quite unrealistic, since different issues
are often given different saliency.
When issues are assumed to have different saliency, the usual approach consists in
“weighting” different issues by means of a real number expressing the relative im-
portance assigned to them by an individual voter. Given a set of positive weights
w1 , . . . , wn , weighted
P Euclidean preferences can then be represented by a utility func-
tion Ux,w (y) = − ni=1 wi (xi − yi )2 . It is easy to see that in the bidimensional case
indifference curves are ellipses centered at the ideal point x, parallel to the horizon-
tal/vertical axis depending on whether w1 is lower/greater than w2 . It is clear that,
given an ideal point x, it is easy to construct examples where, say, y is preferred to z
by weighted Euclidean preferences but not by simple ones.
The intuition underlying the use of a weighted distance function can be expressed
more precisely as follows. Let V be the set of all possible voters. If issues have different
saliency for different voters, the distance function changes from voter to voter, and so
we can denote by dvn the distance function of voter v. Among all possible voters, there
are some, that we call the canonical voters, for whom all the different issues have the
same saliency. Let V0 be the subset of V consisting of the canonical voters. Using a
weighted distance function (whatever it may be) means that the distance of a candidate
y from the ideal candidate x for a given voter v is equal to the distance between a(y)
and a(x) for a canonical voter u, where a is a standardization function that multiplies
every coordinate of the vector to which it applies for the “weight” assigned to the
corresponding issue.18 In this way the preferences of each non-canonical voter v can
18Such weights can be determined empirically by observing the indifference curves of each voter or
group of voters.
14 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
5. Conclusions
As should emerge from our discussion, the results presented in Section 3 may be
useful in understanding:
• whether some suitable monotonic transformation of Euclidean Distance, in one
of the variants investigated in this paper, naturally fits a given application
context, by checking whether its characterizing properties are satisfied in it
(case study 1);
• when the characterization results cannot be directly applied, how the original
application context can be reduced to a “canonical” one in which the properties
of some of the characterized distance functions are all satisfied (case study 2).
Our analysis suggests that some variant of the Euclidean distance is likely to be ap-
propriate in many contexts requiring a distance function which is both monotonically
value-sensitive (as made precise by Properties 3/3∗ and 4) and monotonically order-
sensitive (as made precise by Property 5).
6. Appendix
6.1. Proof of Theorems 1 and 2. To prove the theorems, we first need the following
Lemma 1. A distance {dn } over X satisfies Properties 1, 2, 3, 4 if and only if there
exists a continuous and strictly increasing function f : X+ → R+ , with f (0) = 0, such
that for all n ∈ N and all x, y ∈ X n ,
n
X
dn (x, y) = F f (|xi − yi |)
i=1
(1) dk+j ([x, u], [y, v]) ≤ dk+j ([x0 , u0 ], [y0 , v0 ]) ⇐⇒ dk (x, y) ≤ dk (x0 , y0 ),
whenever dj (u, v) = dj (u0 , v0 ),
for all k, j, all x, y, x0 , y0 ∈ X k and all u, v, u0 , v0 ∈ X j .
Proof. It can be easily verified that Property 4 is implied by (1). To see the converse,
notice that, under Property 1, (1) is equivalent to:
(2) dk+j ([x, u], [y, v]) ≤ dk+j ([x0 , u], [y0 , v]) ⇐⇒ dk (x, y) ≤ dk (x0 , y0 ),
for all k, j, all x, y, x0 , y0 ∈ X k and all u, v ∈ X j . We then show that, under Property 1,
Property 4 implies (2), and therefore also (1).
16 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
Suppose, first, that dk+j ([x, u], [y, v]) ≤ dk+j ([x0 , u], [y0 , v]); then by Property 4 and
contrapposition, we have that dk (x, y) ≤ dk (x0 , y0 ). On the other hand, suppose that
(i ) dk (x, y) ≤ dk (x0 , y0 ) and (ii ) dk+j ([x, u], [y, v]) > dk+j ([x0 , u], [y0 , v]).
Now, if dk (x, y) < dk (x0 , y0 ), by Property 4, dk+j ([x, u], [y, v]) < dk+j ([x0 , u], [y0 , v]),
against the assumption (ii). If dk (x, y) = dk (x0 , y0 ), it follows from (ii), by Prop-
erty 4 again, that d2k+j ([x, u, x0 ], [y, v, y0 ]) > d2k+j ([x0 , u, x], [y0 , v, y]) against Prop-
erty 1. Hence, if dk (x, y) ≤ dk (x0 , y0 ) it must hold true that dk+j ([x, u], [y, v]) ≤
dk+j ([x0 , u], [y0 , v]).
Let us now assume that Properties 1, 2, 3, 4 hold true and recall that X can be either
of the three intervals considered in Section 2, namely R, R+ or [0, 1].
Step 2. For all n ≥ 3 and all x, y, w, z ∈ X n , there exists a continuous and strictly
increasing function fn : X+ → R+ , with f (0) = 0, such that
n
X
(3) dn (x, y) = Fn fn (|xi − yi |)
i=1
We finally show that, without loss of generality, we can assume that the functions fn
and Fn in (3) are such that fn (0) = 0 and Fn (0) = 0. In fact, whenever fn (0) = c 6= 0,
let hn : X+ → R and Hn : R → R be defined as hn (t) = fn (t)−c and Hn (t) = Fn (t−nc).
Then, it is immediately verified that
n n
X X
(8) Fn fn (|xi − yi |) = Hn hn (|xi − yi |) .
i=1 i=1
Moreover, notice that Properties 2 and 3 imply that, for all u ∈ fn (X+ ),
(9) Fn (u) = G(fn−1 (u)).
To see this, just notice that, by Property 2, for all n ≥ 3 and for all x ∈ X n−1 ,
n−1
X
d1 (a, b) = dn ([a, x], [b, x]) = Fn fn (|a − b|) + fn (|xi − xi |) .
i=1
Under the assumption that fn (0) = 0 and by Property 3, we have that for all n ≥ 3,
(10) d1 (a, b) = dn ([a, x], [b, x]) = Fn [fn (|a − b|)] = G(|a − b|).
Hence, taking u = fn (|a − b|), we obtain G(fn−1 (u)) = Fn (u). Now, to show that
Fn (0) = 0, observe that fn (0) = 0 and, by Property 3, G(0) = 0, so that Fn (0) =
G(0) = 0.
Step 3. For all m, n ≥ 3, fn (t) = αm,n · fm (t), for some constant αm,n ≥ 0 depending
on m and n.
Proof. Property 2 implies immediately that for every x, y ∈ X, all n ≥ 3 and all
x ∈ X n−2 , d2 ([x, y], [0, 0]) = dn ([x, y, x], [0, 0, x]). Given Step 2, and recalling that
for every n ≥ 3, fn (0) = 0, it follows that for every n, m ≥ 3 and every x, y ∈ X+ ,
Fn (fn (x) + fn (y)) = F m (fm (x) + fm (y)), and therefore, since Fn is strictly increasing,
−1
fn (x) + fn (y) = Fn Fm (fm (x) + fm (y)) . Now, let u = fm (x) and v = fm (y); then
(considering that fm is strictly increasing):
−1 −1
(11) fn (fm (u)) + fn (fm (v)) = Fn−1 (Fm (u + v)).
Recall that, by (9) above, for all n ≥ 3 and all u ∈ fn (X+ ), Fn (u) = G(fn−1 (u));
then, for all n ≥ 3, fn−1 (u) = G−1 (Fn (u)) and, therefore, fn (t) = Fn−1 (G(t)) for all
t ∈ X+ . Observe also that, by (10), Fn (fn (X+ )) = G(X+ ) for all n ≥ 3, and so
ran(Fm ◦ fm ) ⊆ ran(Fn ) for all m, n ≥ 3. Hence, for all u ∈ fm (X+ ),
−1
fn (fm (u)) = Fn−1 G G−1 (Fm (u)) = Fn−1 (Fm (u)),
18 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
it follows from (17),S(18) and (19) that Fn+1 (u)S= Fn (u) for all u ∈ dom(Fn0 ).
0 0
Therefore, F = ∞ 0
i=3 Fi is a function from
∞ 0
i=3 dom(Fi ) = R+ in R+P. Then F is
0 n
continuous
and strictly increasing, since all the Fi are, and dn (x, y) = F i=1 f (|xi −
yi |) for all n ≥ 3. To conclude the step then observe that, when x and y have size
k < 3, by Property 2 and recalling that f (0) = 0, it follows that for any z ∈ X m :
k k+m
X X
(20) dk (x, y) = dk+m ([x, z], [y, z]) = F f (|xi − yi |) + f (|zj − zj |) =
i=1 j=k+1
k
X
=F f (|xi − yi |) ,
i=1
Pn
and so the identity dn (x, y) = F i=1 f (|xi − yi |) holds for all n ∈ N. Finally,
F (0) = Fn (αn · 0) = Fn (0) and, by Step 2, Fn (0) = 0, so F (0) = 0.
6.1.2. Proof of Theorem 1. Given the above lemma, to prove the “if” direction of The-
orem 1.1 it is sufficient to verify that any strictly increasing function of the Euclidean
distance satisfies Property 5, that is, to verify that, whenever (xi − xj )(yi − yj ) > 0,
(xk − xm )(yk − ym ) > 0, d1 (xi , xj ) = d1 (xk , xm ) and d1 (yi , yj ) ≤ d1 (yk , ym ), we have
(21) (xi − yj )2 + (xj − yi )2 + (xk − yk )2 + (xm − ym )2 ≤
≤ (xi − yi )2 + (xj − yj )2 + (xk − ym )2 + (xm − yk )2 .
After simplification, this equation becomes (xm − xk )(ym − yk ) ≥ (xj − xi )(yj − yi ),
and the result follows. To prove the “only if” direction, we need to show that the
function f in Lemma 1 must be quadratic. Let a, b, c ∈ X be arbitrary non negative
real numbers with a ≥ c. Let also x, y ∈ X 4 be such that
x = [a, (a + c), a, (a + c)], y = [0, c, a, (a + c)]
so that
σ12 (y) = [c, 0, a, (a + c)], σ34 (y) = [0, c, (a + c), a].
Let σ12 (y) = w and σ34 (y) = z. Since (x1 − x2 )(y1 − y2 ) > 0, (x3 − x4 )(y3 − y4 ) >
0, d1 (x1 , x2 ) = d1 (x3 , x4 ) and d1 (y1 , y2 ) = d1 (y3 , y4 ), by Property 5 it follows that
d4 (x, w) = d4 (x, z); hence, by Lemma 1, we have that
4
X 4
X
F f (|xi − wi |) = F f (|xi − zi |)
i=1 i=1
20 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
P4 P4
and therefore, since F is strictly increasing, i=1 f (|xi − wi |) = i=1 f (|xi − zi |).
P4
Hence, subtracting i=1 f (|xi − yi |) from both sides, we obtain
4
X 4
X
f (|xi − wi |) − f (|xi − yi |) = f (a + c) + f (a − c) − 2f (a)
i=1 i=1
and
4
X 4
X
f (|xi − zi |) − f (|xi − yi |) = 2f (c) − 2f (0).
i=1 i=1
argument. Now, by Property 3∗ , di (xi − yi ) = G(|g(xi ) − g(yy )|), with G and g strictly
increasing. Hence dn (x, y) = Hn (|g(x1 ) − g(y1 )|, . . . , |g(xn ) − g(yn )|) for some strictly
increasing Hn .
Now, let d0n be defined as follows:
d0n (u, v) = dn (g−1 (u), g−1 (v)) = Hn (|u1 − v1 |, . . . , |un − vn |),
where g−1 (z) stands for the vector [g −1 (z1 ), . . . , g −1 (zn )]. Clearly, d0n satisfies Proper-
ties 1, 2 and 4 whenever dn does. Moreover, it is not difficult to verify that, whenever
dn satisfies Properties 3∗ and 5, d0n satisfies Properties 3 and 5. Therefore, whenever
dn satisfies all the properties of Theorem 2.1, d0n satisfies all the properties of Theo-
rem 1.1, and we can apply this theorem to establish that d0n must be (some monotonic
transformation of) the Euclidean Distance. Thus, writing g(z) for [g(z1 ), . . . , g(zn )],
we have
X n
0
dn (x, y) = dn (g(x), g(y)) = H (g(xi ) − g(yi ))2
i=1
for some strictly increasing H such that H(0) = 0. Similarly, whenever dn satisfies all
the properties of the Theorem 2.2, d0n satisfies all the properties of Theorem 1.2, and we
can apply this theorem to establish that d0n must be (some monotonic
Pn transformation
1
of) the Averaged Euclidean Distance, that is dn (x, y) = H n i=1 (g(xi ) − g(yi ))2 for
some strictly increasing H such that H(0) = 0.
References
[Acz66] J. Aczel. Lectures on functional equations and their applications. Accademic Press, New
York, 1966.
[ASB99] D. Austen-Smith and J.S Banks. Positive Political Theory I: Collective Preference. The
University of Michigan Press, 1999.
[Atk83] A.B. Atkinson. The measurement of economic mobility. In A.B. Atkinson, editor, Social
Justice and Public Policy. Wheatsheaf Books Ltd., London, 1983.
[Bri50] G.W. Brier. Verfication of forecasts expressed in terms of probability. Monthly Weather
Review, 78:1–3, 1950.
[Cow85] F. Cowell. Measures of distributional change: An axiomatic approach. Review of Eco-
nomic Studies, 52:35–51, 1985.
[Dar93] V. Dardanoni. Measuring social mobility. Journal of Economic Theory, 61:372–94, 1993.
[Deb60] G. Debreu. Topological methods in cardinal utility. In S. Karlin K. Arrow and P. Suppes,
editors, Mathematical methods in the social sciences. Stanford University Press, 1960.
[EMR88] J.M. Enelow, N.R. Mendell, and S. Ramesh. A comparison of two distance metrics
through regression diagnostics of a model of relative candidate evaluation. The Jour-
nal of Politics, 50:1057–1071, 1988.
[ET80] L.G. Epstein and S.M. Tanny. Increasing generalized correlation: A definition and some
economic consequences. Canadian Journal of Economics, 13:16–34, 1980.
[FO96] G.S. Fields and E. Ok. The meaning and measurement of income mobility. Journal of
Economic Theory, 71:349–77, 1996.
[FO99a] G.S. Fields and E. Ok. The measurement of income mobility. In J. Silber, editor, Handbook
of Income Inequality Measurement. Kluwer Academic Publishers, Dordrecht, 1999.
[FO99b] G.S. Fields and E. Ok. Measuring movements of incomes. Economica, 66:455–71, 1999.
22 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
[FS91] J.E. Foster and A.F. Shorrocks. Subgroup consistent poverty indices. Econometrica,
59:687–709, 1991.
[FS97] J.E. Foster and A. Sen. On economic inequality after a quarter century. In A. Sen, editor,
On Economic Inequality. Clarendon Press, Oxford, 1997.
[Gor68] W.M. Gorman. The structure of utility functions. Review of Economic Studies, 35:367–
390, 1968.
[HM70] M.J. Hinich and M.C. Munger. Analytical Politics. Cambridge University Press, Cam-
bridge, UK, 1970.
[Maa98] E. Maasoumi. On mobility. In D. Giles and A. Ullah, editors, The Handbook of Economic
Statistics. Marcel Dekker, New York, 1998.
[MO98] T. Mitra and E. Ok. The measurement of income mobility: A partial ordering approach.
Economic Theory, 12:77–102, 1998.
[Sho88] A.F. Shorrocks. Aggregation issues in inequality measurement. In W. Eichhorn, editor,
Measurement in Economics. Springer Verlag, New York, 1988.
[SJ99] S. Santini and R. Jain. Similarity measures. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 21(9):871–883, 1999.
[SS02] U. Segal and J. Sobel. Min, max, and sum. Journal of Economic Theory, 25:126–150,
2002.
[Tch80] A.H.T. Tchen. Inequalities for distributions with given marginals. Annals of Probability,
8:814–27, 1980.
[TJaLA01] Wu T-J. and Hsieh Y-C. andLi L-A. Statistical measures of dna sequence dissimilarity
under markov chain models of base composition. Biometrics, 57:441–448, 2001.
[TK70] Amos Tverski and David H. Krantz. The dimensional representation and the metric
structure of similarity data. Journal of Mathematical Psychology, 7:572–596, 1970.
[VdGSM01] D. Van de Gaer, E. Schokkaert, and M. Martinez. Three meanings of intergenerational
mobility. Economica, 68:519–537, 2001.
[WBD97] Tiee-Jian Wu, John P. Burke, and Daniel B. Davison. A measure of dna sequence dissimi-
larity based on mahalanobis distance between frequencies of words. Biometrics, 53:1431–
1439, 1997.
[WZF05] Liwei Wang, Yan Zhang, and Jufu Feng. On the euclidean distance of images. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 27:1334–1339, 2005.