You are on page 1of 23

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/258222305

What's so special about Euclidean distance? A characterization with applications


to mobility and spatial voting

Article  in  Social Choice and Welfare · August 2009


DOI: 10.1007/s00355-008-0353-5

CITATIONS READS

34 2,695

2 authors, including:

Marcello D’Agostino
University of Milan
66 PUBLICATIONS   835 CITATIONS   

SEE PROFILE

Some of the authors of this publication are also working on these related projects:

Classical Logic Argumentation for Resource-Bounded Agents View project

Measurement Problems in Social Sciences View project

All content following this page was uploaded by Marcello D’Agostino on 29 November 2016.

The user has requested enhancement of the downloaded file.


WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE?
A CHARACTERIZATION WITH APPLICATIONS TO
MOBILITY AND SPATIAL VOTING

MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

Abstract. In this paper we provide an application-oriented characterization of a


class of distance functions monotonically related to the Euclidean distance in terms
of some general properties of distance functions between real-valued vectors. Our
analysis hinges upon two fundamental properties of distance functions that we call
“value-sensitivity” and “order-sensitivity”. We show how these two general prop-
erties, combined with natural monotonicity considerations, lead to characterization
results that single out several versions of Euclidean distance from the wide class of
separable distance functions. We then discuss and motivate our results in two differ-
ent and apparently unrelated application areas — mobility measurement and spatial
voting theory — and propose our characterization as a test for deciding whether
Euclidean distance (or some suitable variant) should be used in your favourite appli-
cation context.
JEL Classification Numbers: C0, C53, D63, D72.
Keywords: Euclidean Distance, Mobility Measurement, Spatial Voting.

This is the pre-print authors’ version of the paper: M. D’Agostino, V. Dardanoni. What’s
so special about Euclidean distance? A characterization with applications to mobility and
spatial voting. Social Choice and Welfare 33(2009):211–233.

1. Introduction
The problem of measuring the distance1 between real-valued vectors arises in many
areas of scientific research. In particular, variants of the familiar Euclidean distance
play a prominent role in many important application contexts not only in economics
and political science, but in such diverse fields as statistics, decision theory, psychology,
Date: 8 October 2008.
We would like to thank an Editor and expecially two very perspective referees for useful comments
which helped us to improve the paper substantially.
1Our use of the expression “distance”, throughout the paper, is informal. In several application
contexts a variety of functions which do not fully satisfy the standard textbook definition have been
taken into consideration and are regarded as intuitively measuring some sort of distance. So, in this
paper we are not committing to any specific definition, let alone to the standard one. By “distance
function” we shall refer to any continuous function of two real-valued vectors (of the same finite size)
which can be intuitively considered as measuring their distance, even if in some cases, such a function
may not satisfy some of the properties of the standard definition.
1
2 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

DNA sequencing, image recognition, weather forecasting and so on.2 But what’s so
special about Euclidean distance? How can we judge the appropriateness of adopting
this conventional distance in some specific application context? In what contexts are
we forced to use it (or some monotonic transformation of it) as the appropriate distance
function?
In this paper we provide an application-oriented characterization of a class of func-
tions monotonically related to the Euclidean distance in terms of five general properties
which are intuitively (and perhaps empirically) testable. We show that Euclidean dis-
tance is (up to a monotonic transformation) the only function that satisfies them all.
We also show that, by replacing some of these five properties with suitable variants,
one obtains similar characterizations of an averaged version of the Euclidean distance
which we call Averaged Euclidean Distance, and of closely related distance functions
that we call Generalized Euclidean Distance and Averaged Generalized Euclidean Dis-
tance:
pPn
dn (x, y) = i=1 (xi − yi )2 (ED)
q P
1 n
dn (x, y) = n i=1 (xi − yi )2 (AED)
pPn
dn (x, y) = i=1 (g(xi ) − g(yi ))2 (GED)
q P
1 n
dn (x, y) = n i=1 (g(xi ) − g(yi ))2 (AGED)

where g is a continuous and increasing function. Our results may then be helpful in
testing — intuitively or empirically — whether one or the other of these functions fits
a specific application and should, therefore, be preferred to alternative ones.
In this paper we shall speak of a distance function over X n to mean simply a con-
tinuous function dn : X n × X n → R+ for some suitable interval X ⊆ R. Since we
are interested in evaluating the distance between any two vectors of equal size n, for
arbitrary n, our aim is in fact that of investigating infinite classes of distance func-
tionsSover X n , seeking for some uniform characterization of all their elements. Let
X= ∞ n n
n=1 X × X . Then, by a distance over X we mean a infinite collection {dn } of
distance functions over X n .
2To mention just a few applications in these fields, Euclidean distance (or some monotonic transfor-
mation such as the mean squared error) is often used as a loss function in statistics and decision theory;
in psychology, related measures are often employed to evaluate the amount of “similarity” between
two objects, each of which is decomposed into a fixed number of components, and (dis)similarity
is then modelled as a suitable metric in the resulting feature space ([SJ99]); in biology, there are
algorithms for searching genetic databases for biologically significant similarities in DNA sequences
using Euclidean and similar distances ([WBD97] and [TJaLA01]); similarity of images (used, among
other things, in security applications) is typically established by appropriate variations of Euclidean
distance [WZF05]); accuracy of weather forecasting is traditionally established by the so-called Brier
score ([Bri50]), which is (squared) Euclidean distance between a probability assignment to a sequence
of binary events and the observed outcomes (1 if the event occurs, and 0 if it does not occur).
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 3

6 σ12 (y) 6
1
r
0.8 r y0
0.7
r y

0.4 ry

0.2
r 0.2 rx
x
- -
0.4 0.7 1 0.4 0.9
(a) Order-sensitive (b) Value-sensitive

Figure 1. Order-sensitive and value-sensitive functions

Our characterization hinges upon two fundamental properties that a distance func-
tion between two real-valued vectors may satisfy. These properties deal with two basic
ways in which the distance between vectors x and y may intuitively increase/decrease,
namely: (a) by altering the distance in one coordinate, leaving everything else un-
changed; (b) by “swapping” the values of two coordinates in one of the two vectors,
leaving everything else unchanged.
More precisely,
• we say that a distance function dn : X n × X n → R+ (for some interval X ⊆ R)
is value-sensitive if, for any x, y, y0 ∈ X n such that d1 (xj , yj ) < d1 (xj , yj0 ) and
yi = yi0 for i 6= j, we have dn (x, y) < dn (x, y0 );
• let σij (u) denote the vector obtained from u by “swapping” the values of ui
and uj ; a distance function dn is called order-sensitive if, for all x, y ∈ X n ,
whenever (xi − xj )(yi − yj ) > 0 we have dn (x, y) < dn (x, σij (y)).
While value-sensitivity requires that the distance between two vectors should be a
monotonic function of the distance between the values of their corresponding coordi-
nates, order-sensitivity requires that it should depend on the order association between
the two vectors. Suppose that, for some i and j, there is a positive order association
between the corresponding coordinates, that is (xi − xj )(yi − yj ) > 0. In this case a
swap between yi and yj (or between xi and xj ) is order-reversing, since it turns the
positive association into a negative one.3 Order-sensitivity requires that such swaps
always increase the distance between the vectors under consideration.
Clearly, not all commonly used distance functions are order-sensitive. In Figure 1(a)
on p. 3, it can be immediately verified that the so-called city-block distance dn (u, v) =
Σni=1 |ui − vi | declares x as equally close to y and σ12 (y). On the other hand, typical
order-sensitive functions, such as the Spearman and Kendall coefficients as well as any
3Such order-reversing swaps are discussed in mathematical statistics [Tch80], economics [ET80],
and mobility measurement [Atk83, Dar93].
4 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

other function based on ranks, are not value-sensitive and, for instance, would declare
x equally close to y and y0 in Figure 1(b).
Although value-sensitivity and order-sensitivity rely on different intuitions about
how the distance between two vectors may change, they are by no means incompatible
properties. Each member of the commonly used Minkowsky class of distance functions
1/p
dn (u, v) = Σni=1 |ui − vi |p (also known as the power metric) is both value and order
sensitive whenever p > 1.
In their well-known [TK70], Tverski and Krantz discuss axiomatic characterizations
of various generalizations of the power metric from the point of view of measurement
theory. In a different context, motivated by mobility measurement, Fields and Ok
([FO96]) and Mitra and Ok ([MO98]) also provide related characterizations of the
Minkowski class. However, since there are infinitely many elements in the Minkowsky
class, such characterizations leave open the choice of the specific metric which is ap-
propriate in each given context, as well as the characterizing properties of the chosen
metric.4 One aim of this paper is to provide a sharper analysis, by identifying a set of
properties that allow us to single out Euclidean distance, or one of its variants, from the
wide class of separable distance functions.5 A crucial step in this analysis is the recog-
nition that, unlike any other Minkowsky metric, Euclidean distance is order-sensitive
in a way which is monotonically related to the distance between the swapped values.
This distinctive property plays a crucial role in a variety of applications and can be
identified as what makes Euclidean distance so “special”.
In Section 2 we present three properties (Properties 3, 4 and 5) that refine, via nat-
ural monotonicity considerations, the general definitions of value-sensitive and order-
sensitive distance functions informally introduced above. We call them “monotonicity
properties”. We also introduce two invariance properties, Permutation Invariance and
Extension Invariance (Properties 1 and 2). We also consider alternative versions of
Properties 2 and 3 which may turn out to be more adequate for some applications. We
then show, in Section 3, that Euclidean Distance is (up to a monotonic transforma-
tion) the only distance function which satisfies the monotonicity properties, and the
invariance properties (Theorem 1.1). We also show that Averaged Euclidean distance
is (still up to a monotonic transformation) the only distance function that satisfies
the monotonicity properties, Permutation Invariance and an alternative to Extension
Invariance, called Replication invariance (Theorem 1.2). Finally, we show that by
replacing Property 3 with a suitable alternative version one obtains a similar char-
acterization of Generalized Euclidean Distance and Averaged Generalized Euclidean
Distance (Theorem 2.1 and 2.2).

4Thus,their analyses do not, and cannot, assign any special role to the property that we have
called “order-sensitivity”, for the good reason that the Minkowski class includes, as a special case
when p = 1, the city-block distance which is not order-sensitive.
5
A similar sharper analysis, but leading to a characterization of the city-block distance, is provided
by Fields and Ok in [FO96].
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 5

As a vehicle to appreciate the applicative potential of our results, we shall discuss, in


Section 4, two case-studies from very different areas: social mobility measurement and
spatial voting theory.6 In case-study 1, we make a critical analysis of the characterizing
properties of Euclidean Distance in the context of mobility measurement, which leads
to the rejection of two of these properties on intuitive grounds. Our analysis suggests
that Averaged Generalized Euclidean Distance may be more suitable for this kind of
application. In case-study 2, the intuitive rejection of two characterizing properties of
Euclidean Distance in the context of spatial voting theory is addressed via a different
approach. This consists in showing how the intuitively judged distance between can-
didates, which is prima facie incompatible with a Euclidean metric, can be expressed
in terms of a restricted “canonical” model for which all the characterizing properties
of Euclidean Distance are satisfied.
Potential applications of the characterization results presented in this paper, on the
other hand, are by no means restricted to the ones discussed in these two case-studies,
but extend to many other interesting problems that can be naturally interpreted in
terms of choosing a suitable distance function between real-valued vectors.7

2. Properties
As mentioned in the Introduction, we shall speak of a distance function over X n
to mean simply a continuous function dn S : X n × X n → R+ for some suitable interval
X ⊆ R. Moreover, a distance over X = ∞ n n
n=1 X × X will be an infinite collection
{dn } of distance functions over X n .
As for the interval X, we shall restrict our attention to three typical cases: (i)
X = [0, 1] (ii) X = R+ ; (iii) X = R. Furthermore, by X+ we mean the nonnegative
part of X, so that X+ is equal to X in cases (i) and (ii), while X+ = R+ in case (iii).
Notice that in all cases, given any two x, y ∈ X, their absolute difference |x − y| ∈ X+ ;
this restriction will simplify the analysis and the notation used in this paper.
In what follows we shall use the light-face letters a, b, c, d, etc. to denote arbitrary
real numbers and the boldface letters x, y, w, etc. to denote arbitrary vectors. The
vector whose only element is the real number a will be denoted simply by “a”. We
shall write [x, y] for the concatenation of the two vectors x and y.8

6
For surveys on social mobility measurement see Maasoumi [Maa98] and Fields and Ok [FO99a].
For an introduction to the spatial theory of voting, see Hinich and Munger [HM70]; for a general more
advanced treatment of voting theory, Austeen-Smith and Banks [ASB99].
7
The linear regression model, for instance, is without doubts the workhorse of theoretical and
applied econometrics. Given an n-sized vector y (“regressand”) and k n-sized vectors x1 , . . . , xk
(“regressors”) which are collected into an n × k design matrix X, the linear regression model deals
with how to find the point in the linear space spanned by the columns of X which is closest to y.
Thus, the problem is to find a k-sized vector β (“regression coefficient”) which minimizes the distance
between y and Xβ. The most common method for solving this problem is of course the OLS, which
implies that Euclidean distance is the chosen distance concept.
8Since, conventionally, vectors in X n are columns, by [x, y] we formally mean [xT , yT ]T .
6 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

2.1. Invariance Properties. The first property of this group requires each dn to be
invariant under uniform permutations of both its arguments:
Property 1 (Permutation Invariance). For all x, y ∈ X n and all n ∈ N, dn (x, y) =
dn (π(x), π(y)) for every permutation π.
The second property relates the distance between two n-dimensional vectors to the
distance between higher-dimensional “conservative extensions”. It requires that, if
both vectors are extended by concatenating each of them with the same vector, their
distance is left unaltered.
Property 2 (Extension Invariance). For all m, n ∈ N, all x, y ∈ X m and all z ∈ X n ,
dm (x, y) = dm+n ([x, z], [y, z]).
This Property may appear intuitively sound in some application contexts but not in
others, for instance when we are interested in a notion of averaged distance. For an
alternative property which is appropriate in such contexts see Section 2.3 below.
2.2. Monotonicity Properties. The first two properties of this group articulate the
value-sensitivity property informally discussed in Section 1 and refine it via monotonic-
ity considerations. They are both rather natural assumptions to make, and similar
properties arise in several characterization results, including those mentioned in the
introduction, in a variety of fields.
Property 3 (One-dimensional value-sensitivity). For all a, b ∈ X, d1 (a, b) = G(|a−b|)
for some continuous and strictly increasing function G : X+ → R+ such that G(0) = 0.
This property makes one-dimensional distance depend monotonically on the absolute
difference between the values of their single coordinate (G(0) = 0 is a harmless normal-
ization requirement) and leaves entirely open the problem of how to measure higher-
dimensional distance. Property 3 is closely related to the one called “intradimensional
subtractivity” in [TK70]. In fact, the latter corresponds to Property 3∗ presented in
Section 2.3.
The second property requires that the distance between two vectors is monotonically
consistent with the distance between their subvectors:
Property 4 (Subvector Consistency). For all k, j ∈ N and whenever x, y, x0 , y0 ∈ X k ,
u, v, u0 , v0 ∈ X j ,
dk (x, y) > dk (x0 , y0 ) and dj (u, v) = dj (u0 , v0 ) =⇒
=⇒ dk+j ([x, u], [y, v]) > dk+j ([x0 , u0 ], [y0 , v0 ]).
A similar axiom is commonly used in the literature on income inequality [Sho88],
poverty [FS91] and mobility measurement [FO99b], and implies (as shown in the proof
of Theorems 1 and 2 provided in the Appendix, see also [FS91]) the fundamental
independence assumption which plays a crucial role in the theory of additive conjoint
measurement ([Deb60]). Tverski and Krantz show how to derive the latter from three
more primitive axioms (see Theorem 1 in [TK70]).
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 7

The third property of this group is best understood in connection with the order-
sensitivity property discussed in Section 1. It can be interpreted as requiring that the
increase in the distance between two vectors caused by any “order-reversing swap”
depends monotonically on the distance, in each vector, between the components that
are involved in the swap:9
Property 5 (Monotonic Order-Sensitivity). For all n ∈ N such that n ≥ 4, for all
x, y ∈ X n and all i, j, k, m ∈ {1, . . . , n}, if
• (xi − xj )(yi − yj ) > 0
• (xk − xm )(yk − ym ) > 0
• d1 (xi , xj ) = d1 (xk , xm ) and d1 (yi , yj ) ≤ d1 (yk , ym ),
then
dn (x, σij (y)) ≤ dn (x, σmk (y)).
In other words, the effect of an order-reversing swap in the y’s depends monotonically
on the distance between the swapped y’s, provided that the distance between the
corresponding x’s is the same.10 For instance, suppose x = [0, 0.2, 0.4, 0.6] and y =
[0, 0.3, 0.5, 1]. The distance between x and σ12 (y) = [0.3, 0, 0.5, 1] is smaller than the
distance between x and σ34 (y) = [0, 0.3, 1, 0.5], since the distance between the swapped
y’s in σ12 (y) is smaller than the distance between the swapped y’s in σ34 (y) (while the
distance between the corresponding x’s is the same). As will be shown in the next
section, this property provides a sharp means to single out Euclidean distance (or
some suitable variant of its) from the much wider class of separable distance functions.
Observe that, strictly speaking, this Property does not imply, by itself, that the distance
function be order-sensitive in the sense explained in the Introduction, that is, it does
not imply that any order-reversing swap be distance-increasing. However, it does so in
conjunction with the other properties.11

2.3. Variants. As will be shown in Section 4, Extension Invariance (Property 2 above)


may be considered inappropriate in some application contexts — such as social mobility
comparisons — where we are interested in a notion of average distance. In this kind of
9Recall that we denote by σij (y) the vector obtained from y by “swapping” yi and yj , i.e. the
vector y0 such that: (i) yk0 = yk for all k 6= i, j, (ii) yi0 = yj , and (iii) yj0 = yi .
10The latter requirement is necessary for consistency. To see this, observe that the pair of vectors
(x, σij (y)) could be seen as obtained from the pair (σij (x), σij (y)) by an order-reversing swap in
σij (x). Taking into consideration that Property 5, together with the other properties, implies that
every order-reversing swap is strictly distance-increasing (see footnote 11 below), it is not difficult to
show that, without this requirement, the property would lead to inconsistent judgements whenever
d1 (yi , yj ) < d1 (yk , ym ) and d1 (xi , xj ) > d1 (xk , xm ).
11 Let y0 be the vector obtained from y by “swapping” y and y , i.e. (i) y 0 = y for all k 6= i, j,
i j k k
(ii) yi0 = yj , and (iii) yj0 = yi , and assume xj = xi + ∆x and yj = yi + ∆y , with ∆x and ∆y
having the same sign. After the swap, given Theorem 1 below, distance will increase whenever
(xi − yi − ∆y )2 + (xi + ∆x − yi )2 > (xi − yi )2 + (xi + ∆x − yi − ∆y )2 ; simplifying, the above inequality
holds true, since ∆x ∆y is positive.
8 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

applications one may want to consider, instead of Property 2, a standard “replication”


property:
Property 2∗ (Replication Invariance). For every x, y ∈ X k and every k, n ∈ N,
n n
dk (x, y) = dnk ([x, · · · , x], [y, · · · , y]),
z }| { z }| {
n
where [u, · · · , u] denotes the result of concatenating the vector u with itself n times.
z }| {

Moreover, One-dimensional Value-Sensitivity (Property 3 above) may also be re-


jected, on intuitive or empirical grounds, in favour of the following more general vari-
ant:
Property 3∗ (Generalized value-sensitivity). For all a, b ∈ X, d1 (a, b) = G(|g(a) −
g(b)|) for some continuous and strictly increasing function G : X+ → R+ such that
G(0) = 0, and some continuous and strictly increasing function g : X → X,
where the nature of the function g depends on the application. In Section 4.1, for in-
stance, we argue that this variant should be preferred in the context of social mobility
measurement, where g can be interpreted as an “economic status” function. [TK70]
discuss it in connection with psychological applications and also present (Theorem 1)
a different derivation from assumptions expressed in terms of a “betweennes” relation.
Observe also that, if Property 3 is replaced by Property 3∗ , Property 5 needs to be rein-
terpreted by noticing that the distance between the swapped values must be measured
taking into account the function g.12
3. Characterizations
In what follows, when we say that a distance {dn } over X — intended as an infinite
collection of distance functions over X n — satisfies a certain property, we mean that
every element dn of the collection satisfies that property.
Theorem 1. A distance {dn } over X
(1) satisfies Properties 1, 2, 3, 4, 5, if and only if for all n ∈ N and all x, y ∈ X n :
n
X
(xi − yi )2 ,

dn (x, y) = H
i=1

for some continuous and strictly increasing H : R+ → R+ with H(0) = 0;


(2) satisfies Properties 1, 2∗ , 3, 4, 5, if and only if for all n ∈ N and all x, y ∈ X n ,
n
1X
(xi − yi )2 ,

dn (x, y) = H
n i=1
for some continuous and strictly increasing H : X+ → R+ with H(0) = 0.
12In particular, when Property 5 is used in conjunction with Property 3∗ , one has to recall that
the third condition of Property 5 becomes |g(xi ) − g(xj )| = |g(xk ) − g(xm )| and |g(yi ) − g(yj )| ≤
|g(yk ) − g(ym )|), and thus the interpretation of the property depends on g.
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 9

Theorem 2. A distance {dn } over X


(1) satisfies Properties 1, 2, 3∗ , 4, 5, if and only if for all n ∈ N and all x, y ∈ X n ,
X n
(g(xi ) − g(yi ))2 ,

dn (x, y) = H
i=1

for some continuous and strictly increasing H : R+ → R+ with H(0) = 0;


(2) satisfies Properties 1, 2∗ , 3∗ , 4, 5, if and only if for all n ∈ N and all x, y ∈ X n ,
n
1X
(g(xi ) − g(yi ))2 ,

dn (x, y) = H
n i=1
for some continuous and strictly increasing H : X+ → R+ with H(0) = 0.
The proof of these theorems is contained in the Appendix.
Theorems 1 and 2 can be directly applied to justify the use of one or the other version
of Euclidean distance, in all application contexts in which the relevant properties are
satisfied. Typically, an application context C is represented by some data space and
some intended intuitive distance function over it, i.e. some distance function partially
specified by a set of intuitive comparative judgments of the form “x is closer to y
than z”. The problem is determining the functional form that d must have in order to
comply with such intuitive judgments. If the characterizing properties are satisfied by
the intuitive judgments which can be obtained in the application context C – and may
sometimes be revealed experimentally – then Theorems 1 and 2 dictate, up to a mono-
tonic transformation, the functional form of the distance function: Euclidean Distance
(Theorem 1.1), Averaged Euclidean Distance (Theorem 1.2), Generalized Euclidean
Distance (Theorem 2.1) and Averaged Generalized Euclidean Distance (Theorem 2.2).
Since these results provide ordinal representations, they clearly do not allow us to
say anything about the specific monotonic transformation that should be adopted in
each particular application context, in particular they do not allow us to discriminate
between each variant of Euclidean distance and its squared form.13 The choice of the
appropriate transformation will depend on further considerations which are application-
dependent. We shall end this section by showing, by way of example, how Theorem 1
can be strengthened to provide sharper characterizations of ED and AED. For this
purpose consider the following extra property, that we call linear homogeneity:

(Linear Homogeneity). For all n ∈ N, all x, y ∈ X n and all constants α ∈ X,


dn (αx, αy) = α · dn (x, y).

Then, we can easily obtain the following:


Corollary 1. A distance {dn } over X
13Some monotonic transformations do not give rise, strictly speaking, to a distance function. For
instance, squared Euclidean Distance does not satisfy the triangle inequality. However, as already
explained in the introduction, in this paper we use the expression “distance function” in a looser way.
10 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

(1) satisfies Properties 1, 2, 3, 4, 5 and linear homogeneity if and only if for all
n ∈ N and all x, y ∈ X n :
v
u n
uX
dn (x, y) = β · t (xi − yi )2 ,
i=1

for some constant β > 0.


(2) satisfies Properties 1, 2∗ , 3, 4, 5 and linear homogeneity if and only if for all
n ∈ N and all x, y ∈ X n ,
v
u n
u1 X
dn (x, y) = β · t (xi − yi )2 ,
n i=1

for some constant β > 0.


Clearly, an exact characterization of ED and AED follows immediately from the above
corollary via an appropriate normalization.
To show Corollary 1.1, observe that, by Theorem 1.1 and linear homogeneity:
n
X n
X
2 2
(xi − yi )2 .
 
dn (αx, αy) = H α · (xi − yi ) = α · dn (x, y) = α · H
i=1 i=1

Hence, H must be such that H(α2 · t) = α · H(t)√for all constants α ∈ X and all
t ∈ dom(H). But this is possible only if H(u) = β · u for some constant β > 0 (since
H is strictly increasing). The argument for Corollary 1.2 is identical, once the distance
function is evaluated in accordance with Theorem 1.2.

4. Applications: two case-studies


4.1. Case study 1: social mobility measurement. When discussing mobility is-
sues, a basic distinction is usually made between intergenerational and intragenera-
tional mobility. The first concept concerns the study of how the distribution of some
relevant measure of individual status changes between different generations in a given
society. Alternatively, intragenerational mobility studies how the distribution of indi-
vidual status changes among a group of individuals over a given period of their lifetime.
In general, the simplest framework to capture either of these aspects is to consider how,
in a society of n individuals, a vector x is transformed into another vector y, where
the i-th element xi denotes the value of a relevant indicator of the social and economic
status of individual i, and yi denotes its value in the next generation (intergenerational
case) or in the next time period (intragenerational case). Typical variables employed in
most mobility studies for measuring socioeconomic status are income, wage, consump-
tion, education, and occupational prestige. Focusing on income mobility, intended as a
measure of non-directional income movement (as, for example, in [FO96] and [FO99b]),
a social mobility index is then a function from Rn+ × Rn+ → R+ which can be naturally
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 11

interpreted as a distance function between fathers’ and sons’ incomes. So, in this ap-
plication context the data space is Rn+ × Rn+ , where each vector represents the economic
status of individuals in a society, and the intended intuitive distance d is some distance
function that complies with intuitive mobility comparisons.
In the context of intergenerational mobility measurement, Property 1 seems unexcep-
tionable. Property 4 has been assumed in the axiomatic study of mobility measurement
of Fields and Ok [FO99b] (where it is known as “subgroup consistency”); alternatively,
the stronger “decomposability” property is often employed (see e.g. [Cow85], [FO96],
[MO98]), which implies Property 4. On the other hand, many indices such as Sper-
man’s rank correlation or Pearson’s correlation coefficient, which are commonly used
to measure intergenerational mobility, fail to satisfy Property 4, and, while subgroup
consistency axioms are also widely used in the inequality and poverty measurement
literature, they are not without their critics, see e.g. the discussion in Foster and Sen
[FS97].
Property 2 is quite inappropriate if one believes that social mobility in a society
should be measured in per-capita terms. So, Property 2 can be dropped and replaced
by the Replication Invariance property (Property 2∗ ) which led us to Theorem 1.2.14
In this case, one would consider the per-capita Euclidean distance as a good candidate
for the sound distance function in the social mobility context. Let us now focus on the
remaining properties, namely 3 and 5.
If the domain of the mobility index are dollar-incomes, as often assumed, it may
be unreasonable to postulate that a father-son movement, say, between $100 and $110
has the same level of mobility than a movement between $1000 and $1010, as implied
by Property 3. If one interprets a social mobility index as measuring the distance
between the economic status of two generations, this kind of objection would lead to
rejecting the identification of economic status with dollar-income, and suggests replac-
ing Property 3 with an analogous property which is not subject to this criticism. As a
replacement of Property 3 consider then Property 3∗ , where g is interpreted as an ap-
propriate function measuring “economic status” whose form is application-dependent
(but see below for a brief discussion). Note that this property implies the symme-
try of upward and downward mobility. As remarked by Fields and Ok [FO99b], this
symmetry is unexceptionable if one does not distinguish between “good” and “bad”
movements of income.15 As for Property 5, observe that once Property 3 has been
replaced by Property 3∗ , its interpretation inherits the consideration of economic sta-
tus which is incorporated into the latter. As a consequence, the distance between the
14On the other hand, the Replication Invariance Property too may be criticized in mobility ap-
plications. For example, one may legitimately perceive less mobility in a society with two families
with incomes, say, (1, 2) and (2, 1) than in a society which replicates these two families a million
times. If this view is strongly held, our theorem implies that averaged euclidean distance is not a
good candidate for mobility measurement.
15While such a symmetric treatment seems appropriate from a descriptive viewpoint, some may
argue that it is unfortunate from a normative one, since upward mobility may appear to be more
desirable than downward mobility.
12 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

swapped values is measured taking into account the effect of the status function g. In
particular, observe that, if g is a concave function the distance between two swapped
values in the application of Property 5 will be smaller for incomes of high range even
when the difference between their absolute values is the same. Having clarified this,
Property 5 seems a natural monotonic extension of the order-reversing swap property
much discussed in the mobility literature.16 Theorem 2.2 implies that if these proper-
ties are deemed compelling in this application context, AGED may be a good choice
for mobility measurement. Applications of this theorem depend on the choice of an
appropriate status function g. Some may argue that a good choice would be, for ex-
ample, to take g as the log function, so that an income movement from $1000 to $2000
would have the same mobility effect as a movement from $100 to $200. In general, the
most appropriate choice of g depends on the application context and must be justified
independently. Property 3∗ (and, as a consequence, Property 5), can therefore be seen
as a general constraint on this choice.17
Comparing the characterizing properties of AGED with those of the “city-block”and
Minkowsky distance functions applied to dollar-incomes, or those of the “city-block”
function applied to log-dollar incomes, proposed in the seminal and much used char-
acterizations of mobility indices ([FO96], [MO98] and [FO99b]), the researcher may
agree with Property 5 and therefore decide on using AGED (with judicious choice of
the transformation function g).

4.2. Case study 2: multidimensional spatial voting. The standard univariate


spatial theory of voting assumes the existence of a policy space P which is typically an
interval X = [0, 1], R+ or R, such that different alternatives can be represented as points
in P (see e.g., [ASB99]). A multidimensional policy space can then be represented as
the Cartesian product P n , such that each issue has a well-defined unit of measurement
which is shared by all voters, and the different alternatives over which voters are
assumed to vote are elements of P n (real-valued n-dimensional vectors). A voter’s
preferences are then characterized by an ideal point in P n and a distance function on
P n.

16This
property appears to be uncontroversial if one holds the view of social mobility as an indicator
of (non-directional) movements in income, in which case it is natural to require that a mobility measure
is order-sensitive as defined in this paper. In the context of intergenerational income mobility, order
reversing swaps always decrease the positive association between fathers and sons incomes in a society,
and have been discussed, among others, by Atkinson [Atk83] and Dardanoni [Dar93]. On the other
hand, it has been noticed (see e.g. Van de Gaer, Schokkaert and Martinez [VdGSM01]) that such
uniform treatment of order reversing swaps may conflict with the interpretation of social mobility
as an indicator of equality of opportunity. The relationship beteween our properties and the equal
opportunity interpretation of mobility requires a lengthy and detailed discussion which cannot be
addressed here.
17Linking the concept of economic status to individual welfare, by interpreting g as an empirically
revealed utility function, could help this process.
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 13

Given an ideal point x ∈ P n , the Euclidean distance induces a preference ordering


x on P n such that
Xn n
X
y x z ⇔ − (xi − yi )2 ≥ − (xi − zi )2
i=1 i=1

that is, the individual will vote for y whenever y is “closer” to her ideal point x than
z. For illustration, in the two-dimensional case the ordering can be represented by the
utility function Ux (y) = −(x1 − y1 )2 − (x2 − y2 )2 , and the indifference curves are circles
centered at (x1 , x2 ). Theorem 1.1 shows that using Euclidean preferences in a given
context is equivalent to deeming its characterizing properties to comply with intuitive
judgments in that particular context. Here an intuitive distance function measures the
“closeness” between two political platforms.
In multidimensional spatial voting, Property 3 is indeed uncontroversial since it
can be seen as the Blackian and Downsian starting point for any multidimensional
extension. Similarly, Properties 4 and 2 seem natural properties to impose in this
context. However, Properties 1 and 5 appear to be intuitively sound only in situations
where different issues are regarded as equally important from the voter’s viewpoint.
Hence, while Theorem 1.1 justifies the use of Euclidean distance in all such restricted
situations, it can be argued that the restriction is quite unrealistic, since different issues
are often given different saliency.
When issues are assumed to have different saliency, the usual approach consists in
“weighting” different issues by means of a real number expressing the relative im-
portance assigned to them by an individual voter. Given a set of positive weights
w1 , . . . , wn , weighted
P Euclidean preferences can then be represented by a utility func-
tion Ux,w (y) = − ni=1 wi (xi − yi )2 . It is easy to see that in the bidimensional case
indifference curves are ellipses centered at the ideal point x, parallel to the horizon-
tal/vertical axis depending on whether w1 is lower/greater than w2 . It is clear that,
given an ideal point x, it is easy to construct examples where, say, y is preferred to z
by weighted Euclidean preferences but not by simple ones.
The intuition underlying the use of a weighted distance function can be expressed
more precisely as follows. Let V be the set of all possible voters. If issues have different
saliency for different voters, the distance function changes from voter to voter, and so
we can denote by dvn the distance function of voter v. Among all possible voters, there
are some, that we call the canonical voters, for whom all the different issues have the
same saliency. Let V0 be the subset of V consisting of the canonical voters. Using a
weighted distance function (whatever it may be) means that the distance of a candidate
y from the ideal candidate x for a given voter v is equal to the distance between a(y)
and a(x) for a canonical voter u, where a is a standardization function that multiplies
every coordinate of the vector to which it applies for the “weight” assigned to the
corresponding issue.18 In this way the preferences of each non-canonical voter v can
18Such weights can be determined empirically by observing the indifference curves of each voter or
group of voters.
14 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

be represented in terms of the preferences of a canonical voter u. So, by connecting


the preferences of non-canonical voters to those of a canonical voter, one may obtain
a precise mathematical expression which represents the former in terms of the latter.
The advantage of this approach is that the distance function of canonical voters may be
determined by means of some known characterization. To summarize, the assumption
underlying the use of weighted distance functions is the following
Condition 1. For all v ∈ V and all u ∈ V0 , there are αv.1 , . . . , αv.n ∈ R+ such that
dvn (x, y) = dun (av (x), av (y)), where av (z) = αv.1 z1 , . . . , αv.n zn .
Now, this assumption leaves open how the distance function of a canonical voter
should be appropriately determined. Why should weighted Euclidean distance be ap-
plied and not, for instance, weighted city-block distance? Weighting, as explained
before, is just a way of standardizing the elements of different policy spaces into ele-
ments of a reference policy space (that of a canonical voter) and tells us nothing about
the appropriate distance function for this reference space. Now, in the restricted do-
main of canonical voters we have already argued that the distance function satisfies
Property 1 and Property 5. Moreover, Properties 3, 2 and 4 are intuitively sound for all
dvn , no matter whether v is canonical or not. Hence, we can assume that all properties
of Theorem 1.1 are satisfied by dun , for any canonical voter u:
Condition 2. The distance function dun of any canonical voter u satisfies Properties
1, 2, 3, 4 and 5.
Clearly, Conditions 1 and 2 hold true if and only if, for every v ∈ V ,
Xn
dvn (x, y) = K wi (xi − yi )2 ,

i=1
2
with wi = and K a strictly increasing function. For, by Theorem 1.1, dun must be
αv.i
monotonically related to the Euclidean distance for every canonical voter u, and so:
n
X 2 
dvn (x, y) = dun av (x), av (y) = K

av,i xi − av,i yi .
i=1
When the distance function satisfies also linear homogeneity and an appropriate nor-
malization condition (see Corollary 1), K is the square root function. Thus, dvn must
2
be equal to the weighted Euclidean distance, with the weight for issue i given by αv.i .
We conclude this section by stressing that this result may be helpful in guiding
empirical research. For example [EMR88] consider weighted Euclidean distance and
weighted city-block distance to determine which is empirically closer to voting behavior.
By means of our analysis, empirical testing can be performed separately on each of the
characterizing properties of Euclidean distance for a canonical voter. If these properties
are accepted by canonical voters and Conditions 1 and 2 are considered sound, then
weighted Euclidean distance should be preferred to weighted city-block distance for all
voters. On the other hand, a direct characterization of weighted Euclidean distance falls
outside the scope of this paper but would be an interesting topic for future research.
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 15

5. Conclusions
As should emerge from our discussion, the results presented in Section 3 may be
useful in understanding:
• whether some suitable monotonic transformation of Euclidean Distance, in one
of the variants investigated in this paper, naturally fits a given application
context, by checking whether its characterizing properties are satisfied in it
(case study 1);
• when the characterization results cannot be directly applied, how the original
application context can be reduced to a “canonical” one in which the properties
of some of the characterized distance functions are all satisfied (case study 2).
Our analysis suggests that some variant of the Euclidean distance is likely to be ap-
propriate in many contexts requiring a distance function which is both monotonically
value-sensitive (as made precise by Properties 3/3∗ and 4) and monotonically order-
sensitive (as made precise by Property 5).

6. Appendix
6.1. Proof of Theorems 1 and 2. To prove the theorems, we first need the following
Lemma 1. A distance {dn } over X satisfies Properties 1, 2, 3, 4 if and only if there
exists a continuous and strictly increasing function f : X+ → R+ , with f (0) = 0, such
that for all n ∈ N and all x, y ∈ X n ,
n
X 
dn (x, y) = F f (|xi − yi |)
i=1

for some continuous and strictly increasing function F : R+ → R+ with F (0) = 0.


6.1.1. Proof of Lemma 1. The “if” direction is left to the reader. For the “only if”
direction, we split the proof into four steps.
Step 1. Under Property 1, Property 4 is equivalent to the following:

(1) dk+j ([x, u], [y, v]) ≤ dk+j ([x0 , u0 ], [y0 , v0 ]) ⇐⇒ dk (x, y) ≤ dk (x0 , y0 ),
whenever dj (u, v) = dj (u0 , v0 ),
for all k, j, all x, y, x0 , y0 ∈ X k and all u, v, u0 , v0 ∈ X j .
Proof. It can be easily verified that Property 4 is implied by (1). To see the converse,
notice that, under Property 1, (1) is equivalent to:
(2) dk+j ([x, u], [y, v]) ≤ dk+j ([x0 , u], [y0 , v]) ⇐⇒ dk (x, y) ≤ dk (x0 , y0 ),
for all k, j, all x, y, x0 , y0 ∈ X k and all u, v ∈ X j . We then show that, under Property 1,
Property 4 implies (2), and therefore also (1).
16 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

Suppose, first, that dk+j ([x, u], [y, v]) ≤ dk+j ([x0 , u], [y0 , v]); then by Property 4 and
contrapposition, we have that dk (x, y) ≤ dk (x0 , y0 ). On the other hand, suppose that
(i ) dk (x, y) ≤ dk (x0 , y0 ) and (ii ) dk+j ([x, u], [y, v]) > dk+j ([x0 , u], [y0 , v]).
Now, if dk (x, y) < dk (x0 , y0 ), by Property 4, dk+j ([x, u], [y, v]) < dk+j ([x0 , u], [y0 , v]),
against the assumption (ii). If dk (x, y) = dk (x0 , y0 ), it follows from (ii), by Prop-
erty 4 again, that d2k+j ([x, u, x0 ], [y, v, y0 ]) > d2k+j ([x0 , u, x], [y0 , v, y]) against Prop-
erty 1. Hence, if dk (x, y) ≤ dk (x0 , y0 ) it must hold true that dk+j ([x, u], [y, v]) ≤
dk+j ([x0 , u], [y0 , v]). 
Let us now assume that Properties 1, 2, 3, 4 hold true and recall that X can be either
of the three intervals considered in Section 2, namely R, R+ or [0, 1].
Step 2. For all n ≥ 3 and all x, y, w, z ∈ X n , there exists a continuous and strictly
increasing function fn : X+ → R+ , with f (0) = 0, such that
n
X 
(3) dn (x, y) = Fn fn (|xi − yi |)
i=1

for some continuous and strictly increasing function Fn : Un → R+ , with Fn (0) = 0,


where Un = {u1 + · · · + un | ui ∈ fn (X+ ), i = 1, . . . , n}.
Proof. Observe that Property 3 implies
(4) d1 (a, b) = d1 (|a − b|, 0).
It follows from (1) that, for all n ∈ N and all x, y, w, z ∈ X n ,
(5) d1 (xi , yi ) = d1 (wi , zi ) for all i = 1, · · · , n =⇒ dn (x, y) = dn (w, z).
Therefore, given (4) above, we have that for all n and all x, y ∈ X n ,
(6) dn (x, y) = dn ([|x1 − y1 |, . . . , |xn − yn |], [0, . . . , 0]).
Let Mn : X+n → R be defined as follows
Mn (z) = dn ([z1 , . . . , zn ], [0, . . . , 0]).
Now, since dn is assumed to be continuous, Mn must also be continuous. Moreover, it
follows from (1) and Property 3, via Property 1, that Mn is strictly increasing in each
argument zi . From (1) it also follows that, for all u, u0 ∈ X+h and v, v0 ∈ X+k , with
h + k = n,
Mn ([u, v]) ≥ Mn ([u0 , v]) ⇒ Mn ([u, v0 ]) ≥ Mn ([u0 , v0 ]).
So, taking into account Property 1, one can apply standard separability results ([Deb60],
see also [Gor68]), to show that, for all n ≥ 3, the function Mn must be separable,19
that is, there must exist continuous and strictly increasing functions Fn : Un → R and
19See [SS02] for a recent thorough discussion on separable preferences.
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 17
 Pn 
fn : X+ → R (where Un is defined as above) such that Mn (z) = Fn i=1 f n (zi ) .
Thus, by (6) and taking zi = |xi − yi |, we have that for all n ≥ 3:
n
X 
(7) dn (x, y) = Fn fn (|xi − yi |) .
i=1

We finally show that, without loss of generality, we can assume that the functions fn
and Fn in (3) are such that fn (0) = 0 and Fn (0) = 0. In fact, whenever fn (0) = c 6= 0,
let hn : X+ → R and Hn : R → R be defined as hn (t) = fn (t)−c and Hn (t) = Fn (t−nc).
Then, it is immediately verified that
n n
X  X 
(8) Fn fn (|xi − yi |) = Hn hn (|xi − yi |) .
i=1 i=1

Moreover, notice that Properties 2 and 3 imply that, for all u ∈ fn (X+ ),
(9) Fn (u) = G(fn−1 (u)).
To see this, just notice that, by Property 2, for all n ≥ 3 and for all x ∈ X n−1 ,
n−1
X
 
d1 (a, b) = dn ([a, x], [b, x]) = Fn fn (|a − b|) + fn (|xi − xi |) .
i=1

Under the assumption that fn (0) = 0 and by Property 3, we have that for all n ≥ 3,
(10) d1 (a, b) = dn ([a, x], [b, x]) = Fn [fn (|a − b|)] = G(|a − b|).
Hence, taking u = fn (|a − b|), we obtain G(fn−1 (u)) = Fn (u). Now, to show that
Fn (0) = 0, observe that fn (0) = 0 and, by Property 3, G(0) = 0, so that Fn (0) =
G(0) = 0. 
Step 3. For all m, n ≥ 3, fn (t) = αm,n · fm (t), for some constant αm,n ≥ 0 depending
on m and n.
Proof. Property 2 implies immediately that for every x, y ∈ X, all n ≥ 3 and all
x ∈ X n−2 , d2 ([x, y], [0, 0]) = dn ([x, y, x], [0, 0, x]). Given Step 2, and recalling that
for every n ≥ 3, fn (0) = 0, it follows that for every n, m ≥ 3 and every x, y ∈ X+ ,
Fn (fn (x) + fn (y)) = F m (fm (x) + fm (y)), and therefore, since Fn is strictly increasing,
−1
fn (x) + fn (y) = Fn Fm (fm (x) + fm (y)) . Now, let u = fm (x) and v = fm (y); then
(considering that fm is strictly increasing):
−1 −1
(11) fn (fm (u)) + fn (fm (v)) = Fn−1 (Fm (u + v)).
Recall that, by (9) above, for all n ≥ 3 and all u ∈ fn (X+ ), Fn (u) = G(fn−1 (u));
then, for all n ≥ 3, fn−1 (u) = G−1 (Fn (u)) and, therefore, fn (t) = Fn−1 (G(t)) for all
t ∈ X+ . Observe also that, by (10), Fn (fn (X+ )) = G(X+ ) for all n ≥ 3, and so
ran(Fm ◦ fm ) ⊆ ran(Fn ) for all m, n ≥ 3. Hence, for all u ∈ fm (X+ ),
 
−1
fn (fm (u)) = Fn−1 G G−1 (Fm (u)) = Fn−1 (Fm (u)),
18 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

and so, from (11):


(12) Fn−1 (Fm (u)) + Fn−1 (Fm (v)) = Fn−1 (Fm (u + v)).
Let h(t) = Fn−1 (Fm (t)). Then, (12) can be rewritten as:
(13) h(u) + h(v) = h(u + v).
But (13) implies that, for all t ∈ fm (X+ ),
(14) h(t) = Fn−1 (Fm (t)) = αm,n · t
for some constant αm,n ≥ 0 depending on m and n.20 Moreover, since Fm and Fn are
strictly increasing, so are Fn−1 and Fn−1 ◦ Fm ; therefore αm,n > 0.
Now, by Property 2 again, it follows that for every x ∈ X+ , all m, n ≥ 3 with m ≤ n,
all u ∈ X m−1 and all v ∈ X n−1 ,
(15) dm ([x, u], [0, u]) = dn ([x, v], [0, v]).
Therefore, by (3), for every x ∈ X+ and all m, n ≥ 3 with m ≤ n, Fm (fm (x)) =
Fn (fn (x)), and so, by (14)
(16) fn (x) = Fn−1 (Fm (fm (x))) = αm,n · fm (x).

We now show that Property 2 allows us to replace the functions Fn and fn in (3) with
suitable functions F and f which are independent of n, and to extend (3) to the cases
where n < 3.
Step 4. There exists a continuous and strictly increasing function f : X+ → R+ , with
f (0) = 0, such that for all n and all x, y ∈ X n ,
n
X 
dn (x, y) = F f (|xi − yi |)
i=1

for some continuous and strictly increasing function F : R+ → R+ with F (0) = 0.


Proof. Let f be equal to f3 and let αn be the appropriate constant satisfying (14) for
m = 3 and a given n ≥ 3. Let also Fn0 (t) = Fn (αn · t). Then, it follows from Step 2
and (16) that for all n ≥ 3 and all x, y ∈ X n :
n
X 
(17) dn (x, y) = Fn fn (|xi − yi |) =
i=1
n n
 X  0
X 
= Fn αn f (|xi − yi |) = Fn f (|xi − yi |) ,
i=1 i=1

where Fn0 (t) is continuous and strictly increasing. Moreover, by Property 2,


(18) dn (x, y) = dn+1 ([x, z], [y, z])
20(13) is a Cauchy equation of the first kind whose solution is given, for example, in Aczel [Acz66].
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 19

for all z ∈ X, and since


n
X 
(19) dn+1 ([x, z], [y, z]) = Fn+1 fn+1 (|xi − yi |) + fn+1 (0) =
i=1
n n n
X   X  0
X 
= Fn+1 fn+1 (|xi − yi |) = Fn+1 αn+1 f (|xi − yi |) = Fn+1 f (|xi − yi |) ,
i=1 i=1 i=1

it follows from (17),S(18) and (19) that Fn+1 (u)S= Fn (u) for all u ∈ dom(Fn0 ).
0 0

Therefore, F = ∞ 0
i=3 Fi is a function from
∞ 0
i=3 dom(Fi ) = R+ in R+P. Then F is
0 n
continuous
 and strictly increasing, since all the Fi are, and dn (x, y) = F i=1 f (|xi −
yi |) for all n ≥ 3. To conclude the step then observe that, when x and y have size
k < 3, by Property 2 and recalling that f (0) = 0, it follows that for any z ∈ X m :
k k+m
 X X 
(20) dk (x, y) = dk+m ([x, z], [y, z]) = F f (|xi − yi |) + f (|zj − zj |) =
i=1 j=k+1
k
X 
=F f (|xi − yi |) ,
i=1
 Pn 
and so the identity dn (x, y) = F i=1 f (|xi − yi |) holds for all n ∈ N. Finally,
F (0) = Fn (αn · 0) = Fn (0) and, by Step 2, Fn (0) = 0, so F (0) = 0. 
6.1.2. Proof of Theorem 1. Given the above lemma, to prove the “if” direction of The-
orem 1.1 it is sufficient to verify that any strictly increasing function of the Euclidean
distance satisfies Property 5, that is, to verify that, whenever (xi − xj )(yi − yj ) > 0,
(xk − xm )(yk − ym ) > 0, d1 (xi , xj ) = d1 (xk , xm ) and d1 (yi , yj ) ≤ d1 (yk , ym ), we have
(21) (xi − yj )2 + (xj − yi )2 + (xk − yk )2 + (xm − ym )2 ≤
≤ (xi − yi )2 + (xj − yj )2 + (xk − ym )2 + (xm − yk )2 .
After simplification, this equation becomes (xm − xk )(ym − yk ) ≥ (xj − xi )(yj − yi ),
and the result follows. To prove the “only if” direction, we need to show that the
function f in Lemma 1 must be quadratic. Let a, b, c ∈ X be arbitrary non negative
real numbers with a ≥ c. Let also x, y ∈ X 4 be such that
x = [a, (a + c), a, (a + c)], y = [0, c, a, (a + c)]
so that
σ12 (y) = [c, 0, a, (a + c)], σ34 (y) = [0, c, (a + c), a].
Let σ12 (y) = w and σ34 (y) = z. Since (x1 − x2 )(y1 − y2 ) > 0, (x3 − x4 )(y3 − y4 ) >
0, d1 (x1 , x2 ) = d1 (x3 , x4 ) and d1 (y1 , y2 ) = d1 (y3 , y4 ), by Property 5 it follows that
d4 (x, w) = d4 (x, z); hence, by Lemma 1, we have that
4
X 4
X
 
F f (|xi − wi |) = F f (|xi − zi |)
i=1 i=1
20 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI
P4 P4
and therefore, since F is strictly increasing, i=1 f (|xi − wi |) = i=1 f (|xi − zi |).
P4
Hence, subtracting i=1 f (|xi − yi |) from both sides, we obtain
4
X 4
X
f (|xi − wi |) − f (|xi − yi |) = f (a + c) + f (a − c) − 2f (a)
i=1 i=1

and
4
X 4
X
f (|xi − zi |) − f (|xi − yi |) = 2f (c) − 2f (0).
i=1 i=1

Therefore, recalling that f (0) = 0, we must have for all a ≥ c ≥ 0:


f (a + c) − f (a) = f (a) − f (a − c) + 2f (c)
This functional equation has a unique solution f (t) = αt2 for some
Pn constant2α  > 0.
Hence, from Lemma 1, it follows that dn (x, y) = F α · i=1 (xi − yi ) . Since
F (α · u) is strictly increasing in u and F (α · 0) = F (0) = 0, this concludes the proof of
the theorem.
To prove Theorem 1.2, just replace Lemma 1 with the following:
Lemma 2. A distance {dn } over X satisfies Properties 1, 2∗ , 3, 4, if and only if there
exists a continuous and strictly increasing function f : XP + → R+ , with f (0) = 0,
such that for all n ∈ N and all x, y ∈ X n , dn (x, y) = F n1 ni=1 f (|xi − yi |) for some
continuous and strictly increasing function F : f (X+ ) → R+ with F (0) = 0.
Its proof is equal to that of Lemma 1 up to Step 2, and thereafter continues as follows.
Let gn = nfn , so that
n
X 1 
(22) dn (x, y) = Mn (|x1 − y1 |, · · · , |xn − yn |) = Fn gn (|xi − yi |) .
i=1
n
Then, following the derivation of equation (21) from equation (14) in Foster and
Shorrocks [FS91], we have that Property 2∗ allows us to choose the functions Fn
and gn to be independent of n (i.e. such that for all m, n ∈ N, Fm = Fn = F and
gm = gn = f , for
Psome fixed F and  f ) and to extend (22) to the case of n < 3. Thus
dn (x, y) = F n1 ni=1 f (|xi − yi |) holds for all n ∈ N. Then, a proof of the second part
of the theorem can be obtained by using Lemma 2 instead of Lemma 1 and adapting
the previous proof accordingly.
6.1.3. Proof of Theorem 2. A proof of Theorem 2 could be obtained using a similar
argument, based on separability properties, as the one used in the proof of Theorem 1
Here we suggest how to obtain a simpler proof which uses Theorem 1 as a lemma.
First, (1) implies that for all n ∈ N and all x, y, w, z ∈ X n ,
d1 (xi , yi ) = d1 (wi , zi ) for all i = 1, · · · , n, =⇒ dn (x, y) = dn (w, z).
This means that dn (x, y) = Fn (d1 (x1 , y1 ), . . . , d1 (xn , yn )) for some function Fn ; more-
over, it also follows from (1) and Property 1, that Fn is strictly increasing in each
WHAT’S SO SPECIAL ABOUT EUCLIDEAN DISTANCE 21

argument. Now, by Property 3∗ , di (xi − yi ) = G(|g(xi ) − g(yy )|), with G and g strictly
increasing. Hence dn (x, y) = Hn (|g(x1 ) − g(y1 )|, . . . , |g(xn ) − g(yn )|) for some strictly
increasing Hn .
Now, let d0n be defined as follows:
d0n (u, v) = dn (g−1 (u), g−1 (v)) = Hn (|u1 − v1 |, . . . , |un − vn |),
where g−1 (z) stands for the vector [g −1 (z1 ), . . . , g −1 (zn )]. Clearly, d0n satisfies Proper-
ties 1, 2 and 4 whenever dn does. Moreover, it is not difficult to verify that, whenever
dn satisfies Properties 3∗ and 5, d0n satisfies Properties 3 and 5. Therefore, whenever
dn satisfies all the properties of Theorem 2.1, d0n satisfies all the properties of Theo-
rem 1.1, and we can apply this theorem to establish that d0n must be (some monotonic
transformation of) the Euclidean Distance. Thus, writing g(z) for [g(z1 ), . . . , g(zn )],
we have
X n 
0
dn (x, y) = dn (g(x), g(y)) = H (g(xi ) − g(yi ))2
i=1
for some strictly increasing H such that H(0) = 0. Similarly, whenever dn satisfies all
the properties of the Theorem 2.2, d0n satisfies all the properties of Theorem 1.2, and we
can apply this theorem to establish that d0n must be (some monotonic
Pn transformation
1

of) the Averaged Euclidean Distance, that is dn (x, y) = H n i=1 (g(xi ) − g(yi ))2 for
some strictly increasing H such that H(0) = 0.

References
[Acz66] J. Aczel. Lectures on functional equations and their applications. Accademic Press, New
York, 1966.
[ASB99] D. Austen-Smith and J.S Banks. Positive Political Theory I: Collective Preference. The
University of Michigan Press, 1999.
[Atk83] A.B. Atkinson. The measurement of economic mobility. In A.B. Atkinson, editor, Social
Justice and Public Policy. Wheatsheaf Books Ltd., London, 1983.
[Bri50] G.W. Brier. Verfication of forecasts expressed in terms of probability. Monthly Weather
Review, 78:1–3, 1950.
[Cow85] F. Cowell. Measures of distributional change: An axiomatic approach. Review of Eco-
nomic Studies, 52:35–51, 1985.
[Dar93] V. Dardanoni. Measuring social mobility. Journal of Economic Theory, 61:372–94, 1993.
[Deb60] G. Debreu. Topological methods in cardinal utility. In S. Karlin K. Arrow and P. Suppes,
editors, Mathematical methods in the social sciences. Stanford University Press, 1960.
[EMR88] J.M. Enelow, N.R. Mendell, and S. Ramesh. A comparison of two distance metrics
through regression diagnostics of a model of relative candidate evaluation. The Jour-
nal of Politics, 50:1057–1071, 1988.
[ET80] L.G. Epstein and S.M. Tanny. Increasing generalized correlation: A definition and some
economic consequences. Canadian Journal of Economics, 13:16–34, 1980.
[FO96] G.S. Fields and E. Ok. The meaning and measurement of income mobility. Journal of
Economic Theory, 71:349–77, 1996.
[FO99a] G.S. Fields and E. Ok. The measurement of income mobility. In J. Silber, editor, Handbook
of Income Inequality Measurement. Kluwer Academic Publishers, Dordrecht, 1999.
[FO99b] G.S. Fields and E. Ok. Measuring movements of incomes. Economica, 66:455–71, 1999.
22 MARCELLO D’AGOSTINO AND VALENTINO DARDANONI

[FS91] J.E. Foster and A.F. Shorrocks. Subgroup consistent poverty indices. Econometrica,
59:687–709, 1991.
[FS97] J.E. Foster and A. Sen. On economic inequality after a quarter century. In A. Sen, editor,
On Economic Inequality. Clarendon Press, Oxford, 1997.
[Gor68] W.M. Gorman. The structure of utility functions. Review of Economic Studies, 35:367–
390, 1968.
[HM70] M.J. Hinich and M.C. Munger. Analytical Politics. Cambridge University Press, Cam-
bridge, UK, 1970.
[Maa98] E. Maasoumi. On mobility. In D. Giles and A. Ullah, editors, The Handbook of Economic
Statistics. Marcel Dekker, New York, 1998.
[MO98] T. Mitra and E. Ok. The measurement of income mobility: A partial ordering approach.
Economic Theory, 12:77–102, 1998.
[Sho88] A.F. Shorrocks. Aggregation issues in inequality measurement. In W. Eichhorn, editor,
Measurement in Economics. Springer Verlag, New York, 1988.
[SJ99] S. Santini and R. Jain. Similarity measures. IEEE Transactions on Pattern Analysis and
Machine Intelligence, 21(9):871–883, 1999.
[SS02] U. Segal and J. Sobel. Min, max, and sum. Journal of Economic Theory, 25:126–150,
2002.
[Tch80] A.H.T. Tchen. Inequalities for distributions with given marginals. Annals of Probability,
8:814–27, 1980.
[TJaLA01] Wu T-J. and Hsieh Y-C. andLi L-A. Statistical measures of dna sequence dissimilarity
under markov chain models of base composition. Biometrics, 57:441–448, 2001.
[TK70] Amos Tverski and David H. Krantz. The dimensional representation and the metric
structure of similarity data. Journal of Mathematical Psychology, 7:572–596, 1970.
[VdGSM01] D. Van de Gaer, E. Schokkaert, and M. Martinez. Three meanings of intergenerational
mobility. Economica, 68:519–537, 2001.
[WBD97] Tiee-Jian Wu, John P. Burke, and Daniel B. Davison. A measure of dna sequence dissimi-
larity based on mahalanobis distance between frequencies of words. Biometrics, 53:1431–
1439, 1997.
[WZF05] Liwei Wang, Yan Zhang, and Jufu Feng. On the euclidean distance of images. IEEE
Transactions on Pattern Analysis and Machine Intelligence, 27:1334–1339, 2005.

Dipartimento di Scienze Umane, Università di Ferrara


E-mail address: dgm@unife.it

Dipartimento di Scienze Economiche, Aziendali e Finanziarie, Università di Palermo


E-mail address: vdardano@unipa.it

View publication stats

You might also like