You are on page 1of 17

ARTICLE IN PRESS

Journal of Mathematical Psychology ] (]]]]) ]]]–]]]


www.elsevier.com/locate/jmp

Some probabilistic models of best, worst, and best–worst choices


A.A.J. Marleya,b,, J.J. Louvierec
a
Department of Psychology, University of Victoria, Victoria, BC, Canada V8W 3P5
b
Department of Economics, University of Groningen, The Netherlands
c
School of Marketing, University of Technology, Sydney, NSW Australia
Received 26 October 2004; received in revised form 15 May 2005

Abstract

Over the past decade or so, a choice design in which a person is asked to select both the best and the worst option in an available
set of options has been gaining favor over more traditional designs, such as where the person is asked, for instance, to: select the best
option; select the worst option; rank the options; or rate the options. In this paper, we develop theoretical results for three
overlapping classes of probabilistic models for best, worst, and best–worst choices, with the models in each class proposing specific
ways in which such choices might be related. The models in these three classes are called random ranking and random utility, joint
and sequential, and ratio scale. We include some models that belong to more than one class, with the best known being the
maximum-difference (maxdiff) model, summarize estimation issues related to the models, and formulate a number of open
theoretical problems.
r 2005 Elsevier Inc. All rights reserved.

Keywords: Best–worst choice; Preference; Choice; Probabilistic modeling; Discrete choice

1. Introduction (1) a single pair of best–worst choices contains a great


deal of information about the person’s ranking of
Finn and Louviere (1992) proposed a discrete choice options (e.g., if there are 3 items in a set, one obtains the
task in which a person is asked to select both the best entire ranking of that set; if there are 4 items in a set, one
and the worst option in an available (sub)set of options. obtains information on the implied best option in 9 of
They presented an analytical model for data from that the 11 possible non-empty, non-singleton subsets;1 and
task, and considered its evaluation using an experi- if there are 5 items in a set, one obtains information on
mental design (2j fractional factorial) that ensures that the implied best option in 18 of the 26 possible non-
each option, and each pair of distinct options, is empty, non-singleton subsets); (2) best–worst tasks take
presented equally often across the selected subsets of advantage of a person’s propensity to identify and
size j. Since the publication of that paper, interest in, and respond more consistently to extreme options; and (3)
use of, such best-worst choice tasks has been increasing, best–worst tasks seem to be easy for people. Despite
with two recent empirical applications receiving ‘‘best increasing use of the approach, the underlying models
paper’’ awards (Cohen, 2003; Cohen and Neira, 2003). have not been axiomatized, leaving practitioners
In principle, best–worst tasks have a number of
advantages over traditional discrete choice tasks:
1
Let fa; b; c; dg be the set of options and suppose that we know that a
Corresponding author. Department of Psychology, University of is selected as best and d as worst. If we now check each (sub)set of size
Victoria, P.O. Box 3050 STN CSC, Victoria, BC, Canada V8W 3P5. 2; 3; 4 of fa; b; c; dg in turn, then we see that the subsets fb; cg and
Fax: +1 250 721 8929. fb; c; dg are the only ones where the best element is not determined
E-mail addresses: ajmarley@uvic.ca (A.A.J. Marley), either by the information that a is best or by the information that d is
jordan.louviere@uts.edu.au (J.J. Louviere). worst.

0022-2496/$ - see front matter r 2005 Elsevier Inc. All rights reserved.
doi:10.1016/j.jmp.2005.05.003
ARTICLE IN PRESS
2 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

without clear guidelines on appropriate experimental scales b and w such that for x; y 2 X ,
designs, data analyses, and interpretation of results.
bðxÞ wðyÞ
This paper develops theoretical results for three BX ðxÞ ¼ P ; W X ðyÞ ¼ P . (2)
overlapping classes of probabilistic models for best, r2X bðrÞ s2X wðsÞ

worst, and best–worst choices, with the models in each Then direct substitution of (2) in (1) yields that for xay,
class proposing specific relationships between such
bðxÞwðyÞ
choices. The model classes are called random ranking BW X ðx; yÞ ¼ P . (3)
and random utility, joint and sequential, and ratio scale. bðrÞwðsÞ
r;s2X
ras

We consider models that belong simultaneously to one Notice that this combined set of representations has
or more of these classes, with the best known being the three interesting properties—the best–worst choice
maximum-difference (maxdiff) model, which is intro- probabilities are represented in (1) directly in terms of
duced later in this section, and we formulate a number the best and worst choice probabilities, and are
of open theoretical problems. represented in (3) in terms of the ratio scale values that
We now illustrate the framework, and the basic determine the best and the worst choice probabilities,
models, through the maximum-difference (maxdiff) with the same functional form in both representations.
model of best–worst choice. To do so we require some Thus, this aggregation method is plausible, and inter-
basic notation. Let T with jTjX2 denote the finite set of esting, in that it works both at the level of the choice
potentially available choice options, and for any subset probabilities and at the level of the scale values. We are
X  T, with jX jX2, let BX ðxÞ denote the probability immediately lead to the first theoretical question,
that the alternative x is chosen as best in X, W X ðyÞ the namely, are there other (ratio scale) models that satisfy
probability that the alternative y is chosen as worst in X, this type of aggregation property, and, if so, how large is
and BW X ðx; yÞ the probability that, jointly, the alter- this class. The detailed formulation of this problem
native x is chosen as best in X and the alternative yax is requires extensive further notation, and its solution the
chosen as worst in X . Thus use of complex functional equation techniques. The
0pBX ðxÞ; W X ðyÞ; BW X ðx; yÞp1 conjectured final result is that the class of solutions is
large, but in an interesting sense the individual solutions
and do not differ greatly from the above model.
X X X It is important to note that the above example was for
BX ðxÞ ¼ W X ðyÞ ¼ BW X ðx; yÞ ¼ 1. choices from a given fixed set X. Often probabilistic
x2X y2X x;y2X
xay models of choice (or, briefly, probabilistic choice
We assume throughout the paper that for each x 2 T, models) are assumed to be consistent over all the
Bfxg ðxÞ ¼ W fxg ðxÞ ¼ 1. subsets of a finite master set T. If that is assumed in the
For motivational purposes only, we assume now that present context, and we consider the binary choice
best and worst choices are in some sense more basic than probabilities Bfx;yg ðxÞ and W fx;yg ðyÞ, then it is reasonable
simultaneous best–worst choices, and develop a model to assume that, and to test whether,
of the latter based on the former. So suppose that when Bfx;yg ðxÞ ¼ W fx;yg ðyÞ
asked to choose the best and the worst option in a
(finite) set X , the person simultaneously, but indepen- which with (2) gives that3 for each for z 2 T,
dently, chooses the best, respectively, the worst, option 1
bðzÞ ¼ (4)
in X . If the resulting options are distinct, these are wðzÞ
reported as the best–worst pair of options in X,
and so in particular any scale transform a of b is linked
otherwise the person re-samples. As we show later in
to the scale transform 1a of w. In the following work, we
detail, such a process gives rise to the following
consider separately the cases where b and w are
representation for the best–worst choice probabilities
independent ratio scales, and where they are subject to
in terms of the best and the worst choice probabilities:2
common scale transforms.
for x; y 2 X ; xay
The paper has the following structure. Each of the
BX ðxÞW X ðyÞ three main sections develops one of three classes of
BW X ðx; yÞ ¼ P . (1)
r;s2XBX ðrÞW X ðsÞ models, gives examples of that class, and presents
ras
solved, and open, theoretical problems about that class.
Now, suppose that the multinomial logit (MNL or Section 2 introduces the terminology and basic condi-
Luce’s choice) model holds separately for the best tions, Section 3 presents random ranking and random
and the worst choice probabilities, i.e., there exist ratio utility models, Section 4 joint and sequential choice
2 3
We discuss later the theoretical and empirical reasonableness of this (2) gives actually that bðzÞ ¼ c=wðzÞ for some constant c40, but no
representation when jX j ¼ 2. generality is lost by assuming that c ¼ 1.
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 3

models, and Section 5 ratio scale models. Section 6 attention on the task in a way that separate best and
summarizes the results and restates the main open worst choices do not.
theoretical problems.
Random ranking and random utility models are the
most commonly studied probabilistic choice models,
and frequently models in the other classes have
2. General terminology and conditions alternative descriptions as models in this class. Thus,
we include the general terminology related to random
Given a finite master set T, and a particular set X, ranking and random utility models in the present
X  T, we refer to the set fBX ðxÞ; x 2 X g, fW X ðyÞ; section.
y 2 X g, fBW X ðx; yÞ; x; y 2 X , xayg, respectively, as a First, we need the idea of a ranking of a set X from its
set of best, worst, best– worst choice probabilities (on X). best (most preferred) to its worst (least preferred)
We have a complete set of best, worst, best– worst choice element, and similarly of a ranking of X from its worst
probabilities, respectively, (on a master set T) when we (least preferred) to its best (most preferred) element. We
have a set of best, worst, best–worst choice probabilities refer to these as best to worst, and worst to best,
on each X, X  T. Unless stated otherwise, we assume rankings, respectively. Such rankings may be empirical,
that we have a complete set of best, worst, and i.e., resulting from a person’s judgments, but of at least
best–worst choice probabilities on a finite master set T equal importance in this paper is the role such rankings
with jTjX2. play in the development of probabilistic choice models.
One can, of course, study distinct models for each of For any set X, X  T, let RðX Þ denote the set of rank
the three types of choice paradigm—that is, for best, orders of X. Then, with jX j ¼ n, and a given rank order
worst, and best–worst. However, this creates a very r ¼ r1 r2 . . . rn 1 rn of X, let BX ðrÞ denote the prob-
large number of model types. For instance, as indicated ability that r occurs as a best to worst rank order, and
above, in the paper we discuss three classes of models— W X ðrÞ the probability that r occurs as a worst to best
random ranking and random utility, joint and sequential rank order—thus, in the former case r1 is the best
choice, and ratio scale. If the model for each type of element in the rank order, in the latter case it is the worst
choice paradigm—best, worst, best–worst—is assumed element.4 The assumption that BX ðrÞ and W X ðrÞ are
to belong independently to each of these classes, then we probabilities and that a ranking occurs at each choice
have already 27 different patterns of assumptions, and, opportunity is summarized by: for each r 2 RðX Þ,
in fact, there will be even more possible patterns because 0pBX ðrÞ; W X ðrÞp1
of model subtypes within each model class. Partly
because of this proliferation of models, but also because and
it makes conceptual, and one would hope, empirical, X X
sense, we will assume normally that the three types of BX ðrÞ ¼ 1 ¼ W X ðrÞ.
choices satisfy a common type of model. We explore r2RðX Þ r2RðX Þ

also certain natural consistency conditions for linking


A set of best and worst ranking probabilities on a set X is
the best, worst, and best–worst choices on single or
any such set fBX ðrÞ; W X ðrÞ : r 2 RðX Þg. A complete set
multiple sets. The following is one such condition:
of best and worst ranking probabilities on a set T is any
Definition 1. A set of best, worst, and best–worst choice set fBX ðrÞ; W X ðrÞ : r 2 RðX Þ; X  Tg.
probabilities on a set X has consistent margins iff for When we have a complete set of rankings, we can ask
each x; y 2 X , what the relation is between the probabilities of the
X rankings on the master set T and on a given subset
BX ðxÞ ¼ BW X ðx; zÞ X  T. To state possible relations, consider rankings
z2X fxg r 2 RðX Þ and s 2 RðTÞ, let sjX ¼ r mean that the
ranking s on T agrees with the ranking r on X , and let
and
X RT ðr; X Þ ¼ fsjs 2 RðTÞ with sjX ¼ rg.
W X ðyÞ ¼ BW X ðz; yÞ.
z2X fyg Then a pair of plausible, though not necessary,
assumptions is that for each r 2 RðX Þ,
Perhaps surprisingly, we present plausible models that
X
do not satisfy the above condition—and others that do. BX ðrÞ ¼ BT ðsÞ
Also, note that the best and worst choice probabilities s2RT ðr;X Þ
will not in general determine a unique set of best–worst
choice probabilities, but if we have consistent margins, 4
A more precise notation would be BRðX Þ ðrÞ and W RðX Þ ðrÞ, rather
the converse in true. Even if we do not have consistent than BX ðrÞ and W X ðrÞ. However, the simpler notation should not
margins, best–worst choices likely focus the person’s lead to any confusion.
ARTICLE IN PRESS
4 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

and provided there is a probability measure bT on the set of


X rank orders of T, such that for each x 2 X  T,
W X ðrÞ ¼ W T ðsÞ. X
s2RT ðr;X Þ BX ðxÞ ¼ bT ðsÞ.
s2RT ðx;X fxgÞ

A complete set of worst choice probabilities on a finite


3. Random ranking and random utility models set T satisfies a worst random ranking model provided
there is a probability measure wT on the set of rank
We are naturally interested in relations between these orders of T, such that for each y 2 X  T,
various choice and ranking probabilities. Since the focus X
of this paper is best–worst choice, we concentrate on W X ðyÞ ¼ wT ðsÞ.
s2RT ðX fyg;yÞ
how ranking probabilities might determine such best–
worst choice probabilities, though we do consider one A complete set of best–worst choice probabilities on a
result in the other direction. There is extensive related finite set T satisfies a best– worst random ranking model
work on how best choice probabilities determine, or are provided there is a probability measure pT on the set of
determined by, ranking probabilities (see Falmagne, rank orders of T, such that for each x; y 2 X  T; xay,
1978; Fishburn, 1994, 2002). X
For each z 2 T and rank order s 2 RðTÞ, let s 1 ðzÞ BW X ðx; yÞ ¼ pT ðsÞ.
s2RT ðx;X fx;yg;yÞ
denote the rank position of z in s, so, for instance,
s 1 ðzÞ ¼ 1 means that z is in the first position in the rank Note that the above definition assumes no relation
order. Now let between the ranking probability measures underlying
the best, worst, and best–worst choice probabilities. In
RT ðx; X fxgÞ
line with our earlier observations, we consider next the
¼ fsjs 2 RðTÞ; s 1 ðxÞos 1 ðzÞ; z 2 X fxgg, case where they agree.

RT ðX fyg; yÞ Definition 3. A complete set of best, worst, and


¼ fsjs 2 RðTÞ; s 1 ðzÞos 1 ðyÞ; z 2 X fygg, best–worst choice probabilities on a finite set T satisfies
a (best, worst, and best– worst) consistent random ranking
RT ðx; X fx; yg; yÞ model provided there is a probability measure qT on the
set of rank orders of T, such that for each x; y 2 X  T,
¼ fsjs 2 RðTÞ; s 1 ðxÞos 1 ðzÞos 1 ðyÞ; z 2 X fx; ygg. X
BX ðxÞ ¼ qT ðsÞ, (6)
For instance, RT ðx; X fx; yg; yÞ is the set of rank s2RT ðx;X fxgÞ
orders of T for which, relative to only the elements of X, X
x is first (best), and y is last (worst). A first observation W X ðyÞ ¼ qT ðsÞ (7)
is that a set of best–worst choice probabilities on a single s2RT ðX fyg;yÞ
finite set X is always consistent with such a set of rank
and
orders on that set. To see this, consider the equalities X
that must be satisfied for this result to hold: there is a BW X ðx; yÞ ¼ qT ðsÞ ðxayÞ. (8)
probability measure pX on the set of rank orders of X s2RT ðx;X fx;yg;yÞ
such that for x; y 2 X ; xay, As we see now, the above assumptions are eminently
X
BW X ðx; yÞ ¼ pX ðsÞ. (5) testable. Should they fail we can turn to more general
s2RX ðx;X fx;yg;yÞ cases where the best, worst, and best–worst choice
probabilities, respectively, each satisfy possibly distinct
Note that if jX j ¼ 2, then for any x; y 2 X , xay, we can
random ranking models.
set pX ðxyÞ ¼ BW X ðx; yÞ, which determines the rank
order probabilities pX ðrÞ. For any X with jX j42, each We now summarize various theoretical issues asso-
rank order r 2 RðX Þ belongs to exactly one of the sets ciated with the definition of a consistent random
RX ðx; X fx; yg; yÞ, and so, for each x; y 2 X , xay, we ranking model, Definition 3. The first result is immedi-
can arbitrarily set the probability of one of the rank ate from the relevant definitions.
orders in the set RX ðx; X fx; yg; yÞ equal to BW X ðx; yÞ,
and all the others to zero, leading to (5). Proposition 4. Consider a complete set of best, worst, and
The following two definitions are natural extensions best– worst choice probabilities such that the best– worst
to best–worst choices of a standard definition. choice probabilities satisfy a (best– worst) random ranking
model, Definition 2. Then the set satisfies a consistent
Definition 2. A complete set of best choice probabilities (best, worst, and best– worst) random ranking model,
on a finite set T satisfies a best random ranking model Definition 3, iff it has consistent margins, Definition 1.
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 5

Thus, if we know that a set of best–worst choice data checking that the above expression reduces to one
satisfies a random ranking model, then we can test involving only nonnegative (rank order) terms.
whether or not an additional set of best (vis, worst) We have been assuming so far that the best–worst
choice data is consistent with the same random ranking choice probabilities are generated by a random ranking
model by comparing the relevant margins of the model. We can consider also the possibility that the
best–worst data with the best (vis, worst) choice data. rankings are generated by a sequence of best–worst
Thus, an important open theoretical question is what choices to yield the following representation: for
are necessary and sufficient conditions for a complete set r 2 RðX Þ; jX j ¼ n,
of best–worst choice probabilities to satisfy a random
pX ðrÞ
ranking model. The following are necessary conditions: 8 9
> BW X ðr1 ; rn Þ >
>
> >
>
P >
> >
>
< BW X ðr1 ; rn ÞBW X fr1 ;rn g ðr2 ; rn 1 Þ =
(i) The marginal choice probabilities z2X fxg ¼
BW X ðx; zÞ satisfy a best random rankingPmodel, >
> BW X ðr1 ; rn ÞBW X fr1 ;rn g ðr2 ; rn 1 Þ:::BW frj ;rjþ1 g ðrj ; rjþ1 Þ >
>
>
> >
>
>
: BW ðr ; r ÞBW >
;
and the marginal choice probabilities z2X fyg X 1 n ðr ; r Þ:::BW
X fr1 ;rn g 2 n 1 ðr ; r
frj ;rjþ1 ;rjþ2 g j jþ2 Þ
BW X ðz; yÞ satisfy a worst random ranking model 8
> n ¼ 2; 3;
(with common rank order probabilities). >
>
>
>
(ii) The best–worst choice probabilities satisfy regular- < n ¼ 4; 5;
ity: for distinct x; y 2 X  Y  T, xay, if
>
> n ¼ 2j; jX3;
>
>
BW X ðx; yÞXBW Y ðx; yÞ. (9) >
: n ¼ 2j þ 1; jX3:

(iii) For distinct fx; y; zg ¼ X  T, The distinction between the cases with an even versus an
odd number of elements arises because in the latter case,
BW fx;yg ðx; yÞ ¼ BW X ðx; yÞ þ BW X ðx; zÞ
with 2j þ 1 elements, the rank position of the final
þ BW X ðz; yÞ. ð10Þ element is determined after j best–worst choices.
We have the open problem of determining whether
When jTj ¼ 3, i.e., we have only a 3 element set and there are any complete sets of best–worst choice
its 2 element subsets, the above condition, (10), is probabilities on (all the subsets of) a set T that satisfy
necessary and sufficient for the best–worst choice a random ranking model with the above relation
probabilities to satisfy a random ranking model. This between the probabilities of the rank orders in that
is easily seen by setting, for each r ¼ r1 r2 r3 2 RðTÞ, random ranking model and the complete set of
pT ðrÞ ¼ BW T ðr1 ; r3 Þ. Then the best–worst choice prob- best–worst choice probabilities on the subsets X  T.
abilities on T are compatible with the random ranking This question parallels a solved classic one regarding
model, and the compatibility of the best–worst choices Luce’s choice model (Luce and Suppes, 1965).
for the 2 element subsets follows by substituting the We now introduce a general class of random utility
rank order probabilities associated with the terms in the models for best, worst, and best–worst choice probabil-
right hand side of (10) with X ¼ T in those terms. ities, specialize it in a way that produces representations
One would hope that an approach similar to that used formally equivalent to the earlier random ranking frame-
to solve the parallel general case for best (vis., worst) work, and then study related random utility models that
choice probabilities (see Falmagne, 1978; Barbera and do not retain this equivalence. Throughout the paper, we
Pattanaik, 1986; Fiorini, 2004) would give a general set implicitly assume that there is a zero probability of two
of necessary and sufficient conditions in the present case, distinct random variables being equal, which ensures that
though we do not currently see how to do so. It is quite a single option is selected at each choice opportunity.
possible that the relevant conditions will have a similar Definition 5. A complete set of best, worst, and
structure to those in the earlier literature—for instance, best–worst choice probabilities on a finite set T satisfies
the first such condition beyond regularity that occurs in a (best, worst, best– worst) random utility model provided
the characterization of best choice probabilities yields a there are random variables Bz ; Wz ; BWr;s ; r; s; z 2 T,
parallel necessary condition in the present situation, ras, such that for each x; y 2 X  T,
though it does not seem to have an obvious interpreta-
tion. The condition is: for a master set T ¼ fx; y; z; wg, BX ðxÞ ¼ PrðBx ¼ max Bz Þ,
z2X


BW ðx;yÞ ðx; yÞ BW fx;y;zg ðx; yÞ BW fx;y;wg ðx; yÞ W X ðyÞ ¼ Pr Wy ¼ max Wz
z2X
þ BW fx;y;z;wg ðx; yÞX0, and
which is easily checked to be a necessary condition by BW X ðx; yÞ ¼ PrðBWx;y ¼ max BWs;t Þ ðxayÞ.
r;s2X
substituting in the relevant rank orders from (8) and ras
ARTICLE IN PRESS
6 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

Note that this definition does not imply any relations margins, Definition 1, and so in particular satisfy
between the best, worst, and best–worst choice prob- BW fx;yg ðxÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðyÞ. This class of models
abilities (other than that each set separately satisfies a has been studied extensively for best (equivalently,
random utility model). As we have argued earlier, such worst) choice probabilities, with its strengths and
generality is unwarranted at this time, so we now look at weaknesses quite well understood, particularly for
specializations that involve assumed relations between Thurstone and Luce (MNL) models. We now present
the three sets of random variables. the random utility versions of these two (classes of)
models, with extensions to best–worst choices.
One might think that we should require that, when
jX j ¼ 2,
Definition 8. A complete set of best, worst, and
BW fx;yg ðx; yÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðyÞ best– worst choice probabilities on a finite set T satisfies
or, more generally, that for some a, 0pap1, a consistent (best, worst, and best–worst) Thurstone
random utility model iff it satisfies a consistent random
BW fx;yg ðx; yÞ ¼ aBfx;yg ðxÞ þ ð1 aÞW fx;yg ðyÞ.
utility model, Definition 6, such that there exist interval
Neither property holds generally for random utility scale values vz ; z 2 T, and independent5 random vari-
models that satisfy Definition 5. It is a theoretical and ables ez ; z 2 T with
empirical issue to decide the appropriate resolution of
this apparent inconsistency. Shafir (1993) presents data, U z ¼ vz þ ez .
in an accept versus reject design, that can be interpreted
A consistent extreme value random utility model is a
as showing that Bfx;yg ðxÞ may differ from W fx;yg ðyÞ.
Thurstone random utility model where each ez ; z 2 T,
Definition 6. A complete set of best, worst, and has the extreme value distribution.6
best–worst choice probabilities on a finite set T satisfies
a (best, worst, best– worst) consistent random utility Rewriting the random variables in a consistent
model provided it satisfies a random utility model, extreme value random utility model in the form of
Definition 5, with a common set of random variables Uz , Definition 6, we have
z 2 T, such that for r; s 2 T, ras,
B z ¼ vz þ ez ,
Bz ¼ Wz ¼ Uz
Wz ¼ vz ez ,
and
BWr;s ¼ vr vs þ er es ð14Þ
BWr;s ¼ Ur Us .
It is important to note that the random variable with ez , z 2 T, having extreme value distributions. It is
notation above is intended to mean that the best–worst important to note again that all the random variable
choice probabilities are derived on the basis of a single notation above is intended to mean that the best–worst
common set of sample values Uz ; z 2 T. Combining the choice probabilities are derived on the basis of a single
various assumptions, a consistent random utility model common set of sample values Uz ; z 2 T, equivalently
becomes ez ; z 2 T.

It follows from standard results, based on (11), that,
BX ðxÞ ¼ Pr Ux ¼ max Uz , (11) given a consistent extreme value random utility model,
z2X Definition 8, the best choice probabilities satisfy the

Luce (MNL) choice model, with scale values exp vz
W X ðyÞ ¼ Pr Uy ¼ min Uz , (12) (McFadden, 1974). However, and this is important,
z2X
neither the worst or the best–worst choice probabilities
BW X ðx; yÞ given by that model will then, in general, satisfy the
¼ PrðUx 4Uz 4Uy ; z 2 X fx; ygÞ ðxayÞ. Luce (MNL) choice model when the scale values
vz ; z 2 T, are not all equal. Nonetheless, the relationship
ð13Þ
between the best and the worst choice probabilities in
We have the following equivalence:
this model is well known, and has lead to much
fascinating research (Yellott, 1977, 1980, 1997). Here,
Proposition 7. A complete set of best, worst, and we add a result about the representation of best–worst
best– worst choice probabilities on a finite set T satisfies choice probabilities that follows from that earlier work,
a consistent random utility model, Definition 6, iff it
satisfies a consistent random ranking model, Definition 3. 5
We include independence as part of the definition as we only
consider that case in this paper.
This result follows routinely, using classic methods for 6
This means that
showing such equivalences (Luce and Suppes, 1965; PrðUz ptÞ ¼ exp e t ð 1oto1Þ
Falmagne, 1978). Note that such models have consistent
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 7

but has not been noted previously as others have not probabilities satisfy Luce’s choice model on all subsets
studied best–worst choice.7 X  T with jX jp3, then the best choice probabilities
It is known (e.g., Critchlow et al., 1991) that if a satisfy Luce’s choice model for all subsets X  T
complete set of best, worst and best–worst choice (Theorem 5, Yellott, 1977).
probabilities on a finite set T satisfies a consistent Combining all of the above results we have the
extreme value random utility model, Definition 8, following proposition.
then for each Y  T; jY j ¼ m, and each r ¼ r1 r2 :::
rm 1 rm 2 RðY Þ, Proposition 9. Assume that a complete set of best, worst,
PrðUr1 4Ur2 4 4Um Þ and best– worst choice probabilities on a finite set T
satisfies a Thurstone random utility model. Then the
¼ BY ðr1 ÞBY fr2 g ðr2 Þ:::Bfrm 1 ;rmg ðrm Þ ð15Þ
following conditions are equivalent:
and so for each x; y 2 T, xay
BW fx;yg ðx; yÞ ¼ PrðUx 4Uy Þ ¼ Bfx;yg ðxÞ, (i) The best choice probabilities satisfy Luce’s choice
and, using (13) and (15), for x; y 2 X  T; jX j ¼ n42, model.
xay, and Z ¼ Z2 :::Zn 1 2 RðX fx; ygÞ, (ii) For all x; y 2 X  T, xay, BW X ðx; yÞ ¼
BX ðxÞW X fxg ðyÞ.
BW X ðx; yÞ
¼ PrðUx 4Uz 4Uy ; z 2 X fx; ygÞ Note that the argument that (ii) implies (i) only uses
X subsets X  T with jX jp3.
¼ BX ðxÞBX fxg ðZ2 Þ:::BfZn 1 ;yg ðZn 1 Þ
Z2RðX fx;ygÞ
X Also, given that the best choice probabilities satisfy
¼ BX ðxÞ BX fxg ðZ2 Þ:::BfZn 1 ;yg ðZn 1 Þ the Luce (MNL) choice model, it follows from standard
Z2RðX fx;ygÞ results (e.g., Yellott, 1977) that the worst choice
¼ BX ðxÞ PrðUz 4Uy ; z 2 X fx; ygÞ probabilities do not satisfy a Luce (MNL) choice model,
except in special cases such as when the best choice
¼ BX ðxÞW X fxg ðyÞ, probabilities are equal for each x 2 X . Nonetheless, for
i.e., distinct x; y 2 X with Z ¼ Z2 :::Zn 1 2 RðX fx; ygÞ and
so r ¼ xZy 2 RðX Þ, we have, by (15),
BW X ðx; yÞ ¼ BX ðxÞW X fxg ðyÞ, (16)
an example of a (mixed) sequential best–worst choice W X fxg ðyÞ ¼ PrðUz 4Uy ; z 2 X fx; ygÞ


model (Section 4, Definition 13).
On the other hand, suppose that we have a complete ¼ Pr Uy ¼ min Uz
z2X x
set of best, worst and best–worst choice probabilities on X
¼ PrðUZ2 4 4UZn 1 4Uy Þ
a finite set T, jTj42, that satisfies a Thurstone random
Z2RðX fx;ygÞ
utility model, Definition 8, and also satisfies (16) for all X
X  T. Then, in particular, for every u; v 2 T; uav, ¼ BX fxg ðZ2 Þ . . . BfZn 1 ;yg ðZn 1 Þ,
W fu;vg ðuÞ ¼ Bfu;vg ðvÞ, and so for any X  T, jX j ¼ 3, and Z2RðX fx;ygÞ

r ¼ r1 r2 r3 2 RðX Þ, (16) gives


and so the worst choice probabilities can be (relatively)
pðr1 r2 r3 Þ ¼ PrðUr1 4Ur2 4Ur3 Þ easily computed from the best choice probabilities. Also,
¼ BW X ðr1 ; r3 Þ ¼ BX ðr1 ÞBfr2 ;r3 g ðr2 Þ. econometricians or psychometricians might attempt to
fit the model through the integrals implicit in the
However, it is known that if a Thurstone random utility formulae: for X  T,
model8 (for a complete set of best choice probabilities)

satisfies the above ‘‘ranking postulate’’ (Luce, 1959) for
BX ðxÞ ¼ Pr Ux ¼ max Uz ,
a set X with jX j ¼ 3, then the complete set of best choice z2X
probabilities on X satisfy the Luce choice model

(Theorem 50, Luce and Suppes, 1965). Additionally, it W X ðyÞ ¼ Pr Uy ¼ min Uz ; .
z2X
is known that if a complete set of best choice
probabilities on a finite set T satisfies a Thurstone We now summarize the details of a parallel consistent
random utility model, Definition 8, and the best choice Thurstone random utility model where the roles of best
7
and worst choice probabilities are interchanged. So
We thank the anonymous referee for observations that lead us to consider now a consistent Thurstone random utility
complete the following example, and Harry Joe for discussing it with
us.
model, Definition 8, where Uz ¼ vz ez , z 2 T, and each
8
In fact, this part of the argument goes through under the weaker ez has the extreme value distribution (note the term ez ,
assumption of a consistent random ranking model, Definition 3. whereas in the previous example we have þez ). Then,
ARTICLE IN PRESS
8 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

using (12), we have An inverse extreme value random utility model is not

a consistent random utility model, Definition 6, even
W X ðyÞ ¼ Pr Uy ¼ min Uz when uz ¼ vz (see below).
z2X

If we let, for z 2 T,
¼ Pr vx ex ¼ min½vz ez  bðzÞ ¼ exp uz ; wðzÞ ¼ exp vz ,
z2X

then standard techniques show that the choice prob-
e vx
¼ Pr vx þ ex ¼ max½ vz þ ez  ¼ P v , abilities given by an inverse extreme value random
z2X e z
utility model, Definition 11, yield the Luce (MNL)
where the final equality corresponds to the standard choice model for each of BX , W X , BW X , X  T, with,
result regarding the representation of the Luce (MNL) for x; y 2 X  T,
choice model in terms of random variables ez , z 2 T, bðxÞ
with each ez having the extreme value distribution. If we BX ðxÞ ¼ P ,
z2X bðzÞ
now proceed in a manner exactly paralleling the wðyÞ
previous example, but using (11)–(13) with the worst W X ðyÞ ¼ P ,
choice probabilities satisfying the Luce (MNL) choice z2X wðzÞ

model and (15) stated in terms of the worst choice bðxÞwðyÞ


BW X ðx; yÞ ¼ P ðxayÞ. ð18Þ
probabilities, we obtain the following result. r;s2X bðrÞwðsÞ
ras

Proposition 10. Assume that a complete set of best, We now make several observations about the
worst, and best– worst choice probabilities on a finite set T similarities and differences between the two extreme
satisfies a Thurstone random utility model. Then the value models of (14) and (17):
following conditions are equivalent:
(i) Each random variable BWr;s depends on a difference
(i) The worst choice probabilities satisfy Luce’s choice of extreme value random variables, er es , in (14)
model. and on a single extreme value random variable, er;s ,
(ii) For all x; y 2 X  T, xay, BW X ðx; yÞ ¼ in (17). Obviously, this difference leads to differences
W X ðxÞBX fxg ðyÞ. in the predictions of each model for the best–worst
choice probabilities. In particular, (14) predicts
Note that the representation in (ii) is an example of a that the three types of binary choice probabilities
(mixed) sequential best–worst choice model (Section 4, agree, i.e., that BW fx;yg ðx; yÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðxÞ,
Definition 13) where the roles of best and worst choices whereas the general case of (17) predicts that they all
are interchanged relative to the first example. differ. It is an empirical question as to which pattern
of results—or neither—is correct.
Now we present a variant on the assumptions of the (ii) The extreme value random variables Bz and Wz in (14),
d d
above examples that leads to a random utility model but not in (17), satisfy Bz ¼ Wz , where ¼ means
where all three sets of choice probabilities—best, worst, equal in distribution. Thus, a consistent extreme value
and best–worst—satisfy the Luce (MNL) model. In fact, model, Definition 8, is a consistent random utility
the resulting choice probabilities are those of the model, Definition 6, but an inverse extreme value
example in Section 1. random utility model, Definition 11, is not—the latter
result follows from the fact that the extreme value
Definition 11. A complete set of best, worst, and distribution is not symmetric. If the extreme value
best–worst choice probabilities satisfies an inverse (best, distributed random variables ez are replaced by random
worst, best– worst) Thurstone random utility model iff it variables with symmetric distributions (for instance,
satisfies a random utility model, Definition 5, for which d
normals), we have Bz ¼ Wz for both (14) and (17).
there exist interval scale values uz ; vz ; z 2 T, and
independent random variables er;s ; r; s; z 2 T, ras, such
As noted above, the extreme value random variables
that
ez in Definition 8 and Definition 11 are not symmetric.
Bz ¼ uz þ ez , Rephrasing the above results, we have the following
Wz ¼ vz þ ez , results for consistent Thurstone and inverse Thurstone
random utility models, i.e., those where the extreme
BWr;s ¼ ur vs þ er;s . ð17Þ
value random variables are replaced by general random
An inverse extreme value random utility model is variables ez with equal means and variances:
an inverse Thurstone random utility model where each
ez and er;s ; r; s; z 2 T, ras, has the extreme value 1. When ez is symmetric, the representations of the best
distribution. and worst probabilities given by (14) are of the same
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 9

form as are the representations of the best and worst options are distinct, these are reported as the best–worst
probabilities given by (17), and (thus) (14) and (17) pair in X . Otherwise, further decisions are required to
agree with each other for the best and the worst select the best–worst pair. We now consider two possible
choice probabilities. processes for these later decisions, leaving other more
2. When ez is not symmetric, the representations of the complex possibilities for future study should such be
best and worst probabilities given by (14) are not of required by data. The two cases we consider are:
the same form, but the representations of the best and
worst probabilities given by (17) are of the same 1. The person chooses with equal probability amongst
form, and (thus) (14) and (17) do not agree with each the possible distinct best–worst pairs. Thus, each pair
other for the best and the worst choice probabilities. x, y 2 X , xay, is chosen with probability jX jðjX1 j 1Þ.
3. It is an empirical question as to which pattern of 2. The person re-samples for the best–worst pair.
results—(1) or (2) or neither—is correct.
We now study the details of models generated by each
of these processes.
4. Joint and sequential best–worst choice models

We now motivate models through the idea that a 4.1.1. Case 1: Any choices after the first are equally
person, in selecting the best–worst pair of options, probable
approaches the selection of the best and the worst First we consider the case where, if no decision is
option independently, and follows up on these choices, reached at the first stage, then at the second stage the
as needed, to ensure that the same option is not selected person chooses with equal probability amongst the
as both best and worst. We first consider models where possible distinct best–worst pairs. Then the best–worst
the best and worst choices are made jointly, then models choice probabilities are given by: for all x; y 2 X ; xay,
where these choices are made sequentially. BW X ðx; yÞ ¼ BX ðxÞW X ðyÞ
A referee asked whether such an independence X
1
assumption is reasonable and suggested that we should þ BX ðzÞW X ðzÞ. ð19Þ
also consider models with some kind of correlation. jX jðjX j 1Þ z2X
The question and suggestion are eminently reasonable. We have the following interesting proposition:
We introduce the ‘‘independence’’ assumption mainly
as a way to introduce the two models of this subsection, Proposition 12. A set of best– worst choice probabilities
the first of which can be interpreted as a quasi- on a finite set X that satisfies (19) has consistent margins,
independent (log-linear) model9 of a type that has been Definition 1, iff for every r; s 2 X , ras,
extensively studied, and successfully applied, in the
BX ðrÞW X ðrÞ ¼ BX ðsÞW X ðsÞ. (20)
analysis of other types of categorical data, including best
choices (see, e.g., Agresti, 2002; Louviere et al., 2000) Proof. From (19), we obtain
and best–worst choices (Cohen, 2003; Cohen and Neira, X
2003, Finn and Louviere (1992)). The second model BW X ðx; yÞ
y2X fxg
involves a slight generalization of the first model. These X
two models have not received detailed theoretical ¼ BX ðxÞW X ðyÞ
analysis previously and, given the length of the present y2X fxg

paper, we leave the study of correlated versions for X 1 X


þ BX ðzÞW X ðzÞ
future work. And the models of Section 4.2 involve y2X fxg
jX jðjX j 1Þ z2X
more subtle uses of the idea of ‘‘independence’’ between
1 X
best and worst choices. ¼ BX ðxÞ½1 W X ðxÞ þ BX ðzÞW X ðzÞ
jX j z2X
!
4.1. Simultaneous best– worst choice between options 1 X
¼ BX ðxÞ þ BX ðzÞW X ðzÞ BX ðxÞW X ðxÞ
jX j z2X
The example of Section 1 is now generalized to a
broader class of models, and then reconsidered in the ð21Þ
light of that class. and a similar argument shows that
Suppose that when a person is asked to choose the X
best and the worst option in a choice set X , then the BW X ðx; yÞ ¼ W X ðyÞ
person simultaneously, but independently, chooses a x2X fyg
!
best and a worst option in X . If the resulting chosen 1 X
þ BX ðzÞW X ðzÞ BX ðyÞW X ðyÞ . ð22Þ
9 jX j z2X
We thank Tom Wickens for reminding us of this fact.
ARTICLE IN PRESS
10 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

It then follows from (21) and (22) that the best–worst and
choice probabilities have consistent margins iff, for
every x 2 X , X 1=bðyÞ
BW X ðz; yÞ ¼ P ,
z2X 1=bðzÞ
1 X z2X fyg
BX ðzÞW X ðzÞ ¼ BX ðxÞW X ðxÞ. (23)
jX j z2X which with (24) and (25) shows that it has consistent
margins. This is quite fascinating as the best and worst
Clearly, (20) is sufficient for (23) to hold. We now show, choice probabilities each satisfy a Luce (MNL) choice
by contradiction, that (20) is necessary for (23) to hold. model, and the best–worst choice probabilities do not,
So assume that (23) holds for all z 2 X , but that (20) yet the model has consistent margins. Later, we show
does not hold for all r; s 2 X , ras. Then, using the that the second proposed decision procedure has the
properties of an arithmetic average, there must be two ‘opposite’ property, namely that the best, worst and the
distinct elements of X that we denote by xmin ; xmax , with best–worst choice probabilities each satisfy a Luce
(MNL) choice model, but these sets of choice prob-
BX ðxmax ÞW X ðxmax Þ
abilities do not have consistent margins.
1 X Since the above model has consistent margins, we
4 BX ðzÞW X ðzÞ
jX j z2X have that for two element sets X ¼ fx; yg,
4BX ðxmin ÞW X ðxmin Þ,
BW fx;yg ðx; yÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðyÞ.
which contradicts (23). Hence (20) must hold for all
We have already discussed issues related to these
x 2 X. &
equalities, and mentioned that Shafir (1993) presents
data, in an accept versus reject design, that can be
We now illustrate Proposition 12 with the Luce interpreted as showing that the second and third
(MNL) model holding for the best and worst choices. probabilities may be unequal.
For each x; y 2 X , we have An alternative way of writing (26) sets uðzÞ ¼ log bðzÞ,
z 2 X , giving, for x; y 2 X ; xay,
bðxÞ
BX ðxÞ ¼ P ,
z2X bðzÞ exp½uðxÞ uðyÞ þ jX 1j 1
wðyÞ BW X ðx; yÞ ¼ P  . (27)
W X ðyÞ ¼ P . ð24Þ r;s2X exp½uðrÞ uðsÞ þ
1
z2X wðzÞ ras jX j 1

Then with the set of best–worst choice probabilities which is a biased form of the maximum-difference
given by (19), Proposition 12 shows that this model has (maxdiff) best–worst choice model (see Section
consistent margins iff there is a constant c such that for 4.1.2) that converges to the latter model for ‘large’
every z 2 X , sets X.
Now we consider some estimation issues related to the
bðzÞwðzÞ ¼ c, representation (26), equivalently (27). An appropriate
experimental design for testing this model, called a 2j
i.e., fractional factorial, ensures that each option and each
c pair of distinct options, is presented equally often across
wðzÞ ¼ . (25) the selected subsets of size j of the master set T (Finn
bðzÞ
and Louviere, 1992). We assume such a design, and thus
Note that the resulting representation for W X in (24) we lose no generality by developing the results for a
will not depend on c. Then it follows from a routine fixed set X ; X  T, with jX j ¼ j.
calculation that (19) becomes: for x; y 2 X ; xay, Suppose that we have a sample of N best–worst
choices from the set X. Denote these choices by
bðxÞ ðxi ; yi Þ; i ¼ 1; . . . ; N. These may be N best–worst choices
bðyÞ þ jX 1j 1
BW X ðx; yÞ ¼ P  . (26) for a single individual, or one best–worst choice for each
bðrÞ
r;s2X
bðsÞ þ jX 1j 1 of N individuals. For the samples i ¼ 1; . . . ; N and any
ras

x; y 2 X , xay, let
For reassurance, one can check that the representation
(26) satisfies c i ðx; yÞ
bw
( ) ( )
X bðxÞ 1 x is best; y is worst; for sample i
BW X ðx; zÞ ¼ P ¼ if
z2X fxg z2X bðzÞ 0 otherwise
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 11

and for each z 2 X , let choice options. In order for such a resampling process to
terminate, it is necessary that there are r; s 2 X , ras,
X
N X
b ¼
bðzÞ c i ðz; yÞ,
bw with BX ðrÞa0 and W X ðsÞa0,10 in which case
P
r;s2X BX ðrÞW X ðsÞ40. The formula for the best–worst
i¼1 y2X fzg ras
choice probabilities is then, with xay,
X
N X
b ¼
wðzÞ c i ðx; zÞ.
bw
i¼1 x2X fzg BW X ðx; yÞ
" #k
Note that the likelihood of obtaining the data X 1 X
¼ BX ðrÞW X ðrÞ BX ðxÞW X ðyÞ
ðxi ; yi Þ; i ¼ 1; . . . ; N, is
k¼o r2X
2 3k
Y
N Y b i ðx;yÞ X
1 X
bw
½BW X ðx; yÞ ¼ BX ðxÞW X ðyÞ 41 BX ðrÞW X ðsÞ5
i¼1 x;y2X
xay k¼o r;s2X
ras

and it is clear from (26), or equivalently from (27), that BX ðxÞW X ðyÞ
b ¼P .
neither bðzÞ b
or wðzÞ, z 2 X , or both together, are r;s2X BX ðrÞW X ðsÞ
ras
sufficient statistics for this likelihood. However, we
know that this model has consistent margins that satisfy As discussed earlier, if the best and the worst choice
the Luce (MNL) choice model—in fact, it is routine to probabilities each satisfy the Luce (MNL) choice model,
show, with E denoting expectation, that for each x 2 X , then the above representation of the best–worst choice
b probabilities is a Luce (MNL) choice model representa-
E½bðxÞ ¼ N:BðX ÞbðxÞ, (28) tion with scale values bðxÞwðyÞ; x; y 2 X ; xay. As with
where Case 1, when we impose the constraint that for all
z 2 X ; bðzÞwðzÞ ¼ c, i.e., wðzÞ ¼ c=bðzÞ, an alternative
1 version of this representation is given by: for z 2 X , let
BðX Þ ¼ P .
z2X bðzÞ uðzÞ ¼ log bðzÞ, then for x; y 2 X ; xay,
b
Thus, since b is a ratio scale, the scores bðxÞ; x 2 X , or
1 b
rather N bðxÞ, give unbiased estimates of the scale values exp½uðxÞ uðyÞ
BW X ðx; yÞ ¼ P , (30)
for bðxÞ; x 2 X . exp½uðrÞ uðsÞ
r;s2X
ras
Similarly, with
1 a representation that is usually called maximum-
W ðX Þ ¼ P 1
,
z2X bðzÞ
difference (maxdiff).
When X ¼ fx; yg, we have
we obtain, for each x 2 X ,
bðxÞ=bðyÞ
1 BW fx;yg ðx; yÞ ¼ ,
b
E½wðxÞ ¼ N:W ðX Þ , (29) bðxÞ=bðyÞ þ bðyÞ=bðxÞ
bðxÞ
and so the scores w bðxÞ; x 2 X , or rather N1 wðxÞ,
b give which does not in general equal bðxÞ=½bðxÞ þ bðyÞ,
unbiased estimates of the scale values 1=bðxÞ. which is the value of each of Bfx;yg ðxÞ and W fx;yg ðyÞ. As
However, it would be preferable to estimate each scale we have discussed already, we can replace the prediction
value bðxÞ; x 2 X , from some combination of the scores of the maxdiff model for the case jX j ¼ 2 with the
b
bðxÞ b
and wðxÞ. Louviere, Burgess, Street, and Marley representation
(2004) explore the properties of related estimation
BW fx;yg ðx; yÞ ¼ aBfx;yg ðxÞ þ ð1 aÞW fx;yg ðyÞ.
procedures for the model of the following Section
4.1.2, and it will be useful in the future to study whether
However, this is now a sequential process and there is no
or not it is possible to discriminate between the model
reason why it should not be applied for all set sizes,
just presented and that of the next section on the basis of
leading to a different model that is discussed in
data.
Section 4.2.
We have just seen that the unmodified model does not
4.1.2. Case 2: Any choices after the first involve re- have consistent margins when jX j ¼ 2, and the fact that
sampling
Now we consider the case where, if no decision is
reached at the first stage, then the person re-samples for 10
An equivalent condition is that there is no r 2 X with
the best–worst choice pair from the set X of available BX ðrÞ ¼ W X ðrÞ ¼ 1.
ARTICLE IN PRESS
12 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

it does not have consistent margins in general can be 4.2. Sequential best– worst choice processes
seen by noting that for each x; y 2 X ,
X The following definition is based on the idea that a
BW X ðx; zÞ person, in choosing the best–worst pair from a set X,
z2X fxg first decides in which order to choose the best and the
!
1 X 1 worst option (best–worst order with probability a,
¼P bðxÞ 1 worst–best order with probability 1 a), then proceeds
r;s2X bðrÞ=bðsÞ bðzÞ
ras z2X to the actual choices. Note that the question format in
aBX ðxÞ the experiment will likely affect (though not necessarily
completely determine) the value of a.
and
X Definition 13. A set of best, worst, and best–worst
BW X ðz; yÞ choice probabilities on a set X satisfies a mixed
z2X fyg sequential (best, worst, and) best– worst choice model iff
!
1 1 X there is a constant a; 0pap1, such that for all
¼P bðzÞ 1 x; y 2 X ; xay,
r;s2X bðrÞ=bðsÞ bðyÞ z2X
ras

aW X ðyÞ. BW X ðx; yÞ ¼ aBX ðxÞW X fxg ðyÞ


þ ð1 aÞW X ðyÞBX fyg ðxÞ.
Turning to estimation issues, and proceeding as for
Case 1, it follows from the above formulae that neither Marley (1968) introduced the following special case
b
E½bðxÞ or E½b wðxÞ or, in fact, E½bðxÞb w bðxÞ, are of the above class of models, though he did not interpret
unbiased estimates of bðxÞ. The latter fact is of interest his model as being for best–worst choice. This model
since, as we show now, bðxÞ b wðxÞ;
b x 2 X , is a set of assumes that the elements of the best–worst pair are
sufficient statistics for the scale values bðxÞ; x 2 X . chosen sequentially, and that the probability that a
Turning to the likelihood for the maxdiff model, and particular best–worst pair is selected is unaffected
using the notation as before, we have that by the order in which the best and worst criteria are
applied.
Y
N Y
b
½BW X ðx; yÞbwi ðx;yÞ Definition 14. A set of best, worst, and best–worst
i¼1 x;y2X
xay choice probabilities on a set X satisfies a concordant
YY N b (best, worst, and) best– worst choice model iff for all
½bðxÞ=bðyÞbwi ðx;yÞ
¼ P x; y 2 X ; xay,
r;s2X bðrÞ=bðsÞ
x;y2X i¼1
ras
xay
0 1 BW X ðx; yÞ ¼ BX ðxÞW X fxg ðyÞ
P bw
n

Y  b i ðx;yÞ ¼ W X ðyÞBX fyg ðxÞ. ð32Þ


1 B bðxÞ i¼1 C
¼ hP iN : @ A
bðyÞ Note that a concordant best–worst choice model has
r;s2X bðrÞ=bðsÞ
x;y2X
xay
ras
Y consistent margins, Definition 1, and thus so does any
1 b wðxÞ
¼2 3N : bðxÞ½bðxÞ b ð31Þ mixture of a concordant best–worst choice model. Such
P x2X a mixture is also a mixed sequential best–worst choice
4 bðrÞ=bðsÞ5 model, Definition 13. It is an open question as to
r;s2X
ras whether it is possible to characterize the class of mixed
sequential best–worst choice models, Definition 13, that
b wðxÞ; have consistent margins.
and so the difference scores bðxÞ b x 2 X , are
sufficient statistics for the scale values bðxÞ; x 2 X . We need a technical condition, and various additional
However, as pointed out above, they are biased notation, to state a result from Marley (1968) that leads
estimates of the scale values. Again, we are exploring to a concordant best–worst choice model. First, for
the properties of these sufficient statistics for data, and notational simplicity, given an arbitrary set Y, for
the extent of their bias (Louviere et al., 2004). Thus far, distinct x; y 2 Y , let bðx; yÞ ¼ Bfx;yg ðxÞ, wðx; yÞ ¼
it appears that the biases are hard to detect even for W fx;yg ðyÞ. A set of binary best and worst choice
choice sets with relatively few items, say N43. Also, we probabilities on a set X is transitive if for distinct
conjecture that the likelihood, (31), depends on the data x; y; z 2 Y , bðx; yÞ ¼ bðy; zÞ ¼ 1 implies that bðx; zÞ ¼ 1,
b wðxÞ;
through the difference scores bðxÞ b x 2 X , iff the and wðx; yÞ ¼ wðy; zÞ ¼ 1 implies that wðx; zÞ ¼ 1. Now,
maxdiff model holds—we have shown above that the if as previously, let RðY Þ denote the set of rank orders
part of this conjecture holds. of the set Y . When jY j ¼ n; nX2, we can denote
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 13

r ¼ r1 r2 :::rn 1 rn 2 RðY Þ by r ¼ r1 s, where s ¼ (ii) For each x; y 2 X  T; jX jX2; xay, we have


r2 r3 :::rn 1 rn 2 RðY fr1 gÞ. Also, for any r 2 RðY Þ, let BX ðxÞ ¼ W X ðxÞ ¼ jX1 j and BW X ðx; yÞ ¼ jX1 j jX 1 1j.
Y
bY ðrÞ ¼ bðri ; rj Þ,
Proof. Clearly, when (ii) holds, the following consistent
1piojpn
Y random ranking model holds: for each r ¼ r1 r2 r3 2
wY ðrÞ ¼ wðri ; rj Þ. ð33Þ 1
RðTÞ, set pðrÞ ¼ jTj 1
jT 1j . So it remains to show that (i)
1piojpn implies (ii)
Using this notation, we state a variant of Marley’s From (i), we have that for each x; y 2 X  T;
(1968) result, and later in this section we give a process jX jX2; xay,
interpretation of his result. BW fx;yg ðx; yÞ ¼ Bfx;yg ðxÞ ¼ W fx;yg ðyÞ,
Proposition 15. Let a transitive set of binary choice which with (i) and the assumption that the complete set
probabilities on a finite set X be such that for all distinct of best, worst and best–worst choice probabilities on the
r; s 2 X , bðr; sÞ ¼ wðs; rÞ. For x; y 2 Y , Y  X , let three-element set T satisfies a concordant best–worst
P choice model, Definition 14, gives that for
s2Rðy fxgÞ bY ðxsÞ
BY ðxÞ ¼ P , r ¼ r1 r2 r3 2 RðTÞ,
Z2RðY Þ bY ðZÞ
P pðr1 r2 r3 Þ ¼ BW T ðr1 ; r3 Þ
s2RðY fygÞ wX ðysÞ
W Y ðyÞ ¼ P . ð34Þ ¼ BT ðr1 ÞW fr2 ;r3 g ðr3 Þ ¼ BT ðr1 ÞBfr2 ;r3 g ðr2 Þ, ð36Þ
Z2RðY Þ wX ðZÞ
and
Then for distinct x; y 2 X ,
pðr1 r2 r3 Þ ¼ BW T ðr1 ; r3 Þ
BX ðxÞW X fxg ðyÞ ¼ W X ðyÞBX fyg ðxÞ.
¼ W T ðr3 ÞBfr1 ;r2 g ðr1 Þ ¼ W T ðr3 ÞW fr1 ;r2 g ðr2 Þ, ð37Þ
Now define a set of best–worst choice probabilities
by, for distinct x; y 2 X , i.e.,

BW X ðx; yÞ ¼ BX ðxÞW X fxg ðyÞ pðr1 r2 r3 Þ ¼ BX ðr1 ÞBfr2 ;r3 g ðr2 Þ


¼ W X ðyÞBX fyg ðxÞ, ð35Þ ¼ W T ðr3 ÞW fr1 ;r2 g ðr2 Þ. ð38Þ

where the best and worst choice operabilities are as However, (i) with (38) implies that both the best and the
given in Proposition 15. Then by that theorem we have worst choice probabilities satisfy Luce’s choice model
an example of a concordant best–worst choice model, (Theorem 50, Luce and Suppes, 1965) and then, using
Definition 14. In fact, Marley (1968) shows that, under (38) again, we obtain that, for each x; y 2 X  T;
the conditions of Proposition 15, this is the unique jX jX2; xay, we have BX ðxÞ ¼ W X ðxÞ ¼ jX1 j (Yellott,
representation of a concordant best–worst-choice mod- 1980). It then follows from the fact that the best, worst
el. However, note that this is really a ‘‘class’’ of models and best–worst choice probabilities on T satisfy a
as it depends on the assumed representations of the concordant best–worst choice model, Definition 14, that
binary choice probabilities. BW X ðx; yÞ ¼ jX1 j jX 1 1j. Combining these results, we
have that ii. holds. &
Marley (1968) shows also that, when the binary choice
probabilities are transitive and for all distinct r; s 2 X , Returning to Proposition 15, note that, using (33), the
bðr; sÞ ¼ wðs; rÞ, the above concordant best–worst choice best and worst probabilities in (34) can be rewritten in
model is the only one that is compatible with a structure the form
Q P
of ‘‘best to worst’’ and ‘‘worst to best’’ ranking prob- z2X fxg bðx; zÞ s2RðX fxgÞ bX fxg ðsÞ
abilities that he called a reversible ranking model BX ðxÞ ¼ P Q P ,
r2X z2X frg bðr; zÞ s2RðX frgÞ bX frg ðsÞ
(Marley, 1968 Theorem 8). However, we see next that Q P
a concordant best–worst choice model, Definition 14, is z2X fyg wðy; zÞ s2RðX fygÞ wX fyg ðsÞ
W X ðyÞ ¼ P Q P ,
not in general compatible with a consistent random r2X z2X frg wðr; zÞ s2RðX frgÞ wX frg ðsÞ
ranking model, Definition 3. ð39Þ
Proposition 16. Assume that a complete set of best, worst which can be given the following process interpretation:
and best– worst choice probabilities on a three-element set the person makes all possible paired comparisons
T satisfies a concordant best– worst choice model, according to the best paired comparison probabilities
Definition 14. Then the following are equivalent. b. If the options end up as rank ordered by this process,
then the person selects as best the (necessarily unique)
(i) The complete set of best, worst and best– worst choice option that beats every other option in these compar-
probabilities satisfies a consistent random ranking isons—that is, the option that is first (best) in the
model, Definition 3. resulting rank order; otherwise the process starts over.
ARTICLE IN PRESS
14 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

The process for worst choices is parallel, with the 5. Ratio scale models
best binary choice probabilities b replaced by the
worst binary choice probabilities w. We now return to the maxdiff model, i.e., the example
It is interesting to compare the above process in Section 1, and use it to motivate a theoretical question
with a related process model for best choices (with a concerning when a set of best, worst and best–worst
parallel interpretation for worst choices): the person choice probabilities are ‘‘of the same form’’ with the
makes all possible paired comparisons according best–worst choice probabilities determined by some
to the best-paired comparison probabilities b; if some ‘‘functions’’ of the best and worst choice probabilities.
(necessarily unique) option beats every other option Unfortunately, the notation required for the general
in these comparisons, then that option is selected formulation is complex, so we use the earlier example to
as best; otherwise the process starts over.11 Note that, illustrate the ideas, and include details in the Appendix.
in this case, in contrast to the previous process, we do We concentrate on a fixed choice set X, though the ideas
not require the results of the binary choices to be can be extended to cover all subsets X ; X  T, of a
consistent with a rank order, only that they are master set T.
consistent with the existence of a best option. This We begin with the restatement of the formulae of the
process, with a parallel one for worst choices, gives the example: for x; y 2 X ,
following representation for the best and the worst
bðxÞ
choice probabilities: BX ðxÞ ¼ P ,
r2X bðrÞ
Q
z2X fxg bðx; zÞ
wðyÞ
BX ðxÞ ¼ P Q , W X ðyÞ ¼ P ð40Þ
r2X s2X frg bðr; sÞ s2X wðsÞ
Q
z2X fyg wðy; zÞ and
W X ðyÞ ¼ P Q .
r2X s2X frg wðr; sÞ BX ðxÞW X ðyÞ
BW X ðx; yÞ ¼ P ðxayÞ. (41)
r;s2X BX ðrÞW X ðsÞ
By comparing these equations with those of (39), it is rar

clear that the two processes give different representa- Then direct substitution of (40) in (41) yields: for
tions when jX j43. It is routine, though tedious, to show x; y 2 X ; xay,
also that the above representation does not satisfy a
bðxÞwðyÞ
concordant best–worst choice model, Definition 14, BW X ðx; yÞ ¼ P . (42)
r;s2X bðrÞwðsÞ
when jX j43—one simple counter-example takes the set ras

X ¼ fx; y; z; wg with the binary choice probabilities Thus, combining (41) with (42), we have
satisfying Luce’s choice model with x; y; z; w having
scale values 1; 2; 3; 4, respectively. BX ðxÞW X ðyÞ
BW X ðx; yÞ ¼ P
These are only two of a vast array of possible x;y2X BX ðxÞW X ðyÞ
xay
best–worst choice models based on binary comparisons.
bðxÞwðyÞ
There are many alternate ways to combine a sequence of ¼ P , ð43Þ
r;s2X bðrÞwðsÞ
(probabilistic) best and worst choices that will lead to a ras

final best–worst pair. Also, the best (respectively, worst) that is, the best–worst choice probabilities are functions
choices in a mixed sequential choice process can be of the best and worst choice probabilities, as well as
assumed to satisfy any ‘‘standard’’ discrete choice functions of the ratio scale values that determine those
model, not necessarily one motivated by a sequence of choice probabilities, and, in fact, the two functional
binary comparisons. Of course, for parsimony, we forms have identical structure. We wish to know what
would expect some links between the best and worst other functional forms, if any, have parallel properties.
choice processes, such as those that arise as a result of The full notation required to formulate, and the
assuming a consistent extreme value model, Definition techniques to solve, this question are complex (see the
8. Given the vast diversity of possible models, we leave appendix). Nonetheless, based on previous work on
their further systematic study for the future, at which closely related aggregation problems (Aczel et al., 1997;
time we expect to have data with which to challenge Aczel et al., 2000), we conjecture that the representa-
them. tions given by (43) are the only ones that satisfy the full
set of constraints, which include the important require-
11
ments that the two functional forms have identical
The following is an alternate description of the process. Some structure and that b and w are independent ratio scales.
element x 2 X is chosen at random and compared pairwise with every
other element of X, with the best option in each comparison being
If we generalize the formulation by allowing the two
determined by the paired comparison probabilities b. If x wins every functional forms to have somewhat different structures,
comparison, it is selected as best, otherwise the process starts over. then, again based on the previous work, we conjecture
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 15

that the class of solutions is larger, but still involves summarizing the data obtained using best–worst choice
relatively simple functions of the best and worst choice in discrete choice designs.
probabilities (vis., the best and worst ratio scales).
Finally, as in the maxdiff model, we need to consider the
cases, where for each z 2 X , Acknowledgments
1
bðzÞ ¼ . This research has been supported by Australian
wðzÞ
Research Council Discovery Grant DP034343632 to
Nothing similar to this case has been studied in the the University of Technology, Sydney, for Louviere,
previous work, and it raises several complexities in Street and Marley, and Natural Science and Engineering
generalizing the earlier results to the present situation. Research Council Discovery Grant 8124-98 to the
University of Victoria for Marley. It was begun while
Marley was a Visiting Researcher at Systems, Organiza-
6. Summary and conclusions tions and Management of the University of Groningen
and supported by the Netherlands’ Organization for
We derived and discussed theoretical results for a Scientific Research for the period July 1, 2003, to June
number of best, worst and best–worst choice models, 30, 2004. We appreciate useful feedback from Tom
with the focus on the latter. Our results include a Wickens and J. C. Falmagne on earlier versions of this
number of interesting theoretical relationships between work, and the detailed evaluation by two referees (one
these types of models, which in turn suggest a variety of anonymous, the other Reinhard Suck) on the original
tests to determine which model is most consistent with version of the paper.
choice data. For example, the maximum-difference
model is of the well-known Luce (1959), equivalently
Multinomial Logit (McFadden, 1974), form with ratios Appendix A. The functional equations associated with
of scale values, and has difference scores for best versus ratio scale models of best, worst, and best–worst choice
worst choices that are sufficient statistics for the probabilities
parameters of the model, with the separate best and
worst scores having biases that decrease with increasing We use a slightly different notation than previously
choice set size. It will be interesting and important to for the options and the choice probabilities, namely we
undertake empirical studies for different set sizes, which let X ¼ fx1 ; . . . ; xn g, and let BX ðxi Þ, WX ðxj Þ, BWX ðxi ; xj Þ,
can be achieved by constructing the sets using balanced respectively, iaj, i; j 2 f1; . . . ; ng, denote the best, worst,
incomplete block designs (see, e.g., Street and Street, best–worst, respectively, choice probabilities. Thus, as
1987). One can then compare theoretically correct before, we have the constraints
estimates with the obtained best, worst, and best minus
worst scores to learn how much the biases actually 0pBX ðxi Þ; W X ðxj Þ; BW X ðxi ; xj Þp1
matter in real applications. ði; j ¼ 1; . . . ; nÞðiajÞ
General classes of best–worst choice models have not
and
been studied previously in a systematic way, and, as far
as we know, our results constitute the first formal X
n X
n X
BX ðxi Þ ¼ W X ðxj Þ ¼ BW X ðxi ; xj Þ ¼ 1.
presentation of the properties of some of these models. xi ;xj 2X
i¼1 j¼1
The results suggest that best–worst choice tasks and the iaj

associated models are both theoretically and empirically Also, b and w will denote ratio scales defined over
interesting, with the potential to provide important possible options, and so their values are nonnegative
insights into preference and choice processes. We noted real numbers. For mathematical simplicity we eliminate
a number of open problems, the most pressing of which options that have scale values of zero.
demand the axiomatization of important best–worst We now assume that there are functions Bi and W j
choice models, such as the best–worst ranking model, ði; j ¼ 1; . . . ; nÞ, BW ij , H i;j and K i;j ði; j ¼ 1; . . . ; nÞ ðiajÞ,
the maximum-difference model, and the concordant such that
best–worst choice model, each without reference to
representations of best or worst choice probabilities. It is BX ðxi Þ ¼ Bi ðbðx1 Þ; . . . ; bðxn ÞÞ ði ¼ 1; . . . ; nÞ, (44)
also important to characterize the class of ratio scale
W X ðxj Þ ¼ W j ðwðx1 Þ; . . . ; wðxn ÞÞ ðj ¼ 1; . . . ; nÞ. (45)
models of best, worst, and best–worst choice probabil-
ities where all three sets of choice probabilities are ‘‘of We need a matrix notation in order to formulate
the same form.’’ Finally, as already indicated, there are the form of the functional equations that we need to
interesting practical questions regarding the usefulness solve. Let Rþþ ¼0; 1½. For any pair of n-component
of ‘‘simple’’ (sufficient) statistics in analyzing and vectors r ¼ ðr1 ; . . . ; rn Þ; s ¼ ðs1 ; . . . ; sn Þ with ri ; si 2 Rþþ ,
ARTICLE IN PRESS
16 A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]]

and a function O : Rþþ  Rþþ !Rþþ , define the and where Fðu; vÞ ¼ uv, Gðr; sÞ ¼ rs, and for i; j ¼
matrix kOðri ; sj Þk by 1; . . . ; n, iaj,

2 3
0 Oðr1 ; s2 Þ : : : Oðr1 ; sn 1 Þ Oðr1 ; sn Þ
6 Oðr ; s Þ 0 : : : Oðr2 ; sn 1 Þ Oðr2 ; sn Þ 7
6 2 1 7
6 7
6 : : : : : : : 7
6 7
6 : : : : : : : 7
kOðrk ; sl Þk ¼ 6 7.
6 7
6 : : : : : : : 7
6 7
6 Oðr ; s Þ Oðrn 1 ; s2 Þ : : : 0 Oðrn 1 ; sn Þ 7
4 n 1 1 5
Oðrn ; s1 Þ Oðrn ; s2 Þ : : : Oðrn ; sn 1 Þ 0

Thus, kOðrk ; sl Þk is simply the matrix of pairs


Oðrk ; sl Þ; k; l 2 f1; . . . ; ng; kal, with the diagonal ele- H i;j ½kFðBX ðxk Þ; W X ðxl ÞÞk
ments being 0. FðBX ðxi Þ; W X ðxj ÞÞ
¼ n n
With this notation, generalizing the example in the PP
FðBX ðxk Þ; W X ðxl ÞÞ
text, we assume that the functional relation for k¼1 l¼1
kal
BW X ðxi ; xj Þ is: there are functions F and G, such that
for i; j ¼ 1; . . . ; n, iaj, BX ðxi ÞW X ðxj Þ
¼
P
n P
n
BX ðxk ÞW X ðxl Þ
BW X ðxi ; xj Þ ¼ H i;j ½kFðBX ðxk Þ; W X ðxl ÞÞk k¼1 l¼1
kal
¼ K i;j ½kGðbðxi Þ; wðxj ÞÞk. ð46Þ
and
Notice that if we substitute (44) and (45) into (46), we
K i;j ½kGðbðxk Þ; wðxl ÞÞk
obtain a set of functional equations involving the scale
values b and w, and the general functions Bj ; W j ; H i;j ; Gðbðxi Þ; wðxj ÞÞ
¼ n n
K i;j ; F, and G. The additional constraints that follow PP
Gðbðxk Þ; wðxl ÞÞ
from the fact that b and w are distinct ratio scales are k¼1
kal
l¼1
that we have for m; n in Rþþ , j ¼ 1; . . . ; n
bðxi Þwðxj Þ
¼ ði; j ¼ 1; . . . ; nÞ ðiajÞ.
Bj ½mbðx1 Þ; . . . ; mbðxn Þ ¼ Bj ½bðx1 Þ; bðx2 Þ; . . . ; bðxn Þ, (47) P
n P
n
bðxk Þwðxl Þ
k¼1 l¼1
kal

W j ½nwðx1 Þ; . . . ; nwðxn Þ ¼ W j ½wðx1 Þ; . . . ; wðxn Þ, (48) Thus, the problem before us is, given reasonable
technical assumptions, find all solutions of the set of
functional equations (44)–(49).
and

Gð½kmbðxk Þ; nwðxl ÞkÞ ¼ Mðm; nÞ:Gð½kbðxk Þ; wðxl ÞkÞ. (49)


References
Clearly, the model that we introduced in the text,
Aczel, J., Maksa, G., Marley, A. A. J., & Moszner, Z. (1997).
namely (40)–(43), is the special case of (46) where, for
Consistent aggregation of scale families of selection probabilities.
j ¼ 1; . . . ; n, Mathematical Social Sciences, 33, 227–250.
Aczel, J., Gilanyi, G., Maksa, G., & Marley, A. A. J. (2000).
bðxj Þ Consistent aggregation of simply scalable families of choice
BX ðxj Þ ¼ Bj ½bðx1 Þ; bðx2 Þ; . . . ; bðxn Þ ¼ ,
P
n probabilities. Mathematical Social Sciences, 39, 241–262.
bðxi Þ Agresti, A. (2002). Categorical data analysis. Hoboken, NJ: Wiley.
i¼1
Barbera, S., & Pattanaik, P. K. (1986). Falmagne and the rationaliz-
ability of stochastic choices in terms of random orderings.
Econometrica, 54, 707–715.
wðxj Þ Cohen, S. (2003). Maximum difference scaling: Improved measures of
W X ðxj Þ ¼ W j ½wðx1 Þ; wðx2 Þ; . . . ; wðxn Þ ¼ importance and preference for segmentation. Sawtooth software
P
n
wðxi Þ conference proceedings, Sawtooth Software, Inc., 530 W. Fir St.,
i¼1 Sequim, WA (www.sawtoothsoftware.com), (pp. 61–74).
ARTICLE IN PRESS
A.A.J. Marley, J.J. Louviere / Journal of Mathematical Psychology ] (]]]]) ]]]–]]] 17

Cohen, S., & Neira, L. (2003). Measuring preference for product Louviere, J. J., Hensher, D. A., & Swait, J. D. (2000). Stated choice
benefits across countries: Overcoming scale usage bias with models. Cambridge: Cambridge University Press.
maximum difference scaling. Paper presented at the Latin Luce, R. D. (1959). Individual choice behavior. New York, NY: Wiley.
American conference of the European society for opinion and Luce, R. D., & Suppes, P. (1965). Preference, utility, and subjective
marketing research, Punta del Este, Uruguay (pp. 1–22). Reprinted probability. In R. D. Luce, R. R. Bush, & E. Galanter (Eds.).
in Excellence in International Research: 2004. ESOMAR, Am- Handbook of mathematical psychology (Vol III, pp. 235–406). New
sterdam, Netherlands, 2004. York, NY: Wiley.
Critchlow, D. E., Fligner, M. A., & Verducci, J. S. (1991). Probability Marley, A. A. J. (1968). Some probabilistic models of simple
models on rankings. Journal of Mathematical Psychology, 35, choice and ranking. Journal of Mathematical Psychology, 5,
294–318. 311–332.
Falmagne, J. C. (1978). A representation theorem for finite random McFadden, D. (1974). Conditional logit analysis of qualitative choice
scale systems. Journal of Mathematical Psychology, 18, 52–72. behavior. In P. Zarembka (Ed.), Econometrics (pp. 105–142). New
Finn, A., & Louviere, J. J. (1992). Determining the appropriate York, NY: Academic Press.
response to evidence of public concern: The case of food safety. Shafir, E. (1993). Choosing versus rejecting: Why some options are
Journal of Public Policy and Marketing, 11(1), 12–25. both better and worse. Memory and Cognition, 21, 546–556.
Fiorini, S. (2004). A short proof of a theorem of Falmagne. Journal of Street, A. P., & Street, D. J. (1987). Combinatorics of experimental
Mathematical Psychology, 48, 80–82. design. New York: Oxford University Press.
Fishburn, P. (1994). On ‘‘choice’’ probabilities derived from ranking Yellott, J. I., Jr. (1977). The relationship between Luce’s choice axiom,
distributions. Journal of Mathematical Psychology, 38, 274–285. Thurstone’s theory of comparative judgment, and the double
Fishburn, P. C. (2002). Stochastic utility. In S. Barbera, P. J. exponential distribution. Journal of Mathematical Psychology, 15,
Hammond, & C. Seidl (Eds.). Handbook of utility theory, Vol. I, 109–144.
pp. 275–319. Dordrecht, Boston, London: Kluwer Academic Yellott, J. I., Jr. (1980). Generalized Thurstone models for ranking:
Publishers. Equivalence and reversibility. Journal of Mathematical Psychology,
Louviere, J. J., Burgess, L., Street, D., & Marley, A. A. J. (2004). 22, 48–69.
Modelling the choices of single individuals by combining efficient Yellott, J. I., Jr. (1997). Preference models and reversibility. In A. A. J.
choice experiment designs with extra preference information. Marley (Ed.), Choice, decision, and measurement: Essays in honor of
Working paper no. 04-003, Centre for the Study of Choice, R. Duncan Luce (pp. 131–151). Mahwah, NJ: Lawrence Erlbaum
University of Technology, Sydney. Associates.

You might also like