Professional Documents
Culture Documents
net/publication/239984438
Exploiting Rank Ordered Choice Set Data Within the Stochastic Utility Model
CITATIONS READS
397 666
2 authors, including:
Richard Staelin
Duke University
143 PUBLICATIONS 17,825 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
The Role of Paid, Earned and Owned Media in Building Entertainment Brands: Reminding Informing and Enhancing Enjoyment View project
All content following this page was uploaded by Richard Staelin on 20 March 2015.
REFERENCES
Linked references are available on JSTOR for this article:
http://www.jstor.org/stable/3151563?seq=1&cid=pdf-reference#references_tab_contents
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact support@jstor.org.
American Marketing Association is collaborating with JSTOR to digitize, preserve and extend access to Journal of Marketing
Research.
http://www.jstor.org
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
I I
RANDALL
G. CHAPMANand RICHARD
STAELIN*
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
STOCHASTIC
UTILITY
MODEL 289
the stochasticutilitymodelof choice behaviorarebriefly the systematic component of the model, and an er-
reviewed. The principleof decomposingrank ordered ror term e, which capturesthe measurementerrorsin
choice sets into a series of statisticallyindependentun- the modelng process. If one assumes these two com-
rankedchoice sets is explained. Strategies and tech- ponents are independentand additive, the model in
niquesfor coping with "noisy" and possibly unreliable equation1maybewrittenintheform
rankorderinformationare then considered.The results
of a Mont6Carlostudydesignedto investigatethe small (2) Ui.= Vy+ Eij
samplepropertiesof the parameterestimatesof the con- where V = V(xj,di,yi). The presenceof the stochastic
ditionallogit estimationprocedure(the maximumlike- errorterm in equation2 leads to this model being de-
lihood estimationprocedureused to develop parameter scribed as a stochastic utility model.
estimatesof the multinomiallogit model formulationof Supposethat individuali is observedto choose alter-
the stochasticutilitymodel)arereportedandinterpreted. nativej* fromCi. If rationalchoice behavioris assumed,
Particularattentionis focused on the incrementalcon- revealed pre erence implies that Ui,*2 U.i (for j =
tributionof using the additionalinformationcontained 1,2,...,Ji). Because the utility function is partly sto-
in the preferencerankorderingratherthanjust employ- chastic, the probabilityof this event occurringmay be
ing knowledgeof the chosen choice set alternative.We writtenas
concludewith an empiricalapplicationand a discussion
of the implicationsof these results for marketingre- (3) P,* = Prob(Ui, > U, j = 1,2,.. . ,J)
= Prob(Ei - ei* <
searcherswho use rankorderedchoice set data to esti- V,i - Vi, j = 1,2,...,i)
mate the parametersof the stochasticutility model to where Pi* is the probabilitythat decision maker i
drawinferencesaboutchoice behavior. chooses alternativej*. Furtherdevelopmentand simpli-
fication of equation3 requirethat a joint distribution
THESTOCHASTIC UTILITY MODEL functionbe specified for the errorterms. In principle,
The generalnatureof the choice behaviorbeing mod- any joint distributionfunction could be used and an
eled with the stochasticutilitymodel, and the natureof expressionfor the choice probabilitiescould be devel-
the data(assumed)availableto the empiricalmarketing oped. Unfortunately,the choice of most distributions,
researcher,may be describedas follows. Eachconsumer includingthe usual normaldistributionassumptionfor
decision maker i (i = 1,2,...,/) has a choice set Ci errortermsin statisticalmodels, necessitatesthe calcu-
consisting of Ji alternatives (1 < Ji < oo). The choice set lation of a formidableseries of numericalintegrations
alternativesare assumedto be characterizedby N quan- to determineexplicitly the choice probabilities.How-
tifiable attributes.Each decision makeris observedto ever, if the stochasticerrortermsareassumedto be iden-
choose an alternativefrom his or her choice set. The tically and independentlydistributed(IID) accordingto
decision makersare assumedto be utility maximizers the doubleexponentialdistribution,such that
(i.e., rational)whose choices representtheir most pre- Prob(Eij t) = exp [-exp (-t)],
ferredalternativesat the time of choice. Also, because (4)
the data used typically are cross-sectional,the sample one can show thatthe choice probabilitieshave the fol-
of decision makers is assumed to have homogeneous lowing form (cf. McFadden1974).
tastesandpreferencesin termsof the relativeimportance
of the attributescharacterizingthe alternatives. (5) PU*= exp(Vi,) forj* = 1,2,...,.
Let xj denote a vector of relevantattributesof alter- W ')*?- J, ,V
nativej, di denotea vectorof individualdecisionmaker I
exp(V,)
j=l
demographicattributes,and yijdenote a vectorof inter-
active variablesrelatingdecision makeri to alternative The value of the double exponentialdistributionas-
j. It is assumedthat a utility function U exists which sumptionis that a tractableclosed-formexpressionre-
measuresthe unobserveddesirabilityor attractivenessof sults for the choice probabilities.This particularpara-
an alternativewith attributevectorxj to a decisionmaker metricformof the stochasticutilitymodelis often called
with demographicvector di and associated decision the multinomiallogit model because it is the multiple
maker/alternative vector yij. choice generalizationof the binarylogit model.
To operationalizethe choice probabilityexpressionin
(1) U, = U(xi,d,,y,) equation5, the functionalformof the deterministiccom-
Measurementerroris typical in the modeling process ponentof the stochasticutilitymodel mustbe specified.
becausethe x, d, andy vectorsgenerallydo not capture A linear-in-parameters specificationassumptionwould
all of the factorsinfluencingthe choice process, the cor- lead to
rectfunctionalformfor the model may not be specified, N
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
290 JOURNALOF MARKETING AUGUST1982
RESEARCH,
native j to decision maker i and Zij = Zijn(xj,di,,y), and behavior,withinthe class of choice modelsof whichthe
0, is the relative importance of attributen to the sample stochasticutility model is a member,is providedby a
of decisionmakers.The 0 values in equation6 are the proof reportedby Luce and Suppes (1965, p. 354-6).
parametersof the stochasticutility model that must be Althoughthe Luce and Suppes proof is for constant
estimatedfrom the availablesamplechoice set data. utilitymodels (i.e., choice models in which each alter-
One particularfeatureof the stochasticutility model native has a fixed utility value and the probabilityof
with doubleexponentiallydistributederrortermsshould choosingone alternativeover anotheris a functionof the
be noted.Becausethe varianceof the doubleexponential distancebetween their utilities), both the constantand
distributionis a knownfixed constant,not dependenton stochastic(random)utility models have an econometric
the estimated0 values, the magnitudeof the estimated specificationsimilarto that of equation5 (cf. Luce and
Ovaluesdirectlyinfluencesthe "percentof varianceex- Suppes 1965, p. 332-9). Consequently,the Luce and
plained" by the systematic component of the utility Suppesproofrelatingrankingbehaviorto choice behav-
function. As shown in the Appendix, the quantity ior can be extendedto the lattermodel.
N
The Luce andSuppesRankingChoiceTheoremstates
o2 = 02 is a measureof the "size" of the systematic thatfor any rankorderedpreferenceset which has been
n=i
n=l derivedfrom a constantutility model,
component. As (2 -? oo,the systematic component dom-
inatesthe errorcomponentand the choice probabilities (7) Pr(a,b,c,...)= Pr(alC)-Pr(b,c,...)
approachunityfor the alternativesin the choice sets with
greatestutility, with all other choice probabilitiesap- where Pr(a,b,c,...) is the probabilityof observingthe
proaching zero. Conversely, as -)2-) 0, the error com- rankorderof alternativea being preferredto alternative
ponentin the stochasticutilitymodel dominatesthe sys- b being preferredto alternativec, and so on, and Pr(al
tematiccomponentand each choice set alternativehas C) is the probabilityof alternativea being chosen from
the set of alternatives C = {a,b,c,...}. This Ranking
approximatelyequal probabilityof being chosen. Other
detailsof these scale effects are given in the Appendix Choice Theoremenables the probabilityof a ranking
and in subsequentdiscussion. event, Pr(a,b,c,...), to be decomposedinto the product
Values of the parametersof the multinomiallogit of two probabilities-the probabilityof a choice event,
model may be estimatedby maximizingthe likelihood Pr(aIC), and the probabilityof a subrankingevent,
functionassociatedwith the probabilisticchoice model Pr(b,c,...). By successively applying this Ranking Choice
in equation5. Standardsoftwarepackages(cf. Manski Theoremto the subrankingevents, one can derive a
1974) are availableto calculatethe maximumlikelihood probabilityexpressionfor the rankingevent which is the
estimates.Because maximumlikelihood estimatesare, productof the probabilitiesof J - 1 choice events, i.e.,
in general,consistentand asymptoticallynormallydis-
(8) Pr(a,b,c, ...) = Pr(alC)'Pr(blC
- {a})
tributed,approximatelarge sample confidence bounds
on parameterestimates may be constructedand hy- *Pr(clC- {a,b}) ...
pothesesmay be tested in standardways.
whereC - {a} is the set of alternativesexcludingalter-
EXTENDING THE MODEL FOR native a. Equation8 is equivalentto saying that the
RANK ORDERED DATA probabilityof the joint rankingevent of J alternativesis
composed of J - 1 statistically independentchoice
The estimationof the parametersof the stochastic events.
utility model requiresthe availabilityof the following If one appliesthe RankingChoiceTheoremto the sto-
data from a representativesample of decision makers chasticutilitymodel, assumingthatthe alternativeindex
from the populationof interest:(1) the alternativesin j is now interpretedas a serial preferenceindex, it fol-
each decision maker's choice set; (2) the actual alter- lows that:
nativechosen (i.e., preferred)by each decision maker; J,
and(3) the numericalvalueof each quantifiableattribute
associated with the choice set alternatives (i.e., the Zij (9) Prob (Ui, 2 Ui2 > ... UJ) = [I Prob (U, > Uij,
j*=l
values).The modeloperateson the principleof revealed for j= *,...,J).
preference:the alternativeactuallychosen by a decision
makeris assumedto be preferredto all otheralternatives The left side of equation9 is the joint probabilitythat
in the decisionmaker'schoice set. The basic parameter alternative1 is preferredto alternative2 which is pre-
estimation methodology can be extended if the re- ferredto alternative3, and so on to alternativeJi - 1
searcherhas available (or could conveniently gather which is preferredto alternativeJ, for decision maker
along with the otherrequireddata) a completerankor- i. The rightside of equation9 may be interpretedas the
deringof all of the alternativesin the decision makers' statisticaldefinitionof the independenceof the events
choice sets. To exploitthe informationcontentof a pref- (UilI
= = 2,3,...Ji),
Uij, 1,2,...Ji), (Ui2 Uij, j
erencerankorderingof choice set alternatives,one must ..., (UiJ, Uij,).
relaterankingbehaviorto choice behavior.The theoret- The statistical independencecondition implied by
ical justificationfor relatingrankingbehaviorto choice equation9 leadsto the notionof an "explosion" process
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
UTILITY
STOCHASTIC MODEL 291
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
292 JOURNALOF MARKETING AUGUST1982
RESEARCH,
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
STOCHASTIC
UTILITY
MODEL 293
exploded choice observationshad been added to the sive values of E until either the hypothesis that the
availablepool of choice sets, and the explosionprocess subgroupparametervectors are equal is rejectedor the
should be terminatedjust prior to the "elbow in the quantityI(E + 1) - I(E) yields too few explodedchoice
curve." observationsto providemeaningfulparameterestimates.
A secondconsiderationin choosingE is relatedto the The explosionprocess would terminatewhen either of
researcher's prior beliefs and knowledge about the these conditionsis encountered.
choice process being studied. Such beliefs and knowl-
SMALLSAMPLE PROPERTIES OF THE
edge mightlead the researcherto not considervalues of
E beyondsome practicalupperbound. CONDITIONAL LOGITESTIMATION PROCEDURE
Anotherfactorto consideris the distributionsof ri and The known propertiesof maximumlikelihood esti-
Ji values which serve to restrictthe possible range of matorshave been derived only under asymptoticcon-
explosion depths. For example, if the mean depth of ditions.To assess the value of generatingadditionalsta-
available rank order information is small-say tistically independentobservationsby exploding rank
r = 2.5-I(5) probablywill not be much larger than orderedchoice sets, one mustinvestigatethe small sam-
I(4). Thus, the researchermight decide not to explode ple propertiesof the multinomiallogit model parameter
beyonda depthof 4 because few additionalchoice ob- estimates.Of particularinterestto marketingresearchers
servationswill be generated and the extra exploded is the situationdescribedby McFadden(1974) as "con-
choice observationsbeyondE = 4 might be of a priori ditionallogit estimation,"in which thereis one choice
questionablequality. set per decision makerand no replications.As well as
The strengthof these heuristicapproachesis thatthey examininghow the precisionof the estimatesis affected
arerelativelysimpleto use. However,an obviousques- by the number of choice sets available for analysis
tion is whetheran analyticalapproachexists for deter- (among other factors), we investigatedseveral related
mining the extent to which rank orderedchoice sets issues such as unbiasednessof the estimatesand com-
shouldbe exploded(decomposed).One formalapproach putationalcosts.
involves groupingchoice observationsby depth of ex-
plosion and sequentialhypothesis testing. Define the Design of the Monte'Carlo Experiments
first subgroupof choice observationsto consist of the The generalprocedureused to investigatethe small
I(E) choice sets generatedby an explosionto a depthof samplepropertiesof the conditionallogit estimateswas
E. The secondsubgroupthenconsistsof the incremental Monte Carlo experimentationon artificiallygenerated
I(E + 1) - I(E) choice sets generatedby exploding to data. Choice sets with controllablecharacteristicswere
a depthof E + 1. If I(E + 1) - I(E) is large enough generatedandthe choice processwas simulatedwith the
to allowreasonableparameterestimatesto be developed, probabilisticchoice model in equation5. The simulated
the hypothesis that 0(E) = 0E+ can be tested by a sta- rank orderedchoice sets were then exploded and the
tisticaltestingproceduresuggestedby WatsonandWes- conditionallogit estimationprocedurewas appliedto the
tin (1975). This test affords an assessmentof whether resultingunrankedchoice observations.Becausethe true
two datasubgroupsshouldbe pooled for estimationpur- modelparameters wereknown,we were ableto examine
poses. To test the hypothesis that 0(1)= 0(2), the appro- the extentto which the estimationtechniquerecaptured
priatetest statisticis the trueparametervector-i.e., the extent to which O
-2{L( = @P))- [L(O = ()) + L(O = o(2)]}
approximates0.
The experimentalfactors hypothesizedto affect the
where O(P)is the MLE of 0 obtained by pooling the data ability of the conditionallogit estimationprocedureto
subgroups, and 0(1)and 0(2) are the MLEs for the separate recapturethe true parametervalues included (1) the
datasubgroups.This test statisticwill be asymptotically numberof parametersto be estimated(N), (2) the num-
distributedchi squarewith N degreesof freedom(Wald ber of choice sets availablefor analysisafterexplosion
1943). In applicationof the Watsonand Westinpooling (NOBS),(3) the averagenumberof alternativesin the
test, the failureto rejectthe null hypothesisimplies that choice sets afterexplosion (SIZE), (4) the relative size
the two datasubgroupscome fromthe same underlying of the deterministicportionof the choice model in com-
choiceprocess,with the sameerrortermstructure.Con- parisonwith the error componentas discussed in the
sequently,these two data subgroupscan be pooled for Appendix(SCALE),and (5) the collinearityamong the
the purposesof estimation.Thus, this is an exact test of variablesin the model (COLL).SIZEandNOBS are in-
whetherthe assumptionsunderlyingthe explosion pro- verselyrelatedbecauseboth dependon the depthof the
cess hold with real data: if the null hypothesis that explosionchosen by the analyst.1
0(E)= (E+1)cannot be rejected, the available choice set
dataare consistentwith the requisiteassumptionsof the
constantutilitymodel. If this null hypothesisis rejected, 'Becauseeachsuccessiveexplosiongeneratesadditionalchoice sets
no informationis providedas to which assumption(or with one less alternative(becausethe most preferredalternativeis
removedfromconsiderationin formingan explodedchoice set), SIZE
assumptions)is violatedby the data. This groupingand decreasesas E increases.Furthermore,as NOBS increaseswith in-
sequentialtesting procedurecan be iteratedfor succes- creasesin E, the indirecteffect is thatSIZEandNOBSare inversely
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
294 JOURNALOF MARKETING AUGUST1982
RESEARCH,
The scale of the model parameters is defined as COLL were chosen stochastically, but in such a way as
N 1/2
to reflect the usual kinds of conditions an empirical mar-
(13) SCALE =E o2 keting researcher might encounter in real choice behav-
n=l
ior process data. With regard to choice set size prior to
explosion, each choice set was drawn from a normal
Because the size of the error component was known and distribution (whose mean was determined by a draw
the variance of the Z vectors could be determined, we from a uniform distribution with a range of 2 to 10 and
were able to select SCALE values to represent different a standarddeviation drawn from a uniform distribution
levels of "explained variance" ranging from about 10% with a range of 1 to 2). The actual size of any choice
(SCALE = 0.45) to about 85% (SCALE = 3.50).2 set (prior to explosion) was truncated so that 2 - Ji - 10
The collinearity among the individual components of for each choice set i. (The lower limit on choice set size
Z is capturedby an index value designed to measure the is the absolute minimum number of alternatives that can
overall average correlation among the variables describ- constitute a choice set; the upper limit of 10 was chosen
ing the choice set alternatives. The collinearity index to reflect the empirical reality of bounded choice set
used in this study is sizes.) Collinearity was induced into the variables char-
N
acterizing the choice set alternatives by a transformation
E2 --N procedure described by Chapman (1981).
The artificial choice set data in each cell of the ex-
(14) COLL= perimental design were generated in the following man-
N(N - 1) ner. A total of I choice sets were generated, each choice
where X, is the nt eigenvalue associated with the gen- set containing Ji alternatives (where the Ji values were
erated data (where the usual convention of ordering the determined as described above). Each of the N attributes
of each alternative were generated by drawing indepen-
eigenvalues from largest to smallest is followed). COLL
is bounded between zero and one. If the variables are dently distributed normal random variables with mean
zero and variance one. Collinearity was induced into
orthogonal (i.e., completely uncorrelated), Xh = 2 = these data. The transformed collinear data were stan-
...= = 1 and COLL = 0; if the data are completely
correlated and have rank 1, Xh= N and the other eigen- dardized so that each attribute had mean zero and vari-
values will be equal to zero and COLL = 1. ance one. COLL was then calculated.
The experimental design was factorial with one rep- The next series of steps involved simulating the rank
lication per cell. The factorial design set four factors at order choice process. A true parameter vector was gen-
each of the following levels. erated by drawing N independent values from a uniform
distribution with a range of zero to one. Positive and
N = 2, 4, 7, and 10, negative signs were attached to each of the parameters
I = 40, 100, 200, and 400, (the signs being determined with a probability of 0.5)
E = 1, 2, 3, 5, and 10, and the resulting true parameter values were rescaled so
SCALE= 0.45, 0.85, 1.375, and 3.50. that relation 13 was satisfied. Next, the probabilistic
Hence, a total of 320 (= 4 x 4 x 5 x 4) experiments choice model in equation 5 was used to assign the choice
were conducted. Within this Monte Carlo study, NOBS probabilities to each alternative. The "chosen" alter-
is defined as native was assigned rank one.3 The choice probabilities
I
of the "nonchosen" alternatives were then rescaled to
sum to 1.0, and the "choice" process was repeated to
(15) NOBS = E min(E,Ji - 1).
i=I
determine the rank two alternative. This procedure was
iterated to rank all of the alternatives in each simulated
Within each cell in the experimental design, SIZE and choice set. Finally, choice sets were exploded to a depth
of E, resulting in NOBS choice sets. These exploded
choice sets were input to a conditional logit estimation
related.This follows becauseboth SIZEandNOBS are functionsof program (Manski 1974) to obtain the parameter esti-
E. Note thatSIZEandNOBS are only correlatedwithin the context mates.
of an explosion,becausewhenE increasesso does NOBS,but at the Several additional points should be noted. First, this
expenseof slightlydecreasingthe averagechoice set size (SIZE).Ex type of experimental design involves an implicit as-
ante (beforeexplosion)measuresof SIZEandNOBSareuncorrelated
withinthe factorialexperimentaldesign used in this study. sumption that each simulated decision maker employs
2Thesevalues of SCALEwere chosen by noting that if the com- the same underlying choice model. Thus, there is no
ponentsof Z are standardizedto have unit variance,it follows that heterogeneity problem to confound the estimation pro-
Var(U)= SCALE2+ Var(e).BecauseVar e = 1.645 for the double cess. Second, reliability is not an issue in these experi-
exponentialdistribution,the usual partitioningof total variationinto
explainedvariationand unexplainedvariationimpliesthat:
3Byuse of a randomnumbergenerator,an alternativewas "cho-
"proportion of _ SCALE2 sen" with probabilityequal to the choice probabilityassignedto the
explainedvariation" SCALE2+ 1.645 alternative.
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
UTILITY
STOCHASTIC MODEL 295
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
296 JOURNALOF MARKETING AUGUST1982
RESEARCH,
for assessing the marginal tradeoffs between the number sion results reported in equation 18, the explanatory
of available choice observations after explosion and the power of the model is rather low. The reason is partly
average choice set size after explosion. For example, a the experimental design used in this Monte Carlo study.
researcherwho has 100 choice sets each with five ranked Because only a single replication was performed in each
alternativescould use E = 1 (and NOBS = 100 and SIZE of the 320 cells in the factorial design, we could not
= 5) or the data could be exploded to E = 4 (where account for the within-cell variation of the performance
NOBS = 400 and SIZE correspondingly declines to 3.5). measures (the dependent variables). The source of this
Using equation 18, we can easily show that RELRSVfor within-cell variation is the probabilistic choice process
the former strategy (E = 1) is about twice that for the itself: the "chosen" alternative in each exploded choice
latter strategy (E = 4). Further, to obtain the same level observation is determined by a probabilistic model and
of RELRSVas that when E = 4 by retaining E = 1 and a single replication of the choice process reflects but a
increasing the average choice set sizes, we would have single point estimate of the parameter values. To mea-
to increase SIZE from five to about 100. (In this ex- sure the approximate magnitude of this replication error,
perimental design, there was much more variation in five replications of the simulated choice process were
NOBS than in SIZE; NOBS had a range of 40 to about conducted in each of the 16 cells characterized by the
2600 whereas SIZE ranged from about 2.6 to 9.2. If factorial design with N = 4 and 7, I = 100 and 250,
SIZE had been varied much more, say to 100, it might E = 1 and 4, and SCALE = 0.50 and 2.00. The average
have exerted a stronger influence on RELRSV. Still, as standarddeviation of the log RELRSVvalues over these
most empirical applications of the multinomial logit 16 cells is 0.499. Because the standard error of the es-
model would probably involve mean choice set sizes of timate in equation 18 is 0.894, the replication error rep-
fewer than 10 alternatives, this experimental framework resents about 56% of the total unexplained variance in
seems realistic. Note also that, in many applications, the the regression model in equation 18. If means of repli-
size of the choice sets is determined by the consumer cated choice processes had been used as the dependent
decision makers, and is not at the control of the mar- variable in equation 18, we could reasonably expect that
keting researcher. The number of sample decision mak- the total unexplained variance would have been reduced
ers, however, is often at the choice modeler's control.) by about 56%. The corresponding R2 value would have
Collinearity among the attributes seems to have only been increased to about 0.85. Such resulting unex-
a negligible impact on sampling variance. This result is plained variance (about 15%) is much more in line with
very encouraging because real choice data should be ex- a priori expectations.
pected to exhibit patterns of collinearity. Biasedness. Because the conditional logit estimates
As was expected, the precision of the conditional logit are obtained by means of maximum likelihood estima-
estimates is directly related to SCALE, a surrogate for tion techniques, the property of unbiasedness is guar-
the degree of systematic behavior exhibited by the sam- anteed only for large samples (asymptotically). Conse-
ple of decision makers. The coefficient estimate on log quently, it would be useful to know the degree and
SCALE implies that a doubling of SCALE would lead to direction of any small sample bias.
about a 68% decrease in RELRSV. For the 320 cells in the experimental design, the mean
A doubling of N, the number of model parameters, BIAS is -0.025 with an associated standard deviation
leads to about a 160% increase in RELRSV, ceteris par- of 0.311. A two-tailed test of the hypothesis that the
ibus. Because theoretical considerations guide the de- mean BIAS of the conditional logit estimates is zero
terminationof the number of parametersin the stochastic leads to the conclusion that the unbiasedness hypothesis
utility model, these results are useful to a choice modeler cannot be rejected at the conventional 5% level of sig-
in assessing the appropriate sample size to obtain a de- nificance (p = 0.15).
sired level of precision in the resulting parameter esti- To test for the possible presence of conditional
mates. For example, to obtain equivalent precision for bias-bias depending on the external factors-BIAS was
parameterestimates in a four-parametermodel estimated regressed against several theoretically plausible combi-
with 100 choice sets, the choice modeler would require nations of the external factors (NOBS, SIZE, COLL,
about 260 choice sets for a more theoretically complex SCALE, and N). None of the resulting F-statistics for the
model with eight parameters. regression models tested yielded significant values.
The model reported in equation 18 was also estimated Though it is not possible to prove unbiasedness via
with relative absolute variation as the dependent vari- simulation, these Monte Carlo results may be viewed as
able. The OLS results were virtually identical to those support for an hypothesis of unbiasedness. The possi-
reported in equation 18. Apparently, the relationship bility that subsets of the parameter estimates could be
between the external factors (i.e., NOBS, SIZE, and so biased is not precluded. That is, offsetting biases in in-
on) and the sampling variance of the conditional logit dividual parameter estimates could be present.
estimates does not depend on whether accuracy is mea- Computational considerations. The final performance
sured in terms of a quadratic or a linear loss function. measure of interest was computational cost. Because the
Although the small sample properties of the condi- explosion process yields additional choice sets for anal-
tional logit estimates are described clearly by the regres- ysis, the decision of how much to explode the available
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
STOCHASTIC
UTILITY
MODEL 297
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
298 JOURNALOF MARKETING AUGUST1982
RESEARCH,
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
MODEL
UTILITY
STOCHASTIC 299
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
300 JOURNALOF MARKETING AUGUST1982
RESEARCH,
first couple of rank ordered choice set alternatives is estimated vector of parameters affects the error terms
marginal at best (because of reliability and "noisy" data and, hence, their variance.
problems), survey questionnaire respondents need not be Fixing the variance of the disturbance terms implies
confused and antagonized by having to supply more de- that the proportion of the variance of U represented by
tailed information than can be used profitably in the sub- the systematic component of the stochastic utility model
sequent statistical analysis. will depend directly on the "size" of the 0 values. In
this way, "large" 0 values will be associated with high
explanatory power of the stochastic utility model be-
APPENDIX cause the variance of the systematic component will be
SCALE CONSIDERATIONSAND THE STOCHASTIC large in relation to the error component. The following
UTILITYMODEL theorem clarifies these relationships.
The rank order of a set of alternatives in a choice set Scale theorem. In the stochastic utility model Uu =
will be invariant under monotone transformationsof the OZy + E., where the error terms follow the double ex-
form W = axU+ {3, for ot > 0. Hence, the most general ponentiai distribution in equation 4, let On = coO,where
form of the stochastic utility model would be N=i 02 = 1. This scaling implies that o2 = N=i 0n. The
scaling factor o may be interpreted as the "size" of the
(Al.1) U,i = a(O Zi + Ei) + 3. 0 values. It follows that
The model in equations 5 and 6 follows from the nor- (a) as w --> o, Pio -> 1 for j? alternativesuch that OZio=
malizations ao = 1 and P = 0. Because we are concerned max(OZi), and Pij- 0 for all otheralternatives
only with differences in the utilities of various choice set 1
alternatives, P will be identical for each alternative. (b) as o -> 0, Pi - forj = 1,2,...,J
i
Hence, no generality is lost by assuming that P = 0.
Setting a = 1 causes no difficulties with the ca0 Z.. term Proof. One version of the probabilistic choice model in
because ac is just a scalar, and in essence just indicates equations 5 and 6, which can be derived with some sim-
the size (or scale) of the 0 values. We estimate the vector ple manipulations of equation 5, is
a 0, the relative population taste parameters. To obtain
absolute 0 values, an estimate of a somehow determined Pij*= J,
independently from the estimate of 0 would be required. 1+ E exp {0(Z - Zi.)}
We only need the relative 0 values to draw inferences j=!
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
All use subject to JSTOR Terms and Conditions
STOCHASTIC
UTILITY
MODEL 301
FACE)th
Fr
with Ferber
Readings In The Analysis Of Readings In Survey Research
Survey Data Robert Ferber, editor
Robert Ferber, editor 604 pp. 1978
249 pp. 1980 $10/member $13/nonmember
$16/member $24/nonmember
A collection of readings which form
Key pieces of the published literature an extension of the special issue on
concerning applications of multi- survey research of the August 1977
variate and related techniques to sur- issue of Journal of Marketing Re-
vey data, and new, innovative ap- search. The articles focus on three
proaches to the analysis of survey aspects of survey research: sampling,
data are brought together in this book questionnaire preparation and data
of previously published articles. collection. An extensive bibliography
Emphasis is given to recent material is included.
although some of the classics in the
field have also been included. Biblio-
graphies follow each piece, to stimu- TO ORDER call or write Order
late the researcher to go further in Department, American Market-
ing Association, 250 S. Wacker
examining the various techniques. AMERICAN
MARKETING
ASOCIATION
Drive, Chicago, IL 60606, (312)
648-0536.
This content downloaded from 152.3.153.137 on Fri, 20 Mar 2015 15:33:01 UTC
View publication stats
All use subject to JSTOR Terms and Conditions