Accommodating Heterogeneity and Heteroscedasticity in Intercity Travel Mode Choice Model

Accommodating Heterogeneity
and Heteroscedasticity in Intercity

Travel Mode Choice Model
Formulation and Application to HoNam, South Korea,
High-Speed Rail Demand Analysis
Jang-Ho Lee, Kyung-Soo Chon, and ChangHo Park
Multinomial logit models and nested logit models are limited in that they taste and they do not have flexible substitution patterns among alter-
cannot accommodate unobserved variations in travelers’ taste and they natives because of the independence of irrelevant alternatives (IIA)
do not have flexible substitution patterns among alternatives because of property.
the independence of irrelevant alternatives property. Taking this back- Taking this background into account, traffic demand analysts
ground into account, traffic demand analysts have recently used the have recently used the mixed logit model in many studies. Unfortu-
mixed logit model in many studies. Unfortunately, most of the studies in nately, most of the studies in the literature for joint analysis of RP
the literature for joint analysis of revealed-preference (RP) and stated- and SP data could not simultaneously resolve the two limitations
preference (SP) data could not simultaneously resolve the two limitations just mentioned.
just mentioned. The mixed logit framework is used to formulate an An intercity travel mode choice model for joint RP-SP analysis
intercity travel mode choice model for joint RP-SP analysis that accom- is formulated that accommodates the following behavioral consid-
modates the following behavioral considerations: (a) observed and un- erations: (a) observed and unobserved (to an analyst) heterogene-
observed heterogeneity across individuals in response to level-of-service ity across individuals in response to level-of-service (LOS) factors,
(LOS) factors, (b) heteroscedasticity across alternatives, and (c) scale dif- (b) heteroscedasticity across alternatives, and (c) scale differences
ferences between the RP and SP choice contexts. The mixed logit for- between the RP and SP choice contexts.
mulation is estimated with the maximum simulated likelihood method The fundamental reasons for accommodating all three modeling
that employs quasi-random Halton draws. The formulation is applied issues are as follows. First, an individual’s response to LOS attri-
to examine the travel behavior responses of users of the HoNam, South butes affects his or her travel mode choice for a trip, and the response
Korea, high-speed rail to changes in travel conditions. The empirical varies across individuals on the basis of observed and unobserved
results show that there is a significant heteroscedasticity across alter- individual characteristics (e.g., the purpose of trip, vehicle owner-
natives and a significant heterogeneity in response to LOS attributes ship). Second, the allowance to have unequal variances of the random
based on both observed and unobserved individual characteristics. components in utilities with the different alternatives (a) overcomes
There is an improvement in the data-fit statistics when one introduces the IIA restriction of the commonly used MNL model and (b) per-
heterogeneity and heteroscedasticity. These results highlight the need to mits more flexibility in cross-elasticity structure than the MNL model
include heterogeneity and heteroscedasticity within the context of inter- or the NL model does. Finally, since the RP and SP choice settings
city travel mode choice modeling to assist policy decision making. are quite different from each other, there is no reason to believe that
the variance of unobserved factors in the RP setting will be identical
to that of unobserved factors in the SP settings. The scale difference
Since revealed-preference (RP) and stated-preference (SP) data have
between the RP and SP choice contexts has been recognized and
their own advantages and disadvantages with respect to the esti-
accommodated for in almost all previous joint RP-SP analyses.
mation of behavioral parameters of interest, joint RP-SP modeling
is needed in travel demand analysis (1). Also, Delphi survey re-
sults show that joint RP-SP modeling will be one of the top issues MODEL FORMULATION
concerning transport modeling in the future (2).
With reference to modeling framework, the multinomial logit Heterogeneity
(MNL) model and the nested logit (NL) model have limitations in
that they cannot accommodate unobserved variations in travelers’ According to Hensher and Button (2), heterogeneity effects refer to
observed and unobserved differences across decision makers in the
J.-H. Lee, Department of Railway Research, Korea Transport Institute, 2311 Daehwa, intrinsic preference for a choice alternative (preference heterogene-
Ilsan, Goyang, Gyeonggi, 411-701, South Korea. K.-S. Chon and C. Park, Depart- ity) and in the sensitivity to characteristics of the choice alternatives
ment of Urban Engineering, Seoul National University, Shillim-dong, Gwanak-gu, (response heterogeneity). Unobserved heterogeneity effects can be
Seoul, 151-742, South Korea.
accommodated by using the random-parameter structure.
Transportation Research Record: Journal of the Transportation Research Board, Bhat (3) estimated an intercity travel mode choice model that
No. 1898, TRB, National Research Council, Washington, D.C., 2004, pp. 69–78. accommodates variations in response to LOS measures due to both
69
70 Transportation Research Record 1898
observed and unobserved individual characteristics. The model was Scale Difference Between RP and SP
applied to examine the impact of improved rail service on weekday
business travel in the Toronto–Montreal corridor in Canada. Bhat RP and SP data each have their own advantages and disadvantages,
(4) also formulated a mixed logit model of multiday urban travel because of which the two types of data are highly complementary,
mode choice that accommodates variations in mode preference and and combined estimators can be used to draw on the advantages of
response to LOS factors. The model was applied to examine the each type of data. Since the RP and SP choice settings are quite dif-
travel mode choice of workers in the San Francisco Bay Area in ferent from each other, the variance of unobserved components in
California. These two empirical results emphasize the need to the RP setting can be different from that of unobserved components
accommodate observed and unobserved heterogeneity across indi- in the SP setting (12). The scale difference between the RP and SP
viduals in travel mode choice modeling. However, they do not apply choice contexts has been recognized and accommodated in almost
to joint RP-SP analysis. all previous joint RP-SP analyses.
Hensher and Greene (5) accommodated unobserved response
heterogeneity, along with interalternative correlation, in an RP-SP
study on vehicle type choices among conventional, electric, and Mixed Logit Model Formulation
uncompressed natural gas–liquid natural gas (UNG-LNG) vehicles
in single-vehicle households. Bhat and Castelar (6 ) formulated and A mixed logit framework is formulated that accommodates observed
applied a unified mixed logit framework for modeling RP and SP. and unobserved response heterogeneity, heteroscedasticity, and scale
The model was applied to the analysis of congestion pricing in the difference between the RP and SP choice contexts. First, the observed
San Francisco Bay Area with the combined alternatives of travel response heterogeneity represents the differences across individual
mode and departure time. The model accommodated unobserved and trip characteristics. The unobserved response heterogeneity can
heterogeneity, interalternative error structure, and state dependence. be accommodated by the random-parameter structure. Second, the
The empirical results emphasize the advantage of joint RP-SP analy- heteroscedasticity can be considered by using the factor-analytic
sis and the need to include unobserved heterogeneity, flexible error variance-covariance structure. Finally, the scale difference between
structure, and state dependence. the RP and SP choice contexts can be accommodated by estimating
the scale parameter of the SP choice context, whereas the RP scale
parameter is normalized to 1 for identification.
Heteroscedasticity In the mixed logit model framework, the random utility term is
made up of two components: a probitlike random component with a
One of the basic assumptions of the MNL model is that the random multivariate distribution, and an IID Type I extreme value distributed
components of utilities with different alternatives are independent random component. The probitlike random component captures the
and identically distributed (IID) with a Type I extreme value distri- interdependencies among alternatives, the response heterogeneity,
bution. However, the IID error structure assumption leaves the MNL or both. Walker (11) showed that the mixed logit model with the
model saddled with the IIA property. factor-analytic structure is a general formulation that can be used
Bhat (7) formulated the heteroscedastic extreme value (HEV) to specify all known (additive) error structures, including hetero-
model, which assumes that the error terms are distributed with a Type scedasticity, nested (cross-nested) error structures, random param-
I extreme value distribution. The variances of the error term with alter-
eters, and the autoregressive process. Thus, the heterogeneity and
natives are allowed to be different across all alternatives. The formu-
heteroscedasticity are specified by using a factor-analytic structure.
lated model is flexible enough to allow differential cross-elasticities
In a compact vector form, the utility function at the tth choice
among alternatives. The model was applied to examine the impact
occasion of individual n accommodating the heterogeneity and
of improved rail service on intercity business travel in the Toronto–
heteroscedasticity can be written as follows:
Montreal corridor. Hensher et al. (1) applied the HEV model to esti-
mate an intercity travel mode choice model from combined RP-SP
data. The model was applied to examine the effect of high-speed Unt = Wnα + Xntβ n + T1 νn + nt (1)
rail (HSR) service in the Sydney–Canberra corridor in Australia.
Recently, transport applications have used a mixed logit framework where
to relax the IID error structure assumption. Bhat (8) accommodated
unobserved correlation across both dimensions in a two-dimensional Unt = (Jn × 1) vector of utilities,
choice context. The model was applied to an analysis of travel mode Wn = (Jn × K) matrix of sociodemographic and trip characteristic
and departure time choice for home-based social-recreational trips. variables including alternative-specific variables,
Brownstone and Train (9) applied a mixed logit framework to model α = (K × 1) vector of unknown parameters,
households’ choice among vehicles powered with gas, methanol, com- Xnt = (Jn × L) matrix of LOS variables,
pressed natural gas (CNG), and electricity. Their specification allows βn = (L × 1) vector of unknown parameters,
two additional error components, such as nonelectric vehicle and non- T1 = (Jn × Jn) matrix of unknown variance parameter,
CNG vehicle error components. Brownstone et al. (10) accommodated νn = (Jn × 1) vector of IID random variables with zero mean and
both heteroscedasticity and correlation across alternatives on the basis unit variance,
of the framework of a mixed logit model in joint RP-SP data analysis. nt = (Jn × 1) vector of IID Type I extreme value random variables
Walker (11) presented and empirically demonstrated a generalized with zero location parameter and scale parameter µt,
methodological framework that integrates the extensions of discrete Jn = number of alternatives in choice set Cn of individual n,
choice, including the factor-analytic logit kernel model. This study L = number of LOS variables, and
shows that the logit-kernel formulation can accommodate random K = number of sociodemographic and trip characteristic variables,
parameters and heteroscedasticity by using RP data. including alternative-specific variables.
Lee, Chon, and Park 71
The second term of Equation 1 represents the response hetero- The unknown parameters in this model are µt, α, β, γ, and those
geneity. Random-parameter structure can accommodate the observed in T. The explanatory variables Wn and Xn are observed, whereas νn,
heterogeneity as well as the unobserved heterogeneity: υn, and n are unobserved. If the factors νn and υn are known, the
model corresponds to an MNL formulation. The conditional choice
β n = ± exp(β + γ Zn + T2 υ n ) (2) probability of alternative i at the tth choice occasion of individual n,
given νn and υn, is as follows:
where
β = (L × 1) vector of constants in LOS variables; exp[µ t (Wnα + Xnitβ n + T1 νn )]
Lnt (i νn , υ n ) = (5)
γ = (L × K) matrix of unknown parameters;
Zn = (K × 1) vector of sociodemographic and trip characteristic
∑ exp[µ (W α + X
j ∈Cn
t n njt β n + T1 ν n )]
variables for observed heterogeneity;

T2 = (L × L) lower Cholesky matrix, such that T2T′2 = Σβ (variance- The unconditional choice probability of alternative i at the tth
covariance matrix); and choice occasion of individual n is
υn = (L × 1) vector of IID random variables with zero mean and
unit variance. Pnt (i ) = ∫∫L
ν υ
nt (i νn , υn )dF( υ)dF( ν) (6)
The unobserved heterogeneity model was estimated with lognor-

mal distributions for the sign constraints. For example, the travel where F() is the multivariate cumulative normal distribution.
time and cost parameters should have a negative sign. T2 is usually For maximum-likelihood estimation, the probability of each sam-
specified as diagonal except when the LOS variables are closely cor- pled individual’s sequence of observed choices is needed. The prob-
related. If there are LOS variables that are correlated with each other ability of individual n’s observed sequence of choices is the product
such as travel time and cost, T2 is not specified as diagonal and the of the conditional choice probability (Equation 5).
correlation should be estimated.
The third term of the Equation 1 represents the heteroscedasticity: Ln (i νn , υ n ) = ∏L
t
nt (i ν n , υ n ) (7)
σ1 0 0 L 0  The unconditional probability for the sequence of choices is

 
σ2 L 0 
0

0

Pn (i ) = ∫ ∫ L (i ν , υ )dF(υ)dF( ν)
ν υ
n n n (8)
T1 =  0 0 σ3 L 0  (3)
 
M M M O M 
  MODEL ESTIMATION
 0 0 0 L σ Jn 
The maximum simulated likelihood (MSL) estimation method is
where σi is the variance parameter of alternative i. generally used to estimate the mixed logit model. The choice prob-
The heteroscedasticity model was estimated with a normal distri- ability of alternative i is replaced with the unbiased smooth tractable
bution. It is not necessary to constrain the minimum variance term to simulator:
zero for identification because the heteroscedasticity components are D
∑ L (i θ, ν , υ )
1
simulated such that the error component is identical for all responses Pˆn (i θ) = n
d
n
d
n (9)
D d =1
of the same individual.
Its scale parameter (from Equation 1) is specified as follows: where
µ t = (µ SP − µ RP ) δ SP + µ RP (4) L(i θ, νdn, υdn) = choice probability of alternative i given θ, νdn,

t
and υdn;
where θ = vector of unknown parameters;
ν dn = dth draw from distribution of ν;
µRP = scale parameter of RP data (normalized to 1 for identifi- υ dn = dth draw from distribution of υ; and
cation), D = number of draws (repetitions).
µSP = scale parameter of SP data, and
δSP
t = dummy variable taking value 1 if tth choice occasion of
This process is repeated D times for the given value of the param-
individual n to his or her SP choice and 0 otherwise. eter vector to be estimated, and the integrand (Equation 8) is ap-
proximated by averaging the computed choice probabilities in the
Such a specification accommodates the scale difference between different draws. The simulated log-likelihood function can be written
the RP and SP choice contexts. Specifically, the scale parameter of as follows:
the RP data is normalized to 1 for identification, and the scale param-
∑ ln[ Pˆ (i θ)]
N
eter of the SP data is estimated. If the variance of the random utility
in the SP data is larger than that of the random utility in the RP data, SLL(θ) = n (10)
n =1
the scale parameter of the SP data is estimated between 0 and 1. In
contrast, if the variance of the random utility in the SP data is smaller where N is the total number of individuals in the sample.
than that of the random utility in the RP data, the scale parameter of The parameter vector is estimated as the vector values that max-
the SP data is estimated over 1. imize the simulated log-likelihood function. Under rather weak
regularity conditions, the MSL estimator is consistent, asymptotically method (17). Since the data came from the choice-based sample, the
efficient, and asymptotically normal-distributed (13). weighted exogenous sample maximum likelihood (WESML) estima-
The MSL estimator will generally be a biased simulation of the tor should be used (18). For the WESML the population fraction and
maximum likelihood estimator because of the logarithmic transfor- sample fraction are needed. Since HSR is not now in operation in
mation of the choice probabilities in the log-likelihood function. The Korea and there is no population fraction, the WESML could not be
bias of the MSL estimator decreases as the number of repetitions (D) used. The sample used consists of 581 RP and 2,078 SP data records.
increase (14). Brownstone and Train (9) showed the bias to be rather The objective of the RP survey was to obtain information on
negligible with 250 repetitions. Since the mixed logit model here the current trip such as perceived travel times and costs of available
used the lognormal distributions, the empirical analysis was carried alternatives. The RP data also include sociodemographic and gen-
out 500 times. eral trip-making characteristics of the traveler (e.g., the purpose of
Bhat (15) showed that the Halton simulation method outperforms the trip, travel group size, origin and destination cities, choice set,
the pseudo-random Monte Carlo simulation method for mixed logit children or baggage dummy, income level, number of vehicles in
model estimation. Bhat (16 ) also described a problem in which the household, sex, and age). The LOS variables are access and egress
standard Halton sequence defined by large primes can be highly time, access and egress cost, in-vehicle time, and travel cost (fare).
correlated with each other over large portions of the sequence for The objective of the SP survey was to obtain additional data to
simulation of high-dimensional integrals and suggested an effec- model the travel behavior responses of the HoNam corridor travel-
tive solution, the scrambled Halton sequence. The standard Halton ers to changes in HSR in-vehicle time and its cost. It was hypoth-
sequence is used here because relatively low-dimensional integrals esized that the other travel conditions were the same as current
are simulated. travel conditions in the RP survey and that the access time and cost
To decrease estimation time, the gradients of the simulated log- of the HSR alternative were equal to those of the existing rail alter-
likelihood function with respect to the parameter were analytically native. The experimental design for the SP survey generated six sce-
programmed and the Hessian (the second derivatives) was approx- narios of the HSR alternative (two levels of fare times three levels
imated with the technique of Berndt et al., which is computed as of in-vehicle time):
follows (14):
• HSR fare: equivalent to 65% and 90% of the airfare and
 ∂ ln Pn (θ)   ∂ ln Pn (θ)  ′
N
B = ∑ 
n =1  ∂θ   ∂θ 
(11) • HSR in-vehicle time: each case of journey speed 150 km/h,
180 km/h, and 210 km/h.
The score is evaluated per individual and computed with the simu-
lated scores for the MSL estimation. The estimation and computations The universal choice set of the RP data consists of four travel
herein were carried out using Microsoft Visual C++ programming modes: car, bus, existing rail, and air. The HSR alternative is added
language on a personal computer. to the universal choice set of the SP data including the four foregoing
alternatives. Each traveler has his or her own choice set.
Table 1 presents the availability and choice share of each mode,
and Table 2 presents the descriptive statistics for sociodemographic
EMPIRICAL ANALYSIS and trip characteristics in the joint RP-SP sample.
Data Sources
The data used in the analysis were assembled by the Korean Society MNL Models
of Transportation in 2002 to develop travel demand models to fore-
cast HSR demand in the HoNam corridor (Seoul–IkSan, GwangJu, Table 3 shows the results from the three MNL models: the RP MNL,
MokPo corridor in South Korea) with the choice-based sampling the SP MNL, and the joint RP-SP MNL.
TABLE 1 Availability and Choice Shares of Alternatives
RP Sample SP Sample Joint Sample
Availability Choice Availability Choice Availability Choice

Alternatives Shares Shares Shares Shares Shares Shares
Car 0.809 0.522 0.782 0.351 0.788 0.388
Bus 0.943 0.207 0.934 0.147 0.936 0.160
Rail 0.921 0.194 1.000 0.137 0.983 0.150
HSR 0.000 0.000 1.000 0.315 0.781 0.246
Air 0.341 0.077 0.355 0.050 0.352 0.056
Sample size 581 2,078 2,659

TABLE 2 Sociodemographic and Trip Characteristics in Joint RP-SP Sample
Average Monthly Number of

Household Income Vehicles per
(Unit: million Won) Travel Group Size Purpose of Trip Household Age
Range1 Share Size Share Purpose Share Number Share Range Share
< 1.5 0.146 1 0.479

Business 0.366 0 0.085 20-29 0.342
1.5-2.0 0.190 2 0.243
Visit 0.391 1 0.651 30-39 0.330
2.0-2.6 0.219 3 0.097
Tour 0.093 2 0.221 40-49 0.226
2.6-3.5 0.245 4 0.111
Others 0.151 3+ 0.044 50 < 0.102
3.5 < 0.201 5+ 0.070
1
< 1.5: lower 25%, 1.5-2.0: lower 25-45%, 2.0-2.6: upper 35-55%, 2.6-3.5: upper 15-35%, 3.5 <: upper 15% of the
population
TABLE 3 MNL Model Estimation Results

RP MNL SP MNL Joint RP-SP MNL
Variables Parameter t-statistics Parameter t-statistics Parameter t-statistics
Alternative Specific Constants
RP Sample
Bus 0.7655 1.17 - - - -
Rail 2.0136 3.38 - - 1.9116 6.11
Air -1.1190 -1.46 - - -1.7244 -4.73
SP Sample
Bus - - - - - -
Rail - - 1.6644 6.07 1.7565 6.58
HSR - - -0.4135 -2.39 -0.4534 -5.69
Air - - -1.8166 -5.44 -1.9422 -2.66
Socio-demographic Variables
Vehicles in household
Car 0.6248 3.12 0.1483 1.86 0.1821 2.40
Rail - - -0.4294 -3.56 -0.4884 -4.36
Air 0.4344 1.47 0.2665 1.91 0.2335 7.16
Travel group size
Car 0.6699 4.16 - - - -
Bus -0.4428 -2.41 -0.6341 -7.93 -0.7289 -8.63
Rail - - -0.5493 -6.85 -0.5660 -7.22
Baggage/Children
Bus -0.5610 -1.65 -0.5292 -2.78 -0.5674 -3.21
HSR - - -0.8814 -6.55 -0.9444 -6.06
Air -2.9877 -2.41 -2.3779 -3.83 -2.7688 -4.64
Business trip
Car 1.3275 4.01 - - -
Bus 0.7548 2.30 -0.6089 -4.13 -0.6089 -4.36
Rail - - -1.2183 -6.68 -1.3131 -7.16
Air 2.3451 4.96 1.0147 4.20 1.0458 4.68
Higher income group1
Car - - 0.9536 4.12 0.6042 3.71
Bus - - 0.4694 1.77 - -
HSR - - 1.2281 5.52 0.9568 5.53
Level of Service Variables
Access/egress time
(in 10 mins.) -0.0146 -0.55 -0.0410 -3.20 -0.0369 -3.03
Access/egress cost
(in thousand Won)
Bus and Rail -0.0871 -2.50 -0.0917 -4.67 -0.0949 -5.13
Car, HSR, and Air - - - - - -
In-vehicle time
(in 10 mins.)
Car -0.0919 -4.41 -0.0830 -10.39 -0.0871 -9.01
Bus -0.0420 -1.81 -0.0578 -6.00 -0.0556 -5.55
Rail, HSR, and Air -0.1343 -5.67 -0.1281 -11.56 -0.1349 -9.84
Travel cost (Fare)
(in thousand Won) -0.0246 -2.84 -0.0434 -12.50 -0.0432 -9.89
SP-to-RP Scale factor2 - - - - 0.9541 -0.60
1
Upper 15% income group (Over 3.5 million Won in average monthly household income).
2
The t-statistic corresponding to the scale parameter is computed with respect to 1.
(-): Data not applicable or parameter not statistically significant.
The coefficients of the alternative-specific constants show the dif- of the LOS variables are not reported because the only reasonable test
ferences between the RP and SP samples due to the differences in of the constants would be against a value of negative infinity. All the
the alternative share and in the choice set between the two samples. standard deviation parameters representing the unobserved hetero-
The effects of sociodemographic variables indicate that individuals geneity are not statistically significant. In particular, the travelers’
in a household with a high number of vehicles are likely to prefer to response to access-and-egress time and travel cost factors varies
travel by car and air. The bus and HSR are less attractive to those across individuals. The parameters representing the observed hetero-
who have baggage or children because of the burden of transferring geneity are statistically significant. Business travelers are more sen-
at the bus terminal, station, and airport. Although an individual trav- sitive to the LOS factors except the access-and-egress time, and an
eling in a group prefers the car alternative, he or she does not prefer individual in a household with a high number of vehicles is more sen-
the bus and rail. An individual in the higher-income group (upper sitive to the access-and-egress cost and the in-vehicle time of the car.
15% of the population) prefers the car and HSR, and those who However, the income level does not affect the observed heterogene-
travel for business relatively prefer to travel by air. ity. The reason could be that individuals in the sample tend to avoid
Among the LOS variables, the results show the expected negative revealing their own income. Also, there is no statistically significant
effects of travel time and cost. However, the parameters of access scale difference between the RP and SP contexts.
and egress cost in the car, HSR, and air are not statistically different From the likelihood ratio test, the homogeneity hypothesis can
from zero. A comparison of the RP MNL and the other MNL mod- be rejected. The likelihood ratio value for the test is 98.12 [= −2
els shows the difference in the LOS variable effects. The RP MNL (−2614.94 + 2565.88)], which is larger than the chi-squared value
model does not have a statistically significant access-and-egress (= 24.72) corresponding to the 11 additional parameters in the un-
time parameter. This result reflects the limited variation in access- restricted model at a 0.01 level of significance. The unconstrained
and-egress time within the RP sample as well as multicolinearity heterogeneity model is also estimated, which has all the standard de-
between access-and-egress time and access-and-egress cost. The other viation parameters representing the unobserved heterogeneity and
MNL models provide a statistically significant access-and-egress correlations between travel time and cost. The likelihood ratio value
time parameter. for the test is 100.88 [= −2(−2614.94 + 2564.50)], which is larger
The joint RP-SP MNL model shows that there is no significant than the chi-squared value (= 36.19) corresponding to the 19 addi-
scale difference in the RP and SP contexts. From the likelihood ratio tional parameters in the unrestricted model at a 0.01 level of signif-
test, the hypothesis that there is no scale difference cannot be re- icance. These results confirm the need to accommodate observed
jected. The likelihood ratio value for the test is 1.58 [= −2(−2615.73 and unobserved heterogeneity effects.
+ 2614.94)], which is smaller than the chi-squared value (= 11.35)
corresponding to the three additional parameters in the unrestricted
model at a 0.01 level of significance. Heteroscedasticity Model
The alternative-specific constant for HSR has a different, negative

sign compared with the results for the other models. This negative
Mixed Logit Models sign means that if all the travel conditions are equal across alterna-
tives, the HSR alternative is less attractive than the car alternative.
Table 4 presents the results of the three mixed logit models for the The SP-to-RP scale factor is larger than 1 but not statistically differ-
joint RP-SP analysis. The first model is the heterogeneity model, ent from 1 in the heteroscedasticity model, whereas it is smaller than
which accommodates observed and unobserved heterogeneity in 1 in the joint RP-SP MNL model and the heterogeneity model. It is
response to the LOS attributes; the second is the heteroscedasticity the same in the heterogeneity-and-heteroscedasticity model. In sum-
model, which allows the unequal variances across the alternatives; mary, this finding means that the variance of the random component
and the third is the heterogeneity-and-heteroscedasticity model, which in the SP data is decreased when the heteroscedasticity is considered.
simultaneously accommodates the two modeling issues. All the Gaussian variance parameters representing the hetero-
Table 5 provides the data-fit statistics, and Table 6 presents the scedasticity are statistically significant. Thus, the hypothesis that all
monetary values of time from the various models. The monetary value the variances of different alternatives are identical can be rejected.
of time is very high in the RP MNL model; this model cannot capture The variance of the random term in the HSR utility is especially
the sensitivity to cost because of the limited variation in cost within larger than the others since the HSR alternative is a nonexistent travel
the RP sample as well as the multicolinearity between the travel time alternative. From the likelihood ratio test, the homoscedasticity
and cost. Since the parameter of travel time is distributed in the hypothesis can be rejected. The likelihood ratio value for the test is
heterogeneity and the heterogeneity-and-heteroscedasticity models, 72.80 [= −2(−2614.94 + 2578.54)], which is larger than the chi-squared
the mean, median, and mode value are shown. Furthermore, the mon- value (= 18.47) corresponding to the seven additional parameters in the
etary values of time in the heterogeneity and the heterogeneity-and- unrestricted model at a 0.01 level of significance. This result confirms
heteroscedasticity models are classified into non-business-trip and the need to accommodate the heteroscedasticity effect.
business-trip values. The monetary values of time on a business trip
are almost twice those of a nonbusiness trip.
Heterogeneity-and-Heteroscedasticity Model
The parameters of a business trip by bus, rail, and air are not statisti-
Heterogeneity Model cally different from zero. The effects of the LOS attributes and the
sociodemographic variables are generally larger than those of the het-
A comparison of the business trip in the rail utility between the heterogeneity model. The SP-to-RP scale factor in the heterogeneity-
erogeneity model and the other models reveals that the heterogeneity and-heteroscedasticity model is larger than in the others. However, it
model has a different, positive sign. The t-statistics for the constants is not statistically different from 1.
TABLE 4 Mixed Logit Model Estimation Results

Hetero- Heterogeneity &
Heterogeneity
scedasticity Heteroscedasticity
Variables Parm. t-stat. Parm. t-stat. Parm. t-stat.
Alternative Specific Constants - RP Sample
Bus 0.4671 0.91 -0.0773 -0.11 3.7916 1.48
Rail 2.2006 4.28 2.6962 2.21 7.3608 1.75
Air -1.0511 -2.07 -3.7831 -2.59 -1.6656 -1.06
Alternative Specific Constants - SP Sample
Bus 0.5232 1.09 0.8224 1.20 4.7331 1.70
Rail 2.1171 4.81 3.2463 2.76 7.8103 1.87
HSR 0.3091 0.88 -1.6260 -2.20 0.3266 0.31
Air -1.1702 -2.41 -3.5970 -2.38 -1.0867 -0.73
Socio-demographic Variables
Vehicles in household - Car 0.6252 2.75 0.5108 1.81 2.8047 1.73
Rail -0.4792 -3.77 -0.6770 -2.37 -0.9332 -1.68
Air 0.3605 1.99 0.2720 0.89 1.0770 1.47
Travel group size - Bus -0.7208 -6.89 -1.3605 -2.93 -2.1263 -1.85
Rail -0.5779 -7.08 -1.2176 -2.90 -2.0669 -1.87
Baggage/Children - Bus -0.6422 -3.09 -0.8881 -2.34 -1.2242 -1.66
HSR -0.9556 -5.48 -2.1629 -2.78 -3.0890 -1.79
Air -3.0874 -3.70 -5.9057 -2.58 -9.7686 -1.71
Business trip - Bus 0.4626 0.93 -1.1752 -2.52 -1.0268 -0.83
Rail 0.6810 1.68 -2.4055 -2.92 -0.1320 -0.12
Air 0.3300 1.05 1.7946 2.20 -0.3499 -0.38
Higher income group1 - Car 0.8407 3.81 1.0071 2.00 2.2826 1.68
HSR 1.1625 5.40 2.1810 2.74 3.9637 1.80
Level of Service Variables
Access/egress time (in min)
Constant -6.3545 - -5.1032 - -5.9783 -
Standard Deviation 2.2458 2.92 - - 1.9388 3.22
Access/egress cost of Bus and Rail (in Won)
Constant -10.7528 - -8.7462 - -9.6350 -
Business trip 2.5230 3.13 - - 1.9572 4.03
Vehicles in household 0.2297 1.53 - - 0.4688 3.12
Standard Deviation - - - - - -
In-vehicle time of Car (in min)
n)
Constant -4.9510 - -3.9183 - -4.0638 -
Business trip 0.4682 4.37 - - 0.4630 4.03
Vehicles in household 0.1540 2.00 - - 0.2486 4.35
In-vehicle time of Bus (in min)
Constant -5.2029 - -4.2056 - -3.8894 -
Business trip 0.6843 2.18 - - 0.2648 1.13
In-vehicle time of Rail, HSR, and Air (in min)
Constant -4.3895 - -3.6382 - -3.3735 -
Business trip 0.6171 4.35 - - 0.4568 3.39
Travel cost (Fare) (in Won)
Constant -10.1686 - -9.2015 - -9.2939 -
Business trip 0.2103 3.07 - - 0.3508 5.89
Standard Deviation 0.4877 2.32 - - 0.2859 2.72
Gaussian Variance Variables - Car - - 3.7409 2.86 5.5782 1.85
Bus - - 1.0895 1.72 1.6347 1.53
Rail - - 1.7837 2.41 2.8994 1.79
HSR - - 3.7464 2.93 6.6255 1.86
Air - - 2.8004 2.52 2.9430 1.76
SP-to-RP Scale factor2 0.9636 -0.35 2.1174 0.88 2.4887 0.62
1
Upper 15% income group (Over 3.5 million Won in average monthly household income).
2
The t-statistic corresponding to the scale parameter is computed with respect to 1.
(-): Data not applicable or parameter not statistically significant.
There is an improvement in data-fit statistics when one intro- show the need to accommodate heterogeneity and heteroscedasticity
duces heterogeneity and heteroscedasticity. The rho-bar-squared simultaneously.
value increases from 0.248 to 0.269. From the likelihood ratio
tests, the homoscedasticity hypothesis in the heterogeneity condi- −2( −2565.88 + 2526.83) = 78.10 > χ 5, 0.01 = 15.08
tion and the homogeneity hypothesis in the heteroscedasticity con-
dition can be rejected at a 0.01 level of significance. These results −2( −2578.54 + 2526.83) = 103.42 > χ 9, 0.01 = 21.66
TABLE 5 Data-Fit Statistics

Hetero-
geneity &
Joint RP-SP Hetero- Hetero- Hetero-
RP-MNL SP-MNL MNL geneity scedasticity scedasticity
Log-likelihood
-625.40 -2887.64 -3513.04 -3513.04 -3513.04 -3513.04
at zero
Log-likelihood
-404.52 -2206.23 -2614.94 -2565.88 -2578.54 -2526.83
at convergence
Number of
18 23 25 36 32 41
Parameters
Rho-squared 0.3532 0.2360 0.2556 0.2696 0.2660 0.2807
Rho-bar-squared 0.3244 0.2280 0.2485 0.2594 0.2569 0.2691
TABLE 6 Monetary Values of Time
Monetary Value of Time1 Car Bus Rail, HSR, and Air
22,200 10,200 32,500

RP-MNL
($ 18.6) ($ 8.5) ($ 27.1)
11,200 7,900 17,200
SP-MNL
($ 9.3) ($ 6.6) ($ 14.3)
12,100 7,700 18,700
Joint RP-SP MNL
($ 10.1) ($ 6.4) ($ 15.6)
9,800 7,600 17,200
Mean
($ 8.2) ($ 6.4) ($ 14.3)
Non-
11,000 8,600 19,400
business Median
($ 9.2) ($ 7.2) ($ 16.2)
Trip
14,000 10,900 24,600
Mode
($ 11.7) ($ 9.1) ($ 20.5)
Heterogeneity
15,700 15,400 31,900
Mean
($ 13.1) ($ 14.7) ($ 26.6)
Business 17,700 17,000 36,000
Median
Trip ($ 14.7) ($ 14.2) ($ 30.0)
22,400 21,600 45,600
Mode
($ 18.7) ($ 18.0) ($ 38.0)
11,800 8,900 15,600
Heteroscedasticity
($ 9.9) ($ 7.4) ($ 13.0)
10,800 12,800 21,500
Mean
($ 9.0) ($ 10.7) ($ 17.9)
Non-
11,200 13,300 22,400
business Median
($ 9.3) ($ 11.1) ($ 18.6)
Trip
12,200 14,500 24,300
Mode
Heterogeneity and ($ 10.1) ($ 12.1) ($ 20.2)
Heteroscedasticity 17,100 16,700 33,900
Mean
($ 14.3) ($ 13.9) ($ 28.2)
Business 17,800 17,400 35,300
Median
Trip ($ 14.8) ($ 14.5) ($ 29.4)
19,300 18,900 38,300
Mode
($ 16.1) ($ 15.7) ($ 31.9)
1
Unit: Won/hr (U.S. $1.0 is about 1,200 Won).
Policy Implications CONCLUSIONS
The objective of the original survey was to examine the effects of A mixed logit framework for joint RP-SP analysis is formulated that
the newly constructing HSR mode in the HoNam corridor of Korea. accommodates the following behavioral considerations: (a) observed
Consequently, the focus here is on an examination of the aggregate- and unobserved heterogeneity across individuals in response to LOS
level direct and cross-elasticities of the changes in the LOS attri- attributes, (b) heteroscedasticity across alternatives, and (c) scale dif-
butes of the HSR alternative. The aggregate elasticities provide the ferences between the RP and SP choice contexts. These behavioral
proportional changes in the expected market shares of each mode in issues are considered by using factor-analytic structures.
response to a uniform percentage change in the LOS attribute of the The mixed logit formulation here is estimated using the MSL esti-
HSR alternative. mation method, which employs quasi-random Halton draws. This
Table 7 presents the aggregate direct and cross-elasticities of formulation is applied to examine the travel behavior responses of
the four models for the joint RP-SP analysis in the Seoul–GwangJu HSR users in the HoNam corridor to changes in travel conditions.
corridor. The simulation for aggregate forecasting was carried out Several meaningful results can be derived from the empirical analy-
1,000 times. The HSR direct elasticities indicate that a reduction in sis. First, the results confirm the advantage of joint RP-SP analysis.
the fare is a more effective means of increasing the HSR market The parameter of access-and-egress time is not statistically significant
share than a reduction in the in-vehicle travel time. in the RP MNL model; joint RP-SP analysis is better able to represent
the trade-offs in LOS attributes. Second, the results support the need
Among the four models, the heterogeneity model shows higher
to accommodate heterogeneity across individuals and heteroscedas-
direct elasticities. The results from the heteroscedasticity model and
ticity across alternatives. The heterogeneity-and-heteroscedasticity
the heterogeneity-and-heteroscedasticity model are relatively close to
model shows significant random variations and differences in sensi-
each other. The cross-elasticities of the four models are different from
tivity to the LOS variables on the basis of the trip purpose and vehicle
each other. The joint RP-SP MNL model produces unrealistic cross-
ownership. This model also shows significant differences in the vari-
elasticities that are identical across alternatives. The heteroscedasticity
ances of random components of utilities across alternatives. This gen-
model shows that the cross-elasticities of rail are larger than those of eral model rejects the other restrictive models from the likelihood ratio
the other travel modes and the cross-elasticities of car and air are sim- tests. Thus, the homogeneous response and homoscedasticity assump-
ilar to each other. The heterogeneity-and-heteroscedasticity model tion of the MNL model can be rejected. Third, accounting for the het-
shows relatively much lower cross-elasticities except those for the bus erogeneity without the heteroscedasticity leads to an overestimation
alternative. of direct elasticities. Fourth, there is an improvement in data-fit sta-
Although the heterogeneity model predicts a higher percentage tistics when one introduces heterogeneity and heteroscedasticity. The
decrease in the air choice probabilities in response to the reduction of rho-bar-squared value increases from 0.248 to 0.269. Finally, the vari-
HSR in-vehicle travel time, the heterogeneity-and-heteroscedasticity ance of the random component in the SP data is decreased when het-
model predicts a relatively lower percentage decrease in the car and eroscedasticity is considered, and there is no statistically significant
air choice probabilities. The same model predicts a higher percent- scale difference between the RP and SP contexts. All the SP-to-RP
age decrease in the bus choice probabilities in response to the reduc- scale factors in the models are not statistically different from 1.
tion of HSR fare than do the other models. Findings indicate that the The policy implications for the market shares of the various travel
heterogeneity model leads to an overestimation in the choice prob- modes are different among the models. The differences between the
ability of the bus alternative and an underestimation in the choice general model and the restrictive models are noticeable in the case of
probability of the air alternative because it ignores heteroscedasticity. prediction of the market share of the HSR alternative. These results
TABLE 7 Aggregate Direct and Cross-Elasticities in Response to Changes in LOS Variables for HSR
Heterogeneity &
Joint RP-SP MNL Heterogeneity Heteroscedasticity
Heteroscedasticity
HSR 0.7047 0.9692 0.4074 0.3817
Decrease in Car -0.4110 -0.4368 -0.2780

In-vehicle
Bus -0.5955 -0.5496 -0.5742
Time of the
-0.6461
HSR Rail -0.5193 -0.6114 -0.4314
Air -0.8160 -0.4428 -0.5465
HSR 0.7068 1.0705 0.5313 0.4355
Car -0.3650 -0.5706 -0.3147

Decrease in
Fare of the Bus -0.7764 -0.7150 -1.0775
HSR -0.7042
Rail -0.6562 -0.7924 -0.5623
Air -0.6244 -0.5778 -0.5531

highlight the need to accommodate heterogeneity and heteroscedas- 9. Brownstone, D., and K. Train. Forecasting New Product Penetration
ticity within the context of intercity travel mode choice modeling that with Flexible Substitution Patterns. Journal of Econometrics, Vol. 89,
1999, pp. 228–239.
assists policy decision making. 10. Brownstone, D., D. S. Bunch, and K. Train. Joint Mixed Logit Models
of Stated and Revealed Preferences for Alternative-Fuel Vehicles.
Transportation Research, Vol. 34B, 2000, pp. 315–338.
11. Walker, J. Extended Discrete Choice Models: Integrated Framework,
REFERENCES Flexible Error Structures, and Latent Variables. Ph.D. dissertation.
Massachusetts Institute of Technology, Cambridge, Mass., 2001.
1. Hensher, D. A., J. Louviere, and J. Swait. Combining Sources of Pref- 12. Ben-Akiva, M., and T. Morikawa. Estimation of Travel Demand Mod-
erence Data. Journal of Econometrics, Vol. 89, 1999, pp. 197–221. els from Multiple Data Sources. Proc., 11th International Symposium
2. Hensher, D. A., and K. Button. Handbook of Transport Modeling. Elsevier on Transportation and Traffic Theory, Elsevier Science, New York,
Science, Amsterdam, Netherlands, 2000, pp. 5–6. 1990, pp. 461–476.
3. Bhat, C. R. Accommodating Variations in Responsiveness to Level of 13. McFadden, D., and K. Train. Mixed MNL models for Discrete Response.
Service Measures in Travel Mode Choice Modeling. Transportation Journal of Applied Econometrics, Vol. 15, No. 5, 2000, pp. 447–470.
Research, Vol. 32A, 1998, pp. 495–507. 14. Train, K. Discrete Choice Methods with Simulation. Cambridge Uni-
4. Bhat, C. R. Incorporating Observed and Unobserved Heterogeneity in versity Press, England, 2003.
Urban Work Travel Mode Choice Modeling. Transportation Science, 15. Bhat, C. R. Quasi-Random Maximum Simulated Likelihood Estima-
Vol. 34, 2000, pp. 228–238. tion of the Mixed Multinomial Logit Model. Transportation Research,
5. Hensher, D. A. and W. Greene. Choosing Between Conventional, Elec- Vol. 35B, 2001, pp. 677–693.
tric, UNG/LNG Vehicles in Single Vehicle Households. Technical Paper. 16. Bhat, C. R. Simulation Estimation of Mixed Discrete Choice Models
Institute of Transport Studies, University of Sydney, Australia, 2000. Using Randomized and Scrambled Halton Sequences. Presented at 81st
6. Bhat, C. R., and S. Castelar. A Unified Mixed Logit Framework for Annual Meeting of the Transportation Research Board, Washington,
Modeling Revealed and Stated Preferences: Formulation and Applica- D.C., 2002.
tion to Congestion Pricing Analysis in the San Francisco Bay Area. 17. Research and Study for the HoNam High Speed Rail Basic Construction
Transportation Research, Vol. 36B, 2002, pp. 593–616. Plan: Route and Station Selection—Survey Report. Korean Society of
7. Bhat, C. R. A Heteroscedastic Extreme Value Model of Intercity Mode Transportation, Seoul, 2002.
Choice. Transportation Research, Vol. 29B, 1995, pp. 471–483. 18. Ben-Akiva, M., and S. Lerman. Discrete Choice Analysis: Theory and
8. Bhat, C. R. Accommodating Flexible Substitution Patterns in Multi- Application to Travel Demand. MIT Press, Cambridge, Mass., 1985.
dimensional Choice Modeling: Formulation and Application to Travel
Mode and Departure Time Choice. Transportation Research, Vol. 32B, Publication of this paper sponsored by Passenger Travel Demand Forecasting
1998, pp. 425–440. Committee.

Accommodating Heterogeneity and Heteroscedasticity in Intercity Travel Mode Choice Model

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Accommodating Heterogeneity and Heteroscedasticity in Intercity Travel Mode Choice Model

Uploaded by

Copyright:

Available Formats

Accommodating Heterogeneity

and Heteroscedasticity in Intercity

Jang-Ho Lee, Kyung-Soo Chon, and ChangHo Park

variables for observed heterogeneity;

The unobserved heterogeneity model was estimated with lognor-

σ1 0 0 L 0  The unconditional probability for the sequence of choices is

µ t = (µ SP − µ RP ) δ SP + µ RP (4) L(i θ, νdn, υdn) = choice probability of alternative i given θ, νdn,

TABLE 1 Availability and Choice Shares of Alternatives

RP Sample SP Sample Joint Sample

Availability Choice Availability Choice Availability Choice

Car 0.809 0.522 0.782 0.351 0.788 0.388

Bus 0.943 0.207 0.934 0.147 0.936 0.160

Rail 0.921 0.194 1.000 0.137 0.983 0.150

HSR 0.000 0.000 1.000 0.315 0.781 0.246

Air 0.341 0.077 0.355 0.050 0.352 0.056

Sample size 581 2,078 2,659

Average Monthly Number of

< 1.5 0.146 1 0.479

TABLE 3 MNL Model Estimation Results

The alternative-specific constant for HSR has a different, negative

TABLE 4 Mixed Logit Model Estimation Results

TABLE 5 Data-Fit Statistics

Rho-squared 0.3532 0.2360 0.2556 0.2696 0.2660 0.2807

Rho-bar-squared 0.3244 0.2280 0.2485 0.2594 0.2569 0.2691

TABLE 6 Monetary Values of Time

Monetary Value of Time1 Car Bus Rail, HSR, and Air

22,200 10,200 32,500

Policy Implications CONCLUSIONS

HSR 0.7047 0.9692 0.4074 0.3817

Decrease in Car -0.4110 -0.4368 -0.2780

Air -0.8160 -0.4428 -0.5465

HSR 0.7068 1.0705 0.5313 0.4355

Car -0.3650 -0.5706 -0.3147

Air -0.6244 -0.5778 -0.5531

You might also like