Professional Documents
Culture Documents
T his research models the dynamics of customer relationships using typical transaction data. Our proposed
model permits not only capturing the dynamics of customer relationships, but also incorporating the effect of
the sequence of customer-firm encounters on the dynamics of customer relationships and the subsequent buying
behavior. Our approach to modeling relationship dynamics is structurally different from existing approaches.
Specifically, we construct and estimate a nonhomogeneous hidden Markov model to model the transitions
among latent relationship states and effects on buying behavior. In the proposed model, the transitions between
the states are a function of time-varying covariates such as customer-firm encounters that could have an endur-
ing impact by shifting the customer to a different (unobservable) relationship state. The proposed model enables
marketers to dynamically segment their customer base and to examine methods by which the firm can alter
long-term buying behavior. We use a hierarchical Bayes approach to capture the unobserved heterogeneity
across customers. We calibrate the model in the context of alumni relations using a longitudinal gift-giving data
set. Using the proposed model, we probabilistically classify the alumni base into three relationship states and
estimate the effect of alumni-university interactions, such as reunions, on the movement of alumni between
these states. Additionally, we demonstrate improved prediction ability on a hold-out sample.
Key words: customer relationship management; hidden Markov models; dynamic choice models; segmentation;
Bayesian analysis
History: This paper was received July 13, 2005, and was with the authors 18 months for 2 revisions; processed
by Prasad Naik.
185
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
186 Marketing Science 27(2), pp. 185–204, © 2008 INFORMS
the firm. In a review of customer lifetime value mod- find positive and significant state dependence effects
els, Jain and Singh (2002) suggest incorporating fac- across product categories even after controlling for
tors that drive consumer purchase over time and the heterogeneity; other studies find mixed results (e.g.,
stochasticity in the buying behavior in the customer Jeuland et al. 1980). In the context of CRM, Pfeifer and
relationship model. Carraway (2000) use a Markov model between the
observed purchase recency states to capture dynam-
2.2. Dynamics in Buying Behavior and ics in customer lifetime value. Morrison et al. (1982)
Hidden Markov Models modified the brand switching Markov model to clas-
Methodologically, the model we developed is more sify Merrill Lynch’s customers into “prime” and “not
similar to the literature on marketing dynamics. Many prime” states using managerial judgment. In the con-
marketing settings involve dynamics in consumer text of alumni donations, Soukup (1983) used an
behavior. These situations include both individual- ad-hoc dichotomization of past donations (donor and
level and market aggregate dynamics. The difficulty noncontributor) to define the customer’s state of
with capturing such dynamics is that in most market- donation behavior.
ing data sets the number of observations or time peri- A limitation of the observed states models is
ods observed is relatively small, and the nature and their restrictive account for buyer behavior dynamics,
structure of dynamics is often latent. To capture the whereby an ad-hoc specification of state dependence
latent structure of dynamics in a relatively parsimo- is added to an otherwise static model. A second short-
nious way, researchers developed various approaches coming of these models is that they often ignore other
that could be generally divided into discrete or con- important sources of dynamics in buying behavior,
tinuous state space structures. such as the enduring effects of marketing stimuli.
If the dynamics are assumed gradual or smooth, Indeed, for exogenous variables that are correlated
one could use a continuous state structure to capture over time, and are not controlled for, previous behav-
the dynamics. For example, time series autoregres- ior might be a determinant of current behavior simply
sive error models are used to capture the dynamics because it captures the effect of the omitted variables
in sales and the long-term effect of marketing activi- (Erdem and Sun 2001). This problem is likely to be
ties such as advertising (see Dekimpe and Hanssens more severe in the context of relationship marketing
2000 and Pauwels et al. 2004 for a review). Naik et al. because marketing actions such as loyalty programs
(1998) and Xie et al. (1997) use Kalman filtering to (Lewis 2004) and customer initiated interactions such
capture the dynamics in advertising scheduling and as service encounters (Bolton 1998) might alter the
new product introduction, respectively. In the choice customer’s relationship with the firm, and therefore
modeling literature, a smooth dynamic effect is often might have an enduring effect on the customer’s buy-
captured by a state-dependent term in the utility func- ing behavior. Finally, often the researcher or marketer
tion using an exponentially smoothed sum (Guadagni does not observe the consumer or market states that
and Little 1983, Srinivasan and Kesavan 1976) or a govern the dynamics.
simple running average (Bucklin and Lattin 1991) of To overcome the problem of unobserved states one
past purchases. could describe a set of latent states and transitions
However, the continuous state space is inadequate between these states and translate these latent states
to capture dynamics that are postulated to develop to the observed behavior through a stochastic model.
in a discrete manner such as an instantaneous regime This process can be described as an HMM. MacDon-
shift in the market conditions or consumer prefer- ald and Zucchini (1997, Chapter 4) describe several
ences (e.g., due to an inclusion or a drop of a brand applications of HMMs in areas ranging from biology,
from the consumer’s consideration set). One could geology, and climatology to finance and criminology.
model such dynamics, by allowing consumers (or The most common application of HMMs is in the
markets) to transition over time between a set of dis- area of speech recognition (Rabiner 1989, Rabiner and
crete states. Probably the simplest demonstration of Juang 1993). In econometrics, Hamilton (1989) pro-
such discrete states in the choice modeling literature posed an HMM to estimate the impact of discrete
is the state-dependent model (Heckman 1981). In this regime shifts on the growth rates of the real gross
model, the observed previous choice of the customer national product.
(captured by a lagged dependent variable) consti- Within the marketing literature, HMMs are closely
tutes the customer state in the current choice occasion. related to the family of latent class models (Kamakura
Choice modelers include state dependence in their and Russell 1989). Like most latent class models,
econometric models to capture heterogeneity across HMMs classify individuals into a set of states or
individuals as well as the serial correlation in pur- segments based on their buying behavior. However,
chases over time (McAlister et al. 1991). Using scanner unlike the latent class models, in HMMs the mem-
panel data, Keane (1997) and Erdem and Sun (2001) bership in the latent states is dynamic and follows
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
188 Marketing Science 27(2), pp. 185–204, © 2008 INFORMS
(3) The state dependent choice—the probability that alumni donations used in our empirical application,
the customer will choose the product at time t condi- we found this assumption to be both behaviorally and
tioned on her state is P Yit = 1 Sit = s = mit s 3 where empirically grounded.
Each one of the matrix elements in Equation (1)
Sit is the state of customer i at time t in a Markov
represents a probability of transition. We assume that
process with NS states, and
the customer’s propensity for transition is affected
Yit is the choice made by customer i at time t.
by his/her relationship encounters with the firm. We
3.2. The Model’s Components model the transitions between the states as a thresh-
old model, where a discrete transition occurs if the
3.2.1. The Markov Chain Transition Matrix. We propensity for transition passes a threshold level. As
model the transitions between states as a Markov pro- mentioned previously, the idea that a movement to a
cess. The transition matrix is defined as discrete level of relationship occurs when the aggre-
State at t gate measure of satisfaction or dissatisfaction from
State at t − 1 1 2 3 ··· NS − 1 NS relationship encounters passes a threshold has roots
1 qit11 qit12 qit13 ··· qit1NS−1 qit1NS in the relationship and service marketing literature
(Oliver 1997).
Qi t−1→t = 2 qit21 qit22 qit23 ··· qit1NS−1 qit2NS (1) The norm theory of Kahneman and Miller (1986)
postulates that past experiences create a norm, against
which current experiences are judged. Thus, it might
NS qitNS1 qitNS2 qitNS3 ··· qitNSNS−1 qitNSNS
be reasonable to expect that current relationship
encounters be judged relative to the status quo. If the
where qitss = P Sit = s Sit−1 = s is the conditional cumulative experience from the encounters between
probability that individual i moves from state s at the customer and the firm is highly negative (e.g.,
time t − 1 to state s at time t, and where 0 ≤ qitss ≤ 1 service failure), it is likely to shift the propensity for
∀ s s , and s qitss = 1. In applying our model in the transition below the threshold needed for a transition
context of alumni-university relationships (see §4), to a lower state. On the other hand, if the encounter
we put a structure on the general transition matrix is highly positive (e.g., an important product benefit
in Equation (1). Specifically, we define the transition learned from an advertisement campaign), it is likely
matrix as a random walk with a “sudden death,” to shift the propensity for transition above the thresh-
whereas from each state the customer/alumni could old needed for a transition to a higher state. If the
move to an adjacent state or drop immediately to relationship encounters in the previous period did not
dormancy. This assumption was made primarily for have a strong impact on the customer, the customer
model parsimony. Nevertheless, in the context of
is likely to stay in her current state.
Putting this process in mathematical terms, and
3
In what follows, we assume no information is available about assuming that the unobserved part of the propen-
the competition. We take this approach because this is generally sity for transition is independently and identically
the case in CRM transaction data sets, where the firm does not
observe transactions with the competition. Nevertheless, if infor-
distributed (IID) of the extreme value type, we can
mation about the competition exists, one could extend the model model the nonhomogeneous transition probabilities
to incorporate competitive information. following the ordered logit model (Greene 1997).
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
190 Marketing Science 27(2), pp. 185–204, © 2008 INFORMS
Specifically, the terms qitss in the transition matrix in existence or uniqueness of a stationary distribution.
Equation (1) could be written as In our empirical application, all the estimated indi-
vidual transition probabilities were strictly positive
qits1 = Prtransition from s to state 1 confirming that the transition matrices are aperiodic
exp1is − ait is and irreducible, thus guaranteeing an existence and
= (2) uniqueness of the stationary distribution.
1 + exp1is − ait is
3.2.3. The State-Dependent Choice. Given the
customer’s state, the customer choices are assumed
qitss = Prtransition from s to s
to be conditionally independent. Thus, given relation-
exps is − ait is ship state s, we model the probability of a dichoto-
=
1 + exps is − ait is mous choice following the well-known binary logit
exps − 1is − ait is model,5
− (3)
1 + exps − 1is − ait is exp˜ 0s + xit s
mit s = s = 1 NS (5)
1 + exp˜ 0s + xit s
qitsNS = Prtransition from s to NS
where
expNS − 1is − ait is
= 1−
1 + expNS − 1is − ait is
(4) ˜ 0s is the state-specific coefficient for state s,
xit is a vector of time-varying covariates associated
for s ∈ 1 NS and s ∈ 2 NS − 1, with the choice of individual i at time t, and
where s is a vector of state-specific response coefficients.
is is a vector of parameters capturing the effect The full vector of conditional choice probabilities is
of relationship encounters of individual i on mit = mit 1 mit 2 mit NS .
the propensity for transition from state s, The difference between the vectors of covariates
ait is a vector of time-varying covariates for ait to be included in the transition matrix and in
individual i between time t − 1 and time t, the vector of covariates in state-dependent choice xit
and is noteworthy. The conceptual distinction is between
s is is the s ordered logit threshold for individ- those variables that have an enduring impact on the
ual i in state s, where s ∈ 1 NS − 1. attitude of the customer toward the product or ser-
vice and those that affect only the short-term choice
Note that the marginal effects of the time-varying behavior. For example, advertising is often assumed
covariates in the transition matrix are state spe- to have a long-term impact on attitude while a price
cific, thus allowing for different impacts of the time- promotion affects only the short-term choice behavior.
varying covariates depending on the customer’s state. In the transition matrix, one should include covariates
3.2.2. The Initial State Distribution. For an that are hypothesized to have an enduring impact on
HMM with time homogeneous transition matrix, the customer’s buying behavior (e.g., advertisement,
the initial state distribution is commonly defined as service encounters, or relationship-based marketing
the stationary distribution of the transition matrix activities). On the other hand, the covariates in the
(MacDonald and Zucchini 1997). However, because state-dependent choice vector are assumed to primar-
our transition matrix is a function of time-varying ily have an immediate effect on the customer (e.g.,
covariates, we calculate the stationary distribution of price and display promotions).6 The researcher could
the transition matrix by solving the equation i = also test empirically (e.g., using fit measures) the
i , under the constraint NS
i Q
s=1 is = 1, where Qi is appropriate location for each covariate in the model.
the transition matrix with the parameter estimates fol- To ensure identification of the states, we restrict the
lowing §3.2.1 and all covariates are set to their mean choice probabilities to be nondecreasing in the rela-
value across individuals and time periods.4 Generally, tionship states. Because both the intercepts and the
the stationarity conditions above do not guarantee
5
In the context of alumni university donations, we model the state-
4
An alternative approach would be to take the stationary distribu- dependent and dichotomous choices rather than donation amounts
tion of the transition matrix with the covariates set to zero. For the because we believe that the act of choice is a stronger determinant
empirical application in §4, the two approaches yielded very simi- of relationship strength than quantity measures. Nevertheless, the
lar results. In general, one should use the stationary distribution of model could be extended to capture quantity or amount outcomes
the transition matrix with the covariates set to zero if the data set using Poisson or Tobit models.
6
is not left truncated (i.e., we observe the initial interaction between The covariate in the state-dependent choice vector could still have
the customer and the firm), and the stationary distribution of the an indirect long-term effect through the longitudinal effect of the
transition matrix at the mean of the covariates otherwise. current choice on future buying behavior.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185–204, © 2008 INFORMS 191
response parameters are state-specific, we impose this Using our notations for the three components of the
restriction at the mean of the vector of covariates, xit . HMM in Equations (1)–(5), we can rewrite Equa-
Thus, the vector xit is mean-centered and the restric- tion (7) as
tion ˜ 01 ≤ ˜ 02 ≤ · · · ≤ ˜ 0NS is imposed by
Pi Yi1 = yi1 YiT = yiT
s
NS
˜ 0s = 01 + exp0s s = 2 NS (6)
NS NS
T
and diffuse priors on and . We sequentially draw of the alternative models in terms of their predic-
from the set of conditional posterior distributions.8 tion ability on a hold-out sample, and a survey-based
The conditional posterior distributions of i and do analysis of the behavioral dimensions underlying the
not have a closed form. Thus, we use the Metropolis- alumni-university relationship states.
Hasting algorithm to draw from these posterior dis-
tributions. To reduce the degree of autocorrelation 4.1. Application to Alumni Relations
between draws of the Metropolis-Hasting algorithm To empirically illustrate the ability of the proposed
and to improve the mixing of the MCMC we use model to capture the dynamics in customer relation-
an adaptive Metropolis adjusted Langevin algorithm ships and choice behavior, we apply the proposed
(Atchadé 2006). We found this recent approach to be HMM in the context of university-alumni relations
very useful in our HMM application. and gift-giving (i.e., donation) behavior. The objec-
It should be noted that unlike the HMM Bayesian tive of the empirical application is to show how
estimation methods that augment the latent state one can use observed alumni gift-giving data to
memberships (Djuric and Chun 2002, Kim and Nelson (1) dynamically classify the alumni into relation-
1999, Moon et al. 2007, Scott 2002), we use a Bayesian ship strength states, (2) understand the factors that
approach to estimate directly the likelihood function influence the dynamics in gift-giving behavior, (i.e.,
in Equation (8) with random-effect parameters. assess what alumni-university interactions are likely
to move alumni to a higher relationship state), and
3.5. Recovering the State Membership (3) predict future gift-giving behavior.
Distribution There are several reasons for choosing the alumni
An attractive feature of the HMM is the ability to gift-giving data as an empirical application for our
use it to probabilistically recover the individual’s state model. First, we believe that the dynamics in the
at any given time period. This measure could be gift-giving behavior, due to the strong relationship
directly derived from the likelihood function in Equa- underlying its construct, is stronger than the dynam-
tion (8). Two approaches have been suggested for ics found in typical scanner panel data. Second, this
recovering the state membership distribution: “filter- data set contains most of the components suggested
ing” and “smoothing” (see Hamilton 1989 for a dis- for a good CRM data set. Specifically, it includes four
cussion). Filtering uses only the information known out of the five elements suggested by Winer (2001):
up to time t to recover the individual’s state at time t, transaction data, customer contacts, descriptive infor-
whereas smoothing uses the full information available mation about the customers, and longitudinal data.
in the data. The filtering approach is more appealing As discussed further below, this data set is somewhat
for marketing applications, where decisions are made limited in terms of tracking exposure and response to
based only on the history of the observed behavior. marketing activities. As a result, we use other time-
The filtering probability that individual i is in state varying covariates that may affect the alumni state
s at time t conditioned on the individual’s history of such as reunion attendance. Finally, in the sagging
choices is given by economy, private and public schools face severe finan-
cial problems. Charitable contributions to universities
P Sit = s Yi1 Yi2 Yit dropped in 2002 for the first time in 15 years (New
i1 Qi 1→2 m
= i m
it s /Lit
i2 Qi t−1→ts m (9) York Times 2003). Therefore, addressing the problem
of managing the $24 billion market of U.S. alumni
where fundraising is of significant financial consequence.
Previous research on alumni gift-giving behavior
Qi t−1→ts is the sth column of the transition matrix
is limited. The few articles that have been published
Qi t−1→t , and
in this area investigated the effect on alumni gift
Lit is the likelihood of the observed sequence
giving of (1) institutional characteristics (Baade and
of choices up to time t from Equation (8).
Sundberg 1996, Harrison et al. 1995), (2) reunions
(Willemain et al. 1994), and (3) individual character-
4. Empirical Application istics such as demographics, financial aid, and partic-
In this section, we describe the empirical application ipation in college athletics (Okunade 1993, Okunade
of the proposed HMM in the context of the rela- and Berl 1997, Taylor and Martin 1995). Recently, in
tionship between alumni and their alma mater. We the marketing literature, Arnett et al. (2003) related
describe, in order, the data set, the alternative mod- alumni-university relations to the behavioral identity
els estimated, the estimation results, the comparison salience model. To our knowledge, no study has pre-
viously investigated and modeled the dynamics of
8
The complete set of conditional distributions and the iteration gift-giving behavior and the factors that can alter this
sequence of the MCMC simulation appear in Appendix A. dynamic behavior.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185–204, © 2008 INFORMS 193
relational marketing encounters, a linear time trend, The distinction between the HMMs and the re-
representing the number of years since graduation, is cency-frequency model is noteworthy. The recency-
included in the state-dependent vector. Additionally, frequency model accounts for dynamics in an ad-hoc
the data set contains alumni characteristics such as fashion in which the structure of the effect of past
year and major of graduation, gender, marriage to a choices on the current choice is defined a priori through
university alumna/alumnus, and membership in the the recency and frequency terms. The HMM offers
alumni association. a more behaviorally structured approach for state
dependence through a Markovian transition between
4.4. Estimated Models a set of relationship states. Furthermore, although
We estimated the HMM of customer relationship dy- some structure is imposed on the transitions between
namics described in §3, using the MCMC hierarchical the states, the choice of the number of states allows
Bayes estimation described above. Additionally, we us to determine the structure of dynamics based on
estimated a restricted version of this model, with no the complexity of the dynamics of the data at hand.
heterogeneity, as well as a nondynamic benchmark
4.5. Estimation Results
model and the state dependent recency-frequency
Models 2 and 3 were estimated using the maximum
model commonly used in relationship marketing.
likelihood procedure MAXLIK in the GAUSS statis-
Model 1—This is the full HMM.10 tical software. Models 1 and 4 were estimated using
Model 2—HMM with no heterogeneity. This model a MCMC hierarchical Bayes procedure, using the
is similar to Model 1, but with common parameter Gibbs sampler and the Metropolis-Hastings algorithm
across individuals. A limitation of this model is that coded in GAUSS. In the hierarchical Bayes estima-
this model cannot distinguish between heterogeneity tion, the first 90,000 iterations were used as a “burn-
and dynamics in the gift-giving behavior. in” period, and the last 10,000 iterations were used
Model 3—Nondynamic model. This is the latent to estimate the conditional posterior distributions and
class model (Kamakura and Russell 1989) with the moments. To asses convergence of the MCMC we
same number of latent states as in Model 1. This adopt the method proposed by Gelman and Rubin
model does not allow individuals to move between (1992), which compares the within to between vari-
the states over time. ance for each parameter estimated across multiple
Model 4—Recency-frequency model. This model is chains. Across three parallel chains, the scale reduc-
frequently used in relationship management applica- tion estimate for all the parameters estimated is
tions (e.g., Bult and Wansbeek 1995). In this model, lower than 1.2, suggesting that convergence has been
the recency since the last donation and frequency achieved.
of donations up to time t enter as covariates in the 4.5.1. Selecting the Number of States. The first
model.11 In the recency-frequency model, the proba- stage in estimating the HMM is selecting the num-
bility of choice follows the binary logit formulation: ber of states. Model selection measures can be used
expxit i to choose the number of states. Due to the sensitiv-
P Yit = 1 = (10)
1 + expxit i ity of some Bayesian model selection criteria, such as
where the marginal log-likelihood and Bayes factor, to the
i is a vector of random-effect parameters, and specified priors (Rossi and Allenby 2003), we compare
xit is a vector of time-varying covariates. alternative model selection criteria. Specifically, we
The vector xit includes all the covariates used in contrast the Bayes factor with the log-marginal den-
the HMM (reunion years, reunion participation, vol- sity,12 the marginal validation log-likelihood measure
(Andrews and Currim 2003), the deviance informa-
unteering to university roles and years since grad-
tion criterion (DIC) proposed by Spiegelhalter et al.
uation) as well as recency since last donation and
(2002), and the Markov switching criterion (MSC),
donation frequency. We incorporate heterogeneity in
recently developed for HMMs’ states and variables
this model by estimating a random-effect intercept
selection by Smith et al. (2005).13 The validation log-
and the recency-frequency parameters using hierar-
likelihood is calculated following Equation (8), using
chical Bayes estimation.
12
The Schwarz Bayesian information criterion (BIC; Schwarz 1978)
10
In addition to the full HMM, we also estimated a hierarchical commonly used for model selection in classic statistical applica-
HMM in which the random-effect coefficients are a function of indi- tions, asymptotically approximates the Bayesian posterior marginal
vidual characteristics (see Table 6). density.
11 13
We do not model the monetary donation amounts to be consistent The MSC was originally developed for a stationary, aggregate
with the proposed HMM. Moreover, adding the running average data HMM. We adapted this criterion to our random-effect non-
of donation amounts as a predictor did not significantly improve stationary HMM. The details of the modified MSC appear in
the model’s fit or prediction for donation incidents. Appendix B.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185–204, © 2008 INFORMS 195
Table 2 Choosing the Number of States Table 3 Estimation Results for the Hidden Markov Models
t t t
several time periods and therefore should be affected state, by moving them away from the “sticky” dor-
by reunion attendance. mant state and by increasing the likelihood of a tran-
It is important to note that the results presented sition to the relatively “sticky” active state. We use the
in Table 4 are the mean of the posterior distribu- posterior mean of the transition matrix parameters to
tion across individuals. Inference from our estimation gain better understanding of the long-term effect of
could and should be derived at the individual level the time-varying covariates in the transition matrix.
in a similar manner. Specifically, we calculate the effect of reunion atten-
The time-varying covariates reunion attendance and dance and volunteering over the course of 20 years
volunteering are endogenous to the alumna/alumnus, for an alumna/alumnus with an average initial state
thus one should be careful about treating these as deci- distribution at time 0 (average donation rate of 41%)
sion variables in the context of our data set. Never- and attended a reunion and/or volunteered to a uni-
theless, given the observed behavior, one could still versity role at year 1.
use these alumni-university interactions to dynam- As is evident from Figure 2, both time-varying
ically segment alumni to the relationship states. covariates have a long-term impact on the propensity
Table 4 demonstrates the value of adding time- to donate. The immediate as well as long-term effects
varying covariates such as customer initiated interac- of reunion attendance are stronger than the effects
tion or marketing interventions. Future research could of volunteering to a university role. Five years after
explore the application of our model to data sets in the reunion attendance or volunteering year, approx-
which the time-varying covariates are also decision imately 50% of the effect of these events still carries
variables. over.
The transition probabilities in Table 4 suggest that One interesting product of the transition matrices
reunion attendance might have an enduring impact in Table 4 is the stationary distribution of the tran-
on the donation behavior of alumni in the occasional sition matrix at the mean of the covariates, which is
60 Baseline
55
50
45
40
35
30
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Years
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185–204, © 2008 INFORMS 197
also our initial state distribution. The stationary dis- Table 5 Mean Posterior Transition Following a Donation at
tribution of the alumni in the three states is 46%, 29%, Time t
and 25% in the dormant, occasional, and active states, Donation at t
respectively.
In the HMM, following each choice, the model up- t
dates the state membership distribution. This allows t −1 Dormant Occasional Active
estimating the effect of a donation at time t on the
Dormant 53% 47% 0%
probability of being in state s at time t and therefore [47%–58%] [42%–53%] [—]
on the probability of donation at time t + 1 (see Table Occasional 1% 48% 51%
5). Using Equation (11) we calculated the state mem- [1%–2%] [43%–51%] [47%–55%]
bership probability at time t following a donation at Active 0% 16% 84%
time t. [0%–0%] [14%–18%] [71%–86%]
500
Frequency
400
300
200
100
0
0 10 20 30 40 50 60 70 80 90 100
(%)
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
198 Marketing Science 27(2), pp. 185–204, © 2008 INFORMS
4.6. Predictive Ability to that of the full HMM. This is not surprising because
We use the hold-out data to assess the prediction abil- in Model 2 heterogeneity in the hold-out period is
ity of the HMM and compare it to the four bench- elicited from the history of choices in the calibration
mark models.15 The parameters estimated based on period even when the model’s parameters do not vary
the calibration period are used to predict the 1,256 across individuals. Consequently, in Model 2 one can-
(alumni) × 3 (years) = 3 768 observations of possi- not distinguish between the effect of time dynamics
ble gift giving in the validation period. We compare and cross-individual heterogeneity.
the prediction ability of the alternative models using One of the major advantages of the HMM in the
the hit rate measures16 (overall hit rates as well as context of customer relationships is the ability to dy-
hits and misses of donation and nondonation peri- namically segment the firm’s customer base. An alter-
ods), the root-mean-square prediction error (RMSPE) native, static segmentation approach is the latent class
between the predicted choice probabilities and the model (Model 3). In Table 7, the HMM segmenta-
actual choices across alumni and time periods, and tion provides better donation predictive validity and
the validation log-likelihood (see Table 7). fit relative to the latent class model. To investigate
The RMSPE of each model is compared to the further the difference between the dynamic and the
RMSPE of a random choice rule. The random choice static segmentation models, we analyzed the predic-
rule’s RMSPE was calculated based on the aggregate tive ability of the two models for alumni for whom
donation probability in the calibration period (43.0%).
the two segmentation methods coincide and alumni
The random choice rule’s RMSPE is 0.495. The pre-
for whom they diverge. Specifically, for the HMM we
diction ability of all models is significantly better than
classified each alumnus to the state with the high-
that of a random choice rule, based on the RMSPE.
est state membership probability in the last year of
The HMM predicts the hold-out choices signifi-
cantly better than the nondynamic latent class model
(z = 95; p-value < 0001) and the dynamic, observed Table 7 Fit and Predictive Ability Measures
states, recency-frequency model (z = 51; p-value <
Model
0001). The improvement in prediction ability of the
HMM relative to the alternative models is consis- Model 4
tent across all prediction measures used. The recency- Model 1 Model 2 Model 3 recency-
HMM with HMM no nondynamic frequency
frequency model seriously under-predicts donation Measure heterogeneity heterogeneity latent class model
years, which are arguably more important to predict
than nondonation years. Calibration: −2 log 25,397.1 25,565.8 27,399.8 26,045.1
marginal density/
While the fit of the HMM with heterogeneity −2 log likelihood
(Model 1) is better than that of the HMM with no Overall hit rate (%) 77.6 77.3 67.9 72.5
heterogeneity (Model 2), the prediction ability of the Donation hit rate (%) 78.8 78.0 67.2 67.7
parsimonious HMM with no heterogeneity is similar Non donation 76.5 76.6 68.5 76.9
hit rate (%)
RMSPE 0.392 0.393 0.448 0.426
15
The parameter estimates of the recency-frequency model, and the Improvement over the 20.7 20.6 9.4 14.0
latent class model can be obtained from the authors. random RMSPE (%)
16 Validation −2 log 3,583.6 3,597.0 4,484.2 4,148.6
To estimate hit rates, we predicted a donation if the predicted
likelihood
probability of donation is larger than 50%.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185–204, © 2008 INFORMS 199
Table 8 Predictive Ability by Segmentation Method behavior through attitudinal bonds such as commit-
ment, identity salience, self-connection, and satisfac-
Latent class HMM hit
Segment allocation N hit rates (%) rates (%) tion (e.g., Arnett et al. 2003, Fournier 1998, Morgan
and Hunt 1994). To explore the underlying attitudi-
Dormant in HMM but 142 420 772 nal dimensions of the three relationship states, we
not in latent class
Dormant in both 393 824 819
complement the observed behavior data with survey-
Occasional in HMM but 123 550 629 based data. This approach is consistent with the call
not latent class of Gupta and Zeithaml (2006) for incorporating attitu-
Occasional in both 154 526 682 dinal and perceptual constructs with behavioral out-
Active in HMM but not 216 586 767 come models.
in latent class
In the years 1998, 2000, and 2002, the alumni asso-
Active in both 228 852 854
ciation conducted a survey that measured alumni
Total 1256 679 776
engagement and attitude towards the university. Over
1,600 randomly sampled alumni were surveyed. The
vast majority of these alumni were surveyed only in
the calibration sample. The latent class solution sug-
one of the three years. The questionnaire included
gested three segments that are similar to the three
questions about the relationship between alumni and
HMM states in terms of donation propensity (dor-
the university in dimensions such as satisfaction,
mant 14%, occasional 45%, active 80%). We classified
emotional connection, pride and others. Of the total
alumni into these three latent class states based on the
survey sample, 128 surveys matched our original data
highest probability rule. Table 8 compares the predic-
set of 17,000 alumni. For this subset of 128 alumni, we
tive ability of the HMM and the latent class models
calibrated the HMM (Model 1) and calculated, using
to predict the hold-out donation years of alumni that
Equation (9), the alumni probability of membership
were classified to the same state/segment by the two in each of the three relationship states in the year of
methods and alumni for which the segmentation of the survey.17 We then assigned each alumna/alumnus
the two methods departed. to the relationship state with the highest probability.
The predictive ability of the latent class model is Table 9 compares the mean responses to the relation-
similar to that of the HMM for the alumni that were ship questions across alumni in the three relationship
classified to the dormant or active segment/state by states.
both models. However, for alumni that were classified On all relationship measures, except affinity to the
by the latent class model into a different segment than graduating class, the average ratings toward the uni-
the one suggested by the HMM, the predictive ability versity are increasing in the states from dormant
of the latent class model was poor. Additionally, the to occasional to active. The last column in Table 9
HMM outpredicted the latent class model for alumni presents the ANCOVA of the different relationship
in the more transient occasional state. This analysis measures on the states’ membership with lagged
might suggest that the HMM segmentation matches donation as a covariate. Controlling for the lagged
the “true” alumni segmentation better than the static choice provides a more conservative estimate of the
latent class model. difference in the relationship measures between the
In summary, the previous sections demonstrated states over and beyond the impact of an immediate
the insights one could gain from including time- donation.
varying covariates in the states’ transitions. The This analysis revealed that even after controlling
hold-out sample analysis suggests that an additional for the effect of typical state dependence, alumni in
advantage of the relationship HMM is in improving the active and occasional states had stronger feelings
the ability to predict future choices over the static and toward the university than those in the dormant state.
observed state models. Specifically, the difference was significant in terms
of the emotional connection to the university, affin-
4.7. Behavioral and Attitudinal Dimensions ity with the graduating class, the feeling of responsi-
To broaden the applicability of the HMM to most bility toward the university, the perception of being
common CRM data sets, the model uses only valued by the university, and the likelihood of rec-
observed behavioral measures (such as choice) to ommending the university to others. Indeed, it has
elicit the relationship states. This could raise con- been suggested in the relationship marketing litera-
cerns that the relationship states in the HMM rep- ture that these dimensions of commitment, respon-
resent merely differences in the likelihood of choice sibility, and self-connection (Fournier 1998, Morgan
rather than true differences in relationship reflected
by attitude. It has been suggested that repeated trans- 17
Because the gift-giving data ended at the end of 2001, the rela-
actions could be transformed into “true” relational tionship state in 2001 was used for the survey of 2002.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
200 Marketing Science 27(2), pp. 185–204, © 2008 INFORMS
ANCOVA
Question Scale Dormant Occasional Active P -values
How satisfied are you with your university experience? 1–5 451 475 480 0220
How strong is your feeling about the university? 1–5 431 450 460 0098
Do you feel proud of your university degree? 1–4 347 362 369 0833
Do you feel your university experience helped shape your life? 1–4 289 324 343 0069
Do you feel a strong emotional connection to the university? 1–4 272 314 322 0022∗
Do you feel a strong responsibility to help the university? 1–4 228 266 303 0015
Do you feel strong affinity with your graduating class? 1–4 194 252 226 0009
How strongly would you recommend the university 1–4 335 367 374 0042
to prospective students?
How good of a job is the university doing in 1–4 274 293 296 0223
serving your needs as an alum?
To what extent do you feel the university values its alumni? 1–3 225 241 255 0047
Do your parents/grandparents have a degree Yes/No 19% 18% 12% 0109
from this university?
Did you receive financial aid from the university as a student? Yes/No 40% 40% 39% 0557
Median lifetime donation (from the actual donation data) $100 $475 $1,382
Sample size (N) 64 29 35 128
∗
Bold numbers reflect significant differences (at the 5% level) between the three relationship states.
and Hunt 1994) are important factors of customer- and the transitory, occasional state, while the effect
firm relational behavior. of volunteering is stronger on alumni in the “sticky”
These attitudinal measures give behavioral support dormant and active states. Both of these covariates
to the projection from observed donation behavior to have a long-term impact on the donation behavior. In
latent states of relationship. Furthermore, the high rat- terms of prediction ability, we demonstrate that, for
ings for positive word-of-mouth among alumni in the our empirical application, the proposed model pre-
active and occasional states suggest that alumni in dicts future donations significantly better than the
these states do not only have a higher propensity to static latent class model and the dynamic, observed
donate, but are also more likely to be active on other states, recency and frequency model. Additionally, we
dimensions. Indeed, Arnett et al. (2003) used actual use the hold-out sample to demonstrate that HMM
donation and word-of-mouth to measure relationship dynamic segmentation provides superior segmenta-
marketing success in the context of alumni-university tion to the static latent class segmentation. More
relationships. generally, the empirical application demonstrates the
value of the proposed model for CRM marketers. The
5. General Discussion HMM enables marketers to use buyer behavior data
In this paper, we use data on alumni gift-giving be- to dynamically segment their customers into relation-
havior to estimate a hidden Markov model of relation- ship states and estimate the evolution of customer
ship dynamics. The HMM, which was estimated using relationships over time.
a hierarchical Bayes MCMC procedure to account for The proposed model extends the marketing litera-
observed and unobserved heterogeneity, offers several ture by suggesting a Markovian framework for esti-
insights into the drivers of these dynamics. mating the dynamics in customer relationships. Using
The main contribution of this research is in suggest- the nonhomogeneous HMM, one can estimate the
ing a behaviorally grounded model that helps mar- long-term impact of customer-firm interactions on
keters to infer the underlying structure of relationship customer relationships, an issue that has been largely
states. Using the model, the researcher can dynam- neglected in the customer relationship literature.
ically classify customers into the relationship states Methodologically, the proposed model extends the
and assess the dynamic effect of alternative time- observed states models, such as the state-dependence
varying covariates on the transition between the rela- and recency-frequency models by offering a latent
tionship states and consequent buying behavior. specification of state dependence, which incorporates
The empirical application to the problem of uni- the effect of time-varying covariates into the state-
versity-alumni relationships demonstrates the use of dependence structure. The ability to determine the
the model to a dynamic relationship problem. Exam- number of states based on the data at hand relaxes
ining the time-varying covariates in the transition the ad-hoc specification of dynamics in the observed
matrix, we find that the impact of reunion atten- state models. The proposed model also extends the
dance is the strongest for alumni in the dormant family of HMMs. To our knowledge, this is the first
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185–204, © 2008 INFORMS 201
nonhomogeneous HMM in the marketing literature such as service encounters, satisfaction, and action-
that investigates the impact of time-varying covari- able behavior (Bolton 1998). Future research could
ates, such as customer-firm interactions, on the tran- also investigate an HMM with multivariate outcomes
sition between the latent states. Indeed, Wedel and such as donation amounts, donation incidence and
Kamakura (2000, p. 176) point out that the issue of participation in university events, as multiple modes
nonstationarity in marketing segmentation in gen- of behavior that determine the customer relationship
eral, and specifically nonstationarity that could be state.
related to time-varying covariates, has received lim- To summarize, we believe that from the modeling
ited attention. perspective, we have provided CRM practitioners
The ability to incorporate time-varying covariates with an implementable model for evaluating cus-
in the transition matrix opens the opportunity to tomer relationships, their evolution over time, and the
investigate which marketing activities are most effec- effect of customer-firm interactions on altering this
tive in building customer-firm relationships and driv- evolution. These factors are necessary to transform a
ing actionable behavior, and to determine the optimal CRM system from an information system into a deci-
targeting of such marketing activities. Future research sion support system. This research takes a fundamen-
could investigate this issue using a data set that incor- tal step in that direction.
porates both exposure to marketing activities and
observed buying behavior.
Acknowledgments
It is worthwhile to distinguish between the pro-
The authors thank Amy Paulsen and the alumni associa-
posed nonhomogeneous HMM, which incorporates tion members for help with the data, and Michaela Dra-
time-varying covariates in the transition matrix, and ganska, Ricardo Montoya, Olivier Toubia, and participants
a nonstationary HMM (Djurić and Chun 2002), which in seminars at Carnegie Mellon University, Chicago Busi-
models the transition probabilities as a function of the ness School, Columbia University, Hebrew University at
state duration. Because our main interest is in inves- Jerusalem, Hong-Kong University of Science and Technol-
tigating the effect of customer-firm interactions on ogy, London Business School, New York University, North-
HMM dynamics, we adopt the first approach. How- western University, Stanford University, The Interdisci-
ever, future research could investigate the value of the plinary Center at Hertzelia, The Israeli Institute of Technol-
additional flexibility offered by nonstationary HMMs ogy, University of Maryland, University of Texas at Austin,
in marketing applications. University of Texas at Dallas, and University of Toronto
To broaden the applicability of the proposed model for helpful comments and suggestions. The authors are also
to most common CRM data sets, we used only grateful for the support of the Marketing Science Institute
observed buying behavior to elicit the relationship through the Alden G. Clayton Dissertation Competition.
states. Using survey data, we show that the relation-
ship states, which were estimated based on observed Appendix A. Hierarchical Bayes Estimation
Algorithm
buying behavior, are also different in terms of behav-
The parameters in our model could be divided into two
ioral dimensions of relationship, such as satisfac- groups: (1) parameters that vary across individuals (ran-
tion, emotional connection, and responsibility. Future dom-effect parameters) and (2) parameters that do not vary
research could use longitudinal survey data to esti- across individuals (fixed parameters). The set of parameters
mate the relationship states. Estimating the dynamics in each group is determined by the model’s heterogeneity
in relationship states directly from attitudinal vari- specification as described in §§3.2.4 and 3.4.
ables would be helpful in providing insight into the We denote by i the set of random-effect parameters and
mediating factors of the connection between observed by the set of fixed parameters. For the HMM described in
buying behavior and the relationship states. §3, the random-effect parameters includes: i = s i1
To increase the external validity of the model, it s iNS−1 , s ∈ 1 NS. The vector includes: =
would be constructive to investigate the application 1 2 NS 01 02 0NS 1 2 NS .
of this model in relationship marketing contexts other Observed individual characteristics, such as demograph-
than university-alumni relationships. Some possible ics, can be introduced into the model in a hierarchical
applications are other institutional gift giving; con- manner (Allenby and Ginter 1995). Thus, the vector of het-
erogeneous parameters (i ) can be written as a function of
tinuous service provider relationships, which face
observed and unobserved individual characteristics
high churn rates such as banks and telephone carri-
ers (Bolton 1998); institutional memberships (Thomas i = • Zi + i (A1)
2001); direct selling efforts; and dynamic brand
choice problems. The HMM framework could also be where Zi is a vector of individual characteristics for indi-
extended to study alternative measures of customer vidual i, is a matrix of parameters, relating the individ-
relationships. For example, one could explore the ual characteristics to the random-effect parameters (i ), and
connection between alternative relationship measures !i" ∼ N 0 .
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
202 Marketing Science 27(2), pp. 185–204, © 2008 INFORMS
where Yi , Xi , and ai are the vectors of donations, covariates where *0 and V*0 are diffused priors, and LY is the like-
in the state-dependent choice, and covariates in the transi- lihood function from Equation (8). Because (A5) does not
tion matrix, respectively, as described in §4.3. have a closed form, the M-H algorithm is used to draw from
(1) Generate i the conditional distribution of . The acceptance probability
at step k + 1 is defined by
f i Yi Xi ai Zi
Pracceptance
∝ N i Zi LYi
exp−1/2k+1 −0 V−10 k+1 −0 LY k+1
∝ −1/2 exp−1/2i − Zi −1
i − Zi LYi (A2) = min 1
exp−1/2k −0 V−10 k −0 LY k
where L(Yi ) is the likelihood function from Equation (8).
Because (A2) does not have a closed form, the Metropolis- (A6)
Hastings algorithm is used to draw from the conditional dis-
We define diffuse priors for the conditional distribution of
tribution of i . The Metropolis-Hastings algorithm proceeds
k by setting *0 to a n* × 1 vector of zeros, and v*0 = 30In* ,
as follows: Let’s define i as the accepted draw of i in iter- where n* = dim.
k+1
ation k and i as the draw in iteration k + 1. Then, the
k+1 k
sequence of draws is given by i = i +
, where
is Appendix B. Adapting the Markov Switching
2
a draw from N 0 % &, and % and & are chosen adaptively Criterion to Our Estimation Algorithm
to reduce the autocorrelation among the MCMC draws, with Recently, Smith, Naik and Tsai (2006; hereafter SNT) devel-
an acceptance rate of approximately 20%, following Atchadé oped the Markov switching criterion (MSC) for HMMs’
(2006). states and variables selection. The MSC as described by
k+1
The probability of accepting i is SNT was developed for a HMM with a stationary transi-
Pracceptance tion matrix, estimated using aggregate data and common
k+1 k+1
parameters across individuals. In contrast, our HMM is esti-
k+1
exp −1/2 i − Zi −1 i − Zi L Yi i mated using individual-level data, with some of the param-
= min k k
k
1
exp −1/2 i − Zi −1 i − Zi
L Y i i eters varying across individuals. Additionally, we allow the
transition matrix to be a function of time-varying covari-
(A3)
ates, thus relaxing the assumption of stationary transition
(2) Generate matrix commonly assumed in HMMs. Therefore, to apply
Define v = vec . the MSC as a criterion for choosing the number of states,
we need to adapt it to our specific application.
v i Z = MVN un Vn
Following Equation (15) in SNT the MSC is given by
where
Vn = Z Z ⊗ −1 −1 −1
NS
Ts Ts + ,s K
+ V0 MSC = −2 logf Y ˆ i + (B1)
un = Vn Z ⊗ −1 ∗
+ V0−1 u0
s=1 .s Ts − ,s K − 2
Z = Z1 Z2 ZN is an N × nz matrix of covariates,
= 1 2 N is an N × n" matrix which stacks i }, where
∗ = vec , −2 logf Y ˆ i is the maximized log-likelihood value,
s , W
Ts = traceW s = diagPs , and Ps = probS1 = s
V0 and u0 are prior hyperparameters,
n" = dimi , and probST = s,
nz = dimzi . ,s = NS; following SNT recommendation, where NS is
We define diffuse priors be setting u0 to a n" ·nz×1 vector the number of states,
of zeros, and V0 = 100In·nz . .s = 1; following SNT recommendation, and
(3) Generate K is the number of covariates in the state-dependent
vector.
i Z To adapt the MSC to our application we need to modify
the different components in (B1).
N
∼ IWn" f0 + N G−1 The first term in Equation (B1), −2 logf Y ˆ i , cap-
0 + i − Zi i − Zi (A4)
i=1 tures the model’s fit. Because our model is estimated using
where f0 and G0 are prior hyperparameters, f0 is the an MCMC hierarchical Bayes estimation procedure, one
degrees of freedom, and G0 is the scale matrix of the inverse needs to define the procedure used to calculate the model’s
Wishart distribution. We define diffuse priors by setting f0 = fit measure using the simulation output. We adopt the de-
n" + 5, and G0 = In" . viance measure suggested by Spiegelhalter et al. (2002) as
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
Marketing Science 27(2), pp. 185–204, © 2008 INFORMS 203
the model fit component of the deviance information cri- Böckenholt, U., R. Langeheine. 1996. Latent change in recurrent
terion (DIC). We calculate the deviance as the expectation choice data. Psychometrika 61(2) 285–302.
over the estimates at each iteration of the MCMC simula- Bolton, R. N. 1998. A dynamic model of the duration of the cus-
tion. Thus, tomer’s relationship with a continuous service provider: The
role of satisfaction. Marketing Sci. 17(1) 45–65.
J N
j Bucklin, R. E., J. M. Lattin. 1991. A two-state model of purchase
−2 logf Y ˆ i = −2 logf Yi ˆ i j (B2) incidence and brand choice. Marketing Sci. 10(1) 24–39.
j=1 i=1
Bult, J. R., T. Wansbeek. 1995. Optmal selection for direct mail. Mar-
where keting Sci. 14 (4) 378–394.
J is the number of MCMC iterations used to obtain the Dekimpe, M. G., D. M. Hanssens. 2000. Time-series models in
posterior distributions after the burn-in period, marketing: Past, present and future. Internat. J. Res. Marketing
17(2–3) 183–193.
N is the number of individuals,
j Dillon, W. R., U. Böckenholt, M. S. de Borrero, H. Bozdogan, W.
ˆ i is the vector of parameter estimates for individual i in DeSarbo, S. Gupta, W. Kamakura, A. Kumar, V. Ramaswamy,
iteration j, and M. Zenor. 1994. Issues in the estimation and application of
j
f Yi ˆ i j is the likelihood of observing a sequence of latent structure models of choice. Marketing Lett. 5(4) 323–334.
choices following Equation (8). Djurić, P. M., J.-H. Chun. 2002. An MCMC sampling approach
The term Ts in (B1) captures the effective sample size at to estimation of nonstationary hidden Markov models. IEEE
state s. Accordingly, we define Trans. Signal Processing 50(5) 1113–1123.
Du, R., W. A. Kamakura. 2006. Household lifecycles and lifestyles
Ti
N in america. J. Marketing Res. 43(February) 121–132.
Ts = P Sit = s Dwyer, R. F., P. H. Schurr, S. Oh. 1987. Developing buyer-seller
i=1 t=1
relationships. J. Marketing 51(April) 11–27.
where P Sit = s is the probability that individual i is in Erdem, T., B. Sun. 2001. Testing for choice dynamics in panel data.
state s at time T . Thus, the only difference between our J. Bus. Econom. Statist. 19(2) 142–152.
formulation of Ts and that of SNT is the summation over Fader, P. S., J. M. Lattin. 1993. Accounting for heterogeneity and
individuals in our model. nonstationarity in a cross-sectional model of consumer pur-
chase behavior. Marketing Sci. 12(3) 304–317.
Finally, we need to count the number of additional
Fader, P. S., B. G. S. Hardie, C.-Y. Huang. 2004. A dynamic change-
parameters in each state K. Unlike SNT, we include covari- point model for new product sales forecasting. Marketing Sci.
ates both in the transition matrix and in the state-dependent 23(1) 50–65.
vector. Accordingly, in our application, K is the number Flanagan, J. C. 1954. The critical incident technique. Psych. Bull.
of covariates for each state in both the transition matrix 51(4) 327–359.
and in the state-dependent vector. Note that because the Fournier, S. 1998. Consumers and their brands: Developing
relationship theory in consumer research. J. Consumer Res.
parameters that capture the effect of the covariates are com- 24(March) 343–373.
mon across individuals we can simply “count” the number Frank, R. E. 1962. Brand choice as a probability process. J. Bus.
of parameters. If one estimates a full random-effect model 35(January) 43–56.
using hierarchical Bayesian approach, we recommend using Gelman, A., D. B. Rubin. 1992. Inference from iterative simulation
the effective number of parameters, pD of Spiegelhalter using multiple sequences. Statist. Sci. 7 457–511.
et al. (2002), divided by the number of states as a measure Greene, W. H. 1997. Econometric Analysis, 3rd ed. Prentice Hall,
Englewood Cliffs, NJ.
for K.
Guadagni, P. M., J. D. C. Little. 1983. A logit model of brand choice
Calibrated on scanner data. Marketing Sci. 2(3) 203–238.
Gupta, S., V. A. Zeithaml. 2006. Customer metrics and their impact
References on financial performance. Marketing Sci. 25(6) 718–739.
Aaker, J., S. Fournier, A. S. Brasel. 2004. When good brands do bad. Hamilton, J. D. 1989. A new approach to the economic analysis of
J. Consumer Res. 31(June) 1–18. nonstationary time series and the business cycle. Econometrica
Allenby, G. M., J. L. Ginter. 1995. Using extremes to design products 57(2) 357–384.
and segment markets. J. Marketing Res. 32(November) 392–403. Harrison, W. B., S. K. Mitchell, S. P. Peterson. 1995. Alumni dona-
Allenby, G. M., R. P. Leone, L. Jen. 1999. A dynamic model of tions and colleges’ development expenditures: Does spending
purchase timing with application to direct marketing. J. Amer. matter? Amer. J. Econom. Social. 54(October) 397–413.
Statist. Assoc. 93(446) 365–374. Heckman, J. J. 1981. Heterogeneity and state dependence. S. Rosen,
Andrews, R. L., I. S. Currim. 2003. A comparison of segment reten- ed. Studies in Labor Markets. University of Chicago Press,
tion criteria for finite mixture models. J. Marketing Res. 39(May) Chicago, 91–139.
235–243. Hinde, R. A. 1979. Towards Understanding Relationships. Academic,
Arnett, D. B., S. German, S. D. Hunt. 2003. The identity salience London.
model of relationship marketing success: The case of nonprofit Hughes, J. P., P. Guttorp. 1994. A class of stochastic models for
marketing. J. Marketing 67(2) 89–105. relating synoptic atmospheric patterns to regional hydrologic
Atchadé, Y. F. 2006. An adaptive version for the metropolis adjusted phenomena. Water Resources Res. 30(5) 1535–1546.
Langevin algorithm with a truncated drift. Methodology and Jain, D., S. S. Singh. 2002. Customer lifetime value research in mar-
Comput. Appl. Probab. 8(2) 235–254. keting: A review and future direction. J. Interactive Marketing
Baade, R. A., J. O. Sundberg. 1996. What determines alumni gen- 16(2) 34–46.
erosity? Econom. Ed. Rev. 15(1) 75–81. Jeuland, A. P., F. M. Bass, G. P. Wright. 1980. A multibrand stochas-
Böckenholt, U., W. R. Dillon. 2000. Inferring latent brand depen- tic model compounding heterogeneous Erlang timing and
dencies. J. Marketing Res. 37(February) 72–87. multinomial choice processes. Oper. Res. 28(2) 255–277.
Netzer, Lattin, and Srinivasan: A Hidden Markov Model of Customer Relationship Dynamics
204 Marketing Science 27(2), pp. 185–204, © 2008 INFORMS
Kahneman, D., D. T. Miller. 1986. Norm theory: Comparing reality Poulsen, C. S. 1990. Mixed Markov and latent Markov modeling
to its alternatives. Psych. Rev. 93(2) 136–153. applied to brand choice behavior. Internat. J. Res. Marketing 7(1)
Kamakura, W. A., G. J. Russell. 1989. A probabilistic choice model 5–19.
for market segmentation and elasticity structure. J. Marketing Rabiner, L. R. 1989. A tutorial on hidden Markov models and
Res. 26(November) 379–390. selected applications in speech recognition. Proc. IEEE 77(2)
Keane, M. P. 1997. Modeling heterogeneity and state dependence 257–286.
in consumer choice behavior. J. Bus. Econom. Statist. 15(July) Rabiner, L. R., B.-H. Juang. 1993. Fundamentals of Speech Recognition.
310–327. Prentice Hall, Englewood Cliffs, NJ.
Kim, C.-J., C. R. Nelson. 1999. State-Space Models with Regime Switch- Ramaswamy, V. 1997. Evolutionary preference segmentation with
ing: Classical and Gibbs Sampling Approaches with Applications. panel survey data: An application to new products. Internat. J.
M.I.T. Press, Cambridge, MA. Res. Marketing 14(1) 57–80.
Lewis, M. 2004. The influence of loyalty programs and short-term Reinartz, W., V. Kumar. 2003. The impact of customer relationship
promotions on customer retention. J. Marketing Res. 41(August) characteristics on profitable lifetime duration. J. Marketing 67(1)
281–292. 77–99.
Libai, B., D. Narayandas, C. Humby. 2002. Toward an individ- Rossi, P. E., G. M. Allenby. 2003. Bayesian statistics and marketing.
ual customer profitability model: A segment-based approach. Marketing Sci. 22(3) 304–328.
J. Service Res. 5(1) 69–76. Rust, R. T., T. S. Chung. 2006. Marketing models of service and
Liechty, J. C., M. Wedel, R. Pieters. 2003. The representation of local relationships. Marketing Sci. 25(6) 560–580.
and global scanpaths in eye movements through Bayesian hid- Rust, R. T., K. N. Lemon, V. A. Zeithaml. 2004. Return on marketing:
den Markov models. Psychometrika 68(4) 519–541. Using customer equity to focus marketing strategy. J. Marketing
MacDonald, I. L., W. Zucchini. 1997. Hidden Markov and Other Mod- 68(1) 109–127.
els for Discrete-Valued Time Series. Chapman and Hall, London. Schmittlein, D. C., R. A. Peterson. 1994. Customer base analysis:
McAlister, L., R. Srivastava, J. Horowitz, M. Jones, W. Kamakura, An industrial purchase process application. Marketing Sci. 13(1)
J. Kulchitsky, B. Ratchford, G. Russell, F. Sultan, T. Yai, 41–67.
D. Weiss, R. Winer. 1991. Incorporating choice dynamics in Schwarz, G. 1978. Estimating the dimension of a model. Ann.
models of consumer behavior. Marketing Lett. 2(3) 241–252. Statist. 6(2) 461–464.
Montgomery, A. L., S. Li, K. Srinivasan, J. C. Liechty. 2004. Pre- Scott, L. S. 2002. Bayesian methods for hidden Markov models,
dicting online purchase conversion using web path analysis. recursive computing in the 21st century. J. Amer. Statist. Assoc.
Marketing Sci. 23(4) 579–595. 97(475) 337–351.
Moon, S., W. A. Kamakura, J. Ledolter. 2007. Estimating promo- Smith, A., P. A. Naik, C.-L. Tsai. 2006. Markov-switching model
tion response when competitive promotions are unobservable. selection using Kullback-Leibler divergence. J. Econometrics
J. Marketing Res. Forthcoming. 134(2) 553–577.
Morgan, R. M., S. D. Hunt. 1994. The commitment-trust theory of Soukup, J. D. 1983. A Markov analysis of fund-raising alternatives.
relationship marketing. J. Marketing 58(3) 20–38. J. Marketing Res. 20(August) 314–319.
Morrison, D. G., R. D. H. Chen, S. L. Karpis, K. E. A. Britney. 1982. Spiegelhalter, D. J., N. G. Best, B. P. Carlin, A. van der Linde. 2002.
Modeling retail customer behavior at merrill lynch. Marketing Bayesian measures of model complexity and fit. J. Roy. Statist.
Sci. 1(2) 123–141. Soc. Ser. B 64(3) 583–639.
Naik, P. A., M. K. Mantrala, A. G. Sawyer. 1998. Planning media Srinivasan, V., R. Kesavan. 1976. An alternative interpretation of
schedules in the presence of dynamic advertising quality. Mar- the linear learning model of brand choice. J. Consumer Res.
keting Sci. 17(3) 214–235. 3(September) 76–83.
Netzer, O. 2004. A hidden Markov model of customer relationship Storm, S. 2003. After years of cash flow, universities hit an ebb. New
dynamics. Ph.D. dissertation, Stanford University, Stanford, York Times 13(March), 25.
CA. Taylor, A. L., J. C. Martin, Jr. 1995. Characteristics of alumni
Newton, M. A., A. E. Raftery. 1994. Approximate Bayesian inference donors at a research I public university. Res. Higher Ed. 36(3)
by the weighted likelihood bootstrap. J. Roy. Statist. Soc. B 56(1) 283–302.
3–48. Thomas, J. S. 2001. The importance of linking customer acquisition
Okunade, A. A. 1993. Logistic regression and probability of to customer retention. J. Marketing Res. 39(May) 262–268.
business school alumni donations: Micro-data evidence. Ed. Wedel, M., W. A. Kamakura. 2000. Market Segmentation: Concep-
Econom. 1 243–258. tual and Methodological Foundations. Kluwer, Dordrecht, The
Okunade, A. A., R. L. Berl. 1997. Determinants of charitable giving Netherlands.
of business school alumni. Res. Higher Ed. 38(April) 201–214. Willemain, T. R., A. Goyal, M. Van Deven, I. S. Thukral. 1994.
Oliver, R. L. 1997. Satisfaction: A Behavioral Perspective on the Con- Alumni giving: The influences of reunion, class and year. Res.
sumer. McGraw-Hill, New York. Higher Ed. 35(2) 201–214.
Pauwels, K., I. Currim, M. G. Dekimpe, E. Ghysels, D. M. Hanssens, Winer, R. S. 2001. A framework for customer relationship manage-
N. Mizik, P. Naik. 2004. Modeling marketing dynamics by time ment. California Management Rev. 43(Summer) 89–105.
series econometrics. Marketing Lett. 15(4) 167–183. Xie, J. X., M. Song, M. Sirbu, Q. Wang. 1997. Kalman filter esti-
Pfeifer, P. E., R. L. Carraway. 2000. Modeling customer relationships mation of new product diffusion models. J. Marketing Res.
using Markov chains. J. Interactive Marketing 14(2) 43–55. 34(August) 378–393.