Professional Documents
Culture Documents
chaptEr 3
LEARNING MODELS
s. sriraM aNd pradEEp K. chiNtaGuNta
Abstract
Choice models typically assume that agents know with certainty the utility they would derive from
various alternatives. Such an assumption is likely to be violated in instances where (a) the agent
is new to the context or (b) the choice set has new alternatives. Learning models specify a mecha-
nism by which consumers resolve uncertainty regarding products or their characteristics in such
turbulent contexts. In this chapter, we provide a critical review of the extant learning literature
in marketing and economics. We also discuss some avenues for future research in this area.
Introduction
The extensive literature on choice models has been built on the premise that consumers derive
utility from the various alternatives in their choice set and choose the alternative that gives them
the greatest utility. Typically, these models assume that consumers know the various components
of this utility with certainty. This would be the case for categories in which (a) consumers make
purchases regularly and (b) there are no new product introductions and consumers have been
participating in the category over an extended period of time. However, if consumers are either
new to the category or if the category experiences several new product introductions, this as-
sumption is likely to be violated. In such instances, consumers would perceive some uncertainty
in the utility they would derive from the various alternatives. Consequently, this would play a
role in their choice decision. Moreover, consumers may try to resolve this uncertainty by learning
about the utility they would derive from the products through various information sources such
as actual purchase and consumption of the product, advertising messages, and word-of-mouth
interactions with other consumers. Learning models specify the mechanism by which consumers
use these various sources of information to resolve their uncertainty. A choice model that formally
incorporates learning behavior would then be able to assess the relative effcacy of the different
information sources in aiding consumer learning.
The literature that considers consumer learning can be broadly classifed into two streams. The
frst consists of reduced-form models that allow for the evolution of a consumers brand preferences
and price sensitivity parameter to be a function of her past experience in the category (Heilman,
Bowman, and Wright 2000) or due to her exposure to advertising by these brands (see, for example,
Jedidi, Mela, and Gupta 1999; Sriram, Chintagunta, and Neelamegham 2006; Sriram and Kalwani
2007). Therefore, these models do not provide a structural representation of the learning mechanism.
One criticism of such reduced form models is that they may not be invariant to policy changes (also
known as Lucas Critique). Therefore, they may not be useful if the focus of the researcher is to
64 S. SRIRAM AND PRADEEP K. CHINTAGUNTA
understand the implications of signifcant policy changes. The second stream of literature explicitly
accounts for how consumers update their beliefs regarding components of their utility about which
they are uncertain. The structural nature of such models provides some defense against the Lucas
Critique. Most of these studies assume that consumers update their beliefs in a Bayesian fashion
with the extent of updating being related to their perceived precision of the signals that aid in such
learning. In this chapter, we present a review of this latter stream of the learning literature. While
the above discussion was based on consumers learning about the utility they would derive from the
various alternatives, the literature encompasses various other contexts, such as physicians learning
about a drug and managers learning about the attractiveness of a market.
The rest of the chapter is organized as follows. First, we begin by discussing the basic structure
of learning models. We then discuss the differences between the various learning models in the
marketing and economics literatures in terms of the agent who is learning, the entity that the agent
is uncertain about, and the signals that help in learning. Subsequently, we talk about how the basic
learning model has been extended in the literature. The following section sheds some light on po-
tential avenues for future research, and the fnal section provides some concluding comments.
The Basic Structure of Learning Models
Learning models are typically characterized by the following four aspects: (a) an agent, (b) a
component of the agents utility function that she is uncertain about (an unknown entity), (c) sig-
nals that the agent receives about the unknown entity, and (d) a mechanism by which information
in the signal is used to resolve the uncertainty in (b). A common example in marketing involves
consumers (the agents) who are uncertain about the quality of the product they are deciding to
purchase (the unknown entity). Upon purchase and experience of the product, they receive some
information regarding its true quality (the signal). The signal is usually assumed to be noisy,
that is, a single purchase or experience usually does not resolve all the uncertainty regarding the
product. In a standard Bayesian learning model (the mechanism), the agent has some prior belief
about the quality of the product. The (noisy) signal the consumer receives from the purchase
and use of the product allows the agent to combine the prior belief with the signal in a Bayesian
fashion to update beliefs about the products quality. Usually, researchers assume that the noisy
information that agents receive each period comes from a distribution whose mean equals the true
value of the unknown entity, that is, that the signals are unbiased. Hence, if the agent receives
signals over an extended period of time by making repeated purchases, the updated belief about
product quality will converge to its true value. Below, we present a formal discussion of the
model structure in the context of this example.
Consider a market with only one new nondurable product that can be purchased every period.
Consumers in such a market decide on whether to purchase the product or not during each period
t, t = 1, 2, . . . , T. Further, we assume that consumers can make only one purchase during each
period. Let Q be the true quality of the product. In our illustrative learning model, consumers do
not know this true quality. At period 0, all consumers start with a prior belief that the quality of
this product is normally distributed with mean Q
0
and variance s
2
0
, that is,
Prior ~
4 1 . (1)
In period 1, consumers would make their purchase decisions based on this prior belief. If
consumer i, i = 1, 2, . . . I, purchases the product, she can assess the quality of the product from
her consumption experience, Q
Ei1
. If we assume that the consumer always derives the experience
LEARNING MODELS 65
of quality that is equal to the true quality Q
Ei1
Q, i, t, then this one consumption experience
is suffcient to assess the true quality of the product. However, in reality, this experienced quality
might differ from the true quality, Q, because of (a) intrinsic product variability (Roberts and Urban
1988) and/or (b) idiosyncratic consumer perceptions (Erdem and Keane 1996). Hence, researchers
typically assume that these experienced quality signals are draws from a normal distribution whose
mean equals the true quality, that is, that these are unbiased signals. Thus, we have
a
4 (LW
4 1 4 , , L 7 W = = , (2)
where
4
captures the extent to which the signals are noisy. Thus, for learning to extend beyond
the initial purchase, we need
>
4
.
Subsequent to the frst purchase (and consumption experience) the consumer has some more
information than the prior she started with. Consumers use this new information along with the
prior to update their beliefs about the true quality of the product in a Bayesian fashion. Specifcally,
since both the prior and the signal are normally distributed, conjugacy implies that the posterior
belief at the end of period 1 would also follow a normal distribution (DeGroot 1970) with mean
L
4 and variance
L
such that
(L L L L
4 4 4 + = and (3a)
4
L
+
=
, (3b)
where
4
4
4
L
+
=
+
=
and (3c)
4
4
4
L
+
=
+
=
. (3d)
This posterior belief at the end of period 1 acts as the prior belief at the beginning of period 2. Thus,
when the consumer makes a purchase decision in period 2, she would expect her quality experience
to come from this distribution, that is, a
a
L L L
4 1 4 . On the other hand, a consumer who does
not make a purchase in period 1 will use the same prior in period 2 as she did in period 1. Hence, we
can generalize equations 3a, 3b, 3c, and 3d for any time period t, t = 1, 2, . . . , T, as follows:
(LW W L LW
4 4 4 + =
, (3a)
=
+
=
+
=
W
4
L
4
LW
LW
LW
,
,
(3b)
66 S. SRIRAM AND PRADEEP K. CHINTAGUNTA
4 W L LW
4
4
LW
W L
W L
LW
,
,
+
=
+
=
(3c)
4 W L LW
W L LW
4
LW
W L
4
LW
LW
,
,
,
,
+
=
+
=
, (3d)
where I
it
is an indicator variable that takes on the value 1 if consumer i makes a purchase in pe-
riod t and 0 otherwise. Similarly, when the consumer makes a purchase in period t+1, she would
assume that the quality of the product,
a
+ LW
4 , comes from this posterior distribution at the end of
period t, that is, a
a
LW LW LW
4 1 4
+
. Equations 3a, 3b, 3c, and 3d imply that as the number of
consumption experiences increases, the consumer learns more and more about the true quality
of the product. As a result, her posterior mean would shift away from her initial prior and move
closer to the true mean quality. Similarly, as she receives more information, her posterior variance
would decrease.
In order to demonstrate how a consumers posterior belief would evolve as she receives
these signals, we performed a simulation wherein the true quality of the product, Q, is set at
5. The consumer has a prior belief that the true quality of the product, Q
0
, is 0 with a variance of
5 (
= ). The consumer receives unbiased signals around this true quality with a signal vari-
ance of 2 (
=
4
). In Figure 3.1, we plot the evolution of the posterior mean and variance as the
number of purchase occasions increase. As discussed above, the fgure reveals that the consumers
posterior belief about the true quality of the product converges to its true value as she receives
more signals. Furthermore, her uncertainty about this belief (posterior variance) falls with each
additional signal and tends to zero asymptotically.
This concludes our discussion of a basic mechanism by which consumers learn about the quality
of a new (to them) product. Later we will discuss other learning mechanisms. Now we turn to a
discussion of the utility function that drives purchases.
Specifcation of the Utility Function
For the sake of simplicity, we defne the utility that the consumer derives from the product at time
t as a function of her quality perception and the price of the product at that time. As discussed
above, when a consumer makes a purchase decision at period t, she still perceives some uncer-
tainty about the quality of the product she would receive. Hence, her utility will also be a random
variable. Specifcally,
LW W LW LW
S 4 I X + =
a
a
, (4)
where
LW
X
a
is the utility that consumer i derives from purchasing the product at time t, p
t
is the
price of the product at time t, is the price sensitivity parameter, and
it
is a consumer and time-
LEARNING MODELS 67
varying idiosyncratic term that is not observed by the researcher.
1
Since the consumer does not
know the true quality and, hence, the true utility, she makes her purchase decision based on the
expected utility,
LW W LW LW W LW LW
S 4 I ( S 4 I ( X ( + = + = @
a
> @
a
> @
a
> , (5)
where E[.] is the expectation operator. The expectation is taken over the prior distribution at the
beginning of that period (or the posterior distribution at the end of the previous period). Since
the error term is perfectly known to the consumer and unknown only to the researcher, it can be
taken out of the expectation operator as in equation 5. Further, if we assume that the deterministic
component of the utility from not purchasing is 0 and the error term
it
follows a type I extreme
value distribution, we can write out the probability that consumer i would make a purchase at
time t, Pr
it
as
)|) ; ,
~
( | exp( 1
)|) ; ,
~
( | exp(
Pr
W LW
W LW
LW
S 4 I (
S 4 I (
+
=
. (6)
The specifcation of the utility function in equation 4 will have implications for how the posterior
mean and variance enter the expected utility in equation 5 and hence the probability of purchase
in equation 6. For example, if f(.) is a linear function of
LW
4
a
such that
LW W LW LW
S 4 X + + =
a
a
, (7a)
LW W W L
LW W LW LW
S 4
S 4 ( X (
+ + =
+ + =
@
a
> @
a
>
. (7b)
Figure 3.1 Change in Posterior Mean and Variance with Number of Purchases
0
1
2
3
4
5
6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
No. Purchases (Signals)
True Quality Posterior Mean Posterior Variance
68 S. SRIRAM AND PRADEEP K. CHINTAGUNTA
The expression for the expected utility in equation 7b implies that it depends solely on the
posterior mean from the previous period and is independent of the posterior variance. On the
other hand, concave or convex specifcations of f(.) would lead to expected utilities that depend
on both the posterior mean and variance. More specifcally, a concave utility function would
imply that the consumers are risk averse. Consequently, the expected utility would be nega-
tively infuenced by the posterior variance. Likewise, a convex utility function would have the
posterior variance affecting the expected utility positively. In Table 3.1, we present the utility
functionandtheexpectedutilityforthreecommonlyusedfunctionalformsintheliterature
linear, quadratic, and constant absolute risk aversion (CARA). Note that the term in the qua-
dratic and CARA specifcations corresponds to the level of risk aversion, with positive values
corresponding to risk-averse consumers (concave utility) and negative values corresponding to
risk-seeking consumers (convex utility).
Learning Models in Marketing and Economics
We now turn our attention to a discussion of the different applications of learning models in the
marketing and economics literature. Specifcally, we break down the discussion into how these
studies differ in terms of the following three questions: (a) who is learning? (the agent), (b) what
are they learning about? (the uncertain entity), and (c) how do they learn? (the signal). In Appendix
3.1, on pages 8283, we provide a summary of selected studies on learning. The appendix also
provides details on how these studies differ on these three dimensions.
Who Is Learning? (The Agent)
Broadly, there are three types of agents in the literature: (a) consumers who are making decisions
regarding their own consumption or purchases, (b) physicians making decisions on behalf of their
patients, and (c) managers making decisions on behalf of their frms. As regards the frst group,
consumers, the literature on learning models spans several industries including consumer packaged
goods (see, for example, Ackerberg 2003; Erdem and Keane 1996; Mehta, Rajiv, and Srinivasan
2003, 2004), consumer durables such as automobiles (Roberts and Urban 1988) and computers
(Erdem, Keane, Oncu, and Strebel 2005), and services such as local telephone (Narayanan, Chin-
tagunta, and Miravete 2007) and wireless (Xiao, Chan, and Narasimhan 2007). The choice of the
category has typically been dictated by the nature of data required to infer consumer learning. Since
inference of learning requires us to observe consumer decisions over several time periods, most
of the work has been in the context of frequently purchased consumer packaged goods or services
such as telephone and wireless where consumers are not bound by a contract and therefore have
the option of switching between services during each period. A casual perusal of the appendix
would confrm this. The only exceptions are the studies by Roberts and Urban (1988) and Erdem
and colleagues (2005), set in the context of consumer purchases of automobiles and computers,
respectively. Clearly, if one needs to infer learning in these contexts based on how consumers
modify their purchases over time, it would require data that track the purchases of these consumers
over several purchase occasions. Given the lifetime of these categories (especially automobiles),
we may have to track purchases over several decades to arrive at a reasonably large purchase his-
tory to infer learning. Therefore, in both cases, inference regarding learning is not based on repeat
purchases by consumers. Rather, they use data from consumer surveys to infer how consumers
learn over time. For example, Roberts and Urban investigate how car buyers would learn about a
new car model through word of mouth from current customers. In order to infer this, they collect
69
T
a
b
l
e
3
.
1
E
x
p
e
c
t
e
d
U
t
i
l
i
t
y
f
o
r
A
l
t
e
r
n
a
t
i
v
e
U
t
i
l
i
t
y
F
u
n
c
t
i
o
n
s
F
u
n
c
t
i
o
n
a
l
F
o
r
m
U
t
i
l
i
t
y
F
u
n
c
t
i
o
n
E
x
p
e
c
t
e
d
U
t
i
l
i
t
y
L
i
n
e
a
r
L
W
W
L
W
L
W
S
4
X
+
+
=
a
a
L
W
W
W
L
L
W
S
4
X
(
+
+
=
@
a
>
Q
u
a
d
r
a
t
i
c
L
W
W
L
W
L
W
L
W
S
4
4
X
+
+
a
a
a
L
W
W
W
L
W
L
W
L
L
W
S
4
4
X
(
+
+
@
a
>
C
A
R
A
H
[
S
a
L
W
W
L
W
L
W
S
4
X
+
+
=
)
)
5
.
0
(
e
x
p
(
|
~
|
2
1
,
1
,
L
W
W
W
L
W
L
L
W
S
4
X
(
+
+
T T T
, (8)
where , 0 1 is the discount factor. Based on this expected proft, the agent would keep the
product in the market only if the probability of the proftable outcome q > 0.5. However, this
ignores the possibility that the agent might learn something about the products proftability by
keeping it in the market. For example, consider that the agent would learn about the true proft-
ability of the product if she keeps it in the market for one period, that is, the signal is completely
informative. In such a scenario, the product would have an expected proft of 2q 1 in the current
period (same as above). However, armed with the information regarding the true proftability of
the product, the agent can make the optimal decision in future periods. Therefore, the net present
value of the proft stream when the agent considers the fact that she could learn about the true
proftability of the product would be
T
T
. (9)
In the above expression, the frst part, 2q 1, captures the expected profts in the current
period, and the second part corresponds to the net present value of the expected profts that the
agent would derive after having learned about its true proftability. Under this scenario, the agent
would keep the product in the market as long as
>
T
. Note that
<
if > 0.
Therefore, when the discount factor, , is greater than 0, the agent could make different decisions
regarding retaining the product in the market depending on whether or not she considers the role
of learning in enabling her to make more informed decisions in future periods. Specifcally, the
decision criteria for keeping the product in the market would differ under the two scenarios for
74 S. SRIRAM AND PRADEEP K. CHINTAGUNTA
values of q that lie in the range
< <
1
W
, (10)
where Q
t
is the true value of the unknown entity at time t and
t
is its stochastic variation at time
t. Since the mean of this stochastic variation is zero, the true value of the unknown entity fuctu-
ates around a constant mean. Note that the agent does not observe the true value of the unknown
entity, Q
t
. However, the consumer knows the temporal fuctuation in the true value of the unknown
entity, a
1
W
. The consumer, therefore, observes a noisy measure of the unknown entity
such that
a
4 W (LW
4 1 4
, , L 7 W = = . (2)
Thus, the only difference between equations 2 and 2 is that in the latter, the process that
generates the signals varies over time. Therefore, when the agent receives signals that vary
over time, it could be either because of (a) noise in the signal generating process,
4
, or (b)
the temporal variation in the signal generating process,
t
. Taken together, equations 10 and 2
represent the system of equations in a standard Kalman flter (Kalman 1960) with equation 2
playing the role of the observation (or measurement) equation and equation 10 acting as the
system (or state) equation. Hence, the agents updating mechanism can be readily obtained based
on derivation of the standard Kalman flter (see, for example, Hamilton 1994; Meinhold and
Singpurwalla 1983; West and Harrison 1994 for simple exposition of the derivation). In what
follows, we present an intuitive discussion of the implications of this stochastic fuctuation. Once
again, we note that equation 2 is similar to equation 2, with the exception that the mean of the
process that generates the signals is time varying. An implication of this temporal fuctuation
is that as the agent receives signals and updates her belief about the true value of the unknown
entity, she also needs to consider this fuctuation. As a result, the agents posterior belief at the
end of period t1 and her prior belief at the beginning of period t will not coincide.
3
Since the
mean of the fuctuation is zero, the posterior mean at the end of period t1 and the prior mean
LEARNING MODELS 77
at the beginning of period t would remain the same. On the other hand, this will not be the
case for the posterior and prior variances. Specifcally, if the posterior variance that the agent
i perceives about the true value of the unknown entity at the end of period t is
_ W LW
, then her
prior variance at the beginning of period t is
_
_
+ =
W LW W LW
.
(11)
Hence, the variance of the stochastic fuctuation,
1
W
. (10)
Thus, if 0 < < 1 and Q
t
> 0, then the true value of the unknown entity would decrease sto-
chastically over time. An implication of the temporal variation in the mean is that both the prior
mean and the prior variance at the beginning of period t would differ from their corresponding
posteriors at the end of period t1. More formally,
_ _
=
W LW W LW
4 4 ,
(12)
_
_
+ =
W LW W LW
,
(13)
where,
_ W LW
4 and
_ W LW
are the posterior mean and variance at the end of period t1 and
_ W LW
4
and
_ W LW
is the prior mean and variance at the beginning of period t. Furthermore, equations 3a,
3b, 3c, and 3d can be rewritten as follows:
(LW LW W LW LW W LW
4 4 4 + =
_ _
,
(3a)
_
_
4
LW
W LW
W LW
,
+
=
,
(3b)
_
4 W W L LW
4
4
LW
W W L
W W L
LW
,
,
+
=
+
=
,
(3c)
78 S. SRIRAM AND PRADEEP K. CHINTAGUNTA
_
_
_
4 W W L LW
W W L LW
4
LW
W W L
4
LW
LW
,
,
,
,
+
=
+
=
. (3d)
The above formulation can be extended to accommodate the evolution of the true value of
the unknown entity, Q
t
, as a function of other covariates (see, for example, Akcura, Gonul, and
Petrova 2004).
Notwithstanding the mathematical formulation of the belief updating process, the follow-
ing question arises: how is this evolution identifed? For example, consider the case where
the mean of the unknown entity does not vary over time. Under such a scenario, how can one
separately identify fuctuations in the delivery mechanism of the unknown entity from the
temporal fuctuations in its true value? There are two arguments that favor this identifcation.
First, in the absence of the stochastic temporal variation, the weight that the agent places on
the realized values of the unknown entity,
it
, would steadily decrease over time and asymp-
totically tend to zero. On the other hand, in the presence of a stochastic fuctuation in the true
value of the unknown entity, the weight would decrease at a slower rate and asymptote to a
value greater than zero. As in Lovett (2008), we plot the evolution of this weight for different
values of s
2
/s
2
Q
in Figure 3.2. From this fgure, it is evident that for the traditional Bayesian
learning model with no stochastic fuctuation in the true value (s
2
/s
2
Q
= 0), the weight,
it
,
approaches zero. However, as the stochastic fuctuation gets more pronounced and s
2
/s
2
Q
increases, the weight asymptotes away from zero. Therefore, if one had a suffciently large
time series of observations, the extent to which the weight asymptotes away from zero can
be used to infer the variance of the stochastic fuctuation, s
2
c
h
o
i
c
e
s
)
I
n
d
i
v
i
d
u
a
l
l
e
v
e
l
M
y
o
p
i
c
D
i
x
i
t
a
n
d
C
h
i
n
t
a
g
u
n
t
a
(
2
0
0
7
)
D
i
s
c
o
u
n
t
a
i
r
l
i
n
e
s
A
i
r
l
i
n
e
s
A
t
t
r
a
c
t
i
v
e
n
e
s
s
o
f
m
a
r
k
e
t
s
M
a
r
k
e
t
d
e
m
a
n
d
F
i
r
m
l
e
v
e
l
M
y
o
p
i
c
N
a
r
a
y
a
n
a
n
a
n
d
M
a
n
c
h
a
n
d
a
(
2
0
0
7
)
P
h
a
r
m
a
c
e
u
t
i
c
a
l
s
(
p
r
e
s
c
r
i
p
t
i
o
n
)
P
h
y
s
i
c
i
a
n
Q
u
a
l
i
t
y
o
f
d
r
u
g
s
P
a
t
i
e
n
t
f
e
e
d
b
a
c
k
,
d
e
t
a
i
l
-
i
n
g
P
h
y
s
i
c
i
a
n
l
e
v
e
l
M
y
o
p
i
c
X
i
a
o
,
C
h
a
n
,
a
n
d
N
a
r
a
s
i
m
h
a
n
(
2
0
0
6
)
W
i
r
e
l
e
s
s
C
o
n
s
u
m
e
r
P
r
e
f
e
r
e
n
c
e
f
o
r
v
o
i
c
e
a
n
d
t
e
x
t
C
o
n
s
u
m
p
t
i
o
n
o
r
u
s
a
g
e
I
n
d
i
v
i
d
u
a
l
l
e
v
e
l
M
y
o
p
i
c
C
h
i
n
g
(
2
0
0
8
)
P
h
a
r
m
a
c
e
u
t
i
c
a
l
s
(
p
r
e
s
c
r
i
p
t
i
o
n
)
P
a
t
i
e
n
t
a
n
d
p
h
y
s
i
c
i
a
n
Q
u
a
l
i
t
y
o
f
g
e
n
e
r
i
c
s
C
o
n
s
u
m
p
t
i
o
n
A
g
g
r
e
g
a
t
e
M
y
o
p
i
c
C
h
i
n
t
a
g
u
n
t
a
,
J
i
a
n
g
,
a
n
d
J
i
n
(
2
0
0
8
)
P
h
a
r
m
a
c
e
u
t
i
c
a
l
s
(
p
r
e
s
c
r
i
p
t
i
o
n
)
P
h
y
s
i
c
i
a
n
O
v
e
r
a
l
l
q
u
a
l
i
t
y
o
f
d
r
u
g
a
n
d
p
a
t
i
e
n
t
-
d
r
u
g
m
a
t
c
h
P
a
t
i
e
n
t
s
a
t
i
s
f
a
c
t
i
o
n
I
n
d
i
v
i
d
u
a
l
l
e
v
e
l
M
y
o
p
i
c