You are on page 1of 19

Marketing II

Heterogeneity in Logit Models

Ricardo Montoya
Motivation
• One of the key ideas of this course is that consumers
are heterogeneous in their behavior

– Observable heterogeneity:
• Customers with larger income are less price sensitive
• Customers who have purchased frequently in the past
are more likely to buy once again (dynamic vs
heterogeneity)

– Unobservable heterogeneity:
• There are intrinsic differences in consumer behavior
that cannot be explained by observable covariates
(preferences)

2
Observable Heterogeneity
• Include covariates in utility function
(𝑖𝑛𝑑𝑖𝑣𝑖𝑑𝑢𝑎𝑙 𝑛, 𝑎𝑙𝑡𝑒𝑟𝑛𝑎𝑡𝑖𝑣𝑒 𝑗, 𝑡𝑖𝑚𝑒 𝑡)
– 𝑢!"# = 𝛼" + 𝛽$%&'( 𝑃𝑟𝑖𝑐𝑒"# + 𝛾" 𝐼𝑛𝑐𝑜𝑚𝑒! + 𝜀!"#
• Alternatively, include covariates in preferences
(e.g., price sensitivity)
– 𝑢!"# = 𝛼" + 𝛽!$%&'( 𝑃𝑟𝑖𝑐𝑒"# + 𝜀!"#
– 𝛽!$%&'( = 𝛿) + 𝛿* 𝐼𝑛𝑐𝑜𝑚𝑒!
– 𝑢!"# = 𝛼" + (𝛿) +𝛿* 𝐼𝑛𝑐𝑜𝑚𝑒! )𝑃𝑟𝑖𝑐𝑒"# + 𝜀!"#
– 𝑢!"# = 𝛼" + 𝛿) 𝑃𝑟𝑖𝑐𝑒"# + 𝛿* 𝐼𝑛𝑐𝑜𝑚𝑒! 𝑃𝑟𝑖𝑐𝑒"# + 𝜀!"#

3
Unobservable Heterogeneity
Mixture Logit
• Logit:
– Decision makers choose the alternative with the largest utility.
The stochastic component of such utility is extreme value
distributed
evni ( b )
Pni =
å (b )
v
e nj

• Mixture of logits:
– On top of the previous assumptions, we add that the parameters in
the utility function are randomly distributed in the population:
• Finite mixture (latent class)
• Continuous mixture (mixed logit)

Latent class Mixed logit


e ni (
bm )
evni ( b )
v
Pni = å sm Pni = ò f (b )db
åe åe
vnj ( b m ) vnj ( b )
m
j j
! 𝑆! = 1 % 𝑓 𝛽 𝑑𝛽 = 1
! "

4
Finite Mixture

Latent Class

5
Maximum Likelihood Estimation
• Likelihood function Likelihood for individual 𝑛 given class 𝑚

&*+,
𝐿!" #) = # # 𝑃!%$ 𝛽"
$ %

𝐿! # = & 𝑠" 𝐿!" (𝛽" ) Likelihood for individual 𝑛


"

𝐿𝐿 𝛽, 𝑠 = & ln 𝐿! # Complete loglikelihood


!

• Need to impose that sm constitutes a probability law: 0≤sm≤1 & ∑m sm=1


elm
sm = , lM = 0
åe lm

• Maximize directly over 𝛽" and 𝜆"

6
Number of Segments
• Likelihood maximization is conditional on the number of
segments. How many segments?
• Likelihood never deteriorates with the number of
segments
• Try with different number of segments and decide based
on:
– AIC, BIC or any other metric of goodness of fit
– Interpretability / actionability of the segments
• Customers can be assigned to each segment using Bayes’
rule. Let Lm(yn|xn,βm) be the likelihood of observing
purchases of customer n (yn ) if she belongs to class m.
Then
Pr 𝑦! 𝑛 ∈ 𝑚 Pr 𝑛 ∈ 𝑚
Pr 𝑛 ∈ 𝑚 𝑦! =
Pr 𝑦! 𝑃 𝐵 𝐴 𝑃(𝐴)
Pr 𝐴 𝐵 =
𝐿" 𝑦! 𝑥! , 𝛽" 𝑠" 𝑃(𝐵)
Pr 𝑛 ∈ 𝑚|𝑦! =
∑# 𝐿# 𝑦! 𝑥! , 𝛽# 𝑠#
7
Elasticities
• If each segment is homogenous, then the variation
in the market share for brand i in segment m
when a variable xj changes is given by:

¶Pmi xi
Ei , xi = = b m (1 - Pmi ) xi
¶xi Pmi

¶Pmi x j
Ei , x j = = - b m Pmj x j
¶x j Pmi

8
Elasticities Analysis
• Having segment-level estimations, we can provide
a rich description of the market structure

• For example, we can analyze the effect of the


marketing mix of each brand in the distribution of
market shares

– Competitive Clout (CC) CCi = å E 2j ,x


i
j ¹i

– Vulnerability (V) Vi = å Ei ,x 2
j
j ¹i

9
Brands:
A,B,C, P

10
Latent Class vs Clustering
• Similarities
– Segments are obtained
– Unsupervised methods

• Differences
– Clustering
• Uses observable variables
• Conducted before parameter estimation
– Latent Class
• Uncovers unobserved preferences
• Conducted simultaneously with parameter estimation

11
Results
1 Class 2 Latent classes
mle se
[1,] 3.917 0.298
mle se [2,] 4.379 0.202
[Yoplait] 4.475 0.186 [3,] 1.360 0.234 Class 1
[Dannon] 3.730 0.145 [4,] -50.468 4.365
[WW] 3.087 0.145 [5,] 1.097 0.197
[Price] -37.071 2.400 [6,] 5.861 0.392
[Feature] 0.487 0.120 [7,] 2.590 0.366
[8,] 4.691 0.349 Class 2
• LL=-2658.6 [9,] -34.478 4.165
• aic = 5327.1 [10,] 0.636 0.209
[11,] 0.357 0.213
• bic = 5356.1
• LL= -1920.4
• aic = 3862.7 Implies 59%
• bic = 3926.5

12
How Many Classes?

Classes 1 2 3 4 5 6 7
LL -2659 -1920 -1483 -1390 -1346 -1288 -1283
Npar 5 11 17 23 29 35 41
N 2430 2430 2430 2430 2430 2430 2430
AIC 5327 3863 3000 2825 2750 2646 2648
BIC 5356 3926 3098 2959 2918 2849 2885

13
Continuous Heterogeneity

• Logit:
– Decision makers choose the alternative with the largest utility.
The stochastic component of such utility is extreme value
distributed
evni ( b )
Pni =
å (b )
v
e nj

• Mixture of logits:
– On top of the previous assumptions, we add that the parameters in
the utility function are randomly distributed in the population:
• Finite mixture (latent class)
• Continuous mixture (mixed logit)

Latent class Mixed logit


e ni (
bm )
evni ( b )
v
Pni = å sm Pni = ò f (b )db
åe åe
vnj ( b m ) vnj ( b )
m
j j
! 𝑆! = 1 % 𝑓 𝛽 𝑑𝛽 = 1
! "

14
Continuous Mixture

Mixed Logit

15
Interpretation
• Random Coefficients:

– The agent maximizes her underlying utility


𝑢𝑛𝑖 = 𝛽𝑛 𝑥𝑛𝑖 + 𝜀𝑛𝑖
– Parameters 𝛽𝑛 vary in the population according to a
distribution 𝑓(𝛽𝑛|𝜃), but each decision maker knows her
own value

– Error terms 𝜀𝑛𝑖 are extreme value distributed in the


population, but they are known by decision makers

16
Estimation
• For most cases, there is no closed-form expression
for the choice probability 𝑃𝑛𝑖
– Example: normal distribution to describe heterogeneity in
parameters of the utility function

𝑒 &!" '
𝛽 ∼ 𝑁(𝑏, 𝑉$ ) 𝑃!% =B 𝜙 𝛽|𝑏, 𝑉$ 𝑑𝛽
∑# 𝑒 &!# '
• Numerical methods:
– Method of the simulated maximum likelihood
– Method of the simulated moments
– Markov Chain Monte Carlo
• Use packages in R (e.g., mlogit)
– We’ll see it when we discuss Conjoint Analysis

17
Books

18
Marketing II
Heterogeneity in Logit Models

Ricardo Montoya

You might also like