Professional Documents
Culture Documents
Regression Modelling
Marcel Goic
mgoic@uchile.cl
1
Mini Assignment
• In many industries, the brand value is a key performance metric to
evaluate promotional budget. There are several instruments available
in the market including the Brand Equity Index, the Brand Asset
Valuator and the Brand Valuation Model. Recently, some of these
evaluations have been complemented with information about touching
points which capture levels of activity of each brand in different
promotional vehicles. This opens the possibility to analyze the
effectivity of each touch point in increasing brand value.
2
Improving the Model(1)
• A naive attempt
𝐵𝑉𝑖𝑘𝑡 = 𝛼 + 𝛽 ⋅ 𝑇𝑃𝑙𝑗𝑖𝑘𝑡 + 𝜀𝑖𝑘𝑡
– This cannot be computed! Indices are not consistent
• Consistent Indices
𝐵𝑉𝑖𝑘𝑡 = 𝛼 + 𝛽 𝑇𝑃𝑙𝑗𝑖𝑘𝑡 + 𝜀𝑖𝑘𝑡
𝑙,𝑗
3
Improving the Model(2)
• The wrong answer
𝐵𝑉𝑖𝑘𝑡 = 𝛼 + 𝛽 𝐴𝑗𝑖𝑘𝑡 + 𝜀𝑖𝑘𝑡
𝑗
– This model is formally correct, but it does not answer the research
question; In this model all touchpoints are equally effective.
4
Improving the Model(3)
• Asking too much
𝐵𝑉𝑖𝑘𝑡 = 𝛼𝑖𝑘𝑡 + 𝛽𝑗𝑖𝑘𝑡 𝐴𝑗𝑖𝑘𝑡
𝑗
– This model has too many parameters. With 500 brands, 30
markets , 10 touchpoints and 10 years of data, we would
have 1.650.000 (500x30x10+500x30x10x10) parameters.
5
Other Ingredients
• Interactions or cross-effects:
– The joint effect of TV and Radio could be larger than the effect of TV
plus the effect of radio.
• Functional transformations
– They help to linearize relationships between variables:
ln(BVikt), β1BVikt ¡ + β2(BVikt ¡)2
• Hierarchical structures
– Some parameters of the model can be described as a function of
other parameters
6
Preliminary Takeaways
• There are infinite possible models and therefore it is not
possible to determine what is the best model.
7
Orange Juice Sales
8
Scanner Data
• An important source of sales data comes from scanner
data, where sales are automatically recorded at the point
of purchase (this is why this type of data is also called POS
data).
• We are interested on
– Characterizing price elasticities of across stores.
– Forecasting demand for different price points.
9
• In this exercise we will
analyze purchases from
the refrigerated orange
juice category.
10
Tables
• The dataset consist in two tables.
– Sales: Sales, prices and promotional activity.
– Store Demographics: Store characteristics.
Sales Data
Store brand week logmove price deal feat
1 1 44 9.018695 0.06046875 1 0
1 1 45 8.723231 0.06046875 0 0
1 1 46 8.253228 0.06046875 0 0
1 1 47 8.987197 0.06046875 0 0
Store Demo
Tables can be STORE AGE60 EDUC ETHNIC INCOME HHLARGE
connected through 1 0.232 0.248 0.114 10.553 0.104
store column 2 0.117 0.321 0.005 10.922 0.103
3 0.252 0.009 0.035 10.597 0.132
11
Data Exploration
12
Simple Models
• For simplicity of exposition, we only analyze the demand for
the first product (Tropicana Premium 64 oz). To complete the
exercise, we can repeat for each of the other products.
13
14
Adding Fixed Effects
• We can add fixed effects.
– Given the data structure we have, we can use store-fixed effects
(some stores sell more, others sell less).
– In a more general case where all products are considered at the
same time, we could use store and brand fixed effects.
• Models
– FE1: A simple linear model to compare (identical to M1)
𝑞𝑠𝑡 = 𝛼 + 𝛽𝑝𝑠𝑡 + 𝜉𝑓𝑠𝑡 + 𝜓𝑑𝑠𝑡 + 𝜀𝑠𝑡
– FE2: Add store fixed-effects
ln(𝑞𝑠𝑡 ) = 𝛼𝑠 + 𝛽ln(𝑝𝑠𝑡 ) + 𝜉𝑓𝑠𝑡 + 𝜓𝑑𝑠𝑡 + 𝜀𝑠𝑡
15
IN5162 – Marketing Engineering
Regression Modelling
Marcel Goic
mgoic@uchile.cl
18