You are on page 1of 22

Business Decision Making & Data Mining

Study of Sales Promotion Effectiveness

Faculty: Prof. Nagadevara V


Teaching Asst: Ms. Nalini Guhesh

Group – 2
Arpitha Bhat (2008013) Shashidhar A (2008052)
Avnish Kshatriya (2008078) Sundeep Raj (2008112)
Nizam Mohideen (2008090) Varun Yagain (2008141)
Savita Ashwath (2008047)

1
Contents

1. Business Domain Details ........................................................................................................ 3


1.1. Product Details ................................................................................................................. 4
1.2. Promotion Details ............................................................................................................. 4
2. Business Problem .................................................................................................................... 5
3. Details of the Data ................................................................................................................... 6
3.1. Source ............................................................................................................................... 6
3.2. Quality .............................................................................................................................. 6
3.3. Description of the variables .............................................................................................. 6
3.4. Costs of Prediction errors ................................................................................................. 8
4. Analysis ................................................................................................................................... 9
4.1. Assumptions on Data ........................................................................................................ 9
4.2. Technique ......................................................................................................................... 9
4.2.1. Stage-1 ..................................................................................................................... 10
4.2.2. Stage-2 ..................................................................................................................... 10
4.3. Data Preparation Process ................................................................................................ 10
4.4. Choice of Data mining technique ................................................................................... 14
5. Conclusion............................................................................................................................. 15
5.1. Actual Results ................................................................................................................. 15
5.2. Strategic recommendations............................................................................................. 16
5.3. Further direction ............................................................................................................. 16
6. Appendix ............................................................................................................................... 17
7. References ............................................................................................................................. 22

2
1. Business Domain Details
This data mining project is applied on the data of marketing activities of a FMCG business.
These businesses have several marketing promotions used to boost sales and given the limited
resources in launching/running these promotions, it is of considerable interest to find out the
most effective sales promotion (or the most effective combination of them) in order to maximize
the benefit out of limited spend.

In this study, the following promotion campaigns are analyzed


• Shop Displays,
• Price Discounts,
• Coupons, and
• In-store Displays

We use sales data per weeks over more than 300 weeks, the promotions in force in each of these
weeks, and the sales numbers of several related products during these weeks. The explanation of
the data used shall follow in the further chapters. The data can also be connected to the location
of stores using the store identifiers.

There could be several useful actionable strategies like below:


• The close correlation between different products or combinations of them, hinting at a
possible advantage in
o Bundling them together
o Placing them in close display to remind purchase
o Placing them away from each other to increase exposure to other products
• The most effective combination of promotional activities can be found out.
• The most valued set of features of a product, also in relationship with features of other
products, sales and seasonality.
In this project, we stick to understanding the effectiveness of the promotion mix, what is
popularly referred in the Retail industry as Promotion Analysis.

3
Aberdeen Group, a popular Promotion Analytics company defines it as: a technique for
determining optimum promotions at the most appropriate price level that help retailers’ to
increase profit uplift and reduce sales cannibalization. The components of an optimization
solution include a simulation model that provides the most confident prediction of the outcome of
any promotional discount. This solution uses a scenario/results processor, a demand engine, an
activity-based promotions cost engine, and a price optimization engine. The scenario processor
helps a retailer in the optimization process by selecting the most optimum prices and promotions
quantity using the most appropriate promotion vehicle for the promotion period.

1.1. Product Details


Due to confidentiality needs of the Data owner, the authors are unable to disclose the product
details. However, it is to be noted that these products belong to the Retail Industry and to the
category identified in the industry as CPG – Consumer Packaged Goods. Tooth pastes, soaps,
detergents, etc are example of products in this category.

1.2. Promotion Details

For the same reasons of confidentiality as with product details, the authors are unable to disclose
the promotion details either. Hence, the promotion details are identified using inexplicit unique
identifiers. Given the industry, one should consider in-house display, discount coupons, bundled
packaging, etc to be examples of promotions.

4
2. Business Problem

The identified scope of this data mining exercise in Promotion Analysis typically involves the
following Business Impact Study, quoting Cequity Solutionsi:

• Incremental sales & lift, volume impact and margin potential of promotion
• Impact of promotion across different dimensions like geography, season, customer set,
products, category etc
• The best channel for promotion delivery
• Promotion elasticity for price, delivery, & media
• Forecasts of sales for promoted products

It is known that in the Retail Industry, about 71% of the major players employ some form of
Promotion Analysisii.

This is not hard to guess given that research by organizations involved in trade promotions
indicate that for every billion dollars of promotions spend, upwards of $50 million is poorly
spentiii.

From the above listed advantages, the focus of this study is to identify the best individual or
combination of promotions that give a lift to the sales of the selected product-1.

5
3. Details of the Data
The following sub-sections give out the details of the source data used.

3.1. Source
The data is obtained from a market analytics company (name not to be disclosed as a part of
agreement). The company gets the data from a data centre which collects data from the FMCG
(called CPG in US and EU markets) stores across the Europe and American markets. The data
mainly contains the information about the volume and unit sales of different FMCG/CPG
products that the stores sell along with the information on average price, different promotions
etc.

3.2. Quality
The data used is QCED data. The market research company usually gets the raw data from the
data centre which is cleaned for errors and misleading and obviously storage data by a set of QA
persons. Any such anomalies is checked and cleaned with constant calls and emails with the data
centre executives. The cleaned data was used for the analysis.

3.3. Description of the variables


There are 101 fields in all. The below table describes them in further detail:

Field !ame Data Type Use/Remark


WEEK Set (1501.. 1565) The week for which the sales data is
collected.
STORE Set The Unique identifier of the 8 stores whose
sales data is included.
FEAT[1-14] Boolean Denotes if a particular feature is part of the
product or not.
DISP[1-14] Boolean Denotes if a particular promotion was
running or not.
VOL[1-14] Range Normalized sales volume of the product.

6
NUPC[1-14] Range Normalized number of UPCs in a product
(UPC = Universal Product Code i.e.
packaging/flavour/etc variant identifier)
PRIU[1-14] Range Denotes price per unit for each of the
products.
TREND Range Denotes change in Sales Data.
LBPR[1-14] Range Log of Base Price per Unit for each ‘n’
Product Ids.
LPI[1-14] Range Log of price index where price index is index
between the base price and the promoted
price. This helps in identifying the effect of
change in promotion price.
Table 1: Variables in the Data
Furthermore, the following synthesized variable columns were used for our analysis:

Field !ame Data Type Use/Remark


CHG_VOL1_BIN Set {Decrease, Change in sales volumes of Product-1 during
No_change, Increase} every week, as compared to the previous
week.
COMP_FEA Range [1-13] Sum of all feature of products 2-14. This
gives a hint on competition to Product-1 with
all other products considered as competition.
COMP_DISP Range [1-13] Sum of number of promotions for products
2-14. This gives a hint on competition to
Product-1 with all other products considered
as competition.
Table 2: User Computed Variables in the Data

7
3.4. Costs of Prediction errors

The costs of the prediction errors entirely depend on the business strategies we need to employ,
as put in the table below:

Strategy-1 (Profit Protection): In this case, the case of predicted INCREASE


Limit your promotion budget and protect your and an actual DECREASE would be very high,
profit margin even at the cost of a loss in sales. and has to be avoided.
Strategy-2 (Market Share Protection): In this case, the case of predicted DECREASE
Do not lose on sales even at the risk of and an actual INCREASE would be very high,
exceeding the promotion budget or decrease in and has to be avoided.
profit margin.
Strategy-3 (Balance Market Share & In this case, the overall accuracy bears the
Profitability): most value.
Neither lose sales nor lose profits beyond a
certain measure.
Table 3: Available Business Strategies

8
4. Analysis

4.1. Assumptions on Data

• Null value for FEAT and DISP can be replaced with Zero
• Null value for PRIU, VOL, NUPC, LPI, LBPR is a data gathering mistake. A mean of some
previous records can be assumed.

4.2. Technique
We refine the problem statement to the following –

To detect the exact individual feature or a combination there of, of the product to offer
and the exact individual promotion or a combination there of, to be put on display and the
strategy of increasing or decreasing the price (as given by the change log) to employ; so
as to predict an increase in sales volume of the Product-1.

In other terms, each time the competition formulates a feature-promotion-pricing


strategy, the seller Product-1 can formulate his weekly response using the model that we
arrive at.

The sales volume change only in relationship to the previous week was considered while
neglecting other techniques (trend over last n-weeks, historical effect by means of moving
average over last n-weeks, so on) mainly to avoid additional complexity.

In order to avoid complexity of considering the data of all the competing products individually,
we have summed up the Features and Promotions into a composite value (by summing up the
binary variables) and used as one score against that of Product-1.

9
4.2.1. Stage-1

The scale increase in sales volume was attempted to be predicted by dividing it into bin values
SIGNIFICANT_DECREASE, DECREASE, NO_CHANGE, INCREASE and
SIGNIFICANT_INCREASE. The DECREASE and INCREASE bins lie on negative and
positive sides of 0% change within a distance of 20% while anything beyond in the respective
direction is considered SIGNIFICANT.

Draw backs:

With this intention of predicting not just the trend but also the extent of change did not yield a
significant accuracy. Any of the C5.0 CART or ANN techniques did not yield an accuracy
beyond 51%. We conclude that this intelligence is either:
a. missing all the fields required to be predictable, or
b. is not possible to predict given that even the same set of features, promotions and pricing
strategy cannot guarantee a repeat of the sales increase to the same extent.

4.2.2. Stage-2
As a modification, we attempted to predict only the direction of change in sales volume, thereby
using only three bin value DECREASE, NO_CHANGE and INCREASE. After this modification
to the business problem, a respectable predication accuracy of 72.67% was reached using the
CART technique.

4.3. Data Preparation Process


The original data file was in Excel format and contained 101 fields and 23,378 records. The
fields have already been described in the Variable Description section. The data cleaning and
preparation was done based on the domain understanding of the data.
Attaching a Quality node to the original data in Clementine revealed all the missing values. A
snippet of the node output is shown here; the full list is available in the project CD.

10
Field % Complete Valid Records Null Value Empty String White Space
DISP1 98.97 23138 240 0 0
DISP10 67.02 15669 7709 0 0
DISP11 0.16 37 0 23341 23341
DISP12 0 0 0 23378 23378
DISP13 80 18702 4676 0 0
DISP14 100 23378 0 0 0
DISP2 95.91 22421 957 0 0
DISP3 89.12 20834 2544 0 0
DISP4 96.42 22541 837 0 0
DISP5 0.3 69 0 23309 23309
DISP6 0.02 4 0 23374 23374
DISP7 86.23 20160 3218 0 0
DISP8 0 0 0 23378 23378
DISP9 0 0 0 23378 23378
Figure 1: Quality of original data – Snippet 1

PRIU1 98.97 23138 240 0 0


PRIU10 67.02 15669 7709 0 0
PRIU11 0.16 37 0 23341 23341
PRIU12 0 0 0 23378 23378
PRIU13 80 18702 4676 0 0
PRIU14 100 23378 0 0 0
PRIU2 95.91 22421 957 0 0
PRIU3 89.12 20834 2544 0 0
PRIU4 96.42 22541 837 0 0
PRIU5 0.3 69 0 23309 23309

11
PRIU6 0.02 4 0 23374 23374
PRIU7 86.23 20160 3218 0 0
PRIU8 0 0 0 23378 23378
PRIU9 0 0 0 23378 23378
Figure 2: Quality of original data - Snippet 2

Significant observations regarding the quality:


• There are no values for DISP12 attribute. Similarly for FEAT12, PRIU12, VOL12 etc.
• There are only 37 records that have a value for DISP11, FEAT11 etc.
• Similarly it is seen that DISP5, DISP6, DISP8 and DISP9 (and other related fields) have
no valid records or very negligible number of valid records.
• As seen in Figure 2 there were also null values for the PRIU, VOL, NUPC, LPI and
LBPR fields.

Strategy for cleaning up the data:


• Drop the columns that have zero valid records
• For columns with negligible valid records, drop the column, as well as the rows where
actual valid records were present
• Missing values for DISP and FEAT are replaced with Zero
• Missing values for PRIU, VOL, NUPC, LPI and LBPR are replaced with the mean of
previous 1000 records
• Develop a Clementine stream to prepare the data

Outcome of data preparation:


• All attributes for products 5,6,8,9,11,12 were dropped; basically dropping the associated
FEAT, DISP, PRIU, VOL, NUPC, LPI and LBPR attributes
• 107 records were dropped because they had entries for the dropped columns
• All missing values were populated
• The output of the Clementine Quality node on the cleaned data is shown in Figure 3

12
Field % Complete Valid Records Null Value Empty String White Space
DISP1 100 23271 0 0 0
DISP10 100 23271 0 0 0
DISP13 100 23271 0 0 0
DISP14 100 23271 0 0 0
DISP2 100 23271 0 0 0
DISP3 100 23271 0 0 0
DISP4 100 23271 0 0 0
DISP7 100 23271 0 0 0
FEAT1 100 23271 0 0 0
FEAT10 100 23271 0 0 0
FEAT13 100 23271 0 0 0
FEAT14 100 23271 0 0 0
FEAT2 100 23271 0 0 0
FEAT3 100 23271 0 0 0
FEAT4 100 23271 0 0 0
FEAT7 100 23271 0 0 0
Figure 3: Quality of cleaned data

13
4.4. Choice of Data mining technique
The data mining techniques of Classification & Regression Tree (CART), Decision Tree Model
C5.0 and Artificial Neural Networks were attempted. The CART technique has consistently
yielded a higher value of overall prediction accuracy. But the final choice of Model depends
entirely on the Business Strategy employed (as detailed in the Cost of Errors section) :

Technique Profitability Market Share Balanced


(protect/improve) (protect/improve) Strategy
Strategy Strategy (protect/improve
(Use lowest (Use lowest both Profits &
Predicted- Predicted- Shares)
I!CREASE, DECREASE, (Use Overall
actual actual Accuracy)
DECREASE) I!CREASE)
C5.0 14.07% 54.41% 70.11%
ANN 37.04% 83.57% 72.43%
CART 24.85% 71.16% 72.67%
Chosen
C5.0 C5.0 CART
Technique

Table 4: Choice of technique based on Strategy Adopted

14
5. Conclusion
We conclude that the use of data mining results largely depends on the Business strategy chosen
to be applied.

5.1. Actual Results


The actual accuracy and coincidence matrices from each of the techniques are given below as
results obtained by analysing the data on IBM SPSS Clementine tool.

Coincidence
Matrices Overall Accuracy

C5.0 Decrease Increase Error C5.0


Decrease 2496 409 14.07917 Correct 3987 70.10726
Increase 1249 1491 54.41606 Wrong 1700 29.89274
No
change 39 3 7.142857 Total 5687

CARTS Decrease Increase Error CARTS


Decrease 2183 722 24.8537 Correct 4133 72.67452
Increase 790 1950 71.16788 Wrong 1554 27.32548
No
change 32 10 23.80952 Total 5687

ANN Decrease Increase Error ANN


Decrease 1829 1076 37.03959 Correct 4119 72.42835
Increase 450 2290 83.57664 Wrong 1568 27.57165

Table 5: Results of the Data Mining.

15
5.2. Strategic recommendations
As can be noted from the Table 4, the choice of model depends on the actual Business Strategy
chosen. We recommend that the result as predicted by the CARTS model be chosen for adopting
a promotion-feature-pricing strategy as a response to the competition’s moves.

5.3. Further direction


We list down the following limitation/handicaps in our analysis so far.
Improvements in the Data Mining exercise:
a. Some of the products’ sales may not be directly correlated to that of the product of
interest. It would be of interest to narrow down those products which are actually cutting
the sales of Product-1. Such an analysis would require the model to consider the scores of
all the products individually, the accuracy of which was observed to be of limited use.
Hence, we have chosen to use a composite score. It would be advantageous to analyze the
cause of the low accuracy when sales are considered individually, and then remedy the
same.
b. We have considered the changes in sales volumes only in relationship to the immediate
preceding week. This is not realistic enough when the effects of a promotion can last
more than a week. In this case, it would be an improvement to consider:
i. Trend over the past N-weeks, or
ii. Moving average over the past N-weeks.

Improvement in the input data quality:


a. The data set consists of only 23000 records. A larger data set might possibly reveal
further trends.

16
6. Appendix
The Data Cleaning stream

The mining stream for approach-1

17
The Data Mining Analysis Stream for approach-2

Analysis using C.5

18
Gains Chart

Analysis using CART

Gains chart for CART

19
Analysis and Gains chart using A!!

20
Gains Chart for A!!:

21
7. References

i
http://www.cequitysolutions.com/inner.php?p=9&sub=50#pa
ii
http://www.aberdeen.com/summary/report/benchmark/RA_PromOpt_SA_3909.asp
iii
http://www.managesmarter.com/msg/content_display/marketing/e3i1af498fbe3e69d89356d126fc11c1617

22