You are on page 1of 40

Market Research Analysis:

Café Sales
Agenda
• Problem Statement
• Data Summary
• Exploratory Data Analysis (EDA) & Time and Data Analysis (TDA)
• EDA and TDA Recommendations
• RFM - Menu Items Insights & Recommendations
• Market Basket Analysis
• Analyses of Associations
• Possible Combos/Lucrative Offer Suggestions
Problem Statement
• Dataset holds the sales details like items sold, their categories,
quantity and revenue of the café over the period of 1 year
• We analyze this data and provide relevant insights and suggestions to
help increate the revenues of the café

• Sample Data:
DateTime Bill Number Item Desc Quantity Rate Tax Discount Total Category
2010-04-01 13:15:11 G0470115 QUA MINERAL WATER(1000ML) 1 50 11.88 0 61.88 BEVERAGE

2010-04-01 13:15:11 G0470115 MONSOON MALABAR (AULAIT) 1 100 23.75 0 123.75 BEVERAGE

2010-04-01 13:17:35 G0470116 MASALA CHAI CUTTING 1 40 9.5 0 49.5 BEVERAGE

2010-04-01 13:19:55 G0470117 QUA MINERAL WATER(1000ML) 1 50 11.88 0 61.88 BEVERAGE

2010-04-01 01:20:18 G0470283 MOROCCAN MINT TEA 1 45 10.69 0 55.69 BEVERAGE


Data Summary
• There are a total of 145776 rows of data with 10 columns in the dataset with no
null values present in any column

• The date ranges confirm that the records of the café are for the FY 2010-11

• There are 8 unique sales categories

• The same Bill number is present for multiple entries against different categories
with 69965 unique bills generated in sales during the whole FY

• There are 575 unique items that have been billed by the café during the whole FY
Data Summary (cont.)
Modifications:
• Category of LIQUOR & TOBACCO which has only 54 entries (0.03% of data) is
dropped so that the categories of LIQUOR and TOBACCO can be analyzed
independently

• The Date and Time columns are merged into a single DateTime column for
singular analysis

• There are 680 rows holding duplicate values; but given the nature of data
(multiple orders under same bill number), this may be expected (Check data with
SME). Due to this, we retain the duplicates in the dataset.
Descriptive Stats
Column Description Type Mean Std Min Max 25% 50% 75% Mode
Quantity of
Quantity int64 1.12 0.48 1 30 1 1 1 1
purchase
Rate Price per item float64 161.65 102.03 0.01 2100 95 125 225 245
Tax Tax amount float64 48.88 40.17 0 2731.25 22.56 32.06 72 78.4
Discount Discount amount float64 0.09 3.72 0 825 0 0 0 0
Total Total Price float64 224.75 164.55 0.01 14231.25 117.56 167.06 315 323
Column Description Type # of Unique Most Common Count Remarks
G0490530,
Bill Number Unique bill number object 69982 23
G0518006
Category Total price computed object 8 FOOD 56658
Purchase item NIRVANA
Item Desc object 580 579
description HOOKAH SINGLE
Range: [2010-04-01] to
Date Date of purchase datetime64[ns] - 2010-12-31 826
[2011-03-31]
Time Time of purchase object 36200 20 (8 PM)

• Analyzing the descriptive stats of the numerical columns, we confirm that all
the data look to be valid
Exploratory Data Analysis

Column Skewness
Total 7.25
Discount 106.25
Rate 1.56
Tax 4.82
Quantity 9.98
Category Wise Sales Count
Category Wise Revenue Earned
Category Wise
Rate-Tax-Discount
Top Sold Items in Top Categories
Top 10 Bills with Category Division
EDA Highlights
• The box plots show the presence of outliers for all the numerical data and all the outliers are present
above the upper whisker.

• The numerical data are all positively skewed with a longer right tail, and this is confirmed by the
tabulated skewness quotient for each of the columns.

• Out of the 7 categories, the top 4 categories make up 98% of all the orders.

• The most popular category is FOOD holding 39% of the orders, followed by the BEVERAGE category at
~30% and then the TOBACCO category at ~25%. The LIQUOR category holds only 4.25% of all the
orders.

• The most popular item is NIRVANA HOOKAH SINGLE which falls under TOBACCO

• Although FOOD is the most popular category, the TOBACCO category contributes the most revenue.
This is followed by FOOD, BEVERAGE and LIQUOR.
• The top 4 categories- FOOD, BEVERAGE, TOBACCO and LIQUOR categories contribute to 98% of the
total revenue for the café in the FY
EDA Highlights (cont.)
• Tobacco has the highest average Rate and Tax and the lowest average discount. Wines has both high
Rate and Tax with high discount offered as well.

• Merchandise has the highest average discount.

• The largest bill is a single order of Miscellaneous. The next 5 top bills have FOOD as the biggest revenue
contributors

• Merchandise category is not seen in the top 10 bills

• Wine category sales contributes heavily towards the 10th largest bill
EDA – Recommendations
• Categories such as [Miscellaneous] and [Liquor & Tobacco] should be moved into the named
categories for better analysis capability

• Hold guided wine tasting with sommeliers as a wine festival to encourage custoemrs to understand
the wide range of wines offered by the café and thus make higher wie purchases. These can be
priced nominally and held every quarter with some money spent on advertising the evnet. The
success of the even can be gauged at the end of the year and then decided whether to continue it
or not.

• Merchandise purchases are low. This may be due to lower range of items or also due to
unavailability of the item. Some concentration should be put on designing better merchandise.
Small art competitions with eh café as the theme can be held with the winners’ artwork being used
on the merchandise.

• Sports screening, live music can be considered if the venue allows it


TDA – Whole Year Sales and Revenue
TDA – Quarter Wise Revenue
TDA – Month Wise Revenue
TDA – Days of Week Revenue
TDA – Revenue per Hour
TDA – Odd Hours Revenue
TDA – Insights
• The number of bills and the revenue generated for the whole FY generated follow the same trend
with same peaks and dips

• Highest sales of the FY are seen on New years Eve – 31/12/2010

• Top sales are seen in the Q4-2010 and Q1-2011

• The sales across the whole FY for each quarter is within close reach of each other, all of them being
above 20% across the table

• Although Q4-2010 is has the highest sales it is only due Dec-2010 having the highest sales in the
whole FY. Oct-2010 and Nov-2010 record only average sales

• 600+ bills are seen only 3 times in the data – 3Apr2010, 21Jan2010 and 29Jan2011

• The highest sales are in the Dec-2010 and lowest sales is seen in June-2010.
TDA – Insights (cont.)
• Friday, Saturday and Sunday see the highest sales with peak seen on Saturday

• Increased sales are seen post 7PM with the crowd staying beyond 12AM; the busiest hour for the
café is 8PM

• The café is open late on some days with orders seen beyond 12AM

• Café has some sales seen at odd timings between 2AM to 9AM. December has the max sales
during odd hours.

• 31st December a lot of early morning sales from 2AM to 5AM

• Dates having sales at 2AM are rare and occur around once a month. This may be attributed to
dinners continuing later into the night

• The 9AM sales are vey low in number


TDA – Recommendations
• To encourage sales on weekdays, events like happy-hours and ladies' nights can be planned.
Karaoke events along with live music can be organized on weekdays.

• As the 9AM sales are extremely low, the opening timings of the café should be checked. In case
the café wants to enter the early morning market, breakfast offers, and a larger breakfast menu
should be investigated and implemented

• Low revenue months can have month long festivals like summer fiesta and grill festival with short
term special menu items.

• Holidays can have special menu items added to generate interest and encourage families to dine
out.

• During holidays, the possibility of the café staying open late should be investigated.
Recency Frequency Monetary Model
• We quantitatively rank and group the menu items based on their recency,
frequency and monetary total of all the transactions in the FY
• This is done to identify the best items on the Menu
• RFM model is built using KNIME – the flow is shown below
RFM Model Steps
1. Read the input excel having the café sales data
2. Add the Monetary Field. Here, the Total columns is the monetary field
3. Add the Recency Field. Here, we subtract date with 1Apr2011 and
generate the values for the recency field
4. Aggregate the data as per the Item Description column
5. Generate the Frequency field – This is count of the Bills which have
ordered each unique item
6. Generate the RFM bins with the Auto-Binner
7. Rename the bins to Low-Medium-High for Recency, Frequency and
Monetary Bins
8. Write the output to file
RFM Model Output
Monetary • The output of RFM has 575
Recency Frequency Total
High Medium Low rows with each row
High 89 0 6 95 corresponding to the unique
Items in the menu.
High Medium 4 9 53 66
• We see the RFM have been
Low 0 6 5 11 categorized as Low-Medium-
High 35 0 12 47 High to get the best items.
Medium Medium 14 17 131 162 • Based on the RFM, we have
Low 0 40 11 51 Gold and Silver Items.
• The Gold and Silver Level
High 1 0 0 1
legends are seen as below.
Low Medium 1 8 48 57
Low 0 64 21 85 GOLD Items
Total 144 144 287 575 SILVER Items
RFM Model Output – Top Items
• There are 89 items which have the High-High RFM scores
• Sorting these for the highest Monetary, lowest Frequency and lowest Recency, we
identify the top performing items
Item Description Category RECENCY FREQUENCY MONETARY
NIRVANA HOOKAH SINGLE TOBACCO 1 8553 2953044.6
SAMBUCA TOBACCO 1 4425 2291058
MINT FLAVOUR SINGLE TOBACCO 1 5817 1840476
CALCUTTA MINT TOBACCO 1 3318 1640555.4
N R G HOOKAH TOBACCO, BEVERAGE 1 2267 1201820.4
GREAT LAKES SHAKE FOOD 1 4895 842906.61
GREEN APPLE FLAVOUR SINGLE TOBACCO 1 2528 793683
JR.CHL AVALANCHE FOOD 1 3314 712825.6
SILVER APPLE SINGLE TOBACCO 1 1971 673510.2
POUTINE WITH FRIES FOOD 1 3464 579930.61
RFM Model Output – Lowest Items
• There are 64 items which have the Low-Low-Low RFM scores
• Sorting these for the lowest Monetary, lowest Frequency and highest Recency, we
identify the low performing items
Item Description Category RECENCY FREQUENCY MONETARY
MOTHERS DAY SPL FOOD 327 5 0.05
CUTTING GLASS MERCHANDISE 271 1 27
DIP BOWL MERCHANDISE 329 1 67.5
MUGS - PLAIN COLOUR MERCHANDISE 363 1 75.38
MOCAFE HOT CHOCOLATE(SF) BEVERAGE 350 1 80.44
GOLD FLAKE ULTRA LIGHTS(20) TOBACCO 347 1 87.6
DECAFFINATE COFFEE FRAPPE BEVERAGE 351 1 92.81
CLASSIC REGULAR TOBACCO 349 1 93.6
INDIA KINGS OCEAN BLUE TOBACCO 222 1 109.8
ADD GROUND MEAT MISC 222 3 111.39
RFM Model Recommendations
• Removing the low performing items from the menu can help reduce the cost of
storing these items and overall streamline a more revenue productive menu
Item Recommendation Reasoning
Festival offer, creates goodwill and encourages return
MOTHERS DAY SPL Keep
customers.
CUTTING GLASS Remove Low demand, ordered rarely, low monetary return
DIP BOWL Remove Low demand, ordered rarely, low monetary return
MUGS - PLAIN COLOUR Remove Low demand, ordered rarely, low monetary return
MOCAFE HOT CHOCOLATE(SF) Remove Low demand, ordered rarely, low monetary return
GOLD FLAKE ULTRA LIGHTS(20) Remove Low demand, ordered rarely, low monetary return
DECAFFINATE COFFEE FRAPPE Remove Low demand, ordered rarely, low monetary return
CLASSIC REGULAR Remove Low demand, ordered rarely, low monetary return
INDIA KINGS OCEAN BLUE Remove Low demand, ordered rarely, low monetary return
ADD GROUND MEAT Remove Low demand, ordered rarely, low monetary return
Market Basket Analysis
• Performing Market basket analysis identifies the patterns of customer purchases
significantly menu items that are purchased together. Below outputs are generated
at the end of the operation.
• Support: Percentage of bills that contain all of the items in the itemset. The higher
the support, the more frequently the itemset occurs. Rules with a high support are
preferred since they are likely to be applicable to a large number of future
transactions.

• Confidence: The probability that a bill that contains the items in the itemset (left)
also contains the recommended item (right). The higher the confidence, the greater
the likelihood that the item on the right-hand side will be purchased.

• Lift: The probability of all of the items in a rule occurring together. Lift summarizes
the strength of association between the item and itemset; the larger the lift the
greater the link between the two products.
Market Basket Analysis (cont.)
• Market basket analysis has been performed on the items in the menu.
We will be able to use this to give suggestions to customers and design
better menu combos
• We perform the Market Basket Analysis using KNIME
Market Basket Analysis Steps
1. Read the input excel having the café sales data

2. Aggregate (group-by) the data as per the Bill Number column

3. Split the Cells as per the Item Description Column

4. Run the Association Rule Learner on the data to generate the Support-
Confidence-Lift values.
Minimum Support = 0.004
Minimum Confidence = 0.005

5. Split the Column Collection as per Item Description

6. Write the output to file


Market Basket Analysis - Output
• Post the analysis, we get a total of 20 rules (sorted by Lift in descending order)
Rule# Support Confidence Lift Consequent implies Split Value 1
Rule 1 0.004688 0.125912 1.991728 SAMBUCA <--- B.M.T. PANINI
Rule 2 0.004688 0.074158 1.991728 B.M.T. PANINI <--- SAMBUCA
Rule 3 0.004688 0.098558 1.55903 SAMBUCA <--- QUA MINERAL WATER(1000ML)
Rule 4 0.004688 0.074158 1.55903 QUA MINERAL WATER(1000ML) <--- SAMBUCA
Rule 5 0.004888 0.183871 1.5335 NIRVANA HOOKAH SINGLE <--- RED BULL ENERGY DRINK
Rule 6 0.004888 0.040768 1.5335 RED BULL ENERGY DRINK <--- NIRVANA HOOKAH SINGLE
Rule 7 0.006332 0.12976 1.082208 NIRVANA HOOKAH SINGLE <--- POUTINE WITH FRIES
Rule 8 0.006332 0.052807 1.082208 POUTINE WITH FRIES <--- NIRVANA HOOKAH SINGLE
Rule 9 0.004545 0.071897 1.032697 GREAT LAKES SHAKE <--- SAMBUCA
Rule 10 0.004545 0.065284 1.032697 SAMBUCA <--- GREAT LAKES SHAKE
Rule 11 0.004602 0.123608 1.030905 NIRVANA HOOKAH SINGLE <--- B.M.T. PANINI
Rule 12 0.004602 0.038384 1.030905 B.M.T. PANINI <--- NIRVANA HOOKAH SINGLE
Rule 13 0.005546 0.079655 1.019215 CAPPUCCINO <--- GREAT LAKES SHAKE
Rule 14 0.005546 0.070958 1.019215 GREAT LAKES SHAKE <--- CAPPUCCINO
Rule 15 0.005474 0.045655 0.959812 QUA MINERAL WATER(1000ML) <--- NIRVANA HOOKAH SINGLE
Rule 16 0.005474 0.115084 0.959812 NIRVANA HOOKAH SINGLE <--- QUA MINERAL WATER(1000ML)
Rule 17 0.005989 0.076628 0.922446 MINT FLAVOUR SINGLE <--- CAPPUCCINO
Rule 18 0.005989 0.072092 0.922446 CAPPUCCINO <--- MINT FLAVOUR SINGLE
Rule 19 0.005131 0.073701 0.614677 NIRVANA HOOKAH SINGLE <--- GREAT LAKES SHAKE
Rule 20 0.005131 0.042794 0.614677 GREAT LAKES SHAKE <--- NIRVANA HOOKAH SINGLE
Market Basket Analysis – Output Analysis
Rule# Support Confidence Lift Consequent implies Basket Value
Rule 1 0.004688 0.125912 1.991728 SAMBUCA <--- B.M.T. PANINI
Rule 2 0.004688 0.074158 1.991728 B.M.T. PANINI <--- SAMBUCA
Rule 3 0.004688 0.098558 1.55903 SAMBUCA <--- QUA MINERAL WATER(1000ML)
Rule1:
Support: 0.5% of customers bought B.M.T PANINI
Confidence: 12.5% of customers who bought B.M.T PANINI also bought SAMBUCA
Lift: There is a 99% increate in expectation that a customer who purchased B.M.T PANINI will also
purchase SAMBUCA
Rule2:
Support: 0.5 % of customers bought SAMBUCA
Confidence: 7.4% of customers who bought SAMBUCA also bought B.M.T PANINI
Lift: There is a 99% increate in expectation that a customer who purchased SAMBUCA will also
purchase B.M.T PANINI
Rule3:
Support: 0.5% of customers bought QUA MINERAL WATER(1000ML)
Confidence: 9.9% of customers who bought QUA MINERAL WATER(1000ML) also bought SAMBUCA
Lift: There is a 56% increate in expectation that a customer who purchased QUA MINERAL
WATER(1000ML) will also purchase SAMBUCA
Market Basket Analysis – Output Analysis (2)
Rule# Support Confidence Lift Consequent implies Basket Value
Rule 4 0.004688 0.074158 1.55903 QUA MINERAL WATER(1000ML) <--- SAMBUCA
Rule 5 0.004888 0.183871 1.5335 NIRVANA HOOKAH SINGLE <--- RED BULL ENERGY DRINK
Rule 6 0.004888 0.040768 1.5335 RED BULL ENERGY DRINK <--- NIRVANA HOOKAH SINGLE
Rule1:
Support: 0.5% of customers bought SAMBUCA
Confidence: 7.4% of customers who bought SAMBUCA also bought SAMBUCA
Lift: There is a 56% increate in expectation that a customer who purchased SAMBUCA will also
purchase QUA MINERAL WATER(1000ML)
Rule2:
Support: 0.5 % of customers bought RED BULL ENERGY DRINK
Confidence: 18.4% of customers who bought RED BULL ENERGY DRINK also bought B.M.T PANINI
Lift: There is a 53% increate in expectation that a customer who purchased RED BULL ENERGY
DRINK will also purchase NIRVANA HOOKAH SINGLE
Rule3:
Support: 0.5% of customers bought NIRVANA HOOKAH SINGLE
Confidence: 4.1% of customers who bought NIRVANA HOOKAH SINGLE also bought RED BULL
ENERGY DRINK
Lift: There is a 53% increate in expectation that a customer who purchased NIRVANA HOOKAH
SINGLE will also purchase RED BULL ENERGY DRINK
Combo Recommendations
Below items can be offered as combos:
1. B.M.T. PANINI and SAMBUCA

2. QUA MINERAL WATER(1000ML) and SAMBUCA

3. NIRVANA HOOKAH SINGLE and RED BULL ENERGY DRINK

4. NIRVANA HOOKAH SINGLE and POUTINE WITH FRIES

5. GREAT LAKES SHAKE and SAMBUCA


Discount Recommendations
Below items combos can be offered as with discounts:
1. 5-15% off on B.M.T. PANINI if purchased with NIRVANA HOOKAH SINGLE

2. 5-15% off on GREAT LAKES SHAKE if purchased with CAPPUCCINO

3. 5-15% off on NIRVANA HOOKAH SINGLE if purchased with QUA MINERAL


WATER(1000ML)

4. 5-15% off on CAPPUCCINO if purchased with MINT FLAVOUR SINGLE

5. 5-15% off on GREAT LAKES SHAKE if purchased with NIRVANA HOOKAH


SINGLE

Instead of direct discount Buy2-Get1 can also be applied for the above
combinations
Tools Used & References
Python
• Initial data analysis and data cleanup
• File Submitted

Tableau
• Exploratory Data analysis
• Time Data analysis
• Link - MRA_Cafe_Analysis | Tableau Public

KNIME
• Recency-Frequency-Monetary (RFM) Analysis
• Market Basket Analysis Analysis
• File Submitted
Thank You

You might also like