You are on page 1of 5

Making the Case for Quality

February 2017

Using Exploratory Data Analysis to

Improve the Fresh Foods Ordering
Process in Retail Stores
by Sivaram Pandravada and Thimmiah Gurunatha

With the abundance of data now available in the information era, data science offers significant
At a Glance . . . opportunities to complement quality approaches to problem solving and continuous improvement.
Unfortunately, as Roger D. Peng and Elizabeth Matsui point out in their book, The Art of Data
The short shelf lives of fresh Science, Data analysis is hard, and part of the problem is that few people can explain how to do
foods along with fluctuating it.1 Quality professionals may find that applying the principles of data science is not always as
consumer demand meant
straightforward and formulaic as they would like.
that a European retail
chains stores often had to
hold clearance sales with In illustrating how explorative data analysis and basic statistics helped a grocery chain reduce
zero or negative margins inefficiencies in its retail inventory and ordering process, this case study presents a real-world
or write off inventory. example of how the thought processes of data scientists can contribute to quality practice. Once
Explorative data analysis the chain saw the difference that the data-based approach made in targeting waste and ineffi-
and basic statistics helped ciency in one store, it successfully replicated its process improvements to increase profitability
the chain identify and throughout the organization. The project approach and lessons learned can also be applied in other
reduce inefficiencies in its
inventory and ordering
process, minimizing the
gap between quantities sold A data science approach to assessing the problem
and quantities ordered.
Rainbow statistical process The short shelf lives of fresh foods along with fluctuating consumer demand had been causing a
control (SPC) charts, a European retail chains stores to hold clearance sales with zero or negative margin or to write off
variation on traditional some inventory as shrinkage. Annually, the problem of shrinkage accounted for revenue losses of
SPC, helped ensure
up to 20percent.
ongoing monitoring of
the stock-to-sales ratio
Although the ordering process was automated, the algorithm was better suited to articles with lon-
and triggered corrective
action in real time, ger shelf life (>90 days); hence, department managers would often overwrite the automatic system
bringing sustainable results and place orders manually. There were thousands of stock keeping units (SKUs) and no defined
within three months. tolerance limits to manage stock and shrinkage across various stores. The organization needed an
approach for monitoring and controlling this waste.

Granular data is key to accurate and productive problem assessment. When it is not available,
significant time goes into defining metrics and capturing measurements before analysis can begin.
With the prevalence of sophisticated database management systems, the trick becomes extract-
ing information based on the questions that need to be answered to gain a better understanding of

For the problem of shrinkage, using data science and lean Six Sigma started with two questions:

ASQ Page 1 of 5

Question 1: What are we trying to optimize in this project? sales, the team started with analysis of the daily unit sales of
one fresh food SKU with a shelf life of five days and the high-
Answer: The team seeks to decrease shrinkage in fresh food est shrinkage component within the fresh food category.
divisions and also decrease the use of discount sales that lead to
reduced margins. The histogram data in Figure 1 show daily sales in one store
where there was maximum shrinkage due to this fresh food
Question 2: What factors have the most effect on shrinkage SKU. One observation that immediately stands out is that
and discount sales? the sales data appear to be normally distributed if three data
points for times when demand was an outlier are omitted from
Answer: Shrinkage and discount sales occur as a result of
consideration. The distribution is good news from a modeling
excess inventory in stores because of improper ordering pro-
perspective and in terms of predictability.
cesses. Key factors to study include the following:
In Figure 1, it is also easy to see that out of 225 days, the store
Identify the biggest contributing categories and departments
sold its highest quantities on only a small percentage of days:
leading to high shrinkage and discount sales
Quantify the demand or conduct a sales study of these 280 sold on only 19 days
articles by number of SKUs 320 sold on only nine days
Quantify the supply or purchase of these articles by number 400 sold on only two days
of SKUs
Review gaps in the current ordering process leading to Figure 2 presents a histogram for quantities ordered for this
supply versus demand mismatch same SKU. Reviewing alongside the data on quantities sold
leads to a few notable takeaways:
The Japanese concept of muda, which translates to waste or
any activity that consumes resources but creates no value for The ordering quantity did not follow a normal
the customer, provided direction for the teams next task of distribution, even though the process is within the control
identifying and reducing waste to improve process efficiency. of the stores
Pareto analysis helped the team understand which categories Out of 142 orders placed for the same time period of 225
and SKUs were contributing to 80 percent of shrinkage and dis- days, approximately 80 were for quantities 300
count sales and prioritize them for improvement. Department managers were particularly surprised by the
fact that approximately 56 percent of the time, an order
Exploratory data analysis was placed for a quantity that sold only 4 percent of
The team used data analytics and lean Six Sigma tools to iden-
tify and implement corrective action that would reduce waste A simple comparison of the sales and order quantities
and improve the ordering process. Targeting the SKUs that summarized by weekday should give any process owner
were contributing significantly to high shrinkage and discount many subsequent questions to ask the department manager

Figure 1: H
 istogram of quantity sold and frequency Figure 2: H
 istogram of quantity ordered and frequency
ofoccurrence of occurrence

60 51
56 50
40 36

31 30
21 20
10 10
10 8 7
6 5
2 3 2 3
1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0
0 0
80 160 240 320 400 480 560 0 75 150 225 300 375 450
Quantity sold Quantity ordered

ASQ Page 2 of 5

responsible for ordering. In particular, Table 1: C
 omparison of quantities sold versus quantities ordered by day of
why is there so much variation between theweek
the ordering process and the quantity
sold when the article can be ordered six Number Quantity Average Number Quantity Average
days in a week and has only five days of days received of quantity days sold sold of quantity
ordered received sold
shelf life?
Sunday 32 3,186 100

Table 1 focuses on a specific category/ Monday 22 5,612 255 32 5,951 186

SKU and iterates the gaps in quantity of Tuesday 28 7,099 254 32 5,941 186
goods sold and goods received/ordered. Wednesday 16 4,776 298 32 5,731 179

Thursday 30 7,827 261 32 5,994 187

Model building and interpretations Friday 18 5,169 287 32 6,536 204

Saturday 29 8,650 298 33 5,903 179

Model building is part of the improve
Total 143 39,133 274 225 39,242 174
phase of a typical Six Sigma define,
measure, analyze, improve, control
(DMAIC) project. Having contributed
Figure 3: C
 umulative distribution of quantities sold and ordered
to defining and understanding the prob-
lem and analyzing opportunities for
improvement, data still have a role to
play in constructing and implementing Variable
solutions for sustained results. Quantity sold
Quantity received
For this project, a simple model might 80 80
Mean StDev N
be a rule-of-thumb estimation or an
174.4 71.06 225
empirical equation that can predict the 273.7 90.01 143
ordering quantity and the quantity to be 60

maintained as minimum inventory. It

involves iterations of steps and assump- 40 39.6
tions to arrive at an equation that best
fits the demand of the store for the item.
One has to make assumptions for the
model and conduct backtesting based


on quantities demanded to ascertain
that there are no instances of lost sales. 0 100 200 300 400 500 600
Comparing the quantities ordered Data

against actual lost sales incidents plus

shrinkage incidents and discount sale This approach is complementary to Six department managers would see indica-
incidents leads to a calculation of the Sigma DMAIC, but instead of emphasiz- tions of stock levels during daily physical
benefits for the project. In the spirit of ing the conventional defects per million inspections. Red tags indicated SKUs
lean, a 10 percent reduction in shrink- opportunities (DPMO), waste reduction with high shrinkage. Blue tags signaled
age within three months was selected as is measured in terms of money written SKUs with high stock levels.
the improvement target. The basis of the off as shrinkage, a metric to which store
model is to minimize the gap between and department managers can more eas- Ordering proceeded according to stock
the quantity sold, represented by the ily relate. As a result, the control charts shelf life and quantity sold. The store
blue line in Figure 3, and the quantity were designed around shrinkage. manager would only have to look at the
ordered, shown in red. tags on the shelf to understand the need
Turning data into action to investigate any SKUs that might lead
With the model and goals established, to shrinkage events and require preven-
interpretation and communication then In order to reduce excess ordering, the tive action.
become project-specific and store- warehouse used color coding to moni-
specific based on assortment demands tor and track inventory. The colors of A rainbow SPC chart (see Tools Used
within different geographies. tags on the shelves helped ensure that in This Case Study on page 5) was

ASQ Page 3 of 5

deployed for real-time identification is responsible for placing the order. The approach ensured real-time track-
of spikes in the stock-to-sales ratio, as Acting on the immediate feedback ing of variation and brought sustainable
represented in Figure 4. The escalation provided in rainbow SPC monitoring results within three months. The team
matrix that was established required helped reduce shrinkage by 20 percent monitored process data for the 10 top-
root cause analysis for any rise or fall over a period of one month. selling articles for each department
out of the green zone. Entry of the using a simple Excel template with a
ratio into the red zone would need Figure 4 presents the rainbow control rainbow SPC macro and subsequently
intervention of the store manager, chart plotting the stock-to-sales ratio built a dynamic reporting and chart-
while entry into the orange/pink zone based on quantity. At various points, the ing tool using other data visualization
would prompt the intervention of the ratio enters the yellow, orange, and red software. Internal benchmarking then
floor manager. Entry of the ratio into zones, requiring escalation by one level allowed the organization to continuously
the yellow zone would need the atten- up at each stage and root cause analysis improve the ordering process across
tion of the department manager who by different levels of management. store outlets.

Figure 4: Rainbow chart for sales to inventory management

Pre-control inventory stock management

Product line/type of service Fresh chicken Country Xyz

Quality characteristics Stock + ordering to sales ratio Department Fresh food

Date(s) 1/10/2015 to 30/04/2016 Ordering team/source# Fresh food

Purpose of data collection Monitoring ordering process to improve performance Station number/department EBITDA Solutions/Country divisional manager

Target 3 stock/sales Upper specification limit 9.5 stock/sales Lower specification limit -3.5 stock/sales

130 (Stock + GR)/Sales

-120 10.9530
Get CEO involved
STOP 10.4559
when red 9.9588
-100 9.4618
Get first and
second level 90 8.9647
involved at yellow 80 8.4677
and orange 7.9706
CAUTION 7.4736
60 6.9765
50 6.4794
40 GO 5.4853
30 4.9883
Percent of tolerance

20 4.4912
10 3.9941
0 3.0000
-10 2.5030
-20 2.0058
GO 1.0113
-40 0.5147
-50 0.0177
Get first and -70 -1.4735
second level -80 CAUTION -1.9706
involved at yellow -90 -2.4676
and orange -2.9647
-100 -3.4618
Get CEO involved
when red -110 -3.9588
-120 STOP -4.4559
2 Oct-15
5 Oct-15
8 Oct-15
11 Oct-15
14 Oct-15
17 Oct-15
20 Oct-15
23 Oct-15
26 Oct-15
29 Oct-15
1 Nov-15
4 Nov-15
7 Nov-15
10 Nov-15
13 Nov-15
16 Nov-15
19 Nov-15
22 Nov-15
25 Nov-15
28 Nov-15
1 Dec-15
4 Dec-15
7 Dec-15
10 Dec-15
13 Dec-15
16 Dec-15
19 Dec-15
22 Dec-15
25 Dec-15
28 Dec-15
31 Dec-15
5 Jan-16
8 Jan-16
11 Jan-16
14 Jan-16
17 Jan-16
20 Jan-16
23 Jan-16
26 Jan-16
29 Jan-16
1 Feb-16
4 Feb-16
7 Feb-16
10 Feb-16
13 Feb-16
16 Feb-16
19 Feb-16
22 Feb-16
25 Feb-16
28 Feb-16
2 Mar-16
5 Mar-16
8 Mar-16
11 Mar-16
14 Mar-16
17 Mar-16
20 Mar-16
23 Mar-16
26 Mar-16
29 Mar-16
1 Apr-16
4 Apr-16
7 Apr-16
10 Apr-16
13 Apr-16
16 Apr-16
19 Apr-16
22 Apr-16
25 Apr-16
28 Apr-16
2 May-16
5 May-16
8 May-16
11 May-16
14 May-16

ASQ Page 4 of 5

The reduced waste in shrinkage and
stock levels has enabled the retailer TOOLS USED IN THIS CASE STUDY
to generate higher margins and con-
centrate more on availability and This improvement opportunity presented a classic case of the
customerinteraction. need for control charts, combined with root cause analysis
and an escalation matrix (a simple, visible, reliable, and
References responsive process for abridging Six Sigma define, measure,
analyze, improve, and control process steps) to help monitor
1. Roger D. Peng and Elizabeth day-to-day progress. Histograms and statistical process
Matsui, The Art of Data Science, control (SPC) charts assisted with high-level visualization of
2016-05-18, sales and ordering management to identify opportunities to
artofdatascience. reduce waste.
Histograms are graphs used to show frequency
For More Information distributions. The bars on the graph provide a visual
representation of how often each value occurs.
EBITDA Solutions assisted Learn more about histograms.
the grocery retail chain with
Statistical process control charts show how
this improvement project. Visit
processes change over time. Plotting data on current
performance against the process average and upper and
Find more case studies on the use lower control limits based on historical data provides a look
of quality tools and approaches at whether the process continues to perform predictably.
at Learnmore about controlcharts.
In this project, rainbow SPC charts, a variation
on traditional control charts, helped analyze data and
About the Authors determine the anomalies to address. Invented by ASQ
Fellow Thimmiah Gurunatha, the rainbow SPC approach
Sivaram Pandravada is managing
seeks to simplify control charts and provide immediate
director, India - EBITDA Solutions. feedback whenever a significant variation in data Learn more about the
occurs. While monitoring continuous and attribute data for rainbow SPC process in
Thimmiah Gurunatha is senior con-
optimization, seven colors are used to indicate escalation Systems Engineering
sulting partner at EBITDA Solutions,
actions. Teams can monitor data in real time to capture Standards The State of
Florida, USA. Previously, he held
evidence of parameters causing variation rather than trying the Art, Integrating DFR,
senior engineering roles for Xerox for
to find these parameters after thefact. DFSS, and DFX in Systems
32 years. An ASQ Fellow, Gurunatha
The rainbow approach uses attribute data and continuous Engineering Environment,
is an ASQ Certified Reliability
data, as well as life test data for Rainbow Reliability SPC. by Thimmiah Gurunatha.
Engineer (CRE) and Six Sigma Black
Belt (CSSBB).

ASQ Page 5 of 5