You are on page 1of 15

A00181

March 25, 2015

Data Warehousing and Multi-Dimensional Data


Modelling

It was January 15, 2015, and Vishal Mehta, Chief Information Officer of Acme Inc., an Indian
retailer of consumer electronic and durable goods, had just wrapped up a meeting with Steve
Barley, Vice President (Retail), and several other key regional and national heads of Acme. The
team discussed the company’s plans for the year ahead and the key decision support
infrastructure woes faced by operational units in data access, integration, analysis, usage and
interpretation. Mehta presented his ideas on how a data warehouse investment could help
address many of these decision support infrastructure challenges. The management was quite
receptive to his ideas and asked him to conduct a feasibility study, build a pilot and prepare a
detailed implementation plan.

Mehta had more than 15 years of experience successfully building and managing a number of
IT projects. However, he lacked sufficient expertise in business intelligence (BI) or data
warehouse (DW)-related projects. BI/DW projects were more challenging to execute than
normal IT projects and required organisation-wide commitment and support. Mehta, therefore,
was contemplating how to build a data warehouse solution to meet his organisation’s business
requirements.

About Acme Inc

Acme Inc. was one of the leading national retail chains in India. The company had over 150
stores spread across nine major states and 24 key cities with floor areas ranging from 15,000 to
24,000 square feet. It was a specialist retailer in the consumer electronics and durables market. A
2012 industry report had projected that the consumer electronics and durable market in India
would reach INR 520 billion by 2015 at a compounded annual growth rate (CAGR) of 15%. 1 By
2013, the company had become one of the major players in the Indian market with a market
share of 18-20% and INR 33 billion (3,300 crores) in sales revenue.

Acme’s stores offered its customers merchandise in a variety of branded and non-branded
categories — computers, tablets, cameras, home appliances, kitchen appliances, gaming, and so
on. Each store stocked about 12,000 stock keeping units (SKUs) or products, 180 brands, and
nine major product categories. The company had also established an online presence, selling its
merchandise in around 300 cities and towns.

1ASSOCHAM. (2012). Emerging trends in consumer electronics and durables industry.

Prepared by Prof. Srikumar Krishnamoorthy, Indian Institute of Management, Ahmedabad.


The case is written based on our analysis of a leading retail business in India from secondary data
sources.
Cases of the Indian Institute of Management, Ahmedabad, are prepared as a basis for classroom
discussion. They are not designed to present illustrations of either correct or incorrect handling of
administrative problems.
©2014 by the Indian Institute of Management, Ahmedabad.
This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
2 of 15 A000181

Store Operations

Acme’s store operations involved a number of activities including indenting, inventory


handling, customer service, workforce allocation and management. These activities were
coordinated by a store manager and executed by sales assistants and other employees.

The store manager replenished the items in the inventory by placing orders to the regional
warehouse. The actual quantity of items to be ordered, including adjustments for lot sizes, were
automatically determined by an information system using standard inventory models (e.g.,
economic order quantity or EOQ). The store manager raised an indent when the store product
inventory reached its reorder limit. To synchronise activities across stores and a regional
warehouse, the organisation followed a policy of indenting once a week on specific days. The
store manager also raised indents on an ad-hoc basis for specific items, especially when there
was a sudden surge in sales and the item stocked out.

Store employees handled the shipments received from the regional warehouse. These
shipments were first stored in a local storeroom. The items in the container (i.e., cartons, pallets
or bags) were then unpacked to retrieve individual product SKUs. The products were then
moved to the store locations and placed according to the store’s planogram design (or floor
plan). These planograms were generally prepared by merchandising managers, who analysed
numerous variables such as customer purchase histories, demand patterns, local demographics,
customer survey ratings, and so on. Merchandising managers often sought inputs from the local
store managers to come up with better floor plans to improve the store’s profitability. One of
the responsibilities of the store manager was to ensure that the products were stocked as per the
planogram design. Store employees restocked items at store locations by following the
planogram design. Store assistants maintained the aesthetic appeal of the items by cleaning
them and placing or orienting them correctly after a customer visit.

Each store conducted a physical verification of its inventory once in two weeks and reconciled
discrepancies in physical and actual inventory. Discrepancies could occur for a number of
reasons, such as employee theft, shoplifting, misplaced goods, billing errors, administrative
errors, damage during in-store material handling, expired product shelf life and so on. These
discrepancies (commonly referred to as shrinkage in the industry) accounted for 1.2-1.5% of
Acme’s total sales. The company aimed to reduce this percentage to about 0.6% over the next
three years.

Customer service was another important area of store operations. This included activities like
handling in-store customer queries, assisting customers in identifying and selecting the right
products, processing customer orders, addressing customer complaints and handling product
returns. Effective handling of these activities was of utmost importance to Acme, and
particularly to each of its store managers, in providing the best in-store experience to their
customers.

When a customer placed an order, an invoice was generated at the point-of-sale (POS) terminal
and the item was packed and delivered to the customer’s premises. For specific product SKUs
(say, home appliances like refrigerators, washing machines and televisions), the retail store used
local delivery transporters to ensure the timely delivery of the items to the customer’s location.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
3 of 15 A000181

As one can understand from the description of various store activities, there were multiple
points where data could be collected. Acme collected sales data at the POS terminals, item
receipt data for items received from the regional warehouse, physical store inventory
information, and periodic customer survey reports.

(See Exhibit 1 for a sample invoice that describes the nature of data collected at POS terminals).

Warehouse Management

The company built a network of regional warehouses to cater to the individual store
requirements in the region. The primary warehouse activities included storage management
and distribution, inventory management, liaising with raw material suppliers and
manufacturers for item procurement, and liaising with transporters for proper and on-time
delivery of shipments.

As part of its distribution activities, each warehouse shipped items to a set of stores within its
region. Typically, stores sent their requests for items or products to their regional warehouse in
the form of an indent. The regional warehouse then prepared a pick list based on the indent, did
the necessary packaging, and transported the items to the stores in delivery trucks/ vans. The
warehouse coordinated with its transport service providers on the shipping of items and
tracking shipments to ensure their timely delivery. The company followed a weekly indent
policy to synchronise various warehouse activities.

Each warehouse received its supplies from multiple manufacturers on a periodic basis. The
quantity and periodicity of order placement was determined based on forecast estimates made
by the warehouse managers. Warehouse managers closely worked with manufacturers to
identify the right products for specific regions, providing customer feedback and making sales
data available to facilitate better production planning at the manufacturer’s end.

There were several areas in which the company was trying to make improvements. First, the
forecast estimates were often inaccurate, deviating as much as 30-40% for some SKUs. This
resulted in excess inventory in a few cases and stock-outs in most other cases. Second, there
were frequent complaints from retailers over delayed shipments. Third, there had been many
instances in which items had been damaged. This typically occurred during product shipment
or material handling inside the warehouse, leading to increased shrinkage. The company was
unable to clearly identify the root cause of the problem or bottleneck to take concrete actions to
improve the situation.

Product Promotions

Acme offered the latest and good quality products at reasonable rates, giving its customers high
value for money. The company also partnered with several financial service firms who offered
attractive terms (e.g., low down payment, low interest and flexible payment). In terms of
promotions, the company helped customers get the best value for money through festive offers,
product bundling, attractive discounts and psychological pricing. It also promoted its offering
through advertisements across a variety of channels, including television, Internet, radio,
billboards, social media contests/ games, and the company’s own online e-commerce portal.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
4 of 15 A000181

The sales and marketing team at Acme was interested in building a loyalty card programme to
retain customers and improve overall profitability. The company also did not have appropriate
metrics in place to measure the effectiveness of its promotional campaigns. Its marketing
managers believed that significant improvements could be made in its promotional spend if the
proper decision support infrastructure was put in place. While basic sales data was available,
there was no way to get an integrated view of sales across multiple stores, slice and dice data,
conduct multi-dimensional data analysis or build data mining models to identify the right
customer targets.

Information Technology Systems at Acme Inc.

Historically, the company had grown through acquisitions. For example, in 2012, the company
acquired a large distribution warehouse company with operations in multiple towns and cities
in India. As a result of this inorganic growth strategy, business processes and IT systems at
Acme were highly heterogeneous. A few operating units (retail stores and regional warehouses)
used Enterprise Resource Planning (ERP) software as part of their order and inventory
management processes; however, there were multiple ERP vendor solutions in different units.
Even in units with the same ERP vendor solutions, their configurations were very different.

Each of the company’s stores and regional warehouses used a relational database to maintain
operational data generated from their business processes. The individual units used different
relational database products, and possibly different versions and/or configurations of the
database products. While there were many differences in the way the data was stored, the
underlying database design could be understood using an entity-relationship (ER) diagram.
(The ER diagram shown in Exhibit 2 gives a partial view of how different entities pertinent to
Acme’s business were related and the nature of their relationships).

The heterogeneity of IT systems at the hardware, system software, business application and
data levels made it extremely difficult for Acme managers to get timely, relevant and accurate
information for making better decisions. For example, information about the movement of
products across the Acme supply chain was available in multiple systems. However, accessing
the information from these systems in a timely manner to determine supply chain operational
efficiency improvement was a daunting task.

The Promise of a Data Warehouse

Bill Inmon, considered to be the father of data warehousing, defined a data warehouse as
follows:

“A data warehouse is a subject-oriented, integrated, time-variant and non-volatile


collection of data in support of management’s decision making process.” 2

The key terms in the above definition are:

2Han, J., & Kamber, M. (2011). Data mining: Concepts and techniques (3rd ed). New Delhi: Harcourt India
Private Ltd, Morgan Kaufmann Publishers.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
5 of 15 A000181

• Subject-oriented: The data in a data warehouse was organised by subject, for example,
vendors, products, stores, partners, etc.

• Integrated: A data warehouse integrated data from multiple, heterogeneous data sources.
The differences in data definitions and inconsistencies across multiple source systems were
resolved before loading the data into the data warehouse.

• Time-variant: Unlike an operational system, a data warehouse maintained historical data. For
instance, while an operational system held the most recent customer location details (or
mobile number/address), a data warehouse could be used to maintain all changes in
customer location details (mobile numbers/addresses).

• Non-volatile: The data warehouse was a write-once, read-many type of system. That is, the
data once written to a data warehouse was rarely updated. The implication of this was that
a data warehouse system did not need to support expensive transaction processing and
recovery control procedures as in the case of operational database systems.

In short, a data warehouse organised and stored data in a way that was best suited to perform
analytical queries. It served as a basic foundation for building analytical applications. A number
of companies in multiple industry verticals (retail, financial services, telecom, gambling,
transportation, etc.) realised significant benefits by investing in data warehouse infrastructure.
As per Gartner analyst estimates, the overall market for data warehouse systems alone was
estimated to be about US$ 9 billion in 2013 and expected to grow at a CAGR of 7%. 3 The size of
the market, its continued growth and industry success stories clearly demonstrated the value
that organisations could derive from such infrastructural IT investments.

Data Warehousing at Acme Inc.

After the senior management meeting, Mehta organised a team led by Aditi Patel, Head of IT, to
initiate the data warehouse project. He also hired Neil Clarke, a SAP consultant specialising in
data warehouse design and development, to help them in this initiative. Patel asked Clarke to
explain the processes involved in building a successful data warehouse solution:

Patel: Good Morning Neil, How are you doing?

Clarke: Great. How is your new initiative on data warehouse coming along?

Patel: We are in the very early stages of this project and would like to seek your help
on successfully driving this initiative. In this context, I have already shared
with you our organisational business process details.

We are unclear on how to approach this project. Where should we start and
how should we proceed? Can you share your thoughts on some of the best
practices in the industry?

3 Gartner Reports. The state of data warehousing in 2012, 2013 and 2014.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
6 of 15 A000181

Clarke: Yes. I did go through the documents you had shared. Based on my review, I
prepared a data warehouse business matrix [refer to Exhibit 3] and emailed it
to you this morning.

Patel: I did glance through the warehouse business matrix document. But how do I
really interpret and use this business matrix for building a data warehouse?

Clarke: Let me explain. The business matrix contains information about your business
processes — the rows of the matrix, and their context or dimensions — the
columns. The individual cells are marked with an ‘x’ if there is a context
associated with the specific business process row. For instance, the context for
the retail product sales business process includes time, product, store,
customer, promotion and geography.

While building a data warehouse, select a few business processes from the
matrix and design a multi-dimensional data model. The model is then used to
build a data warehouse. The above process is repeated for all of the business
processes, based on your analytical requirements.

Patel: Okay. Is it a kind of bottom-up, iterative process for building a data


warehouse?

Clarke: Yes. One needs to prioritise a few business processes based on the current
business demands. A data mart (a mini warehouse) is then built for each of
the selected business processes.

Patel: Great. How should we proceed after prioritising and selecting a few business
processes for implementation?

Clarke: One of the industry best practices in this area is to apply a four-step process
for designing and building a data mart/warehouse: (1) Select a business
process, (2) declare the grain, (3) choose the dimensions, and (4) identify the
facts. Let us assume that we wish to apply the four steps to one of your
business processes.

As a first step, let us choose the ‘sales order’ business process. In the second
step, we need to declare the grain of the business process. The grain is chosen
based on your analytical requirements. For example, if you wish to analyse
your sales data at the lowermost granularity, then the grain is declared as the
transaction line item. On the other hand, if your analysis doesn’t require
transactional-level data, then the grain is declared at a weekly or monthly
level.

Patel: Well, if the grain is defined at a higher level, will it not constrain my ability to
analyse the data?

Clarke: You are right. One needs to make a trade-off between the size of the data
stored in a data warehouse and one’s analytical requirements. For example, if

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
7 of 15 A000181

the business user rarely or never accesses the transactional- level data for
analysis, then it will be an unnecessary waste of data warehouse space.
Alternately, it is possible to create multiple data repositories with different
levels of grains to meet different analysis requirements.

Patel: Hmm. Perhaps we should declare the transaction-level grain as the last 90
days of recent data. For historical data beyond 90 days, we could declare a
weekly/monthly grain. This would allow the periodic archiving of the older
data to a higher-level grain.

Clarke: You are spot on. This is precisely how organisations meet their analytical
requirements and at the same time optimise warehouse storage space.

Let us move on to the third step of choosing the dimension, a fairly easy step
if you have correctly declared the grain of the business process. For example,
if the grain for the sales order process is declared as sales per line item, then
the typical dimensions are the context of the sale such as item, store, customer,
time and geography.

The final step in the dimensional design process is the identification of facts.
These are the performance measures that one wishes to monitor. For the sales
order business process, the facts are generally quantity of items sold, dollar
value of sales, gross revenue and gross margin.

Patel: This is quite interesting. Now, I get a clear sense of how the dimensional
modelling for a data mart/warehouse works. We will soon start work on
prioritising our business processes and applying this framework to design the
data mart/warehouse.

Neil, we shall connect with you again to understand the next steps. More
specifically, we wish to understand the best practices in making the right
architectural choices and building a data warehouse infrastructure.

Clarke: Sure. I’ll certainly help your organisation in this data warehouse journey. Let
us connect again next week for a deep dive on data warehouse design and
development.

Patel and her team had a series of conversations with Clarke to better understand various
aspects of data warehouse design and development. Through this process, the team gained a
good understanding of data warehouse architecture and its components.

(A sample data warehouse architecture diagram is given in Exhibit 4. One of the components in
the architecture is the online analytical processing (OLAP) engine. OLAP is an interactive data
analysis tool that allows one to view data from a multi-dimensional perspective. Exhibit 5
provides an illustrative list of operations one can perform in OLAP. Exhibit 6 depicts some of
the key architectural choices available for data warehouse implementation).

The team had to decide on which architecture would be the most appropriate for Acme.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
8 of 15 A000181

Pilot Data Mart Solution — Key Analytic Requirements

Patel and her team interviewed key stakeholders at Acme to understand their analytic
requirements. Of the many requirements that were identified, the team, in consultation with the
business units, prioritised the following set of requirements for the pilot project:

Retail Business Users

At the retail level, business users were keen to analyse sales performance along different
dimensions. The ability to conduct a detailed and interactive inventory and promotional
analysis to spot trends, patterns and correlations was also of critical importance. Some of the
specific analytical queries of interest to them included:

• Which are the top 30 SKUs by sales revenue, sales volume and profitability?
• Which products and customers have the highest/lowest gross margins?
• What are the key sales and gross margin trends by product type, store and geography?
• What is the total sales value of an SKU (e.g. Samsung 48’’ LED TV) this quarter? How
does it compare against the last quarter?
• Which products were on promotion but did not sell? Slice and dice the data by
geography, product category and month.
• What are my inventory turns and total days of supply? Analyse the trends by store,
product and time.
• What is the total gross margin return on investment (ROI) for an SKU (e.g. Apple iPhone
6)? Roll up the data by product category, time and store.

Warehouse Business Users

Warehouse users wanted to know the time taken to move a product from the warehouse to the
store. The quick and efficient movement of products helped companies minimise inventory
levels and reduce holding or carrying costs. Warehouse business users wanted to have the
ability to perform workflow analysis and identify the bottlenecks in their order fulfilment
process. An illustrative list of analytical queries that were of interest to warehouse business
users included:

• What are the maximum, minimum and average times to ship an order from warehouse
to retail store? Slice and dice the data by store, product category, geography and time.

• What is the average time taken for the individual activities (picking, packaging,
shipping, transportation and delivery) in the order fulfilment pipeline? What are the
maximum and minimum processing time values for the individual order fulfilment
activities?

• What are the delivery time performance trends by month, product category and store?
Which are the key bottleneck areas?

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
9 of 15 A000181

• What is the order delivery time performance of third-party transporters? How does it
compare against the contractual service-level agreement? Analyse their performance by
time, product category and geography.

As an immediate next step, the team needed to apply the design principles of dimensional data
modelling and build a data mart. The team had to consider several alternative design choices
and choose the one that best addressed their business analytics requirements. Subsequently,
other components of the data warehouse (or data mart) had to be put in place to complete the
identified use cases.

Conclusion: Building a Decision Support Infrastructure

Mehta and his team were preparing a plan to meet the organisation’s analytic requirements.
While the team believed that a data warehouse was the best way to meet those requirements,
several questions lingered in their minds.

What were the potential pitfalls in integrating a warehouse system with the existing IT systems?
What were the design considerations that had to be made while meeting all of the analytic
requirements? How should they go about performing sizing/estimation for making decisions
on the hardware infrastructure needed? What were the available architectural choices and
which was the option that would best suit their needs? At an organisational level, what change
management initiatives and communication plans would be needed for a successful
implementation?

Mehta carefully contemplated the above questions. He knew the answers would be critical in
arranging a pilot study, preparing a detailed implementation plan and presenting the findings
to Barley and other key stakeholders in July 2015, just six months away.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
10 of 15 A000181

Exhibit 1: Sales Invoice

INVOICE

Acme Inc
345 Victoria Street
Kolkata 702 207

Bill To Address: Ship To Address:


Mr. Vaghela, Shankar Mr. Vaghela, Shankar
Mint street, Kolkata Mint street, Kolkata

Invoice No I10005 Order No O123456


Store No S303 Order Date 06-Sep-14
Emp Id E777 Promo code None

Unit Price Amount


Serial No. Product Code Product Description Qty (INR) (INR)
1 TV1001 24'' LCD TV 1 35000 35000
2 W1002 Extended warranty 1 3000 3000
3
4
5
Sales Amount 38000
Sales Tax 4750
Freight 300
Discount Nil
Total Amount 43050

Source: Created by the author.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
11 of 15 A000181

Exhibit 2: Entity-Relationship Diagram

Sample Table Description for the E-R Diagram

Table Name Representative Attributes


CUSTOMER Customer ID, name, address, city, state, country, pin code
ORDER Order ID, customer ID, store ID, salesperson ID, order date, sales tax,
discount, freight, amount
STORE Store ID, name, description, location
SALESPERSON Salesperson ID, name, designation, age, date of birth, address
INVOICE Invoice ID, customer ID, customer name, address, order ID, amount
LINE ITEM Line item ID, order ID, name, description, price
ITEM Item ID, product name, brand, category, department
PROMOTION Promotion ID, promotion name, description, start date, end date
SUPPLIER Supplier ID, supplier name, description, location
SHIPMENT Shipment ID, shipping date, delivery date
CATEGORY Category ID, category name, description
CUSTOMER PROFILE Customer ID, demographic profile, psychographic profile

Source: Created by the author.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
12 of 15 A000181

Exhibit 3: Data Warehouse Business Matrix

Transporter
Warehouse
Geography
Promotion
Customer
Business Process

Supplier
Product

Store
Date
Retail Product Sales X X X X X X
Retail Inventory X X X X
Retail Product Receipts X X X X X
Retail Promotions X X X X X
Frequent Shopper X X X X
Retail Product Returns X X X X X
Warehouse Orders X X X X X X
Warehouse Inventory X X X X
Warehouse Receipts X X X X X X
Warehouse Sales Forecast X X X
Purchase Order X X X X
Source: Created by the author.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
13 of 15 A000181

Exhibit 4: High-level Data Warehouse Architecture

DATA AD-HOC BUSINESS


OLAP
MINING QUERY APPLICATIONS

M
S
E Enterprise Data Warehouse / E
T Data Mart C
A
U
R
D
I
A
T
T EXTRACT, TRANSFORM, LOAD
Y
A

DB ERP CRM FILES ERP

Source: Created by the author.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
14 of 15 A000181

Exhibit 5: OLAP Operations

Source: Created by the author.

This document is authorized for use only in Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.
15 of 15 A000181

Exhibit 6: Data Warehouse Architecture Alternatives

Independent Data Mart Data Mart with Linked Dimension

Users Users Users Users

DM-Finance DM-Marketing DM-Finance DM-Marketing

ETL ETL ETL ETL

Source Data Source Data

Enterprise Data Warehouse Virtual Data Warehouse


Users
Users

Data Warehouse
Virtual Data Warehouse

ETL

Source Data Source Data

Enterprise Data Warehouse and Data Mart


Users Users Users

DM-Finance DM-Marketing

Enterprise Data Warehouse

ETL

Source Data

Source:
This document Created
is authorized byonly
for use theinauthor.
Prof Lakshmi Shankar Iyer's Business Intelligence and Data Visualization at Christ University from Sep 2022 to Mar 2023.

You might also like