You are on page 1of 13

Data Warehousing and Multi-Dimensional Data

Modeling

For the most optimal reading experience we recommend using our website.
A free-to-view version of this content is available by clicking on this link, which
includes an easy-to-navigate-and-search-entry, and may also include videos,
embedded datasets, downloadable datasets, interactive questions, audio
content, and downloadable tables and resources.

Author: S. Krishnamoorthy
Pub. Date: 2017
Product: Sage Business Cases
DOI: https://doi.org/10.4135/9781473989030
Keywords: data warehouse, data modeling, stores, id, inventory, retailing, invoices
Disciplines: Business & Management, Information & Knowledge Management, Business IT, Operations
Management (general), Operations Management
Access Date: April 3, 2024
Publishing Company: Indian Institute of Management, Ahmedabad
City: London
Sage Sage Business Cases
© 2014

Online ISBN: 9781473989030

© 2017 Indian Institute of Management, Ahmedabad All Rights Reserved.

Data Warehousing and Multi-Dimensional Data Modeling


Page 2 of 13
Sage Sage Business Cases
© 2014

This case was prepared for inclusion in Sage Business Cases primarily as a basis for classroom dis-
cussion or self-study, and is not meant to illustrate either effective or ineffective management styles.
Nothing herein shall be deemed to be an endorsement of any kind. This case is for scholarly, educa-
tional, or personal use only within your university, and cannot be forwarded outside the university or
used for other commercial purposes.

The case studies on Sage Business Cases are designed and optimized for online learning. Please re-
fer to the online version of this case to fully experience any video, data embeds, spreadsheets, slides,
or other resources that may be included.

This content may only be distributed for use within INDIAN INSTITUTE OF MANAGEMENT.

2024 Sage Publications, Inc. All Rights Reserved

Abstract

Mr. Vishal Mehta, CIO at Acme Inc, just finished a meeting with Mr. Steve Barley, VP (Retail) and sev-
eral other key regional and national heads of Acme. The team discussed the plans for the next year
and the key decision support infrastructure woes faced by operational units in data access, integra-
tion, analysis, usage and interpretation. Vishal presented his ideas on how data warehouse investment
could help address many such decision support infrastructure challenges. The management was quite
receptive to his ideas and asked him to conduct a feasibility study, build a pilot and prepare a detailed
implementation plan.

Vishal has more than 15 years of experience successfully building and managing several IT projects.
However, he lacked sufficient expertise on Business Intelligence (BI) or Data Warehouse (DW) related
projects. BI/DW projects were more challenging to execute than normal IT projects and require orga-
nization wide commitment and support. Vishal, therefore, was contemplating on how to build a data
warehouse solution to meet his organization's business requirements

Data Warehousing and Multi-Dimensional Data Modeling


Page 3 of 13
Sage Sage Business Cases
© 2014

Case

About Acme Inc

Acme Inc is one of the leading national retail chains in India. The company has over 150 stores spread across
9 major states and 24 key cities with floor areas ranging from 15,000 to 24,000 sq ft. It is a specialist retailer
in consumer electronics and durables market. The consumer electronics and durable market in India is ex-

pected to reach 520bn INR by 2015 at a CAGR of 15%, as per market research estimates 1 . The company
has a market share of 18-20% and is one of the major players in the Indian market with Rs. 3,300Cr sales
revenue as of 2013.

Acme’s stores offer merchandise in a variety of branded and non-branded categories – computers, tablets,
cameras, home appliances, kitchen appliances, gaming and so on – to its customers. Each store stocks about
12000 SKUs (Stock Keeping Units) or products, 180 brands, and 9 major product categories. The company
has also established an online presence selling its merchandise to around 300 cities and towns.

Store Operations

Store operations involve a number of activities including indenting, inventory handling, customer service,
workforce allocation and management and so on. These activities are coordinated by a store manager and
executed by sales assistants and other employees.

A store manager replenishes the items in the inventory by placing orders to the regional warehouse. The ac-
tual quantity of items to be ordered including adjustments for lot sizes are automatically determined by an
information system using standard inventory models (EOQ). A store manager raises indent once the store
product inventory reaches its reorder limit. To synchronize activities across stores and a regional warehouse,
the organization follows a policy of indenting once a week on specific days. The store manager also raises
indents on an ad-hoc basis for specific items, especially when there is a sudden surge of sales and the item
stocks-out.

The store employees handle the shipments received from regional warehouse. These shipments are first

Data Warehousing and Multi-Dimensional Data Modeling


Page 4 of 13
Sage Sage Business Cases
© 2014

stored in a local store room. The items in the container (cartons/pallets/bags) are then unpacked to retrieve
individual product SKUs. The products are then moved to the store locations and placed as per the planogram
design (or floor plan). The planograms are generally prepared by merchandising managers. The merchandis-
ing manager prepares a planogram by analyzing numerous variables like customer purchase histories, de-
mand patterns, local demographics, customer survey ratings and so on. Merchandising managers may seek
inputs from local store managers to come up with better floor plans so as to improve the store profitability. One
of the responsibilities of the store manager is to ensure that the products are stocked as per the planogram
design. The store employee’s restocks the items in the store locations by following the planogram design. The
store assistants also maintain the aesthetic appeal of the items by cleaning them, placing them in the right
position or orientation after the customer visit and so on.

The store conducts physical verification of inventory once in a fortnight. The discrepancies in physical and
actual inventory are reconciled. The discrepancies may occur due to employee theft, shoplifting, misplaced
goods, billing errors, administration errors, damaged during in-store material handling, expired product shelf
life and so on. These discrepancies (commonly referred as shrinkage in the industry) account for 1.2-1.5% of
Acme’s total sales. The company aims to reduce this percentage to about 0.6% over the next three years.

Customer services another important area in store operations. This includes activities like handling in-store
customer queries, guiding customers in selecting the right products, processing customer orders, addressing
customer complaints and handling product returns. Effective handling of these activities is of utmost impor-
tance to Acme in general, and store manager in particular, to provide the best in-store experience to their
customers.

When the customer places an order, invoice is generated at Point-Of-Sale (POS) terminal and the item is
packed and delivered. For specific product SKUs (say, home appliances like refrigerators, washing machines,
TVs), the retail store takes care of delivering the items to customer premises. The store coordinates with local
delivery transporters to ensure timely delivery of items to the customer premises.

As one can understand from the description of various store activities, there are multiple points where data
can be collected. Acme currently collects sales data at the POS terminals; item receipt data for items received
from the regional warehouse; physical store inventory information, and periodic customer survey reports.

A sample invoice that describes the nature of data collected at POS terminals is provided in Exhibit 1.

Data Warehousing and Multi-Dimensional Data Modeling


Page 5 of 13
Sage Sage Business Cases
© 2014

Warehouse Management

The company has built a network of regional warehouses to cater to the individual store requirements in
the region. The primary activities performed in the warehouse include storage management and distribution,
inventory management, liaison with raw material suppliers/manufacturers for item procurement, liaison with
transporters for proper and on-time delivery of shipments.

As part of distribution activities, each warehouse ships items to a set of stores within its region. Typically, the
request for item/product is received from stores in the form of an indent. The regional warehouse then pre-
pares a pick list based on the indent, performs necessary packaging, and ships items through deliver trucks/
vans. The warehouse coordinates with transport service providers for shipping items or tracking shipments to
ensure their timely delivery. The company follows a weekly indents policy to synchronize various warehouse
activities.

Each warehouse receives its supplies from several manufacturers on a periodic basis. The quantity and peri-
odicity of order placement was determined based on forecast estimates made by the warehouse managers.
Warehouse managers closely worked with manufacturers to identify the right products for specific regions,
providing customer feedbacks, and making sales data available (for better production planning at the manu-
facturer’s end).

There are several areas where the company is trying to make improvements. First, the forecast estimates
were often inaccurate deviating as much as 30-40% for some SKUs. This results in excess inventory in a few
cases and stock-outs in many other cases. Second, there were frequent complaints from retailers of delayed
shipment of items. Third, the company experienced many cases of damaged items. This normally occurs dur-
ing product shipments or material handling inside the warehouse leading to increased shrinkage. The com-
pany was unable to clearly identify the root cause of the problem or bottlenecks to take concrete actions to
improve the situation.

Product Promotions

Acme offered the latest and good quality products at reasonable rates giving high value for money to its cus-
tomers. The company also partnered with several financial service firms offering attractive terms (low down
Data Warehousing and Multi-Dimensional Data Modeling
Page 6 of 13
Sage Sage Business Cases
© 2014

payment, low interest, flexible payment). In terms of promotions, the company helped customers get best val-
ue through festive offers, product bundling, attractive discounts, and psychological pricing. It also promoted
its offering through advertisements in TV, Internet, Radio, Hoardings, Social media contests/games, and com-
pany’s own online e-commerce portal.

The sales and marketing team at Acme was interested in building a loyalty card program to retain its cus-
tomers and improve overall profitability. The company also did not have appropriate metrics in place to mea-
sure the effectiveness of their promotional campaigns. The marketing managers believe that significant im-
provements can be made on its promotional spend, if proper decision support infrastructure is in place. While
the basic sales data is available, it lacks the ability to get an integrated view of the sales across multiple
stores, perform slicing and dicing of data, conduct multi-dimensional data analysis and build data mining mod-
els for identifying the right customer targets.

Information Technology Systems at Acme Inc

Historically, the company has grown through acquisitions. For example, in 2012, the company acquired a
large distribution warehouse company having operations in multiple towns/cities in India. As a result of this
inorganic growth strategy, business processes and IT systems at Acme are highly heterogeneous. A few op-
erating units (retail store/regional warehouse) are using Enterprise Resource Planning (ERP) software as part
of their order and inventory management processes. However, there are multiple vendor ERP solutions in dif-
ferent units. Even in units where there are same ERP vendor solutions, their configurations are very different.

Each of the company’s stores and regional warehouses used a relational database to maintain operational
data generated from their business processes. The individual units used different relational database prod-
ucts, and possibly different versions and/or configurations of the database products. While there are multiple
differences in the way the data is stored, the underlying database design can be understood using an Entity-
Relationship (ER) diagram. The ER diagram shown in Exhibit 2 gives a partial view of how different entities
pertinent to Acme’s business are related and the nature of their relationships.

The heterogeneity of IT systems at the hardware, system software, business application and data level makes
it extremely difficult for Acme managers to get timely, relevant and accurate information for making better de-
cisions. For example, information about the product movements across the Acme supply chain is available in

Data Warehousing and Multi-Dimensional Data Modeling


Page 7 of 13
Sage Sage Business Cases
© 2014

multiple systems. Accessing the information, however, from these systems in a timely manner to determine
supply chain operational efficiency improvement is a daunting task.

The Promise of Data Warehouse

Bill Inmon, considered to be the father of data warehousing, defines a data warehouse as:

“A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data


in support of management’s decision making process.”

The description of the key terms used in the above definition is as follows:

Subject-oriented: The data in a data warehouse is organized by key subjects like vendors, products, stores,
partners, etc.

Integrated: A data warehouse integrates data from multiple, heterogeneous data sources. The differences in
data definitions and inconsistencies across multiple source systems are resolved before loading the data into
the data warehouse.

Time-variant: Unlike an operational system, a data warehouse maintains historical data. For instance, while
an operational system holds the most recent customer location detail (or mobile number/address), a data
warehouse can be used to maintain all changes of customer location detail (mobile numbers/addresses).

Non-volatile: The data warehouse is a write-once, read-many type of system. That is, the data once written
to a data warehouse is rarely updated. The implication of this is that a data warehouse system does not need
to support expensive transaction processing and recovery control procedures as in the case of operational
database systems.

In short, a data warehouse organizes and stores data in a way that is best suited to per form analytical
queries. It serves as a basic foundation for building analytical applications. A number of companies in multiple
industry verticals (retail, financial services, telecom, gambling, transportation, etc.) have realized significant
benefits by investing in data warehouse infrastructure. As per Gartner analyst estimates, the overall market
for data warehouse systems alone is estimated to be about $9Bn and expected to grow at a CAGR of 7% as

of 2013 2 . The size of the market, its continued growth and the industry success stories clearly demonstrates

Data Warehousing and Multi-Dimensional Data Modeling


Page 8 of 13
Sage Sage Business Cases
© 2014

the value that organizations can derive from such infrastructural IT investments.

Data Warehousing at Acme Inc

Vishal organizes a team led by Ms. Aditi Patelto initiate the data warehouse project. He also hires a SAP con-
sultant, Dr. Neil Clarke, specializing in data warehouse design and development to help them in this initiative.
Aditi tries to understand from Neil the processes involved in building a successful data warehouse solution.

Aditi: Good Morning Neil, How are you doing?

Neil: Great. How is your new initiative on data warehouse coming along?

Aditi: We are in the very early stages of this project and would like to seek your help on successfully driving
this initiative. In this context, I had already shared with you our organizational business process details.

We are unclear on how to approach this project? Where should we start and how to proceed? Can you share
your thoughts on some of the best practices in the Industry?

Neil: Yes. I did go through the documents you had shared. Based on my review, I had prepared a data ware-
house business matrix (refer to Exhibit 3) and emailed you this morning.

Aditi: Neil, I just glanced through the warehouse business matrix document. But, how do I really interpret and
use this business matrix for building a data warehouse?

Neil: Let me explain. The business matrix contains information about your business processes (rows of the
matrix) and their context or dimensions (columns). The individual cells are marked with ‘x’ if there is a context
associated with the specific business process row. For instance, the context for the retail product sales busi-
ness process includes time, product, store, customer, promotion and geography.

While building a data warehouse, select a few business processes from the matrix and design a multi-dimen-
sional data model. The model is then used to build a data warehouse. The above process is repeated for all
of the business processes, based on your analytical requirements.

Aditi: Ok. Is it a kind of bottom-up, iterative process for building a data warehouse?

Neil: Yes, Indeed. One needs to prioritize a few business processes based on their current business de-

Data Warehousing and Multi-Dimensional Data Modeling


Page 9 of 13
Sage Sage Business Cases
© 2014

mands. A data mart (a mini warehouse) is then built for each of the selected business processes.

Aditi: Great. How should we proceed after prioritizing and selecting a few business processes for implemen-
tation?

Neil: One of the industry best practices in this area is to apply a four step process for designing and building
a data mart/warehouse. The four steps are: (1) Select a business process, (2) Declare the grain, (3) Choose
the dimensions, and (4) Identify the facts. Let us assume that we wish to apply the four steps for one of your
business processes.

As a first step, let us choose the ‘sales order’ business process. In the second step, we need to declare the
grain of the business process. The grain is chosen based on your analytical requirements. For example, if you
wish to analyse your sales data at the lower most granularity, then the grain is declared as the transaction line
item. On the other hand, if your analysis doesn’t require transactional level data, then the grain is declared at
a weekly or monthly level.

Aditi: Well, does it not constrain my ability to analyse the data, if the grain is defined at the higher level.

Neil: You are right. One needs to make a trade-off between the size of the data stored in a data warehouse
and analytical requirements. For example, if the business user rarely or never accesses the transactional level
data for analysis, then it will be an unnecessary waste of data warehouse space. Alternately, it is possible to
create multiple data repositories with different level of grains to meet different analysis requirements.

Aditi: Hmm. Perhaps, declare the transaction level grain for the last 90days of recent data. For the historical
data beyond 90days, declare a weekly/monthly grain. This would allow periodically archiving the older data to
a higher level grain.

Neil: You are spot on. This is precisely how organizations meet their analytical requirement and at the same
time optimize warehouse storage space.

Let us move on to the third step of choosing the dimension, a fairly easier step if you have correctly declared
the grain of the business process. For example, if the grain for the sales order process is declared as sales
per line item, then the typical dimensions are the context of the sale such as item, store, customer, time, and
geography.

The final step in the dimensional design process is the identification of facts. These are the performance

Data Warehousing and Multi-Dimensional Data Modeling


Page 10 of 13
Sage Sage Business Cases
© 2014

measure that one wishes to monitor. For the sales order business process, the facts are generally quantity of
items sold, dollar value of sales, gross revenue, and gross margin.

Aditi: This is quite interesting. Now, I get a clear sense of how the dimensional modelling for a data mart/
warehouse works. We will soon start work on prioritizing our business processes and applying this framework
to design the data mart/warehouse.

Neil, we shall connect with you again to understand the next steps. More specifically, we wish to understand
the best practices in making the right architectural choices and building a data warehouse infrastructure.

Neil: Sure. I’ll certainly help your organization in this data warehouse journey. Let us connect again next week
for a deep dive on data warehouse design and development.

Aditi and her team had a series of conversations with Neil to understand the data warehouse design and de-
velopment aspects better. The team finally understood the data warehouse architecture and its components.
A sample data warehouse architecture diagram is given in Exhibit 4. One of the components in the architec-
ture is the On-Line Analytical Processing (OLAP) engine. OLAP is an interactive data analysis tool that allows
one to view the data from multi-dimensional perspective. An illustrative list of operations one can perform in
OLAP is given in Exhibit 5.

Some of the key architectural choices available for data warehouse implementation are depicted in Exhibit 6.
The team is yet to decide on which architecture will be the most appropriate for Acme.

Pilot Data Mart Solution – Key Analytic Requirements

Aditi and her team interviewed key stakeholders at Acme to understand their analytic requirements. While
many requirements were identified, the team in consultation with business units prioritized on the following
set of requirements for the pilot project:

Retail Business Users

At the retail level, the business users are keen to analyze the sales performance by different dimensions. The
ability to conduct a detailed and interactive inventory and promotional analysis to spot trends, patterns and

Data Warehousing and Multi-Dimensional Data Modeling


Page 11 of 13
Sage Sage Business Cases
© 2014

correlations are also of critical importance. Some of the specific analytical queries of interest to them include:

• Which are the Top-30 SKU’s by sales revenue, sales volume, and profitability?
• Which products and customers have the highest/lowest gross margins?
• What are the key sales and gross margin trends by product type, store, and geography?
• What is the total sales value of an SKU (e.g. Samsung 48’’ LED TV) this quarter? How does it com-
pare against the last quarter?
• Which products were on promotions but did not sell? Slice and dice the data by geography, product
category and month.
• What is my inventory turns and total days of supply? Analyze the trends by store, product and time.
• What is the total gross margin ROI for an SKU (e.g. Apple iPhone 6)? Rollup the data by product
category, time, and store.

Warehouse Business Users

Warehouse users want to know the time taken to move a product from the warehouse to the store. Quick and
efficient moving of products helps the company minimize inventory levels and reduce the holding or carrying
costs. The business users wish to have the ability to perform workflow analysis and determine the bottleneck
areas in their order fulfilment process. An illustrative list of analytical queries that are of interest to the ware-
house business users include:

• What is the maximum/minimum/average time to ship an order from warehouse to retail store? Slice
and dice the data by store, product category, geography, and time.
• What is the average time taken for the individual activities (picking, packaging, shipping, transporta-
tion, delivery) in the order fulfilment pipeline? What are the maximum and minimum processing time
values for the individual order fulfilment activities?
• What are the delivery time performance trends by month, product category, and store? Which are the
key bottleneck areas?
• What is the order delivery time performance of third party transporters? How does it compare against
the contractual service level agreement? Analyze their performance by time, product category and
geography.

As an immediate next step, the team needs to apply the design principles of dimensional data modeling and

Data Warehousing and Multi-Dimensional Data Modeling


Page 12 of 13
Sage Sage Business Cases
© 2014

build a data mart. The team should consider several alternative design choices and choose the one that best
addresses their business analytics requirements. Subsequently, other components of the data warehouse (or
data mart) have to be put in place to complete the identified the use cases.

Conclusion: Building a Decision Support Infrastructure

Vishal Mehta and his team are preparing a plan to meet the organizations analytic requirements. While the
team believes that the data warehouse is the best way to meet their analytic requirement, several questions
are lingering in their minds.

What are the potential pitfalls in integrating a warehouse system with existing IT systems? What are the de-
sign considerations that one needs to make while meeting all of the analytic requirements? How to go about
performing sizing/estimation for making decisions on the hardware infrastructure needed? What are the avail-
able architectural choices and which is the best option that suits their needs? At an organizational level, what
are the change management initiatives and communication plans needed for successful implementation?

Vishal contemplated on the above questions so as to arrange a pilot study, prepare a detailed implementation
plan and present the findings to Mr. Steve Barley (VP Retail) and other key stakeholders in July2015, just six
months away.

Notes

1. ASSOCHAM Report, Emerging Trends in Consumer Electronics and Durables Industry, 2012

2. Gartner Report, The State of Data Warehousing in 2012, 2013, 2014

https://doi.org/10.4135/9781473989030

Data Warehousing and Multi-Dimensional Data Modeling


Page 13 of 13

You might also like