You are on page 1of 60

Introduction to

DW, DM, BI and BA

Prof. Ramesh Behl


rbehl@imi.edu 1
Course Objective
• To enhance the theoretical understanding of students on various concepts of
analytics.
• To understand SAP Business Warehouse as the data staging for Business
Analytics
• To enhance the efficiency of students in using software for extracting
information and generating insight.
• To familiarize students with data mining concepts and techniques.
• To develop the competency of assessing a predicament and choosing the
appropriate tool to arrive at a decision.
• To expose students to a set of predictive tools.

2
Evaluation
Class Participation 10%
 
Quiz 20%
Project 30%
 
End-term 40%
 
Total 100%
 

3
Why BA?
Marketing Manager
Marketing Manager
Supply Chain Manager “I need to predict the outcomes of
Supply Chain Manager “I need to predict the outcomes of
“I need to optimise the processes” The latest campaign”
“I need to optimise the processes” The latest campaign”

CFO
CFO
“I need to measure our
“I need to measure our
financial performance”
financial performance”

CEO
CEO Customer Service Manager
“I need to measure our Customer Service Manager
“I need to measure our “I need to discover what is
Performance against our objectives” “I need to discover what is
Performance against our objectives” happening with my customers”
happening with my customers”
Why BA?
. . . predict the buying behavior and decision criteria of your
prospects weeks before your competition?
. . . gain first-mover advantage by introducing new products
and services to micro-segments that haven't been identified
by competitors?
. . . evaluate the impact of your marketing campaigns hourly
and make adjustments in real-time?
. . . improve customer experience scores that grow products
per customer, reduce attrition, and leverage the power of
customer recommendations for new business?
. . . predict likely failures of critical equipment and processes?

5
What is Intelligence?
Definition:

1.The ability to learn or understand or deal


with new and trying situations

2.The ability to apply knowledge to


manipulate one’s environment

Source: Merriam-Webster’s Online Dictionary


Intelligence in Business
• What is the current status of the Yest
er day
business? – N ow
- To
– What’s going well? mor
row
– What needs improvement?

• What are the business’ strengths and


weaknesses?

• Are there opportunities for innovation


or competitive advantage?

• How do we improve our decision


making?
Definition: Business Intelligence
• Business Intelligence covers strategies,
processes and technologies in order to
achieve knowledge about status, potentials
and perspectives of a company out of
heterogeneous and distributed data.

Definition Source: Institut für Business Intelligence (IBI),


http://www.i-bi.de/home/index.html
Business Intelligence Turns Data into
Knowledge
Decision
Offer product B to
customer Smith

Knowledge
Product A & B have
a 80% sales correlation

Information
Customer Smith
buys product A

Data
Product A
Product B
Customer Smith
Business Intelligence Stages

Source: Brobst, S. and J. Rarey, "Five Stages of Data Warehouse Decision


Support Evolution", DSSResources.COM, 01/06/2003
Getting The Right Information
• Only 36% of CIO’s believe management is
using the right Information to run the business
(Gartner Research)

• Less than one in ten corporate executives


believe they have the necessary information
when they need it to make critical business
decisions. More than half of these senior
executives are concerned that as a result of
missing information, they may be making poor
decisions and a quarter believe that
management frequently or always gets its
decisions wrong.
(Economist Intelligence Unit)
Diverse Sources of Information

Surveys ERP Application


Back of (standard reports)
napkins

Operations Marketing

Benchmark CEO & Board

studies Sales Finance Business Intelligence


applications
LOB EXECUTIVES
(Strategic)
MID-MANAGERS
(Operational)
EXECUTION LEVEL
(Tactical)

Application
(standard reports) Excel/Access

Discussion
Too Much Information!
Getting The Right
Information
Not all Information Is Of The Same
Value
New business strategies, opportunities
Wisdom
Lifetime value of this customer and
strategies to deploy to create loyalty
Intelligence

What the company has purchased,


lue

what other products they may


purchase
Va

Knowledge

A contact associated to a
Company and all back
orders
Information

A Contact

Data
Business Intelligence Evolution
Evolution Phase Business Question Enabling Technology

Data Collection “What was my total revenue for


Data Processing Applications
(1960s) the day?”

Data Access “What were sales quantity in Relational Databases


(1980s) Australia last March?” (OLTP)

Data Warehousing & Decision “What were sales quantity in


Support On-Line Analytical Processing
Europe? Drill down to
(OLAP)
(1990s) Germany”

“What is likely to happen to


Advanced algorithms, increased
Data Mining sales in Europe in fuel prices
computing power
increase? Why”

(Alexander 2008)
(Alexander 2008)
Operational versus Informational
Processing
Operational (OLTP) Informational (OLAP)
Detailed Summarised
Can be updated Snapshot records, no updates allowed

Accurate up to the second Timestamp on each record


Used for clerical purposes Used by management
Built based on requirements Built without knowing requirements

Supports small uniform transactions Supports mixed workload

Data designed for optimal storage Data designed for optimal access

Very current data Mainly historical data


Data is application oriented Data is integrated
Referential Integrity is useful Referential integrity is not useful
High availability is normal High availability is nice to have
Product Dimension

Customer Dimension Sales Dimension

Quantities
Revenues
Costs Taxes

Competition
Time Dimension
Dimension
Corporate Information Factory
The Data Warehouse and Marts
The purpose of a data warehouse is to
establish a data repository that makes
operational data accessible in a form readily
acceptable for analytical processing activities .
..
A data mart is … dedicated to a functional or
regional area.

19
Data Warehouse Characteristics
• "A Data Warehouse is a subject-oriented, integrated, time-variant and
nonvolatile collection of data in order to support management decisions,“
Bill Inmon (1996).

– Subject-oriented
• The organization of data is guided by the view of decision makers
on specific areas of business.
– Integrated
• The Data Warehouse contains data from different internal and
external sources. Important is the high quality of data, i.e., its
correctness and consistency.
– Time-oriented
• Data in a Data Warehouse has a time dimension, i.e. all data values
and their changes in time can be compared and analyzed along the
time axis.
– Nonvolatile
• As opposed to operational databases, data are stored persistently
in a Data Warehouse. Access is by reading the
data; analysis does not change the data.
Data Integration Process

Data analysis and - OLAP, MIS, cockpits, …


analytical applications - Planning, scorecard, …

- Information models
Multi-dimensional data - Aggregation

- Data storage
Data warehouse - Administration

Extraction, trans- - Selection, extraction,


formation, loading - Modification, loading

- External data sources


Source systems - Internal data sources
Why Separate Data Warehouse?
• Performance
– Op dbs designed & tuned for known txs & workloads.
– Complex OLAP queries would degrade perf. For op txs.
– Special data organization, access & implementation methods
needed for multidimensional views & queries.
• Function
– Missing data: Decision support requires historical data, which op
dbs do not typically maintain.
– Data consolidation: Decision support requires consolidation
(aggregation, summarization) of data from many heterogeneous
sources: op dbs, external sources.
– Data quality: Different sources typically use inconsistent data
representations, codes, and formats which have to be reconciled.

22
The Data Warehouse and Marts Features

• Extracts archived operational data


• Overcome inconsistencies between different
legacy data formats
• Integrate data throughout an enterprise,
regardless of location, or format
• Incorporate additional or expert systems

23
24
Data Mining
Characteristics and Objectives
• Data mining tools extract information buried
in corporate files or archived public records
• The “miner” is often an end user
• “Striking it rich” usually involves finding
unexpected, valuable results
• Parallel processing

25
Data Mining Yields Five Types of
Information
• Association
• Sequences
• Classifications
• Clusters
• Forecasting

26
Data Mining Vs. DBMS
• DBMS - queries based on the data held, e.g.last
months sales for each product sales grouped by
customers age etc
– list of customers who lapsed their policies
• Data Mining - infers knowledge from the data held to
answer queries, e.g.
• what characteristics do customer share who lapsed
their policies and how do they differ from those who
renewed their policies?

27
DM versus Statistics
• In statistical analysis you will never find what you are
not looking for
• In inferential statistics you hypothesize an inference,
then test the hypothesis
• DM allows you to discover patterns that you did not
know existed, so you can find things you did not start
out looking for
• A statistician is needed to use statistics
• A business end user can use DM
• Many of the techniques & algorithms used are
shared by both statisticians and data miners
Data Mining Examples
Customer Credit
Customer Credit
Card
Grouping and
Grouping Card
Behaviourand Fraud
Market Based Behaviour Fraud
MarketandBased Prediction
Analysis Up- Prediction
Analysis and Up-
Selling/Cross-
Selling/Cross-
Selling
Selling Credit
Credit
Risk
Risk
Determination
Determination

Pharmaceutical
Pharmaceutical
Industry:
Industry:
Drug Effectiveness
Drug Effectiveness
by Patient Type
by Patient Type

Employee
Employee
Turnover
Turnover
Predictions
Predictions
Defect Analysis University and
Defect
in Analysis University
Employeeand
in
Manufacturing Employee
Recruitment
Manufacturing Recruitment
Closed Loop Business Intelligence
Taking Action is What BA is All
About
Business Drivers and Target Groups
Analytics and BI – the LINK
Business Intelligence WHAT happened?

Analytics WHEN something happened ?


WHO will it happen to ?
WHY something happened ?
Will it happen again?
BI Vs. BA
• BI uses past and present data sets to optimize
the current success, while BA uses the historic
data and analyzes the present to predict future.
• BI is about current improvement, whereas, BA
is about future planning.
• BI always focuses on “What Happened”,
whereas, BA focuses on “Why it Happened”,
“Whether it will happen again”, “Whom will it
happen to” and “When will it happen again”.

34
So, What Are Analytics?
Analytics
Decision Optimization What’s the best that can happen?

Predictive Analytics What will happen next?

Forecasting What if these trends continue?

Statistical models Why is this happening?


Competitive Advantage

Alerts What actions are needed?


Query/drill down Where exactly is the problem?
Ad hoc reports How many, how often, where?
Standard reports What happened? Reporting

Degree of Intelligence
Business Analytics - Inclusions
Types of Analytics
• Descriptive Analytics: Gain insight from historical
data with reporting, scorecards, clustering etc.
• Predictive analytics: Predictive modeling using
statistical and machine learning techniques
• Prescriptive analytics: Recommend decisions
using optimization, simulation etc.
• Decisive analytics: Supports human decisions
with visual analytics the user models to reflect
reasoning.
Scope of Business Analytics

Customer Intelligence

Marketing Analytics

Risk Estimation

Web Analytics

Social Media Analytics


& many more !
Analytics and BIG Data – the LINK
• Data: petabytes
• Reports: terabytes
• Excel: gigabytes
• PowerPoint: megabytes
• Analytics: bytes
• One business decision based on
Analytics: PRICELESS….
What does Multi-dimensional Means?(1)

A material number?

A telephone number? Revenue in September 2007?

My boss‘ salary?

1388486

• A key figure without a relationship to any object makes no


sense!
Multi-dimensional (2)

• San Francisco had 1,388,486 USD revenue in May 2007 by selling


Mountain Bikes

Month: May
Year: 2007

Revenue in USD:
1,388,486
Group: Mountain Bike Sales Organisation:
San Francisco
Multi-dimensional (3)

Multi-dimensional means a key figure always relates to one or


more objects.

• Key figure: Revenue


• Object: Values
– Month  May
– Year  2007
– Sales Organisation  San Francisco
– Material Group  Mountain Bike
What is OLAP?

On-line
On-line Analytical
Analytical Processing
Processing isis aa software
software technology
technology which
which
allows
allows end-user
end-user driven,
driven, fast
fast and
and interactive
interactive data
data analysis.
analysis.

Revenue
1,388,486 USD
Excel PivotTable

• What is the revenue in each country in 2007?


Excel PivotTable (1)
Excel Pivot Table (2)

 What is the revenue in each country per month?


Pivot Table (3)
From Table View to a 3-Dimensional Cube
Hierachies Allow Subcubes in a Cube

• Product group  products


Example
• Year  quarter  months
• Country  sales organisation
Cube Navigation (1)

• Slice
Cube Navigation (2)

• Rotation
Cube Navigation (3)

Material Group and


Country visible, Year
• Rotation, Example restricted to 2007

Material Group and


Year visible, all
Countries
aggregated
Cube Navigation (4)

• Dice
Cube Navigation (5)

• Drill-Down / Roll-Up
Exercises

• OLAP navigation using Excel PivotTables

• OLAP navigation in SAP


– Business Explorer Analyzer (Excel)
– Business Explorer Web (Browser)
BI Architecture
Data analysis and - OLAP, MIS, cockpits, …
analytical applications - Planning, scorecard, …

- Information models
Multi-dimensional data - Aggregation

- Data storage
Data warehouse - Administration

Extraction, trans- - Selection, extraction,


formation, loading - Modification, loading

- External data sources


Source systems - Internal data sources
Information Systems
Strategic
Enterprise
Corporate Management
Planning

Analysis
Accounting
and Controlling
Vertical Integration

Supplier Management,
Production Planning, Disposition and
Cost Planning, … Planning

Administration I:
Warehouse
Accounting

Accounting

Customer

Employee
Invoices
Supplier

Value-Oriented

Salary
Processing

Administration II:
Purchasing Stocks Sales Personnel
Amount-Oriented
Horizontal Integration Processing
Source: Mertens, P., Meier, M.: Integrierte Informationsverarbeitung (2009), 1.
OLTP versus OLAP

Strategic
Enterprise
Corporate Management
Planning

 On-line Analysis
Accounting
Analytical and Controlling
Processing Supplier Management,
Production Planning, Disposition and
Cost Planning, … Planning

Administration I:
Warehouse
Accounting

Accounting

Customer

Employee
Invoices
 On-line
Supplier

Value-Oriented

Salary
Transactional Processing
Processing
Administration II:
Purchasing Stocks Sales Personnel
Amount-Oriented
Horizontal Integration Processing
OLTP Versus OLAP (1)

OLTP OLAP
- Optimized to get data in - Optimized to get data out
- For management and - For administration and daily
daily business decisions
- Processes a small amount of - Processes a large amount of
data per transaction data per transaction
- Business-critical availability - Less critical availability
- Data updates online - Data updates regularly
- Data overwritten - Data are time-dependent
OLTP Versus OLAP (2)

It is very hard to get all-in-one!

You might also like