OLAP – On Line Analytical Processing

Session Objectives
Objectives: At the end of this session, you will be able to: > Define On Line Analytical Processing > Understand the need for OLAP and applications of OLAP in BI > Describe the various OLAP solutions and Architecture > Comparison of different OLAP architectures > Evaluation parameters to be considered for selecting an OLAP tool

2

What is OLAP?

> OLAP (On Line Analytical Processing) applications - designed for online ad-hoc data access and analysis. > Data organized into multiple dimensions. > Access to analytical content such as time series and trend analysis views and summary level information. > A set of functionality that attempts to facilitate multidimensional analysis. > Offers drill-down, drill-across and slice and dice capabilities.
3

OLAP - Fast Analysis

• • •

On Line Analytical Processing

No piles of paper, please! Establish patterns Data-based

Fast Analysis of Shared Multidimensional Information

4

Need for OLAP

• Dimensions can we think in ? E.g. analysis by branch, product, agent, year !!! 2 or 3 • Types of values we can handle ? E.g. Sales, Profit, Cost 1 or 2 • How many levels can we handle ? E.g. number of products we can analyze

5

Need for OLAP

Many parameters affect a Measure (value) e.g Sales influenced by product, region, time, distribution channel, etc., Linear analysis = reports Many totals are at one level Difficult to identify the key parameters

6

OLAP in an Enterprise

7

Uses of OLAP
Departments: • Finance • Marketing • Sales • Manufacturing

Analytical Capabilities: > Used by analysts and managers. > Offers aggregated view of the data, such as total revenues by customer profile, by product line, by geographical regions.
8

Functionality of OLAP Tools

> Provides the decision support front-end for data warehousing. > Advanced statistical, financial, and analytical calculations. > Appropriate tools to access data from a relational database. > Appropriate tools to access or manage multidimensional data.

9

Features of OLAP Applications
OLAP analytical features > Multi-dimensional views of data > Calculation intensive capabilities > Time intelligence The OLAP Calculation engine in OLAP tools have a wide range of built-in calculations such as: > Ratios > Time calculations > Statistics > Ranking > Custom formulas/algorithms > Forecasting and modeling

10

Evolution of OLAP

Star Schema

> A Star Schema is a dimensional model created by mapping data entities from operational systems > It has a central table (fact table) that links all the other tables (dimension tables) together > Dimension: The same category of information. For example, year, month, day, and week are all part of the Time Dimension. > Measure: The property that can be summed or averaged using pre computed aggregates.

12

Facts and Measures
ue en ev sR Net Pro le f

Gros s
it

Marg

Sa

in

P ro fita bili ty

ost C

> Facts or Measures are the Key Performance Indicators of an enterprise > Factual data about the subject area > Numeric, summarized
13

Dimension
e nu e ev e) r sR su le Sa M e a (

What was sold ? Whom was it sold to ? When was it sold ? Where was it sold ?

> Dimensions put measures in perspective > What, when and where qualifiers to the measures > Dimensions could be products, customers, time, geography etc.

14

Star Schema

15

Star Schema Example

16

Star Schema with Sample Data

17

CUBE
Cube – Multi dimensional databases store information in the form of cubes. – A cube is a collection of facts and related dimensions stored together in arrays. Geography

Sales

HR

Time Product

Basic Terminology of a Cube

> Hierarchy: A hierarchy defines the navigating path for drilling up and drilling
down. All attributes in a hierarchy belong to the same dimension.

> Levels: These are organized into one or more hierarchies, typically from a
coarse-grained level (for example, Year) down to the most detailed one (for example, Day).

> Members: The individual category values (for example, 2002 or 21Jan2002). > Measures: These are the data values that are summarized and analyzed.
Examples of measures are sales figures or operational costs.

> Cells: These are the intersection of one member for every dimension and
store the data for measures.

19

Basic Terminology of a Cube

Time

> Dimensions consist of – Dimension Name – Level – Hierarchy
Q2

Level 2000 1999 OfYEAR Detail
Q3 Q4

2001

Q1 Q2 Q3 Q4 Q1 QUARTER

– Member

20

Aggregates

• Add up amounts for day 1 • In SQL: SELECT sum(amt) FROM SALE WHERE date = 1
sale prodId p1 p2 p1 p2 p1 p1 storeId s1 s1 s3 s2 s1 s2 date 1 1 1 1 2 2 amt 12 11 50 8 44 4

81

21

Aggregates

• Add up amounts by day • In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date
sale prodId p1 p2 p1 p2 p1 p1 storeId s1 s1 s3 s2 s1 s2 date 1 1 1 1 2 2 amt 12 11 50 8 44 4

ans

date 1 2

sum 81 48

22

Another Example

• Add up amounts by day, product • In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId
sale prodId p1 p2 p1 p2 p1 p1 storeId s1 s1 s3 s2 s1 s2 date 1 1 1 1 2 2 amt 12 11 50 8 44 4

sale

prodId p1 p2 p1

date 1 1 2

amt 62 19 48

rollup drill-down
23

Aggregates

> Operators: sum, count, max, min, median and avg > “Having” clause > Using dimension hierarchy – average by region (within store) – maximum by month (within date)

24

The MOLAP Cube

Fact table view:
sale prodId p1 p2 p1 p2 storeId s1 s1 s3 s2 amt 12 11 50 8

Multi-dimensional cube:
p1 p2 s1 12 11 s2 8 s3 50

dimensions = 2

25

3-D Cube

Fact table view:
sale prodId p1 p2 p1 p2 p1 p1 storeId s1 s1 s3 s2 s1 s2 date 1 1 1 1 2 2 amt 12 11 50 8 44 4

Multi-dimensional cube:

day 2 day 1
p1 p2

p1 p2

s1 44 s1 12 11 s2 8

s2 4 s3 50

s3

dimensions = 3

26

Example

roll-up to region

r to S

e
LA

NY SF 10 34 56 32 12 56 M T W Th F S S

Juice Milk Coke Cream Soap Bread

Dimensions: Time, Product, Store roll-up to brand Attributes: Product (upc, price, …) Store … … Hierarchies: Product → Brand → … Day → Week → Quarter roll-up to week Store → Region → Country

Product

Time
56 units of bread sold in LA on M

27

Cube Aggregation: Roll-up

day 2 day 1
p1 p2

p1 p2

s1 44 s1 12 11 s2 8

s2 4 s3 50

s3

Example: computing sums ...

p1 p2

s1 56 11

s2 4 8

s3 50

sum

s1 67

s2 12

s3 50

129
p1 p2 sum 110 19
28

rollup drill-down

Aggregation Using Hierarchies

day 2 day 1
p1 p2

p1 p2

s1 44
s1 12 11 s2 8

s2 4
s3 50

s3

store region country

p1 p2

region A region B 56 54 11 8

(store s1 in Region A; stores s2, s3 in Region B)

29

Slicing • In SQL: SELECT * FROM SALE WHERE date = 1

day 2 day 1
p1 p2

p1 p2

s1 44 s1 12 11 s2 8

s2 4 s3 50

s3

TIME = day 1
s1 12 11 s2 8 s3 50

p1 p2

30

OLAP Solutions and Architecture

OLAP - Classification

Online Analytical Processing (OLAP) can be done on: > Relational databases > Multidimensional databases OLAP products are grouped into three categories: > Relational OLAP (ROLAP) > Multidimensional OLAP (MOLAP) > Hybrid OLAP (HOLAP)

32

MOLAP

Brand

Geography

• Multi-dimensional OLAP • MOLAP is a technology which uses a multi-dimensional database that stores data as n-dimensional cube

Ag e

Gr ou p

33

Architecture of MOLAP
l ica it Cr e Siz

non-live connection •Used for updating the MOLAP data cube only

e ub C

LAN

Data Mart Server •RDBMS •Connectivity Middleware MOLAP Server •MDDBMS/Data Cube •MOLAP Application

Desktop Systems MOLAP Client Tools

Issues: • Size of Data Cube • Cubes deployment • Size of Update Data Set

Router Firewall

Intranet Internet Thin Clients •WWW Browser

34

MOLAP Products

• Oracle's Oracle Express Server

• Cognos - Powerplay Transformer

• Essbase (Hyperion Software)

• Holos (Seagate Software)

35

Architecture of ROLAP

LAN

Data Mart Server •RDBMS •Connectivity Middleware

ROLAP Server •ROLAP Application

Desktop Systems ROLAP Client Tools

Issues: • Aggregate Awareness • Response Time •Network Capacity

Router / Firewall
Intranet Internet Thin Clients •WWW Browser

36

ROLAP Products

• • • • •

Brio Query Enterprise Business Objects Metacube DSS Server Information Advantage

37

Architecture of HOLAP

LAN

MOLAP Server •MDDBMS/Data Cube •MOLAP Application

ROLAP Server •ROLAP Application

Desktop Systems HOLAP Client Tools Router/Fire wall

Issues: •Cube elements •Integration with RDBMS

38

HOLAP Products

• Holos (Seagate Software)

Microsoft SQL Server OLAP Services

• Pilot Software's Pilot Decision Support Suite

• SAS

39

MOLAP Vs ROLAP

Comparison of Architectures
A rchitectural Features
N ber of D ensions um im Support for Large num ber of users Scalability C plex om M ultidim ensional analysis Volum of D e ata storage

M LA O P
Ten or Less Lim ited support Poor Easier to achieve U to 50 G p B

R LA O P
U nlim ited G ood G ood D ifficult to achieve H undreds of G igabytes and Terabytes SQ result sets L N orm al SQ L Stores D etailed as w as sum arized ell m data

Storage of Inform ation U ser Interface & functionality C m access om on language N ature of D ata

Through cubes G ood N A Stores sum arized m data

41

Strength and Weakness of MOLAP/ROLAP
Parameters
Application design

MOLAP
Essentially the definition of dimensional model and calculation rules

ROLAP
It uses twodimensional tables that are stored in RDBMSs. (Data is stored in Star schema or Snow flake schema.) Summary tables are implemented in the relational database

Aggregation techniques

Measures are precalculated and stored at each hierarchy summary level during load time Drill down, Drill up, Drill across and Slicing /Dicing Instant response Supports complex functions like %change, ranking etc., Calculated from cubes

Multidimensional analysis Query performance Value added functions

Drill down, Drill up, Slicing and Dicing Slower Limited value added functions Calculated (On the fly )from the database

User – defined calculations

42

Strength and Weakness of MOLAP/ROLAP

Parameters
Processing Over head for large input data sets Support for frequent updates Resource requirements Industry standard Access to the database through ODBC

MOLAP
High Cannot handle frequent update of cubes High No current standards The databases have proprietary API and do not provide access through ODBC.

ROLAP
Low Suitable for frequent updates Low SQL standard Provides access through ODBC

43

OLAP Tool Selection

Parameters to be Considered for an OLAP Tool Selection
Parameters Openness Ad hoc reporting read-write Integration Cost Compatibility Database Scalability Analysis of detail data Features Openness to standard reporting tools. Ad hoc query performance and reporting capabilities Multi user read-write applications Integration with the organization’s enterprise wide environment Cost of ownership, training, and installation. Compatibility with the enterprise computing environment Database size capacity of product Ability of tool to scale to the required number of dimensions Ability of tool to support analysis against atomic data sets

45

Parameters to be Considered for an OLAP Tool Selection
Parameters RDBMS integration Run time calculations Data Loading Key features Features Ability of the OLAP tool to integrate directly with relational databases and non-numeric relational data. Ability to perform calculations at run-time Data loading performance of the OLAP product. Key features offered by the tool such as write-back, allocation calculations, sophisticated currency conversions, printed report quality, spreadsheet interface etc. Integration with other related systems, such as e-mail, data warehouses.

Integration with other systems Deployment architectures

Ability to support various deployments such as stand-alone, high speed client/server, intranet, extranet, Internet

46

Which is Preferred ?

Features
Calculation intensity, complexity Data Sparsity Database Update Data Volatility Volume of Data Development time, learning curve Standards, interoperability Query response time Consistency, Reliability Data Loading time Security Network impact Vendor Stability

MOLAP
√ √

ROLAP
√ √ √ √ √

√ √

√ √ √

47

OLAP - Summary
> Offers Fast, flexible data summarization and analysis. > OLAP servers are a superior technology for BI applications. > Ability to summarize data in multiple ways and view trends over time. > OLAP servers and relational databases can work in harmony.

48

Session Summary

In this session, We have > Understood the need for OLAP and significance of Multidimensional analysis in a Data Warehouse. > Discussed about the evolution of OLAP. > Explained architectures, characteristics as well as the merits and demerits of various OLAP solutions.

49

Thank you

Master your semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master your semester with Scribd & The New York Times

Cancel anytime.