You are on page 1of 41

What is OLAP

• Basic idea: converting data into information that


decision makers need

• Concept to analyze data by multiple dimension in a


structure called data cube
OLAP• Is FASMI
Fast
• Analysis
• Shared
• Multidimensional
• Information
History
• In 1993, E. F. Codd came up with the term online
analytical processing (OLAP) and proposed 12
criteria to define an OLAP database
• the term OLAP seems perfect to describe databases
designed to facilitate decision making (analysis) in
an organization
Purpose of OLAP
• To derive summarized information from large
volume database
• To generate automated reports for human view
Why need OLAP over Relational
Database I
• Consistently fast response

• OLAP obtains a consistently fast response is by


prestoring calculated values
Why need OLAP over Relational
Database II
• Metadata-based queries

• provide analysis functions that are difficult or


impossible to express in SQL
• SQL was developed primarily for transaction
systems, not for reporting applications
Why need OLAP over Relational
Database III
• Spreadsheet-style formulas

• design the data structure with users in mind.


• Spreadsheets are key components of business
management because they are intuitive to create
Step I
1.identify multidimensional data

• measure attribute
(measure some value, can be aggregated upon)
• dimension attribute
(define the dimension and summary of measure
attribute)
(Cont.)
• Each dimension is typically expressed as a
“hierarchy”
• Hierarchy: Analyst is interested in different level of
detail of a dimension
Step II
1.Analyze multidimensional data into cross-tabulation

row header: value for one attribute


column header: value for another attr.
individual cell: value aggregation
Step III
1.Visualize n-dimensional cube - data cube

the word CUBE describe what in the


relational world would be the integration
of the fact table with dimension tables
Step IV
• After you design the cube, you will use the cube's
structure to build a relational database (known as a
star schema) to house the data for the cube
Step V
• Once you load data into the relational database, and
then into the cube, you'll be able to see how
attributes, dimensions, measures, and measure
groups fit together within a cube to create a powerful
analytical tool.
Star Schema
• Cubes are easily stored in relational databases,
using a denormalized data structure called the
star schema, developed by Ralph Kimball
• starts with a central fact table
• Each row in the central fact table contains some
combination of keys that makes it unique. These
keys are called dimensions.
Case Study
• Afco Foods & Beverages is a new company
which produces dairy,bread and meat
products with production unit located at
Baroda.
• There products are sold in North,North West
and Western region of India.
• They have sales units at Mumbai, Pune ,
Ahemdabad ,Delhi and Baroda.
• The President of the company wants sales
information.
January February March April
Sales
14
Information
41 33 25

Report: The number of units sold.

113

Report: The number of units sold over time


Sales Information
Report : The number of items sold for each product with
time
Jan Feb Mar Apr
Wheat Bread 6 17

Cheese 6 16 6 8

Swiss Rolls 8 25 21

Time

Product
Sales Information Jan City
Report: The number of items sold in each Feb Mar product
for each Apr with time
Mumbai Wheat Bread 3 10

Cheese 3 16 6
City
Swiss Rolls 4 16 6

Pune Wheat Bread 3 7


Time
Cheese 3 8

Swiss Rolls 4 9 15
Product
Report: The number of items sold and income in each region for
each product with time.

Jan Feb Mar Apr


Rs U Rs U Rs U Rs U
Mumbai Wheat Bread 7.44 3 24.80 10
Cheese 7.95 3 42.40 16 15.90 6

Swiss Rolls 7.32 4 29.98 16 10.98 6


Pune Wheat Bread 7.44 3 17.36 7
Cheese 7.95 3 21.20 8

Swiss Rolls 7.32 4 16.47 9 27.45 15


Sales Measures & Dimensions

• Measure – Units sold, Amount.


• Dimensions – Product,Time,Region.
Sales
City Data Warehouse
Product Month Model
Units Rupees
Mumbai
Fact Table Wheat Bread January 3 7.95
Mumbai Cheese January 4 7.32
Pune Wheat Bread January 3 7.95
Pune Cheese January 4 7.32
Mumbai Swiss Rolls February 16 42.40
City_ID Prod_ID Month Units Rupees
1 589 1/1/1998 3 7.95
1 1218 1/1/1998 4 7.32
2 589 1/1/1998 3 7.95
2 1218 1/1/1998 4 7.32
1 589 2/1/1998 16 42.40
Sales Data
Prod_ID
Warehouse
Product_Name
Model
Product_Category_ID
589 Wheat Bread 1
Product Dimension Tables
Product_Category_Id Product_Category
590 White Bread 1
1 Bread
288 Coconut Cookies 2
2 Cookies
Sales
City_ID
DataCityWarehouse
Region
ModelCountry
1 Mumbai West India
Region Dimension Table
2 Pune NorthWest India
Time

Product
Sales Fact Product
Category

Region
Online Analysis Processing(OLAP)

• It enables analysts, managers and executives to gain


insight into data through fast, consistent, interactive
access to a wide variety of possible views of
information that has been transformed from raw data
to reflect the real dimensionality of the enterprise as
understood by the user.
Product

Data
Warehous
e Region

Time
Slicing & Dicing
• Additional Functionality that can be thought of as
viewing a slice of the data cube, particularly when
values for multiple dimensions are fixed.
• Slicing/Dicing simply consists of selecting specific
values for these attributes, which are then displayed
on top of the cross-tab
Dicing

A related operation to slicing is


dicing. In the case of dicing, you
define a subcube of the original
space. The data you see is that
of one cell from the cube. Dicing
provides you the smallest
available slice.
The Telecomm Slice
Product

Household

Telecomm
Regions

Video Europe
Far East
Audio India

Retail Direct Special Sales Channel


Rollup & Drill-down
• OLAP permit users to view data at any desired level
of granularity.
• Rollup: moving from finer-granularity data to coarser
granularity
• Drill-down: opposite to Rollup
Roll-up and Drill Down
Higher Level of
Aggregation

• Sales Channel
• Region
• Country Drill-Down

Roll Up • State
• Location Address
• Sales Representative

Low-level
Details
OLAP Inplementation
• Multidimensional OLAP (MOLAP)
• Relational OLAP (ROLAP)
• Hybrid OLAP (HOLAP)
MOLAP
• The database is stored in a special, usually
proprietary, structure that is optimized for
multidimensional analysis.
• + : very fast query response time because data is
mostly pre-calculated
• -: practical limit on the size because the time
taken to calculate the database and the space
required to hold these pre-calculated values
ROLAP
• The database is a standard relational database
and the database model is a multidimensional
model, often referred to as a star or snowflake
model or schema.
• +: more scalable solution
• -: performance of the queries will be largely
governed by the complexity of the SQL and the
number and size of the tables being joined in the
query
HOLAP
• a hybrid of ROLAP and MOLAP
• can be thought of as a virtual database whereby the
higher levels of the database are implemented as
MOLAP and the lower levels of the database as
ROLAP
DOLAP
• The previous terms are used to refer to server
based OLAP technologies
• DOLAP (Desktop OLAP)
• DOLAP enables users to quickly pull together small
cubes that run on their desktops or laptops
Conclusion
• OLAP is a significant improvement over query
systems
• OLAP is an interactive system to show different
summaries of multidimensional data by interactively
selecting the attributes in a multidimensional data
cube
References
• IBM Redbooks. DB2 Cube Views: A Primer. Durham, NC, USA:
IBM, 2003. ebrary collections. San Jose State University.
<http://site.ebrary.com/lib/sjsu/Doc?id=10113016&ppg=43>

• Jacobson, Reed, Microsoft® SQL Server™ 2005 Analysis


Services Step by Step. Microsoft Press.

• Berry, Michael J. A. Data Mining Techniques : For Marketing,


Sales, and Customer Relationship Management. Hoboken, NJ,
USA: John Wiley & Sons, Incorporated, 2004. ebrary
collections. San Jose State University.
<http://site.ebrary.com/lib/sjsu/Doc?id=10114278&ppg=522>.

You might also like