You are on page 1of 15

BUSINESS INTELLIGENCE

BUSINESS ANALYTICS AND DATA


REPRESENTATION
REPRESENTATION OF DATA IN DATA
WAREHOUSE

 Dimensional Modeling – a retrieval-based system that supports high-volume


query access
- Star schema – the most commonly used and the simplest style of dimensional
modeling
Contain a fact table surrounded by and connected to several dimension
tables.
Fact table: contains the descriptive attributes (numerical values) needed to
perform decision analysis and query reporting.
Dimension tables :contain classification and aggregation information about the
values in the fact table
- Snowflakes schema – an extension of star schema where the diagram
resembles a snowflake in shape
ANALYSIS OF DATA IN DATA WAREHOUSE

OLTP (online transaction processing):


is a term used for transaction system that is primarily responsible for
capturing and storing data related to day-to-day business functions.

The main focus is on efficiency of routine tasks.

OLAP (online analytic processing):


A system is designed to address the need of information extraction by
providing effectively and efficiently ad hoc analysis of organizational
data.

The main focus is on effectiveness.


OLAP VS. OLTP

OLTP OLAP
Users Clerk, IT professional Knowledge worker
Function Day to day operations Decision support
DB Design Application-oriented Subject-oriented
Current, up-to-date Historical,
Data
detailed, relational multidimensional
Usage Repetitive Ad-hoc
Access Read/write Lots of scans
Unit of Work Short, simple transaction Complex query
# Records Accessed Tens Millions
# Users Thousands Hundreds
DB Size 100MB-GB 100GB-TB
RELATIONAL DATABASE VS. DIMENSIONAL DATABASE
• A relational database is a collection of relations or tables
• Purpose – relational is designed more for data updating.
1) By using this model, you can
examine the Sales table to find that
the biggest purchase made is of 5
bolts.
2) Then you can check the Order
table to find out that the purchase is
done by Customer Id AAA002 .
3) Then you can check the Customer
table to find out that Customer Id
AAA002 is actually Samantha Jones

Imagine this retrieval is done on terabytes of data !


RELATIONAL DATABASE VS. DIMENSIONAL
DATABASE
Purpose: dimensional is meant
more for data retrieving (BI)

Using this database model,


it is possible to build reports
that answer questions such as:

1. The customer that


purchase the most product
from a specific region

2. The quantity of a specific product


sold on a particular date, month,
year or quarter.

3. In which city did a specific product sold the least.


MULTIDIMENSIONALITY

• Multidimensionality
– The ability to organize, present, and analyze data by several
dimensions, such as sales by region, by product, by
salesperson, and by time (= four dimensions)
– Done during Business Analytical application design using
OLAP technology
• OLAP is a technology for information retrieving in Business Intelligence
MULTIDIMENSIONALITY

 Perhaps the best starting point to approach the multidimensional model


effectively is a by looking at the types of queries for which this model is
best suited.
 Example of common queries in BI
 "What is the total amount of sales recorded last year per state and per
product category?"
 "What is the relationship between profit gained by the shipments
consisting of less than 10 items and the profit gained by the shipments
of more than 10 items?“
 A manager wants to know the sales of a product, by unit or dollar in a
certain geographic area, by a specific salesperson, during a specific
month.
CUBE

• The main operational structure in OLAP is based on a concept


called cube.
• A cube in OLAP is a multidimensional data structure (actual or
virtual) that allows fast analysis of data.
• An OLAP cube is a set of data, organized in a way that facilitates
queries for aggregated/collective information, or in other words,
online analytical processing. Is designed to give an overview
analysis of what happened . (Wikipedia)
 The design of cube database is planned and implemented by the
database developer based on users’ requirements.
UNDERSTANDING CUBE (OLAP OPERATIONS)

 Slice – a subset of a multidimensional array.


 Dice – a slice on more than two dimensions
 Drill Down/Up – navigating among levels of data ranging from the most summarized (up) to the most detailed (down)
 Roll Up – computing all of the data relationships for one or more dimensions
 Pivot – used to change the dimensional orientation of a report or an ad hoc query page display

 Slice and Dice: To slice and dice a cube is to break a body of information down into smaller parts or to examine it from different viewpoints
so that you can understand it better.
CUBE DATABASE – ALLOWING FOR SLICE AND
DICE

Dice down to
the location
Slice the year 2005 Example query:
How many bolts
were sold in the
year 2005 by the
Central branch?

Dice down to the


product
ADVANCED BUSINESS ANALYTICS
 While OLAP concentrates on reporting and queries, a more
sophisticated way of analyzing data and information is needed.

 Users today will want to perform statistical and mathematical


analysis such as hypothesis testing and prediction.

 Such investigation cannot be done with basic OLAP and will


require special tools, including data mining and predictive
analysis – hence, advanced business analytics
ADVANCED BUSINESS ANALYTICS

A major step in managerial decision making is


forecasting or estimating the results of different
alternative courses of actions.
Two methods that can be used for advanced business
analytics are
 Data mining
 Predictive analysis
ADVANCED BUSINESS ANALYTICS

• Data mining
– Tools that would automatically extract hidden and search for pattern in
large transaction database. E.g. Purchase pattern for a certain consumer

• Predictive analysis
Use of tools that help determining the probable future outcome for an
event or the likelihood of a situation occurring.

You might also like