Professional Documents
Culture Documents
UNIT V
Introduction to Data warehouse, Data warehouse architecture, Creating and
maintaining a warehouse, Multidimensional data model, OLAP and data cubes,
Operations on cubes, pre-processing, Analysis of Data pre-processing.
Textbooks:
1. Korth, Sudarshan,Silberschatz, “Database System Concepts”, MacGraw Hill Publication, 2013.
2. Elmasari , Navathe, “Fundamentals of Database Systems”, Pearson, 2013.
3. Thomas Connolly, Carolyn Begg, “Database Systems: A Practical Approach to Design, Implementation
&
Management”, Pearson, 2013.
4. Michale Gertz, Sushil Jajodia, “Handbook of Database Security, Applications and Trends”, Springer,
2008.
Dimension tables
Time Info
...
Dimensional Modeling
Star Schema
Dimension tables
Fact table(s)
• Makes multi-dimensional database
functionality possible using a traditional RDB
Dimensional modeling is a technique for
structuring data around business concepts
ER models describe “entities” and “relationships”
Dimensional models describe “measures” and
“dimensions”
Dimension Tables
Dimension tables have the following
characteristics:
– Hold descriptive information about a
particular business perspective
– Contain relatively static data
– Are joined to a fact table through a
foreign key reference
Fact Tables
Fact tables have the following characteristics:
– Contain numeric metrics of the business
– Can contain one or more calculated measures
– Records are updated not deleted.
– May contain summarized (aggregated) data
– Often contain date-stamped data
– Key value is a composite key made up of the primary
keys of the dimensions
– Joined to dimension tables through foreign keys that
reference primary keys in the dimension tables
Star Schema (in RDBMS)
Star Schema Example
Star Schema
with Sample
Data
Example (1)
insert into products values (567, 'Colgate Gel Pump 6.4 oz.', 1, 68);
insert into products values (219, 'Diet Coke 12 oz. can', 2, 5);
insert into stores values (16, 34, '510 Main Street', '415-555-1212');
insert into stores values (17, 58, '13 Maple Avenue', '914-555-1212');
insert into sales values (567, 17, 1, to_date('1997-10-22 09:35:14', 'YYYY-MM-DD HH24:MI:SS'));
insert into sales values (219, 16, 4, to_date('1997-10-22 09:35:14', 'YYYY-MM-DD HH24:MI:SS'));
insert into sales values (219, 17, 1, to_date('1997-10-22 09:35:17', 'YYYY-MM-DD HH24:MI:SS'));
• SQL extensions
• Extended family of aggregate functions
– Rank (e.g., top 10 customers), Percentile (e.g.,
top 30% of customers), Median, Mode, etc.
– Object Relational System allows addition of new
aggregate functions
• Reporting features
– Running total, Cumulative total
• Cube operator
– Group by on all subsets of a set of attributes, etc.
Other Issues
OLAP Server
other
sources
Analysis
Query
Operational Extract Reports
DBs Transform Data Data mining
Load Warehouse
Refresh Serve
t
TV
uc
od PC U.S.A
Pr
VCR
sum
Country
Canada
Mexico
sum
Browsing a Data Cube
• Visualization
• OLAP capabilities
• Interactive manipulation
Typical OLAP Operations
Roll up (drill-up): summarize data
by climbing up hierarchy or by dimension reduction
Drill down (roll down): reverse of roll-up
from higher level summary to lower level summary or
detailed data, or introducing new dimensions
Slice and dice: project and select
Pivot (rotate):
reorient the cube, visualization, 3D to series of 2D planes
Other operations
drill across: involving (across) more than one fact table
drill through: through the bottom level of the cube to its
back-end relational tables (using SQL)
Fig. Typical OLAP
Operations
OLAP: Data Cube Operations
Slicing:
Selecting the dimensions of the cube to be viewed.
Example: View “Sales volume” as a function of “Product ” by “Country
“by “Quarter”
Dicing:
Specifying the values along one or more dimensions.
Example: View “Sales volume” for “Product=PC” by “Country “by
“Quarter”
OLAP: Data Cube Operations