Professional Documents
Culture Documents
Part Three PDF
Part Three PDF
PART 3
records
◼ Data cleaning and data integration techniques are
applied.
◼ Ensure consistency in naming conventions, encoding
{ Time Data
Key
time,location,supplier
3-D cuboids
time,item,location
time,item,supplier item,location,supplier
4-D(base) cuboid
time, item, location, supplier
branch_key
branch location
location_key
location_key
branch_key
units_sold street
branch_name
city_key
branch_type
dollars_sold city
city_key
avg_sales city
state_or_province
Measures country
<dimension_name_first_time> in cube
<cube_name_first_time>
all all
Specification of hierarchies
◼ Schema hierarchy
day < {month <
quarter; week} <
year
◼ Set_grouping hierarchy
{1..10} < inexpensive
Office Day
Month
January 2, 2023 Prof. Ahmed Sultan Al-Hegami 32
A Sample Data Cube
Total annual sales
Date of TV in U.S.A.
1Qtr 2Qtr 3Qtr 4Qtr sum
TV
PC U.S.A
VCR
Country
sum
Canada
Mexico
sum
all
0-D(apex) cuboid
product date country
1-D cuboids
3-D(base) cuboid
product, date, country
◼ Visualization
◼ OLAP capabilities
◼ Interactive manipulation
January 2, 2023 Prof. Ahmed Sultan Al-Hegami 35
Typical OLAP Operations
◼ Roll up (drill-up): summarize data
◼ by climbing up hierarchy or by dimension reduction
◼ Drill down (roll down): reverse of roll-up
◼ from higher level summary to lower level summary or
detailed data, or introducing new dimensions
◼ Slice and dice: project and select
◼ Pivot (rotate):
◼ reorient the cube, visualization, 3D to series of 2D planes
◼ Other operations
◼ drill across: involving (across) more than one fact table
◼ drill through: through the bottom level of the cube to its
back-end relational tables (using SQL)
ORDER
TRUCK
PRODUCT LINE
Time Product
ANNUALY QTRLY DAILY PRODUCT ITEM PRODUCT GROUP
CITY
SALES PERSON
COUNTRY
DISTRICT
REGION
DIVISION
Location Each circle is
called a footprint Promotion Organization
January 2, 2023 Prof. Ahmed Sultan Al-Hegami 38
Data Warehousing and OLAP
Technology
Monitor
& OLAP Server
Other Metadata
sources Integrator
Analysis
Operational Extract Query
DBs Transform Data Serve Reports
Load
Refresh
Warehouse Data mining
Data Marts
materialized
January 2, 2023 Prof. Ahmed Sultan Al-Hegami 43
Data Warehouse Development:
A Recommended Approach
Multi-Tier Data
Warehouse
Distributed
Data Marts
Enterprise
Data Data
Data
Mart Mart
Warehouse
◼ Business data
◼ business terms and definitions, ownership of data, charging policies
January 2, 2023 Prof. Ahmed Sultan Al-Hegami 46
OLAP Server Architectures (readings)
threshold
◼ Avoid explosive growth of the cube
and product
◼ A join index on city maintains for each
warehouses
◼ ODBC, OLEDB, Web accessing, service facilities, reporting and
OLAP tools
◼ OLAP-based exploratory data analysis
◼ Open Database Connectivity (ODBC) is a C language application program interface standard from Microsoft for
connecting to a server, sending SQL requests, and receiving results.
◼ JDBC standard is similar to ODBC, for Java
January 2, 2023 Prof. Ahmed Sultan Al-Hegami 57
An OLAM System Architecture
Mining query Mining result Layer4
User Interface
User GUI API
Layer3
OLAM OLAP
Engine Engine OLAP/OLAM
Layer2
MDDB
MDDB
Meta Data
◼ Summary
January 2, 2023 Prof. Ahmed Sultan Al-Hegami 59
Summary: Data Warehouse and OLAP Technology