Professional Documents
Culture Documents
神DataWarehousing
神DataWarehousing
Data Warehousing
88423007
88423010
88423025
88423031
Overview
12
Dimensional
Modeling
15
5
33
SQL Server
46
58
Database
Archive
Data martData warehouse
Multi-dimensional
(OLAP)(EIS)
Overview
Gordon Moore
40MB 20GB
LinuxFreeBSD
Internet Intranet
SAPBaanPeopleSoft Oracle
OLTP
(OLTP)
OLTP
OLTP
OLTP
OLTP
OLTP
OLTP
(
)
Metadata(
) metadata
metadata
metadata
metadata OLTP
RDBMS
(OLAP)
1996 IDC ()
220 2.3 90
40() 50
160 400
TeraBytes(TB)
PetaByte(1PB=1024TB)
OLTP
OLTP
OLTP
OLTP
Operational Systems
(1)
(2)
dynamic
(3)
inactive order
data
warehouse model
10
(1)
(2)
(3)
(4) table
table table
key
join table
11
<1>
<2><3>
4.
query report
query
summary views query
12
13
3-1
3.1) / Operational Data Base / External Data
Base Layer
14
(information superhighway)
ExcelLotus 1-2-3FocusAccessSAS
SQL
/
DBMSs
metadataMetadata
metadata
15
middleware
table client/server
main frames
2.9) Data Staging Layer
data staging layer
(copy management)(replication management)
16
Dimensional Modeling
Dimensional modeling
end-user
ERDEntity-Relationship Modeling
relation
instances
tables
table
ERD
4-1
ERD 22 entities
tables tables
ERD
entities ERP SAP entities
entities implement
17
dimensional
natural
dimensional
18
Dimensional modeling
dimensional
relational model
dimensional model multipart key table
fact table tables dimension tables
dimension table fact table multipart
key 4-2
4-2
star joindimensional
modeling ERD 1960 General Mills
and Dartmouth University facts dimensions
19
4-1 4-2
star join star join
dimensional modeling entity-relationship
entity-relationship diagram
multiple fact table diagrams
ERD ERD Sales CallsOrder EntryShipment
Invoices Customer Payments Product Returns
diagram ERD
diagram processes
ERD dimensional modeling diagrams
ERD model
ERD additive facts
20
ERD dimensional
model
dimensional model report
writers query tools interfaces dimensional
model
dimension table bit
vector indexes dimension attributesMetadata
21
dimensional model
tables
table SQL ALTER TABLE
applications
4-2 1 4
2. dimensions dimension
fact
3. dimensional attributes
4. dimension records certain point
granularity level
modeling
report writersquery tools
dimensions
constant dimension
dimensional modeling dimensions
dimensions
facts
facts Pay-in-advance
22
implementation
fact
dimension tables dimensional model
dimensional
defend
table
23
data mart
data mart
implementation
legacy system
24
conformed dimensions
conformed dimensions customer dimension
product dimension
time dimension conformed
dimensions legacy
25
units
fact tables fact
dimensional drill-across
report
dimension table
overhead fact
fact manufacturers table manufacturers table view
conform fact
Conformed dimensions
data mart base level
fact tables dimensions
Granular fact table data power Data mart
Gracefully extended
dimensional modeling approach facts
dimension attribute dimensions
queries applications fact table drop
reload key
data mart
user data mart
data mart data mart
roll up
roll up
data mart
26
Activity-Based Profitability multiple source data mart
single-source data mart data
mart
dimensional modeling
27
Facts
Fact
Fact table
Fact
Fact
Attributes
fact attributes
Dimensions
28
Dimensions table
Fact
.
Fact
Dimension Fact
Fact
Dimension hierarchies
Primary Key Dimension Key
Star
Dimension Fact Dimension
drilling down
drilling down drilling
dimensional model dimension table attribute
attribute application
dimension table attribute
dimension table attribute
29
attribute
Roll up row header
join query high level low level bringing back
Permissible Snowflaking
browse
DB
bitmap indexes
dimension attribute
dimension table dimension
attribute browse
30
verbose ()
descriptive ()
complete ()
quality assured ()
indexed ( B-tree bitmap )
documented ( attribute )
fact table
fact table 4-3
4-3
31
1.organization name
2.post name
3.post name abbreviation
4.role
5.street name
6.street type
... ...
... ...
... ...
Degenerate Dimensions
fact table
line item fact
table dimension table
dimension
order line
item table dimension order
line-item fact table order number order
number attribute degenerate dimension
Junk Dimensions
(flag) text
attribute
32
flag and attribute flag and attribute
junk dimensions
smart keys
attribute dimension primary key join
fact table
dimensional schema
1.the
2.the
3.the
4.the
data mart
fact table grain fact table
dimensions
facts fact
Step1.
33
fact record
fact record
ATM fact record
fact record
fact record
fact record
Step 3. dimensions
fact table grain
grain
Step 4. fact
fact fact table grainfact table grain
fact fact table fact
34
DSS
DSS
DSS
SQL
SQL
Star Schema
Star Schema
Star Schema table
join path
Schema
1. Star Schema
data modeling Star Schema
Star Schema
Star Schema
35
metadata
Star Schema
2. Star Schema
Star Schema table Fact tables Dimension
tables Fact tables major tables,
Dimension tables minor tables
SQL fact table dimension
table join paths
fact table
Dimension table
fact table
schema dimension table
Star Schema
36
5-1
Sales Table
37
fact
table fact table
Fact table
fact table
5-4 fact table
schema
38
5-4
5-5
14,000 120
208
fact table Sales dimension
tableProductPeriod Market 5-5 Sales fact table
ProductPeriod Market
dimension table Sales table
39
dimension table
dimension table
dimension table
dimension table
product table 1400
Dimension table
drilling down
40
(perspective)
drill down
Drilling
down
star schema dimension table
5-6
drill-down roll-up
5-6
star schema
41
AGGREGATION
Aggregation
CPU cycle
5-7
5-7
42
5-8 fact
table 10,000,000
9,000,000
1,000,000
(sparse
aggregation)
5-8
GB
5-9
Star schema
star schema
star schema
43
star join
fact table
star schema
5-10 multi-star schema
Frequent_StayersActuals Bookings table
44
5-10
1.
45
5-11
2.
46
5-11
schema aggregation
47
SQL Server
TOYOTA OLTP
Models
Model
Tercel
Premio
Corolla
Camry
Solemio
Zace
Avalon
Lexus
Price
450000.0000
600000.0000
700000.0000
1000000.0000
750000.0000
500000.0000
1500000.0000
2000000.0000
Import
Retailers
Retailer
Region
Taipei
Chungli
Toayuan
Tainan
North
North
North
South
48
Kaoshiung
Taidong
South
South
Sales
OrderNo
Retailer
Model
SalesDate
1
2
3
4
5
6
7
8
9
10
:
:
49871
49872
49873
49874
49875
49876
49877
Taipei
Kaoshiung
Tainan
Chungli
Taidong
Taoyuan
Tainan
Chungli
Taoyuan
Taidong
:
:
Tainan
Taipei
Taipei
Taoyuan
Taipei
Kaoshiung
Taipei
Tercel
Solemio
Lexus
Premio
Tercel
Zace
Solemio
Solemio
Avalon
Premio
:
:
Avalon
Solemio
Premio
Avalon
Avalon
Premio
Zace
1999-12-25
1998-11-28
1998-11-15
1998-05-19
1999-02-28
1998-04-19
1999-02-23
1999-01-17
1998-04-02
1999-05-15
:
:
1998-07-23
1999-12-08
1999-12-26
1999-08-02
1999-03-18
1998-07-27
1999-03-06
Qty
1
2
1
2
2
1
1
1
1
1
:
:
1
1
2
2
2
2
2
Decision Support
OLTP
SQL Server
49
Dimensional Model
Star SchemaFact tableDimension table
1 Dimensional Model
(1)
Model
Time
Retailer
(2)
/
Model
Region
Retailer
Year
Month
Fact Table
Dimension tables
Dimension ModelRetailerTime
Retail_Dimension
Ret
ai
Sales_Fact
sYear
sMonth
Time_Dimension
Retailer
Model_Dimension
Mo
d
Model
Star Schema
Retail_Dimension
CREATE TABLE Retail_Dimension
( Retailer varchar(12) PRIMARY KEY,
Region varchar(12)
)
51
sYear
sMonth
INSERT Retail_Dimension
SELECT Retailer, Region FROM Retailers
Model_Dimension
CREATE TABLE Model_Dimension
( Model varchar(12) PRIMARY KEY,
Import varchar(8)
)
INSERT Model_Dimension
SELECT Model, Import FROM Models
Time_Dimension 2
CREATE TABLE Time_Dimension
( sYear int,
sMonth int,
CONSTRAINT pk_Y_M PRIMARY KEY ( sYear, sMonth )
)
INSERT Time_Dimension
SELECT DISTINCT
YEAR( SalesDate ), MONTH( SalesDate )
FROM Sales
(2) Fact Table
Sales_Fact
CREATE TABLE Sales_Fact
( sYear int,
52
sMonth int,
Retailer varchar(12),
Model varchar(12),
Quantity int,
TotalPrice money,
CONSTRAINT pk_sY_sM_R_D
PRIMARY KEY ( sYear, sMonth, Retailer, Model )
)
INSERT Sales_Fact
SELECT Retailer, s.model, YEAR( SalesDate ),MONTH( SalesDate ),
SUM( Qty ), SUM( Qty * m.price ) FROM Sales s JOIN Models m
ON s.model = m.model
GROUP BY Retailer, s.model, YEAR(SalesDate), MONTH(SalesDate)
1 Microsoft Transact-SQL
2 Time_Dimension
Fact Table Time_Dimension
53
54
TOYOTA
2000
2000
Sales_Fact
SQL
INSERT Sales_Fact
SELECT YEAR( SalesDate ), MONTH( SalesDate ),
Retailer, s.model, SUM( Qty ), SUM( Qty * m.price )
FROM Sales s JOIN Models m
ON s.model = m.model
WHERE
MONTH(SalesDate) = MONTH(DATEADD(mm,-1,GETDATE()))
AND YEAR(SalesDate) = YEAR(DATEADD(mm,-1,GETDATE()))
GROUP BY YEAR(SalesDate), MONTH(SalesDate), Retailer, s.model
55
Connection
Connection Properties
(1) Connection Name(2) Data Source(3) Server(4) Database
56
57
DTS Package
Schedule Package
58
59
60