You are on page 1of 40

TRNG H KINH T TP H CH MINH

KHOA H THNG THNG TIN KINH DOANH

Bigingmn
TCHHPHTHNG

BI 4: DATA WAREHOUSE

1
Mctiu

Saukhihcxongbinysinhvincth:
Hiurkhinimkhodliu(DataWarehouse)v

ccctrngcamhnhkhodliu

Bitcccmhnhtchhpdliuachiu

Nmckintrckhodliu

Nmcccphngphpphntch,khaiphtrn

khodliu

2
Thamkho

PaulrajPonniah,DataWarehousing,2001
W.H.Inmon,BuildingtheDataWarehouse(Third
Edition),2002

3
4
Nidung

Khinimkhodliu

Mhnhdliuachiu

Kintrckhodliu

5
Khinimkhodliu
Khodliu(DataWarehouse)cnhnghal:
CSDLhtrquytnhcduytrtchbitviCSDLtc

nghipcatchc.
Htrxlthngtinnhcungcpmtdliuhpnhtphn

tch.
KDLlmttphpdliuhngch,tchhp,ctnhthigian
vkhngthayihtrqutrnhtoquytnhquntr.
Bnctrng:hngch,tchhp,ctnhthigianvkhng
thayi

6
Khodliu:khinim
Khodliu:
Cungcpmtkhungnhntchhpvtngthvdoanhnghip
To s sn c thng tin hin ti v lch s ca doanh nghip
thunliraquytnh
To kh nng giao dch h tr quyt nh m khng cn tr h
thngtcnghip
Cungcptnhnhtqunthngtindoanhnghip

7
Kintrckhodliu

8
TokhodliuDatawarehousing

QutrnhxydngvsdngKDL

9
KDLctrnghngch

ctchcxungquanhccchchnh,chnghnnh
khchhng,snphm,bnhng.
Tptrungvoxydngmhnhvphntchdliu
toquytnh
Cungcpmtkhungnhnnginvngngnvcc
tithucchcthtrongqutrnhraquytnh.

10
KDLctrnghngch

ngdngtcnghip chKDL

11
KDLctrngtchhp
KDLcxydngtvictchhpccngundliu
phc,khngngnht
CSDLquanh,CSDLfilephng(flatfiles:mha

CSDLsangdngcbitnh.txthoc.ini),ccmu
tingiaodchtrctuyn
Sdngcckthutlmschdliuvtchhpd
liu.
mbotnhnhtqunquycttn,cutrcm

ha,olngthuctnh,giaccngundliu
khcnhau

VD,gikhchsn:tint,thu,baoginsng
DliuchuyntiKDLthncchuyni.
12
KDLctrngtchhp

13
KDLctrngthigian
ChiuthigianiviKDLlngkdihnsovih
thngCSDLtcnghip.
CSDLtcnghip:dliugitrhinthi.
DliuKDL:cungcpthngtintheoquanimlch
s(chnghn,510nmqukh)
MicutrcctlitrongKDL
Chayutthigian
Nhngctlicadliutcnghipcthchahoc
khngchayutthigian.

April26,2014 14
KDLctrngthigian

hiuthigianhinthiti6090ngy hiuthigian5=10nm

pnhths utrcchnhchayutthigian

utrcchnhcha/khngchayutthi
gian

15
KDLctrngkhngthayi
Lutrvtlringbitccdliucchuyntmi
trngtcnghipsang.
Cpnhttcnghipdliukhngxuthintrongmi
trngKDL.
Khngcxlgiaodch,phchivcchiu
khinngthi.
Chchaithaotctruynhpdliu:

Npdliuvtruycpdliu.Dliungun
khngbinitrongKDL.

16
KDLctrngkhngthayi

17
KDLvHQTCSDLtcnghip
OLTP(xlgiaodchtrctuyn/onlinetransactionprocessing)
BitonchnhcaHQTCSDLquanhtruynthng
Tcnghiphngngy:thumua,lukho,ngnhng,snxut,tin
lng,ngk,kton,vv
OLAP(xlphntchtrctuyn/onlineanalyticalprocessing)
BitonchnhcahthngKDL
Phntchdliuvtoquytnh
ctrngphnbit(OLTP<>OLAP):
nhhngngidngvhthng:khchhng<>thtrng
Nidungdliu:hinthi,cth<>lchs,hpnht
ThitkCSDL:ER+ngdng<>hnhsao+ch
Khungnhn:hinthi,ccb<>tinha,tchhp
Mutruycp:truynhp<>chcvicuhiphc
18
OLTP<>OLAP
OLTP OLAP
Ngi dng Th l, chuyn vin CNTT Chuyn vin tri thc
Chc nng Tc nghip hng ngy H tr quyt nh
Thit kCSDL Hng ng dng Hng ch
D liu Hin thi, cp nht Lch s, tm tt, tch hp a chiu,
chi tit, quan h phng bit hp nht
lp
S dng Lp D tm (ad-hoc)
Truy cp c/ghi Nhiu duyt
Ch mc/bm theo kha
chnh
n v thao tc Giao dch ngn,n gin Cu hi phc tp
# bn ghi truy cp Chc Triu
#ngi dng Nghn Trm
Kch thc CSDL 100MB-GB 100GB-TB
n v o Thng lng giao dch Thng lng truy vn, p ng

19
Khodliuringbit

Hiunngcaochochaihthng
DBMSphnbchoOLTP:phngphptruycp,lpchmc,
iukhinngthi,khiphc
WarehousephnbchoOLAP:truyvnOLAPphc,khung
nhnachiu,hpnht
Chcnngkhcnhauvdliukhcnhau:
Thiudliu:HtrquytnhcndliulchsmCSDLtc
nghipthngkhngduytr
Hpnhtdliu:Htrquytnhihihpnht(tnghp,
tmtt)cadliutccngunkhngngnht
Chtlngdliu:ngunkhcnhausdngtrnhdin,mha
vkhundngdliukhngnhtqun(cnphihahp)
20
Khinimkhodliu

Mhnhdliuachiu

Kintrckhodliu

21
MhnhkhinimcaKDL
MhnhKDL:chiuvgitro
Shnhsao(starschema):Mtbngskin
trungtmcktnivimttpccbngchiu
Sbngtuyt(Snowflakeschema):Mtmrng
cashnhsaotrongmtvicutrcchiu
cchunhathnhmttpccbngchiunh
hn,hnhthctngtnhbngtuyt.
Schmsaoskin(Factconstellationsschema):
Bngskinphcchiasccbngchiu,tokhung
nhnmttpccngisao,nncncgis
ngnh(galaxyschema)hocchmsaoskin
22
Vdvshnhsao
time
time_key item
day item_key
day_of_the_week Sales Fact Table item_name
month brand
quarter time_key type
year supplier_type
item_key
branch_key
branch location
location_key
branch_key location_key
branch_name units_sold street
branch_type city
dollars_sold state_or_province
country
avg_sales
Measures

23
Vdvsbngtuyt
time
time_key item
day item_key supplier
day_of_the_week Sales Fact Table item_name supplier_key
month brand supplier_type
quarter time_key type
year item_key supplier_key

branch_key
location
branch location_key
location_key
branch_key
units_sold street
branch_name
city_key
branch_type
dollars_sold city
city_key
avg_sales city
state_or_province
Measures country

24
ExampleofFactConstellation
time
time_key item Shipping Fact Table
day item_key
day_of_the_week Sales Fact Table item_name time_key
month brand
quarter time_key type item_key
year supplier_type shipper_key
item_key
branch_key from_location

branch location_key location to_location


branch_key location_key dollars_cost
branch_name units_sold
street
branch_type dollars_sold city units_shipped
province_or_state
avg_sales country shipper
Measures shipper_key
shipper_name
location_key
shipper_type 25
Gitro:Baloi

Phnbit:Nuktqunhnctpdnghmtingi
trkthpgingnhktqunhncbipdngchnh
hmtrnmigitrkhngphnhoch.

Chnghn,count(),sum(),min(),max().
is(algebraic):nunctnhtonbimthmi
sviMis(Mlmtsnguynhuhn),miis
thucbimthmtchhpphnb.

Chnghn,avg(),min_N(),standard_deviation().
Lplun(holistic):Nucntimthngshnchtheo
kchthclutrmtmttphpcon.

Chnghn,median(),mode(),rank().

26
Dliuachiu

Khilngbnhnglmthmcasnphm,
thng,vqun
Cc chiu: SP, a danh, Thi gian
Cc ng tm tt phn cp
on
gi

Industry Region Year


Re

Category Country Quarter


Product

Product City Month Week

Office Day

Month
Khodliuvkhaiphdliu
27
Mtkhidliuvd
Total annual sales
Date of TV in U.S.A.
1Qtr 2Qtr 3Qtr 4Qtr sum
t
uc

TV
od

PC U.S.A
Pr

VCR

Country
sum
Canada

Mexico

sum

28
Sdngkhodliu
BakiungdngKDL
Xlthngtin(Informationprocessing)

Htrtruyvn,phntchthngkcbn,vlpbocos
dngxuynm,bng,sctvth
Xlphntch
Phntchachiudliutrongkhodliu

HtrthaotcOLAPcbn,cunln,khoanxung,xoay
Khaiphdliu

Phthintrithctmun

Htrmhnhphntchkthp,xydng,thihnhphnlp
vdbo,vtrnhdinktqukhaiphbngtinchtrc
quanha.

29
Khinimkhodliu

Mhnhdliuachiu

Kintrckhodliu

30
ThitkKDL:Mtkhungphntchkinh
doanh

4khungnhnivithitkmtKDL
Khungtrnxung(Topdownview)

ChophplachnthngtinlinquancnthitchoKDL
KhungngunDL(Datasourceview)

Trnhbythngtincnmgi,lutrvqunlbih
thngtcnghip
KhungKDL(Datawarehouseview)
Chaccbngskinvccbngchiu
Khungtruyvnkinhdoanh(Businessqueryview)

Thyphicnhcadliutrongkhotkhungnhncangi
sdng

31
QutrnhthitkKDL
TipcnTopdown,bottomuphockthpchai
Topdown:Khiuvithitkvlnkhochkhiqut(hon

thnh)
Bottomup:Khiutkinhnghimvmu(nhanh)

Theoquanimcaknghphnmm
Thcnc(Waterfall):Phntchcutrcvhthngtimi

bctrckhitinhnhbctiptheo
Xonc(Spiral):Phtsinhnhanhhthngchcnngtng

trng,chukngnvnhanh
QutrnhthitkKDLinhnh
Chnqutrnhkinhdoanhmhnhha,nhthng,gi

nhng,
Chndliucaqutrnhkinhdoanh

Chnccchiuspdngtimibnghibngskin

Chnomibnghibngskin

32
Kin trc a tng

Monitor
& OLAP Server
other Metadata
sources Integrator

Analysis
Operational Extract Query
Transform Data Serve Reports
DBs Load
Refresh
Warehouse Data mining

Data Marts

Data Sources Data Storage OLAP Engine Front-End Tools


33
Kin trc ba tng

34
BamhnhKDL
Khodoanhnghip(Enterprisewarehouse)
Tphpttcccthngtinvccchtritrnton

bdoanhnghip
KDLchuyn(DataMart)
Mttpcondliutondoanhnghipcgitrivi

mtnhmngidngchuynbit.PhmvicaKDL
chuyncgiihntrongccnhmchuynbit,
cchnlc,vdnhKDLchuyntipth.

KDLchuynclp<>Phthuc(trctiptKDL)
Khoo(Virtualwarehouse)
MttpkhungnhntrnCSDLtcnghip

35
Mhnhdliuachiu
Khuynh hng suy ngh ca ngi qun l kinh
doanh: nhiu chiu (multidimensionally). V d,
khuynhhngmtnhnggmcngtylm:
Chng ti kinh doanh cc sn phm trong nhiu th trng
khc nhau, v chng ti nh gi hiu qu thc hin ca
chngtiquathigian.
Ngi thit k DWH thng lng nghe cn thn v
thmvoccnhnmnhcbit:
Chng ti kinh doanh cc sn phm trong nhiu th trng
khc nhau, v chng ti nh gi hiu qu thc hin ca
chngtiquathigian.
Mhnhdliuachiu(2)

Mphngccchiutrongkinhdoanh

Trcgic:vickinhdoanhnhmtkhi(cube)dliu:
Minhntrnmicnhcakhi.
imtrongkhilccgiaoimcacccnh.
Vimtkinhdoanhtrn
CnhlSnphm,Thtrng,vThigian.
hiu v tng tng rng: im trong khi l cc o hiu qu kinh
doanh,kthpccgitrSnphm,ThtrngvThigian.
XLPHNTCHTRCTUYN
H thng OLAP (On_Line Analysis Processing
Xlphntchtrctuyn)
HTqunlchophpphntchdliu:

Ctlt(slice)dliutheonhiucnhkhcnhau,
Khoanxung(drilldown)mcchitithn
Cunln(rollup)mctnghphn.
BnchtctlicaOLAP
dliuclyratKDLhoctDatamart(khod
liuch)

dliucchuynthnhmhnhachiu
dliuclutrtrongmtkhodliuachiu.
XLPHNTCHTRCTUYN
i tng chnh ca OLAP l khi (cube): mt s biu din a
chiucadliuchititvtngth.
Nhc li: Khi bao gm mt bng s kin (Fact), mt/nhiu

bng chiu (Dimensions), cc n v o (Measures) v cc


phnhoch(Partitions).
Khi (Cube): Khi l phn t chnh trong x l phn tch trc
tuyn, l tp con (subset) d liu t kho d liu, c t chc v
tnghptrongcccutrcachiu
Chiu (Dimension): Chiu l cch m t chng loi, theo cc
dliustrongkhicphnbphntch.
nvolng(Measures):nvocakhilcttrongbng
Fact. Cc n v o xc nh nhng gi tr s t bng Fact, c
tnghpphntchnhnhgi,trgi,hocslngbn.
Ccphnhoch(Partitions):Ttccckhiuctithiumt
phnhochchadliucan;mtphnhochnct
ngtorakhikhicnhngha.
40

You might also like