You are on page 1of 8

Bo Co Tm Hiu

Data Warehousing, Business Intelligence, and Dimensional Modeling


Primer

Nhm 2

Nhm 2

Bo Co Tm Hiu
Data Warehousing, Business Intelligence, and Dimensional Modeling
Primer

1 Mc tiu ca
Intelligence

Data

Warehouse

Business

H thng DW/BI cn gip truy cp thng tin d dng: Ni


dung ca h thng DW/BI phi d hiu. Cc d liu phi l trc quan
v r rng cho c ngi dng l doanh nghip, khng ch l cc nh
pht trin. Cu trc v nhn ca d liu nn da theo qu trnh suy
ngh v t vng ca ngi dng doanh nghip.
Cc cng c v ng dng BI phi c n gin ho v d s dng, ngoi
ra cng phi tr li kt qu truy vn cho ngi s dng trong thi gian
ch i ti thiu. Nhn chung h thng DW/BI cn n gin v nhanh
chng.
H thng DW/BI phi trnh by thng tin mt cch nht
qun: Cc d liu trong h thng DW/BI phi ng tin cy. D liu
phi c tng hp cn thn t nhiu ngun khc nhau, kim tra
tin cy v pht hnh ch khi n l ph hp vi ngi dng. Tnh nht
qun cng cn c m bo vi ni dung ca h thng DW/BI c
s dng trn nhiu ngun d liu.
V d, hai bin php thc hin c cng tn phi ging nhau. Ngc li, nu
hai bin php khng phi l mt phi c tn khc nhau.
H thng DW/BI phi thch ng c vi thay i: nhu cu ca
ngi s dng, iu kin kinh doanh, d liu v cng ngh lun thay
i. H thng DW/BI phi c thit k x l cc thay i khng
th trnh khi. D liu hin c v cc ng dng khng nn thay i
hay tr nn li thi khi cng ng doanh nghip t nhng yu cu
mi hoc b sung nhng d liu mi vo kho d liu. Cui cng, cc
thay i cn khng lm nh hng n ngi s dng.
p ng ca h thng DW/BI phi thc hin trong thi gian
hu hn: Khi s dng h thng DW/BI, d liu th c th cn phi
c chuyn i trong vng vi gi, vi pht, thm ch vi giy.
H thng DW/BI phi m bo an ton thng tin: mc ti
thiu, cc kho d liu c kh nng cha thng tin v nhng g bn
ang bn v n s tr nn c hi trong tay ca nhng ngi ngoi
h thng. H thng DW/BI phi kim sot quyn truy cp vo thng
tin b mt ca t chc mt cch hiu qu nht.

Nhm 2

H thng DW/BI h tr vic ra quyt nh mt cch ng tin


cy: Cc kho d liu phi c d liu h tr vic ra quyt nh.
Cc kt qu quan trng nht ca mt h thng DW/BI l cc quyt
nh c thc hin da trn cc thng tin trn.
H thng DW/BI phi c cng ng doanh nghip chp
nhn: h thng DW/BI khng nht thit phi l h thng tt nht
trn l thuyt m n phi l mt h thng n gin d s dng v
c ngi dng doanh nghip chp nhn.

2 Gii thiu v m hnh a chiu (Dimensional


Modeling)
2.1 Lc hnh sao v khi OLAP
C lc hnh sao v khi OLAP u c chung thit k logic nhng khc
nhau v ci t.
Cc lu khi trin khai OLAP:

Mt lc hnh sao trong mt c s d liu quan h l c s tt


xy dng khi OLAP.
Khi OLAP c hiu nng tt hn so vi c s d liu quan h nhng
iu ny dn tr nn khng quan trng v s pht trin ca phn
cng my tnh.
Cu trc d liu m hnh khi OLAP a dng ty thuc vo nh cung
cp.
Khi OLAP cung cp cc ty chn bo mt tinh vi hn c s d liu
quan h.
Khi OLAP cung cp phong ph cc tnh nng phn tch.
Khi OLAP h tr transaction v cc snapshot nh k ca cc bng
fact.
Khi OLAP thng h tr h thng phn cp phc tp khng xc
nh su.
Khi OLAP c th p dng nhng hn ch chi tit v cu trc ca
khng gian kha.
Mt s sn phm OLAP khng cho php vai tr khng gian hoc b
danh.

2.2 Bng Fact trong o lng

Bng fact trong m hnh khng gian lu cc s o v hiu sut


c sinh ra bi cc s kin trong qu trnh kinh doanh ca t chc.
Cc d liu s o rt ln nn khng nn to cc bn sao. d liu
c thng nht, tt c ch nn truy cp vo mt ni tp trung.
Mi hng trong bng fact l mt s kin o lng
D liu mi hng gi l mt ht, cc d liu ny phi cng mt mc
chi tit.
Nhng thng tin c ch nht l s v c tnh cng c.

Nhm 2

Khng nn lu d liu ch khng cn thit vo bng fact. Nu n


l duy nht cho mi hng ca bng fact th n thuc bng khng
gian.
Khi nim su ca bng l s cc hng, khi nim rng, hp l
s cc ct.
Bng fact c 2 hoc nhiu kha ngoi tr n cc kha chnh ca
cc bng khng gian.

2.3 Bng khng gian cho mc ch m t


Bng khng gian lun tn ti song song vi bng fact. N m t Ai? Ci
g? u? Bao gi? Nh th no? Ti sao lin quan ti s kin lu bng
fact? Bng khng gian thng nng hn bng fact nhng rng hn.
Bng khng gian c mt kha chnh.
Nn c gng s dng t m nht c th v thay bng cc t c ngha hn.
Bng khng gian cung cp im truy cp d liu, v cng l cc nhn v
cc nhm cho vic phn tch.
Cch phn bit mt d liu s thuc bng fact hay bng khng gian:
Nu n hay thay i gi tr v lin quan n tnh ton th thuc bng
fact
Nu n l hng s v c s dng nh rng buc hay l nhn th
thuc bng khng gian.
Chun ha d liu c gi l snow flaking. Nn hn ch thi quen
chun ha d liu. Thay vo , nn lm phng quan h mt-nhiu trong
mt bng khng gian. V bng khng gian thng nh hn bng fact nn
chun ha hay tuyt ha cng khng nh hng ln.

2.4 Bng Fact v bng khng gian trong mt lc hnh sao

Star join (ni hnh sao)


Lc khng gian n gin v c tnh i xng.
C li cho ngi dng v d hiu v d nh hng.
S n gin ny cng c li v hiu nng.
Lc khng gian c th d dng m rng thch nghi thay i.
C th thm bng khng gian mi, thm bn ghi mi cho bng
fact, b sung vo bng khng gian c m khng cn ti li d
liu, cc ng dng vn s tip tc chy.

3 Kin trc DW/BI ca KimBall


3.1 ETL: rt, chuyn i v ti d liu.

Extract Rt d liu: c v hiu d liu gc, lc ra cc d liu cn


thit cho vic lu tr v x l sau ny. im ny, d liu thuc v DW.
Transforms Chuyn i: chuyn i d liu, chuyn i t cc d
liu nghip v ca cc phn mm thnh d liu phn tch ca cc nh
qun tr, ng thi phi ti u ha cho mc ch phn tch d liu ny.

Nhm 2

Ngoi ra, chuyn i d liu cn tham gia vo mt mc ch khc l lm


sch d liu (sa cha li chnh t, gii quyt tng tranh min, i ph
vi cc tc nhn b mt, hay phn tch c php v dng chun).
Load Ti d liu: Bc cui cng l vic cu trc vt l v ti d liu:
sau khi c chuyn i th ton b cc d liu ny c a vo mt
ni lu tr mi, m ngi ta gi l DataWarehouse (tm dch l kho d
liu).
Trong nhiu trng hp, h thng ETL khng da trn k thut c tnh
quan h nhng thay vo , n c th ph thuc vo mt h thng cc
tp phng (thiu tnh kt cu).
Sau khi ph duyt d liu cho ph hp vi quy tc nghip v mt mt
v mt nhiu, vic thc hin bc cui cng ca xy dng CSDL 3NF c
th khng cn thit. Tuy nhin, c nhng trng hp m d liu n ti
ngng ca ca h thng ELT trong nh dng quan h 3NF. Trong
trng hp ny, nh pht trin h thng ETL c th thoi mi hn trong
vic thc thi cc tc v bin i v lm sch d liu s dng cu trc
chun ha. S khi to ca c hai cu trc chun ha cho ETL v cho
cu trc khng gian cho mc ch biu din m d liu c kh nng
c rt, bin i v ti hai ln mt ln vo CSDL chun ha v mt
ln khi chng ta ti m hnh khng gian.

3.2 Vng biu din


Presentation Area Vng biu din: l ni d liu c t chc,
lu tr v sn sng cho cc truy vn trc tip bi ngi dng, ngi
vit bo co v cc ng dng BI khc. N l tt c nhng th m
doanh nghip thy v tc ng thng qua cng c truy cp ca h
v cc ng dng BI.
D liu c biu din, lu tr v truy cp trong gin khng gian,
hoc l gin sao quan h hay khi OLAP (OLAP cubes).
Bao gm d liu chi tit v d liu nguyn t (t chi tit, n gin nht;
d liu nguyn t cung cp d liu c s cho ton b s bin i d
liu).
c t chc bi quy trnh nghip v o lng s kin (Business Process
Measurement Events). Cch tip cn ny xp mt cch t nhin vi h
thng ly d liu ngun iu hnh.
M hnh khng gian cn ph hp vi cc s kin ly d liu t nhin. Tt
c cc cu trc khng gian phi c xy dng s dng chiu (kch
thc) chung, ph hp. y l c s ca cng trnh kin trc but DW
(the Enterprise Data Warehouse Bus Architecture) s c m t
Chng 4.
Khi cc m hnh khng gian c thit k vi chiu ph hp, chng c
th d dng kt hp v s dng cng nhau.
Khi kin trc bus c s dng nh mt framework, chng ta c th
pht trin cng trnh DW theo cch nhanh chng, phi tp trung ha, lp
li.
4

Nhm 2

3.3 Cc ng dng tri thc nghip v - Business Intelligence

Cc ng dng BI: mt tp cc k thut v cng c cho vic thu thp v


chuyn i d liu th thnh cc thng tin c ngha cho cc mc ch
phn tch kinh doanh.
Mt s chc nng chung ca BI:
Bo co, quy trnh phn tch online, phn tch, Khai ph d liu, Khai
ph quy trnh (Process Mining), Thc thi s kin phc tp, Qun l hiu
sut nghip v, o lng tiu chun (Benchmarking), Khai ph vn
bn (Text Mining), Phn tch d on (Predictive) v Phn tch quy tc
(Prescriptive).
C th n gin nh mt cng c truy sut c bit (khng theo th
thc). Cng c th tr nn phc tp nh mt ng dng m hnh hoc
khai thc d liu tinh vi.
Cng c truy vn c bit (Ad hoc Query Tools) mnh ng nh kh
nng ca chng, c th c hiu v s dng hiu qu bi phn trm
nh s lng ngi dng nghip v DW/BI. Bn cnh , mt s ng
dng phc tp hn, nh cng c d bo hoc m hnh, c th nng cp
kt qu ngc tr li h thng ngun iu hnh, h thng ETL hoc
vng biu din.

4 Kin trc DW/BI khc


4.1 Kin trc theo dng cc Data Mart c lp nhau
Data mart c lp c xy dng trc DW v d liu c trc tip ly t
cc ngun khc nhau. Cc Data mart c xy dng v qun l c lp,
khng c d liu trao i qua li.
kin trc ny, mi BI - Business Intelligence Application ch c kt ni
ti mt v ch mt Data Mart, cc Data Mart ly d liu t 1 hoc nhiu
ngun khc nhau qua c ch ETL (Extract, Transformation, and Load).
Kin trc c lp to ln tnh t tr v d liu cao. c s dng khi ngi
ta mun phn tch d liu theo cc mc ch ring. Chnh bi s c lp
lm cho kin trc ny linh hot hn. V c th d dng c ci t bi
cc k thut lu tr khc nhau nh quan h, hng i tng, phn tn...
N thng c thy trong nhng t chc nh hn, thiu ngun xy
dng mt kho d liu tp trung. Phng php ny n gin v chi ph thp
hn nhng i li c nhng im yu l mi DM c lp c cch tch hp
ring, do d liu t nhiu Data Mart rt kh c th ng nht vi
nhau.

4.2 Kin trc trc bnh xe v nan hoa


Hnh nh trc bnh xe v lan hoa lin h ti s tp trung, ph thuc ln
nhau thng qua trc. Cc Data mart c xy dng sau DW, dng ph
5

Nhm 2
thuc ln nhau, tc c lin h d liu qua li. D liu c lu di c hai
dng l tm lc v chi tit.
Data Mart khng ly d liu trc tip t cc ngun cp m thng qua EDW
(Enterprise Data Warehouse) chu trch nhim chun ha cc quan h
trong d liu ca DW (Normalized Relational Warehouse)
Chun ha v 3NF
Cho php ngi dng s dng query truy vn DL
Pha BI - Business Intelligence App c th to yu cu trc tip n EDW
m khng cn thng qua Data Mart. Thng qua EDW, d liu c cht
lc, tch hp li mc cao phc v mt ch nht nh trong Data
mart.

5 Lm tng v Dimensional Modeling


5.1 M hnh a chiu ch dnh cho vic tng hp d liu
y l nguyn nhn ca vic thit k m hnh lc hu khng th d on
c d liu cn thit i vi ngi dng v cc doanh nghip. D liu
cng khng chi tit th cng khng d dng thay i kho d liu, nh kinh
donh v m hnh l nhng phn t chi tit c cung cp ci thin
hiu nng truy vn nhng khng th thay th chi tit.

5.2 M hnh a chiu l mt phn ch khng phi l c mt h


thng ln
Khng a ra ranh gii da trn t chc tp. M hnh a chiu nn c t
chc xung quanh cc quy trnh kinh doanh nh ha n, n hng, dch
v. Nhiu chc nng kinh doanh thng mun phn tch s liu tng t
nh x l kinh doanh n l; do d liu phc tp v nn trnh nhng
phn tch khng ph hp

5.3 M hnh a chiu khng c kh nng m rng


Thc t m hnh a chiu rt c kh nng m rng, cc bng d liu c
hng t dng nh cung cp CSDL lun ht lng pht trin v ti u ha
kh nng m rng ca m hnh a chiu, m hnh thng thng v m
hnh a chiu u cha thng tin, quan h d liu tng t, logic ging v
gii quyt mt cch chnh xc khi gp vn .

5.4 M hnh a chiu ch dnh cho vic d on kh nng s


dng
M hnh ha nh hng khng nn c thit k bng cch tp trung vo
cc bo co hoc phn tch c sn. Thit k nn tp trung vo x l xc
nh. iu quan trng l phi tp trung vo cc s kin n nh ca t
chc. Mt h qu: m hnh chiu khng p ng vi nhu cu thay i kinh
6

Nhm 2
doanh, ngc li nh vo cu trc cn i, cu trc nh hng cc k linh
hot, thch ng vic thay i.

5.5 M hnh a chiu khng th tch hp


Hu ht cc m hnh nh hng c th tch hp nu nh ph hp vi cu
trc bus tng th kho d liu. Kch thc ph hp c xy dng v duy
tr nh d liu tng th tp trung v lin tc trong h thng ETL v sau
ti s dng trn m hnh a chiu cho php tch hp d liu v m
bo s thng nht v mt ng ngha.
Pht trin nhan: tp trung cung cp gi tr kinh doanh, gi tr s phi hp
gia ngi pht trin v qun l, lin tc gp mt, giao tip trc tip, ph
hi thng tin, u tin cc bn lin quan ca doanh nghip, thch nghi mt
cch nhanh chng chc chn v yu cu pht trin, gii quyt pht
trin mt cch lp i lp li nhng c s pht trin.
Tuy hp dn nhng cc phng php pht trin nhanh thng thiu quy
hoch, kin trc v i i vi vic thch thc nh qun l.

6 Tng kt
Chng ny trnh by v mc tiu quan trng cho DW/BI v cc khi nim
c bn ca m hnh ha a chiu. Kin trc Kimballs DW/BI v mt s gii
php thay th c so snh v xc nh mt s hiu lm c bn i vi
m hnh a chiu mc d c s dng rng ri ng thi a ra thch
thc a chiu trong vic m hnh ha d liu

You might also like