You are on page 1of 15

Gi i thi u k thu t x l phn tch tr c tuy n v i SQL SERVER

Tr l i Lin h

Gii thiu k thut x l phn tch trc tuyn vi SQL SERVER OLAP (On-Line Analytical Processing) with SQL SERVER Nguyn Vn Chc chucnv@ud.edu.vn

Trong cng ngh kho d liu (Data Warehouse Technology), OLAP l k thut truy xut d liu ch yu trong kho d liu. D liu trong DW c t chc di dng cc khi d liu a chiu (Multi Dimensional Cube) v OLAP c dng phn tch trn d liu khi (cube).

Bi vit ny trnh by cch trin khai thc hin k thut OLAP trn DBMS SQL Server phin bn 2005 hoc cao hn.

1.Gii thch mt s thut ng

Di y tm lc cc thut ng c s dng trong bi vit:

Data Warehouse (DW): c xem l tp cc c s d liu hng ch , c tnh lch s c tch hp t nhiu ngun d liu qua cc qu trnh trch lc, hp nht, chuyn i, lm sch.

D liu khi (Data Cube): D liu trong kho d liu c th hin di dng a chiu (Multi Dimension) gi l khi (cube). Mi chiu m t mt c trng no ca d liu. V d vi Data Cube bn hng th chiu hng ha (Item) m t chi tit v hng ha, chiu thi gian (time) m t v thi gian bn hng, chiu chi nhnh (Branch) m t thng tin v cc i l bn hng,

r hn v Data Cube, hnh di y minh ha Data Cube ca d liu bn hng t bng d liu (Spreadsheet) sang d liu dng khi vi 3 dimensions l: Location (Cities), Time (Quarters) v Item (Types)

Lc hnh sao (Star Schema): y l m hnh biu din d liu ca DW, lc hnh sao v c bn gm c bng s kin (Fact Table) v cc bng chiu (Dimension table). Fact table

ng theo di cc bin ng ca d liu, cu trc ca Fact table gm cc kha ngoi l cc kha chnh ca c bng chiu (Dimension table). Dimension Table l cc bng m t cc t trng ca cc chiu nh chiu thi gian, chiu khch hng, chiu hng ha,

Di y minh ha lc hnh sao ca bi ton bn hng. y cng l d liu dng minh ha trong phn tip theo khi thc hin OLAP trn SQL Server. Trong Fact table l Sales v 4 Dimension tables l time (chiu thi gian) , item (chiu hng ha) , location (chiu b tr) v branch (chiu chi nhnh)

Measure ( o): L i lng c th tnh ton c trn cc thuc tnh ca fact table. y l mc tiu ca OLAP v phi xc nh trc khi tin hnh phn tch. V d nh tng tin bn hng ca mt chi nhnh, doanh thu ca tng mt hng theo qu,

Phn cp (Hierarchies): Khi nim ny m t s phn cp th bc (mc chi tit ca d liu). V d i vi chiu thi gian, ta c thc bc nh sau: day<week<month<quarter<year. Tng t i vi chiu location ta c th bc street<city<province_or_state<country. Trong khi phn tch d liu chng ta rt cn khi nim ny tng hp hay chi tit tng hng mc d liu trong DW.

2. M t ng dng phn tch OLAP

Bi ton m t trong phn ny l bi ton bn hng, gm c 1 Fact table l Sales v 4 Dimension table l time, item, location v branch (Xem lc hnh sao trn).

Fact Table (Sales): Lu gi cc bin ng v qu trnh bn hng, gm cc kha ngoi ca 4 dimension tables v 2 thuc tnh l gi bn (dollars_ sold) v s lng bn (units_sold)

Cc dimension table:

Time: lu gi thng tin v thi gian bn hng.

Location: Lu gi thng tin v v tr

Branch: Lu tr thng tin v chi nhnh Item: Lu tr thng tin v hng ha

Mc ch m t hot ng OLAP phn tch hot ng bn hng ca mt doanh nghip.

3. Trin khai OLAP trong SQL Server

Lu : c cng c phn tch OLAP, bn phi ci t SQL Server 2005 (2008) phin bn Developer hoc phin bn Enterprise Edition y v khi ci t nh chn mc SQLServer Database Services v Analysis Services. Cng c cho php thc hin OLAP lSQL Server Business Intelligence Development Studio - BIDS. Khi ci SQL Server cc phin bn trn th BIDS s c t ng ci t.

Cc bc thc hin: Khi ng SQL Server Management Studio v to CSDL c tn DW nh sau v nhp vo cc bng mt s records phn tch.

Khi ng SQL Server Business Intelligence Development Studio

To mt Analysis Services Project mi c tn OLAP_DW

Trong ca s Solution Explorer ca Project OLAP_DW, bm phm phi chut vo Data Source to mt b kt ni n d liu dng cho phn tch.

Xc nh cc tham s kt ni n kho d liu c tn DW to ra trong SQL Server Management Studio.

t tn cho Data Source vm bm Finish hon thnh vic kt ni n c s d liu. To Data Source View ly cc bng d liu cn thit cn cho phn tch. Bm phm phi chut vo Data Source View trong ca s Solution Explorer chn New Data Source View

Xc nh ngun d liu (Data Source) cn ly l DW mi va to ra bc trc

Chn Next v chn cc bng cn cho phn tch

Ch : Nu bn mun chn bng Fact v cc bng Dimension lin quan n bng Fact th ch cn chn Fact Table a qua khung bn phi v bm nt "Add Related Tables" t ng ly cc bng Dimensions lin quan. Sau khi hon thnh, cc bng Fact v Dimension nh sau:

Sau khi tao Data Source v Data Source View ta to d liu khi cho phn tch bng cch bm chut phi ln Cube trong Solution Explorer v chn New Cube

Chn Next v chn ngun d liu cho Khi (DW), h thng s t ng d tm fact v Dimension Tables

Kt qu nh sau:

Bm Next thit lp chiu thi gian. Ch , thi gian l mt chiu rt quan trng trong kho d liu ni chng v phn tch OLAP. V vy nu bn khng xc nh chiu thi gian th h thng s t ng to ra mt chiu thi gian qun l.

Bm Next xc nh cc o (Measure) cho phn tch. Nhc li rng o l cc i lng phn nh mc tiu phn tch, tnh ton. l cc php ton trn thuc tnh c th tnh ton trong bng Fact.

Bm Next, h thng s t ng pht hin cc cu trc phn cp (Hierarchies) trong cc Dimesion Tables

Xem li cc chiu trong khi

t tn khi (DW)v bm finish sinh ra khi. Khi d liu vi cc chiu c sinh ra

Sau khi to ra khi d liu cho phn tch, thc thi OLAP ta bm phm phi chut vo tn project trong Solution Explorer v chn Deploy

Project c thc thi thnh cng nh sau

Sau khi thc thi xong project, thc hin cc phn tch OLAP phc v cho cng tc qun l, bm phm phi chut vo Cube trong Solution Explorer chn Browse xut hin m hnh phn tch:

Mn hnh phn tch OLAP nh sau:

Panel bn tri cha cc Measure v cc Dimensions nh ngha khi xy dng khi.

Panel bn phi chia lm 2 ca s, ca s pha trn dng xc nh cc iu kin phn tch, ca s pha di cha kt qu cc measure khi ta ko th (drag and drop) cc measure t panel bn tri qua. Ty theo mc ch phn tch m chng ta xc lp cc biu thc phn tch cho ph hp. V d vi thit lp nh di y c ngha l yu cu cho bit s ln (Sales Count) v tng s lng (Unit Sold) hng m chi nhnh Danang bn.

Thit lp di y cho bit Mt hng ca hng Intel c bn bao nhiu ln vi tng s lng bao nhiu ti chi nhnh HCM

Mn hnh thit k OLAP rt d s dng v linh hot, bn c th ko th cc Dimension v cc Measure t Panel bn tri sang panel bn phi. V d ta c th ko th thuc tnhBranch Name trong Dimension Branches sang panel bn phi v h thng s cho bit s lng v s ln bn cc sn phm theo tng chi nhnh nh sau:

Ty theo nhu cu phn tch d liu, bn c th to ra cc lt ct (slice) d liu trn nhiu chiu khc nhau sinh ra cc tng hp d liu cn thit cho nhu cu phn tch d liu trong kho rt nhanh chng v tin li. Hnh di y cho bit s lng v s ln bn cc mt hng theo tng chi nhnh da trn lt ct 2 chiu Branches v Items

Cc tab Dimension Usage, Caculations, KPIs, Actions, Partitions, Perspectives, translations c dng m rng kh nng phn tch ca OLAP. Ngoi k thut phn tch OLAP, SQL Server Business Intelligence Development Studio cn cung cp cc k thut khai ph d liu nh Regression, Association, Decision tree, Time Series, Clustering.. trong mc Mining Structure rt mnh v tin li xy dng cc m hnh khai ph d liu (s trnh by bi vit khc) Xem Video minh hoa OLAP tai y All comments please send to chucnv@ud.edu.vn. Thank you and Welcome!

T kha i din: data warehouse, kho d liu, OLAP, Phn tch x l trc tuyn, Data Mining, Business Intelligence, Tin hoc quan ly, Management Information Systems


im ch : 80

You might also like