You are on page 1of 3

4.

3 Ngt sm (early termination)


Vic ly mu tin hnh da trn nhng c s trc nu ra vn
v vic khng c ngt sm(lack of early termination) ca cc map
task [47], k c khi mt job (cng vic) c input u vo
to ra u ra output cn thit.
Vn c ch ra l lm th no mt mu c kch thc c nh
(m tha mn c s a ra) ca mt tp hp d liu ln s dng
MapReduce. t c mc tiu ny, mt s vn k thut cn
c nhm n gii quyt trong Hadoop, v th 2 concept mi
c nh ngha. Mt loi job mi c gii thiu, gi l dynamic ,
c kh nng kim sot linh ng vic truy cp d liu ca n. Bn
cnh , concept ca Input Provider cng c gii thiu trong mu
thc thi ca Hadoop. Input Provider c cung cp ng thi vi
logic Map v Reduce. Vai tr ca n l to nhng quyt nh ng
v truy cp d liu bi job. nhng khong thng thng, JobClient
cung cp cc thng tin thng k cho Input Provider , v da trn
nhng thng tin ny Input Provider c th a ra phn hi theo 3 cch
(1)

(2)
(3)

end of input trong trng hp cc tc v map (map task )


c php hon thnh nhng khng c map task mi c
gi ra v pha trn (shuffle) c bt u
input available cc input b sung cn c truy cp
input unavailable cho bit rng khng quyt nh no c
to thi im ny v vic x l c tip tc nh bnh
thng cho n khi c yu cu t input provider.

EARL nhm n cung cp kt qu sm cho truy vn phn tch


trong MapReduce m khng phi x l ton b d liu u
vo.EARL s dng cch ly mu ng nht v thc hin lp li
tnh ton ra nhng mu ln hn , n khi t c chnh xc
cho, c c tnh bng phng php mi. ng gp chnh ca n

bao gm s tnh ton tng dn ca nhng kt qu sm vi vic


nh gi chnh xc ng tin cy , iu m ngy cng quan trng
cho cc ng dng phi x l khi lng ln d liu . Hn th na
EARL s dng duy tr phn vng ci thin hiu sut vic ti
thc thi cc job trn mu ln hn. cp k thut, EARL sa
i Hadoop theo 3 con ng :
(1)

(2)
(3)

Cc tc v reduce (reduce task ) c th x l input trc khi


x l hon thnh map task bng phng php to ng
ng
Cc map task duy tr hot ng n khi b ngt dt khot.
Mt knh thng tin 2 chiu gia cc map task v reduce task
c thit lp kim tra s p ng ca iu kin ngt. Bn
cnh mt pha reduce thay i c s dng, trong
x l tng dn ca mt job c h tr. So vi [47], EARL
cung cp 1 framework nh gi li v tp trung nhiu hn
trong vic s dng ch thc mu ngu nhin ng nht.

Mt s k thut c o c gii thiu h tr lack of early


termination (khng c ngt sm) cho x l top-k, bao gm s dng
cc truy cp sp xp, sp t d liu thng minh s dng s
phn chia cp cao c thit k cho cc truy vn top-k v vic s
dng bn tm tt cho d liu lu tr trong HDFS m cho php xc
nh hiu qu cc block vi b d liu c trin vng. Hu ht cc k
thut ny c th c kt hp c hiu sut tt hn.

RanKloud c xut cho truy hi top-k trong m my.


RanKloud tnh ton thng tin thng k ( thi gian chy) trong lc
qut cc bn v s dng thng tin thng k ny tnh ngng
threshold (im thp nht cc kt qu top-k) cho vic ngt sm. Bn
cnh mt phng thc chia mi c xut , t tn l uSplit ,

hng n ti phn chia d liu theo mt cch phn bit r li ch ,


li ch hng ti kh nng ca b d liu l mt phn ca top-k .
S khc bit chnh vi [38] l RanKloud khng th m bo truy hi
ca k kt qu , trong khi [38] hng n truy hi kt qu chnh xc