You are on page 1of 10

1

I. Gii thiu:
Tnh ton tin ho (Evolutionary Computations EC) [35] c rt
nhiu cc ton t di truyn c s dng nh hng qu trnh tm
kim n cc li gii mong mun trong khng gian tm kim. Trong lp
trnh gen (GP) [5] cng nh l cc lnh vc khc ca tnh ton tin ho,
chng hn nh gii thut di truyn (Genetic Algorithm GA) [36] v
chin lc chin ho (Evolution Strategies ES), qu trnh tm kim
c thc hin vi cc ton t tin ho, ch yu l ton t lai ghp
(crossover) v ton t t bin (mutation). Mi ton t ny c cc gi
tr tham s ring ng vi n, cho bit xc sut xem liu ton t c
c la chn l ton t thc hin tin ho hay khng. K t khi John
Koza, mt trong nhng ngi khai ph lnh vc lp trnh gen, trong
nghin cu ca ng y [5] s dng mt t l xc sut rt thp cho ton t
t bin v gi rng ch nn dng ti a 10% l t bin, th nhiu
ngi lm lp trnh gen thng c khuynh hng thit lp t l t bin
rt nh so vi t l ton t lai ghp khi tin hnh trin khai mt h thng
lp trnh gen trong nhiu nm. l gii cho iu ny, Koza lp lun
rng t bin trn thc t khng ng gp quan trng n hiu nng ca
GP, phn v s c lp v v tr ca cc cy con trong GP, phn v s
lng ln cc v tr nhim sc th trong cc qun th GP.
Lun im gia t bin v lai ghp ny thu ht rt nhiu nh
nghin cu v khoa hc my tnh nghin cu mi quan h tng quan
gia chng. Rt nhiu cc nghin cu kho st tm nh hng ca
ton t lai ghp, t bin, v c cc ton t di truyn khc ln cht lc
ca li gii tng qut c sinh ra bi cc h thng tin ho. Shaffer v
cc cng s [37], Hinterding [38] cho thy rng t bin c th hu
ch hn l chng ta vn tng. Luke v Spector [1] chng minh bng
thc nghim rng ton t t bin c hiu nng gn nh tng t vi ton
t lai ghp bng cch thit lp cc t hp khc nhau ca c hai ton t
2
ny v th nghim chng ln nhiu lp bi ton. Mc du hiu nng tng
th ca c hai ton t ny l gn nh tng t nhng chng biu hin
khc nhau v mi ton t li c nhng thuc tnh ring. Lai ghp c
coi l ton t xy dng, to ra cc cu trc mi t cc cm xy dng c
bn mc thp; iu ny ngha l ton t ny ch c th s dng nhng vt
liu di truyn sn c trong qun th. Trong khi , t bin c vai tr
mnh hn trong vic bin i v khai ph cc cu trc bi v n c th
sinh ra cc on m s dng cc hm s v ton t. V vic s dng c
hai ton t ny cng lc l thc s cn thit; Like v Spector [6] nu
ra mt nh hng nh to ra cc kt qu tt khi ton t lai ghp nm
trong khong 60 70%. Tuy vy, cc nh nghin cu GP thng thng
vn chn t l t bin rt nh khi thit lp cc tham s cho th nghim.
Tuy nhin, gn y, ngi ta tha nhn rng vic chn la cc
tham s c nh l ph thuc vo tng loi bi ton c th, do cch
tip cn ny dn chuyn sang vic thit lp tham s thch nghi hay
khuynh hng t ng la chn tham s (Adaptive Parameters - AP). Do
mt s thnh cng ca AP trong mt s cc lnh vc khc nh ES [10],
[11] v trong GA [9], [8], [28], [30] cho nn c mt vi nghin cu
cho GP [31], [32], [8], [4] v mang n mt hng trin vng cho vic
nghin cu AP trong GP v cc m rng ca n.

II. Cc nghin cu trc y:
Cn phi lu rng vic t ng thay i tham s (AP) c chia
thnh ba mc khc nhau: Mc qun th, mc c th v mc thnh phn,
da trn v tr m cc ton t di truyn thc hin [1]. AP mc qun th
lin quan n vic bin i mt vi kha cnh ca Ecs, ci m s p dng
ng nht ln ton b cc thnh vin trong qun th, in hnh l thay i
xc sut ca ton t lai ghp v s tc ng ca n ln gi tr ca hm
thch nghi. AP mc c th l s bin i ca mt lng cc thuc tnh
3
mi thuc tnh ny gn vi mt c th c th trong qun th. Cui cng,
mc thnh phn, AP ch yu lin quan n vic thay i tham s vi mi
thnh phn ca mt c th ang c tin ho no m xc nh xem
lm cch no mi thnh phn ny c bin i trong qu trnh sinh sn.
Cc th nghim ca Angelines cho thy rng trong mt vi mi
trng, khng phi mt mc AP n l m chnh s kt hp ca AP
cc mc khc nhau mi thc s em li nhng hiu qu vt tri.
Trong EC, vic t ng thay i tham s (Self-adaptive
parameters SAP) c chng minh l hiu qu trong nhiu bi ton.
Kramer [12] tin hnh mt vi th nghim v cho thy hu ht cc
nghin cu v hc thuyt ln SAP ch yu tp trung vo ton t t bin
v nu ra mt trin vng v vic SAP c th l mt cha kho chnh
trong mt gii thut tin ho khi gii quyt cc bi ton ti u khng ph
thuc vo tham s. Barbosa v cc cng s [33] p dng chin lc
SAP ln gii thut di truyn v ch ra rng vic s dng SAP ci thin
tt ca li gii c tm thy.
Trong GP cng nh cc m rng ca n cng c vi nghin
cu ng k v SAP. Angeline [34] a ra phng php c tn l Two
Self-adaptive Crossover cho GP. Trong phng php ny, thot u, n
s bin i thch nghi cc gi tr m xc nh v tr ton t lai ghp s
din ra. Bc tip theo, n s bin i thch nghi c v tr v cch thc m
ton t lai ghp c tin hnh trn mt cy GP. Cc kt qu nghin cu
ca Angeline ch ra rng phng php ny c hiu nng tt hn GP
tiu chun v do chng minh rng cc gii thut heuristics nht
nh thng c dng trong GP c th cha hn l ti u. Fagan v cc
cng s [15] a ra mt phng php v vic t ng thay i tham s
t bin, p dng n vo h tin ho vn phm (Grammatical Evolution
GE), gi l Fitness Reaction Mutation (FRM). Phng php ny t ng
bin i tham s t bin mc qun th v n ph thuc vo phn hi
4
t cc gi tr fitness t qun th. Fagans cho thy rng vic s dng
FRM cho nhiu u th trn kha cnh fitness v cc li gii c tm
thy. Hn na, FRM cng c th ngn nga vic sm hi t n li gii
ti u cc b trong mt vi trng hp.
Cc h lp trnh gen vi SAP cng c p dng vo nhiu bi
ton thc t c th v cho thy s hiu qu ca n. Zheng Yn v cc cng
s [14] s dng tng GP kt hp SAP bin i thch nghi gi tr
ca ton t lai ghp cng nh t bin trong sut qua trnh thc thi ca
GP trn bi ton Option Pricing. Nhng kt qu ny cho thy phng
php mi ny cho cc kt qu tt hn v GP tiu chun vi ton t lai
ghp v t bin c tham s c nh. i vi cc bi ton phn lp
(classification), GP kt hp vi SAP cng phn no cho thy c hiu
qu ca n. AI-Madi v Ludwig [20] p dng tng ny cho tc v
phn lp v th nghim trn mt vi cc b d liu, chng hn nh
Wiscosin Diagnostic Breast Cancer (WDBC), Diabetes, Heart,
Hepatitis,... v cc kt qu cho thy rng GP vi ASP cho kt qu nhanh
hn nhiu so vi GP tiu chun trong khi vn gi c chnh xc ca
li gii.
...................

Chng 3

3.1. tng
Phng php xut ca ti da trn tng l, ti mi th h
nht nh, ton t di truyn hiu qu hn nn c xc sut cao hn cc
ton t km hiu qu hn n. Gi s ta nh ngha t l thnh cng ca
mt ton t l s lng cc c th con m c gi tr fitness tt hn b m
ca chng trn tng s cc c th con c to ra khi s dng ton t
ny. Nu mt ton t m c t l thnh cng cao hn cc ton t khc
5
mt th h no th ton t phi c xc sut cao hn cc ton t km
hn n th h tip theo. Vi tng nh vy, ti xin nu thut ton chi
tit phn 3.2.

3.2. Thut ton t ng thay i tham s lai ghp v t bin cho GP
Trc ht, gi p_cross(i) v p_mutate(i) l hai xc sut tng ng
ca ton t lai ghp v t bin ti th h th i; usedcross(i),
succross(i), v usedmutate(i), sucmutate(i) ln lt l s ln s dng v
s ln thnh cng ca tng ton t lai ghp, t bin, ti th h th i.
Th hai, ta nh ngha khi nim fitness ci tin trong bi ton
Symbolic Regression s dng GP vi gi tr fitness cng nh cng tt nh
sau:
- Vi ton t n nh l t bin, nu c th con c to
ra m c fitness nh hn c th cha, chng ta c mt ln
t bin thnh cng.
- Vi ton t lai ghp, hai c th cha m p1, p2 to ra hai
c th con c1, c2. Nu nh m min(fitness(p1),
fitness(p2)) > min (fitness (c1), fitness (c2)), chng ta c
mt ln lai ghp thnh cng.
Th ba, ta nh ngha t l thnh cng cho c hai ton t theo cng
thc sau:
- c_rate(i) = succross(i) *succross(i) / usedcrossed(i)
- m_rate(i) = sucmutate(i) * sucmutate(i) / usedmutate(i).

Nh cp trc y, trong gii thut ny, xc sut ton t lai ghp
cng nh t bin th h tip theo t l vi s thnh cng ca n h h
hin ti. V l do ny, ta s xy dng cng thc tnh xc sut lai ghp v
t bin th h th (i+1) nh sau:
6
- p_cross(i+1) = p_dis + c_rate(i) * (1-2p_dis) / (c_rate(i)
+ m_rate(i))
- p_mutate(i+1) = p_dis + m_rate(i) * (1-2p_dis) /
(c_rate(i) + m_rate(i))
vi p_dis = 0.01, l mt xc sut rt nh, phn phi u cho hai ton t
trn m bo rng cc xc sut ca lai ghp v t bin s khng th l
khng d n rt km thnh cng th h hin ti.
M t chi tit thut ton lp trnh gen p dng gii thut t ng thay i
tham s lai ghp v t bin trn nh sau:
Thut ton lp trnh gen t ng thay i tham s lai ghp v t
bin
1) Khi to ngu nhin mt qun th gm cc c th l cc chng trnh
(cc cy GP) t tp cc Terminals v Functions ban u.
2) Tnh gi tr fitness cho tng c th
3) La chn cc c th t qun th theo Tournament Selection
4) p dng cc ton t di truyn lai ghp v t bin ln cc c th c
chn, to ra cc c th mi ri sao chp kt qu sang qun th mi.
5) Lp li bc 3 v 4 cho n khi qun th mi c lp y bi s
lng cc c th mi.
6) Qun th c c thay th bi qun th mi.
7) Cp nht tham s xc sut cho lai ghp v t bin theo thut ton
thch nghi tham s trnh by trn.
8) Lp li cc bc t 2 n 7 cho n khi mt iu kin dng no
c tho mn.
9) Tr v c th tt nht, trn kha cnh fitness.

Chng 4: Ci t thc nghim

4.1. Bi ton p dng:
kim tra hiu nng ca thut ton, ta s s dng mt trong s
cc bi ton tiu chun trong lp trnh gen, l Symbolic Regression vi
h cc hm s dng F
n
(x) =
n
. V mi mt bi ton, ta s dng x
i=1
n

7
hai b d liu u vo gm s phn t, s = 50 v s = 20, l cc gi tr mu
trong khong [-1, 1], k hiu l F
n
s
. Trong phn nym ta s kim nghim
thut ton trn hai bi ton l F
6
20
v F
18
50
.

4.2. Thit lp chng trnh
Trn c s kim tra thut ton ln hai bi ton trn, ti tin hnh
thit k th nghim nh sau:
Ti s s dng tp hp hm s gm cc hm +, -, *, /. Trong cc
hm ny, hm / l hm chia c m bo bng cch nu hm thc hin
chia cho gi tr 0 th hm s tr v gi tr 1.
V terminal, do lm hm mt bin nn ti ch s dng tham s X
i din cho cc gi tr trong khong [-1, 1].
Do mc ch mun kim tra tt ca thut ton thay i tham s
lai ghp v t bin, do vy ti thit lp gi tr ban u cho hai tham s
ny l nh nhau, bng 0.5, v tin hnh th nghim trn 20 ln chy ly
kt qu trung bnh.
Sau y l bng thit lp chi tit cho chng trnh:



Tham s Gi tr
S th h 51
Kch thc qun th 500
Phng php chn La chn cnh tranh vi kch c l 8
Xc xut cc ton t ban
u
Lai ghp = 0.5, t bin = 0.5
Tp hm s +, -, *, /
Tp terminal X
Fitness Case Cc gi tr mu trong b 20 (hoc 50) gi tr
Raw Fitness Tng cc sai s tuyt i ca cc fitcase
Hits S lng cc sai s tuyt i nh hn 0.1
S ln chy th nghim 20
8
Chng 5: Phn tch kt qu

Sau khi tin hnh th nghim chng trnh, sau 20 ln chy v
tin hnh so snh h thng ca mnh vi h thng GP thng thng, ti
thy rng h thng ca mnh c hiu nng tt hn hn GP thng thng
c mt tm ra c th tt nht trn kha cnh fitness ln tc hi t v li
gii ti u.
Hnh 2 miu t so snh gia h thng ca ti vi GP tiu chun trn kha
cnh best fitness (gi tr fitness tt nht) qua tng th h. Chng ta c th
d dng thy rng h thng ca ti thc hin kh tt hn so vi GP tiu
chun. C th thy rng h thng ca ti hi t v best fitness (best
fitness = 0) sm hn GP tiu chun.

Hnh 1: Biu gi tr trung bnh ca best fitness trong bi ton F
6
20


i vi bi ton kh hn nhiu nh bi F
18
50
, c th d dng nhn ra rng
h thng ca ti cho li gii tt hn hn GP tiu chun. ng cong biu
din kt qu cho h thng ca ti c dc tt hn hn so vi GP tiu
chun v hi t v nghim nh hn rt nhiu so vi GP thng. iu ny
th hin kh nng v tc hi t tt ca h thng GP vi thut ton t
ng thay i tham s ca ti.
Qua c th thy rng, chnh vic la chn tnh ton gi tr cc tham s
cho ton t lai ghp v t bin mi th h da vo s thnh cng ca
chng ngy cng to ra nhiu c th tt hn th h c th h tip theo
dn ti h thng ca ti cho li gii tt hn v hi t v nghim nhanh
hn so vi GP thng kh nhiu.
0
5
10
15
20
1 6 11 16 21 26 31 36 41 46 51
Adaptive
Standard
9
Trong mt s trng hp, vic phn b xc sut cho cc ton t lai ghp
v t bin mi th h l rt quan trng.

Hnh 2: Biu gi tr trung bnh best fitness trong bi ton F
18
50


Hnh 3 miu ta phn b xc sut ca cc ton t di truyn trong bi ton
F
6
20
. Chng ta c th thy rng vi bi ton ny, phn b xc sut ca to
t lai ghp c thin hng gim mnh t th h th 1 n 7, ri t t tng
t th h 7 n 18, sau n iu tng cho n th h cui cng, chm
ngng ~93%. C khong 30 th h xc sut ca lai ghp ln hn xc
sut ca t bin. Do trong bi ton ny ton t lai ghp l hiu qu
hn ton t t bin.

Hnh 3: Phn b xc sut trong bi ton F
6
20

0
5
10
15
20
1 6 11 16 21 26 31 36 41 46 51
Adaptive
Standard
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 6 11 16 21 26 31 36 41 46 51
mutation
crossover
10
Ngc li, hnh 4 miu t phn b xc sut trong mt bi ton kh hn l
F
18
50
. Trong hnh ny c th thy c rt t th h m ton t lai ghp
c xc sut ln hn 50%. iu ny ngha l trong bi ton ny th ton t
t bin l tt hn so vi ton t lai ghp.Tuy vy, h thng ca ti vn
th hin tt hn GP thng vi vic thit lp xc sut cho lai ghp v t
bin ln lt l 0.9, 0.1, ci m c coi l tt theo thi quen.

Hnh 4: Phn b xc sut trong bi ton F
18
50

Qua h thng t ng thay i tham s ca ti, c th suy ra rng vic tin
rng ton t lai ghp tt hn ton t t bin trong mi bi ton l hon
ton khng hp l.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1 6 11 16 21 26 31 36 41 46 51
mutation
crossover

You might also like