Professional Documents
Culture Documents
K48 Nguyen Thi Hai Yen Thesis PDF
K48 Nguyen Thi Hai Yen Thesis PDF
Nguyn Th Hi Yn
H NI - 2007
I HC QUC GIA H NI
TRNG I HC CNG NGH
Nguyn Th Hi Yn
H NI 2007
LI CM N
Trc tin, em xin by t lng bit n chn thnh v su sc nht ti Thy gio,
PGS-TS H Quang Thy v Thy gio, ThS. ng Thanh Hi tn tnh hng dn,
ng vin, gip em trong sut qu trnh thc hin ti.
Em xin gi li cm n su sc ti qu Thy C trong Khoa Cng ngh thng tin
truyn t kin thc qu bu cho em trong nhng nm hc va qua.
Em xin gi li cm n cc anh ch trong nhm seminar v khai ph d liu
nhit tnh ch bo trong qu trnh em lm kho lun.
Con xin ni ln lng bit n i vi ng B, Cha M lun l ngun chm sc,
ng vin trn mi bc ng hc vn ca con.
Xin chn thnh cm n cc Anh Ch v Bn b, c bit l cc thnh vin trong
lp K48CD ng h, gip v ng vin ti trong sut thi gian hc tp bn nm trn
ging ng i hc v thc hin ti.
Mc d c gng hon thnh lun vn trong phm vi v kh nng cho php
nhng chc chn s khng trnh khi nhng thiu st. Em knh mong nhn c s cm
thng v tn tnh ch bo ca qu Thy C v cc Bn.
Em xin chn thnh cm n!
H Ni, ngy 31 thng 05 nm 2007
Sinh vin
Nguyn Th Hi Yn
TM TT NI DUNG
Hin nay, vi mt lng ln cc d liu th phn lp d liu c vai tr rt quan
trng, l mt trong nhng bi ton lun thi s trong lnh vc x l d liu vn bn. Mt
yu cu c bn c t ra l cn tng tnh hiu qu ca thut ton phn lp, nng cao gi
tr ca cc o hi tng, chnh xc ca thut ton. Mt khc, ngun ti nguyn v v
d hc c nhn khng phi lun c p ng v vy cn c cc thut ton phn lp s
dng cc v d cha c nhn. Phn lp bn gim st p ng c hai yu cu ni trn
[5, 7, 8, 16, 17]. Cc thut ton phn lp bn gim st tn dng cc ngun d liu cha
gn nhn rt phong ph c trong t nhin kt hp vi mt s d liu c gn nhn
cho sn.
Trong nhng nm gn y, phng php s dng b phn loi my h tr vector
(Support Vector Machine - SVM) c quan tm v s dng nhiu trong lnh vc nhn
dng v phn loi. T cc cng trnh khoa hc [4, 7, 8, 11] c cng b cho thy
phng php SVM c kh nng phn loi kh tt i vi bi ton phn loi vn bn cng
nh trong nhiu ng dng khc.
Trong kho lun ny, em kho st thut ton hc bn gim st SVM v trnh by
cc ni dung v phn mm SVMlin do V. Sindhwani xut [18]. Trong nm 20062007, V. Sindhwani dng SVMlin tin hnh phn lp vn bn t ngun 20Newsgroups cho cc kt qu tt [14,15].
MC LC
M U......................................................................................................... 9
Chng 1
KT LUN .................................................................................................. 45
Nhng cng vic lm c ca kho lun .................................................................45
Hng nghin cu trong thi gian ti .............................................................................45
Cm t
kNN
k Nearest Neighbor
SVM
S3VM
DANH MC HNH NH
Hnh 1. Bi ton phn lp.
Hnh 2. Vn bn c biu din l vector c trng.
Hnh 3. S khung qu trnh phn lp vn bn.
Hnh 4. Siu phng h phn chia d liu hun luyn thnh 2 lp + v - vi khong
cch bin ln nht. Cc im gn h nht l cc vector h tr (Support
Vector - c khoanh trn).
Hnh 5. Phng php hc bn gim st Self-training.
Hnh 6. Phng php hc bn gim st Co-training.
M U
Trong nhng nm gn y, s pht trin vt bc ca cng ngh thng tin lm
tng s lng giao dch thng tin trn mng Internet mt cch ng k c bit l th
vin in t, tin tc in t Do m s lng vn bn xut hin trn mng Internet
cng tng vi mt tc chng mt, v tc thay i thng tin l cc k nhanh chng.
Vi s lng thng tin s nh vy, mt yu cu ln t ra l lm sao t chc v tm
kim thng tin, d liu c hiu qu nht. Bi ton phn lp l mt trong nhng gii php
hp l cho yu cu trn. Nhng mt thc t l khi lng thng tin qu ln, vic phn
lp d liu th cng l iu khng th. Hng gii quyt l mt chng trnh my tnh t
ng phn lp cc thng tin d liu trn.
Tuy nhin, khi x l cc bi ton phn lp t ng th gp phi mt s kh khn l
xy dng c b phn lp c tin cy cao i hi phi c mt lng ln cc mu
d liu hun luyn tc l cc vn bn c gn nhn lp tng ng. Cc d liu hun
luyn ny thng rt him v t v i hi thi gian v cng sc ca con ngi. Do vy
cn phi c mt phng php hc khng cn nhiu d liu gn nhn v c kh nng tn
dng c cc ngun d liu cha gn nhn rt phong ph nh hin nay, phng php
hc l hc bn gim st. Hc bn gim st chnh l cch hc s dng thng tin cha
trong c d liu cha gn nhn v tp hun luyn, phng php hc ny c s dng rt
ph bin v tnh tin li ca n.
V vy, kho lun tp trung vo nghin cu bi ton phn lp s dng qu trnh hc
bn gim st, v vic p dng thut ton bn gim st my h tr vector (Support Vector
Machine SVM) vo phn lp trang Web.
Ni dung ca kho lun c trnh by bao gm 3 chng. T chc cu trc nh
sau:
Chng 1
Qu trnh phn lp d liu thng gm hai bc: xy dng m hnh (to b phn
lp) v s dng m hnh phn lp d liu.
Bc 1: mt m hnh s c xy dng da trn vic phn tch cc i tng d
liu c gn nhn t trc. Tp cc mu d liu ny cn c gi l tp d liu
hun luyn (training data set). Cc nhn lp ca tp d liu hun luyn c xc nh
bi con ngi trc khi xy dng m hnh, v vy phng php ny cn c gi l hc
c gim st (supervised learning). Trong bc ny, chng ta cn phi tnh chnh xc
ca m hnh, m cn phi s dng mt tp d liu kim tra (test data set). Nu chnh
xc l chp nhn c (tc l cao), m hnh s c s dng xc nh nhn lp cho
cc d liu khc mi trong tng lai. Trong vic test m hnh, s dng cc o nh
1.2. Phn lp vn bn
1.2.1. t vn
Ngy nay phng thc s dng giy t trong giao dch dn c s ho chuyn
sang cc dng vn bn lu tr trn my tnh hoc truyn ti trn mng. Bi nhiu tnh
nng u vit ca ti liu s nh cch lu tr gn nh, thi gian lu tr lu di, tin dng
trong trao i c bit l qua Internet, d dng sa i nn cng ngy, s lng vn
bn s tng ln mt cch nhanh chng c bit l trn World Wide Web. Cng vi s gia
tng v s lng vn bn, nhu cu tm kim vn bn cng tng theo. Trong i thng,
phn lp cc vn bn c tin hnh mt cch th cng, ngha l chng ta thc hin cng
vic c tng vn bn mt, xem xt v sau l gn n vo mt lp c th no . Cch
ny s tn rt nhiu thi gian v cng sc ca con ngi v cc vn bn l v vn, gn
mi vn bn vo mt lp cho l mt vn khng th v do khng kh thi. Vi s
lng vn bn s th vic phn lp vn bn t ng l mt nhu cu bc thit.
Vy phn lp vn bn l g? Phn lp vn bn (Text Categorization) l vic phn
lp p dng i vi d liu vn bn, tc l phn lp mt vn bn vo mt hay nhiu lp
vn bn nh mt m hnh phn lp; m hnh ny c xy dng da trn mt tp hp cc
vn bn c gn nhn t trc.
Phn lp vn bn l mt lnh vc c ch nht v c nghin cu trong
nhng nm gn y.
i2
in
ij
l trng s
S nhiu khng gian c trng thng ln. Cc vn bn cng di, lng thng tin
trong n cp n nhiu vn th khng gian c trng cng ln.
Cc c trng c lp nhau, s kt hp cc c trng ny thng khng c ngha
trong phn lp.
Cc c trng ri rc: vector c trng di c th c nhiu thnh phn mang gi tr
0 do c nhiu c trng khng xut hin trong vn bn di (nu chng ta tip cn
theo cch s dng gi tr nh phn 1, 0 biu din cho vic c xut hin hay
khng mt c trng no trong vn bn ang c biu din thnh vector), tuy
nhin nu n thun cch tip cn s dng gi tr nh phn 0, 1 ny th kt qu
Gi y, nhng phn mm
tin tin ca hacker cho php
ngay c nhng g "tay m"
cng c th to ra virus vi
tc chng mt. Tuy nhin,
vi nhng th h trc ,
c nhng loi virus sinh ra l
c mt s kin lm nhng
ngi dng my tnh hoang
mang.
phn mm
hacker
virus
tc
tin
1
.
.
.
th h
s kin
ngi dng
xe
mn hnh
my tnh
ti vi
bia
khng biu din ni dung vn bn. Hin nay cch tip cn biu din Website l mt cch
tip cn nhn c nhiu s quan tm ca nhiu ngi trn th gii, i tng quan tm
khng phi l Webpage m l Website, ngha l i tng tm kim khng phi l cc
trang Web n na m l c mt Website [2, 9].
Trong lnh vc vn bn truyn thng t trc n nay th thng thng vn thc
hin cc cng vic nh biu din, tm kim, phn lp... trn c s xem trang Web nh l
cc trang vn bn thng thng v s dng m hnh khng gian vector biu din vn
bn. Vic s dng siu lin kt gia cc trang Web c th ly c thng tin v mi lin
h gia ni dung cc trang, v da vo nng cao hiu qu phn lp v tm kim,
y chnh l vic khai thc th mnh ca siu lin kt trong vn bn. Mt s nh nghin
cu a ra cch ci tin nh hng bng cch lit k thm cc t kho xut hin t
cc trang Web lng ging bng cch b sung thm cc t kho xut hin trong on vn
bn ln cn vi siu lin kt.
Trong kho lun ny, chng ta s nghin cu cch biu din trang Web theo m
hnh vector v n l mt phng php rt ph bin hin nay. Vi vic s dng cc thng
tin lin kt nhm tng chnh xc tm kim cng nh phn lp cc trang Web nn cn
thit phi a thm cc thng tin v cc trang Web lng ging vo vector biu din ca
trang ang xt.
Tn ti bn cch biu din trang Web theo m hnh vector nh sau [2]:
Cch th nht
Mi t kha trong mt trang Web c lu tr cng tn s xut hin n trong
trang Web. Cch ny b qua tt c cc thng tin v v tr ca t kho trong trang, th t
ca cc t trong trang cng nh cc thng tin v siu lin kt.
Trong nhiu trng hp khi m cc ti liu lin kt c lp vi cc nhn ca cc
lp th cch biu din ny l la chn tt nht. Tuy nhin trong mt s trng hp th
cch ny khng khai thc c tnh cn i trong ti liu siu lin kt.
Cch th hai
S dng cc thng tin v lin kt ca trang Web, mc ni n ti cc trang lng
ging to ra mt siu trang (super document). Vector biu din bao gm cc t xut
hin trong mt trang cng vi tt c cc t xut hin trong cc trang lng ging ca n
cng vi tn s xut hin ca cc t. Cch ny b qua thng tin v v tr ca cc t trong
trang v th t ca chng.
Nhc im ca cch ny l lm long i ni dung ca trang m chng ta ang
quan tm. Tuy nhin y l cch la chn tt trong trng hp cn biu din mt tp cc
trang Web c ni dung v cng mt ch , nhng hin nay s lng cc trang Web lin
kt ti nhau c cng mt ch tng i t, v vy cch biu din ny him khi c s
dng.
Cch th ba
Dng mt vector cu trc biu din trang Web. Mt vector c cu trc c
chia mt cch logic thnh hai phn hoc nhiu hn. Mi phn c s dng biu din
mt tp cc trang lng ging. di ca mt vector c nh nhng mi phn ca vector
th ch dng biu din cc t xut hin trong mt tp no .
Cch ny trnh c kh nng cc trang lng ging ca mt trang Web c th lm
long ni dung ca n. Nu thng tin ca cc trang lng ging ny hu ch cho qu trnh
phn lp mt trang no th my hc vn c th truy cp n ton b ni dung ca
chng hc.
Cch th t
Xy dng mt vector c cu trc:
1. Xc nh mt s d c xem l bc cao nht ca cc trang trong tp
2. Xy dng mt vector cu trc vi d + 1 phn nh sau
a. Phn u tin biu din chnh ti liu ca mt trang Web.
b. Cc phn tip theo n d+1 biu din cc ti liu lng ging ca n,
mi ti liu c biu din trong mt phn.
Nh vy qua bn cch biu din vector trn th ta thy rng hu ht cc phng
php biu din vector c kt hp cc thng tin v trang lng ging cho kt qu phn lp
tt hn so vi phng php biu din vector vi thng tin v tn s xut hin ca cc t.
recall =
precision =
F1 (recall , precision) =
(1.1)
true _ positive
100 %
(true _ positive) + (true _ negative)
(1.2)
2 recall precision
recall precision
(1.3)
sim x , . y , c j b j
d i d i
d i {kNN}
(1.4)
Trong :
y (di, c) thuc {0,1}, vi:
y = 0: vn bn di khng thuc v ch cj
y = 1: vn bn di thuc v ch cj
sim (x, d): ging nhau gia vn bn cn phn loi x v vn bn d. Chng ta c
th s dng o cosine tnh khong cch:
x.
di
sim x , = cos x , =
di
di
x
di
(1.5)
Trong chng 2 s trnh by chi tit v thut ton hc SVM v bn gim st SVM.
Co-training
1. Hun luyn hai b phn lp: f (1) t (Xl (1), Yl), f (2) t (Xl (2), Yl).
2. Phn lp Xu vi f (1) v f (2) tch bit nhau.
3. Chn thm vo f (1) k-most-confident (x, f (1) (x)) ti cc d liu
gn nhn ca f (2).
luyn c nh gi l tt. Hiu sut tng qut ho ph thuc vo hai tham s l sai s
hun luyn hay v nng lc ca my hc. Trong sai s hun luyn l t l li phn lp
trn tp d liu hun luyn. Cn nng lc ca my hc c xc nh bng kch thc
Vapnik-Chervonenkis (kch thc VC). Kch thc VC l mt khi nim quan trng i
vi mt h hm phn tch (hay l tp phn lp). i lng ny c xc nh bng s
im cc i m h hm c th phn tch hon ton trong khng gian i tng. Mt tp
phn lp tt l tp phn lp c nng lc thp nht (c ngha l n gin nht) v m bo
sai s hun luyn nh. Phng php SVM c xy dng trn tng ny.
C + wi xi = 0
i=1,,n
(2.1)
(2.2)
(2.3)
Trong
sign(z) = +1 nu z 0,
sign(z) = -1 nu z < 0.
Nu f(x) = +1 th x thuc v lp dng (lnh vc c quan tm), v ngc li,
nu f(x) = -1 th x thuc v lp m (cc lnh vc khc).
My hc SVM l mt hc cc siu phng ph thuc vo tham s vector trng s w
v dch C. Mc tiu ca phng php SVM l c lng w v C cc i ho l
gia cc lp d liu dng v m. Cc gi tr khc nhau ca l cho ta cc h siu mt
phng khc nhau, v l cng ln th nng lc ca my hc cng gim. Nh vy, cc i
ho l thc cht l vic tm mt my hc c nng lc nh nht. Qu trnh phn lp l ti
u khi sai s phn lp l cc tiu.
Ta phi gii phng trnh sau:
(2.4)
(2.5)
trng ln nht (hn 10.000 chiu) trong khi cc phng php khc c s chiu b hn
nhiu (nh Nave Bayes l 2000, k-Nearest Neighbors l 2415).
Trong cng trnh ca mnh nm 1999 [12], Joachims so snh SVM vi Nave
Bayesian, k-Nearest Neighbour, Rocchio, v C4.5 v n nm 2003 [13], Joachims
chng minh rng SVM lm vic rt tt cng vi cc c tnh c cp trc y ca
vn bn. Cc kt qu cho thy rng SVM a ra chnh xc phn lp tt nht khi so
snh vi cc phng php khc.
Theo Xiaojin Zhu [15] th trong cc cng trnh nghin cu ca nhiu tc gi
(chng hn nh Kiritchenko v Matwin vo nm 2001, Hwanjo Yu v Han vo nm
2003, Lewis vo nm 2004) ch ra rng thut ton SVM em li kt qu tt nht phn
lp vn bn.
Kiritchenko v Matwin nghin cu v so snh phng php SVM vi k thut
Nave Bayesian, sau chng minh c rng SVM l phng php tt nht cho phn
lp th in t cng nh phn lp vn bn.
Hwanjo Yu v Han cho thy rng SVM hon ton c tin hnh tt nht so vi
cc phng php phn lp vn bn khc. Tt c cc ti liu nghin cu hin nay cho thy
rng SVM a ra kt qu chnh xc nht trong kha cnh phn lp vn bn.
Lewis nghin cu phn lp vn bn v khm ph ra rng kt qu ca SVM
l tt nht. Lewis a ra tp hp nh cc ti liu ca phn lp vn bn. Tc gi c
gng ci tin phng php RCV1 cho phn lp vn bn v s dng phng php mi
c ng dng cho mt s k thut phn lp vn bn khc nhau. SVM a ra kt qu
tt nht khi t da vo k-ngi lng ging gn nht v k thut tp phn lp RocchioStyle Prototype.
Nhng phn tch ca cc tc gi trn y cho thy SVM c nhiu im ph hp
cho vic ng dng phn lp vn bn. V trn thc t, cc th nghim phn lp vn bn
ting Anh ch ra rng SVM t chnh xc phn lp cao v t ra xut sc hn so vi cc
phng php phn lp vn bn khc.
Vn cn bn ca hc bn gim st l chng ta c th tn dng d liu cha gn
nhn ci tin hiu qu ca chnh xc trong khi phn lp, iu ny c a ra so
snh vi mt tp phn lp c thit k m khng tnh n d liu cha gn nhn.
Trong phn sau ca chng ny, kha lun s gii thiu mt phng thc ci tin
ca SVM l bn gim st SVM (semi-supervised support vector machine S3VM) [16,
17]. Bn gim st SVM c a ra nhm nng SVM ln mt mc cao hn, trong khi
SVM l mt thut ton hc c gim st, s dng d liu gn nhn th bn gim st
SVM s dng c d liu gn nhn (tp hun luyn training set) kt hp vi d liu cha
gn nhn (working set).
(2.6)
s dng Internet nh c s hun luyn rt ph hp. Trong cc trang Web, tuy chnh
xc khng phi l tuyt i, nhng ta c th thy mi ch gm c nhiu t chuyn
mn vi tn sut xut hin rt cao, vic tn dng tn s ph thuc ca cc t ny vo ch
c th em li kt qu kh quan cho phn lp.
Vi i=1,,n.
Nu f(d) 0 th trang Web thuc lp +1.
Ngc li nu f(d) < 0 trang Web thuc lp 1.
(2.6)
C th thy rng qu trnh p dng thut ton S3VM vo bi ton phn lp trang
Web chnh l vic thay th vector trng s biu din trang Web vo phng trnh siu
phng ca S3VM, t tm ra c nhn lp ca cc trang Web cha gn nhn.
Nh vy, thc cht ca qu trnh phn lp bn gim st p dng i vi d liu l
cc trang Web l tp d liu hun luyn l cc trang Web cn tp working set (d liu
cha gn nhn) l nhng trang Web c cc trang Web c nhn trong tp hun luyn
tr ti.
Theo Vikas Sindhwani, khi dng SVMlin phn loi vn bn (tp d liu RCV1v2/LYRL2004) vi 804414 d liu gn nhn v 47326 c trng, SVMlin mt t hn hai
pht hun luyn SVM tuyn tnh trong mt my Intel vi tc x l 3GHz v 2GB
RAM. Nu ch cho 1000 nhn, n c th s dng hng trm ngn d liu cha gn nhn
hun luyn mt SVM tuyn tnh bn gim st trong vng khong 20 pht. D liu
cha gn nhn rt hu ch trong vic ci thin qu trnh phn lp khi s lng nhn lp
khng qu ln.
3.3. Ci t
Trc tin, cn gii nn file ci t bng cc lnh sau:
unzip svmlin.zip
tar xvzf svmlin.tar.gz
3.
c m t trong file u vo l:
2:3
1:4
2:5
1:6
5:1
2:1
3:9 4:2
4:5 5:3
Nhn ca cc d liu hun luyn c cha trong mt file ring bit, gi l file m
t nhn d liu. Mi dng ca file cha nhn cho d liu dng tng ng trong file m
t d liu trn. Nhn ca d liu c th nhn cc gi tr sau:
+1 (d liu gn nhn thuc lp dng)
-1 (d liu gn nhn thuc lp m)
0 (cc d liu cha c gn nhn)
Phin bn hin ti ca b cng c SVMlin ch c th p dng cho bi ton phn
lp nh phn.
Qu trnh hun luyn
G lnh:
svmlin [options] training_examples training_labels
Trong :
training_examples.weights.File cha d liu hun luyn
training_examples.outputs. File cha kt qu m hnh phn lp
Kim tra (testing)
G lnh:
svmlin -f training_examples.weights test_examples_filename
Trong :
training_examples.weights: File cha kt qu m hnh phn lp
test_examples_filename: File cha d liu kim tra
nh gi
Nu nhn ca d liu kim th c bit trc, chng ta s dng lnh sau
tnh ma trn thc thi ca qu trnh phn lp:
svmlin -f weights_filename test_examples_filename test_labels_filename
D liu hun luyn
KT LUN
Nhng cng vic lm c ca kho lun
Kho lun khi qut c mt s vn v bi ton phn lp bao gm phng
php phn lp d liu, phn lp vn bn v cc thut ton hc my p dng vo bi ton
phn lp, trong ch trng nghin cu ti phng php hc bn gim st c s dng
rt ph bin hin nay.
V phn lp d liu, kho lun a ra bi ton tng quan, cho ci g v cn ci
g, ng thi trnh by v phng php phn lp d liu tng qut t c th gip
ngi c hiu s qua v bi ton phn lp.
Trnh by c bn v bi ton phn lp vn bn, cch biu din mt vn bn trong
bi ton phn lp nh th no, qua nu ln cc phng php phn lp vn bn c bn
hin nay.
Tm hiu v cc thut ton hc my p dng vo bi ton phn lp vn bn bao
gm thut ton phn lp s dng qu trnh hc c gim st v hc bn gim st. y
chng ta tp trung ch yu nghin cu v qu trnh hc bn gim st, nu ln mt s
phng php hc bn gim st in hnh, trn c s s i su tm hiu thut ton hc
bn gim st SVM.
Bi ton phn lp trang Web p dng thut ton bn gim st SVM c nu ln
rt c th. Trong phn thc nghim gii thiu mt phn mm m ngun m c tn l
SVMlin, cch s dng phn mm v kt qu chy phn mm do V. Sindhwani tin hnh
trong nm 2007. Em ti phn mm v nghin cu kho st song do hn ch v thi
gian v trnh nn cha lm ch thc hin phn mm.
10. Panu Erastox (2001). Support Vector Machines: Background and Practice.
Academic Dissertation for the Degree of Licentiate of Philosophy. University of
Helsinki, 2001.
11. Paul Pavlidis, llan Wapinski, and William Stafford Noble (2004). Support vector
machine classification on the web. BIOINFORMATICS APPLICATION NOTE.
20(4), 586-587.
12. T. Joachims (1999). Transductive Inference for Text Classification using Support
Vector Machines. International Conference on Machine Learning (ICML), 1999.
13. T. Joachims (2003). Transductive learning via spectral graph partitioning.
Proceeding of The Twentieth International Conference on Machine Learning
(ICML2003): 290-297.
14. V. Sindhwani, S. S. Keerthi (2006). Large Scale Semi-supervised Linear SVMs.
SIGIR 2006.
15. V. Sindhwani, S.S. Keerthi (2007). Newton Methods for Fast Solution of Semisupervised Linear SVMs. Large Scale Kernel Machines, MIT Press, 2005
16. Xiaojin Zhu (2005). Semi-Supervised Learning with Graphs. PhD thesis, Carnegie
Mellon University, CMU-LTI-05-192, May 2005.
17. Xiaojin Zhu (2006). Semi-Supervised Learning Literature Survey. Computer
Sciences TR 1530, University of Wisconsin Madison, February 22, 2006.
18. http://people.cs.uchicago.edu/~vikass/svmlin.html