You are on page 1of 30
™, . Unvt-b C 0) Supevised Leavning — Pavarnetic Methods \ Bey MWearniqn eee & C'S Examples | Ret Le C= clo ot = Fivet input Atvimote which denote price Setond input nee Attribute which denote engine volume xg: Price Ahe claw of o “Family cov" Freiaing eect. for correspond” +o one example Gy Each dato parce of the Rake indicate the and Are coordinates vice and engine * 4’ denotes & cere exa Ca famil d 6 On Sekt denotes al Arye example of “fhe claws xo Xz = Engine Power -ted by * Cov's label 6 denoted bey aft x tAA pos < Each toh [72] ree Ay = Puce o Wx tha focn on ovsered @ ~N Each Cay i+ Be precenice by paiv Cx,5) and the training get contains N fuch examples ten Ke ce Bae SE" indexer digtevent exampl 4 Lime oy any Di, ao toe hc. where foth oder: Mk does nct ee Huypothe big Clas Se Ho Engine ow et 1 Price mapotesks clas Figure shows the hypothesis clas . The clors of Fomily Cax (4 O “sectangle un the price - engune ower spre. Oov Ata data (x1) %2) M point in two - dimensional Pace wheve @adn Pee iaace “Epa data point at Coordinates Cue, ae> and its type may be positive oY negative qven by ot a Sy Considey @ Ue . fas S f») AND (C18 Engine power < en) Pie Lone Wimit of he price pre higher limit of the pre Lower limit of the engine Fi wey eye pu higher limit of She engine power Ler = Hypethests clays whose price and engine po an fotloww Ch & prices pa) AND (a & engine powd $ er) “the Neowntog algocthn “finds the parkcvlay \ybothesis oh EH, fpecisied baa oes equoaople of Ce a eh eh) 40 alppronate Sanne cleselg on ma ipanticets clas , Though the expexE dletines “this h of the oo ave not Known: that i, he Valuer hough we Choc nH, We donot Know which pesticeley hes pao Oo henis cléet Lene ' ere onc ns rrestvich 00 attention te BW Iybothesis clays 7 Deovoing inpie> hading the four pacers ee Vig the aim i4 te tind he hat if on Rare Lek 0S ee athe lngpathesis ere es possible to Ce n instance % Buch that a Prediction for ov ae ¢ ee a clamor Kana Re example (Cass i i classifier % ara negative example go we Cannot G veal lise we do not Know CCA) 4 canwree trae teal (hes) anes GO what ve home is the tvainia BA %, which is Oo Ama\| Bene oe x. The empirical ere 0% Me Ket oF a\\ possible io the ara oO of taining instances whee aie iyed Vous vere esag oot es hypothesis rmatch Yae VEYY rine emer h aun ~the Ayaining Zi XX i* & , e Chix) = 5 sl GHEY ) ciicoal a honee a ie GO at i it a(h ee) (rode TP is 0 sitive CKarn ele Nees 1 i x4 nen acs negative example ie hy clowitign * oe 4 ootive exam pe abtione exampk ho oe o lr clasifges % O% A EAE whee b= aa “he Iypethesis claxs H is the Set Each ayresoglt Ce Reno) 4, and We meed to Ty ovv examele A ossitale -cectangler- | of al P defines one Inyppothesis, from Choose he loest One) on ee words » We need ce tne values of these oun parameters Eten athe Eraining get, to indode all the positive cxamebs and none of Hoe negative examples +S ever Bleck thee pasate oer = ah: bot given ——EEE Sag S Zome where clse to the boundavy Hive Fample , Bievent Candidate huypothes > may Anke diferent potions hs is the | problem of gyanevalizahion that 16 , how well vv \nyuborhesis will conve clang Soture oxarngles a ave Not Resin of the bainiey Ser. thesis 8) Most_Geneval_ hypo a fotvove example be heen positive and esis Most Specie ae Vexsion Sexe Op.: Engine Power Le: Pric B= Sinthe most Beckie hypothesis fe G= Gis te wet Geraral hugfethes OY ala + 6 is the tightest 5 inthe Most specific \ the peas exam positive and negative. Therefore , oN sisterent learning r Con oe dtéined by N deta ae I fev any of yobolerns These probes » We Can find & hypothesis het Ret Beperotes the positive examples rom “the negatine, That 6 ang Ben we Say 4 ahalters N points leaning {e tog N examples Can Nemes) wish 2° exror oblen definable be by a hypothesis dyawn fom H. o& ints edt con tbe Shattevel, The maximum nomber Vapois Chervonenkis (VO by eee ta\led ake een denoted %* NC CW and dimension Measores the copacity os ° Ces = ee ~ 4 @) §o the Fi shown, We See thal an axis -aligned mG eclangle Can shatter Foor foinis 1 two dimension Then VeCr), whn H is the hypothesis lus a 14 for . Oxid - allighes -reclanpfe> atin “Ewo een ee . diménsion ge tells usHrat leave Vc dimension ™44 using a tectangle ar OF h esis claws, We Can only dedasets Comaiming, Loow points and anct more A \eaxning algor cithm thet can learn datasets of —four points i net VOY factual, thowever his is (ecnunee cate Ve, dimension is inde endent of the xoloabiili distribution From which ingtances ave dYawn.. othesis clanced with Amal ve over those So even With h Cc ante Vicable and ave Mecca * be le, boty Blt, Ee 1B FE 3 Te 1 Ve - Dimension = Ie instances - be Shatteved Some iol : © 2:3 Probalel ft Approximatel Correct Leaxni ¢ AG (eS ee Oe / Bio Probe! peer — one) ) daly Covet CPA ZL . le ae Stents a 1s C NN , cat a sc oxamples See © (Zeme unknowy but Sve ) probabil lity distibotion. we 6 (want +o “hind She rember smell examples , N/ fue hy hal > Wwe salah es z the Bena ae The difference between AEs caret —fovv crectanguloy atrigs , one of which is — Let Cz Clos Example» dyawn from with probability distibotion poo No = Wovdindwetie mombey of examples With probability at least @-s) & 2 the hypothesis h hes ervey at most € ant Wr SSE proce p = whee’ €70 P {cans et7 G8) Ah is the vegeon of diferenie lbehween ee h es Wheve a ee ae ee Se aim hatte we eanec eal ke Fo ike owl thesis fol haw a (os nal te Be ever (oe ty be cer oy te te setae ~ f — | ~ r 2 x) ; Ofe an @ © *s hee ‘fo C i ©, O-—9 i & © Oo : Ely Foy each Strip ¢Yt0r= £ erate Probability Hest Nt sodependent aedies N mins the Strip & = (- £ a deoendent dvaws ate probability Weak ai= in pe aN any of the foor ships = & (i- £) miss —_— R We Know fe In CE) en ® (ye ae + 4 note) . Provided that we take (4 Ce) independent examples poem (Gonos the Hightest sechangle or ovr hypothesis bh, with Confidence xe se (5); a aan a wil be misclasigied With ervey probabil é. BB decreas Contidene probability increases. AS E decreas — Manratease4- —h—— AS DB decreas N incvea) kK wwdew & checreaxes N increases ih Noise aS Noise is any vawanted anomaly in the data and due +o noise , the clase meg be move difficult 420 leayn and Zevo evvor may ke infeasible With —@ Ginple hapethesi« clam. there ave Eenexal inter pre tations of noise. They ave as Follows a a © There nag be impreciaion in tecording— Ahe inpot attriboees , which may ahisE the data points in He ae Sao @) there may be exvors 4h ints, which may crelabel and negative instance 6> positive PA eo cocne Gs none attriloutes » which we have There may oe additional es oe into account hat abtect Ade label of an instante - auch ottribotes may be hidden or \atent in Yhact. they may be cael The eteeck of these -negtected attsibootes 5 ue led OF rsand0m compenent and ib included in \abe ling- he data sitive inglance as negative This is Bometimes mod noise when thee 6 Noise> here is not a Semple beoadayy loekween fasitive and negative instances © and Leve misclassification CyYoy WM not be ossilsle “With a Simgle m1 hypothesis. A xectangle Wish four parameteys alefining the ea es pas Closed form! Can be deawn Lf pcens fonctions with a ¢ loxrge number of Contrel pen i See ad ©) complteded < when there is anoise We need hypothesis to Aepsole positive, and negative \ A ~reclangle can be defined by toe Ono ci , bat Ay define & More complicated ahape on€ needs & Move complex amodel with a much lage number of aaa Leva. mre becaur (Onan simple vrectaog make moe be of the following O Shin a Bimgle mod Whether a point iS insi Eieasl al to use: his easy do check deo outside 4 ~pectanghe K, for o fitove: date ioroese Con easily chee instance , whether it wa ao or a negative instance. and has @ Sriso Bimple model to train ja casiey to fewer avamnekeNs » MY AJalven oF & mKeetang\ ¢ Maan the Contra Past afan J, gimgle medel Nealon ind cone assbitraty shape. Narriante « A +00 ingle model id Tnoxe rigid and nay do\ if medes | thee oaderlyong: a A fimpe model har mie bias va not that imple Finding 2 atimal model Cove both the bins and Vawiante © Mia & Grgle model 40 crplain- A ace Finely Co ads to defining intesvals on the “hyo altyibote>. By eacsning= Kimple anode , we dan &xtvach indermation from the waw data gyn she raining Bet. rds to minimizing = G or noise in inpet anodel ® I indeed theve i4 miglabe lin tual clas 14 sea Ny a simple like the agectang ey Given . Com vvable empisicl we Boy thot a Seenypl (bot not toe feimpla) model better than 4 comlex amodel « this which States Would penealize Occam's tazet 7 eat is Known thot explanations ote nore favs ible and bhaved off any should be 25 Leavaing Mottigle Classe aS ie Soe Power [e 0.0 a a ial patel en a a Loxory Cy Family Av ste veel Fi shows thyee clases. “They ice Clasen Conventooding ‘to cfornily Car, ie and Sports Cows ferpectively: there ave three Inybothesis induced an Ahown. a se presents Feject wean wheve vo clavs 14 forected 0% Move Eien lores Gilected: evyoY simpler wane (eo compost © Gn machine leaxning for clamification » We wowl Nine to leavin the ooundayy Beperating- the instanies OF 0k claw from the instances of all other classes. thus we view a K- clas claosielion paisa as K ta + Claws oe a The Boe example Loelogging to Cx ave the paitive instances otf hyperthes's hi ‘and the examples of all other claned ave the mena insta of hi. Thos ina Ke clon oblem» We Wee hypothesis 4p leas Boch Yoo eee Ca hay Gxt) = Va % xt EC» eal Samiene Lota) empirical eyroy takes a Som over the a fox all clame> covery all instances teN ick ae Oo (hie) Me tel tel are K cloner N= No. of Gustanis c She “Brains Drndannwarrtaile” Training ten Ne : ; {te Sey of K wih, andes © wang sng Stom ¢ 4 bN we de i exw a es aS exam gle 6 w/ Xx iAa nggative example mM Gh Foy a ane ay ideally nly oreo hiods fae 6 Land We Con choose a Claws. Wee ees tS clos ! hioo=! hha G20 havo=o hone o jet and ie milan thi GSD exe hidoe | hoo-! band22 Ghia difficult to Select aclux hx G20 yahen hawaervorweeird ie fy iz) ® K when hooze Wnated=d 3 Gis difficolt te Gelect 4 clos le Gozo we Cannot Pee attien ibid a Kuch Cases. Case of doubt and clasifiey vesects Gn ow tae of Neaning adamily cay, We used only one Bet and only modeled the positive examples+ le outside (4 not a aol Can: ee i Sometimes We may fas build two hypothesis, one doy the positive a. and the Other for +h i @ negative ingtanies. tive exavmnph This arenes a Stroduve also for the negate rastances fox Het con be covered bf anothe hay pathesis Seperating foenily- cxs Strom 5 ereccace ts cou & lob lem, each claxs ee Bteortove of tts owns The advantage thot i he inut iA Quotas Bedan , we can rene both ha pete decide papa and “seyect the or ina dutaeet , WE enpect do haw all clases with Zimilas dishilovtion + Fev examele in a handwritt digit “secognition dataset » We would expect all digits to ham Similar disthbotions + Aogneses dataset , for examele, where we have tite clan for sick and healthy opts > we may lave Completely Ai sfevent distyibotiows cor the two alwses.- Ye At) healthy re Je ave alike each Cue ae iw Mick i his ov her own Coe ee a But ina medica) I eee $n damification » aver an input the ootpot is areal The ovtpot generated i Boolean » Ht id (Law @. We Cidhey es OF MO tahen the outpet is a Nomevic Valve, We have +o Make Machine leaning model to leavn Nur t : meric Function, Gin machine learning. » the 4unction is Not Known but we ha Ne a tyai dvawa From it. oy ae reo “Sq a ten z 15> Reqyoived coker Fonchon ee ea tye Rice oie Oe Sk theve 14 no noise , the tahk is interpelation. ete & (xt) Sn Polynomial interpolation , quer N- J Mp e Cn-dDet degree pelgremiel Met We Can use to the ootyet for any KR. This 14 Called extra polation de of the Barge of Xx Sov example ia dhe value oF ed to the ints , we Find predic t if % ip ovtsi in the training set Tn time-series predichen- we have data up to the present and we want + tor the Autere . Gna Tegression , Here ib Noise add cotpor ¥aown of athe unknown donchion Pec ie Moe Unknown function and & is Where # GER extra Widden Pde cee oe comes fron Naviabler- * aba ce GRO) pidden Variables. We woold like to agprerimate wheve aes the ovtpet bg ovr Model gq): The empirical efor on he training set Kis given by N E(rm) eh EO goo) ow Ss 2 t e > Lat- goo) @ — eyoaen tc E(¢lsO* + © and R22 ave numer defined on Meir at Becase tov example © R , Herve if an ordering Values and we Gan dehne 4 distance bedveen Valves , as He which ae os more \atermaton ware oF the diferente - an taal /nek equ! Bn ad. 1 dlassificahen: The vere Of He difkerene 4 one error Less) Piette, that cn be Use + another is the absdote Value of the difference. Ovy aim ih to “nd gO Hhat minimixes the empivical evtey. SE we ariume gO) i* linear 4 Ward + We ge 2 Wiki t Watt ea gays S win Th wet - Wd, Wo aXe parameters to leavn rom Yoe daca, Wi Wa, Tie Volum For two athyieote Cay = Wx t We g teN L c Connie 20240 SE. se- ono eat iat can be Calevlated by taking Ne E with Yespect to Ws and Wo, | 8 efor the Avo Unknowns Case Sts miaimon atial derivatives of Selling them equal tO ond Selvi ee Woes *— Wie ae Bey NK @ 5 ones pe . Where 2 * eS Kt avd) an . Sh He Linewr ‘model is “te simple , it ia tee aenctngesny One | Cae a large Sree COTOY, and in Such a Mse, the outpot May be taken Od A higher — ovber fuaction OF he input - for evarnghe , Te g@- fee ee ne Bot a higher oxder elynomial fellows individual examples cluselye , instead of Captoriag- dhe areal eet As implies Vhet Octam's “azo also applies in Ahe case segression and we should be Careful when Fine - toning che model Cornglen'tq be rma it with the Complexity Wn soe Saderiying Fre dats at Model Golection and Ganesalization es a ; with TA’ inpots Be Lraining Sr hay at most F exammes Each of these Can be labeled o eee and thevetore , there Ove 9 cae Boolean fonctions F a- impels» 0 how in following Qe _ tat hsthe Lh ac tela | boa} Hu) hol el rg has] bg | Slolo} ojo ON VTOa A teehee te a ee eae a velhsealpesfeullie Ru late It ea |e) cH a8 fod vt peste AUIEL el a na) ae alle Each distinct Laining removes halt the hypothesis ame » Hose whose A are Wyong « For example \et us we have X20, oe | and the ootput is 0 he erieven hs he ht he, hia hig his, hig. “This 14 one Way to interpret lepers We Start With all Fe hypotheses and as we Kee More Droivipg examples , We Kaa Ternove those bg pothesks that Orve not Consistent With the training data. Ga the (ase oF 0 Boolean Fordtion, to end op With a Birgle hapothesis We need to See all 34 training examn les A, the raining Sr We ave given Contains only. a Srall Laloset of all possible instances » 49 1b choos generally doe - hat ix, if We Know what the cvtput Ahovld be for enly a Fmal\ percentage of He Cases - the Selotion 14 oF uni ajo fee weer neal: 10627 Hoeve remain 9° Possible Funchons . Thi is an example of ann ill posed : pycblen wheve the data lay itself is not Gui cient 4p find a vnlajve Slo. a & 50 loecauce earning i ill- posed pand derta by itself is not Lobficient to find the felvtion , We Should make Some extra arsum prions do lave a unique Aelotion with the dala we have Gelipem cee tos onseumpliens we make to have learning pase . 16 Called the inductive lian. Bot we Know that each Iybothesis class har a Cextain Capacity and con learn only Certain fonctions. “The claw of fonctions that can be leayned en Re extended by vsing & hypothesis claw with laxger capacity 7 Ct more complex Aaa Foy exanole ithe hypothesis clas thet is avaion & two Tettagdles ha» he capacity» bout its hypotheserahe Inde Complex. Bim: larly in “egression a We increase the oxdey of the gelgnemial , the Capacity: and complen'ty increases. The Tce ow 16 “to decide wheve to Step Thus learning is Vet real without inductive bias And Now ¥. westion how to cheese the wight ever This is Called — mmodel_ Selection , which is Cheesing Ledween (edi He Gn thas ansmering this Si we should —Yemember Hhat the aim of amachine leavning 16 carely to replicate the training clata bot dhe prediction fox nen Cares. That in , we would like to be able to generate the wight output for un he ingtance outside the ‘raining set, one fov a @ aan ain the being whics the Corvech output is net Bet. trow well a ymodel tained on the raining ak predicts ake niget ouput Loy vew instanced is called fyenealizatiang Foy loeat Generalization , Complexity of hypothesis claw WH wiah the Complexity of the unction ender|yin the dota ff His lew complex hen Hhe function, we have onder Hing , tov examele , when trying cho Fife a Qing to data amped froma Hhivd order pelyromial Gin such a Gse, aS we increase the Complexity , she Sit have H that 16 too complex , the onstain ib and We mai. end up hel, cov example when Fittio ad From one rectangle. Ov is deve i noise ,an ovevcomplys bugathesis may, \eayn nét not only the under yg fonction bot alse the noise in the data and may make & bad fit 7 bv Gampk, when Fitting a sixth ovder ee un nels dota amples ena phic axder polynomial Ths 16 Colled cversitting 1 Wnseh a care » Pan more ‘training data helps lot opts oats vp to acennan poe Given a Yraining Geb an HH, We (on Find eal) pect has the minimum Yainin evo but He not choben wall, ho matter which hen we pick ) we will not have gor gindeliyelicn, meneecls amatch the error decreasr: But if we dota i not encugl do « With a bad hygethesis » -kwo rectangle to data @F) Dasplegaueanss ites Miia 4 Dickerich (2004) an all learnin ali cithins that ave acter Sample data > ctheve is aWade- off between rce factovs ramely ©The Comeliniy of the Fugpothesis pie Fit to data / navel the Capacity of the a eau Clous ® the amount F training data and (@) the generalization ervey om new example As the amount of training. clata intreases , the eraligation error clecveasers AS complenity incr cakes , the cea then otayta to lncvease. veveomnelx H con be of the model clows eyyor derveas® Ayvet and The : ery of an © Kept in check by ineveasiog athe Ainourt= of training daca but of doa Point: I de the data is Geameled dee Ts and if i aye Fitting a higher ordey fel ( ) he Lit will be Constrained fe Qe cline he the Quine ib theve 16 rainigy data in the vicinity whee it has net bean ctrainecl ei ve ) le aaa Boa te "a W th bil = & Can measore e genevalization ili a hypethesis namely 5 ue of ita Inductive bas x... iS we hove atte to data | ovtside the training set. We se walo ett ; ries. This canbe done by divicieg dataset into training dats sok and validation data Set ts tet the Generali zation ability. That ie » gwen a eet 06 possible hypothesis clawes Hi, ee ee Lit the best hi & Hi on the training Set, Then. ovtercing Dasise eneugln Erdining and validation dels, the Inypatresis Mbat (5 mst Accorate on the Nalidation Get is the best one. This process 16 called a» Crees Validation. $0 ;for examele , to Sind the sight oder in polypomie rites : ann a number of Candidate pelagnounials of diferent orders Where polynomials Coch other , We Find the cocShi cients on the training Sct ee aiseevent orders convespond fe Ni, fov Calcplate thei Errers on the Validation Set and ctake Benen es ieee least Nalidetion error as the bes! al cali “report athe error Wwe Should Use beet Sb. we divide cthe data 6eh into training , Validation and doar bet + chhoetcha between bo hypothesis lowes Hi and Hy 7 We will vse them looth mottigle ss ona number of Draining. and Validation sets @ and check if Re ditferene between Average error’ of hi and bj 14 laseger thon the average 4 moltiple hi. imevence between Lis Dimensions oF A Supervised Machine Learni Algorithm ee Oe 8 | CN ae ew ; t-N Consider & Sample Kafe, sty os aa cid) dent and identical dishiboted The Sample is independ WA the instances ove dawn dishibotion por: 2 & indexes On 2 Avbitvay}- dimensional ingot sired OUtpet Lom the Bame Joint eof the N instances xe tL Agscciated de Er ON AIO hwo - clos , leamting mensional binary vector for Kero. clans yt. K-di and is & Yeal Valve in negqvession Classification the aim i to build a do +* using the model g (xtj@) » $n hye one Three decision Wwe must ‘make Model we vse 0 lerentanaymcccot

You might also like