You are on page 1of 119
Data Mining with Neural Networks Solving Business Problems from Application Development to Decision Support McGraw-Hill Zz Ari eet ope! iit wah near ert: bcs pens tc Sse gr ane pda ttavenpcat donno an ns. Ban oust tt) 1 At tts eer ser) 2 ne mangement skort) Selec 0 soem Sori 01 Th Me Comp, te Un as of pein omelet yin py cn vn x 294567890 pocDaC 990987 12 Sy eG nd ei a ase arn a ont Be Stn Pa mene neo Prana ound by R Doeeey an Sos Company Cari, tana ee a cay es ed i {Seba of Spec ae Ncw 11 eh Se New ak WT OF ‘oad reo “This bok contains simple examples to prove lustaton. The data we in ee enanpies enti ssn at oboe on ar el compa res or people. The hor and publsher ofthis bok have used ther best ‘forts in preparing ths took Te author and pusher make no warranty tary kid expressed orimpla, wth reget the dosunenaton and ex [Snples contained in ths took. The aur and pusher shall no be able inary event orimedentlo consequent darages i covumcias with or ting et of then fermion in this bok "The vews expeated in the book are te ews ofthe author acl donot ncn Leech views ofthe IDM Corportion. The bok net po sored in any pat ot nary manne, by IM Mo Jennifer my wife and best friend. Contents art 1 The Data Mining Process Using Neural Networks Te Eton allman Techogy Developing Busnes Aopstons ‘ramp een Alen Fates ‘hater 2. ntadustion to Nera Networks eral etwas Ast ng Engine Ios! Perpectve ‘inn ompung: Smet Manplson o Pater Rcoonon? CompaeMetaphr Verse Bain Meas ing Oeon re Neral Processing Element seeeeegeee S Cchaper 3 Data Preparation Magen Ontsee Shs Dat Sector Representation tmp on Ting Tne Dat uanty ‘Chapter 4. Neural Network Models and Architectures ‘Cnapae 5 Teinng and Teating Naural Networks ontrting he Wolnng Proves wih Lanring Patameters Retences ‘chapter Analyzing Neural Networks for Decision Support ule Gnuaton tom Near Networks ‘Chapter 7.Depaying Naural Neto Apieaiont Aspen Depart ni een She dares tp {et Upor Ose Luring? Mooring a Nance Prtomance ‘then Retain ent Engh Ste Meso anges Wer? Freee Chapter 8 nthigent Agente and Beam ate Mang Types agents ‘ang Onmain only hough Res Part 2. Data Mining Application Case Studies, Chapter 9. Maat Seaman Saasucton Most and here Secon cahonan dao (Chaper 10, et Estate Peng Moos! Data Representsin Dep and iain Ow Aplin tate ptm wl eon 129 wa Copter. Customer Renking Mode Sea tepeomton ‘ining ar Yestng the Neural ewok Senay Aisa Fatated hplestora a Dscusen Chapter 12, Sas Foracostng Tring an etn te Neral Newer elongating he Apeson ‘Appa. (6M Neural Network ly Anand 8. Fuzzy Logie Anpenai ©, Gana Algortns Acknowledgments 1 wont he tm aetned my thanks to the mary people who have helped me wes this book First, would be remiss roto thank my chien, Sarah and {he book. Tey ave un many opporees pay computer games so that Teould meet my schedule. My wile, Jenifer, put up with my working Serange hours (rages than abou sy) we ogling my ob a TM with the wetingehoees Italy could not have written this witout het help and support. Ta ke to tank my manager, amvee, or is under- ‘tanding whe sinnanecsyeomplate the et, tls ne ene, parpated in a worldwide software design and development ello. “Alo he groper tak were create Uy Ooty Halk 1 woul ike to thank Gordy forse work and his patience as we turned my ‘eas into istration Ifa pitre is wort a Uuosand words then Gordy ‘ved me plenty af wring. "This book as benefited fram the comments and suggestions of Jennifer gis, Cindy Hitchooc, Hl Lmere, ell Pig, Don senosrag, Sally Goer, and lex Bers. Unk ther fo de ue spent reading hous my drat and for sharing their opinions wth me. Tei esky ein aro, ete Thea iG r {olovng up after our fst roe tthe 18M Tech Interchange in New Orlear Herder ne Ke for"badk about neural ecwork appbcatios" prenpd me to sb my propa and readin this bok being ween ‘Over the year, many people at TEM have helped develop the Neural Netware Uuly prove Tese ineude Shawn Austvol, Sieve Lund Jn Honrtl, Past Homers, Lamy Melis, ea Rar, Kart Sul Seo Peterson Helen Fung, Matt Latcham, Cindy Hitchcock, Jett Pim, Todd ‘Stra Eu Seas and a Seusnay:Iwoul ke to thank these people for ‘the contributions they made, and continue to make, tothe NNU technology. Introduction In my poston at IBM, regu bre exccuves, managers, ad computer ings | cover the fundamentals of dat mining and neural networs ad 1 tise discuss specie applesdons relevant uy ue cusuaners’ Duineses ‘Se testy ted. my goals to ait sie them a basi under stoning of data ming and to spark their agains so they can vsialze Tho te ean cate used in thelr own enterprises When Tuoaeed [saat ose hee excitement as they “ponder the posses "Inthe ‘tustovand-anewer pera following ‘ny presstabons, Tam iwaribly ‘Ste fora rosemmenaten cn "gue bot neal network 20 Eh ‘an leam more, With ew excepsions these pple do not wa to know how ‘neural networns work Uy Want co KNOW Row neural retworks can be > ‘ln ta solve business problems, sin terminology they can understand ‘and resold examples to whch they ean relate. "Whe here aren newal rh anevada day, ns foes the inner workings of te tecveloy. These texts approach neural networks from eter a cogrtiv science ofan engineering perspective, with care “hundingcnshass cher on philosophical argrnante oon ese test tent ofthe complex mathematics underping the various neural network models Other neural netwerk boots discs acazeme appeatons, wih Fave ite orn elation t el ieee pnbles, and ae fll of Co Or Source code showing nitty implementation deals, None of these ‘es would My dea of =ygal bo on ttl network tha ap root fora busnes-ornted audience, "Tie book, however i targeted decty at executives, manager, art computer professionals by explaining dat mining and noua atwor rm {busines information sstems and maragement perspective. I presenis ats mining with neal networks as a suateie busines techno, wh {he foou onthe proctl ompetiie adwantagee thy fer i dian the book provides a general methodology for neural network data mining {int appliation development tng wat of realities prabor 6 fxamples. The examples are developed using a conmerciallyavalale ‘eral rework dat lng tool the IBM Neural Network UU Data misung the Hen of extracting ah isormaton fom a, rat ‘new Whats news the wholesale computerization of business transactions dnd he consequential ood of sien data. What anew ue astute ‘omputer processing and storage tecoses wich allow siabytes and {erates of data to rein online, ealbl for procesing by censerver Spptciinn, What ew eve url worn aa Uwe develipient fw ‘nec algorttns fr nowedge discovery. When combined, these new ca: Dalits oe the promise of Kesver to businesses drowning ina Sea of Bon data. ‘Neural network re computing nehnsogy whose frat purse Istorecogiz patterns in dla, Baedon a compu mode sar tee “erging sri if he mn a, eval es share the ra’ ai igytoleam oradaptin response to extera inputs. When exposed toasteam thins and let complex neninear mappugs nthe data. Neral news provide sone fundamental, new capahites fr processing business data. omeveraprng these new neural eter data ning fete eqs 2 ‘completly diferent appleaion development proces rem tradtioal pro- [ramming So even hough the commer We of neural etware eencg) Fessurgelinthe past year eonetreting sive naval eae Cations tl cansiered alc ary marin he software development emmy. As wil show, at tate and nacre pete ‘hon ofthe sae of the a of neural network pplication development. "Tus bock presents a combreheasive vlew ofall the major ses related to data mining and the pasticalaplation of neural network to sling ‘eal world busines problems. Begg wa an intoducton to data mi {ng and neural network tecoloas irom abuses onetation, te Doo ‘Sorin wih a exaintion of he data min orocess and es wi a> Deaton examples. Appendices dere reuced techrcoges such as fuzzy Jeg ad eat lg "The dita ming proces starts with dala preparation issues, inching ata selection, cleansng, and preprcesting, Next, neural netork mode nf architecture selection le leaned, wih the focus onthe problems the “arious modes ean sive, oot the mathemaial deals of haw they sate them. Then the neural Aetwork tarung ana esting process is deste, Tatnwed 4dlanson of the nf lata mii for decison suppor and pplcatlon development. Automated data mining through te we fil Iga agents rab preset, "The annlation cae studies deal with common busines problems The pec ennnpes ae chsen rom a brood range of insries inorder tobe relevant to most readers. Theda mining functions of etasiestion, chs {ering rng rl time ie freon are vctatne int examples. “When you fish reading this boo, you wil now what data mining i, ‘what problems neural networks can solve ody, howto deers pr Tem snmp for nea etworkshton. howto sup the probes for solution, and aly how to save Inshore, wily to lumina the “tisk eof develope nerd sma applicants aa plac i a con text with cher spolaton development technologies such a8 objector fred computing snd incemenal development and prototyping tectnigues, or business eaceulines, anager, or corputr profesor, the Book provide 3 thorough intreduetion to ural network technol ad the = ‘Stes related tote application without geting bogged down in eomplex math ‘enced deta Te eater wile ale oe commen snes rei ‘Ems that are amenable tothe neural network approach ard wile senstized {othe lass tat ean alec success compreunn oa jin ‘The bok se cen, ontecticl lanauage aed cases approach to entlore theses volved in sig nearal exworks to solve busines prab- ‘rons to implement apleston sons ug commercial neural neta Tools or by manager tying to understand how neural networks an be p= [ied to thee businesace. Each chapter ines » imeary atthe ena ‘ong with a reference Ist fr further reading Pate] spans elt chapters, eu nrogucory capers nl 2 ad then proces rempeehensivemethodolog and overview ofthe key sss Indta ming wen neural networks fr decison support and apieation ‘Chapter 1 describes the busines and information technchogy tends that axe coneriiting tothe reqlrements for data mining applications. Rey d= ‘elopments nal the corporte data warohoee atthe stated cm Duting models, The major steps in the data mining process are deal, and eats mining arhteesre presented Data une enlace decision "Support and as eopleaton development are examined. A catalog of exam ‘le ata mining potications in specie ndustes is deserbed, fn chayier 3, neural etworke are introduced ae fundamentally new computing and problem-solving paradgn for approoching data ming ap- ‘leans Neural computing ir presented as an xternatve patton te ev0= {ton of intellget comping,» path that wns dense by epmbelle ficial neligence, The Ke factors responsible forthe tnt rejection nd the recent reemergence of neural etwerks wre seuss. "Sher 2 slo dese he aradamn shit required fr problem solving with neural networks at opposed to traditional eomputer programming. AN Chanple of aknoydege worker i ueed to compares ntact the use of ‘neural networks fram tasks Net the neural processing element and the mechaisn fr adapive Behavior daesed Ten reason the te cic neural network comping function: clesfiestion, shoring, made ing and tine sere forecasting. ‘Chapar ecustes the dts preparation stop, begun waa. wverview | ofthe current state ofthe arin dataase management ystems, Next, igh ligh-the importance of data selection and representation tothe neural nt- ‘work pplcahn devebyneit yr. Daa representation schemes for ‘une yb varabes Using real nubs and coded datatypes are ‘covered, Dala preproceaingopeatons—indudng symbole mapping, tx- homies and orang or thresholding of mame malues are deceribed (Common techaiqus for data set maragement, cdg the quay and ‘qual o the data, are dscased ‘Chater 4 pesos eon ofthe hae nora network arin para ligne, including supervised, unsuperised, and reinforcement leering ‘apes of neural network models and their eapabilties are described. The fo- , ‘Chapter 5 wales he reader though ayPieal neural network deveopmient rovess First I highlight the imprtanes of sletng nn appenrate ert ‘measure to indicate when the network waning is complete Next, describe the most imporane waning parameters etl cot Ue way Une ‘nt ualtyof the trsned neal network, The erative neural network de- ‘elpment process examined, an out I ive afl forthe "ao Il even and how to detect “sbrarmal” proto ‘Chapter dccuses the aalyls of eal network odes created troigh ara min, This proces of discovering what the neural nework ares s fecuired for docaon support appestions ‘Th sept powats the mt ‘onmon ectiues for vation of neural networks and data ain ‘Sls Kul generat form neal netwoees and Wp enc analy are ‘Chapter 7 desries the use of tained neural networks forthe deply- ment of personal appeal Fuses dala peal yengrocesing = ‘uierente at run Une, os” neural networks can be Ueated as simple Subroutines, ard how neural network prediction accurary can be mon {one Applicaton mnitenance anu are lo adresed “Chapter 8 deals withthe topic ofilignt agents ad how the data ming tecines an neural ebro can be ed pad arg t melee lagen, As computer stone base me sme, tare re nee Tooling to advanced software to case Ur burdens Intligent agents ca a tomate bth wera Sete maragerent Lass Uthat taiton regriton al daain ede. Part 2 gives four detalled examples of how neural networks can be ap- ld o svg Disputes Bach apiation flows the dat mi Ing methodoloty used in Part Icing a dscussion of how the peciic ‘ample given canbe general to solve tke sindar busines problems. ‘Reanuureheive so ppesion references proved ach chapter in Part can stand alone no order is mpl or uated ‘Chaper 9 comtines customer eataase ar sales rans data = fone ange! arte Wi ape es ural networks to seamen the Csimers by creating clusters based on siaies ofthe customer ai- {tess oration can then be ata wo tage promations at members of the gr wh have the atiibues in wich we ae interested. Ths chapter {nclode edcusson ofthe anages f estes or segments in data mining spplcatios ‘Chapter 10 wes markt da on properties and sling pies to build 3 pec etna for eal ae appa Ti case ao ‘ulple input varahles and one cutput (art pric of cst). Ay business {ots make proposal can We ns past expec wth ils poeta ‘Chapter 11 mines customer profile formation tora customers or iver Ths aypeaion wees Se information a busines hae nab on furent and ast customers to Dull a neural network model that ranks them inorder of goons” Ge, proftabity),Prospecuve new euszmers ane target o laced ring het expecta nia. “Chapter I2 uses annventory and sles transaction database tobuld a e- plershmest stem, Ins tme-senes forcasing sppiication den wid, ‘haca that anges ore tne. The ea sto se past history to pedi tre thao, ses unique o forecasting ae dicted in depth ‘Appendin A prensa wverie uf the IDM Neural Network Ui nots and ther capable, The oes ion features ofthe product at ‘Stpport the data ining and neural network application methodology pe Sette in Pa “Append Bian introduction co fuzy sts ane fz lag. Otten used conjunction with neural networks, zy lage, Tough TWA EXPER y3- tem, provides an ewcllent way oat domain Knee ta data min operations. appendix desenesevoudonary prorat grins tite nea etrorts, geet sgoritims ae bola incpire, Mey se “arclaphor forte proces of aural eleeton to perform paral searches, “Gore sgn are eto find optal neural notwork arctetres ‘and toast connection weight ‘the glosary provides st ofthe most common terms used in data rnining and neural network pplication deveopment The anatated hi Fograpy contains a resource st of neural network reference books and bpsinss oriented application papers and aces, wat bet descriptions thoi content Part The Data Mining Process Using Neural Networks Port preconts a mathodology for data mining with nowral ‘networks. Structured around the major steps of data Drparaion, daa mening ad aly he ming res, {he eoht chapters in ths section highlight the issues specie tomeural network aigrons, The imreduction mentioned the ‘hack art label gon wand lo reforfo the neural near development process Wie perhaps not sircly cookbook ‘appronch acargul reading ey ts maternal wil consiteraiy tener your chances of succesfully training your neural ‘network. Por hose famitiar wih rational data analysis hed mde! dingy ela those usd to obec! orkenled development this process wl sm comfortably fain. The ‘empaat oon the kay stops can prateatconsaeraions, mot ‘tha theoretical ewes inva. ‘ort I begins wth an iretion to daa minéng and now networks Ph tho dacuasion harms the mary ‘specs of data preparation, he frst step required for data ‘mening, rogardes ofthe data manag augorvom use. OF "Syecietmportance to neural networks isthe representation (fdas, 3 th commen representations and datatypes used re liscussod. Mh hoy aspects that differentiate neal ‘Renworks—trasning paradigm, topolony ad learning Uigorinms-vare covered sn dea 1 describe the eran Drones, sarong frst with the defrition of “success ad then {The Daa ning Process Ung Neus Networks doscribing the moat important learning paramore ws to ‘ntrl that process Ar neural natwork training, I explore Tove to doploy and maintain evra netoork apicnions Part ons with alook at itatigent agent technology. a Sometnetsymoiotertatonshiy tegen ents Cae ‘ontra the mina of data, while data mining can be used to (id onming eapaittios © sagen! agen Chapter Introduction to Data Mining ‘oman Woe ate ta in tn this chapter, cuse te busines errant ae nition ele ‘dgy trends that have made data mining both necessary and achievable Drove a formal defn for data ing ard deseribe te major steps in te dia lnng roves aly pezent tof hema dat ming a= plcatons that hve been developed sing neural network technology. Data Mining: A Modern Business Hequiement Being & business manager or computing profesional day sayin but fll Ae wave afer wif ew formation technol isthe market nd ‘Shoiy gee acsimated into dal operations, Ge risks (and rewars) grow ‘iger for we wit leo lace thelr Bete o the tocnology raltte whee. Get iri ae youright ain several point of market hare st your Competitors expence, Gti wrong of do ning, ar you might have to Spel year ying to vorver let ru, a the od Chinese prove ‘fay you lve im interesting nes” Wo, ormation technology workers dave cet the co “Dror the pot tee dora, tse of computer teh has vale ‘ram the piecereal automation of certain business operations such as ac ‘coun ane ing am toaaysineratd cnn exvionent, wih coe end-ee stamation of al major business processes. Not only has ‘he computer technology changed How that acsogy towed tnd Host {tis sed ina tasines has charged. Prom the new hardware configurations ‘ing lea and wide aren networks for astute caenvserver computing {othe sftane orphans on object alee programing. dest changes ‘Stppor: one overriding business requlrement-process moe dat, faster ‘ver more complex ways: 1 10K, the IBM PC was introduced. Costing jut $3000, i used a 16 Intl BES procesor 64 kloytes (KB) of RAM, and a single 5:25" hoppy Aire, The fet hard drive avale wa a Seagate 55" Winchester hard ‘rive, which stored whopping § megabytes (MB) of data. In ate 1886, ‘S00 wil ty aPC whan intel Pent processor, 16 MS of KAM, ana [sistote (GP) har stn I jt yore the amount of ee neg svaliabie in 9000 PC as increased 200 anes, DB of main memory anda maximum band dink expat of 8 GB using 00- MB rvs In 1906, te ASHOO Advance Sy Hebe watt or A300 appr ato to erates spread acon 22 ASHOO syste. Prom paper ape to puneh ear, o mage fic drum, to the relentlew advince In rar nance storage desis (DASD using TM terminciogy, otherwise own a hard disk dees), the tncreses in oth da snrage panies and device relabmy tae bet ‘honor Figure 11 shows the recent explosion inthe amount of for ‘lion stared on mintare computer systems a supposedly dying breed, {hat the 1990 Cough 196, and projected through 198. have et wih IBM customers who ate gern igabytes of data day: ‘Tey are tery rable to stor allo ht data oie and ave to put eto Cape for bacup Homage, This ete beng gain farmer witha emp ‘who hast et ro in held because he dosnt have storage. Just ke & ‘op inthe Hel, tana ilormabon decreases in vale as ges, a Me ‘Sint of plan the ero (therin Ue data) has already been al Increasingly, busines tai sen asa valuable const in its own igh oot jn nw prc proce he days uansoctions, Teas ‘operational data represent the Gurent state of your Busnes, When i ‘combined wih historical business dats, ican el you where you sre ging Sh where you ste. By tlng operational data ae ding ito te, yo Fight be protecting the data, but you ae necting Kas wel. With bus ‘nest decison beng made at a breakneck pace, matages and execs tei information wich tn ee thon deesons. A that nfeemation reeds tobe online ut jn nn ei i eg Te old query an eporting tol Fave lor since latte ably to Keep up wit these information nets. New Cerever software tu allows free for queries has helped. But quay ‘World-Wide Gigabytes Shipped tem 5/5900850 | o za0} 5 2009 600 I 19901991 1581553129815" TON HST T98E pu Groh anata a ng (Sore retinal ae Op) tools ony net yo ow wh you are king fo Maidens ta mee, which provide thre-dimersioal views of data and cline aaltical proces (OLAP) tcls are certanlyenbancng business data ana aie ‘hiss Hever oven they dot enn toda eompstie event ‘What abou al he information brid in your estomer transaction les? Maybe there’ tend dat says your customers are switching to a dierent protic or satrap ienny ing nb wel ut ofa “noelfyoudon react Maybe there Isa sing arsacos ht are totally bof earacer or that csr a sie er ee ar) Maas ‘racial queens for your text product in development arden inthe Dist purchases mide by customers into larger markt. Would be nice {ota tht infrmation? AR al rane that prt ofthe business ease for ompterng business operations nthe st place? Would ibe rice 10 realy each non tha ivestnent in compuling echo)? lyon scked we qvetine, then gm aren alone Tnverasingy, ple watt leverage hel investments business data, to use eas an aid in fdeciaon mal, a or no operasonlappcatons. Daa ran rome to fist that Moe than just complex queries, data mining 20> ‘ides the meana to discover information i a business data In math ‘stien, rhas become etasnen ingeratve. Ifyou are ot mining Your dia for al its wor, you are gully of uderise of one of your company’s saretet suet, Base ln ut data i iormation about Jour eustomers ‘al he procs ty buy Anyuu wise, Ue ol buses main of how {Four cistomer is tanabe today ung data ning echnses. ‘The Evolution of Information Technology ‘The evolution in business computing overt past 20 years nas been dra- Ipatie ts ofen sii n determine which came, fit the change to the ‘ater busines nguizaton or tne new dstrbuted computing capabies. ‘We the rw proces power sil storage epbtiey of eomputers as ‘expanded a an astonish ace, the business commurity has sed thal ad- [Set computer data processing te Te anyways this mace! the hier fehl command and control management used by large corporations, ‘Te managemens intron sys QS) otal i a gas Howse (he ‘ale ors and airconditioned computer foams required by the mal- frames) controled acess to and processing of al corporat dts. “Te development of mincomputer or dapartmental computer® at somenatan extension ofthe msnane paradigm and semowak a pre ‘ror ofthe future. These computers lowed groups of people working on ‘Soren tack to conto toma of ther computing error, thigh ually with the guidance and blessing of Ue certral MS organization, athe tn wal fora new report or appcation tobe developed by ae WIS ‘vasnztinn,e eartment would hire own orosrammers of Sontware tngineers oslve ks own computing robles. nthe 180m the veka eyes input completely charged the dans of busines computing though i was some ine be- {ore the central 8 erganzaion and the bosiness management realized this, "Now avid could purchase 2 Cand applications software and work tis or her ow des to elve daly problems, The evelopment of spread- ‘Sheets and word proeesore gave the busines Jusuieaton fr these put ‘hese Over ine, the erwennmenteonted fom a sprinkling of PCS oF ‘worksiatons inthe orzriaton othe pine where nearly every know ‘eg worker has PC on hi of her desk. Whe Us eu fon eee Ine to ted computing has dramatically changed how and where {ta processing is performed in an organization, perhaps the igsest pact (Son tow Uusiss data created and managed The Data Warehouse Looking at this computing evolution from a business data perspective rises some interesting iases Inthe host ene computing mode, ine corporate ota was sored on te cena comput, Tis owe Ue MIS ganization tomanage the vakuble corporate nfrmation, to sfeguard if tet or ‘mage, and to colt ne business data at Was ereaed tough business {ransncnn. Ofcourse, ane dow to th ental contralto data Weta now workers who needed to acces the daa oe ha to wal og, time forthe 1S organization to fespond vo Wie needs. Weg New COBOL, ‘or RPG programe to generate opts ke ine (an programmer) "As department devel stems were iedued the workgroup could exer land was sometimes ou of te bu this soresmes tae data was tal price ‘ay for reltively eae eee to al information, Conversely tne depa- ‘dia, then the intonation Bo tobe rowed up tothe corporate data repas- tary: Whe this eased same prbles fr central lS the were sy ore then hap to get he spietion baetng dws fy psig that woo he ‘departietal uses Tere war ao some comet due tothe dssbuton of ‘ta, ot aga, ha was pce ey hau wo pay Soren eis es. ‘When large numbers of stand-alone PCs were oust ito ofces, real problem began to surface in Ue management of kay business data. Now Ee crack franca analy crunching numbers on hs orhor PC apeenact ‘had ey business dita tal was totally out o the purview of the central MIS ‘ngaisation, Who would back up the daa? Who wou ensure te securyy [A the PCé were connect tothe corporate computer network, sme of ‘those problems were coved Ihe knowledge worker cuss download xe Temmatin from the deparienial of mainirane computers. orocess the ‘ata with PC-based applications and then retura the data to We corporate ‘ofr Toay te remote strato of PCs Uy 1S ha Lait ln oblem fl ce bck to the das when allel busines ata was un fer conl—wel, not completly. ‘Oe prbiem with the pekfration of computer throught the sins lethe urge number of datatates seared scos syste, Aste dalabascs indeparsmentl and PC systems grew the data was not always pase up tO the centred getem Over tne information stot estemer, epee, ‘duct design, and manufacturing operations was stored in separate dat ces, Wine moving te daa under cenratzed contol s desirable Dec fepeentenal real, many ofthese databases remain Where they wee cit {nal deployed. They are nected to run the buses. However, i tteg "Now applictnne ae ever going tobe developed, thi parte 9 of data ‘needs tobe consoled under one (iguraive) rot. Thisleads ws tone of the mot sweeping eas to it the database management arena ince reli Sonal datstaoss the data warehoue “Ada warehouse, asthe nae implies, a data store fora age amount ot corporate dat. The data qully and inept ean be mantaed ty & © TheDsa ang Process Using Nawal Nevers centrale sa Applications developers donot hav to de wh ya ‘ot multiple incompatible and saetimes oerlaping databases. In shor, en ey edt scr Corporate ta ey ow were oa re, how we go otis pln cambination ofthe history ofthe evolu {Gon of ingmin tecrnology and the eek growin unin ‘rage cabs In ge corporate data warehouse, we ate not aking In terms of hundreds of megabytes of data (wth can now be stored cn a Snir FO) tin Wie ures of teins of byica of data (terabytes) ined, the sea not 20 mich thatthe data resiaes physically ona sre ‘computer system, ut Ul allofthe datas stored and acces Leo Stetwork of dite spon tha presents self wa seamless cok lection of corporate data, igure 1 pit a pel coniguratin of «corp dab watchs ‘operational ata isenerned trough tanactions processed ly appitions gue Et rn cet runing on PCs and servers and i then stored in operational dsabases. ‘These operational datataes uly hold several ror of data and range tho 10 to 60 sists ins. At cota eters the operational te ‘moved off the ransicion processing gstens ono the dla warehousing > tem: Prodi suchas BMS Dataupagtnr and Visial Warehouse can aio. frat these dts replat ace. Th wn might bs network of PC {ie serer, a range computer Ske an IBM AS0O, an TBM maitre, Some neteropencin xu cer systems, The data warehouse might old year of dala and can swel to terabytes of data. addition tothe sfrementioned beni for data quality and secur. fystems, decision suppor stems, ‘al sppbeatons, Win computers, as in peopie, you car make good Serio lose ot hue alt ofthe relable data. A od enrporate data ‘warehouse males hat data ready avaliable. In ado, makes poste {whole ew dass coping appiasions, now Kiowa dia mig ‘ata mining Overview ‘ata mining, aso refereed to as knowledge discovery (raw, Pistetsy- Sapiro snd Mathes 1002), tas become something o buzzword in bus ses cil, Breryone wane and heefore mary computer arr ae ‘oftware vendors ci hat they haveit. The only peoble, ofcourse, sta foteveryone agrees on what its. To sre tis clenerver ques To ot (sits matadmencina databasre el hers, OLAP with dl ‘down capabites, Seemingly the only pots of agreement ae tha ith to (do wih database sytney Us etn, Fear ‘my ew oferacty what data mining, and wore important, ow tis done, Fire ee tare witha defition: Daza ming is the ecient discovery of Inlubis moneda formation from a large eollotion of data. "Tis innocuous sentence ints some key stibutes that ca be used todetermine what sand not "data rning"The operative wordin his de- {Button fxs with the never” ineraton fom data, Nei tht Tam ‘ot tlking about complex qeties where the user already has a suspicion outa relationship i eda and was pl al Ure tration or {rth to manly eee or aida a hypothe. Nor are we alin boat performing tatisicltets of hypotheses use standard satscl tec Tics Dota iningcenters onthe ater covery af nw fact andre lathonshipsin data Te eas tha he rave avers the busines aa, and {he data mining gt the excavator sting through the vast qua tiv of data ong or the sone ges if ne Infra "A dita mining operation is ein the value of the extracted infor raion exceeds tne cot o processing ine raw aaa. When wewed from is 12 Theos ning Process Using Net turks perspective data mining efirieney i aretim on investment tatament nat “Tprocesing ine statement, To some people, data ming algorithm ie ‘Gene ony fe comple in under he ates a supp teactve naan of he resus, However, few people would argue agaist spending tho week of processing time (at cont of 100,000) i key design or mare Ucturing proce parameter lo disovered and ill ave $1,000,000 in txsts over the nex two yeu iiiney ia cos versus beneft statement. ‘When we sper nonobousnes sa requirement or daa marin, ts le o's statement shat the afielony and vale of the proves. I You ‘ped $10. data mining onto nd out something that was wel known in ‘our business en ou hve ist wasted $10. Add wake ay dala niin laorhms are used to proces data inorder to fad relaiowshipe and pat- {ern they often produce volun outputs of trval, obvious informs- Une sna gh made you fee better by confirming your “unerstanding ofthe business fundamentals ih your indus, bat does Tot ad vali to your desion making rocess. Ths separaing the wheat om the chal the back-enaafthat miing ep unin algorithms (ane oo) wr wich you proxi te ‘lion dscovere trout data lings valuable" oly iit els you gain 1 eomptive advantage In your business, o aids in the decision-making ‘Aang cllcion of data” is certain a subjective quantity. Asal base nese might cosier«gabytecf datato be a large database worthy of min- ite A large corporation might have multe tabases inthe tone oF hundreds of lattes range. To some exten databases age enough for fata mining i conta enough data so uct relatos ae den ‘tom ew al that wih renin information can be extracted, "The data mining process consists of Ures major stops, as stated in gure 1. of cours, eal ars we bi pe of data The ist processing step is data reparation often refered to as "srubbing the dala." Daa is asia selected, eanee and reprocessed under te guidance and knowlege of Soman exper. Second, a data using algoridim is used to process the [reputed dat, cmprecaing and waraorming ito make hey to entity fry latent valuable nuggets of infomation The third phase 1s the data [nse page, where the dala fing output eratuata to se i at tinal domain ining ne deemvrrt and tn decermine the lave it~ portance of the Tats generated by the mining algorthns. This is where [Meatepe tusiessdessone se mate argo eel by Ueda rnin process and where operational applcatons are deployed. ‘As bsineses have computerized thelr operates, they Rave gradually “acl a walletiun uf exparae and sometines compat tems Biter trough mergers of dine infomation technology deparunents oF "imply though the requirements fr diferent applications customer, es feunctons wear, snd desig information telly ext 9 mre than ‘ne place in the corporate information sytens Tis duplication must be duced of eiiaied in order operon eect das ing Ths oar Enlai of crucial business data now bert refered tas the corporate “a wachoe Whe panes presto dtamig ve ‘oping exible decsion support systom To wea mining meaphir ‘slr tone van area wth ood ads and bridges than in the mile of ‘Sores or mountaintop, where te ining tls wonld have to be sie {in Gaining access to he raw mater sa part ofthe cost ofthe operation ad herlore ects tne ciciney (Tetum on ives). tang ie cor- rst ats ronsedate el ren aval wil make some data mini ‘Operation more practical fom & east standpoint than if the data ad o be ‘Seve om seat ‘Unfreunatly us colecting the data in one place and making i easily salable ft enough, When operational data fom tarsacuons i oded {noth data warchoo, alton contain misnng or naccrato dts, Hom ood or bad he data function of the amour of input checking done fhe application that generates the transaction. Unfortunately, many de- ‘ployed apgieatian are nee than lar when ees to validating he ps. To overcome tas problem, Une operational data must go through & “easing process, wiveh takes care of sing oul orarge vl this ceansin step not done before te daa loaded into the data ware- house, wil have tobe performed repeatedly whenever tat datas sedi data linea or met daa ming applcatos, the rltivly ean data that res in the cumorae dita warehouse mast unl be reffeand processed telat Uindergpes the dts mani proces prepracesing might nel sing Inforation tra ile ales, selecting speci roms o ecrds of a, ae mos cetan includes select whch cums Od ofa ok inthe data mningstep Men ten or refs are combined to represent tos derived alors This data selection a mpuaton proces i s- {ly peformed ty someone wath gos dealer Etwredge abr re pi rin atthe dit late the pcb under sts. Depending onthe {ham ining grt involved, he data might ned to be femsted in spe fate waye (2c a seal of wavered) before ko proceased. Wile ‘ewed by some as «bothersome preliminary step (crt of ike scraping the ‘paint say tore apln Wesh cot o pi) sta preparation sr {Gh ton sucess detaining epletoIndood BM Cansiing aie ‘pendent corsutants coi etiates that data preparation might consime nye om oO oe eure pena aa mg opera (Far coset snap in etn mings the eta ieee and fe processed, is when the da ising algorithms perform he atu sifting Process Many tecriques Tae ten used to perform the common dal Inning activites of aocations, clistering clssiication, modeling, se ‘tent ptterna, and time sere forecast, These techniques ange fom etsucestroug ata to rel nator Seo Tblo Lt fr a ito the ‘ost conumon data mining factions, the corresponding data mining lg ‘isan (jel applications, Tink of the Arent ant muri ago ‘sme e the d teof the ming arbi. If de re eked in har Tock, then we might needa dlanand dil (or algorithm oa certain ype). I [sinmare porous rok, then we might beset crease yu effet by ing ls exert (or aon). The ype of ta mining fue tion we are tying to perfor, long withthe ust and quantity of dat ‘rata eonbaie to ec which data ing igre aul be wed. TAMU Dating Fanctons Mosare enamine Seep Stat eh setae te ie “The thi and final step is the aaj ofthe data mining ress oro ‘ut in sume cass the outputs na form that makes very easy to dscem The vale sgacta fnfration fees the rll o interesting ets ‘gue 1 shows the oust of» data ing run ws the Quest associ ma ese Th oni Se ‘r= owe aay Aan Marae Agron ‘Since nsbSu ef becurs i pepe che nae enact pe hte tec Arent tion agai developed by IBM Almaden Research. The relationships be tovcon ems In a arket basket anabae are eprsentd i ethene fon, The antecedent (Left-hand sie) Iss the ites purchased and thea Socition with he coneequent (ng hand ie) em mers of one (Gum oten te en ge purchased atthe sete) and suppor (he percentage of records in which the asolaton appear). With the rls re- at eat ny, ane frat ac caer to Keni Tv other ase, however, the results wll have tobe anja her sally coc tvough another level of tol to cass the nuggets according to pre “ited value, Ppa 5 stats iunzation of another mart hatkot analysis using aeyeslathon youn by the 1M UK Scene Center ‘The anh states the stati profile of the customers in each major Seainent and how their ltabutes compare to the whole population of e45- ttiers Whatever data mining algorithm se the resis wil ave tobe prevented tothe ver. A sycesful data mining application iwoves the {Eanformason of raw data to for tat re compact ad under ashe, al whore relation are expliiy define. Tre talked about dala mining as the process of extracting valuable infor nau from dat, cou iat kes information valuable oa business ‘when that information leads to ations or market behavior that gies 8 ‘ducernile competive advantage. There are two major ways 10 ts ‘step decison support processes.‘ second ito use he da mining inode as part of operational appcatens. sss thee imo ping tof data ining ne lowing eters. Enhancing Decision Support ‘As the quantity of basnes data has grow, anew elas of appbeatins shit ‘Sas ates tie hae emerge eae either denn support sates (055), or executive information systems (EIS), depending on the software SGeaoe Whatever iis eal, the fos ai se soa tess de Shon makers to ana and duet pattems in data, and (oa ther in ‘aking strategie business decisions, “a Oral me of ders auyyent apse would be fora purchasing agent or ver fora lage rear to erate an interactive query for sales fed inventory data for parte produt or product group. Once the data SS rerievel tie devisto support syotom would allow the data to be ds layed erephicalin avait of formats Based on this transformation of ‘he row dl, tne decision maker woud decde wrat quan oF tyne rs hvuld bordered Note that “enor” element ofthis itu fs Droid by the dita analysand er selection of which data to request ae ‘wn most respects esa query apie nigral with graphs a ipay eye in coritat daa mining solaton to this problem wouk! be to mine & ‘came tat caine sale adie information nthe predict. ‘ize 1.6 shows a typical scenario for using dala mining for deisionsup- port Stating with selerton of prepared data, a newa netWores wed Pele satin wonton forest moe! ‘Th models constructed ator ‘atcally ro the data sing the lnmingcapablites of neural networks ‘Gace this model screed what-s cate run uot y Ue anus torgermore secustepraitions of the fue ales and inventory reauie- ‘ments orto determine the sent ofsales to changesin ay ofthe input “arbi, portant, wens complain! le iy generate rm the sit Data, Data Ming Preparation ° igre Daa ini er deco spor raw dat using data ing algrit, it also opens ee way > compe “Steomaton ofthe presses Tie arte in the next sein Doveloping Business Appleaions or the pst 20 years, the standard approach to developing bisines ppl {ans fs beet lave a yet anaes detrmine the dts that mer Ve processed, and ientify the major steps ofthe business process that op- ‘rates ont that data. Once characterized, the probiem s GFoKEN down to “Sproles, and algorhms ae designed a operate othe ata. Hist ‘down apprwch works wel frswide variety of problems and has become the standard ecnnuque used ty business porzaiy slups worldwide ‘anther eb mer aoreach sto perform an aralsis of the problem in terms othe busines bjs tht ar volved in de process andthe oP- {cums psd on or By ove objects Tie co called objet erent “alms and des, combined with objec rented programing angages ich as Smaak and C++ (evn 00-COBOL) sas becoming ine pretees pptation dovlopmer acuigue for businos: Major aaages are code ‘Fee and improved relay beesise new applications are developed sng revonsly developed and tetea toes. Ako Ue fs eto it th havior ihe than on oblam decompositn, oject-oented Programing stl jst another approach vo wring algorithms for dita cenputers "ks mentioned care, atid alteratve exists for developing business pplication. This approach is based on data ning and the use of models all due se dsvery proces. Figure 1.7 shows how dst mining ean ‘bersed for automated application development. The prepare datas sed {To build «neural network mode of te tutu to be performed, Tats tone are then en thrgh the nal eto, ad he ump are wed 10 fake automated basinessdecions. Tas use of data mining places difer- nt requiement on the data mung alors nce de perpen not Srimprtant What iieperant in bung business aplcations usin data Preparation Oars Min Sota Figs Data ig peton debe. mining thatthe udeling processing funcons—nether they be cus- torn lsifetion, meoling on raprin froasing are accra at ‘lal Please note tat aplication bul sing a data ming agro do {ut hare to senalyyertrm the business proesang facta better" than applications but through the tradonalprorammatial approach. Bauivalent sceuracy woud stl el lgneant benefits beause te apl- {tion's generated tomato a prec fhe deta mining proce However, there are many eases where atonal programed aplication ‘aot be developed, since no one inthe business understands how the ‘ta eaten well enough to design or write an alr to capture those re- laste. Tes here where the advantages of wig data miningtechniques ici outlets ae ely ene Example Data Mining Applications ‘So fat [have talked about data mining fom 2 technological perspective. Now lets charge our Wew tos busines perspective (L190) How are Ipinsee ing taming nl net tay? Wha Kn fa pliations have ben sucesflly deployed? What industies ae leading in "Be adopaon o tas wcrnotagy? Merieting very busines has to markets proctor services, and the os of ar esi efforts must be ftored into the ultnate sling pice. ny echt. fg that eon mgrowe markoting reso wer marking cass gs 3 ‘ose look by businese, In conjunction, estomser crea, bling and pur [hases were some of tie fis snes sactons to be automate wih omputers so late amounts of ala are realy vale for mining These factors have combined Yo make marketing one of the hotest apleation a as for date mining Ts gnoral oro ie rafered toa databate mareting (Gessarol 195). ‘Guster reltonship managements the trm most sea for he overt pesmanet aepning etonentcaied nforatnn nl win eto enhance fhe revenve fw fromm an existing customer. Inforrtion on custome de- Iogrepiics and ther pure pains ace used to seapnen thee tomers based on thelr underving sis, whether scioeconome or by thelrintrets and hobbits a demonstrat by the products they purchase Dy determining sar sharee of eutemer, averting retur on inst ‘ment can be enhanced since the marketing messages are accurately reach: tng those customers most Bly toby. Hy segmenting the customer tase, ferent pestrts al services ean be developed, wich af ted 1 a> eal to mernbers ofthe specifi soup. ‘Daabaee marketing sy daa techies agaiat databases with marketing nfrmation-can be Wein several diferent aspects ofthe ‘Cistomerbusiness relationship. This information canbe used to prove the eusomer tetention ate by HMentiping customers who are Il) “mitch to another provider. Since Iteots mich more to win new cistmer ‘han tose to an existing ne, ts appeaton can ave asgucant pect. tn profits When hig shot the estamer x ombined with product {fteraton, specifi premio can ber that increase the average Pur {cases made Uy us fstorer sea BY hing what a particular > tomer is interested in. marketing costs can be lowered through more ‘ociveralings and custome satsfucton is knproved becase they ae Iotrecrring wht they porte "nk mal” Direct mal response campaigns use data mining to st selec target set “ofeustomers, Atestmangis then made toa sma suet ots st. sed ‘Sh thon of the responan ant he eharaceratin of thoae who responded. {determination canbe made 3s to who should be ince inthe subse- ‘quent macs mang and Wuen oe or oer shite inca. Reta Inthe rela sector, perps the biggest application s market basket analy: Sih Ths volves tig the pat of ane transactions to i asnlations 18 The Date ning Process Ung Neural Newer ween products. This Information is then used to determine product alfintes and suggest promotion strategies that can ainze roi ‘Atypical aceite all oinsale watson a dataase ‘Te tasacion detatase then mined to find those products that are most ‘roy asoasated. When customers purchase baby diapers, they alo tered fe purse baby formula, The feta would not ordinary put both ‘sper and formula onsale at the same tie. Rather, using knowlege that there is an azoiaton betwen these tr products, tne retalr woul put {soe onsale an parethe ether tem nent tornclese primity tthe Gat item The placement of tems in grocery sloresis no accident. The ouput Aman beret anas woud ln status betwee pus tat had never been sspected. One ateotoal ale sofa convenience store ‘hain ha rotied ha here wae strong sttcation between purchasers es; he often picked up a sxpack along the way “A related! eppcatin is Une use of sequential alters to spt tempor Callons between tensor products, nly now the focusison thei temporal Fetattup a exapl wen someone purchases ew sul et wit "hip ili hey wil tum to purchase new dress hrs ad ies. Are tale would then ue thie informatio to ry to encourage the purchase of Use related ea n a age tip to the tore Because thy might chop ‘lsewhere to ick up the accessories. ate sing ei widespread ace i the Rance industry (Disney 1996). [Neural networks are wed to detect pattems of potential ral trans- ‘actions in consumer credit cards (Norton 195). Tey ae also used wo pre tint iverest rate and exchange rate Mrtatons tn eenry mart ‘Several brokerage houses use neural esworks to help in managing stock ‘and ton porte (Senwarez 19), Neura etworha lve eet wed or ‘rod isk assesment and for bankruoiy prediction in comercial eee Ingand bord rang, Sh the nance nary, dierent clan of eurtoneré ae teat Air ently, based on the percelved risk tothe lender (Margarita and Bela 1082}. Tho, clasp Ue amount of Hk associated with a customer or sith parler tanacton i extremely important A modest prove ‘ment inthe abil to detect tnpending bankruptcies, for example, wl kd [Mn spprecinbe revere crease to ue nancial nsiauon (to 1998) Inthe comedies trading arena, a comolex sof variables i wed to ‘onscuct trading srsteses Grads and Osbur 1988), While the eff ‘Gent market Unory ls te wily auczped for year ay brokerage Touses tl rely on tect! traders to analyze the data and make educated esses bout the markets. The most important ay to detect trends ‘hangesinmovement ofthe market asa whale or f some parca Segment Srateck lomo, Chan el Ko I) Teen ti nthe ere x charge ara, Neural ncwos abies to mode! tinesenis and complex frmunearhunctio tias pomp el ue ull Use appeal areas Manutaturing ‘The compleiy in modem manufacturing environment andthe reuite- rents fr oth elcecy ad igh ual fas prompted Ue Wwe uf 8 ‘ing in several areas, Neural networks reused computer sided design match part requirements t exiting pat for desig reuse, njobsched- tngand manufoturing contol in optniaton of hema proceses, vl tominimize energy consumption. Neural networks ae als widely used in ‘qultycontrel snd automated inspeeon appoeators. St shop sedge = fiat preblem hat deals with assigning the ssquenceof jobs and how work i assigned to specie machines ina mane factung pine anere are ono ty corsa ia me te ne such s the sequence of proces steps and whether particular mat al canbe processed bya spalie macune. In adi to these hard con ct these ave su once such a0 optimising oprsingeficenins by Droid necias setup and reconiguration of machines by scheduling Similar types of prodits of operations onthe save plece of equipment. Neural actors have ben od to sty Phar contrainte gerarat- {ng optinzed job ssgponets anulctring proces cool dais wi Ure automate ust ya tees hat cota the qua and cua of profucs oeduced ty the ‘arufactiring act. A wel ow ecique called tatsteal proces con {lel etch Se ually ofthe goods produced bya manatee ty ‘measuring the verb and tolerances in varios aspects ofthe fined foods. Using examples of god ae bad parts, neural networks have been ‘iodo side inthe contre of proceses ah the dtetion af st faa the plant outputs. in sme chemical manufacturing processes, complex mixsurs of cei ‘ae man be heated cole mixed, ae tratsported by an automatic con- trol sytem, Dangerous aan ot abnorial operating cndions mst be Sevete ard cnet for ulema, ot Hn sytem ould explode ‘Neural networks hve ben wed to minimize the generation of waste prod ‘ots ad to improve the properties of he matral produced sch asses Hm bat farace= “Automated inepection 6 a requirement many manufacturing environ rents where highspeed and lgh-output quaites ean overwneln te {hiss of human inepetor to acest al lay sp deer nt in proces, Using gal images, neural networks ave been wed to detect 20, The ate ning Process Ung Neus Newer fats in lle ste! and aluminum n ented circuit boards and in con- sumer product packaging. Tey have been used to cas the grades of frayed oso prdcs ae ny cof o a ineated ascent ine Heath and media! "There are two pear uss for ata mining in Une Health industry: the ade ‘Mute of pant sevice, ting, iaurence, ete, and the ago and treat of disease Insurance ing hnrtime red forthe Another major appiaton eo entity the most coseffectve healthcare provides, Many sspects of the heath in- ttsey are der gh aomertver cna, and aplance with govern rent regulations must be maintained. ‘Datamining also bring ae Wo automate the diagnosis of cervical cancer reat cancer, and ert atacke (Sebati 1002). Patient dt tan be collected on a lane population and presented toa neural network, "Thus a data mining system eat look at more pauents in one iy than ‘amon dortr em seein iim Neural network’ abies to st thesia large boy of data and to detect suble patterns have proven to De etciveHamrson, Che, ae Kenly 1988) Energy and uty ‘Supliers of elect omer ae subject to large swings in demand or ser- ‘ie, Ainge wee! rt inva evap a egion can considers = ‘crease demand for ower in mater o hour. Decson have to be made as {o whether plants should be brought enline o ake down for peewertive Iruntenance. Lasgeconeimer of scrcl power noch ae monfectirng Plantar often charged tase on tos pea eneray usage ois inthe in terest to manage thelr enraumpten to rune excessive demancs or er- Nice Thema depenveney on accurate ld forecasts has made the uty Industry one ofthe major Wer of neural nezworks (Parke. 1990). “Aer appcation me ener tye each fo ew gsr epost: Neural networks have been suceasuly used to aid in anabjss of Soundings taken a est dling shes for detecting changes Inthe sta of {od and wo dnt ike sts for mineral depot ‘summary “The changes inthe business computing environment over the pst three ddcades have been drat. Computer proces and storage reciology ‘ndyances provide businesses wit he aiy to keep hundreds of igabytes ‘even terabytes of data online, However, ths a ood news, bed news Story. The god noms fo thas naw we ean have yar of Nstorl sins: ‘at avaliable for decision suppor applications. The bad news tha tradi tial data query and anaes methods are a capable of dealing wid uch data Te sonsequene ha nen are drowninin data. ‘eta mining or Knowledge discovery oflers a sluuon to this protien ‘wan te emphasis on the dcrvery of valuable tration from ngs haters, dita min provides added value to the investment in the cor porate data warehouse, The data mining process consis of tree base ‘nd anal of the ning igri oust. “fe benefits of data mining ar evident It bwo major business acute, acon support and application oelopenen. In lion ort SHS tema, data rng transforms the daa to reveal hidden information in the {oom of fee, rls, a graphical represeuauns of Une usta. Te ox tremely large smn of data are compressed to reveal the inne resto ‘Speamong the data elements When wod in de application development ‘Gris tating with neal networks provides automated sonatruction ‘of transaction processing systems and forecasting modes Aplin of dita rin span al industries Businesses ofall PES oo dia ming to target maseting mewapee to sri customer sts. both to st thei customers’ ntds and to increase revemies. Retalers (he dala ning tof associations between. pate putes at Ue ‘ae tine a to forecast sales and corresponding verry requirements. “Te tinane industry uses data mining feehqes to manage rst ar 0 ete rena he rhe Manufacture use neural tw nthe de Six production scheduling process contol and quality inspections of {helt products: Hoga and insurance companies mine te dat to d= {cette lima by health are proviters nt patents, na sian {ise advanced pater rection capabilites of neural neo to auto- rate oratory tet, Uoes se neural networks wo freeads rd ‘ety respond qulpment outages and chases inthe weather ‘Any business with deta sbout ts esters, supers, products, aed sales car benef data ning When binsics are looking forth aight ‘te over their competion. they are wing tavel far and wide and spend filo of dlrs toby information about Hse markets. Often this nor on i ting ihe in oon, hen ny nthe Ata wrens, References 0 per ce ge an re ey sata A Ugh Ci Da 8 elie Ses eer wid alagig ptm p- Get es neg and meer rata Pros EAN He 122 Thana Ming oes Ung Nour Newer i, ttn ta, rn LN Ca apn ct at me ney ae emer a een EE ona sones Tae noeee cate conamgenymiganaenrine mean “fess oom ees nearer dec deco supa erg a rr Ganeee on Ber eers ee aed tetra ek ete ER ese Ane wl etn an ee asiness appbeston,Inormaton and tet ce SA oan oan So att in a ees ER nc great ne 9 ra re rt ir, Ey “epee ae ek ESS ap rns mara t ‘et ner pers rin Sa & Cee Sear fia er men, oan rin ‘Wong DICTA Bodinvch sed Sv 1906 A graphy of aera work business ap- niin ie tee aga ee ‘Chapter Introduction to Neural Networks “wan reps ecm maar ‘vse "Neural Networks: A Data Mining Engine ‘Neural networks are one ofthe key teelanes sed fo data mining In ths chapter, explore te story af neural network, how they compare to traonl computing approaches, and why they tt « naturel tehnlagy for performing data maining actives, ‘Atistoncal Perspective Te Mstoy of computing 8 led wan cru ss an tums. rom ne {st visions of Charles Babbage and his mechanical computing device. the Diference Engine, to dn von Neumann and the development ofthe rod fala enputer, any putty ete mene ex Ind and then rejected ss poiieal and market forces made their atu felecion, Figure 21 shows some of the major miestanes in computing {fom both a computer hardware and softwar wow in 1097, Alan Turing developed his theory of tne Turing machine, a device tat could red i- Ssructions om a paper tape and sumulate ay other computing mactine. Fron Computing ton ime ine ‘nen Meco se Pas (1940) wrote hee paper atthe inary meron, they were wing the human bran ss their computational model. Jon von Neumann picked Up these Meas and developed them, along wit oer, int te computing mol we how fay he stare program, enn Neumann” computer It somewhat ironic that wat began a8 crude ‘model ofthe Drain has, overtime, beofe te acepted metaphor for now ora bret period, analog computers competed with dgtal computers ‘Tae ely computers aye very welt advanced mathematics ad cal uu and coud be uted for modeling natural phenomena. However they ‘oul ot be used for accurate matherutical eomputations suchas business ‘counting and inventory management. Here the dil eompoter proved fuperor Anda more and more problems were mapped tothe dill eal, digi! compiters became ute domarant type of omputers. Today, {he term romper feeynammene with Aig “hs story occured withthe development of ineligent computers sn theta 1980s a aly 190, thet were two ror school of hough. ‘ne schoo! wanted to model computation an the bas architecture ad key ‘tutes of wha wat known about the human bain, The othe schoo fl. {hot intligence could be produced in machines trough the we of smo) rhanipaton. The two approsches were tightly coupled to the prevaing ‘hiosopieal positions regarding the fundamental nature of inteligence fn Ted wo major dete he nueigen compu aren Intetigent Computing: Symibat Manipulation or Pattern cognition? ‘Why has the digital computer become the common metaphor for how the furan tain works? Why tthe log, sequential processing ofthe lon. Cronk computer used asthe mode ofthe organized mand? Are computes ‘sceurte modes othe logical bras? Tne answersto tese questions de- fend on ys dfinton of intligonen "What separates humans frome lower Mefrns? For yar, great hikers {inthe human cerebellum the nique achinery that gives us nteligence [ot justiteligence enough for sural, but inelgence tat allows plan- “en Newel ad Simon proposed ter piysical symbol system hypothe. ‘isn 1055, the dgial computer was ony ten years od Jute 1962). Tey ‘lea that well compere wee xteal good ad fast number ‘Crunches, they could alo be exten ast srbel processors, A hat was eee was spe strato tapping symbols to numbers The ci ‘nas tha a “hia sao rte has the necessary and sufient means For general itligent ston Not only were tey saying that symbol manip ton cout end toineligens behavior thay ted tat ne -necensn = Ir people exit iteligen behavior, then trust be because they are using formal rules to manips symbols ‘us assumption of te Ina equa lore atwen mbolprocesng computes and de han brain beet ‘he ass for most of the artificial neligence workin the next thre decades, ‘What chese acess wet wo uverink wan ha sibel prea ‘arebrain processes infrmatin that has aoa ben processed sta sub- ‘yo level bythe body senes, Our hearing, sin taste, and tact in- {ut provide the huan bain with a weakh 3 ngoration with whieh (0 eason about the world. Some people cal hs subsymbai processing Joa ture detcion, Ar what sTeture dtecbon” Is proces of pattem = ‘roition tat secur egy aa subeeneris level People develo many Context senive models of what to expect as we Interact with the Word. ‘Eventhough we might be dking wen sunt ie ut bln nor tly detectors break theoush our thoughts and ell us when something out bot the ondary or inexpected Is happeing. Por example, when driving asthe aubeonacios often takes over the rule tas, andar mind {testo wander unt we "notice" sorething expected inthe tafe igure 2.2 strates the mor erences n approaches berween we symalic an the ahem (or neta etwor scool of aii nel Tigence, Those espousing the ombole view woud say that now Must 26 The ate tning Process Using Neus News a seni eat a oe sage esteem non sete fev ones syne pacautg eens ac ‘be exoliy represented ty rules, and thatthe Dow of consciousness best ‘eseibed ty a eer prots. Those with asubeymbai or eannectionst Sant woud say Wat snassve parla ad analog computation 3 fun ‘Samental aspect of nteligenee. This leads othe robust qualities of neural networks in contrast o the wel known nitieness associated with BaD Tepe rate- haved sete when they operate near the edges hele domi nowedgeIn some sense, Une diferent vectneal approaches miro der. fences inte psopryof mina une, iuigence spurl 9 ction of the higher receses derived in a topdown mans. Inthe oer, n- ‘eigen a emergent phanomens sping ft am he tracts roby ies somewhere inthe mide of these extremes. ‘Computer Metaphor Versus the Brain Metaphor In the eaty 1960s, the symbol processing school recognized that digital Computers were good a manipulating umbars, a that nunbars cot he ‘wed to represent ambolt. The use ofthis simple stration from sbols to munbers meant that without any changes tothe common dial cor Futerarchitesun, we had asym! procoing computer The sas pro ‘essing researcher demonstated some siglfcat early sucess, suchas ‘computer progras tat eau do ealge-tevel aus apd manera {hoor prising Solung mich saemingly Affe problems suatested at “Symbol proces was suey the way to go if we wanted to develop ili sent compuser ase, ‘During the same ine the ra-based or connections researchers were tang to show that connected networks O simple processing units could ‘Semnstrtecrergentlteligene bev RoeenbltsParoeptron mos (1962) and Widows Adeline (1860) were two examples of the {pes of eral tors al rig ite. The Preston as prone tbe Served fr certain 2 opera ayer The Adaline ws shown 0 be ‘ite capable of sovarg nara engineering probes ia aren that became cm ax aati ierina However white neural network researchers ‘were abl to show sore intresting reals, ey soon hit abi wall to ‘eningyinurmountabe thera! problems, Ther laringagortone ‘ould ont work for single-layer neural networks, which inited the (0 felvng ony simple, nea-separaie probes Minsky and Papper, Wo Teecarchrs fom the symbl processing srl wh wer wor Ewe able about neural network, highlighted these theoretical tations in {heir ential rook, Peresptrons (uta). Neural necworks research all lt fnded in the United Sates by the late 1960s, although some work contin: ‘ed in Europe ‘Meuiwle, the aytt rocening shoot of artical inetlgnce pro ‘ceded fil speed ahead. Speci purpose programing languages sch 8 Prolog and Lisp were developed for writing symbol processing programs Inded,speciied computers known ae AT worataline were developed 0 provide the high-powered processing needed to simulate ineligence {rough symbol procesing Kesearenets moved on fom eaeuus ar eo rom pening tn problems sich a iage recon, speech ecomiion, Panag and eehedulrg, Hale tased expert systems Were developed to “Shue de poutcsoling techniques used by human expers Panding fe aduatesudents poured into symbol-based afl inteligence re- earch yar ater year twas not nl the rd-1980s tat people eae ‘hat progrne wae ot bing rade a fats was promod “Symbolic Al was avays‘hut around te commer crossing that mag seal esol to rmanstream application, Al sat was needed was more ‘perl eompatrs, mare funding ore time. Not that the research was Irulss, New computer terface tchaiqus such as graphical use iner- facet were refined on he A workstations New propa paralyiom ‘such asthe object-oriented lrguage, Smalalk—were invented. Exper systems went commercial and slved some du reaword problems However, the deep reste Teang to truly ietgent mactunes id not ‘seem tobe ening any slag ven after dovvon of recareh, Ti ak af ‘rogress on some of the fundamental problems in develorng inteligent Sonware systems in esearcners to rexarae te wk ra Ue 0008 ot ‘neural networks ant edicover the work of sll aroup of researchers ‘who cari on, evn after neural networks “os” to symalic Al Tn the rl 15005 researchers lrtedpublshing new rst and upd Ingold ests on fundamental neral network problems. nthe ery 1900s ‘when researchers were frst working on computers tha coud learn, both The avablesompster hare al tha theoreti indoretaning a he izsues were not up tthe sk Inthe intervening years, rescarchers had de ‘eloped new neural network taining aigontnms that overeame the Hit ‘ons ofthe early Perceptron and AMaline modes, The PDP research aroun (where POP means parallel isribued processing) had been working for several years al yale Lei worvolue esta 198 ae ‘and McClean). These books served wo entice mary young graduate tu ‘ena to pk up the conneetionst or eral network cause. The PDP books popslrited the Dacward propagation of ror aigrit, 3 learning aso: Fim hat allowed mals layer neural networks tobe constructed. Other Btcesm te book covered sel-orgawan snd eompeutve Denavir, re- ‘urvent neural networks, and apoations of neural networks to opti ‘Gon and eogrtve madeing. he Fst Intemational Conference on Neural ‘Network eh in 155, serve tal hol for uel ote eatent esearch on neural networks. These theoretieal advance, along ith the Svalaty of elaivey cheap computing power, alowed both academic re archers and eammmersal appleton develnger o explore en en ‘networks to slve tel prbles. ‘Avanety ofits py ein determining wheter a weennctgy be- comes commercial sucesso fare. Perhaps the most eral point i tnhether vabe akematives ext to solve the preg probes of the da. {The fallure of symbaue acl intelligence to sty indwtry Feqlre rents for robust pattem recognition and adaptive behavior opened the ‘hoor fr neal network o reenter the technical sage. Te also Helped that Whale new generation Of rsearhereserand on the sean wh low” that neural networks wouldnt work. The advance in comgating ower and integrated ees, wnch mad pusng hr of prota ‘nasil chin pose, certalycontbuted tothe reemergence of neural network technology. “yon le etd lo etal networkin the pst fw year, taht hve een inthe context ofa hot new tetnalogy ha is reveuonsing, fel ike sack porto management (Ragiro 1884) and ered cara raved ‘etceon (Norton 1204) rire hae heen nthe rent of eine ‘on, Which i? Selence ft o scence fac? Wel that depends on your ein of view, Ifyou are interested i sing neural networks to save pract- al probleme, suen ss predicting fue ales, modeling a manfecrng cess, or detecting flues in machines, then neural networks are rel, Fev uy, and avaliable (ese Tele 21 for ast of comercial neural net- trrk apiations). Ifyou want to blld Commander Data, the personae droid on the “StarTrek The Next Generation” show, then you ae tll {ing tion, Neural networks have not alowed te eeation of machines ‘vith human itligence or tase Howe, researchers in ary ‘els, tom neuroptysolny and cognitive scence to computer scence and ‘lectcal engineering af work toward at “the avalalty of commercial neural network development oos has n- creed the number offldedappllcaions Tools rom vendors such as ING {he TBM Carp NeataWate fn, and Ward Systems Group ron on PCH, ‘workstations, nnkomputers, nd maitames, These tools provide interac: {ie envionment for developing neural networks and the means to depioy huitations See eppendis A fora dereigton ofthe fete povided My ‘he IDM Neural Netware Ut. Tn'surwrmry, me are now a tennolog sate wnere new ui, computing too on like ode. both posible and practical Te hs ow been aost 10 years since the reemergence of real reworks, The Conia earch ending ecucogy a tore werk = 30, Theta ning Process Using Neu Newer created a fundamently new approach to slag protien with computes. ‘Mach more tan jst an incremental step ford i erating abit, neural networks are leap forward, providing completely new paradigm both for Formulating prbloe se for maior Inthe nest ection, examine how we must change our fundamental problem sling approach i we are to ex: lat ts new tac Changing the Problem-Solving Paradigm Solving problems with computers has become commonplace. Ii done tevery day by many people. However, is by no means natural for most peopl. ven thos with an spade for computer programming must be {rained in the organized, step-by-step procedures required to write a ‘rogram to eta computer to sola problem Some people are unable to think atthe velo nerescary to speity the sequence of lemon tary operations needed to perform even the most basic computing fae: viewa problem as aconnectthe-dots pute, where the first an ast pints are specie, andthelrjb to lnk the dts, one by one, unt the solaton merges "The problems we use computers to solve are aut varied, They could be ‘taonal computer appeations such a8 aecouting and paral or they ‘ould be optinieaton probes sucha: how to ssgn shipments otros for delivery, ot haw to manage aniventoryrepenshrent sate In every ee, ue oral proie use be reas ina frmchat ea be Slved on ‘computer, sit computer pros aguas. ‘One ofthe basi aks that computer systems analyst rust perform is to traalete bninem problem ita orpeterslstions There are a large ‘number of methodologies for doing tis Perhaps the most common isthe top-iown structured approach, where a problem is broken down ito sub- ‘Probleme Data ie identifed and process are dafind for Maripting the ata. Wel-known techniques are used wo design programs and algorithms to solve the cata processing problem. ater tna isyust a simple mater of ‘ropranming I's been dane a ilion ties. or the past 20 yar, the computer science curiulum in universes Classes on data structures, algorithms, syeers analysis and design, and prograrting languages such as COBOL, Pascal and Cae al standard of- ferngs Student ar taught al of the important information needed to de ‘ine, conceptualize and vole problems on digital computes, Th the past few years, the od top-down structured design and central. laed apliston has given way toan approach hased on aoe Instone of fensing on data and then deciding how to proces it object-oriented Snalys and design foeuses on dennung business objects tha correspond to rea world sbjcce. Boch cect containe some set of ats, whieh dex fines ts eurent state anda set of operations, which that object can pe- form or respond (o, Hand in hana wat anew problem ana exe the nvtesing en of dhert-orlented rogram anguages such #8 ‘Smalltalk and Ce (objetaiented C), which support the constructs {sed in object-oriented sages, This ahr aii the systems ansac methodology i eureily eatsing a corresponding change in university Computer seience euscums, In any ways, the move to objec-or- hte telungu new paradigm for solving problems ont Computers. The rig waterfall sofware development process, which ‘ows from requirements, o analysis and design, to code and est, When Stes the standard develorment methossigy ner the sricred pro- [ramming technique, snow ving way to the iterative development and ‘ape prototyping process more nara ure new vijecurionses ay preach, The software development method has changed, butte funda- ‘mental computing architect machine? Wha it ie mars pal, with hundreds or thousands of proceso? Ina sat way, we need Lo evelop acurutun on prae omputing, on parilling proceens and synehroining acest shared ‘arabes, When we charge the underlying esumptors abou the computer brenteetre, we calito quesoon mary orue tse vere uf ie computer irtnys spneach and the accumulated knowledge in solving problems ‘ith computers gnined over te past 30 years. Css on para archiee- Une nd programm se being offered If we toot the tration to mas ‘ve gral aan evolutionary step from eral computers, we think we ‘an get there it it hard to teach mary people how to Wink in paral, ‘str ah we have jut leaned how to tan pale thik in eae 8e- ‘Guental ways 0 they could program dal eomputers ‘Change hard, espe hen 9 retneancgy as proven use | rea verte ast 30 years. Now suppose we ave anew typeof com puter—not jst gia exesion of serial gta computer, but ara iy iferen contin all snd atlitecsre How are we gong to teach peone to solve problems using these computers wont be easy if tre approach the problem a8 one of retary thousands of programmer. it trl inyonble fe ry to presenta edie version of the fair {eral setware development process, “Wat we need ia inal’ erent approach. A mundamentaly aueren. computing mosel regen 8 rthinting nthe afttare development process from problem defiiion though testing, Neural networks and neural ne ‘ork eveloprnet are ferent coum tl sal sting probles ‘is noua networks is quite silt the way pele natural soe prob- Tens A neural network lars to solve problems by boing given dat, exam ‘enue pub, ae Ws slstin: People do th alo the tine ‘Knowledge Workers and Neural Networks Suppose we just hired anew lan officer to make credit decisions fr our tank (et call her Jenifer. Jennifer wl be given some hard and fast rule, but there lage gray aroa where the loan decision ss upto er Either she grants the an and we ake ourehances, or she does grant the ‘ean and we ave up tne opportu to maxe mney on tne lanes gong, to ake n whl for loner to earn how ois wel Fe 2. shows, Jenifer at her des for he fs ay on the Job Kom Bs Ln UNE 1 ener processes about 10 ans ay. Ax est, every appiation 2 ‘complete new experience. Some lars she can determine sy based on the lack oa job, history of bankruptey, ror aspects. Others she needs toe meg the lvormaton aed make aut cal.” One person incase story is stable, they have two kids, and the lan i fora good purpose Jennifer weighs al ofthese factors and sas “Ys, ets take Us business” “Another nase nt so clr cat. The persia has changed jos race there axe some lave payments on some of the ered history. ennfr say "No, Tete no take ts busines ‘verte. Jenifer ses feedback on her performance. Each month her manager comes in an ists the estomers wit ate payments: Jeni ‘ens Use hin patos an now cs tell Signe of problems he continues to lean howto judge whether someone Is on o pay back the aun on tne or not OF course, enter wets feeiackos te ones she re [pots They might have gone across he sret to Second National Ban and fron given the fan she eed and that bai ade the money she fave tp. Bat overt, Jv hye gd jo, the bank ening ost ane money i ha located toloans, and has easorable default rae Lets examine what Jenifer as doe She looked at any exams O12 kant an! earned to lazy them se gn ot propeets More than thas she a say whith one "beter han another I Jenaer made ‘trong determination on someone Wo ser car ejay er, she wo {entra si," mae amis” So Jenifer had to ok at the ap- cant data ane dst er neal weighing forthe signee of various 3 _ a Now lets sy the bank management would Ike Jennifer to move onto tose Tous, But they stl eed to cover the consumers of te Bs fess: Rather than fea train anew psn to lnm tha dtnetions be- tween god and bad ered risks, they wuld keto lone" Jenifer (or tat her expertise). They wold ie to uve her sceunlated experience in Sautorsted way Rank management hs heard of «new application tech ‘ology called neural networks we can lear oo the ste ob Jenner ‘en truter sayo-t dont How cane oar to atthe way 7” Wal, frst al ofthe data from Jenifer ean decisions most be eallected, Bach transaction has the appatlon dala and her decison and, forthe one se tcceped the protiabity atl eueeny of te seeoun. The neural network i presented with the same factors Jenifer used to make er deisions, Sig with ner deeston Aner a bred teasury” ped, de neural network 4 The Date ning reese Newons te ready to test. A new appetion somos iy and Ih prsented tothe ‘neural network. I makes a dein. Jens ooks over the application and "Wo di jst wha would have done. How dit do Ua” Tie mary nsen or we worker tas, dhe expert in his exams teamed to weigh the various inpt factors and earbine thet oem up with sv over sere o dio. aly, sue denon wete inde Uhl ese ‘Soon Inder to crrect the mistake, the expert had to ahs he intemal ‘weighty oUt next ine she woud ot make the same mistake, Tiss ‘Samo fino forma ob perfrmed by people They might arto ‘vith some rules that ean bee in he enteeo lear eases. But the ‘eal llores searing how to judge he between eases recogni the subtle dntinetions hetecen sores and fir. The meee ‘etaphorsare tha the expert “ses the son, "or "ue arswer jumps out” slog neural aetworcappucawons saa wo wang ew tw cle worker We must be ale to sve example of the clearcut extreme aes, ad we mist be abl to ive suet da nthe "gra cases thal mays be correct? No, Das the now worke or exper always make {he correct desison? Obviously not Dut We expert lame am experience, snag ean the neural networe Making Decisions: The Neural Processing Element “Te dgtal computer arcitecire (see Figure 2.5 cons of central ro- ‘ewing it (CDU) with set of registers nd an artnet Inge nt (ALL, along with store of addressable memory tht olds both nstruc- ‘ons and data The cg compute calea a soquenual acre becuse instars reading instruction from mero and then walks throuth memory (ith some sips of branches here or there as diated by the progra), ending trie al du, processing he dala sg the AL, an ie ‘tng the results back to the memory. ewe 25 Digicom sche gr, Induction Nau Nevo 28 Loan Ualng digital computer to ake adedsionis a relatively straightforward proces, Te aklvtic log nt as You mgt expect om te ame, pet {ome mathematical eperaions tach as aon and subtraction. 1 also perfos tase Boolean lg fnectlons such as testrg whether two mt bors are el, or wher oe ger Una were Making Dose or binar ysl decisions i afundamenal prt of dgtal computer. Because ofthis, mapping from a highlevel angie computer statement scien iticta- 100000 then Loanpprosed = True cee LesnApprved False” into elementary computer operations i a. The valve of neome ‘and 100000 ae loaded nt regaters from meraryThe ALU tests bo see Inco greater than £00000 anf this = Tr, then Phe Lan pee = “True” code executed, otherwise te "LoanApproved » Pale code exe ‘ate Whe hs might Ser tereey obvious wo you, pon sou oer phason thatthe typeof decson making that we can do in programs fanguages is a fnction ofthe underying capably ofthe dtl CPU. The oping angonges we ue tovay were bul pond derived rom theo ‘scinary compuling capable of dita computers This reationship co- ‘rvalofour thinking about how we make deans with computers toa Figure 26 laetrates how appletions are devaoped ung digital com putes, Ast of business rules or algorithms is ansated into a compoter program. The input datas Tedint the progam, the program processes the (OF couse, coding decision statement snot hard. Knowing what the rer shuld be to Ut for Ue har pst Gud Ht be 100000 or 1005007). The point is tha gal computers are great at making binary (jesno) decisions, slong as you tll them exactly what to compare. Life ‘nike he digital computer, neural networks ard neural computer are ‘sed 8 model ofthe rai A processing element ina. neural network ‘3 ThDaa ning Process Ung ewe newer prod rakes deisonsin avery erent way thn gta computer. Rather than tersand performing the specie opal operation ike a dtl computer, ‘eur processing element operates much auleeny “A neural prncessing leet receiv pts frm ather connected mo cesing elements, These input signal or value passthrough weighted co ‘nections, whi ener api or din the signals. Inside the news Processing element, al ofthese input sgpals are summed together to ave The total nput to the unk Tis toll iypt value I then passed throug a Irathematilfarcion to produce an output oF desson vos ranging rn Oto 1, Notice that tis areal valued (alog) output, not a dg ou put the input signal matehes the connecuon wets exacy, then the utp aloe to fhe lp signa totaly mismatches the connection ‘regis then te ovat close to. Varying degrees of snr are rep- sented ty the intermedi value. Now, of exutee, we can Force the eur roves eement to make «binary (10) decison, but by using nag values ranging between 0.0 and L038 the outputs, we are retaining hore infant pas onto the Ret ger of paral processing nits ‘ery real eee, neural networks ae analog computers. ‘Pgure 27 shows how a aeural network Would be used vo make a oan approval aplication. The inputs tothe neal network ars examples ot ‘as histones ofthe aplication problem and its soltin. The loan appi= fatlon datas fed ita the neural network nd value rom 0. to 1.0 ‘Bach neural procesing element ate as spl patter recognition mi chine ceca he Input grt egaia a memory tee (Connection ‘weiths) and produces an opt sal that coresponds to the degree of ‘hatch betwesn those pater In typical neural networks, were are mun- {de of neural procening lenient whe peter eruption and dee Sonsmaking abies are haressed together to solve problems. “The Learning Process: Adjusting Our Biases, ‘Suppose we present an input patter to neural network andi proces an ‘nd gal ha i wlycurect. What mechani est to change the ‘Sutbut? For example let say thatthe out vale mich ower than ‘Should be, One way t increas the output ofthe neural processing element move the meson trace or coneton weigh nee othe np ral This would inprove he degree of match a increase the ouput vale "use exact the method wed to “program” ofa” edral etwore ‘computes Brarpiee ar presented toa neural network itrakes predie- tio, and the connection weights stead so thatthe ouput core Sona iure ney to the dete output Ths adjustment proces done ‘suomatcaly by te laring algorithm being sed to train te network. By raking connectors stronger or weaker, reinforcing or inhibiting, the arti ‘wich undergo physical changes in response to input patterns and fed ack. Figure ean example ots process. In step 1, the neural neswork = = step © o” Foret Nnmintts—barag rm emrace 28 The Datamining Process Using Mere Neves outputs vale Jee than he deied av. Based on the irene {een he desired and etal ouput, the connection weights are moe, Instep we ge that connection weg Bs smaller and weight re, ‘ving an seta out hat now say larger than desired Once ‘gan the welts are rst. Intl ase, woight Cs redoced, so that he ‘utp oratep Sis very wise Wo Ue deste utp ‘Basic eure Network Functions: 1k chou come eno eurpree that natal nets ptform many of the ‘nds of tasks that haan do, These asks, which ar nportnt frou su ‘ala aspece tae smuaneoas pros oar att of ‘sponses are required, The architecture of the human brain evolved to solve ‘hese ype penn Cssteaton erhape the most basi function performed by our brains that of disci fn olan Ute two ings We aze capable of aalying ob Jct using the eublest and fest eatures to assess both the siniries Sid lferences We can asi animas a tind or dangerous, pans fou to ctor peonons, weathce a pleasant or thestoning Bory dy. ‘hundreds and thoisands of eases, we classify Ung. nthe business eni- ‘oment we have another need o make eusincations i hs applica. ‘oothy nea morgage for new house? Should we exten aie of eet. “erowing sine? Should we rail our new catalog to this setof customers ‘Seto another one? We nuke Use dexaors based on clasifestion. Clustering We clasfiation i rportant, it can certainly be overdone. Maing 0 [e's dtincion tcaweeh Shing can be a serious a problem a= nat Bing thie to decide at all Bacar we have ied storage capac in ur brain (Gre tl ent figured out how to a an extender ead), iis iportant Thru tobe ae to cuter snare or things gether Not only ich tering well from an eency standpoin, but the sbity to group Uke thing together (called chunking by arial ntegence pracsoners i & ‘ray important reasoning tol ee thrngh srg that we ca thik it teams of higher attractions, sling broader problems by geting above all ‘tthe tyra dea Theses pation of cisterin are mainly inthe marketing arena By cstering customers into groups based oa important sar atribtes, rds which uta they bay demographics hey care, ean art te nett! your markets in ner deta Ti information canbe ud tara these groups of sllar customers with products that ay of tem Fae purchased in the past ad-on Services, wich might appl to tna -Assoiative memory “The human mind isan amazing storage deve. A bletine worth of memo ‘esate store with an vein pom thaw ak database ain itrtordrol People ore unormaton by ssoiting things or ideas With ‘ter relaied memories Thereseeta wy bea complex network of emai ‘ally elated ideas stored inthe rain. One of the fundamental theories of fearing in the brain, called Hebbian lerrng after Dona Hebb (1940), {hen the connection Between the gfows strange. Ak he tne, Hebb po {ulated tht pysalehangesin the synapee ofthe neurons aoe pace Ts “Some of the earliest work in neural networks deals with the creation of ‘asoctatve Memories Unie casinaon ur mole where me ae ths {ilar some fundamental relationship between inputs and output, a80- fae memory requires a mapping of any two es, Neural network mod- Shai Binary Adsptive Memories and Hopi networks have been ‘shown be limited capacity, bat workin, asodative memories, Moaenng or regression ‘Pople bl practical socal mental models al ofthe tine Seldom do they resort. wrting copie se of matheratilequans or use ober forma ‘methods. Rather, st people bud models relating npus and ours ase tthe eames they Po eon thr eer if Then model can be ‘athe rial such as owing Oat when there ar dark clouds inthe ky and {the wind sare picing up tat sonny way OF they abe ore comple. ike ack trae who watches ps of lady economic n= ‘estore to know when o buy or sel. Te by o wake accurate predictions Hom complex xamples volving rary arable ta reat see. ‘By seeing only afew example, people can lar to mde! relalonsips. ‘We se our ality to interpolate between the exact exampls to generalize to nove! enone a problem ee hve never sean before ee the sity 10 feneralze hati strength of neural network technology. “Tune-seres forecasting and prection Like modeling wich iwnemabing a lati onetoeprdietion hon ‘arent information, time-series prediction involves looking a crrent i tormation and predicting what 1s Boi Lo happen. However, wis ne sess prisons, we tpiealy rs ooking at what hes happened forse period back through une and predicting for some point in the fare, The Femoral or time element makes tne-seres peaidon bat ure il. Setar reaming Smnane wha ean pri the fare based on what fas occured in the post can clear have tremendous advantages over "Peon ae very ood at using context to help rot their predictions of ‘what te autor of certain ttons willbe. For example, knowing that it ‘busier than normal and a sbopper would tae thi into account. I bank Manager were plan stafing Fequremens, ten tis woud be Cosh (red When nel nator ar et fo tine seis forecasting mre, eur networks also most be sven ths context information s they ea ‘aot lwo net preeuins. Constraint satstacion ost people are very good at sling complex problems that involve mut pt silaneos events Por example, we might havea Ht of eran {ovum Kiwwing tat baying groceries should beat he eda that we can perforin tre ofthe ass ata sage shopping center woud elp Us oUF rang nbusnee, we eten want to maximn confitng peas. increase Eistomer satisfaction, reduce cot, increase gual and maximize proms ‘We can centamlyinctese cistome asta ine sell our products Juice However, ths woud be od fr our rofiabity. We coud re Goce cows hy eutng ox inventory down to ust few tems, but this ould ‘ety havea nepalie impactor ctmtomorstiefaton Havin ulipl conflicting goals natural par of. People deal with ings of affairs al ofthe Le ae tink nohung oF Dg computers be Boean lage homer, wx har ume dealing with this. Tn contrast, rural networs with thelr weighted connections and analog compu ave proven theelvesexsremely adept sling cvstal saison et tization problems. ‘Summary People are familar with the computer metaphor, thatthe human brains Tothing more than computer mae of flock. Tht view it th real of both the sucess of the dal computer and ofthe symbolic schoo of {ileal icligence where rue processing a syibel manipulation were ‘Syated with intligpnee. However neural networks present an ater {ive model based on the msi paras and the pattern recognition apliies acne bein "Near networks share mach more withthe architecture and our eurent | understanding of how people lem aid make decors chan wih Ue cur fa erate tod. Neural networks lean from examples. They {ake complex, nls data and make educated guesses based on what they fave eared rom he ea. Ge thereunto lurve datahates of historical dala, neural networks ara natural technology orn typeof apptiaton, ‘ore than jut new computing architect, neural ntworks offer 3 completely diferent paredign fr solving problems with computers. my tarp, [showed ow a knowledge worker lars todo Mer 0 oy worn ‘rough examples ant ting Ieedhack on her perforce. Tis aporcach tras contrasted to how computers are used to ale these kinds of probes, The neural network lng pyc oe iar Wo how people Work sna iar "Me process of escing in neural network isto use feebackto adjust in- Aesna connections, whieh intr fest the outpat cr anawer predated ‘Phe neural processing element combines all ofthe inputs tot and produces {noutput, whch is seentily © mease of te maten benween the input pattern ad fs connecionwelghte hon hsv of these neural proces- ‘ors are combined, we have the sbi to solve dul problems Sich as reat seonng ‘any ofthe basi functions perfomned by neural networks are mirrored tyr human abies, These inde making distinct betmeen ems (a Sitctlon) ving soar things tnt groupe cheering), ssocating 0 ‘or more things (associative memory), lamin to predict outcomes based (beecsting), salen ms ‘enough soluon (constraint satisfaction). Neral network ae a comp ae row oe ay to e= course pater in data. Asa consequence, they have many sppiaons to ‘ta mining and analy nthe remainder of this boo, 1 explore how ‘earl etworhs can be appli to salve comin business problems peteees meer ge geen br nah ee EE Sena ec emt Sieeies haa taa ae et vcs apn SAB eco See ae awa oA See ra rei ena i anon Cae earn nme EERE as rm a 2 Sw , ae ec EO on Sc ait ha notes oe Chapter Data Preparation Fost Intl chapter, explore the issue elated tothe fst major step in the data ‘mining proceso date prparation. Moder databace ayetem atchioctre, ata access, data cleansing, and ata selection are described. Te remain ‘roth chapter deals wth dataset management nd ue preprocessing ‘ca for data mining with neural eon. Data: The Raw Matorat [eis certain true tha having data necessary prerequisite to doing data tht However, jst Navan th data ent ata sufiert. Thre i a ‘raj the question oft we have enough dat. Then theres these off we Inve cee, relable data Fray, and ost portant the deteanation cf whether we hawe the right date Only tomeone with domain Knol, Someone who sndertande Ue data and wha leans, can select the it ‘ata lor # data murng operauon. A we Wl see, us appcation uf sw ‘vig aboot the dat ed in several efferent wav in data nna espe ‘ally in the data preparation phase. Timat casey the dain ol for ta mining operon fas been it ting aroand colt dst The datas created as by-product of pearing ‘common buses traction stored nan operational databse, and 8 {lived to tape for longterm storage, Many compan ate pret ‘porate data warehouses, which Keep the operational data online ad vale she fr extended periods of tine. Some companies are erating marketing fataeses, whch Keep ony infomntoon Felted to commer eas: fd purchasing story onine. Whatever historical data saab itis the aw mater for he data mining proces. ‘ie pre ote aben wen «daa ang pees proposed that ely part ofthe business proves is computerized. Pato the proces is ‘tie, ar part ofits manual, sony a portion ofthe datas vale on the Compter tem Percale sv ret prove procta, most ofthe nation i eere ro the computer sytem, but the red report ad ap- ‘ralsers report ar not. They are pat oie ered appants paper older. In {hese hve lf he a, te ot sehen farm a at ‘sed indy for data mig I the gals to analy or ants is esr ong ct, we wis is ve ean ae aver docarants othe crate syste. The cst of ths can be subtan- ire process and begin co- tal Analterative approach Sion sunnor arpstons, hit for ers pes of anton delanment ‘works well The key tat eventhough we do have the data, we ean fenernie Ie That, we do Have recut uf etl decison but we ea ‘rite sme rls that canbe used to generate tiring examples, which de [ne 60m of te desiedbehsvir WN Uns ppranch, we can offload 20% the casclods tram worker (Ue ney cae), ana ham foes on 20% ‘tthe eases (the hard cases). In the mean te we can stare cllecting the ‘decisions they make and afer Update the neural network 80H can cover "The daa used ina data mining projet might be stored ether ina at le rina cuabae, ven when tects vac store a dla, iso {en dumped aft fe fr processing by the data mining lgorthns. Tis J sometimes dove for simply, as an easy way of handing off data toa ‘onan or hed party who i atl going todo the dts ining In ‘ther eats, thi Is done to avoid the petfontance peas, which are ‘Sometime pid when erating through age relational attest. One se that avers when loge data ste hae ta he eons i ensuring hat there enough dk pace forall othe preprocessed da. nthe nex 8 tl, we bok te ln fests elatnaldtabage ater and sore ofthe performance issues, Modern Database Systems, ‘computer sytem snot considered complete today without having some feet of dette capi Although data con be sored in pln text oe ‘aay enough, any gnc data pressing activity where sins rans- am Pmpacaton 65 ‘ucsions are processed, red, and updated today use relational sabes. In the pact, Nerareicaldstabaes ar ae IRM IMS. oF network eats, ‘ch as CODASML were used to store busines data. The advent of objec frlented programsarg nguages has prompted the development fw" frm of dnahnee cyto had on the object paradian. Oe databases have become very popula and they are growing quite rapy. But even though ch dust ene inthe Nerarchiea and network databases on ‘mainframe computers, sa cbect databases are growing, the information technology world today res largely on relational database tetnlngy and ‘ri fore foresocebl star. Rektonal databases suchas IBM DB2, Oracle, and Sybase al teat data aslson freed cal ups (oe ig) eh ect of OW tional database, no matter how large, can always be considered tobe a arg ‘able of data contig rows olan Te velatonal algebra fist de= "ened Codd and Date (1950) specifies a set of geal operations that fan be used to slec rows exec cous, and ln two or ore relational {Shiestopther ordre gt the Soied ew he dat, Ts map letion of relational data bas teen standardaed inthe industry by the Stuetured Query Langisge, Sle Business appacation programing ipmges ene ae CORON at RPE sport SOL teres to relational da fase. Even cint-tused graphical query tools end up ultimately suing ‘dma sate SOL stern owes he st daa “Accessing data fo a database const of selecting corns of data from either al records or from econ containing spec values or anges tional stab ec rm ve tre Age 21 atl tin ro alate A here ite Rime = Tae Came and Age > 18 gees eel tne es tales i de ata The esl a SQL petlins on lana database tablesis another table Tis able ie then screed he using ey Hels or Sequential By using a eursor to walkthrough the lable. Ts, when al is ‘id and dng, the data ern rom a rational datas sine rer Consisting of set of columns or elds. "as the amount of ta generated lv pea BUSINES ay as grown, he ‘eltnnal athens tht store i data have also needed to ow. To Ie ‘vidual relational databases canbe states in size. In 1904, 2% of the ‘rd datas rage ste Non 11 to OO gigabytes, wile 10% of ‘Oracle databases were alo in that range (Ovum 1904). Almost 30% ofthe DDE databsnes on mainframe computers were larger than 80 gigabytes. Depending onthe compleniy ofthe layout of tho tales aque To infer. ration from ast of ales might take anywhere rm Seconds o mites, toveven hous or days to complete. AS the busness user locus as ued te te tack of meamining the bine data, demands on database nero mance has increst toch an extent tat ach ofthe iflerentiation be {ce commercial date veneers fet, Dot inthe area of werformance. Parallel Databases ‘As the ques or better perormance has conunea, database vendors have ‘own tn parallel are configurations in order to provide the needed transaction processing and query resporse ines. These parabel database ltlectues ea be plus moar cat spunea proce Ing (SMP) of iby coupled systems, and shared nothing o sey cou. ‘ledsystems Inthe falowing serions we explore ese two popular parallel Cstabaseartitectres. ‘SUP database architectures Symmetrical multiprocessing database systems use mp processors ‘Sang memory in the eame romper sym tn races res ara Jel. Allothe processors nthe system ean access all ofthe data onthe hard (tard ney aso a shar she yam menor. ust SMP syste, a ‘Shale query splint nieces an these leces are sent off othe individ til procesorsin the system, Figure 3.2 hows a task o application runing tars faurmay AGAQOayae wis Ure SMP version of DG for C8400, Bach processor can work on spat ofthe problem, secessng the Par disks to Fetieve the data and eurring its partial esl. These pata results then fave to be combined to cetum the oveal rr f he ery tthe ro ‘esting appetion progr. ‘One major problem wth SMP database systems i tha the sated mem ony becomes sytem bottleneck Tit fe hersuee each pene, while sie to perform indepavtently on place of he poker, til eto wit Uni the main ema i not being accessed bay of the other processors Inthe system SIA sual esecive tor sma munber or procesors, a Ing tom upto abt 16. ‘Shared-othing detabase architectures Hecate ofthe saing robles of SMP database stein, verse i ‘woded loosely oupled or shared-nothng datatase systems, Ina shared ‘othingartvtetre, each processor has sown ds and sown meer, that nanory ine loge abotlencck. Figure 22 depicts an example of ‘easter database ina shared nothing contiguraion Te invita ela tonal database table pit across tree nodes in an three-way shared ). OF we can deci to represent no ted. yoene land inknnen Se 0, Allf the tepnennttione ae wad He ends on whi srequedy te appiation. The trade isin network ae {Ge mumber of inp) vers te eave ag (reduce ra) orsvimbae data represen unrelated dsrete values we siply map tne symbol oan integer from 1 oN, For example, pps, peaches Pur an pie] map to 1,3, 4 O eure, we would probably then ale 1 10 ‘0-1 50 that apes = 00, peaches = 0.3, pumpkin = 0.6, pie = 1. In ‘esence we have the sate repreretatonoplion as cussed previously it ‘the numeri dzrote cave But cach wie sya ist he Map 2 tunigne numer vale (ee igure 8.5). Depenng onthe application, we ght or mig not Want vo went sjnbois wun Tstatehed case (apBe Wer ut Aol) for exarole, different symbol For symbalic data representing related values, such as (good, better, ct} wet i area Uo aye Wy onseeive gers al use 3 data representation hal preserves his rang iormation such sa ther- Imomete code oa smple linear sale Por aybolis data that in fs continuous nature, thi nore comple. Forexample,ifwe want to be sennitive to diference ofa sgl character Sa eat so ter ge MG Data Representation impact on Training Time Dates impor wong deca ade ei ‘onship we are ring to teach it. However, there is usually ast of posible data representations that aesueint to rain a network. nal aes is Important to vert hw yur dts rpsotain dake wl ort both the traning time forthe neural newark and the accuracy obained In grea he me epi te data representation, he easier iw be forthe new network to lear. For example taking discrete variable and twing a neobN vector code wil yplealy ain the fastest, However, the {stl atyouare adding inputs and aftr of addtional weights {othe network, Again, i general the arger the networkin terms of pro- ‘tin ts ad coiection weighs, the ess wel Wil generalize ana he onger wil ake ain Taking th sae dere arable and asin ‘toa single np wl, where each discrete value is represented by 8 l= ference of 1 in inp gre, cea ore compact represents ton A smaller neal network and one that generalizes beter likely to Fesule However wi take the evra networks lt Inger to adst ts Tenthis estpicant alference tat ndleates completely unig vale for hat inpue vara Managing Talning Data Sets Avery important aspect of usin neural networks fr data mining and 3p- ‘iestion development is how to manage your raw materi the Nstoricl {teh mone common appeeh ns fandom dade the source Ala into ono or ore datasets One subset ofthe datas used to train the neural network and another sects wed to test he accuracy ofthe neural ne ‘ton Its inant treatin tha ho neural network verre" the ek Gta wile ii in teinng mode. Tati, never learns o adjusts is ‘eign sig the east dns, Some peuple sign al thd subset “hired that is wield even from the developer af the neal network rode tot hat anyone woud cheat). Inthis three -sbset scenario, the de- lo es taal Sst dae tothe neural nctwork modal ant {Shin party independently teste the network using the valation data FRgue 26 shows typical se of datasets rear! network waning. ‘Mate Input fs oe tobaae or combined in the data reparation st toereate the sure dataset This dai et isthe splt int a traning se, Vestn set ana vadation set "There ate sme cases when tis usual method isnt appropriate, One when the data a temporal of tine sees rare This datamust be wed ntinaousterporal squcnees inorder to maitaln the information i ‘onfins, Random selection frm this data set would be catastrophe. In this ‘hs, rs ypical to se data rom a cera ie pero Tor tring andthe mon resent dat forthe esting snr valntion paes “Another case shen theres ot sifcet data to allow random sampling to reasonably pronde a representative spe of eit daa jp In dhs case, sate! eeriques might be equred to ensure that both the tiring and test data et coral representative spl of hed aa Prparton 57 ‘ain Se te = i” Data Quantity since dats te most amportant ingredient in data mining, ensunng that ‘we have eno of our raw materal servi. In most applleations, the amount of data a premium, and several techniques must be used to ‘preset nt iy ot of “A rough ral of thimb with neural network stat you need two data ines for eath connection in the neural network. So aback propagation network with 10 inputs, Shdden unt, and 8 outpte woud ned appros ‘mately "2¥(10¢5) + 6 +5)» 10" traning examples tobe abet tain ac- ‘uray In practic, many successful neural nerwork appbestons have ‘hen devloped ing Ie ata than hie ili ages ‘When wing rea dato rai neural ecwor, ls ypc to have 88% of| ‘era eprint cntrer or nara” conden and oly he ‘sal preertage of exarples for de eass we really want to detect (Le, “tad eastomers or “abort” operating conitions). One tehque to it- “rau the parcontage ito imply dst he narra tring examples that contain the underrepresented clas of tring patter. Another tech- aque 91 ake te a rier of est casas ad edt them By yeeU ‘all amounts ofrandon ise nt the ino aes el hen ase hes Inputs as adnal traning cases, Anat option is to ces the edna trig expe ya Data Quality: Garbage n, Garbage Out In adaiion othe management of the data, a major concern in neural ete ‘wor data mining ste qty Of theca, Most databases conan incom 58 The ata ttning ress Using wa NeWor ete and inaccurate dala, Depending on te amount of data aval, you Ihiht shes anore any abel bad ecards. However. in an ‘Cases you wil fave Kite data avaible, and so you wll veto try to ‘str the dats Uy supp values for missing feds. The moet common tecinaues isto get he feds tothe mean or median vale itis numeric ot tothe mode ofa scree arable ‘Abita trdional atin! analysis, culirs are conor, single record with ava ane or two orders of magnitude larger or smaller than theres ofthe data et can severly unpact ine perfomance ora neural nes ‘wort moval Srna, ten, Keser 1906) A cursory sean ofthe rane ‘ofeach variable ora snplesatter plot can usualy Wen exieme cas Neural network ta mining 8 the are ples, is highly dependent on ‘thequlty and quantity of data I ever there was asystom where GIGO was the ule (ontage in guage out, neural network et They ae Ny forgiving of oy and ineomplce data, but they ae ones gor asthe data they are trained with, Se the paper by Cortes, Jackel ana UTang (1390) fren ovellent lesson of tee of bad dat on eatin ‘Summary ‘Whe te goa of data mining eo extract valuable information rom dt, t Ton undeine fact at te aly of te rena eats ett othe ‘rnnty and quality of the data being mined. Data might be generated by transaction processing programs, might be enered no datas rom ex- ‘ting mama popertared proceso, or might eeen he generate by do- rain exper In any care, ii importa that missing ard out of range ‘aes are scrubeed In ie data preprocesing use. Te date might be “oven flat los rin databases, obo, Often data bas tobe selected and ‘ombined from several sources bore data mining begins. “The majo uf dota vcd in data ingrid tclationaldtabane systems, efi dicussed the relational ata made of rows or tuples and ‘urine olds. The Stuesured Query Language (SQL) is the primary ‘rete for anipultng dat in rational tabars. When we gta the centremel large gigabyte and terabyte datasets, then performance ofthe ‘elaonal databoe sytem becomes more umportat. To primary methods for performance spends nat tray are symunercaluliprcessn (GMD) and the shared-nthig or loosely coupled systems, Both paral t= Gatectarespromde improves performs for dal access ut esa ‘thing architecture more sealable and wl be required forthe larger ‘ta warehouse estes ‘Tala yer al preytuceang ace extremely important to neuat network data mining. Experienced appleaton consultants estimate that ‘ange fro SP to 7% ofthe development ime spent working with the ea Peparaion 38 dst before it ven soe uta network: Thus Roving power ata eves tool, data cleansitg, and preprocestng operations are essential to elec- tive data mang “The bale data tres wed for minis ae categoria data, diserete m- were data, and continuous numeric data. Synbas can be tured ino dis ‘fete nice daa tog Ian fanetions or hough symbol ble ‘ape. Numere dia, tr, can be represented as coded data types such ‘8 one of, Uhemometr of regular brary codes. Continuous data canbe ‘ale, theesolded nd ieretzes. Symbabie data can be mapped int aif. {eron levels of abstraction by using taxonomies. Deciding whch represer tation is best usualy ajo of the don exper who does he cata [roparton Understanding the amano of the data i rl fr slet- ing the appropiate dala representations. The eksions concerning wha limact onthe performance ofthe neural nebwor, in ers of ain ie, time required to proces rast, and how well the neta References oat, dW ang 108 Uno aig cing gt ‘ating AAT Prom pO en ova Dosa “eat dabce moms Ye 1 (St aon), Aion ‘wo aan Te ht eae marc ro pert Chapter Neural Network Models and Architectures “Tepe ate ay erat types of eral nctwork model or pare, At ‘rory neural network conference, Meraly hundreds of vations wil be ‘presented. Consequently, after you have decided fo use neural networks to fs data ming, your ner decison "Which nara network mel oT te? Ths chaper explores the most popular neural network rodels in tems of te leaning approaches, ters mane tpety ad et rocessna functions and capabiies. The Basle Learning Paradigms, Perhaps the most useful way to eategorze the diferent neural network ‘model by the bai learning preg or approach they Th hoe ‘alin earning parsdigns are superse, unsupervised, and reinforcement. ‘ed learnings often used for castering and segmertaton i data mining Ter dessin supgur. Reforenentleanang, sgh usd ess requenshy than the other methods today, has applications in optimization ove ine and adaptive contro, (2 TeDee rng Process Using News Newors Before wegen he Ure pst dealt hl late these ra ing parades to salons we are falar with. Supervised learn He tring olear new ask romyour mother Aer every attempt you make Solve te problem, you have asery atone teacher who ge you spe, ‘emediate feedhack on how wel youd. Unsper ised lamin is ike being ‘rena stackot dciment, ae cant vn unmarre ie oer ana na {gto conte rot fing chore fm serach, Renfree arin. isthe most ke red if. Is Uke baving ob. You are given a sequence of {aeksrequrig decsiogs a at se pit dows Ue ye are ve & ‘performance ara Youare told whether youre doing well oro, butts ‘Upto yout igure out which deine wee rat and witch were wiong Persor say, “OR, no, you did wrong” and indicates what the answer hou be. Now we nave someting ogo on. The earrarg agri ake thesifrenea hetevon tn eorreto desired output andthe actual predic tin the neural network made, andthe ago ses that lnormation to tjust ie weighs of the youl yetwork 90 that next tine, the proditon wil be close tothe corel answer, Unlike people, who usualy don have {arbe show the sae protiem over and over before they ge the Hea, tour netweth ar somewhat slow ‘Thy mart he shown the example: tens, hundred, or even thousands fines before they ean accurately pre- ‘det the corral answer to same eampiex probit ‘Suporte Inering ed when you hve a database of examples that ‘contain both problem statements andthe answer Now you might say, "What Aajea Mgrs Sing Err (Goteed Acre) 0th I aready know the anawor? What the ner networ ding {hat cant do sready™ ean lear how the ingot and output ae relate. fan lie to ook at rotiems the same way you do ana mare sar dec ‘Sh ea lok and learn rounded or thousands of exales Dox ‘duced by the ber performers in your organization. Moreover, teaneam to ‘hot wits programy ils ttc sich ab Bit do ts ‘hen dott -adsatueam Ina relatively automated proces, rewalnet- ‘work ean fur ap of dla tos deeision support syste! Turning dala Io ine of sino opps data ming atte most powerful best ‘Super earings sel approach fr wag neural networks pe form easieaions uncon appraatn or modes a ne-sere xe. ‘tng whore the tse aie n pdt ote sme pit in the inure is especialy wef in problrs wher stain he form fru [exnpes avant a 0 ens ie eas ana ‘ea the input and pein Ue outpu. Despite aazing thing th be lemed from daa sg tata! ard ber mathematical arabs fave complex elatanships betwen lil variates, an for whch ef ral mathematical incon ng anon of camot be easly derve. Anos tape af ribo for which superset neal neces are dels te case tren te problem self changes over tine Ife are ying contra a man [aes procter charges doe to Vroble weather a= chine tol wear, then a neural network ean be used to mode and adapt to these changing conditions. nmuperviend earning ‘Unaupersisd earning is wed incase where we have ft of data, ut we dont know the arswer. We dont know the arswer, but we do know the [gestion Ife dont know the queton, we might t wall qt right now ‘The question i, How are these daa related? What tems are ilo i= erent and in what way Inflect, we Wan he Neral network ‘he pattem of data nd tester ther tht sina ates et Du ito ‘he same cluster (re Fire 42). The neural network using unsupervised Fearing can peronnst sk wih get pein. Ofcourse, you hove © ‘eorestnt the data coreetvs that the neural network can dsriinat be teen important dilrenes inthe dala a unmportant ones. Once he [artidning done, well ncd to do some anal ofthe network to get ‘complete applicstion (well talk abou that ltr), This cistern approach Saute useful function as we wll ee. Moura network that arene ing cpr methods are elled seloganizng beense they reeive no direction on what de desired ‘correct output shouldbe. When peesenta wha genes of put pate, ‘he onipu processing uns selF-osanizeb nay eompetingto recognize °e ef rec tonard By rattern ote Winer gue 4 Unni tion he pttan, ant hn pnperaing nan thr connection wets, Over time, an unsupervised nero voles 2 tha each ouput ui is sensive to and wil recogniae inputs roms specie portion of te input space Reinforcement erring “The third major neural network traning praigm i ale reinforcement Turning. Ineirervet atsns we have examples ofthe problem oF ‘ae, but we do ot have the exact ater, oa east not immediately (ee Figure 43) For example, ets ay we ae plying a game, we Pave aboard smear the opponent makers Mave, week aimawe ‘ster 10 or 20 moves, we win or we ose. Now tis sur reiagorcement i ‘a Wemake genes eens, ander ner do we fl vt whee hey ‘ter ight or wrong Chow Melk) Te pura network enforcement ar ingapprosch ows very il}cl temporal (me-dependert) problems tobe vel eapets, esorcemene arin the mest rat Mo Fa ‘i. For tat reason, ts alo one of he hardest o use to solve problems exact eedhack formation avalabe, then soperse traning val most aaye be far and mare acon than reinforcement eis Flowever when the problem inobes some time sequel proces or when, the exac feedback not avaunle ae nly sevontay sina ate vnc, {hen reinforcement eaminaixan appropriate technique to se, Researchers fave shown that neural network models that use esforcement laring are Perbetuing maleeal opinaationfeneion sar to dynamic ro ‘ramming (Sutton 1988) Tis approoch allows optimal strategies wo be de ved in econae and onto applications. Naural Network Topologies "The arangement of eral processing units and their interconnections can fave a profoued impact onthe processing capitis ofthe neural et~ AdjossNeghes 7 tne = owe Retreat reparation works. In genera ll neural networks have some set of processing unis That eosive inputs from the outdo word whlch we reer appropiate a the “input uns” Many neural networks ls have one or more layers of hidden” proces uns at recetve puts ony Irom otter processing ne eros of provesing us eclves a vei of dita or the ‘outputs ofa previous aver of units ad processes them in parallel. The se proce a hl presets Ue Gal fesul of Ue etl etwork ‘computation designated the “output unt” There ae weer con ‘ection toplages tat define how data fows between te input, hidden, Sha output proceadng units, These main categories feedforward limited ‘euret, nd lly recurrent networks—are described in detain the next Feediornard networks Ferma Hetworhs ae se situa when we can Ding a of she Information to beat on problem atone, and we ean presenti 10 the ‘neural network. isle a pop quiz, where the eather wasn writes set ‘tte onthe bard and sae, OM, talline the answer” You mist ta the Gat, proces arid “jump oa conclusion” In this ype of neural network, ‘tne ata fows vough the network in one areca, and uve awer I ee slay onthe eure se of input. In Pgure 4, we so pial eedorvard neural network topology: Data ‘ters the neural network rough De inpatunson thee The inp vals ‘sre atid tothe ing nits ste un iain valves Te output values ‘ofthe units are modded by the connection weights ter berg magifed {tke connection waht site av grater than bang immishe ifthe camection weight i between 00 and 1.0 the connection welt is ‘negate the galt magne o mune nthe oppose areton, fb fem fob eee Fett cl Bach processing uit combines lla he input sips comin into the uni song wth teal eT nal ip signa i then pent {an activation function to determine the actual opt ofthe processing unt, ‘whic in trnicines de bap to anser yer of ws alae ‘work. The mast ypicalactvaon function used in neural networks the S- ‘ape o sigmoid also elle the gist) functon This neon converts ‘input valu oan output ranging trom Oto 1.0. The fect of the threshold ‘eg to sh the curve right or fet tere making the opt vale Tgher or ower, depending on hes of the threshold weight. fs shea in Pig 44, the iat Ror om the ing ager tah ‘one, or more suceeing hiden layers and then to the ouput ier. in most eran, te wuts om one jer are lj connected wo Ue us ee ne lover. However. hss no a eauirerent af fedforvard neural networks. In ‘some eases, especialy wen the neural reswork connections and weights are ‘onaracted tm a ue or pedkate form, there could be lens cone ‘eights hanina fly conecd network. There are aso ecigues fo prt- Inglumecenary weights rom neural network afer itis raed In Beeral, the lar weight hae a, te later he neo wl eset races at and the beer wl generalize to unseen input. i portant remember that feetorrard” isa dein of eannecton opal data ow 1 does tly ay speci type af tation function or raing paradigm. United caren networks RecuTeneneswors are used in suitors wren we hve Cure ns ion to ive the network. but the sequence of inputs is importa, and we ‘need the neurl network vo somchow sore a record ef he pir inputs and {ctor themin withthe current dott produce an answer Inrecurent it~ works, information about past input fet ck nto and mixed with the in- pits through recurrent or eediack connections for hide or outputs scuitions (See Pgure 45). "wo major artteesre 0 Ute recurent networks re wiley use, ‘man (1900 sama along fehack fram the hidden units oa set of ‘onal inte ced content urs Ever Jordan (1085) decribed net- ‘rok wih fees Gm dor ouiput wa ch wo asetofcortet uns Ts foro eure sa cinprne betwee Ue spel oa eer watd rework and the comolen of fly recument neural retwarkbecanse it ‘all te popular back propagation rang algo (dosrbed inthe flowing) to be wed Full recurrent networks uly ecuret eons trae ues rove roma comes coun all proczaor inthe acura network. Aeubie of he n= ic ‘degated ante pat procera ar asi or ape oe ‘Desa t aves ea hn tow a aacet comes ws Figure 46 shows the input unis feeding ito both the hidden units i ary) Son dhe tj un, Tre wan of Use ade and wap a he ‘are recomputed until the neural network tables. At this po, the output ‘ales ean be read rom the tpt yer of processing us uly recurrent networks are complex, daria! teem, ae they ‘ital ofthe power and insta associated wth mit ees and chaotic enanor of such syste. Unlkeleaforward network varias, wach ave ‘determinate ie a price an output vale Chae an Che ve fr the ‘ata to flow through the netwos), fly recurrent networks can ake an i Tn the best case, the neural network wil reverberate afew tines and ‘quickly seite ino a table, minal energy state AT hs ine the output alr Uescan be read fom the ctput ui Indes optimal drcaetances the et Work might cyte quite fe ines before i sties nt an awe In wore ‘cases, he network wlll int acy, visting te save sto anor inten ower al ono ithnot ever sattling im nate pei a ‘he network wilenuera chaotic patiem and never vi the sae output late pas ipa geet Pata cued 1) The Dea Ming Process Ung Hara Newark AY By poring sme constants onthe conection weighs, we can ensure tout the network wl enter astable stat, Te connections between wits {hue be symcreal Fil secure aebwore are usd pinay fr op Iniaion problems ana associative memories A nice afiibute with obt- ‘zation problemi tat depending onthe Lime aaa, yu can enoose daget tnerecurert aes cure newer waits longer time fr 10 ‘eden beter one This behaviors silt the performance of peo- pln certain asks, yen rity moe Neural Network Models ‘As mentioned eater, the combination of toplogy, leaning paradigm, and feumingsgordum dfs wut itworh rode. There ae a wide ele ‘Son of poplar neural network models For data mining, perhaps the back ‘ropaeuon network ad te Kohonen feature map ate the most poplar Henne tre are mag iferent types nf neal networks in se. Some tre optinized for attaining, cers for fat real of stored memories, ‘thes fr computing the best pasblearewer cep of rang oc (aie Bor the host reve fora ven apnation or data mining function depends on the data and te foretion required Pr discussion that fll in htended to provide an intltie ander. “canting ofthe dfferences between te major ype of eural networks, NO ‘Sc or the mathematics behind dene modes are provided, Asmertoned fhthepefac her re alread age murber of oxooks that describe the math derivation in considerabe deta. Wasserman’ books provide a (Goo ltrouron to neural network cory (198, 1980), arog i St Soaks geting te ated Her, Kroah snd Palmer (1998 sve one of ‘the most comprehensive treatments ofthe erature and the mathematics ‘ssocited with the model Ifyou eave # dose of aleuls, these books wil ‘0 naps Du epi wi he gal of writing a Book fr sn infor Imation processing and anes auclence, ce folowing dszusion uses ‘word and graphs, even when a formula might lary the point fr some Beck propagation networks [Abock propagation neural network uses feedlorear opel supermsed ating arate aor stn) hack reagan tring sane Pe a fovthin was response in large part for the reemergence of reurl et- Sore athe mi 10%0s. Rael Hn, ad Willams (1800), working {spat ofthe Parl Disibted Process group of neural network re- ‘Staehers, popularized the tack propagation lgoridwn (which they called {he generate delta rue) with tet slo, mathematical detion and ‘inp examples ofthe We ofthe lgortun. Thee sple rebuttal of Minsky land Poppers clic of neural networks ira to lean ple prob- Teme, such asthe exes Ogi feton, shoud tha ae bation of neural nezworks had been overcome ‘Sack propagator 1a general-purpose earning grid Bis perf but also exzensve i terms of camputational requirements for tralning. A tack propaton network witha single Ndden ver of processing elements ‘tunel at continuous Tatton to any degree of ccursey (ven enough Droceaing elements inthe hidden lye), Tere are erally hundreds of ‘aration of back propagation nthe neural nebwork ratore, anda em {oe super t “hase heck prmpagtion none way or the seer Iced, "nce back pronation i bared on a ela) spe form af optimization own a8 gradiene aesconr, raterrkly site vinervrs so Po" ‘ote modifications us more powerful techniques such as conus r+ ‘ies and Newtons methods ee Wasserman, 198, ora discussion of some ‘ofthe ry vrationo of tack propagation. However, "basic" back props {tion stl eh most widely sed vara. two pray virtues are att {SSimple and easy to understand, and work fora wide Tago of problems. "Thehuric hak propagation igri consis of tre ste (see Fire 441), The input patter is presented tothe input layer of the network, ‘ese input are propagates! Uru Unter al Uy ten oul ‘bt units This forward pass produces the actual or predicted ouput pat- tem. Because tack propagation i a supersed leary alg, the ‘snd outputs ae en as part of ke runing vector Te actual net ‘outputs are ebtracted fom the desed outputs and an era signal 8 pro ‘Sed, This eror sgl then the bass forthe back propagation step, Sohereby the errs are paced ack sigh the pase ne WY ome ‘puting the contain ofeach hidden processing wnt and deriving te cot Error Tolerances Learn kate Ads Nepnts sepondingtistnent needed to prodce the correct output The connection ‘rei are then adjusted and Ue neural network has ust "eared from "A enoned ear, back propagation sa powerful and exile tool for data ming end analy, Suppose you want to do near regression. tick propagation setwork th no iden its oun he ay tet le ‘represion model resting mip input parareters to mpl outputs fr dependent variables. This type of Bock propagation nework sway (evan lgoithn ale the dala ral, at proposed by Werow and Hot (1960). hing singe layer of hidden unite sus einen new network into onner ne, enable of performing mulvariat loge regression, Dut (etn some dntint advantages over the tradiional statistical teetwiaoe. ang a back proppain netmurh La du pte regression allows 700 t ‘model mie outputs a Ue same line, Confounding effects fom mult ‘le Input parameters can be captured ina singe back propagiton network ck propagation neural networks canbe used for easiation, mode- ing, and tineerie forecasting, Por cassieaton problem, the spy a Uae are mapped to te deste cleenton catopni. The training of the neural network amounts o setting up the comect st of discriminant funtion to comecti classy the ts: For Duly maxes funtion oproximation, the input attbutes are mapoed to the funtion output ‘Tis could be single output uch as a pricing model, or could be complex ‘models wan mie outputs suc a yi Uo pret Evo oF ore fue "Time series Forecasting canbe accomplished with backpropagation net ‘wort ugh # texte known 0 the aiding window" np fo art [erlod a ne canbe presented othe neural network and the desired out- urs Naor Modelo ecacees Th ‘ul s the funtion atthe next time period. Various time relations canbe Renae sing this method. Por example, the neural nctwork cald be ‘tained to pedis the next output inthe sequence oF the output thee or {or step in the future. Ths techiqu was ised Uy Sejrowsta 5 ino NevTalkonporinest, hora eight a ural Meter (RD LED Bhonemes for npu toa spech syathesizr (188. Two major ering parameters arc used contrat the aig proces of shark pepagnion rate ‘The oar ae se to specif whether he eur neswork ie goin to make major adhstments after each eng al Dri ition og her ajo Soe use er trol pose cellations i Ue weighs, which could be caused by ler- rately signed ero inal, Whe most commercial back popagaion tools provide anywhere from to 10 o ore parameters for outa set, ete Eo til usualy produce the most impact on the neural network training tine Si performance obonen esture mans Kohionen feature maps are feiforward networks tha se an unsupervised ttlning algo, and tough a proces ced slF-rganzaton, con {re the sult unt soa fopalogel a spatial map, Kabanen (IRB) was ‘one ofthe few researchers who continued working on neural networks and houative memory even ar the ns ther cachet as a eseaen opie tt {he Iie Hie wor at roralused during the ate 18605, and the tty of the selorgantang feature map was recognized. Kobonen has presented “evel enfancemerds to ths model, ela superised ean variant ‘known as Leamina Vector Quantization (LV). "hestre map neal network consists of two layers of processing unis, ‘an nput layer ly sansested tow compete output ayer Ther re no en unis. When an input pattern is presented to the feature map the tts inthe ouput layer compte with ath eter forthe right tn be decared the winner The wang outot ui peal the wnt whe nearing fonnection weights are the closest to te input pater (in tems of ‘Buclideandstanee), Ths the pu presented ana each ouspt unt com putes te sosenese ov match srs np attr Thame hate ‘eemed closest fo the input paar i ecard the warner aes earns the rant to ave ls conteron welgtsaajustea The comecon wes ae ‘moved the ection of he inpat ater bya fctor determined by ea Ingrateparazneter. This the base rare of eampetine neural news, “Tee Kote esate map cteales a topuogial main Uy ating rot only the winners welt, but also adjusting the weights ofthe adj ‘ent outpit units in close proximity oF te negborhood of te winner ‘Sh not only does the wanner get adjusted, bat the whole nighborhood of ‘output uri get moved closer tothe input pater, Starting from ra 172 The eta ning Process Using Heua Newors rized weight values, the opine oly align themseves such that “then an input pattern i presented, areighborhood of units responds to Theinput pater. As trabung progresses, te size uf Ui neighborbood see Er he wing decreased, tally large numbers of Saat unite wil be opdated, and ter on smaller and smaller numbers STeupaated unl at Ue nc of ring ony the wining un juste ‘Stroy the leering rate wil decrease as taining Drogreses, ain ‘She implementation, the lear rate decays wih the este fo Ue swing Ove ai ‘looking a the feature map from the perspective of the connection weg the Kahonen nap has performed a prcen ale vector vantcn “Nowa Sode boot generation inthe engineering erature. The conection ‘foots represent atypia or prototype input pater for hess ol Pot tr fal ca tat cher The proces of ting 22 ffi dimen alata aed reduing i toa sta luster is called segmentation. The ‘$BRatensona input spce is reduced to a two-densonal map. Ue a he winning suspet wnt vee, eerily partons the iu tense eto categories or chistes roma daa ming perspec, oo ses uf wef! nfrmation areal bien tape etre ma Siar easiomers, pret, oF behaviors ‘Reena chstered together ot sopmented so that marketing es touion weights ofeach cuistr defines the typical arbutes ofan item that ‘so tat gent. This information lends ett edie use fr Sing what the stare moun Gee Pigne 44). When combined with ‘Soroprints risazation tools andor analysis of both the poplation and Peas — ing Free (Geshe Aetua ett perc Ostpat “— Died Oueput center nd segment statics, the makeup ofthe segments identified by the feature map canbe analyed and tamed it valuable business meyer Recurrent back propogtion ecurent backpropagation is asthe name suggests, a back propagation petwork mith feehac op recurrent eonneccors.1ypely, Ui feck Iimied tater the hidden aver units rte ouput units In either cong ‘uation, adlingSedck rom the activation of outputs from the prior pat- tem intunes hind of memory Uy Whe proces. Ths adding recurrent ‘connections toa back propagation network enhances it by to Team {emporal Sequences without fundamental changing te trang proces. Recurrent back propagation networks ln nor, perfor heir than regular back propagation networks on ine-eries prediction probles ett ein 0 nr etre et ie supervised trlnng algun. Tey are typically configured witha. fehiddeniger of uns wos actvaton functon teeth fetone raat hse finetons (ee Fre 49). Whe sia to tock ‘ropgaton in mary’ respects, rad basis function networks have several Sees yy ine aaron tr sarees cusepible to problems with nontation cate of Pekar ofthe also ee an Ra as ane ‘eimai tothe protic reara etorke in mang expects 14 The One ning Process Using New Newors (okaserman 1989), Popular Wy Menly aed Dasken (1905), rd bess Function networks lave proven tobe a useful neural network architecture "The major dilrence between radial tas function neworks and Yack peopauation netweri the behav ofthe gle den taper Rather han ‘hig the sgnoldel or S-shaped setitio futon as in back propegation, the iden uns in RUP networks we a Usussan or some other bas keel funtion Bach hidden mart ea aly tne processor that cones 8 ‘sore forthe match etween the input vector and is connection weights oF enters, In elect, ie bes UNAS are highly pein yates detetas. ‘The weishts connectinathe tase unis tothe outputs are used (take near ‘ombiatons ofthe hen itso product tne final easton or out. emer ais tc yropagsion network, al weights alo the layers ae adjusted atthe aa ime in alba fuction network, how ‘ter the weights no the Neen ler basis unt re usily ae: belore the ccc iayer of weight ated Aethe ng mse ay fra the coe nection weights, Ue activation value alls of This behavior leads to the se ‘omputed using Kohonen feature maps, statistical methods such a8 K- ‘Means clstering or some oer means nary ease, they are then used to ‘Set be aean of suuty fr the REE hidden unis, whieh then rein Tired, Once the hidden layer weights reset, second phase of tng is tye to ast the op weighs This proces peal uses the standard bck propagation taining ul ‘ns snes form, al hidlen unit inthe RBE network have the sume snd or degre of ses wo inputs. However, i pron o he input ‘are whore there are few pattems is sete desirable to hae hidden Us witha wide aren of reception. Likewise, n portions ofthe input space, Iwi aue tsi kmh be desirable o hve very highly aed proces ‘ors with narrow reception fds. Computing these individual widths in- ‘eases the perforce ofthe ROP network atthe expense of a mare ‘Spat trainng proces. Adaptive resonance theory “Adaptive resonance theory (ART) networks ae a fay of recurrent net ‘work that can be ured for elusering, Rated on the wrk of researcher $tephen Grosberg (1087), the ART models ate designed tobe bnlogialy suse Input patterns ae presence v te retwors rd an output wit [Evdecred a winner in a process similar 1 the Kehonen feature maps. Tomever the feedhack connections frm the winter outpot encode the ex: Devted pat pte tampa oce Figure 4.10) Ifthe atl input pon foes not match the expected connection weighs vo ascent degree, ‘then the winner output est of, nd the next closest output unit de {ned w the winner Ths procze connate unions of the op is texpectaton i sted to within the reqled tolerance Ione ofthe ut- ° ve tape, wage tate wre Beare cee Viptance ‘put units wins, then a new output unl is commited with the initia ex- pected pate set othe erent inptpatorn “The ART amy of retworks has been exVanded tough the ation of raz ge, wien bows ret-ralued inputs, and urough we AICTMAP i ‘htectre which allows supers training. The ARTMAP architectare tres acka back ART network, one to clas the npu atts and one {orencede the metching outa pters. The MAP pare of ARTMAP So Fel of units (or indexes, depending onthe implementation that serves os fan index between the input ART network and the output ART network. ‘Whe the datas athe tring goin are ait comply, the hale craton for recall i surprisingly simple. The input pater is presente to te put ANT network, when comes up win a wine ouput. This miner futput ls mapped toa eomespondingoutbut ure in the gutout ART net ‘work. The expected pater is read out ofthe ouput AE network, which rides he verl caput ot peo pate Probable neural networks Probabilistic neural networks (PNN) feature a feedforward architectare fn supervised training agorthm sir to back propestion Bpecht 1900) Instead of adjusting te input layer weights using the generalized Gelso, each tring input ptern used asthe connection weighs to naw hidden unt In eflect, onc input pater fr incorporsted nts the PNN architecture. This technique i extremely fast, since only one ‘ass trough te networks equied to et tne np connection weighs ‘enna aceon igh he wid to adjst the output Weighs to fineune ‘the network outputs. 16 The ala ning roses Ug Mew Newark Sever esarchors hae ened tht nats hidden unit for each in put pattem might be ove. Various clustering schemes fae been pro- pose acu down on ember fae ns ie ina paemsare se {ino space and ean be represented by a ale Nien unt Probabilistic ‘neural networls oiler several advantages ove back propagation networks (iser 198) ‘Ding la ach ater, tly Single pass, Gen ‘nou np da, the PAN wl conergeea a Bayesian (optimum) clase Probate neural network alow ire ineementl Yering were New {ntnng ata canbe aed t any in wiht oping etait of tee tre network, And because of the static basis forthe PN ean ve a Indaton othe amount or igen Sra Sevan (ther neural network models ‘Whe Ihave presented the major neural network modes, there are mary more tat ae used by varus prope fx speicproblems. Generalised r= {ressnn nral network (GRNN) 2 relatively ew model that subsumes fhe fancuanaty of REF and PRN networks (Cau 1994), Hope net- ‘rou ond Dolsinann etorks ae filly recurrent networks that are eed Tr optiiatoe and constrain sation protiems, whieh are not usually ‘Consrdered as data mining appealins of neural networks (Her, Krogh, ‘Palmer 1991) ‘key Issues in Selecting Models and Architecture ‘Selecting whch neural network model to use fora patel application is Stig dormard iyo se te lowing proces, Fest, alec the fneton {Jou want to perform, This ean ineode ctusterng, cisifation, modeling, Gr tine series approximation. Then ook atthe mp ta you nave wo tran eacewor rt dat al Knorr contin al valued ius. at ‘right dsguliy some of the network architectures Next you shoul deter- ‘nine How mut aaa you have a how fst you aed onthe network ‘This might gest sing probbiliste earal networks or radial sis func- tion networks rather than a bac propagation network, Table 41 can be Une Wo all ths selection proces, Moat commerial neural err tls Should suppor at east one varia ofthese algorithms. ‘Our defo of wchivecturelethe number ora, maden, nd ouput et Se in my sw sou ight lect back propagation model, Bat ex- plore several ferent arctectres having ferent numbers of Ndden fryers andor sen urs. ete ype and quantity In some cass, whether the data is all binary or contains some real nun ‘ers might help deverine which neural networkmode to ure The an a met Soe ae o “ hot oo. Se, seems ne ure e Eee eet Peed Tne dard ART natwork (called ART 1) works only with binary data and i probably preferable to Kotionen maps for clasterng if the data is all ie "rf he pt da as ea vale, en azey AT ut Kober aps should be used ‘Tain requrements: anne or ten eaming In genera, whenever we wa online lensing, then traning peed Becomes the overiing factor in determining which neural network model to use [sk propagation a recuren back propagation ran quite low and =o ‘radial ass function networks, however, train quite fast usally ina few asses over ue a Funelona requirments ‘Based onthe funtion required, seme models canbe squalid. Por ex- tle ART an Kohonen featre mape sre chstering algedtons. They arnot be used for modeling or time-series forecasting Ifyou need to do “stern, then back propagation could be used, but it wllbe much slower ning than ning ART of Kehanen mee ‘Summary ‘Neural networks are ifferetiate long thre major axes: the traning par= align constb pohly, al ang algun, The nan wed {ining paradigm is upersised ining, where an npu pie andar responding output patter are presented fo the neural network. The ie Ghee between the desired ant act outputs is used to aust the neural exwork weights. Unsupervised earings sel when we want ase te werd neta tn perform chsterne of segmentation of te input data. Reinforcement Tearing i sed in stuaions where the desired output snot known un ove eter in Use tang sequence. Tae Ue tining paraigms ‘cover a wie range of applation areas. Neural neworks are organized Int layer of neural proces us Move neural nore ave layer of pt its, one or more avers of hidden units and finally ayer of output nis. Data can flow between {he uns in these layers in severe way feedforward networks, dt ‘mes in hen ns ls through any hidden layers. and then ows {She output units where the anwer appears. muted recurrent ne ‘oth tune sue feedback conncetone which aro used to provide roe. ‘fate information, or amemory, tothe neural network This s most useful in problems involving tine-dependentpaters. aly recurrent retwerks hate bidirectional seanerione hater all processing uns. The com ‘lex dymamis allow fllyrecurent networks vo model extremely nonin ar fonctions and to eave optimization ant evutint slfction fnblems: However they can be unstable ad might oscilate o al into limit eels. Me mt popular ype of weutl networks he Bak propagntin nets works rea feekforward network ad urea supervised an method ‘Hyostts weights Kohonenfestre maps, also own as soap, ‘Raps, ae feotfovard seus nabene raed sng unsupervised fear ing. Knonen maps slearganie iv topological maps where inputs tat ie love together nthe wp space are mapped vst adjacent output its {the neural network utr Laer Recurrent backpropagation i hybrid ‘network that Wes mite recurence and the standard supervised back ‘ropagan errng gai, Rad bea function network ate eer tort reworks that retrained with supervised earning and have asin Taper of hidden nts that ue «Causian bass uncon compute te ide ‘networks tht ae tained Wig unsupervised leaning. Probabilistic neural ‘etworks ave supervised feedforward networks where sew Hien wits ing prove mr man er neal network ‘models that use diferent combinations of taining pacar topologies, sd learning agora. "rhe processing or data mininé function required places definite con- _arsnis on wich neural network models can be se for applizatins Table {Uiiiss tne mer models aed the uncon eat they ean perform wel, Sthetheritis lasieaon, clustering, modeling of forecasting Reterences op 4 it rl tl Main, ile Fem Pas, ‘al wie ate mac ide : Sry vtim Coe ers fr Seapets eto e ite re, i le ri oe fn compa ei era ani nen rin we roi fin 1-68 sae Eau, ns EN Cone ers ey ann vedten it Soa st alate memory deen How Bs Terni, Pod trad Prem ip SIE Cn Sefowo Ti ond Mey 18a ews Daan a enone Bh “Ex Con ene ee me eS. om Can oc yo mo oe a, Me seme ual ming ry ed prac, nd ei. Chapter Training and Testing Neural Networks aang senting. he pach wa ence iter mo eer a ata ih ‘etijestuaton arn Go) Topi In ths chapter, explore the fsve rested to trang a neural network to pron specie preseing fonetinn whether tie lasacaton, caster. ‘modeling, or ime seis forecasting, I deserve the mest important parame tere ose in the mos: pour sed enka aed ow ey cae be ‘sed to contre the training process. talk abou the usual raing process forbth superised and unsupervised netwrks, an Talo deus the man- eee ofthe traning date and how tsps the trang prosee= ‘Once the data preparation s complete athe neural network model ad aelitectre have been selected, the next step isto train te neural ne srort Beenuse ofthe lrge vty inthe yn af eral two, tie [roves canbe very dependent onthe exact neural model an the furetion Sou are tying to train te neural hetwore io perfor. Some nemo ‘ie env ane pas thos te data while others mht rear hundreds for thousand, Some networks have ony afew parameters to contol the tuning proves, de eahers might present abewdenngaet of prametore toadjust. So when someone asks how longi takes to tra aneural network, theanswer et depends” Te depends onthe neural network apd is arch. 2 The Dna ning Process Using News Newer ‘ctr, ft ha on or hundreds of processing us an if tere ae hun ‘rede or thowrands tring pars. And, of course, ke depends on Your spplcaton and whet your deal of "wae Tr nst case, we wa (otra he network wih subset ofthe examples and then test the network performance with another smal subse. This {ites ple sod fo ensure thatthe neural stor a lene thei portant aspects ofthe jo ks berg asked to do. ‘ost neural networks begin the trai process wath tne connection weights inaliee tml com one ‘The raining cont preter te set ad the taining dita pater ar presented tothe neural network, ‘ne aller te ote. AB tang progresses, die cancion weg area Jini, and wean monitor the performance ofthe newer In supervised Tring we watt alternate Between wang and test data to ensure tha Fors lvunaupervicd tranig, we usual wae fo visualize the arrangement ‘tthe output units Table 1 shows the aor earning parameters used or "At some point, it might become clear that the neural network is notable to eam the function We ae Wig to vac. Ths whe the metodo tthe yer “jes Sn tin ore ‘gyre ces Mementm ——‘ackyoupton Sots th tact wee ater Srtdmce tk peopanion Spin cn a ae a esebermad tenn ss Nnberoepots Kunmaps oes Parnes ich ied omer Mpc nnn ogy provide in ths chaper wil ecome mos: use Til and error might ‘Seem ratural but tcan consume alot of tine and money (OSulivan 198). ‘haelpincd spraochtoleritive nour netenresovlopment can ba th fiference between success and flue na decon supporto application ‘evelopment projec. Defining Success: When ls the Neural Network Trained? ‘Once you have selected a neural network mde, chosen the data represen- tations andar al ead star trang, the next deasion “How do You Iino when the network ie rined?" Depending on he type of neural net ‘wor and onthe faction you are peor, the answer to this question ‘wilay. you are perfor lassen, then ou wane to montoe ie ‘sin tetng mode. When chasterng data, the Srining proces usally de termined Dy te numberof pases, or exc, ake vag Ue ag ‘ata I You are evi wo Dull a modelo time-series forecaster, then yOu ‘roby want to minimize te petioneror Regardless ofthe funcon racy then the connection weights are locked” a0 they eat be adjusted, Inthe flowing sections, we explore the acceptance rterta used for tra ing neural network to perform eseeicatin, ehdoring. modeling, ad time Series forecasting. Clesinctin “The moacine of ease in lasfiestonprolem ie the acurey f the laste, umsaly defined as the percentage of correct classileations. It She appl gettng as inact camifaton i worse than geting ‘ro dassication a al. In these cases, “dot know” or uncertain answer is ‘este, By selecting sor dta representation fr de network outputs, OU ‘cn obiin the bahar ou ese or example, let's sy we want to clasify customers into tree types oor, ood, and excelent. We we a ane-a-N code to represent our out a then tan the nt wh an eror tolerance of 1. We erated ‘Output iter that selects the highest output unt asthe winning category ‘Thar ene ops are 09,04, a0, wey Un ne wie 908, eh the corespordi category is pooe. Not alo that if the outpus are 09, Oo, and 0.7, we would si easy the caster as oor, even tough the rnotwork hs Rgh prediction value for ood and exellent, Even fhe out pte were 0.0.19, an |, the Outpt elation would be that he cus tomer was por. One way to avoid his problem sto puta theshoi ton {he sup te before you parlor the nasa ene conwerin. aly ‘we want the ouput valve to beat least. before we sy that te nis ON ‘Ewe pt tie ec alan ple, then we ould a fot enters "unknown or undecided, to represent the case where rane ofthe network ‘utp unis Pata vale above U8. ‘A onfinlon matt tat rape vliaton that ndeaes where the casifntion errs are occurring. A text version lt the possibe out ‘ut cacgien nd te conseoponing percenuages of coret and incorrect ‘asifcations (se Figure 6.1), custenng “The wulpt uf a casterng network usally open to angi by the wer In ost cases, the raining regimen is determined simpy bythe number of times the data is presented tothe neural network, aby how fast the leuming rate and the reighbrho acy Kekona lati maps, fo ‘mpl, might es ines decay of the learn rate ada ine eduction i ‘unre negnoornaod uneaon, or tey mg use an expe dey it lear rate and a Gaussian or circular nlghborhood funtion. The user ‘wou spect the numberof epects or compete pases through the train- Tiger aa ie iatew sate, The network woul wai forte specibed numberof epochs and then stop. Fgure 62 shows the output actations, represented ty 9 Hinton diagram, of Kehonen network used to cst ome dat, Nate tat unite coe to the winner alen have lw stein, Which are denoted by small boxes. "Adaplive resonance network ning is controled pray y Ue vi lance tralsngpaameter and by the lean rte The hisher the vance the ‘more cecriminatng the retwoek wil be. ART networks are ane unt Salle codingsobaines An nnptive resonance network ls conaldored a De when he training data goes through two complete pases, and each in pt plter Tals ito the same output las ab on the previous pass. Bepanding on the sprain, yon might want t lock he ART near ‘wees 0 hat Chey wil ot be adjusted when the neural network i de Doved: However, ane or te advarages of he adaptive resonance Une ‘modelisthat canbe used fo online learing wher ican recognize now Input pattems and alleate new output categories when necessary. One Predicted Output Category CataGaTA | CalegenyD | Calagry E caiagonya ft ie a Category | 025 oa 030 Category [045030 O55 peek Grn ratio lei eens ence er yject_Dala_Edit_View Options HintonDiagram Not Output Array ‘ee Tien aga i crear ‘point to remember is that ART networks ar sensitive to the order ofthe ining dats, Thus there eno gurantee that epee np patter wl ‘ap tothe same output eategry on consecutive raining rus if the tne inglata sexs moaiea in any way Modeling ‘Inmodelingorregreson problems, the us error measure the oot mean sipinte eto Rete, tne pres wear usualy tole Some function with makipe inputs and one or mre dependent tpt var. Ales. The average or meat squared emer (SE) ote oot mean squared ‘ror (RMS) ae good measure ofthe prediction acourcy When raining = {Nt started and the neural network weigs lave bee rade he FMS ‘rors usaly que Nghe expected bela Is Wut as te neural et ‘to trained, the RMS wil rly fl ti ree tale in lum. Figure 6. shows the RMS ero fora sgl taining run of 2 tack rope ewer Hse pen esr do ofl, ogi sel [nin uo and down, theresa chance tht te network has allen ito lea ‘ina: ts csc, you wil we to reset or randomize) the eur et trek weight and ear aga Ie earl network ll doe ot comers ought need to change some of the valves of your raining parameters, of Theil nina Process Using Neurons PT aes ata fait view options Het Tineke ‘pect Ne | ~~ 0 2 B 3 30 Neural Network Parameters “+ Ave RMS Error revisit some of your data representation and model acitetue decisions (Gee Network convergence hes” nter inthis chapter for mare discussion ‘Stohas to dowhen a our neti do nt cone. ‘Gare mitt be taken when using the RBIS eeor asthe only indkatorof| neural network performance In some ces, ie era network eas a the best way to minimize the RMS eer isto always output the mean vale ofthe function. This beisvor occurs primary wit functions whose output SSsymmetrical abl save vl, scae, sao usta monitor the [RMS ero ofthe wort patter Ifthe average RMS error for the traning set [sfaling, but the RMS errr ofthe worst patern i growing ager, then IRight be the eae thatthe neural network tring to average rater han fethefancton Forecasting Like the modeling apieatlon, forecasting I edetion problem, and 8 the rot mean square err is sed. Another god way to visualize the per- Tormance ofa orecarting neural newark so use ine po of eat sr doe network outputs (See Pisure 5), "Timeseis forecasting i 4 thy modeling problem. Tere might be some underdag longer Wei Ut also uence by some cial factor such the te of ear (refered tos seasonal) On top ofthese trends there fs sulla random component that causes varity and ut ono an Testing Nort Netwarks 67 cortsiny i any prediction The rndomess can be tattclly charactor tae ty some probability distrbution or could be, nat, eaused by ade termuntie nonknear proces elerred wo a ehaoie ume series (agers nd Vers 10) People have been using statistics to predict near trends with random components for years (av and Benwon 108), However tng fate ‘xs compl ntlinea o chao ime series is nother mater. Neral nt ‘works hive shown Uemesies to be excelent tons for modeling complex Une eres prctlame,eapesaly recurrent neural nobis, which or hem ‘selves nolinear dau systems. CControltng the Training Process with Learning Parameters ‘Once wo have determined what network performance i required for out pplication, we ean then star the ting proceso, Once gn, depending ‘on the type of earning algorithm and neural network used there ate par 2 38 Series Data ‘© Net Output0 0 Teach:0 50 meters that must be etn oder to conzo the trainin process nal of {he neural network research pj hat hae Leen written, erly thou ‘ands of parameters have been defined nue folowing seen, iscuss Sh rig paras you are mos aly wo encour, bated on te Supenise waing upersed ag we present paler othe neural network, makes esruon and we conpare the este xn nthe dead oat “has we have xp inomason about te peomare of te networe ‘Themape paraneers sed in super rang hae todo wis Wow the moe ‘and hrow hia a ton we take when adiusting the connec din weigis inthe recton afte desired ont. Searing rt. Amst al neural network modes have a earing ae ‘nse oct withthe, Th arg at he knob ou ca a to ‘Ghul wee you haves Hyperactive stent or» drwcanteady frame. Ina tpi supervised ang xe, ater presented the stort takes an ncorect predictor and eae be Fer sce utp ta thn arta output ied to dt the Motus The leaming rate parameter corr the magne ofthe Trak we mate wnchaqjusng me csc weighs ove em 9 cra nue for the cen ting pattem ad deste out Fat do we ken ar stp toward the comet vcs Cage leaning oy a eg) Youre "Wie eal [rng rate Late ets ve ih fast” However, you must remeber {Say not ornate neural eter oars aterm, Do ao te deans he nse on. ith» are aig rn we are aI {age changesin he wiht alter ench pati s resented maybe CLS re eins Ue aes Ao, ect hat ne don nosso sr ne the rer wer each anita. We wa he ea ‘onrk totes the major festurs othe pobie s at cn gereraize {Spates tet thas ever seen fore ‘en wer eaing rat sty gio he en of ces “ake nay snl ep ster than ie rake he ast eke OF ee roots on he pee aha Tee harm yt ‘San witha relive Inge lam rate when you Desh an, Bae ‘fen usay bento to ower te warn tte ove une ws the an ihe ess, Te He i that you mabe lage corrections extn, and Ihenyou fine tae 8 you 0 lon Momentum. Momentum isa traning parameter that goes hand in hand ae earning ates effect tote out ugh-requency Ganges alain antag Sn Heth the weight valves, so that there is ess chance that the neural network wil star osollating ature a act of vals, The momento parte ces {We errors from previous training patterns tobe averaged together ver ine ‘mi aed to the current error So ithe err on a singe pattem forces 8 lange change inthe fection nf the neural network weight thi eles can ‘ermlsgated by aver the errors from the previous training patter “Tis Is especialy tue It the previous paler erors were fring the network weihts inthe oppose direction Ista af ing error nora tn from a singe traning pattern (as would be the eae wien momentum {soet to Othe ceor trom the poy pate are sea i. The overall ‘el ta the weighs ares helt be deen tack an on ae ternate dretions ror tolerance. Supervised taining methods provide the neural network, tn mpuvoutpat pars nthe training dala. Te target or desired outputs ‘example, standard back propagation networks, ing the lotic aetvaton. ‘tion, eyuie dade target outputs be inde rage of 0.0.0 1.0, Some commercial neural network development tools use the Hoperbee tangent fnetions, whieh requir ouputs i the Oto +10 range. ‘Mott commercial oss wil allow you to specify an err tolerance. This training parameter is used to cotrel “how lef ose enough In many ‘ase, an ero tolerance of 1 i sed. This meas at ithe target vale {51.0 a network ot wai bn 6 (10-01 0.8) within the tle ance and the err i treated a 0.0 One of the mai reason for using an furore sto avd nny the necwore Weis oexeme wae. t you kep the output uit activations valve inthe range of 0.11009 (wth 8 tolerance of 0.1), then you are staying in the near rage ofthe logistic funtion. As you tr ode the utDut values up 1. drt 0.0, he ‘et input (Ue sum of the input sas to the unt) mast be ie lrg, Since the outputs or te her units wl onl be Inte range of Ot ys sa tay regi thatthe weigh gr gor Once theres gro large values and the output ofthe lg function Is above the knee ofthe ‘Sts curve is le Tard vo ege ue outpu of te uN Tis cn tio soften ale "network nari" (Waseerman 1980. ‘Unsupersed aring In unsupervised learn, Use st yt paneer welecton for ‘he number ofoutputs. Ths was deserbed inthe network architecture see ‘on, but it Dears repeating, When a neural network used for string oF sgmentston, the specication far the nurbor of output ne defines she ‘ranula of Ue segmentations. this too lage or smal then the f= Sis of ue Segmentation vl be asappoing, However once the aerate tere ase sevesal enring parame tere cn he used to cotrl the seperation proces. Like supervised ‘earal networss, most uneuperssed neural networks aso havea leat rate Ut used vo conta de step ase sn the edasiment ofthe connection treighs, Sect to unsupersised models are the neighborhood parameters Tor Kohonen has and he vgance parameter used ART network, Neighborhood, When 2 Kohonen selfonjizing fesire map is use to SGoster data, there are two popular metinds for contri whch ito st thelr weight changed One ito use a square neighborhood function with linear decrease in the faring rte, The oter Ist use a Gaussian ewan wa a eapereaia os of eeagh the ‘alt of the soltans sql si, he ler mode in terms of parameters that mst be se a Use. a case, dhe major dein how qh or slew) you want the earl network to sete down. By slectinga age value forthe numberof ‘poche parameter, you are en te neural etwork co ake Ha ie be cients on the el eters, in contrast when yu select a smaller ‘atue, you are telling the Neural network to make a quick decision. The (Guok-cecblon approach statement about the taining te and pro- ‘seeing of the dat "The neghborhood in aKahone feature ap defines the area around the ningun, where the noninning une weights wal ako he modified, “Daly, tis prametr fst valu roughly hal te se ofthe maxi- ‘mum diversion ofthe output lajer ou Ne wannng ast ese of Toby 8 output yer and the eighbovhoed is 2, then not ony the winning tn, but als the 8 uns one step away, and the 14 units 2 steps way wil feo have tow weights aus “The naghborhood vee is inportant in Keeping te locality ofthe topo- rate maps ereted by the Kohanen mays. AS traning praresses, the aerthorn slo of sepa is decense, 20st tthe ed oly the wine ‘rt ur weights are modes. Remember, i you are using @ Gaussian eghborood fino, diss taken ear of automancaly. igonce. When using adaptive resonance theory (ART) network, the ‘Mer of outputs selected inte arroecrure i sateent bout the ‘Rasiman suber of pose oops, at necessary how many outputs ‘rlactunly be ued, Adaptive resonance networks have a vganceparan- (er tnt conols now plcky te url network ging o be when cas. toring the ata (Carpenter and Grossberg 1988). Lok a this way I the ‘ance is ow ad two pater are sia, then they willbe clustered to [ether That by fr castering purpose, the to ptm fall itn the same xat uit or eaegory. Howeve, if we raise the vglance parameter, then the neural netorks moe dserininaing when evant ne ciereces inn an Tsing Neutered ‘twee to pattems. What would have been “lose-enuh” witha vig Janes oF 5 mught rot beif the vglance sO. Inthis case, the network wil Shvetly, ths ira tally ew cls of input pattems her, 9 | bette rita new opt unit” "The contro allowed by the vglance pararetr sone of te nicest e- tums of tne adapting rexonnea ntwnnn Home, the ilance ara ters set too high then the adaptive resonance network wil acale new ‘output unt tor amoct every np, nd soon we Wl Use pal of Ure osi, ‘Wi owe doin th case? Wel ether me can lower the valance Po rameters the network isto pick, or we can change the network archi teeter ore ota wil aes) “Adapive resonance networks ain unt ey reach a stable state This wien each input pater ges caaed into the same output unt on #0 tive to the order in which tens ae presented, Fora give inp dat ef {he orders rancomzd, ten flere custenng cou est. er Development Process. spite al of your selections, it saute possible tht the stor second time tat you try to train, te neural network wl nt beable to meet your ‘eceplae ec Wlen Wie hypese you are dhe a woublesoosing mode, Wha an be wrong and Row ean you ie? Figure 5 howe te teaive nature ofthe neural network development rovess‘The majo tps are data sleeon and representation, neal et ‘work mode selection, uchteture specifcauon, traning parameter selee- tion, and choosing sn appropeate aceptance enter i any of Use ‘ecsione aren the mor, ta paral natn ight othe ae ear Iviat you are tying to teach Inthe folowing sections, describe the ma [ition is rhe een spina we gy me ag Network convergence issues How do you knove wen you aren touble when sing a neural newer, tro Th fra hint that takes a ona lon tine forthe network 10 ton, and you are montring the lassileation accuracy or the prediction secur ofthe curl eink fou ate ply te TMS exer, you wl fe that fas quickly andthe sta at or that it osellates up and down. Either of these two condons might mean ht the network strapped in 3 oul itn, whe tho objctve to reach the lobed mnie. "There are two primary way around this problem. Fst, you can ad some ano noe the neural network weights in oder to try to ek hes tom te lea niin, The atherention eto rect the network eights Aeseptence rte tonew random values and star taining al over again. However, ths might tol be enough neural aetrrk to carwerge ona elution, Ary of the desi decisions you ade might be negatively impacting the ality of the neural network lean he faction you ar ying to teach Model leon, Is omens best rit your marcos in he ‘Sia tower nce etn you ae ying ero? Is, eh loki a neural network model tht can perform the function is the sl Bins then on yw ate nner ag are Rc sor other layer of hiden etn practice, one ye iden units ‘aly sufce vo layer are eqed ony you have added a rg umber of hen nits an the neta s snntcomserged on not provide enough hidden unt, the neural network wil not have the co Other factors besides the neural network architecture could be at work, Maybe the data tas a strong temporal or tne element embedded ins perforin better than regular back propagation. Ifthe inputs are nonstation- fy, tat they change slowly over tie, then ada basis union net ‘ta representation, If neural neswork doesnot converge wa sottion, and yu are sure that your mode architecture is appropiate fr the peo Tem, then the ext hing torevantefYour data epresentation decors Insome cates acy iat parameter ot Uelngseled coded in ate ne hates the neural networ earns importance tothe nein at han, ‘One examples continoous variable, which has a large range Inte og ‘network Perhaps a thermometer coding wih one unit foreach magnitude ‘of 10:8 morder. Ths woul change te representation of te np parae- ‘ter toma snd inout 15.6 of 7. depending onthe ange af he valve ‘Amore serious problem is when a Key parameter i ising fom the ug al soe aye th Une ni potent oa an easly spend much tie playing around wih the data representation ‘tying to get te network to converge. Unfortunately, hss ne area where cxporence ls rqured to now what a norma rang proce feels and what one ht s doomed to fare feels tke. Tiss also why es = Portant to have a main expert wowed wno can provide Was when ‘things ae nt working A demain exert might recogni hat an impartane urameter ie misang from the wralnng dt, ‘ode! areitectres.Insome cases, we have done everthing ri, but the ‘network jist wort convert: H cui be that Ue problem Is st to com plex forthe artecture you Rave opeiied Oy adding nditona HiSden ‘units, and even another hidden layer, you are enhancing te computational ables ofthe neural network. Bach new conection weg is another ree ial, whichcan heated ‘That why tego petice to str ‘vith an abundant supply of Hidden units when you fst stat working on Problem. Unce you are ure sar the neural ecwore ca ea the Runeuor, ‘Sou cn start reducing the number of hidden uns unl the seneralzation performance meets Your requirements. But beware. Too much of goed ‘Sing cant ba tot ‘some additonal hidden unt is good, i ading mary more beter? tn most eases, no! Giving the neural network more hiden units (andthe asso- ented connect lg) can etl mabe it to egy forthe network. In SRE Sethe noua network el spy Hear to mero he taining Fate The neural network bas opaazea toe rset paiculor ‘Rucro ard ho nl extract the important relationships nthe data. You ath have saved ourel ine and money by us using lookup table. The ‘Shoe pn ito Get tener eter to dec Key features inthe data {tater o generalize when presente wih aller tas nat sen before "There lsothing worse tha of ay neural network. By Keeping the ie ‘den byers thin ss pusible, you usually grt the best resi Avoiding Overraining ‘When raining neural network ts important to understand when oop [Rls to hk thai 109 eps sod, ie 1000 epochs willbe ch Tate Hove, thie itv te of more racticeIshele?” does hold ‘ih ural network the ame ining Pater or examples re given to (De aur networr ver aad ure, andthe welghte are ajntd to match the deine tous, we are esata teling the network to memorize the ‘eacrns ader tha to extrac the essence the relaoraups, What up- raat the neural network partorme satemely well oh the training vats, However when is presented wth patterns i hasnt seen befor, a oeepraiee and does not perform wel What ure groin? Ie ‘led verteining ‘vertanng a neural networkis similar to when an athlete practices ard proses fran event om hs fore ett Wha te actual competition Pees hare is faced with an uni arena and circumstances it ‘Might be imposuble for him or er to react and perform atthe same levels ss cunng craig. itis importa to remember that we are not uying to get the neural ete woskamake thebt petition can an the rang ta, Weare uy t0 earner etananee on the testing valation data Most commer our network tol provide the means 10 autonatial sich between ‘Tefen tig data." ea so enech. ie network prformance on the teting dt wl on are rang ‘Automating the Process, ‘What hasbeen desribed in the preceding sections is he anv process of rr nsauatncorork mode roses some Agree of sil and expe ‘Rane ei neural network and model bulking in order to be success Teva to teak many parameters and make somewia aitarydeisions ce ag the neural notwor arohitectare des not sce Ikea Feat a eige te some application developers. Because ofthis, researchers have ‘worked ina variety of way to minumize ese probes Perhaps the firs atlempt was to automate the section ofthe appropri a rmber of iden ers ana aden sts the neal network Ts {hes nnreached iva mamberof ways: a pri aterps to compute there ‘ulred architecture ty loking at he data, bung ata large networs Gl eprint Modes and onesie unl the sale nator tht Could do the jb is produced, and staring witha small network and then ‘groming tuple can perform the task appropriately enstic gore are often send to aptimine fnetion wing yale search methods based on the olga theory of natural selection (fo ‘etaled discosion of genetic agortuns, see appendix C). Iw view te SAtoctisn of the momber af hon Layers and hidden units 28 an opt tnzaton problem, dente algritms canbe wsed to help fin the opt- "The dea of rnin nades and weights rom neural networks in ode to Iprove ter generalization capes has been explored by several re segs (Set ard Dow 3988), Anetork with an arbi rge umber of hidden une crested and trained to perform some process function, Then the weights connected a noge ae analyzed vo sete ‘Contec tothe ascarte redition of he tt pate. Ifthe wets ‘Se etremely mall or they do ot impact the prediction eror when they Be removed, ten tat node ands welts sre pred ot removed fom the netanr ‘Tie roves continues unk the Temoval of any additonal pode cases a decrease in the performance onthe est st, ‘Several esvarinainare als explored the apple approche pring “That ia sal neural networks created and atonal hidden nodes and wright are added incremental, ‘The network prediction ero mon- Tere and longa performance on the test date impeoving sonal iden uns ae added, The cascade correlation network (Faan 1989) Ailoestes a woe st of potential new neswork nodes. These ew sles “ompete with eather an he ane tat redesthe prediction eeor the ‘most is added tthe network, Perhaps the highest level of automation of {hereral network ta mig proces wil cone wid the use oatllgent fgent, In cater 8 we wil explore naligert agents and data ming in ent ge ng ‘cumnmary ‘Thing neural enor eth hare prt of sing neal networks for ta mining es he eqlalent step to sing down and writing the algo- ‘ith (and coding and testing) usa conver prgiatig an fgonge The stant presented a methodoly for talnin and testing ural networks tot, wile nt strict cookbook, i certainly more sic {ured thane “bck Itel wus atte io Ue neural metre ‘lopment process As with any projec, the Hirst ak to understand what ‘tin you are tying to poform. Prom hi x Healy candidate can bes lected rom the many avaliable neural network modes ‘Once a neural network mode nak been seed erent step io eve ‘curmesnire of sees th what level te neural network nus achire in terms of easicaton or modeling accracy before we calli “rained.” or clasiteaton, die sponte reuse iste prcerage of eorex and Incorrect claications For modeling ad tne series freeasting tithe ‘mean square er or Ue rot mean squared error, Successful strings Ite atjectve and often dependent. onthe competion of ester analy Sister the neural network has self-organized, “Thee ave several taining parameters tat are used wo conrose newt rotwerk development procone The ost common patamete isthe Kar ‘ate, which contra the sie of Ue adjustments made to the connection Swen, Superneed trai sigur also iciude w momentum erm, Iwich averages the welakt changes oer molipl patterns an serves to ‘ininizeoselations in the weights. The erorfleance i sed in super Woe Waning 4 baste tendency of neural networks n become pars Ijeed by extremely large weight, whlch result from trying to drive the butpus to ther extreme valves In Kahonen maps, controling ee decay oF ‘eam rate andthe sean ate daca nthe neghboehood prame= terare important. Adaptive resnanee networks use the viglane parame {ero conta te degree ofsifarty hyp pate that are mapped © the sme le ‘Neural network training an erative proces, very sia in principle ut eu t cap aplication devlopment or ebjos-orented pvtn= tapine Several erations are usally reqled before a scoesful ain fn achieve, The prin seps in Ue process ince aaa section hd dts reposentation, whieh ar smeties considered to be mat of the ‘ta preparation phase Neural network model selection is next, followed borane cennson oe aeatectur, wich ise ruber of opt, Hen fd utnut uit Te tanig parameters then ed tobe se andthe wai ing Gata presented tothe neural nebwork. The apropate error pertor thee wesutcnerts must be monord te determine feral network ‘Seoergingtosoutionorifone ofthe previous steps needs tobe revised (Orertanng isa degenerate case where a neural network sed se pestediy on eat such that memories wns the funtion relat i pats to outputs. When new datas presented tan overrained network i ‘i produce lage preaeton eros Decne sas not aed he fonda ental elation in the training dt. Recarchers and conser neural network tol vendors have made jrogens in auomating the neva network development provost rom ‘model selec, to selecting the appropiate numberof hidden wits 0 r- moving unnecessary input varbls, o choosing Une best data represen tion, tewlquce are being dwelt to simplty things, No matter how ing and Tntng Heal atria 67 sutomated things become, your thorough understanding of he neural net- ‘work development process wl serve you We Ratarancne ag. nig 18. Ate tpn a lie AE mc te ae Te cme ee a, ae ee ten ae 2c Kehoe 1 1888 Sionyoncnton ad ono memory (nd) Now Yor: Singer and, Stier tn oom, ns cab hereto iy ene I greens se nT Ce ee ae eS mae See ee ecw seer einaa me ences cynien See a See EE ic eee, Chapter Analyzing Neural Networks for Decision Support nen data musing is wed for decision support applications, creating the ‘otal network mle nly thirst ar fhe praetor ed ‘the moet important from a decison makers perspective, lt fd out wht te eral netork kare nd ehaper, esorbe a set of posiproces Ina activites that are wed to open wo the neural network blck bo and ‘wansform th ealleson of network weights Ino set of Vsuleatins, ‘rus and parameter rltonships Ut people can eal cmprchend Discovering What the Network Learned ‘When using neural network as model for transaction processirg, the most Insert baue a cer Ue weigh hs Ue new teork mxaaly ‘eaptae the lasifeation, mode, of forecast needed forthe apleaton It ‘weuse cet Mesto create a neural network loan office, then what mary [ethat we maximise ou prot and minniae our loses However n dellon suppor applications, whats imparant snot that the neural network wat able lean to clsrmunate between good and ad credit is, bul Tat the network wa ea Hil what farts ae yin mang deter tion In hort, fo decison supgort applications, we wart to know what the ‘ural networ are. Unfortunately this is one ofthe most dificult aspect of wy neural et works, Aer a, wats neual nebrrk vt a collection of procesing le “rents ad connetion weiaht? Fortaates however, there ae techriques {be ferreting ot ths iformation fram a trained neural network, One a= (ones sense eur etmons netics sy” poe wh ft Tut and ecord the eutt, Tisheiput sensi approach, Asher ‘npeouch ito present the input dita to Ue neural network an sen ge arr aet ot fucs tha dserbe the Iga fnetone performed by the ‘ural network ase on iepecions of ts eral states and connection ‘weights A Ud approsch sta represen the neural etm visual using [MGrhica representation sa Pathe wonderful rater recognition a ‘kine Know os the haan rain can contribute to the process Ine tetnigue used arate neural network depend onthe tata mining funtion being performed, This ls necessary beeause the {ppc of normation the neural ete has earned is qualitative eerent, Pees functon it wen tained todo. Por example, yn are cise. Ing customers fora market segmentation appleatin, the ouput of the Iaralneswok the Menten a te custer ta he custo lin AL {Role reste napsin ofthe atibtes ofthe customers in each se ‘rent might be warranted, long with sualzatonteeiquesdeseribed in {hetouowang Or we migt wants yew re comecon weights owing ito (ch output unit (cite) and ale thor to see what the neural network feared were the “proteeypia”eustamer fr thal segment, We might then maseinet into adtonlcotors This ‘Toul low orto down oa finer and ne evel of deta a required, Th mading and forecasting applications, te wvormaton discovery the naa nefits enced inte eonnacton weights. The most obvious the ofthe trained neural nebwork sto use i to play whats against the ‘node Ife neural network Ma eam vownael fnction, eno dont ‘awe mathematical forma forthe fenton, you can sliarna great deal Sout i by uring the input porters and seeing what the effect ton se ektbel ss iy we bk ermadel fe elo etn an investment forse ofproduts Ifwe Input the data ona set of proposed development Projets, we on use Ue estimates ou evaluation Oe business ase Grenson doa complete sanathay atlyisof the inoutsto determine their ‘ela importance tothe retum on ivestent. Sensitvy Analysts ‘Wawe tere are any feet spss of nermation that mite eae {ting data ring with neural networks, perhaps the eral ting to ears inch parametew are mast important fora spec uncuon Ifyou aren Tig Caer stitution then i fe mrt to know wich aspect of ‘Sour customer reatinship has the most impact onthe level of sataction, caying Neral tur for Dion Sport 101 you nave ated number of dlls open now you spend ion anew ‘waling rom or your customers, of shoul You hire another techn 50 the average wait i 10 minutes lest Determining the impact reflect of an, nya ari ne aap of se eld wri ‘A neural network canbe wed todo saniviy nals ina vanity of swat One appre is to treat the network ask box") deter {he impect tht partir input warble hoe onthe output, you ed bold the other inputs to some faed alu, such as their mean or median value and vary only dhe input wile you monitor te change in outputs. ‘oreo the np rm ts iim (it anim vale an nshing Bap ‘ens to the output then the nput vrais ot very portant othe ane on weg moet. However, te ourpurcarges conser, then the Inti certainly important because it aets the outbut. Te tok in er fering ses aan way sto jet hs oes ech ar teach parameter. inthis way, you havea rankingofthe pcameters according {other npc on the output value, For example, let say we are modeling the price af a tork We bul a model an then perform input seni saljls When we look at each lpi variable, we might see thatthe day of the week se most portant precieor of whats gogo pen wo the tion of the nck. We cul then ie this information to our aang “Amore automated approach to peeforng sent analysis with back propagation seul vetwurs fo heep tack of Ue exer ters comjated uring the backpropagation step. By computing the eror al the way back ‘otheinput yer, we hve a measure othe degree to which each input con- trate tothe utp eror Looking te anther wey he inp th the largest error has the gest impact on the ogput By accumulating these ‘rors overtime and then noraliang ter, we can eompute the relative ‘ontition ofeach ip tote out erst effect, we have diene ered the seni of the function being modeled to changes in each input ‘ule Generation from Neural Networks ‘Acommon ouput of data mining or owls dloovery algo othe {ranaformation ofthe aw data ito then rules. Standard itive lear tng techmquessuen as desin tes can eas be Used to generate SuEh file sate One of the eae hea taightorwand ta hdr, tues is tht each nde inthe eis a biary conuon or est vale As fgrewer an B, Wen take one inane of te Ue, else take the oer However. ss been placed out before, havnt to define some arbitrary paint as the dividing ine between twe ss of ms wil cert Heed to [Rep anor but het neces the cere ance ‘One ofthe perennial erie of neural networks as een tht they are black ox,"iscrtabe unable to expan tel operation or how heya

You might also like