You are on page 1of 14
Artificial Neural Networks: A Tutorial Ani. Jain ‘Michigan State University Jianchang Mao | KM Mobiuddin {IBM Almaden Research Center | These massively parallel systems with large numbers | of interconnected simpl | processors may solve a variety | of challenging computational problems. This tutorial provides the background and the basics. | merous advances hve been nade in developing ineligent Db eeiis Mecat ser Resencherfrommary went dplines re desgnngat fia neural networks (ANN) ose aay o problemen pattern treognton,predtan, opinion esodae emery and conga (seth “Chlleging problems eb) Convnionalappescher ne ben pote fr svn pro ten ough sts epplaons cane foundin en weleon Srinedenvfonment, none fexble enough pero wel ouside {Soman ANN prvi ecing array, and eon aplctons couldbeneit omen them ‘This aice fr thos readers wkh itor no aod of ANN 0 hebpthemundestand nother aren thi sect Gamer Neds Costhemotatonetind the development ofa debe basic Pilot neron and th ara computational model cue net ‘orkarcietresandleaingprocese, and resetsome ofthe ost Common wed NN dele ccondde wcrc recgiion Stee AN apt. WHY ARTIFICIAL NEURAL NETWORKS? “The long course of evolution has given the human brain many desie- able charaeteristies not present in von Neumann or modern parallel com puters. These include + massive parallelism, + isibuted representation and computation, + earning ability, + generalization ability, + adapsivity, + inherent contextual information procestng, + faultolerance, and * low energy consumption Itis hoped that devices based on biologieal neural nrworks will possess some of these desirable characteristics Modern digital computers outperform humans in the domain of numeric computation and related symbol manipulation. However, ‘humans can eforlessly solve complex perceptual problems (ike recog nizing a man in a crow from a mere glimpse of his face) at such a high speed and extent as to dwar the world’ fastest computer Why is there Sicha remarkable difference in their performance? The biological neural system architectures completely different from the von Neumann archi {eetute (See Table 1). This difference significantly affects the type of func- sions each computational model can best perform. ‘Numerou efforts to develop “intelligent” programs based on von [Neumann's centralized architecture have not resulted in general-purpose inteligene programs. Inspired by biological neural neworks, ANNs are massively parallel eomputing systems consisting ofan exremely large nu berofsimple processors with many interconnections. ANN modelsarempt ‘ose some “organizational” principles believed robe used in the human March i996 a Challenging probl Letus consider the folowing problems of intarest to com- puter scientists and engineers, Pattern classification ‘The task of pattern classification sto assign an inputpat- ‘tern like 3 speech waveform or handwritten symbol) rep- resented by a feature vector to one of many prespecified ‘lasses (See Figure A). Well-known applications include ‘character recognition, speech recagnition, EEG waveform 0, and 90th ‘exwise Ina to-class classification problem, the percep- ‘ton assigns an input pattern to one classify = 1, and theotherclssify=0. The linear equation Siynnuno Aefines the decision boundary (a hyperplane in the ‘dimensional inp pace) tha halves the space. Rosenblatt developed a learning procedire to eter: mine the weights and threshold ina perceptron, given a setoftraining paters (ee the “Percepuon learning algo rithm’ sidebar) ‘Note that learning eccurs only when the perceptron. ‘makesan error. Rosenbete proved that when training pt- terns are drawn from tro linearly separable classes, the perceptron leazning procedure converges after a finive hhumber of iterations. This i the pzrcepéron convergence ‘theorem. In practice, you do not knovt whether the p tems are linearly separable. Many variations ofthislearn- Ing algorithmhave been proposed inthe inerature*Other activation funetions tht lead ro differen: learning char fcteristis can also be used. However, a sinle layer per | trained using the Hebbian rule, | Computer ‘xperon can only separae linearly separable patterns astong ‘asamonatonic activation function isuse. The back propagation lesrning algorithm (See the “Bacicpropagation algorithm sidebar") isalso based on the error correction principle BotrzManw LEARNING. Boltzmann machines e697 metre rerustent resvorks eonsting ofbinary units +1 for“on” and =1 for“ofF). By symmneti, we mean thatthe ‘weighton the connection fromunitizo unit tequathe ‘welghton the connestion from unitj to unit fw, =). subsetof theneurons, called vise, ineract with he envi ronment; the ret, called hidden, donot, Bach newton i ‘stochastic unit that generates an output (or state) accord ing to the Boltzmann distribution ofstatstieal mechanics. Boltzmann machines operate in two modes: clamped in ‘hich visble neuronsare clamped oto specie states deter sminedby the environment; and ree running, inwhiehbot Visible sd hidden neurons are allowed to operate rely Boltzmannearningisastochasiclearning rule derived rom information theoretic and thermodynamic prin ples.» The objective of Bolezmann larningistoadjustthe ‘connection weights so hat the statesofvisibleunitssatisty ‘particular desied probability distribution, According ro ‘the Boltzmann learning rule, the change inthe connec: ton weighty given by 0, =) 90) where is the learning rate, and pad p, ate the corte lations between the states of units and j when the net work operates in the clamped mode and free-running mode, respectively. The values of p, and pare usually este mated from Monte Carlo experiments, which are extemelyslow. Boltzmann learning can be viewed a a special ease of error correction learning in which errr is measured not asthe direct difference between desited and actual ot puts butasthedifference between thecorrelationsamong the outputs of two nearons under clamped and free: running opersting conditions. Henna Rute, The oldest learning rule is Hebb pos ‘ulate of earning. ® Hebb based icon the following obser vation from neurobiological experiments: IFneurons on both sides of a synapse are actvared synchronously and repeatedly, che synapses srength is selectively increased, ‘Mathematically, che Hebbian rule canbe described as myer D = m0 #ay,(0x00, where andy, are the output values of neurons andj, respectively, which are connected hy the symapee wand Isthe leaming rate. Note thats the input the synapse. ‘An important property ofthis rule is that learning is one local thats, the eange in synapse weight depends only onthe setvitis ofthe two neurons connected byt. ‘Thissignificanly simplifies the complexity ofthe learning circutin a VLSI implementation, Asinglenewon rained using the Hebbian rule exhibits ‘orientation selecivity. Figure S demonstrates this prop- ery. The points depicted are drawn from a cwo-dimen- sional Gaussian distribution and used for raining a new. ron. The weight vector ofthe neuron isiniialized to was ‘shown nthe figure. Ashe learning proceeds, the weight veetor moves progressively closer tothe direction w of ‘maximal variance inthe dat. Infact, wisthe eigenvector ‘ofthe covariance matrixof the data corresponding tothe largest eigenvalue. ‘CouPerrrive LEARNING RULES. Unlike Hebbian learn {ng (in which multiple output units can be fired simulta- neously), comperitive-learaing ouxpu units compete mong themselves for activation, Asa resul, nly one out put unit is active at any given time. This phenomenon known as winner-‘ake-al. Competitive learning has been found to exist in biological neural neworks Competitive learning often clusters or categorizes the Input data. Similar patterns are grouped bythe network nd represented by a single unit. This grouping is done automaticallybsed on data corvelations, Te simplest competitive learning network consis of single layer of output unitsas shown in Figure 4 Each out ptunitinthenetworkconneetstollthe input units Gy) Via weights, w, f=, 2, ...,m- Hach output unitalso con: nectstoal other output units via inhibitory weightsbuthas aself-eedbackwith an excitatory weight Asa esultofcom- petition, only the nit with the largest or the smallest) ret input becomes the winner, thai, Wx wi, Vi, oF |jw-—x|[s |w,—» |, Wt When ll the weight vectors are ‘normalized, these hwo inequalities are equivalent, ‘A simple competitive learning rule can be sated as {remo 7 ‘Note that only the weights ofthe winner uni ger updated, “The effet ofthis learning rule isto move che sored pat- tem in the winner unit (weights) a litle bit closer to the inpuc pater, Figure 6 demonstrates a geometric inter pretation of competitive learning. inthis example, we assume thatall input vectors have been normalized to ave tunic length. They are depieted as black dots in Figure 6. “The weight vectors ofthe three units are randomly in tilized. Their ints and inal posionson the sphere after competitive learning are marked as Xs in Figures 6a and 6b, respectively. In Figure 6, each of the three natural ups (elusters) of patterns has been discovered by an ‘ourpic unit whose weight weetor points to the center of sravity ofthe discovered group. ‘You can se from the competitive learning rule that the network wll not stop learning (updating weights) unless the learning rate is. A particular input pattern ean ire diferent output units 3 differen iterations during learn ing. Ths brings upthe stability issue ofa learning system, ‘The ystem i said tobe stable if no pattern in the raining data changes its eategory after inte numberof earning iterations. One way achieve stability sto force the learn ing rate to decrease gradually asthe learning proces pro= ceedstowards 0. However, thisarificial freezing oflearning causes another problem termed pasty, which s the abil ity to adapt to new data, This is known as Grossber’sst bility plasty dilemma in compitive learning. Figure 6. An example of competitive learning: (a) before learning; (b) after learning. “The most well-known example of competitive learning is vector quantization for data compression. Ithas been ‘widely used in speech and image processing for efficient storage, transmission, and modeling. ls goal ito repre- senta stor distribution of input vectors with areltively small numberof protocype vetors (weight vectors), or codebook. Once a codebook has been constructed and | agreed upon by both the transmitter and the receiver, you ‘need onltransmit or store the index ofthe corresponding protorype tothe input vector. Given an input wetor,itscor responding prototype ean be found by searching forthe ‘nearest prototype in the codebook. ‘SUMMARY. Table 2 summaries various learning algo rithms and their associated network architectures (his isnotan exhaustive lst). Both supervised and unsuper- vised learning paradigms employ earning rules based Back. al 4. Initialize the weights to small random values. 2 Randomly choose an input pattern x¥. ~ 3, Propagate the signal forward through the network, 4 Compute 6! in the output layer (0; = 7!) 8b g'tht [ot vt], ‘where b!represents the net input to the ith unit in the th layer, and g's the derivative of the activation function g. - Compute the deltas for the preceding ayers by propa- ‘gating the errors backwards; Bag a9, S for/=(t—1), 11 5. Update weights using Aw) = n8/y5" 1. Goto step 2 and repeat for the next pattern until the terror inthe output layer Is below a prespecified thresh- ‘ld or a maximum numberof iterations s reached. March 1996 | eneneconeion thin, nd cmpete ening iLantpsebandonersreoretcenboel Wate el onsets hie hablsiceage Mase eae rapes ofc nes oO, top layer Hidden layers ae weltinown learning algortims tures. However, each learning algorithm is designed for training aspecitcarchitecture. Therefore, when we dis cuss a learning algorithm, a particular network atch tecture astociation is implied. Each algorithm can radigm __Learningrule Architecture Learning algorithm Task Supervised Errorcortection single-or Perceptron Pattern clasificatio multilayer learning algorithms Function approniiation perceptron Back-propagetion Prediction, conti Adaline and Madaline : Boltzmann Recurrent Boltzmann leaming Patter lasification algorithms Hebbian Muitlayerfeed- Linear discriminant ——=Dataanalysis forward analyis Pattern claification Competitive Competitive Learning vector wrthinloss. quantization categorization Data compression ART network ARTMap Patter dasifcaton ‘Within-class: a q categoriztion ‘Unsupervised Errorcorrection Multilayer feed- Sammon'sprojectionDataanalyse ae forward Hebbion Feedforwardor Prindpalcomponent Data dnalyshs | | competitive analy Data compression | Hopfield Network Associative memory _aAssorniive memipge learning fe Pee | Competitive ‘Competitive ‘Vector quantization Categorization Data compressior | Kehonen'sSOM —_Kohonen’s SOM Garegorieation Data analy ART networks ARTS, ARTZ Categorization | Hybrid Error-correction —_RBF network RBF learning Pattern classification “| andeompetiive ‘algerthm Function approximation Prediction, control. 1 32 | Computer Description of easton regions exclusve-OR Structure problem Half plane bounded by hyperplane Single layer General region shapes Classes with meshed regions ‘Acbiteary (complexity limited by number of hidden oy units) abitrary {complexity limited by umber of hidden unit) ‘Three layer Figure 8. A geometric interpretation ofthe role of hidden unit in a two-dimensional input space. perform only afew tasks well, The last column of Table 2 lists the tasks that each algorithm can perform. Dueto space imitations, we do not discuss some other algo: rithms, including Adaline, Madaline,* linear discrimi nant analysis,” Sammon's projection, and principal component analysis Interested readers can consult the corresponding references (this article does not always citethe first paper proposing the particular algorithms) MULTILAYER FEED-FORWARD NETWORKS: Figure 7 shows typical three-layer percepcron. Ingen eral, a standard L-layer feed-forward ne«work (we adopt the convention that the input nodes are not counted asa layer) consist of an input stage, (1) hidden layers, and ‘an output layer of units successively connected (fully or locally) ina feed-forward fashion with no connections berween units in the same layer and no feedback connec tions between layers, Multilayer perceptron ‘The most popular dass of multilayer feedforward net- ‘works is multilayer perceperons in whieh each computa- tional uitemploy either the thresholding function orthe sigmoid function, Multilayer perceperons can form arbi trarlly complex decision boundaries and represent any Boolean function. The development of the back propa gation learning algorithm for determining weights ina ‘multilayer pereepsron has made these networks the most popular among researchers and users ofneural networks. ‘Wedenotew) asthe weight onthe connectionbetween the th unit ia layer (1) to th unit in layer Let (9, €), (x2, d2), (AH), 409} be a set ofp training patterns (input-output pairs), where x® = Reis the inp vector inthe n-dimensional pattern space, and 44 ¢ {0, 1}%, an m-dimensional hypercube, For classifi ‘ation purposes, m isthe number of classes. The squared- terror cost funetion most Frequently used in the ANN Titeracure is defined 2s, Spee ‘Teback propagation algrithn’is a gradient-descene rethod to minimize the squared-ereor cost function in Equation 2 (see ‘Back propagation algoeth” sidebar). | “Ageometricinterpretaton (adopted and modified from Lippmann®) shown nFigure8 can help explicate the role ofhiden wish the threshold activation function. | Ech unin the fst hidden layer forms hyperplane inthe pattem space; boundaries between patter classes can be approximated by hyperplanes. unt in the se: ‘ond hidden layer forms ahypertegion from the outputs ofthe firstlayer units; a decision region is obtained by performing an AND operation on the hyperplanes. The butput layer unitscombine the decsion regions made by thetunitsin he second hidden ayer by performing log fal OR operations. Remember that this scenario Is depicted only 1 explain the roe of hidden unis, Their factual behavior after the networkistrained, could ifr. | “Atwolayer network can form more complex decision boundaries than those shown in Figure. Moreover, mu tilayer perceptron with sigmoid activation functions ean form smooth delsion boundaries rather than piecewise linear boundaries. @ Radial Basis Function network The Radial Basis Function (RBF) network which has ‘wolayer, espe dasofmallyerfedrvar net March 1996 | ‘works, Bach unitin the hidden layer employsa radial bass function, such as Gaussian kernel asthe activation fune tion. The radial basis function (or kernel function) sen: ‘ered atthe point specified by the weight vector associated ‘with the unt. Both the positions and the widths ofthese kernels nustbe learned from taining pattems. There are usually many fewer kernelsin the BF networkthan there are training patterns. Each curput unit implements alin ear combination of hese radial basi functions. From the point of view of function approximation, the hidden units providea se of functions thar constitute & basis set for rep: resenting input patterns in the space spanned by the hi. den units “There ae a variety of earning algorithms forthe RBF network.’ Thebasiconeemploysa two-step learning srat- ‘ey, or hybrid learning, Icestimates kernel postions and [kernel widths using an unsupervised clustering algorithm, followed by supervised least mean square (LMS) algo. rithm to devermine the connection weights berwoen the hidden layer and the output ayer. Because the output units arelinesr, 2 nonterative algorithm eanbe used. After this. initial solution is obtained, a supervised gradient-based algorithm can he used to refine the network parameters. ‘This hybrid earning algorithms for training the RBF ne ‘work eonverges much faster than the back-propagation algorithm for training muldlayer perceptrons. However, for many problems, the RBF network often involves a larger namber of hidden units. Tis implies that the run time (after taining) speed ofthe RBP nework i often slower than the runtime peed ofa multilayeeperceptzon. ‘The efficiencies (error versus networksize) ofthe RBF at. work and the multilayer perceptron are, however, prob lem-dependent.Ihas been shawn that the RBP network has the same asymptocc appreximation power a aml ‘layer perceptron, Issues ‘There are many issues in designing feed-forward net works, including hhow many layers are needed fora given task, + how many units are needed per layer, + how lhe network perform ondata notincded in the taining set (generalization ability), and + how large the training set shouldbe fo “good” gen calization Although multilayer fed forward networks using back- propagation have been widely employed frcassficacon and function approximation * many design parametets still must be determined by trial and error. Existing theo: retical results provide only very lose guidelines forselec ing these parametersin practice, KOHONEN’S SELF-ORGANIZING MAPS Theself- organizing map (SOM)*has the desirable prop erty of topology preservation, which captures an impor- tant aspect of the fearure maps in the cartex of highly developed animal brains. na topology-preserving map- ping, nearby input patterns shoul activate nearby output ‘union the map. Figure 4 shows the basi network arch: tecture of Kohonen’s SOM. Irbasially consists of axwo- dimensional array of units, each connected to all minut nodes. Let wy denote the n-dimensional vector associated with the unit at location (i, ofthe 2D array. Each neuron ‘computes the Euclidean distance berween the input vee torx and the stored weight vector ‘This SOM isa special ype of competitive leaning net work that defines a spatial neighborhood for each output lit. The shape ofthe Tocal neighborhood can be square, rectangular, or circular. Initial neighborhood size soften settoonehalfto two thirds ofthe nenworksize and shrinks ‘overtime according toa schedule (for example, an expo: ‘entialy decreasing function). During competitive earn ing allthe weight vectors associated with the winner and itsnejghboring units are updated (se the “SOM learning I. Iiaizeweightsto ial random nubs setiitil ear | ingrate and neighborhood | _2.Present a patter) x, ond evaluate the network outputs. | 4.Select the unit Gc) withthe minimum ouput: algorithm" sidebar). Kehionen’s SOM ean be used for projection of multi variate dat, densityapproximation, and cluscering,Ithas bbeen successfully applied in Une areas of speech recogni tion, image processing, robotics, and process contr The ‘design parameters include the dimensionality ofthe neu A : ron array, the number of neutons in each dimension, the 4 Update all weights according to the following learning | _shapeof the neigbortood, the shrinking schedule ofthe rules neighborhood, and the learning rate. ADAPTIVE RESONANCE ‘THEORY MODELS Recall that the stabi plasticity dilemme is an impor: tant issue in competitive learning. How do we leatn new things (plasticity) and yetretain the stability to ensure that existing knowledge snot erased or corrupted? Carpenter and Grossberg’s Adaptive Resonance Theory models (ARTL,ART2, and ARTMap) were developed nan attempt toovercome this dilemma.” The network has asulicient supply ofoutpat units, but they arenot used until deemed necessary. A unieis said abe committed (wncomited) itis Gsnot) being used. The learning algorithm updates femal mpl ws we) COL IED EM wit, otherwise, | athere Wa [0 isthe neighborhocd of the unit oat ‘ime t anda) isthe learning ate. 5, Dees the value of) an shrink he neighborhood be ) “6 Repeat steps 2 through 5 until the chang In weight vak "es sess than a prespectied threshold ara maximum ‘number of erations isteached the stored prototypes ofa category only if the input vector i sufficiently similar to them, An input veetor and a stored proto- ‘ypearesaid toresonate when they aresuf- ficiently similar. The extent of similarity Is controlled bya vigilance parameter, 9, with (0

for zip code recognition. A 16 16 normalized gray level images presented toa fed-for- ‘ward network with three hidden layers. ‘The units inthe firs layer are locally con: nected tothe units inthe input layer, Form: ng a set focal feature maps. The second hidden layer is constructed in a similar way. Each unit in the second layer also combines local information coming from feature maps inthe irs layer ‘Theactivation levelof anoutpur unitcan be interpreted as an approximation of the 1 posteriori probability ofthe input pat tern’s belonging toa particular class. The ourpurcategoriesare ordered according to activation levels and passed to the post- processing stage. In this stag, contextual Jnformation is exploited to update theclas- sifie’s output. This could, for example, involve looking up a dictionary of admiss- ble words, orutlizing symtacticconstraints present, for example, in phone or socal security numbers Results "ANNs work very well the OCR application. However, theres noconclusive evidence about their superiority over conventional statistical pattern classifiers At the Fist ‘Census Optical Character Recognition System Conference held in 1992," more than 40 diferent handwritten char ‘acter recognition systems were evaluated ased on their performance ona common database. The top 10 perform: fers used either some type of multilayer feed-forward net ‘work ora nearest neighbor based classifier. ANNs tend to be superior in terms of speed and memory requirements ‘compared tonearest neighbor methods. Unlike the nearest neighbor methods, classification speed using ANNisinde pendent ofthe size ofthe training sex. The recognition acc racies of the top OCR systems on che NIST isolated (presegmented) character data were above 98 percent for digits, 96 percent fr uppercase characters, and 87 percent for lowercase characters. (Low recognition accuracy for lowercase characters was largely de to the fact that the test data differed significantly from the raining data, as ‘yell as being dve to “ground.truth” errors.) One conclu ‘on drawn from the tests that OCR system performance ‘on isolated characters compares well with human perfor ‘mance. However, humans sill outperform OCR systems ‘on unconstrained and cursive handwritten documents, DDevELorMeNTS IN ANNs Have stinuLaTED alo of enthus mand eitiism, Some comparative studies are optimistic, some offer pessimism, For many tasks, such as pattern recognition, no one approach dominates the thers. The ‘choice ofthe best technique should be driven bythe given ‘application's nature, Weshould uyto understand the capac ies, assumpeions, and applicability of various approaches and maximally exploit their complementary advantagesto Figure 10,A sample set i ‘characters in the NIST database. rowttor Computer || wom COMPUTA! Sect. inl IA Spue Ds WS RK Figure 11.two schemes for using ANNs in an OCR system. develop beter inteligencsystems. Such an effort may lead toa synergistic approach that combines the strengths of [ANNs with other technologies vo achieve significant ber ter performance for challenging problems. As Minsky recently observed, the time has come to build systems out lof diverse components. individual modules are important, but we also needa good methodology for integration. Iis clear that communication and cooperative work between i March 1996 researchers working in ANN and other disciplines will not only avoid repetitious work but (and more important) will simulate and benefit individual diseipines. Acknowledgments We thank Richard Casey (BM Almaden); Pat Flynn (Washington State University); William Punch, Chitra Dorai, and Kalle Karu (Michigan State University); Ali ‘Khotanzad (Southern Methodist University); and Ishwar Sethi (Wayne State University) for their many useful su pestions. References — 1. DARPA Nara Newor Stay, AFCEA In Pres, Fata Va, 1988, 2 Hert, A Keogh, and RG. Palmer, Prodcion othe The oryofNewral ampucation, Adon Wesley, Reading, Mass, wn, 5. Haykin, Neural Neeworks A Comprehensive Foundation, Mactillan Calege Publish Co, New Yuk, 194, ‘W.S. Meculloch and W, Pits, “A Logical Calulus of Ideas Immanent in Nervous Act,” Bull. Mochemarica Bi. lysis, ol 5,194, pp 15-185. 1. Rosenblat Prine of Neuroyaamic, Spartan Books, Neon York, 1962, 6. M. tins. Papen, Repro: An troduction to Com putational eam, MIE Pes, Cambridge Mas, 1969. 7 1-H, "Neural Nesworks and Phys ystems with ergent Collective Computational Abies," in Proc Nat eatery of cece, USA79, 1982, p. 2554 2588, 8. P.Werbes, "Beyond Regression: New Tels for Pediton and Analysis a the Behavioral Sciences” PAD thsi, Dept. of Applied Matbemaris, Harvard Lniery, Cambridge, Mas, ww, DE RumelhartandJ1. MeClland, Pra Distributed Pro- cessing: Eploration nthe Mcrostrocure of Cognition, MIT Press, Cambridge, Mas, 186, 10, 3A Anderson and . Rosenfeld Neurccumputing: ead tionsof Research, MT Press, Cambridge, Mas, 1988. 8. Brunakand B. Lautrup, Neural Neoworks, Computers wth Incltan, Wold Scent Singapore, 1990 12, Feldman, MA. Fay and NH Goddard, “Compating with Struerure Neural Ncworks," Compute, Vol 21 MoS, Maz. 1988, pp 91-109. 13, D.0. Hebb, The Organisation ofa, fob Wily & Sons, Nev York 1949, 14. RP Lipprnann, “An intodesonto Computing with Neural ets IEEEASSP Magee, Vol No.2, pr 1887, p. 422 15, AJC Jen and. Mao, “Neural Nesorke and Pattern Resog nion,"in CompucatenalIntlgence: Imitating Life, ara, RJ Man and C.J. Robinson ed, EE Pres, Pecans, NJ, 1994, pp. 194-212 16, T Kohonen, Sy Organization and asoitive Merry, Thi alton, Springer-Verlag, New York, 1989. 17 GA. Carpenter ana 5. Grossberg Pars Ragin y See Organising Neuralneors MIT Pres Carb, Mas, 291 18, “The ls Cans Opa Character Recognition System Co ference” RA. Wiking eal, ed, Tech Report, NISTIR 4912, Us Dept. Commerce IST, Gaithersburg, Mé, 1992 19, KMohiuddinandJ. Meo, A Comparative SadyofDiferent Computer Classifier for Hanprinted Charste Recognition,” in Fat tern Recognition in Proce W, ES Gleema and LN. Kanal, ls, Hever cence, The Netherlands, 1994, pp. 457-448. 20. ¥.LeCunetal, "Bod Propagtin Applied Handvrten 2p Code Recogition, Neural Computation, Vol. 1, 1989, pp. sal-ss1 21. ML Mins, “Logical versus Analogical or Symbolic Vers CConnecsinistor Net Versus Sealy" ATMagazne Vol. 65, No.2, 1991,pp. 345. Anil K. Janis Univesity Distinguished Professor andthe ‘chair ofthe Department of Computer Science a Michigan ‘State University. His interests daclude statistical pateera recognition, exploratory patter analysis, neural menworks, ‘Markov random fields, texture analysis, remote sensing, inuerpretationofrange images, and 3D object recognition Jain served sete r-chiefof ERE Transactions on Pat tem Analysisand Machine neligence from 1991 to 1994, and currently serves on the editorial boards of Pattern, Recognition, Patiern Recognition eters, Journal of Math- ematical Imaging, Journal of Applied Intelligence, arc TERE Transactions on Neural Networks, Hens coauthored, edited, and coedited runerous books inte ld Jain isa {fellow of the TEBE anda speaker in the IEEE Computer Soc- 0/8 Distinguished Visitors Program forthe Asia-Pacific region, He isa member ofthe IEEE Computer Society Dlanchang Mao isa reearch staff member atthe IBN Atmaden Research Center: isinteress include pattern eo- nition, neural networks, document image analysis, image rocessing, computer vision and parallel computing. Mao received te BS degre in physics in 1983 and che MS ddegreeinelecricalengineringin 1986 frm East China Nor ‘mal University in Shanghai. Hereceved the PRD in computer science from Michigan State Universieyin 1994, Mao is the abstracts editor of EEE Transactions on Neural Networks. Heisa member ofthe IEE ac the IEEE Computer Socey KLM, Mohiuddin isthe manager othe Docurtere Image ‘Analysis and Recognition projec in the Computer Selence Departmentat the BM Almaden Research Genter He haste. {BM projects on high-speed reconfigurable machines for industrial machine vision parallel processing fr scientific computing, and document imaging systems. His interests include document image analysis, handwriting recogni tion OCR, data compression, and computer architecture. _Mohiuadin received the MS andl PRD degrees inelectrcal engineering from Stanford University in 1977 and 1982, respectively. Heis an associate editor of EEE Transactonson Pattern Analysis and Machine Intelligence. ff erved on Computer's editorial board from 1984 to 1989, and isa senior member of the IEEE and a member ofthe IEEE Cort ter Socety, Readers can contact Anil Jai at the Department of Com puter Science, Michigan State University, A714 Wells Hall, East Lansing, MI 48804; jain@eps. sued,

You might also like