You are on page 1of 7

# Neural Networks Assignment

## An Introduction to Back-Propagation Neural Networks

Introduction This article focuses on a particular type of neural network model, known as a "feed-forward backpropagation network". This model is easy to understand, and can be easily implemented as a software simulation. What is a Neural Network? The area of Neural Networks probably belongs to the borderline between the Artifficial Intelligence and Approximation Algorythms. Think of it as of algorithms for "smart approximation". The NNs are used in (to name few) universal approximation (mapping input to the output), tools capable of learning from their environment, tools for finding non-evident dependencies between data and so on. The Neural Networking algorithms (at least some of them) are modelled after the brain (not necessarily human brain) and how it processes the information. The brain is a very efficient tool. Having about 100,000 times slover responce time than computer chips, it (so far) beats the computer in complex tasks, such as image and sound recognition, motion control and so on. It is also about 10,000,000,000 times more efficient than the computer chip in terms of energy consumption per operation. The brain is a multi layer structure (think 6-7 layers of neurons, if we are talking about human cortex) with 10^11 neurons, structure, that works as a parallel computer capable of learning from the "feedback" it receives from the world and changing its design (think of the computer hardware changing while performing the task) by growing new neural links between neurons or altering activities of existing ones. To make picture a bit more complete, let's also mention, that a typical neuron is connected to 50-100 of the other neurons, sometimes, to itself, too. To put it simple, the brain is composed of neurons, interconnected. Structure of a neuron. Our "artifficial" neuron will have inputs (all N of them) and one output: the neuron has: Set of nodes that connects it to inputs, output, or other neurons, these nodes are also called synapses. A Linear Combiner, which is a function that takes all inputs and produces a single value. A simple way of doing it is by adding together the dInput (in the case if you are not a programmer - a "d" prefix means "double", we use it so that the name (dInput) represents the floating point number) multiplied by the Synaptic Weight dWeight: for(int i = 0; i < nNumOfInputs; i++) dSum = dSum + dInput[i] * dWeight[i]; An Activation Function. We do not know what the Input will be. Consider this example - the human ear can function near the working jet engine and in the same time - if it was only ten times more sensitive, we would be able to hear a single molecule hitting the membrain in our ears! What does that mean? It means that the input should not be linear. When we go from 0.01 to 0,02, the difference should be comparable with going from 100 to 200.

How do we make a non-linear input? By applying the Activation function. It will take ANY input from minus infinity to plus infinity and squeeze it into the -1 to 1 or into 0 to 1 interval. Finally, we have a treshold. What the INTERNAL ACTIVITY of a neuron should be when there is no input? Should there be some treshold input before we have the activity? Or should the activity be present as some level (in this case it is called a bias rather than a treshold) when the input is zero? For simplicity, we (as well as the rest of the world) will replace the treshold with an EXTRA input, with weight that can change during the learning process and the input is fixed and always equal (-1). The effect, in terms of mathematical equations, is exactly the same, but the programmer has a little more breathing room A single neuron by itself is not a very useful pattern recognition tool. The real power of neural networks comes when we combine neurons into the multilayer structures, called... well... neural networks. there are 3 layers in our network (we can make it more, but if we make it less - we will have a less capable net. Making 4 layers is sometimes useful when you are looking for a non-evident things. And I have never seen a problem that requires 5 layers. For 99 percent of tasks, 3 layers is the best choice). There are N neurons in the first layer, where N equals number of inputs. There are M neurons in the output layer, where M equals number of outputs. For example, when you are building the network capable of predicting the stock price, you might want the (yesterday's) hi, lo, close, volume as inputs and close as the output. You may have any number of neurons in the inner (also called "hidden") layers. Just remember, that if you have too few, the quality of a prediction will drop and the net doesn't have enough "brains". And if you make it too many - it will have a tendency to "remember" the right answers, rather than predicting them. Then your neural net will work very well on the familiar data, but will fail on the data that was never presented before. Finding the compromice is more of an art, than science. Teaching the Neural Net. The NN receives inputs, which can be a pattern of some kind. In case of an image recognition software, for example, it would be pixels from the photo sensitive matrix of some kind, in case of a stock price prediction, it would be the "hi" (input 1), "low" (input 2) and so on. After the neuron in the first layer received its input, it applies the Linear Combiner and the Activation Function to the inputs and produces the Output. This output, as you can see from the picture, will become the input (one of them) for the neurons in the next layer. So the next layer will feed forward the data, to the next layer. And so on, until the last layer is reached. Let's use our example with the stock price. We will try to use yesterday's stock price to predict today's price. Which is the same as using today's price to predict tomorrow's price... When we work with yesterday's price, we not only know the price for the "day - 1", but also the price we are trying to predict, called the DESIRED OUTPUT of the Neural Net. When we compare the two values, we can compute the Error: dError = dDesiredOutput - dOutput; Now we can adjust this particular neuron to work better with this particular input. For example, if the dError is 10% of the dOutput, we can we can increase all synaptic weights of the neuron by 10%. The problem with this approach is that the next input will require a different adjustment.