You are on page 1of 5

Neural Network Definition

Neural networks are a set of algorithms, modeled loosely after the human brain, that are
designed to recognize patterns. They interpret sensory data through a kind of machine
perception, labeling or clustering raw input. The patterns they recognize are numerical,
contained in vectors, into which all real-world data, be it images, sound, text or time series,
must be translated.

Neural networks can also extract features that are fed to other algorithms for clustering and
classification; so you can think of deep neural networks as components of larger machine-
learning applications involving algorithms for reinforcement learning, classification
and regression.
Deep learning maps inputs to outputs. It finds correlations. It is known as a “universal
approximator”, because it can learn to approximate an unknown function f(x) = y between
any input x and any output y, assuming they are related at all (by correlation or causation,
for example). In the process of learning, a neural network finds the right f, or the correct
manner of transforming x into y, whether that be f(x) = 3x + 12 or f(x) = 9x - 0.1.

Deep learning is the name we use for “stacked neural networks”; that is, networks
composed of several layers. The layers are made of nodes. A node is just a place where
computation happens, loosely patterned on a neuron in the human brain, which fires when
it encounters sufficient stimuli. A node combines input from the data with a set of
coefficients, or weights, that either amplify or dampen that input, thereby assigning
significance to inputs with regard to the task the algorithm is trying to learn; e.g. which
input is most helpful is classifying data without error? These input-weight products are
summed and then the sum is passed through a node’s so-called activation function, to
determine whether and to what extent that signal should progress further through the
network to affect the ultimate outcome, say, an act of classification. If the signals passes
through, the neuron has been “activated.”

Here’s a diagram of what one node might look like.


A node layer is a row of those neuron-like switches that turn on or off as the input is fed
through the net. Each layer’s output is simultaneously the subsequent layer’s input, starting
from an initial input layer receiving your data.

Pairing the model’s adjustable weights with input features is how we assign significance to
those features with regard to how the neural network classifies and clusters input

Coding a Neuron

Time to implement a neuron! We’ll use NumPy, a popular and powerful computing library for
Python, to help us do math:
Importnumpyas
np

defsigmoid(x):
# Our activation function: f(x) = 1 / (1 + e^(-x))
return1/ (1+np.exp(-x))

classNeuron:
def__init__(self, weights, bias):
self.weights= weights
self.bias= bias

deffeedforward(self, inputs):
# Weight inputs, add bias, then use the activation function
total =np.dot(self.weights, inputs) +self.bias
return sigmoid(total)

weights =np.array([0, 1]) # w1 = 0, w2 = 1


bias =4# b = 0
n =Neuron(weights, bias)

x =np.array([2, 3]) # x1 = 2, x2 = 3
print(n.feedforward(x))

Perceptrons were developed in the 1950s and 1960s by the scientist Frank Rosenblatt,
inspired by earlier work by Warren McCulloch and Walter Pitts. Today, it's more common to
use other models of artificial neurons - in this book, and in much modern work on neural
networks, the main neuron model used is one called the sigmoid neuron.

A perceptron takes several binary inputs, x1,x2,…x1,x2,…, and produces a single binary
output:

In the example shown the perceptron has three inputs, x1,x2,x3x1,x2,x3. In general it could
have more or fewer inputs. Rosenblatt proposed a simple rule to compute the output. He
introduced weights, w1,w2,…w1,w2,…, real numbers expressing the importance of the
respective inputs to the output. The neuron's output, 00 or 11, is determined by whether
the weighted sum ∑jwjxj∑jwjxj is less than or greater than some threshold value. Just like the
weights, the threshold is a real number which is a parameter of the neuron. To
it should seem plausible that a complex network of perceptrons could make quite subtle
decisions:

In this network, the first column of perceptrons - what we'll call the first layer of
perceptrons - is making three very simple decisions, by weighing the input evidence. What
about the perceptrons in the second layer? Each of those perceptrons is making a decision
by weighing up the results from the first layer of decision-making. In this way a perceptron
in the second layer can make a decision at a more complex and more abstract level than
perceptrons in the first layer. And even more complex decisions can be made by the
perceptron in the third layer. In this way, a many-layer network of perceptrons can engage
in sophisticated decision making.

it should seem plausible that a complex network of perceptrons could make quite subtle
decisions:
In this network, the first column of perceptrons - what we'll call the first layer of
perceptrons - is making three very simple decisions, by weighing the input evidence. What
about the perceptrons in the second layer? Each of those perceptrons is making a decision
by weighing up the results from the first layer of decision-making. In this way a perceptron
in the second layer can make a decision at a more complex and more abstract level than
perceptrons in the first layer. And even more complex decisions can be made by the
perceptron in the third layer. In this way, a many-layer network of perceptrons can engage
in sophisticated decision making.
Back PROPOGATION
backpropagation (backprop,[1] BP) is an algorithm widely used in the training of
feedforward neural networks for supervised learning; generalizations exist for other
artificial neural networks (ANNs), and for functions generally.[2] Backpropagation efficiently
computes the gradient of the loss function with respect to the weights of the network for a
single input-output example. This makes it feasible to use gradient methods for training
multi-layer networks, updating weights to minimize loss; commonly one uses gradient
descent or variants such as stochastic gradient descent. The backpropagation algorithm
works by computing the gradient of the loss function with respect to each weight by the
chain rule, iterating backwards one layer at a time from the last layer to avoid redundant
calculations of intermediate terms in the chain rule; this is an example of dynamic
programming.[3]

The term backpropagation strictly refers only to the algorithm for computing the gradient,
but it is often used loosely to refer to the entire learning algorithm, also including how the
gradient is used, such as by stochastic gradient descent.[4] Backpropagation generalizes the
gradient computation in the Delta rule, which is the single-layer version of backpropagation,
and is in turn generalized by automatic differentiation, where backpropagation is a special
case of reverse accumulation (or "reverse mode"). The term backpropagation and its
general use in neural networks was announced in Rumelhart, Hinton & Williams (1986a),
then elaborated and popularized in Rumelhart, Hinton & Williams (1986b), but the
technique was independently rediscovered many times, and had many predecessors dating
to the 1960s; see § History. A modern overview is given in Goodfellow, Bengio& Courville
(2016).

You might also like