che

© All Rights Reserved

4 views

che

© All Rights Reserved

- Word Recognition in Continuous Speech and Speaker Independent by Means of Recurrent Self-Organizing Spiking Neurons
- 02#Artificial Neural Networks Based Devnagri Numeral Recogni
- Automatic License Plate Recognition
- Sample Research Paper
- Application of Soft Computing in Electrical Engineering IJERTCONV5IS01058
- Neural Nw Too Lb
- Medical Text
- P.A. Estevez, R. Hernandez, C.A. Perez and C.M. Held- Gamma-filter self-organising neural networks for unsupervised sequence processing
- MAR_2010_SUPP
- Fábio Beckenkamp_ Ann
- ANN_GPU
- GeologicalApplications[1]
- 35-O-jayveer Singh-A Survey on Machine
- Som
- Kohonen Self Organizing Maps Shyam Guthikonda 2
- 2752-2760
- 00483421 - Neural Network Based Estimation of Power Electronic Waves
- INVE_MEM_2010_88004
- 5
- IJAIEM-2014-03-13-029

You are on page 1of 85

Submitted By:

Arti Vaidya

Anjali Anjali

Divya Durgadas

Janani Natarajan

State University of New York at Stony Brook

What is a neural network (NN)?

system loosely modeled based on the human brain. The field goes by many names, such as

connectionism, parallel distributed processing, neuro-computing, natural intelligent systems,

machine learning algorithms, and artificial neural networks.

An ANN is a network of many very simple processors ("units"), each possibly having a (small

amount of) local memory.

The units are connected by unidirectional communication channels ("connections"), which

carry numeric (as opposed to symbolic) data.

The units operate only on their local data and on the inputs they receive via the connections.

The design motivation is what distinguishes neural networks from other mathematical

techniques

A neural network is a processing device, either an algorithm, or actual hardware, whose

design was motivated by the design and functioning of human brains and components thereof.

Most neural networks have some sort of "training" rule whereby the weights of connections

are adjusted on the basis of presented patterns.

In other words, neural networks "learn" from examples, just like children learn to recognize

dogs from examples of dogs, and exhibit some structural capability for generalization.

Neural networks normally have great potential for parallelism, since the computations of the

components are independent of each other.

Introduction

Biological neural networks are much more complicated in their elementary structures

than the mathematical models we use for ANNs.

It is an inherently multiprocessor-friendly architecture and without much modification,

it goes beyond one or even two processors of the von Neumann architecture. It has

ability to account for any functional dependency. The network discovers (learns,

models) the nature of the dependency without needing to be prompted.

Neural networks are a powerful technique to solve many real world problems.

They have the ability to learn from experience in order to improve their

performance and to adapt themselves to changes in the environment. In addition

to that they are able to deal with incomplete information or noisy data and can be

very effective especially in situations where it is not possible to define the rules or

steps that lead to the solution of a problem.

They typically consist of many simple processing units, which are wired together in a

complex communication network.

Introduction

There is no central CPU following a logical sequence of rules - indeed there is no set of rules or

program. This structure then is close to the physical workings of the brain and leads to a new type of

computer that is rather good at a range of complex tasks.

In principle, NNs can compute any computable function, i.e. they can do everything a normal digital

computer can do. Especially anything that can be represented as a mapping between vector spaces

can be approximated to arbitrary precision by Neural Networks.

In practice, NNs are especially useful for mapping problems which are tolerant of some errors and

have lots of example data available, but to which hard and fast rules can not easily be applied.

In a nutshell a Neural network can be considered as a black box that is able to predict an

output pattern when it recognizes a given input pattern. Once trained, the neural network is

able to recognize similarities when presented with a new input pattern, resulting in a predicted

output pattern.

The Brain

System

nerve cells, or neurons. On average, each

neuron is connected to other neurons

through about 10 000 synapses. (The actual

figures vary greatly, depending on the local

neuroanatomy.)

Computation in the brain

The brain's network of neurons forms a massively parallel information processing system.

This contrasts with conventional computers, in which a single processor executes a single

series of instructions.

Against this, consider the time taken for each elementary operation: neurons typically operate

at a maximum rate of about 100 Hz, while a conventional CPU carries out several hundred

million machine level operations per second. Despite of being built with very slow hardware,

the brain has quite remarkable capabilities:

Its performance tends to degrade gracefully under partial damage. In contrast, most programs

and engineered systems are brittle: if you remove some arbitrary parts, very likely the whole

will cease to function.

It can learn (reorganize itself) from experience.

This means that partial recovery from damage is possible if healthy units can learn to take

over the functions previously carried out by the damaged areas.

It performs massively parallel computations extremely efficiently. For example, complex

visual perception occurs within less than 100 ms, that is, 10 processing steps!

It supports our intelligence and self-awareness. (Nobody knows yet how this occurs.)

As a discipline of Artificial Intelligence, Neural Networks attempt to bring computers a little

closer to the brain's capabilities by imitating certain aspects of information processing in the

brain, in a highly simplified way.

Neural Networks in the Brain

the largest anatomical scale, we

distinguish cortex, midbrain,

brainstem, and cerebellum. Each

of these can be hierarchically

subdivided into many regions,

and areas within each region,

either according to the anatomical

structure of the neural networks

within it, or according to the

function performed by them.

The overall pattern of projections (bundles of neural connections) between areas is extremely complex,

and only partially known. The best mapped (and largest) system in the human brain is the visual system,

where the first 10 or 11 processing stages have been identified. We distinguish feedforward projections

that go from earlier processing stages (near the sensory input) to later ones (near the motor output),

from feedback connections that go in the opposite direction.

In addition to these long-range connections, neurons also link up with many thousands of their

neighbours. In this way they form very dense, complex local networks

Neurons and Synapses

The basic computational unit in the

nervous system is the nerve cell, or neuron.

A neuron has:

Dendrites (inputs)

Cell body

Axon (output)

A neuron receives input from other neurons (typically many thousands). Inputs sum (approximately).

Once input exceeds a critical level, the neuron discharges a spike - an electrical pulse that travels from

the body, down the axon, to the next neuron(s) (or other receptors). This spiking event is also called

depolarization, and is followed by a refractory period, during which the neuron is unable to fire.

The axon endings (Output Zone) almost touch the dendrites or cell body of the next neuron.

Transmission of an electrical signal from one neuron to the next is effected by neurotransmittors,

chemicals which are released from the first neuron and which bind to receptors in the second. This link

is called a synapse. The extent to which the signal from one neuron is passed on to the next depends on

many factors, e.g. the amount of neurotransmittor available, the number and arrangement of receptors,

amount of neurotransmittor reabsorbed, etc.

Artificial Neuron Models

neurons in order to run detailed simulations of particular circuits in the brain. As

Computer Scientists, we are more interested in the general properties of neural

networks, independent of how they are actually "implemented" in the brain. This

means that we can use much simpler, abstract "neurons", which (hopefully) capture

the essence of neural computation even if they leave out much of the details of how

biological neurons work.

integrated on VLSI chips. Remember though that computers run much faster than

brains - we can therefore run fairly large networks of simple model neurons as

software simulations in reasonable time. This has obvious advantages over having

to use special "neural" computer hardware.

A Simple Artificial Neuron

The basic computational element (model neuron) is often called a node or unit. It receives input

from some other units, or perhaps from an external source. Each input has an associated weight w,

which can be modified so as to model synaptic learning. The unit computes some function f of the

weighted sum of its inputs

written neti.

Note that wij refers to the weight from unit j to unit i (not

the other way around).

simplest case, f is the identity function, and the unit's

output is just its net input. This is called a linear unit.

Applications:

Clustering:

A clustering algorithm explores the similarity between patterns and places similar patterns in a cluster. Best

known applications include data compression and data mining.

Classification/Pattern recognition:

The task of pattern recognition is to assign an input pattern (like handwritten symbol) to one of many

classes. This category includes algorithmic implementations such as associative memory.

Function approximation:

The tasks of function approximation is to find an estimate of the unknown function f() subject to noise.

Various engineering and scientific disciplines require function approximation.

Prediction/Dynamical Systems:

The task is to forecast some future values of a time-sequenced data. Prediction has a significant impact on

decision support systems. Prediction differs from Function approximation by considering time factor.

Here the system is dynamic and may produce different results for the same input data based on system state

(time).

Types of Neural Networks

Applications

-Classification

-Clustering

-Function approximation

-Prediction

Connection Type

- Static (feedforward)

- Dynamic (feedback)

Topology

- Single layer

- Multilayer

- Recurrent

- Self-organized

Learning Methods

- Supervised

- Unsupervised

The McCulloch-Pitts Model of Neuron

The early model of an artificial neuron is introduced by Warren McCulloch and Walter Pitts in

1943. The McCulloch-Pitts neural model is also known as linear threshold gate. It is a neuron

of a set of inputs I1,I2,I3Im and one output y . The linear threshold gate simply classifies

the set of inputs into two different classes. Thus the output y is binary. Such a function can be

described mathematically using these equations:

range of either (0,1) or (-1,1) and associated with

each input line, Sum is the weighted sum, and T is a

threshold constant. The function f is a linear step

function at threshold T as shown in figure

The Perceptron

In late 1950s, Frank Rosenblatt introduced a network composed of the units that were

enhanced version of McCulloch-Pitts Threshold Logic Unit (TLU) model. Rosenblatt's

model of neuron, a perceptron, was the result of merger between two concepts from the

1940s, McCulloch-Pitts model of an artificial neuron and Hebbian learning rule of

adjusting weights. In addition to the variable weight values, the perceptron model

added an extra input that represents bias. Thus, the modified equation is now as

follows:

The McCulloch-Pitts Model of Neuron

The McCulloch-Pitts model of a neuron is simple yet has substantial computing potential. It also has a precise

mathematical definition. However, this model is so simplistic that it only generates a binary output and also the

weight and threshold values are fixed. The neural computing algorithm has diverse features for various

applications . Thus, we need to obtain the neural model with more flexible computational features.

Artificial Neuron with Continuous

Characteristics

Based on the McCulloch-Pitts model described previously, the general form an artificial neuron

can be described in two stages shown in figure. In the first stage, the linear combination of inputs

is calculated. Each value of input array is associated with its weight value, which is normally

between 0 and 1. Also, the summation function often takes an extra input value Theta with

weight value of 1 to represent threshold or bias of a neuron. The summation function will be then

performed as

The sum-of-product value is then passed into the second stage to perform the activation function

which generates the output from the neuron. The activation function ``squashes" the amplitude the

output in the range of [0,1] or [-1,1] alternately. The behavior of the activation function will describe

the characteristics of an artificial neuron model.

Artificial Neuron with Continuous

Characteristics

The signals generated by actual biological neurons are the action-potential spikes, and the biological

neurons are sending the signal in patterns of spikes rather than simple absence or presence of single

spike pulse. For example, the signal could be a continuous stream of pulses with various frequencies.

With this kind of observation, we should consider a signal to be continuous with bounded range. The

linear threshold function should be ``softened".

One convenient form of such ``semi-linear" function is the logistic sigmoid function, or in short,

sigmoid function as shown in figure. As the input x tends to large positive value, the output value y

approaches to 1. Similarly, the output gets close to 0 as x goes negative. However, the output value

is neither close to 0 nor 1 near the threshold point.

This function is expressed

mathematically as follows:

describes the ``closeness" to the

threshold point by the slope. As x

approaches to

- infinity or + infinity , the slope is

zero; the slope increases as x

approaches to 0. This characteristic

often plays an important role in

learning of neural networks.

Single-Layer Network

By connecting multiple neurons, the true

computing power of the neural networks

comes, though even a single neuron can

perform substantial level of computation. The

most common structure of connecting

neurons into a network is by layers. The

simplest form of layered network is shown in

figure. The shaded nodes on the left are in the

so-called input layer. The input layer neurons

are to only pass and distribute the inputs and

perform no computation. Thus, the only true

layer of neurons is the one on the right. Each

of the inputs x1,x2,xN is connected to

every artificial neuron in the output layer

through the connection weight. Since every

value of outputs y1,y2,yN is calculated

from the same set of input values, each

output is varied based on the connection

weights. Although the presented network is

fully connected, the true biological neural

network may not have all possible

connections - the weight value of zero can be

represented as ``no connection".

Multilayer Network

To achieve higher level of computational

capabilities, a more complex structure of

neural network is required. Figure shows

the multilayer neural network which

distinguishes itself from the single-layer

network by having one or more hidden

layers. In this multilayer structure, the

input nodes pass the information to the

units in the first hidden layer, then the

outputs from the first hidden layer are

passed to the next layer, and so on.

cascading of groups of single-layer

networks. The level of complexity in

computing can be seen by the fact that

many single-layer networks are combined

into this multilayer network. The designer

of an artificial neural network should

consider how many hidden layers are

required, depending on complexity in

desired computation.

Backpropagation Networks

networks with distinct input, output, and hidden layers. The units function basically

like perceptrons, except that the transition (output) rule and the weight update

(learning) mechanism are more complex.

The figure on next page presents the architecture of backpropagation networks. There

may be any number of hidden layers, and any number of hidden units in any given

hidden layer. Input and output units can be binary {0, 1}, bi-polar {-1, +1}, or may

have real values within a specific range such as [-1, 1]. Note that units within the same

layer are not interconnected.

Backpropagation Networks

Backpropagation Networks

In feedforward activation, units of hidden layer

1 compute their activation and output values and

pass these on to the next layer, and so on until

the output units will have produced the

network's actual response to the current input.

The activation value ak of unit k is computed as

follows.

This is basically the same activation function of

linear threshold units (McCulloch and Pitts

model).

As illustrated above, xi is the input signal

coming from unit i at the other end of the

incoming connection. wki is the weight of the

connection between unit k and unit i. Unlike in

the linear threshold unit, the output of a unit in a

backpropagation network is no longer based on

a threshold. The output yk of unit k is computed

as follows:

The function f(x) is referred to as the output

function. It is a continuously increasing function

of the sigmoid type, asymptotically approaching

0 as x decreases, and asymptotically approaches

1 as x increases. At x = 0, f(x) is equal to 0.5.

Backpropagation Networks

In some implementations of the

backpropagation model, it is convenient to

have input and output values that are bi-polar.

In this case, the output function uses the

hypertangent function, which has basically

the same shape, but would be asymptotic to

1 as x decreases. This function has value 0

when x is 0.

the output units, the networks response is

compared to the desired output ydi which

accompanies the training pattern. There are

two types of error. The first error is the error

at the output layer. This can be directly

computed as follows:

hidden layers. This cannot be computed

directly since there is no available

information on the desired outputs of the

hidden layers. This is where the

retropropagation of error is called for.

Backpropagation Networks

Essentially, the error at the output layer is used to compute for the error at the hidden layer

immediately preceding the output layer. Once this is computed, this is used in turn to compute for

the error of the next hidden layer immediately preceding the last hidden layer. This is done

sequentially until the error at the very first hidden layer is computed. The retropropagation of error

is illustrated in the figure below:

Backpropagation Networks

The errors at the other end of the outgoing connections of the hidden unit h have been earlier

computed. These could be error values at the output layer or at a hidden layer. These error signals are

multiplied by their corresponding outgoing connection weights and the sum of these is taken.

Backpropagation Networks

The errors at the other end of the outgoing connections of the hidden unit h

have been earlier computed. These could be error values at the output layer

or at a hidden layer. These error signals are multiplied by their corresponding

outgoing connection weights and the sum of these is taken.

Backpropagation Networks

After computing for the error for each unit, whether

it be at a hidden unit or at an output unit, the

network then fine-tunes its connection weights

wkjt+1. The weight update rule is uniform for all

connection weights.

The learning rate a is typically a small value

between 0 and 1. It controls the size of weight

adjustments and has some bearing on the speed of

the learning process as well as on the precision by

which the network can possibly operate. f(x) also

controls the size of weight adjustments, depending

on the actual output f(x). In the case of the sigmoid

function above, its first derivative (slope) f(x) is

easily computed as follows:

We note that the change in weight is directly proportional to the error term computed for the unit at the

output end of the incoming connection. However, this weight change is controlled by the output signal

coming from the input end of the incoming connection. We can infer that very little weight change

(learning) occurs when this input signal is almost zero.

The weight change is further controlled by the term f(ak). Because this term measures the slope of the

function, and knowing the shape of the function, we can infer that there will likewise be little weight

change when the output of the unit at the other end of the connection is close to 0 or 1. Thus, learning

will take place mainly at those connections with high pre-synaptic signals and non-committed (hovering

around 0.5) post-synaptic signals.

Learning Process

One of the most important aspects of Neural Network is the learning process. The

learning process of a Neural Network can be viewed as reshaping a sheet of metal,

which represents the output (range) of the function being mapped. The training set

(domain) acts as energy required to bend the sheet of metal such that it passes through

predefined points. However, the metal, by its nature, will resist such reshaping. So the

network will attempt to find a low energy configuration (i.e. a flat/non-wrinkled shape)

that satisfies the constraints (training data).

In supervised training, both the inputs and the outputs are provided.

The network then processes the inputs and compares its resulting outputs against the

desired outputs. Errors are then calculated, causing the system to adjust the weights

which control the network. This process occurs over and over as the weights are

continually tweaked.

Summary

The following properties of nervous systems will be of particular interest in our neurally-inspired

models:

In unsupervised training, the network is provided with inputs but not with desired outputs. The

system itself must then decide what features it will use to group the input data. This is often referred

to as self-organization or adaption.

Following geometrical interpretations will demonstrate the learning process within different Neural

Models:

Parallel, distributed information processing

High degree of connectivity among basic units

Connections are modifiable based on experience

Learning is a constant process, and usually unsupervised

Learning is based only on local information

Performance degrades gracefully if some units are removed

References:

http://www.ece.utep.edu/research/webfuzzy/docs/kk-thesis/kk-thesis-html/node12.html

http://www.comp.nus.edu.sg/~pris/ArtificialNeuralNetworks/LinearThresholdUnit.html

http://www.csse.uwa.edu.au/teaching/units/233.407/lectureNotes/Lect1-UWA.pdf

Supervised and Unsupervised

Neural Networks

References

http://www.ai.rug.nl/vakinformatie/ias/slides/

3_NeuralNetworksAdaptation.pdf

http://www.users.cs.york.ac.uk/~sok/IML/iml

_nn_arch.pdf

http://ilab.usc.edu/classes/2005cs561/notes/

LearningInNeuralNetworks-CS561-3-05.pdf

Understanding Supervised and

Unsupervised Learning

B

A B

A

B A

Two possible Solutions

A B A B B B

A A

A B B A

Supervised Learning

It is based on a

Class

labeled training set.

The class of each

Class

A

piece of data in B Class

training set is

Class

B

known. A

Class labels are A Class

provided in the

training phase.

Supervised Vs Unsupervised

Task performed Task performed

Classification Clustering

Pattern NN Model :

Recognition

Self Organizing

NN model : Maps

Preceptron What groupings exist

Feed-forward NN in this data?

How is each data

What is the class of point related to the

this data point? data set as a

whole?

Unsupervised Learning

Input : set of patterns P, from n-dimensional space

S, but little/no information about their classification,

evaluation, interesting features, etc.

It must learn these by itself! : )

Tasks:

Clustering - Group patterns based on similarity

Vector Quantization - Fully divide up S into a

small set of regions (defined by codebook

vectors) that also helps cluster P.

Feature Extraction - Reduce dimensionality of S

by removing unimportant features (i.e. those that

do not help in clustering P)

Unsupervised Neural Networks

Kohonen Learning

Also defined Self Organizing Map

Learn a categorization of input space

Neurons are connected into a 1-D or 2-D lattice.

Each neuron represents a point in N-

dimensional pattern space, defined by N

weights

During training, the neurons move around to try

and fit to the data

Changing the position of one neuron in data

space influences the positions of its neighbors

via the lattice connections

Self Organizing Map Network

Structure

connected by weights

to each neuron

size of neighbourhood

changes as net learns

Aim is to map similar

inputs (sets of values)

to similar neuron

positions.

Data is clustered

because it is mapped

to the same node or

group of nodes

SOM-Algorithm

random values

2. Sampling : Draw an input sample x

and present in to network

3. Similarity Matching : The winning

neuron i is the neuron with the weight

vector that best matches the input

vector

i = argmin(j){ x wj }

SOM - Algorithm

4. Updating : Adjust the weights of the winning

neuron so that they better match the input.

Also adjust the weights of the neighbouring

neurons.

wj = . hij ( x wj)

neighbourhood function : hij

over time neigbourhood function gets smaller

of the input space and correspond

INTRUSION DETECTION

References

http://ilab.usc.edu/classes/2005cs561/notes/LearningInNeural

Networks-CS561-3-05.pdf

2004 IEEE International Conference on Systems, Man and

Cybernetics" A Hybrid Training Mechanism for Applying

Neural Networks to Web-based Applications Ko-Kang

Chu Maiga Chang Yen-Teh Hsia

Data Mining Approach for Network Intrusion Detection, Zhen

Zhang Advisor: Dr. Chung-E Wang04/24/2002,Department of

Computer Science,California State University, Sacramento

A Briefing Given to the SC2003 Education Program on

Knowledge Discovery In Databases,Nov 16 2003,NCSA

OUTLINE

What is IDS?

IDS with Data Mining

IDS and Neural Network

Hybrid Training Model

3 steps of Training

What is an IDS?

The process of monitoring and analyzing the events

occurring in a computer and/or network system in

order to detect signs of security problems

Anomaly detection: deviation from normal usage

Network based intrusion detection (NIDS) monitors

network traffic

Host based intrusion detection (HIDS) monitors a single

host

What is an IDS?

Limitations of IDS

Limitations of Misuse Detection

Signature database has to be manually revised for each new

type of discovered intrusion

They cannot detect emerging threats

Substantial latency in deployment of newly created signatures

Limitations of Anomaly Detection

False Positives alert when no attack exists. Typically, anomaly

detection is prone to a high number of false alarms due to

previously unseen legitimate behavior.

Data Overload

The amount of data for analysts to examine is growing too large.

This is the problem that data mining looks to solve.

Lack of Adaptability

System has to instantiated each new attack identified. Lack of

co-operative training and learning phase of system.

Why Data Mining can help?

Supervised learning: learn precise

models from past intrusions

Unsupervised learning: identify

suspicious activities

Maintain models on dynamic data

Data Mining - Overview

IDS with Data Mining

Neural Networks and IDS

NN is trainable -- Adaptability

Train for known pattern of attacks

Train for normal behavior

Learning and re-training in the face of a

new attack pattern.

Adjusts weights of synapses recursively

to learn new behaviors.

Hybrid Training Model

behaviors.

Training is carried out offline and used

online.

An intrusion detection model for we-

based applications.

Model follows.

Hybrid Training Model

configure

Feedback / Storing Log

Weight

DB

Decision

Module

configure

Update weights

New Training Set

LOG DB

Decision Abnormal Behavior

Detection Module -

Online

Decision

Module

Supervised

Filtering

1. Offline Training process

Real and Simulated Datasets with expected

results are put into the neural network for training,

without malicious data

IPAddress IP Date er time type

(msec)

140.245.1.55 67589 AM 09:30:56 2003/10/03 20000 N

140.245.1.44 67588 PM 08:45:44 2003/11/04 30000 AN - 001

130.158.3.66 64533 AM 07:12:08 2003/11/05 26000 AN - 010

Neural Network Model

IP Addresses are translated to 32 bit- binary and time data is

fed in milliseconds.

The expected outputs from the neural network are

Is it a normal access? (0/1)

What kinds of malicious actions is it? (000/001/010/011)

Can it be positively sure? (0/1)

Since we have input attribute values and expected outputs,

the preparation for training the neural network is done.

Back Propagation Neural Network Model

Input Layer 115 neurons

Hidden Layer 60 neurons

Output Layer 5 neurons

2. Analyze Access Behavior

Trained Neural Network model is used online to judge

whether the access is normal or abnormal.

When confusion arises between similar patterns and new

malicious behavior, store information in database. ( 0 000 0)

(Refer to model diagram)

Decision module identifies the output to be definitive or non-

definitive and stores the information for later review by

administrator.

Administrator categorizes the new patterns as normal or

abnormal behavior and re-trains the NN, through web

interface.

Web Interface for Online -

Training

3. Online - Training

network for training online.

Separate copy of the neural network is

being used for analysis

Once the training completes new

neural network replaces the old one.

(Refer to model diagram)

Design Suitability

Internet Ad Industry

Network Security

Antivirus

Using Neural Networks to

Forecast Stock Market Prices

Project report :

Ramon Lawrence

Department of Computer Science

University of Manitoba

umlawren@cs.umanitoba.ca

December 12, 1997

Contents

Abstract

Motivation behind using NN

Traditional Methods

NN for Stock Prediction

Abstract

predict market directions more

accurately, with the ability to discover

patterns in nonlinear and chaotic

systems.

Motivation behind using NN

stock market prices because they are

able to learn nonlinear mappings

between inputs and outputs.

It may be possible that NN

outperforms the traditional analysis

and other computer-based methods.

Traditional Methods

Statistics, technical analysis,

fundamental analysis, and linear

regression.

None of these techniques has proven

to be the consistently correct

prediction tool that is desired.

contd.

they are commonly used in practice and

represent a base-level standard, which

neural networks should outperform.

Also, many of these techniques are used to

preprocess raw data inputs, and their results

are fed into neural networks as input.

Technical Analysis

assumption that history repeats itself

and that future market direction can be

determined by examining past prices.

Ex. Using price, volume, and open

interest statistics, the technical analyst

uses charts to predict future stock

movements.

Fundamental Analysis

Fundamental analysis involves the in-depth

analysis of a companys performance and

profitability to determine its share price.

By studying the overall economic conditions,

the companys competition, and other

factors, it is possible to determine expected

returns and the intrinsic value of shares.

This type of analysis assumes that a shares

current (and future) price depends on its

intrinsic value and anticipated return on

investment.

contd.

its systematic approach and its ability to

predict changes.

Unfortunately, it becomes harder to

formalize all this knowledge for purposes of

automation (with a neural network for

example), and interpretation of this

knowledge may be subjective.

Chaotic System

A chaotic system is a combination of a deterministic

and a random process. The deterministic process

can be characterized using regression fitting, while

the random process can be characterized by

statistical parameters of a distribution function.

Thus, using only deterministic or statistical

techniques will not fully capture the nature of a

chaotic system.

A neural networks ability to capture both

deterministic and random features makes it ideal for

modeling chaotic systems.

Other techniques

techniques have been employed to

forecast the stock market.

They range from charting programs to

sophisticated expert systems. Fuzzy

logic has also been used.

Comparison with Expert

Systems

sequentially and formulate it into rules. In

this capacity, expert systems can be used in

conjunction with neural networks to predict

the market.

In such a combined system, the neural

network can perform its prediction, while the

expert system could validate the prediction

based on its well-known trading rules.

contd.

The advantage of expert systems is that they can explain how

they derive their results. With neural networks, it is difficult to

analyze the importance of input data and how the network

derived its results.

However, neural networks are faster because they execute in

parallel and are more fault tolerant.

It is hard to extract information from experts and formalize it in

a way usable by expert systems.

Expert systems are only good within their domain of

knowledge and do not work well when there is missing or

incomplete information.

Neural networks handle dynamic data better and can

generalize and make

"educated guesses." Thus, neural networks are more suited to

the stock market environment than expert systems.

Application of NN to Stock

Market Prediction

main areas:

Network environment and training data

Network organization

Training a NN

A neural network must be trained on some

input data.

The two major problems in implementing

this training discussed in the following

sections are:

Defining the set of input to be used (the

learning environment)

Deciding on an algorithm to train the

network

Learning Environment

One of the most important factors in

constructing a neural network is deciding on

what the network will learn.

The goal of most of these networks is to

decide when to buy or sell securities based

on previous market indicators.

The challenge is determining which

indicators and input data will be used, and

gathering enough training data to train the

system appropriately.

contd.

The input data may be raw data on volume,

price, or daily change, but it may also

include derived data such as technical

indicators (moving average, trend-line

indicators, etc.) or fundamental indicators

(intrinsic share value, economic

environment, etc.).

The input data should allow the neural

network to generalize market behavior while

containing limited redundant data.

An Example

A comprehensive example neural network system, henceforth

called the JSE-system, modeled the performance of the

Johannesberg Stock Exchange. This system had 63 indicators,

from a variety of categories, in an attempt to get an overall view

of the market environment by using raw data and derived

indicators.

The 63 input data values can be divided into the following

classes with the number of indicators in each class in

parenthesis:

fundamental(3) - volume, yield, price/earnings

technical(17) - moving averages, volume trends, etc.

JSE indices(20) - market indices for various sectors: gold,

metals, etc.

gold price/foreign exchange rates(3)

interest rates(4)

economic statists(7) - exports, imports, etc.

contd.

The JSE-system normalized all data to the range [-1,1].

Normalizing data is a common feature in all systems as

neural networks generally use input data in the range

[0,1] or [-1,1].

It is interesting to note that although the final JSE-

system was trained with all 63 inputs, the analysis

showed that many of the inputs were unnecessary. The

authors used cross-validation techniques and sensitivity

analysis to discard 20 input values negligible effect on

system performance.

Such pruning techniques are very important because

they reduce the network size which speeds up recall and

training times. As the number of inputs to the network

may be very large, pruning techniques are especially

useful.

Network Training

input patterns in a way so that the

system minimizes its error and

improves its performance. The training

algorithm may vary depending on the

network architecture, but the most

common training algorithm used when

designing financial neural networks is

the back-propagation algorithm.

contd.

The most common network architecture for financial

neural networks is a multilayer feedforward network

trained using backpropagation.

Back-propagation is the process of back-propagating

errors through the system from the output layer towards

the input layer during training. Back-propagation is

necessary because hidden units have no training target

value that can be used, so they must be trained based

on errors from previous layers. The output layer is the

only layer which has a target value for which to

compare.

As the errors are back-propagated through the nodes,

the connection weights are changed. Training occurs

until the errors in the weights are sufficiently small to be

accepted.

Over training

The major problem in training a neural network is

deciding when to stop training. Since the ability to

generalize is fundamental for these networks to

predict future stock prices, overtraining is a serious

problem.

Overtraining occurs when the system memorizes

patterns and thus looses the ability to generalize. It

is an important factor in these prediction systems as

their primary use is to predict (or generalize) on

input data that it has never seen.

Training Data

Training on large volumes of historical data

is computationally and time intensive and

may result in the network learning

undesirable information in the data set.

For example, stock market data is time-

dependent.

Sufficient data should be presented so that

the neural network can capture most of the

trends, but very old data may lead the

network to learn patterns or factors that are

no longer important or valuable.

Supplementary Learning

A variation on the back-propagation algorithm which

is more computationally efficient was proposed

when developing a system to predict Tokyo stock

prices. They called it supplementary learning.

In supplementary learning, the weights are updated

based on the sum of all errors over all patterns

(batch updating).

Each output node in the system has an associated

error threshold (tolerance), and errors are only

back-propagated if they exceed this tolerance. This

procedure has the effect of only changing the units

in error, which makes the process faster and able to

handle larger amounts of data.

Conclusion

perfect in their prediction, they

outperform all other methods and

provide hope that one day we can

more fully understand dynamic,

chaotic systems such as the stock

market.

THANKS

- Word Recognition in Continuous Speech and Speaker Independent by Means of Recurrent Self-Organizing Spiking NeuronsUploaded byAI Coordinator - CSC Journals
- 02#Artificial Neural Networks Based Devnagri Numeral RecogniUploaded byapi-26172869
- Automatic License Plate RecognitionUploaded bySatish Kumar
- Sample Research PaperUploaded byPrakash Chorage
- Application of Soft Computing in Electrical Engineering IJERTCONV5IS01058Uploaded byAparna Goli
- Neural Nw Too LbUploaded byGigih Himawan
- Medical TextUploaded byGurumurthy B R
- P.A. Estevez, R. Hernandez, C.A. Perez and C.M. Held- Gamma-filter self-organising neural networks for unsupervised sequence processingUploaded byGrettsz
- MAR_2010_SUPPUploaded bysatya_vanapalli3422
- Fábio Beckenkamp_ AnnUploaded byNafri Irfan
- ANN_GPUUploaded byFlorin Dudau
- GeologicalApplications[1]Uploaded byvanpato
- 35-O-jayveer Singh-A Survey on MachineUploaded byTuan Luc Duc
- SomUploaded byAna-Maria Romila
- Kohonen Self Organizing Maps Shyam Guthikonda 2Uploaded byArius B Wijaya
- 2752-2760Uploaded byAkindolu Dada
- 00483421 - Neural Network Based Estimation of Power Electronic WavesUploaded byJandfor Tansfg Errott
- INVE_MEM_2010_88004Uploaded bylincol
- 5Uploaded byIJMER
- IJAIEM-2014-03-13-029Uploaded byAnonymous vQrJlEN
- FeedFowradNet Image CompressionUploaded bysophiaR
- Neural Networks Package Documentation - Mathematica 5Uploaded bySiri Daniel
- Christopher Taylor ThesisUploaded byMuhammad Badar Ismail Dheenda
- Turban Online Tutorial 2Uploaded byMarta Kharja
- remotesensing-04-00561.pdfUploaded byAli
- Paper PresentationUploaded byShivpreet Sharma
- Python TensorFlow Tutorial - Build a Neural Network - Adventures in Machine LearningUploaded byvarun3dec1
- Artificial Nuclear NetworkUploaded byRahul Roy
- AmanRaj CVUploaded byAayush Jain
- Simulación Por Computadora de Redes Neuronales Usando Hojas de CálculoUploaded byAaron Arango Ayala

- ARTIFICIAL NEURAL NETWORKS APPLIED.pdfUploaded byAmrYassin
- 01_introduction.pdfUploaded byNitesh Shinde
- Redes Neurais LabviewUploaded byDaniel Rodrigues
- Document 1643Uploaded byKunal
- Stone Werestrass NeuralUploaded byCarlos Daniel Armenta
- ecsyll6(1)Uploaded byRaghunath B H
- Credit Card Fraud Detection Using Perceptron Training Algorithm and Prevention Using One Time PasswordUploaded byEditor IJRITCC
- face recognitionUploaded byZainab Sultana
- Thesis Using AnnUploaded bybomadan
- g19p1Uploaded bytriplewalker
- 17942Uploaded bymass_sia
- cs.pdfUploaded bySomu
- Neural NetworksUploaded bymundosss
- 0_exercises (1).pdfUploaded byAndika Saputra
- Neural Approach for Determining Mental Health ProblemsUploaded byJournal of Computing
- Devi Thesis CorrectedUploaded bySyed Ibad Ali
- Lecture 4_ Perceptrons and MLPs.pdfUploaded byJorge Airy Mercado
- Using MLP and RBF Neural Networks to Improve the Prediction of Exchange Rate Time Series with ARIMAUploaded byKiarash Negah
- TutorialUploaded byspwajeeh
- ANN.docxUploaded byrajeshrachakatla
- Image Splicing Detection involving Moment-based Feature Extraction and Classification using Artificial Neural NetworksUploaded byIDES
- C05 Neural Networks and Deep LearningUploaded byangelvi
- Statistical Learning TheoryUploaded byLucas Massaroppe
- nn.pdfUploaded byOscar
- Connectionist Models and Linguistic TheoryUploaded byCompraventaMya
- Chapter 9Uploaded byRaja Manikam
- 4 Simple ANNUploaded byfauzansaadon
- Report about neural network for image classificationUploaded byVivek Singh Chaudhary
- SyllabusUploaded byapi-3820998
- Proceedings_NSAIUploaded byMohan