Professional Documents
Culture Documents
v=1XkjVl-j8MM
Perceptron
It models a neuron
It receives n weighted inputs (features), sum them, checks the results and according to a threshold (theta),
produces an output
o Sigmoid
So, if we have n variables (e.g. x, y, z… inputs). We need to find n+1 weight values (n variables + the bias)\
These will be the coefficient in the equation of the separation line/plane/hyperplane
For example, if we have 4 inputs, the equation becomes (which is the equation of a hyperplane):
We start with a random line to try to find the classification line
We change W to change the slope
We change the bias to change the intercept
We can control how much we should change the weight and bias through the learning rate
We train the perceptron to respond to each input vector with a corresponding target value of 0 or 1. If the data
is linearly separable, the classification line will be found in a finite interval (that is, we will find a classification
line that will match the results of the training data set)
Procedure:
o Initialize the weights (zero or small random value)
o Pick a learning rate n (etha - number between 0 and 1)
o Do the following until a stop condition is satisfied (error equals to zero or max. number of iterations):
For each training instance (vector of inputs – x’s, actual value - corresponding class)
Compute the predicted output: sum of product of inputs, weights and bias and applying
the threshold
Apply the learning rule:
o Error = output – actual
o Update the bias:
b = b + m * error
o For all inputs of the current vector, update the weights (remember that there’s
only one weight vector for the network):
W(i) = W(i) + error * n * x(i)
o Calculate the global error:
Global Error = Global Error + Error2
o Observe that if the output is correct, no change is made
o A complete run through all training instances is called epoch. We can have lots of epochs run until the
stop condition is satisfied
o Training is complete if we finish an entire pass through all training vectors without error
o After that, if a vector P, not in the training set is presented to the network, the network will tend to
exhibit generalization by responding with an output according to the classification learned by the
training set.
import java.text.*;
class Perceptron
{
static int MAX_ITER = 100;
static double LEARNING_RATE = 0.1;
static int NUM_INSTANCES = 100;
static int theta = 0;
public static void main(String args[]){
//three variables (features)
double[] x = new double [NUM_INSTANCES];
double[] y = new double [NUM_INSTANCES];
double[] z = new double [NUM_INSTANCES];
int[] actual = new int [NUM_INSTANCES];
double[] weights = new double[4];// 3 for input variables and one for bias
double localError, globalError;
int p, iteration, output;
weights[0] = randomNumber(0,1);// w1
weights[1] = randomNumber(0,1);// w2
weights[2] = randomNumber(0,1);// w3
weights[3] = randomNumber(0,1);// this is the bias
iteration = 0;
do {
iteration++;
globalError = 0;
//loop through all instances (complete one epoch)
for (p = 0; p < NUM_INSTANCES; p++) {
// calculate predicted class
output = calculateOutput(theta,weights, x[p], y[p], z[p]);
// difference between predicted and actual class values
localError = actual[p] - output;
//update weights and bias
weights[0] += LEARNING_RATE * localError * x[p];
weights[1] += LEARNING_RATE * localError * y[p];
weights[2] += LEARNING_RATE * localError * z[p];
weights[3] += LEARNING_RATE * localError;
//summation of squared error (error value for all instances)
globalError += (localError*localError);
}
/**
* returns a random double value within a given range
* @param min the minimum value of the required range (int)
* @param max the maximum value of the required range (int)
* @return a random double value between min and max
*/
public static double randomNumber(int min , int max) {
DecimalFormat df = new DecimalFormat("#.####");
double d = min + Math.random() * (max - min);
String s = df.format(d);
double x = Double.parseDouble(s);
return x;
}
/**
* returns either 1 or 0 using a threshold function
* theta is 0range
* @param theta an integer value for the threshold
* @param weights[] the array of weights
* @param x the x input value
* @param y the y input value
* @param z the z input value
* @return 1 or 0
*/
static int calculateOutput(int theta, double weights[], double x, double y, double z)
{
double sum = x * weights[0] + y * weights[1] + z * weights[2] + weights[3];
return (sum >= theta) ? 1 : 0;
}}
Example of a classification hyperplane
Equation:
-0.09807000000000016*x + 0.7617000000000004*y + -0.054989999999999956*z + -2.147100000000001=0