You are on page 1of 6

Source: https://www.youtube.com/watch?

v=1XkjVl-j8MM

Perceptron

 It models a neuron
 It receives n weighted inputs (features), sum them, checks the results and according to a threshold (theta),
produces an output

 Used to classify linearly separable classes


 The perceptron can have another input known as the bias. When we use the bias, x0 is equals to 1.
 The goal is trying to find a line/plane/hyperplanethat separate the classes by adjusting the weights (slope) and
bias (intercept)
 The transfer functions translate the input signals to output signals and uses a threshold to produce an output.
 Two types of transfer functions are commonly used:
o Unit step

o Sigmoid

 So, if we have n variables (e.g. x, y, z… inputs). We need to find n+1 weight values (n variables + the bias)\
 These will be the coefficient in the equation of the separation line/plane/hyperplane
 For example, if we have 4 inputs, the equation becomes (which is the equation of a hyperplane):
 We start with a random line to try to find the classification line
 We change W to change the slope
 We change the bias to change the intercept
 We can control how much we should change the weight and bias through the learning rate

 We train the perceptron to respond to each input vector with a corresponding target value of 0 or 1. If the data
is linearly separable, the classification line will be found in a finite interval (that is, we will find a classification
line that will match the results of the training data set)
 Procedure:
o Initialize the weights (zero or small random value)
o Pick a learning rate n (etha - number between 0 and 1)
o Do the following until a stop condition is satisfied (error equals to zero or max. number of iterations):
 For each training instance (vector of inputs – x’s, actual value - corresponding class)
 Compute the predicted output: sum of product of inputs, weights and bias and applying
the threshold
 Apply the learning rule:
o Error = output – actual
o Update the bias:
 b = b + m * error
o For all inputs of the current vector, update the weights (remember that there’s
only one weight vector for the network):
 W(i) = W(i) + error * n * x(i)
o Calculate the global error:
 Global Error = Global Error + Error2
o Observe that if the output is correct, no change is made
o A complete run through all training instances is called epoch. We can have lots of epochs run until the
stop condition is satisfied
o Training is complete if we finish an entire pass through all training vectors without error
o After that, if a vector P, not in the training set is presented to the network, the network will tend to
exhibit generalization by responding with an output according to the classification learned by the
training set.

Perceptron Algorithm in Java


/**
* The Perceptron Algorithm
* By Dr Noureddin Sadawi
* Please watch my youtube videos on perceptron for things to make sense!
* Copyright (C) 2014
* @author Dr Noureddin Sadawi
*
* This program is free software: you can redistribute it and/or modify
* it as you wish ONLY for legal and ethical purposes
*
* I ask you only, as a professional courtesy, to cite my name, web page
* and my YouTube Channel!
*
* Code adapted from:
* https://github.com/RichardKnop/ansi-c-perceptron
*/

import java.text.*;

class Perceptron
{
static int MAX_ITER = 100;
static double LEARNING_RATE = 0.1;
static int NUM_INSTANCES = 100;
static int theta = 0;
public static void main(String args[]){
//three variables (features)
double[] x = new double [NUM_INSTANCES];
double[] y = new double [NUM_INSTANCES];
double[] z = new double [NUM_INSTANCES];
int[] actual = new int [NUM_INSTANCES];

//fifty random points of class 1


for(int i = 0; i < NUM_INSTANCES/2; i++){
x[i] = randomNumber(5 , 10);
y[i] = randomNumber(4 , 8);
z[i] = randomNumber(2 , 9);
actual[i] = 1;
System.out.println(x[i]+"\t"+y[i]+"\t"+z[i]+"\t"+actual[i]);
}

//fifty random points of class 0


for(int i = 50; i < NUM_INSTANCES; i++){
x[i] = randomNumber(-1 , 3);
y[i] = randomNumber(-4 , 2);
z[i] = randomNumber(-3 , 5);
actual[i] = 0;
System.out.println(x[i]+"\t"+y[i]+"\t"+z[i]+"\t"+actual[i]);
}

double[] weights = new double[4];// 3 for input variables and one for bias
double localError, globalError;
int p, iteration, output;

weights[0] = randomNumber(0,1);// w1
weights[1] = randomNumber(0,1);// w2
weights[2] = randomNumber(0,1);// w3
weights[3] = randomNumber(0,1);// this is the bias

iteration = 0;
do {
iteration++;
globalError = 0;
//loop through all instances (complete one epoch)
for (p = 0; p < NUM_INSTANCES; p++) {
// calculate predicted class
output = calculateOutput(theta,weights, x[p], y[p], z[p]);
// difference between predicted and actual class values
localError = actual[p] - output;
//update weights and bias
weights[0] += LEARNING_RATE * localError * x[p];
weights[1] += LEARNING_RATE * localError * y[p];
weights[2] += LEARNING_RATE * localError * z[p];
weights[3] += LEARNING_RATE * localError;
//summation of squared error (error value for all instances)
globalError += (localError*localError);
}

/* Root Mean Squared Error */


System.out.println("Iteration "+iteration+" : RMSE =
"+Math.sqrt(globalError/NUM_INSTANCES));
} while (globalError != 0 && iteration<=MAX_ITER);

System.out.println("\n=======\nDecision boundary equation:");


System.out.println(weights[0] +"*x + "+weights[1]+"*y + "+weights[2]+"*z +
"+weights[3]+" = 0");

//generate 10 new random points and check their classes


//notice the range of -10 and 10 means the new point could be of class 1 or 0
//-10 to 10 covers all the ranges we used in generating the 50 classes of 1's and 0's
above
for(int j = 0; j < 10; j++){
double x1 = randomNumber(-10 , 10);
double y1 = randomNumber(-10 , 10);
double z1 = randomNumber(-10 , 10);

output = calculateOutput(theta,weights, x1, y1, z1);


System.out.println("\n=======\nNew Random Point:");
System.out.println("x = "+x1+",y = "+y1+ ",z = "+z1);
System.out.println("class = "+output);
}
}//end main

/**
* returns a random double value within a given range
* @param min the minimum value of the required range (int)
* @param max the maximum value of the required range (int)
* @return a random double value between min and max
*/
public static double randomNumber(int min , int max) {
DecimalFormat df = new DecimalFormat("#.####");
double d = min + Math.random() * (max - min);
String s = df.format(d);
double x = Double.parseDouble(s);
return x;
}

/**
* returns either 1 or 0 using a threshold function
* theta is 0range
* @param theta an integer value for the threshold
* @param weights[] the array of weights
* @param x the x input value
* @param y the y input value
* @param z the z input value
* @return 1 or 0
*/
static int calculateOutput(int theta, double weights[], double x, double y, double z)
{
double sum = x * weights[0] + y * weights[1] + z * weights[2] + weights[3];
return (sum >= theta) ? 1 : 0;
}}
Example of a classification hyperplane

 Equation:
-0.09807000000000016*x + 0.7617000000000004*y + -0.054989999999999956*z + -2.147100000000001=0

 Random point A (red): x = 1.4165,y = 5.0086,z = 6.2186 (class 1)


 Random point B (blue): x = 6.6346,y = -2.6293,z = 5.3033 (class 0)

You might also like