You are on page 1of 8
Chapter 1 Introduction ‘There are several types of neural networks, but, feed-forward neural network is one of the most prevalent neural networks. A feed-forward neural network contains one input layer, one or more hidden layer and one output layer, as shown in Fig 1.1 stimuli from the external sources enter the input layer and the output signal Inp leaves from the output layer. Feed-forward neural networks can be trained based on gradient-descent based back propagation algorithm [1]. Additive kind of hidden nodes are most frequently used in such networks. For additive hidden node, the expression for output function of the node in the [® hidden layer is (2) waa (a. 2+ 0), (a) with sigmoid activation function g(t) + RR, g(2)=1/(1 4 exp(—z)) (12) for nodes present in the hidden layer. Here a," is the weight vector linking the (J — 1)" layer to the i“* node of the U layer and 6," is the bias of the é* node of the I layer (2). The architecture of feed-forward network is shown in Fig. 1. The expression "9. 2" signifies the inner vector product of ai a But, this algorithm requires more computation time. Extreme Learning machines are time efficient and they are less complex than con- 1 ee teste (nore Ce ea ail Widen Layers wn Input tddon Layers Supt Laver Iyer Figure 1.1; Architecture of Feed-Forward Neural Network ventional gradient based algorithm. ELM uses additional methods such as weight decay and early stopping methods which is used to prevent issues such as local min- ima and improper learning rate for single-hidden layer feed-forward neural networks {3|.ELM can be trained with non-differentiable activation fumctions contrasting the gradient based learning algorithms (3) Chapter 2 Literature Survey ELM model is a supervised learning machine based on a Single-Hidden Layer Feed- forward Neural network (SLFN) architecture [3), as shown in Fig. 2.1.It shows that ‘the weights connecting to the output layer can be systematically resolved through ‘the generalized inverse operation when the weights and bias of the hidden neurons are arbitrarily allocated [4]:They evade the necessity to tune these hidden neuron, parameters [4]. Assume that (x,t) © B® x R™ 5 fw (x) = nand j= 1, m (2.1) is a set of NV’ patterns where 2; is a n x 1 input vector and f, is am x 1 output vector (4).f an SLEN with hidden neurons and activation function 9/2), there exists B,, , and b, such that YE BGCCa:,b:,2;) ot mn (22) where a, and , are learning parameters of hidden nodes (where a, is the weight vector connecting input node to the hidden node and b, is the bias of hidden node), B; is the é-th output weight, and G(q,,b,,2,) is the output of i-th hidden node with respect to input vector 2, [4]. If the hidden neuron is additive, it follows that [4] Gabi. 5) = a(anzy + bi) (2.3) 3 cb Figure 2.1: Architecture of ELM. Adapted from (7].The neurons present in the input layer are linked to a huge number of non-linear neurons present in the hidden layer ‘through random weights and controllable bias, b(1) to b(M) [7].'The neurons present in output layer possess linear-characteristics [7]. The links from the neurons present in the hidden layer to the neurons present in output layer, with weights that are trainable, and the neurons present in the output layer produce a sum of their inputs {7|.The sum is generated in linear fashion {7} ‘Then, the Equation (2.2) can be written as HB= (24) where Gay, bs, x1) Glan, by, #1) H (28) G(a,b:,2y) Glay, by, 2n) is known as hidden layer output matrix (being the i-th row of H the output of the hidden layer with respect to 2; input vector, and the j-th column the output of the j-th hidden node with respect to 2; to 297 input vectors) [4] ‘Thus, the matrix of output weights, B can be estimated as Input’ ‘Training sos (a) © RX Rive) = be = 1, mand j = 1,......,mand N Hidden neurons with sigmoid activation function g(r) Step 1: Randomly assign input weight a, and bias 6, = T,.n ix H using G(s, 6,5) Step 2: Calculate the hidden layer output m: = glazr, + b,) and G(ar,b,21) Glaw, bw.) Garb zn) Glan, bw, ay) Step 3; Calculate the output weight Busing B= A*T where H* is the ‘Moore-Penrose generalised inverse (pseudo-inverse) of the hidden layer t output matrix H and T |, found using DY, BG(ai, bs, 25) tm Output weight matrix Table 2.1; Training Procedure of Extreme Learning Machine Bot (26) where H+ is the Moore-Penrose generalised inverse (pseudo-inverse) of the hidden layer output matrix H [4) In spite of their efficiency, ELM neural networks are hard to implement at hard- ware level because each input neuron is connected to all the neurons present in hidden layer. ‘This needs considerable amount of hardware resources that increases as the neurons present in hidden layer imcreases. In (6] and {7}, the authors assume fixed. number of input layer and hidden layer neurons which limits them from using complex datasets. In order to remove these issues, the neural network based on a receptive field (RF) approach [8] is used. The idea of receptive fields [9] (see Fig, 2.3) usage for pattern recognition origi- nates from neural science, where neurons frequently react to input stimulus generated. from restricted three-dimensional range of data, as shown in Fig. 2.2. Here, Primary Afferent is a pscudo-unipolar neuron with central end synapsing on Second Order Neuron and the peripheral end is connected to receptors. ‘Tissue from which Primary Afferent receives sensory information is Receptive field of Primary Afferent. Multiple Primary Afferent synapsing on Second Order Neuron (Dorsal Horn Neuron). Hence, ne Fetlots seer Figure 2.2: Receptive Field of Second Order Neuron [20] the combined Receptive Field of multiple Primary Afferent is the Receptive Field of Second Order Neuron, Including this concept into hardware implementation of pattern recognition im- proves its performance [9]. Receptive - Field based ELM gives similar accuracy when compared to traditional ELM algorithms [9], with an additional benefit of utilizing less hardware resources. Also, the positions of the receptive field [9] can differ based. on size of the input data, with changes required between the connections present in the neurons of the hidden layer and its corresponding receptive field But, implementing RF approach at hardware level is complex since the location and dimensions of Receptive - Field is either found arbitrarily or through patterns, which arbitrarily access the input data. Input image in binary format and shift register array are used for hardware imple- mentation of the RF approach (as shown Regions are selected from the input image by shifting the index variable across memory array. These generated regions are the receptive-fields. ‘These receptive-fields are either square or rectangular in size. Fig. 2.4 shows functioning of RF approach on an image from MNIST dataset [10] AD “@ Figure 2.3: Illustration of the receptive ficld methodology [9]. Every single neuron present in hidden layer obtain inputs from an arbitrary receptive field [9], which is either square or rectangular in size. of size 28x28 pixels. Bach pixel can be represented « sing &-bit binary value. Firs the image is restructured from 28x28 pixels to 784x1 pixels .Then, RF is shifted by 16 pixels across the restructured image. The 16 pixels corresponds to step size. Here, size of receptive field is 128 pixels, Fig. 2.4 depicts the restructured image on the left and the region covered by receptive field over the image on the right for different regions covered by index variable. Fig. 2.4a shows the receptive+field covered over image for index 1 to index 128 from the restructured ge. The region covered by receptive-field (as shown in Fig 24a) is used for generating stimulus towards first hidden layer neuron when multiplied by the weights present in the input layer. These weights are arbitrarily generated for each neuron present in hidden layer. In Fig. 2.4b, the starting position of index over restructured image is shifted by one step and its corresponding receptive-field. is shown. The region covered by receptive-field (as shown in Fig. 24b) is used for generating stimulus towards sccond hidden layer neuron when multiplied by the weights present in the input layer. ‘This process is repeated several times to generate stimulus to other neurons present in the hidden layer by multiplying with its corresponding random weights. swe neg econo # [oye o}------{ ofo]o| 3 “a? 5 me mre 4 ry » [ofefo}-fofo]efe|o}-{o} [efofol-fo[o[[°[o} fo] DED SS awe Figure 2.4: Illustration of RF Approach, Adapted from [5] Fig. 24e shows the region covered by receptive-field after shifting the starting in- dex by 16 steps over restructured image. After the 49th step, the final step of the restructured image, the index comes back to its initial position,

You might also like