# MatLab Tutorial

Draft

Anthony S. Maida October 8, 2001; revised September 20, 2004; March 23, 2006

Contents
1 Introduction 1.1 Is MatLab appropriate for your problem? 1.2 Interpreted language . . . . . . . . . . . 1.2.1 Command-line shell . . . . . . . 1.2.2 Loading scripts from ﬁles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2 2 2 3 3 3 4 4 4 5 6 6 6 7 7 9 9 9 10 11 11 11 1

2

Vectorizing code 2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Statement terminators and output suppression . . 2.1.2 Some matrix operators . . . . . . . . . . . . . . 2.1.3 Flexible matrix access . . . . . . . . . . . . . . 2.1.4 Loading data via a script . . . . . . . . . . . . . 2.2 Vector operations . . . . . . . . . . . . . . . . . . . . . 2.2.1 Examining the workspace . . . . . . . . . . . . 2.3 Applying functions to matrix elements . . . . . . . . . . 2.3.1 Vectorizing a feedforward network for one epoch 2.4 Deﬁning functions . . . . . . . . . . . . . . . . . . . . Plotting and visualization 3.1 Simple plotting . . . . . . . . . . . . 3.1.1 Application to neural networks 3.2 3D Plots . . . . . . . . . . . . . . . . 3.3 Surface plots . . . . . . . . . . . . . 3.4 Other plotting commands . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

3

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

Appendix

The process of entering a matrix can extend over more than one line and is terminated with the close-square-bracket character. “.2 Some matrix operators The mathematical notation for the transpose of a matrix w is wT . Suppose that you want to square each element in a matrix. to allow you to multiply corresponding elements of two m × n matrices to yield a new m × n matrix. called array multiply. The statement terminator is the newline character. we use the operator to square the elements of w. The line continuation operator is three consecutive periods .. it terminates a row. 4 . M ATLAB is amazingly ﬂexible in allowing you to access information in matrices.3 2. you can also add two compatible matrices using the “+” operator.*”. it suppresses output.*w ans = 1 4 9 16 25 36 2. it will print w and the matrix of values. For instance. There are other more subtle operators. >> w*w’ ans = 14 32 32 77 Of course. where i indicates the row and j indicates the column.j).1.M ATLAB will respond by echoing the value of w. In a matrix. then normally you would terminate them with a semicolon (to suppress output) unless you want the value of some variable to be printed as the ﬁle is evaluated. as shown below. 4 5 6]. For matrices. >> w = [1 2 3.3 Flexible matrix access You can access an element of the matrix w in the previous example by using an expression of the form w(i. In M ATLAB.1. In this example.1 Statement terminators and output suppression A newline terminates a statement unless you are typing in a matrix. . which signals that you are typing a command which extends across more than one line. In the example below. For the matrix w. . >> w. the apostophe symbol is used instead.1.4 2. There is an operator. That is. terminating a line with a semicolon character will suppress the output. and JAVA but is an output suppression operator in M ATLAB. you 3 The function of the semicolon operator is context dependent. Three consecutive periods is the command continuation operator. If you are entering assignment statements into a ﬁle. 4 5 6]. 4 The semicolon is a statement terminator in the programming languages C. the open-square-bracket character signals that you are entering a matrix. The dimensions of w and w’ are compatible so the matrices can be mutiplied. At the end of a line. C++. >> w = [1 2 3. w’ will yield the transpose of w.

You also have full access to rows and columns in the matrix. c++) System. I will leave out the command-line prompt. the expression “w(2.out.out. Suppose several cycles of this output are sent to a ﬁle named data. 0. Type this into the command shell to see what happens. Type these into the command shell to see what happens.2)” gives you the second column.m. The example below illustrates this. any number of tools can be used to manipulate. Afterwards. 2) = [ . w = 1 3 4 6 In most of the examples that follow. The third dimension holds the cycle number and the ﬁrst two dimensions hold the state of the two-dimensional array of cellular automata for that cycle.out.can treat it as a six-element vector by referencing it using the expression “w(:)”.println("]. Then from within M ATLAB this data can be loaded simply by typing the one-line command data. such as Conway’s game of life.1.length-1]+". c < (data[r]. data(:.").. 0. 1) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 = 0 0 0 0 0 0 [ 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0. and visualize the data. Notice that JAVA program embedded the output within a M ATLAB script where the M ATLAB variable data is a three-dimensional array. for (int r = 0. Let’s assume the method printData is executed on each cycle in order to print the current state of the game. representing say the current state of the of a cellular automaton. " + cyc + ") = ["). suppose the variable data is a 6 × 12 array of zeros and ones. For instance.2) = [].println("data(:. Possible output for the ﬁrst cycle is shown below. data(:. simply executing the script will load the data into M ATLAB..4 Loading data via a script It is often very useful to generate data in a traditional language such as C++ or JAVA and then visualize it using M ATLAB. 2.length. Once loaded into M ATLAB.out. 5 . examine. 0. :. r++) { for (int c = 0. System. r < data.m.} System. In the above JAVA method.:)” gives you the second row of the matrix and the expression “w(:.length-1). Thus the array gets a set of new values on each cycle of the game. static void printData(int cyc) { System.println(data[r][data[r]. which is an instruction to evaluate the script named data."). 0.2)” by typing >> w(:. 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 ].print(data[r][c]+" "). 0. :. :. There is a very easy way to do this by writing the output data in the form of a M ATLAB script. You can delete the second column of matrix “w(:.

2. The currrent workspace has three variables which reference matrices. you can watch the weights evolve using this editor. N ni = j=1 wj. you get run-time errors that are inconceivable in a compiled language. If you click on one of these names. so that only one statement is needed to calculate it. If you are debugging a neural network program. the operation of matrix multiplication vectorizes a double-nested for-loop. 0. they are called column vectors.3 Applying functions to matrix elements To complete the feedforward network computation.i pj If we were to compute these sums using for-loops. Each ni is equivalent to the sum below. In the M ATLAB development environment. When we use vectors and matrices at the same time. n = w*p. so excellent run-time debugging is a necessity in an intepreted language. the interpreter would have to decode and execute six different assignment statements. the workspace inspector is probably under the window menu. Each component ni of n represents the net input for one neuron in the input layer. you will be able to inspect the array contents using a spread-sheet-like interface. 0] (represented as a column vector5 ). a vector with three components is treated as a 3 × 1 matrix. The convention is to treat an n-component vector as an n × 1 matrix. w = [1 2 3. The matrix has two rows and each row codes the three weight values for one unit. 2. p = [1. 0.2 Vector operations Let us interpret the matrix w as the weight matrix for the ﬁrst layer of a two-unit neural network with three input-features to each unit. Then if we have an input pattern vector p = [1. 5 Vectors are a different kind of object than matrices.1 Examining the workspace M ATLAB has excellent run-time debugging facilities. we need to ﬁnd a way to treat them uniformly. 4 5 6]. we can compute the net-input for this layer with one matrix multiplication as shown below in the last line. Notice that these matrices have n rows and one column. You can even change the values of entries in a matrix. In this example. let us apply the sigmoidal activation function to each element of the net input array in the previous example. you have direct access to inspect and edit the objects in this workspace. Depending on your platform. you will see a list of variable names and the amount of space their associated objects consume. So. In an interpreted language. When you access this inspector.2. For this reason. 2. 0]. 6 .

This has the effect of applying the function to each element of the matrix. we premultiplied the matrix n with the scalar −4.p w n a = = = = [1. Let us explain what that line does. = [ . If the number of training patterns is small.3. For this example. 2. 1 1]’. that whereas matrices are signaled by square brackets. -. Since there are four input patterns.2. = [1 1 1 1]./ (1+exp(-4*n)) This example is the same as the previous except that one more line was added. The matrix netInput is therefore 2 × 4 and should be interpreted as follows. Then the matrix of reciprocals is computed. Multiplying wts with inputs yields a 2 × 4 matrix. 0./” stands for array divide and is the division equivalent of “.1 Vectorizing a feedforward network for one epoch The earlier example vectorized the presentation of one pattern to the network./ (1 + exp(-4*netInputs)). there are four columns. M ATLAB applies the matrix operation of addition. The next step was to add one to the matrix of results. . = [ . 4 5 6] w *p 1 . has the effect of adding one to each element of the matrix. 0 1. Notice adding the scalar 1 to a matrix. The ﬁrst layer consists of two units. 0] [1 2 3. The symbol “. -. After this.4 Deﬁning functions M ATLAB uses the call-by-value parameter passing style. Functions can have side-effects if the variables they use are declared global both in the function body and external to the function and also have the same name. How does this happen? M ATLAB converts the scalar 1 to a matrix of ones whose dimensions match the argument on the other side of the operator (in this case +). 2.1]. % transposed = [ 0 1 1 0]. We are looking at the ﬁrst layer of the network.1]. The 1 in the numerator again gets converted to a matrix of ones whose dimensions match those on the other side of the operator. 7 . each with two inputs. . then we can vectorize the presentation of all the training patterns to the network.1. Multiplying these together yields a 2 × 4 matrix. to the matrix. The matrix biasWts is 2 × 1 and onesVec is 1 × 4. Each column of the matrix codes the output values of the two units for one input pattern.1. First.*”. Notice that wts is a 2 × 2 matrix and that inputs is a 2 × 4 matrix. Notice. 1 0. Let us work from the innermost expressions. 1 . exp. assume that we are training a network to learn a two-input boolean concept such as AND. function application is signaled parentheses. starting with n. (wts * inputs) + (biasWts * onesVec).1. Next we applied the exponential function. inputs desiredOuts onesVec wts biasWts netInputs = outputs = = [0 0.

function f = hardlims(x) % 1 if x >= 0. 0. It needs to be put on its own ﬁle called hardlims. or the invocation logSig([1 2]) to apply the function to each element of the matrix [1 2]. The comment line begins with the % symbol. The purpose of this section is to show how to deﬁne functions that work with vectorized code. It is also customary not to indent the body of the function. -1 otherwise f = 2 * (x >= 0) .1. it should work with vectorized code. which is deﬁned below.5) to apply the function 1. The function is also desiged to work either with scalars or with arrays. The function returns when it reaches the end of its body. or JAVA. this function is placed in a ﬁle named logsig. This function returns 0 when its argument is zero and the hardlims function returns 1 which its argument is zero. Let’s start with a simple example. Normally. logsig(x) = 1 1 + e−x M ATLAB does not have a built-in logistic sigmoid function. M ATLAB does not have a built-in symmetric hard-limit function hardlims. The expression (x >= 0) in the ﬁrst line of the function body does this before the relational operator is applied. For this function to work on array arguments. The return value is the value of the variable f.m./ (1 + exp(-x)).5. Here is how to implement it. >> sign([-2 0 2]) ans = -1 0 1 This function is similar to the hardlims with one difference. An example of its use is given below. it is necessary to cause the system to create an array of zeros whose dimensions are the same as x. function f = logsig(x) f = 1 . That is. Notice that the function does not have the return statement characteristic of C. I have included a comment line between the function declaration and the function body. as deﬁned below. which was declared at the start of the function. In this example. Here is how to deﬁne the hardlims function. or 1.When you deﬁne a function in M ATLAB. 8 . We shall write a function to compute the logistic sigmoid function. The next example implements a piecewise linear function and is a bit more tricky to vectorize. you should be able to issue the function invocation logSig(1. −1 x < 0 hardlims(x) = +1 x ≥ 0 MatLab does have the built-in function sign which returns −1.m. C++.

size returns the dimensions of the matrix y.1.3 Plotting and visualization The language has convenient and powerful visualization facilities.1:10. 3. 3. x = rows:cols. y is a 1 × 101 matrix. More accurately. [rows. First. You can generate data within M ATLAB. as shown in the example below.* error).1. In this case. That is.02. Also.02)) break. Notice that both statements terminate with an end. or from an external program as was illustrated in Section 2. .1.1 Simple plotting The plot command plots two-dimensional graphs. cols] = size(y).4. let’s create a vector y with 101 elements and then plot it. we train the network for 1000 epochs unless the SSE drops below 0. if((SSE(epoch)<. end end plot(SSE). . Finally. This was implicit in the previous example and is made explicit below.1 Application to neural networks It is very easy to plot the sum-of-squared error (SSE) as a function of training epoch. SSE(epoch) = sum(error . The next line creates a vector x with a default increment of 1. In this example. >> >> >> >> y = 0:. Compare this with the creation of y with an explicit increment of 0.6 for epoch=1:1000 . the scope of the iteration variable epoch continues beyond the end of the for statement. The second line plots this vector as a function of an implicit x ranging from 0 through 100 in increments of 1.1:10. the for statement allows a break. and the componentwise assignment statement gives rows the value 1 and cols the value 101. Finally. plot(x. This example illustrates the syntax of for statements and if statements. the plot command explicitly plots y as a function of x. the y-values are plotted as a function of their array indices.y) In the above. we break from the for-loop.1. The plot command operates on vectors and plots a y against an x. >> plot(y) The ﬁrst line creates a vector whose values range from 0 through 10 in increments of 0. 6 9 . >> y = 0:.

’ epochs of bp’]). To do this.cyc)). In M ATLAB. data. xlabel(’Epoch’). num2str(epoch). In the above.y. as shown below.The variable error is assumed to hold a vector of error values for the n training patterns in one epoch of training. In this variant. The list of x coordinates goes into the x vector. we assume that 50 frames or cycles of data have been generated.1. Since x and y are vectors of the same length. you would get an array index out-of-bounds exception. title(’SS error for backprop’). z=zeros(length(y). The hold on command says to superimpose the data from successive plots. then you should include a clear statement at the beginning of the ﬁle. You need to erase the old SSE array from the system. then use the more complex variant below. title accepts a vector of strings.2 3D Plots The command plot3 allows you to plot data in three dimensions. If we square those error values and add them up. title([’SS error for ’. rather than to erase the graph for each new plot. cyc). plot3(z. your program behavior would be undeﬁned. we should put labels on the graph axes and give it a title. plotting the SSE is so easy it is mind boggling. hold on for cyc=1:50. In C or C++. For a given cycle of the life simulation. We save these values in the dynamically growing array7 SSE. Notice. then we have the SSE for that particular training epoch. the array grows so that it is large enough to handle the index. If you want to print the value of a variable in a title. the command find obtains the x and y coordinates of the non-zero elements of the two-dimensional matrix data(:. if you use an array index that is larger than the number of elements in the array.4 and then plots it in three dimensions using the plot3 command.1)+cyc. In JAVA. and similarly for the y coordinates.x. ylabel(’SSE’). Once it is computed.y] = find(data(:. when storing a value in an array. that the value of the variable epoch is converted to a string. The script below loads the data illustrated in Section 2. Of course. we create a vector of zeroes whose length matches y. :. 7 10 . The we add a scalar cyc to If you are in a debugging cycle and you reload this ﬁle.’) axis([1 50 1 6 1 12]) end. plot(SSE).:. we also need a z vector of the same length to give to the plot3 command. The 3D plot is generated in a loop of 50 iterations where each iteration plots one frame of data on the graph using the command plot3.’. 3. [x.

respectively. history(:. we use the axis command to say that we want dimensions of the x. Next. for epoch=1:1000 . Of course. 6. surf(history).45]).3 Surface plots You can use a surface plot to plot the values in a two-dimensional matrix. 12. . view([45. it will have four values corresponding to each of the input patterns. ylabel(’Pattern’). . These values match the dimensions of the plotted data.02)) break. . this adds the scaler to each element of the vector. 1 . You can play with these parameters to get a good viewing angle. You can specify this explicityly with the AXIS command. we issue the plot3 command. y. If we look at that unit for one epoch of training. you will want to annote the plot as illustrated below. . You can put several graphs within the same ﬁgure using the subplot command. 3. title(’Activation to patterns as a function of training’).4 Other plotting commands In the earlier examples. yielding a vector whose length is the same as y and whose components are all equal to cyc.* error). This surface plot is very useful because it shows you how the output units change their response to the input patterns as a function of training.this vector. The command surf(history) creates a surface plot of the history matrix. figure. We plot the z dimension on the x axis of plot3 because we want the progress of time to be depicted on the x access of the plot. . 11 . In M ATLAB. . surf(history). end end plot(SSE). Let’s apply this to a neural network that has one ouput unit but has been trained on four input patterns. SSE(epoch) = sum(error .epoch)=outputsL2(:). The command view([45. and z axes to vary from 1 . It vividly displays the network’s change in behavior as a result of the learning process. M ATLAB chose bounds for the x and y axes automatically. xlabel(’Epoch’). 3. and 1 . . The example below illustrates how to do this. if((SSE(epoch)<. zlabel(’Activation’). You can plot error bars using the errorbar command in conjunction with hold. then we can plot the ouput values as a function of pattern and training epoch. view([45.45]).45]) sets the viewing angle. If we store these values in a matrix across all epochs of training. The matrix history is incrementally updated to hold the output values of the units for the current training epoch. . 50. The command figure tells M ATLAB to plot the results in a new ﬁgure and do not overwrite the results in the SSE ﬁgure. Finally.

would be written as shown below. .  . matrix multiplication is not in general commutative.n If m equals n. A. then AT (read “A transpose”) 12 .  . with the identity matrix or multiplying the identity matrix with A both yield the original matrix A. The inner product is not deﬁned if row i does not have the same number of elements as column j. This is why the number of columns of A must equal the number of rows of B. If A is an m × n matrix. The resulting product matrix will have dimensions m × o. Multiplying a square matrix.   a1. . written AB. The identity matrix has special properties with respect to matrix multiplication. ··· ··· . A. Matrix B cannot be multiplied with A unless o is equal to m. .n a2. . the elements of an m × n matrix.  1 0 .A Appendix: Matrix notation A 2 by 3 matrix has two rows and three columns. A very important square matrix is the identity matrix. . 0  0 ··· 0 1 ··· 0   . . AI = IA = A (1) A. An m × n matrix has m rows and n columns.1 a1.i .1 Matrix multiplication This convention of describing matrix dimensions eases the problem of keeping track of whether two matrices may be multiplied together to produce a new matrix and what the resulting matrix dimensions will be. . Matrix multiplication is associative and distributive.2 Deﬁnition of transpose The notation AT refers to a matrix generated from matrix A by reversing the order of the subscripts of each of the elements. .  .n . That is. .2 · · · am. . 0 ··· 1 I≡    A.1  A≡ . Although scalar multiplication is commutative. A matrix A can be multiplied with a matrix B if the number of columns in A is equal to the number of rows in B. as shown below. . Suppose that A is an m × n matrix and B is an n×o matrix. then we have a square matrix. am. Then A can be multipled with B. With these conventions. This is the set of matrix locations ai. .2 . Element ij of matrix AB is computed by taking the inner product of row i in matrix A and column j in matrix B.1  a2. I.2 a2. We can easily deﬁne how to compute the elements of the product matrix AB. .     am.. a square matrix has a diagonal. a1. Further. A square matrix has the same number of rows and columns.. which has ones along the diagonals and zeros everywhere else.

When deriving results. there are a number of useful facts about the transpose of a matrix. The transpose of the product of two matrices is the transpose of the second matrix multiplied with the transpose of the ﬁrst matrix. The transpose of the identity matrix is itself. The transpose of the transpose of a matrix is the original matrix. A = (AT )T 3. I = IT 2. (AB)T = B T AT 13 .is an n × m matrix. 1.