You are on page 1of 9

Yu Hen Hu

5/9/16

Last (family) name: _________________________


First (given) name: _________________________
Student I.D. #: _____________________________
Department of Electrical and Computer Engineering
University of Wisconsin - Madison
ECE/CS/ME 539 Introduction to Artificial Neural Network and Fuzzy Systems

Take Home Final Examination


(From noon, Monday 5/2/2016 till noon, Monday 5/9/2016)

This is a take home final examination. You are to submit your answer
electronically to the Moodle assignment box before the deadline. If you
prefer hand-writing rather than typing, you should keep the writing
neat and high-light the answer and scan the answer sheet for
electronic submission.
You CANNOT discuss either the questions, or the answers with any one
except the instructor.
Many of the problems require programming. You are required to attach
copies of your code as part of the submission so that the grader can
run the code to verify the answer.
You cannot copy the code elsewhere EXCEPT those post on the course
website.
ABSOLUTELY NO EXTENSION REQUEST WILL BE GRANTED. Be on time.
Any academic mis-conduct will be pursued to the full extend
according to University rule. You must sign below and scan this
page as part of submission to receive credits for the final examination.

I, _____________________ (print your name) promise that I will not commit


academic plagiarism. I understand such an offense will result in failure of
this course for both the person who copy others answer and those who
let their answer copied by others.

IMPORTANT!

Question Answering Period

During his regular office hours, 1-2PM, Tuesday, 2-3PM, Wednesday,


11-noon, Thursday, or by appointment, Prof. Hu will be available
answering questions related to this final examination. Questions may also
be submitted via email.

1 of 9

Yu Hen Hu

5/9/16

Proble
m
1

15

10

15

30

15

15

Total

2 of 9

Max.
pts

100

Points

Yu Hen Hu

5/9/16

1. (15 points) linear Classification


Let X = [x1, x2, , xN]mN be a matrix consisting of N m-dimensional feature vectors {xn; 1
n N}. Suppose there is a hyperplane H: {x; g(x) = wTx c = 0} that separates these feature
vectors into two subgroups, with one labeled with 1 and the other labeled with +1. The
projection of each column of X along the direction of w, namely, wTX then may be divided
into two groups, one having values greater than c and the other smaller than c. We define a 1
N vector y = sgn(wTX c[1 1 1]1N) as the binary label vector where sgn is a sign
function such that sgn(x) = + 1 if x > 0 and = 1 if x < 0.
Now, a realization of X and corresponding y vector are stored in the datafile s16p1dat_1.txt.
The first m rows belongs to the X matrix while the last row is the y vector. The matlab code
to load the data and set up parameters are:
load s16p1dat_1.txt;
x=s16p1dat_1(1:end-1,:);
y=s16p1dat_1(end,:);
clear s16p1dat_1;
[m,N]=size(x);
idx1=find(y==-1); idx2=find(y==+1);
D1=x(:,idx1); D2=x(:,idx2);
N1=length(idx1); N2=length(idx2);

Find a hyperplane in the form of H: {x; g(x) = wTx c = 0} such that yn = sgn(g(xn)) for 1
n N where y = [y1, y2, , yN]. Note that this is a binary classification problem using a linear
classifier. Hence, the answer should provide w, c and corresponding confusion matrix Cmat.
You may use any one or more methods you have learned in this class. These include: (i)
Perceptron learning; (ii) linear discriminant analysis; (iii) MLP, (iv) SVM, or (v) maximum
liklihood classifier using uni-variate Normal distribution.
The data set given is linearly separable using a particular value of w and c as shown in the
figure below.

In this particular example, the training data and testing data are the same set of X and y. If
you obtain an adquate choice of w and c, the Cmat should be a diagnal matrix (100% correct
classification). The matlab code that compute the Cmat matrix for the given X, y, w, and c
are as follows. Actually, here c is computed to be in the middle of the gap. Its value may be
adjusted to maximize the probability of classification.
yb=w'*x; % 1 x N
yb1=yb(y==-1); % label for -1 class
yb2=yb(y==+1); % label for +1 class
if mean(yb1) < mean(yb2), % if -1 class is projected to negative values
sides=0; % -1 side is negative projection value

3 of 9

Yu Hen Hu

5/9/16

gp=min(yb2)-max(yb1);
c=0.5*(min(yb2)+max(yb1));
else % if +1 class is projected to negative values
sides=1; % +1 side is negative projection value
gp=min(yb1)-max(yb2);
c=0.5*(min(yb1)+max(yb2));
end
if gp < 0, disp('*** projection not separated linearly! ***'); end
yest=sign(w'*x-c*ones(1,N)+eps); % 1 x N, estimated
% converting output from scalar (+1, -1) to (1, 0), (0, 1) encoding
ygrnd=double([[y' > 0] [y' < 0]]); % N x 2, ground truth
if sides==0,
ye=double([[yest' > 0] [yest' < 0]]); % N x 2
elseif sides==1,
ye=double([[yest' < 0] [yest' > 0]]); % N x 2
end
Cmat=ygrnd'*ye;
disp(['Cmat = '])
disp(Cmat)
disp(['Prob. Classification = ' num2str(round(100*sum(diag(Cmat))/N)) '%']);

2. (10 points) A hypothetical SVM model has the following values of support vectors (SV) and
corresponding target values d.
SV
1 -23 12
2 -2
3
3 4
-4
4 16 -8
5 16 6
6 -3
8
7 -1
4
8 -5
9
9 1 -15

28
-5
-1
-6
6
11
11
8
4

d
1
-1
-1
1
-1
-1
1
-1
1

0.0016
0.0615
0.8703
0.0243
0.0027
0.0082
0.0124
0.0286
0.0085

Suppose that the polynomial kernel is used. Compute the output of this SVM model when
the input feature vector is (13, 8, 6). Recall that for polynomial Kernel,
K(x, y) = (1 + xTy)2. Hence K(x, xi) = (1 + xTxi)2 where xi is a support vector.
3. (15 points) clustering, kmeans algorithm, SOM
Download the data files s16p3a.txt and s16p3b.txt.
(a) (5 points) Apply DBSCAN clsutering algorithtm
(http://www.mathworks.com/matlabcentral/fileexchange/52905-dbscan-clusteringalgorithm) to cluster s16p3a.txt. Discuss which values of Eps and Minpts should be used
to yield two clusters. Plot the clustering results using diferent colors for different clusters.
(b) (5 points) Apply the SOM algorithm on s16p3a.txt using a linear array of indices of the
neurons. Reporting results with total number of neurons N = 10, and 20 respectively. Plot
the clustering results as illustrated in somdemo.m
(c) (5 points) Apply the Kmeans algorithtm for s16p3b.txt using 8 clusters. Plot the
clustering results as demonstrated in clusterdemo.m

4 of 9

Yu Hen Hu

5/9/16

4. (30 points) Fuzzy Logic Controller Design


x

(50, 0)

(100,0)

(0,0)

(95,40)

Parking Lot

(5,95)
(0,100)

Refer to the left figure above, a truck (shown on the right figure) is initially parked in a
parking lot (shaded region) whose lower-left and upper-right coordinates are, respectively, (5,
95) and (95, 40), all measured in meters. The objective is to back-up this truck to the
loading duck (marked with a triangle at coordinate (50, 0)) such that the center line of the
truck is prependicular to the loading duck wall so that the truck faces south at that position.
The loading duck and the parking lot are surrounded by walls enclosing a 100 meter square
area marked with thick lines. The truck can not hit the wall during the back-up operation.
Otherwise, it is consider a failure.
When the back-up operation begins, the truck will be driven backward at a constant speed v.
The driver, or the controller you are to design in this problem, is allowed to streer the front
wheel by an angle within the range of (30o, 30o) measured from the trucks heading
direction. (0 360o) measures the angle between the positive y axis and the heading of
the truck. For example in the right figure above, the trucks heading direction is 90o. For
consistency, all positive angles are measured in the counter-clock wise direction. The trucks
coordinate, denoted by (x, y) is defined as the center point between its rear wheels. The
distance between the center of its front wheels to (x, y) is L, and the distance between the
center of the rear end of this truck and (x, y) is . For simplicity, we ignore the width of the
truck in this problem.
Three dynamic equations describe the trucks movement:
x(n 1) x (n) t v cos (n) cos( (n) / 2)
y (n 1) y (n) t v cos (n) sin( (n) / 2)
v sin (n)
(n 1) (n) t
L

x(n+1) and y(n+1) are trucks new position t seconds later than x(n) and y(n) respectively.
Here angles are represented in the unit of radians rather than degrees. Note that due to the
5 of 9

Yu Hen Hu

5/9/16

dependency between x(n) and y(n), only two independent state variables x(n) and (n) will be
sufficient to determine the trucks trajectory and its heading.
(a) (5 points) Let v = 4 m/s, t = 2 s, L = 1.5m, and = 0.5m. Suppose that the initial truck
location (x(1), y(1)) = (50, 75), and initial heading (1) = 80o. If the truck driver
maneuvers the truck with the following steering angles (in degrees) {(n); n = 1, 2, ,
10 }= {9 o, 13 o, 7 o, 10 o, 3 o, 13 o, 13 o, 3 o, 2 o, 13 o}. Write a program to compute the
trucks positions and headings for n = 1, 2, , 11. Tabulate the result. Also, plot the
trucks (x, y) coordinates for n = 1 to 11. Give (n) in terms of degrees.
Answer:
n

50

-75

(n
)

80

10

11

(b) (5 points) In order to develop a fuzzy logic controller to control the steering angle at each
time step, the two inputs to the fuzzy controller, x(n) and (n) need to be fuzzified. Also,
the output of the fuzzy logic controller, (n) must also be fuzzified. For the time being,
let us generate triangular fuzzy sets using the Matlab program fsgen.m that can be
download from the Matlab programs in the course web page. Specifically, generate
(i) 5 fuzzy sets for input variable x(n) [0, 100], with a support containing 25 evenly
separated points in that range (including both ends).
(ii) 7 fuzzy sets for input variable (n) [0,360o], with a support containing 91evenly
separated points in the range (including both ends).
(iii)
7 fuzzy sets for output variable (n) [30o, 30o], with a support containing 61
evenly separated points in the range (including both ends).
Plot these fuzzy sets. Furthermore, suppose that x(n) = 65 m, (n) = 130o, and (n) =
23o, compute the fuzified membership vectors of each of them.
(c) (10 points) Let v = 4 m/s, t = 0.5s, L = 2 m and = 0.5m. Use the fuzzy sets defined in
part (b), a rule-base defined in the file ruletrk.m (downloadable from the web), and the
center of area defuzzification method as implemented in defuz.m to develop a fuzzy logic
controller to back-up the truck to the loading duck. A back-up operation is considered
successful if the trucks final coordinate (xf, yf) and the heading f satisfy the following
conditions: (i) |xf 50| 2 meter, (ii) 2 yf , and (iii) |f 180o | 5o. Moreover,
as stated above, in no time during the backup operation the truck may run over the wall.
Hence we also require 0 x 100, and 100 y 0. Let [x(0) y(0) (0)] = [70 70
90o], a successful example is given below:

6 of 9

Yu Hen Hu

5/9/16
trajectory of truck backing to dock

x final = 49.1546 meter


y final = -0.62813 meter
phi final = 176.4777 degree.

-20

-40

-60

-80

-100

20

40

60

80

100

(i) (6 points) Submit a printed copy of the source code of your program, and a figure like
above to show your program is working.
(ii) (4 points) Download a file trkpos.txt from the course web site. Each row of this file
specifies the initial position and heading (in degrees) of this truck of a trial. Apply the
fuzzy logic controller to each of these initial conditions. Report how many trials are
successful.
(d) (10 points) Use the result you obtained in part (c, ii) as a baseline performance, try to
improve the performance of this fuzzy logic controller. You may modify either the
definition of the fuzzy set (but the discrete support for each variable can not be changed),
and the rule base. Discuss briefly the changes you have made, and report how much
improvement of performance due to these changes. Also, submit your new
definitions of fuzzy sets and the modified rule set you used. The instructor may
choose to verify the results as you claimed in this part when deemed necessary.
You will receive 5 points of this part if any improvement is shown. The remaining points
will be awarded roughly according to the percentage improvements. Full credit (10
points) will be given if the overall success rate 98% after improvement.
5. (15 points) Pattern Classification
Use multi-layer perceptron neural network to perform pattern classification. The training data
set ctrain.txt and the feature vectors of the testing data set ctest.txt can be down-loaded
from the course home-page. The class labels of the testing set feature vectors are withheld.
You score for this problem will be judged in part by your approach to solve the problem, and
the remaining points will be graded based on the relative classification rate on the testing
results with respect to the best results submitted.
(a) (5 points) At least two different network structures you have experimented, and your
basis for selecting the network for testing.
(b) (10 points) Fill out the table below with testing results for each testing vector. Each misclassified sample causes 0.5 point deduction up to total 10 points deduction on this part.
Answer:

7 of 9

Yu Hen Hu

5/9/16

Testing vector
0.084

0.550 -0.158

0.243

0.503

0.632

1.038 -0.116

0.682

0.159

0.850

1.418 -0.167

0.987

0.652 -0.241

0.380

1.020 -0.383

0.531

0.386

0.044

0.283

0.188

0.803

0.680 -0.350

0.537

0.250

0.666
0.183

0.618 -0.609

0.243 -0.208

1.053

0.834

0.896 -0.287

0.571

0.071

0.364

0.904

0.461

0.444

0.811

1.000 -0.470

0.002

0.672 -0.538

0.814

0.126

0.739

0.517 -0.174

0.358

0.794

0.801

1.428 -1.017

0.908
0.281

0.084 -0.586

1.204

0.379

0.319

0.321

0.629 -0.847

1.216

0.982

Label

0.599 -0.030

6. (15 points) Mixture of Experts


Assume a 2-class pattern classification problem with class labels {0, 1} and 10 feature
vectors {x(k); 1 k 10}. The classification results of three expert classifiers as well as the
correct class labels are shown in the table below:
8 of 9

Yu Hen Hu

5/9/16

Feature no. 1 2 3 4 5 6 7 8 9 10
Labels 0 0 0 0 0 1 1 1 1 1
classifier A output 0 0 1 1 0 1 1 1 1 0
classifier B output 0 0 1 0 0 1 1 0 1 1
classifier C output 0 1 1 1 1 0 0 0 1 1
(a) (10 points) Denote the kth outputs of classifiers to be yA(k), yB(k) and yC(k) respectively.
Let {wA, wB, wC} be a set of weights such that wA + wB + wC = 1. Define an ensemble
classifier whose inputs are yA(k), yB(k) and yC(k), and output is
y k I wA y A k wB yB k wC yC k

where I(x) = 1 if x 0, and = 0 otherwise; as well as h being a constant threshold. Find


{wA, wB, wC} and h such that the PMiss-classification is minimum for the training samples. Also,
give the corresponding confusion matrix and PMiss-classification of the ensemble classifier.
(b) (5 points) Note that the outputs of classifiers A, B, and C form a new feature vector
f(k) = [yA(k), yB(k), yC(k)]
A nonlinearly combined ensemble classifier then can be regarded an assignment of a
class label to each of the f(k). Fill-in the table below so as to minimize PMiss-classification. Note
that for the same f(k), only one class label can be assigned. Give the confusion matrix of
the optimal classifier. Discuss the difficulty to fill in these tables, and explain why the
minimum value of PMiss-classification cannot be 0.
Table 1. Output of the ensemble classifier

yA(k)
yB(k)
yC(k)
y(k)

0
0
0

0
0
1

0
1
0

0
1
1

1
0
0

1
0
1

1
1
0

1
1
1

Table 2. Applying the ensemble classifier to the given training samples

Feature no.
Labels
yA(k)
yB(k)
yC(k)
y(k)

1
0
0
0
0

2
0
0
0
1

3
0
1
1
1

9 of 9

4
0
1
0
1

5
0
0
0
1

6
1
1
1
0

7
1
1
1
0

8
1
1
0
0

9
1
1
1
1

10
1
0
1
1