You are on page 1of 2

Machine Learning Programming Exercise 3

Due 15th November, 2015. Please upload to LMS before midnight.

1 Logistic Regression One-vs-All [100 points]


You will implement one-vs-all logistic regression using CIFAR-10 data (you can reach the details of the
data from the site: http://www.cs.toronto.edu/~kriz/cifar.html). Use data_batch_1 as training set and
test_batch as test set. With this dataset, you will have 32x32 color images in 10 classes. When you get
X, you will see it has 3072 columns which are coming from 3072 bytes of the pixels of image. First
1024 column represents red channel values, second 1024 green and the last 1024 blue.
Note: After loading the data, dont forget to convert the type of it to double.

i.

[10 points] Randomly choose 100 of the pictures and display using the code displayData.m
that we give you. Then you should see an image like:

ii.

[40 points] Write a one-vs-all function that calculates thetas for all labels and returns them
as a matrix such that each row represents a theta vector. For this, use fmincg function that
is given to you and apply it when y is equal to different labels (e.g., c in the code below,
where c is a label index) using regularized logistic regression cost function.
For fmincg you can use the code below:

You can use your logistic regression cost function of Programmins Assignment 2 but make
sure that your code is vectorized.

iii.

iv.

[30 points] Predict the labels of the training set using the thetas you get from one-vs-all
function. Then, by comparing your predicted labels and the real labels, calculate the
training set accuracy.
[20 points] Using test_batch data, calculate the test set accuracy.

You might also like