You are on page 1of 4

COMP3710

Assignment 3
Geetanshu Kamboj (T00660998)

Ans 1: Linear Regression


 The class of linear functions of continuous-valued inputs are one of the different
hypothesis spaces used for hundreds of years.
 To predict a dependent variable (Y) based on values the values of independent
variable(X).
 Dependent variable is always continuous and independent variable is discrete.
 A way to find the trends in the data.
 Come up with a prediction model
 P-value
 Correlation Coefficient
 Function is a straight line with input X and output Y with the form Y= b1x + b0 + e
where Y —> Dependent variable, b1 —> Slope, b0 —> Y intercept, X —>
Independent variable, E —> error
 Example: linear regression may predict how the obesity of an individual is linearly
related to work-life imbalance.

Logistic Regression
 Similar to linear regression.
 The dependent variable is categorical — 0 or 1, Yes or No, etc.
 Predicts if something is true or false.
 Sigmoid curve or S curve.
 Threshold will be set to decide the category
Output = 0 or 1
Hypothesis => Z = WX + B
hO(x) = sigmoid(Z)
 Example: logistic regression predicts whether a patient has stage 2 (0) or
stage 3 (1) cancer.

Ans 2: Traditional Programming: - A person (a programmer) builds the programme


manually in traditional programming. However, without any programming, rules must be
manually created or coded. We have the input data, and a programmer wrote a
programme that uses it to run on a computer and provide the intended output.

Logic: Input + Program = Output.

Machine Learning: - Machine learning is a subset of Artificial Intelligence in a way of


letting the software predict and anticipate what the output would be without being
previously prepared or programmed to do so. In simple words, it is the science of
getting computers to act without being explicitly programmed.
Logic: Input + Output = Program

Ans 3:

Decision tree is a representation of a function that maps a vector of attribute values to


a single output value called “a decision”.

Ans 4: K Nearest Neighbour is a simple algorithm that stores all the available cases and
classifies the new data or case based on a similarity measure. It is mostly used to
classifies a data point based on how its neighbours are classified.

K nearest neighbor algorithm check the k nearest neighbors and assign the label to the
current node whichever is most frequent among them.

In this case, value of k = 5.


If we consider nearest neighbors based on distance then,
there are 3 squares and 2 triangles.

So, we assign the disc as square class.


Ans 5: 5) From the plot we can see the best 5 values of k is 20,21,22,23 and 27.

a) made a list best_k that contains the best values of k.

b) Score for each value of k is same.

c) Case1) Setting test data to 30% and train data to 70%


Output :- Accuracy of output is 96%

Case 2) : Setting test data to 40% and train data to 60%

Output:- Accuracy of output is 95%

Case 1 perform well as its accuracy is 96% where as case 2 accuracy is 95%.

You might also like