Classification? Can we use linear regression for Classification? Can we use linear regression for Classification? Can we use linear regression for Classification? Can we use linear regression for Classification? Can we use linear regression for Classification?
Now for the new point with
weight 180 probability will be >1. Can we use linear regression for Classification?
For few cases the probability will
be <1. What is logistic regression? • Logistic regression is the appropriate regression analysis to conduct when the dependent variable is dichotomous (binary). • Like all regression analyses, logistic regression is a predictive analysis. • Logistic regression is used to describe data and to explain the relationship between one dependent binary variable and one or more independent variables. • The outcome can either be yes or no (2 outputs). Applications • Manufacturing Find probability of failure • Healthcare Likelihood of disease • Finance Fraudulent or Non Fraudulent • Marketing if users will click on an advertisement or not Assumptions • The dependent variable is binary or dichotomous. • There should be no, or very little, multicollinearity between the predictor variables—in other words, the predictor variables (or the independent variables) should be independent of each other. This means that there should not be a high correlation between the independent variables. • The independent variables should be linearly related to the log odds. • Logistic regression requires fairly large sample sizes What are the log Odds? • In very simplistic terms, log odds are an alternate way of expressing probabilities. • In order to understand log odds, it’s important to understand a key difference between odds and probabilities: • odds are the ratio of something happening to something not happening, • probability is the ratio of something happening to everything that could possibly happen. Logistic Function Logistic function:
We all know the equation of the best fit line in linear regression is:
Let’s say instead of y we are taking probabilities (P).
But there is an issue here, the value of (P) will exceed 1 or go below 0 and we know that range of Probability is (0-1). To overcome this issue we take “odds” of P:
Here, it will take values in the range: (0, ∞)
Logistic Function • To control this we take the log of odds which has a range from (-∞, +∞). Cost Function In linear regression, we used the Mean squared error.
• In logistic regression Yi is a non-linear function
(Ŷ=1/1+ e-z). • If we use this in the above MSE equation then it will give a non-convex graph with many local minima as shown Cost Function Cost Function
• When the true output value y= 1 (positive)
penalty = -(1*log(p) + (1–1)*log(1-p)) = -log(p) • When the true output value y= 0 (negative) penalty = -(0*log(p) + (1–0)*log(1-p)) = -log(1-p)