02 LogisticRegression

Logistic
Regression
PhD. Msc. David C. Baldears S.
TC3007C
https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc
1
Simple linear regression
Age and systolic blood pressure (SBP) among 33 adult women
2
Simple linear regression
● Relation between 2 continuous variables (SBP and age)
● Regression coefficient β1
○ Measures association between y and x
○ Amount by which y changes on average when x changes by one unit
○ Least squares method
3
Multiple linear regression
● Relation between a continuous variable and a set of i continuous variables
● Partial regression coefficients βi

○ Amount by which y changes on average
when xi changes by one unit
and all the other xis remain constant
○ Measures association between xi and y adjusted for all other xi
● Example
○ SBP versus age, weight, height, etc
4
Multiple linear regression
Predicted Predictor variables
Response variable Explanatory variables
Outcome variable Covariables
Dependent Independent variables
5
Age and signs of coronary heart disease (CD)
Example — Data
What is the relationship between age and CD?
6
Linear Regression Not a good fit
Increasing relationship between the
age and CD
But we may want a simpler model,

with am predicted as 0 or 1
We need a model for analyzing data

with binary response
7
Logistic regression
● It is a type of Regression Machine Learning Algorithms being deployed to
solve Classification Problems/categorical.
● It does this by predicting categorical outcomes, unlike linear regression
that predicts a continuous outcome.
● Problems having binary outcomes, such as Yes/No, 0/1, True/False, are the
ones being called classification problems.
8
Why Apply Logistic Regression?
Linear regression doesn’t give a good fit line for the problems having only two
values.
It will give less accuracy while prediction because it will fail to cover the
datasets, being linear in nature.
It needs a model for

analyzing data with binary
response
9
Logistic regression
Prevalence (%) of signs of CD according to age group
10
Comparing LP and Logistic regression Models
11
Mathematics Involved in Logistic Regression
Sigmoid Function
Which is a be also expressed in terms of intersect and weight
Can also be expressed as:
12
Probability
This can be express as Probability
π is the probability that the event Y occurs
1-π is the probability that the event Y does not occurs
13
Probability
Odds ratio is the probability of an event occurring divided by the probability of
it not occurring: Thus if π is the probability of an event:
● β0= log odds of disease

in unexposed
● β1 = log odds ratio associated
with being exposed
● eβ1 = odds ratio
Log-odds ratio, or “logit:
Note: This is natural log (aka ‘‘ln’’) 14

Why Log Odds ratio?
In the a study 70.5% of men and 72.9% of women voted
The odds ratio of men voting were = 0.705/.295 = 2.39, and the log odds ratio
were ln(2.39) = 0.8712
The odds ratio of women voting were = 0.729/.271 = 2.69, and the log odds
ratio were ln(2.69) = 0.9896
Note that ln(1.0) = 0, so that when the odds ratio is 1.0 (0.5/0.5) the log odds
ratio is zero
15
Why use Logarithms?
● They have 3 advantages
1. Odds vary from 0 to ∞, whereas log odds vary from -∞ to +∞ and are centered at 0. Odds
less than 1 have negative values in log odds and odds greater than one have positive
values in log odds. This accords better with the natural number system which runs from -∞
to +∞.
2. If we take any two numbers and multiply them together that's is the equivalent of adding
their logs. Thus logs make it possible to convert multiplicative models to additive models, a
useful property in the case of logistic regression which is a non-linear multiplicative model
when not expressed in logs.
3. A useful statistic for evaluating the fit of models is -2log likelihood (also known as
deviance). The model has to be expressed in logarithms for this to work.
16
Logistic Regression Model
● The logistic distribution constrains the estimated probabilities to lie
between 0 and 1.
● The estimated probability is:
○ if you let β0 + β1x = 0, then p = .50

○ as β0 + β1x gets really big, p approaches 1
○ as β0 + β1x gets really small, p approaches 0
17
Cost Function of Logistic Regression
The cost can be defined as:
in a nutshell:
where n is the number of records that we have.
18
Fitting equation to the data
● Linear regression: Least squares
● Logistic regression: Maximum likelihood
● Likelihood function
○ Estimates parameters β0 and β1
○ Practically easier to work with log-likelihood
19
Explanation of Cases of Cost Function
When Y = 1: When Y = 0:
Penalize as it moves farther from 1, Penalize as it moves farther from 0,

-log(π(x)) -log(1- π(x))
Cost
Cost
Model Prediction Model Prediction
20
Note: our model’s prediction won’t exceed 1 and won’t go below 0. So,
that part is outside of our worries.
Derivative of Sigmoid Function
Sigmoid function Simplify the equation
Reciprocal Rule Make Less Exponential Operations
Applying the reciprocal rule
21
Derivative of Sigmoid Function
Sigmoid:
Derivative
22
Gradient Descent and Cost Function Derivatives
Having the cost function
For convenience
it requires to find
The gradient descent formula:
with α as the learning rate

Let Part 2
Part 1
Adding A + B
24
Hence, the update for β
25
Maximum likelihood
● Iterative computing
○ Choice of an arbitrary value for the coefficients (usually 0)
○ Computing of log-likelihood
○ Variation of coefficients’ values
○ Reiteration until maximisation (plateau)
● Results
○ Maximum Likelihood Estimates (MLE) for β0 and β1
○ Estimates of P(y) for a given value of x
26
Multiple logistic regression
● More than one independent variable
○ Dichotomous, ordinal, nominal, continuous …
● Interpretation of βi
○ Increase in log-odds for a one unit increase in xi with all the other xis constant
○ Measures association between xi and log-odds adjusted for all other x i
27
Types of Logistic Regression
● Binary Logistic Regression: the dependent/target variable has two
distinct values
○ For example 0 or 1, malignant or benign, passed or failed, admitted or not admitted.
● Multinomial Logistic Regression: the target or independent variable has
three or more possible values.
○ For example, the use of Chest X-ray images as features that give indication about one of
the three possible outcomes (No disease, Viral Pneumonia, COVID-19).
● Ordinal Logistic Regression: the target variable is of ordinal nature. In
this type, the categories are ordered in a meaningful manner and each
category has quantitative significance.
○ For example, the grades obtained on an exam have categories that have quantitative
significance and they are ordered. Keeping it simple, the grades can be A, B, or C.
○ “very poor”, “poor”, “good”, “very good” 28
Examples
● Education sector:
○ Whether a student gets admission into a university program or not is based on test scores
and various other factors.
○ In E-learning platforms to see whether a student will complete a course on time or not
based on past activity and other statistics relevant to the problem.
● Business sector:
○ Predicting whether a credit card transaction made by a user is fraudulent or not.
● Medical sector:
○ Predicting whether a person has a disease or not is based on values obtained from test
reports or other factors in general.
○ A very innovative application of Machine Learning being used by researchers is to predict
whether a person has COVID-19 or not using Chest X-ray images.
● Other applications:
○ Email Classification – Spam or not spam
○ Sentiment Analysis – Person is sad or happy based on a text message
○ Object Detection and Classification – Classifying an image to be a cat image or a dog image 29

02 LogisticRegression

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

02 LogisticRegression

Uploaded by

Copyright:

Available Formats

Logistic

● Partial regression coeﬃcients βi

Predicted Predictor variables

Response variable Explanatory variables

Outcome variable Covariables

Dependent Independent variables

But we may want a simpler model,

We need a model for analyzing data

It needs a model for

Which is a be also expressed in terms of intersect and weight

Can also be expressed as:

π is the probability that the event Y occurs

1-π is the probability that the event Y does not occurs

● β0= log odds of disease

Log-odds ratio, or “logit:

Note: This is natural log (aka ‘‘ln’’) 14

○ if you let β0 + β1x = 0, then p = .50

where n is the number of records that we have.

Penalize as it moves farther from 1, Penalize as it moves farther from 0,

Reciprocal Rule Make Less Exponential Operations

Applying the reciprocal rule

The gradient descent formula:

with α as the learning rate

You might also like