You are on page 1of 3

Department of Civil Engineering

Course: CE 235 Artificial Intelligence and Data Science


Instructor: Gopal R. Patil

Logistic Regression

1. Classification
• In linear regression, the response variable is quantitative.
• Often, we need to predict response variables that are qualitative.
• Qualitative variables are also referred to as categorical variables. Predicting a
qualitative response variable for an observation is called classification. Examples
o Will it rain today or not?
o Weather a structure will fail after a certain magnitude of earthquake or not.
o Mode used to travel from home to office.
o Gap accepted by a driver or not.
• Logistic regression is one of the popular approaches used for classification problems.

2. Logistic regression
• We can code the qualitative responses. For example,
1, 𝑤𝑎𝑙𝑘
2, 𝑇𝑊
𝑌={
3, 𝐶𝑎𝑟
4, 𝑇𝑟𝑎𝑛𝑠𝑖𝑡
o Note that (𝑇𝑊 − 𝑤𝑎𝑙𝑘) = 1, and (𝑇𝑟𝑎𝑛𝑠𝑖𝑡 − 𝐶𝑎𝑟) = 1.
o Are the equal differences meaningful? Someone could use different codes.
o When a response variable has only two possible outcomes (binary response),
we can use dummy variables. For example,
1, 𝑖𝑓 𝑎 𝑔𝑎𝑝 𝑖𝑠 𝑎𝑐𝑐𝑒𝑝𝑡𝑒𝑑
𝑌= {
0, 𝑖𝑓 𝑎 𝑔𝑎𝑝 𝑖𝑠 𝑟𝑒𝑗𝑒𝑐𝑡𝑒𝑑

o We could fit a linear regression and predict gap accepted if 𝑌̂ > 0.5 and gap not
accepted otherwise. However, a linear model will give the same predictions
even if the coding is reversed. Additionally, the prediction values can be less
than 0 or greater than 1.

• Logistic regression models the probability that Y belongs to a particular category. Let,
𝑝(𝑋) = Pr (𝑌 = 1|𝑋)
• We need the output between 0 and 1 for all values of 𝑋 (This is not guaranteed in linear
model). In logistic regression, the logistic function is used and is given by the following
equation for two classes.
𝑒 𝛽0+𝛽1𝑋
𝑝(𝑋) =
1 + 𝑒𝛽0+𝛽1𝑋
• The logistic function will always produce an S-shaped curve. The coefficient 𝛽0
and 𝛽1 are estimated using the method of Maximum Likelihood (ML). The
likelihood function:
ℒ(𝛽0 , 𝛽1 ) = ∏ 𝑝(𝑥𝑖 ) ∏ (1 − 𝑝(𝑥𝑖 ′ ))
𝑖:𝑦𝑖 =1 𝑖 ′ :𝑦𝑖′ =1

ℒ(𝛽0 , 𝛽1 ) = ∏ Pr (𝑌 = 1|𝑋 = 𝑥𝑖 ) ∏ (1 − Pr(𝑌 = 1|𝑋 = 𝑥𝑖 ′ ))


𝑖:𝑦𝑖 =1 𝑖 ′ :𝑦𝑖′ =1

The estimates 𝛽̂0 and 𝛽̂1are calculated by maximizing the likelihood function. ML
is a popular approach to fit non-linear models (In ML we find parameters so that
the observed data is most probable).
• Multiple logistic regression
𝑒 𝛽0+𝛽1𝑋1+⋯+𝛽𝑃 𝑋𝑃
𝑝(𝑋) =
1 + 𝑒𝛽0+𝛽1𝑋1+⋯+𝛽𝑃 𝑋𝑃
3. Multinomial Logistic regression
• Response variable has more than two categories. For example: Modes used to travel
are walk, Two-Wheelers (TWs), car, transit. Here, we select one category (class) as the
baseline. Assuming 𝐾 𝑡ℎ class as a baseline.
o For 𝑘 = 1, 2, … , 𝐾 − 1
𝑒 𝛽𝑘0 +𝛽𝑘1 𝑥1+⋯+𝛽𝑘𝑃 𝑥𝑃
Pr(𝑌 = 𝑘|𝑋 = 𝑥) = 𝛽𝑙0 +𝛽𝑙1 𝑥1 +⋯+𝛽𝑙𝑝 𝑥𝑃
1 + ∑𝐾−1
𝑙=1 𝑒
o And for 𝑘 = 𝐾
1
Pr(𝑌 = 𝐾|𝑋 = 𝑥) = 𝛽 𝑙0 +𝛽𝑙1 𝑥1 +⋯+𝛽𝑙𝑝 𝑥𝑃
1 + ∑𝐾−1
𝑙=1 𝑒

You might also like