3 views

Uploaded by Pinaki Ghosh

xzz

- Binary Logistic Regression Schueppert 2009
- Logit Model for Binary Data
- Predicting Long-Term Outcome After Acute Ischemic Stroke 2008
- pseudo r2 de regresiones
- Chapter 3.docx
- Financial determinants of the accounting choice to capitalize expenses: The case of start-up and interest costs
- Logistic Regression
- Pagliari
- pilgrim bank
- Eers 1014
- Relationship of Patient Characteristics and Rehabilitation Services to Outcomes Following Spinal Cord Injury- The SCIRehab Project
- Logistic Regression1
- Logistic Regression
- PS2
- Tutorial Sobre Regresión Logística
- Methodology
- Vermunt, K - LATENT GOLD 4 USER´S GUIDE
- 1471-2458-13-42
- Best University
- Introductory%20Guide%20to%20using%20Stata

You are on page 1of 41

Logistic Regression

Extends idea of linear regression to situation

where outcome variable is categorical

Widely used, particularly where a structured

model is useful to explain (=profiling) or to predict

We focus on binary classification

i.e. Y=0 or Y=1

Why Not Linear Regression?

Technically, you can run linear regression with a 0/1

response variable and obtain an output

But the resulting model will not make sense

For instance:

Predictions will mostly not be 0 or 1

Coefficient interpretation will not make sense

The Logit

Goal: Find a function of the predictor variables that

relates them to a 0/1 outcome

Instead of Y as outcome variable (like in linear

regression), we use a function of Y called the

logit

Logit can be modeled as a linear function of the

predictors

The logit can be mapped back to a probability,

which, in turn, can be mapped to a class

Step 1: Logistic Response Function

p = probability of belonging to class 1

Need to relate p to predictors with a function that

guarantees 0 p 1

Standard linear function (as shown below) does

not:

+

q = number of predictors

The Fix:

use logistic response

function

Equation 8.2 in textbook

Step 2: The Odds

eq. 8.3

eq. 8.4

p

p

Odds

1

The odds of an event are defined as:

p = probability of event

Odds

Odds

p

1

Or, given the odds of an event, the probability of the

event can be computed by:

We can also relate the Odds to the

predictors:

To get this result, substitute 8.2 into 8.4

q q

x x x

e Odds

2 2 1 1 0

eq. 8.5

Step 3: Take log on both sides

This gives us the logit:

log(Odds) = logit (eq. 8.6)

q q

x x x Odds

2 2 1 1 0

) log(

Logit, cont.

So, the logit is a linear function of predictors x

1

, x

2

,

Takes values from -infinity to +infinity

Review the relationship between logit, odds and

probability

Example

Personal Loan Offer

Outcome variable: accept bank loan (0/1)

Predictors: Demographic info, and info about their

bank relationship

Data Preprocessing

Partition 60% training, 40% validation

Create 0/1 dummy variables for categorical

predictors

Single Predictor Model

Modeling loan acceptance on income (x)

Fitted coefficients (more later): b

0

= -6.3525, b

1

= -0.0392

Seeing the Relationship

Last Step: Classify

Model produces an estimated probability of being a

1

Convert to a classification by establishing cutoff

level

If estimated prob. > cutoff, classify as 1

Ways to Determine Cutoff

0.50 is popular initial choice

Additional considerations (see Chapter 4)

Maximize classification accuracy

Maximize sensitivity (subject to min. level of specificity)

Minimize false positives (subject to max. false negative

rate)

Minimize expected cost of misclassification (need to

specify costs)

Example, Cont.

Estimates of s are derived through an iterative

process called maximum likelihood estimation

Lets include all 12 predictors in the model now

XLMiners output gives coefficients for the logit,

as well as odds for the individual terms

Estimated Equation for Logit

(Equation 8.10)

Equation for Odds (Equation

8.11)

Converting to Probability

Odds

Odds

p

1

Interpreting Odds, Probability

For predictive classification, we typically use

probability with a cutoff value

For explanatory purposes, odds have a useful

interpretation:

If we increase x

1

by one unit, holding x

2

, x

3

x

q

constant, then

b

1

is the factor by which the odds of belonging to

class 1 increase

Loan Example:

Evaluating Classification Performance

Performance measures: Confusion matrix and % of

misclassifications

More useful in this example: lift

Multicollinearity

Problem: As in linear regression, if one predictor is

a linear combination of other predictor(s), model

estimation will fail

Note that in such a case, we have at least one

redundant predictor

Solution: Remove extreme redundancies (by

dropping predictors via variable selection see

next, or by data reduction methods such as PCA)

Variable Selection

This is the same issue as in linear regression

The number of correlated predictors can grow

when we create derived variables such as

interaction terms (e.g. Income x Family), to

capture more complex relationships

Problem: Overly complex models have the

danger of overfitting

Solution: Reduce variables via automated

selection of variable subsets (as with linear

regression)

Goodness-of-Fit

How well does model fit the data?

Popular measures:

Deviance ( = std. dev. estimate in XLMiner)

Multiple R

2

Goodness of fit measures are used in explanatory

data analysis (profiling), not much in classification

(where we look at classification accuracy with

validation data)

P-values for Predictors

Test null hypothesis that coefficient = 0

Useful for review to determine whether to include

variable in model

Key in profiling tasks, but less important in

predictive classification

Complete Example:

Predicting Delayed Flights DC to NY

Variables

Outcome: delayed or not-delayed

Predictors:

Day of week

Departure time

Origin (DCA, IAD, BWI)

Destination (LGA, JFK, EWR)

Carrier

Weather (1 = bad weather)

Delayed Flights: Summary

Bold numbers are delayed flights on Mon Wed

Regular numbers Thurs - Sun

Data Preprocessing

Create binary dummies for the categorical variables

Partition 60%/40% into training/validation

The Fitted Model (not all 28 variables

shown)

Model Output (Validation Data)

Lift Chart

After Variable Selection

(Model with 7 Predictors)

7-Predictor Model

Note that Weather is unknown at time of prediction

(requires weather forecast or dropping that predictor)

Summary

Logistic regression is similar to linear regression,

except that it is used with a categorical response

It can be used for explanatory tasks (=profiling) or

predictive tasks (=classification)

The predictors are related to the response Y via a

nonlinear function called the logit

As in linear regression, reducing predictors can

be done via variable selection

Logistic regression can be generalized to more

than two classes (not in XLMiner)

- Binary Logistic Regression Schueppert 2009Uploaded byTürk Psikolojik Travma Akademisi
- Logit Model for Binary DataUploaded bysoumitra2377
- Predicting Long-Term Outcome After Acute Ischemic Stroke 2008Uploaded byBilge Serhateri Yenidemir
- pseudo r2 de regresionesUploaded byOrlando Munar Benitez
- Chapter 3.docxUploaded byQuame Enam Kunu
- Financial determinants of the accounting choice to capitalize expenses: The case of start-up and interest costsUploaded byinventionjournals
- Logistic RegressionUploaded byDhruv Bansal
- PagliariUploaded byColectivo Rizoma
- pilgrim bankUploaded bySatyajeet Jaiswal
- Eers 1014Uploaded byMutahar_Hezam
- Relationship of Patient Characteristics and Rehabilitation Services to Outcomes Following Spinal Cord Injury- The SCIRehab ProjectUploaded byvasgar
- Logistic Regression1Uploaded bysuprabhatt
- Logistic RegressionUploaded byMonica
- PS2Uploaded byMohammed Seid
- Tutorial Sobre Regresión LogísticaUploaded byDenisse Campos
- MethodologyUploaded byfoqajutt
- Vermunt, K - LATENT GOLD 4 USER´S GUIDEUploaded byJaime Jose Arias Congrains
- 1471-2458-13-42Uploaded bySatwika Arya Pratama
- Best UniversityUploaded byMarc Adriane Mañosca
- Introductory%20Guide%20to%20using%20StataUploaded byMarius Argetoianu
- Pastyear Exam AnswerUploaded byhanhele
- s12889-018-5658-4Uploaded byElfiaNeswita
- Logistic Regression Variables PilihanUploaded byDwi Moeryanto
- 11+ bateria preventiva lesiones futbolUploaded byDanilo Andres Silva Esparza
- 10.1016@j.eap.2017.05.002Uploaded byRain Siregar Rosadi
- Chapter8 Load Tap Changer Fault DiagnosisUploaded byPopeye Olive
- 1-s2.0-S0022202X1540939X-mainUploaded byrezadeptia
- Binary LogitUploaded bySylvia Cheung
- RDES 2 07 Castellanos Dopico SanchezUploaded byIonut Lupescu
- Depression DiagnosisUploaded byAustin Kinion

- Evaluating the Costs and Benefits of Tidal Range Energy GenerationUploaded byDxtr Medina
- ontologies-for-developing-things.pdfUploaded byfrsalina
- Case Study GIA Lab EnUploaded byPrachur Singhal
- Analysis of Ship to Use Program _stu_ a Case Study to Investigate the EffectiveUploaded byIAEME Publication
- budget lessonUploaded byapi-97640788
- Language Testing_Lecture NotesUploaded byPhạm Đức Long
- diffusionosmosislabUploaded byapi-196665276
- assessment 2-teachersUploaded byapi-243829989
- Derivation vs ValidationUploaded bySol Camus
- Green HRM practices.pdfUploaded byDharsana Deegahawature
- 1. Symptom burden, depression, and quality of life in chronic and end-stage kidney disease.pdfUploaded byKrisleen Abrenica
- Getting HSE Right and Traction 2005Uploaded bySHANZAY21111111
- Stress and Anger Management With NLPUploaded byiTrainingExpert
- Inted2017 1266-Gender and Career Aspirations of ArchitectureUploaded byBukky
- Brazilian.Perspectives_EUploaded bydiogovianasantos
- Chapter IV VUploaded byRainer Siregar
- Laddering in Consumer ResearchUploaded byIgnacio Greene Allende
- An Introduction the Predicament of Brazilian CultureUploaded bySusana Amaral
- 2Uploaded byWilliam Sanjaya
- Presentation Crisis ManagementUploaded byDiyar Bagaev
- IARA PDFUploaded byParamasivan Tamilarasi
- 2008 Proceedings of Presented Papers of the Parapsychological Association 51st Annual ConventionUploaded byThe Parapsychological Association
- Synopsis Main Point IncUploaded bypandey_hariom12
- Doctoral Workshop Schedule Spring 2017Uploaded byChunxiao Wei
- j.bica.2015.04.007_759597398647Uploaded bysaman
- fye_2002_ramburuthUploaded byRupanagudi Nagendra Rao
- ExpUploaded bycoolarun86
- Useful PhrasesUploaded byLee Boon Yee
- REGULASI-OBATUploaded byLilis Tuslinah
- Economics, An Economy and the Role of Information in Economic DecisionsUploaded byarcherselevators