10 views

Uploaded by juntujuntu

GLM

save

- MULTIVARIATE gaussian
- 01.Introduction
- Efficient Single Frequency Estimators
- 32.2.2.julia-etal.pdf
- Akaike
- Euro Public Gap Wp
- StatFit en Ingles
- Part 1_Statistical Analysis
- 1-s2.0-S2090123214000927-main
- Caprani 2005 PhD Thesis
- Statistical Inference II
- Akaike 1974
- Morris & Doak Chapter6
- MODELLING STUDENT’S SATISFACTION WITH LIBRARY
- Estimation of nonstationary heterogeneous panels
- Un Pan 049152
- Thomas 2015
- Outsourcing Integration
- A Support Vector Machine Approach to Credit Scoring
- Nonlinear Robust Regressions Based on ?-Regression Quantile, Least Median of Squares and Least Trimmed Squares Using Genetic Algorithms
- Lepto Spiros Is
- Probas Massart Stf03
- UA Study proves MAS-TUSD is a success!
- Non Parametric Graphs
- Going for Three
- RUM Model
- Regresi Logistik Metode Enter
- Econ5025 Practice Problems
- Lecture4 App Econ
- Apply
- 7 اكتوبر قراءة في الوقائع الليبية
- IBM SPSS Bootstrapping
- defo003lotg01_01-Robnson Crouise Advenature
- An introduction to scalar Kalman filters.pdf
- Apr
- Ibm Spss Conjoint
- 03_RDS_Clatse_2012
- 02 Multiplicity
- 01 Notas Completas 2005
- 2013-apts-asp-ex
- Pri
- tufte
- Shiny Cheatsheet
- IBM SPSS Missing Values
- IBM SPSS Neural Network
- Module 2 Slr
- Bio2 Combo
- fa08
- IBM SPSS Bootstrapping
- Module 3 - Multiple Linear Regression
- fa01
- fa07
- IBM SPSS Statistics Core System Users Guide
- Module 4 - Logistic Regression
- IBM SPSS Statistics Brief Guide
- GPL Reference Guide for IBM SPSS Statistics
- Module 1 - Foundation
- IBM SPSS Bootstrapping
- Module 5 - Ordinal Regression
- EC220 2015 Syllabus
- Elastic Collisions and Snooker
- Ex Quantum
- Normal Dist
- WKB
- distribuciones.ppt
- Hypothesis Test
- Matriz de densidad
- Post Hoc Power
- Eco_Basic_1-8
- Distribuciones Continuas.pptx
- Deber de Anailitica Alejandro Moran
- Chapter6 Solutions
- 19670015742
- Regression Analysis Summary Notes
- edu.docx
- Anomalies - Risk Aversion
- AnnPhysBerlin.v524y2012p345
- Aplicacion de La Estimacion de Las Probabilidades ORIGINAL
- The Theory of Elementary Waves.pdf
- Introducción A La Fisica De Particulas
- s2 Revision Notes
- Tema3 Variable Aleatoria
- 4.Quantum Mechanics_GATE 2010 - 2013
- ffer
- Foley Rationality & Ideology
- ortiz_2010
- Topic 6 Heteroscedasticity
- EstBas8a
- Motivation 1

You are on page 1of 31

Statistics 149 Spring 2006

Copyright c 2006 by Mark E. Irwin

**Logistic Regression Model
**

Let Yi be the response for the ith observation where Yi = 1 Success 0 Failure

Then Yi ∼ Bin(1, π (xi)) where xi is the level of the predictor of observation i. An equivalent way of thinking of this, is to model µ(Yi|xi) = π (xi) One possible model would be π (xi) = β0 + β1xi

Logistic Regression Model 1

As we saw last class, this model will give invalid values for π for extreme xs. In fact there is a problem if −β0 β1 1 − β0 β1

x<

or

x>

We would still like to keep a linear type predictor. One way to do this is to apply a link function g (·) to transform the mean to a linear function, i.e. g (π ) = β0 + β1x

As mentioned last class, we want the function g (·) to transform the interval [0,1] to (−∞, ∞). While their are many possible choices for this, there is one that matches up with what we have seen before, the logit transformation π logit(π ) = log = log ω = η 1−π

Logistic Regression Model 2

Transforming back gives π = ω = eη = eβ0+β1x 1−π and ω eη eβ0+β1x = = π= η 1+ω 1+e 1 + eβ0+β1x Note the standard Bernoulli distribution results hold here eβ0+β1x µ(Y |X ) = π = 1 + eβ0+β1x and Var(Y |X ) = π (1−π ) = eβ0+β1x (1 + eβ0+β1x) 2 Logistic Regression Model 3 .So instead of looking at things on the probability scale. lets look at things on the log odds (η ) scale.

While the above just assumes a single predictor.. it is trivial to extend this to multiple predictors. + βpxp = Xβ giving π = ω = eη = eβ0+β1x1+.+βpxp eXβ = π= = = β + β x + ....+βpxp = eXβ 1−π eη ω eβ0+β1x1+.. . . + β x η p p 0 1 1 1+ω 1+e 1 + eXβ 1+e and Logistic Regression Model 4 .. just set logit(π ) = β0 + β1x1 + .

Logistic Regression Model 5 .5 kg (0/1) age: age of mother in years lwt: weight of mother (lbs) at last menstrual period race: white/black/other smoke: smoking status during pregnancy ptl: number of previous premature labours ht: history of hypertension (0/1) ui: has uterine irritability (0/1) ftv: number of physician visits in ﬁrst trimester bwt: actual birth weight (grams) We will focus on maternal age for now. with the main interest being in low birth weight. Mass during 1986. Springﬁeld. low: birth weight less than 2.Example: Low Birth Weight in Infants Hosmer and Lemeshow (1989) look at a data set on 189 births at Baystate Medical Center.

1.8 15 20 25 30 Maternal Age 35 40 45 Lets ﬁt the model logit(π (age)) = β0 + β1age in R with the function glm Logistic Regression Model 6 .This data set is available in R in the data frame birthwt.4 0. though you may need to give the command library(MASS) to access it.0 Low Birth Weight 0.0 0.6 0.2 0.

family = binomial.glm(low ~ age. family=binomial) > summary(birthwt.03151 -1.7800 Coefficients: Estimate Std.73212 0.0402 -0.glm <.623 0.91 Logistic Regression Model on 188 on 187 degrees of freedom degrees of freedom 7 .> birthwt.67 Residual deviance: 231.599 age -0.38458 0.91 AIC: 235.105 (Dispersion parameter for binomial family taken to be 1) Null deviance: 234.05115 0.525 0. Error z value Pr(>|z|) (Intercept) 0.9018 -0.7754 3Q 1.glm) Call: glm(formula = low ~ age.4119 Max 1. data = birthwt) Deviance Residuals: Min 1Q Median -1. data=birthwt.

0 Low Birth Weight 0.2 0.0 0.385−0.4 ^ (age) π 15 20 25 30 Maternal Age 35 40 45 Logistic Regression Model 8 .385 − 0.385−0.051age 1.The ﬁtted curves are η ˆ(age) = 0.385−0.6 0.051age π ˆ (age) = 1 + e0.051age ω ˆ (age) = e0.051age e0.8 0.

ω (x+∆x) = eβ0+β1(x+∆x) = eβ0+β1x+β1∆x = ω (x)×eβ1∆x = ω (x)×(eβ1 )∆x So for this model. Changing x by one leads to a change in log odds of β1. Logistic Regression Model 9 . the changing x has a multiplicative eﬀect on the odds. Increasing x by 1 leads to multiplying the odds by eβ1 . ω . Increasing x by another 1 leads to another multiplication of eβ1 . it appears that age increases. η (x + ∆x) = β0 + β1(x + ∆x) = β0 + β1x + β1∆x = η (x) + β1∆x So the log odds work the same way as linear regression. and π ? Lets see what happens as x goes to x + ∆x in the single predictor case. What is the eﬀect of changing x on η .So in this case. the probability/odds of having a low birth weight baby decreases.

For π there is not a nice relationship as π (x) has an S shape. However the biggest changes occur when π (x) ≈ 0.5 and the size of the change decreases as π (x) approaches 0 and 1. the bigger the absolute diﬀerence. Logistic Regression Model 10 .Another way of thinking of this is through the odds ratio ω (x + ∆x) = eβ1∆x = (eβ1 )∆x ω (x) Note that the diﬀerence in odds depends on x (through the odds) as ω (x + ∆x) − ω (x) = ω (x)(eβ1∆x − 1) So the bigger ω (x). eβ0+β1(x+∆x) = π (x) + something ugly π (x + ∆x) = 1 + eβ0+β1(x+∆x) As can be seen in the following ﬁgure. the change π (x) depends on π (x) and not in a nice way.

0 Probability ω=e 50 η π= 0.718 0 ∆η = 1 0.8 eη 1 + eη 40 0.2 Factor of 2.4 Changes by 0.6 ω 30 π Changes by 0.231 ∆η = 1 20 0.0 −4 −2 0 η 2 4 −4 −2 0 η 2 4 Logistic Regression Model 11 .Odds 1.072 10 0.

More precisely.051.051 = 0. Actually. since β have low birth weight babies (ignoring the eﬀects of other predictors). Logistic Regression Model 12 .95 per year. nothing like logit(π ) = β0 + β1x1 + β2x2 + β12x1x2 then the previous ideas go through if you ﬁx all but one xj and only allow one to vary. older mothers should be less likely to So in the example. each additional year of age lowers the odds of a low birth weight birth by a factor of e−0. e. If there are no interaction terms in the model.g. there isn’t enough evidence to declare age to be statistically signiﬁcant. In the case of multiple predictors. but the data does suggest the previous statements. you need to be a bit more careful.These formulas also imply that the sign of β1 indicates whether ω (x) and π (x) increases (β1 > 0) or decreases (β1 < 0) as x increases. ˆ1 = -0.

x2) = β0 + β1(x + ∆x) + β2x2 = β0 + β1x + β1∆x + β2x2 = η (x1. x2) × (eβ1 )∆x Logistic Regression Model 13 .For example for the model logitπ (x1. x2) = eβ0+β1(x+∆x)+β2x2 = eβ0+β1x+β1∆x+β2x2 = ω (x1. x2) = β0 + β1x1 + β2x2 The log odds satisfy η (x1 + ∆x. x2) + β1∆x imply that the odds satisfy ω (x1 + ∆x. x2) × eβ1∆x = ω (x1.

x2) = β0 + β1(x + ∆x) + β2x2 + β12(x1 + ∆x)x2 = β0 + β1x + β2x2 + β12x1x2 + ∆x(β1 + β12x2) = η (x1. x2) = eβ0+β1(x+∆x)+β2x2+β12(x1+∆x)x2 = eβ0+β1x+β2x2+β12x1x2+∆x(β1+β12x2) = ω (x1. x2) × e∆x(β1+β12x2) Logistic Regression Model 14 .However for the interaction model logit(π ) = β0 + β1x1 + β2x2 + β12x1x2 the log odds satisfy η (x1 + ∆x. x2) + ∆x(β1 + β12x2) implying the eﬀect of changing x1 depends on the level of x2 as ω (x1 + ∆x.

• If X ∼ Bin(n. • If Y ∼ Bin(1. Unfortunately this approach has some problems. π ) and p ˆ= X n.Fitting the Logistic Regression Model One approach you might think of ﬁtting the model would be do to least squares on the data (xi. logit(Yi)). While there are ways around this (weighted least squares & fudging the Yis). then logit(Y ) equals −∞ or ∞. π ). a better approach is maximum likelihood. Fitting the Logistic Regression Model 15 . then 1 Var(logit(ˆ p)) ≈ nπ (1 − π ) so we don’t have constant variance.

Maximum Likelihood Estimation 16 . One way of thinking of the MLE is that its the value of the parameter that is most consistent with the data. Yn are an independent random sample from a population with a distribution described by the density (or pmf if discrete) f (Yi|θ). θ2. . . . Then the likelihood function is n L(θ) = i=1 f (Yi|θ) The maximum likelihood estimate (MLE) of θ is ˆ = arg sup L(θ) θ i. θp). Y2. the value of θ that maximizes the likelihood function. . . .e.Maximum Likelihood Estimation Lets assume that that Y1. The parameter θ might be a vector θ = (θ1. . .

Maximum Likelihood Estimation 17 .So for logistic regression. the likelihood function has the form n L(β ) = i=1 n yi πi (1 − πi)1−yi = i=1 n πi 1 − πi yi (1 − πi) = i=1 n yi ωi (1 − πi) = i=1 eβ0+β1xi yi 1 1 + eβ0+β1xi where logit(πi) = log ωi = β0 + β1xi.

∂L(θ) =0 ∂θp with respect to the parameter θ. ∂θ1 ∂L(θ) = 0.. Maximum Likelihood Estimation 18 . Note that when determining MLEs... it is usually easier to work with the log likelihood function n l(θ) = log L(θ) = i=1 log f (xi|θ) It has the same optimum since log is an increasing function and it is easier to work with since derivatives of sums are usually much nicer than derivatives of products.One approach to maximizing the likelihood is via calculus by solving the equations ∂L(θ) = 0. ∂θ2 .

. Maximum Likelihood Estimation 19 . ∂l(θ) = ∂θp ∂ log f (xi|θ) =0 ∂θ1 i=1 ∂ log f (xi|θ) =0 ∂θ2 i=1 ∂ log f (xi|θ) =0 ∂θ p i=1 n n n for θ instead.Thus we can solve the score equations ∂l(θ) = ∂θ1 ∂l(θ) = ∂θ2 ..

the log likelihood function is n l(β ) = i=1 n (yi log πi + (1 − yi) log(1 − πi)) (yi log ωi + log(1 − πi)) i=1 n = = i=1 yi(β0 + β1xi) − log(1 + eβ0+β1xi ) Maximum Likelihood Estimation 20 .For logistic regression.

such as Newton-Raphson or iteratively reweighted least squares.Normally there are not closed form solutions to these equations as can be seen from the score equations ∂l(β ) = ∂β0 ∂l(β ) = ∂β1 n i=1 n i=1 eβ0+β1xi yi − 1 + eβ0+β1xi =0 =0 eβ0+β1xi xiyi − xi 1 + eβ0+β1xi These equations will need to be solved by numerical methods. However there are some special cases which we will discuss later where there ˆ is a nice function of (xi. yi)) are closed formed solutions (β Maximum Likelihood Estimation 21 .

Among approximately unbiased estimators. Key Properties of MLEs 22 . the MLE has a variance smaller than any other estimator. Var(θ The information matrix I (θ) is an p × p matrix with entries ∂ 2l(θ) Iij = − ∂θi∂θj Then the inverse of the observed information matrix satisﬁes ˆ) ≈ I −1(θ ˆ) Var(θ 3.Key Properties of MLEs 1. MLEs are nearly unbiased (they are consistent) ˆ) can be estimated. 2. For large n.

For large n. So π ˆ (x) = is the MLE of eβ0+β1x ˆ0 +β ˆ1 x 1 + eβ ˆ ˆ eβ0+β1x π (x) = 1 + eβ0+β1x So the ﬁtted curve on the earlier plot is the MLE of the probabilities of a low birth weight for ages 14 to 45.Invariance property. This implies we can get conﬁdence intervals for θi easily.) An example of where this is useful is the estimation of success probabilities in logistic regression. (Transformations of MLEs are MLEs . ˆ is the MLE of θ. then g (θ ˆ) is the MLE of g (θ). for any “nice” function 5. Key Properties of MLEs 23 . If θ g (·).4. the sampling distribution of an MLE is approximately normal.

.Least Squares Versus Maximum Likelihood in Normal Based Regression One question some of you may be asking is. The log likelihood function in this case can be written as ind n 1 n l(β. . i = 1. σ 2). . . why was least squares used earlier in the book for regression. Least Squares Versus Maximum Likelihood in Normal Based Regression 24 . σ 2) = − log 2π − log σ 2 − 2 2 2 2σ n (yi − β0 − β1xi1 − . − βpxip)2 i=1 So maximizing this with respect to β is the same as minimizing the least square criteria. if maximum likelihood is so nice. n. If Yi|Xi ∼ N (Xiβ. then the least squares and the maximum likelihood estimates of β are exactly the same. . .

The MLE is SSE n−p−1 2 σ ˜2 = = σ ˆ n−p n−p where σ ˆ 2 is the usual unbiased method of moments estimator. Least Squares Versus Maximum Likelihood in Normal Based Regression 25 .The one place where there is a slight diﬀerence is in estimating σ 2.

∼ N (0. 1) Thus an approximate conﬁdence interval for βj is ˆj ± z ∗ SE (β ˆj ) β α/2 ∗ where zα/ 2 is the usual normal critical value. So we can base our inference on the result ˆj − βj β z= ˆj ) SE (β approx.Inference on Individual β s ˆj is approximately normally distributed with mean βj As mentioned before β and a variance we can estimate. it does give you the information needed to calculate them. While R doesn’t give you these CIs directly. Inference on Individual β s 26 .

105 (Dispersion parameter for binomial family taken to be 1) Null deviance: 234.525 0.9018 -0.38458 0.91 AIC: 235.599 age -0.> birthwt.623 0.7754 3Q 1. Error z value Pr(>|z|) (Intercept) 0. data=birthwt.glm <.67 Residual deviance: 231.91 on 188 on 187 degrees of freedom degrees of freedom Inference on Individual β s 27 . family=binomial) > summary(birthwt.0402 -0.glm) Deviance Residuals: Min 1Q Median -1.05115 0.03151 -1.7800 Coefficients: Estimate Std.4119 Max 1.glm(low ~ age.73212 0.

betahat) > ci.81952014 age -0.glm))) > se.1129188 0.cbind(Lower=betahat .betahat # 95% approximate CIs Lower Upper (Intercept) -1.me.01061290 Inference on Individual β s 28 .betahat <.betahat > ci.975) * se. and create In addition you can get the β conﬁdence intervals by the following code > betahat <.betahat (Intercept) age 0.betahat <.03151376 > me. Upper=betahat + me.38458192 -0.05115294 > se.betahat.glm) > betahat (Intercept) age 0.coef(birthwt.ˆs and standard errors into vectors.73212479 0.betahat <.sqrt(diag(vcov(birthwt.qnorm(0.0503563 1.

1) distribution.In addition. testing the hypothesis H0 : βj = 0 is usually done by Wald’s test ˆj β z= ˆj ) SE (β which is compared to the N (0. This is given in the standard R output with the summary command vs HA : βj = 0 Inference on Individual β s 29 .

Inference on Individual β s 30 .105 So in this case age does not appear to statistically signiﬁcant.38458 0. As in regular regression. In this case β0 gives information about mothers with age = 0.525 0.glm <. Error z value Pr(>|z|) (Intercept) 0.599 age -0. family=binomial) > summary(birthwt.05115 0.73212 0. However remember there are a number of potential confounders ignored in this analysis so you may not want to read too much into this. inference on the intercept usually is not interesting.03151 -1.glm) Coefficients: Estimate Std.623 0. data=birthwt. a situation that can’t happen.glm(low ~ age.> birthwt.

- MULTIVARIATE gaussianUploaded bymainakdave
- 01.IntroductionUploaded byHoesang Choi
- Efficient Single Frequency EstimatorsUploaded byShafayat Abrar
- 32.2.2.julia-etal.pdfUploaded bybogdan202
- AkaikeUploaded byYorgos Baltzopoulos
- Euro Public Gap WpUploaded byAlex Mortus
- StatFit en InglesUploaded byAlexis Fiestas Gonzales
- Part 1_Statistical AnalysisUploaded byicehorizon88
- 1-s2.0-S2090123214000927-mainUploaded byDaay Vlix
- Caprani 2005 PhD ThesisUploaded byM Refaat Fath
- Statistical Inference IIUploaded byshivam1992
- Akaike 1974Uploaded bypereiraomar
- Morris & Doak Chapter6Uploaded byDgek London
- MODELLING STUDENT’S SATISFACTION WITH LIBRARYUploaded bydecker4449
- Estimation of nonstationary heterogeneous panelsUploaded byQoumal Hashmi
- Un Pan 049152Uploaded byHesti
- Thomas 2015Uploaded byFrancisco Magaña
- Outsourcing IntegrationUploaded byPieter Jacobs
- A Support Vector Machine Approach to Credit ScoringUploaded bymjalal2
- Nonlinear Robust Regressions Based on ?-Regression Quantile, Least Median of Squares and Least Trimmed Squares Using Genetic AlgorithmsUploaded byJournal of Computing
- Lepto Spiros IsUploaded byNisa Ucil
- Probas Massart Stf03Uploaded byCristian Rodrigo Ubal Núñez
- UA Study proves MAS-TUSD is a success!Uploaded byDA Morales
- Non Parametric GraphsUploaded byronalduck
- Going for ThreeUploaded byTaylor Pipes
- RUM ModelUploaded byberhanu12
- Regresi Logistik Metode EnterUploaded byNova
- Econ5025 Practice ProblemsUploaded byTrang Nguyen
- Lecture4 App EconUploaded byMutasim Billah

- ApplyUploaded byjuntujuntu
- 7 اكتوبر قراءة في الوقائع الليبيةUploaded byjuntujuntu
- IBM SPSS BootstrappingUploaded byjuntujuntu
- defo003lotg01_01-Robnson Crouise AdvenatureUploaded byjuntujuntu
- An introduction to scalar Kalman filters.pdfUploaded byjuntujuntu
- AprUploaded byjuntujuntu
- Ibm Spss ConjointUploaded byjuntujuntu
- 03_RDS_Clatse_2012Uploaded byjuntujuntu
- 02 MultiplicityUploaded byjuntujuntu
- 01 Notas Completas 2005Uploaded byjuntujuntu
- 2013-apts-asp-exUploaded byjuntujuntu
- PriUploaded byjuntujuntu
- tufteUploaded byjuntujuntu
- Shiny CheatsheetUploaded byjuntujuntu
- IBM SPSS Missing ValuesUploaded byjuntujuntu
- IBM SPSS Neural NetworkUploaded byjuntujuntu
- Module 2 SlrUploaded byjuntujuntu
- Bio2 ComboUploaded byjuntujuntu
- fa08Uploaded byjuntujuntu
- IBM SPSS BootstrappingUploaded byjuntujuntu
- Module 3 - Multiple Linear RegressionUploaded byjuntujuntu
- fa01Uploaded byjuntujuntu
- fa07Uploaded byjuntujuntu
- IBM SPSS Statistics Core System Users GuideUploaded byjuntujuntu
- Module 4 - Logistic RegressionUploaded byjuntujuntu
- IBM SPSS Statistics Brief GuideUploaded byjuntujuntu
- GPL Reference Guide for IBM SPSS StatisticsUploaded byjuntujuntu
- Module 1 - FoundationUploaded byjuntujuntu
- IBM SPSS BootstrappingUploaded byjuntujuntu
- Module 5 - Ordinal RegressionUploaded byjuntujuntu

- EC220 2015 SyllabusUploaded byAnonymous RnapGaSIZJ
- Elastic Collisions and SnookerUploaded byAlex James
- Ex QuantumUploaded byRamona Adochite
- Normal DistUploaded byfalcon_545
- WKBUploaded byMuhammad Abdur Raafay Khan
- distribuciones.pptUploaded byKarol Galindo Martinez
- Hypothesis TestUploaded byAhmed Aldahhan
- Matriz de densidadUploaded bySilvia Cárdenas López
- Post Hoc PowerUploaded byKellix
- Eco_Basic_1-8Uploaded byViky Natadipura
- Distribuciones Continuas.pptxUploaded byMaricel Parra Osorio
- Deber de Anailitica Alejandro MoranUploaded byAlejandro Morán Hidalgo
- Chapter6 SolutionsUploaded by0796105632
- 19670015742Uploaded byAbhik Dasgupta
- Regression Analysis Summary NotesUploaded byAbir Allouch
- edu.docxUploaded byjoel
- Anomalies - Risk AversionUploaded byDang Le
- AnnPhysBerlin.v524y2012p345Uploaded byYaj Bhattacharya
- Aplicacion de La Estimacion de Las Probabilidades ORIGINALUploaded byRodrigo Algarañaz Ortega
- The Theory of Elementary Waves.pdfUploaded byrambo_style19
- Introducción A La Fisica De ParticulasUploaded byDaniel Ponce Martinez
- s2 Revision NotesUploaded byAlex Kingston
- Tema3 Variable AleatoriaUploaded byJenrry Chuquista Tunque
- 4.Quantum Mechanics_GATE 2010 - 2013Uploaded byAbhishek Upadhyay
- fferUploaded bylhz
- Foley Rationality & IdeologyUploaded byBhowmik Manas
- ortiz_2010Uploaded byJames Gildardo Cuasmayàn
- Topic 6 HeteroscedasticityUploaded byRaghunandana Raja
- EstBas8aUploaded byGonzaloqplorenz Xp
- Motivation 1Uploaded byArunkumar