You are on page 1of 4

5/17/2015

AnnotatedStataOutput:MultinomialLogisticRegression

HelptheStatConsultingGroupby

stat

>

stata

>

output

givingagift

>stata_mlogit_output.htm

StataAnnotatedOutput
MultinomialLogisticRegression
Thispageshowsanexampleofanmultinomiallogisticregressionanalysiswithfootnotesexplainingtheoutput.Thedatawerecollectedon200highschool
studentsandarescoresonvarioustests,includingscience,math,readingandsocialstudies.Theoutcomemeasureinthisanalysisissocioeconomicstatus
(ses)low,mediumandhighfromwhichwearegoingtoseewhatrelationshipsexistswithsciencetestscores(science),socialsciencetestscores(socst)
andgender(female).Ourresponsevariable,ses,isgoingtobetreatedascategoricalundertheassumptionthatthelevelsofsesstatushavenonatural
orderingandwearegoingtoallowStatatochoosethereferentgroup,middleses.Thefirsthalfofthispageinterpretsthecoefficientsintermsof
multinomiallogodds(logits)andthesecondhalfinterpretsthecoefficientsintermsofrelativeriskratios.

usehttp://www.ats.ucla.edu/stat/data/hsb2,clear
mlogitsessciencesocstfemale
Iteration0:loglikelihood=210.58254
Iteration1:loglikelihood=194.75041
Iteration2:loglikelihood=194.03782
Iteration3:loglikelihood=194.03485
Iteration4:loglikelihood=194.03485
MultinomiallogisticregressionNumberofobs=200
LRchi2(6)=33.10
Prob>chi2=0.0000
Loglikelihood=194.03485PseudoR2=0.0786

ses|Coef.Std.Err.zP>|z|[95%Conf.Interval]
+
low|
science|.0235647.02097471.120.261.0646744.017545
socst|.0389243.01951651.990.046.0771759.0006726
female|.8166202.39098132.090.037.0503111.582929
_cons|1.9122561.1272561.700.090.29712584.121638
+
high|
science|.022922.02087181.100.272.0179861.0638301
socst|.0430036.01988942.160.031.0040211.081986
female|.032862.35001530.090.925.7188793.6531553
_cons|4.0573231.2229393.320.0016.454241.660407

(ses==middleisthebaseoutcome)

IterationLoga
Iteration0:loglikelihood=210.58254
Iteration1:loglikelihood=194.75041
Iteration2:loglikelihood=194.03782
Iteration3:loglikelihood=194.03485
Iteration4:loglikelihood=194.03485
a.Thisisalistingoftheloglikelihoodsateachiteration.Rememberthatmultinomiallogisticregression,likebinaryandorderedlogisticregression,uses
maximumlikelihoodestimation,whichisaniterativeprocedure.Thefirstiteration(callediteration0)istheloglikelihoodofthe"null"or"empty"modelthat
is,amodelwithnopredictors.Atthenextiteration,thepredictor(s)areincludedinthemodel.Ateachiteration,theloglikelihooddecreasesbecausethe
goalistominimizetheloglikelihood.Whenthedifferencebetweensuccessiveiterationsisverysmall,themodelissaidtohave"converged",theiterating
stops,andtheresultsaredisplayed.Formoreinformationonthisprocessforbinaryoutcomes,seeRegressionModelsforCategoricalandLimited
DependentVariablesbyJ.ScottLong(page5261).

ModelSummary
MultinomiallogisticregressionNumberofobsc=200
LRchi2(6)d=33.10
Prob>chi2e=0.0000
Loglikelihood=194.03485bPseudoR2f=0.0786
b.LogLikelihoodThisistheloglikelihoodofthefittedmodel.ItisusedintheLikelihoodRatioChiSquaretestofwhetherallpredictors'regression
coefficientsinthemodelaresimultaneouslyzeroandintestsofnestedmodels.
c.NumberofobsThisisthenumberofobservationsusedinthemultinomiallogisticregression.Itmaybelessthanthenumberofcasesinthedatasetif
therearemissingvaluesforsomevariablesintheequation.Bydefault,Statadoesalistwisedeletionofincompletecases.

http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm

1/4

5/17/2015

AnnotatedStataOutput:MultinomialLogisticRegression

d.LRchi2(6)ThisistheLikelihoodRatio(LR)ChiSquaretestthatforbothequations(lowsesrelativetomiddlesesandhighsesrelativetomiddleses)
atleastoneofthepredictors'regressioncoefficientisnotequaltozero.ThenumberintheparenthesesindicatesthedegreesoffreedomoftheChiSquare
distributionusedtotesttheLRChiSquarestatisticandisdefinedbythenumberofmodelsestimated(2)timesthenumberofpredictorsinthemodel(3).
TheLRChiSquarestatisticcanbecalculatedby2*(L(nullmodel)L(fittedmodel))=2*((210.583)(194.035))=33.096,whereL(nullmodel)isfromthe
loglikelihoodwithjusttheresponsevariableinthemodel(Iteration0)andL(fittedmodel)istheloglikelihoodfromthefinaliteration(assumingthemodel
converged)withalltheparameters.
e.Prob>chi2ThisistheprobabilityofgettingaLRteststatisticasextremeas,ormoreso,thantheobservedunderthenullhypothesisthenull
hypothesisisthatalloftheregressioncoefficientsacrossbothmodelsaresimultaneouslyequaltozero.Inotherwords,thisistheprobabilityofobtaining
thischisquarestatistic(33.10)ifthereisinfactnoeffectofthepredictorvariables.Thispvalueiscomparedtoaspecifiedalphalevel,ourwillingnessto
acceptatypeIerror,whichistypicallysetat0.05or0.01.ThesmallpvaluefromtheLRtest,<0.00001,wouldleadustoconcludethatatleastoneofthe
regressioncoefficientsinthemodelisnotequaltozero.TheparameteroftheChiSquaredistributionusedtotestthenullhypothesisisdefinedbythe
degreesoffreedominthepriorline,chi2(6).
f.PseudoR2ThisisMcFadden'spseudoRsquared.LogisticregressiondoesnothaveanequivalenttotheRsquaredthatisfoundinOLSregression
however,manypeoplehavetriedtocomeupwithone.ThereareawidevarietyofpseudoRsquarestatistics.Becausethisstatisticdoesnotmeanwhat
RsquaremeansinOLSregression(theproportionofvariancefortheresponsevariableexplainedbythepredictors),wesuggestinterpretingthisstatistic
withgreatcaution.

ParameterEstimates

sesg|Coef.hStd.Err.jzkP>|z|k[95%Conf.Interval]l
+
low|
science|.0235647.02097471.120.261.0646744.017545
socst|.0389243.01951651.990.046.0771759.0006726
female|.8166202.39098132.090.037.0503111.582929
_cons|1.9122561.1272561.700.090.29712584.121638
+
high|
science|.022922.02087181.100.272.0179861.0638301
socst|.0430036.01988942.160.031.0040211.081986
female|.032862.35001530.090.925.7188793.6531553
_cons|4.0573231.2229393.320.0016.454241.660407

(ses==middleisthebaseoutcome)i
g.sesThisistheresponsevariableinthemultinomiallogisticregression.Underneathsesaretworeplicatesofthepredictorvariables,representingthe
twomodelsthatareestimated:lowsesrelativetomiddlesesandhighsesrelativetomiddleses.
handi.Coef.andreferentgroupThesearetheestimatedmultinomiallogisticregressioncoefficientsandthereferentlevel,respectively,forthemodel.
Animportantfeatureofthemultinomiallogitmodelisthatitestimatesk1models,wherekisthenumberoflevelsofthedependentvariable.Inthis
instance,Stata,bydefault,setmiddlesesasthereferentgroupandthereforeestimatedamodelforlowsesrelativetomiddlesesandamodelforhigh
sesrelativetomiddleses.Therefore,sincetheparameterestimatesarerelativetothereferentgroup,thestandardinterpretationofthemultinomiallogitis
thatforaunitchangeinthepredictorvariable,thelogitofoutcomemrelativetothereferentgroupisexpectedtochangebyitsrespectiveparameter
estimategiventhevariablesinthemodelareheldconstant.
lowsesrelativetomiddleses
scienceThisisthemultinomiallogitestimateforaoneunitincreaseinsciencetestscoreforlowsesrelativetomiddlesesgiventheothervariables
inthemodelareheldconstant.Ifasubjectweretoincreasehissciencetestscorebyonepoint,themultinomiallogoddsforlowsesrelativetomiddleses
wouldbeexpectedtodecreaseby0.024unitwhileholdingallothervariablesinthemodelconstant.
socstThisisthemultinomiallogitestimateforaoneunitincreaseinsocsttestscoreforlowsesrelativetomiddlesesgiventheothervariablesinthe
modelareheldconstant.Ifasubjectweretoincreasehissocsttestscorebyonepoint,themultinomiallogoddsforlowsesrelativetomiddleseswould
beexpectedtodecreaseby0.039unitwhileholdingallothervariablesinthemodelconstant.
femaleThisisthemultinomiallogitestimatecomparingfemalestomalesforlowsesrelativetomiddlesesgiventheothervariablesinthemodelare
heldconstant.Themultinomiallogitforfemalesrelativetomalesis0.817unithigherforbeinginlowsesrelativetomiddlesesgivenallotherpredictor
variablesinthemodelareheldconstant.
_consThisisthemultinomiallogitestimateforlowsesrelativetomiddleseswhenthepredictorvariablesinthemodelareevaluatedatzero.For
males(thevariablefemaleevaluatedatzero)withzeroscienceandsocsttestscores,thelogitforbeinginlowsesversusmiddlesesis1.912.Note,
evaluatingscienceandsocstatzeroisoutoftherangeofplausibletestscoresandifthetestscoresweremeancentered,theinterceptwouldhavea
naturalinterpretation:logoddsofbeinginlowsesversusmiddlesesforamalewithaveragescienceandsocsttestscore.
highsesrelativetomiddleses
scienceThisisthemultinomiallogitestimateforaoneunitincreaseinsciencetestscoreforhighsesrelativetomiddlesesgiventheothervariables
inthemodelareheldconstant.Ifasubjectweretoincreasehissciencetestscorebyonepoint,themultinomiallogoddsforhighsesrelativetomiddle
seswouldbeexpectedtoincreaseby0.023unitwhileholdingallothervariablesinthemodelconstant.
socstThisisthemultinomiallogitestimateforaoneunitincreaseinsocsttestscoreforhighsesrelativetomiddlesesgiventheothervariablesinthe
modelareheldconstant.Ifasubjectweretoincreasehissocsttestscorebyonepoint,themultinomiallogoddsforhighsesrelativetomiddleseswould
beexpectedtoincreaseby0.043unitwhileholdingallothervariablesinthemodelconstant.
femaleThisisthemultinomiallogitestimatecomparingfemalestomalesforhighsesrelativetomiddlesesgiventheothervariablesinthemodelare
heldconstant.Themultinomiallogitforfemalesrelativetomalesis0.033unitlowerforbeinginhighsesrelativetomiddlesesgivenallotherpredictor
variablesinthemodelareheldconstant.
_consThisisthemultinomiallogitestimateforhighsesrelativetomiddleseswhenthepredictorvariablesinthemodelareevaluatedatzero.For
males(thevariablefemaleevaluatedatzero)withzeroscienceandsocsttestscores,thelogitforbeinginhighsesrelativetomiddlesesis4.057.
j.Std.Err.Thesearethestandarderrorsoftheindividualregressioncoefficientsforthetworespectivemodelsestimated.Theyareusedinboththe
calculationofthezteststatistic,superscriptk,andtheconfidenceintervaloftheregressioncoefficient,superscriptl.
k.zandP>|z|Thesearetheteststatisticsandpvalue,respectively,thatwithinagivenmodelthenullhypothesisthatanindividualpredictor'sregression
coefficientiszerogiventhattherestofthepredictorsareinthemodel.TheteststatisticzistheratiooftheCoef.totheStd.Err.oftherespective
predictor.ThezvaluefollowsastandardnormaldistributionwhichisusedtotestagainstatwosidedalternativehypothesisthattheCoef.isnotequalto
zero.Theprobabilitythataparticularzteststatisticisasextremeas,ormoreso,thanwhathasbeenobservedunderthenullhypothesisisdefinedby
P>|z|.Theinterpretationoftheparameterestimates'significanceislimitedonlytothefirstequation,lowsesrelativetomiddleses.Theinterpretation
forthesecondmodel,highsesrelativetomiddleses,naturallyfallsoutofthefirstequationsinterpretation.

http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm

2/4

5/17/2015

AnnotatedStataOutput:MultinomialLogisticRegression

Forlowsesrelativetomiddleses,thezteststatisticforthepredictorscience(0.024/0.021)is1.12withanassociatedpvalueof0.261.Ifwesetour
alphalevelto0.05,wewouldfailtorejectthenullhypothesisandconcludethatforlowsesrelativetomiddleses,theregressioncoefficientforsciencehas
notbeenfoundtobestatisticallydifferentfromzerogivensocstandfemaleareinthemodel.
Forlowsesrelativetomiddleses,thezteststatisticforthepredictorsocst(0.039/0.020)is1.99withanassociatedpvalueof0.046.Ifweagainset
ouralphalevelto0.05,wewouldrejectthenullhypothesisandconcludethattheregressioncoefficientforsocsthasbeenfoundtobestatisticallydifferent
fromzeroforlowsesrelativetomiddlesesgiventhatscienceandfemaleareinthemodel.
Forlowsesrelativetomiddleses,thezteststatisticforthepredictorfemale(0.817/0.391)is2.09withanassociatedpvalueof0.037.Ifweagainset
ouralphalevelto0.05,wewouldrejectthenullhypothesisandconcludethatthedifferencebetweenmalesandfemaleshasbeenfoundtobestatistically
differentforlowsesrelativetomiddlesesgiventhatscienceandfemaleareinthemodel.
Forlowsesrelativetomiddleses,thezteststatisticfortheintercept,_cons(1.912/1.129)is1.70withanassociatedpvalueof0.090.Withanalpha
levelof0.05,wewouldfailtorejectthenullhypothesisandconclude,a)thatthemultinomiallogitformales(thevariablefemaleevaluatedatzero)andwith
zeroscienceandsocsttestscoresinlowsesrelativetomiddlesesarefoundnottobestatisticallydifferentfromzeroorb)formaleswithzeroscience
andsocsttestscores,youarestatisticallyuncertainwhethertheyaremorelikelytobeclassifiedaslowsesormiddleses.Wecanmakethesecond
interpretationwhenweviewthe_consasaspecificcovariateprofile(maleswithzeroscienceandsocsttestscores).Basedonthedirectionand
significanceofthecoefficient,the_constellswhethertheprofilewouldhaveagreaterpropensitytofallinoneofthelevelsofthedependentvariable.
l.[95%Conf.Interval]ThisistheConfidenceInterval(CI)foranindividualmultinomiallogitregressioncoefficientgiventheotherpredictorsareinthe
modelforoutcomemrelativetothereferentgroup.Foragivenpredictorwithalevelof95%confidence,we'dsaythatweare95%confidentthatthe"true"
populationmultinomiallogitregressioncoefficientliesbetweenthelowerandupperlimitoftheintervalforoutcomemrelativetothereferentgroup.Itis
calculatedastheCoef.(z/2)*(Std.Err.),wherez/2isacriticalvalueonthestandardnormaldistribution.TheCIisequivalenttothezteststatistic:ifthe
CIincludeszero,we'dfailtorejectthenullhypothesisthataparticularregressioncoefficientiszerogiventheotherpredictorsareinthemodel.An
advantageofaCIisthatitisillustrativeitprovidesarangewherethe"true"parametermaylie.

RelativeRiskRatioInterpretation
Thefollowingistheinterpretationofthemultinomiallogisticregressionintermsofrelativeriskratiosandcanbeobtainedbymlogit,rrrafterrunningthe
multinomiallogitmodelorbyspecifyingtherrroptionwhenthefullmodelisspecified.Thispartoftheinterpretationappliestotheoutputbelow.

mlogitsessciencesocstfemale,rrr
Iteration0:loglikelihood=210.58254
Iteration1:loglikelihood=194.75041
Iteration2:loglikelihood=194.03782
Iteration3:loglikelihood=194.03485
Iteration4:loglikelihood=194.03485
MultinomiallogisticregressionNumberofobs=200
LRchi2(6)=33.10
Prob>chi2=0.0000
Loglikelihood=194.03485PseudoR2=0.0786

ses|RRRaStd.Err.zP>|z|[95%Conf.Interval]b
+
low|
science|.9767108.02048621.120.261.93737261.0177
socst|.9618236.01877141.990.046.925727.9993276
female|2.262839.88472762.090.0371.0515984.869199
+
high|
science|1.023187.02135581.100.272.98217471.065911
socst|1.043942.02076332.160.0311.0040291.085441
female|.9676721.33870.090.925.48729811.921595

(ses==middleisthebaseoutcome)
a.RelativeRiskRatioThesearetherelativeriskratiosforthemultinomiallogitmodelshownearlier.Theycanbeobtainedbyexponentiatingthe
multinomiallogitcoefficients,ecoef.,orbyspecifyingtherrroption.Recallthatthemultinomiallogitmodelestimatesk1models,wherethekthequationis
relativetothereferentgroup.Ifthemodelwastobewrittenoutinanexponentiatedformwherethepredictorofinterestisevaluatedatx+andatxfor
outcomemrelativetoreferentgroup,whereisthechangeinthepredictorweareinterestedin(istraditionallyissettoone)whiletheothervariablesin
themodelareheldconstant.Ifwethentaketheirratio,theratiowouldreducetotheratiooftwoprobabilities,therelativerisk.Inthissense,the
exponentiatedmultinomiallogitcoefficientprovidesanestimateofrelativerisk.However,theexponentiatedcoefficientarecommonlyinterpretedasodds
ratios.Standardinterpretationoftherelativeriskratiosisforaunitchangeinthepredictorvariable,therelativeriskratioofoutcomemrelativetothe
referentgroupisexpectedtochangebyafactoroftherespectiveparameterestimategiventhevariablesinthemodelareheldconstant.
lowsesrelativetomiddleses
scienceThisistherelativeriskratioforaoneunitincreaseinsciencescoreforlowsesrelativetomiddleseslevelgiventhattheothervariablesinthe
modelareheldconstant.Ifasubjectweretoincreasehersciencetestscorebyoneunit,therelativeriskforlowsesrelativetomiddleseswouldbe
expectedtodecreasebyafactorof0.977giventheothervariablesinthemodelareheldconstant.So,givenaoneunitincreaseinscience,therelativerisk
ofbeinginthelowsesgroupwouldbe0.977timesmorelikelywhentheothervariablesinthemodelareheldconstant.Moregenerally,wecansaythatifa
subjectweretoincreasetheirsciencetestscore,they'dbeexpectedtofallintomiddlesesascomparedtolowses.
socstThisistherelativeriskratioforaoneunitincreaseinsocstscoreforlowsesrelativetomiddleseslevelgiventhattheothervariablesinthe
modelareheldconstant.Ifasubjectweretoincreasehersocsttestscorebyoneunit,therelativeriskforlowsesrelativetomiddleseswouldbeexpected
todecreasebyafactorof0.962giventheothervariablesinthemodelareheldconstant.
femaleThisistherelativeriskratiocomparingfemalestomalesforlowsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelare
heldconstant.Forfemalesrelativetomales,therelativeriskforlowsesrelativetomiddleseswouldbeexpectedtoincreasebyafactorof2.263giventhe
othervariablesinthemodelareheldconstant.
highsesrelativetomiddleses
scienceThisistherelativeriskratioforaoneunitincreaseinsciencescoreforhighsesrelativetomiddleseslevelgiventhattheothervariablesin
themodelareheldconstant.Ifasubjectweretoincreasehersciencetestscorebyoneunit,therelativeriskforhighsesrelativetomiddleseswouldbe
expectedtoincreasebyafactorof1.023giventheothervariablesinthemodelareheldconstant.
socstThisistherelativeriskratioforaoneunitincreaseinsocstscoreforhighsesrelativetomiddleseslevelgiventhattheothervariablesinthe
modelareheldconstant.Ifasubjectweretoincreasetheirsocsttestscorebyoneunit,therelativeriskforhighsesrelativetomiddleseswouldbe

http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm

3/4

5/17/2015

AnnotatedStataOutput:MultinomialLogisticRegression

expectedtoincreasebyafactorof1.043giventheothervariablesinthemodelareheldconstant.
femaleThisistherelativeriskratiocomparingfemalestomalesforhighsesrelativetomiddleseslevelgiventhattheothervariablesinthemodelare
heldconstant.Forfemalesrelativetomales,therelativeriskforhighsesrelativetomiddleseswouldbeexpectedtodecreasebyafactorof0.968given
theothervariablesinthemodelareheldconstant.
b.[95%Conf.Interval]ThisistheCIfortherelativeriskratiogiventheotherpredictorsareinthemodel.Foragivenpredictorwithalevelof95%
confidence,we'dsaythatweare95%confidentthatthe"true"populationrelativeriskratiocomparingoutcomemtothereferentgroupliesbetweenthe
lowerandupperlimitoftheinterval.AnadvantageofaCIisthatitisillustrativeitprovidesarangewherethe"true"relativeriskratiomaylie.

Howtocitethispage

Reportanerroronthispageorleaveacomment

Thecontentofthiswebsiteshouldnotbeconstruedasanendorsementofanyparticularwebsite,book,orsoftwareproductbytheUniversityofCalifornia.

IDRE RESEARCH TECHNOLOGY


GROUP

High Performance
Computing
Statistical Computing

GIS and Visualization

ABOUT
2015 UC Regents

CONTACT

NEWS

HighPerformanceComputing

GIS

StatisticalComputing

Hoffman2Cluster

Mapshare

Classes

Hoffman2AccountApplication

Visualization

Conferences

Hoffman2UsageStatistics

3DModeling

ReadingMaterials

UCGridPortal

TechnologySandbox

IDREListserv

UCLAGridPortal

TechSandboxAccess

IDREResources

SharedCluster&Storage

DataCenters

SocialSciencesDataArchive

AboutIDRE

EVENTS

OUR EXPERTS

Terms of Use & Privacy Policy

http://www.ats.ucla.edu/stat/stata/output/stata_mlogit_output.htm

4/4

You might also like