Professional Documents
Culture Documents
Stats Midterm 1 Cheat Sheet
Stats Midterm 1 Cheat Sheet
CheatSheet
AssumptionsfortheRegressionModel
y = o + 1 x +
Choose o and1 tominimizeSSE
SSE=sumofsquaresofresidualerrors
Residual=observedminusexpected
1. E [|x] = 0 Theexpectedvalueoftheerrorgivenxis0.Themeanvalueofyatagiven
xvalueis o + 1 x .
2. v ar[|x] = 2 Thevarianceshouldbeconstantanddoesnotdependonx.
3. normal Thedistributionoftheerrorsshouldbenormal.
4. 1 , ..., n independentofeachotherEacherrorobservationshouldbeindependentofone
another.
Patternedresidualsplotsviolateassumption1.Increasingspreadofresidualsviolates
assumption2.Wecanchecknumber3usingahistogramoftheresidualsoranormalprobability
plot.
t=
1
SE( 1 )
t/2 S E(1 )
CI: 1
Atanygivenvalueofx,thetypicalvaluesofyareintherange( o + 1 x) 2 .
Increasingxby1unitisassociatedwithincreasingthemeanvalueofyby 1 units.
Whenx=0,themeanvalueofyis o .
Inthefittedregressionmodel,ifanadhasonemillionplays,whatistherangeoftypicalvalues
forthenumberofclicks?wethentakeaconfidenceintervalandusetheSEcoefofplays
(xvalue).95%ofyvalueswithx=10areexpectedtobewithintherangeof(CI).
Whenweinterprettheestimatedvalueof 1 ,wesayweare95%confidentthatforeach
increaseinx,theexpectednumberofywillincreasebyanamountbetween(CI).
=significancelevelforarandomhypothesistest,if H o istrue,thenP(TypeIerror)=
Pvalue=observedsignificancelevelittellsyouhowstrongtheevidenceisagainst H o .If
H o istrueandweweretorepeattheexperiment,P(teststatatleastasextremeas
observed)=pvalue
AssumptionsforPopulationMean
SEMean(standarderroroftheestimate): SE x =
s
n
Forsamplemeans,weuse(n1)df.REMEMBERINCONTRASTTOREG.
x
t = s/n
Forconfidenceintervals:Ifwerepeatthisexperimentmultipletimes,95%oftheresultswillfall
within(CI).