Welcome to Scribd, the world's digital library. Read, publish, and share books and documents. See more
Standard view
Full view
of .
0 of .
Results for:
P. 1
427513 Statistics Reference Cheatsheet

# 427513 Statistics Reference Cheatsheet

Ratings: (0)|Views: 18|Likes:
cheat sheet
cheat sheet

### Availability:

See more
See less

07/05/2014

pdf

text

original

.
TSTICSFORINTRODUGTORYCOURSES
J STATISTICS-A setof toolsfor collecting,oreanizing,presenting,and analyzingnumericalfactsorobservations.I . DescriptiveStatistics-procedures usedtoorganizeandpresentdatain a convenient,useable.andcommunicableform.2. nferentialStatistics-proceduresemployedto arriveatbroadergeneralizations orinferencesfromsampledatatopopulations.-l STATISTIC-A numberdescribingasamplecharacteristic.Resultsfromthemanipulationof sampledataaccordingto certainspecifiedprocedures.JDATA-Characteristicsornumbersthatare collectedbyobservation.JPOPULATION-A completeset of actualorpotential observations.JPARAMETER-A numberdescribingapopulation characteristic;typically,inferredfromsamplestatistic.fSAMPLE-Asubsetofthepopulationselectedaccordingto some scheme.JRANDOMSAMPLE-A subsetselectedin sucha waythateachmemberof thepopulationhas anequalopportunityto beselected.Ex.lotterynumbersin afairlotteryJVARIABLE-Aphenomenon thatmay takeon differentvalues.f MEAN-Theoointn a distributionofmeasurementsaboutwhich thesummeddeviationsareequalo zero.Averagevalue of a sampleorpopulation.POPULATIONMEANSAMPLEMEAN
p:+!,*,
o:#2*,
Note: The means very sensltlveo extrememeasure-ments hat arenot balancedonboth sides.I WEIGHTEDMEAN-Sum of asetof observationsmultiplied bytheirrespectiveweights,divided bythesum ofthe weights:9,*,*,WEIGHTEDMEAN-L-
,\r*'
wherexr,weight,'x, observation;:numberofobservaiionrdups.'Calculatedrom apopulation.sample.orgr6upings n afrequencydistribution.Ex. In theFrequencVDistributionbelow,themeun is80.3: culculatbdby-usingfrequenciesforthewis.Whengrouped,useclossmidpointsJbrxis.J MEDIAN-Observationrpotenlialbservationn asetthat dividesthe set sothatthe samenumber ofobservationsie on eachside ofit. For an oddnumberofvalues. t is themiddle value;oranevennumbertis the averageof themiddle two.Ex.IntheFrequencyDistribution tablebelow,themedianis 79.5.fMODE-Observationhat occurswith thegreatesttiequency.Ex.IntheFrequencyDistributiolnnblebelow. hemode s 88.
O SUMOF SOUARESfSSr-Der ationsiomthemean. quaredndummed:,(Ir,),PopulationSS:I(Xil.rx)'orIxi'-tN_ r,\,)2SampleS:I(xi-x)2orIxi2---O VARIANCE-The averagef squareiffer-encesetweenbservationsndheirmean.POPULANONVARIANCEAMPLEVARIANCE
VARIANCESFOHGBOUPEDDATAPOPUIATIONSAMPLE
^{G-'{G
o2:*it,(r,-pts2=;1i tilm'-x;2
lI;_r t=1
DSTANDARDDEVIATION-Squareoot of
the variance:Ex.Pop.S.D.o-
n
Y
I
U
fi
)
D BAR GRAPH-A form ofgraphthat usesbarsto indicatethe frequencyof occurrenceof observations.oHistogram-a form of bargraphusedrrithinterval orratio-scaledvariables.-Interval Scale-aquantitativescalethatpermitsthe use of arithmeticoperations.Thezeropoint in the scaleisarbitrary.-R.atio Scale- sameasinterval scale exceplthat thereisatrue zeropoint.D FREOUENCYCURVE-Aformofgraphrepresentingafrequency distributionin the formof acontinuousline that traces ahistogram.oCumulativeFrequency Curve-a continuousline that traces ahistogram where barsin all thelower classesare stacked upin the adjacenthigher class.Itcannothaveanegative slop€.oNormal curve-bell-shapedcurve.oSkewedcurve-departsfromsymmetryandtails-off at one end.
GROUpITGOF DATA
Showsthe numberoftimes eachobservationoccurswhenthevalues ofavariableare arrangedin orderaccordingto theirmagnitudes.
II GROTJPEDREOUENCYilSTRIBUTION
-Afrequency distributionin which thevaluesofthevariablehave beengrouped into classes.
J il {il, I a rr .)'A.lb]|, K I3artlLQ
x
f
x
t
x
f
x
t100
1
83
11
74
11f65
o
991
ut
1111175111166198
0
851
76
116711gl
0
86
o
7711168196118717A
I
6911195
0
8811111117911
70
111194
0
89111
80
171
093
I
118111721192
0
911
82
I
73
111
tr CUMULATUEREOUENCYISTRI.
BUTION-Adistributionwhichshowshe o-tal frequencyhroughthe upperreal limit ofeach lass.tr CUMUIATIVEPERCENTAGEDISTRI.BUTION-Adistributionwhichshowshe o-talpercentagehroughhe upperreallimit ofeach lass.
!I!
llrfGl:
il {.lllNl.l'tlz
CLASS
fI
Cumf
"
65-67334.84
6&70
81117.7471-7351625.81
7+76
925
40.32Tt-79
631
50.0080-82
43556.4583-858
43
69.3586-88851
82.26
89-9165791.9492-g15893.5595-9726096.77
9&100
262
100.00
15100
NORMALCURVE^/T\./\
-t
-att?
\
CLASSf CLASSt98-100
15100
SKEWEDURVE
--\/\-/
LEFT
\
J-
\

Probabilityofoccurrence^tat -Numberof outcomafamringEwntAoif'ent'lAnt=@
DSAMPLESPACEAllpossibleoutcomesf anexperiment.N TYPEOFEVENTS
oExhaustive-two ormore eventsaresaid tobe exhaustiveifallpossibleoutcomesareconsidered.Symbolically,P(Aor B or...)-l.rNon-Exhausdve-twoormore eventsare saidto benon-exhaustiveifthey donot exhaustallpossible outcomes.rMutuallyExclusive-Eventsthat cannotoccursimultaneously:pAandB)=0; andp(AorB)=p(A)+p(B).Ex. males,femalesoNon-MutuallyExclusive-Event-sthatcan occursimultaneously:p(AorB)=P(A)+p(B)-p(Aand B)'&x.males,browneyes.Slndependent-Eventswhoseprobabilityis unaffectedbyoccurrenceornonoccurrenceof each other:p(AlB)=p(A); ptBIn)=p(e); andp(AandB)=p(A) p(B).Ex.genderandeyecolorSDependent-Eventswhoseprobability changesdeoendlnsuponthe occurrenceornon-occurrenceofeachother:p{.IIbl dilferslromAA):p(BlA)differsfromp(B);andp(AandB):p(A)p(BlA):p(B)AAIB)Ex.rsce andeye colon
CJOINTPROBABILITIES-Probabilityhat2otmore eventsccursimultaneously.trMARGINALPROBABILITIESor Uncondi-tionalProbabilitiessummationfprobabilities'
DCONDITIONALPROBABILITIES-ProbabilityofIgiven the existenceof,S, written,p(Al\$.flEXAMPLE-GiventhenumbersI to9 asobservationsin asamplespace:.Eventsmutuallyexclusiveandexhaustive'Example:p(alloddnumbers)p(all eu-enurnbers.Evenlsmutualtyexclusivebutnot exhaustive-Example:p(aneiennumber);p(thenumbers7and5).Eventsni:ithermutuallyexclusiveor exhaustive-Example:p(anevennumberor a 2)
fl SAMPLINGDISTRIBUTION-Atheoreticalprobabilitydistributionof a statistichat wouldiesultfromdrawingallpossible samplesof a
given sizefrom somepopulation.
THESTAIUDARDEBROROFTHEMEAN
A theoreticalstandarddeviationofsamplemean of agiven samplesi4e, drawnfromsomespeciJiedpopu-lation.DWhenbasedon averylarge,knownpopulation, thestandardrrors:6__o"r_^lEWhen estimatedfrom a sampledrawnfromvery largepopulation,the standarderroris:lThedispersionofsamplemeansdecreasess samplesizes increased.
O==S^t-'fn
RANDOMVARIABLES
A mappingorfunctionthatassignsoneand'onlvone-numericalvalue toeachoutcomeinan exPeriment.
tlDISCRETERANDOMVARIABLES-In-volvesulesorprobabilitymodelsor assign-ing orgenerating nlydistinctvaluesnotrac-tionalmeasurements).CBINOMIALDISTRIBUTION-Amodelfor thesum ofa seriesfnindependentrialswhererialresultsn a 0(failure)orI(suc-cess).Ex. Coino"tp(r)=(!)n'l-trl"-'wherep(s) s theprobabilityf s successn ntrialswith a constantprobabilityper trials,andwhere,1\=n!
-"-"'-'-ts/s!(n-s)!Binomialmean:!:nxBinomialvariance:o':n,(l-tr)Asn increases,heBinomialapproachesheNormaldistribution.DHYPERGEOMETRICDISTRIBUTION-A modelforthe sumof a seriesofn trialswhereeachtrialresultsin a 0or I andis drawnfrom asmallpopulationwith N elementssplit betweenN1 successesnd N2failures.Then theprobabil-ity of splittingthen trials betweenxl successesandx2 failuresis:Nl!{_z!p(xlandtrr:W't4tlv-r;lrHypergeometricmeanpt:E(xi-+
andariance:o2ffit+][p]
D POISSONDISTRIBUTION-A modelforthenumber ofoccurrencesof anevent x:0,1,2,...,when theprobabilityof occurrenceis small,butthe numberofopportunitiesorthe occurrences large,or x:0,1,2,3....nd)v>0.otherwiseP(x)=.0.
e\$t=ff
Poissonmean andrariance:t.
Fo continuou st'ari u b es..fi'euen ' es uree.tressed interms o.fareusunderu t'ttt.re.
D CONTINUOUSRANDOMVARIABLES-Variablehatmay takeon anyvaluealonganuninterruptedntervalof anumberline.D NORMALDISTRIBUTION-bell cun'e;a distributionwhosevaluesclustersymmetri-cally aroundhemean(alsomedian andmode).f(x)=-1,(x-P)212o2o"t'2xwheref(x):frequency.t.agivenrzalueo:standardeviatlonofthedistribution
lt:approximatelyI 111qapproximately.7183p:themeanofthe distributionx:anyscoren thedistributionD STANDARDNORMALDISTRIBUTION-A normalandom ariable. thathasa meanof0.andstandardeviationfl.Q Z-VALUES-Thenumberof standardevia-tionsaspecificbservationiesrom hemean:':x-11tr LEVEL OFSIGNIFICANCEAprobabilinvalue onsideredaren hesamplingistribtion.specifiednderhenullhypothesishereneswillingo acknowledgeheoperationfchance
factors. Commonsignificanceevelsare170,50,l0o. Alpha(a)leveltheowestevefor whichthe null hypothesiscan berejected.The significanceeveldeterminestheriticalregion.[| NULLHYPOTHESIS(flr)-A statementthat specifieshypothesizedvalue(s)for oneormore ofthepopulationparameter.lBx.Hs=acoinis unbiased.hatisp:0.5.]trALTERNATMHYPOTHESIS(.r/1)-Astatementthat specifiesthatthepopulationparameter is somevalueotherthanthe onespecifiedunderthenull trypothesis.Ex.I1r: acoinis biasedThat isp * 0.5.1I. NONDIRECTIONALHYPOTHESIS-analternativehypothesis(H1)thatstatesonllthatthepopulationparameters differentfromthe one ipicifiedunderH6.Ex.[1flt+!t0Two-TailedProbability Valuesemployedwhenthe alternativehypothesiss non-directional.2. DIRECTIONALHYPOTHESIS-analternativehypothesisthat states hedirectionrnwhichthepopulationparameter differsfiom theonespecifiedunder11* Ex.Ilt:Ir> pnr-trHflr't1One-TailedProbabilityValues employedu'henthe alternativehypothesiss directional.D NOTION OFINDIRECTPROOF-Stnctinterpretationofhypothesistestingrevealshat thc'null hypothesiscan never beproved.[Ex.Ifwe toi.a coin200times and ailscomes up100times.it isnoguarantee thatheads will comeup exactlyhalithe timein thelong run; small discrepanciesmigfrtexist.A bias can existevenat a smallmagnitude.We canmakethe assertionhoweverthat NOBASISEXISTSFORREJECTINGTHEHYPOTHESISTHATTHE COINISUNBIASED.(Thenull hypothesissnotreieued.When employingthe 0.05evel of significareject thenull hypothesiswhenagiven resoccursby chance5% ofthe timeorless.]]TWOTYPES OFERRORS
-Type1 Error(TypeaError)=theejectionof11,when t is actuallyrue.Theprobabilityofa type1 errorsgivenby a.-TypeI Error(TypeError)=Theacceptanceoffl,whent is actuallyalse.Theprobabilinof a typeII errorsgivenbyB.
(forsamplemean X)rlfx1,X2,X3,...xn,is a simplerandomsample ofnelementsfrom a large(infinite) population, withmeanmu(p) andstandard deviationo, thenthe distributionofT takeson the bell shapeddistributionof anormalrandom variableas n increasesandthe distributionoftheratio:7-!6l^Jnapproacheshe standardnormal distributionas ngoesto'infinity.Inpractice.anormalapproximationisacceptablefor samplesof 30 orlarger.PercentageCumulativeDistribution
for selectedZvalues underanormal curyeZ-value-3 -2 -l0+1+2 +3PercentifeScore o-132.2a 15.87 50.00a4.13 97.7299.a7

Criticalregion for rejection of Hswhen u:O-O7. two-tailed test
.2.b8O+2.58
trUSEDWHENTHE STANDARDDEVIA-TION ISKNOWN: When ois knownt ispos-sibleo describe heform of the distributionofthe samplemeanasa Z statistic.he samplemustbe drawnfrom a normal distributionorhaveasample izen)of ateast30.,.=r-!whereu:populationmeaneither
'6=
nro#rf or hypothesizednder Ho) and or=o/f,.oCriticalRegion-theportionof the areaunderthecurvewhich ncludeshose alues f a statisticthat ead o the rejectionofthe nullhypothesis.-The most often used significanceevelsare0.01, .05, nd0.L Foraone-tailedtestsingz-statistic,hesecorrespondo z-valuesof2.33,1.65, nd1.28 espectively.orawo-tailed est,the criticalegionof 0.01 s splitnto two equalouterareasmarkedbyz-valuesf12.581.Example1.Givenapopulation withlt:250and o:S0,what s theprobabili6t of drawing asample ofn:100 values whosemean(x)is atleast255?nthis case,=1.00.Looking atThbleA, thegivenareaor2:1.00 is 0.3413.Tb itsright is 0.1587(=6.5-0.i413)r 15.85%.Conclusion:there are spproximately16chances n100ofobtaining a samplemean:255romthispapulationwhenn=104.Example 2.Assume we donot know thepopulation me&n. However, we suspect thatitmay have been selectedromapopulationwith1t=250 and 6= 50,butwearenot sure.The hypothesis to be tested iswhether thesamplemean was selectedfrom thispopula-tian.Assumewe obtainedfrom a sample(n)of100, asample,neen of 263. Is itreason-ableto&ssantehat this samplewas drawnfromthesuspectedpopulation?| . Ho'.1t250(thathe actualmeanofthepopu-lationfromwhich thesamples drawnsequalto250) Hi[tnot equal o 250(thealternativehypothesiSs that it isgreaterhan orless han250, hus atwo-tailed est).2.e-statisticwill be usedbecause hepopula-tionois known.3.Assume he significanceevel(cr)obe0.01Looking atTableA, we find thatthearea be-yondazof 2.58 s approximately.005.To eject H6atthe 0.01evelof significance,}reab-solutevalueoftheobtainedzmust be equal o orgreaterhanz6.91lr2.58.Here hevalueof zcor-respondingosamplemean 263 s 2.60.trCONCLUSION-Sincehisobtainedfallswithinthecriticalegion,we may ejectHoathe0.01 evelof significance.
Normalurvereas
arearonmean o zoooo .0040 008001200398 0438 .O474 O5170793 0832 .0871 09101179 1217 1255 12931554 1591 1624.1664
,gYOC .+YOO .{Vot .4WO .9VqY .+VrV .+rt |,1Vt 4.{VtJ .+Vr+l
.4974 .4975 .4976 .4977 .4977.4978 .4979 .4979 .4980 .4S81.4981.49A2 .4982 .49a3 .4984 .4984 .4985 .4985 .4986 .4986.4987 .4991 .4987 .4988 .4988 .4989.4949 .4989 .4990 .4990
.4452 .4463 .4474 .4444 .4495 .4505 .4515 .4525 ,4535 .4545.4554 .4564 .4573 .4542 .4591.4599.460a.4616.4625 .4633.4641.4649 .46S.4664.4671 .4674 .4646 .4693 .4699 .4706.47't3 .4719 .4726 .4732 .4734 .4744 .4750 .4756 .4761 .4767.4772 .4774 .4743 .4744 .4793 .4798 .4803 .4aAa .4812 .4417.4821 .4826.4830 .4834 .+agatetz aaa6.tasoZSsa ABs?.4A61 .4A64 .4A68 .4871 .4875 .4A78 .4aal .4884 .4887 .4A90.4893 .4a96 .4898 .4901 .4904 .4906 .4909 .4911 .4913 .4916.4918 .4920 .4922 .4925 .4927 .4929 .4931 .4932 .4934 .4936.4938.4940 .4941.4943.4945 .4946 .4944 .4949 .4951 .4952.lgsg .+955 .+ss6 assz a959 .+goo.+s6r .+s6z .csi6g .4964-.4965 .4966 .4967 .4968 .4969 .4970 .4971.4972.4973 .49741.4974 .4975 .4976 .4977 .4977.4978 .4979 .4979 .4980 .4S81 l
Table
o.oo-lo.2o.3o.4.00 .ol .o2 .(x! .o4 .o5 ,06 .o7 .6 .o92.12.22.32.42.50160 0199 0239 0279.0319 03590557 .0596 0636 0675 0714 07530948 .0987 .1026 .'tO64 1'lO3 1141
-I-r L'sBDwHENTHE STANDARDEvIA-IrtoNIS UNKNOWNUseof Student's.fWhen isnotknown,tsvaluesestimatedrom
Fsamoledata.
f'
jmt-ratio-the ratio employedl thq. estingofIvpothesesor determiningtheinificancebfVrri'erencebetweenmeafrstwo--samplease)inrolvinga samplewith a i-distribuiion.Thetbrmula s:NBIASEDNESSPropertyf areliable s-
imatorbeins estimated.oUnbiasedEstimate ofaParameter-an estimatethat equalson the averagehevalue oftheparameter.Ex.the samplemesn is snunbissedestimator ofthepopulation mesn..BiasedEstimateof a Parameter-an estimatethat doesnot equalon the averagehevalue oftheparameter.Ex. thesamplevariance calculatedwith n is abi-ssedestimator ofthepopulation variance, however,x'hencalculatedwithn-I it is unbiused.JSTANDARDERROR-Thestandarddeviationofthe estimatoris calledthe standarderror.Er.The standarderrorforT's is.o:="/FXThishas to be distinguishedfrom the STAN-D.A,RDDEVIATION OFTHESAMPLE:
Example.The sample(43,74,42,65)has n=4. Thesum is224andmean:56. Using these 4numbersanddetermining deviationsfromthe mean, we'll haveJ deviationsnamely(-13,18,-14,9)which sum up to:ero.Deviationsfromthe mean is onerestriction wehaveimposed and thenaturalconsequenceis that thesum ofthese deviationsshould equalzero.Forthis tohappen, we canchoose any number but ourfreedomto chooseis limited to only 3numbers because one isrestrictedby therequirementthatthe sum ofthe de-viations should equalzero.We use the equality:
(x,-x)+ x2-+tt-x)+ xa--x)0ISogivenqmean of 56, J'theirst3 observqtions re43,74,und 42,the last observationhusto be 65.Thissinglerestriction in this case helps usdetermine df,Theormulais n lessnumber ofrestrictions. n thist'ase, tis n-l= 4-l=3df._/-Ratios a robust test-This means hat statisticalinferencesre ikelyvaliddespiteairly largedeparturesfromnormalityin thepopulationdistribution.Ifnor-mality ofpopulationistributions n doubt, t is wisetoincreasehe sample ize.'The standarderrormeasureshe variabilityintheTs aroundtheir expectedvalueE(X) while the stan-Jarddeviation ofthe samplereflectshe variabilityrn the samplearoundthe sample'smean(x).\Fwherep:population mean underH6
S-X
"n6r=.srloDistribution-symmetricaldistributionwithamean ofzero lnd standarddeviationthatannroachesne as degreesoffreedomincreases'i.i.approacheshe Zdistribution)..A,ssumptionnd conditionrequired inr\sumingr-distribution:Samplesare diawnfromanorm-allvdistributedpopulationand orpopulation standarddeviatiori)s unknown.oHomogeneityofVariance-If.2 samples arebernccomoaredhe assumptionn using -ratior' th?tthe variances ofthepopulatioi's from*here thesamplesare drawnare equal.oEstimated6X-,-X, thatssx,-Fr)isbasedonthc unbiased stimaieofthepofulaiionvariance.oDegreesofFreedom(dJ\-^theumber of valuesthat arefree tovaryafterplacingcertainrestrictionson the data.Example.Givenx:l08,s:l5, and n-26 estimate95%confidenceinterval for thepopulationmean.Since thepopulationvariance isunknown, thet-dis-tribution s used.The resultins nterval.usinsa-valveof2.060 romTable B(row25of the middle-column),isapproximately102 to 114.Consequently, ny hy-pothesizedbetween 102 o 114 stenableonthebasis of this sample.Any hypothesizedrbelow102or above114 wouldberejectedat 0.05 significance.OCOMPARISONBETWEENI ANDzDISTRIBUTIONSAlthouehbothdistributionsare svmmetrical abouta meanbf zero, thef-distributionis morespread outthan the normal di stributi on(z-distributioh).Thus a much larger value oftis requiredtomarkoffthe boundsofthecriticalregion<ifrejection.As d/rincreases,differences betweenz-andt-dis-tributions are reduced. Table A(z)may be usedinsteadof TableB(r;when n>30. ToLse eithertablewhen n<30,the sample must bedrawnfromanormal population.
Ta^bJs
B*=l.evel
'!41
.134-6
.Z
d
:e-10. !t,13
,aa
24,13.14.1q-17_18-_]t20o.o250.o1 0.oo5o.o5 olo2 o.ol.12.7063r.tz1 6i.6574.3Q36.965 9-9253.1 2 4.541 5.4412.7761747 4.6042.5713.3054-O3Z2.447 3.143_3.707_
trCONFIDENCEINTERVAL- Interval withinwhich we mayconsider ahypothesis tenable.Common confidenceintervals are 90oh,95oh,and 99oh. ConfidenceLimits: limitsdefininsthe confidence interval.(1-cr)100% confidenceinterval forrr:
ii,*Ftl-il.,<i+z*("1{n)
where -, isthevalueofthestandardnormalvariableZ-thatputscrl2per-centineachtailofthedistribution.The confi-denceinterval is the complement of the criticalregions.A t-statistic may be usednplaceofthe z-statisticwhen oisunknown and smustbe usedasanestimate.Butnote thecautionin that section.)
.25,?s
27
2.
.2930 ini,2.306_2.8963.355?.?422.82L_3.2502-2?E 2.76L_3.1692.201 2.7143.1062.179 2.6A13.055
.179 2.6A13.055.z.roo 2.650.g.otz
2.145_ 2.924 ?.9f72.'131,2.60?2.9472:12o_ ?,qe32.9212,110 2.5Q7?.e982.1o12.552 2.4742.Q93_2.539-2.4612.OqQ-2.5242.4452.OQO2.514_2.4312Bf42.5Q8 2.a1e2.o6s2.5Oq _ 2-AA7?.e64_2.4e2 -?.1e72.060_2.1qF 2.7Q7?.o502.479-2.7792.052 2.423 2.7712.044 2.4672.7632.Q45?,482 Z.lsE2.042 2.457 2.750
2.994 3.499