You are on page 1of 60

STA201(STATISTICSFORPHYSICALSCIENCESANDENGINEERING)

5LECTURENOTE
INRTODUCTION
Definition
Statisticsisthescienceoflearningfromexperience,especiallyexperiencesthatarrivesalit
tlebitatatime.Thiscenturyhasseenstatisticaltechniquesbecometheanalyticmethodsof
choiceinbiomedicalscience,geneticstudies,epidemiology,agriculturalscienceandothe
rareas.Statisticsimpliesbothstatisticaldataandstatistical33method.Whenitmeansstat
isticaldataitrefertonumericaldescriptionsofquantitativeaspectofthings.Thesedescrip
tionscouldbeinformofcountsormeasurements.Thusstatisticsofstudentsofafacultyofsc
ienceincludecountofthenumberofstudents,suchasmalesandfemales,marriedandunm
arried,orpostgraduatesandundergraduates.Theymayalsoincludesuchmeasurements
astheirheight,weightandIQ(IntelligentQuotient).StatisticsisbroadlydividedintoDescr
iptiveandInferentialstatistics
Population,SampleandModel
Thedatainmedical,biomedical,nutritionaloragriculturalstudiesaregenerallybasedoni
ndividualobservations.Theyareobservationsormeasurementstakenonthesmallestsa
mplingunit.Thesesmallestsamplingunits,frequentlybutnotnecessarily,arealsoindivid
ualsinthebiologicalsense.
Population:Populationoruniverseiswell-
definedinthescienceofstatistics.Throughbiologicaldefinitionoftheterm“population”is
thetotalityofindividualsofagivenspeciespergiventimeandgivenarea,populationin“stat
istics”alwaysmeansthetotalityoftheindividualobservationsaboutwhichinferencesare
tobemade.Apopulationmayreferstovariablesofaconcretecollectionofobjectorcreatuq
ressuchasweightortaillengthsofallthealbinorats,anthropometricmeasurementsandh
aemoglobinorserumproteinlevelsofadults,andnutrientscontentsofvarietiesoffoods.
Samples:Sampleisapartofthepopulation.Largenumberofsamplesmaybetakenfromthe
samepopulation,stillallmembersmaynotbecovered.Inferencesdrawnfromthesampler
efertothedefinedpopulationfromwhichsampleorsamplesaredrawn.
aaaaaaaaaaaaÀThedropofbloodexaminedinthelaboratoryisasamplefromthe“populat
ion”ofallbloodinthebody.
Model:Inductiveinferenceisbasedontheassumptionthatthevaluesinthepopulationun
derstudyarescatteredaccordingtoacertainpattern.Thispatternismodeledbyaprobabil
itydistribution.
Forexample,“theheightofstudentsinthefacultyofscience,OlabisiOnabanjoUniversityfo
llowthenormaldistributionwithmean120cmandstandarddeviation10cm”isthespecifi
cationoftheprobability(orstochastic)model.
Fittingofaprobabilitymodeltothevaluesofacertainpopulationisdonebyspecifyingthep
robabilitydistributionoftheunderlyingrandomvariable.Thefollowingpurposesarethe
reasonsforfittingaprobabilitymodel.
a. Itmaybeusedtodescribethepopulation,
b. Itmaybeusedtopredictsomefuturevalue.
c. Usuallytheprobabilitymodelsarefittedasafirststeptotakeoneamongthesetof
severalpossibleactions
d. Sometimesthevalueoftheparametermaybeofindependentinterest.
DataCollection

Dataformthebedrockonwhichstatisticalanalysismostlyreliedupon.Itisanactivityaime
datgettinginformationtosatisfysomedecisionsorobjectives.Theprocessofcollectingda
tavariesanddependsuponthekindofdatatobecollected.

SourcesofData
Basically,therearetwomajorsourcesofdata,namelyprimaryandsecondarysourcesofda
tacollection.
PrimarySources
Thisreferstothestatisticaldataorinformationwhichtheinvestigatororiginateshimselff
orthepurposeoftheenquiryathand.Examplesarecensus,surveysandexperiments
Advantages
i. Itallowsdetailedandaccurateinformationtobecollected.
ii. Itismorereliable
iii. Themethodofdatacollectionandlevelofaccuracyknown.
Disadvantages
i. Oftentimeconsuming.
ii. Moreexpensive.
Secondarysources
Thisreferstothosestatisticaldatawhicharenotoriginatedbytheinvestigatorhimself,but
whichheobtainsfromsomeoneelse’srecordsorfromsomeorganization,eitherinpublish
edorunpublishedforms.ExamplesincludepublicationsoftheFederalOfficeofStatistics(
FOS),CentralBankofNigeria(CBN),NationalPopulationCommission(NPC)WorldHealt
hOrganization(WHO),etc.
Advantages
i. Notexpensive@@a@@q
ii. Nottimeconsuming
iii. Veryeasytocollect,especiallyinacomputerizedorganization.
Disadvantages
i. Theinformationmaybemisleading.
ii. Itmaynotallowdetailedandaccurateinformationtobecollected.

MethodofDataCollection
Threemethodsofcollectingdataare:
i. Postalquestionnaires
ii. Personalinterviews
iii. Telephoneinterviews
Questionnaires
Aquestionnairecontainsasequenceofquestionsrelevanttothedataorinformationbeing
sought.Thisisaformalquestionspreparedwhichbuttobeansweredbytherespondent.Qu
estionnairesareusuallyoftwoparts,partsoneistheclassificationsection.Itcontainssuch
detailsoftherespondentslikesex,age,maritalstatus,occupation,stateoforiginetc.Thesec
ondpartisrelatedtothesubjectmatteroftheenquiry.
Typesofquestionnaire
a. Close-
EndQuestionnaire:thisisaquestionnairedesignedinsuchawaythatrespondentsarel
imitedtostatedalternativesoroptionstherebynotpermittingfurtheroradditionalex
planationandiscalledstructurequestionnaire.
b. Open-
EndQuestionnaire:Thisisunstructuredquestionnairedesignwhichallowstherespo
ndentfreetomakewhateverreplythattheychooses,thatis,therespondentsarenotina
nywayrestrictedtooptions.

QualityofaGoodQuestionnaire
1. Questionnairesshouldbesimpleandeasilyunderstood.
2. Itshouldbeinlogicalsequence.
3. Itshouldbeshortandunambiguous.
4. Questionsshouldnotoffend,frightenedorbetele-
guiding.Questionsthatmayarousetheresentmentoftherespondentsshouldbeav
oided.
5. Questionshouldnotrequirecalculationtobemade.
6. Questionshouldbeabletohavepreciseanswerlike“Yes”or“No”.
7. Questionsthatrelytoomuchonmemoryshouldbeavoided.Sincesomepeopleforg
eteventstoosoon.
Editing

Thisisawayofcheckingtheansweredquestionnairetocorrectsomeofthemistakes.There
turnedquestionnairesfilledbytheinformantsorbyenumeratorsshouldbescrutinizedat
anearlystagewithaviewtodetecterrors,omissionsandinconsistencies.Theworkofediti
ngrequiresskillandscientificimpartialityofahighdegreeandfourtypesofeditingare:edit
ingforconsistency,uniformity,completenessandaccuracy.
Coding

Theresponsesintheeditedquestionnairearenowtobetranslatedinnumericaltermsinor
dertofacilitateanalysis.Thisisdonebysettingoutalistofcodesforthepossibleresponsest
oquestions.
TabulationandClassification

Thisisanactofarrangingfactsandfiguresintheformoftable(s)orlist.Inordertomakethed
ataeasilyunderstandable,thefirsttaskofthestatisticianistocondenseandsimplifythemi
nsuchamannerthatirrelevantdetailsareeliminatedandtheirsignificantfeaturesstando
utprominently.Theprocedurethatisadoptedforthispurposeisknownasthemethodofcl
assificationandtabulation.
DataPresentation

Itistherepresentationofdatainappropriateforminordertomakethecomparisonandjjun
derstandingeasythroughcharts,diagramorgraph.Nomatterhowinformativeandwellde
signedastatisticaltableis,asamediumforconveyingtothereaderanimmediateandcleari
mpressionofitscontent,itisinferiortoagoodchart,diagramorgraph.Themostpopularch
arts,diagramsandgraphsare,piecharts,bardiagrams(barchartandhistogram)andgrap
hs(frequencypolygonsandOgives).
Piecharts

Apiechartissimplyacircledividedintosections.Thiscirclerepresentsthetotalofthedatab
eingpresentedandeachsectionisdrawnproportionaltoitsrelativesize.Themainadvanta
geofapiechartisthatitiseasytounderstand.
Example
Aninvestigationofthemaritalstatusofthestaffofaninstitutionrevealsthefollowing:
Maritalstatus Noofstaff
Single 35
Married 130 Drawapiechartusingtheaboveinformation.
Widowed 25 Solution
Divorced 10
Totalnoofstaffintheinstitutionis
35+130+25+10=200
Anglecorrespondingtoeachstatusarefoundthus:
35
Single = × 3600 = 630
200
130
Married = × 3600 = 2340
200
25
Widowed = × 3600 = 45 0
200
10
Divorced = × 360 0 = 18 0
200

Thus,thepiechartis:
Total Number of Staff in the Institution
Divorced; Series1; 18; 5%
Widowed; Series1; 45; Single; Series1; 63; 18%
13%

Married ;
Series1;
234; 65%

Observation:thechartclearlyshowsthatmajorityofthestaffintheinstitutionaremarried.
BarCharts
Barchartscouldbesimple,multipleorcomponentinnature.Asinglebarchartcomprisesof
anumberofequallyspacedrectangles.
Amultiplebarchartisusuallyusedinthecomparisonoftwoormoreattributes.
Acomponentbarchartcomprisesofbarswhicharesubdividedintocomponents.Example
Representsthedatausedaboveinbarchart.
Solution:
BarChart

No of staffs

Example
Thesexdistributionofstaffinfivedepartmentsofthefacultyofsciencearegivenbelow
S/No Departments Male Female Total
1 Chemicalscience 25 15 40
ii. Mathematicalscience 65 30 95
iii. Biologicalscience 45 40 85
iv. Physics 35 5 50
v. EarthScience 30 10 40
Total 200 110 310
Presenttheaboveinformationona
i.) MultipleBarChart
ii.) ComponentBarChart.

Solution:
i.)

Multiple Bar CHart

Male
Female

ii.) ComponentBarChart.
Female
Male

HISTOGRAMS
Histogramsandbarchartslookalikeinpresentation,butwhilethebarsofthebarchartsare
usuallynotjoined,thoseofthehistogramareusuallyjoined.Furthermore,whilethebarch
artattachesimportanceonlytoitsheights,histogramattachesimportancetobothheights
andthewidths.
Example
Obtainthehistogramofthedatainexampleabove
Solution

Histogram.
No of staffs

DescriptiveStatistics
Statisticsisconcernedwithvariability.Itisofinteresttoknow,howtodescribeit?
Howtomeasureit?
Andhowtoreachsensibleconclusionsfromtheresultsofexperimentsandcomparativest
udies?
Descriptivestatisticsdealswithclassificationofdata,thedrawinghistograms,diagramsa
ndgraphssuchaslinegraphs,bargraphs,pictogramsthatcorrespondtofrequencydistrib
utionthatresultafterthedataarebeenclassified.Italsoincludethecomputationofsample
means,mediansandmodes,thecomputationofranges,meanabsolutedeviationsandvari
ances.
Variable,VariationandDistribution
Theresultsofanexperimentofcomparativestudycanalwaysbepresentedasasetofmeas
urementsoneachofagroupofunits.Forexample,theunitsmaybeanimalsofaparticularsp
ecies,patientsofaparticulardiseases,orfamilieslivinginaparticularhousingestate.Agen
eraltermsforanyfeatureoftheunitwhichisobservedormeasuredisvariable.Thus,thewe
ightofananimal,thepresenceorabsenceofasymptominapatientarevariables.
Variations
Therearetwomaintypesofvariation:oneisvariationbetweenunitsandtheotherisvariati
onwithinunits,variationbetweenunitsisuniversalinanyscientificinvestigation.Variati
onwithinunitsisseenwhenobservationsaremadeoveraperiodoftime.Variationisbestd
escribedbytherelativefrequencieswithwhichdifferentobservedvaluesoccur.
Distribution
Thevariationbetweenobservationsisbestdescribedbydistributions.Thewayinwhichth
erelativefrequenciesoftheobservedvaluesofavariablearedisplayed,dependstosomeex
tentonthescaleonwhichthevariabletakesitsvalues.Variablescanbeonaqualitativescale
consistingofvalueslikered,white,blackorwhiteorpresenceorabsenceofadisease.Qualit
ativevariablesarealsonamedasattributes.Thesevariablesarenotcapableofbeingdescri
bednumerically.Examplesare:sex,religion,nationality,colouroftheeyeorskinetc.these
characteristicsarecalled“attributes”or“attributivevariates”or“descriptivecharacteris
tics”.
Secondtypeofvariablearethosetakingvaluesnaquantitativescaleforwhichacompariso
nofmagnitudeisinvolved.Exampleofquantitativevariablesareheight,weight,heamoglo
bin,calorieandnutrientcontentoffoods.
MeasureofCentralTendency
Classificationandtabulationofdataarehelpfulinreducingandunderstandingthebulkoft
helargemassofdata.Buttheyaredescriptive.Tobemoreprecise,thedatashouldbeexpres
sedinnumericalterms.Sotheneedarises,tofindaconstantwhichwillbetherepresentativ
eofagroupofdata.Thisisameasureofhowthedataarecentrallyplaced.Itisalsocalledmeas
ureoflocation.Therearethreepossiblemeasuresoflocationnamely:themean,themedian
andthemode.Themeancanalsobedividedintothreepartsnamely;arithmetic,geometric
andharmonicmean.Bycarefulobservationofdata,itcanbenoticedthattheobservationst
endtoclusteraroundacentralvalue.Thisiscalledcentraltendencyofthatgroup.Thiscentr
alvalueisknownasaverage.
EssentialofaGoodAverage
Sincean‘average’istorepresentthestatisticaldataandisusedalsoforpurposesofcompari
son,itmustpossessthefollowingproperties:
i. Itmustberigidlydefined,andnotlefttothemeanestimationoftheobserver.
ii. Theaveragemustbebasedonallvaluesgiveninthedistribution.
iii. Itshouldbeeasilyunderstandable.
iv. Itshouldbecapableofbeingcalculatedwithmeasurableeaseandrapidity.
v. Itshouldbeaslittleaffectedasmaybepossiblebyfluctuationsofsample.
vi. Itshouldbesuchthatitcanleaditselfreadilytoalgebraicaltreatment.

TheArithmeticalMean
Thearithmeticmeanofaseriesisobtainedbyaddingthevaluesofallobservationsanddivi
dingthetotalbythenumberofobservations.Thisisgenerallycalledthemeasure.Insymbol
s,X1,X2,…,Xnarenobservedvalues,thenthemeanisgivenby:

X̄ =
Total of all individual values x + x + , . .. , x n
= 1 2 =
∑ x1
sample size n n
Example:Thegaininweightsof5albinoratsoveraperiodof5daysare5,6,4,4,4,7.
Thearithmeticmeanormeanis
5 + 6 + 4 + 4 + 7 30
x̄ = = = 5.0
6 6
MeanofaGroupData
Threemethodofcalculationare:thelongmethod,theassumedmeanmethodandthecodin
gmethod.
Longmethod

x̄ =
∑ fx
∑f
Assumedmeanmethod

X̄ = A +
∑ fd
∑f

Where
A=isaguessedorassumedmean
d=X-Aarethedeviationsfromtheassumedmean.
Codingmethod

X̄ = A +
( ) ∑ fu
∑f
c

Where
Aisanappropriatelychosenxvalues
Cisthecommonclassessize
U=…,-3,-2,-1,0,1,2,3,…
Example2.2
Theweightsinkgofacollectionof40studentsinthefacultyofscienceofO.O.U.aregivenbelo
w:
59,53,66,55,57,65,48,59,51,58,52,68,60,70,71,55,70,64,54,67,62,53,49,56,63,48,57,6
1,58,55,50,55,61,52,54,65,56,50,62,60
Calculatethemeanusing:
a. Thelongmethod
b. Theassumedmean61
c. Thecodingmethod
Solution:
Weights(kg) F x fx d=x–a=x–61 fd u Fu
48–50 8 50 400 -11 -88 -2 -16
53–57 12 55 660 -6 -72 -1 -12
58–62 10 60 600 -1 -10 0 0
63–67 6 65 390 4 24 1 6
68–72 4 70 280 9 36 2 8
Total 40 2330 -110 -14
a. Longmethod

X̄ =
∑ fx
∑f
2330
= = 58 . 25
40
b. Assumedmean61

X̄ = A +
∑ fd
∑f
= 61 + (40−110 ) pop1111d😂=61–2.75=58.25
c. CodingMethod
X̄ = A +
( )
∑ fu
∑f
c

HenceC=5.AisthevalueofXcorrespondingtoU–
0foroddnumberofclasseswechooseu=0atthecentre.

Thus,
X̄ = 60 + ( −14
40 )
5
to

=60–1.75
=58.25
TheGeometricMean(GM):

Iftheobservationsinsteadofbeingadded,aremultiplied,thegeometricmeanwouldbethe
nthrootoftheproduct.Inalgebraicsymbols,thegeometricmeanofnobservations,
x 1 , x 2 , x 3 , ... , x n ,isgivenbytheformula:
1
GM = Geometric mean = √( x 1 ) ( x 2 ) ( x 3) , .. . . , ( x n ) = ( x 1 , x 2 , . . ., x n)
n n

Foritslogarithmiccalculation,therelationshipusedis,

=
log x 1 + log x 2 + log x3 . .. + log x n
=
∑ log x i
n n
=simplearithmeticmeanofthelogarithmicvaluesofindividualvalues.
Anti-
logarithmsvaluesofthislogmeanisthegeometricmean.Thegeometricmeanispreferable
tothearithmeticmeaniftheseriesofobservationscontainoneormoreusuallylargevalue.
5
Example
Theintakesofbabymilkfoodobservedinfifteenchildreninonedayareprovidedbelow:
101 114 109 135 122
184 196 185 217 198
148 233 227 336 253
Calculatethegeometricmean?
Solution:
n=15

GM = √ ( 101 × 114 × 109 × ..., × 253 ) = 173.7


15

log x1 + log x 2 + log x 3 .. . + log x 15


log GM =
15
log 101 + log114 + . .. + log 253 33 . 5959
= = 2 .2397
15 15
Anti-logarithmof2.2397is173.7.so,thegeometricmean,
GM=173.7
HarmonicMean

Itisthereciprocalofthearithmeticmeanofthereciprocalsofobservations.For x 1 , x 2 , ..., x n i
ndividualvalues,harmonicmean(HM)is
1
HM = 1
n ∑ 1x i

n =
n
=
( ) + ( x ) + , . .. , ( x )
1 1
∑ x1 i
1
xi
2 n

Example
Forthenumericalvaluesof1,2,3,4,5,calculateandcomparetheAM,GMandHM
Solution
1 + 2 + 3 + 4 + 5 15
x̄ = = = 3.0
ArithmeticMean(AM)= 5 5
1 1

Geometricmean(GM)=( 1 × 2 × 3 × 4 × 5 ) = ( 120 ) = 2 . 605


5 5

Withlogarithms,thecalculationsareprovidedbelowforGM.
1
1
log GM = log ( 1 × 2 × 3 × 4 × 5 ) 5 = ( log 1 + log 2 + log 3 + log 4 + log 5 )
5
1
= ( 0 + 0. 30103 + 0 . 47712 + 0 . 60206 + 0 .69897 )
5
1
= ( 2. 07918 ) = 0 . 415836
5
GM=Anti.logofGM=Anti-log(0.415836)=2.60517isequivalentto2.605
1 5
= = 2 .242
Harmonicmean(HM)
1
5 ( 1
1
+
1
2
+
1
3
+
1
4
+
1
5 ) 2 . 23

Therefore,AMisthehighestfollowedbyGMandHM.
TheMode
Thisisthevalueornumberthathasthehighestfrequencyinadistribution.Themodemayn
otexistandevenwhenitdoesexist,itmaynotbeunique.
Forexample:
5,2,4,7,5,3;hasmode5(unimodal)
2,6,3,4,3,2,5hastwomodes2and3(bimodal)
4,7,2,1,3hasnomode
Themodecanbeobtainedbothgraphicallyandbycalculations.Forgroupeddata,weuseth
ehistogramtoestimatethemode,whilebycalculationweusetheformula.

Mode
= L +
[ fm − f a
2fm − fa − f b ] C

Where
L=Lowerclassboundaryofthemodalclass
Fm=Frequencyofthemodalclass
Fa=frequencyoftheclassabovethemodalclass
Fb=frequencyoftheclassbelowthemodalclass
C=sizeofthemodalclassinterval.
THEMEDIAN
Ifasetofdataisarrangedinorderofmagnitude,themiddlevalue,whichdividesthesetintot
woequalgroupsisthemedia.Generally,forNdata

[ ]
th
N+1
Median = item
2
Forexamplefindthemedianofthefollowingsetsofdata
a. 3,6,2,4,3
b. 2,5,3,4,8,3
Solution
a. Arrangementinorder:2,3,3,4,6
HereN=5

[ ]
th
N+1
Median = item
2

=
[ ]
5+1
2
= the 3rd item

=3
b. Arrangementinorder2,3,3,4,5,8
HereN=6

[ ]
th
6+1 th
Median = item = 3. 5 item
2
Thiswillbeinterpretedasthe
3rd item + 4th item 3+4
= = 3.5
2 2
MedianofaGroupData
Themediancanbeobtainedgraphicallyfromthecumulativefrequencycurve(Ogive)orb
ycalculationusingtheformular.

[ ]
N
2 −F
Median = L + C
f

Where
L=valueofthelowerclassboundaryofthemedianclass.
F=Cumulativefrequencyoftheclassjustabovetheonecontainingthemedian.
f=frequencyofthemedianclass
C=sizeofthemedianclassinterval
Example:Usingthedatagiveninexampleabove
i. Constructthehistogramandfromitestimatethemodeofthedistribution.
ii. Calculatethemodeandcompareyouranswerwiththeestimatedvaluein(i)abo
ve
iii. Constructthecumulativefrequencycurveandfromitestimatethemedian.
iv. Calculatethemedianandcompareyourresults.
Solution
i. Histogram

Series1

47.5 52.5 57.5 62.5 67.5 72.5

Themodeisapproximately56

ii.
L +
[ fm − fa
2f m − f a − fb ] C

Themodeclassis53–57
Hence,L=52.5,fm=12,fa=8,fb=10andc=5
Thus
Mode = 52. 5 +
[ 12 − 8
]
2 ( 12 ) − 8 − 10
5

=52.5+3.33
=55.83
Comparison: graphicalvalue=56
Estimatedvalue=55.83
Thesevaluesagreedapproximately
iii.
Weight(kg) Frequency(f Cumfrequency(f)
)
48–52 8 8
53–57 12 20
58–62 10 30
63–67 6 36
68–72 4 40
Total 40
Series1

47.5 52.5 57.5 62.5 67.5 72.5

Estimated=37.5

[ ]
N
2 −F
Median = L + C
iv. f
N 40
= = 20
2 2 i.ethemedianisthe20thvalue.Fromthecumulativefrequencydistributionta
ble20thitemfallswithintheclass53–57.Thusthemedianclassis53-
57,hence,L=52.5,F=8,f=12andC=5

Median = 52. 5
[ 20 − 8
12 ]
5 = 57 . 5

Comparison=Bothofthemareequal
MeasuresofVariationandDispersion
Whilestudyingafrequencydistributionofavariable,itisimportanttoknowhowthefrequ
enciesareclusteredaroundorscatteredawayfromthemeasuresofaveragesorcentralten
dency.Twodistributionsmaycentrearoundthesamepointi.e.arithmeticmeans,butdiffe
rinvariationfromarithmeticmean.Suchvariationiscalleddispersion,spreadorvariabilit
y.Thedegreetowhichnumericaldatatendtospreadaboutanaveragevalueiscalledthevar
iationordispersionofthedata.Variousmeasuresofvariationare,range,quartiledeviation
,meandeviation,standarddeviation,varianceandstandarderror.
i. TheRange

Rangeisthedifferencebetweenthelargestandsmallestitemsofthesampleofobservation
s.Ifsampleofobservations:5,6,7,8and9arethere,therangeis9–5=4i.e.maximumvalue–
minimumvalue,itdependsontwoextremevalues.Itisthedifferencebetweenthelargesta
ndthesmallestnumbersofadistribution.
ii. QuartileDeviation
Quartiledeviationinsemi-interquartilerangeQisgivenbytheformula
1
Q=
2 ( Q3 − Q1 )

WhereQ1andQ3arethefirstandthirdquartilesrespectively.Quartiledeviationisbetterth
anrange,sinceitiscalculatedusingfirstandthirdquartilevalues.

iii. MeanDeviation

Themeandeviationisthearithmeticmeanoftheabsolutevaluesofthedeviationsfromso
meaveragelikemeanormedianormode.
∑ f i ( x i − x̄ )
Mean deviation =
N forgroupdata
∑ ( x i − x̄ )
Mean deviation =
N forungroupeddata
Where
fi=isthefrequencyoftheithclassinterval
xi=istheithmidvalueofclassintervalorithindividualvalue.
x̄ =isthearithmeticmen

N=isthenumberofobservationsorN=∑ f i
iv. StandardDeviation
Thisisthemostcommonlyusedmeasureofvariationordispersion.Ittakesintoaccountall
thevaluesofthevariable.Standarddeviationisdefinedasthesquarerootofthearithmetic
meanofthesquareddeviationsoftheindividualvaluesfromtheirarithmeticmean.Thefor
mulaforlargesamples.
SD2 = 1
n ∑ ( x i − x̄ )2
Where
xi=istheithindividualvalue
x̄ =isthearithmeticmen

n=samplesize
forsmallsamples,theformulais,
SD2 = 1
n− 1 ∑ ( x i − x̄ )2
1
= n−1
[ SS − CF ]
Where

SS=sumofsquares= ∑ x i
2

(∑ x i )
2

CF=correctionfactor= n

Forgroupeddatatheformulais,
SD2 = 1
n− 1 ∑ f i ( x i − x̄ )2
SD = √ 1
n−1 ∑ f i ( x i − x̄ )2
√ [∑ (∑ f i x i )
]
2
1
SD = n−1
f i x 2i −
n
Where
fi=isthefrequencyoftheithclassinterval
xi=isthemidvalueoftheithclassinterval
x̄ =isthearithmeticmen

n=samplesize
Example
Ungroupeddataforthevalues5,6,7,8,9
35
AM= x̄ = 5
=7

SS=∑ x i = 5 + 6 + 7 + 8 + 9 = 255
2 2 2 2 2 2

(∑ x i )
2
352
CF = = = 245
n n

SD = √ 1
n−1
( SS − CF ) = √ 1
4
( 255 − 245 ) = 1 . 58

Groupeddatausingthefrequencydistributionofweights(kg)of70adultsbelow
Classintervalofweigh Middlevalu Frequenc Cumulative fixi
ts(kg) eofxi y(fi) frequency
45–50 47.5 2 2 95.0
50–55 52.5 3 5 157.5
55–60 57.5 6 11 345.0
60–65 62.5 4 15 250.0
65–70 67.5 6 21 405.0
70-75 72.5 4 25 290.0
75–80 77.5 5 30 387.5
Total 30 30 1930.0
N=Σfi=30
7
∑ f i x i = 1970
i =1

x̄ =
∑ f i x i = 1930 = 64 . 33
∑ f i 30
∑ f i x 2i = 126637 . 50
(∑ f i x i )
2
3724900
= = 124163 .33
n 30

√ [∑ ]
2
1 (∑ f i x i )
SD = n−1
f i x 2i −
n

= √ 1
29
[ 126637 . 50 − 124163 .33 ]
=9.24
v. StandardError
Thestandarddeviationofmeanvaluesisknownasstandarderror.Thisisusedtocompare
meanswithoneanother.
S tan dard deviation SD
S tan dard Error (SE ) = =
√ ( sample size ) √n

vi. CoefficientofVariation

Tocomparethevariabilityoftwoserieswhichdifferwidelyintheiraveragesorwhicharem
easuredindifferentunits,arelativemeasureofdispersionisusedwhichisknownascoeffic
ientofvariationordispersion.Theformulais,
S tan dard deviation
Coefficient of var iation ( CV ) = × 100
mean
whenthevariabilityoftwoseriesarecompared,theserieshavinggreaterCVissaidtohave
morevariationthantheotherandtheserieswithlowerCVissaidtobemorehomogeneoust
hantheother.
Example2.6Usingthetableinexample2.5

SD 9.24
SE = = = 1.69
Mean=64.33 √ n √30
9. 24
CV = 100 × = 14 . 36
SD=9.24 64 . 33
Forungroupeddata 5,6,7,8,9
1 . 58
CV = 100 × = 22 .57
Mean=7 7
SD=1.58
vii. Variance

ThevarianceismeasuredinthesquareoftheunitsinwhichthevariableXismeasured.
Theformulaforvarianceis:
∑ ( x i − x̄ )2 ∑ x 2i − n x̄ 2
Variance = =
n n
Abetterestimateofthepopulationvariationisobtainedbysuingadivision(n-
1)insteadofn.

2 ∑ ( x i − x̄ ) 2
S =
Estimatedvariance= n−1

( S) =
√∑ ( x − x̄ )
i
2

Estimatedstandarddeviation= n−1
Characteristicsofsampleandpopulationareprovided
Sample Population
Number n N
Mean x̄ 

Variance S2 2
Standarddeviation S 

S2willbearepresentativeunbiasedestimateofthepopulationvariance2onlyif(n-
1)isusedinthedenominationofS2.
Example
Thebodysurfaceareaoffifteenchildrenaregiven.Calculatethemean,variance,standardd
eviationandstandarderror.
Body SurfaceArea
196 101 184 227 253
185 217 126 336 148
114 135 233 198 109
Solution
∑ x = 2758 = 183 . 9
Mean= n 15
2
(∑ x )
∑x− n 58499 .7
Variance = S2 = = = 4178 . 55
n− 1 14

Stan dard Deviation, SD = √ S2 = √ 4178.55 = 64.65


SD 64. 65
S tan dard Error = , = = 16. 71
√ n √ 15
Thestandarddeviation(s)isameasureofthevariationordispersionofagroupofvaluesar
oundanarithmeticmean(ormean).
Variance(S2)isthesquareofthestandarddeviation.Thestandarderror(SE)isameasureo
fthevariationordispersionofthemeansofasetofmeasurements.
REGRESSIONANALYSIS
Regressionanalysisisoftenusedtopredicttheresponsevariablesfromtheknowledgeoft
heindependentvariables.Likewise,regressionanalysisisutilizedprimarilytoexamining
thenatureoftherelationshipbetweentheindependentvariablesandtheresponse(depen
dent)variableThereforeregressionisthestudyofrelationshipamongvariables.Onepurp
oseofregressionmaybetopredict,orestimate,thevaluesofothervariablesrelatedtoit.
EXAMPLE
ThetablebelowshowstheweightofmalesXandfemalesYstaffofaninstitution.
i)FindleastsquareregressionlineofYonX
ii)FindleastsquareregressionlineofXonYisconsideringXasdependentandYasindepend
entvariablerespectively
X 65 63 67 64 68 62 70 66 68 67 69 71
Y 68 66 68 65 69 66 68 65 71 67 68 70
Solution:
i)TheregressionlineofΥ on¿ isgivenby
Y =α + βx
Χ Υ Χ2 Υ Χ
Y2
65 68 4225 4624 4420
63 66 3969 4356 4158
67 68 4489 4624 4556
64 65 1096 4225 4160
68 69 4624 4769 4692
62 66 3844 4356 4092
70 68 4900 4624 4760
66 65 4356 4225 4290
68 71 4624 5041 4290
67 67 4489 4889 4889
69 68 4761 4624 4692
71 70 5041 4900 4970
800 811 53418 54849 54107
ForY on x
n ∑ xx−∑ x ∑ y
β=
n ∑ x 2 −( ∑ x )2

12(54107 )−(800 )(811)


=
12(53418 )−(800)2
β=0 . 4764

α=
∑ y −β ∑ x = y−β x
n n
811 800
= −0. 4764 x
12 12
=35.8233
TheregressionequationofYonXisgivenasY=35.823+0.476X

(ii) Theregressionlineofxonyisgivenby

x=α + βY
β=n ∑ xy−∑ x ∑ y
n ∑ y 2 −( ∑ y )2
12(54107 )−(800 )(811)
=
12(54849 )−(811)2
β=1 . 036

α=
∑ x −β ∑ y =800 −1. 036 x 811
n n 12 12
=−3. 38
Teregressionequationofxonyisgivenas
Y=-3.38+1.036Y

CORRELATIONANALYSIS
Wehavedealtwiththeproblemofregressionorestimationofonevariable(thedependent
variable)fromoneormorerelatedvariables(theindependentvariables).Weshallnowco
nsiderthedegreeofrelationshipthatexistsbetweenvariables,thecorrelationanalysis.
Correlationanalysisisatechniqueforestimatingtheclosenessordegreeofrelationshipbe
tweentwoormorevariables.Correlationisthedegreeofassociationbetweentwoormore
variables.Thedegreeofrelationshipmaybepositivethatis,anincreaseinonevariableacc
ompaniedbyanincreaseintheotherornegativewhendecreaseinonevariableisaccompa
niedbyanincreaseintheother.Thepatternsofcorrelationareperfectandpositivecorrelat
ionwhenr=1,perfectandnegativecorrelationwhenr=-
1,positivecorrelationwhenr>0,negativecorrelationwhenr<0andnocorrelationwhenr
=0.
Thecorrelationcoefficientorcoefficientofcorrelationdenotedbyr,isameasureofthestre
ngthoftheliearrelationshipbetweentwovariables.Twotypesofthemeasuresofcorrelati
onare:
(i) KarlPearson’s’productmomentcorrelationcoefficient(r)
(ii) Spearman’srankcorrelationcoefficient(R)

PRODUCTMOMENTCORRELATIONCOEFFICIENT
TheKarlPearson’sproductmomentcorrelationcoefficientisdevotedbyrandgivenby:
n ∑ xy− ∑ x ∑ y
r=
√ [n ∑ x 2−( ∑ x )2 ][n ∑ y 2−(∑ y )2 ]
Where–1<r<1
Itshouldbenotedthatthehigherthemagnitudeofr,themorestrongertheassociation.

EXAMPLE
Thetablebelowgivestheweightofheart(x)andtheweightofkidneys(y)inarandomsampl
eof12adultmalesbetweentheagesof25and55years
Maleno Heartweight(X) Kidneyweight(Y)
1 11.50 11.25
2 9.50 11.75
3 13.00 11.75
4 15.50 12.50
5 12.50 12.50
6 11.50 12.75
7 9.00 9.50
8 11.50 10.75
9 9.25 11.00
10 9.75 9.50
11 14.25 13.00
12 10 12.00

Calculatethecoefficientofcorrelation
Solution:
∑x ∑y
∑ xy - n
r=


√ [ ∑ x 2 -( ∑ x )2 [ ∑ y 2−
( ∑ y )2
n
x=138.00, ∑ y=138.25, ∑ x=1608.12
∑ x2=1632.75, ∑ y2=1602.81
138 . 00 x 138 . 25
1608. 12−
12
r=

√ (1632 .75−
(138 . 00)2
12
r=0.70(to2decimalplaces)
)(1607 . 81−
(138 . 25 )
12

Thereisasignificantrelationshipbetweenheartweightandkidneyweight.
SPEARMANRANKCORRELATIONCOEFFICIENT
Whenvariablesdonotfollownormaldistributionandonedesirestoassesstherelationshi
p,correlationcoefficientknownasspearmanrankcorrelationcoefficientisused.Thevari
ablearerankedbasedonthemagnitude.Thecorrelationbetweenranksofvariablesxandy
isobtained.ThesymbolusedisR,theformulais:
6 ∑ d 2i
R=1−
n ( n2 −1 )
WheredisthedifferencebetweenranksgiventothevariablesofeachpairandniLllsthenu
mberofpairsstudied.Theprocedurewasdevelopedbyspearman.Hence,itisknownasspe
armanrankcorrelationcoefficient.Itsvaluealsorangesfrom–1to1.
EXAMPLE
Fromthetablebelow,calculatethespearmanrankcorrelationbetweensmokingandcanc
er.
Individualranks
1 2 3 4 5 6 7 8 9 10
Gradesofsmoothing(x) 1 2 3 4 5 6 7 8 9 10
Severityofcancer(y) 1 2 3 4 5 6 7 8 9 10
d=differencebetweentherank -1 1 -1 1 -1 -1 2 -1 1 0
sofxandy
d2 1 1 1 1 1 1 4 1 1 0

∑ d 2=1+1+1+1+1+1+4+1+1+0=12
6∑ d2 6(12) 6(12)
R=1− 2
=1− 2
=1−
n(n −1) 10(10 −1) 10(99 )
=1-0.073
=0.927.
Severityofcancerandgradesofsmokingarepositivelycorrelated
EXAMPLE
CalculatethevaluecorrelationcoefficientbetweenthecorrespondingvaluesofXandYgiv
enisthetablebelow
X 22 24 25 16 28 19
Y 48 42 40 38 47 45
Solution
Thevaryingisinascendingorderofmagnitude
X Y RX RY d d2
22 48 3 6 -3 9
24 42 4 3 1 1
25 40 5 2 3 9
16 38 1 1 0 0
28 47 6 5 1 1
19 45 2 4 -2 4
24
6∑ d2
2
R=1- n(n −1 )
6 (24 )
=1- 6(36−1)
=1-0.6857
=0.3143
=0.31
Thereisaloworweakpositivecorrelationbetweenthetwovariables.
TIEINRANKS
Mosttimes,twoormorevaluesofavariablemightbeequal.Insuchcases,weassignt
oeachofthetiedobservationsthemeanoftherankswhichtheyjointlyoccupy.Forexample
( 5+6 )
ifthe5 and6 largestvaluesofavariableareequal,weassigntoeachtherank 2 =5.5,a
th th

ndiftheoffifth,smithandseventhlargestvaluesofavariablearethesameweassigneachth
( 5+6+ 7)
erank 3 =6.6
EXAMPLE
Thetablegivebelowshowstherespectiveweight Χ andΥ (inkg)of12fathersandt
heireldestsons.

Father( Χ ) 66 64 68 65 69 63 71 67 69 68 70 72
Sons(Υ ) 69 67 69 66 70 67 69 66 72 68 69 71
Calculatethecoefficientofrankcorrelationandcommentonthedegreeofcorrelationbet
weenthefather’sweightandtheirson.
Solution
Χ Υ RX RX D=RX-RY d2

66 69 4 7.5 -3.5 12.25

64 67 2 3.5 -1.5 2.25

68 69 6.5 7.5 -1.0 1.00

65 66 3 1.5 1.5 2.25

69 70 8.5 10 -1.5 2.25

63 67 1 3.5 -2.5 6.25

71 69 11 7.5 3.5 12.25

67 66 5 1.5 3.5 12.25

69 72 8.5 1.5 -3.5 12.25

68 68 6.5 1.2 1.5 2.25

70 69 10 5 2.5 6.25

72 71 12 11 1.0 1.00

72.50

6 εd 2
2
R=1- n(n −1 )
6(72. 50 )
=1- 12(144−1
72 .50
=1- 2(143)
=1-0.2535
=0.7465
=0.75
Comment:Thereisafairlyhighpositivecorrelationbetweenthefather’sweightsandthato
ftheireldestsons.
ELEMENTSOFPROBABILITY
Probabilityconceptsarethefoundationsofstatistics.Theunderstandingoftheconceptsof
probabilitywillhelptheinterpretationofthestatisticsinaskilfulway.Probabilityisaterm
appliedtoeventsthatarenotcertain.Itisthestudyofrandomornon-
deterministicexperiments.Soprobabilityisdefinedastheratiooffavorableeventstothet
otalnumberofevents.Briefly,theinterpretationofprobabilitiescanbesummarizedasfoll
ows:
i. Probabilitiesarenumbersbetween0and1,inclusive,thatreflectthechancesofapa
rticularphysicaleventoccurring.
ii. Probabilitiesnear1indicatethattheeventinvolvedisexpectedtooccur.
iii. Probabilitiesnear½indicatethattheeventisjustaslikelytooccurasnot.
Theabovepropertiesareguidelinesforinterpretingprobabilitiesoncethesenumbersare
available,buttheydonotindicatehowtoactuallygoaboutassigningprobabilitiestoevent.
Threemethodsarecommonlyused:theclassicalapproach,therelativefrequencyapproa
chandpersonalorsubjectiveapproach.

TheClassicalApproach
Thismethodcanbeusedwheneverthepossibleoutcomesoftheexperimentareequallylik
ely.Inthiscase,theprobabilityoftheoccurrenceofeventAisgivenby:
n (A) Number of ways A can occur
P [A] = =
n ( s) number of ways the experiment can proceed
WhereSisthesamplesizeandACS.
Itsmaindrawbackisthatitisnotalwaysapplicable;itdoesrequirethatthepossibleoutcom
esbeequallylikely.Itsmainadvantageisthat,whenapplicable,theprobabilityobtainedise
xact.
Example
Whatistheprobabilitythatachildborntoacouple,eachwithgenesfrombothbrownandbl
ueeyes,willbebrown-eyed?
Solution
Wenotethatsincethechildreceivesonegenefromeachparent,thepossibilityforthechilda
re(brown,blue),(blue,brown),(blue,blue)and(brown,brown).
Wherethefinishmemberofeachpairrepresentsthegenereceivedfromthefather.Sinceea
chparentisjustaslikelytocontributeageneforbrowneyesasforblueeyes,allfourpossibili
tiesareequallylikely.
Sincethegeneforbrowneyesisdominant,threeofthefourpossibilitiesleadtoabro
wn-eyedchild.Hence,theprobabilitythatthechildisbrown-eyedis¾=0.75.
Example
Whatistheprobabilityofdrawinganaceatrandomfromawellshuffleddeckof52pl
ayingcards?
Solution
Thereare4acesinacheckof52cardsthatisx=4andn=52.
x 4 1
= =
Hence,probabilityoface n 52 13

TheRelativeFrequencyApproach
Thismethodcanbeusedinanysituationinwhichtheexperimentcanberepeatedmanytim
esandtheresultsobserved.ThentheapproximateprobabilityoftheoccurrenceofeventA,
denotedP(A),isgivenby:
n (A) Number of times event A occured
P [A] = =
N number of times experiment was run
Thedisadvantageofthismethodisthattheexperimentcannotbeaone-
shortsituation,itmustberepeatable.Theadvantageinthismethodorapproachisthatusu
allyitismoreaccurate,becauseitisbasedonactualobservationratherthanpersonalopini
on.
Thusforalargenumberoftrials,theapproximateprobabilityobtainedbyusingtherelativ
efrequencyapproachisusuallyquiteaccurate.

Example

Aresearcherisdevelopinganewdrugtobeusedindesensitizingpatientstobeestingsof20
0subjectstested,180showedalesseningintheseverityofsymptomsuponbeingstungafte
rthetreatmentwasadministered.Itisnaturaltoassumed,then,thattheprobabilityofthiso
ccurringinanotherpatientreceivingtreatmentisatleastapproximately
180
= 0. 90
200
Onthebasisofthisstudy,thedrugisreportedtobe90%effectiveinlesseningthereactionof
sensitivepatientstostings.
Example
If1,000tossesofacoinresultsin520heads,thentherelativefrequencyofheadsis
520
= = 0. 52
1000
Thesubjectiveorpersonalapproach
Thisistheprobabilityassignedtoaneventbasedonsubjectiveorpersonalexperien
ce,informationandbelieve.
Hence,probabilitiesareinterpretedasthestrengthofone’sbeliefintheoccurrence
ofanevent.
.
SomeBasicDefinitions
Experiment:thisreferstoanyprocessofobservationormeasurementwemaynotb
eabletopredict.
Outcome:Thisreferstoresultsobtainedfromanexperiment.
Samplepoint:Thisisanoutcomeinthesamplespace
Samplespace:Thisreferstothecollectionofallpossibleoutcomesofanexperiment.
Event:Thisreferstoanysubsetofasamplespace.

Axiomsofprobability
1. LetSdenoteasamplespaceofanexperiment.ThenP[S]=1
2. P[A]≥0foreveryeventA
3. LetA1,A2,A3,…
beasequenceofmutuallyexclusiveevents.ThenP[A1A2A3…]=P[A1]+P[A2]+P
[A3]…
Axiom1statesafactthatmostpeoplewouldregardasobvious,namelythattheprobability
assignedtoasureorcertain,eventis1.
Axiom2ensuresthatprobabilitiescanneverbenegative.
Axiom3iscalledthepropertyofcountableadditivity.
ProbabilityLaws
1. IfAandAarecomplementaryeventsinasamplespaceS,then
P(A)=1–P(A)

Complementaryevents:TwoeventsAandAaresaidtobecomplementaryiftheyaremutu
allyexclusive.
P(A)+P(A)=1
Mutuallyexclusiveevents:TwoeventsAandAaresaidtobemutuallyexclusiveifth
eoccurrenceofoneeventexcludesorpreventstheprobabilityofoccurrenceoftheo
therevent.
2. P()=0foranysamplesizes
SandaremutuallyexclusiveandS=S
P(S)=P(S)

=P(S)+P()
P()=P(S)–P(S)=0
3. IfAandBareeventsinasamplespaceSandACB,thenP(A)≤P(B)
4. 0≤P(A)≤1foranyeventA.
5. Additionrule:ifAandBareanytwoeventsinasamplespaceS,then
P(AB)=P(A)+P(B)–P(AB)
=P(A)+P(B)foranyP(AB)=0
6. MultiplicationRule:
Theprobabilitythataneventwilloccursjointlyistheproductoftheprobabilitiesofe
achevents.IfAandBareindependentevents,then
P(AB)=P(A)P(B)
Thisrulesisgeneralizedforanarbitrarynumberofindependentevents.
Example
Whatistheprobabilitythatacarddrawnatrandomfromawellshuffledstandardpackwill
beeitheraspadeoraclub?
Solution
S=13,C=13,n=52.
13 1
P ( s ) = P ( spade ) = =
52 4
13 1
P ( c ) = P ( c lub ) = =
52 4
Theoutcomesaremutuallyexclusive,therefore,theP(SorC)=P(s)+P(c)
HenceP(sorc)=¼+¼=½
Example3.6
Findtheprobabilityofgettingthreeheadsinthreerandomtossesofabalancedcoin?

Solution
Probabilityofeachtossis½
Multiplyingthethreeprobabilitiesgives
½½½=1/8
7. ConditionalProbability
GivenasamplespaceS,letAbeanon-
emptypropersubsetofS.i.e.AandAcS.TheprobabilityofaneventBhappeninggiventh
ataneventAhastakenplaceisdenotedbyP(B/A)andisdefinedas:
P (B ∩ C )
P (B / A ) =
P ( A)
IfAandBareanytwoeventsinasamplespaceSandP(A)0,theconditionalprobabilityofBg
ivenAis:
P ( A ∩ B) P ( both events )
P (B / A ) = =
P (A) P ( given event )
Likewise,theconditionalprobabilityofAgivenBandP(B)0is:
P ( A ∩ B) P ( both events )
P ( A / B) = =
P ( B) P ( given event )
Example
Itisestimatedthat15%oftheadultpopulationhashypertension,butthat75%ofall
adultsfeelthatpersonallytheydonothavethisproblem.Itisalsoestimatedthat6%ofthep
opulationhashypertensionbutdoesnotthinkthatthediseaseispresent.Ifanadultpatient
reportsthinkingthatheorshedoesnothavehypertension,whatistheprobabilitythatthed
iseaseis,infact,present?
Solution

LettingAdenotetheeventthatthepatientdoesnotfeelthatthediseaseispresentandBthee
ventthatthediseaseispresent.WearegiventhatP(A)=0.75,P(B)=0.15andP(A B)=0.0
6
Weareaskedtofind:
P ( both ) P ( A ∩ B) 0. 06
P (B / A ) = = = = 0 .08
P ( given ) P (A) 0 .75
Thereisan8%chancethatapatientwhoexpressestheopinionthatsheorhehasnoproble
mwithhypertensiondoes,infact,havethedisease.

Baye’sTheorem
ThistheoremwasformulatedbytheReverendThomasBayes(1761).Itdealswithconditi
onalprobability.Baye’stheoremisusedtofindP(A/
B)whentheavailableinformationisnotdirectlycompatiblewiththatrequiredinconditio
nalprobability.Thatis,itisusedtofindP[A/
B]whenP[AB]andP[B]arenotimmediatelyavailable.
Theorem:

Let A1 , A2 , A 3 , ..., A n beacollectionofeventswhichpartitionS.LetBbeaneventsuchthatP[B


]0.ThenforanyoftheeventsAj,j=1,2,3,…,n
P [B/ A] P [ Aj]
P ( A j /B ) = n
∑ P [ B/ A ] P [ A j ]
i =1

Baye’stheoremismucheasiertouseinpracticalproblemthantostateformally.
Example
ThebloodtypedistributioninOlabisiOnabanjoUniversityistypeA,41%;typeB,9%
;typeAB,4%,andtypeO,46%.Itisestimatedthatduringaninvestigation,4%ofinducteesw
ithtypeObloodweretypedashavingtypeA;88%ofthosewithtypeAbloodwerecorrectlyt
yped;4%withtypeBbloodweretypedasAjand10%withtypeABweretypedasA.onestude
ntwaswoundedandbroughttosurgery.HewastypedashavingtypeAblood.Whatisthepr
obabilitythatthisishistruebloodtype?

Solution
Let
A1=hehastypeAblood
A2=hehastypeBblood
A3=hehasstepABblood
A4=HehastypeOblood
B:ItistypedastypeA.
WewanttofindP[A1/B]
Wearegiventhat
P[A1]=0.41 P[B/A1]=0.88
P[A2]=0.09 P[B/A2]=0.04
P[A3]=0.04 P[B/A3]=0.10
P[A4]=0.46 P[B/A4]=0.04
ByBaye’stheorem
P [ B / A 1] P [ A 1 ]
P [ A 1/ B ] = 4
∑ P [ B/ A1 ] P [ A 1]
i=1

( 0 . 88 ) ( 0 . 41 )
=
( 0. 88 ) ( 0 . 41 ) + ( 0 . 04 ) ( 0 . 09 ) + ( 0. 10 ) ( 0. 04 ) + ( 0. 04 ) ( 0 . 46 )
0.93
Practicallyspeaking,thismeansthatthereisa93%chancethatthebloodtypeisAifithasbe
entypedasA,andthereisa7%chancethatithasbeenmistypedasAwhenitisactuallysomeo
thertype.
FACTORIALS
Factorialisaspecialmultiplicationoperator.Thefactorialsign“!”indicatesaspecialrepeat
edmultiplicationwhichisusedfrequentlyinstatisticalapplications.
Examples
3!=321=6
4!=4321=24
Ingeneral,n!=nn-1n-2,…,321
Wherenisaninteger
Theoperator“”isusedtoindicateamultiplicationofaseriesofnumbers.
Theoperation“”isusedtoindicateasummationofaseriesofnumbers.
5
Π Y2 = Y1 × Y2 × Y3 × Y4 × Y5
i=1

5
∑Y 2 = Y 1 × Y2 × Y3 × Y4 × Y5
i=1

PERMUTATION
Ifrobjectsareselectedfromasetofnobjects,anyparticulararrangement(order)oftheseo
bjectsiscalledapermutation.
Thenumberofpermutationsofrobjectsselectedfromasetofndistinctobjectsis
n n!
Pr =
(n − r ) !
Example
FindthenumberofwaysofarrangingthelettersoftheworldCHEMISTRYif:
a. Allthelettersaretobetakenatatime
b. Fourofthelettersaretobetakenatatime
Solution
a. Requirednumberofarrangements=n!
9!=362880
b. Requirednumberofarrangements=nPr
9 9! 9! 362880
P4 = = =
( 9−4 ) ! 5! 120 =3024
Notes
i. 0!=1andnPn=n!
ii. Thenumberofpermutationsofnobjectsofwhichn1areofonekind,n2ofasecond
n!
kind,…,nkofakthkindis n1 ! n2 ! ..., nk !
COMBINATION
Thisdealswiththenumberofwaysinwhichrobjectscanbeselectedfromasetofnobjects.T

(n ¿) ¿ ¿¿
henumberofwaysinwhichrobjectscanbeselectedfromasetofndistinctobjectsis ¿ orn
Crandisgivenby:
n n!
Cr =
r! ( n − r ) !
Example
Inhowmanywayscanapersonselectthreeitemsfromalistof7suchitems?
Solution
Hencen=7andr=3
NumberofpossibleselectionsnCr
7 7! 7!
C3 = =
3! (7 − 3 ) ! 3! 4!
7×6 ×5
= = 35
3×2×1
=35
Mathematicallyspeaking,aneventwhichisimpossibletooccur,forexample,ananimalgiv
ingbirthtoahumanchild,hasaprobabilityzeroandtheeventwhichiscertaintooccur,fore
xample,death,hasprobabilityunity.Iflifebirthofanoctuplettoawomanarenotknowntoo
ccurinthehistoryofacommunity,thestatisticalprobabilityofsuchaneventiszerointhatc
ommunity.Butitdoesnotmeantthattheeventisanimpossibility.Noprobabilitycanbeneg
ativenorcanitexceedone.Insimpleterms,ifmalnutritionispresentin8percentofchildren
inapopulation,theprobabilitythatarandomlypickedchildwouldhavethatconditionis0.
08.Thus,thismeasuresthelikelihoodoftheeventandinawayiscomplementofuncertaint
y.
Suchqualificationofuncertaintieshasprovedimmenselyusefulineffectivemanagement
ofhealthconditions,bothatindividuallevelaswellasatcommunitylevel.Knowingthatthe
probabilityofdevelopingcoronaryarterydiseaseinseniorexecutivesis,say,3timeshighe
rthaninclerks,providesusascienitificbasistogiveappropriateadviceortoinstituteanint
erventionatindividuallevelandtoplanandexcutepreventivemeasurestocombatthepro
bleminthetargetgroup.Iftheanalysisofrecordsshowthat90percentofthelargenumbero
fpatientsofabdominaltuberculosis(TB)camewithcomplaintofpaininabdomen,vomiti
ngandconstipationoflongduration,the
P(pain,vomiting,constipation/abdominalTB)=0.90
Suchprobabilities,whicharerestrictedtoaspecificgroup,arecalledconditionalprobabili
ties.Therefore,theabovegivenillustrationsaresomeofthebiologicalandhealthspecifice
xamplesofprobability.
EXPERIMENTALDEGISN
DefinitionofTerms
Randomization:Thisistheallocationoftreatmentstounitssuchthattheprobabilitythata
particulartreatmentwillbeallocatedtoaparticularunitisthesameforalltreatments.That
isboththeallocationoftheexperimentmaterialandtheorderinwhichtheindividualtrials
oftheexperimentaretobeperformedrandomlydetermined.Statisticalmethodrequireth
attheobservations(orerrors)beindependencyandidenticallydistributedrandomvaria
blesandrandomizationmakesthisassumptionvalid.Thus,randomizationremovesbiasa
ndallowstheapplicationofprobabilityconcepts.
Replication:itisacompleterepetitionofthebasicexperiment,thatis,itprovidesanestimat
eofthemagnitudeoftheexperimentalerrorandamoreprecisemeasureoftreatmenteffec
ts.
ReductionofRandomVariation
Thethirdbasicprincipleistheuseoftechniquesofexperimentaldesignforthereductionof
randomvariationorlocalcontrolofvariabilityorerrorcontrol.Thisreferstothewayinwhi
chtheexperimentalunitsinaparticulardesignisbalanced,blockedandgrouped.Possessi
onoflocalcontrolisnecessarytoincreasetheefficiencyoftheexperiment.Thecommonlyu
sedtermsare,experiment,treatment,experimentalunit,experimentalerror,grouping,bl
ockingfactors,balancingandprecision.
Experiment:Itisameansofgettingananswertothequestionthattheexperimenterhasinm
ind.Thismaybetodecidewhichofseveralpainrelievingdrugsismosteffectiveorwhethert
heyareequallyeffective.
Similarly,theeffectivenessofvarioustypesofdietsongrowthstatusofchildrenoralbinora
tscanbeassessed.Forassessingtheeffectivenessoftheexperiment,itshouldhaveonegro
uptoserveaslocalcontrol.
Treatment:Thismeanstheexperimentalconditionswhichareimposedonanexperiment
alunitinaparticularexperiment.Inadietaryormedicalexperiment,thedifferentdietsor
medicinesarethetreatments.Inanagriculturalexperiment,thedifferentvarietiesofacro
pordifferentmanureswillbethetreatments.
Experimentalunit:Anexperimentalunitisthematerialtowhichthetreatmentisapplieda
ndonwhichthevariableunderstudyismeasured.Inafeedingexperimentofcowsoralbino
rats,thewholecoworalbinoratistheexperimentalunit.
ExperimentalError:Weusuallycomeacrossvariationinthemeasurementmadeondiffer
entexperimentalunitsevenwhentheygetthesametreatments.Apartofthisvariationissy
stematicandcanbeexplained,whereastheremainderistobetakentobeoftherandomtyp
e.Theunexplainedrandompartofthevariationistermedtheexperimentalerror.
Grouping:Thisistheplacementofhomogenousexperimentalunitsintodifferentgroupst
owhichseparatetreatmentsmaybeassigned.
Blocking:Thisistheassignmentoftheexperimentalunitstoblocksinsuchamannerthatth
eunitswithinanyparticularblockareashomogenousaspossible.
Factors:Afactorisapossiblecauseofresponseorvariation.Factorsincludeage,sex,variet
y,etc.itmaybeobservedthattreatmentsareoftendifferentcombinationsofthelevelsofon
eormorefactors.
Balancing:Thisistheassignmentofthetreatmentcombinationstotheexperimentalunitsi
nsuchawaythatabalancedorsymmetricconfigurationisobtained.

One-wayANOVAisusedwhenwewishtotesttheequalityofk-
populationmeans.TheprocedureisbasedontheassumptionsthateachofKgroupsofobse
rvationisarandomsamplefromanormaldistributionandthatthepopulationvariance2i
sconstantamongthegroups.ANOVAmodelsprovideanappropriateestimatetofacilitate
comparisonofseveralmeans.
Thestatisticalmodelforone-wayclassificationofANOVAis
X ij = μ + α i + ℓ ij

i=1,2,…..,k
j=1,2,…,n

WhereXij=(ij)thobservationfromthejthunitreceivingtreatment

=overallorgrandmean

i=ithTreatmenteffect

ℓ ij =randomerror

ℓ ij ~NID(0,2)

Notation trttotal trtmean


Trt 1 X 11 X 12 . .. . .. .. . . X1 n X 1. X̄ 12
Trt 2 X 21 X 22 . .. . .. .. . X2 n X . X̄ 22
⋮ ⋮ ⋮ ⋮ ⋮ ⋮
Trt k X X k 2 . .. . .. .. . X kn X k . X̄ k 2
k2

Where
n
X̄ i .. =
1
n ∑ X ij i = 1 , 2 , . .. k
j=1

k n

k ∑ ∑ X ij
X̄ i .. = 1
k ∑ X̄ i . = i=1 j=1
kn
i = 1, 2, ...k
j=1

SumofSquaresidentity
ANOVAispartitioningoftotalvariabilityintocomponentsparts.
Totalsumofsquare(TSS):TheTSSisdefinedasthesumofthesquareofthedeviationsfromt
hegrandmean.
k n
TSS = ∑ ∑ ( X ij − X̄ . . )
2

i=1 j=1

Itisameasureofthedispersionofallthevariatesaboutthegrandmean.Itsdegreeoffreedo
m(df)=k-
1.ItcanbeshownthattheTSS,SStotalortotalvariationscanbepartitionedintotwo.
k n k n
TSS = ∑ ∑ ( X ij − X̄ . . ) =
2
∑ ∑ ( X ij − X̄ i. )2 + n ∑ ( X̄ i. − X̄ . . )2
i=1 j=1 i=1 j=1

BSS
WSS
TSS = Between treatment
Treatment sum of squares
sum of squares
WithinSumofSquares(WSS):WSSorsumofsquaresduetoerror(residualerror)isdefine
dasthedeviationofXij(originalobservation)fromthetreatmentmeans.Itrepresentsthee
xperimentalerrorofthegivenexperimentitsdegreeoffreedomisk(n-1)denotedbySSE.
Betweensumofsquares(BSS):Itisdefinedasthedeviationsofthetreatmentmansaboutth
egrandmean.Thelessthesamplesdifferfromeachother,thesmallertheBSSortreatments
umofsquares(SSTr).
Foreasycomputation,wecanusethefollowing:
k n
TSS = ∑ ∑ ( X ij − X̄ . . )2
i=1 j=1

k n
T2
= ∑∑X 2 −
i=1 j=1 ij nk

[∑ ∑ ]
k n 2
2
T = X ij
Where i=1 j=1

n
Xi .
Xi . = ∑ X ij , X̄ i. =
N
j=1
k n
X..
X . . = ∑ ∑ X ij , X̄ . . =
i=1 j=1 N
WhereN=totalnumberofobservations
k
BSS = n ∑ ( X̄ i. − X̄ . . )2
i=1

1 T2
=
n
∑ T 2i. −
nk
WhereTi.=sumofobservationsintreatmentigroup
WSS=TSS–BSS
One-wayANOVATable(Equalobservation)

Givenmodel, X ij = μ + α i + ℓ ij
Totestthehypothesis
H0:1=2=…=k
H1:atleasttwodi’sarenotequal.
TestStatistics:FfromtheANOVAtablebelow
ANOVATable
Source SS df MS F
Betweentreatments BSS k-1 BSS
k−1
= A A/B
Withintreatments WSS K(n-1) WSS
k ( n−1 )
=B

Total TSS Kn-1

F (1−α ) , v
Thecriticalvalueis 1
, V2
wheredf,v1=k-1,v2=k(n-
1)andisthesignificantlevels.
Example8.1
GiventhefollowingfivetreatmentsA,B,C,DandEofthreevariableseachperforman
analysisofvariancetotestwhetherthetreatmenteffectsandthesameornotandcomputet
hecoefficientofvariationtodetermineitsprecisionat5%levelofsignificance.
A B C D E
3 5 7 6 4
2 8 8 8 9
4 8 6 7 5

Solution:
TestofHypothesis
H0:1=2=…=5
H1:atleast2di’sarenotequal.
A B C D E
3 5 7 6 4
2 8 8 8 9
4 8 6 7 5
Ti.Total 9 21 21 21 18
X̄ i . mean 3 7 7 7 6

TSS = ∑ ( X ij − X̄ . . )2 = ( 3−6 )2 + ( 2−6 )2 + ( 4−6 )2 + .. . + ( 9−6 )2 + ( 5−6 )2 = 62


BSS = n ∑ ( X̄ i. − X̄ . . ) = 3 [ (3−6 )2 + ( 7−6 )2 + ( 7−6 )2 + ( 6−6 )2 ]
=3[9+1+1+1]=36
WSS=TSS–BSS
=62–36=26
AnotherComputationalFormulaorMethod
n
T2
TSS = ∑ X 2ij −
nk
i=1 T=Grandtotal
( 90 )2
( 3 2 + 22 + 42 + . .. + 4 2 + 92 + 52 ) −
= 3×5

=62
1 T2
BSS =
n
∑ T 2i . −
nk
1 2 ( 90 )2
= [ 9 + 21 + 21 + 21 + 18 ] −
2 2 2 2
3 3×5
=36
WSS=TSS–BSS=62–36=26
ANOVATABLE
SourceofVariation SS df MS F
Betweentreatments 36 4 36
4
=9 9/2.6 = 0.346
Withintreatments 26 10 26
10
= 2 .6

Total 62 14

Teststatistics=Fcal=3.462
CriticalValue=F(1-),4,10=3.48
Decision:SinceFcal>Ftab,weacceptH0andconcludethatthetreatmentmeaneffectsinthefiv
etreatmentsareequalorthereisnosignificantdifferencebetweenthetreatmentmeansin
thefivetreatments.

StatisticalHypothesis
Themostfrequentapplicationofstatisticsistotestsomescientifichypotheses.Resultsofe
xperiments,andinvestigationsareusuallynotclearcutand,therefore,needstatisticaltest
stosupportdecisionsbetweenalternativehypothesis.Astatisticaltestsexaminesasetofs
ampledataandonthebasisofanexpecteddistributionofthedata,leadstoadecisiononwhe
thertoacceptthehypothesisorwhethertorejectthathypothesisandacceptanalternative
one.Thenatureofthetestsvarieswiththedataandthehypothesis,butthesamegeneralphi
losophyofhypothesistestingiscommontoalltests.Astatisticalhypothesisisanassumptio
norstatementwhichmayormaynotbetrueconcerningoneormorepopulation.
Astatisticalhypothesis(orinference)isastatementabouttheparametersorformofapopu
lation.Atestofastatisticalhypothesisisacriteriawhichspecifiesforwhatsampleresultsth
ehypothesisistobeacceptedorrejected.Thehypothesiswhichistobetestedisgenerallyca
lledtheNullhypothesisdenotedbytheH0andhypothesisagainstwhichitistobetestedisca
lledthealternativehypothesisandalsodenotedbyH 1.
TypeIandTypeIIErrors
AtypeIerrorhasbeencommittedifwerejectthenullhypothesiswhenitistrueandatypeIIe
rrorhasbeencommittedifweacceptthenullhypothesiswhenitisfalse.
ThefollowingtablesummarizesthevarioussituationsthatcanarisewhentestingH 0again
stH1:
AcceptH0 AcceptH1
H0istrue Noerror TypeIError
H1istrue TypeIIerror Noerror

TheprobabilitiesofcommittingatypeIandtypeIIerrorsarecalledlevelofsignificanceofth
etestsandarewrittenasand,respectively.iscalledthesizeofthetestand(1-
)iscalledthepowerofthetest,and(1-

)isalsotheprobabilityofrejectingnullhypothesis(H 0)whenitisfalse.Theareasuchthatif

thesamplepointfallsinitwerejectH0iscalledthecriticalregion.Whentheprimaryconcern
ofatestistoseewhetherthenullhypothesiscanberejected,suchatestiscalledatestofsignif
icance.Inthatcase,thequantityiscalledthelevelofsignificanceatwhichthetestisbeingc
onducted.
OneandTwoTailedTest
Atestofanystatisticalhypothesiswherethealternativeisonesidedsuchas:
H0:=0 or H0:=0
H1:>0 H1:<0
Iscalledaone-
tailedtest.ThecriticalregionforH1:>0liesentirelyintherighttailwhilethecriticalregio
nforH1:<0liesentirelyinthelefttail.
Atestofanystatisticalhypothesiswherethealternativeistwo-sidedsuchas:
H0:=0
H1:0
Iscalledatwo-
tailedtest,valuesinthebothtailsofthedistributionconstitutethecriticalregion.
TESTPROCEDUREANDSTEPS
Thestepsinvolvedingeneralandintheutilizationofanytestofsignificanceare:
i. Findthetypeofproblemandthequestiontobeanswered.
ii. Tostatethenullhypothesis(H0)andtheappropriatealternative(H1)hypothesis
iii. Selectionoftheappropriatetesttobeutilizedandcalculationofthetestcriterionbas
edonthetypeoftest.
iv. Fixationofthelevelofsignificance
v. Decisionmakingontestcriterionvalue,whethertorejectoracceptthehypothesis.
vi. Drawingoftheconclusion(orinference)onthebasisoflevelofsignificanceisdecidi
ngwhetherthedifferenceobservedisduetochanceorduetosomeotherknownfact
ors.
‘P’Values
‘P’Valuesareusedtoassessthedegreeofdissimilaritybetweentwoormoresetsofm
easurementsorbetweenonesetofmeasurementsandastandard.The‘P’valueisactuallya
probability,usuallytheprobabilityofobtainingaresultofextremeasormoreextremetha
ntheoneobservedifthedissimilarityisentirelyduetovariationinmeasurementsorinsubj
ectresponse,thatis,ifitistheresultofchancealone.
‘P’valuesmeasurethestrengthofevidenceinscientificstudiesbyindicatingthepro
babilitythataresultatleastasextremeastheobservedwouldoccurbychance.
‘P’valuesarederivedfromstatisticalteststhatdependonthesizeanddirectionofth
eeffect.‘P’Valuesshouldbeconsideredinmakingdecisionsabouttheusefulnessofatreat
ment.
Onepopularapproachistoindicateonlythatthe‘P’valueissmallerthan0.05(P<0.0
5)orsmallerthan0.01(P<0.01).When‘P’valueisbetween0.05and0.01,theresultisusual
lycalledstatisticallysignificant,whenitislessthan0.01or0.005aretakentobeveryhighlys
ignificant.
TESTSCONCERNINGTHEMEAN(FORLARGESAMPLE).
Wewillassumethatthesamplingdistributionofthesampleestimateswillbeapproximate
lynormalandthatthevarianceisknown.Hence,forlargesamples(n30),wecanusetheno
rmalprobabilitydistributionfortestingahypothesizedvalueofthepopulationmean.
Theteststatistics
X̄ − μ
Z=
S.E . ( X̄ )
Where
X̄ isthesamplemean
isthepopulationmean

S.E.( X̄ )isthestandarderrorofthesamplemean.
σ
S . E . ( X̄ ) =
√n
Where
isthepopulationstandarddeviation(usuallyknown)

nisthesamplesize.
WethencomparethemodulusofZ,thatis,(/
Z/)toitsvalueatthegivenlevelofsignificance,usuallyat5%and1%.Thecorrespondingva
luesofZforbothonetailedandtwo-tailedtestsaretabulatedbelow:
One-tailed Two-tailed
5%(or0.05) 1.64 1.96
1%(or0.01) 2.33 2.58

Decision
i. IfZcalculatedislessthantheZtabulatedthenthereisnoreasontorejectthenullhypo
thesisH0.
ii. IfZcalculatedismorethantheZtabulatedthenwerejectthenullhypothesisH 0anda
cceptH1thealternativehypothesis.
Example

Abottlingcompanywhichbottlesasoftdrinkclaimsthattheliquidscontentis35clwithsta
ndarddeviation0.75cl.Aresearcherrandomlycollects50bottles,measuredtheircontent
sandgotmeanof34.2cl.Testat0.01levelofsignificancethatthebottlingcompanyhasbeen
cheatingtheirconsumers.

Solution
=35cl

=0.75cl

n=50
X̄ =34.2
=0.01(1%)

H0:=35thatis,thecompanyhasnotbeencheatingtheconsumers.
H1:<35thatthecompanyhasbeencheatingtheconsumers.
Teststatisticsis
( X̄ − μ ) √ n
Z=
σ
( 34 .2 − 35 ) √ 50
=
0 . 75
−0. 8 × 7 . 0711
=
0. 75
=-7.54
Thus,|Z|=|-7.541|=7.54
At0.01levelofsignificancetheZtabulatedvalue(onetailed)is2.33
Decision:theZcalculatedvalue7.54isgreaterthantheZtabulatedvalue2.33.werejectH 0a
ndacceptH1.
Conclusion:Thereissignificantdifferencebetweenthepopulationandsamplemean.Hen
ce,thebottlingcompanyhasbeencheatingtheirconsumers.
Example

Themeanheightfromarandomsampleofsize100is64cm.Thestandarddeviationisknow
ntobe3cm.testthestatementthatthemeanheightofthepopulationis67cmat5%levelofsi
gnificance.

Solution
X̄ =64cm
=3cm

=67cm

n=100
=0.05levelofsignificance

H0:=67cm
H1:67cm
Teststatistics
( X̄ − μ ) √ n
Z=
σ
( 64 − 67 ) √ 100
=
3
=-10
Thus,|Z|=|-10|=10
At0.05levelofsignificancetheZtabulatedvalue(two-tailed)is1.96
Decision:SinceZcal>ZtabwerejectH0andacceptH1
Conclusion:Themeanheightofthepopulationcouldnotbe67cm.
TESTCONCERNINGTHEMEANS(SMALLSAMPLES)

Therearesituationsinreallifeexperiment,suchas,testingtheefficiencyofanewlyproduc
eddrug,whereitisimpracticabletogetalargesampleandyettestsofsignificancestillhavet
obecarriedout.Whenwedonotknownthevalueofthepopulationstandarddeviationandt
hesamplesizeissmall(n<30),weshallassumeagainthatthepopulationwearesamplingf
romhasroughlytheshapeofanormaldistribution.Theteststatisticsis:
X̄ − μ ( X̄ − μ ) √ n
t= =
S S
√n
Whosesamplingdistributionisthetdistributionwithn-
1degreeoffreedom.Sisthesamplestandarddeviation.Aswithlargesamples,wecomparei
twithitsvalueatagivenlevelofsignificance,andthendrawourconclusions.
Example

Supposethatwewanttotestonthebasisofarandomsampleofsizen=5whetherornotthef
atcontentofacertainkindoficecreamexceeds12percent.Whatcanweconcludeaboutthe

nullhypothesis.=12percentatthe0.01levelofsignificance,ifthesamplehasthemean X̄
as12.7percentandthestandarddeviationSis0.38percent.
Solution
Hypothesis
H0:=12%
H1:>12
=0.01

n=5
d.f.=n–1=t0.01,4degreeoffreedom
Teststatistics
X̄ − μ
t =
S
√n
12. 7 − 12
t=
0 .38
√5
0.7
t= = 4 .12
0 . 1699
t0.01,4=4.12
Decision:Sincetcal>ttab,werejectH0
Conclusion:Therefore,thecontentofthegivenkindoficecreamexceeds12percent.

Example
Thelifetimeofelectricbulbsforarandomsampling10fromalargeconsignmentgivethefol
lowingdata:
Item Lifein1,000hrs x- X̄ (X- X̄ )2
1 4.2 -0.1 0.01
2 4.0 -0.3 0.09
3 3.9 -0.4 0.16
4 4.1 -0.2 0.04
5 5.2 0.9 0.81
6 3.8 -0.5 0.25
7 3.9 -0.5 0.16
8 4.3 0 0
9 4.4 0.1 0.01
10 5.6 1.3 1.69

Canweacceptthehypothesisthattheaveragelifetimeofbulbsis4,000hoursat5%le
velofsignificance?
Solution:
Hypothesis
H0:=4,000hours
H1:4,000hours
=0.05levelofsignificance

Since,n=10 d.f.=n-1=10–1,9
tα = t 0 .25 , 9
2 ( n−1 ) ( n−1 )

n
∑ Xi
i=1 4 . 2 + 4 . 0 + ,. .. , + 5. 6 43 . 5
X̄ = = =
n 10 10

X̄ = 4.3
10
∑ ( X i − X̄ )2
i =1
S2 =
n−1

2 ( 4.2 − 4.3 )2 + ( 4.0 − 4.3 )2 + ... + ( 4.4 − 4.3 )2 + ( 5.6 − 4.3 )2


S =
10− 1
0.01 + 0.09 + , ..., + 0.01 + 1.69
=
9
3 . 22
S2 =
9 =0.358
TestStatistics
( X̄ − μ ) √n
t=
S
( 4 .3 − 4 ) √ 10
t=
0 .598
Where S = √0.358 =0.598
t=1.587
t0.025,9=2.262
Decision:RejectH0iftcal>ttab
Conclusion:Sincetcal>ttab,thenweacceptH0andconcludethattheaveragelifetimeis4,000
hours

You might also like