You are on page 1of 11

StatisticsForManagement

Unit8

Unit8
Structure
8.1 Introduction Objectives 8.2 8.3 8.4 8.5 Reasonswhyestimateshavetobemade MakingStatisticalInference TypesofEstimates CriteriaofaGoodEstimator 8.5.1Unbiasedness 8.5.2Efficiency 8.5.3Consistency 8.5.4Sufficiency 8.6 8.7 8.8 PointEstimates IntervalEstiamtes IntervalEstimatesandConfidenceIntervals SelfAssessmentQuestions1 8.9 8.10 DeterminingtheSampleSizeinEstimation Summary TerminalQuestions AnswertoSAQsandTQs

Estimation

8.1 Introduction
Everyonemakesestimates.Whenyouarereadytocrossastreet,youestimatethespeed ofanycarthatisapproaching,thedistancebetweenyouandthatcar,andyourownspeed. Havingmadethesequickestimates,youdecidewhethertowait,walk,orrun. LearningObjectives Inthisunitstudentswilllearnabout

1. 2. 3.

Typesofestimations Criteria foragoodEstimator Intervalestimatesandconfidenceintervals

8.2 Reasonswhyestimateshavetobemade
All mangers must make quick estimates too. The outcome of these estimates can affect theirorganizationsasseriouslyastheoutcomeofyourdecisionastowhethertocrossthe street. Credit managers estimate whether a purchaser will eventually pay his bills.

SikkimManipalUniversity

121

StatisticsForManagement

Unit8

Prospectivehomebuyersmakeestimatesconcerningthebehaviourofinterestratesinthe mortgage market. All these people makeestimates without worry about whether they are scientific but with the hope that the estimates bear a reasonable resemblance to the outcome. Managers use estimates because in all but the most trivial decisions, they must make rational decisions without complete information and with a greatdeal of uncertainty about whatthefuturewillbring.Aseducatedcitizensandprofessionals,youwillbeabletomake more useful estimates by applying the techniques described in this and subsequent chapters.

8.3

Makingstatisticalinference

Statisticalinferenceisbasedonestimation,andhypothesistesting.Inbothestimationand hypothesistesting,weshallbemakinginferencesaboutcharacteristicsofpopulationsfrom information contained in samples. Here we infer something about a population from informationtakenfromasample. Herewetrytoestimatewithreasonableaccuracythepopulationproportion(theproportion of the population that possesses a given characteristic) and the population mean. To calculatetheexactproportionortheexactmeanwouldbeanimpossiblegoal.Evenso,we willbeabletomakeanestimate,andimplementsomecontrolstoavoidasmuchoftheerror aspossible.

8.4 Typesofestimates
Therearetwotypesofestimatesaboutapopulation 1 Apointestimateand 2 anintervalestimate 8.4.1 A Point estimate: is a single number that is used to estimate an unknown populationparameter.Apointestimateisofteninsufficient,becauseitiseitherrightor wrong,wedonotknowhowwrongitis.Therefore,apointestimateismuchmoreuseful ifitisaccompaniedbyanestimateoftheerrorthatmightbeinvolved. 8.4.2 An interval estimate: is a range of values used to estimate a population parameter. It indicates the error in two ways: by the extent of its range and by the probabilityofthetruepopulationparameterlyingwithinthatrange.

8.5 Criteriaofagoodestimator
8.5.1 Unbiasedness: This is a desirable property for a good estimator to have. Thetermunbiasednessreferstothefactthatasamplemeanisanunbiasedestimatorof a population mean because the mean of the sampling distribution of sample means

SikkimManipalUniversity

122

StatisticsForManagement

Unit8

takenfromthesamepopulationisequaltothepopulationmeanitself. Wecansaythata statisticisanunbiasedestimatorif,onaverage,ittendstoassumevaluesthatareabove the population parameter being estimated as frequently and to the same extent as it tendstoassumevaluesthatarebelowthepopulationparameterbeingestimated. 8.5.2 Efficiency: Another desirable property of a good estimator is that it be

efficient.Efficiencyreferstothesizeofthestandarderrorofthestatistic.Ifwecompare two statisticsfrom a sampleof the same sizeand try todecide whichoneis themore efficient estimator, we would pick the statistic that has the smaller standard error. Suppose we choose a sample of a given size and must decide whether to use the samplemeanorthesamplemediantoestimatethepopulationmean.Ifwecalculatethe standarderrorofthesamplemeanandfindittobe1.05andthencalculatethestandard errorofthesamplemedianandfindittobe1.6,wewouldsaythatthesamplemeanisa moreefficientestimatorofthepopulationmeanbecauseitsstandarderrorissmaller.It makes sense that an estimator with a smaller standard error (with less variation) will havemore chance of producinganestimate nearer to the population parameter under consideration. 8.5.3 Consistency: Astatisticisaconsistentestimatorofapopulationparameter

ifasthesamplesizeincreases,itbecomesalmostcertainthatthevalueofthestatistic comesveryclosetothevalueofthepopulationparameter.Ifanestimatorisconsistent, itbecomesmorereliablewithlargesamples. 8.5.4 Sufficiency: An estimator is sufficient if it makes so much use of the

information in the sample that no other estimator could extract from the sample additionalinformationaboutthepopulationparameterbeingestimated.

8.6

Pointestimates:
101 105 97 93 114 Resultsofasamplesof35Boxofbolts(boltsperbox) 103 112 102 98 97 100 97 107 93 94 100 110 106 110 103 98 106 100 112 105 97 110 102 98 112 93 97 99 100 99

Consider the table above, we have taken a sample of 35 boxes of bolts from a manufacturing line and have counted the bolts per box. We can arrive at the population meani.e.meannumberofboltsbytakingthemeanforthe35boxeswehavesampled.i.e. addingalltheboltsanddividingbythenumberofboxes. X3570

X=

n35

=102

SikkimManipalUniversity

123

StatisticsForManagement

Unit8

Thususingthesamplemeanxastheestimatorwehaveapointestimateofthepopulation mean.
2 Similarly we can use the sample variance s and estimate the population variance, where 2 thesamplevariances isgivenbytheformula.

(XX)
2 S =

n1

8.7

IntervalEstimates

Thepurposeofgatheringsamplesistolearnmoreaboutapopulation.Wecancomputethis information from the sample data as either point estimates, or as interval estimates. An intervalestimatedescribesarangeofvalueswithinwhichapopulationparameteris likelytolie. The marketing research director needs an estimate of the average life in months of car batterieshis company manufactures. We selecta random sampleof200 batteries with a mean life of 36 months. If we use the point estimate of the sample mean x as the best estimator of the population mean , we would report that the mean life of the companys batteriesis36months. Thedirectoralsoasksforastatementabouttheuncertaintythatwillbelikelytoaccompany this estimate, that is, a statement about the range within which the unknown population meanislikelytolie.Toprovidesuchastatement,weneedtofindthestandarderrorofthe mean. Ifweselectandplotalargenumberofsamplemeansfromapopulation,thedistributionof thesemeanswillapproximatetonormalcurve.Furthermore,themeanofthesamplemeans willbethesameasthepopulationmean.Oursamplesizeof200islargeenoughthatwe can apply the central limit theorem. Suppose we have already estimated the standard deviation of the population of the batteries and reported that it is 10 months. Using this standarddeviationwecancalculatethestandarderrorofthemean:sousingtheformula s

sX = n
WefindthestandarderrorS.E= Makingtheintervalestimate: We can tell to the director that our estimate of the life of the companys batteries is 36 months,andthestandarderrorthataccompaniesthisestimateis0.707.Inotherwords,the actualmeanlifeforallthebatteriesmayliesomewhereintheintervalestimateof35.293to 36.707months.Thisishelpfulbutinsufficientinformationforthedirector.Next,weneedto calculatethechancethattheactuallifewilllieinthisintervalorinotherintervalsofdifferent widthsthatwemightchoose, 2s (2x0.707), 3s (3x0.707),andsoon.

sX = 10/ 200tobe0.707permonth

SikkimManipalUniversity

124

StatisticsForManagement

Unit8

The probability is 0.955 that the mean of a sample size of 200 will be within 2 standard errorsofthepopulationmean.Stateddifferently,95.5percentofallthesamplemeansare within2standarderrorsfrom m.Thepopulationmeanwillbelocatedwithin2standard errorsfromthesamplemean95.5percentofthetime. Hencefromtheaboveexamplewecannowreporttothedirector,thatthebestestimateof thelifeofthecompanysbatteriesis36months,andweare68.3percentconfidentthatthe life lies in the interval from 35.293 to 36.707 months (36 1 sx ). Similarly, we are 95.5 percentconfidentthatthelifefallswithintheintervalof34.586to37.414months(36 2 sx), andweare99.7percentconfidentthatbatterylifefallswithintheintervalof33.879to38.121 months(36 3 sx).

8.8 IntervalEstimatesandconfidenceintervals
Inusingintervalestimates,wearenotconfinedto1,2and3standarderrorsforexample, 1.64 standard errors includes about 90 percent of the area under the curve it includes 0.4495 of the area on either side of the mean in a normal distribution. Similarly, 2.58 standarderrorincludesabout99percentofthearea,or49.51percentoneachsideofthe mean. The probability that we associate with an interval estimate is called the confidence level. Thisprobabilityindicateshowconfidentwearethattheintervalestimatewillinclude thepopulationparameter.Ahigherprobabilitymeansmoreconfidence.Inestimation,the mostcommonlyusedconfidencelevelsare90percent,95percent,and99percent,butwe arefreetoapplyanyconfidencelevel. Theconfidenceintervalistherangeoftheestimatewearemaking.Ifwereportthatweare 90 percent confident that the mean of the population of incomes of people in a certain communitywillliebetweenRs.8,000andRs.24,000,thentherangeRs.8,000Rs.24,000 is our confidence interval. Often, however, we will express the confidence interval in standard errors rather than in numerical values. Thus, we will often express confidence intervalslikethis:X1.64 sx X+1.64 sx =upperlimitoftheconfidenceinterval X1.64 sx =lowerlimitoftheconfidenceinterval

Thus, confidence limits are the upper and lower limits of the confidence interval. In this case,X+1.64 sx iscalledtheupperconfidencelimit(UCL)andX1.64 sx =isthelower confidencelimit(LCL). CalculatingintervalEstimatesoftheMeanfromLargeSamples

SikkimManipalUniversity

125

StatisticsForManagement

Unit8

Ifthesamplesarelargethenweusethefinitepopulationmultipliertocalculatethestandard error.Thisisgivenfromthepreviousunitas s

sx =

Nn

is

N1N

>0.05

CalculatingintervalEstimatesoftheProportionfromLargeSamples Statisticians often use as sample to estimate a proportion of occurrences in a population. Forexample,thegovernmentestimatesbyasamplingproceduretheunemploymentrate,or theproportionofunemployedpeople,inthecountrysworkforce. We know for a binomial distribution, the mean and the standard deviation of the binomial distributiontobe Mean=np Andstandarddeviation s = npqwhereq=1p Heren=numberoftrials p=probabilityofsuccessand q=probabilityoffailure=1p Sincewearetakingthemeanofthesampletobethemeanofthepopulationweactually meanthat mp =p Similarly, wecanmodifytheformulaforthestandarddeviationofthebinomialdistribution, npq,whichmeasuresthestandarddeviationinthenumberofsuccesses.Tochangethe numberofsuccessestotheproportionofsuccesses,wedivide npqbynandget pq/ n ThereforethestandarderroroftheproportionSp = pq/ n Example:Inaverylargeorganizationthedirectorwantedtofindoutwhatproportionsofthe employeesprefertoprovidetheirownretirementbenefitsinlieuofacompanysponsored plan.Asimplerandomsampleof75employeeswastakenandfoundthat40%,i.e.0.4of themareinterestedinprovidingtheirownretirementplans.Themanagementrequeststhat weusethissampletofindanintervalaboutwhichtheycanbe99percentconfidentthatit containsthetruepopulationproportion. Heren=75,p=0.4q=1p=10.4=0.6 ThereforeStandarderrorofthemean= pq/ n There the interval estimate for 99% levelof confidence is0.4 2.58 (0.057) = 0.253 and 0.547.

SikkimManipalUniversity

126

StatisticsForManagement

Unit8

Thereforetheproportionofthetotalpopulationofemployeeswhowishtoestablishtheirown retirementsplansliebetween0.253and0.547. IntervalEstimatesusingthestudentstDistribution Sofar,thesamplesizeswewereexaminingwerealllargerthan30.Thisisnotalwaysthe case.Questionslikehowcanwehandleestimateswherethenormaldistributionisnotthe appropriate sampling distribution, that is, when we are estimating the population standard deviationandthesamplesizeis30orless?Supposewehavedataonlyformletussay10 weeks or sample sizes less than 30, then fortunately, another distribution exists that is appropriateinthesecases.Itiscalledthetdistribution. EarlytheoreticalworkontdistributionswasdonebyamannamedW.S.Gossetintheearly 1990s. Gosset was employed by the Guinness Brewery in Dublin, Ireland, which did not permitemployeestopublishresearchfindingsundertheirownnames.So Gossetadopted thepennameStudent and published under thatname. Consequently, the tdistribution is commonlycalledStudentstdistribution,orsimplyStudentsdistribution. Conditionsforusage: Because it is used when the sample size is 30 or less, statisticians often associate the t distributionwithsmallsamplestatistics.Thisismisleadingbecausethesizeofthesampleis onlyoneoftheconditionsthatleadustousethetdistribution.Thesecondconditionisthat thepopulationstandarddeviationmustbeunknown.Useofthetdistributionsforestimating isrequiredwheneverthesamplesizeis30orlessandthepopulationstandarddeviationis notknown.Furthermore,inusingthetdistribution,weassumethatthepopulationisnormal orapproximatelynormal. Degreesoffreedom There is a different t distributionfor each of the possible degrees of freedom. What are degreesoffreedom?Wecandefinethemasthenumberofvalueswecanchoosefreely. We will use degrees of freedom when we select a t distribution to estimate a population mean,andwewillusen1degreesoffreedom,wherenisthesamplesize.Forexample,if weuseasampleof20toestimateapopulationmean,wewilluse19degreesoffreedomin ordertoselecttheappropriatetdistribution. Withtwosamplevalues,wehaveonedegreeoffreedom(21=1),andwithsevensample values,wehavesixdegreesoffreedom(71=6).Ineachofthesetwoexamples,then,we hadn1degreesoffreedom,assumingnisthesamplesize.Similarly,asampleof23would giveus22degreesoffreedom. UsingthetDistributionTable

SikkimManipalUniversity

127

StatisticsForManagement

Unit8

Comparisonbetweentandztables Thetableoftdistributionvaluesdiffersinconstructionfromtheztableornormaldistribution tableusedpreviously.Thettableismorecompactandshowsareasandtvaluesforonlya fewpercentages(10,5,2,and1Percent).Becausethereisadifferenttdistributionforeach numberofdegreesoffreedom,amorecompletetablewouldbequitelengthy.Althoughwe canconceiveoftheneedforamorecompletetable Aseconddifferenceinthettableisthatitdoesnotfocusonthechancethatthepopulation parameter being estimated will fall with our confidence interval. Instead, it measures the chance that the population parameter we are estimating will not be within our confidence interval (that is, that it will lie outsideit). If we are making anestimate at the 90 percent confidence level, we would look in the t table under the 0.10 column (100 percent 90 percent=10percent).Thisis0.10chanceoferrorissymbolizedbytheGreekletteralpha .Wewouldfindtheappropriatetvaluesforconfidenceintervalsof95percent,98percent, and99percentunderthecolumnsheaded0.05,0.02,and0.01,respectively. A third differenceinusing the t table is that we must specify the degrees offreedom with which we are dealing. Suppose we make an estimate at the 90 percent confidence level withasamplesizeof14,whichis13degreesoffreedom.Lookunderthe0.10columnuntil youencountertherowlabelled13.Likeazvaluethetvaluethereof1.771showsthatifwe mark off plus and minus 1.7716 sx (estimated standard errors of x) on either side of the mean, the area under the curvebetweenthese two limits will be90 percent, and the area outsidetheselimits(thechanceoferror)willbe10percent. Remember that in any estimation problem in which the sample size is 30 or less and the standard deviation of the population is unknown and the underlying population can be assumedtobenormalorapproximatelynormal,weusethetdistribution.

SelfAssessmentQuestions1 1. Pizza Hut has developed quite a business in Bangalore by delivering pizza orders promptly.ItguaranteesthatitsPizzaswillbedeliveredin30minutesorlessfromthetime theorderwasplaced,andifthedeliveryislate,thePizzaisfree.Thetimethatittakesto delivereachPizzaorderthatisontimeisrecordedinthePizzaTimeBook(PTB),andthe deliverytimeforthosePizzasthataredeliveredlateisrecordedas30minutesinthePTB. Asampleof12randomentriesfromthePTBarelistedbelow: 15.3 10.8 29.5 12.2 30 14.8 10.1 30 30 22.1 19.6 18.3

a. Findthemeanforthesample b. Fromwhatpopulationwasthissampledrawn?

SikkimManipalUniversity

128

StatisticsForManagement

Unit8

c. Can this sample be used to estimate the average time that it takes for Pizzas hut to deliverapizza?Explain. 2. Madhuafrugalstudentwantstobuyausedbike.Afterrandomlyselecting125wanted advertisements, he found the average price of the bike to be Rs.3250 with a standard deviationofRs.615. a. Establish an interval estimatefor the average price of the bike so that Madhu can be 68.3%certainthatthepopulationmeanliesinthisinterval. b. Establish an interval estimatefor the average price of the bike so that Madhu can be 95.5%certainthatthepopulationmeanliesinthisinterval. 3. Given the following confidence levels, express the lower and upper limits of the confidenceintervalfortheselevelsintermsofXand sx.(Usethenormaldistributiontables). a. 54percent. b. 75percent. c. 94percent. d. 98percent. 4. From a population of 540, a sample of 60 individuals is taken. From this sample the meanisfoundtobe6.2andthestandarddeviationSDtobe1.368 a. Findtheestimatedstandarderrorofthemean. b. Constructa96%confidenceintervalofthemean. 5. Forthefollowingsamplesizesandconfidencelevelsfindtheapproximatetvaluesfor constructingconfidenceintervals(usethettable) a)n=2895% b)n=898% c)n=1390%d)n=2595%

8.9 DeterminingtheSamplesizeinEstimation Inalltheexamplesabovewehaveused,thesamplesizewasknown.Nowwearetryingto estimatethesamplesizen.ifitistoosmallwemayfailtoachievetheobjective,ifitistoo largewewillbewastingresources.However,letstrytoexaminesomeofthemethodsthat areusefulindeterminingwhatsampleisnecessaryforanyspecifiedlevelofprecision.

Comparisonoftwowaysofexpressingthesameconfidencelimits Loverconfidencelimit a. b. c. x500 xz sx xt sx Upperconfidencelimit x+500 x+z sx x+t sx

SikkimManipalUniversity

129

StatisticsForManagement

Unit8

IIM wants to conduct a survey of the annual earning of its graduates in international placements.Itknowsfromthepastexperiencethatthestandarddeviationofitspopulation of studentsis $ 1500. How largea sample size should betaken inorder to estimate the meanannualearningsoflastyearsclasswithin$500at95%levelofconfidence? If you look at the problem above: it is stated that variationof $500 on either side ofthe populationsmean. Thatmeansz sx =500 At95%levelofconfidenceweknowfromtheztablethatz=1.96 Therefore1.96sx =500andthatmeans sx =500/1.96=255 Nowifthestandarderrorofthemeanis255thatleadsusto sx = s / n=255.Since s =1500wecanfindn.thatis
2 1500/ n=255thereforen=(1500/255) =34.6

Meaningnshouldbegreaterthan34.6or35iftheuniversitywanttoestimatetheprecision withwhichitwantstoconductthesurvey.

8.10 Summary
In this chapter we have seen point estimates and interval estimates. These are the foundation for inferential statistics in estimation and hypothesis testing which we will be discussinginthenextunit.Alsowehaveseentheconceptofconfidencelevelsandmake estimations whenthesample sizesaresmall andlarge. Also we have gonein reverseto estimate a sample size provided we know the level of accuracy we want to construct the estimate. Also we have seen thatif the sample size is less than 30 and the populations standarddeviationisnotknown,weusethestudentstdistributionforestimations. TerminalQuestions 1. ICICIisdeterminingtheno.oftellersavailableduringtheFridaylunchrushhour.

The bank has collected dataon theno.ofpeople who enteredthebankduringthepast3 monthsonFridayfrom11amto1pm. Usingthedatabelow,findthepointestimatesofthe meanandSDofthepopulationfromwhichthesamplewasdrawn 242 294 2. 275 328 FromapopulationknowtohaveaSDof1.4,asampleof60individualsistaken. 289 306 342 385 279 245 269 305

Themeanofthissampleisfoundtobe6.2 a. b. mean. Findthestandarderrorofthemean Establishanintervalestimatearoundthesamplemeanusingone SDofthe

SikkimManipalUniversity

130

StatisticsForManagement

Unit8

3.

On collecting asample of 250 from a population with a known SD of 13.7, the

meanisfoundtobe112.4 a. b. Finda95%confidencelevelintervalforthemean Finda99%confidencelevelintervalforthemean

ANSWERSTOSelfAssessemtQuestions 1. a)20.225minutes b)PTB c)Noastimeover30minutesisrecordedas30andhenceitwill Underestimatethedeliverytime. 2. s =615n=x=3250andSe iscalculatedas s 615 125

sx =

= 55.01

a)x 1 sx =3250 55.01=3194.99and3305.01tobe68.3%certain b)95.5%certainmeansx 2 s x =3250 110.02givingarangebetween 3139and3360.02 3. a.x 0.74 sx d.x 2.33 s x b.x 1.15 s x

c.x 1.88 sx

4. sx =

a.Given: s =1.368 s n Nn X N1

N=540 n=60 x=6.2

as

n N

>0.05

sx =

1.368 X 54060 60 5401

=0.167

b.x 2.05Se =6.2 2.05(0.167)=5.86and6.54andLCLandUCLrespectively. 5. a.2.052b.2.998 c.1.782 d.2.262

AnswerstoTerminalQuestions 1. 2. 3. mean=296.583peoples=40.751people a.0.181 a.112.4 1.697 b.6.019,6.381 b.112.4 2.234

SikkimManipalUniversity

131