0 ratings0% found this document useful (0 votes)

333 views335 pagesEl libro es una introduccion con una gran varidad de ejemplos practicos.

© © All Rights Reserved

PDF or read online from Scribd

El libro es una introduccion con una gran varidad de ejemplos practicos.

© All Rights Reserved

0 ratings0% found this document useful (0 votes)

333 views335 pagesEl libro es una introduccion con una gran varidad de ejemplos practicos.

© All Rights Reserved

You are on page 1of 335

SCHAUM’S
oullines
Second Edition
POS ULAR Sau. a
=
Updated examples with the
most current U.S. and world data
Two complete self examinations
=
New chapter on Time Series
Econometrics
Perfect for
pre-test review
Use with these courses: A sisiics and Econometrics [7 Statistical Methods in Economics
} Quantitative Methods in Economics (+! Mathematical Economies [ Micro-Eeanomics
Macro-Econames Math for Economists: Math for Social ScieneesBS
RTS
Theory and Problems of
STATISTICS AND
ECONOMETRICS
SECOND EDITION
DOMINICK SALVATORE, Ph.D.
Professor and Chairperson, Department of Economics, Fordham University
DERRICK REAGLE, Ph.D.
Assistant Professor of Economics, Fordham University
Schaum’s Outline SeriesMcGraw-Hill
hii Th Cara Hl
Cogpeight ©2002 hy The MecranHill Compnins, fe. All ighisteservnd, Manufanared inthe Wied Steve Ameren cep
| ered under te Ue ates Copyright Act 01 no pata! ens pasiicion may he repeadaced Cede my orm
foc by any means oc ee in a danse o reival ssem, without the peice wrinen permis be publi
07-1 30566-7
‘The mesa inthis eRe so apa inthe pein version this il: Oa 4852.2
All trademarks ae trademarks of tei respective owners Rather then pt atradeturk symbol afer evry accurece of sade
marked same, use mars nan edie Cashion only and ote bene. he trae mark mer, with mo iment of ing
‘ment ode wader, Beere sch Jesignsions appear in Chis Dak, ey have en pine with ial eas
‘MCh Hill ks we avila a ia pean nse wc un pcoio daengsead ny i i np
‘ining pgm For move inti, pease come Geowpe Haare, Special Ses at geoige Nowe Pacyeaw Dillman 212)
Saale),
TERMS OF USE
This Copyright week and The MeCraw-Hill Cormpies, Ine. (°MAcew-HiL") an isles veer al rights in and
(he wok, Use of this wor i tet wo hese toma. Except as permit wer the Cpsyigh ACT of TTS a he right ste
aun euieve cae copy ofthe Work you muy wt vom, issotnbe,reveie exper, reproduc, sxe cale deta
‘works based open tana dtr, disseminate, sl, 9 aublcen work or ay pat
fice concent. Yu may use the week fr your ow mancoreerca nd pers se: any other use of he wok ie mictly pr
le Your right no the wrk may be semintod i you fat comply wich those tema
‘THEWORK IS PROVIDED "AS IS", MeGRAWAIILL AND ITS LICENSORS MAKE NOGUARANTEES OR WARRANTIES,
AS TO TH: ACCURACY, ADEQUACY 8 COMPLETENESS OF OR RFSULTS TO BF OBTAINED FROM USING THE
WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK 08
OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANT, EXPRESS OR IMPLIED, [MCLUDING BUT NOT LIM.
ITED TO IMPLIED WARMARTIES OF MERCIAN TAMILTFY 8 FTTNESS FORA MAITICUEAR BUILPOSE- Merve
tn iene donot watraace pune tha he Fanci contained athe wexk wil meet youreeguiement eth is oper:
sion wil he uninterupted or era free. Nether MeGeaa-HIl mart lcensrs shal be Hable you ot anyone else for any im:
‘acy, em oc cto, regardies Oca, rhe WCE oe MH ages ren eTeOM. Mra Rae NO repo
"Sil forihe conten of any infomation aveseed rough the work. Under no cumstances all McGraw-Hill nd its cen
sense able ray inc, incident, peca pie. consegoeaia oe emir dares tha result fe he une €or nak
ley w seat work, even ify of them hasbeen asc the posit of such damagex This ntation Vai sl apply
to any claimer case whatsoever whether mach elaieer cause arises in ears, nto otherwise.
DOK: 10.1m36u0TEDeS«S?This book presents a clear and concise introduction to statistics andl econometrics. A course in statistics
tr casmunnctiice io wften ume uf the mest useful but abso ane of the ununtdillivalt ut Use reyined oun ses
in colleges and universities. The purpose of this book is to help overcame this diliculty by using a
problem-solving approuch,
Each chapter begins with a statement of theory, principles, ar background information, fully lla
strated with examples. Thies followed by numerous theoretical and practical problems with detailed,
step-by-step solutions. While primarily intended as a supplement to all current standard textbooks of
statistics andor ecomomettics, the haok can alse he uscd asan independent text. aswell as to urpplement
class lectures,
‘The book is aimed at wollege students in economics. business aciministration, and the social sciences
taking a one-semester or a one-year course in statistics andjor econometrics. It also provides a very
iwseful source of reference for M.A. and M.B.A. students and For all those who use tor wold Hike to use)
statistics and evonometries in their work, No prior statistical background is assumed,
The book is completely self-contained in that it covers the statistics (Chaps. | to $) required for
econometrics (Chaps. 6 0 11), It is applied in mature, and all proofs appear in the problems section
rather than in the text itself. Real-world socioeconomic and business data are used, whenever possible,
to demonstrate the more advanced econometne techniques and models. Several sources of online data
are used, and Web addresses-are given for the student’ and researcher's further use (App. 12). Topics
frequently cncoumered In econometrics, such as multicollincarity and autocorrelation, are clearly and
concisely discussed as to the problems they create, the methods to test for their presence, and possible
conection toclusigus. i this seam edition, we have expuanied the computer appliativis ty prusake a
reneral introduction to data handling, and specific programming instruction to perform all estimations
im this book by somputer (Chap. 12) using Microsoft Excel, Eviews, or SAS statist
have also added sections on nonparametric testing. matrix notation, binary choice models,
chapter on time sorics analysis (Chap. 11}, field of econometrice which has expanded at of late. A
sample statistics and econometrics examination is also included.
‘The methodology of this hook and much of its coment has heen tested in undergrad
graduate classes in statistics and econometrics at Fordham University. Students found the approach
and content of the book extremely useful and made many valuable sugesstions for improvement. We
have also received very useful advice from Professors Mary Beth Combs, Edward Dowling, and Damo-
dar Gujarati. The following students carefully read through the entire manuseript and made many
‘useful comments: Luca Bonardi, Kevin Coughlin, Sean Hennessy, and James Santangelo. To all of
them we ate deeply grateful, We owe a great intellectual det to our formar profesor of tatisies and
econometrics: JS. Butler, Jack Johnston, Lawrence Klein, and Bernard. Okun
‘We are indebied to the Literary Executor of the Inie Sir Ronald A. Fisher, F. R.S., to Dr. Frank
Yates, F. K.S.,and he Longman Group Ltd., London, for permussion to adapt and reprint 1apiss IL
and IV from their hook, Statistical Tables for Biolagical, Agricultural and Medical Research.
In addition 10 Statistics and Econometrics, the Schaum's Outline Serles in Economies includes
Microeconomic Theory, Macroecanomic Theory, International Economics, Mathematics for Economists,
sand Principles of Ecomrnies
Dosmack SxLvarone
Derrick Rescu
New York, 2001
‘Copyright 2002 The McGraw-Hill Companies, Inc, Click Here for Terms of Use,CHAPTER 1
CHAPTER 2
CHAPTER 3
CHAPTER 4
CHAPTER 5
Introduction
LL The Nature of stausbes
12 and Econometrics
13 ‘Methadalogy of Econometries
Descriptive Statistics
2A Frequency Distributions
22 Measures of Central Tenvleney
23° Measures of Dispersion
24 Shape of Frequency Distributions
Probability and Probability Distributions
31 Probability of a Single Event
2 Probability of Multiple Events
33 Diserow Probability Distributions: The Binomial Dastriburion
34 The Poisson Distribution
35) Continuous Probability Distribstions; The Normal Distribution
Statistical Inference: Estimation
41 Sampling
42 ie Distribution of the Mean
43° Estimation Using the Normal Distrib
44° Confidence Tntervals for the Mean Using the ¢ Distribution
Statistical Inference: Testing Hypotheses
SA Teating Hypotheses
52 Testing Hypotheses about the Population Mean and Proportion
3° Testing Hypotheses far Dillerencey between Two §
Proportions
SA ChisSquare Test of Goodness of Fit and Independence
Analysis of Variance
Nonparametric Testing
‘STATISTICS EXAMINATION
CHAPTER 6
‘Simple Regression Analysis
6.1 The Two-Varlable Linear Modet
62 The Ordinary Least-Squares Method
‘Copyright 2002 The McGraw-Hill Companies, Inc, Click Here for Terms of Use,
1
1
1
67
67
a
69
87
87
87
9
92
124
128
128
128CHAPTER 7
CHAPTER &
CHAPTER 9
CHAPTER 10
CHAPTER 11
CHAPTER 12
CONTENTS
43 Tests of Significance of Parameter Estimates
GA Test of Goodness of Fit and Correlation
65 Propertics of Ordinary Least-Squares Estimators
Multiple Regression Analysis
7 The Three-Variable Linear Model
‘72 Tests of Significance of Parameter Estimates
7.3 The Coctficient of Multiple Determination
74 Test of the Overall Significance of the Regression
7S Partial-Correlation Coefficients
766 Matrix Notation
Further Techniques and Applications in Regression
Analysis
1 Functional Form
82 Dummy Variables
3.3 Distributed Lag Models
Forecasting
BS Binary Choice Models
$846 Interpretation of Binary Choice Models
Problems in Regression Analyst
91 Multicolineas
2 Heteroscedastici
93 Autocorrélation
94 Errors in Variables
Simultaneous-Equations Methods
10.1 Simultancous-Equations Models
Ww tasnuticauon
10.3 Estimation: Indirect Least Squaes
Wa Estimation; Two-Stage Least squares
Time-Series Methods
ut
2
“3
14 Testing for Unit Rant
ILS Cointegration and Error Correction
11.6 Causality
Computer Applications in Econometrics
12.4 Data Formats
122. Microsoft Excel
130
ne
133
154
1st
158
1ST
158
158
181
181
182
182
133
184
185
266
266
267vi CONTENTS
12.3 Eviews
124 5A5,
ECONOMETRICS EXAMINATION
Appendix: 1 Binomial Distribution
‘Appendix: 2 Poisson Distribution
“Appendix 3 Standard Normal Distribution
Appendix 4 Table of Random Numbers
pendix § Student's ¢ Distribution
‘Appendix 6 ‘Chi-Square Distribution
‘Appendix 7 F Distribution
Appendix 8 Durbin Watson Statistic
‘Appendis: 9 Wikeoxon
Appendix 10 Kolmogorov-Smirnov Critical Values
‘Appcadis 1 ADF Critical Values
‘Appendix 12 Data Souroes on the Web
INDEX
268
18Introduction
1A THE NATURE OF STATISTICS
‘Statics refers to the collection, presentation, analysis, and utilization of numerical data to make
inferences and reach decisions in the face of uncertainty in economics, business, and other social and.
physical sciences.
‘Salisties is subdivided into descriptive and inferential, Deseriptive statistics is concemed with
summarizing and describing: a body of data, Mnjerential seattsvies is the process of reaching general-
izations about the whole (called the populatian) by examining a potion (called the sample). In order
for this to be valid, the sample must be representative of the population and the probability of error also.
must be specified
‘Deschiptive statsties is discussed in detail in Chap. 2. This is followed by (the more crucial
statistical inference: Chap. 3 deals with probability. Chap. 4-with estimation, and Chap. 5 with hypoth
sis testing
EXAMPLE 1. Suppose that we fave data on the incomes of [000 US. families. This body of data cam be
Summarized By foding the average family income and the spread of these family incomes above and below the
iiverage The data also can be described by constricting a table, chart, or graph of the number or proportion of
families fm each income clase. This i descriptive statictace. If those [00 Famili are representative of all US.
families, we ean then estimate and test hypotheses about the average family ancome an the United States at a whos
Since these conclusions are subject to error, we also would have to indicate the probability of error, This 1
saristeal Inference
1.2. STATISTICS AND ECONOMETRICS
-Economiciricy refers to the application of economic theory, mathematics. and statistical techniques
for the purpose of testing hypwoubwescs ancl id foreeasting eennomic phenomena. Feane-
imetrics has become stvongly ilenified with regression analysis. This rolatcs a dependent variable to one
ior more independent or explanutory variables Sines relationships arenag aecnomi: variahles are
generally inexact, a disturkance or error term (with well-defined probabilistic properties) must be
incluted (500 Prob 1 8)
‘Chapters 6 und 7 deal with regression analysis: Chap. 8 extends the hasic regression model; Chap. 9
deals with methods of testing and correcting for violations in the assumptions of the basic regression
model and Chaps 10 and 11 4 two specific areas of econometrics, specifically simultancous-
equations and time-series methods. Thus Chaps. | to 5 deal with the statistice required for sconometricr
(Chaps. 6t0-11). Chapter 12 is concerned with using the computer to aid in the cileulations involved in
tho previous chapters
‘Copyright 2002 The McGraw-Hill Companies, Inc, Click Here for Terms of Use,2 INTRODUCTION lomar. t
EXAMPLE 2. Consumption thoory tells ws that, in general, peop: inexease their eonsumpéion expensiture C as
thete dicporable (after-tax) necome ¥, increater, bul aot by at mach ag the dneveace in thairdisporalie income. This
ean be stated i explicit incur equation fost 38
cathe wa
‘where by and dare unknown constants called parameters, The parameter hy is the slope coulicient representing the
frarginal propensity to constume (MPC) Since even people with icatical disposable ineowe are Likely to have
somemhat different consumption expenditures, the theoretically exact and deterministic relationship represented by
Eq, (11) must be madifed to inchude a randam disturbance or error term, 1, making tt stochastic:
c
ba heee uw
13 THE METHODOLOGY OF ECONOMETRICS
Econometric research, in general, involves the following throe stages:
|. Specification of the model or maintained hypothesis in explicit stochastic equation form,
together with Uwe a primi theoretical expectations about de sign and size of Ure pranuncters
of the function,
2 Collection of data on the variables of the model and estimation ofthe enefficients of the function
‘with appropriate econometric techniques (presented in Chaps. @ to 8).
3. Evaluation of the estirsated coefficients of the Funetion om the basis of economic, statistical, and
eeanamettic criteria,
EXAMPLE 8. The frat stage in coummctic sorcarch wu cannumption theuny ie alate the than i exp
stochastic equation form, as in Eq. (1), with the expectation that fy > 0 (ae, at Ty = 0, C>Oas people dssave
fandjor harrow) and 0. = ET secand sfage involves che collstion of data on consumption sxpondicure and
Sispacable income an estimation of F.(1 1). The died rage in econometric research involves (1) hing ose if
theesticnated vale of be O aed by 1: Ohdeterminine a “satisfactory” peapartion af the variation ia Cs
explained by changes in Vand i hy andy ae “statistically significant at acceptable level [se Prob. 1.15) and
See, 5.2 and (3) testing to see the assumptions of the basi regression model ae satistied oF. if not, how to correct
for violations, If the estimated relationship does mot pass these tests, the hypothesized relationship mnsst be
modified and reestimated until a satisfactory estimated consumption relationship is ahseved.
Solved Problems
‘THE NATURE OF STATISTICS
1.4 What is the purpose and function of (a) The ficld of study of statistics? (by Descriptive sta
fisties? (03 Inferential statistics?
(@) Statistics is the body of procedures and teriques sed to calle, presen, and analyze data on wish
totinedcesions Inte ice ofuncermny or incomplctsiformation. Sac ana Isused voy
in practically every profeson. The economist weit to test the eicency of akernative prodution
techniques the buitesaperoh ay nae ft test the pot eng or package that mses aes
the sociolosist to analy the res of a drug habitation program; the instil peychologit to
tczorina borkart ruepouit to plant etre sree the pola etn 1 foes Woting pale the
physi tows the efciveness of new drug; the chemist to produce cheoper frien and so
(0) Design stasis suhasicd «bik of dat with nt eek W fai tha cas
tne whole data. abo telorsfo the prosetation of boy of dala he rm of tabs, chars graphs
another Forme of graphic dpCHAP. 1) INTRODUCTION 3
12
13
La
(©) Infeccotial statistics dhoth estimation and hypothesis testing) refers to the drawing of generalizations
about the properties of the whole (called a prpufarion} trom the epeific or a eample drawn from the
population. Inferential statistics thus involves inductive reasoning. (This is to be eoatrasted with
rletuetive reasoning, which asesibes properties to the spestie starting with the whe)
(a) Ate descriptive or inferential statistics mone important today? (B) What is the importance
of a representative sample in statistical inference? (c) Why is probability theory required?
(a) Statics started as a purely descriptive scence, but it grew émto a powerful too! of decision making as
its inferential branch was seveloped. Modern statistical analysis vefers peicnaily 4 inferential o¢
inductive statistics, However, declucive and insuctive statistics are complementary, We must stusly
hhow to generate carnplae from poptlationc before we can beara to gooeraline from expe to popati
(Uy Gis oidee for statistical aatnnve tir be val iL aimst be based om a siaiple that Fully safety the
characterstis and properties of the population feom which is drawn. A sepeesentative sample is
soriced by random rampling, whershy ach sloment of the population hae an aqual chance of baing
Included! in the sample (see See. 4.1).
(0) She the puossbiliy uf eis enink i staintialinfercem, elinaten ye teks oa pupa prone y
characteristic are given together with the chance or probabikty of being wrong. Thus probability
thoory i an essential slomont in statistoal infarc
How can the manager of a firm producing lightbulbs summarize and describe to a board meeting
the results of testing the Hife of a sample of 100 Tightbwihs produced by the fin?
Providing the (raw) data on the le ofeach in the saeple of 100 Kightbulbs prod oc by’ the firm would
be very inconvenient and ime-consursing or tne tard members to evaluate. Instead, the manager might
summarize the data by indicating that the average life ofthe bulbs tested is 360h and that 95% of the bulbs
tested tea Uetest 320 nd A001, Byung Ui Une nana io ridin te wicoen fini sativs (ee
average if and the spread i te average lif) that characterize the life ofthe 100 bulls tested. ‘The manager
taka might want to describe the data with a table or chart indicating the murber or proportion of bulbs
tested that lasted within cach IO-Nclasification, Such a tabular oF araphic representation of the data is abso
seny usefil for gaining a quick oversiew of the data stimmaririna and deseabing the data inthe ways
indicated, the manages is engaging in deseriptive statistics. It should be noted that descriptive statistics can
be used to summarize and describe any how of data, whether it sa sample (as above) ora population when
all the elements of the population arc known and its characterstics can Be calculated)
(a) Why may the manager in Prob, 1.3 want to engage in statistical inference? (6) What would
this involve and require?
a} Quality control requires that the manager have a Fairly good idea about the average life and the spread
‘in the life of the lightbulbs produced by the firm. However, testing all rhe lightbulbs produced would
destroy the entire output of the firm. Even when isting does not destroy ths produc, testing the entire
output is usually probibitively expensive and famecconsurring. The usual procedure i ta take asample
ff the output and infer the properties and characteratice of the entire ousput (population) from the
conesponding charsetenstiss of a sample drawn [rom the population,
(6) Statistical inference requires frst of ll that the sample be repsesentative of the population being
sampled. If the frm produces lightbulbs in diferent plants, with more than one workshift, and
‘with raw materials from different suppliers, these must be represented in the sample in the propertion
in which they contribute to the tolal output of the firm, From the average life and spread in the if of
the bulbs in the sample, the fim manager might estimate, with 98% probability of being correct and
1% probability of being wrong, the average Ife oC all the lightbulhs produced hy the frm to be berween
S20and 400 {oee Sec. 4.3). Instead, the manager may use the sample information ta test, with 95%.
feovtabaity of being corvest and £0% plokublity of bung Weomg, that the average life of the population
of all the bulbs produced by the firm is greaier than 320h (see See. 5.2) In estimating or testing the
average fora population from sample information, the manager engaping im ctateical inferenceINTRODUCTION lomar. t
STATISTICS AND ECONOMETRICS.
1s
7
What is meant by (a) Econometrics? (b) Regression analysis? {c) Disturbance or error
term? () Simultaneous-equations models?
(«) Exonomeiries is the integration of economic theary, mathematics, and statisical techniques for the
‘parpov: of teting hypothotor about aconamic phenomena, extimaling eveliconts af economic relation
‘tips, nd forecasting oe prodieting funure values of economic Variables or phenomena. Econometrics
is suhtivided into theoretical and applied econometrics Thvweetical accaowmieenis neters to the methaels
for measurement of economic relationships in general. Applied econameteics examines the problems
encountered and the findings in particular elds af economics, such as demand theory, peaduction,
investment, consumption, and other fells of applied eeonomie rewearch, In any case, econometrics is
partly art and portly a ssience, because oRen the intuition and good judgment of the ssonometrician
plays a crosial role
(6) Regression analysis studios the causal relationship between one economic variable to be explained (the
Aepenlent variable) und one ar mare independent or explanutary variables, When there is only one
‘iwdependent of explanatory variable, We have simple regression. la the wore usual case of tase that
‘one independent or explanatory variable, we have mullple regression.
(©) A frandom) disturbance or error must be included in the exact relationships postulated by economis
theory and mathematical esonamnis in order ts make them stochastic (ic, in onder 1 reflect the fact
that in the real world, ccanomic reathonships among econoraic variables are inexael and somewhat
ertatic).
(df) Simultaneous equations models refer to relationships among economic variables expressed with mone
than one equation and such that the ssonomic ¥artables in the various equations imeract, —Simuta-
ncous-oquations rodclsare the most coraplex aspect of economnetsics and are discussed in Chap. 10.
(a) What are the unctions of coonometnes? (0) What aspects ol ecomomets (and other social
sciences) make it basically different (rom most physical sciences?
(4) Beonometies has basically theee closely interelated functions. The first sto west coonomie theories or
hnypothesee. For example, is sansuraption directly relied tn income? Ts the quantity demanded of a
commodity inversely related to its price? The second function of econometrics is to provide numeral
estimates of the coefficients of economic relationships. These are csscntial in decision making. For
‘xample,a government polieyemaker needs to have an aocurate estimate ofthe svefisient of the relation-
ship between consumption and income in order to determine the stimulating (i. the multiplier) effect
fof proposed tax reduction. A manager needs to know i a price reduction increases or reduces the
total sales rexenues of the firm and, if so, by how mach, The thd function of econometrics is the
foresasting of events. Is, 109, i Mesessary a. orver for polkeymakers to faRe apprOprae cArrecteNe
action ifthe ratz of unemployment or inflation is predicted 10 rise an the future.
() There are two basse differences between econometrics (and other socialsciences) on one handl, and most
physical sciences feuck as physic) oa the other. One is that (as pointed out eather relationsheps
among economic vanabies are ensxact ars somewhat erratic. Ihe sesond 1s that most economis
[Phenomena coeur sontemporancensy, s that Iahoratery experiments eannot be conducted, These
differences require special methods efanalyss (sich as the incisslon ata disturbance or error sre with
the cxaet relationships postulated by economic theory! and multivariate analysis (each ae multiple
regression analysis), Ths Istts issltss the affect of sash indspondent or explanatory variable on
the dependent variable in the face ef contemporancous change in all explanatory variables,
In what way and for what purpose are (a) economic theory, (S) mathematics, and (¢) statistical
analysis combined to form the field of study of econometrics?
|w)Peonometes presupposes the existence ofa body of economic theorkes or hypotheses requlring texting.
[the variables suggested ly econarnic theory do nat pravade a satisfactory explanation, the researcher
nny copra it alternative rsialions anid Vaniables suggested by paved Lats oe carols
theories, In this may, economeiti research can lead to the acceptance. ection, and reformulation of
sconomie theoriesCHAP. 1) INTRODUCTION =
18
(6) Mathematics is used to express the verbal statements of economic thearies in mathematical form,
expresiing an exact or daterminiatic fanciional relationship between the dependent and one oF more
independent or explanatory variables,
(0) Statistical analysis applies appeopiate Hla oes to etic the ncaa aud uomenpesiaee tal clation
ships among economic variables by utling relevant economic data and evaluating the results,
Wht justifies he inlusion af disturbance or ceeur (erty in regrension analysis?
‘The inclusion of a frandom) disturbance or estor teem (with wellatined probabilistic properties) is
required in regression analysis for three important reasons, Firs, sings the purpose of theory isto generalize
and simplify, ceonomie relationships usually inlude only the most important farces at work, This means
that nuimeraus other variables with slight ane repr effects are not ineluded. The error term can be
viewed a representing the act elect ofthis large number of small and irregular forees at work. Second, the
Imctusion of the error ferm ean be JUsihed -oTder to take mer onsiceration the Net eect oF possAbkesrTaT:
im measuting the dependent variable, ar variable being explained. Finally, sinee human behavior usally
Gifers im random way under idcnlical circumstances dhe disturbs or ceror term eam be uoed wo expluse
this inherently random human behavior, ‘This ersor term thus allows for inéiritual rarsiomn deviations from
ths enact and deterministic relationships postlated by economic theory ng mathematical economics,
Consumer demand theory states tat the quantity demanded of'a commodity Dy isa function of,
or depends an. ils price Py. consumer's income and the price of otber (related) commodities,
say, commodity Zi, Fz). Assuming that consumers" rast remain constant during the period.
of analysis, tate the preceding theory in (a) spociic or explicit incar form or equation and
(6) in stochastic form. (c} Which are the costtcients to be estimated? What are they called?
@ Dy=By4b\Pr by) + bP sn
oo Dra hy thiPrth¥ ther te a
(e) The cooticents to be estimated are by by, and by, They ar called paranster
THE METHODOLOGY OF ECONOMETRICS.
110 With refercnee to the consumer demand theory in Prob. 1.9, indicate (a) what the frst step is in
econometric research and (4) what the a priori theoretical expectations are of the sign and
possible size of the parameters of the demand funetion given by Eq. (1-4)
(a) The first step in econometric analy is to express the theory of consumer demand in stochastic
‘equation form, as in Eq. (14), and indicate the a priori theoretical espestations about the sign and
possibly the size ofthe parameters of the Function.
(6) Consumer demand theory postulates that in Eq. (1.4), < 0 (indicating that price and quantity ase
inversely elated, by = 0 ifthe commodity is a normal pood (incieating thax consumers purchase more
of the commodity at bigher incomes), by =i X and Z are substitutes, and by <4 X and Z are
complements
Indicate the sccond stage in econometric research (a) in general and (4) with reference to the
demand function specified by Eq. (1.4,
(a) The second stage in econometric research involves the caleetion of data on the dependent variable and
‘on each of the independent or explanatory variables of the model and utilising these: data for the
ctipivieal eitimatlon of the pacaineters of the wiodel. The i URUslly davse with multiple regression
analysis (diseussed in Chap. 7)
(oy tis wrdee to stints the desman fection given by By. (1), data must be solleeted ou (Up the
«quantity demanded of commodity ¥ bby consumers, (2) the prive of ¥Y, (3) consumer's incomes.
and. (8) the price of commodity 7 per unit of time (ie, par day, month, oF yeas) and aver a number12
INTRODUCTION lomar. t
of days. months, or years. Bata on Py. Vand Py are then regressed against data on Diy and estimates
of parameters by by bane By obtained,
How doos the iype of data required to estimate the demand function specifiod by Fa, (1.4) difler
fear the type of ata eat wail Be teqired ta estimate the consumption function for a gecsp af
families at ane pons in rie
In onder to estimate the demand function given by Eq, (1.4), numencal values of the vanables are
required over a period of time. Fer example, ifwe want to estimate the demand finetion for coffee, we need
she numerical value ofthe quantity of coffee demanded, say, per yeas, over a numberof years say, ram 1960
ter 1980, Similarly, we need data om the average peice of colle, canstmers income, and the ries, of say, tea
(a: aubatitute for coffec} per gear from 1960 to 1980, sta that give nimerical wales for the warinbles of 8
function from pertod to period are walled tinw-serics data. However 1 estimate the consumption funtion
for 4 group of Families at one point in tec. we ced crorssectional data (L., numerical valucs foe the
consumption expenditures and dispacable incomes of each Family in the group at particular point in time,
say, mn 192
What is meant by (a) Lne third stage am econometnc analysts! (b) A pnori theoretical en
teria? (c) Statistical criteria? (al Feonometric criteria? (e) The forecasting ability of the
moet?
he evahvalion of the estimated matel on the sis of
ty of the model
(6) Thee print economic criteria fer to the sign and sas af the parometers of the model paatulated by
csonomie thoory. Ifthe estimated cooflcicats do wot conform to those postulated, the mods! must be
revised or rgected
(©) The statistical crteria eefer ta (1) the proportion of variation inthe dependcat variable “explained”
bby changes inthe independent or explanatory wariables and (2) verifention that the dispersion or
spread of eich estimated evellicicot around the true parameter is suficiently microw Lo give us "eon
‘dence in the estimates
The ecowomeerc criteria reler to test that the assumptions of the basic regression model, and particu:
larly these about the disturbance or 0 (WT W a normal good), and wf; > O ( Za substitute for A, 28
postulated by demand theory,
() The statistical criteria are satisfied only if a “high” proportion of the variation in Dy. ovce time is
“explained by changes in Py, Vand P, and ifthe dispersion of etmated 4, an By aro the
{rue parameters are “slficiently narzoww.” There is no generally atcepled answer as to what sa “high”
‘proportion ofthe variation in Dy “explained” by Fy, P.and Py. However, beause of eommon trends
in imesseries data, we would expect more than $0 0 70% of the varlation In the dependent varlabie 10
bbe explained by the independent or explanatory variables for the model to be judged satisactory.
‘Silty, in eee fit sich estnasted cacnnt to La Staisacally signi” wre Wahl eae the
Alspersion of cach estimaicd cosflcient about the true paramcier measured by is standard devi
as Seo, 21) to be panoraly lee than half the ertimatad salve of the eoalficiontCHAP. 1) INTRODUCTION 7
Las
(0) The econometric criteria are used to determine i the assumptions of the ceanometric methods used are
catiled inthe ecimation of the demand fnetion of Eq. (11) Only i thew aerumptione are ratified
ill the estimated coefficients have the desirable properties of unbiasedness, consistency, fficeney, and
sa forth (s98 See. 64
Qe way to test the forcxasting ability of the demand model given by Eq. (1.4) isto use the estimated
Faction to predict the value of Diy For a periad mat included in the cample and checking that this
predict! value s "sufficiently close tothe actual observed value of Dy foe that perise
15 stages Of econometric research
4
Mathematical riod
1
oonomettic (stochastic) model
Stage 2: Collection of approprints data
4
Entimation of the parameter of the model
‘Stage 4: Evaluation of the model om the basis af sconemie,
atistical, and seonometric critecia
I C74
Accent theory Reject theory Revise thenry
if compatible if incompatible if incompatible
with data wits data wwith data
L
Prediction Confrontation of
revised theory
vont new dana
Supplementary Problems
THE NATURE OF STATISTICS
ut
(a) To hich field of study is statistical analysis important? (6 What are the most important Functions of
Sescripeive statistics? (¢} What is che most important function of inferential statistics?
Ars. (a) Toccanomics, business, and other social and physical sciences (By Summarizing and describing
| body of data. (0) Drawing inferences abst ths characteristics of 4 population from the comesponding
characteristics of a sample drawn from the popallation.
(a [s statistical inference associated with deductive or inductive reasoning? (8) What are the conditions
required in order fr statistical inference to be wali
‘Ans. (a) Unduetive seasoning (b) A representative sample and probabiity theory
STATISTICS AND ECONOMETRICS
[Express in che form oP an explicit Incr equation the statement that she Level of investment sponding F bx
inversely related 10 rate of interest R
dn J y+ byR with by postulated to be negative usINTRODUCTION lomar. t
1.4 What is the answer to Prob, 18 an example of?
dng. Aneconomne theory exproted io {enact or deerennitis) evatheratia! form
1.2m Express Bq. (1. in stochastic form,
ss. Tet 4b Ro U6
1.21 Why isa stochastic form required in econometric analysis?
sing. Becavse the rbationshis among economic variables are inexact and somewhat erratic as opposed to
the exact and deterministic relationships postulated by economic theory und matherutical economics
THE METHODOLOGY OF ECONOMETRICS
1.2% What are wager (a) ome, (4) two, and (4) thies in oomometaie research?
Ans. (a) Spesiication ofthe theory in stochastic equation form and ification of the exposted signs and
posse since of estimated paramtrs (8) Collertinn of dats on the warnbles ofthe movil ana timation
Othe coofcients ofthe Function. (ch Eeonoeni, statistical, and cconometic evaluation ofthe estsmated
rameters
1.28 What isthe frst stage of esonometic analysis for the investment theory in Prob. 118?
Ans. Stating the theory iv the Form of Ea. (2.6) and pricing by ~ 0
1.24 What is the sosond stage in esonometric analysis forthe investment theory in Prob. 1.18
Ans, Colfsstie of time-series data on / and and estmation of Ea. (8)
1.26 What is the third stage of ssonometic analysis for the investment theory in Prob, 18?
dus, Determination thatthe estimated coeficient of 8, ~ 0, that an “adsquate” proportion of the variation
in Fover ome 6 explatned” by changes in R, that 6) is“satistically significant at eastornary levels” and
that the econornetsic assumptions of the madel ate satistiedDescriptive Statistics
2 FREQUENCY DISTRIBUTIONS
frequency distribution, This breaks upp
s the number of abservations in each class. The number of
sfisiribution is obtained by dividing the number
The sum of the
felative frequencies equals |. A histogram isa bar graph of a frequency distribution, where classes are
measured along the horizontal axis and frequencies along the vertical axis. A frequency polygom isa line
raph of a froquency distribution resulting from joining the frequeney of each class plotted at the class
midpoint, A. cumecative frequemey distribuste cach class, the total number of observations in
all classes up to and including that class. W. this gives a dlstribution curve, or ogive
tis often useful 1o organize or arr
the data into: groups ar classes anal sh
classes is usually between Sand 15. A relative frequenc
plott
EXAMPLE 4. A student rescived the following grads (measured from 0 to 10).on the 10 quizses he took during 3
semester: 6,7, 6,8, 5, 7,6, 9, 10, and 6, These grades can be arranged into frequency distributions asin Table 3 |
and shown graphically as in Fig. 2-1
Table 21 Freqsensy Distributions of Grades
Grades ‘Absolute Frequency Relative Frequency
t 1
‘ oa
2 2
U
l
L
el
o
eo
io Lo
Fig. 24
9
‘Copyright 2002 The McGraw-Hill Companies, Inc, Click Here for Tenms of Use,10 DESCRIPTIVE STATISTICS [oHar, 2
EXAMPLE 2. The cans in a sample of 0.cans of fruit contain net weights of frit ranging fram 19:3 to 20.90%. a5
piven in Table 22, If we want to group there data into & claster, wo git eforr éntoreak of O.Fox
[(2L0-192)/6=03ed. The weights given in Table 2. can be arranged into the frequency distributions gven
in Table 9 ¥and chown praphically in Fig. 9-9
‘Vale 2.2 Net Weaghe i= Ounces of Feat
7 199 m2 199 m0 26 1 m4 1D 20d
201 9S MY M3 2S 199 WO He 19 198
‘Table 2.3. Frequency Disertnutlon of Wels
rr
192194
195197
19200
Dotan
ma me
mo7209
Panel A toga ae: Reve epee gain
® é «|
a. a
z ‘ a ea
Panel ive
anal; Prequeney peiyzoa "
i
3
‘weghie a ciate
Fig 22cua, DESCRIPTIVE STATISTICS u
2.2 MEASURES OF CENTRAL TENDENCY,
Central tendency refers to the location of a distribution. The most important measures of central
tendency are (1) the mean, (2) the median, and (3) the made. We will be measuring these for
Populations (i... the collection of all the elements that we ars describing) and for samples drawn
from populations, as well a Tor srouped and ungrouped data
1. The artiimietic mean of average, of a population is represented by ys (the Greek letter muy and.
fora sample, by F (read “X bar"). For ugrouped data, ys and Y are calculated by the following,
formulas:
St am THEE (res)
¥ *
where OX refers to the cum of all the obsarvations, while Nand m refer to the number of
observations in the population and sample, respectively. For groped data, ye and Y are
caleulated by
oe
and
H (22a,0)
+e
where 7 roe to the sum of the Trequeney of exeh elass mes the chs mapornt
2. The median for uogrouped data is the-valuc of the middle item when all the tems are arranged in
either ascending oF descending order in terms. of values:
N4I
Median = the ( im item in the data array 4)
where’ refers to the number of items in the population (n for a sample). The median for
_groupedt dava is given by the formula
nfl—F
Median = L425 Se (4)
Whore J =lower limit of the median class (i¢., the elass that contains the middle item of
the distribution
= the number of abservations in the data set
F = sum of the frequencies up to but not including the median class
Jue = frequency of the median elas
¢= width of the elass interval
3. The mode is the value that occurs most frequently in the data set. For grouped data, we obtain
(25)
Where J. = lower hmit of the modal class (2. the class with the greatest Irequency)
dy = frequency of the modal class minus the frequency of the previous class
dy = frequency of the modal class minus the frequency of the following class
= width of the eas
rv
‘The mean is the mort commonly used measure of central tendency. The mean, however, is affected
by extreme values im the data set, while the median and the mode are mot. Other meusures of central
tendency are the weighted moan, the genmerric- moan, and the harmonic mean (soe Peobs. 2.7 to 29),2 DESCRIPTIVE STATISTICS lemar. 2
EXAMPLE 3. The mean grade for the population en the 10 quizess given in Example 1, sing the Formula for
nmogrouped data, ie
LX _LO+TH64 8454746494106 _ 70
we 10 @
‘To find the median forthe ungrouped data, ve fist arrange the 10 grades in ascending ovder: 5, 6,6, 6,6, 7.7.8,
1, Then-we find the grade of the (¥ + 1)/20r (10-+ 11/2 = $.Sth itr, Thus the median is the average ofthe Sth
‘nd 6th item in the array. ar (6-+ 72 =63. The made for the ungrouped data is 6 (he value that occurs most
Frequently in the data set}
sins
EXAMPLE 4, We can estimate rhe mean for the grouped data given in Table 2.3 with the aid of Table 2.4
Ste
2 at
[ns calcuration cous ne simpined coming, (8 Hrobs 2.0
Y= M08 0%
Table 24 Caleulatlon of the Sample Mean forthe Data: in Table 2.3
Chass Frequency
Weight, on | Midpoiae * pe
1294 193 193
195.197 196 3a
ret) 19 8 vn
20..-203 m2 4 sas,
m4 208 as 3 ee
20.1209 208 2 416
wid
Te
= 98402676 = oar
Mod =
198+
Unbere £ = 19.8 = lower limit of the median class tic. the 198-2040 class which contains the 10th and 1th
obscevations)
f= 20 number of observations or terns
r
sum of frequensies up to bet not inchading the median class
fre = 8= frequency of the median class
603 — width of class intersal
Similarly
Modest n= 8s¥o ssa 9tec
As noted in Prob. 2.4, the mean, snadian, and mode for grouped data are estimates used when only the grouped data
ble-ar to reduce calculations with a large wngrowped data extcua, DESCRIPTIVE STATISTICS B
23 MEASURES OF DISPERSION
Dispersion refers to the variability or spread in the data. The most important measures of disper-
sion are (1) the aveeage deviation, (2) the variance, and (3) the standard deviation, We will mea
sure these for populations and samples, as well as for grouped and unerouped data.
|. Auerage devaateon. The average devianion (AD), also called the mean atieolute deveatton (MATD},
is given by
‘for populations (26a)
nat for samples (ey
where the two: vertical bars indicate the absolute value. or the walues oenitting the sign, with the
other symbok having dhe same meaning as in See. 2.3. For grouped data
ap LAX =o
for populations (27a)
sot ap-E2™=T pe ampts em
where f refers to the frequency of each class and to the class middpoints,
Variance. The population variance o? (the Greck letter sigina squared) and the sample
variance # for ungrouped data are given by
> Step) rw-xy 5
oe ond gf ES (28a)
For grouped data
eB od eo (290.
3, Standard deviasion, The population standard deviation ¢ and sample standard deviation s are
the positive square root of their recpective variances, For ungrouped data
a poy
[ou = ul? uy - FF
ae a
Eni gg y= YEAS (2a.
The most widely used measure of (absolutey dispersion is the standard deviation, Other
measures (besides the variance und average deviation) are he range, Uhe Orrerguarcle range,
and the guarate deviation (see Probs, 2.11 and 2.12).
4. The conffcient of variation (8°) measures relative dispersion:
and (2.100)
For grouped data
or populations (2.120)
and v=4 for samples (2.12)
EXAMPLE 8. The average deviation, variance, standard deviation, and coefisint of variation For the ungrouped
ata givon io Kxample 1 can be found with the aid of Table 2.5 (je = 7; eae Example 3k“ DESCRIPTIVE STATISTICS lomar. 2
"Palle 2:5 Custos he Dut bn Examgie 1
Grade | Yawn [Nal Wen?
6 |7 T 7 7
7 |? o ° °
6 |? “1 1 1
s |r \ 1 1
5 7 2 4
1/7 ° 6
6 |? 1 1
9 |? 2 4
w |? 3 3 ’
6 |? “1 1 1
Elteal=0 | DW am
EXAMPLE 6. The average deviation, variance, standard deviation, and eoeficient of variation for the frequency
distribution of weights (grouped data) piven in Table 2,3 can be found with the aid of Table 26 (1° — 2008 a; see
Brample
O31802
225 9 star quid
[ELL OY POS _ Vous = 0.84202
# 0.3982 02
ae 0.0196, or 1.56!
Yo War oz *
[Note that in the formula for ? and ¢,a— I rather than m is used inthe denominator ieee Prob, 2.16 forthe reason}
[Pr the fiers fv oor a Biv tis ssl thers may he esi that wl sey scars
for a large body of data (soe Probs. 2.17 to 2.19 for their derivation and application}
Table 246 Calculations om the Data in Table 24
a a
we | edna | ER] efi} eam ae}
Towa | rn we ie
some | 0 4 |e | ome] on as ones
manana | ans + | am | nae | one as De
ava | 0 2 | aw | oz] on La si
Eysss> Lae Foecuar, DESCRIPTIVE STATISTICS 1s
24 SHAPE OF FREQUENCY DISTRIBUTIONS:
The shape ofa distribution refers to. (1) its symmetry oF lack of it (skewness) and (2) its peak:
edness (kurtosis)
1. Skewness. A distribution has zero skewness if it is symmetrical about its mean. For a
symmetrical (unimodal) distribution, the mean, median, and mode are equal, A distribution
is positively skewed if the right tail is longer. Then, mean > median > mode. distribution is
neastvely skewed if the left tail is longer. Then, mede > median > mean (see Fig. 2-3).
Mean Mode Mean
ae on
Pu A Syma Pama Rose shew na ent avd
fg 23
Skewness can bo measured by the Pearson coeficien of skenness:
sx = %A= met) for populations 23a)
and se Em bop samples (2b)
Monn and variance ary the first and second moments ofa distribution, respectively, Skowmeas
an also be measured by the third moment [the numerator of Eq. (2.14a.b)] divided by the cube
of the standard deviation:
sea ZL or popattons (2
and SELEY compte eum
For symmetric distributions, Sk = 0.
2 Kurtosts, A peaked curve is called leprolerric, as opposed to a flat one (plarykurric), relative te
fone that is mesokurtic sce Fig. 2-4). Kurtosis can be measured by the fouth emament [the
numerator of Eg. (2.154.01] divided by the standard deviation raised to the fourth power. The
kurtosis for a mesokurtic curve is 3.
Lepeokutic
Meese16 DESCRIPTIVE STATISTICS
Eset for populations (2.154)
:
ana E LUT por sampes 2.090)
3. Joint moment, The comovement of two separate distributions can be measured by covariance:
er Tyr -F) year)
N N
E(Y- WF) Ey)
eo(¥, 1)
— XY for populations
cov(¥ Y= YF for samples
‘A positive covariance indicates that 1" and ¥ move together in rel
negative covariance Indicates that they move In opposlie directions.
jon to their means. A
EMAMPLE 7, We cin fl the Poissons coslfict of keane fu the grinds givin Cosas 1 Ry nag ye
5 (see Example 3), and o = 18 (se Example 5):
Heal 6a sie 2
Similarly, by using V = 20.08 o2, med = 2kox sce Example 4), and
Pearson coefficient of skewness for the frequency distribution of weights
347 — med) _ 30
Sk
239. (see Example 6), we can fod the
Table 2.3 as follows:
Sk= 28015 toe Fi
Le),
For kurtosis, see Prob, 223,
Solved Problems
FREQUENCY DISTRIBUTIONS:
ZL Table 2.7 gives the grades on a quis for a cass of 40 students, (a) Arrange these grades éraw
data set) into an array from the lowest grade to the highest grade. (B) Construct a table showing
class Intervals and class midpolats and the atsolute, ratlve, and cumulative frequencles for each
grade, (@) Present the data in the form of a histogram. relative-frequency histogram, frequency
polygon, and ogive,
Taille 2.7 Grades on u Quite for x Class of 40 Statens
(a) See Table 28.
Table 28 Data Array of Grades
> 2 2s 3 3 @ @ @ @
4 5 5 5 § 5 6 6 6 6
Boe FF a
8 os 8 8 9 9 9 9 wocuar. 2) DESCRIPTIVE STATISTICS "
() See Table 2:9 Note that sinos we ars dealing here with discrete data (is. data expressed in whole
snambere), we weed the actual grades ae the clues misdpoints.
‘Table 29 Frequency Distribution of Grades
‘Class Absolute | Relative
Grade | Midpomt | Frequency | Frequency
isa 2 3 ons >
2sh4 3 3 aus 8
asada 4 3 0133 a
asa 5 5 as 6
5564 6 6 4.150 2
674 1 8 200 ”
1884 8 4 100 ™
808 9 4 4.100 8
9s 10 uso 40
10
(0) See Fig. 25.
Panel A: Mistogears Panel B: Relative Frequency Dicribution
Fegan
Relate fregeney
: , ale we
+ Be i i i
Panel ©: Frequency polbigon i
?
i
Gendee2
DESCRIPTIVE STATISTICS lomar. 2
A sample of 25 workers in plant receive the hourly wages given in Table 2.10, a) Arrange
thet caw data into ai aivay fiom the lowest to the highest wage, (2) Group the ata isto
classes. (o} Present the data in the form of a histogram, relative-frequency histogram, frequency
polygon, aud ogive.
Table £10 Hoarly Wages is Dottars
TAS M7e O8F 998 400 410 435 RSS ORE nme
Sad 390 426 378 39S gOS ame 41S 380 ans
388 393 40d 4a dos
(See Table 2.11
300 68 7S 378 380 SRS BAS ORAS
395 398 198 3.96 400 405 ans 405 406 48
40 413 48 42S 4.26
(@)Thshourly woges in Table 2.10 range from $3.55 to $4.25, This can the conveniently subslvided
imio ® cqwal classes of $0.10 cach. ‘That is, {$8.30 ~ £3.50]/8 = 8080/8 = 80.1, Note that the
range was extended from 3,50 to $4.30 s9 thatthe lowest wags, $3.55, falls win the lowest cass
and the largest wage, $4.26, falls widhiv the largest class. Tt is also convenient (and needed for
Plotting the frequency polygon} to find the class mark or midpoint of each class These are
shou in Table 2
‘Table 212 Froqucacy Distribution of Wages
[Hourly Wage] Class ‘Absolute | Relative | Cumulative
5 ‘Mispoint, $ | Frequency | Froqueney | Frequcr
‘na
sa)
360) 3.69 3 o.08
370-3.79 o.00
330 3,89 a0
4004.09 om
410-419 an
420-429 uns
Loo)
(6) See Figs 26 Armley of ttn thesrsive slo plo the eunvulalve esa wpe 53.595, 3.695,,
3.795, and so on (so asto include the upper limit of each class). The-values 53.595, 3.695, 3795, etc. are
‘often refern 10.45 the clase hoamdaricyaresct its, Moke hat the clays midints are obtained by
‘adding together the lower and upp class houndarizsaand divideng by 2. Forename, the second class
smicposnt se goven by’ (3.508 4 3.688)/2 — 7.2002 — 3.65 (nee Table 2.12).cuar, DESCRIPTIVE STATISTICS ro
Panel Ac Hisgram Pan Neate rogue sention
gE : 1 fe
=. ‘ gon ass
=
3 olay
Precl De Ogre
MEASURES OF CENTRAL TENDENCY
24 Find the mean, median, and mode (a) for the grades om the quiz for the class of 40 students
given in Table 2.7 (the ungrouped data) and (6) for the grouped data of these grades given in
Table 29,
(a) Since we are dealing with aif grades, we want the population smear
DN TES 46445 MO ey
x cr “ay = SPH
‘That ix, jb obtained by adding together all the 40 grades given in Table 27 and dividing by 40 [the
three centered dats flips) were pat i 19 sNoid repeating the 40 values in Table 2.7] ‘The median i
siren by the values of the [(W 4 1)/2th tem in the data array in Table 28 Therefore, the median ix
the vale of the (40-4 1)/3 oF 20.5th, oF the average ofthe 20th and 2Ist item. Since they are both
qual tn 6, the metinn is, The mind is 7 (the vale that qssare mot frequently in the ata set)
(6) We can find the paputarian mean for the grouped data in Table 2.9 with the aid of Table 2.13
This isthe some mean we found for the ungronped data, Note that the som of the frequencies, $f.
equals the number of observations in the population, N, and EN = 5°70. The median for the
grouped data of Table 2.13 is given by
= 554067 =61720
mM
DESCRIPTIVE STATISTICS lomar. 2
whore L.— $.5— lower limit of the median class (ie the 5,564 elass, whieh contains the
04h stad 298 obser vate)
= 40 = number of observations
F = 16 =su of observations up to but aot including the enedian. cass
Frequency of the median class
seith of class interval
‘The made for the grouped data in Table 2.13 i given by
+74
avd
Where £= 6.5= lower limit of the modal clas fue, the 6.5-7.4 class with the highest frequency of 8)
i —2 = frequency of the maa clas, 8, mins the Frequency of the previons clas, 6
sh 4— frequency of the modal clas, 0, minus the frequency of the following class, #
= L = wiih of the olass interval
Note that while the mcan calculated from the grouped data is in this case identical to the mean
saloslated for the ungrouped data, the median and the mode are only (goad) approximations
‘Table 2.13 Cakulaton of the Population Mam forthe Groped! Data in
Table 29
Grade [Class Midpoint x] Frequency
aa Z
2534 3
3544 4
as Sa s
5$64 6
5ST T
7584 ‘
es o4 ®
95-1
Find the moan, median, and mode (a) for the cample of hourly wage received by the 25
workers recorded in Table 2.10 (the ungrouped data) and (d) for the grouped data of these
wages given in Table 2.13,
oe yp EX _ sas 4 sizes 9.68
SEAM or S98
8
Medion = $3.95 the value of the fn 1/3 (25 | 1) = 13th fe in the data array in Table 2.11}
Moge ~ §3.95 and 54.05, since there are three of each of these wages, Thus the distriution is iste
Ge at hs tuo ates
(6) We can dnd the sarmple mean for the grouped data im Table 2.12 with the aif af Table 214:
Note that in this ease 5 fil = 98,75 # SO’ ='998.65 (found in part a) since the average of the
cobrervation: in sack clace ic not equal ta the clacs midpoint for all classes [ar im Prob, 2.38cur,
25
1 DESCRIPTIVE STATISTICS 2
‘Thus T cabcuated from the grouped data is only a very good approximation for the trie value of F
calculated for the ungrouped data. nthe neal workd, we often feave only the grouped data, or if we
have a very lasge Body of usgeouped data, i will save on calentions to estimate the meat by fest
cermping the atm
Te
1 compared with the true median of $4.95 found from the ungrouped data (sce part).
age HOT 5H
Mode = 1+ (0.10) = $400 + 80.028 = $4028 or S403
1s compared ‘ith the true modes of 5395 andl $4.05 found from the ungrouped data (see part a
Swvaetinin Un re senor given asthe anidpwnnt of te wa tas
‘Table 2.14 Caleutation of the Sample Mean for the Grouped Data in
xt
=
Compare the advantages and disadvantages of (a) the mean, (6) the median, and (c} the
mode as measures of central tendency.
(ah Te aug Une vnc ase CF iC Gains an sleet by vinhslly everyones (2H lle
observations in the data are taken into account, and (31 it & used in performing many other
statistical procedures and tests. The disadvantages of the mean are ()) itis afested by xtreme
Values, (2) it is time-consuming to compute for a large body of ungrouped data, and (3) if cannot
be calculated shen the lst clate of grouped data ie opemended (Le, it inchudes the lower limit of the
last class “and aver”)
G8) The ausaniages of themmalias a's €1) ibis uw alfeted by cuisine valuss, (2) i iscaily netstat
(Gc. hal the data are smalles than the median-and half are greater, and (3) it ean be calculated even
whan the Inst olast 9 open-ended and shen the data are qualitative rather than quantitative, The
slsadvantages of the mean ars (1) it does not use much of the information available, and (2) ib
recpires that obearsations be arranged into an amray, which ie time consuming for a Harge badly of
‘ungrouped data,
(0) The enlvantayss wf the enous are the sans as theme For iis snsaion, The analsantagss uf ahs
mode are (J) as for the median, the mde docs not use much of the information available,
and (2) sometimes no walns of the data is repested mons than ones, ao that there is no mode, while
al other times there may be many maces. In general, the mican i the most frequently used measure of|
central tendency and the mode ic the beat wiod26
aT
28
DESCRIPTIVE STATISTICS lomar. 2
Find the mean forthe grouped data in Table 2.12 by coving (ie, by assigning the value of x = 0
te the tho Sth esses ai — —1, yo = —2, eRe eae lower elas and j= Hy jem 2, oe 80
cach Larger class and thon using the formula
Terst ce, (210)
where Xp is the midpoint ofthe class assigned j = O and cis the width of the hiss intervals}, Ses
Table 2.4,
‘Table 2.18 Calculation of the Sample Mean by Coding forthe Grouped Data in Tabi 212
Waously Wage, » | Clas wapomet, > | Codey | Prequeney ) 7
ee 3s 3 T
se 360 aes 3 :
mM 2 ars 1 2
380-389 3.88 0 4
390-399 sas 1 :
400-409 408 2 6
400-419 aus 3 3
420-43 4 z
= 5395
Et sassy Sn in) sans
“F for the grouped data formed by coding is identical to that found in Prob. 2.48 without coding.
Coding eliminates the problem of having to deal with possibly large and inconvenient class
rmidpoints; thus it may simplify the calculations.
A firm pays a wage of 54 per hour to its 25 unskilled workers, $6 to its 15 semiskillod workers, and
3810 is IU skilled workers, What is the wergiied arerage, oF weighted meu, wage pais by this fim?
In find the weightet mean, ox weighted average. of a poptlation, j4., oF sarmple. T. the weights, w,
have the same function as the Frequency in finding the mean for the grouped dala. Thus
Lew
or a= ee (207)
‘For this problem, the weights are the number of workers employed at each wage, and Ss equals the sum of
all the workexs
(S425) + (56) (15)
we wie ie
This weighted average compares with the simple average of S6 (S44 S6-+ $8)/3 = S6] and i a betier
imeasare ofthe average wages,
Anation faces a rate of inflation of 2% in ome year, 5% inthe sevond year, ane 12.5% inthe third
your. Find tho geametrio meun of tha inflation rates (the geometnie man, op Ng, of oat oF n
Positive numbers is the mth root of their product and is used mainly to average rates of change
and index numbers
XN, (2.8)cur,
29
DESCRIPTIVE STATISTICS 2
where Nj Xy).00) Ny refer to the w (or N) abservations.
He = Y/CVSVUTS = WTB = 3%
This compares with = (24+ $+ 12.5)/3 = 19.5/3 = 6.5%, Whew all the musbees are equal, jg equa
otherwise jy smaller than j. In practic, 1g i ealculated by logarithms:
Slee
N
‘The scometsis mean is wied primarily i the mathematics of finance and Finansial managsmeot
op ho = (ny
A commuter drives 1Omi on the highway at 60 mi/h and 10mi om local streets at 1Smi/h. Find.
the harmonic mean, The harmonic mean jx is used primarily to average ratios:
N
Bu = Spe)
a
(1/60) + O15) (1 4)/60
10 sean
Tos amie
sscanpeted with je =O VIN = (14 16)/9.= 14/9 = 37 Sanith Note that if ris ecnnter had aereapied
30.5 mifh it would have taken her (20 on/37-Sanij6O min = 32min to drive the 2 mi. Insicad she drives
Gimin om the highway (10 ai at 6@ mish) and 40 min oe local streets (10 mi at LS mii or a total of Sin,
and this is the (comreet) answer we get by using jy = 2M igh. That i (20rni/24i/h) x 60 min = Sein.
(a1 Por the ungrouped data in Table 2.2, find the first, second, and third quartiles and the third
deciles and siatieth percentiles. (6) Do the same for the growped «ata in Table 2.12, (Quarriler
divide the data into 4 parts, deciles into 10 parts, and percentiles into 100 parts)
Go) Q) Uist quartile) =.4 (the average of the 10th and 11th vahies in Table 2.8)
2; (second quartile) = 6 = the valve of the Sth item = the median
2 (thied quastie) — 7.5— the value of the 20.2 itn
Dy (third decile) = 5 the value ofthe 125th item
Fa (sistiath percentile) = 7= the value of the 28.5 inom
nis F
af
= 24 msassmnses (220
(ey Beals
nit
aa
* (90,18) = 53.90.4807 = 8897 = median (22%)
=" (sa10) = 5.00 sn0792 = $4.08 (227)2
DESCRIPTIVE STATISTICS lomar. 2
(224)
= $4.00 + (80.10)
seis SH + $0067 = S402 1225
MEASURES OF DISPFRSION
Ru
243
(a) Find the range for the ungrouped data in Table 27, () Find the range for the ungrouped
data in Table 210 and for the grouped data in Table 2.12. 4c) What are the advantages and
disadvantages of the rangs?
(@) The range for ungrouped data is equal to the value of the largest observation rminus the value of the
smallest observation in the data sxt. The range forthe ungrouped data in Table 27 is from 210 10, 0r8
points,
(8) Tassie far th ageonped ata is Table? Inde feeen 814St0 $4 26, 08 STL TE Fae grange sata, the
range extends from the lower lint ofthe smallest lass to the-upper Imi ofthe largest class, Fo the
srouped data in Table 2.12, the range extends from $3.50 10 5.29
(©) The-advantapes of the range-are that it i easy to find and understand, Its disadvantages are that it
‘cso the lowest nl highest valves of adistriition, ee ereally illinsea by-exterme abies
sand it cannot be found for aper-ended distributions. Bectuse af these disadvantages, the range is of
tel usefulness (except in quality control.
Find the interquastile ange aval Ue quantile deviation (2) fox the wrod it Fable 27
and (4) for the grouped data in Table 2.12
(w) The interquartile range is equal tothe difference hetwcem the tind and frst quartiles;
- 21-9 1226
For the ungrouped data in Table 2.7, [R = 7.5 —4 = 35 points ftilizing the values of Q; and Q« found
in Prob. 210 (a) Note that he antrguartl ange iv aot afte By careme values becane a lies
cooly the mide Kalf ofthe data, Its thus better than the range, but ite no as widely used. the other
measures of cispersion, For the quartile deviatio
o
= (22)
QD
Therefore, QD = (9.6 4)/2= 3.6/2.
‘one-fourth of the da
(R= Q, ~ 0, = SA8 ~ $3.82 = $0.25 otilering the values of Qy and Qy Found ip Prob > 10(6¥
p= 21-21 _ $4.08 S383
1.78 points, Quartile devindon measures the average mange of
02s
Find the average deviation for (a) the ungrouped data in Table 2.7 and (B) for the grouped
data in Tabls 29.
(a) Since ps = 6 [see Prob. 2a).
Eu
DHLSOFAH2ETSOS1ESESEAE IE LEIS DOE TEED EE
$ASISOFIES424EG42EIES42FOS 1424340404 34441
n
ap DL.
Lspointecur,
2d
1 DESCRIPTIVE STATISTICS 28
[Note that the average deviation takes every ebscrvation into aecount. It measures the average of the
abvolute deviation of each abusrvation from the mean. It taker the absalute value (indicated by the
to vertical bars) Because SO(¥ — 2} =O (see Example Sh.
(oy We sae fal
rstnes evant fv Une sane rpm da wits Une abd of Table 216
DA wl 72
Ap ND
the same as we Found for the wngroupod: data,
‘Table 216 Calewtalons forthe Average estat for the Grouped Data im Tabbe 29
Clans Midpoint
r Frequency. | Moan v—p| fra
2 3 6 a
3 3 6 3 4
4 5 6 2 0
. . «6 | 1 5
6 6 6 o fo ®
8 4 6 2 | 2 8
° 4 6 sf 3 2
S104 0 2 6 a | a 8
Dyeve@ Elr-a=7
Find the average deviation for the grouped data In Table 2.12,
‘We can dnd the average deviation for the grouped data of hourly wages in Table 2.12 with the aid of
Table 217 (F = 3:95, ee Prob, 2.463):
Note thatthe average deviation found forthe srouped data sm estate of the “rus” average deviation
ther comid be wad ke the agent ata Th sally es saat fers tbe Fran average devitin
because we use the estimate af the mean for the grouped data in our ealculations [compare the values of T
Found in Prob. 2.0) and (6)
‘Table 2.17 Calculations forthe Average Deviation for the Grouped Data in Table 2.12
Hourly Wage, [Class Midpoint] Frequency [Mean J ¥—¥,]|— HL] f= ¥h,
s XS f 5 si] os s
Sa-h60 hos 040 | 030 bs
30-478 335 120 | 020 oa
380-389 385 4 1 | 010 pap
30-398 398 5 o.08 | 0.00 boo
400-409) 4.05 6 ow | a0 a0
410-419 48 3 20 | 020 050
420-429 4 2 0.30 | 030 ba
Lfaaas Eri T = 30026 DESCRIPTIVE STATISTICS lomar. 2
AS Pind the warianoe and the standard deviation for (a) the ungrouped data in Table 2.7 and
(@) the grouped data ia Table 29. (°) What is the advantage of the standard deviation over
the variance?
fa Te and 6 Goce Prob. 234)
SUV Wh UGTA OFS ELA OS TE IE OE WS TELS E OFTHE ESE LG
HOPLAOFLS OAS IG H4 SFOS ESO E TEA O FETE OS ICH
=i
2h .8 points squared
Eww _ (_ ae,
on Pe a pe VEE 219 pons
(6) We can find the variance and the standard deviation for the grouped dats of grades with the aid of
Tale 218
SEyiy =u _ 92
Poet ints. square
° w ay = 48 points squared
and om Var = VER 219 points
the same as we Found for the wngrouped data
“Table 2.13 Calculations for the Variance and Standard Deviation for the Data in Table 2.9
Frequency f tm?) fora?
”
2
16
36
2
Tifa = py = 192
(6) Tisacvantags af me stand deviating wer the waa is thatthe stata oval is mepesiel
the same units a& the data rather than in “the wideh squad,” which is how the variance is expressed
‘The standard deviation is by for the most widely used measure of (absolute) dispersion.
{E10 Find the variance and the stangard deviation for the grouped data in Table £10
‘We san find the varie andthe standart deviation forthe groped data hourly wage withthe nit
of Table 2.19 [¥ = $3.95; soe Prob, 2416):
obs
aT
andcuar. DESCRIPTIVE STATISTICS 7
‘Table 219 Caleulatlons forthe Variance and Standard Iestation fur the Data in Table 2.12
Hourly | Class tea
Wage, S [Midpoint ¥. 3] Froguency/ |S iw - 8 fit = 3
yas] ass 1 335 016 Oe
aap369 | 3.65 2 395 0.09 01s
aman] 31s 2 39s oat
sama | 38s 4 398 oat
song | 95 5 39s 00
on-s00 | ans ® a9s oat
wea] 4s a ass 04
amar] 43s 2 393 008
Epan-3
IT
18
ote that in the Formula for and s,m — I rather than 9 i used in the denominator. The reason for this
is that if we take many samples from a polation, the average of the sample varianees does not ead to
qual population variance, 0°, unlce we we» 1 i the donominator of the Formula for «(mora wll be
sald oF this im Chap, 5). Furthermore, ° and s for the grouped data are estimates for the true Fane £
thot com be found foe the grouped data because ae ie the coimate of W from the grote eat i
our ealeulations,
Starting with the formula for a” and s' given In See, 2.3, prove that
(@) (2.280,4)
(6) (2.2¥0,4)
oy
We can get by simply replacing wih Tang 4 sth im the numerator and WHR A — 1 mn abe
denominator of the Formal for
EF =F EPO = tea DAF = eT A Nt
@ x N x
AEP ae pe EM m
We can pet rin the ame way as we did in part a The preceding formulas will simplify the-aleulutions
for of and st fora large body of data. Cadi also helps (see Prob. 2.6)
Find the variance and the standard deviation for (a) the ungrouped data in Table 2.7 and
(0) the groupe lata in Tate 2.9. wine rhe style canpuarianal fowmnulas in Prob 21728 DESCRIPTIVE STATISTICS lomar. 2
) SENT Hh 28 Nh dy 6b 8D 4 36 I Se BT 106 15 4 254 25
4164 36-449 4 18444494 254 B64 ADF RY Hd 16
RTE 164 16494 Od 4 Op SbF DY BT LOD 9 2S
= 162
1.637 (any(36) L482 —1aan _ 197
= 4.8 points squared
the samme a2 in Prob. 2.1548),
(6) We can dnd o” and o-for the grouped data in Table 2-9 with the aid of Table 220
11.832 — (409136) _ 1,682 — 1.840 192
ae
Vee = VER 2 19 points
48 polats squared
the same as in part @ and Prob 2.15
Table 2.20, Calcutations for the Vartance and Standard Deviation for
tn Tabte 2.9
Gente [Cae Mutpaior Y | Feeney re v ne
1S24 2 3 é 4
asa a 3 9 °
1344 4 5 x 16
4354 s 5 3B 3
S84 6 4 % 36
6574 T 8 56 ”
184 8 4 2 a
asa4 » 4 Ea a
yo 2 a ow
Lrewean| Ser
219 Find the variance and the standard deviation for the grouped data in Table 2.12 using the simpler
computational formula given in Prab. 2.17(b)
“We can tind ¢ and » foe the grouped data in Table 2.12 with the ld of Table 2.21
0.0342 dollars sauazed
and os VOORE 50.18
the came se-we found in Prob, 216.cur,
220
DESCRIPTIVE STATISTICS »
‘Table 2.21 Calewlations for the Variance and Standard Deviation forthe Grouped Data i
“Vane 212
Hourly Css
‘Wape,'S | Midpoint x8] Frequency |X, $ a a
saess9 | ass 1 338 1200
s03.09 | 365 2 730 265450
amo3% | 375 2 7.50 28.1250
asos9 | as 4 15.40 9.2000
s903.99] 39s 5 19.35 7a0128
4oo-409 | 40s 6 ux | 164025] geal
aoa | as 3 yas [inzmas| S167
amar | 42s 2 aso [isms] 361280
Efean8 [Es = 9075 Ee = ORs
Find the coefficient of variation V for the data in {aj Table 27 and (6) Table
2.12. fe) Whats the usefulness of the cocificient of variation’?
(a) with je~ 6 and 2.19 (se Prob, 2.19)
a 219 points
eo Gpeints
0835, of 4.38%
(6) With = 93.95 and » 30.18 (oor Prob. 2.19)
(©) The coefficient of variation measures the relatiw dispersion in the data and is expressed as a pure
number without any units. This ss to be contrasted with standard deviation and other measures af
‘absolute dispersion, which are expressed én the unite of the problem. Thue the eoeficient of variation
‘cam be used to-compare the relative dispersion of two oF mare distributions sxpressed in diferent nits,
c= wns lia he ee i val ifr, Fo esata we wa ay Un lenge eat it
‘Table 2.7 i greater than that in Table 2.12. ‘The ovellcient of variation aso can be used to compare the
relative digpersion of the came type af data over different time periods (when ys ar F and ors change)
SHAPE OF FREQUENCY DISTRIBUTIONS
220
Find the Pearson coctfcient of skewness for the (grouped) data in (a) Table 29 and (@) Table
212
(ah With j= 6, ned 6.17 [ose Prob 2.3¢8), and o 22.19 soe Prob. 2.15(61
_ Hucimed) i
2 zy
Se 0.23. (a pure number)
‘Note that mectan is greater than mean and that the distribution is sightly negatively skewed (see Fig.
2a.
(6) With T= $3.95, med — $3.97 [see Prob, 2-4()], and 5 280.18 (see Prab. 2.16)
sx = 20 = med). 4395-397) _ 34-002)
Sa a
= 03330 DESCRIPTIVE STATISTICS lomar. 2
2:22 Using the formula for skewness based on the third moment, find the coeflicient of skewness for
the data in (a) Table 2.9 aad) Table 2.12,
(@) We can find the eoelciont of skewness for the data in Table 2.9 using the formula based on the thind
moment with the aid of Table 2.22:
2
“Tamm = 4
This indicates that this distribation is negatively skewed. but the dogree of skewness is measered
differently than in Prob, 2.71
‘Table 22% Calcuations for Skewness for the Data in Table 2.2
Grade Frequency [Mean fara] fa?
isa z 3 6
; 3 3 6
ass 4 6 2
assa s 6 | $
ssa 6 6 0 0
e578 T 6 L 1 8
1384 t 6 2] 8 2
asoa ® 6 af|on 108
osm % 6 a] ot E
() See Table 2:23,
[Note that regarutess of the mensure of skewness sed, te 398 | ~0.30 | 0081 onie:
370-339 373 ao 0.0016 sons
3.80-389 388 4 a9s | nin | ooo oon
3590-399 395 5 ass | 0 fo o
00-09 403 é sas | oo | 000 0006
410-439 4s 3 as | om | aoe sons
49 2 ass | 930 | ooost Sole
EAN T= 00570R DESCRIPTIVE STATISTICS lomar. 2
2:24 Find the covariance between hourly wage ¥ and education Y, measured in years of schooling in
the data in Table 2-26
Table 2.26 Employee Hourly Wages and Years of Schraling
Employee Hourly Years af
Number | Wage x,3 | Schooling
:
1 a0 n
> now uu
3 2.00 0
4 20 R
s 11.00 6
7 25.0 18
8 1.00 18
» 650 R
io 825 0
From the calculations ip Table 2.77, cow(, ¥)~ (108.55/14) ~ 10.388, When 1 and 9° are both above a
tuclow their means, eavariance # imereased. Wher X and Y move in apposite diectians relative to that
cans (empress 9), cownriance i decreased Sinee in this ease eaw(N, V} >> 0. ¥ and ¥ mawe together
to thetr means.
Table 227
Employes] Howly | Years of ~ oo
Number | Wage X,S [Schooling r] (x —F) iw-Tor-7}
1 2 327s | 18 5595
3 ta | -2775 | 38 lasts
4 1050 2 175 | -18 2.05
5 1.00 | -orrs| 22 1.705
6 1500 6 3aas | 22 avs
7 25.00 42 S548
8 10 4a os
° 650 13 9.495
0 828 38 13.398
suas Bix — THY — ¥y= W338CHAP. 2 DESCRIPTIVE STATISTICS
2
2.28 Compute the covariance from Table 2.26 using the alternate formula,
Computations are given in Table 228 eovy, ¥) = (17388/10) (11.728)(13.8) = 172.88
162.49 = 10.355.
‘Table 2.28 Caketations for Covartance with Altenate Fortsala
Employee
Supplementary Problems
FREQUENCY DISTRIBUTIONS:
1226 Table 2.29 gives the frequency far gasoline pricesat 48 stations ina town. Present the data in the Form a
bistogram, « elativedrequency histogram, a frequency polygon, and an ogive.
Table 2.29 Prequeney Distibation of
Gasoline Prices
rice, Frequency
Toot 7
1.01.09 6
Liga a
Lis-49 1s
Lae.
Ls
227 Table 2.30 gives the frequency distribution of family incomes for sample of 100 families ina sty. Graphs
the data into a hietogramm, a relative frequency bisogramn, a Frequensy polygon, and an ogive™
DESCRIPTIVE STATISTICS lomar. 2
‘Table 2.90 Frequency Disinbution of
Feil Tneosvet
Fanily Income, § Frequency
10,000-11,999 12
12900-13,999 4
14,000 15,999 4
16,000-17,999 Is
1s,000- 19,999 1B
20,000. 21,9901 7
22,000) 33,900, ‘
24,000 25.995 4
25,000 27.999 3
25,000-29,959 2
10
MEASURES OF CENTRAL TENDENCY
aoe
229
Find (ap the can, (i) the madian, and (2) tho mode for the grouped dasa in Table 2.29.
ns. (a) e= S115 8) Mestan = $1.16 (@) Mode = S117
Find (ah the arean, (6) the median, and fc) the mode for the frequency distribution wf incomes in
Table 23.
Aes. (a) N= S170, (69 Median = $16 000 (<) Mode = $15,053
FFind the mean for the grouped datain (a) Table 229 and (6) Table 2.30 by eodi
dns. a) jem S118 (6) V =S17,000
1A ins aay 5/120 iby aon vec bru ly ee of $5 1/9 of Uke ab Freee mane A 6, ana 1M wane
PST. What isthe weighted average paid By this fn
Ans. hy 2685.88
For the se anna of apa invested in cach of 8 yeas aa invest dacued a sabe? sets of 1% dicing
the frst year, 4 during the second year, and 16% daring the thisd. (a) Find yg. (8) Find px (c) Which
ie appropriate?
Ans. (a) ig =H BET (2) Ho
|A plane traveked 200 mi at 60¢mi/h and 100 mi at S00 mh. What was itsaverage speed?
Ans. ty = 562.5 ih
A deiver purchases $10 woul sf gasiling at 90.90 4 gallon aol SI $1,100 gallon, What is the average
pice por gallon?”
das. ty $0.99 per gallon
For the grouped data of Table 2.29, Gand tu) the Rist quale, (b) the secon! quartile, fe} the than
‘quartile, (4) the fourth decile, and (e) the seventicth pereentie.
ns. 2) Dy = SUI (b) Q. S116 (0) Q,~ 81M Kl) Dy SLING fe) Pyy 81.195
For the gouped ata ia Table 2.20, fund (a) the
tnd (d) the sixtieth percentile,
dns. (a) Q= SIRRST (6)
eunti,(6) th ious, fe) the thie desis,
19,833
$19,538 fe) Dy = SAAT (4) PayCHAP. 2 DESCRIPTIVE STATISTICS 38
MEASURES OF DISPERSION
2.37 What isthe range of the distiitution of (e) gasoline prices in Table 229 and (hb) family incomes in Table
a0
Ans. (a) S29 (8) $10,000 to $29,999, oF $20,000
2.34 Find the interquartile range ane quartile deviation for the data in (4) Table 22% and (4) Table 2.30,
Aes (ah IRE SO Mand ON NANG (b) TR SATéand OR S938
2.9 Find the average deviation for the data in (a) Table 229.and (b) Table 230.
Ans: (a) SOOS?S (6) $3520
240 Find (the variance and (6) the standard deviation forthe frequency distribution of eusolin pices in
Table 229
dns. (oh of & 0.0045 dollars squared (8) 0 & 80.0698
2AL_ Find (a) the variance and (bp the standard deviation forthe Frequency éistibution of family ineonses it
Table 230,
ans, (0)?
19,760,000 doulas squared (9) 3 2 3489.22
EAE Using the eer camperanional formaias, find (a) the variance and (b the standard deviation for the
distribution of gasoline pries in Table 2.29,
css (wh 0 0.0089 allars suaeed CE) 0 230.0099
2AB Using the easier computational fovitals Hae (a) the varinace aid (0) the standant deviation for the
family incomes in Table 2.0,
Ane. (2) = 19,760,000 eotlare scpeared (Bb) #5 $415.22
244 Find the coeficient of varintion V for (a) the data in Table 279 and (i) the data in Table
230. (€4 Which data have the greater dispersion?
Ans, (2) 0.080, or Be (H) O61, oF 26.1% () The data of Table 2.30,
SHAPE OF FREQUENCY DISTRIBUTIONS
245 Find the Pearson coefficient of ckewness for the data in (a) Table 229 and (5) Table 230.
Aus, (a) — 0.43 (6) 0.07
246 Find the coofltent of seewness using the formula based om the thid moment foe the data in (2)
sand (8) Table 2.30.
Ans. (a) = 188 (8) 755
27 Pinal the sefsinat of ketosis for th data do (a) Table 720 nel (2) Tae 9 0
es, (a) 177 (8) 300
248 For covariance, (a) in what range should the covariance for directly elated data fall? (6) for inversely
related data? (ec) for unrelated data”
dies, (at sow = (b) cov 0 [e} coveProbability and
Probability
Distributions
3 PROBABILITY OF A SINGLE EVENT
If event can oceur in ny ways out of a total of A’ possible and equally likely outcomes, the
Probability that event will occur is given by
Pia) wu)
where P(A) = probability that event 4 will occur
aq = number of ways that event 4 can o¢e
NV = total number of equally possible outcomes
Probability can be visualized with a Venm diggs
ofa? are of the rectangle represents,
PEA) ranges between 0 and I
In Fig: 31, the circle represents event A, and the
G2)
Feu
TE PEA} = 0, event 4 cannot occur. Wf Pla) = 1, event # will oocur with certainty
‘Copyright 2002 The McGraw-Hill Companies, Inc, Click Here for Tenms of Use,CHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS 7
If PAY} represents the probability of nonoccurence of event A, then
PIA) + PLA) = 1 (ap
EXAMPLE 1. A hicad {H) and a tail (T) are the two equally possible outcomes in tossing a balanced coin. Thus
and A) PT) = 3
EXAMPLE 2. In rolling a fair die once, there are six posible and equally likely outeomss: 1, 2,3, 4 $, and 6
Ths
Py FN 2} — 1H F5) GF
“The probability of not rolling » Lis
and
EXAMPLE 8. card dock has 6 cans divided ints suits (hanaonds, Bears, chubs, aa spades) walle 1st it
cach suit (1.2.3... Wjack, queen, king). Ifthe deck is welleabuficd. cach of the 52 cards is equally Wkly to be
Picked. Since there are 4 jacks, the probability of picking a jack, J, on a sine pick ts
m4
INR
Since there are 13 diamonds, D.
PID") 1 Prp) =
La
a
and rib} + F(D')
EXAMPLE 4. Sungei: that in 100 tess of a haikanssd sai we set OV Rend ame 47 tls Tha eeative forest
of heads is $8/100, oF 0.53. This is the refative frequency or emplscul probabil’y, which isto be distinguished from
the ¢ priori or catsiea! profabilry of FXEN) = 0.5. As the number of toss increases and approaches infinity in the
limit, the relative frequency ot empitical probability approaches the a priori or elasical probably. For example,
the relative frequency ar empirical probability might be 0.517 or 1000 tosses, 508 for 10,000 tosses, and s9 on.
3.2 PROBABILITY OF MULTIPLE EVENTS:
1. Rule of addition for manmutually exclusive events. Two events, A and B, are not mutually
cexciusie ifthe accurrence of does not preclude the occurrence af B, or view versa, Then
FA or By = Pia) + P(B)— P(A and B) a4)
PLA and f) is subtracted to avoid double counting, This.can be seen with the Venn diagram in
Fig 4
2. Rule of addivion for muruatly exclusive events, Two events, 4 and 8, are mutually exctusive ifthe
soccurseive of of precludes the wveurrenwe of Byer vive versa [P(A aval) =O). Th
Pid and Bl = Fi + PBL J)38 PROBABILITY AND PROBABILITY DISTRIBUTIONS [cHar, 3
Fig. 32
3. Rude of multipbearion for dependlens events, Two events are dependem if the occurrence of ome is
connected in some way with the occurrence ofthe other. Then the joint probability of A and B is
PUA and B= PLA) PLBy AY (36)
This reads: “The probability that Aorh events and # will take place equals the probubility of
event A times the probability of event 8, given that event A has already occurred.”
P(B/A) = conditional probability of B, given that A has already occurred (3.7)
and P(A and 8) = PB and A) Ga)
Dee rob, 5.1(6) and (a).
4. Rule of madtiptication for independent events. Two events, A and B, are independent if the
ovcurtence of A is not connected in any way to the oocwrrence of B, [P{8/a) = P(B)). Then
P(A and B) = #4) PB) (9)
EXAMPLE 5. Ona single tossofa dic, we can get only one of six posible oateomes: 1,2, 3,4, 5,0" 6. These are
routualy exchstve vents, W'the di is fait, P{1) = P(2) = P(3) = P(8) = 213) = #16) = 1/6. The probability of
setting a2 ov a 3 on a single toss af the dic is
PQQ oF 3) = PI) + P13) =
Similarly {2 oF 3 oF 4) = Pi2) + Fi8) + Fea) =
EXAMPLE 6. Picking at random a spade or a king o0 a single pick from a wellabufled card deck does not
constitute two mutually exchuive events because we could pick the king of spades. This
1 L_w_4
PIS or K) = PIS) + PIK) ~ PIS and K) =
Using set hey, the pec statement can be reuse i an euivlen way as
4
SUK) = FS) +AK)— PISOK) = B+ 3-B- Sak
‘where the symbol Ui (read “union”) replaces wv and 7 tread “intersection”? replaces and,
EXAMPLE 7. The outoomcs of tao svocessine tosses of « hakanced coke ar inipondens cvents. The outcome of
the first toss im no way affects the oirtsome on the Keeond tous, Tha
PUH and Hy) — PHM) — PIR) EH)
Similac, AH and H and Hi = PH HH) = 1H) POE Pt) = 3-4
EXAMPLE 8 The probally that onthe ist pick fom dak we gt the king. ood is
ri,CHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS »
1 the frst card picked was faded he Ky of lamonds and iF the fist card was not replaced the probability of getting
another Kingon the cond pick is dapendiet er the Get pick because these aro now only 3 Kings and 1 cards bet in
the deck. The conditional probability of picting another king, given thatthe king of dimaonds was already picked
and not reptant. is
3
PiK/Kul = 5
‘Thus the probability of picking the king of diamonds on the fest pick and, without replacement, picking another
king of the setond pick is
Pe Fike) RI RE) = 5
(Rp and KY= FUR) RIN) = 35°97 = Fees
‘abt iv M000, Relat to sonal gra
combinations and permutations, or “counting techniques
Bayes thounsn (see Prolt 3:17} Proilie 3s18 seven
33, DISCRETE PROBABILITY DISTRIBUTIONS: THE BINOMIAL DISTRIBUTION
A candor variable is variable whose values areassociated with some probability of being observed.
A discrete (as. opposed to continuous} random variable is one that can assume only finite and distinet
values, The set of all possible values of a random variable and its associated probabilities is called a
probability distribution. ‘The sum of all probabilities equals I (sce Example 9}
‘One discrete probability distribution is the binownéal distribution. This is used. to find the probability
of ¥ number of occurrences or successes of an event, P[-), im m trials of the same experiment when (1)
there are only rio possible and mutually exelusive outcomes, (2) the m trials are independent, and (3) the
probability of oocurrence ur suns, g, remains eonsiamt im gach trial, Then
Pin) agg (su)
where sr! (read “a factorial) =n (# — 1) (mn — 2)
‘The mean of the binomial distribution is
3-21, and 08
| by definition (see Prob. 3.18).
=n (3.41)
The standard deviation is
a= apa) (3.22)
Ip — 1p —0.8, the binomial distribution is symmetrical; if p < 0.4, i is skewed to the rights and if
p> 05, itis skewed to the let
EXAMPLE 8 The possible outcomes in * tosies ofa halanced coin are TT, TH, HT, and MM. Thus
1
poet rin! ant mt
a
The number of heads is therefore a discrete random variable, and the set of all possible outcomes with their
associated probabilities is a discrete probability distribution (Gee Table 3.1 and Fig. 3)
‘Table 2.1 Probability Distribution of Heads in Two Tesses of a lanced Coin
Nurnber of Heads Poste Ouicomcs Probaity
a 7 a
TH ur 9.50
140 PROBABILITY AND PROBABILITY DISTRIBUTIONS [cHar, 3
as
Prstabiby
Nur hee
Fig. 321 Probability Distetbution of Heads in
Two Tosees x Balanced Coin
EXAMPLE 10, Using the binomial distribution, we can find the probability of 4 heads in 6 Nips of a balanced ein
as allows:
a IB
aye— ay! O63
aja t ta
5
43
me)
2 EL yoni asyen
nxn as embers nh cet id peas can be aol wing App Ee
ered nmtot ted nd te errsapeONh/}2) hose ie nama denise oe ota
o- vn =a— TPR
TA — Va VTS 1 22nd
Bocanse p =0 5, thit probability disribation is symmetrical If we were not dealing with a coin and the trials were
not dependent (asin sampling without replacement), we would’ have hid tee the hyporgecmeeri distribution (see
Prob. 3.27,
34 THE POISSON DISTRIBUTION
‘The Poisson dsirsburion is another diserete probability distribution. Tt is used to determine the
probability of a designated number of successor per ult of rimw, when the events of successes are
independent and the average number of suscesses per unit of time remains constant, ‘Then
Mes
PAX) (ra)
where X= designated number of successes
PLY) = probability of W number of successes
(Girock letter lambda) = average number of suocesses per unit of time
¢ = base of the natural logasithasie system, oF 2.71828
Given the valus of \ (the expected valle oF mean and variance of the P
find e~* from App. 2, substitute in Eq, (3.13), and find PCY).
n istribution), we can
EXAMPLE 11. A police department reosives an average of Scallsper hour. The probability of reciving 2eallsin
a randomly selected hour i
Pix)
‘The Poisson distribution can be wsed asan approximation to the binomial distribution when wis large and vor | — p
fe emall eay, 2 30 and mp $ and nil —p) > §, and it
approximates the Fonson distribution when A > 11 Sse FTODS, 857 and 3.881, Another continuAKIs probatsty
stistribution isthe exponential distro (see Prob. 3.39) Chebyshev 'stheceam, oF inequality, states that repardless
fof the shape of a dlstriberlon, the proportion of the observations or arca falling within K standard deviations af the
‘mam is atleast |= 1/K, for X > | (see Probs, 340 and 3.72),
cd ~
ar a eae
+ tf “ Yee
Pip 38
Solved Problems
PROBABILITY OF A SINGLE EVENT
31 (a) Distinguish among classical or a priori probability, relative frequency or empirical probabil-
ity, and subjective or personalistic probabsity. (b) Whatis the disadvantage of each? (e) Why
lo we study probability theory?
a) According to classical prababily, the probability of an event A is given by
Pid) =
¥
‘where P(A) — probability that event 4 sill o¢eur
re = number of ways event 4 can oecur
N = total pumber of equally possible eutcoenes
By the classical approach, we can make probability statements about balanced coins, fakr dice, and
standard card decks a prior, ar-withowt tossing a coin, rolleng a cie, ar drawing a card. Relate
eesucmy st erpirioal petabaiy i given by the eat of the wusnbe of ties ae vent cextrs the
‘otal number of actual outcomes or observations, As the ptumber of experiments ar trials fsach as the
‘ooring of a coin) increacer, the relative Frequency or erapirical probability approachec the laescal oraCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS a
32
a3
priori probability. Subjective or persnnatstc probability refers to the degree of betle/of an individual
‘that the event wall oceur, based on whatever evidence i available tothe individual
() The classical ora priori approach to probability can only be applied to games of chance (such as tossing
ss Traut, rns Fait iss oe pishins wards fiona stanadund aovh wf sao} lies we wae
determine a prior, or without experimentation, the probability that an event will oscar, Ia realk
‘world problems of ceonamies and business, we afte cannot axdgn probabilities « price aad the
classical approach cannot be used, The relaive-frequency or empirical approach eversomes the
sicvantopes of the clastcal approach by sing the rvlalive frequenries of maasl ceewrrences as
probabilities, The diliculty with the relativefrequency or empirical approach és tat we get different
probabilities (relative Frequencies) for different numbers of trials or experiments, These probabilities
stabilize, oF approach a limit, as the numer of tris or experiments increases. ecause this may be
expensive and time-consuming, people may end Up using it without a “suflesent” aumber of trials of
experiments. The disadvantage of the subjective or personalistic approach to probability is that
sffrent people faced with the same situation may come up with completely different probabilities,
(o) Most of the decisions me face in economics, business, seiece, and everyday’ life invatve risks and
probabilities, These probabilities are easier fo understand and illustrate for games of choice bocanse
Objective probabilities can easily be assigned to various events, However, the primary reason for
studying probaly theory i 10 help us make intelligent decisions in economics, busines, selene, ant
everyday Me when sk and uncertainty ase mvolved,
What is the probability af (a) A head in one toss af a balanced coin? A tail? A head or « tail?
(6) A 2 in one rolling of a fair die? Nota? A2ornota 2
(or
(by. Sinee each of the 6 sides of af
ic is equally likely to come up and a 2 is one of the possi
Pi) =
‘The probability of not rolling 2 that is, #42") i given by
cia
1-P
Pays ei) =
(iy a spade, (c) the King of spades, Cd) ner the king af spades, ar (0) the king of spades or not
the king of spades?
ah Since there are 4 kings K an the 9Z-earas oF the sangre acok
a
(6) Since there are 13 spades Sin the SE cards, P(S) = 18/52 = 1/4
(©) There is only one king of spades in the deck, thetefone PCRs) = 1/32
(ai The probability of not picking the king of spades is PUK) = 1 ~ 1/52 = S1/S2
o) (RS) | PORES) = 1/52 1 51/30 = 53/30 = 1, or exctainty“
3s
36
PRODADILITY AND PRODABILITY DISTRIBUTIONS [omar 3
‘An urn (vase) contains 10 halls that are exactly alike except that 5 are red, are blue, and 2 are
gueen. What is the probability that, in picking up a single bal, the ball is (a) Red? (i) Due?
(e) Green? (d) Nanblue? (e) Nongreen? (f) Green or nongreen? ¢g) What are the odds of
picking a blac ball? (h) What are the ookls of wot piching « blue ball?
Nn _$
« ny ho nas
w
«
“ rip) 1B) 1-03-07
) FG) 1 F(G)= 1-02-08
wn HG) + PG) 02408 =
(e) Theodds of picking a he ball are piven by the ratio oF the mumber of ways of picking a blue bal to the
‘numberof ways of not picking & Hue ball, Since there are 3 Hue balls and 7 nonblue balls, the oddsin
favor of picking a blue ball are 3 to 7, of 3:7
(ih) The odds of not (against) picking a blue ball are 719 3, or 7:3
Suppose that a 3.comes up 106 tlmes In 600 tosses of dle. ar) What Is the retanlve frequency of
the 3? How does this differ from classical ora priori probability” (by What would you expect to
be the relative frequency or empirical probability if you increased the umber of times the die is
rolled?
(a) The relative frequency or empirical probability of the 3 is given by the ratio of the number of times 3
comes up (106) out ofthe total number of times the dic is rolled (600). Thus the rekative froaucasy o7
empirical probability of the is 16/600 0.177 in 600 rolls. According to the classical ar a prion
approach fand without rolling the die at alll, P(3) = 1/6 0.167. the die i fais, we expect the 3 10
‘cme up 100 times in 600 rolls ofthe die as compared with the actual, observed, or empirical 106 times
(b) Ifthe mumber of times te sane dic is roted is increased trom 60, we expect the relative frequeney
empirical probability to approach (i, to becameles+ unequal with) the classical ora priosi peabalility
The production process results in 27 defective items for each 1000 items produced. (a) ‘What is
the relative frequency or empirical probability of a defective item? (b) How many defeeti
do you expect out of the 1606 items produced each day?
(e) The relative Frequency or empirical probability of defective item is 27/1000 = 0.027
() By muleplying the number of thems produced cach day (Ie00) by the relive fequency or emnplrieal
probability of a defective stern (0,027), we get the number of defective items we expect omt of each day's
‘vutput, This is (1600}40.027) — 43, te the aearest ise.
PROBABILITY OF MULTIPLE EVENTS
a7
Define and give some examples of events that are (a) mutually exclusive, (b) not rhutually
exclusive, (e) independent, and (af) dependent.
(e) Two oF more events are mutually exelusve, or dinjoint, if the cectrsence of one of them precludes
prevents the occurrence of the ethers). When one event takes place, the others) will not. For
cuample, In-a single Mp of a coin, we pet elthor a head oF a tall, but nox both, Heads and calls are
therefore mutually exchusive events. In a simple tous ofa dic, we get one and only one oF six possible
watodnis, 1,2, 3,4, Sea 6. The oulscnies ant iefove swaRUally exclusive, A cas picked At éasons
san be of only one sui: diamonds, hearts, clubs. orspades. A child is hom either a boy ara gi
items produced an an assembly ine ic sither good or defectiveCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS: 48
39
(6) Two or more events are nos nautuaty excfustve if they may occur atthe same time. ‘The oesurrence of
‘one does not preclude the eocurrence of the other(s). For example, a card picked at randkown from 2
deck of cards can be both ant ace and a club. Therefore, aces and clubs ave not mutually exclusive
vente. herr: we crmldl pick the ace of elnbs Resance wr eon have inflation and reeession at the
same time, inflation and recession are not mutually exchisive events
(2) Two or more events are inepondont if the oscarrence of one of them in no way afte the oceurrence of
the other(s). For example, two successive fups of halanced coin, the outcome of the sacon Hip im
0 way dopeads ow te term of the fst fig Ths Sue is tre fay raw sures tasers a a fief
dice or picks of two cards fram a deck with replacement,
(Two oF more events are dependent if the securrones of onc of thom offsets the probability of dhe
‘ecurrence of the others) For example, if ae pick a card from a deck and do not replace it, the
peababulity of packane the same card ae the second piek is Allother prokabiitirs alo are affected
since there are naw oaly SI cards in the dock. Similarly. af the proportion of defective item is greater
for the evening than for the morning shift. the probability that an item picked at random frem the
evening satput is defective is arcater than for the morning oatput
Drawa Venn diagram for ta) mutually exclusive events and (5) not mutually exclusive ever
(c) Are mutually exclusive events dependent ar independent? Why?
(a) Figure 3-6 illuctrates the Venn diagram for events 4 and é which ave enucuslly exclusive
(6) Fagure 3-7 usteates the Venn diagram for events 4 and dF which are mot mutually exctusive.
OO)
Fig. 26 Figa7
(eo) Mutalty exchosive events are depsndent events, When one crsnt secure, the probability of the other
occurring is. Thus the oecusrence of the fist allects (precludes) the escurrence of the other.
What is the probability of getting (a) Less than 3 on a single roll of a fair die? (6) Hearts or
clube on a cingle pick from a well chuilled standard deck ofearde? (s) A red or a blue ball from
an urn containing 5 red Balls, 3 blue balls, and 2 green balls? (df) Mere than 3 on a single rol
a fair dis?
(a) Geting tess than 3 on a single roll of a (air dic means geting a | ora 2. These are mutually exclusive
events. Applying the rule of addition for mutually exclusive events, We get
Pier
Fy +r)
Using set theory, P(L or 2) can be cewrtten in am equivalent way as P{Q/U2}. where U is read “anion”
and stands for a.
(8) Getting s heart or a stub 96 a single pisk from a welkshufed desl of cands alse constitatcs two:
maually exshisive events. Applying the rule of addition, we get
PH or C) = PIMC) =
© POR of B) = P(RUB)
Mor Ser6)=rausus) =a mols ms) bed46
PRODADILITY AND PRODABILITY DISTRIBUTIONS [omar 3
(a) What is the probability of getting an ace or a club on a single pick from a wellshuled
standard deck of cards? (Ia all rernaining problems, it will be implicitly assumed that coins are
balanced, dic are fair, and decks of cards are standard and well shuttled and cards are picked at
sarnborn without veplaccanat;) (@) What is the fwaction of the negatine tern in the whe of
addition for events that are aot mutually exclusive?
(a) Getting anace or a club does not constitute tuo mutually exclusive events because we could get the ace
of clubs. Applying the rule of addition for events that are not mutually exclusive, we get
4. tw 4
Bt Ron G
FiN os C) = F(A) + IC) ~ PIA and C)
‘The preceding probability statement can be rewsiten in an equivalent
FIAUC) = PIA) + PIC) — ANC)
using set theory a
hse 7 i ral “nhcnoctvn aad sans fads
() The function of the negative term in the rle nF addition for events that are net mutually exchaine isto
avoid Wouble countmg. For example, m calculating FA or {) m part 4, Me ace of eluDs 1s counted
tice, onge as an ave and onse asa eluly, ‘Therefore, we subtract the probability of geting the aoe of
subs in ordcr Gv avoid thissdouble counting, IC iicevcris-arc mutually exclusive, dhe prebabiliy tha
‘both events will occur simultaneously is‘, and no double counting is involved. This is why the rule of
sddition for mutually onchusive ovents dees net contain a negative tem,
What is the probability of (a) Inflation [or recession R ifthe probability of inflation is.3, the
probability of recession is 0.2, and the probability of inflation and recession is 0.06?
(é) Drawing an age, a elub, or a diamond on a single pick trom a deck?
(a) Since the probability of inflation wid recession is not 0, inflation and recession are not mutually
exclusive events. Applying the rule of addition, we get
Por R) = PU) + PIR) = P{l and Ry
or PULURY = PUL) 4 PIR) = PUL R)
nel PUlor R) = MUU R} =03 40.2 —n06— 0.44
()Gotting an ace, a eb, ara diamond doet not constitute mutually exclusive evens because we could pet
the ace of tubs or the ace of diamonds. Applying the rule of addition for events that are not mutuals
exclusive, We get
P(A or © or D) = MYA) + #(C) + PIDI— PA aml C) — PLA and D)
4,1 ,18 1 VT
a is
PUA oF © a 2) =
What is the probability of (a) Two Os on 2 rolls of a die? (6) A Gon each die in rolling 2 dice
once? c) Two blue balls in 2 successive picks with replacement from the urn in Prob. 3.4?
(a) Thrce girls in a family with 3 ehibdren?
(eo) Getting 4 6 on each of 2 rolls of a die constitutes independent events, Applying the rule of sis
plication for independent cvents, we get
P{6 and 6) = PIB) = PIG) Fi6}=
6
(6) Getting a 6.0m each die in rolling 2 dice once also constiies independent events, Therefore
FUG and 6) = PIG) = PIB) PLO}=
6
() Since we replace the frst ball picked, the probability of geting a bu ballon the second pick is the same
fc 09 the fet pick. The events ara independent. ThereforeCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS a
Aa
aa 9
warms!
Gd) The probability ofa girl, G, on each birth eanstiutes independent events, each with a probability of 0.8,
‘Therefore
MG and G and G) = PGOG 016)
‘oF J chance in 8
(Band
(BN B) = PLB) PIB)
1G PIG) - FIG) = (0.8) (05) 40.5) =0.125
(a) List all possible outcomes in rolling 2 dice simultaneously. (6) What is the probability of
petting a total of 5 in rolling 2 dice simultaneously? (ch What is the probability of gctting a total
of 4 ar less in rolling 2 dice simultaneously? More than 4
(a) Bach dicts 6possible and oqualy likely ouicumes and the wuleume on eackdicisindepewlent. Sinve
cach ofthe 6 ouenmesoa the first die can be associated with each of the 6 oatcomes on the second dic,
thore are a total of 36 possible autsomes that bi, the sample space Nis 24, (In Table 3.3, the ist
‘uiibor refs tthe oatconse om the Bist die, and the sooond aumnber refers to the sozond dee, The dist
can be disinguished by diffewet colors.) The total of the 36 possible outcomes also-car be shaven by 2
roe (or sequential diggram, as in Fig. 8
Table 32 Outcomes in Reiling Two Dice Sinultaneousy
wt BT BE 4 3,1
2 2 at 4 5,2
3 23 33 4 33
4 ha ua 4 sa
5 as a8 4 58
é Ne XG Ae 46
(6) Oot of the 36 pocsible and equally likely outoomer, 4 of them givea total of $. These are 1, 4.2, 3;3, 2:
and 4,1. Thus the probability of a total of § (event ) im rolling 2 ice simultancausly is given by
fia) ot
(0) Rolling a total of # ar less involves rolling total of 2 3,9 4. There are f possible and ecwally Ukely
ways of rolling atotal of 4 or fest. These are 1, 11.21.3521; 2.2 and 3.1. Thus event 4 is
defined as rolling a total of 4 or less. Pi} = 6/36 ~ 1/6, ‘The probability of getting a total of more
than + equals T mimss the probability of getting a total of 4 oF less, This is | 1/6 — 5/6,
What isthe probability of (a) Pickiag a second red ball from the win iin Prob, 34 when a red ball
was alrcady obtained om the first pick and not replaced? (6) .A red ball on the second pick when
dhs First ball picked was 04 rest aad was snot veplavel? Co} A seal ball ow the tise pich oh
rod and a nonred ball were obtained on the frst two picks and were net replaced?
(a) Picking a sccond red ball from the urn whem a red ball was already picked on the first pick and was not
replaced is a dependent event, sine there are now only 4 red balls and 5 noneed balls remaining inthe
turn. The conditional probubity of picking a second red ball when 2 re ball was already obtained on
the first pick and was mot replaced is P(RR/RR} = 4/9
(6) The conditional probability of obtaining a red ball on the second pick when the first ball picked was not
red (Rand was not replaced in the arm before the sesond ball is picked is PUR/R') = $9,48
Bus.
PROBABILITY AND PROBABILITY DISTRIBUTIONS [cHar, 3
Oxon oo scone on
Ge inte the econ ie
6
Fig. 38 Tree Diagram for Rolling Two Dice Simultaneously
e)_ Since ? balls, one of which was red, were already picked and not replaced, there remains a total of 8
alls, of which 4 are red, in the urn. ‘The (conditional) probability of picking another red ball i
AARUR and Re’) = F(R/R' and R) = 4/8 = 1/2.
What is the probability of obtaining (a) Two rod balls from the urn in Prob 3.4 in 2 picks
‘without replacement? (b) Twoaces from a deck in 2 picks without replacement? (e) The acs of
‘hubs and a spade in thar order in 2 picks from a deck without replacement? (df) A spade and the
see of chuls ov that order im 2 picks fear a ddeck without replacement? (2) Throw ve halls from
‘the urn of Prob. 3.4in 3 picks without replacement? (f) Three red halls fromm the same urn 30 3
picks with placement?
a) Applying the rule of multiplication For dependent events, we get
6)
@CHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS ”
316
uy
a Sand Ae) = ASN Ach = AS) Ae} = Beha = eRe a8
(Pend Rand R) = ARAROR) = AR) -#iR/R) ARIK ae)
S43
ot
oo Wee
(7) With replacement, picking, three balls from an um constitutes three independent events. Therefore
POR and R and R) = PIR) P(R)- PIR:
io 10 10
Past experience has shown that for every 100,000 items produced in a plant by the morning shift,
200 are defective, and for every 100,000 items produced by the evening shift, $00 are defective
During a 24-h period, 1000 item are produced by the morning shift and 649 by the evening shift
What is the probability that am item picked at random from the total of 1400 items produces
during the 24h period (a) War produced by the morning shift and ix dafective? (5) Was
produced by the evening shift and is defective? (c) Was produced by the evening shift and is net
defestive? (a) Te dofeetive, bother produesd by the morning or the evening shift?
(a) The probabilities of picking an item produced by the morning shift MI and evening: E are
000 00
iM 0625 and PE) = SE
‘The probabilities of picking a defeetive item D from the morning and evening outputs separately are
20 00
2a a
DIM) = sary = 8D — and FID /E) = ET = 0m
‘The probability that an item picked at random from the total of 16 Hems produced during the 24-h
period was produced by the mening shif und ip defective ie
XM and BD) = PM) #(D/M) = (0.6289(0.002) = 0.00125
( P(E apd D) = PCE) A(D/E) = (0.375}(0.005) = 0.001875
%,
G and D') = P(E)- A(D'/E) = (0.3% =asrsns
@ PE and D!) = PIE) -F(D/E) = (0.15) SE = 03731
(a) The expected amber of defective itemsfrom the morning sift is equal to the probability of a defective
item from the morning output times the mumber of items prodvocd by the momning, shift that i,
(0,002 0 From the evening shift we expect (00005)(60M) = 3 defective items. Thus we
expect $ defective items from the 1600 items prosiaced during the 24-h period, IF there are indies 5
defective items, the probability of picking al randar any of the S defective lems out of a total of 1600
items is $1600 1/320 or 0.003128,
(a) From the rule of multiplication for dependent events Band , derive the formula for P(4/B)
in terms of P(A} and P(R) This is Known as Raves" rhearem and is used to revise probabilities
when additional relevant information becomes available. (b) Using Bayes’ theorem, find the
probability Gtat a defective item picked at random from the 24h output of 1600 items in Prob
4.16 was produced by the morning shift; by the eve
(a PiBand aj = FB) -PLA/BI
By dividing both
However, PR and 4) = PIA and By, exe Prob. 8.15(opand (i). Therefore0
PROBADILITY AND PRODABILITY DISTRIBUTIONS cua. 3
P54) 8) ay r44/ Bp PE es toro 85)
FR) FR
(6) Applying Bayes” theoreen to the statement in Prob. 3.16, lets 4 sip the morning, shit Mand 8
sigmty defective D, and utiizing the results of Prob. 3.16, we get
FIM) P/M) _ (0.625002) _ 0H?
POn/D) = DY —aansiis~ a0sTs
04
Thatis, the probability that a defective teen picked al randora Gow the total 24h output of 1600 eens
war produced by the morning shift i 40%. Similarly
i9.375)(0005) _ 9.0m1s75
B/D) = rip rey = OTT ANNS! _ OemtES
= 0.6, oF 60,
Dyes’ theorcen can te generalized, for example, to find the probability that a defective item 2 picked at
random was produced by any of w plants (4ie/= 1... ..n), as follows:
Pid.) PB
SAT PITA
where 5) refers tothe summation over the plants (the only ones producing the wuipat), Bayes!
orem is apphied im Hesiness decision theory, DU Is sekJom Wed IN the eG of oN, (Mewever,
‘rayesian econometrics is beeoming increasingly amportan.)
Pai) = (48)
318 Acclub has § members. (a) How many diflerent committees of 3 members cach can be formed
from the club? (Two committees are different even when only one member is different.)
() How many commitices of 3 members each can be formed from the club if each commitice
is to have a president. a treasurer. and a sccretary?
(a) We are imerested here in finding the number of eombinasions of $ people taken 3 at a time without
ccancern forthe onder
!
SOF
In genera, the number of arrangements of things taken ata time-without eoner for the onder isa
combination given by
= (aaa tar)
where al tread w fastorial) =e-fn 1) fa —3)-—-3-2-1 and OF = 1 by definition,
(6) Since cach committee of 3 has to havea president, a treasurer, and a seeretary, we ane mow interested in
nding the number of purmmutations of 8 people taken 3 at atime, whem the order x éportane
ee oe
=a ial
In general. the number of arrangements i define ode, of n things taken 1 ata time ism peomutae
tion given by
”
a (4s)
Permutations and combimations foften referred to as counting teinigues) are helpful in counting the
saeiher OF ally Whely ways eve a ode cela te Une Lat of alps aid ual likely
‘oatcomes, Combinations and permutations were not used in previous problems because those pro
blame ware simple enough without therCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS 5
DISCRETE PROBABILITY DISTRIBUTIONS: THE BINOMIAL DISTRIBUTION
319
30
Define what is meant by and give an example of (a) a random variable, (b) a diserete random:
variable, and (e) a discrete probability distribution, (a) What is the distinction between a
probability distribution and a reativesfrequency distribution?
ab A rondo wate isa variable host vahucs are aasecated with som protatility of hing sbacred
Fr enatple, oe 1 roll ofa fat die, we have 6 mutually exclusive outenmes (2 3, 4,5, 0° 6), each
aatociated witha probability ccurtcace of 1/6." has the eutcome from the rll ofa die Wa random
“arable.
CO) A cdssreie renin raninble is ou Haat cau asses ouly Guile or distinet values, For esas the
‘outsomes from rolling a dic sonstitutsdisrete random variables bscaruse they arc limited to-the values
12,44, 5, and 6, Thie to be contrasted with continous vorlahter, which san accome an infinite
number of values within any given interval [see Prob. 3.31fa)
(0) Addoceeie probably asieauion veer te he 961 of all puss values uf a (uixercleh random variable
land their associated probabilities The sct of the 6 outcomes in rolling a die and their associated
Peohabiltcs in an example of a disorate probability dsteibution. ‘The sum af the probabiliion ania
ciated with all the valies that the diseste random variable can assume alway’ equals |
(a) A probabiiy diserbusion reters wo the classe ora prfart probable associated with ll the values that
1 random variable can assume. Because those probabilities arc assigned a priosi and without any
sapirimentation, a probability distribution is oftsn referred to as a ehevvencul (lative) fPequensy
sdstribution, This differs from an empicical (relative) frequency distribution, which refers 1o the
ratio of the number of timer exch outcome actually occurs to the total mumber of actual trial or
observations. Far example, in actually rolling « die a number af times, we are not likely to get
ch outcome exactly 1/6 af the times. However, at the number of tolls increases, the empirical
(elative) frequency distribution stabilies atthe (uniform) probability ar sheoreticl relativefreq wency
distribution of 16
Derive the formula for (a) the mean js ar expected valwe EC¥’) and (b) the variance for a
sdscrete probably st ration.
(a) The Fortuila for the arithmetic mean far grouped population data [Eq. (2 2a] is
ret
ante
where 55 ffs ihe sum of the frequency of each class f thnes the class mikpolnt W and.” = 5 7, whieh
te the number ofall observations or frequencies. In dealing with probability dstibutioms, the mean ye
‘soften soterced tows the “eajtesl Nabe” £(). Ths fovaula fos ye or EA) fou a shancste poobalty
sistribution can be derived by starting with Eq, (22a)and keting f = PL). which isthe probability of
och of the possible omtoomer W, ‘Thon, 32 f¥ — S5MDUN), which ic tho cum af the valve of each
outcome times its probability of eccurence, and N= Ef = 5.A(X), which is the sum of the prob
abilities of each evtasune. which is 1 Thus
Fer) =e EP (n
(6) The formula for the varinnce of grouped popolation data [Eg. 2] is
Ev - i"
u (ray
‘Qnoe again letting f = PLY’) = probability of cach outcome and
the formula for the variance of a discrete probability distribution
Erebrun
we cam got
Var Xa of = = E(YIPPC) = SPP EG = BY (222 PRODADILITY AND PRODABILITY DISTRIBUTIONS [omar 3
321 Table 3.3 gives the number of job applications processed at a small employment agency during
the past 100~day period. Determine the expected number of applications processed and the
variance and standard deviation,
‘Table 3.3 Number of Jub Application: Procesced during the Pact
100-Day Period
a it)
0 »
:
M4 4
‘To the extent that we believe that the experience ofthe past 100 days is typical, ws can find the relative
frequeney distibution and equates probability dist®bution, This and the other calculations to find)
and Var Y are shows in Table 3.4
VarX =o} =) A0X) —[SENPUXIF = 116— (10.6y = 116 — 112.36 = 3.64 applications squared
SDN = oy = ye} = W369 & 1.91 applications
‘Table 34 Caleuations to Fin the Expected Vatue and Variance
lumber, Days, rin) Erin x “erin
7 1 on Oo ” 1
8 w ou 08 o 64
w 20 02 20 100 204
un 20 03 33 12 363
1 » va 24 ry ake
4“ Ww on a 196 4
NeSsreto | Daye | xy = 106 y= 6
BUYS = DPD
106 applications
3:22 (ap State the conditions required to apply the binomial distribution. (8) What is the probability
ofS heacls in 5 flips of « balancod-coin? (c) What is the probability of less than 3 heads in 5 flips
of a balanced evn?
(@) Theinomial distribution i used to find the probability of 1” number af occurrences oF soocesses of an
seat, PA, aw Winks ofthe sia eopesinget when (2) trace sul)? auutually ealuseve oulkonnes,
(@) them trials are independent. and (3) the probability of vccurrence.or succes, p, remains constant
in-each trialCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS 2
(on FX) aap = PF = (Pol at = a at a
See Ege. 3.10) and (3.17). Ia some Books, 1 — p (the probability of failure) iedefised at. Here we — 5
No=3,p=1/2,and 1—p= 1/2. Substituting these values into the presediag equation, we get
PO)= ge gg tat (Ua? = (1/2 = 191/32) = 92125
i) PIX -<3) = PION PI) + PD)
ET 5 as.
PD) = peg UF RY = 35 = 0.125
Thus PUN <3) = POON PI) + PQQ) — 03125 +9.15625 40.3125 — 0.5
323 (a1 Suppose that the probability of parents having a child with blond hair is 1/4. ‘there are 6
chikdren in the Family, what isthe probability that half of them will have blond hair? (bt I the
probability of hitting a target on a single shot is 0.3, what is the probability that in 4 shots the
target wil be hit at Teast 3 times?
(a) Meee 6.8 —3.y— 1), and 1p 3/4, Substituting these values inte the binomial formals, we
st
8s apraiay Phe teayanven 85432 ae
PSN ap UNC =a (LOHNTION) =F (2/4096)
Nga son
thy Here n= 3, and 1p
PIX> 3) PI) +A)
PB 0.3"(071
Thos
3.24 (a) A quulity inspector picks a simple of 10 tubes al random from a very large shipment of tubes
knows to contain 20% defective tubes. What is the probability that no more than 2 of the tubes
picked are defective? (b) An inspection engineer picks a sample of 15 items at random from a
manufacturing process known to produce 85% acceptable items. What is the probability that 10
of the items picked are acceptable?
(0) Heron = 10, 22, pehd,and 1 p05:s PROBABILITY AND PROBABILITY DISTRIBUTIONS [cHar, 3
AN S21 PLO PL) + PRR)
10!
‘oro —07)
= 0.1074 ooking up m= 10,0 0, and p= 02 in App. 1)
Pil} = 0.2684 (looking up m= 10,1 = 1, and p= 02 in App. 1)
P{2} = 1.3020 (looking up w= 10,1 = 2, and p= 02 in App. 1)
Thus PIN S21 P(OD-+ PL) + PLZ) —O.1074 + 0.2684 + 0.3000 = 6778
(8) Here m= 15, ~ lip 8.85, and | p= 0.15. Since App. | only gives binomial probabilities For up
10.0.5, we should transform the problem. The probability of = 10 acceptable items with
equals the probability of = 5 defective items with p=4.15. Using a = 15, ¥'= S defective, (of
sbjcctive) = 0.15, we pet 0.0849 (from App. 1).
Pio} (o2)"os)"
25 (a) IE balanced coins are tossed simultaneously (or 1 bakaneed coin is tossed 4 times), compute
‘the entire probability distribution and plot it. (6) Compute and plot the probability distribution
for a sample of 5 items taken at random from a production process known to produce 30%
defective items.
ta)
; V=0H, IM, 2H, 3H, or 4H; P= 1/2; and App. 1, we get POOH) = 0.0625,
3180, POH) = 4.2400, P(aH) 00635, and
En
PUI) = 0.3500, PH)
thas POOH) + #(0H) + PIM) + PCED + PAH)
(0625 + 02500 + 0.3790 + 02500 + 90625 = 1
‘See Fig. $1 Note that = 0.3 and the probability distribution in ig. 3-9 is symmetrical,
z
2
ass an a
qu
ans ‘a
Senter le Number f eter fers
Fig, 34 Probability Distribution of Heads in
‘Tosting Foor Balanced Coins Fig, 310 Probability Distribution of Defective ems
(81 Using n= 5 4 4.4, or $ dof
five; ar p= 0.3, we got pf) = 0.1681, #1) = 0.9602, #(2)—= 0.3087, #3) = 0.1523, 2(8) = 0.02H,
AS) = 00028, Therefore
PQ) + #1) + PI) + PCR) + PIA) + PIS)
= 0,168] + 0.3602 +.0.3087 + 0.19234 00384 40.0024 =
‘See Fig. 3.10, Note that p<. and the probability distribution in Fig, 3-10 is skewed to the righ;
3.26 Calculate the expected value and standard deviation and determine the symmetry or asymmetry
of the probability distribution of (a) Prob, 3.2%(a), (6) Prob, 3.24by (c) Prob. 3.240), and
(d) Prob. 3.2406).
%) EL) = po up = (6)(1/4) = 3/2 = 1S blond children
SDY = ye@pT =i = YETTA = VTR7TR = VTE & 1.6 blond children
Becaure p < 0.5, the probatility distribution of blond children ic ckewed to the +CHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS 58
or T= op = (410.3) =
sox = vty = YARTHET < = vO & uyens
Becatse p05, the praabilitydivebution is skewed to the right
(eh BUY) = c= mp = (10)(02) = 2 defective tubes
SD-X = Vinpll — p) = V{1OO2NO.8) = VIG = 1.26 defective tubes:
ecause p< 0.9, the probability dixtrbution ts skewed to the sight
cr zixt= = (1510.85 = 12.7 accemtabie items
SD. = api =p) = VTSORS|@1S) = vASTTE 1.38 avoepeable ems
Because p> 0.5, the probability distibution is skewed tothe et
3.27 When sampling is done fiom a finite population wishous replacement, the binomial distribution
cannot be used because the events are not independent, Then the Aypergewmetric distribution is
wed. Thit ie given by CQ)
hhypergeometrie distribution (an
Te measures the number of suosesses in a sample size taken at random and without replace
ment from a population of size N, of which ; items have the characteristic denoting success,
(a) Using the Formula, determine the probability of picking 2 men in a sample of 6 selected at
random without replacement from a group of 10 people, Sof which are men. (6) What would
the result have been if we had (incorrectly) used the binomial distribution?
an
(@ (")
7 re
aa (al ag
Pua
o Pa)
[should be noted that when the sample ie wery small in relation to the population (sa, less than 3% of
‘the population), sampling without replacement has ile effect on the probability of sueves in each tial
and the binomul distribution (which is easier to use) #64 good approaimation for the hyperscometcic
istribution. This is the season the binomial distabution was used in Prob, 3.2Ka),
THE POISSON DISTRIBUTION
3.28 (a) What isthe difference between the binomial and the Poisson distributions? (b) Give some
examples of when we can apply the Poisson distsibution, (ce) Give the formula for the Poisson
distribution and the meaning of the various symbols. (d) Under what conditions can the
Poisson distribution be used as an approximation tw the binomial distribution? Why can this
be useful?
(@) Whereas the binomial distribution can be used to find the probability of a designated number of
suvseases im ins, ths Poitaon distibution is used to funk the probability of designates uuaibec
of successes per wn ef tine, ‘The other conditions required te apply the binomial distribution also ars
reuited to apply the Poizvan dictation: that i (1) there must bo only te matallywxchicive oot56
any
30
[omar 3
comes, (2) the events must be independent, and (3} the average number of successes per unit of time
(6) The Poisson distribution is ofen used in operations research in solving management problems Some
samuples ate the aber of telephone alls to te poles pat hous, Hae wunibes of castonnaes aciving ata
‘gasoline pump per howr, and the sumber of trafic accidents at an intersection per week
(6) The probability of a designated number af successes per anit of time, Pi), can be found by
Met
oT
ix
‘where X= designated number of successes
he averse neimber af sueeesies wes a specie ime perio
he base of the natural logarithes system, oF 2.70828
Given the value of, we can find «* from App. 2 substitute it nto the fom, and-ind P(X). Note
‘hats the mean and variance ofthe Poison distribution,
(We can use the Poisson distribution 85 an approximation to the binomial distibation when w, the
srumber of tak, i large and p oF Up is small (are events}.A good rake of thumb isto use the
Poisson distribution when 20 and np or n{l-~ p< S. Whenm is large, it cam be very time
consuming to wse th binomial distribution and tables for binomial probabliiss, for very small vals
of p may pot be availble. Ifa(l ~p) < 5, soosess and faire shut be redefined so that ap < 5 to
snake the approximation ascarate.
Past experience indicates that an average number of 6 customers per hnur stop for gasoline at a
gusoline pump. (a) What is the probability of 3 customers stopping in any hour? (b) What is
the prehahility of Tcustomers or less in any hour? (0) What is the expected value, or mean, anc
standard deviation for this distribution?
fe*_ (2 \
® any — GINO _ OSES _ gap
é
oy Fin
ray) P2) 4)
fe ayaa
Se (O08 gory
Ge _ (360.0248)
£3)= 00898 fo ut)
Ths 5 3) Onn Com oc S28
=o.onss
= 00ds6
(2) The sepsgted walvs, of moan, of this Poisson distribution is A — 6 cistomers, and the standard devis
tion is VA = VB 2.45 eustoners
Past experience shows that 1% of the lightbulbs produced in a plant are defective. Find the
probability that more than | bull is defective in a random sample of 30 bulbs, using ta) the
binomial distribution and 4b) the Poisson distribution
(@) Here 30, p = 0.01, and weare asked to find P(V > 1}. Using App. 1, we pet
POO) + Fi) + PLA) + = MORE + O.O031 + 8002 = AMET, oF 61%
(6) Since oe 90 aad np — (3RY(0.MT) —03, We san use He Poissow appeosination of the binonal
Alstibution. Letting = ap = 0.3, we have tofind PN > I} = 1 — PLY 1), where ¥ is the mamber
of Gofective lls. Using Tg, (3.13), we gotCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS 7
Poy =e
(0.3)(0.74082) = 0.222285
Pio 8 —o74n82
PUY S 1) = PI) + Pi) = 0.22246 + 0.74082 = 0.965066
Thos PUPS Ent St) = 1 sie = O.0;6M, oF 3.895%
‘As becomes large the apyicosimation besomies even clotce.
CONTINUGUS PROBABILITY DISTRIBUTIONS: THE NORMAL DISTRIBUTION
aa
aa
(a) Define what is meant by a continuous variable and. give some examples. () Define what is
meant by a continuous probability distribution, (c) Derive the formula for the expected value
and variance of a continuous probabiity distribution,
Ga) A. continser veriabte is one that can assume any valne within any given interval. A continous
\anable san be measured with any degree of aocuracy simply by wing smaller and smaller ucts of
rmearurement. For example, if we ray that » production procots takes 10k, this mane anywhere
berween 93 ard 10-4h (10h rounded to the nearest hous). If we used mimates as the unit of measure-
ment, we could have sail thatthe nrevluction process takes 10h and 20min, This means anywhere
between IWhard 19.Seminand 10k and 24min, and sows, Times thus a continuous Variable, and 30
arc ucight, distance, and temperature.
(6) A.cominuous probability setbuson refers to the range ofall possible values that a continuous random
ahr saa ascnene ragether with the stsoeiatet peabaltiis The penhabiity cistriatinn of a ei
tinuous random variable is often called a probability density mcrion, or simply a prabulility Faction,
Tes given by a smooth curve such that the total area (probability) under the curve is 1. Since 2
continuous random variable can assure an infinite nurmber of values within any given interval, the
probability of a specific value is 0. However, we ean measure the probability that a continuous random
vaniable ¥ assumes any valne within a given interval (say, betwcon .y and 3} by the area under the
carve within that interval;
iti W Ny
[i rae (27
ys
‘whore f(s tho equation ofthe probability density funstion, andthe integration sign, J, ie analogous
to the summation sign © for discrete variables, Probability tables for some of the mast sed con
tinnions prnhabity Aitiohntions are gira inthe appends, this ciminnting thr rit to getirr the
integration oursches,
(2) Tho expected value, or mean, and arance for continuous probability disteiutions can he dovived by
substicating J for 5 and f(¥) for PLX) into ths formuls foe the expected value and varianee foe
dliscrete probability distributions (Eqs. (3.0) and (2.2
symm [aren av (34
Var =
fw mevyp pvp a (a5)
(a) What is normal distribution? (5) What is its usefunese?.(c) What is the standard normal
disteibution? What is its usefulness?
(a) The norm dicinbution 8 a continous probability function that & bell-shaped, symmetrical bout the
ii, ail scouts eGue iat Sec. 24), AS we ans Mle aay Cons Ue cna i ttle
directions, the normal eurve approaches the horizontal axis but never quite touches it). The equation
of the normal probability fanction i given by38
PRODADILITY AND PRODABILITY DISTRIBUTIONS [omar 3
14]
Where (17) = height of the normal curve
2 = shanna deviation of thea
l.
() The normal distribution is the mast commonly used of all probability distributions in statistieal
anaivsis. Many distnbutions actually found in nature and industry are normal. Some examples
are the IQs (intelligence quotients), weight, and Beights of a large aumber of people and the variations
in dimensions ofa large number of parts prodosed by a machine. The normal cistribation often can be
used to approximate other distributions, sich ac the binomial and the Poisson distributions (ese Prob.
3.7 and 3.38) Disinbutions of sample means and proportions are often notmal, regardless of the
distibution of the parent population (Se See. 4.2),
(e) The standard normal distribution i¢ a normal distribution sith j= and o° ‘Any normal
disttbution (defined bya particular value for y and o°) can be transformed into a standard normal
distribution by letting ¢— 0 and expressing deviations from y+ in standard deviation units, We often
can find areas (probabilities) by converting Y values into corresponding > values [that ix,
(= )/o} an looking up these = values in App. 3
from minus infinity to plas infinity)
2) fore era ar sa ene
Find the area under the standard normal curve (a) between z+ 1,242, and 243; (6) from
2S Dluz = O88 () hows = 1.0 lue = 2.55, (a) Ww lheboll uf: = 1.60, (@) lu the aight
of r= 2.55; (A) (0 the left of z= =1,60 and to the right of z= 2.55,
a) Thearea (probability included under the standard normal curve between = 0 and z= 1 is obtained
bby looking up the vale of 1.0 ix App. 3. This is accompbabed by moving down the z column en the
tableto 1.0 and then across until we-are below the columa headed 00. ‘The value that we get i 0.3413
This means that 34.13% of the total area (of 1 of 100%) under the eurve lis between z= 0 and
P= LO0, Because of symmetry, the area hetween z—0 and z~—I is also 0.313, of 34.13%,
the area. betwee Land z= 1 8 68.25% (see Fig. 3). Similarly, the area between
ig 4092, of 41.12% (by Hooking up r= =u) im the eablep, 30 hat the area between,
Fo £2 1s 95.44% (ope Fig. 4). The area between 7+ 3 = #9474%5 (see Fig. 3-42, Nove thatthe table
sly ass tailed valucy fre ay hy 2.99 Benne Une a wes Ue ete wale «3 i wali
(6) Thearea between z= Oand 2 = 0.88 is obtained by looking up 0.88 in the table. This is 0.3106.
(©) Thearea between z= O.and = ~1.60 is obtained by booking up z= 1.60 in the table, This is 0.4452.
‘Thearea between z= 0 and : = 2.58 is obtained by looking up 2 = 2.55 in the table. This is 0046.
Thas the area under the standard normal carve from z= =I-60 and 2 = $5. cquals 0.4452 phas D546.
This is 0.9598, or 93.8% (see Fig. 311). Ima probleme of this nature itis helpful ta sketch a figure
i) Weknow that the otal arca under the normal curve is oqual fo 1. Bocauseof symmetry, 0.$0Fthe area
s on either side of =O. Since O.A8S2 extends from 2 = 0 to 2 = ~ 1.60, 0.5 ~ 0.8482 = 0.0548, or
5.48%, is the area in the left tll, to the left of 1 6D (ave Fig. 3-11)
fe) 0.5~ 0.4049 = LOSS, oF 1.54%, is the area in the right tail, to the right of = 2.85 (see Fig. 3-10.
(Fr Thearca to the left of z = —1.60 and tothe right of : = 2.55 is equal to-1 ~ 049998 (sce part ch. This is
1.0802, o 6.02% of the tal.CHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS: 2
male
Pig. S18
334 The lifetcne uf lightbuls i» kawwa to be morally distributed with ys = LOK sumer = Sh, What
is the probability that a bulb picked at random will have a lifetime between 110 and 120 burning,
hours?
‘Weareasked here find P(110 < ¥ < 120), mbere 1 refers to time measured in hours of burning time.
Given = 100 and o'= Sh, and letting 2 = 110b and 1) = 120, we get
My=w_ tot Xam _ 120-100
O28 and $20 100
20
‘Thus we want the area (probability) between +) = 1.28 and =) =2.50 (the shaded area in Fig, 312).
Looking up £3 = 2.50 in App. 3, we get 04938, This is the arca from 2 =0 to 2)= 250, Looking up
21 = 125, me get 0.364. This thearca from z= 4-24 = 1.25, Subtracting 0.394 from 0.4938, we pot
(00954, of 9.948%, for the shaded area that gives P(I1O <1 < 120).
ee
=
Fig. 312
3.38 Assume that family ingomes are normally distributed with js — $16,000, and» — #2000, What is
the probability that a family picked at random will have an income: (a) Between $15,000 and
$18,000? (6) Belew $15,000? fc) Above $18,000" (ay Above $20,000?
(a) We want (815,000 < ¥ ~ 818,000), hese X is faeily incase:
y= nw _ $15,000 — 816,000 _ Ayn _ $18,000 - 316.000
e ‘aan 5 dae oy
‘Thus we want the area (probabiltys between z= —05 and 4) =1 (Whe shaded area in Fig. 313).
Looking up 2=05 in App. 3, we got 0.1918 for the arca from z—0 to z= 05. Looking up
aI, we get OMI for the area from z= ta z= 1, ‘Thus, P(815,000 < X < $18,000) =0.1915+
O13 = 0.5828, oF 53.25%.
ie ica oe Hecle
oe
Fig. 313,337
PROBABILITY AND PROBABILITY DISTRIBUTIONS [cHar, 3
18) PLY’ < 815,000) = 0.5 ~ 0.1915 = 0.3085, or 30.85% (the unshadedd ara im the left tail of Fig. 3-13),
Ae) ra > $1000) = US — 0.3418 = OAK, of 18874 (the Unshaded anen in the right tal of Fig. $132.
{d)_¥ = $20,000 eorvesponds to 2= (820,000 ~ $16,000) /$2000 = 2. Therefore, PL’ > $20,000) =0.5~
‘The grades om the midterm examination in a large statistics section are normally distributes! with
mean of 78 and a standard deviation of & The professor wants to give the grade of A to 10%
fof the students. What is the Towest grade point that can be designated an Aon the midterm?
In this problem we are asked to find the point grade such that 10% of the students will have higher
grades, “This involves finding the grade point X such That 10% of the area under the normal curve Will Be to
the right of (the shaded azeain Fig, 3-14), Since the total areaunder the curve tothe right of 8 5 0.5, the
swashadet area in Pig 3-14 tothe righ of 7S mma be O.. We muse look inv ahe Body oF App. ¥ forthe valve
lowest 1004. This is 0.3997, which corresponds to the z valve of 28. The X value tthe grade point) that
sorresponds to the = vals of 1.28 is obtained by substituting the known valuss inter — (N'— sr and
solving for W
“This piers 1074
WTR Thesele Vm 78+ M24 = 88 74, oe RS te avast Whe ae
9.3000
oe (Gene pit
Pig. 54
‘Experience indicates that 30% of the people entering a store make a purchase, Using (a) the
binomial distribution and (8) the norenal approximation to-the binomial, find the prabability
‘hat out of 30 people entering the store, 10a more will make a purchase.
ta)
(= 10) = PLO) + {TI + PI) +--+ + P(30) = 0.1416 + 0.1103 +0789 + CO + 00231
++ A10106 40,0042 + 0.0015 44.005 + 0.001
a?
16) je np = (309(0.3) <9 persons, and o = yfapit—p) = yGONOSHOT = v3 002.51 persons.
‘Since n= 30 and both ap and a(t = p) > S, we can approxmate the binomial probabelty with the
‘normal. However, the number of people ssa dscrote variable. In onder to use the normal distribution,
‘4p Must {reat the number of people as HAL Were a continNOUS NaN and Find FA. 93). Thus
2
From z= 8.20, we get 0733 (from App. 3). This means that 0.0793 of the area uoder the standard
normal curve bes from = Ota = 0.20. Therefore, P(X > 9.5) = 0.5 = 0.0793 = 0.4207 tthe normal
appiesimalions Ase becomes even large, the appresimation Lacowns eve chiser [LP we had wot
‘treated the number of people as a continuous variable, we would have found that PLN’ = 10) =O,
and the approximation wold not have been ae clace.]CHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS a
338
339
A proclction process produces I defective items per hour. Find the probability that 4 or ess
items ate defective out of the eutput of « vaadonily chosen howr using. a) the Poiston distrbe
tion and ¢4) the normal approximation of the Poison
(ab Here A= 10 and we are asked to find P(X <4), where X is the number of defective items from the
output of a randomly chosen hour, The value of ¢"* from App. 2 & 0.00005, ‘Therefore
FID
ZEN _ W005) gggos
= 00083335
gr08s9s
nM (0) + #1) + PLR) +3) +
0000S + 0.0005 + 0.0025 + 0083335 + 0.020335
= 0032217, oF about 3.2745
(Gh) Treating the item os comtinuows [ose Prob 33%, we are wid ta find 2X <4), whens W inthe
number of defective items, = A= 10, and o = v= VIOH316, Thus
Ka 49-10-92
a" ie Fie
Foc : = Lin App. 3, we get 0.459], This means that M.S ~ 0.4591 = 0.0409 of the area (probability)
‘under the standard normal curve lies to the left of : = 1.74, Thus ALN < 4.5) = 0.0409, of 4.09%,
‘As ¥ booowics lager, we get a betier approximation (If we had not ‘rcatad the mamber af defective
items as a continous variable, we wouk! have found that PAX < 4) 0.287)
1
Thevents or successes fallow a Poisson distribution, we can determine the probability that the frst
event occurs within a designated period of time, P(T <1), by the exponential probability
distriburion. scause we are dealing with time, the exponential ic a. continuous probability
dlstribution, This is given by
(3.27)
wlicte i Use wuinber of wseanseacs Fo Ue inernal af iaverest anal e* cai be obtained
from App. 2. The expected value and variance are
(25)
(329)
(a) For the statement of Prob. 3.29, find the probability that starting ai a random point in time.
the fit eustomer stops at the gasoline pump within a half hear (A) What isthe: pensahility that
no customer stops at the gasoline pump within a half hour? (e) What is the expected value and
variance nf the exponential distributing. where the comtinuons variahke is time 7?
(a) Since am average of 6 custoniers stop at the pump pee hour, A = average of 3 custowiers per half hour.
‘The probably thatthe frst customer wil stop within she frst half our is
Ine 7 = 1 —0,09979 (from App. 2) = 0:9502, oF 954340
PRODADILITY AND PRODABILITY DISTRIBUTIONS [omar 3
() The probability that no-eustomer siops at the pump within a half hour is
ae san
(2 E(P) = 1/4 = 1/6 20.17 por sar, and yueT = 1/38 = 1/26 20.07h por car aquared, The expe.
acatial distribution also can be used to calvulats the tims betwoen two successive eves,
The mean level of schooling for a population is § years and the standard deviation is | year.
What is the probability that a randomly selected individual from the population will have had
between 6 and 10 years of schooling? Less than 6 years or more than 1D years?
Since we have nat been told the form of the distribution, we eam use Chebyshev’ theorem, which applies
tivany diseiburion. With ye = 8 ears andr = 1 year, 6 years af schawaling #2 standard clevintinns Below j
and 10 years of schooling ts standard deviations aboxe Using Cheryshev's theorem or inequality We
obtai
PUR —y| = Ko) > 1
130)
‘The probability ofan individual picked at random froma the population wil be within 2 standard deviations
trom the mean 1s
Therefore, the probability that th indivigeal will have ha cither less than @ or mere than ID years of
schooling & 25%.
Supplementary Problems
PROBABILITY OF A SINGLE EVENT
3aL
‘What approach to probabihty ts mvolved in the Yollewang statements? {a) The probabibty ofa head in the
tex of a balanced coin is 1/2. 18) The relative frequency of a head in 100 tosses af a coin Is S3. Ke) The
probability of rain tomorrow Is 29%.
‘dns. (a) The classical a a priori approach (6) ‘The relative frequency or empirical approach. ¢@) The
subjective or persoaalistc approach.
‘What isthe probability thst in tossing a balanced coin we get (a) a tail, (6) alhead, (c) not ata or (dt a
Wor pot a tail!
ans. (a) PUT
1/2 b) PUMP 1/2 te) PC) = ya a PCH) + PT
‘What isthe probability that ine roll ofa fair die We Bet (a) a1, 48) 46, (Ch Hota Lor td) al oraot
alt
fins. (a) PAY) = 1/6 (b) PG) = 1/6 fey
5/6 Md) PI) Pi
‘What isthe probability that ina single pisk from a standurd desk ofcards we pick (a a club, (6) anaes,
(o) theacr af clibs, [d) nol acloh, ar fe a club er not a
Ans. (a) PIC) = 13/82 =1/4 (6) PIA) =4/S2= 1/13 (6) PAC (d) ric) =3/4
fe) PICI+ P= 1
Aw tuo contains 12 balls that as saactly alike encapt that 4 ase ble, Saco 3 ane geen anal 2 aie hile
What is the probability that by pisking.a single ball we pick (a) A lve ball? (By A ced ball? fe) A green
ball? (a A white ball? (@) A ponred ball? (F) A-nonshit ball? fg) A shite or nonwhite ball? AlcoCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS a
() What are the odds of picking a green ball? (2) What are the odés of picking « nongroen ball?
ns: 4a) PB} ~ 1/3 or 0.33 (b) PR) ~ Lidar B25 (o) PIG) —1/for O25 {d) PW) — 1/6 or 0.167
(e) PIR) S078 (F) POW!) = 0833 dg) PW) + PTW) ST (h) 3-9 (9 9:3
Suppose that a card is picked from a well-shufled standard deck, The card is then teplaced, the deck
reshuffled, and another card is picked. Ax this procen is repented $20 timex, we obtain 136 spades.
(a) What is the relative froquency o¢ empirical probabllty af getting a spade? (hy What is the classical
fo a price! probability of gctting » spade? (c} What would you expsct the relative frequency or empirical
probability of getting spade to be ifthe proves is repealed many mors times?
dmc. (a) 134/820 0.26 (44 PIS) —1/4 (6) To approach 1/4 or 0.28
An insuranos company found thal Gum a sample 6 10000 mew bebwcen the ayes of 30 and 40, 87 become
seriously ill during a I-year period, 0) What is the relative frequency or empirical probability of men
betwcen 2 and 0 becoming seriowaly il during 4 I-year peviog? (6) Why fe the insurance sompanse
iaterested in these sults? — (cb Suppose that the company subsequently sills Realth insurance te
1.387.684 men in the 30 tad age group. How many elaimscan the company expect during a laxear period?
Ans. (ay The relative frequency or empirical probability is 87/10,000 = 0.0687. (6) The insnrance-com-
pany is interested in the relative frequeney or empirical probability in order to determine ite insurance
premninis. fe} 12.073, to the nearest person
PROBABILITY OF MULTIPLE EVENTS
ae
350
ase
What typos of events ase the following? (a) Pioking hoarts or chubs am a. single pick from a dock.
(0) Picking diamonds or a queca on a single pick from a deck, (} To successive fips of a balanced
cain. td) Two soocessive tosses of a fair dic. (ob Picking two. cards from a deck with 1=placement.
() Picking two cards from a deck without replacement, (gb Picking two balls from an ura without
replacement,
Ars: (a) Mutually cxctasive —(b) Not mutuall exclusive (e) lodependent (d) Independent (e)
Independent () Dependent (g) Dependent
What i the probability of getting (a) Folie or shone on a simile tas oF a Fair ie? (8) Acer king on
single pick from a welhshulflad standard deck of cards? (6) A green or white ball from the ura of Prob.
aa?
Ans; (ah 1/2 (8) 8/SE or 23. (e) SZ
‘What isthe probability of getting. (a) A diamond or a qussn on a single pick from a deck of cards? (b) A
diamond, waqocen, or a King? (6) An African-American ar a woman president of the Linited States if the
probability of an African-American president is 0.75, of a woman i 0.15, and of an Afi&can-American
woman is 00072
Ans: (a) 16,52 07 4/13 (8) 19/52 (e) 033
What isthe probability of (a) To ones in 2 roll ofa die? (6) Three tile i 3 fips ofa coin? (e) A total
of 6 in volling 2 dice simultancously? (a) A total of oes than $ in solliag 2 dice siemultancously? (0) A
total af 16 oF more in rolling 2 dice sinmtancossly*
Ans. (ah 136 (AV UK te) S36 G16 ted 18
‘What isthe probability of obtaining the following from a feck of cards: (a) A diamond on the weed piske
when the first card picked and not replaced was a-diammond? (6) A diamond on the scoond pick when the
Breteard picked and not replaced wat nota diamond? {c) A king on the thind pick when a queen and a jack
wwete already obtained on the frst and sscond pick abst aot replaced”
Ame. (ah 12/81 (6 SL oe 4/50
What is tee probability of pickings (wh the king of clas sand liacnoud that wre ine pits fom a desks
without replacement? (b) A white ball and a green bal in thus order in2 picks without replacement fram the
torn of Prob. 115? (09 A preon ball and a white ball thor ordi i picke without replacement from the“ PRODADILITY AND PRODABILITY DISTRIBUTIONS [omar 3
uum of Prob, 3.457 dd) A grestsand a white ball ie shat ode in 2 picks
turn? (6) Thos green balle in 3 picks without replacement (toon the ura?
Ans. (a) 13/2682 a 1/208 (8) GI132 oF 1/22 feb 122 Ae) YEE fe) 6/1320 oF 1,220
jout replacement from the same
SM Suppose thatthe probabity of rtm on a given day i 0.1 and the probability of my having a-cxr accident is
(9005 on any day acd L012 seein aye (a) What ce svuhl Vase to cabal the platy that oa a
tiven day it will rain anc will have a car accident? (8) State the rule asked for in part a, sting A signify
acciddant and R signify rain. (8) Calculate the probability acked For in part a
dns. (a) The rule of multiplication for dependent exents (6) A(R und A) = F(R) F(A/R) (2) 002
388 _ (@) What rule or theorem should { use to calculate for the statement in Prob. 3.54 the probability that it was
sings lige: Fad a car atsntset? (2) Stale the cule-ov thecesns applable be pat ae (e) Ansties the
question i part
fans. (a) Baye’ theorem (5) BR/A) = A(R) FA/R)/PIAY 4s) O24
438 In how many ciflewent ways can @ qualified individuals be assignod to. (a) Three trainee positions available
if the positions are wentical? 48) Three wainee positions eventually ifthe positions cifer? ) Six trainee
roils avails ithe pit lifes?
Aus. (a) 20 (8) 130) 720
DISCRETE PROBABILITY DISTRIBUTIONS: THE. BINOMIAL DISTRIBUTION,
3ST The probability distribution of lunch customers al a restaurant is given in Table 3:5. Caleulale (a) the
‘expected number of hunch customers, (8) the varianec, and (c) the standard deviation
‘Table 35 Probability Distribution of Lanch Customers at
4 Restaurant
Nasu of Castine
100
10
us
120
12s
Ans, (a) 113.1 customers (6) 65.69 customers squared (¢¥ 8.10 customers
358 What is the probability of (a) Getting exactly 4 heads and 2 tails in 6 tosses of a bullaced coin?
() Getting 3 sixes in 4 rolls of a fair diet
Ans, (a) 923 (6) O0LS4R21
380 (a) 120% of the seadents entering college deop out fore secelvingthels diplomas, find the pesbabilcy that
‘ut of 20 stucents picked at random from the very langs numberof students entering college, less than 3 drop
fut (8) If 0% of the bulks produced in a plant are acceptable, what isthe prabahulity that out of 10 bulls,
Picked at random from the very large outpot of the plant, 8 are acceptable?
ns. (a) 9206 tb) 0.1937
‘380 Caloulase the expected valve and standard deviation and éewermine the symmetry or asyrametry of the
Probability distribution of (a) Prob. 3.5842), (8) Prob. 3.59{a), and (ey Prob, 3.3%)
Aeon (a) E(A) — 1 els, SEN — 1.22 haul, aod theistabution és ayunneteical (2) ECE) —4 stntents,
SD ¥ = 1.79 students, and the distribution is skewed to the right, (e) (1°) = 9 bulls, SD. = 0.95 balks,
and the dietribution ie el:ewed to the letCHAP. 3) PROBABILITY AND PROBABILITY DISTRIBUTIONS oe
261
What is the probability of picking (a) Two women in a sample of $ drawn at random and without
replacemant From a group of B people, 4 of whom are womma? (8) Eight men in a eample of 1 drawn
at randoot and without replacement from poputation of L000, half of which are men,
Ams (a) Ahont C171 dosing the hypergecmesrie slisritusion) (hy Abst O39 (using the hincwnial
approximation to the hypergcometric probability)
THE POISSON DISTRIBUTION
sa
Past experience shows that there are to traffle accidents at an lnrerscsion per week. What isthe probe
ability of: (a) Four accidents during a randomly selected week? (8) No accidents? {cy What is the
sxperted vahts and standand deviation of the distribution?
Aus; a} About 0.36 (6) About 14 (@) BLA} A— 2 accidents, and SD. — VR— Ll accidents
Past experience shows that 00% of the national labor force get seriously ill during a year, If 1000 persons
are randomiy selected from the national labor Force: (a) What is the expected mumber of workers that oil
get sek during a year? (8) What i the probability that S workers will get sick during the year?
Ars. (a) 3 workers (6) About 0.1 (using the Poisson approximation to the binomial distribution)
CONTINUOUS FROBABILIFY DPSTRIBUTIONS: THE NORMAL DISTRIBUTION
as
aT
am
Give the formas: (a the probability that eontinuoys variable X falls berween As and Vs. (8) the normal
Slistrinution, (c) the expected valve and variance of the normal distribution, and “{d) the standard normal
distribution, fe} what i the mean abd Variance of the standard Hormal disteibution?
Ans. (ab PLM) OSD, thea
_ 2 [Nan OA
rem TeV 30 =
4, witht the jite correction factor
instead of op =
EXAMPLE 4, The probability that the mean of a random sample V of 36 elements from the popalation in
[Example 3 falls between 18 and 24 units i compited as Fallows
18
ang
oF
Looking up 2; and 2) im App. 3, we get
rise <2) =08
13 + MATT2 = O.RTRS, oF BLASScuar. 4) STATISTICAL INFERENCE, ESTIMATION @
Soe Fig. 42,
se ca Sica
io 7 cele
Fig. a
43 ESTIMATION USING THE NORMAL DISTRIBUTION
‘We can get a point or an interval estimate of a population parameter. A poi estimate is a single
umber. Such & point estimate is wibiased if in repeated random samplings from the poputation, the
expected oF mean value of the corresponding statistic is equal to the population parameter, For
example, is an unbiased (point) estimate of because pg = p, Where jy #8 the expected value of
The sample standard deviation sfas defined in Eqs, (2.20b1and (2,1Jbi] is an unbiased estimate of
{sce Prob. 4.13(6)). and the sample proportion jis an unbiased estimate of p (the proportion of the
population with a given characteristic).
‘An interval extimare rofors to a range of values together with the probability, or confidence level, that
fhe interval includes the unknown papnlation paramcter Given the population standard deviation ar
its estimate, and given that the population is normal or that a random sample is equal to or larger than
4, we can find the 95% confidence interval for the uakinown popubation mean as
PUL — 1.960ry < p< 8 + 1.9604) = 0.98 (4)
This states that in repeated random sumpling, we expect that 95 out of 100 intervals such as Eq. (4-4)
include the unknown population mean and that our confidence interval (based on a single random,
sample) is one of these.
A confidence interval can be constructed similarly for the population proportion (see Example 7)
where
(the proportion of suscesses in the population) 43)
(the standard error of the proportion) (668)
EXAMPLE 5. A random sample of 144 with a mean of 100 and a standard deviation of @) is taken from a
population of 1000, The 95% confidence interval for the unknown pptlation mean is
£1 mep since n > 30
£196. since m > 0.058
= 100 4 1.9622, OORT sing sat an extmate of ©
aa 1000 = 7
— 100+ £9048) (093)
= 1040.1
Thus sis between 9,89 and 109.11 witha 95% degree of eonfidenes. Other frequently used confidence intervals are
the 80 and 99%; level, corresponding ta the 7 value: of 1,64 and 2.5%, respsctively (ose App. 370 STATISTICAL INFERENCE, ESTIMATION [omar 4
EXAMPLE 6, A manager wishes to estimate the mean number of minutes that workers take to compli
particular manufacturing process within 43 min and with 80% confidence, From part experience, the manager
Knows that the standard deviation o is 15min, The minimum required sample sie (w > 30) is found as follows:
x
oF
sop =X
we
1a aecaming 1 805N
1s
ret 8
ra
re SL
3
ince the total confidence interval, fe 3 min
167.24, of 68 (rounded to the next higher integer)
EXAMPLE 7, A suste clucation departarent finds that ina random sample uf 100 persons why aitendal college,
sérrcceived a college degree. To find the 9% confidence interval for the proportion of college graduates out of all
the persons whe altended college, we precoed as follows. Firat, we note that this problem tnveives the binomial
distribution (sce See. 3.3}, Since. > 30 and both op > $ and {1 — p) > S, the binomial distribution approebics the
normal distribution (which ix simpler to use: sce Sec. 15). Then
an papery assuming a < 005)
59,0008)
258) SE sing as an estimate of p
was 7 58(0.05)
oat o13
‘Thus pis between 0.27 and 0.53 with a 99% level of confidence
44 CONFIDENCE INTERVALS FOR THE MEAN USING THE ¢ DISTRIBUTION
‘When the population is normaly distributed but ¢ is not known and w < 30, we cannot use the
noriial distribution for determining cosfidence intervals for the wiknown population mean, but we can
Use the Faistbubion, Tus is symmetrical about sts zeta mean Du i Haller than the standard normal
distribution, so that more of its area falls within the tails. While there is a single standard normal
stistrbution, there diferent J distribution for each sample size, x, However, asm becomes larger, the
4 distribution approaches the standard normal distribution (sce Fig. 4-3) until, when > 30, they ate
approximately equal
Appendix 5 gives the values of 10 she right of which we fine 10, 5. 2.5, 1, and 0.5% of the total area
wunder the carve for various degrees of freedom. Degrees of freed (4) ate defined in this case as a — |
Standard normal dissin
> 2X rineation, 93cuar. 4) STATISTICAL INFERENCE, ESTIMATION a
(or the sample size minus I for the single parameter j we wish fo estimate). The 95% confidence it
for the unkiwwn population wican when the ¢ distribution is used is given by
o(e- 2 The sandurd exer ofthe mean 2 i cven by the standard deviation ofthe parent portation 2 divided
by the mare foot ofthe samples sie J that So = o/ va. Pos fie populations ice N,fintle
correction factor most be added, and of = (o//byGW = n)TN =}. However af the sample si is
‘ery smal ineation tothe poptation sie, /(N = n)/( = [}igetose to | and canbe dropped fromthe
formuls, By convention, this is dome whenever n = O0SN, Independently of this tinite correction
factor, np is drctly related to. and iversely elated to in [soe Eq. .20.8)) Thus increasing the
samples sie 4 times increases the accuracy of as an estimate of by catting oy in half, Notc also that
‘9p s anways smaller an 9. he reason tor this Wat the sample meats, 38 areager of IME pe
‘observations exhibit Iss variability or spread than the population values, Furthermore, the lrgerare
the sane siee, he mone he valucoat'apaneuveragsl Uk wits repost Ihe valuvst Gs Figs 4+.
For a population composed of the following $ mumbers: 1, 3, §, 7, and 9, find (a) ye and 2,
(B) the theoretical sampling distribution of the mean for the rample size of ?, and. (c) yp and 0
la LEW _ls3454749_ 25
(6) The theoretical sampling distribution of the sample mean for the sample size of from the given finite
population m is given by the mean of al che pusshle different samples that can be obtained fom this
Dopulanon, Me MuMDsr oF commPmuacoss OF 3 NUENDEKS fakeN 2 at LANE EERO concern For Ae Or? 1s
SY/2188— 10 (sae Prov, 3.18), These 10 samples are 1,351,551. 1,8: 3,5: 3,2, 3,9:5 7: $,9; and 7,9,
‘The mean, of the proceding 10 samples ts 23,4, 3,4, 8.6, 6, 7, 8. ‘The sheoretiea sampling
dlistribacion of the mean is given in Table-4.1, Nove thatthe variability ar spread of the sample means
(from 2 to.8) is less than the varity or spread of the values in the parent population (from | to 9.
confirming the statement made at the end of Prob. 4.55).
(©) By applying theorem 1 (Sec. 42), 1g = y= 5. Since the sample size of
population sine (that is, > 0.05V),
greater than 54 af the
vi OK
Ver”
48
STATISTICAL INFERENCE, ESTIMATION lemar. 4
‘Table 41 Theoretical Sampling Disisibuiion of the Mean
Values of the Mean | Possible Quicomes | Probability of Oveurrenee
2 2 a1
3 3 ol
4 4 a2
$ 45 02
6 ‘ 2
, 7 a
8 8 a
Total Lie
or the theoretical sampling distnbution of the sample mean found in Frob. 4.0(0+ (a) tind the
mean ane the standard error of the mean asing the formulas for the poputation mean and standard
deviation given in Secs. 2.2 and 2.5. (0) What do the answers to part a show?
PEASME STA HERE THR
Pear pOTIsOsTSITaT_ Og
PEREAESESES EAESESEC BOLE ELS
(6) Theanswers to part a confirm the rests obtained in Prob. 4.5{¢hby the application of trarem F (See.
namely, that ap =y and op = (o/ ya)y(N=m)/(W~ 1) for the finite population where
n> 0.056, Noe thal We LOOK alf the postin diferent samples of size 2 that me cout take from our
ite population of $ mmibers, Sampling from an infinite parent papalation (or from a finite parent
‘population with replaccment) would have required taking an iil number of randem samples of sie
‘frome the parent population (an abuiously impossible task), By taking oly a fonited number of
random samples, theorem I would hold only approximately (iss. yy ™ wand vy % yA with the
approximation besoming better as the number of random samples taken is increased, In this cass, the
tarepling distribution of the eampbe mean gerurated ig refereed to athe ompé
(the ae,
A population of 12.000 elements has a mean af 100 and a standard deviation of ¢#, Find the
mean and standard crror of the sampling distribution of the mean for sample sizes of (ab 100
wk hy 00.
(al
a) up aa
Since a sample of $00 is more than 5% of the population size, the finite correction factor mst be wsed
jnthe formala fr the eiandaed errorcuar. 4) STATISTICAL INFERENCE. ESTIMATION 78
60 [iano 900 60 |
om TMT 1 = 30) To
294.9% oe 20,982) a 1.2
net
oe
Without the correction factor, - word have been equal to 3 instead of 1,92,
(a) What i the chape of the theoretioal sampling distribution of the moan if the paront popula
tion is norma? Ifthe parent population is not normal? (2) What is the importance of the answer
(© part a?
(a) Ifthe parent population is normally distributsd, the theoretical sampling distributions of the mean are
also normally dstribuied, regardless of sample size. According (o the centru lime dhearem, even if the
parent population is not normal, the theoretical sampling distributions of the sample mean approach
Normality as simple size increases (Le.,asm— co), Thisapproximation is sufficiently good for samples
of at east 30,
(6) The contrabimit theorem is perhaps the most important theorem in all of statistical inference. Te
alloms us to use sample statistics to make inferences ahout population parameters without knowing.
anything about the shape of the parent population. This will be dane an this chapter and in Chap. 5.
(a) How can we calculate the probability that 2 random sample has a mean that fall within a
given interval if the theoretical sampling distribution of the mean is normal or approximately:
normal? How ie thit different feom the procoss of finding the probability that a normally dis
twibuted random variable assumes a value within a given interval? (2) Deaw a noemal curve in.
the ¥ and zecales and chow the percentage of thearea under the curve within 1, 2, and 3 standard.
deviation units of ite mean,
() [the theoretical sampling distribution ofthe mean is normal or apprositmatcly normal, we can find the
probability that a random ample has a racan that falls within a given interval by calculating the
sorresponding 2 values in App, 3. This is analogous to what was done i See, 3.5, where the normal
and the standard normal curves were introduced. The aly diflerence rs that aow we aze dealing ‘sith
‘ue distribution of the 1+ rather than with ihe distribution of the 1s. In addition, Before
(X= nie, while now 2 = (4 — ue) /ee=(X —al/or, sinoe ap
(6) In Fig 4.5, we have a normalcurve in the 1 scale and a standard normal curvein the rseale. The area
Kectle
a a er Heelers:
a st ort ar
Fig. 4516 STATISTICAL INFERENCE. ESTIMATION [onar, 4
‘Find the probability that the mean of a random sample of 25 elements from a normally diss
twibuted population with a mean 90 and a standard deviation of OD is larger tha 100,
‘Since the parent population is normally distributed, the theoretical sampbing distribution of the mean is
ako normally distributed and op = 7/ /m because w-< O0SN. For X= 100
kop www
or elyn GOVE TE
‘Looking up this vabue in App. 3, we pet
083
PCE > 100) = 1 ~ (0.5000 +0.2967)= 1 ~ 0.7967 = 0.2033, o¢ 20.33%
See Fig. 46,
Atk Euale
Fir 4
4.12 A small local hank has 1450 individ wl sivings accounts with an average balance: of $3000 and a
standard deviation of $1200, If the bank takes a random sample of 100 accounts, wit is the
probability that the average savings for these 100 accounts will be blow $2800?
‘Since w= 100, the theoretical sampling ditibutioa of the mean is approximately normal, but since
> WUBIN, the finite sorrection factor must be wsed fo find rp. For X= 82800
N-up =u 280 — 5000 m2 sy
ao Nan 1200 flaso. 19 [380 OR
VaYN-1 JinoV 0 oy ia
73 im App. 3, we pot
PCY < 82800) = 1 — (0.5000 + 0.4582)
‘Looking
adit, on 4.18%
See Fig, 47,
ESTIMATION USING THE NORMAL DISTRIBUTION
413 What is meant by (a) A point estimate? (@) Unbiased estimator? te) An
a) Because of cost, time, and feasibility, population parameters arc frequently estimated from sample
statistics, A sample viaistic used to estimate a popwlation parameter i called an exsimaror, and
specific observed value is called an estimaie, When the estimate of an ueknown population parameter
is piven by a single number, itis called a poiw eseimate, For example, the sample mean is an
feslmator of the population mean, and a single valve of fsa point estimate of Similarly, she
‘sample standard deviation scan be Gsed as an estimator of the poptlation standard deviation @ and
single valus of» isa point estimate of u. The sane proportion psa be used as an estimate Fr the
population proportion p, and a single value of 7 is & point estimate of » (ue. the proportion of the
popelation with a given characteristic)
val estimate?CHAP. 4) STATISTICAL INFERENCE, ESTIMATION 7
(6) A estimator is anblesad if in repeated random sampling from the population the corresponding
ttatictic frora the theoreiical campling divteibution x equal to the population parameter, Another
Way of stating this is that an estimator i unbiased if its expected Value (see Probs. 3.20 and 3.31) is
qual to the popmlatinn parameter being estimated For example, ¥, « [einer in Fas. (106) ant
@.NB)), and pare unbinsed estimators of w,.0, and p, espectively. Other important criteria for a good
estimator are discussed in Sec. 6
(0) A fetcratessimate refers to the range of values wsed to estimate an yaknown population parameter
gether with he peababiliy, oF candice level thatthe interval dis doch the va kas BopMsTc
parameter. This ic known as. eowfdence inerval and is usually centered around the unbiased point
sstimate, For example, the 95% coniidenos interval for ur is given by
FN = 1960 < ps Nop 360.9) = 0.95
The two mimbers defining confidence interval ars called confidence fits, Because an interval
ctiate also expresses the dogsse af accuracy o¢ cowialence we have it the estionate, st i SUpEHOr te
2 point estimate,
4.14 A random sample of 64 with a mean of 50 and a standard deviation of 20 is taken from a.
population of 800. (a) Find an interval estimate for the population mean such that we ire
93% confident that the interval includes the population mean, (8) What does the result of part a
tell us?
G2) Since n> Hi, 20 can woe the 2 wulue of 1.56 from the standard normal cstsibution to construct the 98%
confidence interval for the unknown popalation and we can we sas an estimate for the unknown
oo en
see he RA" (anes a tn an
Ze a Sf EE eens om tan
ow WAT
ta hi rsh
a [N=a ~ 8 Oy ogo 24
a - 2 a
ao aN =17 yea soo 1S
(24) oe $0.4 4.30, Tia is Beton the Hower contidenes nt of 46.2
nthe upper condense limit of $4.7 witha 98% level of sontders.
() The result of par tells us that if we take from dhe population repeated random samples, cach of sine
11 = 64, and construct the 88% contidenes interval for cich of the sample means, 98% of these cone
fidence intervals will contain the trae unknown popaation mean, Dy assoring tha cur confidence
interval (based on the single random sample that ws have actually ken) esone ofthese 95% sonidence
that include p, we take the calculated rik of being wrong Sof the tne
A random sample of 25 with a mean $0 is taken from & population of 1000 that is normally
disteibuted with a standard deviation of 1 Find (a) the @0%, (8) the WM%_and (6) the 9%
confidence intervals for the unknown population mean. (aly What dogs the difference im the
results to parts 2, b, and ¢ indicate?
@ W=N-L16top seh youn i normally dt
Jee E64 se <1 anes 04m
i
i
were
vis
= wats
so+ 98d78
416
47
STATISTICAL INFERENCE, ESTIMATION lemar. 4
‘Ths jis betmcen 70,16 and 89.94 with 0% confidence.
(a) be 80 1:96(6) ~ HN ILS
‘Ths jis betmsen 68.04 and 91,76 with 9695 level of eonfiaens,
be BALE 2,58(6) ~ 15.48
‘Ths jis bermcen 64.52 and 95,48 with 49% keel of confidence,
(i) ‘The results of pars a, 8, and ¢ indicate that as We inerease the degree of contin required, the size of
the confidence interval inereases and the interval estimate: becomes mare vagne (he. bess precise)
Honever, the degree of confidence associated with a very narrow confidence Interval may be $0 low
sssto have litle meaning. By-canvention, the most frequently used confidence interval is #5, followed
by 90 and 994
‘A-random sample of 36 students is taken out of the S00 students from a high school taking the
college entrance examintion, The mean test score for the sample is 33H, and the standard
deviation far the entife population of S00 students is 40. Find the 95% contidence interval
Jor ine unknown population mean score,
Since 1 1, the thesia nenpling Atribtion rhs aan I appeaeNARy Ane Alu, seas
> 005
[RaW _ 0 HHH. og
Ve NT a V apm T Ot
Then wa Leoy = 3804 1.96164)
ve
3021254
Thus is between 367.46 and 392.54 with a 95% level of confidence,
‘A researcher wishes tn estimate the mean weekly wage of the several thousands af workers
employed ina plant within plus or minus $20 and with a 99% degree of confidence. From.
ast experience, the meearcher knows that the weekly wages of these warkers ne normaly
distributed with a standard deviation of $40, What is the minimum sample size required?
ina
S168 = 26.3, or 27 (rounded to the nearest higher integer)
(a) Sotve Prob. 4.17 by first getting an expression for n and then substituting the values fram the
problem info the expression obtained, (6) Why is the question of sample size important?
(e) What is the size of the total confidence interval in Prob. 4.17? (d) What would have to be
the sample size in Prob. 4.17 if we had not been told that the population was normally
distributed? (e? What would have happened if we had not been told the population standard
deviation?
(o) Searing with an/J= 2 yu(soe Brab, 4.17) we get 29/(8 ye) yi ThueCHAP. 4) STATISTICAL INFERENCE, ESTIMATION »
(5) so
Substituting the values from Prob, 4.17, we get
ne ES] = 2668, or 27 (the same as in Prob. 4.17)
(6) The question of sample size i important because if the sample is too small, we fall to achiews the
objectives of the analysis, ang ifthe sample istoo large, we waste resources beearuse i is more expensive
10 colleet and evaluate a larger sample.
(0) The size ofthe total confidense intceval ia Prob, 4.17 is S40, of twice N— x, Since we arousing Jas at
catimate of 1 — gis sometimes referred to a8 the error af the estinate. Because in Prob, 4.17 we
‘want the error ef the estimate to be “within plus or minus $20," we get — = $800, ora range of $40
for the total confidence interval
(a) Uf-we had not been told that the population was normally distributed, we would have had to increase
the sample to at east 30m Prob, 4.17 im onder to justly the use of the nommal distribution,
(©) [fms had not been told the value of, we auld not have solved the problem. (Since we were deciding
fon what camp ce to take fn Prof, 4.17, we could not porsbly have known the © nvear an esimate
of 0.) The only way we sould estimate «(and thes approximate m) would be if we knew the range of
wages Fn the highest tothe lowest Since £ Ie inches 99.78 of all the agra vase the-normal ere.
wwe cauld have equated Go with the range of wages and thus estimate @ (and solve the prablern)
With reference to & binornlal distribution, indicate the relarlonship between 4a) pean gigs (1
and A, and fo} wap. and dy.
a) = np = mean monber of succeses in. tials, where pis the probability of succes in any of the trials
(ee See. 3.31. 4p = w/e p = the proportion of swesesses of the sampling distribution of the propor-
tion,
Ub) p— the proportion of succes i he peputuian, and f — the proportion of accesses ds the sample (and
fan unbiased estimator of ph
(0) o = Var pT ~ starr eatin of the mums suena re palo ata
smd ene 9 (v4
yn lee hon 9608" (28
4.20) Fora random sample of 100 workers In a plant employing 1200, 76 prefer providing for thelr own
retirement benefits over belonging to a compans-sponsored plan, Find the 95% confidence
interval for the proportion of all the workers in the plant whe prefer their own retirement plans80
an
42
STATISTICAL INFERENCE, ESTIMATION
singe 4 > 30 and mp > Sand a(l —p)> 5
since w > 0.05
=07 2196) S) a a= sing fa am estimate for p
0.7 1 9640.05)(0,96)
so7z009
Thus {the proportion ofall the workers in the plant who prefer their oun retirement plans) is between 0.6
and 0.79 with 95% degree of eonfidence.
A polling agency wante to estimate with 964% level of confidence the proportion af voters who
would vote for a particular candidate within +£0.06 of the truc (population) yeoportion of voters.
What ic the minimum sample cize required if other poll: indicate that the proportion voting for
this candidate is 0.307
%
poe
a
nf pp prorat n cas
164 [PMO og
EERO OTI — oqn36 by squaring both ses
(2.0896910.3)¢07)
= RESSPNO SOT) 156.59, oF 187
(a) Solve Prob, 4.21 by first getting. an expression for mand then substituting the values fram the
problem into the expression obtained. {b) How could we still have solved Prob. 4.21 if we had
not been told that the proportion voting for the candidate was 0.30?
(eo) Starting with 24/00 pie
7 *
aa 2 ‘ell =
-(- and pa)
ene vn
Pp tuve Prob. 4.21), we get
1422)
Sutmsitusing the valucy from Prob.
me pet
(Loy"taayo.7) _ Qosseyo.
‘O08 aos
15689, oF 157
(dhe same as in Prob, 4.21),
(}) Ih we had nos boen tid that the proportion votIng for the candidate was 0,30, we ewuld estimate the
largest value of m to achieve the precision required.na mutter what the acta! value of pis This # done
uy Letting p — 0.8 (ao that I~ ~ 0.5 al), Shae pf — pe} ayaa iw the erator of He fon fe
1 (ce part «) and this product is greatest when p and 1p bath equal 0:5, the vale of wis greatest
ThorCHAP. 4) STATISTICAL INFERENCE, ESTIMATION 81
p(t =p) _1.63(0.500.5) _ (26896) 0.25)
ot oe z 21868, or 187
(nstead of w — LST when we were told that p 0-30). In this and similar cases, trying to get an actu
estimate of p does not greatly reduce the size of the required sample. When p is taken to be 0.5, the
formula for m can be simplified to
en
Using this, kr got
CONFIDENCE INTERVALS FOR THE MEAN USING THE ; DISTRIBUTION
4
44
(a) Under what conditions can we not use the normal distribution but can use the J distribution
(wo final coufidence interyals for the unknown population sucan? (8) What is the eehstiuaship
between the rdistribution and the standard normal distribution? (c) What is the relationship
between the ands statistics for the theoretical sampling distribution of the mean? (a) What is
meant by degrees of freedom?
(a) When the population is normally distributed but the population standard deviation o i not known and
the sample sci smaller than 3, we cannot usc the normal distribution for determining confidence
intervals forthe unknown population mean but wean vse the Students (or simply, the f) distribution,
(b)- Like the standaedt narmal Aistihution the 1 istibuting is bell-shaped and symmetrical akoat is era
‘can, but itis platykurtie (se See, 24) or latter than the standard normal distribution so that more of
its area falls within the tails, While there is only one standard normal distribution, there is a different ¢
distribution for each sample size x, However, as m booornes larger, the ¢distibution approaches she
standard normal distribution until, when w > 30, they are approximately equal
@
and is found in App. 3.
(ery
land is found in App. $ for the degrees of freedom involved,
(a) Degrees of freedom (af refer to the number of values we can choose freely. For example if we deal
with a ample of and we know thatthe sample mean for these two vals is 10, ne can freelv asian the
value to only one of these two numbers, Mone number is 8, the other nurnber must be 12 (to get the
ican of 1). Then we say that me have a— 1 =2— 1 = Ed, Similarly, ifm = 10, this means that we
can Freoly assign a valoe to only 9 of the 10 values if we want fwestimate the popolation mean, and 30
we have n= P= 10-129 at
(2) How can you find the ¢ value for 10% of the area in each tail for 9 df? (B) Ta what way are ¢
ilues interpreted differently from z values? (e) Find the # value for 5, 2.5, and 0.5% of the arca
within cach tail for 9 df. (4) Find the rvalue for 5, 2.5, and 0.5% of the area withim each tail for
a sample size, m that is very large or inf How do these ¢ values compare with their
corresponding = values?
(ah The sas fav IDS the ak within ch dao cai Ley ag doi the ces a O10 ae
App. Sto df. "This pives the rvahieof 1.383. By symmetry, 10% of the area under the ¢ distribution
‘with 9 alco Kee within the lot tail, to the leit of = 1383,425
STATISTICAL INFERENCE, ESTIMATION lemar. 4
() The ¢ values given in App. 5 sefer to the areas (probabiltics) within the ‘all's) of the ¢ distribution
indicated by the dograes of fevedom, However, aloes given ia App. 3 rer to the arsas (probabilities)
under the standard mortal curve jrom the mean tothe specified £ values (eaeapare Example 4 with
Frample 8
(6) Moving down the columns beaded 0.05, 0.025, and 05 in App. § t0 9 df, we get ¢ values of 1.833,
12262, and 2.280, respectively, Racouse of symmetry, $2.8, and 0.59% of the area within the left tail
the Fdistmbuion for 9 lf lie to the let of f= —1833, 1 2.262, and f= 3.250, respectively
() For sample sees (ann at ane ery ange or afin, Hage = LOS, danas = 1.960, anal ggys = 2576
(irom the last rom of App. $1 These coincide with the corresponding = values in App. 3. Specifically,
fuges = L960 mans that 3.882 of the ars under ther distribution with sdf Wes within the ight tai, €2
the right-of = 1.96, Similarly, 2~ 1.9% gives (rom App. 3) 14780 of the arsa under the standard
normal curve From si O40 r= 198, Thus, Far df ==] = co. the Plitribution is identical to the
standard normal curve
A random sample of 25 with a mean of 80 and a standard deviation of 30 is taken from. a
population of 1000 chat és normally distributed, Find (a) the A%%, (A) the 954%, and ¢¢) the
99% confidence intervals for the unknown population mean. (d) How do these results compare
with thase-in Prob, 4.157
(a) L711 for 24 af
oss ene 674 an 1.206 th ef oa
” oo or 24
a Wo BO 20 Tw AE
er eter 4 a 8 ef oe
yo 19 or 34
Ge tpamarm mets
Thus jis between 63.218 and 06.742 with 99% degree of confidence.
(d) The 90, 95, and 99% confidence intervals, as anticipated, are larger in this problem, where the 1
disttbution was used, than in Prob. 4.15, whese the standasd normal distribution was used. Hawwever,
the diferenars ate not great because when w= 25, the distribution and the standard normal distifou-
tiom are laily similar, Note that in this problem we had to use the f distribution becase rwas given
(al wot, as at Pools 415%
Arandom sample of 1 = 9 lightbulbs with 1 mean operating life of 300 and a standard deviation
fof ASh it picked from a large shipment of Vightbulbs known to have a normally distributed:
operating lS, (a) Find the 90% confidence interval for the unknown mean operating Ife of the
entire chipment. (8) Sketoh a figure for the reculte of part a
@ gas — L880 foe 8 af
be Pardy os tae$S sae
‘Thor jis approximately betwaon 272 and 328 h with a 90% level of confidence.cuar. 4) STATISTICAL INFERENCE. ESTIMATION 82
any
424
(6) See Fig. #8.
A random sample of = 23 with = 80 Is taken from a population of 100 with @ = 30.
Suppose that we know that the population from which the sample is taken is not normally
distributed. (a) Find the 95% confidence interval for the unknown population mean.
(6) How does this result compare with the resubts of Probs. 4.15(6) and 4.2S(b1?
(a) Since we know that the population from which the sample i taken is nat normally distributed and
11 30, we cam use asither the normal nor the ¢ distribstions, We ean apply Chebasher's theorem,
Thich sates that regardless of the shape of the distribution, the proportion of observations (or area
fallg withix K standard deviations ofthe mean) is at last |— (1/2), for A> 1 (ee Prob. 3.40)
Setting f= (1/08) =0.95 and solving for we get
a 20
Then wake ames 2 wos 2682
Vi 3
Thus 11s approximately Between 53.and 107 with a 95% level of contidence.
(6) The 98% confidence interval using Chcbyshev’s theorem is much wider than that fours when We could
use the normal distribution [Prob, 4.1509] or the # distribution [Feob. 4.25(0)]. For this rason,
Chebyshev’s theorem is seldom used to find confidence intervals for the enknown population mean.
However, it represents the only possiblity short of mereasing the sample size Lo at least 30 (60 thatthe
‘nopmal distribution can be used.
Under what eond:tions can we construct confidence intervals for the unknown pepukation mean
from a random sample drawn from a population using (a) The normal distribution? (6) The ¢
distribution? (c} Chebyshev’s theorem?
(o) We can use the nocmal distribution (1) ifthe parent population is normal, » > 30, apd « or s are
Snows; (2) ifm > 30 (by invoking the central-timit theorem) and using s as.an estimate for 6; or (3) if
= 30 but o is given and the population from which the random sample is taken is known to be
normally distributed,
(6) We ean uss the edisteibution (fs the given digress of freedoms) whea «230 bat it nat given and the
population from which the ample is taken is known to bo normally distributed.
(0) Tees 20 but the populatin Gm which the das sap o taken a od Kani te be cnally
lstrdbated, theoretically ws should use neither the normal distribution nor the distribution, In such
‘cor, eithor wa chord ea Chebychev's theorem or school increace the ssa af the random camp tost
STATISTICAL INFERENCE, ESTIMATION lemar. 4
‘2 = 0.0 as to be able to use the normal disriation). fn reality, however, the eisribution is sed
seven in thene cater.
Supplementary Problems
SAMPLING
429
430
(a) What does statistical ijerence sefer to (§) What are the names of the descriptive characteristics of
populations and samples? (c) How can representative samples be obtained?
Ans. (a) Estimation and hypothesis testing (4) Parameters and statisties (et By random sampling
(2) Starting foam the thied columa apd tunth row of App. and reading horizontally, obtain a camp
from 99 elements. (8) Starting from the seventh columa and frst roo of App. 4 and reading ve
tain a sample af 10 from 46M clement
dws. (a) 31, 13,33, 67, 68 (B) 24, S4, 290, 218, 385, 130, 24, 72, 313, 397
SAMPLING DISTRIBUTION OF THE MEAN
4
on
4
as
How-can-ve obtain the theoretical sampling distribution ofthe mean from. poptulation which is ta) Finite?
(@) Kotinite?
ans. (4) By taking all possible diferent samples of sie @ from the population and shen fading the mean of
ach sample (6) By (hypothstically) king an infinite number of samples of size » from the infinite
eplatcns anl thes ing ee sn ee age
‘What is (a) the mean and (5) the standard error for # theoretical sampling distribution of the mean?
Ans. (a) Jo = where jis the mean of the parent population (6) oy =f vi where oi the standard
deviation of the parent poplin and m isthe sample ie; oe ite poplations a see N wheven > OASN,
og = (oval = mith = 1)
Foca papnlatin of 100 tors, p= ane = 1 What ethene atl stanwaed orrne af The thetic
sampling distribution of the mean for sample sizss of (a) 28and (b) 817
din, (a) sey = 50 gits and oy = 2 48) ap = Sits and rp = 107
What i the shape of the theuretical sampling distibution of the mcan for samples of fa) 10 the parent
population is normal? (6) $0 ifthe parent populations not normal? (c) On what was the answer t part 8
busca?
Ans. (a) Nomal 16) Approximately normal (c) The eeatrabtimit theorem
What ic the statistic for (a) Random variable XP (8) The theoretical sampling distribution of 2
Ans, (a) 2=(N—ylie (6) 2= (8 alfa
What ic the probability of 1 fying between 49 and 50 for a random sample of $6 fear popslation with
tb and o= 12%
‘dns. 01498, oF 14.98%
What i¢ the probability that the mean for a ransom sample of 14 accounts receivable drama from a
population of 2000 accounts With can of SUOMO and & stacubaad deviation of $4000 will Le betes
91500 and 510,500?
dns. 08813, oF 88.13%CHAP. 4) STATISTICAL INFERENCE, ESTIMATION BE
ESTIMATION USING THE NORMAL DISTRIBUTION
48
an
What are unbiased point estimators of ys, 2, and p, respectively?
Ans. 5 [as debined in Eqs. (2.108) ad (2.1761), and p
Using the stanelacticn! macrnal distesbutin, stow foe ye Go) the 90%, CH) the WSR, a (o) the BO
confidence intervals
Ans, tah AY = L6bop (by Mhoperating hoars(e)» wookd have had co be increased to 301 justi
them of the normal dxtniiion
FFor the binomial dtsrbutlon, wrke the formula for (uy sand o, (by op amd dp when # < 0.05%, and
(6) whee n >A.a5
Aes (ab on aie — ATP) op — Vie — Pia ad dy — VU
10) Oy = VT PITH x VN
For a random sample of 36 graduate students in economics in a graduate cconopnics program with $80
students, § students have an undergraduate degree in mathematics. Find the proportion of all graduate
studens at this university with an undergraduate major in mathematics at the 90% comfidenae hve
Ans, O11 40 0.33
A. manulacturer of lightbulbs warts (0 catimate the proportion of defective lightbulbs within 0.1 with a
96% ogres oF evade What isthe ini saruple sie eoypinet if previnns experience idicates that
the proportion of defective light bulbs preduced is 0.2
Ans, 62
(ua) Waitedowa taesapression forte solve Prob, 47, (Lp Hlow-cuull still ave solved Prot. $47 if the
sanafactrer did not know tha
Ans: (ah 2p ap
(8) Ry Itting p= 0.5 and n= 8786
STATISTICAL INFERENCE, ESTIMATION lemar. 4
CONFIDENCE INTERVALS FOR THE MEAN USING THE : DISTRIBUTION
a9
40
4st
Find the evalu for 29dffor the Following areas falling within th (right) til of the sdstritation: (ay 10%,
(6) S85, fed 2.554, and) 056%,
Ans. (a) fayy = LET (8) fens = 109 el dawns = 20S (a) fae = 2.786
FFind the : value for the following areas falling from the mean to the ¢ value under the standard normal
curve: (a) = 40%, (H) F— 45%, (e) F~4TSN, and [d) TAOS fe) How do these =
salucs compare with the corsesponding ¢ Yalues found in Prob. 449°
ins. (a) 2= 128 (b) 2= 185 fe) z= 196 (d) 22.88 Ce) Corresporling 2 and ¢ values
are very similar (compare = 1.28 to f= 1.811, 2 L.65t0 1 = 1,699, 2= [9610 1 = 2045, and 2 = 2.38 10
7361
Arrandom sample of m= 16 with X= 5M) and ¢= 10 ictaken from a very large npalation that is normally
distributed. (a) Find the 95% confidence interval for the unknown population men. {5} How would the
answer have differed if = 10?
ins, (a) $467 to 35,33 (using the ¢ distribution with 15 ef) () 45.1 to $4.9 (using the standard normal
Alistribution)
On. particular test for a very large statctiss clase, random sxmple of m= 4 students has a moan grade
Wa Tard 58 The onde far the entincelass ate knowin to be nermalty distrihered. Fae thence
population mean of the grads. find (a) the 95% confidence interval and (h) the 99% coafidence interval.
das. (a) Approximately from 62 10 88 (6) Approximately from 5
Avrandom sample of n= 16 with Y= 50 and s= 10 is taken from a very large population that
nomnally distributed. (a) Find the OS. aanfidance interval for the unknown popula
(@) How isthe answer in part « diffrent from those of Prob, 4.517
Ans. (a) 38 to 61 fusing Chebysher’s thoorem and + as a rovsh ertimate of «) (B) The 858% eanfidence
interval bere is uch wider than those found in Prob. 431
Indicate which distribution to use in onder to find eunfidense intervals for the unknown popalation mean
from a raneom sample taken front the population inthe fllawing cases: 4a) w= 3éand.s= 10, 48) r= 20
And + — 10 and the population is normally distribated, and (cb 11— 20and s — 10nd the population is not
rnomnally distributed,
dns. (a1 Noswal dutrbation invoking the centeal lst thease and using rae an extineats of) (by The ¢
distribution with 19 df (e) Chebyshev's theoremStatistical Inference:
Testing Hypotheses
§. TESTING HYPOTHESES
Testing byporkeses about population characteristis (such as j and a) is another fundamental aspect
of statistical inference and statistical analysis, In testing hypothesis, we start by making am assumption
with regard to an unknown population characteristic We then take a random sample from the
Fepelation, and on the basis of the corresponding sample characteristic, we either accept oF rejeet the
hypothesis with a particular degra of confidence.
We can make two types of errors in esting a hypothesis. First. on the basis of the sample
(formation, we could reject a hypothesis that is in fact true. This és called a rype error. Second,
Wweecould accept a false hypothesis and make a sype IT error.
We can control or dctermine the probability of making a type Ferror, a. However, by reduc;
wwe ll have tor st prohnbility af making a type TT error, A, unless the st
increased. a is called the level of significance, and | ~aris the lev af confidence of the test
EXAMPLE 4. Suppose that a fim producing lightbulbs wants to know if team staim tha its lightbalbs
fuming hour, 4, To do this the firm can take a random sample of, sy, 100 hulls and sind their average lietume
W. The stmalle the difference is between V’ apd je, the mare likely s acceptance of the hypothesis that y= 1000
brimming: hours ata speciied level of significance, @, By sctling eat $%, the frm aoexpts the calculated risk of
‘Of the tHe. By setng @ aN I", Re frm WOH! fase a greater probably of accepting
52 TESTING HYPOTHESES ABOUT THE POPULATION MEAN AND PROPORTION
9s follows
‘The formal sieps in testing hypotheses about the population mean (or proportion) a
1. Assume that js equals some hypothetical value jig. This is represented by My: j= jey and is
called the sald Aypathests, Une allernative hypotheses ake Men Hy: je My (Fead “ju 18 HO equal
10 fig"), Hy se > fe, oF Hf: je < fig. depending on the problem
Decide om the level of significance of the test (usually $%, but sometimes 1%) and define the
‘accepiance region and rejection region for the test using the appropriate distribution
3, Take a random sample from the population and compute X, 1f¥ tin standard deviation units)
{alle iw the acceptance region, accept Hy; otherwise, reject Hy in favor of H
‘Copyright 2002 The McGraw-Hill Companies, Inc, Click Here for Tenms of Use,85 STATISTICAL INFERENCE, TESTING HYPOTHESES [omar 5
EXAMPLE 2. Sepposc that the firm in Example I wants ta est whsterit can claim that he ihtbalb it process
Inet [000 Burning hers, The firm esr random rape of no Te te ght dfs that she empl
scan T'=960 hand the sap standard deviations = 0h. Ifthe i wants o conduct the tit al the 5% velo
Snificame tshonldprweed as falls. ince ral eral to. larger than, osm han 1, the Sm
‘Book at the all and sllermativehxpotheses x
My w= 0m wg 1000
Sine = 30, denen tiation of the va apytesinaey seal (aa we a we swan
“The acsptance pion of th testa the $Y lve of signconces within 41:96 under the standard normal curve aod
the ration region i ots foe Fp. 5 1p. Sino the ejection region sn both ai, ve have awe aie, The
third sep iso find the # wae sorrssponding fo T:
Tse _ 990 — 1000
dn “R077
Telesoneglen —Acsopance reps Radecka ga
ig. 51
Since the caleulate = vahue falls inthe rejection region, the firm should reject My, that = IO and accept My, that
i # 1008, at the 3% level of significance
EXAMPLE 3. A firm wants to know with a 95% level of confidence if it can etaim thatthe boxes of detergent it
sell coataia more than £00, (about I.1/b}of detergent. From past experince the frm knows thatthe enous of
Stergnt inthe bos is normally dsibuted.- The frm takes a tandort sample of m= 2S and fndsthat V = 30g
and s= 7S q. Sings the tr is interesed in testing if ue > 300g, 4s have
He w= 80 Hye p> SO
Since the popilavon dstnibution 1s normal bul n= 30 and-¢ # Mot known, We must use the str ibUtON KHER
=| = 24 depres of freedom) to define the critical, or rejection, region ofthe test at the Ss level of signicance,
“This is found trom App. 5 (ice See. 4) and ls gine In Fig. 3.2. ‘This tsa ripheral rest. Pally, since
Vou _ 520-500 20
sida” 357085 "15
und it as within the acceptance region, we accep 4, that jt = Sg atthe $e level of significance (or with a 85>
level of contdenee).
133
‘Acepane mgs Fa y
Fe 82
EXAMPLE 4, In ths past, 60% of the students entering a specialized college program received Mir egress within
years Foe the 1980 entering class of 38, only 1S reccived their degrees by TSM To test if the 1980 class