Professional Documents
Culture Documents
O"e "fthe eminent Past President of Shri Kajasthani Se\'a Sangh who hdp the Rap,tham
cOll1'nllnity 10 preSeTve then culture and hentage thought heing away Irum thdr motherland. lie had a
dYliamie personality. lie established various b'lsinesses in tile fi led of Chemical s. Engmet:mg, PIa,lie ami
Electnal, elC. which arc havlTlg global presencc today and contribution to thc development in India
cconom\'. llc served in ~lull1hai through v<lriou, I'Til"aS; Organuatl(m, lih :\hnudi Sanllnelan,
Rajasthani Yidhy,uthi (; ri ha elC.
Ilis contribution in the development of the l11agnificem Shri Khemisati Mandir "I Jhun)hunu is <l
,cun ror any ,ocial wurker. Shn Rajasthni Scva Silllgh Tru,l undcr h" leadership madc ,ul1stautial
contribmion in the dcvclopmem of tcmple, school and collcge at J. U, ~<lgar. Education was very close ltl
hi, hcart. He started a seh"ol al Jhunjhunu in hi, lather's nal11eami he hatl a VIsion to make Jhunjhunu <In
education huh having a University at Jhunjhullu. We all arc V(;ry proud that we arc in the process of
fulfilling the dream of Shri Jagdishpra,ad Jhaharmal Tibrewala. the hcloveJ I'ast I'resiJell1 of Shri
R<I)a,th<tniSev" Sangh in overall development of jhllnjhunll by t<lking suppon of tht: mt:mbt:r; uf
jhunjhnun Pragati Sangh ami "th"" "rg:""i~ati,,ns wh'ch arc llltcrt:,ted III the J~vehlpment uf nut only
Shekhawati region, but all over Rajasthan. as ill a very ,hort sp<tnof lime the University is going to 'pread
its wings all over India thus fnlfilling his mission and completing the dre<tm of late Shri Shriniwas
bagarka whoestablisheJ Shri Rajasthani Seva S;mgh.
Quantitative Techniques
(For Ph.D. Course Work)
I
I
I,
•
,
,
lam trying /0 reach you through this book. This book call be lIsed as a seljlearning material
/0 enhance the re~'earch skills and Quality. This book is thorough~v based 011 the syllabus
prescribed by University Grants Commissioll. Our experts have tried themselves the best to give
you excellenl study material. The lerms have been explained ill a lucid manner. 1hupe Ihal m)'
gen/le scholars will enhance their research methodology through this sincere effort. Some solved
numerical problems have also been added IV express dala collection and ,lola sampling methods
/0 design a quality reseore/I.
Sd/.
Vinod D. Tibrewala
Chancellor
Preface
publication.
The author is immensely grateful to Honourable Chancellor JJT University, Shri
Vi nod Tibrewa la for his constant encouragement and guidance in bringing out this wurk.
Valuable inputs provided by Or. N.N. P:lnlley, Prin. Prahladmi Dalmia Lions College of
Commerce and Economics, from time to time are also gratefully acknowledged. The
author is also thankful to Ms. Vanashree Valecha, Ms. Rakhee KcI:lskar,
Dr. Mrs.Anju Singh, for their valuable help. The author will also like to place on record
the untiring help extended by Dr. Bnlwanl Singh in the expeditious printing and
publication of this material.
INDEX
t. Module I 1 .9
2. Module II 10 - 34
4. Module IV 53-58
6. Exercise 134~139
Try Yourself
Measures of Central Tendency are nothing but statistical averages. It tell w. the valUe about
which items have a tendency to cluster. It is representative of the mass of data. It is useful in
comparing different distributions. The average have a general tendency to lie at centre and hence
they, arc tcrmoo as 'measures of central tendency'. The requisites of mc ••.,ures of central
tendency include it's simplicity in dclinition, easiness in computation, capability of fUMher
algebraic treatment" sampling slability and nun influence by eJltreme observations.
Types: The measures of central tendencies or averages can be classified as (il Algehric
averages and
Algebrie averages require algebraic formliia to compute. While Positional aver"ge' can be
located from graphs. Algebrie averages cannot be obtained from graph.
Amongst Mean Mode Median, TIle Mean falls in Algebric average category. White Median
and Mode are Positi"nal averages.
Me:ln is the simplest measure~ uf central tendency and is widely used. It i~ used ill
summarizing the essentials features of a ~eries and enables data to bc compared. It is easy to define
and simple to understand. It is based on all the observations and hence treated as a good
representative of the distribution. It hal>a sampling stability and al,o capability of further algebraic
treatment. Its only limitation is it cannot be obtained for 'open end' d<lsS interval distribution. AI~o
it is duly affected by extreme observations. Sometimes it gives absurd resulls. It may be the value
which is not pari of the distribution. Specially in Economics and Social studies where dircct
quantitative measurements are possible, Mean is the better avernge than others.
Normally Mean is of three types as Arithmetic Mean, Geometric Mean and Harmonic Mean.
Gemnctrlc Mean is dcfined as n'h root of the prodllct of the values of ,] times.
1
Ckcassionaly a frequency distribution is encounted thaI is skewed to right, but if logarithms
X values are used with the class intervals of logs constant, the curve becomes synunctricaL In such
situation the Geometric Mean Illay be appropriate.
Harmonic Mean is defined as the reciprocal of the average of reciprocals of values of items
of a series. It has limited applications particularly where time and rate are involved. It is used in case
like time and motion study where time is variable and distance constant.
Median is the value of the middle item of the series where the series is arranged in ascending
or descending order. ~1edian is used only in the context of quali13th'e phenomenon for example in
estimating intelligence. Median is not Useful where items need to be assigned relative importance
:md weights, It is not frequently used in sampling statistics.
There arc two specific situations where the median ser\'t'S as a ,'aluable alternative 10 the
mean. Thesc <x:cur when
In psychology this often ueeurs in learning experiments where you arc measuring the number of
errors or amnunt of time required for an individual to solve a particular problem, Generally in Open-
End class interval frequency distribution mean is unable to compute and hence in such cases median
will be preferred.
Mode is lhe value which occurs most frequcntly in the distribution. It is easy to compute and
it can be used with any scale of measurement. The fact that mode can be used in any scale of
measurement made i.e. flexible when scores are measured in a nominal scme it is impossible to
calculate either mean or median so mode i~ used to de:;cribc central tendency. Mode describes the
typical or most represcntalive academic major for the sample. Because the mode identifies the Illost
typical value I case, it often produces a more sensible measure of central tendency.
Thus comparinR mean, median aod mode it Is noted that lhe mean is the commonl}'
used averaRe, taking into consideration all the ob:;ervatioDS. It can be good representalil'C of
the distribution. The goal of centrallendcney is to find a single ,'alue that best represent.~ the
distribulion. Besides being a good rcprl'Sentative the mean has added average of being a good
measure for purpose of infen:ntial statistics. Specifically whenel'Cr you take a sample from a
population the sample mean will give a good indication of the value of lhe population mean.
Also mean satislil'S majority requisites nf an ideal a"erage so mean is the superior nmongst all.
But there are cerlain situations where mean cannot be compullod then median or mode can be
used.
,
Dispersion
An average can represent a distribution only as a best single represcntativc. Thcre are some
situations whcre averages fail to compare the distributions. Consider the following case.
Four candidates Sanchit, Saumbh, Smila, Seema. Scores marks three te~ts as follows.
I 80 95 80 98
80 80
II
m
"
95 65 80
92
50
On thc basis of average if candidates are compared, the conclusion is all four arc equal or
same as far are scores are concerned, since the average score for everyone is 80. If studied minutely,
we see that Smita is most consisitent. Sanchit comes next to Saurabh ,!lid then Seema. So here there
is need to study scatter of the values from average and it is defined as dispen;ion. Thus Dispersion
means scatter or spread of individual values from its average in the distribution. The measures which
are used to measure dispersion are known a~ measures of dispersion.
Absolute measures are with respect to given distribution and hence arc expressed in
corresponding unils of measurements. While Relative measures are free from any measurement
units. They are pure numbers.
,
Following is the list of Absolute and Relative measures of dispersion with their f<mnulas.
l. Range H.S
H.g
eoeff. of Range =
Where H: Higest. Value in the data H+. •
2.
Coeff. Of Q.D = Q3 - Ql
Q3-Ql Q3+Qt
Qumtile Deviation Or SCllll hlter Qnmtilc Range =
2
The utility of range is that it gives an idea of variability very quickly. But it affected very
greatly by sampling fluctuations. Range is mostly used as a rough measure of variability.
If open- end class intervals are given the suitable measure is Quanile Deviation.
The standard deviation concept is very important in further analysis. It is defined as root
mean squared deviation.
+
6=
1.1....N L (X-xl' OR
Coeff.orVariation -
= ~ X 100
X
,
Coefficient of Variation is very important mcasure uscd to compare different <.\i,tributions on
tbe basis of cOnsistency homogeneity, variability. uniformity,
Lcss CV indicates more consistency. more unifonnity, more stability while More CV
indicates more variability, more heterogeneity.
In above discussion the Quartiles are introduced. Let us know about Quartile'S in brief.
Quartiles are the partition values which divide tile distribotion into Four equal parts, when
data is arranged in order. There are 3 Quartilcs in all denoted by Q I, Q ~ and Q;.
observations have value less thall Q I and 75 % observations have value abovc Q I. Q 1 is known
as second quartile and divide the distribution such !hat 50 % obscrvmions have value less than Q 2
"d
50 % observations have value above Q 2. In other words Q 1 is nothing but Median of the
distribution.
Q J is known as upper quartile and diville tbe distribUlion such that 75 % observations have
value less than Q; and 25 % observations have value above Q J.
The other panition values are Decilcs (D1,D2 •••.••••••.••••••• 09) and Percenti les(Pl. Pl •........... P'I9).
The interpretations and computational procedures for Deciles and Percentiles are similar tu
that of Merlian. Deciles divide the distribution into 10 equal parts, while Percentiles divide the
distribution into 100 equal parts. Fifth Decile DJ . Fiftieth Percentile Pxo are median of the
distribution. Tenth Percentile is DJ. 25'" Percentile is Ql, am;!so 011.
Panition values can be used to determine limits for desired percentage of central
observations. For e:o;ample Qj, and QJ provide limits for central 25% observations, Plo, or D2 anll
Pso Of D~ provide limits for central 60 % observations.
s
Correlation Coefficient and Coefficient of Determination
The meaSures we studied in above sections are related to univariate study only.
In practice we COmeacrO!<sa large number of problems involving the use of two or more than
two variables, The variables are said to be correlated if change in one variable Callses change in
othen; either in Same direction or ill opposite direction. The degrc-e of relationship between the
variables under consideration is measured through the correlation analysis. Thc mea.>ure of
correlation called the correlation coefficient.
Correlation analysis alternpl, to determine the "degree of relationship betwccn thc variables"
Thus the COrrelation is a statistical dcvice which help., us in analyzing the covariation of two or
more variables.
The problem of analyzing the relation between different series should be broken into three steps;
Correlation is described or classified in different ways. Some of the important ways are as
follows.
i) Positive or Negative
• The two variables under study are affected by a large number of independent causes so as to
fonn a normal distribution.
,
• There is caus(} and effect relationship between the forces affecting the distribution of the items
in the two series.
Amongst the mathematical methods used for measuring the degree of relationship, Karl
Pearson's method is most popular. The correlation coefficient summarizes in one figure not only the
degree of correlation but also the direction positi\'e or negative.
When r =0 it implies nO relationship between the variables. i.e. variables arc uncorrelatcd.
The value of r closer to 0 indicates weak relationship or weak association between the variables.
While r closer 10 -1 or +1 indicates strong association amongst the variables.
The full interpretation of r depend, upon circumstances one of which is the size
Of the sample.
The probable error of the coefficient of correlation helps to detennine the reliability of the
value coefficient.
• If the value of r is less than probable error there is no eviden~"Cof eOITelation. i.e. the value ur
r is not at all significant.
• If the value of r is more than si~ times the probable error, the coefficient of correlation is
practically certain. j,e. the value of r is significant.
• The (r - P.E., r + P.E.) provide the limits for population correlation coefficient expected to
lie.
The measure of probable error can be properly used only when the following conditions
exists.
2. The statistical measure for which the P,E. is computed must have been calculated from a
sample.
,
3. The sample must have been selected in an unbiased manner and the individual items must
be independent.
explained variation. it Is defined as the ratio of the explained variance to the total variance.
" 1-~
According to Tuttle ~the coefficient of correlation has been grossly overrated and is used
entirely too much. Its square, the coeffident of determinatiun Is much more useful measure the
linear covarlatlon of two variables. The reader should develop the habit of squaring every
correlation coeffldent he finds dted or stated before coming to any contluslon about the extent
of the linear relationship between the two correlated varlables.~
2. The value of r '" 0.707 implies (! = 0.499g49 i.e. half variance in Y is due tn X
=
3. If r 0.6 36% of tntal varialion is explained. while if r = 0.3 only 9% nf tutal varialinn is
explained.
Value of (! is always posilive. It can nol tell whether the relalionship between the two
variables is positive or negative. For that purpose r is 10 be computed.
8
STAl'o'DARD ERROR OF AN ESTIl\lATE.
Sampling Distribution of of a statistic is generated from a population distribution. Known or
assumed. The same population may generate an infinite number of sampling distribmion s for tlie
statistic, each for special s.lmple size n. A population may genemte sampling distrihutions for two or
more different statistics. The standard deviation of sampling distrihution of a statistic b known
as Standard Error of an estimate.
The S.E. can also be used to determine the limits within which the parameter values are
e~pected to lie.
M~ Pop.Std. Devi-Jn
9
Module II
Forecasting Techniques
Multiple Correlation & Multiple Regression
Time Series Analysis.
2.11 Introduction:
Forecasting is a key element of management decision making. Since the ultimate effecliveness of
any decision depends upon a sequence of evenls following the decision should pennit an impmved choice
over that which would otherwise be made.
Forecasting techniques are useful in
a) InvenlOry management to estimate the usage rate for each part in order to determine
procurement quantities.
b) Production planning to fonx:ast unit sales for each item by delivery period for a number of
months in the future.
Forecasting is an integral part of planning process. Correct fureca\ting is not usually possible
because of uncertainty which inevitably attaches 10 the future. By fore<;asling we only try to minimize the
impact of uncertainty. Thus forecasting is only a means of attempting to reduce uncertainty of the future
and not of eliminating it.
Various techniques arc available for fotel:asting. The choice of a method is generally diclated by
dala availability and I or by urgency of fotel:ast. Many limes fOTe<:;astare forced 10 usc less reliable
method for the required data as the usc of more reliable metho<J are not always available. If the usc of
beUer techniques i\ time consuming and forecast are urgently needed. forecast are made on the basis of
easy and less reliable techniques.
Regression technique is a tool to i:;olate the casual relationship between the variables several
regression models arc available to test and establish a statistically satisfactory fit between the dependent
10
variable and a specific range of independent variables, Forecast are made by substituting in values of the
independent variables in the equation and hen computing the dependent variables. These methods are
useful for long-term forecasting are relatively more sophisticated and expensive to use.
Regression analysis is olle of the scientific techniques uscd for predictions. According to M.M.
Blair, "Regression Analysis is a mathematical measure of thc average relationship belween IWOor more
variables in terms oflhe original unils of the data.
Regression analysis confined to the study of only IWOvariables at a time is termed as simple
regression. The regression analysis for slUdying more than two variables al a time is known as multiple
regression.
In this analysis there are two types of variables as Dependent __ariahles and Independent variables.
The variable which ;nnuen<:cs the values or is used for prediction is called indc'pendcnt variable.
In regression analysis, independent variable is also known as regressor or cJlplanatmy while the
de]X'ndent variable is known as regressed or explained variahle.
The estimation are done with the help of equations known as regression equalions. The regression
equations gives in aeC{Jrdanee with the Principle of Leasl Squares whieh consists in minimizing the sum
of the squares of the deviations between and the given observed values of the variahles and their
corresponding estimated values given by t1Jeline of best fil.
While dealing with involvement of three or more variables we need to apply multiple .orrelation.
We may be interesled to find association between the yield of wheal per acre and bot1J the amount of
rainfall and the average dail}' temperature, We shall be trying to make estimates of the value of the one of
t1Jese variables based on t1J",values of all tJle Olhecs. The variables whose value we are trying to estimate
is termed as dcpendent variable and all the variahles on which our estimates arc based arc kJlO"''I1to be
independent variables.
11
r ,,13 = r ~ 1.+1'23
:.i
_2rnl't3I'2:J
l-r13
,
A,d
1. 1.
r 3.l. = r 13+ r2] -2r
l-r12
, 1. r13r7.l ~J.
r13 +r.3J
:<
(1-1"13)
"'"
A coefficient of multiple correlation lie~ between 0 and I Le. always positive. A closer value to 1
indiClltesthe better linear relationship between the variables. If the coefficient of multiple correlation is I.
the relations called perfect. A correlation coefficient 0 indicate> no linear relationship between the
variable~, but a non- linear relationship between the variables c:mnot be ruled OUI.
EXA!vIPLE 1 : From the data relating to the yield of dry bark (Xl), height (X~) and girth (X:;) for 18
cinchona plants the following correlation coefficients were obtained: 1"12
'" 0.77, fn '" 0.72
and r~J == 0.52. Find the partial correlation coefficients r12) and multiple correlation
coefficients fl.2J.
SOLUTIO~ : We ha,'e,
~,-(0.77)' v,-ro.m'
0.77 - 0.3744 0.3956
0.3956 = 0.667
0.5927691
12
And,
~ ~
I"U + l" 13-2 fl, rl3l"23
R' 1,23 = ,
1-"23
~
(0.77) + (o.n) ~-2;J.: 0.77;J.: o.n ;J.:O.~2
1 - (0.52)~
0.Ht724
= 0.733
1 -0.2704
Hence, Rl.2J = 0.8562
EXAMPLE 2 :The following zero-order correlation coefficients are given fl1 '" 0.98. flJ == 0.44,
f2J" 0.54. Calculate multiple correlation coefficient treating XI. as dcpendent and second (X~)ami third
(Xl) variables as independent.
SOLUTION :
r 1.23 = I"~
t2 "
+ I" 13-21"12 rl:ll'n
,
1-1'23
" 0.986.
Advantages:
2. Thus again the coefficient of multiple correlation also serves as a measure of goodness fit of
the calculated plane of regression and consequently as a measure of the general degree of
accuracy of estimates made by reference to equation for the plane of regression.
IJ
Solving (i), (ii) and (iii),
bll.l=-0.623 bIZ,l=0.389 andALZ3" 16.479
Hence required regression equation is,
XI = 16.479 + 0.389 X2- 0.623 XJ
ell = 0.8
6(=10
If Lhe variables Xl. X2 and Xl are mea~ured as derivations from their re,pective means, AL:z.'
vanishes and we write Xl- Xl" XI, Xl- X 2 = X2and Xl" X 3= Xl
Then, we have (I) t,-"nsfonns into XI= biB X2+ bO.2Xl (2)
Then,
bIB = r12-rL3fJ3 X ~I = 0.8-0.6xO.~ X 10
J
l-r 23 02 1-(0.~)' S
0.8 - 05 x 0.6
b12.3 = I"L.-1"131'23
, X~ =
1-(0.6):.1
~
X 10
_
1-1"13
05 ~
"<
- 0.64 X 10 = 8 =0.62~
A"d
1-1"13
, X iL
63
= (0.5) - 0.6xO.8 X --S....
1-(0.6)' 5
Now,
: bnl=r231 x
-~ ~ :)1-CO.-l9/ J1-CO.28)2
'" 0.445
rnl= m-1'231U = 0.49-0.~1XO.28
~ ~ J1-CO.51)2 J1-CO.20)2
'" 0.42
~U=~~~
=1.7)(~)'.J1-(0.41)2 = 2.113
61.2, = 6
J
~ ~ = 1.7.Jl- (0.18)2 )1- (O.-l-l~)2
'" 2.333
Also,
''13 - f;l3 1'12 0.18- 051)( 0.-19
1'123 =
~~
.J1-(051)2 F.-I9)2
= 0.04
02,13=02
~
r-;-j.'~j , ~14. ~ ~,1-(0.49)-\1 ~
l-r~131/ -r 123 ! l-{O.O-l)2
'" 2.067
17
Hence, required regressioll lille is,
Xl" 0.403XI + 0.429X2
EXAMPLE 6 : III a trivariale distribution 61 " 2, ~,,6):TIl" rll" 0.5 : Til" 0.7. Find (i) bll.2 alld (ii)
blu
SOLUTION
(i)
Now,
ftl- I7l 0.7 - (0.5) (0.5)
1'\2
---------~---------
~ ~ ..jl-CO.5)2 JI-Co.si
" 0.6
61.3=
61~ = 2)1-0.25 = 1.732
~3
\Ir:-:::
I - 0.25 - 2.598
Then,
1.732
bn,2= 0.6 )( 2.598 = 0.412
(ii)
..jl-CO.7)2..jl-C05)1.
~~
" 0.243
1-f J 12 =2)(
1 -0.49 = 1.418
18
= 3 )( 1- 0.25 = 2.598
1.-128
- = 0.13.t
2.598
EXAMPLE 7 : The correlation coefficient between a general intelligence test and school achie~ement in
a group of children from 8 tll 14 years age is 0.80. The correlation between the general
intelligence test and age in the same group is 0.7U and the correlation between school
achievement and age is 0.60. What is the correlation between general intelligence and
school achie\'ement in children of the same age.
SOLUTION : We are given with .. correlation between a generJl intelligence test and school
achievement", r I~'" 0.80.
~~
EXAMPLE 8
~~ FF
: An instructor of mathematics wishes to detennine the relationship of grades on final
examination to grades of the quizzes given during the semester. Calling XI. X:!. Xl the
grades of a student on the first quiz; second quiz and final examination respectively. he
made the following computation for a lotal of 120 students.
(ii) Estimate the [mal grades of two students who scored respectively I and 7; 4 and 8
in the quezzes.
EXAMPLE 9: Suppose a compUlcr has found for a given set of values of XI ,X2 and Xl. TIl'" 0.91; Til
=0.33; r23=0.81.
Jl- Jl-
~
0.1089 O.6~61
0.6-127 0.6-127
= = 1.161
~J=:
Oj~36
Since the value of rI2.3 cannot exceed OnCthe computation given in lhe que:;lion are not frec from
errors
20
EXAMPLE 10: Find the regression equation of Xl on X, and X2 and estimate X, when Xj = 10 and X,
= 6 from the table:
x, 3 , 6 8 12 14
2
X, 16 10 1 4 3
X, 90 72 54 42 30 12
-3
"
25
9
16
10
9
3
81
9
6 -2 4 1 0 0
8 0 0 4 -3 9
-4 16
12
14
4
6
16
36
3
2 -, 25
LX 48 Do -0 LX2,= 90 LX _42 LX =0 Do 140
-
X, X - X.\ Xl Xl Xl Xl XI Xl
Now,
21
- ,
~(X3-X3)
N
--~ ~6X"~ V ~68
6 ~ 2~.85
:!;"
" .100 = -0.891
Tn
V
= ~
~:q;>;;) .582
V .0.969
r 13 = ~ ~
1"23=
L"l:2X3
~ no ~ 0961
_ [0.%1- (.0.969)(00.891)]
XJ-50- 1-(-O.891)J
(25.85)
4.83
(Xl- i)
+ [.0.961 - (0.961)(-0.891)]
1- (-O.891)J
(2s.sS)l.87
(X1-8)
EXAMPLE No. 11: If t12= 0.65. Tn= 0.6 and tll= 0.4. Calculate the value of tn,2
SOLUTION
OA-O.6SxO.6
FF Jl-0.~F
1'11.2 - -
- 0,01 _ 001
o.m
- 0.02
~1-04225 x~
22
EXAMPLE No. 12 : The simple oorrelation coefficient between temperature (XI). com yield (X~) and
rainfall (Xl) are rl2 :0.59, Til = 0.46 and r~l = 0.77. Calculate the partial correlation
coefficient rll_land multiple correlation coefficient Rl.2).
SOLUTION
rill _
FF
We have TlI=O.59. Til = 0.46 and Tn = 0.77
O.59-0JH!
"u -
0.59 - 0,46 x 0.77
-
~ (1- (OA6r) ~ (1-(0.77)') 1-0.2116~1-0.'919
0.2358 0.2358
= =
0.2358
0.4162
0.5665
Again,
23
2.18 Time SerifS analysis Method
Time series refer to numerical data at successive intervals over a time in the P.1St.
It is an arrangement of numerical data in chronological ordcr. The time series data shows
certain definitive patterns which can be meaningfully analysed for purpose of projection into several ways
of using the time series data for forecasting purpos.es such as
(i) Extrapolmion of sales patterns into the future. Considering current sales levels as ba.«eand
pmje<:tions on the as!mmption that the pattern will continue in future. Extrapolative
forecasting involves determination of a curve of trend appropriate for the product being
forecast where upon projections can be madc.
(ii) Time series smoothing is done to minimize the influen<:eof extreme values in the historical
data which might have been caused by random facton;. The smoothing process brings out
the underlying pmtem in the time series data. The underlying pattern may be horizontal or
may involve somc fluctuations or trend. a steady increase or steady decrea.,e. The
techniques of simple moving averages and weighted moving averages are ll-'C for
smoothing the horizontal paUems. Where there is as evident trends in the time serie', least
square method is used.
(iii) The various components of time series. the trend, seasonal, cyclical and random Or erratic
fluctuations. The trend factor is the long-term underlying movement of time series. It may
be a steady stme trend or a growing or declining trend. The cyclical factor is the periodic
ups and downs in the observations forming into a cycle every few years.
The seasonal factor is the periodi<:pattern in the data during the course of a year. The random
factor arises out of erratic events which do nO!occur frequently. Here the te<:hniques of moving averages,
simple linear or nun linear regression equation can be applied to isolate the above fa<:t(lTS and they can
be cumulatively analysed to fore<:a,t the future sales.
Time series may be used for both short-term and long.term forecast. But more useful for
short tenn forecast.
24
of the past behaviour, it would be possible, within certain limits, to forecast for
the probable future variations (or movements) of such data. Thus it helps in plallning
future operations.
With the help of Time Series Analysis, we can compare the actual
perfonnance with the expected perfonnance and analyse the caUl;e of variation.
Analysis of TillIe Series sho",s that the observed values of the variahle
are fluctuating from time to time.The fluctuations are due to various faclOrs (or
forces» like changes in habits and tastes of people, weather conditions, etc. On the
action of these forces. the values of the Variable arc chllllging with time.
The object of time series analysis is to isolate. and ascertain these forces (i.e ..
the various components).
1bese four types of movements are called the our components or elements
of Time Series.
The changes in Time Series dma arc the result of the combined effect of these four
components.
Y=T+S+C+I
2S
2.212 SEMI. AVERAGE METHOD
In the Semi-Aver-dge Method. the given data is first divided into two parts (preferably equal) and
an average (i.e. A.M.) for each pari is found. Then these two averages are ploned on a graph paper as
point against the mid-points of the time intervals eovered by the respective two parts. These two points
are joined by a straight line. This straight line is the required trend line and the distances of the line from
the horizontal axis OX give the trend values.
Although this method is simple to apply. it may lead to poor results when used inw'LTiminate!y. II
is applicable only where the trend is linear or appro"imately linear.
EXAMPLE I : Draw a trend line hy the Semi-Average Method using the following data:
y
t
:rn
•
!~
.
•
i•
i*
"
!~,
-•,
~rn
•,
i ,m ,
.•,
2.213 MOVING AVERAGE 1\1ETH.OU
For a given numlJe,-,;Y 1, Yz,Y3, ...• we define moving totals of order N by the sum,
Y1 + Y:+ •.. YN, Y1+ Yl+ .•.. + YN+1,Y)+ Y.+ .... + YN+1•...
And moving aver-.Igesof order N by the sequene of arithmetic means
28
In Moving Average Method, a series of moving averages of specific order is calculated. Slarting
from the beginning of the given series. an average for a specific number of years for yearly data or a time
intelO'a! (called period) is calculated and this is placed again.t the mid.point of !he time intelO'al. Keeping
the period fixed the process is replaced by dropping the first yearly figure of the given values and adding
the figure of the next year we had not added before. We continue with this till the end of the series is
reached.
If !he period of moving average is odd, the moving totals and moving averages com:spond to !he
given years of time. But if the period is even, a two-poim moving average of the moving averages is to be
found for centering them. i.e. for synchronizing the moving averages wi!h !he original data (see example
3(ii) given below).
This method is commonly used for measuring trend. By using moving averages of approprime
orders, cyclical fluctuation, seasonal and irregular movements may be eliminated, leaving only the trend
movemell!.
If the moving averages are strongly affected by extreme values, a weighted moving average with
appropriate weights is sometime used.
(i) This melhod is used to measure trend seasonal, cyclical and irregular fluctuations.
(il) Moving average method is easy to apply as this method does not involve any difficult
calculation.
(iii) If an appropriate period is chosen (i.e., if the period of the moving average coincide with
the period of cyclical fluctuations), Ihen these fluctuations are automatically eliminated
from the data by using this method.
(Iv) The choice of the period of moving averages is made by obselO'ing the oscillatory
movements in the data and not by the personal judgement of the Statistician.
(v) This method is quite flexible in the sense that when a few more obselO'ations are added to
the given data, the trend values already obtained will not be affected. only some more trend
values will be included in the series.
29
Example 2.
{i}Obtain the /ive- ear movinl! avera ell for the followin ~erie~ of observation~;
YOM 1%7 1968 1969 1970 1971 1972 1973 1974
Annual Sales Rs. '000 3.6 4.3 4.3 3.4 4.4 SA 3.4 2.4
(ii) Construcl also the 4 year cenlered muving average.
SOLUTION.
(i) TABLE 1
CALCULATIONS OF 5-YEAR MOVING AVERAGES
YOM Annual Sales 5- year moving total 5- year moving aver~ge
(I) (Rs. '000)
'"
(3)
I,,,
(Rs. '000)
1967 3.6 - -
1968 4.3 - -
1969 4.3 20.0 4.00
1970 3.4 21.8 4.36
1971 4.4 20.9 4.18
1972 5.4 [9.0 3_80
1973 3.4 . .
1974 2.4 . .
Note thaI the first moving total 20.0 of column 3 IS Ihe sum of Ihe fIrst 5 values 3.6, 4.3,
4.3.3.4,4.4. The second moving total is 4.3 + 4.3 + 3.4 + 4.4 + 5.4 = 21.8 which can also be easily
obtained by adding (5.4 - 3.6), i,e .. 1.8 with the flI'Stmoving lolal. Similarly. Ihe 3nl ~lOving 10lal is 21.8
+ (3.4 - 4.3) = 20.9 and so on.
NOTE; The five year moving averages (or lrend values) for Ihe years 1969-1972 are shuwn in
cournn 4. (Note that the moving averages correspond 10 the given years.) For the olher years 1967. 1968
and 1973. 1974, moving averages cannot be deleffilinoo.
Second Method
TABLE] CALCULATIONS FOR 4-YEAR CENTERED MOVING AVERAGES.
Year Data( Annual 4-year movmg 4-ycar movmg 2-ycar moving total 4-)'ear centered
(I) Sales Rs. '000) total average of col. 4 (centered) moving Aver.age
(2) (3) (4) Co1.5+2)
1967 3.6 ..... ..... ..... ....
1968 4.3 ..... ..... .... ....
15.6 3.9
1969 4.3 8.0 4.0
16.4 4.1
1970 3.4 8.5 4.2
17.5 4.4
1971 4.4 8.6 4.3
16.6 4.2
1972 5.4 8.1 4.0
15.6 3.9
1973 3.4 .... .... .... .....
1974 2.4 ..... .... ..... .....
EXAMPLE].
F"dth
m , lreo or the f OllOWlfllZ
series USinlZa three-vear wei hted mavin averalZe with weilZht I. 2; I:
Y'M .. I 2 3 4 5 6 7
Values: 2 4 5 7 8 10 13
SOI,UTION.
1 2 ... .....
2 4 2xl+4x2+5xl _ 15 3.75
3 5 4xl+5x2+7xl - 21 5.25
4 7 5xl+7x2+8xl,,27 6.75
5 8 7xl +8x2 +lOxl 33 8.25
6 10 8xl+IOx2+13xl 41 10.25
7 13 .... .....
Col. 4 = Col. 3 + total weight, where total weight" I + 2 + I = 4.
31
EXAMPLE 4.
For the following series of observations, verify that the 4-year centered movmg average is
equivalent to a 5-year weighted moving average with weight 1,2,2,2,1 respectively:
I
y=
;:~e~ .'000
1
2
2
6
3
1 , ,
4
3
6
7
7
2
8
6
9
4
10
8
11
3
SOLUTION.
4 , " 31 3.875
, 3
16
33 4.125
17
6
.,
7
18 " 4.375
2 37 4.625
19
8 6 39 4.875
20
9 4 41 5.12S
21
10 8 ..... ..... .....
32
TABLE 6 CALCULATION OF 5-YEAR WEIGIITED MOVING AVERAGE
From the la~t columns of the two tablesS and 6, we see !hat !he 4-year centered moving avemge is
equivalent to a 5.year weighted moving average with weight 1,2.2.2.1 respectively.
For a given value of X. say Xl, the corresponding value of Y obtained from (I) is a + b XI. The
=
difference E1 YI - (a + b Xl) or Y 1- a - b XI, which may be positive. negative or zero. is called an error
or residual.
Similarly we obtain
E:1= Yz- a. bXz, ... , EN: YN- a- bXN.
By !he Principle of Least Squares. the line of the best fit is obtained when the sum of the squares
of the differences E. between the observed values Yi and the corresponding calculated values a + h Xi. is
minimum. i.e. when
is minimum.
i~
N
~
=1
.' ,
N
= L(Y;-a-bX;)
i= 1
,
N
L E:l;
Whell i= 1 is minimum, we obtain !he normal equations
:EY=Na+bDe (2)
And:E XY: a rx + b Dez (3)
33
Solving these two equations, a and b can be determined. and substituting these values or a and b in
(I), the required equation or the straight line trend is obtai lied, From thi, equation, we can compute the
trend values.
If we take tbe mid-point in time as the origin. tbe negative values in the first haIr or the series
balance out the positive values in lile sc<::ondhair so that LX" O. The normal equations (2) and (3) would
reduce to
I: y" N a and I: Xy" BI: X2;
a = ZY and b = LXY
N zx2
EXAMPLES.
Detemlined the equation of a straight line which best fils the following data:
Compute tile trend values for all the years from 1974 to 1978.
SOLUTION. Let the equation of the straight line best of fit. with the origin at the middle year 1976 and
unit or X as I year, be
Y"a+bX (I)
By the Method of Least ,;quare., the values of a and b are given by
l (2)
a'" I:Y I Nand b "'LXY IDC
Hence N:= number of years '" 5.
Using (2), a '" I:Y IN", 290/5 = 58. and b", DY I LX? = 34flO = 3.4.
From (I). the required equation of the best fitted straight line is y" 58 + 3.4 X.
NOW. Unless otherwi,e specified, we shall assume lhatlhe values of Y refer to mid-year values, i.e. as a
=
July, I. Thus in Example 6, X '" 0 corresponds to July, I 1976, X -1 to July I, 1975, X =
I to July I,
1977, etc.
34
Module III
Parametric Tests
Theory of estimation- Point and Interval- Testing of Hypothesis. Large and small
sample Tests.
Parametric Test: t-Test, F- Test, Chi-square test, ANOVA. Probability Distribution:
Binomial, Poisson, and Normal distribution.
~ Sampling theory is the study of relationships between a population and samples drawn from the
population.
Sampling theory helps us to determine whether the differences between two samples are actually
due to chance variation or whether they are really significant.
(I) a
P:lnlmeler : It is a statistical measure based on all the units of populntiotl. For example,
population mean, population standard deviation, proportion of defectives in population etc.
(2) Statictie : It is also a statistical measure thJt based on all units selected in sample. For example
sample mean, .samplestandard deviation. Etc.
Consider the case of selecting 100 houses from the city of Mumbai to study effect of Internet on
children. Let us assumc that Mumbai city consists of 50000 houscs having intcrnct connection. Here
SOOOO is the population size and 100 is sample size selected from lhese 50000.
Now any statistical measures say avcrage age of user, standard devialion or variance of age of
user, if thesc are obtaincd f calcuLatedfrom all 50000 users, it will be 'Parameter' and if calculated from
selected 100 houses thcn it will be sample statistic or simply statistic.
Since the units selected in two or more samples drawn from a population are not the same. the
value of statistic varies from sample to sample. But the parameter always remains constant. This variation
in the value of statistic is called sampling fluctuation.
A parameter has no sampling fluctuation.
A sampling di~tribution is a thcoretical distribution that express the functional relation between
each of the distinct values of the sample statistic and the corresponding probability for all the different
possible samples of size n from the same population.
In the other words the frequency distribution or probability distribution of a sample statistic is
called sampling distribution of statistic. For such distribution standard deviation etc. the characteristics
mean and standard deviation of the distribution are very imponant and plays important role in thcory of
estimation.
The mean of sampling distribution i.e. an e~pectation of statistic of if it is equal to value of the
parameter then it is known as an unblasednl'M property ufthe statistic.
"
The standard deviation of sampling distribution is lenned as Standard Error of an estimate,
II is used as a 1001in teslS of hypothesis. It gives an idea about the reliability and the precision of a
sample. II helps in determining the limits within which the parameters are expected to lie.
It is possible to draw valid conclusions about the population parameter from each sampling
distribution.
Types of Estimation :
Point Estimation:
A Point estimate is a single value that is used to cstim31e the unknown parameter. E.g. sample
mean is a Point estimate population mean.
Interval Estimation:
In this type instead of obtaining a single value as an eslimale, the pair of values are ohtaincd and is
used to estimate an interval or range within which parameter lies with certain confidence (probability).
Such inlerval is known as confidence Interval and the two values arc known as confidence limits. The
prubability or confidence generally in tems of percentages arc 95%, or 99% . Higher the probability,
higher is the confidence. Standard error plays very important role in determining confidence limits and
hence cunfidence Interval.
a) Unblasednl'!is:
Let T denotes an estimator and (} denotes parameter.
Thus T can be sample mean I proportion I standard devialion etc and 8 can be POpulaLinn mean I
population proportion I population standard deviation ctc.
36
b) Cllusislency :
IfVm70asn70<
Le. as sample size inc,""ases, variance approaches to zero which shows spread or dispernion diminishes as
sample size becomes large then T is said to be consistent estimator. i.e. as sample size increases the
difference between T and 0 should be smaller and smaller.
c) Effidency:
Efficiency is measured by variance. The estimator with smallest variance is an efficient estimator.
d) Suffciency :
A sufficient statistic is an estimator that utilizes all the infonnation a sample contains about the
paramcter to be estimated.
Among all the estimato~ sample mean Jl and sample proportion P are sufficient statistics for
population mean [J and population proportion P.
3.2: Tl'StingofHypothesfS
Hypothesis is one of the important aspect in any research study. The purpose of hypothesis testing
is to be detell11ine the accuracy of hypothesis due to Ihe fact that data is collected through sampling
method and not complete enumeration method.
The accuracy of hypothesis is evaluated by determining the st.ltistical likelihood that the data
reveal lrue difference and not random sampling error.
There an: two approaches to hypothe.is testing as (i) Clas.~ical or Sampling Theory approach and
(ii) Bayesian approach.
Classical or Sampling theory approach is most widely used in research applications. This approach
represents an objective view of probability in which the decision making rests totally on an analysis of
available sampling data. A hypothesis is established, it is rejected or accepted bascd on the sample data
collected.
Bayesian statistics are an extension of the classical approach. Here also sampling data is used for
decision making, but here research goes beyond it to consider all other available information. The
additional information consists of subjective probability estimates stated in terms of degree of belicf.
These subjective estimates are ba~ed on general experience than on specific data collected. They are
expressed as a prior distribution that can be revised after the sample information is gathered. The revised
estimate known as posterior distribution information and so on various decision rules are established, cost
and other estimates can be introduced, and these element are used to judge decision alternative hypothesis
testing procedure.
In classical tests of significance Iwo kinds of hypothesis are used - The Null hypothesis and an
alternative hypotbesis.
37
Null Hypothesis:
It is a statement that no difference exists between the par<lnJcter and the statisti<:. The ''No
DilTe~nce" type hypothesis is teffiled as Null hypothesis and denoted by Ho.
Alternllti~e Hypothesis :
It is the logical opposite of the Null hypothesis. It is denoted by HI or HA. The alternative
hypothesis may take several fonm depending on the objective of researcher. It may be of the "Not equal
to" or "greater than" or "less than" type. And these types will be used be decide whether the underlying
test is two tailed or one tailed.
If H, or H .•.is "Not equal to" types ( ~ ). The underlying test is two tailed or two sided or non
directional test.
Hoand HI f H.•.are complementary to each other. If Hois rejected means HI is accepted and vice versa.
Based on sample results 110may be accepted or rejected. And Ho may be True or False in legal or
true sense. Thus it will arise following four situations.
II, Tru,
Decision
The error committed in rejccting true Hypothesis is termed as Type I Error. The probability of community
Type I Error is denoted by 0< and kllOwn a~ Ie\'e1 of significance. The standard values of c< are 5% and
1%
0< = P [Type I Error]
Thc error eommilled in acrepting false hypothesis is termed as Type IT Error. Probability of
community Type ITError is denoted by B.
B = P [Type II Error]
= P [Accept HoI "ois false].
38
3.21 Statislkal Testing Procedure:
It is a step by step procedure as follows:
3.22:Test of Signilic:mce :
Generally there are two classes of significance tests:
Parametric and Non- Parametric.
Parametric test are more powerful because their data are derived from inlerval and nttio
measurements.
Nun- Parametric lests are used to test hypothesis with nominal and ordinal data.
Tnthe ahove paragraph different methods of measorement are introduced. Let us discuss the same.
A measurement scale can be defined as a sci or numbers or symbols developed ht a manner that
facilitates the assIgning of these numbcrs or symbols 10 the units under research on the basis of
certain rules.
The design - of a measurement scale depends opon the objective of the research and the
mathematical calculations thai a researcher expects to perform on the data collected by using the sI."a1cs.
The different types of measuremcnt scale are as follows:
A) Nominal Scale:
This type uses number or letters 10 identify different objects. It assigns numbers to each
category for identification after segregating them into mutually exclusive and collectively
exhaustive categories.
8) Ordinal Scale:
An ordinal scale is used to amlnge objecls in a particular order. It can be used for ranking
brdJIds based on their quality.
C) Interval Scale:
This is similar to an ordinal scale and is used for arranging the objects in particular order
whcre the intervals between the points on the scale are equal. The two poinl~ on the scale are
located at equal distance.
39
D) Ratio Scale:
Ratio scale have a fixed zero point and equal intervals. These scales are used for
representing age. weight. height etc. for example age can be represented as ratio scale like The
difference between 10 years and 20 years is the same as the difference between 30 years and 40
years.
-7 Nominal data is numerical in name only. They do not share any properties of the numbers which
we deal in ordinary arithmetic. e.g. we can record marital status as 1,2.3,4 depending on whether
the person is single, manied, widowed or divorced. But we can not write 4>2 or 3<4. Also 1+3~,
4~2=2 and so on. In such situations we are restricted to use mode as the measure of eentml
tendency. There is no generally used measure of dispersion for nominal scales. Chi. square lest is
the most common test which can be utilil.ed. Also for correlation the contingency coofficiem can be
worked out.
-7 In those situations where we can not do anything expect set up inequalities, the data is referred as
ordinal data. Ordinal scales only pennit the ranking of items from highest to lowest ordinal
measures have no absolute value. The real difference between adjacent ranks may not be equal. In
this situation median is the appropriate measures of central tendency. A p.:rcentilc or quartile
measures is used for measuring dispersion. Correlation arc restricted to I'llnk correlation
coefficients. Non" plll'llmetric tests of significance can be usen.
-7 Interval scales can hav.:: an arbitrary zero but it is not possible to determine for them an ab~olute
zero or unique origin. And this is the limitation of Interval scale. It does not have the capacity to
measore the complete absence of characteri~tic. The Fahrenheit seale is an example of an interval
scale. Increa:;e in tempemlUre from 4' to 8 • ami 30' 10 38' is same but we can not say that
30'temperature is 5 times Wann than 6".
-7 Interval :;cale provides more powerful mea~urement than ordinal scale. Mean is applllpriate
measures of central tendency and standard deviation is the most widely used measure of dispersion.
Product moment correlation technique is appropriate to ~tudy correlation and t-test, F-test arc
generally used test for significance.
-7 Ratio scales represents the actual amounts of variables. Generally all statistical techniques arc
usable with ratio scales.
-7 Selection of measurement scale requires dccision in six key areas as I) study objective ii) Response
fonn ill) Degree of preference Iv) Data properties v) Number of dimensions vi) Scale construetion.
If the nature of the variables pennits. the rescarcher should use the scale that provides the most precise
description. Researchers in physical sciences have thc advantage to describe variables in ratio scales but
behavioural sciences are generally limited to describe variables in interval scale from which is less
precise.
40
The scales should be reliable, valid, st'llSitiVe, generall7.able and relevant.
The Reliabilily is the degree to which it is error free and produces consistent results.
Valldlly is the ability of a scale or the instrument to mea~lIre what is intended to measure.
Sensitivity is the instruments ability to measure the variability in responses accurately.
Relevance is the suitability of using a particular scale for measuring a variable. Thus Relevance =
Reliability x Validity.
1. The observation must be independents mean the selection of allY one case should not affect the chalices
for any other case to be included in me sample.
2. The observations should be drawn from normally distributed populations.
3. These populations should have equal variance.
4. The measurement scales should be interval or ratio so that arithmetic operation can be used with them.
The researcher is responsible for reviewing the assumptions pertinent to the chosen test.
Performing diagnostic checks on the data allows the resean:her to selectlhe most appropriate test.
Z. tesl is a large sample test. Generally if sample size exceeds 30 it is said to be large sample.
Otherwise small sample distribution. This is because of lack of information about me population standard
deviation.
When sample size approaches 120. me sample standard deviation becomes a very good estimate of
population standard deviation.
For the characteristic like average and proponion Z or t distribution based tests are most
appropriate tests.
4I
3.31 Z. tests
To{eslHo:~=po
HI; "poor(~<lJOor~)po)
Here p denote populalion mean and IJOdenule specified value of population mean.
z~X-JiO 2- JlO
cr / .r,; , I .r,;
The critical values of Z depend:; upon
(il level of significance 5% or 1%
(ii) Two tailed or one tailed {est (sign in HI" or <. »
42
Available data: For two ~amples, their sample size .• n" n. with mean X, , X. and population
standard deviations (mayor may not) or sample standard deviations.
The computation of te~t statistic Z is done as follows:
Case I: Ifboth the samples are drawn from same population with standard deviation (>
X. - X,
z~
Case II: Population standard deviation unknown. let S,2 , sl<denotes sample variances.
9.jJ = L(.r2-X2)2
n, - I
Then work 0111
z~ X., X,
,
'.n. + n,~
Case III:If two samples are drawn from two different popul31ioos with standard deviation 0, and 02
z~ X.
cr'
-' +
n.
Rest procedure is same as above
Ho: P=Po
H,: P;t. Po (orP< Poor P> Po)
43
P denotes desired proportion. Puis specified value
z = P-Po
~
Where P = sample proportion
Qo'" 1. Po
z = Pl-P.
PI q] p. (12
+---
III 11.
b) If proportions are similar with respect to given attribute, the best estimate of population proportion
is obtained as
111PI + 11. p.
Po =
III + 11.
qo= 1- Po
md
P, - P,
z =
\/_1_ II]
+ _,_
U.
In all above four tests the probability distribution of test statistic is Nonnal distribution.
So let us discuss about Normal distribution.
44
3.311 Normal Distrihution
This is the most widel)' used probability distribution. This is applicable for continuous random
variable.
A mndom variable means a real valued function defined over a sample space. For every value of
random variable there is asso<:iated probability.
If II random variable takes only integer values, it is knOwn as discrt'te random variable. If a
random variable assumes any value within range it is known as continuous random '-ariable.
From the most widely used probability distributions Nonnal distribution is for continuous mndum
variable and Binnmial,Poisson, distributions arc for discrete random variables.
The probability distribution of random variable is either a tabular form or a functional f\lOn
showing probabilities distributed over various values of random variable sHch that individual probabilities
lies between 0 and I and sum or Total probability is I( unity)
For discrete probability distributions fhe tabular fonn or functional foml referred as pruoobility
mass function (p.m.f.) and for C<Jntinuousrandom variables the function is referred as probability density
function (p.dJ.)
For Normal distribution, there are twe> parameters mean '1-1' and Standard deviation 0" ( or
variance .r)
Let the random variable X is said to follow Normal distribution with parameters 1-1and a ( or a~)
Then it's p.d.r. is given by
,
1 e
+C X-II
0
)
1 ('l,) •
0
~
•
f
•
.l(X) = 1
The frequency ClIr\ICobtained for various values of X and f (~)is known as Nornm! Curve,
For a nonnal distribution if mean is '0' and standard deviation IS J then that variable or variate
is known as Standard Nonnal Variate (SNV). It is generally denoted by t Of Z
1 e
.t(z) •
__ <x<""
0>0
2. FOf Normal distribution mean = median = mode = I!
3. Normal curve is a bell shaped symmetric curve
Symmetric about X = I' or Z = 0
x= It z=o
46
4. The total area under the curve is unity
I.e.
The area under the curve ~nd probability of X or Z between the 11'.'0values is same concept.
Th"
b
5. The nonnal curve is an asymptoTic curve i,e. two tails of Thecurvc do not touch X axis but remain
parallel to X "",is.
8. Any nonnal variate with mean fl and standard deviation (1can be converted into corresponding SNV
"
z =
x- It
o
This is discrete probability distribution. The random variable X under sTudy is said to follow
Binorninal Distribution with parameters n and p under the following assumpTions.
1) The trial must result into two OUTcomesonly success and failure i.e. the trial must be a
Bernoulli trial.
2) Let P denOTeprobability of success then 0 < p < I and p should remain constant in all repe~ted
trials.
3) Let n denote number of times, the trial is repeated. All trials are independent of each other and
n is finite.
If X is defined as 'Number of success' in the experimenT then this is said to have Binomial
distribution with parameters n and p.
47
x- 13 (0. p) aod probability maJ;s function is given by
n 11- r
pI' q
" q=l-p
r=O,I •.... o
This is a special case of Binomial distribution. If n is infinite or too large and p is very
small then the product of n & p will be a moderdte value say it is..t then this,t is the parameter for
Poisson Distribution. Thus
X-P(,l)
And p.mJ. is
,
r=O,1,2 .
"0
48
Properties of Poisson Distribution.
2) For Poisson distribution mode an integer lying between ).' I and .i.. If).. itself is an integer,
distribution has two modes,t- I and,l
(i) Inuependence : The number of times an event occurs in any time interval is independcnt of
the number of times it occurs in any disjoint time interval.
(ii) In a very small time interval say t to t + h where h is infinitely small, the probability that
the event occurs once is approximately..t h where,t is the average rate at which the event
occurs per unil of time.
(iii) The chance of two or more occurrences of event in a very small interval t 10 ( + h is
insignificant in comparison lo,t h, the chance of one occurrence.
(iv) The number of persons born blind per year in a large city.
----------_._----------....-------)(----------------- -----------
49
3.32 Student's t. test
TIm degrees uf freedum is a number which tells us huw many uf the values may be independently
or frecly chusen. Su as the conditiuns an: satisfied. TIlcre is a rule tu set degree of freedom as if n is the
sample Sill' and one parameter is specilied then the degree uf freedum i~n _ I, If twu par:lIl1eters
are spedfied then n _ 2 & su un. If there are twu samples of II] & 0z as sizes fur specified means uf twu
populations then degree offreedum will be (nj - I) + (nz- I) '" nl + nl -2.
The probahility distribuliun (p.d.£) of lhe randum variables fulluwing t distrihutiun with degree of
freedum n- I is as fl,ll"ws:
(V+l)(2
__ <t>o<
Where K is constant.
)'roperties of t . distribution.
Uses of I - test.
t. test is used
50
The procedure for lests of significance is same as Z lest or general procedure. Here critical ~alue is
..,btaioed for required degree of freedom at specified le~el of significance.
The formula for test smtistic, corresponding degree of freedom are tabulated below.
I) To tesl specified mean
Ho: 1.1=1.10
I -
d.f.=n-l
s is sample standard deviation
K1- K.
S. E
X1- Xl
where
I 1
+-
lit 112
A,d
IlISl
~
+ Ill!!:!
4
s =
III +tl.-l
"
If observations for two samples are available then steps to calculate X I , X 2 and S.E are as
follows:
Let Xl], X 1, Xn X Inl are observations of sample 1
' , X2J X2n2 are observatioos of sample II
XII, X2l
<D Obtain LXI, r Xl
'" o :s x, _
'"
-
@ObtainXI-X,.(XI-Xl),X1-Xl(X1-X1)Column.
-1 - -1
- 2 - 1
@ObtainE(XI-Xd ,E{X2- Xll
@ Obtain
@ test statistic
, 0
d 0
'" ,.----
J 2; (di-iJ)' ,
,-, o
,-,
@ lest statistic
t _ ••
sf r;;
52
Module IV
Non- Parametric Tests
Non paTametric tests have fewer and less stringent assumptions. They do not specify nonnally
distributed populations or homogeneity of variance. Non parametric tests are the only oncs useable with
nominal data. They aTe the unly technically correct tests are sometimes employed in this case. Non
pilfametric te.~tsmay also easy to use for interval and ratio data. These are ealiY to use and understand.
Parametric tests have greater efficiency when their use is appropriate but even in such cases non
parametric test often achieve an efficiency upto 95%.
Chi- square can be used as a nOn parllJlletric statistic which is used frequently for cross- tabulation
or contingency tables. It's applications include testing for differences between proportions in populations
and testing for independence.
Non parametric tests are also known ali Distrihution- free tests.
I. Test concerning some single value for the giwn data ( Oue sample sign test)
2. Test concerning nO difference among any two or more sets of data (Two samples sign test. Fisher
- Irwin Test, Rank Sum Test)
4. Test of a hypothesis concerning variations in the given data (similar to ANOYA, Kruskal
Wallis Test)
5. Tests of randomness of a sample based on theory of runs ( one sample run test)
6. Tests of hypothesis to detennine if categorical data shows dependency (Chi- squart: lest for
independence of attributes)
53
4.1 Sign Tests
alOne sample si~ Ie-I
On the basis of sample sir.e n we replace the value of each and every item of the sample with a (
+) sign if It is greater Ihan flo and
After doing this we testlbe null byp. tbm tbese + and - signs are values
Steps;rre as follows
i. e. pandheneeq= l-p
(iii) Obtain
S.E. =~. n -
r;;L
(Iv) Calculate test statistic
1
Po "
P-Po 2
Z =
S.E
AAAABBBJ1J1J1KKKFFOOOOOPPPPLULMMMMMM
I 2 3 4 , 6 7 8 9
Here in all there are 9 runs with different letters A,B,J,K,F,O,P,L,M respectively. First run is of 4
A's, then 3 B's upto last 9'" run is of 6 M' s.
The IOtal number of nmsappearing in an arrangement is always good indication of a possihle lack
of randomness.
If there are too few runs it implies a definite grouping, dustering or trend may be sus[l<;'clcd.
(f there are too lIlany runs some sort of repeated alternating pattern may be suspected. Thus it may
be possible to prove that 100 many or too few runs in a sample inuicate something other than chance when
the items were selected.
The Ilumber of runs 'r' is a statistic with it's own special sampling distribution and its oWn test. To
derive the mean and standard deviation following fonnulae are used. Meao '" (2njn;>/{nl+ 01) 1 + I
Also the sampling distribution of r can be approximated by a normal distribution if cithcr nl or "2
is larger than 20.
55
4.3 Test for independence 01"Attributes.
Let l1le obs.elVations be classified according to two attributes and the frequencies 0, in different
categories arc shown in [wo way table called as contingency table. And we have to test whether the two
attributes are independent or nol.
Under the Null Hypothesis Ho that the two attributes are mdependent, the el<pected frequency for
cell is calculated as
Gr.ud Tol.1
Test Statistic
(Oij- Ei) )
,
x' •.yII y'"
j~l i-I E ij
II III
OJ .2
.r r Ei
-N
j-I i-I
Where
Ri XCj
y yaij Y Eij •• N . Eij ••
, ' , N
N is grand total
•
dof",(m-I)(n-l)
Otlierwise Ho is rejected
"
Z)( 2 Contingency Table and Yates correction.
A NotA ToTal
AI
~
B b ub
"
NolB , d "d
BO
Total "H b" a+b+e+d=N
Here
2
N (lld- be)
X' (n+b)(c+d) (ll+e)(b+d)
And dor == I
N 2
eOlTected
N[I,d-b,I--2j
(a+b)(c+d) (a+e)(b+d)
dor == 1.
"
SPEARMAN'S RANK CORRELATION PROCEDURE
The Spearman'~ rank correlation coefficient i~ a measure of a~so<:iation based 00 the ordinal
feature of data, Among the various statistical methods based on ranks the Spearman's rank correlation
procedure was the earliest to developed. Also this method i~ ~implc to use and easy to ~pply. It has also
proved that it is as powerful as Karl Pearson's correlation coefficient when assumptions about parametric
methods are violated.
ltisdenotedbyRandgivenby R = 1_{6L))")J{N(Nl_I)}
S.E. =R "(n - I)
The Spearman's rank correlation coefficient may be employed as a test statistic to test a
hypothesis of no association between two populations. We as~ume that pairs of observ3lions have been
randomly selected and therefore the hypothesis of no association between the populations implies a
random assignment to "mks within each sample. Each random assignment represents a sample point
a.~,;ociatcd with experiment and a value of R could be calculaled for each. lllUS it is possible to calculate
the probability that R aS~UmeSa large positive or large negative valUe due solely to chance and thcrby
suggests an association belwcen population,; when none eAists.
3. As the sample si7e gelS larger, data manipulations required for non-parametric procedures are
sometimes laborious unless appropriate computer software is avail~ble.
4. A collection of tabulated critical values for a variety of non.parametric tests under situations
dealing with small and large 11 is not readily available.
"
Solved Examples on Module 111and IV
EX.l A population consists of four values 0, 2, 4, 6. Draw all pos~ible ~amples (with replacement) of
size 2 from the population and hence find the sampling distribution of sample means.
Sol. Total numbers of ~ample~ will be 42 = 16. Samples of Slle 2 and their sample means are shown
beiow.
2 0,2 2 1 10 4,2 6 3
3 0,4 4 2 11 4,4 8 4
4 0,6 6 3 12 4,6 10 5
5 2,0 2 I 13 6,0 6 3
6 2,2 4 2 14 6,2 8 4
7 2,4 6 3 15 6,4 10 5
8 2,6 8 4 16 6,6 12 6
16 possible sample~ of size 2 with replacement as shOWnabove can be drawn. Hence each of the 16
1
~amples means occurs with probability 16
2 x ...!... = _,_
16 8
Sample Mean 21s thrice, S is repeated twice and 6 only once. Probability distribution of sample mean
j is given below.
Probability (p) 1 2 J , 3 2 1
16 16 16
" " 16 16
50
Elt.2. A population consists of the four numbers 3, 4, 2, 5. Consider all possible distinct samples
(without replacement) of size two and verify that the population mean is equal to the mean of the
sample means.
Sol. The
3+-1+2+5
poptualiollmeau (~l) = = 3.5
4
Allpossible distinct samples of size two (without replacement) and the corresponding sample me<lns are
shown In the following table:
I 3,4 7 3.5
2 3,2 5 2.5
) 3,5 8 4.0
4 4,2 6 3.0
5 4,5 9 4.5
6 2,5 7 3.5
Total .. .. 21.0
[Sampl1ng used in the above t<lble is random sampling without replacement and the no. samples
".c,,, 6)
7
Mean of sample meaDS = 2~O -,- = 3.5
Sol.
PQ
SE=-
11 whereQ"l-P
-<v
18
P(l - P)
9
::;- 1 = 1 ~P(I-P)
P= ~ ::::>Q=I-P=l- ~ ",111
1
-,-f 1
-18 x 1.96= -.t
1 0.1088 '" 0.:' t 0.1088 = 0.6088 alld 0..~911
_ 1
b.5. The financial controller for Home Electronics concerned about rising personnel costs. Recruiting
expenses appear to be too high, and the controller suspects that an under number 01 applicants are
being examined for each new position. From the recently filled position~ he ~ampled 36 and was 38,
with a standard deviation of 4.5. Construct a 95% confidence interval for the mean number of applicants
screened for each new job at Home Electronics.
Sol.
z =
x- Il x- Jl
:i 1.96::::> X - 38 =:i 1.96
S.E fiE '.'Ij36 .
X-38
:i 1.96
0.75
"
Ex.6. The business manager of a large company wants to heck the inventory records against the
Physical inventories by a sample survey. He wants (I) to be 95% confident (ii) to be almost sure that the
maximum sampling error should not be more than 5% above or below the true proportion of the
inaccurate records. The proportion of Inaccurate records is estimated at 20% from past experience,
Determine the sample size.
Sol.
p= '0
100
SE of proportions =~ pQ = .lx..:l
n .5 5
n
where n Is the sample size
:. P-P=:t0.05
,'.z = :t 0.05 -1.96:=>:t0.OsF x~ = 1.96
J2tn
[ :. For 95% confidence level Z = 1.96)
n"n3~'J
0.05
4 =>
[
n => "0.0'
3>2 ]'
~"6An,.
25 II
Ex.7. A company has the head office at Calcutta and a branch at Bombay. The personnel director
wanted to know if the workers at the two places would like the Introduction of a new plan of work and a
survey was conducted for this PUrp05e. Out of a sample of 500 workers at Calcutta 62% favoured the
new plan. At Bombay out of a sample of 400 workers 41% were against the new plan. 15there any
significant difference between the two groups in their attitude towards the new plan at 5% leve;?
62
Sol. let PI and P2 be the population proportions in CaicuUa and Bombay respectively who favour the
new plan.
= j 0.607 x 0.393 x 9
~ 2000
= ~ 0.00107
= 0.0327
Assuming that Hois true, the Null Hypothesis at 5% level of significance and conclude that there is no
significant difference between the two group in their attilUde towards the new plan.
EIl.8 If it costs a rupee to draw one member of a sample, how much would it cost, in sampling form a
universe with meantOO and standard deviation to, to take sufficient members to erlsure that the mean
of the sample In all probability would be within 0.01% of the true value? Also firld the additional cost to
double the precision.
"
•
Sol. We know that mean:l: 3 standard error covers 99.73% (or leaves) 0.27% of the total are or cases,
which in other words, amounts to overall coverages In all probability.
crp
Standard Error offill' Mean of sample or 0. = r
x '"J II
Where 0" P denotes the S.D. of the population and n the number of members (or items) in the sample,
In all probability the difference between samples mean and population mean should be 3 times of S.E.
"
3cp
~ and the given value of it is 0.01% of mean (i.e., 0.01% of 100) i.e.. 0,01 or
= 0.01 Or 3 x 10 - 0.01
,J;;
Or 30 =O.Ol~ Or F = 3000
Or n" 9,000,000
So the number of sufficient members to ensure that the mean of the sample in all probability be
within 0.01% ofthe true value is 9,000,000 and consequently the total cost will be Rs. 90lakhs.
To double the precision means to have the standard error. In order to have the standard error or
double the accuracy (Precisiorlj, the number of members in the sample should be fourfold, i.e., it should
be 36,000,000. But in the question, additional cost is being asked, which will be Rs. 36,000,000 -
Rs.9,OOO,OOO"Rs. 27,000,000.
Note: Precision. Precision is defined as the degree of accuracy with which the sample mean can
estimate the population mean as revealed by the standard error of the mean.
As the standard error decreases, the precision with which the sample mean can be used to
estimate the populatiorl, mean increases i.e.
Precision a I
SEx
~ if precision is doubled, the S.E. will be have =:;> sample si~e will become four times il precision is
doubled because for a given populations S.D. is fixed and to have S.E.
[ SE ~ -'!...- ]
~ , n, the sample size will have to be made 4 times.
For a fixed sample si~e, reduction in the interval width causes greater preciSion i.e., doubling the
pretlslon means reducing the interval to half.
EX.9 If it costs Rs. 40 to draw one unit of sample how much would it cot in sampling from a universe
with mean as 100 and standard deviation as 10 to take sufficient number as to ensure that the mean of
the sample with a 5% significance level be within 1% of the true value? Find the extra cost to double the
precision.
If precision is doubled the size of the sample si~e = 384.16 x 4" 1537 approx.
If fraction in the sample size is ignored then extra cost = Rs. 46080
65
EK.10 1.800 persons 01" certain age group were observed to have a standard deviation of 9.2 beats per
minute. A5sign the limits for the standard deviation of the population, assuming the above sample of
1,800 persons came from a normally distributed universe.
a
Or G(J= ---
2~
9.2 9.2
cr(J= -0.153
60
~lXL800
As thrice the standard error covers almost the total number of cases (to be exact, 99.73% cases) so the
population standard deviation should not differ by more than :t 3 S.E. or should remain within 9.2 :t 3
(.153}
Hence the limits of the population standard deviation are 8.741- 9.659, .e., between these (two
minimum and maximum values parameter standard deviation should lie.
Ex.l1. A sample study of 2,500 couples gives a correlation coefficient of 0.45. Estimate the limits to the
correlation in the universe.
.. ,
S E = 1 - (0.-t5)j
----
1 -0.201." 0.7975
-0.0159501"0.016
V 2.500 '0 '0
66
In all probability, th@param@tNco@fficientofcorrelationshouldnotdiffer by more than thrice the S.E.,
from sample correlation coefficient as sample:!: 3 S.E., would cover 99.73% of the total populati':m. So
the limits to coefficient of correlation are.
Thus we can confidently expect that the parameter or population correlation of coefficient
should be within the limits of 0.402 and 0.498.
Note; it should noted that S.E., ( 1- r2) I "" should be used only when r is moderate, say, less
than 0.5 and n is large, othe'wise t-test of the significance of r should be used.
Ex.12 A correlation coefficient of 0.2 is obtained from a random sample of 1,600 pairs of observations.
Do you think this value of correlation coefficient Is si8nificant?
Sol. To conclude whether the value of r = 0.2 is significant, I.e., whether the observed pairs are really
correlated. It is necessary to find out the value of r which may arise on account of chance when 1,600
pairs are observed, presuming that the observed pairs are uncorrelated.
1 _1.:l 1
S.E .• = ~-=O.015
40
~
We know that 3 S.E.,cover 99.73% cases, therefore, the upper limit of r will 3(0.025} = 0.075 on account
of sampling fluctuations. But the vlue of the observed r is 0.2 whiCh is many times this value, so we can
safely conclude that the value of r, viz., 0.2 is highly significant, Le., the observed pairs are really
correlated.
"
•
EX.13 Mr. X wants to determine the average time to complete a certain Job. The past records show that
population standard deviation is 10 days. Determine the sample size so that Mr. X may be 9S% confident
that the sample average remains within:t 2 days of the average.
(Critical value of Z at 95% confidence is 1.96 from standard normal area table)
1l=(~ZY
Where a = population standard deviation
E = Sampling Error = observed value of the mean. Expected value of the mean
Ex. 14 A manufacturer of ball point pens claims that a certain pen he manufactures has a mean writing
Jife of 400 pages with a standard deviation of 20 pages. A purchasing agent selects a sample of 100 pens
and puts them for test. The mean writing life for the sample was 390 pages. Should the purchasing agent
reject the manufacturer's claim at S% level.
Sol. Let the null hypothesis Hobe that the mean writIng life of ball pens is 400 pages.
Alternative hypothesis = The mean writing life of ball pens is not 400 pages.
Since the calculated value is more than the tabulated value at 5% level the claim of the
manufacturer is rejected. The purchasing agent should rejf'ct the manufacturer's claim, that the mean
writing life of pens is 400 pages.
'"
Ex. 15 A manufacturer claimed that at least 95% of the equipment which he supplied to a factory conformed
to specifications. An e>;amination of a sample of 200 pieces of equipment revealed that 18 were faulty. Test
his claim at a significance level of (i) 0.05, (ii) 0.01.
Sol. Null Hypothesis Is that the proportion of equipments conformed to specification is 95% i.e., 50 Ho{P"
0.95). Alternative hypothesis is that It is less than 95%. HI: (p< 0.95). Now equipment found t.:l be not faulty"
200 -18 = 182.
~ 182
= 0.91
200
Assuming Hoto be true, expected value'" 0.95
(i) Z = (-2.60) is less than -1.645 therefore at 5% level, claim is not justified.
(ii) Z" (-2.60) is less than - 2.33 therefore atl% level, claims not justified.
Note: Since we are interested to check only the lower proportion, one tailed test has been
considered.
Ex. 16 In random samples of 600 and 1000 men from two cities, 400 and 600 men are found to be literate.
Do the data indicate (at 5% level of significance) that the populations are significantly different in the
percentage of literacy?
Sol. Null Hypothesis Is Ho: (P1" PI); Alternative hypothesis HI; (PI ~ P2)
Herell=600P=
1.16006 400=...::!...
6 ~_l_
10 15
69
If Hois true, the best estimate of the value of p is given by
This value of z is greater than 1.96 (at 5% level), so it is significant and we conclude that the difference
between the two proportions In percentage of literacy is significant.
E~. 17 a firm found with the help of a sample survey in a city (size of sample 900) that % of the population
consumes things produced by them. The firm thus advertised the goods in paper and no radio. After one
year, a sample size of 1000 reveals that proportions of consumers of the goods produced by the firm is 4/5'".
Is this significant to indicate that the advertisement was effective?
Sol. Null Hypothesis is that proportions of consumption before and after advertisement were equal, Ho:
(Pi" P,); Alternative hypothesis Hi: (Pi <: P,)
0,05
Z= PI -Pl = 0.7S -0.8
-. 0.019 =2.63
S.E. 0.019
Here HI: (PI <: P2) Is one sided and for this test the critical regions are z ~ - 1,645 at 5% level, And z ~ - 2.33 at
1% level.
Now this value l < - 2.33, so It is sigrlificant at level. We reject the Hoand conclude that the proportion
of consumption increases after advertisement, i.e., advertisement was effective.
We can also say that Ill> 2.33, therefore the null hypothesis can be rejected.
Ex. 18 In an infantile paralysis epidemic 500 persons contracted the disease. 300 received no serum
treatment and of them 75 became paralysed. Of those who received serum treatment 65 became paralysed.
Was serum treatment effective?
501. We have the I"ull hypothesis Hothat the serum treatment Is not effective, i.e" P," P, and Altemative
hypothesis Hi: P, < P,
The proportion of persons who became paralysed after receiving the serum
=
"
200
- 0.32.~ = PI (s:l}')
The proportion of persons who became paralysed without receiving the serum
p
III PI + 112P2 65 +75
= 0.28, q= 1-0.28 =0.72
IlI+U2 500
8.E.(pI-P:l)
0.28xO.71(2~ + 3~0) =0.0-11
Z=
III - 112 = 0.315 - 0.25 = 1 83
S.E. 0.0-11 .
At 5% level of significance the tabulated value of l is 1.64 which is more than the calculated value.
Hence at 5% level of significance the null hypothesis is accepted i.e., there is no difference in the proportion
of persons getting paralysed with or without serum treatment i.e., the serum treatment was not effective.
EX.19 On a certain day, 74 trains were arriving on time at Delhi station during the rush hours and 83 were
late. At New Delhi there were 65 on time and 107 late. Is there any difference in the proportions arriving on
time at the two stations?
Sol. let the null hypothesis be that there is no difference in the proportions of trains arriving on time at
Delhi and New Dell'llrailway stations I.e., P, '" P,
PI = 74 -0471
157 .
os
172 =0.378
SE of difference of proportions =
pq ( -t + I~) where
~ 74 +65 139
0.422
157+172 319
Ex.20 In a random selection of 64 of the 600 road (rossings in a town, the mean number of automobile
accidents per year was found to be 4.2 and the sample S.D. was 0,8. Construct a 95% confidence interval for
the mearl number of automobile a((idents per crossing per year.
5<>1.
.".E. ofmeau".JL ~
-.[lI'1~= ~.~_0.8
--.soo::l"" 3 IJ ~"09--16
.\9<) .
x = 4.2
For 95% confidence, the value of l = 1.96
.'. The 95% confidence intervals for the mean will be 4.2 :t 1.96 x 0.0946 = 4.2 :t 0.17854 I.e., 4.0146 to
4,3854.
Ex. 21 A sample of size 600 persons selected at random from a large city shows that the percentage of male
in the sampie is 53. It is believed that male to the total population ratio in the city is)1. Test where this belief
is confirmed by the observation.
Sol. We have the Null Hypothesis, Hothat male to the total population ratio in the city is Y,= 0.5.
let Po= Y,= 0.5 and PI = 53%, when p = 0.5, q = 1- 0.5 = 0.5
.Lx..L
SEP='~+ 'V ---;;- 2
600
2
= 0.0204
= 0.53-0.5 -147
0.0204 .
At 5% level of significan(e the value of l is 1.96 whkh is more tharl 1.47. Hen(e at 5% level of signifi(an(e
there is no significant difference between the observed value and the normal belief and we accept our null
hypothesis. Hence the belief that the male to the total population Is }i Is (onfirmed by tile sample
observations.
Ex.22 lrl order to make a survey of the buying habits, two markets A and B are chosen at two different parts
of a city.
400 women shoppers are chosen at random in market A..Their average weekly expenditure on food is
found to be Rs. 250 with a S.D. of Rs. 40. These fIgures are 220 and Rs. 55 respectively in the markets B
where also 400 women shoppers are chosen at random. Test at 1% level of significance whether the average
weekly food expenditure of the two populations of shoppers are equal.
Nul! Hypothesis Ho; (!!l;' !!l}; Alternative Hypothesis HI: (1-1, ~ I-Ill
Ex 23. a supplier of components to the electronic indu~try make~ a sophisticated product which ~ometimes
fails immediately it Is used. He controls his manufacturing process so that the population of faulty products Is
supposed to be only 5%. Out of 400 suppliers In one batch, 26 prove to be faulty. Has the process gone out of
control to produce too many faulty components?
Sol. tet the Null Hypothesis be that the process has not gom~ out of control i.e., the proportion of faulty
components'" 0.05
Alternative hypothesis is that process has gone out of control and the proportion of defective
components Is more than 0.05 i.e., the process products too many faulty componer'lts.
z= p-",
S.E'(I»
=
0.065 -0.0."
=
0.015 O.OlS x 20
0.2179 1.376
0.05 x 0.95
n
~ '00
The value of z at 1% level of significance for one tailed test is 2.33 and at 5% level of significance i.
1.65.
Since the calculated value of z is less than the tabulated value we can reasonably e~pect that the
process has not gone out of control at both 5% and 1% level of significance.
Ex24. A maragarine firm has invited 200 mefl and women to see if they can di.tinguish maragarine from
butter. It is found that 120 of the women, but only 108 of the men can. Investigate whether there is any
evidence of sex difference in taste discrimination.
Sol. We have Null Hypothesis, Hath"t there is no evidence of se~ difference in taste discrimination, I.e.. p,
" PI
120 _.....L _ _
200 -5 -0.6-Pl
108
= 0.54 =Pl
200
•
120 + 108 228 0.57=p
200 +200 400
S_E.(ofPl-p.z) =
\J.1 pq(_'_+..1-) _
III ~ 0~<7" (1-0.'<7)
("
200 + 200
)
06 -0.54
z- 0.049.~
1.21
The value of l for two ta;led test at 1% level of significance Is 2.58 and at 5% level of significam:e is
1.96.
Since the calculated value is less than the tabulated value, we anept our null hypothesis. Hence there
is no evidence of se~ difference in taste discrimination.
E~.25. Random samples drawn from two places the following data relating to the heights of adult males:
Place A Place B
The standard error of the difference between the number of the samplesls given by
"
z~Difference of means
S.E. of lIleallS
68.58 -68.50
0.1058
'" 0.08 -07'.
O.lO.~8 - ..
Computed value of I being less than the table value, we cannot reject the null hypothesis, and so the
mean height for adults in the two places.
E~.26. In a certain city 380 mean out of 800 were found to be smokers. Discuss whether this information
supports the view thai the majority of mean In this city are !'lon-smokers.
{Use 95% level of significance, for which the critical value of 'z' is 1.96 given in standard normal area
table}
Sol. tet p denote the proportion of smokers in the city. The from the given information, sample
proportion
380
PI= = 0.-175
'00
Let us construct a 95% caMidenc!! Ir.lerval for the population proportion p.
Hence the confidence limits for p, the proportiOrl of smokers at 95% cOrlfiderlce level Is
= 0.4404 to 0.5086
The view that majority of men in the city are smokers Is equivalerlt to the view that minority of merl
irl the city are smokers which amOUrlts to the situation that the proportion p of smokers should be always
less tharl 0.5. Since the confiderlce limits for p include values more tharl 0.5 in the preserll case, the Biven
irlformation does not support the view that the majority of merl irl the city are non-smokers at 95%
confidence level.
n
h.27 A random sample of 16 values from a normal population showed a mean of 41.5 cms, and sum of
squares of deviations from mean equals 135cm. 5how that the assumption of a mean of 43.5 cm. for the
population Is not reasonable and that 95% fiducial limits for the mean are 39.9 and 43.1 cms.
Sol. We have the null hypothesis, Hothat the sample has been drawn from a population whose mean is
43.5 cm.
S=~ ~(X.X')2=
0-1 /VjJlL=3
15
I~ (4U -43.5) 4
2.666
3
d.o.f=16-1=15
The tabulated value of I t I for 15 d.oJ at 5% level of significance is 2.13 which is less than the
calculated value of 1 t I.
Hence the null hypothesis that the sample has been drawn from a normal population with mean 43.5
cms is rejected.
h.28 You are given the gain In weights (Ibs) of cows fed on two diets of X and Y
GAIN IN WEIGHT(lbsJ
Diet X ;25 32 30 32 24 14
DletY;24 34 22 30 42 31 30 32 35
Test, at 5% level, whether the two diets differ as regards their effect on mean increases!n weight.
(tabulated value of 't' for 15 degrees offreedom at 5% = 1.753)
Sol. let us t~~e null hypothesis th~t the two diets X and Y do not differ significantly as regards their effect
on Increase In weight. Appling t.test of difference of means:
f
X-y
,
x~2~ + 32 + 30 + 32 + 24 + 14 + 32 21
7
~ EY EY = 320 :::- y = 32
n 10 I.
LX 189 E (il _ 27) !(X 27) EY _ 320 r(y - 32) I(Y - 32)
"
, L(X_:()2 + I(Y-vi 266 + 3.~O
7+10-2 = 6.408
III +11;1-2
~ ~
. 1I1St+1I2S2 (X-x} + (Y_y)2
COlll1110IlVan~lIce= 111+112-2
111+112-2
,- 27 -32
6.408 1)~7XTO = --~-'-"
6A08
2.029 = - 1.~83 = 1.:"83
[Absolute \/alue]
For 15 d.oJ. to.05= 1.753. The calculated value of I t I is less than the table value. The null hypothesis is
accepted. Hence the two diets do nOI differ significantly with regarded to their effect on mean increases In
weight.
EX.29 The following data show the cost per squilre foot of floor area connecting randomly selected 7
schools and 5 office blocks from those completed during the period 1984 to 1989.
Schools
Office blocks
"
37
31 26 27
" 38 37
37 35
00 the data support the hypothesis that the cost per square foot for office blacks was greater
than that for schools? Tesl at 5% level of significance using 'nest.
Sol. let us the null hypothesis that the cast per square foot for office blocks was not greater than that
for schools. Applying t-test of difference of means. He; J.Isc"Jl<,H
B.No. x, (Xl"Xd (Xl-XI)
, B.No. X, (X:l-Xl) (X.-Xl):J
.,
,, " ,, " ,.,
I 18 4 I 37 0 0
31 +1 I
.,.,
~ 16 J4 "9
,
4 21
l3
9
49
4
, '7 0
, 0
6
, 38
37
+.
+7
64
49
" 4
1l!~7 EX! I:{Xl-Xl) r(Xl- XI):J lll~~ EX. I:( X. X, :I:(Xl X.)
""llO -0 -192 =18~ 9> =38
t= (XI-X.)
s
_..X, ,
II, ,
210 185
=37
192+38
=4.796
7+.'-1
:. t= 30-37 _ ~ = 7 x 1.i08
V96 \j~ 1.-191
..1.796
For 10 degrees of freedom, the calculated value of t at 5% level of significance for one tailed
(right tail) test Le., to.os= 1.812. The calculated value of t is greater than the table value. The hypothesis
Is rejected. The cost per squares foot for office blocks was greater than that for schools.
Ex. 30 From a large population of unemployed youths, a random sample of 25 is selected and an
:ntelligence test given to them. From the test the data, it was found that the average I. Q. is 97 with a
standard deviation of 12. Are these data consistent with the hypothesis that the unemployed youths
were selected from a population of average intelligence, that is, a population with 1.0. of lOO?
"'
Sol. We formulate the null hypothesis that Ihe sample is select from a population of average
intelligence of 100.
Ho:p.=l00
197-1001
Substituting the \'31I1e8.we get: 1 t 1= 1.45 - 1.22-1
For 24 degrees of freedom, Ihe table value of 'I' for one lailed test at 5% level 01 significance is
1.711. The computed value of I t I is less than the table value of t. Thus it falls in the acceptance region.
Hence our null hypothesis is correct. i.e., the sample of unemployed youths was taken from a population
of average intelligence with an I.Q. of 100.
Ex.31 Certain refined edible oil is packed in tins holding 15kg each. The filling machine can maintain
this but wilh a slandard deviation of 0.5 kg. samples of 25 are taken from the produclion line. If a
sample mean is (i) 16.35kg (ii) 15.85kg can we be 95% sure Ihat the sample has come from a population
of 16kg.lins7
Limit of population mean are = 15:t 1.95 (0.1) i.e., from 15.804 to 16.196kg.
"'
With 95% confidence we can say that
Ii} If sample mean is 16.35kg. then the sample does not belong to population of 16 kg.
tins.
(Ii) If sample mean is 15.85 kg. then the sample belongs to population of 16 kg. tins.
NOle: Please note the difference in the formula used for 5.E. of mean in eX.39 and ex. 40. In eX.39 the
S.D. of the sample is given where as in eX.40 the S.D. of the population is given though both the example
deal with small sample.
Ex.32 A soap manufacturing company was distributing a particular brand of soap through" large
number of retail shops. Before a heavy advertisement campaign the mean s"les per week per shop was
140 dozens. After the compaign, a sample of 26 shops was taken and the mean sales was found to be
147 with standard deviation 16. What conclusion do you draw on the impact of advertisement on sales7
Use 5% significance level.
Nowt=----~
IX. f'l 1147-1401 7 - 2.19
S.E._ 3.2 3.1
X
From the table, for 25 degrees of freedom t 0.05" 1.708 {for one tailed test].
Since computed (or calculated) value of t > to."., we reject the null hypothesis. i.e, advertisement m"y
be considered to have changed the average sales volume or we can say the campaign h"d impact on
sales.
EX.33 two salesmen A and B are working in a certain district. From a sample SUNey conducted by the
Head Office. The following results were obtained. State whether there Is any significant difference in the
average sales between the two salesmen:
83
A B
No. of Sales
H':!Jt-l" III
III S~11I2S~2
Unbiased estimate ~ oftbe COIlllllOlivariance =
III +112-2
.Here one tailed test Is used because under normal cirwmstances, the sales can be expected to
increase iI. a result of campaign. However if we use two tailed test then t 0.0" " 2.06. Even then we
conclude that the campaign was effective. But one tailed test is more suitable here.
Ex.34. Two type of batteries are tested for their length of life and the following data are obtained.
Type A 9 600 m
Type B B 640 144
Is there a significant difference in the two mean? Value of t for lS degrees of freedom at S% level
is 2.131.
'"
Sol. The null and alternative hypotheses are,
Ho : II," 1I,I.e. the two type of batteries an identical I.e., statistically there is no difference between
their mean lives.
H, : 1-11
cF 1-11I.e. the two type of batteries are different with regard to their mean life.
S", Sl, and n" n1belng respective sample variances, and the corresponding sample sizes.
Sp'" (9-1),,11.1+(8-1)xl44
9+8-1
The standard error of the difference between the two means is given by
n, SJ
'I
112S'
• 1
III +112-2
85
-llA7x ~ t
•
+
•
I _l1Ah~ ~~
i'"1" .t"2
t -
S.E ,where j' I, and .f 1are respectively the means of the first and the second
sample.
600 -640
5.57
,,-
40
... ,
=-i.18 :.PI =7.18
Degrees of freedom ~ 9" 8 - 2 ~ 15.
Table value 011 for d.o.f at 5% level of significance [two tails test) ~ 2.131
Since computed value of It I is mOrethan the table value, the difference between the means is significant.
EK.35. Ten objects are chosen at random from a large population and their weight are found to be in gms.,
63,63,64,65,66,69,69,70,70,71. In the light of the above data dl.,;uss the suggestion that the mean weight in the
universe Is65gms.
Weight x-66- d
, d
63 .3 9
63 .3 9
64 ., 4
65 .1 1
66 0 0
69 3 9
69 3 9
70 4 16
70 4 16
71 5
I:d ,,10
"
I: d<"
10
"
Meau= j'=66+ :g =67
,
Sample SoD. = --"-'!'. - ( " \"J
U II
,
- ~_(10Y
10 10J £8.8
•. 'Vli.l' =2.966
1:.966 2.966
Unbiased estimate of S.E. of meall:O
.y 10-1 J
"'0.98&
Tabulated value of t for 9 degrees of freedom at 1% level of slgnificance in two tails is 3.2S. Since the
calcul••ted value is less than the tabulated value, we accept our null hypothesis and the mean weight in
the universe 15likely to be 6Skgs.
Ex.36. samples of two types of electrk bulbs were tested for length of life and the following data were
obtained:
Type I Type II
Number In sample 8 7
Test at 5% level, whether the differel'lce in the sample mean is significal'll. (Table values of t for
13 degrees of freedom" 2.16, for 14 degrees of freedom - 2.15 al'ld for 15 degrees of freedom" 2.13 at
5% level for two tail areas and 1.77, 1.76 and 1.75 respectively for one tail area).
We will test the significance of difference in sample means by t- test as in Type J and Type II, nur->ber of
items in the samples are 8 al'ld 7 respectively. AI'lunbiased estimate of the Common Population standard
deviation Is givel'l by.
"
Since it Is given th~t SI = 35 ~nd SI; 40, nl '" 8 ~nd 112'" 7, we get
S.E = Sp
40.1 x + + ~ =10.8
t =
1,13-1 -1.024
1= 5.288
10.8
Since the computed value of t is more than the table valUe of t a,os (;2.16) for 13 degrees of
freedom, the difference is significant. Hence. the null hypothesis is rejected and therefore the two types
of electric bulbs differ significantly in their mean values.
Ex.3? Two kinds of manure applied to sixteen one"acre plots, other conditions remaining the same. The
yields in quintals are given beioVl:-
Manure I 18 20 35 50
3S
" 35
" " 41
" 29 28 16 30
" 45
ISthere any significant difference between mean yields? Use 5% significance level.
"
Sol. We have the null hypothesis, He,that the mean yields of two kinds of manure do not differ
significantly.
let the samples with manure I be denoted by X,and those with manure II be denoted by X,
XI-37 X,-34
36 .1 1 26 ., 64
50 .13 169 35 +1 1
49 +12 144 30 -4 16
36 .1 1 40 +10 100
34 .3 9 44 +12 144
49 +12 144 46
41 +4 16
i,"333/9:37 x,"238/7=34
"
Tabulated value of to.05 for the d.oJ Is 2.14.
Since calculated value of t is much le5s than the tabulated value at 5% level of significance we
accept our null hypothesis that there Is no si~nificant difference between the mean yields of two kinds
of manure.
Ex.38. The foilowing data pertain to two types of Tube. Bulbs tested for their leflgth of life:
Test whether there Is a significant difference between the two means at 5% level.
"
501. The nutl and alternative hypothesis are:
An unbiased estimate of the common population standard deviation is given by (assuming the
given standard deviations to be unbiased)
The standard error of the difference between the two means is given by
Compound t = =9.06~
• 90
Table value of 'I' at 5% level for 10 d.o.f is 2.22g. Since the calculated value ISmuch greater than
the table value, the difference is significant and hence we reject the null hypothesis. Hence the two
means differ significantly .
• Note the difference between Ex. SOand Ex.Sl. In Ex. 50, the formula used for the common
population S.D. is
• "l S" +
",+11,-1
II'S'l
.
(Jll-l)S~I+(1l2-1)S 2
,
111+112-2
. In fact the formula for unbiased estimate of common
population variance of two series x and y is
Ex.39 Three samples of five, four and five motor car types are drawn respectively from three brands A,
Band C manufactured by three machines. The life-time of these lyres (In 'OOOmiles)is given below. Test
whether the average life-time of the three brands of lyres are equal or not.
ABC
45 41 44
42 40 42
43 42 38
44 43 43
42 39
Sol.
let the Null Hypothesis be Ho : the average lifetime of three brands oftyres are equal.
Let us subtract 40 from each of the given values. Ther'lthe coded data are given below.
"
TABlE
X. X. X. X. X. X.
1 1 4 16
5
, " ,
4 0 0 4
3 9 , 4 -, 4
4 16 3 9 3 9
, 4 .. .. -1 1
16 58 6 14 6
••
"EXt "EX\ " EXl =EXll = E Xl =Ex'J
~ (l8)~
CorrecliOl.1Fllctor=..I...",
-= 56
N 14
=58+14+34-56=SO
"
SSW'" Sum of the Squares within the samples
Tolal 50 13 ..
The tabulated value of F for 'Yl '" 2 and 'Yl'" 11 at 5% level is 3.98. We see that the calculated value
of F i.e., 1.624 is less than the tabulated value 3.98 at 5% level. Hence we accept the Null Hypothesis HO
and conclude that the average lifetime of the three brands of tyres are equal.
EX.40. The Amrit Merchandising Co. Wishs to test whether Its three salesman A,B and C tend to make
sates of the same size or whether they differ in their selling ability as measured by the average size of
their sales. During the last week there have been 14 sale calls. A made S calls. B made 4 calls and C made
5 calls. Following are the weekly sales record (irl Rs.) of the three salesman:
Sol.
let the null hypothesis be HO:the three salesman tend to make sales of the same size.
let us divide each observation by the common factor 100. then the coded data and their squares are
given in the following table:
93
TABLE
X, X, X, X, X, X,
3 9 6 36 7 49
4 16 3 9 3 9
3 9 3 9 4 16
5 25 4 16 6 36
0 0 .. .. 5 25
15 59 16 70 25 135
~ (~6)~
Correction Factor"" l"" -"-""
224
N 14
,
SST ""Total stun oCUte ScpJares'" :EX'l +:E X1l+:E X:3 _ L
N
10 SSW 30
:. MSB '" _S_S_B_ = and MSW- --"--2.73
"', 2 "', 11
94
TABLE
Between 10 2 , F
, 1.83
Samples 2.73
Within 30 11 2.73
Samples
TOlal 40 13 ..
The tabtllated valtle of F for 'YI=:2 and 'Yl = 11 at 5% level is 3.98. We see that the calculated valtle
of F I.e.• 1.83 of F <: the tabtllated valtle 3.98 at S% level. Hence, we accept the Null Hypothesis Ho and
conclude that the three salesman tend to make sale of same size.
EJI.41. An experimentor wished to study the effect of four fertilizers on the yield of a crop. He divided
the field into 24 pots assigned each fertilizer at random of 6 pots. Part of his calculations are shown
below:
Total .. 6212 .. ..
(a) complete the above table by filing In the values marked by.
(b) test at S% level to see whether the fertilizers differ significantly.
Sol.
os
... dJ. for Within group = N - k = 24 - 4 = 20
... SSW = Sum 01 the Squares within the group = SST- SSB
3272
= 163.6
20
MSB 980
:.F= MSW =163.6 =5.99
TABLE
Total 23 6212
(b) We see that the calculated value of F i.e., 5.99 > the tabulated value 3.10 of F at 5% level with
dJ. '/1 = 3 and '/2= 20. Hence we conclude that the fertilizers differ significantly.
"
Ex. 42. A company appoints four salesman A,B, C and 0 and observes their sales in three seasons:
Summer, winter and monsoon, The figures (in Lakhs) are given in the following tables:
Salesman
A
• c D Total
Summer 36 36 21 35 '28
Winter 28 29 31 32 120
Monsoon 16 28
" " 112
Total 90 93 96 360
Sol.
Let the Null Hypothesis be Ho : There is no significant difference between salesman or between
seasons.
Ttle given data are first coded by subtracting 30 from each observations and then classified
according to two factors.
A B C D Season
~Iesman Total
s,,~1
Summer 6 6 -, 5 8
Winter -2 -I 1 2 0
Monsoon -4 -2 -I -2 -,
Salesmen 0 J -, 6 O-T
Total
( Grand
Total)
T ~ (0\1
Correction Factof= _=..s::.L= 0
N 12
,,0+3+27+12-0=42
•• T~
N ={36+4+16+36
+ 1 +4 +81 + 1 + 1+ 2S + 4 + 1 }-o
= 210
TABLE
Between 42 3 14 F 22.67
1.62
Samples 14
Within Samples 32 1 16
Total 210 11 ..
The table vaful! for F for YI " 6 and Y2= 3 degrees of freedom at S% level is 8.94. Since the
calculated value 1.62 .; the tabulated value at 5% level. We conclude that there is no significant
difference between the salesman.
Again the table of value of F for YI= 6 and Y~= 2 degree of freedom at 5% level is 19-33. Since the
calculated value of F .; the tabulated value at S%. we conclude that there is no significant difference
between the seasons.
EX.43. Apply the techniques of analysis of Variance to ten foilowing data showing the yield of 3 Varietiei
of a crop each from 4 blocks, and test whether the mean yield of the varieties are equal or not. Also test
equality of the block means.
Varieties Blocks
, "
IV
'"
A 8 6 8
, 5 5 7 8
c 6 7 9 5
Given F.os = 5.143, F ,01" 10.925 for dJ. (2, 6) : F.os = 19.33 for dJ. (6,2) for F .0'" 4.757, Fe1"
9.779 for dJ. (3,6).
Sol. let the Null Hypothesis be He: The mean yield of the varieties are equal or the block means
are equal.
The given data are first coded by subtracting 5 from each observation and therl classified according
to two factors: (i) Blocks and (ii) Varieties.
TABLE
Varieties
A .1 3 I 3 ,
B .0 0 2 3 ,
C I 2 , 0 7
Total 0 3 ., , 18: T
100
. T:.l (18)~
CorrectiOilFactor= ~= -= 0
N 12
sse" Sum of squares between Salesman
"54-27"27
TABLE
F 2.81
0.25 11.2-1
Residual 16.83 6 2.81
Total 27 11 ..
101
Since the calculated value of F (viz. 1.15) is .: the tabulated value 4.757 at 5% level for (3,6) dJ. we
conclude that the mean yields of the varieties are equal.
Again sInce the calculated value of F (viz. 11.24) is .: the tabulated value 19-33 at 5% level for (6,2) dJ.
we conclude that the block means are also equals.
Ex.44. IQ test was administered to 5persons before and after they were trained. The results are given
below:
Candidates
lQ before
I
110
"
110
'"
123
IV
132
V
125
Training
(t 0.05(4) = 4.6)
let the null hypothesis be Ho: Il, = Ill, there is no significant effect of the training.
The alternative hypothesis is H,: 1-1, '1' f.ll, i.e., the IQ before training is less than the IQ after
training.
11B -, 4
"
III
120
123 125 , 4
IV 132 136 4 16
V 125 121 -4 16
'"
.,'
. d= rd
.. -..1.Q..-2
u-5-
-
lxS -rt k =0.82
€x140-100
Since the calculated value of t <: the tabulated value with 4 dJ. a11% level, we accept He at 1 %
level and conclude that there is no significant change in IQ after the training programme.
EX.45 A certain stimulus administered to each of the 12 patients resulted In the following increase of
blood pressure;
Can it be concluded that the stimulus will, In general, be accompanied by an Increase in blood
pressure? (Given for 11 d.f., toOl = 2.7)
let the Null Hypothesis be He: III = III I.e. there is no significant difference in blood pressure before and
after administering the stimulus i.e., stlmulles Is not effective.
The Alternative Hypothesis Is H,: 111> jJ.,i.e., stimulus increases the blood pressure.
103
.-_Ed 31
..d-Il=--=::U8 ",,,,:E"".18'.
"
~
va Klh[il
{l1 x 18' -(31)'
Since the calculated value of t > the tabulaled value with 11 dJ. at 1% level, we reject the Null
Hypothesis Heand conclude that the stimulus will, In general, be accompanied by an increase in blood
pressure.
Ex.46. The sales data of an item In six shops before and after a special promotional compaign are as
under;
Shops A B C D E F
Before Compalgn 53 28 31 4B 50 42
After Compaign 58 29 30 55 56 4S
'04
Sol. The compalgn will be succeis'lf there is a significant increase in the average sales after compaign.
In this case we have to consider the significance on one side only i.e., increase in sales.
SALES d=XI-Xl
Shops
A 53 58 -S 2S
a
C
"
31
"
30
-1
1
1
0 48 55 -, 49
E SO 56 -6 36
F 42 4S -, 9
Null Hypothesis Ho: III = III I.e., there Is no difference in the average sales before and after the
compalgn.
Alternative hypothesis HI; III < III I.e., average sales have improved after the compaign.
cr • I l:.d'_ (l:.d)' •
~ 121
-,-
2
- (3.') '" 2.8136
/\j II II
o
U<>b
•••••••••••• ors.£, - r--7
,"••
-I
The tabulated valUe of t at 5 d.o.f. at 5% level of significance (one tailed test J Is 2.015.
The computed value of t = 2.78 being more than the table value, we reject Hoand conclude the
sales compaign has been a success.
lOS
Important Note:
Some author use the letter s for standard deviation and calculate the unbiased estimate of S.D.
by using the formula
s=~ I:(X.:l)2
SE s
II-1
and then
But the general formula for S.[. fOf pair t-test remains the same i.e.,
-""
~ urtr-(Ld')
S[= --==~--
n rJ Il- 1
•
n{n- I)
Ex.47 10 Accountants were given intensive coaching and four tests were conducted in a mOrlth. The
scores of tests 1 and 4 given below:
Serial No. of 1 , 3 4 5 6 7 8 9 10
Accountants •
Does the score from the 1 to test 4 show an Improvement? Test at 5% level of significance. (The
value oft for 9 d.oJ. at 5% level for one tail test is 1.833 and for two tail test is 2.262)
501. let us denote the score of first test with SUffiK1 and that of the fourth test with suffix 4 anel thefl
taking the null hypothesis that there is no improvement, we can write.
Since we have matched pairs, we use paired t-test and work out test static 't' given by:
106
~ nl'<f-Il(d)~
t. i IS,E,wbon 'i •••.• onof4ondS.E.---------
11(11- I)
42 40 -2
•
51 61 "a 100
42 52 "a 100
60 68 .8 64
"
.
41 51 "a 100
70 64 -6 36
55 63 .8 64
.,0
"
38
72
SO .12
'00
14.
d .nII0.7.2
856-10(12)'
•• 1937
10(10-1)
H.ncot
.. -
S.E.
. "
..-.3717
1,1137
'"
As H, Is one sided, we shall apply one tailed test {In the right tail, because H, Is greater than tvpe
for determining the rejection region at 5% le~el.)
The observed ~alue of t '" (3.717), Is more than 1.S33, and hence, in the rejection region.
Accordinsly, we reject Ho(I.e., we accept H,), and conclude that coaching has improved the standard.
Ex.48 A company can claim that the weight of their product is 10 kgs. A sample of Items taken from a
lot supplied by the company has shown the following weights.
10.2,9.7,10.3,10.0,9.8,9.7,9.6,9.7,9.4
Is there any statistical evidence to support the claim of the company about the weight of the
item?
dJ. 11 10 9 8
___________________
10.2 + 9.7 + 10.3 +10.0 +9.8 + 9.7 + 9.6+ 9.7 + 9.4 •• 9,8
10
If X'.Xl.... X,odenote the weight for the 10 Items in lhe sample then
,
_(1':) n- 961411 (9$)" 10 - l.0Il
- ~ -1.039 -1.04
let Ihe null hypothesis be that the mean weight of the item Is 10 kg. Alteroative hypothesis is mean
weight.,. 10 kg.
108
t • <I'IJ.)-f: ~ .\Y,S-IO)--JlO--F
- 104
'"
At 5% level of significance with 9 dJ. the tabulated value" 1.833.
Since the calculated value is less than the tabulated value, we can accept the claim of the
company that the weight of Ii'll' item is 10 kg,
Note: In the above example if we consider two tailed test then we have take 10% (eve! of significance
and if we lake one tailed test we have to consider 5% level of significance .•
EX.49 Eight students were given a test in statistics, and after one month's coaching, they were givl"I1
another I('sl of the similar nature. The following table gives the difference in their marks in the second
test over the first:
Roll Number: , , 3 , 5 6 , 8
Difference In Marks: , ., .8 ., .,
6
"
r d' -
.
rd "'..12....U = d
let the null hypothesis be that the training is not effective i.e., there is no significant difference in thl'
marks of the two tests.
~ !ee-n(eJ)2 "
'"'
At 5% level of signifkance with 7 d.f. tabulated value of t = 2.365 for two tailed test and t" 1.895
for one tailed test. Since the calculated value is less than the tabulated value, we can accept the null
hypothesis and cOflclude that the training Is not effective I.e., the differeflce ifl the marks Is flot
statistically significafll at S% of significance.
EX.50 A certain drug administered to 10 patients showed the following additional hours of sleep:
Can it be cOflcluded that the drug does produce additiOflal hours of sleep7
d.f. 10
Ld' = 1 + 0.25 + 7.29 + 0.36 + 1.44 + 3.24 + 2.56 + 12.25 + 0,04 + 2.89 = 31.32
tet the null hypothesis be that the drug Is not effective I.e., the drug does not produce any
additional hour of sleep.
t. d-F~ .Q,81"~~
~ !dl_n(d)1 -V 24,596
Since the calculated value of t Is less than t 0,015 with 9 d.f. I.e., at 5% level of significance, the null
hypothesis cannot be rejected. Hence we conclude that the drug do not produce any additional hours of
sleep.
110
EX.51 A certain stimulant was administered to 10 patients In a hospital and their blood pressure
showed the following:
Can it be concluded from the above data that the stimulant has Its impact on blood pressure?
Sol. Ld"-2-3+5+3+1+0-2+4-3+5"S
Ed - 8
-- Il '" d "'- 10 - 0.8
1-0.8#$ - 0,716
9.777
Ex52 A certain diet newly introduced to each of the 12 pigs resulted in the following increases 01 body
weight:
Can you conclude that the diet effective in increasing the weight of the pigs? ( given t "0' for 11
d.oJ." 2.20)
111
Nowle.tot.ti.li.;.I.S~
51
- 2.7.5 F
F
_...rJ.1.-x 3.3166 _ 3DO~
3031l
At 5% level of ~ignificance (dJ. = 11), tabulated value of t is 2.20 (two tailed test) and the calculated
value Is greater tnan the tabulated value, so we reject Hoand conch/de that the diet is effective. [ In fact
it is question of paired I-test}.
Note: If we consider single tailed test, the value 011 = 1.796. Since we are Interested onlv in the increase
in weight we should preler one tailed leSl because two tailed 1<,,\ wHisimply test the change In weight.
Ex.53 The number of car accidents per month in a metropolitan city was found as below:
20,17,12,6,7,15,8,5,16 and 14. Use chi- square test to check whether these frequeflClesare
in agreement with the belief that occurrence of accidents was the same during the 10 months period.
Test at 5% level of significance.
501. We have the null hypothesis, HD, that the occurrence of accidents was the same during the ten
month period.
= 20 + 17 + 12 + 6 + 7 + 15 + 8 + 5 + 16 + 14 = 120
.--.--
6415
12 12 ...--.--.--.--
362.1916
12 12 12 12
'M
--,,-.2033
Oegree of freedom" 10 -1 = 9
m
The tabulated value of X2 for 9 degree of freedom for two tailed test at 5% level of significance is
19.02. Since the calculated value is more than the tabulated value, we reject our null hypothesis and say
thai the given data do not support the belief that the number of accidents were same during 10 months
period.
E~.54 A sample analysis of e~amination results of 500 students was made. It was found that 180
students had failed, 170 had secured a third class, 110 were piaced in second class and 40 got a first
class. Are these figures commensurate with the general e~amination result which is in the ratio 4:3:2:1
for the various categories respectively? Answer at a '" 0.05 (Table values of chi-squares at a 0.05 for 3
d.f. and 4 d.f. and 4 d.f. are 5.99, 7.81 and 9.49 respectively).
Sol. We have the null hypothesis Ho, that the result of the e~amlnation were commensurate with the
general e~amination result which is In the ratio 4:3:2:1.
= 2 + 2.67 + 1 + 2 = 7.67
As the calculated value is less than the tabulated value, we accept our null hypothesis and say
that the observed figures are quite commensurate with the general e~amination result.
Ex.55 The following table shows the distribution of goals In football matth:
No,of Goals: a 1 , 3 4 5 6 7
Sol. We have null hypothesis Ho. that the Poisson distribution can be fitted to the data.
The expected frequencies of the Poisson distribution are computed from the expression:
E'p",udfr."",ncy-N ( '~")
.1 . lten,"N. 4&l••••• 1,7
.-lY(I."l
E%p"lod Fu'f'oncy" 480 ----
.,
Where x '" 0.1,2,3,4,5,6 and 7.
Working out the successive terms of this distribution, we get the following frequencies (results
expressed to the nearest whole number):
,
0 95 ~
7
150
lOB ''"
'"
n
n
7
,
4 "
40 ~
!OJ
•
7
16 ~ 14
Since no expected frequerlty should be less than 5, we pooled the last three frequerlcles.
x. •
114
• iY5-8:8)' (158-130)' (108_126)' (63-72)' (40-30)" (16-14)'
X.- -.-~-_.-.- -"_--_-.' '.~ _
88 DO 126 72 30 14
For 4 degrees of freedom at 5% level of 5ignificance, the table value of X'" 9.488, while the calculated
value X'" 8.30. Since the calculated value is less than the taboiated value, difference between expected
and observed frequencies is not significant and can be ignored. Hence the fit is good.
Ex.s6. In experimental on pea breeding, Mendel obtained the following frequencie, 01 seed,; 31S round and
yellow; 101 wrinkled and yellow, 108 round and green and 32 wrinkied and green lotal 556. Theory predicts that
frequencie, should be Inthe proportions 9;3:3:1. Examinethe corre,pondence between theory and experiment
Sol. We have Ihe null hypothesis, Ho, Ihat Ihe frequencies are in the proportloo of 9:3:3:1.
On Ihe basis 01 hypothesis that the seeds are In Ihe proportions of 9:3;3:1, Ihe expected frequencies of
,eeds of four categories are:
313,104,104,3S.
3j6
"9 -""""j6"9-,12, 1'-iiJapp'",",
'"'" ,Ii
'"
'"n '""
substituting the observed and expected frequencie51n the expression:
• (Oi_Ei)l
X'- I Ei
We get
m
The oumber of degrees of freedom:
D.F.=4-1=3
For 3 degrees of freedom at 5% level of significance, the table value of X' ~ 7.B15 which is much greater
than the computed value of X' = 0.51. Therefore, the difference b~tweeo observed and expected frequencies Is
ootsigoificant 3nd m3Ybe Igoored. Hence, there is a perfect correspondence between theory and experlmeots.
Ex.57. 50 students selected at random from 500 students enrolled 10 a computer crash programme were
classified accordlog to age an grade points giving the following data:
Age (10years)
Grade Points
" .0'
uoder
21-30 Above 30 Total
Upto 5.0 , 5 , 10
,
,"
5.1 to 7.5
7.6 to 10.0 ,
5
"
" "
Test at 5% level of significance the hypothesis th3t age and grade points are Independents. Table value of
X' (Chl.Square)
d.f. 4 5 6 7
" 9
9.4BB 11.070 12.592 14.067 15.507 16.919
Age in Years
Grade Points
" ,,'
under
21-30 Above 30 Total
Upto 5.0 , 5 , 10
5.1107.5 , ,
7.6tol0.0 , ,
5
, "
"
Total
" " " 50
116
Since value~ Ie•• th~n 5 are occurring In some cell of the expected frequencle~, we have to ~malgamale
the~e cells to their neighbours. After amalgam~tion the new frequencies .
OBSERVED
Age In Years
Upto 7.S n ,
7.6 to 10.0
15
, , , "
"
Total 15
" 15 SO
EXPECTED
Upto 7,5 9 n
7.6 to 10.0 6 ,
9
"
Total
6
"
15
" 15 SO
Sol. let us con,trud a 2 x 2 contingency table of observed frequencies from the gl~en information as below: •
New Therapy
OldTherapy
" "' '"'
Total
'" '"' '"
,,,
'" '"'
[30%of 400 = 120] We willformulate the hypothesis a, below:-
EXPECTED FREQUENCIES
",W (IOO~UO)
~ 2S
(lOO x 36D)
lOO - 12 '"
Therapy
'"
(~OO~ UO) (JOOx560)
lOO - lSi
Old Therapy
lOO • 1I2
'"
Total ,,,
'" '"'
• If one patient, is treated by new therapy by new therapy then 4 patients are treated by old therapy and the
total number of patients admitted are 5. If 100 patient, are treated by new therapy then patients treated by old
therapy = 400 and total number of patienh" 500.
For a contingency table of r row, and c column. Degree, of freedom = (r- 1) (c -1)
(Oi_Ei)2
Th. ,..lu. cfX'wi1! lhtn b•• L Ei
1'8
From the table, we find the table value of X' for] d.f. at 5'" level of significance is 3.84. Compound X' >
table X'value. Hence Hois rejeded. The inference is that the difference in fatality is more than occurrence by
chance.
Ex.59. ]00 Students randomly selected from the 1000 students enrolled in an MBA program were cross-
classified by age and grade point. Accordingly, the following data were complied:
Up to 3.0 , ,
3,1 t03.S 18 18
5
"
8
"
3.6 to 4.0 11 11 17
"
Total
" " " '00
At 5% level of significance, test the hypothesis that age and grade points are independent.
On the basis of Hewe can obtain the upected frequencies as shown below:
Up to 3.0 , , , "
3.1 to 3.5 18 18
3.6 to 4.0 18
11
"
Total
18 11
"
" " " '00
{Oi-Eir
X' • "" ~-"
"- Ei
us
(6 _7)2 (9_7)2 (j_6)2 (18_14)2 (14-14)2 (&_12)2
~-~-+ + ~-~- + + ~--~ + ~--"
7 7 6 14 14 12
of (11 - 14i (12-14)~ + (17-12)~ = 6369
14 + 14 12
At 5% level for (c - 1) (r - 1) '" 2 ~ 2 '" 4 d.o,f. the I~ble value of X' is 9.488. The computed v~lue of X' being less
Ihan this is insignificant. Accordingly, we cannot reject H, , and conclude that age and grade points are
Independent
h.60. A Chemical extraction plant processes sea water to collect sodium chloride Jnd magnesium. It is known
that sea water contains ,odium chloride ,magnesium and other elements In the ratio of 62:4:4 sample of 200
tonnes of Sea water ha, resulted In 130 tonnes of sodium chloride ~nd 6 tonne. of magnesium .Are the,e data
consistent with the known composition 01 sea water at 5% level?
Where H, is null hypothe.is and H, I. alternative hypothesis. As per the null hypothesis, in 200 tonnes of
sea water we expect 124, 8 and 68 tonnes of sodium chloride, magnesium and other elements respectively.
Sodium
Clorid"
'" D' • 36 361124~0290
-,
Mlgnejlm
• • 4 4/8 ~O.50D
au., ~ -, 16 16168~D.235
Domm,
"
Tolo! ~, '00 x'- 1.D2$
As Ihere are three types of elements, n "3,The degrees of freedom ~ n.1 ~ 3.1 = 2. At 5% level, for 2
degrees of freedom. the table value of chi-square is S,!l!lL The computed value Is less Ihan Ihis table value.
Accordingly, it can be concluded thaI the observed data are consistent with Ihe known composition of sea water.
DO
E~,61 A sample of 300 students of Undergraduate and 3000 students ot Post graduate clas,es of a University
were as~edto give their opinion towards the autonomous colleges. 190 of the Undergraduate and 210 of the Post
graduate ,tudents favoured the autonomous status,
Present the above fact in the form of a frequency table and teSl, at 5% level, that opinions of
Undergraduate and post graduate students on autonomous status of colleges are independent (Table value of chl-
square at 5% level for 1 dJ. is 3.84).
Sol. We,et
H,' Opinions on autonomous ,lalUS and level of Graduation are not Independent.
let uSnow from a contingency table of observed frequencies in which expected frequencfes are shown
within brac~et,.
Expected frequencies are computed by multiplying Ihe row total and column total and then dividing the
product by the sample size.
(Oi_ti}2
Now X' • L Ei
.--.--.~
tOO
200
100
200
100
100
Computed value of X' being less than the table value, the difference between observed and e~pected
frequencies are not large enough to reject the hypothesis 01 Independence at the 0.05 level of significance. Hence
the events are In dependent I.e.• the opinions of Undergraduate and Post graduate students are independent.
Note: It may be recalled thaI a contingency table i, a two-way table of frequencies corresponding to two factors
of cla"LfLcation,
121
h62. calculate the expected frequencies for the following data presuming the two attributes viz., condition of
home and condition of child ilS independent:
Condition of Home
Clean Dirty
50
Condition
of Child
Clean
"
Fairy Clean 80 80
<5
"'" "
Use chl-~quare test at 5% level to state whether the two attributes are independent.
(Table value~ of chi-square at 5% for 2d.1. is 5.991 and for 3 dJ. Is 7.B15 ,lnd for 4 dJ. is 9.488)
"'I. An expected frequency E, corre,ponding to each cell in the table will be given by
We form the table of ",petted frequencie~ with the help of the abovl' rule and write the expected
frequencies In eath cell within Brackets
Clean Total
"'"
Clean 70 (185 x 120) /300: (74) 50 (115 x 100) /300: (46) 120
,
Fairy
Clean
80 (1g5 x 100) / 300
61.67
20 (115 x 100) / 300 :
(38.33) '"
01.'" 35 (185 x 80) /300 - 49,33 45 (115 x 80)/300 ~ 30.67 80
Total US SOO
'"
m
let us set the null and the alternative hypothesis as follows:
H,: No association e.lsts between the attributes, I.e., the two a!tributes are independent.
X' •• '<'
L
(Oi_Ei)2
---
Ei
•
CU-49.33)'
.
" ----. ----- .• -----
(1C-7~' (50-40'
4IS
()l0-61Jj7)' (20-38.33)'
(45_30,67)l
61.67 38.33
• -----.----
49,33 3067
~ 25.636
Compound value of X' being more than 5.991, it is significant, and hence, HGis rejected. Therefore. the
two attributes are not independent, I.e., association e.ists between attributes.
h.63 Out of 800 persons, 25% were literate and 300 had travelled beyond the limits of their district. 40%
of the literatures were among those who had not travelled. Prepare a 2 x 2 table and test at 5% level
of significance whether there is any relation between travelling and literacy.
Sol. We have the null hypothesis, H", that literacy and travelling are independent and HL the alternative
hypothesis that travelling and literacy are related I.e., Independent.
'" •• '"
'" ." ."
>00
'" •••
On the assumption that literacy and travelling are Independent, the expected frequency table will be as
under:
.
Li ••••••
Ill•••" •• m" '"
.m
~O
••
••• , •••
123
~ (120-73)~ (SO-125i (1SO- 225)" (420 - 375)~
X .~---~.+~---~+ 225 + 375
75 125
At 5% level for 1 d.oJ. table value for chi.square is 3.841 but our computed value is muoh larger
than this. Hence, the assUmption of Independence cannot be accepted. Accordingly, we conclude that
there Is relation between travelling and
Ex.64. To test the efficiency of a new drug a controlled experiment !Conducted wherein 300 patients were
administered the new drug and 200 other patients were not given the drug. The patients were monitored and
the results were not given the drug. The patients were monitored and the results were obtained as follows:
70 110 500
Use X' (chi square) test for finding the eefect of the drug,
OBSERVED FREQUENCIES(0)
370 110
T"" 70 500
fXPECTED FREQUENCIES(E)
T"" 370 70
'" 500
I"
,.
Expect ••d frequency for each tell h~s been Cilltulated by using the formula
COMPUTATION OF CHI-SQUARE
0
, (0 -E) (O-E)" (O-£)'IE
Since calculated value of X' is less than the tabulated value, the effect of the drug Is not significant and we
accept our null hypothesis thilt the drug Is not effective.
EX.6S. The table given below show, the dilta obtained during an epidemic of cholera:
\nOl'ulaled
" ,,,
Non Inoculated
'"
Test the effectiveness ollnoculalion in preventing the attack of cholera.
llIGiven; X'O.OS: 3.841 for d.f.; 5.991 for 2 d.f.; 7.815 for 3 d.1. III
Sol. We have the null hypothe,ls Ho, that the Inoculation Is not effective In preventing the attack of cholera.
H1: Alternative hypothesis that the Inoculation Is effective In preventing the attack of cholera.
m
Attacked Net AUad;td Row Total
lL,oC\~attd
NOll lLloc.daltd
41 (A)
106 (B)
lJ1 (C)
7-lft (D) '"'"
Cehllll1l Tet.l
'" 980 1128 = Or.nd Tot.l
3.
A 41 3. (-16 -36):.1 ]6= 1.000
C 232 3.
238 (232 _138)2 238 0.1500
D 36
748 741 (7-18-7-ul 74"2 == 0,0485
Total 1.8405
Calculaled value X';s lessthan the tabulated value X' at 1 d.f. at 5% level of significance.
We can presume that the null hypothesis Is valid and the Inoculation 15not effective In preventing the attack of
cholera.
126
h.66. A market analysts took a sample of 20 markets In a large city In an attempt to determine how much
variation is there in the butler prices. The 20 prices that were quoted to him for the four samples of the butter
yielded the same value; x: 100 and x" 9. Th" problem now Is to find a 95% confidence interval for the standard
deviation of all the market prices.
Sol. Suppose a large number 01such sample of 20 prices were laken and their standard deviations computed,
the sample m"an being 01no Interest here.
From the X'dl,tribution table it i, found that for d,o,f. X"m, ~ 32,85 and X' 0."'" 8.91
1539 1~J9
~ 0' ~
32.85 8.91
Ex.61. A sample of 101 light bulbs yielded a standard deviation of 80 hours burning time, whereas long
experience with the particular brand showed a standard deviation of 90 hours. Using u" 0.05, te,t if Ihere is any
difference Inthe slandard deviation,
'
X~=(Il-l ) _l00"~SO) =79 ,
- 020 (90)
Since the calculated value is less than the tabulated value we accept the null hypothesis that there 15no
slgnlficanl difference Inthe S.D.
38,40,45,53,41,43,55,48,52,49
Can w" say that variance of the distribution of weights of all slUdents from which the above sample 01 10
students was drawn, is equal to 20 'quare kg. ?
m
'0'.
X> Xi-j'-)'.:i-47 (Xi-y i'
38
<0 .,.,., 81
,
<0 ,
"" 0 38
.," "
"""
10
,,,
0
,
M
""
3
<0 4
IX;-470 n- 10 L(Xi-Y)'.280
Specification of the significant level. tel us consider S % significance level. Statement of the decision rule.
At 5% significant level" 10- 1" 9 d.o.f. X' 0-">''' 19.02.
x' •
• (n-I) ~
0'0
Making an admTni,traHve decision. The data indicate that the population variance may be 20 square
Kgms.
'"
h.69. In a survey of 200 boy~. of which 7S were intelligent, 40 had skilled father~; while 85 of the
unintelligent boys had un~killed father~. Do these figures support the hypothesis that skilled fathers have
intelligent boy~.U~eX' te~t. Value of X'lor 1 degree of freedom at 5% level is 3.84
ObP:fVed frequences
Unint...nigenl
Intelligent
boys
T'"
'"'"
.0
Scilledfalhers
'" '"
Unskilledt'lth,n 35
" IlO
To"" m 200
"
The following table gives the expected frequende~
rxp.ot.d fro'f'oncio,
ScilledfatMs
7j~80
200 -," 125-80 -50
'"0
80
125 120
Un~kiI1ed&th..-l 75~120.45
'"" '"0
M
-" "'
To"" m TOO
"
We ~et the null and the alternative hypothesis as follows:
'"
_100(15+9+10+6)_100. 4}O.J.Q...S.8S
4S0 450 9
This value is much higher than 3.84, the given table value, and Is significant. Therefore, we reject the null
hypothesis and consequently accept the alternallve hypothesis. Thus the given data .upport the hypothesis that
skilled fathers have Intelligent boys I.e., association exists between skilled fathe,s and intelligent boys.
Ex. 70. A certain drug is claimed to be effective in curing ooids. In an experiment, on 164 people with cold, half
of them were given the drug and half of them given sugar pills. The patients ,eactions 10 the treatment are
reoorded in the following table. Test the hypothesis thaI the drug Is not better than sugar pills for curing colds.
Drug
" " 20
Sugar Pills
" " "
Sol. The table showing the observed frequencies.
82x96
,
E.po.t.od fr,q.>ency cOlJosponclng to A c,l1- 164 • 48_
130
Holped Horm.d NoEffool 'oW
0"" "
A- 48 '"
B-Il '"
C.23 "
Sug •• Pill,
"
D- 48 "
E. 11 "
I'"-n "
Toto!
" n
•• '"
x' may now b calculated as follows:
-un
Now the degrees of freedom are calculated as (r -1) (c- 1), where r" number of rows, and c ~ number of
columns.
In this problem, therefore, d,a, f. " ( 2 - 1 ) ( 3 - 1 ) " 2. From Ihe table, for d.a. f. X' 0.0>"S.991.
Ex,?l A Bombay film director claims that his films are liked equally by males and females. An opinion survey of a
liked Disliked
Males
'" '"
Females m 2",
d.l. " , , ,
Sol. The table of results is given below
Total
'"
'" 353 Grand
Total •
"'''''
595 x 647
Eop.mdft'Cf.',n'yof ••l!A. 1000 • )85
Total
'"
Grand Total ~
353
'"
'"""
• (4J:l-335)' (193-210r (245-262)' (160-143/
X. 385 + 210 + 262 + 143
D<>grees
01freedom "(2-1) (2-1) ~1
Since the calculated value is more than the tabulated value, the claim is not supported by the data i.e.,
the films 01the dlr •.•ctor are not liked equally by males and females.
m
b.n 1600 familie. were selected at random in a city test the belief that high income familie. u.uallV ,end thei~
children to public schools and low income families often send their children to government school,.
School,
Low
'" 506 1000
High 162
'" 600
Total '50
'" 1600
x:, ,('"'"'_-_'"L~")~'+
a_
'W
"
(506 -59!J)'
590 +
."("~_._--,,~,,'
'"
-
Which is much mOre than the tabulated value ofX'for 1 d.o.f at all level, of significance.
Hence the null hypothesi. i. rejected we can .ay that high Income families u.ually .end their children to
public schoois.
'"
TRY YOURSELF
I .Memory capacity of 9 students was tested before and after training. State at 5% level of signilican~-e
Whether the training was effective from the following score.?
S, I 2 3 4 5 6 7 8 9
.No.
Before 10
" 9 3 7 12 16 17 4
After 12 17 8 5 6 II 18 20 3
2. The sales data of an item in six ~)lops befoTe and after a special promotional campaign are as follows
Can the campaign be judged to be a success? Use 5% level of significance.
Shop A B C D E F
Before 53
" Jl 48 50 42
After 58
" 30 55
" 45
J. Sample of sales in similar shops in two towns are taken for a new product with the following results
Is there any evidence of difference in sales in the two towns? Use 5% level of ~ignilicance fOT te$ting
this diffeTenee between the mcans of two samples.
A
" 5.3 5
B 61 4.8 7
134
4. A personnel manager i~ interested in Irying to delermine whether abscntism is greater on one day
of the week. than on anolher. His rc<;ords for the pa~T years show the sample.
No. of 66 57 54 48 75
absenTism
5. The eontin~ency Table below summarizes the results obtained in a study CQndUCTedby a I"Cseard.,
organisation wilh respect to the performance of four competing brands oftootbpasTe among the users TeST
wheTher incidence of cavities is independent •.•fthe brand of the toolhpaste used.?
aro 9 13 17 II
1,5 63 70 85 82
>5 28 37 48 37
6. The following lable gives tbe number of good and bad parts produced by each •.•f three shifts in a
factory Is there any association betwcen the shift and the qualiTYof parts produced?
135
7. An invcstm~nt consultancy flIm finds that 87% of 150 investors in city A prefer equity investment and
65.9% of 120 investors in city B prefer equity investment against debt investment. Test whether the two
cities differ in the proportion of investors preferring equity.?
8. A film director claims that his films are liked equally by males and females. An opinion survey of a
random sample of 1000 film gocrs revealed the following resulls:
Liked Disliked
9. 400 women shllppers are chosen at random in the market A. Their average weekly expenditure on fllOd
is found to be Rs. 250 with a standard deviation of Rs. 40. thes~ figures are Rs. 220 and Rs. 55
respoclively in the market B where 600 women shoppers are chosen at random. Use 1% !e,'el of
significance to tesl whether the average weekly fllOd expenditure of the two populations of shoppers are
equal.
10 A sample of heights of 6400 soldiers ha., a mean of 67.45 inches and standard deviation of 2.56 inches
while a simple sample of heighls of 1600 sailors h3j; a mean of 68.55 inches and a S.D. of 2.52 inches Do
th~ data indicate that the sailors are on Ihe average taller than Ih~ soldiers?
1\ A stenogr .•pher claims that she can take dictation at the rale of 12U words per minute. Can we rejcct
her claim on the basis of 100 trails in which she demonstrates a mean of 116 words with a standard
deviation of 115 word,? Use 1% \.o.s.
12The following data give the yields on 12 plots of land in three samples. each of 4 plots, under three
varieties of seeds A,S and C.
32 )j 31 30
30 24 32 26
26 27 25 30
Apply technique of the Analysis of Variance to test whether dirference in the average yields under the
three varieties is significant or not.
l3.A consumer maga7.ine was interested in determining whether any differcnce existed in the average life
of four different brand of transistor balleries. A random samrle of 4 batteries of each brand was tested
with the following results (in hours).
136
Brand I Brand II Brand III Brand IV
12 14 12 14
15 17 19 21
IS 12
10 19
20
" "
20
Is there any significant difference in the average life of the four brands at 5% level?
14.An experimentor wanted to study tlte effed of 3 fertilizers on the yield of a crop. He divided tile field
into 12 plots and assigned each fertilil.er at random to 4 plots. Part of his calculatiun.~ are shown below:
Source d.f. 55 MS F F
..
Fertilizers
I"' ..
.. 4.26
Within .. .. ..
G.rou
Total 176
(a) Complete the above table by filling the gaps shown by ....
(b) Test at 5% levclto sec whether the fertilizers differ significantly.
15.A manager of a mercantile finn wishes to test whether its three salesman A,B.C tend to make sales of
the same size or differ in their selling ability. During a week there have been 14 sale calls - A made 5
calls, B made 4 calls and C made 5 calls. Following are the sales data for the week of the three salesman.
16.. TItree varieties A, B,C of a crop are tested in a randomized block design with four replications. The
yields are given below in pounds:
1 2 3 4
A
B ,,
6 4
6
8
6
6
10
24
C 5 10 9 "
32
Test whether there are differences between varieties. Test also whether yields of A differs significantly
from that of B.
17.Following tables gives the number of refrigerat(lrs sold by 4 salesman of Kelvinator (India) Ud .• in
three month 1anuary, February and March in the year 1978:
137
Month A B c D
January 50 40 48 39
February 46 48 50 45
March 39 44 40 39
18. Four different manufacruring processes were tried at three different stations and the average
measurement of a quality characteristics of me product by three proce:;scs obtained as in the following
table, Perform the analysis of variancc of Ihe data and list for the difference between the processes.
Station Proces:;es
A B C D
I 7 14 II 11
11 IS 16 I' 10
111 8 I' 10 12
19.5uppose that we are interested in establishing the yield producing ability offour types of soyabeans A.
S, C and D. We have three blocks of land X,Y and Z which may be different in fertilily. Each block of
land is divided into four plms in each block by a random procedure. The following results are obtained.
20. The following data give me number of units produced per day by 4 woken A, S, C, D using four
different types of machines MI. Mz, M1.!'04.
M, M,
A
B
45
'0
42
32
'"
38
34
C 43 36 40
D 36 38 36
(a) Test whether the mean production of the four differenltypes of machines arc equal.
(b) Te't <llsowhether the four workers differ with respect to mean productivity.
21. Set up ANOYA table for the following per hectares yield for three varieties of wheat each grown
on four plots:
22.
138
Per Hectare Yield (in '00 kgs,)
Variety of Wheat
Plot of
U,,' A, A, A,
1 6 5 5
2 7 5 4
3 3 3 3
4 8 7 4
Alw work out F-ratio.
Treatment I
(i) (ii) (iii)
Treatment 2 (i) 30 26 38
{ii) 24 29 28
(iii) 33 24 35
(iv) 36 31 30
1'1 27 35 33
23. A manufaerurer of ball point pens claims that a certain pen he manufactures has a mean writing life
of 400 pages with a sl3ndard deviation of 20 pages. A purchasing agent selects a sample of 100 pens and
puts them for test. The mean writing life for the sample was 390 pages. Should the purchasing agent reject
thc manufacturer's claim at 5% I.o.s.
24. A manufacturer claimed thaI allea.t 95% of the equipmenl which he supplied to a facto!)' confonned
to specifications. An examination of a sample of 200 pieces of equipmenl revealed lhat 18% were Faulty.
Test
his claim at a significanllevel of 5%.
25 In an infantile paralysis epidemics 500 persons contracted thedisease. 300 received no serum
treatment and of them 75 became paralysed. Of those who received serum reatrnent 65 became parulysed
Was serum treatment effective?
26 Out of 800 persons 25% werc literate and 300 had travelled beyond the limits of their district. 40% of
the literates were among those who had not travelled. Is there any relation between travelling and
literacy?
27. In a survey of 200 boys of which 75 were inteUigent. 40 hal;! skilled fathers; while 85 of the
unintelligent boys had unskilled fathers. Do these figures support the hypothesis that skilled fathers have
intelligent boys?
139
T.lble 1: Area Und~. Nflrn",1 ,- ••tv.'
, , .m .0> m
001
"., '"''
ill
'" ill
.om
.ffi
'"
m59
, "'" "'~
D
"'" "'~
.QI60 .m9')
"'"
.0319
,• .l554 .1~1
~""'"
.l(j(,.. .1736 ,m
'''' .'''''
, =. .,"" "" "'" ""
.l915 .1985
"" "n
,,~ "" "'" .2.~17
""
.,, "., "'" """A'
,,~ ""
21"
zrn
""
'61'
'"" """
" '"'' "",
""
., "" , '"w
,,~ "'..<"m "'" "',,~ '"'' 3133
m
31~9
~" ""3212
-"~ "'""n 3315 "OS
'.0
,, '''' "" "" "" 3531
"" ""
" ''''
"
"
;"" ""
"OJ
."' "
' ".JW
."'"""
"m .•m .•'"
,~,
.4192
.TIre
"'" -""
,
,m
""~" ""
,,.
"'"
",J
,,~
'''' •••
.4357
..
"'"
"'"
.4251
3749
" .4332
.•
.- '" Am
" .•""
.4474 .4515
,,~ ,4545
"
.4554 A'"
,,~"'" <@,
.4573 .4S91 .4599 .4616 ,4633
.•'"
" .4641
""' "'" "" ,4671
"'"
.-
Am ,4750
1,9
" .,m Am
.4713 .4719
"" "" 4732
"'" ••• .
,4756 .4761 .4767
.4783 ,4788
.•.'"
.4793 AM
"" .4317
.<8.n
" ."".'"" """" "'" "' " ""
"" "'" "".<m .""A9"
.4338 A'"
.<W
""
-, "'" ."".""
20 .4871
.
23 ,m A"" .4i«> .4913 .4916
" .4940
.•", .•gro
,m
.4945 .4951 .4952
" .495.1
.""
.4956
"" ""Am
.4951
.""Am ."",~,
,. '''''
2J
.4974
."'
.4966
.4975
""
" ,4961
"n Am .m .""
.4moS
.• OW .4971
"., .4981
140
T.,bl" 1, e';I;,',,1 \'"llI," or SI"d,'"I', I-U;'lJih"ti""
•.""""
3.182
w,
,• 1533 2m
WI) ,m
3.147
,•
,
,
1.476
,."" ,
,
IAI5 ""
,.'" ",.
3.143
3,499
,• ,-'" "'" ,•
''"'
un ""'
m ""'-'"
,,.,
3.355
"IU
"" "'"
10
II
''''
1.812
,.'" ""
m 2718
""
3.106
10
II
U ,-'" ,.'" 212'
"',<60
" ''''' u
""
"'" 1.711
L761 ,,~
""' 3.012
291' "
'3" '624
"
"
"
I)
'"
,-",
""
1.333
"'"
1.7:53
1.746
1.740
1.134
2m
"'"
2110
2101
um
""
""
""
""
,m
,
'-'"
..
"
I)
""
19 '318
''''
w, "m "" '-", 19
"
1.318 1.711
~
1.316
1315
'001
"',."" ,,,, ~
14'
T~ble 3: Critieal V"iue, oi ;('
--
f'rob>.bility under H,1ha,of X' > Chi square
"'''"''
• -" .W
, '" " .,~ ''''' '" ill
"
, ,.." .'"
.oo:ll5J .oom 3.841 5.412
., ill'"
"" ,.,.,
'''' "'" 5.991
?S15 9.83J
9.210
11.341
•,
.115
m '" ""
'.m .'"
"" ,,-'" """
"'"
.Jl1
,, "., .1145
"'"
,n;
""'"
,,.,,,
15.ct16
,• ,
""
""
..
.om
"'"
""
''''
,.,.,
""
""
.,.. "'"
"'"''''
"'"
lll.645
12.01J 14.ll6J
,,-'"
16.919
15.033
'M"
18.168
19,679 ""'"
"""
16.&12
18.4"1:1
,,.,
1.633 lall1
10.&51 '''''.
19.331
n»>
28.412
»'"
31.4lll
ll.OO
".""
36.191
""
,m 11.591 20.337 ' 29.615 'w, ,."., "'"
~'"
"., '>D' 21.331
.",,, """ 30.813 33.924
""" <>=
"" JaI96 13W1 35.17l
,.'" "."
. ," ,,= '''''
'''''' 14.611 "'" ""'
'"'' ,,= ,.".,
36.415
37.n'i2 . ""'" """
""" ",." 44.314
"
"
~
,m, "'"
12198
'''''' '''''
16.151
""~ """
"'~
773~
36.141
37.916
40.113
41.331
..
41.&56
44.140
..
45.419
'"
.,,"
" "'" 11.7ffi
"'"
",~ """""" """ "" ."".
D
"'" 18.493 . 43.713
""" ""',
NO'" Ford<a=< "ff~ ••<"h •••lO. Ih< qv",,'ily
~n: 2ot' _ J2d,f. _ 1 may be u.><dos. """""I van". wiIl> un;'
yart ••••••j~"'- ••bot' - JZd,'f. - I.
'"
Table 4(a), Crilkal Value, of F.Dist,;h"tion (at 5 per n,nt)
, , , , , , • U ~
-
~
" .
,,,
,,
,
161.4
18.S1
10.13
7.71
"""m 215.7
"" ""
19.16
"" "'"
." '" '" "" ,•.".. ""
'51
'"
"",,-"
,m ,w
9.12
no.,
.~
1937
243.9
19.41
"'., ""
'" >.n
S.91
."
19.45
'" "'"'"'"
,. '"''" '" 5.14
5.41
m
'""" '"
3.13
"
D "" '" 2aJ
2~3.10 '" ""
'" '" '" 2~ ""
", '"'" '" "" '"'"200 '"2•• '"'" '"'"
347 '.00
'"'" 2DJ
"" '" 1.78
". . '" '" 'ro 2" 2" 2"
"" 2'" 1.76
• . '" '''' 2n3.0r
"
D
4.18
'" 2"
-
3.92 . ,.00
'D
2aJ 2" '" "",." ,m
"" ""
•., • Dc,,,,,, <>Ifreedom for g ",",Of vari"""".
'" '" 1.7S
'"
Y, ., Dog""" of ff'C<domfor , •• ,Ii" vari....,.,.
143
Tabl~4(b); Crilical Values of F.Di,lr;butiOIl (01 1 per cellO
, , , , , , • - ~
"
,,
1
, 34.12
,m,
"" ""
"99.17
,.,
..""'" ",ro "'" """''' """'-'" "'""''' "''' "'""'''
~" "O' va; ,.'" ,."
28.71
"""'."
27.91 V."
-
"'50
,, "''' ,m '''''
"ill
"a; ,,» ''''
"'~
13.98
""
".
13.21
""
.m
13.93
.~
9.47
"" "''' '.n "''' "''' ".
'" ."
,en
, 13.13 9.13
'" 8.10
"" '50
". '" '" '" '""" ,m
843
'" 7.19
.'" 'I"
• "" '"
,m
'" ,O' '" '"
m
• "''' .m
'" '" ''''
5.47 3.11
>om
'" ." ,.'" ,.~ '" ,."'m '''' ,~'"
>0 4.71 3.91
'.m ,m
.".~
5.ll 4.74
"u ""
,m '"
7.21
'" ,OS
'''' ""3,41
,,. ". '" '50 4.16 3.78
3,17
" ''''
."
5.74
'"
'm '"
52'
'" '"
." ,O'
'" ""
,ro
" ••• '" '" "" ,en ,m
"
" '"•••• '"'",m
'"'" m 'M "" '" '", "" '"
'" '"
.."
""
'" .. 3.18
• 8.\8 5.01
'" '5O
'" '"'"
,'~
'" 8.10
'" 'M
m ""
4.43
'" '" "" '"
4.10
" "'"
5.78
"",~." "" ,w '" 3.81 3.17
"• m ,,.,
"" 3.61
'"
, ..
'" "" '" ,en "" '"'" '"
'" 4.18
'" '.61
'"
.. '",..,
2iCl
"" '"
'.n '"
4.14
""
'eM '"
= "" w
V
•n ,
'" "" '"
4.11
"" '-" '" "" '''' '" "" ""'" '" '"'
5.42 '50
'" 5.18
'" 4.13
'" '.61 ".
"" '",os
3.12 "" "" ,m
''''"" '"' '" ,." ""
-
. 1.38
'" '.m ,n ,en ,." "" '" '"". ,," ,ro
'" J.l7
'"
,.OS
'" '"
", '"D<J=lI of fr=;lom for S..•• ter vorlonc<:.
v, '" D<~,oH....,oom for 1m. tlu von.nee.
144
Table S: Value, for Spearman's Rank Correlalion (r,) for Combined Area, in Both TJil,
10% 01 area
_.3966
.m DO
", '" .",
,, =>
.", ..", .",
,, """
1m>
.""
.=
.7H4
""
""'.,"" "'",n" ,>m
"'"
.••,
..,., """ "'".m> "'" "'.","
.7143
.8167 .
"m .•• ~ ""
.5515 """
hOM .TIll .7818
"'"
n
.,., ,.,.
.m .=
.,,, .7455
""
"" .""
U
"
"
"
..
'"'"
"""'," ,,~
"""
..."
.,'"
"",~,
,,~ ""
.",
1m>
.=
Ifm
.m
'"'
,,~
.-
. •.
.8182
.7912
.,
."",
" "" "'..."" "" .,..,
V
•• "'"
.411B
""-"" .6152
"'"
•
.3148 '
"""" ,,~ A716
,m "'"
"" .•m
D """
"'" "'" .••" =>
"" ""
Am
" " "'"'"" "'" "'A'"" "" "..
""
•.
""
-
~ , .4241
"'"
""z J. '.' II61 .~:
." 27l»
~;{
-=
.3518
'''' . .•• .
AI.'iO .
, "'"'!" ..;
'""'
-=
~
z. ,,~ '
"'"
, ""
"'" "'"
=> ""
"'ih\ <1 -,: :2540 ',' "" .3236 -,'
"" '. ""
"'.m
" ..,.
.4915
.
""
sm
~28.'".''"
'B"~". I"" "" '''' , ,..,
"'"
. -" .."
., , "'"
.3l13
"""'" ,
Am
,A251 "" ,,~
" """ "'" ' "'"
..,
. '" .~,." ,~
IW. - M;n W,J or [M~x, W,-W,J
"'" .". 'J'
• , M;. M •• , , , • 0
• , • • '" " " " •• " " " '" '" W
,,
W.
,, ,W.
.~.
,
,• ,, " m,00.
.028 ,OS6
,143
, ""• .m .~ .,"
.111
, ,•, .In
•• •" ,~
.'00
~.
.<00
,• •• ~ m. ,~,
.<00
.018••
.OS7 .114
,•
•
m. m,
" .~ .Oll. .131
-• .<00 •
,• " " ,.
" ~
00'
<00
.016
.01(1
.Oll
,019
.M'
m,
.<00
.M'
.143
,M'..
.oM .119
, "
I(I$' IS8
.00'
.<00
.~ .(111
.~
.036
m.
OSS .01\2 .ILl
.,m
0 , .~
•
.036 ,OSS .<00
""" ~"
.018 ,l1S
'"
.~
•0 ,00
~
.~
.(116
~
01lOS6
.016.0211 ~, I-
00, .143
.073 .Ill
•, " " <00 .~ <00 ,OIS .026 00' .= ". .123
, " ~
" "
00'
.00'
.<00
.<00
,COS ,00II
.003 .OOS
.01S
<00
.1124 .Ol7
.01S .1l2J
,OS3 .00' .101
m, .00' ~ 00' .Ill
'CO"' •. I
"
, , , 0
• , .. III II 11 lJ 14 IS I~ 17 I~ 19 20
, Min M•• , , , • , • , •
"
W, W,
0
'" " " " " " " " " " '"
• ,• " " '" oro. •
," "
,• "" """
,00'
,~
,00'
.olO
,~
,00>
.019
.009
,004
.Oi' ,026
,008
, •••
~,
,033 ,ClS7 1186 .129 • ,
oro,
.•
.012 .047 .066 ,00>
,12)
~, 000
m,
""
.ms. "'" '" .1l4
,• " n" ,00' ,00' .002 .004 ,~ ,Oll .017 ,000 ,037
'" ,~ .117
.•, ~,
,~ ,~ .008 .014 • ,
-
~
•, "" "•
00'
,000
,~ 00>
.00,
,00' ,000 ,015 m,
,033 .041'.06('
,~,
.•,
~ 00' ,00' ,~ .006 .010 .01S .021 , ,07] 00' .lla
.003 .005 .IX17.010 .014 ow .027"'"
• "" "
,000 000 ,00' ,00' ,~
'00 ,000 ,000 .000 .001" ,00' 00' 00' J~l.om .010 mol .019
"'" '" "" ,000
,00'
,'M
M'
00'
'"
,[ 16
000 ,00' .In
• 1M;,: •••••••• , ••••• I•••• ""0<1 <>f !hi ••••• ""'" (aoS<!
,""'" •• I••• ",,", .,.. '-reI ••• "'" pou!ol< fo< Ihc ",," •• 1••• of, .n<! I in ,b•• ""'.
1 al,le 7- C,ili<al V"I"", "I T in thO'Wii(o~(!n MaT{-h(,d Pai" l(>.'T
I " I
"" Level of lignificonce for two-l.liJed lest
m
" '" "
,
6
,,
0
,, ,,
• ,,
9
•n
6
• ,
n , ,,
" " • •
"" "
"
~
"
"
~ ""
" ~ ~ ~
""" ~
'"
~
n
n
~
"
~ '"• •
~
•
•
~ ~
" "
"" "
n
~
~ "
~
~
~
•W W
n
61
w
148
Table 8; C:um"I~liv,. llill()miJll'roh~bilili"s: I' Ir <: r1n.p'
"n" "'"
, ,, '"'" '''''' ''''''
,, ""' 2m
"'" .OJ13
'""
..,
.9J85 b.l:!8 II!75
.•'"
.9':114
..""'" ""'-"
,9130 '""
8125
•, .~
,nm
.~
'""
,, '''''' '''''' ,."'.0010"
" ,,
-=
.7361 ""
2~ "'"
"""
.,m
.0108
"" >'"
.n~ "'"
•,
.om
••• .9219 ,,~
"n .1719
,no
,
, u=
~
...,
"'"
m;
,8331
.9452
.02~
.8".Bl
.9453
," '"'"
'"'" '"'"
""
.~~
"'"
'om> .~
'"'" 'om> "'"
'", '"'" WID
nm .ron
WID
.am
" ,, ""
, "'" '''' .0196
"m
0'"
'""
m", '"" .01'T.!
•, "~OJ
."-"
.8424 "'."'"" ."'.1937
"
, .~ .,,~
"'" ""
.•,.
,on
, I.ClOO
""
.!!-tIS
.'"
,, WID
WID
.'>m ."v
.,'" mm
'"'" won .om .=
m ,mn .•m
'ron '"'"
,om> "'"
'ron 'om>
" 'ron 'ron ,ron
''''''
" (c."""")
'"
" '. .00
'"
.40
'"
3>
,
0 .1216
.3917
.0))2
""'3
=
=
.=
-.-
=
2
.,'" om
"''' "'""'13
3
4 .,,., "'51
.4148
.0159
.CB:» ro59
5 .6\71 .1255 IJ)5JI
6 .""5 .7857 2-1'39
7 9'J9S .8911' .4158 """
.13l6
,
8
"'"
,=
."'"
.9116'
.5'>15
.7552 "'''
.41J9
.'800 E72J
'0 ,.=
WXJ .9.m '""
.7483
" 1.=
.,,'" ..,,,
.97118 .8684
"
13
1.=
,= ,= .\Iom
14 ,= 1= .m3
""'3
.'m6
13
16
1.00X>
un»
,=
1= 1.=
'''''
,mI
].0:00 1.= I.IJ:m .W>J
"
,s 1.= ,= 1.= ,=
19 1.= 1.= 1.= ,.=
3> ,= ,.= 1= ,=
150
Tabh' 9: Sl'le(ll~t C, itic"I VaIl"" of S in [Ill' KVrld"II's .C"dhl i",,1 "I CUIlt ,,,d.IIl' ,.
, !
value_forN=3
, , 5 6 , , ,
, '" 103.9 157.3 9 'W
, '95 88.4 143.3 217.0
"" 71.9
626
,,.,
112.3 182A 216.2
3..15,2
83.8
6 75,7 221.4
" 95"
"" '"
ron
101.7
127.8
1&3.7 WO
376.7
4Sl1
sn" '" Im.7
,
, 75.6 ",. •
11lS6 , 759
"m
, 61.4 lOB 116.2
m,
MS.O
"
• 6 ""
""
1428
176,1 m' ""
,,'-, "" 121.9
""
"IS"
OJ" 137.4
175.3
242.7
'""
•••.0
579.9
737J:J '" 15S.6
'"
131.0 W>.'
300.'
475.2 '5'2 1129.'i
'" In.O
"'•" "''' """ 1521.9
151
•
~
t
T~hl
••10 T"I,I" Shnwinl: Cfilk~1 Valu,,, of A_Slali<licfOfa"y Giwn V"iuc
or " ~ I. Curn'>po"d;ng III Va,inu. l t",d~ oj I'miJahiloly
(A j, ,ig"jf,e."t at a giwll k,el ,fil i<<: the valuo ,hown in ,he l.hl.)
n _ ]" Level or .igniflcance to< one.l.iled Ie" [
•
= m = ,
Level "r .ignifkllllcc (or IW<>-~1ncd t." "'" ~
m m m,
.W
, '", • , ,
,, 05OOXl12
,
0.5125
0.412 '''"'
"'.
0..50:)4<)
O_'-lJ
OSDl2
,= D.3J4
=, "",
-
M~ MU o.m
,-'"
•, "m
..
0,211
"""
,= ''''
"'" "' 0184
,
, '''"
,,., ,~ n= 0,167 l
",.. =,
"'"
U 1<)(, 0.155
•, ''''' '''''
0.217 0.190
noM
0.146
0,139
''''' "'''
nn.
0.213
0.210 0.181
""
" """ O,Z73 ,,., 0.178 0.130
"" "'"
,y,
,,..
"''' ~ 0.176 0.126
U
"'" ,,.. "''M
"
0,174
IH12
0.124
0.121
" ,y,
''''' OW> 0.\10 Q.1l9
"•• "'" n". Ull?
''''' OW>
,~
OW>
0.19") .,," 0.116
" '''''
ox< ''''' n'~ 0.167 0.114
152
• References:
1. S. P. Gupta, Statistical Methods, New Delhi, Sultan Chand & Sons 20 II.
Module 11 Forecastingtechniques
Multiple correlation and Multiple regression. Time series anaiysis
ModuleJV - Non-parameticTest
Sign test, Rank correlation, Chi square test, Runtest.
Course material prepared by
•• ABOUT'IBEAUTHOR ..c
",is. S,,'ati Subha>h [k,,,i i, "n .-\"OciHtl' I'rol'e""r in PmhlaJrai Oahma Lion' Colk!;e "I'
Commerce and Economic,. Malad Iwl. Mumhai -1>4,She ha, d"ne her I'o,t graduation (~I.Sc) in Applied
Slali<tic, from Puona t:ni,'er<it~. Her arcas of intcre't incl"de amoog other> Opcration, Rcsearch.
QWlIltita1t\'c Method,. "nd Re,~a",h Mcthod"I"gy.
She has o\'cr H years oft,.ad""g eXf"'Tle!1cc.She has abo neenleaching at P"st (jraJuate level ,ince
I\I'J~ at \ ariOLl.S""an"gemem m,titutes of ".pute ,ueh as Prin, LN. Weiingkar In,t ilUleof 1\-.lanagementand
Rcsc'"reh. Ilanasaheh (jawde hl'titute of Mana~cment. Dr lledebr In,titute <IfManagement. l'aJmashrcc
V,,,antdada l'atillnstitllie ofr,bnagemen1. I(.V,'Aetc.
Slle i, al,o aetivcly a"ociat"d with the In,l;l ,,11'"I' Oi'ta •••.e F:d"clllion. l: n iv'er>il~'of :\1" mbai. io
her capacity", a ""urse "'riler fllr :\lalhcmati,-"I and Sll11ilfic,,1 'l~chni'lue, at F.Y.R.Com. level,
Ioll'grate,1 ,\ I'l'rollchl'~ 10 (l(lcralio'" R"'l'lIrcli at PC;I)OR:\t and also '" a b.eully fonhe,e cou",e~.
She i, "Iso lin author Ope<a1iuo' RCSI'Hell "hieh has been deSL~1\edas a tcM Book fllr T.Y.B.M.S.
Scm \"I. r n i\'Cr,il~' of:\t "mb"i.
She is also serving as" (j"e>t Faculty "t J.J.TUmVl'"ity and also at Reg;iollal Training lmti\ute.
Mllmhai. Indian Audit andAc"""n" Depanment. (jm.ernment on odia,
She ha, pmlleipated i1\ \'"r'''lls "a1ion"1 Level Semin,ors.' Sym)ll"la and pre.sented a oum!>er of
technical papers at such , •.mina". Quile it fc". or her pal"'''' h'l,e heen published in prestigIOUSjournal, e.g.
I::NTiIt[ RJo:SL\RCIl ISS'-1l97S-S020 titled "Role of Quantitative Teehniques 111 Industry:'
V,o\RIORUJ\t :\lulti- Oi<ciplioar} e-Re'carch Jo"rnal ISS' U976-9714 titled "Applications of Lincar
Programming Prohlem, and Son Linear Programmiog Problem,.
She was n"minated for Masten; training program on life ,kills. citil~O,hip & civics otg.nized hy
RGN IYD tn repre,ent Mah"rJshlra,
Rs. 250/~