You are on page 1of 10

ESTIMATION THEORY AND HYPOTHESIS TESTING

Sampling:-When the population is very large it is difficult to examine each and
every unit of the population. We select a few units from the population examine
them. The draw conclusion regarding the whole population on the basis of these
few selected units. This known as sampling.
Sample:- Sample is a small part of the population selected from it to draw
conclusion regarding the whole population.
Population:- A group of similar objects is called population.
Finite Population:- A population consisting of finite number of units is called
finite population.
e.g. Population of student of a college is finite population.
Infinite population:- A population consisting of infinite number of units is called
infinite population.
e.g. population of fishes in a sea is infinite population.
Sampling fration:
N
n
is called sampling fraction where n! is the number of
unit the sample and "! is the number of units in the population.
!omplete Enumeration:- #f each and every unit of the population is examined
then it is called complete enumeration.
e.g. census.
Sampling error" an# non-"ampling error":-
Sampling Error":- A sample contains only a few units of the population and on
the basis of these few units we estimate the parameters of the population. $ven
when greatest care is taken in selecting the sample% there will always be a small
difference between the estimates obtained from different samples. There shall
also be a difference between the true parametric value and the estimates
obtained from the sample. These differences observed between different sample
estimates or between the population value and sample estimate are called
sampling errors. These errors are inherent and unavoidable.
Non-Sampling Error:- "on&Sampling errors are the errors committed in the
measurement of characteristics% recording the measurements% personal bias% the
careless use of sampling techni'ues etc. These errors can be avoided if a proper
care is taken. (ow ever sampling errors can only be minimi)ed by choosing
appropriate sampling techni'ues.
N.P.Singh (RATM) B.Statistics (015) 1
Sampling
Pro$a$ilit% &Ran#om' Sampling Non-Pro$a$ilit% &Non-Ran#om'
Sampling
Simple ran#om "ampling !on(eniene "ampling
Stratifie# ran#om "ampling Purpo"i(e "ampling
!lu"ter "ampling )uota "ampling
S%"temati "ampling *u#gement "ampling
Multi"tage "ampling
Simple Ran#om Sampling*& Simple random sampling is the best most widely
used method of sampling. Simple +andom sampling is used when the population
is homogeneous with respect to characteristic under study. #n random sampling
each and every unit of the population has an e'ual chance being selected in the
sample.
+andom sample can be selected by the following methods*&
,. -ottery method.
.. With the help of +andom "umber Table.
+otter% Met,o#:- Suppose we want to select a random sample of n! units from a
population consisting of "! units.
At first we assign numbers from , to " to all the "! units of the population.
We write these numbers on small chits and mix the chits in a bowl. "ow we
select n! chits one by one from the bowl. Those units whose number chits are
selected are included in the sample.
-it, t,e ,elp of ran#om num$er ta$le*& We assign numbers form , to " to all
the "! units of the population. "ow to select a sample of n! units% we select n
random numbers /between , to "0 from the random number table. Those units
whose numbers are selected are included in the sample.
Simple Ran#om Sampling .it, replaement an# -it, out replaement*& #f
at each stage of the draw the unit previously drawn is replaced before the next
draw is made. The method of sample is known as sampling with replacement. #f
however% the unit previously drawn is not replaced before the next draw the
method is known as sampling without replacement.
1 Sampling without replacement is better than sampling with replacement
since a unit of the population can be included only once in the sample in the cost
of sampling without replacement can standard error of the estimate is less is
sampling without replacement than in sampling with replacement.
N.P.Singh (RATM) B.Statistics (015) 2
Stratifie# Ran#om Sampling*&Stratified sampling is used when the population is
heterogeneous. #n stratified sampling we divide the heterogeneous population
consisting of "! units into 2 homogeneous groups /strata0. 3onsisting of ",% ".%
444"k. units respectively. Such that ",5".5&&&&&&&&&&&5"k 6 "% "ow to select a
sample of n units we select a sample of n, units from first stratum% Sample of n.
units from second stratum% 444444..% a sample of nk units from kth stratum
such that n,5n.5445nk 6 n. This procedure of sampling by first dividing the
homogenous population in to homogeneous group and than selecting sample
from each group is called stratified sampling. #f we select sample from each
stratum then the method of sampling will be known as stratified random
sampling.
S%"temati Sampling*& The procedure is useful when elements of the
population are already physically arranged in some order% such as an
alphabeli)ed list of people with driving licenses% list of bank customers by
account numbers. #n these case one element is chosen at random from first k
element and then every kth element is included in the sample. The value k is
called the sampling interval.
!lu"ter Sampling*& A group of elementary units in the population is called a
cluster. When a cluster is taken as a sampling unit% the procedure of sampling is
called cluster sampling.
Multi-Stage Sampling:- Sometimes sampling is done in stages to reduce the
cost of the survey. #n this sampling method% the population is divide into first
stage sampling units also called primary units. Then the random sample of first
stage units is made. 7urther division is made of the first stage sampling units
selected% and a random sample is taken from these second stage sampling units.
The process can be continued for a number of stages.
Non-Pro$a$ilit% Sampling Te,ni/ue":-
!on(eniene Sampling:- #n this procedure% units to be included in the sample
are selected at a convenience of the investigator rather than by any prespecified
or known probabilities of being selected. 3onvenience samples are easy for
collection data on a particular issue. (owever% it is not possible to evaluate its
representativeness of the population and hence precautions should be taken in
interpreting the results of convenient sample that are uses to make inferences
about a population.
)uota Sampling:- #n 'uota sampling the selection of respondents lies with the
investigator% although in making such selection he8she must ensure that each
respondent satisfies certain criteria which is essential for the study.
Sampling error" an# Non-Sampling Error":-
N.P.Singh (RATM) B.Statistics (015) 3
Sampling Error":- A sample contains only a few units of the population and on
the basis of these few units we estimate the parameters of the population. $ven
when greatest care is taken in selecting the sample% there will always be a small
difference between the estimates obtained from different samples. There shall
also be a difference between the true parametric value and the estimates
obtained from the sample.
These difference observed between difference sample estimates or
between the population value and sample estimate are called sampling errors.
These errors are inherent and unavoidable.
Non-Sampling Error":- "on&Sampling errors are the errors committed in the
measurement of characteristics% recording the measurements% personal bias% the
careless use of sampling techni'ues etc. These errors can be avoided if a proper
care is taken. (ow ever sampling errors can only be minimi)ed by choosing
appropriate sampling techni'ues.
Stan#ar# Error:- The standard deviation of the sampling distribution of a statistic
is called the standard error of that statistic.
n
Mean of S.E.
σ
=
Te"ting of H%pot,e"i":- Testing of hypothesis is a rule of procedure by which
we reject of accept a null hypothesis on the basis of the sample.
Null H%pot,e"i"*& A hypothesis tested for possible rejection under the
assumption that it is true is called null hypothesis. "ull hypothesis is denoted by
(o.
Alternati(e H%pot,e"i": To every null hypothesis there exists an alternative
hypothesis which is different from null hypothesis and is accepted when the null
hypothesis is rejected on the basis of the sample. Alternative hypothesis is
denoted by (,.
Error":
Ho i" true Ho i" fal"e
N.P.Singh (RATM) B.Statistics (015) 4
!orret Dei"ion T%pe II error
T%pe I error !orret Dei"ion
Aept Ho
Re0et Ho
While testing a statistical hypothesis we may commit two types of errors.
i. We may reject a hypothesis which is actually true.
ii. We may accept a hypothesis which is actually false.
There are known as type # and type ## errors respectively. Their probabilities are
denoted by
α
and
β
respectively.
+e(el of Signifiane* & Probability below which we reject our null hypothesis is
called level of significance.
Power of a test* & Power of a test is the probability of making a correct decision.
Power 6 Prob.9+ejecting (: when (: is false;
6 Prob.9 +ejecting (: when (, is true;
6 ,& Prob. 9Accepting (: when (, is true;
Power 6 ,&
β
One taile# an# T.o taile# Te"t:-
(ypothesis testing for population parameters with large samples* (ypothesis
testing involving large samples
) 30 ( > n
is based on the assumption that the
population form which the sample is drawn has a normal distribution.
H%pot,e"i" Te"ting for Single Population Mean:-
Te"ting t,e "ignifiane of t,e o$"er(e# mean &1-Te"t':-
-et
, ...., ,......... ,
2 1 n
x x x
be a random sample of si)e n /<=:0 from a large
population. -et sample mean ,
n
x
x

= and sample variance

− = . ) (
1
2 2
x x
n
S
N.P.Singh (RATM) B.Statistics (015) 5
-et the mean and standard deviation of the population be
µ
and
σ

respectively. Then to test*
0 0
µ µ = H

0 1
µ µ ≠ H
And if standard deviation
σ
of the population is known.
T,e 2-te"t "tati"ti i" gi(en $%:
Test& statistics*
n
x
z
σ
µ −
=
.
#f
TABULATED CALCULATED
Z Z <
% we accept our null hypothesis at >? level of
significance% other wise reject it.
When population standard deviation
σ
is not known% then x
σ
is taken as
1 − n
S
or
,
n
S
where S is standard deviation of sample values.

!ritial region for 1-"tati"ti:-
-evel of significance @ne tailed Two tailed
>? or :.:> ,.AB ,.CA
Te"ting t,e "ignifiane of t,e #ifferene $et.een t.o "ample mean":-
Suppose we draw two samples from two populations /or same population0. Then
to test the hypothesis% DThe difference of the means of two samples of not
significant or the two samples have been taken from the same population% we
consider the difference between two samples means and the standard error of
this difference.
2
2
2
1
2
1
2
1
) (
n n
x x
Z
σ σ
+

=
E
2 1
, n n
#ndependent random samples drawn first and second population
respectively.
Te"t of "ignifiane: Small Sample":-
Te"ting t,e "ignifiane of t,e mean of "mall "ample":
t 3 Te"t
N.P.Singh (RATM) B.Statistics (015) !
Stu#ent- t te"t:-
To test the significance between the difference of sample mean and population
mean or to test the null& hypothesis
,
0
µ µ =
a certain value when population in
normal% population standard deviation is not known and sample si)e is less than
=:% we proceed as follows*
-et
, ...., ,......... ,
2 1 n
x x x
be a random sample of si)e n% then
% Where
( )
,
2
n
x x
S


=
#f
,
TABULATED CALCULATED
t t >
we reject our null hypothesis at >? level of significance
otherwise reject it.
"ote* Student t test can be applied only when n is small and population variance
is /unknown0.
Te"ting t,e "ignifiane of t,e #ifferene $et.een t.o "ample mean:-
To test the significance between the differences of two samples means taken
from two populations with same variance /Fnknown0%
, 30 ,
2 1
< n n
test statistic% t is
defined as follows*








+








− − −
=
2
2
2
1
2
1
2 1
2 1 ) ( ) (
n
S
n
S
x x
t
µ µ
Where
1
x and
2
1
S are the mean and variance for a sample si)e
1
n drawn from
the first normal population with mean
1
µ
and variance
2
1
σ . -ike wise%
2
x and
2
2
S
are the mean and variance for a sample of si)e
2
n drawn from the second
population with mean
2
µ
and variance .
2
2
σ
#f
,
TABULATED CALCULATED
t t >
we reject our null hypothesis at >? level of significance
otherwise reject it.
Te"ting of "ignifiane of (ariane &Small Sample"':-
F-te"t or Sne#eor4" F- #i"tri$ution:-
-et there be two independent random samples of si)es
1
n
and
2
n
from two
normal populations with variances
2
1
σ and .
2
2
σ respectively. 7urther let
N.P.Singh (RATM) B.Statistics (015) "
n
S
x
t
µ −
=
( )
2
1
1
2
1
1
1



= x x
n
s
i and
( )
2
2
2
2
2
1
1



= x x
n
s
j % be the variance of the first
sample and the second samples respectively. Then 7&statistic is defined as*&
,
2
2
2
1
S
S
F =
if ,
2
2
2
1
S S > with ( )
2 1
, v v degrees of freedomE where , 1
1 1
− = n v and 1
2 2
− = n v
#f calculated value of 7 <7.:> for /v,% v.0 d.f. then the ratio is considered significant
at >?. #f calculated value of 7 G7.:> for /v,% v.0 d.f. then the ratio is considered not
significant which means the two samples are taken from the population having
same variances.
2
χ
- Te"t &!,i-"/uare te"t':-
3onditions regarding applications of 3hi&s'uare Test*&
,. The total fre'uency should not be less than >:% otherwise the condition of
normally is not fitted.
.. "o expected fre'uency should be less than >. #f it so some adjustments
are made.
=. The sum of observed fre'uency must be e'ual to the sum of expected
fre'uencies.
B. The selection of sample should be on random basis.
2
χ -Te"t for ompari"on of t,eoretial an# o$"er(e# proportion" of
fre/uenie":-
,. "ull (ypothesis* "o difference between observed and expected
fre'uencies or proportion.
.. The total observed proportion or fre'uencies " is classified into k cells.
=. $xpected fre'uency is computed for each cell under null hypothesis% (:
based on the given rule.
B.
2
χ &statistic or test statistic is computed by the formula*
( )

=

=
k
i i
i i
E
E O
1
2
2
χ E
Where k
O O O O ..., ,......... , ,
3 2 1 are the observed fre'uencies.
The value of
2
χ calculated is compared with the table value of
2
χ at a
given level of significance.
3onclusion* i #f
2
05 . 0
2
χ χ >
Cal
% the null hypothesis (: is rejected at >? level
of significance i.e.% the two sets of fre'uencies are dissimilar.
iii. #f
2
05 . 0
2
χ χ <
Cal
% the null hypothesis (: is accepted and the difference
is said to be significant% i.e.% observed and theoretical sets of
fre'uencies are nearly alike.
N.P.Singh (RATM) B.Statistics (015) #
2
χ - Te"t of Goo#ne"" of Fit:-
/3omparison of observed fre'uencies and theoretical fre'uencies when
theoretical fre'uencies are based on a probability Histribution0
The procedure for
2
χ &test of goodness of fit is as follows*&
,. Write null hypothesis% there is no difference between observed and
expected fre'uencies /based on any theoretical fre'uencies distribution0.
.. 3ompute
( )

=

=
k
i i
i o
e
e O
1
2
2
χ %
=. 3ompare the calculated value of
2
χ with table value of
2
χ .
#f calculated value of
2
χ is less than table value the difference is not significant%
this means there is no significant difference between the two distributions. #n this
situation we say the fit is good. When calculated value of
2
χ is more than table
value% the difference is significant. #n this situation we say the fit is not good.
A""oiation of attri$ute":- An attribute is a 'ualitative characteristics. @ne can
only feel the presence and absence of this characteristics while observing
individual or items under consideration.
@+
An attribute is a characteristic of an individual which cannot be measured
numerically. $.g% honesty% beauty% marital status etc.
Association of two attributes*& Association of two attributes measures the degree
of relationship between two phenomena whose si)e cannot be measured but one
can only determine the presence or absence of a particular attribute or 'uality.
@r
#f a relation exists between two or more attributes% they are said to be associated.
There are three types of association of attributes*
i. positive ii. "egative iii. #ndependent.
@rder of class*&
" * 7re'uency of a )ero order.
/A0% /I0%
) ( ), ( β α
* 7re'uencies of first order.
) ( ), ( ), ( ), ( αβ β α A B AB
* 7re'uencies of second order.
3lass J 7re'uencies*& The attributes may be positive or negative. #f the attribute
is present% it is termed as positive calssE
e.g%
i. #f A! represent Kale! then
α
! would represent 7emale!.
ii. #f I! represent Ilinds!% then
β
! would mean "on&Ilind!
The two attributes can be combined. The combination of attributes is represented
by grouping together the letter% such asE
) ( ), ( ), ( ), ( αβ β α A B AB
.
N.P.Singh (RATM) B.Statistics (015) $
#n the language of the above e.g%
) ( AB
stands for male&blind!
) ( β A
stands for male&nonblind!
) ( B α
stands for female&blind!
) (αβ
stands for female&non&blind!
"% A% I and AI etc are positive classesE
αβ β α , ,
etc. are negative classesE
β αβ A ,
etc. are pairs of contary classesE
β A
stands for Kale "on&blind!
αβ
stands for 7emale "on&blind!
"ine S'uare Table*
N.P.Singh (RATM) B.Statistics (015) 10