You are on page 1of 47

S.Y.B.Sc.

STATISTICS : SEM 3: PAPER 2


UNIT 1 :SIMPLE RANDOM SAMPLING

Terminologies:
1) Population: The group of individuals under study is called population.
Population may be finite or infinite.
e.g.i) Total number of workers in the factory.}
ii) Total number of books in the library.}Finite population
iii) Number of stars in the sky.]
iv) The set of real numbers.]Infinite population
Each element of the population is called population unit.
The total number of units in the population is called population size and it is
denoted by ‘N’
2) Sample: A fraction of the population is called sample.
The number of units in the sample is called sample size and it is denoted
by ‘n’.
3)Elementary unit: Smallest component part in which population can
be divided is called as an elementary unit.
4)Sampling unit: An elementary unit of group of elementary units of
population which are convenient to form the purpose of sampling and on
which observations are made is called as sampling unit.
e.g. i) To know average income per family, the head of the family is a
sampling unit.
ii) To know the average yield of rice, each farm owner’s yield of rice is a
sampling unit.
5)Sampling frame: In sampling procedure, each sampling unit is represented
by a number for the purpose of identification. Such a list or map is called
sampling frame. Thus sampling frame consist of units with the identification
numbers and at the same time it should represents reasonable population.
e.g. i) List of voters
ii) List of households
iii)List of farmers
iv) Map of area
6)Parameter: A statistical constant of the population is called parameter.
e.g. Population mean(̅Y or μ), Population variance(𝜎2) etc.
7) Statistic: Statistic is a function of sample values. e.g Sample mean(.x̅ )
,Sample variance( s2) etc.
8) Estimator: An estimator is a statistic which is a function of sample
values. It is used to estimate population value(Parameter).
Since statistic is a random variable, an estimator is also a random variable.
The numeric value of an estimator is called as an estimate.
9) Bias: The bias of an estimator is the difference between its expected value
and the true value of the parameter being estimated.
E( t ) = θ + C
C = E( t ) - θ ; where ‘C’ is called as bias of an estimator.
10)Unbiased estimator: An estimator t is said to be unbiased estimator of θ
if E( t ) = θ
11) Mean square error: It is the average squared difference between the
estimated value and its parameter. It is denoted by MSE.
Suppose ‘t’ is a statistic and θ is a parameter the MSE is given by:
MSE = E(t- θ)2
12) Standard error: The standard deviation of sampling distribution of
statistic is defined as its standard error and is denoted by S.E.
e.g Suppose x ~ N( 𝜇 , 𝜎2)
2
Then x̅ ~ N(𝜇 , 𝜎 ⁄𝑛)
2 𝜎
∴ S.E. = √𝜎 ⁄𝑛 =
√𝑛

Q What is (i) sample survey (ii) census survey?


Methods of collecting data
Data on population can be collected in two ways:
1) Census method or Complete enumeration method
2) Sample survey

1) Census method:
In this method data is collected from each and every unit of the population.
e.g. Population census of India- Population census of our country is taken at
10 yrs intervals. The latest census was taken in 2010. The first census was
taken in 1871-72.
Merits of census method:
i) The results will be more accurate and reliable.
ii)Intensive study is possible.
iii)The data collected may be used for various surveys and analysis.
Limitations:
i) It requires a large number of enumerators and it is a costly method.
ii) It requires more resources such as money, labour, time energy etc.
iii) When the population is infinite census method is not advisable.
iv) If the study is of destructive type, census method is not feasible.
2) Sample survey:
In this method data is collected from the fraction of population.
e.g. i) Economic surveys, agricultural surveys etc. are conducted regularly.
ii) Need based surveys like consumer satisfaction for a particular product
are conducted when some need arises.
Limitations:
i) Sampling should be done by qualified and experienced persons otherwise
information will be unbelievable.
ii) Sample method may give the extreme values instead of mixed values.
iii) Sampling errors may occour.
Q State the advantages of sample survey over census survey.

The advantages of sample survey over census method are:-

i) Sampling saves time and labour.


ii) It results in reduction of cost in terms of money and man- hour.
iii) Sampling ends up with greater accuracy of results.
iv) It has greater scope.
v) If the population is too large, or hypothetical or destroyable
sampling is the only method to be used.

Q Distinguish between census survey and sample survey.(HW)

Q State principal steps in a sample survey.

Principal steps in a sample survey:


1)State objectives clearly: The objectives of the survey has to be clearly
defined and well understood by the person planning to conduct it.
2) Decide the population to be sampled: Based on the objectives of the
survey, decide the population from which data can be collected.
3) Decide on methods of measurement for data collection:
Data can be collected by any one of the following method-
i) Physical observations and measurements: The surveyor contacts the
respondent personally through meeting. He observes the sampling unit and
records the data.
ii) Personal interview: The surveyor with the help of questionnaire collects
the data from the respondent through personal interview.
iii) Telephone interview: The surveyor collects the data by asking the
questions over the telephone to the respondent.
iv) Mail enquiry: The well prepared questionnaire is sent to the respondents
through postal mail, email etc. The respondents are requested to fill up the
questionnaires and send it back.
v) Web based enquiry: The data is collected through internet based web
pages. The questionnaires are to be sent to the respondents through link. By
clicking on the link, the respondent’s answers are recorded.
vi) Recorded information: Data is collected from the already recorded
information.
4) Decide on type of sampling that needs to be done: Divide population in
parts. Identify sampling units.
5)Select sample relevant to cost and time: The size of the sample needs to
be specified for the given sampling plan. Also the method and plan adopted
for drawing a representative sample should also be noted.
6)Pretest questionnaire on small scale to eliminate troubles: Pilot survey
is the procedure of trying questionnaire and field methods on small scale. It
helps in assessing the suitability of questions, clarity of instructions,
performance of enumerators and the cost and time involved in the actual
survey.
7)Organize fieldwork: by providing proper training to surveyor regarding
procedures, plans for handling the non-response and missing observationsetc.
8)Summarize and analyze data: Based on the objectives of the data, the
suitable statistical tool is decided which can answer the relevant questions.
9)Write report on findings.
Characteristics of a good questionnaire
i)Number of questions should be minimum.
ii)Questions should be short and simple.
iii) Questions should be in logical orders, moving from easy to more difficult
questions.
iv) Questions of Yes/No type are preferable.
v) Personal questions and questions which require memory power and
calculations should be avoided.
vi) Questions should be carefully framed so as to cover the entire scope of the
survey.
vii) The wording of the questions should be proper without hurting the
feelings.
viii) As far as possible confidential information should not be sought.
ix)Physical appearance of the questionnaire should be attractive, sufficient
space should be provided for answering each question.
Q Define sampling errors and non-sampling errors.
Sampling errors and Non-sampling errors
There are two types of errors can occur in sample survey and they are:
(1) Sampling errors
(2) Non-sampling errors.
Sampling errors: - These errors can occurs due to:
i) Un-representativeness of the sample selected for observation.
ii)Part of the population is studied, it can not supply full information about
the population. So there may be difference between statistics and parameters.
Non-sampling errors: These errors can occurs during collection of actual
data like from human error such as error in problem identification,method or
procedure used etc.

Q Define various types of sampling methods.


Types of sampling
Sampling procedures are classified as:
(1) Probability sampling
(2) Non- Probability sampling.
Probability sampling(Random sampling):
Under this procedure selection of units from the population is made with
some known probabilities. The commonly used methods are:
i) Simple random sampling
ii)Stratified random sampling
iii)Systematic random sampling
iv) Cluster sampling
v)Two stage sampling
vi)Probability proportional to sample size(PPS)
Non- Probability sampling:
Under this procedure discretion is used to select representative units from the
population. This method is not used in general because of bias of the
enumerator. However if the enumerator is experienced and expert,this
method may yield valuable results. The commonly used methods are:
i)Purposive sampling
ii)Convenient sampling
iii)Quota sampling.

Q Distinguish between Probability and non- probability sampling.(HW)


Q State advantages and disadvantages of Non- probability sampling.(HW)
Q Write short note on CSO. (HW)
Q Write short note on NSSO. (HW)

CSO:-Central statistical office. It brings about coordination of statistical


activities among various statistical agencies at state level.

NSSO:-National sample survey office. It has responsibility to conduct


large scale sample surveys in diverse fields on All India basis.

Simple Random Sampling

Simple Random sampling(SRS): Defn-


Simple random sampling is the sampling technique in which each unit of
the population has an equal and independent chance of being included in the
sample.
Note: i) This method is probability sampling.
ii) The selection of unit is free from any personal bias.
iii) If the selection is random & sample is sufficiently large, it will
represent the population.

There are two types of simple random sampling:


(1) Simple random sampling without replacement(SRSWOR)
(2) Simple random sampling with replacement(SRSWR)

Simple random sampling without replacement(SRSWOR)


In this method, if a unit is selected once in the sample, it is not consider
again for the next selection. Thus a unit may get selected 0 or 1 times in the
sample.
e.g. Suppose there are 4 units in the population i.e. N = 4
Population units are denoted by: Y1, Y2, Y3, Y4.
Suppose a sample of size 2 is selected using SRSWOR i.e. n = 2.
N 4
Number of possible samples= ) = ) =6
( (
n 2
Sample units are: (Y1 , Y2), (Y1 , Y3), (Y1 , Y4), (Y2 , Y3), (Y2 , Y4), (Y3 , Y4)

Simple random sampling with replacement(SRSWR)


In this method, if a unit is selected once in the sample, it is consider again
for the next selection. Thus a unit may get selected more than once in the
sample.
e.g. Suppose there are 4 units in the population i.e. N = 4
Population units are denoted by: Y1, Y2, Y3, Y4.
Suppose a sample of size 2 is selected using SRSWR i.e. n = 2.
Number of possible samples= Nn = 42 = 16
Sample units are: (Y1 , Y1), (Y1 , Y2), (Y1 , Y3), (Y1 , Y4),
(Y2 , Y1), (Y2 , Y2), (Y2 , Y3), (Y2 , Y4),
(Y3 , Y1), (Y3 , Y2), (Y3 , Y3), (Y3 , Y4),
(Y4 , Y1), (Y4 , Y2), (Y4 , Y3), (Y4 , Y4).

• Unless it is specifically stated (SRS) means SRSWOR.

Methods of selection of a simple random sample


(1) Lottery method
(2) Using Random Number table
(3) Using Calculators or Computers.

Lottery method:
In this method :
Step 1: All the units of the population are numbered and are written on
separate slips of paper of same size, shape and colour.
Step 2: They are folded and mixed up in a container.
Step 3: The required number of slips are selected at random for the desired
sample size.
Step 4: If simple random sample without replacement is to be drawn then
the slips are selected one after another.
Step 5: If simple random sample with replacement is to be drawn then after
noting the number on the selected slip, the slip is replaced in the container
before next selection.

Using Random Number(R.N.) table:


A R.N. table is a series of digits(0 to 9) arranged randomly in rows and
columns. The table contain 5-digit numbers, arranged in rows and columns.

In this method:
Suppose there are ‘N’ units in the population. and we wish to select a
sample of size ‘n’ using SRS. Then steps are as follows:
Step 1: If we have to select a sample from population of size N=10 then
consider R.N.’s from 00 to 99(i.e. two digited R.N’s)
If we have to select a sample from population of size N=100 then consider
R.N.’s from 000 to 999((i.e. three digited R.N’s).
Step 2: Select the numbers from RN table. We may start at any place and
may go any direction such as column wise or row wise in RN table.
Step 3: From the selected RN’s, we will consider random numbers up to
which maximum cycles of ‘N’ numbers can be completed. The remaining
RN’s are rejected.
Step 4:If the selected R.N. is less than N then corresponding R.N. is taken
in the sample.
Step 5: If the selected R.N. is greater than N then we find the remainder
with respect to ‘N’. Now the remainder is considered as a RN & unit
corresponding to the remainder is taken in the sample. If remainder is 0
then select Nth unit in the sample.
Step 6: If the sample to be drawn is by SRSWOR and any value of R is
repeated ignore the repeated value and take next RN and follow the
procedure.
Using Calculators or Computers:
In this method, we have to press RND command. In response to this every
time we get a new RN. The ways of selection of sample is similar to that of
using RN table.
Ex: Using random numbers given below, select a sample of size 5 from a
population of 140 units.
Random numbers: are : 089, 180, 992, 669, 000, 320, 012, 560, 800.
Soln : Given: N =140, n=5
Here we will consider numbers from 001 to 980 for sampling & we will
reject numbers 981 to 999 & 000.
[ We have, 140*7 = 980
∴ We can only complete 7 cycles of 140 numbers in the 1000 numbers
selected. ]
Sr. No. Random Number Remainder Sample unit
1 89 89 89
2 180 40 40
3 992 Rejected -
4 669 109 109
5 000 Rejected -
6 320 40 40
7 012 12 12
8 560 0 140
9 800 100 100
SRS without replacement: Select population units bearing numbers 89,
40, 109, 12, 140.
SRS with replacement: Select population units bearing numbers 89, 40,
109, 40, 12.

Theorems
(I) Probability of drawing a unit
(A) SRSWOR
Show that the probability that a specified unit is selected at kth draw is 𝟏
𝐍

Proof:
Let Ek = a specified unit is selected at the kth draw.

∴ p(E1) = p[ a specified unit is selected at the 1st draw]

𝟏
p(E1) =
𝐍

p(E2) = p[ a specified unit is selected at the 2nd draw]

= p[ a specified unit is not selected at the 1st draw & it is selected


at the 2nd draw]

1 1 N−1 1
= ( 1- )× = *
N N−1 N N−1

𝟏
p(E2) =
𝐍
p(Ek) = p[ a specified unit is selected at the kth draw]

= p[ a specified unit is not selected up to 1st (k-1) draws & it is


selected at the kth draw]

N−1 N−2 N−3 N−(k−1) 1


= × × × ………….. × ×
N N−1 N−2 N−(k−2) N−(k−1)
𝟏
p(Ek) = p(a specified unit is selected at the kth draw) =
𝐍

(B) SRSWR
Show that the probability that a specified unit is selected at kth draw is 𝟏
𝐍

Proof:
In this method population size remains same before each selection.
∴ probability that a specified unit is selected at any draw is 1
N

p(a specified unit is selected at the kth draw) = 𝟏


𝐍

(II) Probability of drawing a unit in the sample


(A) SRSWOR
Show that the probability that a specified unit is selected in the sample
of size ‘n’ is 𝐧 .
𝐍
Proof:
Let Ek = a specified unit is selected at the kth draw ; k =1, 2, .......... n

∴ p(E1) = p[ a specified unit is selected at the 1st draw]


𝟏
p(E1) =
𝐍

p(E2) = p[ a specified unit is selected at the 2nd draw]

= p[ a specified unit is not selected at the 1st draw & it is selected


at the 2nd draw]

1 1 N−1 1
= ( 1- )× = *
N N−1 N N−1

𝟏
p(E2) =
𝐍

p(Ek) = p[ a specified unit is selected at the kth draw]

= p[ a specified unit is not selected up to 1st (k-1) draws & it is


selected at the kth draw]

N−1 N−2 N−3 N−(k−1) 1


= × × × ………….. × ×
N N−1 N−2 N−(k−2) N−(k−1)

𝟏
p(Ek) = p(a specified unit is selected at the kth draw) = ; k=1, 2,……n
𝐍

Required probability = p( specified unit is selected at 1st or 2nd ….nth draw)

= p[ E1 or E2 or ……… En ]
= p(E1) + p(E2) + ................. + p(En)

= 1 + 1 + + 1 , n times
N N N

p( a specified unit is selected in the sample of size ‘n’ ) = 𝐧


𝐍

(B) SRSWR
Show that the probability that a specified unit is selected in the sample
of size ‘n’ is 𝟏 .
𝐍

Proof:
Let Ek = a specified unit is selected at the kth draw ; k =1, 2, .......... n
At any stage there are N units in the population.
p( a specified unit is selected in the sample of size ‘n’ ) = 𝟏
𝐍

(III) Probability of drawing a sample of size n


(A) SRSWOR
Show that the probability of drawing a sample of size ‘n’ is 𝟏
𝐍
.
( )
𝐧

Proof:
Let Ek = a unit is selected at the kth draw ; k =1, 2, ......... n

∴ p(E1) = p[ a unit is selected at the 1st draw in the sample of size n]

n
() n
p(E1) = 1 =
N N
( )
1
n n−1
p(E2/ E1) = p(E1) × p(E2 / E1 ) = ×
N N−1
n−1 1
Similarly p( E1 E2 ….En) = n × …………………..×
N N−1 N−(n−1)

n! (N−n)!
= × (N−n)!
N×(N−1)×………N−(n−1)
n! (N−n)! 1 1
p( E1 E2 ….En) = = =
N! N! N
( )
n! (N−n)! n
p(drawing a sample of size ‘n’) = 𝟏
𝐍
( )
𝐧

(B) SRSWR
Show that the probability of drawing a sample of size ‘n’ is 𝟏 .
𝐍𝐧

Proof:
Let Ek = a unit is selected at the kth draw ; k =1, 2, ......... n
Here at any stage there are N units in the population.
∴ p( Ek ) =1 ; k =1, 2……n
N
1
P(E1 E2 ….En) = 1 × …….1 = 1
N N N Nn

P(drawing a sample of size ‘n’ ) = 𝟏


𝐍𝐧

Simple Random Sampling for variables


Notations:
Suppose there are ‘N’ units in the population.
Let ‘Y’ be the characteristic under study.
Let Y1, Y2, ……….., YN be the population values.
∴ Yi : be the characteristic for the ith population unit. ; i = 1,2……N
1) Y = ∑N Yi = Population total
i=1
∑N 1 Yi
2) = i=
= Population Mean
𝐘̅̅ N

3) σ2 = 1 ∑N (Yi − ̅Y )2 = Population Variance


N i=1
∑ 𝐘̅𝐢 𝟐
i.e. σ2 = [ ̅2
] - 𝐘̅
𝐍
4) S =2 1 ∑N (Yi − ̅Y )2 = Population Mean Square
N−1 i=1

i.e. S2 = 𝟏 [ ∑𝐍 𝐘̅𝐢2 - N𝐘̅ ]
𝐍−𝟏 𝟏

Suppose a sample of size ‘n’ is selected using SRS.


n : Number of units in the sample.
Let y1, y2, ………., yn be the sample values.
yi : be the characteristic for the ith sample unit. ; i = 1,2……n

∑n yi
5) 𝐲̅̅ = 1
= Sample Mean
i
𝑛

6) s2 = 1 ∑n(yi - y̅)2 = Sample Mean Square


n−1 1

i.e. s2 = 1 [∑ yi2 - ny̅2 ]


n−1

We have,
Nσ2 = ∑𝐍(𝐘̅𝐢 − 𝐘̅̅)2 ------------------------- [ From 3) ]
𝟏

(N-1) S2 = ∑𝐍(𝐘̅𝐢 − 𝐘̅̅)2 ---------------------- [ From 4) ]


𝟏
7) Nσ2 = (N-1) S2
Ex: Let Y1 = 3, Y2 = 5, Y3 = 6, Y4 = 2 , Y5 = 11
i.e. N = 5

Suppose we wish to select a sample of size 2 from this population i.e. n = 2


There are 5c2 = 10 possible samples.
Suppose units 3 & 5 are selected in sample.

∴ Sample values ≡ ( y1, y2) ≡ (Y3, Y5)

(y1+y2)
∴ Sample mean = y̅ =
2

(Y3+Y5)
y̅ =
2

(0∗Y1 + 0∗Y2 + 1∗Y3 + 0∗Y4 + 1∗Y5)


y̅ =
2

y̅ = 1 ∑N aiYi -----------where ai = 1 or 0
2 1

In general,
y̅ = 1 ∑n yi
n i=1
𝐲̅̅ = 𝟏 ∑N aiYi -------------------------------- (*)
𝐧 1

Where, ai = 1 ; if ith unit is selected in the sample


ai = 0 ; if ith unit is not selected in the sample.
Note:
1) V(X+Y) = V(X) + V(Y) + 2 cov(X,Y)
2) cov(X,Y) = E[(X - E(X)) (Y – E(Y))]
3) cov(X,Y) = E(X,Y) – E(X) *E(Y)
4) v(aX + b) = a2 v(X)
5) ( ∑N 1 Xi)2 = ∑N 1 Xi2 + ∑n j=1 XiX j
i= i= i≠

Result 1:
In SRSWOR, sample mean is an unbiased estimate of the population mean.
OR
In SRSWOR, show that E(y̅) = ̅Y
Proof:
Consider a population of size N with values Y1, Y2, .............. , YN.
From above population a sample of size n is drawn by the method of
SRSWOR with values y1, y2, …,yn..
∑N 1 Yi
Then Population Mean = = i=
𝐘̅̅ N
∑n yi
Sample Mean = 𝐲̅̅ = 1
𝑛

This can be written as,


∑ n yi ∑N a i Y i
𝐲̅̅ = 1 = 1
n n

Where,
ai = 1 ; if ith unit from the population is selected in the sample.
=0 ; if ith unit from the population is not selected in the sample.
Then,
E(ai) = ∑ ai × P( ai ) = 1 × P(ith unit is selected in the sample)
+ 0 × P(ith unit is not selected in the sample)
n n
E(ai) = 1 × + 0 × (1 − ) = n
Nn N N
∑ yi
Consider LHS = E(y̅) = E[ 1
]
n
∑ N a i Yi
= E[ 1
]
n

= 1 ∑N E(a ) Y
n 1 i i

= 1 ∑N ( n ) Yi -----------[ E(ai ) = n ]
n i=1 N N

E(y̅) = 1 ∑N YI = ̅Y = RHS
N i=1

∴ ̅ -------------------------------- (1)
E(𝐲̅̅) = 𝐘̅

∴ Sample mean 𝐲̅̅ is an unbiased estimate of population mean ̅.


𝐘̅
i.e. 𝐘̅̅ = 𝐲̅̅
Result 2:
In SRSWOR, the variance of sample mean is given by:
N−n 2 2 1 1
v(y̅) = ( ) S = ( − )S
Nn n N

Proof:
v(y̅) = v( 1 ∑n yi ) = v ( 1 ∑N aiYi )
n i=1 n i=1

= 1 v(∑N aiYi )
n2 i=1

v(y̅) = 1 [∑N Y 2 v(ai) + 2 ∑N Y Y × cov(a , a )] …..(2)


n2 i=1 i i<j=1 i j i j
Where,
ai = 1 ; if ith unit from the population is selected in the sample.
=0 ; if ith unit from the population is not selected in the sample.
Then,
E(ai) = ∑ ai × P( ai ) = 1 × P(ith unit is selected in the sample)
+ 0 × P(ith unit is not selected in the sample)
n n
E(ai) = 1 × + 0 × (1 − ) = n
N N N
𝐧
E(ai) =
𝐍

E(a2) = ∑ a2 × P( ai ) = 12 × P(ith unit is selected in the sample)


i i

+ 0 × P(ith unit is not selected in the sample)


n n
E(ai2) = 1 × + 0 × (1 − ) = n
N N N
E(𝐚𝟐) = 𝐧
𝐢 𝐍
v(ai) = E(𝐚𝟐) – [E(ai)]2
𝐢

= 𝐧 - [𝐧]2
𝐍 𝐍
𝐧
𝐧
v(ai) = ( 1- )
𝐍 𝐍
E(ai aj) = 1 × P(ai aj = 1) + 0 × P(ai aj = 1)
= 1 × P(ai = 1, aj = 1)
= P(ai = 1) × P(aj = 1/ ai = 1)
n (n−1)
E(ai aj) = ×
N (N−1)
𝐧(𝐧−𝟏)
E(ai aj) =
𝐍(𝐍−𝟏)

∴ cov(ai, aj) = E(ai aj) - E(ai) E(aj)


n(n−1) n 2
= -( )
N(N−1) N
𝐍−𝐧
𝐜𝐨𝐯(𝐚 , 𝐚 ) = - 𝐧 ×
𝐢 𝐣 𝐍 𝐍(𝐍−𝟏)

From Eqn (2)


n N−n
v(y̅) = 1 [∑N n (1− ) Yi 2 + 2 ∑N YY × [- n × ]]
n2 i=1 N N i<j=1 i j N N(N−1)
n N−n
v(y̅) = 1 [n (1− ) ∑N Yi 2 - n × [2 ∑N Y Y ]]
n2 N N i=1 N N(N−1) i<j=1 i j
N−n
= 1 n × [∑N Yi2 - 1 { ( ∑N Yi)2 - ∑N Yi2 }]
n2 N N i=1 N−1 1 1

[( ∑N Xi)2 = ∑N Xi2 + 2 ∑N XiXj ]


i=1 i=1 i<j=1
N−n
= 1 × [ ∑N Y i 2 ( 1 + 1 )- 1 ( ∑N Yi)2 ]
𝑛 N2 i=1 N−1 N−1 1

= N−n 1 [ N ∑N Yi2 - (N̅Y)2 ]


nN2 N−1 1
∑N 1 Yi
---------[ ̅ = ]
i=
Y N

= N−n N [ ∑N Yi2 - N̅Y2 ]


nN2 N−1 1

= N−n 1 [ ∑N(Y − ̅Y)2]


nN N−1 1 i
𝐍−𝐧 2
∴ v(𝐲̅̅) = ( ) S
𝐍𝐧

v(𝐲̅̅) = (𝟏 - 𝟏 ) 𝐒 𝟐
𝐧 𝐍
𝐧 𝐒𝟐
v(𝐲̅̅) = (𝟏 − )
𝐍 𝐧

Note:
1) f = 𝐧 : Sampling fraction
𝐍

2) (1-f) is known as finite population correction.


𝐒𝟐
3) v(𝐲̅̅) = (𝟏 − 𝐟)
𝐧

4) When population size(N) is large f is small and therefore (1-f) -> 1


𝐒𝟐
∴ v(𝐲̅̅) ≈
𝐧

5) v(𝐲̅̅) = 𝐯(𝐲̅̅)𝐖𝐎𝐑

Result 3:
In SRSWOR, sample mean square is an unbiased estimate of the population
mean square.
OR
In SRSWOR, show that E(s2) = S2
Proof:
Consider a population of size N with values Y1, Y2, .............. , YN.
From above population a sample of size n is drawn by the method of
SRSWOR with values y1, y2, …,yn..
∑N 1 Yi
Then Population Mean = = i=
𝐘̅̅ N

Population Variance = σ2 = 1 ∑N (Yi − ̅Y )2


N i=1

Population Mean Square = 𝐒𝟐 = 1 ∑N (Yi − ̅Y )2


N−1 i=1
∑n yi
Sample Mean = 𝐲̅̅ = 1
n

Sample Mean Square = s2 = 1 ∑n(yi - y̅)2


n−1 1
∑𝐧 𝐲̅𝐢 ∑𝐍 𝐚𝐢𝐘̅𝐢
Then 𝐲̅̅ = 𝟏
= 𝟏
𝐧 𝐧

Where,
ai = 1 ; if ith unit from the population is selected in the sample.
=0 ; if ith unit from the population is not selected in the sample.
Then,
E(ai) = ∑ ai × P( ai ) = 1 × P(ith unit is selected in the sample)
+ 0 × P(ith unit is not selected in the sample)
n n
E(ai) = 1 × + 0 × (1 − ) = n
N N N
𝐧
E(ai) =
𝐍
E(a2) = ∑ a2 × P( ai ) = 12 × P(ith unit is selected in the sample)
i i

+ 0 × P(ith unit is not selected in the sample)


n n
E(ai2) = 1 × + 0 × (1 − ) = n
N N N
E(𝐚𝟐) = 𝐧
𝐢 𝐍
E(s 2 ) = E[ 1 ∑n(yi - y̅)2 ]
n−1 1

= 1 E [ ∑n yi2 - ny̅2 ]
n−1 1

= 1 [ E ( ∑n yi2) - n E(y̅2) ] ----------------------- (1)


n−1 1

Now, v(y̅) = E(y̅2) – [E(y̅)2]


∴ E(y̅2) = v(y̅) + [E(y̅)2]
2
∴ E(y̅2) = N−n ( S ) + ̅Y2 ------------------------------------------------------------ (2)
N n

Also,
E [ ∑ n yi 2 ] = E [ ∑ N a 2 Y 2 ]
1 i=1 i i

E [ ∑n yi2] = ∑N E(a2) Y2
1 i=1 i i
= ∑N n Yi2
i=1 N

E [ ∑n yi2] = n ∑N Yi2 ----------------------------------- (3)


1 N i=1

Using (2) & (3) in (1), we get,


2
E(s2) = 1 [ n ∑N Yi2 - n{ N−n ( S ) + ̅Y2}]
n−1 N i=1 N 𝑛
2
= n [ 1 ∑N Yi2 - N−n (S ) - ̅Y2 ]
n−1 N i=1 N 𝑛
2
= n [ 1 ∑N Yi2 - ̅Y2 - N−n (S )]
n−1 N i=1 N n
2
= n [1 ∑N (Yi − ̅Y )2 - N−n (S ) ]
n−1 N i=1 N n
n N−n 𝑆 2
= [ σ2 - ( )]
n−1 N 𝑛
2
= n [ N−1 s2 - N−n (S ) ] ----------- [Nσ2 = (N-1) s2 ]
n−1 N N n
2
n S N−n
= ( ) [ N-1- ]
n−1 N N
2
n S nN−n−N+n
= n−1
( N) [ n
]
2
= [ S ] × [N(n-1)]
(n−1)N

∴ E(s2) = S2
∴ Sample mean square is an unbiased estimate of the population mean
square.
i.e. 𝐒𝟐 = s2
Estimation of variance
v(𝐲̅̅) = (𝟏 - 𝟏 ) 𝐒 𝟐
𝐧 𝐍

&
E(s2) = S2 ⟹ 𝐒𝟐 = s2
𝐯(𝐲̅̅) = (𝟏 - 𝟏 ) 𝐒𝟐 = (𝟏 - 𝟏 ) s2
𝐧 𝐍 𝐧 𝐍
Confidence interval for population mean
̅ , v(𝐲̅̅) = (𝐍−𝐧) S2
We have E(𝐲̅̅) = 𝐘̅
𝐍𝐧
N−n S 2
y̅ ~ N(̅Y, )
N n
| ̅y – Y̅|
Z= ~ N(0,1)
√v(̅y)
∴ p [ | z | ≥ 𝑍∝⁄2 ] = α
| ̅y – Y̅|
P[ ≥𝑍 ] =α

√v(̅y) ⁄2
| ̅y – Y̅|
P[ -𝑍∝ ≤ ≤ 𝑍∝ ] = 𝛼
⁄2 √v(̅y) ⁄2
̅ ≤ 𝐲̅̅ + 𝒁𝜶⁄ √𝐯(𝐲̅̅)𝑾𝑶𝑹 ] = 𝛼
P[𝐲̅̅ - 𝒁𝜶⁄𝟐 √𝐯(𝐲̅̅)𝑾𝑶𝑹 ≤ 𝐘̅ 𝟐

100(1-α)% confidence interval for population mean (̅Y):


𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐲̅̅)𝑾𝑶𝑹
𝟐
100(1-α)% confidence interval for population mean (̅Y):
𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐲̅̅)𝑾𝑶𝑹
𝟐
Estimation of population total and its variance
Estimation of population total:
̅
We have E(𝐲̅̅) = 𝐘̅
Multiplying both sides by N
N × E(y̅) = N × ̅Y
∑N 1 Yi
E( ) = N
Ny̅ × i=
N
E( Ny̅) = ∑N Yi = Y
i=1

E( Ny̅) = Y
∴ Ny̅is an unbiased estimate of population total.
i.e. 𝐘̅= 𝐍𝐲̅̅
Estimation of variance population total:
v(𝐘̅) = v( 𝐍𝐲̅̅) = N2 𝐯(𝐲̅̅)
= N2 (𝟏 - 𝟏 ) 𝐒𝟐
𝐧 𝐍
𝐍(𝐍−𝐧)
v(𝐘̅) = 𝐒𝟐
𝐧

𝐯(𝐘̅) = 𝐍(𝐍−𝐧) 𝐒𝟐
𝐧
𝐍(𝐍−𝐧) 𝟐
𝐯(𝐘̅) = 𝐬
𝐧

Confidence interval for population total


): N 𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐘̅
100(1-α)% confidence interval for population total (𝐘̅ )
𝟐 𝑾𝑶𝑹

): N𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐘̅


100(1-α)% confidence interval for population total (𝐘̅ )
𝟐 𝑾𝑶𝑹

Result 4:
In SRSWR,
(i) Sample mean is an unbiased estimate of the population mean.
2
(ii) The variance of sample mean is given by: v(y̅) = σ
n

OR
2
In SRSWR, show that (i) E(y̅) = ̅Y (ii) v(y̅) = σ
n

Proof:
Consider a population of size N with values Y1, Y2, .............. , YN.
From above population a sample of size n is drawn by the method of
SRSWR with values y1, y2, …,yn..
∑N 1 Yi
Then Population Mean = = i=
𝐘̅̅ N

Population Variance = σ2 = 1 ∑N (Yi − ̅Y )2


N i=1

Population Mean Square = 𝐒𝟐 = 1 ∑N (Yi − ̅Y )2


N−1 i=1
∑n yi
Sample Mean = 𝐲̅̅ = 1
𝑛
This can be written as,
∑ n yi ∑ N t i Yi
𝐲̅̅ = 1 = 1
n n

Where,
Let ti = Number of times ith unit is selected in the sample. ;i = 0,1, 2,., N.
P(ith unit occurs in the sample in a single draw) = 1
N

This probability is equal and independent for every draw.


1
∴ t ~Bin(n, p = )
i N
∴ E(ti) = np = n , v(ti) = npq = n [1- 1 ]
N N N

Then joint distribution of t1,t2 … … . tn is the multinomial distribution and is


given by:
P(t t … … .× t ) = n! [ p(t ) × p(t )…….. × p(t )]
1, 2 N t1!t2!………tN! 1 2 N

; where ∑N ti = n
i=1
1
P(t1,t 2… … .× t N ) = n! ×
t1!t2!………tN! Nn

We have, cov (ti, tj) = - ( n )2


N
∑ n yi
(i) Consider LHS = E(y̅) = E[ 1
]
n
∑ N t i Yi
= E[ 1
]
n

= 1 ∑N E(t ) Y
n 1 i i

= 1 ∑N ( n ) Yi -----------[ E(ti ) = n ]
n i=1 N N

E(y̅) = 1 ∑N YI = ̅Y = RHS
N i=1

∴ ̅ -------------------------------- (1)
E(𝐲̅̅) = 𝐘̅
∴ Sample mean 𝐲̅̅ is an unbiased estimate of population mean ̅.
𝐘̅
i.e. 𝐘̅̅ = 𝐲̅̅
(ii) v(y̅) = v( 1 ∑n yi )
n i=1

= v ( 1 ∑N tiYi)
n 1

v(y̅) = ( 1 ) v(∑N tiYi)


n2 1

= ( 1 ) [ ∑N v(ti) Y 2 + 2 ∑N Y Y * cov(t ,t )]
n2 1 i i≠j=1 i j i j

v(y̅) = ( 1 ) [ ∑N n (1 − 1 ) Y i 2 + 2 ∑N YiYj (−n)]


n2 1 N N i≠j=1 N2

= ( 1 ) n [ (1 − 1 ) ∑ N Y i 2 - 1 { 2 ∑N YiYj }]
n2 N N 1 N i≠j=1

= ( 1 ) n [ (1 − 1 ) ∑N Yi2 - 1 {( ∑N Yi)2 - ∑N Yi2 }]


n2 N N 1 N 1 1

-------------- [( ∑N Xi)2 = ∑N Xi2 + 2 ∑N XiXj ]


i=1 i=1 j
= 11 [(1- 1 + 1 ) ∑N Yi2 - 1 ( N̅Y )2]
n N N N 1 N

v(y̅) = 1 1 [ ∑N Yi2 - N̅Y2]


n N 1

= 1[ 1 ∑N (Yi − ̅Y )2]
𝑛 N i=1
𝟐
--------------------------------
∴ v(𝐲̅̅) = 𝝈 (2)
𝒏

∵ Nσ2 = (N-1) s2
(𝐍−𝟏)𝐒 𝟐
∴ v(𝐲̅̅) =
𝐍𝐧

Result 5:
In SRSWR, sample mean square is an unbiased estimate of the population
variance.
OR
In SRSWR, show that E(s2) = 𝜎2
Proof:
Consider a population of size N with values Y1, Y2, .............. , YN.
From above population a sample of size n is drawn by the method of
SRSWR with values y1, y2, …,yn..
∑N 1 Yi
Then Population Mean = = i=
𝐘̅̅ N

Population Variance = σ2 = 1 ∑N (Yi − ̅Y )2


i=1
∑n yiN
Sample Mean = 𝐲̅̅ = 1
n

Sample Mean Square = s2 = 1 ∑n(yi - y̅)2


n−1 1

E(s 2 ) = E [ 1 ∑n(yi - y̅)2 ]


n−1 1

= 1 E [ ∑n yi2 - ny̅2 ]
n−1 1

= 1 [ E ( ∑n yi2) - n E(y̅2) ] ----------------------- (1)


n−1 1

Now, v(y̅) = E(y̅2) – [E(y̅)2]


∴ E(y̅2) = v(y̅) + [E(y̅)2]
2
∴ E(y̅2) = 𝜎 + ̅Y2 -----------------------------------------------------------(2)
𝑛
Also,
E [ ∑ n yi 2 ] = E [ ∑ N t 2 Y 2 ]
1 i=1 i i

E [ ∑n yi2] = ∑N E(t2) Y2
1 i=1 i i
N n
= ∑i=1 N
Yi2
E [ ∑n yi2] = n ∑N Y 2 ----------------------------------- (3)
1 N i=1 i

Using (2) & (3) in (1), we get,


2
E(s2) = 1 [ n ∑N Y 2 - n{ 𝜎 + ̅Y2}]
n−1 N ii=1 𝑛
2
= n [ 1 ∑N Yi 2 - 𝜎 - ̅Y2 ]
n−1 N i=1 𝑛
1 2 2
= n [{ ∑N Y 2 - ̅Y2 } - 𝜎 ]= n [ 1 {∑N Yi 2 -N ̅Y2 } - 𝜎 ]
n−1 N i
i=1 𝑛 n−1 N i=1 𝑛
2
= n [1 ∑N (Yi − ̅Y )2 - 𝜎 ]
n−1 N i=1 𝑛
2
= n [ σ2 - 𝜎 ]
n−1 𝑛
n 1 n 𝑛−1
= [1 - ] σ2 = [ 𝜎] 2
n−1 𝑛 n−1 𝑛

∴ 𝐄(𝐬𝟐) = 𝝈𝟐
∴ Sample mean square is an unbiased estimate of the population
variance.
𝟐 = s2
i.e. 𝝈
Note:
(𝐍−𝟏)𝐒𝟐 𝝈𝟐
1) v(𝐲̅̅) = =
𝐍𝐧 𝒏
(𝐍−𝟏)𝐒𝟐 𝟐
𝝈
2) 𝒗
(𝐲̅̅) = =
𝐍𝐧 𝒏
(𝐍−𝟏)𝐬𝟐 𝐬𝟐
𝒗
(𝐲̅̅) = =
𝐍𝐧 𝐧

3) v(𝐲̅̅) = v(𝐲̅̅)WR
Confidence interval for population mean
100(1-α)% confidence interval for population mean (̅Y):
𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐲̅̅)𝐖𝐑
𝟐

100(1-α)% confidence interval for population mean (̅Y):


𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐲̅̅)𝐖𝐑
𝟐

Estimation of population total and its variance


Estimation of population total:
̅
We have E(𝐲̅̅) = 𝐘̅
Multiplying both sides by N
N × E(y̅) = N × ̅Y
∑N 1 Yi
E( ) = N
Ny̅ × i=
N
E( Ny̅) = ∑N Yi = Y
i=1

E( Ny̅) = Y
∴ Ny̅is an unbiased estimate of population total.
i.e. 𝐘̅= 𝐍𝐲̅̅
Estimation of variance population total:
v(𝐘̅) = v( 𝐍𝐲̅̅) = N2 𝐯(𝐲̅̅)𝐖𝐑
𝟐 𝟐
2 (𝐍−𝟏)𝐒 2 𝝈
= N 𝐍𝐧
=N 𝒏
𝟐
𝑵(𝑵−𝟏) 𝝈
v(𝐘̅) = 𝐒 𝟐 = N2
𝒏 𝒏
𝟐 𝟐
𝐯(𝐘̅) = N2 𝝈 = N2 𝐬
𝒏 𝒏
Confidence interval for population total
): N 𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐘̅
100(1-α)% confidence interval for population total (𝐘̅ )
𝟐 𝑾𝑹

): N𝐲̅̅ ± 𝒁𝜶⁄ √𝐯(𝐘̅


100(1-α)% confidence interval for population total (𝐘̅ )
𝟐 𝑾𝑹

Comparison between SRSWOR & SRSWR


(𝐍−𝟏)𝐒 𝟐 𝐍−𝐧
𝐯(𝐲̅̅) = ; 𝐯(𝐲̅̅) = ( ) S2
𝐖𝐑 𝐍𝐧 𝐖𝐎𝐑 𝐍𝐧
n>1
∴ -n < -1
Adding N both sides
N-n < N-1
2
Multiplying S both sides
Nn
𝐍−𝐧 (𝐍−𝟏)𝐒𝟐
⟹( ) S2 <
𝐍𝐧 𝐍𝐧

⟹ 𝐯(𝐲̅̅)𝐖𝐎𝐑 < 𝐯(𝐲̅̅)𝐖𝐑


∴ Sampling without replacement is more efficient than sampling with
replacement.
Note: Simple random sampling for variables

SRSWOR SRSWR

1. E(y̅) = ̅Y E(y̅) = ̅Y
2 2
2. v(y̅) = N−n (S ) v(y̅) = σ
N n n

2 2
3. v(y̅) = N−n (s ) v(y̅) = s
N n n
Simple Random Sampling for Attributes
A qualitative characteristic is called as an attribute e.g Literate, Intelligence,
colour of the eyes etc. In such situation the units in the population into two
groups(classes):
(1) A group having particular characteristic.
(2) A group not having a particular characteristic.
Notations:
Suppose the population consist of N units say U1, U2…….UN.
Let ‘C’ be the characteristic under study.
The population is divided into two mutually exclusive and exhaustive
classes, where first class of possessing the characteristic ‘C’ and second
class of not possessing the characteristic(attribute) ‘C’.
Let A: be the number of units in the population possessing attribute ‘C’.
N –A:be the number of units in the population does not possesses attribute C
Let Yi : be the value associated with the ith unit of the population ;i=1,2….N
Yi = 1 if the ith unit of the population possesses attribute C.
= 0 if the ith unit of the population does not possesses attribute C.
Y: Population total = ∑N Yi
1

Y = A: is the number of units in the population possessing attribute ‘C’.


∴ Y = ∑N Y i = A
1
∑N 1 Yi
Population mean = ̅ =
i=
Y N
A
=
N
Number of units in the population possessing attribute ‘C
=
Total number of units in the population

𝐘̅̅ = P (say) ........................................................... (1)


̅Y = P :be the proportion of population units possessing attribute ‘C’.
Population variance = σ2 = 1 ∑N (Yi − ̅Y )2
N i=1
∑N Y2
= 1 i -̅
Y2
N
∑N Y i
= [ 1
] - ̅Y 2 ; [ ∵ Y = 0 or 1 ]
N 𝑖

= ̅Y - ̅Y2
= P – P2
= P(1-P)
σ2 = PQ ; [ Q = 1-P ] ------------ (2)

Population Mean square = S2 = 1 ∑N(Y − ̅Y )2


N−1 1 i

= 1 [∑N Y2 - N̅Y2 ]
N−1 i=1 i

= 1 [ ∑N Y - N̅Y2 ] ; [∵ Y = 0 or 1, Y2 = Y
N−1 1 i i i 𝑖

S2 = 1 [ N̅Y - N̅Y2]
N−1

= N (P – P2) ; [ From Eqn (1) ]


N−1

= N P(1-P)
N−1

= N PQ
N−1
----------------------------------------
S2 = 𝐍 𝐏𝐐 (3)
𝐍−𝟏

Suppose we select a sample of size ‘n’ from above population.


Let ‘C’ be the characteristic under study.
Let a: be the number of units in the sample possessing attribute ‘C’.
n – a :be the number of units in the sample does not possesses attribute C.
Let yi : be the value associated with the ith unit of the sample ;i=1,2….n
yi = 1 if the ith unit of the sample possesses attribute C.
=0 if the ith unit of the sample does not possesses attribute C.
y: Sample total = ∑n yi
1

y = a: is the number of units in the sample possessing attribute ‘C’.


∴ y = ∑𝐧 𝐲̅ = a
𝐢=𝟏
∑n yi
Sample mean = =
i=1
y̅ n
𝑎
=
n
Number of units in the sample possessing attribute ‘C
=
Total number of units in the sample

∴ 𝐲̅̅ = p (say) --------------------------------------------- (4)


𝐲̅̅= p : be the proportion of sample units possessing attribute ‘C’.

Sample mean square= s2 = 1 ∑n(y − y̅)2


n−1 1 i

= 1 [∑ yi2 - ny̅2 ]
n−1

= 1 [∑ yi - ny̅2 ] ; [ yi = 0 or 1 ]
n−1

= 1 [ ny̅- ny̅2 ]
n−1

= n ( p-p2) ; [ From Eqn (4)


n−1

s2 = 𝐧 𝐩𝐪 ; [ q = 1-p ] -------(5)
𝐧−𝟏

Simple random sampling without replacement(SRSWOR)


Result 6:
In SRSWOR,
(i) Sample proportion is an unbiased estimate of the population proportion.
(N−n) PQ
(ii) The variance of sample proportion is given by: v(𝑝) =
(N−1) n

OR
(N−n) PQ
In SRSWOR, show that (i) E(p) = P (ii) v(p) =
(N−1) n
Proof:
(i) We know that in SRSWOR sample mean is an unbiased estimator of
population mean.
i.e. E (y̅) = ̅Y ; ̅Y = P, y
̅= p
E (p) = P ---------------------------------------------------------------------------------- (6)

∴ Sample proportion ( p) is an unbiased estimate of population proportion( P).


𝟐
(ii) We have 𝐯(𝐲̅̅) 𝐖𝐎𝐑
= N−n (𝐒 ) ; S2 = 𝐍 𝐏𝐐
N n 𝐍−𝟏
N−n N PQ
𝐯(𝐩) = ( )
𝐖𝐎𝐑 Nn N−1
𝐏𝐐
𝐯(𝐩) = 𝐍−𝐧 ×
𝐖𝐎𝐑 𝐍−𝟏 𝐧

If N is large, 𝐯(𝐩) = 𝐏𝐐
𝐖𝐎𝐑 𝐧

Estimation of variance
𝟐
v(y̅) = N−n ( 𝐒N) ; yn̅ = p , S2 = s2 = n pq WOR
n−1
v(p) WOR
= N−n ( npq )
N n(n−1)
𝐩𝐪
𝐯(𝐩) = 𝐍−𝐧 ×
𝐖𝐎𝐑 𝐍 (𝐧−𝟏)
𝐩𝐪
If N is large, 𝐯(𝐩) ≈
𝐖𝐎𝐑 (𝐧−𝟏)

Confidence interval for population proportion


100(1-α)% confidence interval for population proportion (P):
𝐩 ± 𝒁𝜶⁄ √𝐯(𝐩)𝐖𝐎𝐑
𝟐
100(1-α)% confidence interval for population proportion (P):
𝐩 ± 𝒁𝜶⁄ √𝐯(𝐩) 𝐖𝐎𝐑
𝟐

Estimation of total number of units possessing attribute and its variance


(i) Estimation of total number of units possessing attribute:
We have E( Ny̅) = Y ; y̅= p, Y = NP = A
E( Np) = NP = A
∴ 𝐀̂= Np
(ii) Estimate of its variance:
) = v(Np) = N2 v(p)
v(A
𝐏𝐐
v(𝐀̂) = N2 𝐍−𝐧 ×
𝐍−𝟏 𝐧
𝐏
𝐐
) = N2 𝐍−𝐧 ×
v(A
𝐍−𝟏 𝐧
𝐩𝐪
) = N2 𝐍−𝐧 ×
𝐯(𝐀̂
𝐍−𝟏 𝐧

Confidence interval for population total


): N p ± 𝐙𝛂⁄ √𝐯(𝐩)𝐖𝐎𝐑
100(1-α)% confidence interval for population total (𝐘̅
𝟐

): 𝐍𝐩 ± 𝐙𝛂⁄ √𝐯(𝐩)𝐖𝐎𝐑
100(1-α)% confidence interval for population total (𝐘̅
𝟐

Simple random sampling with replacement(SRSWR)


Result 7:
In SRSWR,
(i) Sample proportion is an unbiased estimate of the population proportion.
(ii) The variance of sample proportion is given by: v(p) = PQ
n

OR
In SRSWOR, show that (i) E(p) = P (ii) v(p) = PQ
n
Proof:
(i) We know that in SRSWR sample mean is an unbiased estimator of
population mean.
i.e. E (y̅) = ̅Y ; ̅Y = P, y
̅= p
E (p) = P ---------------------------------------------------------------------------------- (6)

∴ Sample proportion ( p) is an unbiased estimate of population proportion( P).


𝟐
(ii)We have 𝐯(𝐲̅̅) 𝐖𝐑
= N−1 (𝐒 ) ; S2 = 𝐍 𝐏𝐐
N n 𝐍−𝟏
N−1 N PQ
𝐯(𝐩) = ( )
𝐖𝐑 Nn N−1

𝐯(𝐩) = 𝐏𝐐
𝐖𝐑 𝐧

Estimation of variance
𝟐
v(y̅) =
WR
( 𝝈) ; y̅= p , 𝜎2 = s2 = n pq
n n−1
v(p) WR
= ( npq )
n(n−1)

𝐯(𝐩) 𝐖𝐑
= 𝐩𝐪
(𝐧−𝟏)

Confidence interval for population proportion


100(1-α)% confidence interval for population proportion (P):
𝐩 ± 𝒁𝜶⁄ √𝐯(𝐩)𝐖𝐑
𝟐

100(1-α)% confidence interval for population proportion (P):


𝐩 ± 𝒁𝜶⁄ √𝐯(𝐩) 𝐖𝐑
𝟐

Estimation of total number of units possessing attribute and its variance


(i) Estimation of total number of units possessing attribute:
We have E( Ny̅) = Y ; y̅= p, Y = NP = A
E( Np) = NP = A
∴ 𝐀̂= Np
(ii) Estimate of its variance:
) = v(Np) = N2 v(p)
v(A
𝐏𝐐
v(𝐀̂) = N2 ×
𝐧
𝐏
𝐐
) = N2 𝐍−𝐧 ×
v(A
𝐍−𝟏 𝐧
) = N2 𝐯(𝐩)
𝐯(𝐀̂ = N2 𝐩𝐪
𝐖𝐑 (𝐧−𝟏)

Confidence interval for population total


): N p ± 𝐙𝛂⁄ √𝐯(𝐩)𝐖𝐑
100(1-α)% confidence interval for population total (𝐘̅
𝟐

): 𝐍𝐩 ± 𝐙𝛂⁄ √𝐯(𝐩)𝐖𝐑
100(1-α)% confidence interval for population total (𝐘̅
𝟐

Estimation of sample size


In any sample survey, the problem is to determine the sample size so that
the population parameters may be estimated with a specific precision. The
degree of precision is determined in terms of:
(i) The level of significance(α) in the estimate &
(ii) The confidence interval within which this estimate lie w.r.t given level
of significance.
(A) Data for variables
Consider a population consist of N units. Suppose we select a sample of size
n from above population.
Let ̅Y be the population mean and y̅ be the sample mean.
Sample mean should not differ from population mean by more than a
specific amount of absolute estimation error ∈, which is a small quantity and
̅| <∈) = 𝟏 − 𝛂
it is given by: p(|𝐲̅̅− 𝐘̅
̅| ≥ ∈) = 𝛂
∴ p(|𝐲̅̅− 𝐘̅

(I) Simple random sampling without replacement(SRSWOR)


𝟐
E (y̅) = ̅Y ; v(y̅) = N−n ( 𝐒 )
WOR N n
We have p [ | y̅ - ̅Y | ≥ ∈ ] = α ------------------------------------------ (1)
| ̅y − Y̅|
p[ ≥ ∈ ]=α
√v(̅y) ̅)
√v(y

Now,
If X ~ N(μ, N−nσS ) then z =̅y − Y̅| σ
2 X−μ
Similarly y̅ ~ N(̅Y, ) then
2
Z= |

N n √v(̅y)

∴ p[|z| ≥ ∈ ] = α ------------------------------------------ (2)


̅)
√v(y

Let zα/2 be the value of SNV such that,

p [ | z | ≥ zα/2 ] = α-------------------------------------------- (3)

From Eqn (2) & (3) we get,



= zα/2
̅)
√v(y

= zα/2
2
√(N−n)S
N n

∈ (N−n) S2
=√
Zα/2 N n

Squaring both sides


∈2 (N−n) S2
=
Z2α/2 N n

∈2 1 1
2
Z S2 = n -N
α⁄
2
∈2 1 1
2
Z S2 + N = n
α⁄
2
1 ............................................................................
n= 1 ∈2
= 1
2 (4)
+ 1 ∈
N Z2 S2 +( )
α ⁄2 N Zα⁄ S
2
Zα⁄ S 2
( 2 )

n= 2
Zα⁄ S
1+1 ( 2 )
N ∈

Zα⁄ S 2
Put ( 2 ) = n0

n= n0
n
1+ 0
N

n = 𝐧𝟎 ; if N is large
= 𝐧𝟎 ; Otherwise
𝐧𝟎
𝟏+ 𝐍

(II) Simple random sampling with replacement(SRSWR)


𝟐
̅
E (y̅) = Y ; v(y̅) = 𝝈
WR n

We have p [ | y̅ - ̅Y | ≥ ∈ ] = α ---------------------------------------------- (1)


| ̅y − Y̅|
p[ ≥ ∈ ]=α
√v(̅y) ̅)
√v(y

Now,
If X ~ N(μ,𝝈σ2) then z̅y = X−μ
− Y̅| σ
Similarly y̅ ~ N(̅Y, 𝟐) then Z = |

n √v(̅y)

∴ p[|z| ≥ ∈ ] = α ----------------------------------------- (2)


̅)
√v(y

Let zα/2 be the value of SNV such that,

p [ | z | ≥ zα/2 ] = α------------------------------------------- (3)

From Eqn (2) & (3) we get,



= zα/2
̅)
√v(y

= zα/2
2
√𝜎
n
𝜎 𝑍𝛼⁄
2
√𝑛 =

Squaring both sides
𝝈 𝒁𝜶⁄ 𝟐
n =( 𝟐)

(A) Data for attributes


Consider a population consist of N units. Suppose we select a sample of size
n from above population.
Let P be the population proportion and p be the sample proportion.
Sample proportion should not differ from population proportion by more
than a specific amount of absolute estimation error ∈, which is a small
quantity and it is given by: p(|𝐩 − 𝐏| <∈) = 𝟏 − 𝛂
∴ p(|𝐩 − 𝐏| ≥ ∈) = 𝛂

(I) Simple random sampling without replacement(SRSWOR)


𝐏𝐐
E (p) = P ; v(p) =
WOR
𝐍−𝐧 ×
𝐍−𝟏 𝐧

We have p [ | p - P | ≥ ∈ ] = α ------------------------------------------ (1)


| p − P|
p[ ≥ ∈ ]=α
√v(p) √v(p)

Now,
If X ~ N(μ, σ2) then z= X−μ
σ
N−n PQ
Similarly p ~ N(P, × ) then Z =| p −P|
N−1 n √v(p)

∴ p[|z| ≥ ∈ ] = α------------------------------------------ (2)


√v(p)

Let zα/2 be the value of SNV such that,

p [ | z | ≥ zα/2 ] = α-------------------------------------------- (3)

From Eqn (2) & (3) we get,



= zα/2
√v(p)

= zα/2
N−n PQ
√ ×
N−1 n

N−n PQ
∈= Zα/2 √ ×
N−1 n

Squaring both sides


2 N−n PQ
∈2 = Zα/2 ×
N−1 n
∈2 N−1 N
Z2 PQ = n- 1
α⁄
2
2
N ∈ N−1
n = Z2 PQ + 1
α⁄
2
n 1
=
N ∈2 N−1
Z2 PQ+ 1
α⁄
2
N .........................................................................................................
n= (4)
∈2 N−1
Z2 PQ+ 1
α⁄
2
Zα2 PQ
Multiplying N & D by ⁄2
∈2
Z2 PQ
α⁄2
N
n= ∈2
Z2 PQ
α⁄2
N−1+
∈2
Z2 PQ
α
Put ⁄2 = n0
∈2
Nn0
n=
N−1+ n0

Divide N & D by N
n= n0
1 n
1−N+ N0
n= n0
1 n
1−N+ N0

n= 𝐧𝟎
𝟏
𝟏+ (𝐧 −𝟏)
𝐍 𝟎

(II) Simple random sampling with replacement(SRSWR)


E (p) = P ; v(p) = PQ
WR n

We have p [ | p - P | ≥ ∈ ] = α -----------------------------------------------(1)
|𝑝−𝑃 |
p[ ≥ ∈ ]=α
√v(𝑝) √v(p)

Now,
If X ~ N(μ, σ2) then z= X−μ
σ
PQ
Similarly p ~ N(P, ) then Z =| p −P|
n √v(p)

∴ p[|z| ≥ ∈ ] = α------------------------------------------ (2)


√v(p)

Let zα/2 be the value of SNV such that,

p [ | z | ≥ zα/2 ] = α-------------------------------------------- (3)

From Eqn (2) & (3) we get,



= zα/2
√v(p)

= zα/2
PQ

n

PQ
∈= Z √
α/2 n

Squaring both sides


PQ
∈2 = Z 2
α/2 n
𝐙𝟐 𝐏𝐐
𝛂/𝟐
n= ∈𝟐
Advantages and Disadvantages of simple random sampling
Advantages
1) The method is simple to use.
2) SRS method needs minimum knowledge of the study group of population
in advance.
3)It is easy to assess the sampling error in this method.
4)Simple random sampling is representative of the population.
5)SRS is free from bias and prejudice.

Disadvantages
1) SRS carries larger errors from the sample size than that are found in
stratified sampling.
2) In SRS selection of sample becomes impossible if the units are widely
dispersed.
3) SRS cannot be used when the units of the population are heterogeneous.
4)SRS method lacks the use of available knowledge concerning the
population.

You might also like