Professional Documents
Culture Documents
BSA
26 August 2019
BSA Sampling 1 / 50
Recap
BSA Sampling 2 / 50
Basic Ideas of Sampling
BSA Sampling 3 / 50
Basic Ideas of Sampling
BSA Sampling 3 / 50
Types of Sampling
BSA Sampling 4 / 50
Potential Causes of Bias
Convenience Sampling
Volunteer Sampling
Systematic Sampling
Non-response Bias
Response Bias
Does that mean we shouldn’t use any of these types of sampling??
BSA Sampling 5 / 50
Potential Causes of Bias
Convenience Sampling
Volunteer Sampling
Systematic Sampling
Non-response Bias
Response Bias
Does that mean we shouldn’t use any of these types of sampling??
NO, one can use, but with caution. Make sure it is not leading to
a systematic error
BSA Sampling 5 / 50
Sampling Using Excel in Presence of Sampling Frame
Generate a random number for each observation and then sort the
observations
BSA Sampling 6 / 50
What after sampling?
BSA Sampling 7 / 50
Estimating from the statistic
BSA Sampling 8 / 50
Errors in the Process of Estimation
BSA Sampling 9 / 50
Estimating Population Mean from Sample Mean
BSA Sampling 10 / 50
Estimating Population Variance from Sample Observations
If the P
sampling scheme is WITH REPLACEMENT
n
(Xi − X )2
sX2 = i=1
n−1
If the sampling
scheme Pis WITHOUT REPLACEMENT
n 2
N −1 i=1 (Xi − X )
sX2 ,WOR =
N n−1
Why is estimating population variance important?
BSA Sampling 11 / 50
Estimating Population Variance from Sample Observations
If the P
sampling scheme is WITH REPLACEMENT
n
(Xi − X )2
sX2 = i=1
n−1
If the sampling
scheme Pis WITHOUT REPLACEMENT
n 2
N − 1 i=1 (Xi − X )
sX2 ,WOR =
N n−1
Why is estimating population variance important?
To get an idea about error in estimation of sample mean
BSA Sampling 11 / 50
Standard Error in Sample Mean
BSA Sampling 12 / 50
Central Limit Theorem
BSA Sampling 13 / 50
Central Limit Theorem
BSA Sampling 13 / 50
Central Limit Theorem
Theorem
If the sample size is large, for WITH REPLACEMENT and
independent sampling, the sample mean X is approximately normal
with
1 mean = µ
σ2
2 variance = n
BSA Sampling 14 / 50
Comments about Central Limit Theorem
BSA Sampling 15 / 50
CLT when population itself is normally distributed
BSA Sampling 16 / 50
Sample Proportion
BSA Sampling 17 / 50
Sample Proportion
BSA Sampling 18 / 50
Sample Proportion
BSA Sampling 18 / 50
Sampling Proportion
BSA Sampling 19 / 50
Confidence Interval
BSA Sampling 20 / 50
Confidence Interval
BSA Sampling 21 / 50
Confidence Interval Idea
BSA Sampling 22 / 50
Confidence Interval Idea
BSA Sampling 23 / 50
Example of Confidence Interval
BSA Sampling 24 / 50
BSA Sampling, etc. continued
BSA Sampling 25 / 50
Easier way for check unbiasedness of sample proportion
BSA Sampling 26 / 50
Easier way for check unbiasedness of sample proportion
BSA Sampling 26 / 50
Easier way for check unbiasedness of sample proportion
BSA Sampling 26 / 50
Easier way for check unbiasedness of sample proportion
BSA Sampling 26 / 50
Standard Normal Distribution
BSA Sampling 27 / 50
Standard Normal Distribution
BSA Sampling 27 / 50
Standard Normal Distribution
BSA Sampling 27 / 50
Standard Normal Distribution
BSA Sampling 27 / 50
Confidence Interval
Back to Sample Mean X
1 X is a random variable
2 Under certain conditions, large sample size, etc. We use CLT
to get better idea about X
3 Using properties of normal distribution, what can be said
about
4 P X is in between µ − 2 √σn and µ + 2 √σn
BSA Sampling 28 / 50
Confidence Interval
BSA Sampling 28 / 50
Confidence Interval
BSA Sampling 28 / 50
Confidence Interval
BSA Sampling 28 / 50
Confidence Interval
BSA Sampling 28 / 50
Confidence Interval Discussions
BSA Sampling 29 / 50
Summary of results for 100(1-α)% C.I.
n σ2 C.I. Type Symmetric C.I.
zα σ zα σ
Large known Approximate X − √2 , X + √2
n n
zα s zα s
Large unknown Approximate X − √2 , X + √2
n n
Table: C.I. for population mean µ, s is sample standard deviation
BSA Sampling 30 / 50
Proof of the above results
We will skip the proof, because they are very similar in nature
Source - https://www.istockphoto.com/in/vector/set-of-children-skipping-gm974441736-265090753
BSA Sampling 31 / 50
Sample Size Determination
BSA Sampling 32 / 50
t-dist, hypothesis testing
BSA Sampling 33 / 50
Basics about Random Variable - Revisited
BSA Sampling 34 / 50
Standard Normal Distribution
A normal distribution with mean 0 and standard deviation as 1
Source - https://en.wikipedia.org/wiki/Normal_distribution#Standard_normal_distribution
BSA Sampling 35 / 50
Chi-square distribution
BSA Sampling 36 / 50
Chi-square distribution
Source - https://en.wikipedia.org/wiki/Chi-squared_distribution
BSA Sampling 37 / 50
t-distribution
If Z ∼ N(0, 1) and U ∼ χ2n are independent, then the distribution
Z
of q is called the t-distribution with n degrees of freedom
U
n
Source - https://en.wikipedia.org/wiki/File:Student_t_pdf.svg
BSA Sampling 38 / 50
F distribution
If U ∼ χ2m and V ∼ χ2n are independent, then the distribution of
U
m
V
is called the F-distribution with m and n degrees of freedom
n
and is denoted by Fm,n
Source - https://en.wikipedia.org/wiki/F-distribution
BSA Sampling 39 / 50
Application of distributions that we just saw
BSA Sampling 40 / 50
Case of normal population
SamplePnmean is denoted by X and defined as follows
Xi
X = i=1
n
If the sampling scheme is WITH REPLACEMENT, sample
variance
Pnequals
(Xi − X )2
sX2 = i=1
n−1
It can be shown that -
1 X and sX2 are independent
(n − 1)sX2
2 ∼ χ2n−1
σ2
BSA Sampling 41 / 50
Case of normal population
BSA Sampling 41 / 50
Case of normal population
BSA Sampling 41 / 50
Moral of the story - results for 100(1-α)% C.I. for Mean
zα s zα s
Any Large unknown Approximate X − √2 , X + √2
n n
zα σ zα σ
Normal Any known Exact X − √2 , X + √2
n n
tn−1, α2 s tn−1, α2 s
Normal Any unknown Exact X− √ ,X + √
n n
Table: C.I. for population mean µ, s is sample standard deviation
BSA Sampling 43 / 50
Hypothesis Testing
BSA Sampling 44 / 50
Key Terms in Hypothesis Testing
BSA Sampling 45 / 50
Key Terms in Hypothesis Testing
BSA Sampling 45 / 50
Errors, Level and Power
BSA Sampling 46 / 50
Remarks on α and β
BSA Sampling 47 / 50
Example
PRM 39 was accused of using a biased coin in the toss for cricket
match. A test was thus performed to check whether the given coin
is biased towards heads. The coin was tossed 10 times.
1 Observed result - HHTHHHHTHH
2 Null Hypothesis p = 0.5
3 Alternative Hypothesis is p > 0.5
It was decided that the null will be rejected if the number of heads
is greater than 7?
What is the level of the test?
What is the power of the test?
BSA Sampling 48 / 50
P-value
BSA Sampling 49 / 50
P-value
BSA Sampling 49 / 50
P-value
BSA Sampling 49 / 50
Thank you for your attention
BSA Sampling 50 / 50