You are on page 1of 15

SAMPLING & ESTIMATION

Main Issues
 Universe/Population
 Sampling Unit
 Sampling Frame
 Sample Size
 Budgetary Constraints
 Sampling Procedure
Criteria of Design
Cost of
collecting &
analyzing Data

Minimise cost of sampling


Cost of
incorrect
inferences

Systematic bias &


Sampling error Leads to

Systematic bias – Inherent in the System


Sampling error-Random variation, controllable by sample size
Sampling Methods
A. Non-random/Non-probability-based sampling
– Convenience/ judgmental /purposive/quota sampling
B. Random/Probability- based sampling
1. Simple random sampling
 Each element/item has equal chance of getting included in a sample.
Randomness.
 Sampling with/without replacement
 Random number table, pseudo-random number generator.
2. Stratified Sampling
 Each stratum is a homogeneous group and different from other strata.
 Random selection from each stratum.
3. Systematic sampling
 Elements selected at a uniform interval.
 Selection evenly spread, less cost & time, more convenient.
 Problem in case of hidden periodicity.
4. Cluster sampling
 Least or no variation among clusters.
 Clusters are selected randomly for further analysis.
 Area sampling in geographical clusters.
 Multi-stage sampling as a special case.
ESTIMATION FROM SAMPLES
• Sampling Distribution: Distribution of a sample
statistics, usually mean.
• Standard error( ): Standard deviation of the
sampling distribution.
• Mean of sampling distribution( ) of means, taking
all possible samples exhaustively, approaches to
population mean (µ), particularly for normal
population distribution.
• As sample size increases, standard error decreases.
Assuming Normal Population Distribution

n = Sample size
Central Limit Theorem:
 Irrespective of shape of population distribution, sampling
distribution approaches to normal, as sample size increases.
 Mean of such sampling distribution is population mean.

Sample Standard Precision of Cost of


Estimation Vs sampling
Size error
Point Estimate
Interval Estimate.
 Confidence Level:
 Level of significance, α
 Probability that is associated with an interval
estimate (1- α), of any population parameter.
 Higher confidence level => Wider confidence
interval
Estimation of mean from large sample(usually n> 30):
As sample size is large, sampling distribution of
mean is normal.
1. Compute from either known or estimated

2. Get Z value from standard normal distribution table


corresponding to confidence level (1- α).
3. The confidence interval
Estimation of means from small samples(n<30):
t-distribution:
 Applicable for smaller sample size.
 Unimodal and almost like a bell shape.
 Flatter than normal.
 Larger the sample size less flatter the distribution shape and
closer to normal.
 Value of t varies with d.f.i.e.(n-1) as the distribution shape
changes.
Step 1. Compute ( ) as usual
Step 2.Get t value from t- distribution table corresponding to (n-1)
as d.f. and (1- confidence level) as the area under curve.
Step 3. ± t is the confidence interval/limit.
Two sided Confidence
Case Interval (CI)

Population standard deviation, σ 𝜎


𝑥 ± 𝑍𝛼/2
known 𝑛

Population Sample size n > 30 𝑠


𝑥 ± 𝑍𝛼/2
standard 𝑛
deviation, σ
unknown
Sample size n ≤ 30 𝑠
𝑥 ± 𝑡𝛼/2
𝑛
Confidence Interval on the Variance of a Normal Distribution

Confidence Intervals on a Population Proportion


• Example 1: A sample of size 20 was collected
and the sample mean and standard deviation
are estimated as 9.8525 and 0.0965. Find 95%
CI for the mean.
• Example 2: The life in hours of a light bulb is
known to be approximately normally
distributed with 25 hours. A random sample of
40 bulbs has a mean life of 1014 hours.
1. Construct a 95% two-sided CI on the mean life.
2. Construct a 95% one-sided lower CI of the mean
life.
• Example 3: The following result shows the
investigation of the haemoglobin level of hockey
players (in g/dl).
15.3 16.0 14.4 16.2 16.2
14.9 15.7 14.6 15.3 17.7
16.0 15.0 15.7 16.2 14.7
14.8 14.6 15.6 14.5 15.2

a) Find the 90% two-sided CI on the mean 15.43684211


0.83413996
haemoglobin level.
b) Also construct 90% Upper CI on the mean
haemoglobin level.

You might also like