You are on page 1of 29

Sampling Distribution

Population and Sample

• A population (universe) is the collection of all members


of a group.
– Example: all students of IIM.

• A sample is a portion of the population selected for


analysis.
– Example: 50 students of IIM.

• Census: gathering data from the entire population


Reasons for Sampling
• Sampling can save money.
• Sampling can save time.
• For given resources, sampling can broaden the
scope of the data set.
• Because the research process is sometimes
destructive, the sample can save product.
• If accessing the population is impossible;
sampling is the only option.
Sampling is Necessary under
Certain Conditions
• In order to find the acreage under rice in India
by the census method, each and every plot of
land on which rice is grown within the
geographical limits of Indian Union has to be
visited and area under rice cultivation has to
be measured. The method is quite
impracticable, because of enormous task of
visiting the plots and the consequential time
and money involved.
Sampling is Necessary under
Certain Conditions
• When a new mine is discovered and it is
necessary to know the quality of contents,
only a few ounces of ore, drilled out from here
and there, are sufficient for chemical analysis.
It is impossible and unnecessary to dig out the
whole mine and then examine the quality.
Parameter and Statistic
• A parameter is a numeric quantity, usually unknown, that
describes a certain population characteristic.
– Parameters are normally represented by Greek letters. The
population mean and variance, represented by the Greek letters μ
and σ2, respectively.
– Example: The true “average” height of adult human males.

• A statistic is a quantity, calculated from a sample of data,


used to estimate a parameter.
– Statistics are usually represented by Latin letters with other
symbols. The sample mean and X variance are denoted by the
symbols and s2, respectively.
– Example: The “average” height of a random sample of 1,000 adult
human males.
Types of Sampling

Sampling

Random Non-random
Random and Nonrandom Sampling
• Random sampling
• Every unit of the population has the same probability of being
included in the sample.
• A chance mechanism is used in the selection process.
• Eliminates bias in the selection process
• Nonrandom Sampling
• Every unit of the population does not have the same probability
of being included in the sample.
• Open the selection bias
• Not appropriate data collection methods for most statistical
methods
Simple Random Sampling
• Every individual or item from the population has
an equal chance of being selected
• Selection may be with replacement or without
replacement
• Samples obtained from table of random numbers
or computer random number generators
• May not be a good representation of the
population’s underlying characteristics
Systematic Sampling
• Purchase orders for the previous fiscal year are
serialized 1 to 10,000 (N = 10,000).
• A sample of fifty (n = 50) purchases orders is
needed for an audit.
• k = 10,000/50 = 200
• First sample element randomly selected from the
first 200 purchase orders. Assume the 45th
purchase order was selected.
• Subsequent sample elements: 245, 445, 645, . . .
Stratified Sampling
• Divide population into two or more subgroups (called
strata) according to some common characteristic
• A simple random sample is selected from each
subgroup, with sample sizes proportional to strata sizes
• Samples from subgroups are combined into one
• Ensures representation of individuals across the entire
population
Stratified Random Sample: Population
of FM Radio Listeners
Stratified by Age

20 - 30 years old
(homogeneous within)
(alike) Hetergeneous
(different)
30 - 40 years old between
(homogeneous within)
(alike) Hetergeneous
(different)
40 - 50 years old between
(homogeneous within)
(alike)
Cluster Sampling
• Population is divided into several “clusters,” each
representative of the population
• A simple random sample of clusters is selected
– All items in the selected clusters can be used, or items
can be chosen from a cluster using another probability
sampling technique
• cost effective
• Less efficient (need larger sample to acquire the
same level of precision)
Nonrandom Sampling
• Convenience Sampling: sample elements are
selected for the convenience of the researcher
• Judgment Sampling: sample elements are selected
by the judgment of the researcher
• Quota Sampling: sample elements are selected until
the quota controls are satisfied
• Snowball Sampling: survey subjects are selected
based on referral from other survey respondents
Types of Survey Errors
• Coverage error or selection bias
– Exists if some groups are excluded from the frame and have no chance of
being selected

• Nonresponse error or bias


– People who do not respond may be different from those who do respond

• Sampling error
– Variation from sample to sample will always exist

• Measurement error
– Due to weaknesses in question design, respondent error, and interviewer’s
effects on the respondent
Sampling Distributions
• A sampling distribution is a distribution of all
of the possible values of a statistic for a given
size sample selected from a population
Standard Error of the Mean
• Different samples of the same size from the same population
will yield different sample means
• If the standard error is small, most of the sample means will be
near the center of population mean.
• Thus a particular sample mean has a good chance of being close
to the population mean, and will be a good estimator of the
population mean.
• Conversely, a large standard error means that the given sample
mean will be a poor estimator of the population mean.
• Note that the standard error of the mean decreases as the
sample size increases
Sampling from Normal Populations
• If a population is normal with mean μ and
standard deviation σ, the sampling
distribution of X is also normally distributed
with
μX  μ
σ
σX 
n
Sampling Distribution Properties

Normal Population
μX  μ Distribution

μ x
σ Sampling Distribution
σX  is also normal (and has
n the same mean)

μx x
Sampling Distribution Properties

As n increases, Larger
σ x decreases sample size

Smaller
sample size

μ x
Sampling from Nonnormal Populations

• Apply Central Limit Theorem.


• Central Limit Theorem:

If x is the mean of a random sample of size n from


a population with mean of µ and standard deviation
of σ, then as n increases the distribution of x
approaches a normal distribution with mean μ X  μ
σ
and standard deviation σ X  .
n
Distribution of Sample Means
for Various Sample Sizes
Exponential n=2 n=5 n = 30
Population

Uniform n=2 n=5 n = 30


Population
Z-value for Sampling Distribution
of the Mean
• Z-value for the sampling distribution of X:
(X  μ X ) (X  μ)
Z 
σX σ
n
where:
X
= sample
μ
mean
= population mean
σ
= population standard deviation
n = sample size
Example
• Suppose a population has mean μ = 8 and
standard deviation σ = 3. Suppose a random
sample of size n = 36 is selected.

• What is the probability that the sample mean


is between 7.8 and 8.2?
Sampling from a Finite Population
• In this case, the standard deviation of the
distribution of sample means is smaller than
when sampling from an infinite population
• The correct value of this standard deviation is
computed by applying a finite population
multiplier to the standard deviation for sampling
from a infinite population.
• If the sample size is less than 5% of the
population size, the adjustment is unnecessary.
Modified Formula

N n
• Finite population multiplier =
N 1

• Standard error of the mean from finite population


 N n
x 
n N 1
• Modified Z Formula,
X 
Z
 N n
n N 1
Sampling Distribution of
Proportion
• Sometimes we choose to use the sample proportion for
analysis when the research results in countable items.
– How many people in a sample choose Pepsi as their soft
drink?
– How many people in a sample have a flexible work
schedule?
• Proportion is computed by dividing the frequency with which
a given characteristics occurs in a sample by the number of
items in the sample.
• We denote π as population proportion and p as sample
proportion.
Sampling Distribution of
Proportion
• The sample proportion is the percentage of successes in n binomial trials.
It is the number of successes, x, divided by the number of trials, n.
• Sample proportion: p  x
n

• From central limit theorem, the distribution of sample proportion is


approximately normal if nπ > 5 and n(1-π) > 5. and standard error of the
proportion is  (1   )
n
p 
z
and the z formula:  (1   ) n
Problem
• Suppose that 25% of all Indian in a given income and lifestyle
category are interested in buying a particular brand of car. A
random sample of 100 Indian consumers in the category of
interest is to be selected. What is the probability that at least
20% of those in the sample will express an interest in that
particular brand of car?

You might also like