You are on page 1of 35

Sampling and Sampling Distributions

■ Aims of Sampling
■ Probability Distributions
■ Sampling Distributions
■ The Central Limit Theorem
■ Types of Samples
Aims of sampling

■ Reduces cost of research (e.g. political


polls)
■ Generalize about a larger population (e.g.,
benefits of sampling city r/t neighborhood)
■ In some cases (e.g. industrial production)
analysis may be destructive, so sampling
is needed
Probability

■ Probability: what is the chance that a


given event will occur?
■ Probability is expressed in numbers
between 0 and 1. Probability = 0 means
the event never happens; probability = 1
means it always happens.
■ The total probability of all possible event
always sums to 1.
Probability distributions: Permutations

What is the probability distribution of number


of girls in families with two children?
2 GG
1 BG
1 GB
0 BB
0.6
Probability Distribution of
Number of Girls

0 1 2
How about family of three?
Num. Girls child #1 child #2 child #3
0 B B B
1 B B G
1 B G B
1 G B B
2 B G G
2 G B G
2 G G B
3 G G G
Probability distribution of number of girls
How about a family of 10?
As family size increases, the binomial
distribution looks more and more normal.

- 0 1 2 3 4 5 6 7 8 9 1 0

Number of Successes
Normal distribution

Same shape, if you adjusted the scales


Coin toss

■ Toss a coin 30 times


■ Tabulate results
Coin toss

■ Suppose this were 12 randomly selected


families, and heads were girls
■ If you did it enough times distribution would
approximate “Normal” distribution
■ Think of the coin tosses as samples of all
possible coin tosses
Sampling distribution

Sampling distribution of the mean - A


theoretical probability distribution of sample
means that would be obtained by drawing from
the population all possible samples of the same
size.
Central Limit Theorem

■ No matter what we are measuring, the


distribution of any measure across all
possible samples we could take
approximates a normal distribution, as
long as the number of cases in each
sample is about 30 or larger.
Central Limit Theorem

If we repeatedly drew samples from a


population and calculated the mean of a
variable or a percentage or, those sample
means or percentages would be normally
distributed.
Most empirical distributions are not normal:

Percentage of people with income less than a given amount


1 0 %2 0 %3 0 %4 0 %5 0 %6 0 % 7 0 % 8 0 % 9 0 %

Gross annual income (in thousands of dollars)

U.S. Income distribution 1992


But the sampling distribution of mean income over
many samples is normal
N
u
m
b
e
r

0
f

s
a
m
p Sampling Distribution of Income, 1992 (thousands)
1
e Q.

s (/)
Standard Deviation

Measures how spread


out a distribution is.

Square root of the sum


of the squared s = K X -m
)2

deviations of each VN
case from the mean
over the number of
cases, or
Example of Standard Deviation
Deviation from Mean 2
Amount X (X - X) ( X -x 2
600 435 600 - 435 = 165 27,225
350 435 350 - 435 = -85 7,225
275 435 275 - 435 = -160 25,600
430 435 430 -435 = -5 25
520 435 520 - 435 = 85 7,225
0 67,300

£ (x - x )2 67,300
n—1 = V16,825 = 129.71
s = \ 4
Standard Deviation and Normal Distribution
Distribution of Sample Means with 21
Samples

o
a
<D
2
0"
<D
VH
PH

Sample Means
Distribution of Sample Means with 96
Samples

Sample Means
Distribution of Sample Means with 170
Samples

Sample Means
The standard deviation of the sampling
distribution is called the standard error
The Central Limit Theorem
Standard error can be estimated from a single sample:

Where
s is the sample standard deviation (i.e., the
sample based estimate of the standard deviation of the
population), and
n is the size (number of observations) of the sample.
Confidence intervals
Because we know that the sampling distribution is normal, we
know that 95.45% of samples will fall within two standard errors.

95% of samples fall within 1.96


standard errors.

99% of samples fall within


2.58 standard errors.
Sampling

■ Population - A group that includes all the


cases (individuals, objects, or groups) in
which the researcher is interested.
■ Sample - A relatively small subset from a
population.
Random Sampling
■ Simple Random Sample - A sample
designed in such a way as to ensure
that (1) every member of the population
has an equal chance of being chosen
and (2) every combination of N
members has an equal chance of being
chosen.
■ This can be done using a computer,
calculator, or a table of random
numbers
Population inferences can be made
...by selecting a representative sample from
the population

£*
%
Random Sampling

■ Systematic random sampling - A


method of sampling in which every Kth
member (K is a ration obtained by dividing
the population size by the desired sample
size) in the total population is chosen for
inclusion in the sample after the first
member of the sample is selected at
random from among the first K members
of the population.
Systematic Random Sampling
Figure 1 1.2 Systematic Random Sampling

From a population of 40 students, let’s select a systematic random sample of 8 students. Our
skip interval will be 5 (40 -s- 8 = 5). Using a random number table, we choose a number
between 1 and 5. Let's say we choose 4. We then start with student 4 and pick every
5th student:

Our trip to the random number table could have just as easily given us a 1 or a 5, so all
the students do have a chance to end up in our sample.
Stratified Random Sampling
■ Proportionate stratified sample - The size
of the sample selected from each subgroup is
proportional to the size of that subgroup in
the entire population. (Self weighting)
■ Disproportionate stratified sample - The
size of the sample selected from each
subgroup is disproportional to the size of that
subgroup in the population. (needs weights)
Disproportionate Stratified Sample
Stratified Random Sampling

■ Stratified random sample - A method of


sampling obtained by (1) dividing the
population into subgroups based on one
or more variables central to our analysis
and (2) then drawing a simple random
sample from each of the subgroups

You might also like