You are on page 1of 12

LESSON 7 SAMPLING AND SAMPLING DISTRIBUTIONS

I. BASIC TERMS
Sampled population is the population from which the sample is drawn.
Frame is a list of the elements from which the sample will be selected.
Sample mean provides an estimate of a population mean
Sample proportion provides an estimate of a population proportion.

II. Selecting a sample


Sampling from a Finite Population
1. Simple random sample of size n from a finite population of size N is a sample selected
such that each possible sample of size n has the same probability of being selected.
* use the counting rule for combination to compute for simple random sample

Sampling from an Infinite Population


1. Random Sample Inifinte Population of size n from an infinite population is a sample
seleted such that the following conditions are satisfied.

III. POINT ESTIMATION


Sample Statistic is the corresponding characteristic of a sample
Point Estimation is the statistical procedure of computing sample statistic
Point estimator of the population mean = sample mean
Sample Proportion as the point estimator of the population proportion p
(𝑥 ) ̅=𝑝𝑜𝑖𝑛𝑡 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑜𝑟 𝑜𝑓 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛
(𝑝 ) ̅=𝑝𝑜𝑖𝑛𝑡 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑜𝑟 of the population proportion p point estimate
s=point estimator of population standard deviaion

IV. INTRODUCTION TO SAMPLING DISTRIBUTIONS


Sampling Distribution is the probability distribution of sample mean
Sampling distribution and its properties will enable us to make probability statements about how close the sample mea

Sampling Distribution of sample mean


The sampling distribution of sample mean is the probability distribution of all possible values of the sample mean .

Expected Value of sample mean 𝐸(𝑥 ̅ )=μ


𝐸(𝑥 ̅)= expected value of 𝑥 ̅
𝜇=𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑚𝑒𝑎𝑛

When the expected value of a point estimator equals the population parameter, we way the point estimator is unbiased

Standard Deviation of 𝑥 ̅


Finite Population Infinite Population Properties of Infinite Population
𝜎_𝑥=√((𝑁−𝑛)/(𝑁−1) )(𝜎/√𝑛) 𝜎_𝑥=𝜎/√𝑛 1. The population is infinite
2. The population is finite and the sample size is less than

Form of the Sampling Distribution of sample mean


Identifying the characteristics of the sampling distribution of sample mean is to determine the form or shape of the samp
Two cases to consider:
1. The population has normal distribution
2. The population does not have a normal distribution - central limit theorem
Central Limit Theorem is selecting random samples of size n from a population,, the sampling distribution of the sample m
as the sample size becomes large.

Sampling Distribution of 𝑝 ̅ where 𝑝 ̅=𝑥/𝑛 is the probability distribution of all possible values of the sample pro

x = the number of elements in the sample that posses the characteristics of interest
n = sample size

Expected value of 𝑝 ̅ where 𝐸(𝑝 ̅) = p


(𝐸(𝑝) ̅)=𝑡ℎ𝑒 𝑒𝑥𝑝𝑒𝑐𝑡𝑒𝑑 𝑣𝑎𝑙𝑢𝑒 𝑜𝑓 𝑝 ̅
p = the population proportion

The Standard Deviation of 𝑝 ̅


Finite Population Infinite Population
𝜎_𝑝=√((𝑁−𝑛)/(𝑁−1) ) √((𝑝(1−𝑝))/𝑛)
𝜎_𝑝=√((𝑝(1−𝑝))/𝑛 )

Form of the sampling distribution of 𝑝 ̅ to determine the form or shape of the sampling distribution.
since the sample proportion is 𝑝 ̅=𝑥/𝑛 in simple random sample from a large population, x is a binomial ran
and n is constant, the probability of x/n is the same as the binomial p
which means that the probability for each value o x/n is the same as
Two conditions:
np >= 5 and n(1-p)>=5

the sampling distribution of 𝑝 ̅ can be approximated by a normal distribution whenever np>=5 and n(1-p)>=5.

V. OTHER SAMPLING METHODS

Stratified Random Sampling


in stratified, the elements is the population are first divided into groups called STRATA, such that each element
belongs to one and only one stratum, example: department, location and etc.

Cluster Sampling
in cluster, the elements in the population are first divided into separate groups called CLUSTERS.
Each elements of the population belongs to oe and only one cluser.
One of the primary applications of cluster is area sampling, and requires larger sample size compare to simle an

Systematic Sampling
Alternative for simple for sampling with large populations, systematic involves selecting randomly one of the fi
and then selecting every th that follows in the population.

Convenience Sampling
Convenience is a nonprobability sampling technique. Elements are included in the sample without prespecified
Examples: Volunteers

Judgment Sampling
Another nonprobability sampling, where the person most knowledgeable on the subject of the study selects ele
population that he or she feels are most representative of the population. Often, relatively easy way of selectin
VI. BIG DATA AND ERRORS

Sampling Errors
the difference between the value os the sample statistic and the value of the corresponding population parame
Unvoidable and the risk to accept wjem we chose to collect a random sample rather than incur costs associated

Non Sampling Error


Deviations of the sample from the population that occur for reasons other than random sampling.

Implications of Big data


When sampling, care must be taken to ensure that we minimize the introduction of non-sampling error into the
1. Carefully define the target population before collecting sample data, and subsequently design the data colle
2. Carefully design the data collection process and train the data collectors.
3. Pretest the data collection procedure to identify and correct for potential sources of nonsampling error prior
4. Use stratified random sampling when population level information about an important qualitative variable is
5. Use systematic sampling when population level information about an important quantitative variable is avail
how close the sample mean is to the population mean

s of the sample mean .

point estimator is unbiased.

the sample size is less than or equal to 5% of the population size; that is n/N<=.05

form or shape of the sampling distribution.


istribution of the sample mean can be approximated by a normal distribution

e values of the sample proportion 𝑝 ̅

ulation, x is a binomial random variable


the same as the binomial probability of x
value o x/n is the same as the probability of x.

ver np>=5 and n(1-p)>=5.

A, such that each element in the population

e size compare to simle and stratified sampling

ting randomly one of the first th elements from the population

mple without prespecified or known probabilities of being selected.

ect of the study selects elements of the


atively easy way of selecting a sample.
onding population parameter is called SAMPLING ERROR.
han incur costs associated with taking a census of the population.

om sampling.

on-sampling error into the data collection process. This can be done by the ff:
ently design the data collection procedure so that a probability sample is drawn from this target population.

of nonsampling error prior to final data collection.


rtant qualitative variable is available to ensure the sample is representative of the populationfor that qualitative charateristics.
uantitative variable is available to ensure the sample is representative of the population for that quantitative characteristics.
tive charateristics.
e characteristics.
SELECTING A SAMPLE
Suppose we use excel's RAND function to assign random numbers to the five elements: A(.7266), B. (.0476), C. (.24
List the simple random sample of size 2 that will be selected by using these random numbers
A 0.7266 Simple Random Sample = 𝑁!/𝑛!(𝑁−𝑛)! =
B 0.0476
C 0.2459
D 0.0957
E 0.9408

POINT ESTIMATION
The following data are from a simple random sample
5 8 10 7 10 14
a. What is the point estimate of the population mean? 𝑥 ̅ 9
b. What is the point estimate of the population standard deviation? s 2.8284

A survey question for a sample of 150 individuals yielded 75 Yes responses, 55 No responses, and 20 No Options.
a. What is the point estimate of the the proportion in the population who respond Yes? No?
𝑝 ̅=x/n Let Yes denote x= 0.5000
Let No denote x= 0.3667

SAMPLING DISTRIBUTION OF 𝑥 ̅


A population has a mean of 200 and a standard deviation of 50. Suppose a simple random sample of size 100 is sel
and 𝑥 ̅ is used to estimate 𝜇 ̅
a. What is the probability that the sample mean will be within +/-5 of the population mean?
b. What is the probability that the sample mean will be within +/-10 of the population mean?

SAMPLING DISTRIBUTION OF 𝑝 ̅


A random sample of size 100 is selected from a population with p=.40.
a. What is the expected value of 𝑝 ̅ ? 0.40
b. What is the standard error of 𝑝 ̅ ? 0.04899
nts: A(.7266), B. (.0476), C. (.2459), D. (.0957), E. (.9408).

10 AB BC CD DE
AC BD CE
AD BE
AE

esponses, and 20 No Options.

andom sample of size 100 is selected


+5 -5 Probability of sample mean within of population mean
0.539828 0.460172 0.079656
0.57926 0.42074 0.158519
+10 -10

+2500 -2500
0.841345 0.158655 0.682689
9000
0.642668

0.642668
Simple Random Sample =rand()
Normal distribution =norm.dist(x,mean,standard deviation, true)
LESSON 7 EXERCISES
1. Martina Levitt, director of marketing for the messaging app Spontaversation, has been assigned the task of profiling use
who have downloaded Spontaversation use the app an average of 30 times per day with a standard deviation of 6.
a. What is the sampling distribution of 𝑥 ̅ if a random sample of 50 individuals who have downloaded Spontav
b. What is the sampling distribution of 𝑥 ̅ if a random sample of 500,000 individuals who have downloaded Sp
c. What general statement can you make about what happens to the sampling distribution of
becomes extremely large? Does this generalization seem logical? Explain. Based on the Central Limit Theorem
mean is able to be approximated by
sample size, the more likely you are
calculated using a normal distributio

2. The latest available data showed health expenditures wre $8086 per person in the US or 17.6% of GDP. Use $8086 as t
firm will take a sample of 100 people to investigate the nature of their health expenditures. Assume the population stand
a. Show the sampling distribution of the mean amount of health care expenditures for a samle of 100 people.
b. What is the probability the sample mean will be within +/-$2500 of the population mean?
c. What is the probability the sample mean will be greater than $9000, would you question whether the firm followed cor

3. Allegiant Airlines charges a mean base fare of $89. In addition, the airline charges for making a reservation on its websi
These additional charges average $39 per passenger. Suppose a random sample of 60 passengers is taken to determine th
The population standard deviation of total flight is known to be $40.
a. What is the population mean cost per flight? $128
b. What is the probability the sample mean will be within $10 of the population mean cost per flight?
c. What is the probability the sample mean will be within $5 of the population mean cost per flight?
ed the task of profiling users of this app. Assume that individuals
ard deviation of 6.
have downloaded Spontaversation is used? 0.849
who have downloaded Spontanversation is used? 0.008485281
as the sample size
the Central Limit Theorem, it is telling us that because the size of the sample is getting larger, that the sample
ble to be approximated by the normal distribution because it is so large. This is logical because the larger the
ze, the more likely you are to encompass a greater average of the population and the distribution is able to be
d using a normal distribution

% of GDP. Use $8086 as the population mean and suppose a survey research
ume the population standard deviation is $2500.
of 100 people. 250
0.682689
ther the firm followed correct sampling procedures? Why or why not? 0.00013
There is .00013 that the sample mean will be greater than 9000

a reservation on its website, checking bags, and inflight beverages.


rs is taken to determine the total cost of their flight on Allegiant Airlines.

0.9472
0.6671

You might also like