Professional Documents
Culture Documents
I. BASIC TERMS
Sampled population is the population from which the sample is drawn.
Frame is a list of the elements from which the sample will be selected.
Sample mean provides an estimate of a population mean
Sample proportion provides an estimate of a population proportion.
When the expected value of a point estimator equals the population parameter, we way the point estimator is unbiased
Sampling Distribution of 𝑝 ̅ where 𝑝 ̅=𝑥/𝑛 is the probability distribution of all possible values of the sample pro
x = the number of elements in the sample that posses the characteristics of interest
n = sample size
Form of the sampling distribution of 𝑝 ̅ to determine the form or shape of the sampling distribution.
since the sample proportion is 𝑝 ̅=𝑥/𝑛 in simple random sample from a large population, x is a binomial ran
and n is constant, the probability of x/n is the same as the binomial p
which means that the probability for each value o x/n is the same as
Two conditions:
np >= 5 and n(1-p)>=5
the sampling distribution of 𝑝 ̅ can be approximated by a normal distribution whenever np>=5 and n(1-p)>=5.
Cluster Sampling
in cluster, the elements in the population are first divided into separate groups called CLUSTERS.
Each elements of the population belongs to oe and only one cluser.
One of the primary applications of cluster is area sampling, and requires larger sample size compare to simle an
Systematic Sampling
Alternative for simple for sampling with large populations, systematic involves selecting randomly one of the fi
and then selecting every th that follows in the population.
Convenience Sampling
Convenience is a nonprobability sampling technique. Elements are included in the sample without prespecified
Examples: Volunteers
Judgment Sampling
Another nonprobability sampling, where the person most knowledgeable on the subject of the study selects ele
population that he or she feels are most representative of the population. Often, relatively easy way of selectin
VI. BIG DATA AND ERRORS
Sampling Errors
the difference between the value os the sample statistic and the value of the corresponding population parame
Unvoidable and the risk to accept wjem we chose to collect a random sample rather than incur costs associated
the sample size is less than or equal to 5% of the population size; that is n/N<=.05
om sampling.
on-sampling error into the data collection process. This can be done by the ff:
ently design the data collection procedure so that a probability sample is drawn from this target population.
POINT ESTIMATION
The following data are from a simple random sample
5 8 10 7 10 14
a. What is the point estimate of the population mean? 𝑥 ̅ 9
b. What is the point estimate of the population standard deviation? s 2.8284
A survey question for a sample of 150 individuals yielded 75 Yes responses, 55 No responses, and 20 No Options.
a. What is the point estimate of the the proportion in the population who respond Yes? No?
𝑝 ̅=x/n Let Yes denote x= 0.5000
Let No denote x= 0.3667
10 AB BC CD DE
AC BD CE
AD BE
AE
+2500 -2500
0.841345 0.158655 0.682689
9000
0.642668
0.642668
Simple Random Sample =rand()
Normal distribution =norm.dist(x,mean,standard deviation, true)
LESSON 7 EXERCISES
1. Martina Levitt, director of marketing for the messaging app Spontaversation, has been assigned the task of profiling use
who have downloaded Spontaversation use the app an average of 30 times per day with a standard deviation of 6.
a. What is the sampling distribution of 𝑥 ̅ if a random sample of 50 individuals who have downloaded Spontav
b. What is the sampling distribution of 𝑥 ̅ if a random sample of 500,000 individuals who have downloaded Sp
c. What general statement can you make about what happens to the sampling distribution of
becomes extremely large? Does this generalization seem logical? Explain. Based on the Central Limit Theorem
mean is able to be approximated by
sample size, the more likely you are
calculated using a normal distributio
2. The latest available data showed health expenditures wre $8086 per person in the US or 17.6% of GDP. Use $8086 as t
firm will take a sample of 100 people to investigate the nature of their health expenditures. Assume the population stand
a. Show the sampling distribution of the mean amount of health care expenditures for a samle of 100 people.
b. What is the probability the sample mean will be within +/-$2500 of the population mean?
c. What is the probability the sample mean will be greater than $9000, would you question whether the firm followed cor
3. Allegiant Airlines charges a mean base fare of $89. In addition, the airline charges for making a reservation on its websi
These additional charges average $39 per passenger. Suppose a random sample of 60 passengers is taken to determine th
The population standard deviation of total flight is known to be $40.
a. What is the population mean cost per flight? $128
b. What is the probability the sample mean will be within $10 of the population mean cost per flight?
c. What is the probability the sample mean will be within $5 of the population mean cost per flight?
ed the task of profiling users of this app. Assume that individuals
ard deviation of 6.
have downloaded Spontaversation is used? 0.849
who have downloaded Spontanversation is used? 0.008485281
as the sample size
the Central Limit Theorem, it is telling us that because the size of the sample is getting larger, that the sample
ble to be approximated by the normal distribution because it is so large. This is logical because the larger the
ze, the more likely you are to encompass a greater average of the population and the distribution is able to be
d using a normal distribution
% of GDP. Use $8086 as the population mean and suppose a survey research
ume the population standard deviation is $2500.
of 100 people. 250
0.682689
ther the firm followed correct sampling procedures? Why or why not? 0.00013
There is .00013 that the sample mean will be greater than 9000
0.9472
0.6671