# Chapter 7: Sampling Distributions

Section 7.1 How Likely Are the Possible Values of a Statistic? The Sampling Distribution

Learning Objectives
1. Statistic vs. Parameter

2. Sampling Distributions
3. Mean and Standard Deviation of the

Sampling Distribution of a Proportion 4. Standard Error 5. Sampling Distribution Example 6. Population, Data, and Sampling Distributions

Learning Objective 1: Statistic and Parameter
 A statistic is a numerical summary of sample

data such as a sample proportion or sample mean  A parameter is a numerical summary of a population such as a population proportion or population mean.  In practice, we seldom know the values of parameters.  Parameters are estimated using sample data.  We use statistics to estimate parameters.

Learning Objective 2: Sampling Distributions Example:  Prior to counting the votes.54. . If a different random sample of about 3000 voters were selected. An exit poll of 3160 voters reported that the sample proportion in favor of a recall was 0. a different sample proportion would occur.   The sampling distribution of the sample proportion shows all possible values and the probabilities for those values. the proportion in favor of recalling Governor Gray Davis was an unknown parameter.

 Sampling distributions describe the variability that occurs from study to study using statistics to estimate population parameters  Sampling distributions help to predict how close a statistic falls to the parameter it estimates .Learning Objective 2: Sampling Distributions  The sampling distribution of a statistic is the probability distribution that specifies probabilities for the possible values the statistic can take.

p) n .Learning Objective 3: Mean and SD of the Sampling Distribution of a Proportion  For a random sample of size n from a population with proportion p of outcomes in a particular category. the sampling distribution of the proportion of the sample in that category has Mean  p standard deviation  p(1.

Learning Objective 4: The Standard Error  To distinguish the standard deviation of a sampling distribution from the standard deviation of an ordinary probability distribution. we refer to it as a standard error. .

50. would it have been unlikely to observe the exit-poll sample proportion of 0.Learning Objective 5: Example: 2006 California Election  If the population proportion supporting the reelection of Schwarzenegger was 0.565?  Based on your answer. would you be willing to predict that Schwarzenegger would win the election? .

5 .5 * (1  .5)   .  Find the estimate of the population proportion and the standard error: p  .0096 2705 .Learning Objective 5: Example: 2006 California  Given that the exit poll had 2705 people and assuming 50% support the reelection of Schwarzenegger.

Learning Objective 5: Example: 2006 California Election .

565 voting for reelection of Schwarzenegger would be very unlikely if the population proportion were p = 0.565 is more than six standard errors from the expected value of 0.50.50 or p < 0.50 .0.Learning Objective 5: Example: 2006 California Election (0.  The sample proportion of 0.0096  The sample proportion of 0.50) z  6.565.8 0.

 Values of its parameters are usually unknown.Learning Objective 6: Population Distribution  Population distribution: This is the probability distribution from which we take the sample. They’re what we’d like to learn about. .

the larger the sample size n.Learning Objective 6: Data distribution  This is the distribution of the sample data.  It’s described by statistics  With random sampling. the more closely the data distribution resembles the population distribution . It’s the distribution we actually see in practice.

.  With random sampling.  Its standard deviation is called the standard error.  The sampling distribution provides the key for telling us how close a sample statistic falls to the corresponding unknown parameter. the sampling distribution provides probabilities for all the possible values of the statistic.Learning Objective 6: Sampling Distribution  This is the probability distribution of a sample statistic.

1 million votes were tallied    Let X= vote outcome with x=1 for Clinton and x=0 for Spencer . Senate election in NY  An exit poll of 1336 voters showed   67% (895) voted for Clinton 33% (441) voted for Spencer 68% voted for Clinton 32% voted for Spencer  When all 4.S.Learning Objective 6: Example  In the 2006 U.

1 million values  of the x vote variable. 32% of which are 0 and 68% of which are 1. 33% of which are 0 and 67% of which are 1.  The data distribution is the 1336 values of the x vote for the exit poll.68 and   0.68) /1336  0. the others are discrete and concentrated at the two values 0 and 1 .Learning Objective 6: Example  The population distribution is the 4.68(1 0.  The sampling distribution of the sample proportion is approximately a normal distribution with p=0.013  Only the sampling distribution is bell-shaped.

2 How Close Are Sample Means to Population Means? .Chapter 7: Sampling Distributions Section 7.

Effect of n on the Standard Error 3.Learning Objectives 1. The Sampling Distribution of the Sample Mean 2. Central Limit Theorem (CLT) 4. Calculating Probabilities of Sample Means .

x. the population mean.  The sample mean varies from sample to sample. is a single fixed number. µ.  By contrast. .Learning Objective 1: The Sampling Distribution of the Sample Mean  The sample mean. is a random variable.

Learning Objective 1: The Sampling Distribution of the Sample Mean  For a random sample of size n from a population having mean µ and standard deviation σ. the sampling distribution of the sample mean has:  Center described by the mean µ (the same as the mean of the population). which equals the population standard deviation divided by the square root of the sample size: n  standard error of x     . Spread described by the standard error.

 The sales figures fluctuate around a mean µ = \$900 with a standard deviation σ = \$300.  What are the center and spread of the sampling distribution?   \$900 300 standard error   113 7 .Learning Objective 1: Example 1: Pizza Sales  Daily sales at a pizza restaurant vary from day to day.

Learning Objective 2: Effect of n on the Standard Error  Knowing how to find a standard error gives us a mechanism for understanding how much variability to expect in sample statistics “just by chance.  With larger samples. so the standard error decreases.”   The standard error of the sample mean = n  As the sample size n increases. the sample mean is more likely to fall closer to the population mean. . the denominator increases.

Learning Objective 3: Central Limit Theorem  Question: How does the sampling distribution of the sample mean relate with respect to shape. center. and spread to the probability distribution from which the samples were taken? .

. the sampling distribution of the sample mean is approximately a normal distribution.  This result applies no matter what the shape of the probability distribution from which the samples are taken.Learning Objective 3: Central Limit Theorem (CLT)  For random sampling with a large sample size n.

 If the population distribution is approximately normal. the sampling distribution is usually close to normal when the sample size n is at least about 30. . the larger n must be before the shape of the sampling distribution is close to normal.  In practice. then the sampling distribution is approximately normal for all sample sizes.Learning Objective 3: CLT: How Large a Sample?  The sampling distribution of the sample mean takes more of a bell shape as the random sample size n increases.  The more skewed the population distribution.

Learning Objective 3: CLT: Impact of increasing n .

the sampling distribution is approximately normal even if the population distribution is not.  This enables us to make inferences about population means regardless of the shape of the population distribution. .Learning Objective 3: CLT: Making Inferences  For large n.

99.99 lbs?  Calculate the mean and standard error for the sampling distribution of a random sample of 5 milk bottles  By the CLT.20.0894  P( x >0.Learning Objective 4: Calculating Probabilities of Sample Means  The distribution of weights of milk bottles is normally distributed with a mean of 1. )  .89  5  .1E99.  What is the probability that the mean of a random sample of 5 bottles will be greater than 0.2 5 =0.99)= . x is approximately normal with mean=1.1.1 and standard error = 0.1.20  normcdf (.1 lbs and a standard deviation (σ)=0.

and standard error = 20 40 =3.Learning Objective 4: Calculating Probabilities of Sample Means  Closing prices of stocks have a right skewed distribution with a mean (µ) of \$25 and σ= \$20. 25.  20 )  . 20.1623  P( x <20)=  normcdf (1E99.  What is the probability that the mean of a random sample of 40 stocks will be less than \$20?  Calculate the mean and standard error for the sampling distribution of a random sample of 40 stocks x is approximately normal with mean=25  By the CLT.06 40  .

Learning Objective 4: Calculating Probabilities of Sample Means  An automobile insurer has found that repair claims have a mean of \$920 and a standard deviation of \$870.  What is the probability that the average of the 100 claims is larger than \$900? 870 normcdf (900. Suppose that the next 100 claims can be regarded as a random sample from the long-run claims process. )  .59 100 . 920.1E99.

. )  8. and standard deviation =0.9.16 5 .1 oz. wedges of cheddar cheese produced by a certain company is normal with mean =8.1 invnorm(.8.1 oz.Learning Objective 4: Calculating Probabilities of Sample Means Example: the distribution of actual weights of 8 oz.1.  Find the value x such that there is only a 10% chance that the average weight of a sample of five wedges will be above x.

)  8. and standard deviation =0.1 oz.03 5 .Learning Objective 4: Calculating Probabilities of Sample Means Example: the distribution of actual weights of 8 oz.1 invnorm(. wedges of cheddar cheese produced by a certain company is normal with mean =8.8.1.  Find the value x such that there is only a 5% chance that the average weight of a sample of five wedges will be below x.05.1 oz. .

3 How Can We Make Inferences About a Population? .Chapter 7: Sampling Distributions Section 7.

Sampling Distribution for a Proportion . Using the CLT to Make Inferences 2.Learning Objectives 1. Standard Errors in Practice 3.

 . x falls within 2 standard errors of  with probability close to 0.Learning Objective 1: Using the CLT to Make Inferences Implications of the CLT   When the sampling distribution of the sample mean is approximately normal.  For large n. the sampling distribution of x is approximately normal no matter what the shape of the underlying population distribution.95 and almost certainly falls within 3 standard errors of .

Learning Objective 2: Standard Errors in Practice In practice..g. these parameter values are  unknown. standard errors are estimated  Standard errors have exact values depending on parameter values.   p(1 p) n for a sample proportion  n for a sample mean In practice. Inference methods use standard  errors that substitute sample values for the parameters in the exact formulas above  These estimated standard errors are the numbers we use in practice. e. .

Learning Objective 3: Sampling Distribution for a Proportion  The binomial probability distribution is the sampling distribution for the number of successes in n independent trials  In practice. the sample proportion of successes is the statistic usually reported  Since the sample proportion is simply the number of successes divided by the number of trials. the formulas for the mean and standard deviation of the sampling distribution of the proportion of successes are the formulas for the mean and standard deviation of the number of successes divided by n. .

Learning Objective 3: Sampling Distribution for a Proportion  For a binomial random variable with n trials and probability p of success for each. the sampling distribution of the proportion of successes has   p(1 p) n  These values can be found by taking the mean np and the standard deviation np(1 p) for the binomial distribution of the number of successes and dividing n by  Mean = p Standard error = .