You are on page 1of 11

Hattie Burford

MATH 535 Sampling Techniques


Dr. Stack
Fall 2017 Exam 2

1. A researcher wants to study invasive plant species on an acre of land to determine the
proportion of plant species that are invasive. "Invasive species" means a non-native species
whose introduction has caused, or is likely to cause, economic or environmental harm or
harm to human health. An acre is any perimeter enclosing 43,560 square feet. The acre will
be divided into 1 square foot plots. The potential invasive plant species are expected to
reproduce from a starting point and extend out from that point each year.
The researcher has a list of invasive species and will sample 25 square feet in the selected
acre by counting the total number of plants and the number of invasive plant species in
each square foot plot in the sample.
Which sampling method would be best: a simple random sample, a stratified sample, or a
systematic sample? Explain.

In order to get an accurate sample that is a good representative of the population, the
stratified sample as well as the systematic would give the most accurate results. Because
the species start at one point and spread, they will not be evenly distributed across the
whole acre. With that in mind, the simple random ample wouldn’t be the best option
because there may be square feet chosen that is far from where the species started,
showing very little invasive species, or there may be square feet chosen where they
originated, showing many invasive species. Therefore, since the acre is divided into 1
square foot plots, these would be considered the strata for the sample, numbered.
Systematic sampling would then be used to choose 25 acres that would be more likely to
be spread out across the acre. Once the different strata are chosen, we would then use
this to count the total number of invasive species in each square foot in the sample.

2. A researcher wants to estimate the total number of people in a community who recycle. A
telephone survey asked if the respondents recycled. The members of the community were
separated into 3 strata based on the education level of the adult respondent. Strata 1 = Did
not complete HS; Strata 2 =Completed HS and some college; Strata 3 = Bachelor degree or
higher.

Population # who recycle- from


Sample Size
size sample
Strata 1 30 300 8
Strata 2 50 500 26
Strata 3 20 200 15
Total 100 1000 49

a. Estimate the total number of individuals who recycle.


8
𝑁1 = 300 𝑝̂1 = 30 = .27
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

26
𝑁2 = 500 𝑝̂ 2 = 50 = .52
15
𝑁3 = 200 𝑝̂ 3 = 20 = .75
𝑁 = 1000

∑ 𝑁𝑖 𝑝̂ 𝑖 = [(300)(. 27) + (500)(. 52) + (200)(. 75)] = 491 𝑡𝑜𝑡𝑎𝑙 𝑟𝑒𝑐𝑦𝑐𝑙𝑒

b. If Var(y-bar(st)) = 25; estimate the bound on the error of estimation.

𝑉̂ (𝑦̅𝑠𝑡 ) = 25
̂ ̂
𝑉 (𝑁𝑦̅𝑠𝑡 ) = 𝑁 𝑉 (𝑦̅𝑠𝑡 ) = 10002 (25) = 25000000
2

𝐵 = 2√𝑉̂ (𝑁𝑦̅𝑠𝑡 ) = 2√25000000 = 10000

3. The text includes a data set titled TEMPS. The table below is a sample of size n = 15 from
that data set showing the temperatures for April and May. Additional information is
provided that may (or may not) be useful in the computations. Estimate the ratio of
average April temperature to average May temperature. Give your answer to 4 decimal
places.

Station April Temp May Temp April Temp / May Temp


Baltimore, MD 54 64 0.84375
Columbus, OH 51 61 0.836065574
Fairbanks, AK 29 47 0.617021277
Houston, TX 69 76 0.907894737
Jackson, MO 66 73 0.904109589
Juneau, AK 39 47 0.829787234
Kansas City, MO 54 64 0.84375
Marquette, MI 40 50 0.8
Memphis, TN 63 71 0.887323944
Moline, IL 51 61 0.836065574
Nashville, TN 60 69 0.869565217
Omaha, NE 52 63 0.825396825
Rapid City, SD 45 55 0.818181818
Syracuse, NY 47 57 0.824561404
Wilmington, DE 52 62 0.838709677
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

Sum 772 920 12.48218287


Mean 51.4667 61.3333 .8321455247
Estimation ratio of population ratio:

∑ 𝑦 51.4667
𝑟= = = .8391
∑ 𝑥 61.3333
4. Salary information for 15 employees is provided for this year and the previous year.

Employee Last Year(x) This Year(y) This – Last (y-x) (y-rx)


1 37537 40099 2562 112.898435
2 35821 38289 2468 130.858855
3 38577 41339 2762 245.043635
4 36391 38839 2448 73.669205
5 36696 39459 2763 368.76948
6 34493 36847 2354 103.504215
7 37276 39639 2363 -69.07262
8 34601 36859 2258 0.457755
9 34801 36959 2158 -112.591245
10 34782 37139 2357 87.64841
11 34785 37039 2254 -15.547325
12 36085 38139 2054 -300.365825
13 37177 39459 2282 -143.613365
14 39341 41939 2598 31.196455
15 35740 37559 1819 -512.8563
SUM = 544103 579603 35500
MEAN = 36273.53 38640.20 2366.67
ST DEVIATION = 1498.25 1652.86 252.86 214.185984
45875.63585
Standard error=

Hint, calculate the standard error of the mean for the 2 years.

a. Estimate the rate of change for the 500 employees in the company and place a bound on the
error of estimation.
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

∑ 𝑦 579603
𝑟= = = 1.065245
∑ 𝑥 544103

𝑛 1 𝑠2 15 1 45875.6358
𝑉(𝑟) = (1 − 𝑁) (𝜇2 ) ( 𝑛𝑟 ) = (1 − 500) (36273.532 ) ( )=.00000225
15

𝑛 1 𝑠𝑟2
𝐵 = 2√(1 − ) ( 2 ) ( ) = 2√. 00000225 = .003
𝑁 𝜇 𝑛

We are 95% confident that the true population lies within .003 of the estimation, 1.065245.

b. Estimate the total amount the company will spend on salaries for this year.

𝑅𝑎𝑡𝑖𝑜 𝑒𝑠𝑡. 𝑜𝑓 𝑡𝑜𝑡𝑎𝑙 𝑠𝑎𝑙𝑎𝑟𝑖𝑒𝑠 = 1.065245 ∗38640.20 *500 = $20,580,639.92

5. A chip manufacturer wants to estimate the average fill for a package of chips. The data in
the table represents a 1 in 100 systematic sample for 1 day’s production. Estimate the
mean fill in ounces. Place a bound on the error of estimation. Assume the company
produces about 3600 bags of chips per day.
15.79 15.74 15.83 15.72 15.88 15.85
15.78 15.9 15.86 15.9 15.83 15.75
15.87 15.89 15.72 15.6 15.76 15.85
15.65 15.72 15.78 15.7 15.82 15.75
15.68 15.78 15.85 15.83 15.9 15.79
15.73 15.82 15.76 15.86 15.86 15.78
Sum of values listed = 568.580
Sum of values^2 = 8980.283817
s^2 = .07454

∑ 𝑦 568.580
𝜇== = 15.79388
𝑛 36
𝑛 𝑠2 36 07454
𝑉(𝜇) = (1 − ) ( ) = (1 − ). = .00204985
𝑁 𝑛 3600 36
𝐵 = 2√𝑉(𝜇) = 2√. 00204985 = .0905505

Therefore, the mean fill lies within .0905505 ounces of the mean, 15.79388 ounces.

6. You do not have information about the proportion of the citizens that are in favor of
requiring hands-free cell phone use, and want to conduct a survey to estimate this
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

proportion with a bound on the error of estimation of B = .05. The population is 50,000.
Estimate the sample size.
𝐵 2 052
𝐷= =. = .000625
4 4
𝑁𝑝𝑞 50000(. 5)(.5) 12500
𝑛= = = = 396.8332705
(𝑁 − 1)𝐷 + 𝑝𝑞 (50000 − 1)(. 000625) + (. 5)(.5) 31.499375

Therefore, 397 people must be surveyed to determine the proportion in favor of using hands-
free cell phones within 5%.

7. Data for four Stratum are given in the table below.

Strata Strata Strata Strata


1 2 3 4
Ni 122 96 102 84
Variance 25 9 36 16

How should a sample of 100 be distributed among the 4 stratum?

∑ 𝑁𝜎 = (122)(5) + (96)(3) + (102)(6) + (84)(4) = 1846

𝑁𝜎 122∗5
Strata 1: 𝑛 = 𝑛 (∑ 𝑁𝜎 ) = 100 ( 1846 ) = 33.0444

𝑁𝜎 96∗3
Strata 2: 𝑛 = 𝑛 (∑ 𝑁𝜎 ) = 100 (1846) = 15.6013

𝑁𝜎 102∗6
Strata 3: 𝑛 = 𝑛 (∑ 𝑁𝜎 ) = 100 ( 1846 ) = 33.15276273

𝑁𝜎 84∗4
Strata 4: 𝑛 = 𝑛 (∑ 𝑁𝜎 ) = 100 (1846) = 18.20151679

8. The following tables give information about the use of Tamoxifen (an anti-breast cancer
drug). It is often given to women who have had breast cancer and to those who are at high
risk for breast cancer. There were 13,000 women in the study.
Is Tamoxifen as effective for those who have had breast cancer as for those who have not
had cancer? Explain your answer.
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

Previous Breast Cancer

Breast Cancer Tamoxifen Placebo


Yes 89 175
Total 983 1251

No Previous Breast Cancer

Breast Cancer Tamoxifen Placebo


Yes 46 64
Total 5637 5129

Proportion of previous breast cancer patients who took the drug and got breast cancer:
𝟖𝟗
=. 𝟎𝟗𝟎𝟓
𝟗𝟖𝟑
Proportion of previous breast cancer patients who took the placebo and got breast cancer:
𝟏𝟕𝟓
=. 𝟏𝟑𝟗𝟖𝟗
𝟏𝟐𝟓𝟏
Difference between proportion of previous breast cancer patients who took the drug and got
breast cancer and those who didn’t take the drug and got breast cancer:
. 𝟏𝟑𝟗𝟖𝟗−. 𝟎𝟗𝟎𝟓 =. 𝟎𝟒𝟗𝟑𝟗

Proportion of no previous breast cancer patients who took the drug and got breast cancer:
𝟒𝟔
=. 𝟎𝟎𝟖𝟏𝟔
𝟓𝟔𝟑𝟕
Proportion of no previous breast cancer patients who took the placebo and got breast cancer:
𝟔𝟒
=. 𝟎𝟏𝟐𝟒𝟕𝟖
𝟓𝟏𝟐𝟗
Difference between proportion of no previous breast cancer patients who took the drug and got
breast cancer and those who didn’t take the drug and got breast cancer:
. 𝟎𝟏𝟐𝟒𝟕𝟖−. 𝟎𝟎𝟖𝟏𝟔 =. 𝟎𝟎𝟒𝟑𝟏𝟖
The difference between the proportion of previous breast cancer patients who took the drug and
got breast cancer and those who didn’t take the drug and got breast cancer was larger than the
proportion of no previous breast cancer patients who took the drug and got breast cancer and
those who didn’t take the drug and got breast cancer. The larger difference shows that it is not as
effective.
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

9. Suppose information was gathered about dry land wheat production for 2009 and 2010.
The weather conditions in the spring and summer were different for this area. Information
from a sample of 10 plots gave bushels of wheat produced per acre for 2009 and 2010.

Year=2009 Year = 2010


43 52
41 45
46 48
35 45
39 42
44 51
50 54
40 50
47 54
49 51

Develop a linear regression equation to predict 2010 production from 2009 production.
Would you use this to estimate 2011 production? Explain.

𝒓𝟐 =. 𝟓𝟓𝟗𝟖𝟔
Coefficient of intercept: 21.2816
Coefficient of X: .64328
Linear regression equation: . 𝟔𝟒𝟑𝟐𝟖𝐱 + 𝟐𝟏. 𝟐𝟖𝟏𝟔 = 𝐩𝐫𝐨𝐝𝐮𝐜𝐭𝐢𝐨𝐧 𝐩𝐞𝐫 𝐚𝐜𝐫𝐞 𝐟𝐨𝐫 𝟐𝟎𝟏𝟎

Because our 𝒓𝟐 is not a strong positive, I would not use this to estimate 2011 production. If it
would have been a stronger correlation, I would have felt that it could produce accurate results,
but it was not.

10. The local chamber of commerce is considering use of funds from a recent event to
purchase a statue of an early community leader for placement in a local park. The chamber
wants to determine the proportion of members that favor the statue purchase. They
conduct a 1 in 10 systematic sample from the membership list of the N= 500 members.
The responses were 36 in favor of the project. Estimate p, the proportion in favor of the
purchase.

Using a 1 in 10 from 500, n=50. So, out of 50 sampled, 36 were in favor.


Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

Therefore, p=.72 and q=.28

𝑛 𝑝𝑠𝑦 𝑞𝑠𝑦 50 (.72)(.28)


The variance of the estimate: 𝑉(𝑝𝑠𝑦 ) = (1 − ) ( ) = (1 − ) = .0037028571
𝑁 𝑛−1 500 49

𝐵 = 2√𝑉(𝑝𝑠𝑦 ) = 2√. 0037028571 = .12170

Therefore, with 95% confidence, the proportion lies within 12.17% of the estimate.

11. A quality control check on automobile batteries involves weighing them. A shipment of
1000 batteries was received from a supplier for two consecutive months. The investigator
decides to stratify on months. A simple random sample of 6 batteries from each month’s
shipment is taken and summarized below.

Month A Month B
74.0 74.5
74.5 73.5
71.5 73.5
73.5 73.8
73.8 76.5
73.5 74.0
Sample Total 440.8 445.8
Sample size 6 6
Number in Shipment, N 1000 1000
Mean 73.4667 74.3
SD 1.032795559 1.140175425

Estimate the average weight of the batteries from this supplier and place a bound on the
error of estimation (ignore FPC). The manufacturing standard for this type of battery is 75
pounds. Do you think this supplier meets the average standard? Explain.

1
𝑦̅𝑠𝑡 = [𝑁 𝑦̅ + 𝑁2 𝑦̅2 ]
𝑁 1 1
1
𝑦̅𝑠𝑡 = 2000 [1000(73.4667) + 1000(74.3)]=73.88335
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

1 𝑛 𝑠2 1 6 1.0327955592
𝑉(𝑦) = 𝑁2 ∑ 𝑁 2 (1 − 𝑁) ( 𝑛 ) = 20002 [10002 (1 − 1000) ( ) + 10002 (1 −
6
6 1.1401754252
)( ) =.0980194471
1000 6

𝐵 = 2√. 0980194471=.6261611521
Therefore, the average weight of batteries from this supplier is 73.88335 ±.6261611521
Although it is close, the supplier does not meet the average standard of 75 pounds.

12. A corporation desires to estimate the total number of worker hours lost for a given month
because of sick leave among all employees. Because clerical, technicians, and
administrators have different sick leave rates, the researcher decides to use stratified
sampling, with each group forming a separate stratum. Data from previous years suggests
the variances shown in the accompanying table for the number of worker-hours lost per
employee in the three groups, and current data give the stratum size, determine the
Neyman allocation for a sample of n=30 employees (formula 5.9).

Clerical Technicians Administrators


N 120 85 30
Sigma-squared (variance) 25 36 9
Sigma (standard deviation) 5 6 3

∑ 𝑁𝜎 = (120)(5) + (85)(6) + (30)(3) = 1200

𝑁𝜎 120∗5
Strata 1: 𝑛 = 𝑛 (∑ 𝑁𝜎 ) = 30 ( 1200 ) = 15

𝑁𝜎 85∗6
Strata 2: 𝑛 = 𝑛 (∑ 𝑁𝜎 ) = 30 (1200) = 12.75

𝑁𝜎 30∗3
Strata 3: 𝑛 = 𝑛 (∑ ) = 30 ( ) = 2.25
𝑁𝜎 1200

13. A chain department store is interested in estimating the proportion of accounts receivable
that are delinquent. The chain consists of four stores. So that the cost of sampling is
reduced, stratification is used with each store as a stratum. Because no information on
population proportions is available before sampling, proportional allocation is used, with a
total sample size of n=50. From the accompanying table estimate p the proportion of
delinquent accounts for the chain, and place a bound on the error of estimation.
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

Store Store Store Store


#1 #2 #3 #4
Number of Accounts Receivable N1=60 N2=37 N3=81 N4=22
Sample Size n1=15 n2=9 n3=20 n4=6
Number of delinquent accounts in the
4 2 8 1
sample

Proportion of delinquent accounts .26667 .22222 .4 .16667

Proportion of delinquent accounts .26667 .22222 .4 .16667

a. Estimate the proportion of delinquent accounts for each store.

b. Estimate the proportion of delinquent accounts using Equation (5.13).


1 1
∑ 𝑁𝑖 𝑝̂𝑖 = [(60)(. 26667) + (37)(. 22222) + (81)(. 4) + (22)(.16667)]
𝑁 200
= .3014454
c. Estimate the variance of p-hat (stratified) using Equation (5.14).

𝑁−𝑛 𝑝𝑞 60 − 15 (. 2667)(. 7333)


𝑉(𝑝̂ ) = ( )( )=( )( ) = (. 75). 013969
𝑁 𝑛−1 60 14
= .010477

𝑁−𝑛 𝑝𝑞 37 − 9 (. 2222)(. 7778)


𝑉(𝑝̂ ) = ( )( )=( )( ) = (. 757). 0216
𝑁 𝑛−1 37 8
= .0163512

𝑁−𝑛 𝑝𝑞 81 − 20 (. 4)(. 6)
𝑉(𝑝̂ ) = ( )( )=( )( ) = (. 7531). 01263 = .009512
𝑁 𝑛−1 81 19

𝑁−𝑛 𝑝𝑞 22 − 6 (. 1667)(. 8333)


𝑉(𝑝̂ ) = ( )( )=( )( ) = (. 72727). 02778
𝑁 𝑛−1 22 5
= .0202
Hattie Burford
MATH 535 Sampling Techniques
Dr. Stack
Fall 2017 Exam 2

1 1
𝑉(𝑝̂ ) = 𝑁2 ∑ 𝑁 2 𝑉 (𝑝̂ ) = 2002 [(602 )(. 010477) + (372 )(. 0163512) + (812 )(. 009512) +
(222 )(.0202))= .00331

Therefore, the estimate of proportion of delinquent accounts, with a bound on the error of
estimation is given by . 30145 ± .115

You might also like