Professional Documents
Culture Documents
Number of seeds per plant for Lupinus polyphyllus from an experimental biological population
1. Enter the class limits, relative and cumulative frequencies of the first 3 classes in the table. 6pts Label the units of
the frequency, cumulative frequency and relative frequency columns. 3pts (9total)pts
see next pg
1 for equation
3. Find the crude value of the median using the method demonstrated in class. (Show your work.) 4pts
1 pt for 15.5
5. Describe the shape of the distribution of data using appropriate statistical jargon. 2pts
Essentially normal
6. Which of the measures of central tendency is most appropriate for this data? Explain why. Be specific. 4pts
Mean for symmetrical data has least sampling error because it takes into account all the data equallly
7. On what scale is the variable number of seeds per plant measured? 2pts
ratio
8. Is the variable number of seeds per plant discrete or continuous? 2pts
discrete
9. Is this the shape (see#5) that you expect for the distribution based on the type of variable? Explain. 2pts
Although it is counted data (Poisson), the mean is large enough for it to be reasonably symmetric
1
Relative Cumulative
Midpoints Limits Frequency Frequency Frequency
15 and
35
(n+1)/2 16
120 both occur in class with midpt =50
270
500 mode =50
385
120
65
1495 49.83333
II. Some studies have associated polyphenols in cocoa with beneficial effects on functions such as weight control
and cognitive function. A particular brand of Dutched cocoa powder has a mean polyphenol content of 40mg/g
with a standard deviation of 11mg/g based on a sample of size 10. Answer the following pertaining to this
information. Include formulas. 30 pts
2
Pg 28pts
2. What probability distribution (if any) is most likely to apply to the population of this type of data? 1pt
Normal population (t sampling dist assumes normal population)
3. Calculate the degrees of freedom 1pt
n – 1 = 10-1 =9
6. Is the sample size used adequate? Explain and use a formula to calculate what you consider a good sample size. 5pts
No the confidence limits are broad
assume you want d to be 5% of the mean, use the crude approximation of t=2
=121
7. Assume you wish to test whether the level in the sample above is significantly less than the value in Dutched
chocolate which is known to be 96mg/g. 11pts total
b. Based on the 95% confidence limits (#4) will you fail to reject the null hypothesis? Explain. 2pts
c. Sketch the sampling distribution and label the reject and fail to reject regions. Include the value of the level of
significance and the critical value in your sketch. 4pts Assume that µ1 = the sample mean and expand your
drawing on the last page to indicate the probability of a Type II error and power. Label using appropriate
symbols. 3pts
α =0.05
reject Fail to reject
3
t crit=-1.833 96= µ0
4
Pg 20pts
III. What’s wrong here? A glaring mistake in either experimental design or analysis has been made in the
following experiments. The mistake is stated in the brief description; it does not involve a factor that is simply
omitted from the description. State the error and how to correct it. 8pts
1. A study was performed to test the hypothesis that plant growth is affected by small amounts of acidity. Twenty
similar small plots of herbaceous vegetation were randomly assigned to either the control (watering with pH=7) and
experimental (watering with a mixture of nitric and sulfuric acid with pH=6.3). After one season of growth the
biomass (dry mass in Kg) of each plot was measured. The mean of the control was 6.8 and the experimental 8.9.
When compared statistically they found a two tailed p of 0.06. They concluded that this reduction in pH did not
affect growth.
2. In a study of factors that may prevent heart attacks, investigators wished to determine whether lycopene (an
antioxidant found in tomatoes) increases blood vessel dilation, thus decreasing the chance of a heart attack. They
divided 60 patients with cardiovascular disease randomly into 2 groups: one took a pill with lycopene and the other a
placebo. After 2 months those that took the lycopene pill showed a statistically significant improvement in blood
vessel dilation. The investigators concluded that lycopene protects against heart attacks.
The conclusions are not justified. They should say that lycopene does significantly increase blood vessel in individuals
with cv disease. Note there are 3 problems=can’t extrapolate for dilation to heart attacks, can’t speak of non cv patients
and the statement may have been a bit too strong and not repeated significantly.
IV.Show the formulas (2pts), sketch (1pt) and calculate (2pts) the probability. Give the name of the distribution
you used (1pt). 12 pts total
1. Given that for a particular laboratory the mean C reactive protein (associated with inflammation) in normal
individuals is 1.5 mg/dl with a variance of .72 what is the probability of finding a normal individual with a value
between 1 and 3?
X 1 1.5 3 1.5
Normal Z Z = -0.59 Z = 1.77 p (Z1)=0.2776 p
.72 .72
(Z2)=0.0384
P=1-0.2776-0.0384 =0 .6840
Table Table
2. Given the mean percentage of sexually active Canadians that have had at least one infection with HPV is 75% what is
the probability of finding more than 13 that have been infected in a random sample of 15.
n!
Binomial p=.75 q=.25 n=15 X>13 p( x) pxqn x
x!(n x)!
PX≥14)
5
0 1 2 3 4 5 6…. 14 15
Left skew
1. Why does the formula for the statistical variance have n-1 as its denominator? Explain in detail. 4pts
To eliminate bias since the value over n is always an underestimate- the numerator based on the range of the sample will
always b << than the range of the population; dividing by n-1 will make up for this- why -1?- it is the degrees of freedom,
the number of deviations that are independent of each other- only n-1 are because in calculation for the variance you have
forced the sum of deviation to zero around the sample mean not the parametric mea; thus only n-1 deviations are
independent of one another.
2. If you could use either normal or t which would be better? Why? 2pts
Normal is better; it is based on the parameter and has a lower σ and thus a narrower sampling distribution and more
power
3. Under what circumstances would you use the t distribution rather than the Normal to calculate probabilities? 1pt
When you do not have the parametric standard deviation (σ) only s-yes you can use it for large sample sizes, but as far as
I am concerned that means at least over 500
4. Which statistic is best to estimate population dispersion? Give the name and symbol. 3pts
7. Derive the formula actually used in one sample tests to compare a sample mean with a given value. from #6 using #5.
3pts
6
Pg 16pts
VI. Based on measures made by ornithology classes over the previous 15 years, the mean number of perching bird
species in a particular woods has been 5.9. One year after spraying with herbicide three species were found.
Does this value indicate a decrease in the number of species? 16 pts
Ho: µ≥ 5.9
H1: µ< 5.9
3. Find the p value. Indicate it in a sketch shading the appropriate area. 8pts
0 1 2 3 4 5 6… ∞ 2pt for graph I gave this point to those who were using normal or t