You are on page 1of 10

# Statistics IB Example Exam

1. The central limit theorem ensures that A) confidence intervals will be symmetric. B) we can always assume our data is normally distributed. C) the sample mean will tend to be normally as the sample size grows, for most distributions. D) a Z statistic is always exactly normally distributed. 2. A researcher is interested in the effect of toy color on childrens preference for toys. He hypothesizes that children will play longer with bright-colored toys. If 1 is the population mean time that children play with bright-colored toys, and 2 is the population mean time that children play with dull-colored toys, what is the proper null hypothesis for his statistical test? A) 1 > 2 B) -1.96 <1-2 <1.96 C) |1-2| > 1.96 D) 1-2= 0 3. What is the logical problem with this abstract? An experiment was performed to determine the effect of toy color on infant play behavior. Infants were given access to toys of bright colors and dull colors to play with; the time played with each type was measured. On average, children played with bright-colored toys longer than dull-colored toys (p=0.02). The p value of 0.02 indicates that the probability that there is no difference between play behavior for dull and bright colored toys is about 2%. We conclude that color has an effect on behavior. A) Statistical significance is not practical significance. B) Accepting the null hypothesis. C) No logical problem. D) Misinterpretation of p value. 4. What is the logical problem with this abstract? The Hair-Length-Height (HLH) theory predicts that people with long hair will be taller on average than people with shorter hair. We tested the theory using a sample of 25 undergraduates. Their hair length and height were measured. A Pearson correlation of r = 0.1 was found. This correlation was not significant at the = 0.05 level. We conclude that no true correlation exists between height and hair length, and that the HLH theory is false. A) Statistical significance is not practical significance. B) No logical problem. C) Misinterpreting the p value. D) Accepting the null hypothesis. Use the following to answer question 5: Central Middle School has calculated a 95% confidence interval for the mean height () of 11-yearold boys at their school and found it to be 56 2 inches.

5. Which of the following could be the 90% confidence interval based on the same data? A) 56 3 B) Without knowing the sample size, any of the above answers would be the 90% confidence interval. C) 56 2 D) 56 1 6. In a test of statistical hypotheses, what does the p value tell us? A) The largest level of significance at which the null hypothesis can be rejected. B) The smallest level of significance at which the null hypothesis can be rejected. C) If the alternative hypothesis is true. D) If the null hypothesis is true. 7. A study was conducted on the reaction time of male subjects over the age of 50 to a particular light stimulus using a specific type of red light. A random sample of 45 male subjects from this age group in the population was selected and the subjects reaction time (in seconds) determined. Based on the test data, the 99% confidence interval estimate for was determined to be (0.66, 0.78). From this interval we can conclude that A) we have 99% confidence in this interval because in repeated sampling from the population, the method used to determine such confidence intervals will produce intervals 99% of which contain . B) this interval has probability of 0.99 of enclosing the true mean . C) we have 99% confidence that this interval contains because if we repeated testing over and over again, 99% of the time this interval will contain . D) this interval, in the long run, will contain the true mean,, 99% of the time. Use the following to answer question 8: The survey of Study Habits and Attitudes (SSHA) is a psychological test that measures the motivation, attitude, and study habits of college students. Scores range from 0 to 200 and follow (approximately) a Normal distribution with mean 115 and standard deviation 25. You suspect that incoming freshmen at your school have a mean which is different from 115 because they are often excited yet anxious about entering college. To test your suspicion, you decide to test the hypotheses H0: = 115 versus Ha: 115. You give the SSHA to 25 incoming freshmen and find their mean to be 116.2. 8. What is the value of the test statistic? A) z = 1.96 B) z = 1.2 C) z = 0.048 D) z = 0.24 Use the following to answer question 9: The level of calcium in the blood of healthy young adults follows a Normal distribution with mean = 10 milligrams per deciliter and standard deviation = 0.4. A clinic measures the blood calcium of 100 healthy pregnant young women at their first visit for prenatal care. The mean of these 100 measurements is = 9.8. Is this evidence that the mean calcium level in the population of healthy

young women is less than 10? To answer this, test the hypothesis H0: = 10 versus Ha: <10 at the 5% significance level. 9. What is the value of the p value? A) 0.6170 B) greater than 0.99 C) 0.3085 D) Less than 0.0002 10. What is the file-drawer problem in meta-analysis? A) It may be difficult to find studies relevant to a particular topic, because research is often published in a disorganized manner. B) Researchers may have published a claim that the null hypothesis is true, but they may simply have had too little power to reject the null. C) Studies typically take a year to be published, meaning that the latest research may not be available right away. D) Some studies do not get published, especially is they have null findings, potentially biasing published research. Use the following to answer questions 11-12: The nicotine content in cigarettes of a certain brand is Normally distributed with standard deviation =0.1 milligrams. The brand advertises that the mean nicotine content of their cigarettes is =1.5 milligrams, but you are suspicious and plan to investigate the advertised claim by testing the hypotheses H0: = 1.5 versus H0: > 1.5 at the 5% significance level. You will do so by measuring the nicotine content of 15 randomly selected cigarettes of this brand and computing the mean nicotine content of your measurements. 11. What is the smallest value of A) 1.542 B) 1.551 C) 1.505 D) 1.458 for which you will reject H0?

12. If the mean nicotine content of the cigarettes is, in fact, =1.6, what is the power of the test? A) 0.95 B) 0.987 C) 0.799 D) 0.971 13. To estimate , the mean salary of full professors at American colleges and universities, you obtain the salaries of a random sample of 400 full professors. The sample mean is \$73,220 and the sample standard deviation is \$4400. What is a 99% confidence interval for the mean ? A) 73,220 4400 B) 73,220 28 C) 73,220 433

D) 73,220 569 14. A researcher is interested in the difference of a training program for increasing statistics test scores. She selects a within-subjects design with 20 participants, and tests each participant in a pretest, and again after the training. The scores are shown in the figure below:

The diagonal dashed line shows where scores are equal across conditions. Perform a twotailed sign-test for the appropriate hypothesis. What is the p value? A) 0.025<p<0.05 B) p<0.025 C) 0.05<p<0.1 D) p>0.1 15. Which of the following is NOT true about the sign test? A) The sign test does not require difference scores to be normally distributed. B) The null hypothesis in the sign test is that the median score is 0. C) The sign test may be used when the assumptions of the t test are not met. D) It assumes that the raw difference scores are binomially, rather than normally, distributed. Use the following to answer questions 16-18: You wish to compare the prices of apartments in two neighboring towns. You take a simple random sample of 12 apartments in town A and calculate the average price of these apartments. You repeat this for 15 apartments in town B. Let 1represent the true average price of apartments in town A and 2 the average price in town B. 16. If we were to use the pooled t test, what would be the degrees of freedom? A) 14

B) 25 C) 12 D) 11 17. If we were to use the unpooled t test, what would be the conservative estimate for the degrees of freedom? A) 14 B) 25 C) 12 D) 11 18. Suppose we were to use the unpooled t test with the conservative estimate for the degrees of freedom. The t statistic for comparing the mean prices is 2.1. What can we say about the value of the p value? A) 0.05<p<0.10 B) p>0.10 C) 0.01<p<0.05 D) p<0.01 Use the following to answer questions 19-20: Two statistics professors at two rival schools decide to use IQ scores as a measure of how smart the students at their respective school are. IQ scores are known to be Normally distributed. The two professors will use this knowledge to their advantage. They will randomly select 10 students from their respective schools and determine the students IQ scores by means of the standard IQ test. The two professors will use the unpooled version of the two-sample t test to determine whether the students at the two universities are equally smart. Let 1 and 2 represent the mean IQ scores of the students at the two universities. Let 1 and 2 be the corresponding population standard deviations. Based on the two samples of 10 students, the two professors find the following information: 1 = 111, 2 = 120, s1 = 7, s2 = 11. 19. The value of the test statistic is -2.18. Suppose the professors had wished to test the hypotheses H0: 1 = 2 versus Ha: 1< 2. What can we say about the value of the p value? A) p>0.05 B) p<0.01 C) 0.01<p<0.025 D) 0.025<p<0.05 20. The two professors also wish to test the hypothesis that the groups are equivalent in how variable their IQ scores are. To do this, they wish to test the hypothesis H0: 1 = 2 versus Ha: 1 2. What can we say about the value of the p value? A) 0.01<p<0.025 B) p>0.05 C) p<0.01 D) 0.025<p<0.05

21. Which of the following procedures is not robust to non-Normality? A) The t test for matched pairs. B) The two-sample t test. C) The F test for comparing two population standard deviations. D) The one-sample t test. 22. Which of the following is not an assumption is the nonpooled t test procedure?

A) B) C) D)

There are two samples, not one. Populations are normal. Observations are independent. Populations have the same variance.

Use the following to answer questions 23-24: A simple random sample of 120 vet clinics in the Midwest reveals that the vast majority of them only treat small pets (dogs, cats, rabbits, etc.) and no large animals (cows, horses, etc.). Of the 120 clinics sampled, 88 responded that they do not treat large animals at their clinic. 23. What is the value of the standard error of p-hat? A) 0.03 B) 0.04 C) 0.05 D) 0.02 24. What is a 90% confidence interval for p, the population proportion of vet clinics that do treat large animals? A) (0.19, 0.35) B) (0.67, 0.80) C) (0.16, 0.37) D) (0.20, 0.33) 25. A simple random sample of 60 blood donors is taken to estimate the proportion of donors with type A blood with a 95% confidence interval. In the sample, there are 10 people with type A blood. What is the margin of error in for this confidence interval? A) 0.079 B) 1.96 C) 0.048 D) 0.094 Use the following to answer questions 26-27: Suppose a psychic was tested for extrasensory perception. The psychic was presented with 200 cards face down and asked to determine if the card were one of five symbols: a star, a cross, a circle, a square, or three wavy lines. The psychic was correct in 50 cases. Let p represent the probability that the psychic correctly identifies the symbol on the card in a random trial. Assume the 200 trials can be

treated as a simple random sample from the population of all guesses the psychic would make in his lifetime. 26. Based on the results of the test, what is a 95% confidence interval for p? A) 0.25 0.055 B) 0.25 0.060 C) 0.25 0.05 D) 0.25 0.004 27. Suppose you wished to see if there were evidence that the psychic is doing better than just guessing. To do this, you test the hypotheses H0: p = 0.20 versus Ha: p>0.20. What is the value of the large-sample z statistic? A) z = 1.77 B) z = 4.17 C) z = 0.83 D) z = 1.96 Use the following to answer question 28: A quality manager in a small manufacturing company wants to estimate the proportion of items produced by a very specialized process that fail to meet a customers specification. Because it is very expensive to determine if an item produced by the process meets the specification only a very small number of items can be tested. A random sample of 15 items was selected and in the sample 3 of them failed. 28. Based on the sample results, the plus four estimate of the true proportion of items that fail to meet the customers specifications is A) P = 0.263 B) P = 0.200 C) P = 0.333 D) P = 0.158 Use the following to answer questions 29-30: A consumer advocate agency is concerned about reported failures of two brands of MP3 players, which we will label Brand A and Brand B. In a random sample of 197 Brand A players, 33 units failed within 1 year of purchase. Of the 290 Brand B players 25 units were reported to have failed within the first year following purchase. The agency is interested in the difference between the population proportions, PA - PB, for the two brands. 29. From these data, the estimate of the true difference between the proportions, D, and the standard error of the estimate, SED, are, respectively A) D = 0.0813, SED = 0.0299 B) D = 0.1191, SED = 0.1049 C) D = 0.0813, SED = 0.0313 D) Not within 0.0005 of any of the above.

30. What is the 99% confidence interval estimate of the true difference, PA PB? A) (0.0005, 0.1621) B) (-0.1893, 0.3897) C) (0.0042, 0.1584) D) Not within 0.0005 of any of the above. 31. In the year 2000, a report was released which estimated the heart disease rate in Great Britain to be 188 per 100,000 people, or .188%. In France, the heart disease rate was 57 per 100,000, or .057%. Estimate of the relative risk of heart disease for the British compared to France. A) 1.19 B) -1.19 C) .30 D) 3.30 Use the following to answer question 32: Are avid readers more likely to wear glasses than those who read less frequently? Three hundred men in Ohio were selected at random and characterized as to whether they wore glasses and whether the amount of reading they did was above average, average, or below average. The results are presented in the following table: Glasses? No 26 78 70 174

## Amount of reading Above average Average Below average Total

Yes 47 48 31 126

32. Suppose we wish to test the null hypothesis that there is no association between the amount of reading and wearing glasses. Under the null hypothesis, what is the expected number of above average readers who wear glasses? A) 30.7 B) 47 C) 81.1 D) 27.2 Use the following to answer questions 33-35: A group of researchers is interested in the relationship between the income of a childs parents and later academic achievement. Income data was collected when the subjects were children, and then as adults the subjects were asked about their academic achievement. A professor tells 4 of his students to think about how they would like to approach studying this topic. A week later, they all meet, and the four students present their research designs. In the following questions, choose the analysis method that is best suited to the design the student has chosen.

33. The first student, Bob, has decided to divide the subjects academic achievement into finished high school and didnt finish high school, and the parents income into below middle class and middle class or above. Once the subjects are categorized in this way, what is the most appropriate analysis technique below? A) One sample z test B) Independent samples t test C) Chi-squared test D) Regression and correlation coefficients 34. Chris decided that he will use as a dependent (response) variable the number of years a subject attended in school, and as an independent (explanatory) variable he would like to use the natural logarithm of the subjects parents income. What is the most appropriate analysis technique for this design? A) Independent samples t test B) Confidence interval for Relative Risk C) Regression and correlation coefficients D) z test for difference between two proportions 35. Which of the following would be the most appropriate plot for assessing the relationship of interest with Chris data? A) Error bar plot B) Histogram C) Scatterplot D) Box plot 36. The main difference between Bayesian statistics and Frequentist statistics is A) Their interpretation of confidence intervals B) Whether they think Bayes Theorem is valid C) The definition of probability that underlies each philosophy D) How they create prior distributions

Answer Key: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. C D D D D B A D D D A B D A D B D A D B C D B D D B A A C A D A C C C C B