P. 1
Math 15

|Views: 7|Likes:
...
...

Categories:Types

See more
See less

11/29/2013

pdf

text

original

# Clarifying the Concepts 15-1. Distinguish nominal, ordinal, and scale data.

Answer: Nominal data are those that are categorical in nature; they cannot be ordered in any meaningful way, and they are often thought of as simply named. Ordinal data can be ordered, but we cannot assume even distances between points of equal separation. For example, the difference between the second and third scores may not be the same as the difference between the seventh and the eighth. Scale data are measured on either the interval or ratio level; we can assume equal intervals between points along these measures. 15-3. What is the difference between the chi-square test for goodness-of-fit and the chi-square test for independence? Answer: The chi-square test for goodness-of-fit is a nonparametric hypothesis test used with one nominal variable. The chisquare test for independence is a nonparametric test used with two nominal variables. 15-5. List two ways in which statisticians use the word independence or independent with respect to concepts introduced earlier in this book. Then describe how independence is used by statisticians with respect to chi square. Answer: Throughout the book, we have referred to independent variables, those variables that we hypothesize to have an effect on the dependent variable. We also described how statisticians refer to observations that are independent of one another, such as a between-groups research design requiring that observations be taken from independent samples. Here, with regard to chi square, independence takes on a similar meaning. We are testing whether the effect of one variable is independent of the other—that the proportion of cases across the levels of one variable does not depend on the levels of the other variable. 15-7. How are the degrees of freedom for the chi-square hypothesis tests different from those of most other hypothesis tests? Answer: In most previous hypothesis tests, the degrees of freedom have been based on sample size. For the chi-square hypothesis tests, however, the degrees of freedom are based on the numbers of categories, or cells, in which participants can be counted. For example, the degrees of freedom for the chi-square test for goodness-of-fit is the number of categories minus 1:dfX2 = k – 1. Here, k is the symbol for the number of categories. 15-9. What information is presented in a contingency table in the chi-square test for independence? Answer: The contingency table presents the observed frequencies for each cell in the study.

15-11. Define the symbols in the following formula: Answer: This is the formula to calculate the chi-square statistic, which is the sum, for each cell, of the squared difference between each observed frequency and its matching expected frequency, divided by the expected value for its cell. 15-13. Why do we sometimes convert scale data to ordinal data?

(iii) Degrees of freedom for the chi-square test of goodness-of-fit is based on the number of groups. In order to compute statistics.Answer: When we are concerned about meeting the assumptions of a parametric test. and scores that rank high on one variable tend to rank high on the other. g. we multiply the degrees of freedom associated with each variable. we assess the relative position of a score on one variable with its position on the other variable. and an ordinal dependent variable. we need to have working formulas. (i) + is incorrect. (iii) Calculation of chi square involves calculating the difference between observed (O) and expected frequencies. we examine how ranks on one variable relate to ranks on the other variable. (i) N is incorrect. (ii) O is the correct symbol. we can convert scale data to ordinal data and use a nonparametric test. (iii) When obtaining the degrees of freedom for the chi-square test for independence. Answer: a. Calculating the Statistics 15-19. low ranks on one variable tend to be associated with high ranks on the other. a. (i) identify the incorrect symbol. 15-17. a between-groups design. (ii) k is the correct symbol. d. (i) M is incorrect. c. In the case of the Spearman rank-order correlation. and (iii) explain why the initial symbol was incorrect. For the chi-square test for independence: dfχ2 (krow − 1) + (kcolumn − 1) c. For each of the following. (ii) state what the correct symbol should be. with a positive correlation. For a negative correlation. 15-15. symbolized by k. scores that rank low on one variable tend to rank low on the other. For example. When do we use the Mann–Whitney U test? Answer: We use the Mann–Whitney U test when there are two groups. b. e. Answer: In all correlations. f. . (ii) The multiplication symbol is the correct symbol. For the chi-square test for goodness-of-fit: dfχ2 = N − 1 b. Explain how the relation between ranks is the core of the Spearman rank-order correlation.

we do not square the ranks before we sum them. f. c. (iii) Calculation of the expected values is based on the total counts for the rows and the columns. (i) Both ks are incorrect. dfX2 = k – 1 = 3 – 1 = 2 b. e. (i) k is incorrect. (iii) Calculation of Cramer’s V involves dividing by the degrees of freedom. (ii) Total is the correct symbol. (iii) In the Mann–Whitney U test. Below are some data to use in a chi-square test for independence. Answer: a. we just sum the ranks. Compute the chi-square statistic. (i) ΣR12 is incorrect. 15-21. . Calculate the degrees of freedom for this test. not the number of groups. (iii) This is the formula for the Spearman rank-order correlation. (ii) df is the correct symbol. (i) The r is incorrect. Perform all of the calculations to complete this table. Use this calculation table for the chi-square test for goodness-of-fit to complete this exercise. b. not the numbers of categories. (ii) ΣR1 is the correct symbol. Calculate degrees of freedom for this chi-square test for goodness-of-fit. 15-23. (ii) rS is the correct symbol.d. which requires the subscript S. a. g.

calculate the test statistic. Answer: 15-27.Answer: dfχ2 = (krow – 1)(kcolumn – 1) 5 (2 – 1)(2 – 1) = 1 15-25. Convert the following scale data to ordinal or ranked data. starting with a rank of 1 for the smallest data point. Using the data presented in Exercise 15-23 and the work you did in Exercise 15-24. Answer: .

Compute the Spearman correlation for the data listed in Exercise 15-27. Compute the Mann –Whitney U test on the following data: .15-29. Answer: 15-31.

Answer: ΣRgroup1 = 1 + 2. At a small company with 15 staff and 1 top boss. Compare car accidents in which the occupants were wearing seat belts with accidents in which the occupants were not wearing seat belts. Are women more or less likely than men to be economics majors? b. because it is the smaller of the two.5 ΣRgroup2 = 11 + 9 + 2. At your high school.5 + 8 + 4 + 6 + 10 = 31. on average. Do seat belts seem to make a difference in the numbers of accidents that lead to no injuries. At your high school. did athletes or nonathletes tend to have higher grade point averages? d. state whether a parametric or nonparametric hypothesis test is more appropriate.5 The formula for the first group is: The formula for the second group is: Our test statistic would be 10. than those not wearing seat belts? . nonfatal injuries.5 + 5 + 7 + 12 = 46. did athletes or nonathletes tend to have higher class ranks? e. Applying the Concepts 15-33. For each of the following research questions. Compare car accidents in which the occupants were wearing seat belts with accidents in which the occupants were not wearing seat belts.5. Explain your answers. do those with a college education tend to make a different amount of money from those without one? c. and fatal injuries? f. Were those wearing seat belts driving at slower speeds. a.

f. education. c. and (iii) what category of research design is being used: I—scale independent variable(s) and scale dependent variable II—nominal independent variable(s) and scale dependent variable III—only nominal variables Explain your answer to part (iii). and their levels where appropriate. (ii) Grades received. The average GPA for the graduating students of elite schools is 3. is nominal and the dependent variable. an elite university. At Dartmouth College. A nonparametric test is more appropriate for this question because the sample size is small and the data are unlikely to be normal. c. 25% of all grades were A’s. d. GPA. This outlier would lead to a nonnormal distribution. (ii) the dependent variable(s). (iii) This is a category III research design because the independent variable. A nonparametric test would be appropriate because both of the variables are nominal: gender and major. grade (A or not). is scale. CNN. and infrastructure.04. e. is nominal and the dependent variable. health care. (i) SAT scores of incoming students. b. (iii) This is a category II research design because the independent variable. 7% of all grades were A’s. Answer: a.Answer: a.2. SAT scores of incoming students have increased along with their subsequent college GPAs (perhaps an explanation for grade inflation). and the average GPA for graduating students at state colleges is 2. could also be considered nominal. 15-35. A parametric test would be appropriate because the independent variable (type of student: athlete versus nonathlete) is nominal and the dependent variable (gradepoint average) is scale. A New York Times article on grade inflation reported several findings related to a tendency for average grades to rise over the years and a tendency for the top-ranked institutions to give the highest average grades (Archibold. (ii) College GPA. in 1994. culture and environment. year. For each of the findings outlined below. For each of the following research questions. Vancouver came out on top. (iii) This is a category I research design because both the independent variable and the dependent variable are scale. a.95. state (i) the independent variable or variables. (i) Year. A nonparametric test would be appropriate because the independent variable (athlete versus nonathlete) is nominal and the dependent variable (class rank) is ordinal.com reported on a 2005 study that ranked the world’s cities in terms of how livable they are using a range of criteria related to stability. (i) Type of school. 15-37. (ii) Average GPA of graduating students. 1998). c. type of school. the average GPA for graduating students at selective schools (the level below elite schools) is 3. b. b. the ―top boss‖ is likely to have a much higher income than the other employees. A nonparametric test would be appropriate because the research question is about the relation between two nominal variables: seat-belt wearing and degree of injuries. A parametric test would be appropriate because the independent variable (seat-belt use: no seat belt versus seat belt) is nominal and the dependent variable (speed) is scale. In 1969. state which nonparametric hypothesis test is most .

483. (3) There are more than five times as many participants as there are cells (there are 1798 children in the sample and only 2 cells). there are only 933 girls for every 1000 boys (Lloyd. (4)We do not know. c. not the actual numbers for the population. There is one variable. evidence of a bias that leads many parents to illegally select for boys or to kill their infant girls. b. Explain your answers. ( Hint: You will use the proportions from the national database for comparison. Conduct the six steps of hypothesis testing for this example. Assume that you are a researcher interested in whether sex selection is more or less prevalent in educated regions of India and that 1798 children from Punjab constitute the entire sample. How many variables are there in this study? What are the levels of any variable you identified? b. there are only 798 girls for every 1000 boys. Which cities tend to receive higher rankings—those north or south of the equator? b. Step 1: Population 1 is children with gender proportions like those that we observed in Punjab. 2006). Population 2 is children with gender proportions similar to those in India as a whole. gender of the children. 15-39. so we must generalize with caution. A chi-square test for goodness-of-fit would be used because we have one sample. (Note that this translates into a proportion of girls of 0. (Note: Be sure to use the correct proportions for the expected values. is 3. based on a p level of 0. What hypothesis test would be used to analyze these data? Justify your answer.) In Punjab. This study meets three of the four assumptions. and we are comparing proportions of children that fall within each level of gender (a nominal variable) to expectations based on national proportions. Answer: a. Its levels are girls and boys. a region of India in which residents tend to be more educated than in other regions.841. (2) Each observation is independent of the others. (1) The variable under study is nominal. The comparison distribution is a chi-square distribution. The Spearman rank-order correlation would be most appropriate because we are asking a question about the relation between two ordinal variables. Step 2: Null hypothesis: The proportions of boys and girls in Punjab are the same as those in India as a whole. whether this is a randomly selected sample of the more educated people.appropriate: Spearman rank-order correlation coefficient or Mann–Whitney U test. Across all of India.) d. The Mann–Whitney U test would be most appropriate because it is a nonparametric equivalent to the independent-samples t test. Step 5: . however. c. and an ordinal dependent variable (here. Step 4: Our critical χ2. Are the livability rankings related to a city’s economic status (assessed by rank)? Answer: a. it is the ranking of the city). Research hypothesis: The proportions of boys and girls in Punjab are different from those in India as a whole. they are north and south of the equator).) a. Step 3: The comparison distribution is a chi-square distribution that has 1 degree of freedom: dfχ2 = 2 – 1 = 1.05 and 1 degree of freedom. the children from Punjab. The hypothesis test will be a chi-square test for goodness-of-fit because we have only one nominal variable. It is used when we have a nominal independent variable with two levels (here. b. a between-groups research design. a. Report the statistics as you would in a journal article.

Calculate the appropriate measure of effect size. The strategies of U. they won’t win anything. he or she receives that bigger payout and you receive nothing. The researchers hypothesized that those from the market economy (United States) would cooperate less (i. p < 0. If you both defect. a. most players of such games choose to defect. you receive an even bigger payout and your opponent receives nothing.05. knowing that if they coopera te but their partners don’t. your opposing player cooperates but you do not (the term used is defect). Answer: . It appears that the proportion of girls in Punjab is less than that in the general population of India. Our calculated chi square value exceeds our critical value. c. however. In a classic prisoner’s dilemma game with money for prizes.S. If you cooperate but your opposing player defects. d. N = 1798) = 11.. Conduct the six steps of hypothesis testing for this example. Report the statistics as you would in a journal article. Because of this. How many variables are there in this study? What are the levels of any variables you identified? b. would defect more often) than would those from the nonmarket economy (China).Step 6: Reject the null hypothesis. According to Cohen’s conventions. What hypothesis test would be used to analyze these data? Justify your answer. d. what size effect is this? e.05 15-41. you each get a small prize. If. and Chinese students were compared. players who cooperate with each other both earn good prizes. χ2(1. using the above data.e.

(4) The students were not randomly selected. Research hypothesis: The proportion of Chinese students who choose to defect as opposed to cooperate is different from the proportion for U. Step 4: Our cutoff χ2. so we should use caution when generalizing beyond this sample. The independent variable is the country the student is from (United States. and (3) there are more than five times as many participants as there are cells (there are 122 participants and 4 cells). is 3. (1) The two variables are nominal. b. A chi-square test for independence would be used because we have data on two nominal variables.841.S. Population 2 contains students from a population in which country of origin and choice to defect or cooperate are independent. Step 1: Population 1 contains students like those in this sample. Step 3: The comparison distribution is a chi-square distribution that has 1 degree of freedom: dfχ2 = (krow – 1)(kcolumn – 1) = (2 – 1)(2 – 1) = 1.05 and 1degree of freedom. The comparison distribution is a chisquare distribution.S. China). The dependent variable is the choice the student made (defect. Step 2: Null hypothesis: The proportion of Chinese students who choose to defect as opposed to cooperate is similar to the proportion for U. Step 5: . based on a p level of 0. (2) every participant is in only one cell. The hypothesis test will be a chi-square test for independence because we have two nominal variables. however. cooperate). This study meets three of the four assumptions. students.a. students. c. There are two variables in this study.

Our calculated chi-square value exceeds our critical value. χ2(1. Refer to the prisoner’s dilemma example in Exercise 15-41. students than among Chinese students. c. a. It appears that the proportion of participants who choose to defect is higher among U. d. this is a medium effect.05. Cramer’s V = 0. (The conditional proportions are the proportions of Chinese who defect or cooperate and the proportions of Americans who defect or cooperate.29 15-43. p < 0. b. Create a graph with bars showing the proportions for all four conditions. Answer: a. Cramer’s According to Cohen’s conventions.Step 6: Reject the null hypothesis.) b. . The accompanying table shows the conditional proportions.99. N = 122) = 9. e. The accompanying graph shows the conditional proportions for all four conditions.S. Draw a table that includes the conditional proportions for participants from China and from the United States. Create a graph with two bars showing just the proportions for the defections for each country.

. Here are some monthly cell phone bills. for college students: a.) What happens to an outlier when you convert these data to ordinal? b. Roughly. Convert these data from scale to ordinal. 15-45.c. in dollars. (Don’t forget to put them in order first. what shape would the distribution of these data take? Would they likely be normally distributed? Explain why the distribution of ordinal data is never normal. The accompanying graph shows only the bar for defects.

was well above the next-highest observation. What are the two variables of interest? For each variable. 15-47. 200. b. Prior to converting to ordinal data. the outlier is still at the top of the distribution but is no longer very different from the rest of the scores in the distribution. In most cases (unless there are ties). However. Why does it not matter if the ordinal variable is normally distributed? ( Hint: Think about what kind of hypothesis test you would conduct. the outlier. 98 74 87 92 88 93 62 67 a. state whether it’s scale or ordinal. c. They are arranged in order from the student who turned in the test first to the student who turned in the test last. Now the scores of 500 and 200 are ranked 29 and 28. The distribution is likely to be somewhat rectangular and not normal. When converted to ordinal data.c. the distribution of ordinal data is never normal because each score is assigned a rank. . all frequencies would be 1.) Answer: a. It does not matter that the ordinal transformation is not normally distributed because we would be using nonparametric statistics to analyze the data. 500. respectively. The accompanying table shows the ordered data and corresponding ranks. Does speed in completing a test correlate with one’s grade? Here are test scores for eight students in one of our statistics classes. Nonparametric statistics do not require the assumption that the underlying distribution is normal. which means that each individual raw score usually has a different rank from the others.

What third variables might be responsible for this correlation? That is. What does the coefficient tell us about the relation between these two variables? d. and squared differences. Correlation does not provide evidence for a particular causal relation. Does this correlation coefficient suggest that students should take their tests as quickly as possible? That is. c. . b. which is an ordinal variable. This correlation does not indicate that students should attempt to take their tests as quickly as possible. a. The first variable of interest is test grade. order in which students turned in the test. Calculate the Spearman correlation coefficient for these two variables. We calculate the Spearman correlation coefficient as: c. which is a scale variable. does it indicate that taking the test quickly causes a good grade? Explain your answer. Exercise 15-47 presented data to enable you to calculate the Spearman correlation coefficient that quantifies the relation between the speed of taking the test and the test grade. 15-49. A third variable that might cause both speedy test-taking and a good test grade is knowledge of the material. Remember to convert any scale variables to ranks. difference scores. The second variable of interest is the order in which students completed the test.b. We could not have calculated a Pearson correlation coefficient because one of our variables. Why couldn’t we calculate a Pearson correlation coefficient for these data? Answer: a. b. is ordinal. what third variables might cause both speedy test-taking and a good test grade? Answer: a. d. Students who completed the test more quickly also tended to score higher. b. The coefficient tells us that there is a rather large positive relation between the two variables. The accompanying table shows test grade converted to ranks. A number of underlying causal relations could produce this observed correlation.

The states were randomly selected. Step 2: Null hypothesis: There is no difference between the voter turnout in red and blue states. The smaller calculated statistic needs to be less than or equal to this critical value to be considered statistically significant. and what are its levels? What is the dependent variable? b. there are no tied ranks. This is a between-groups design because each state is either a red state or a blue state but cannot be both. Finally.S. Step 1: We need to convert our data to an ordinal measure. states whose residents tend to vote Republican) have different voter turnouts from blue states (U. The independent variable is type of state.S. How would you present these statistics in a journal article? Answer: a. b. a p level of 0. and its levels are red and blue. The dependent variable is the percentage of registered voters who voted. Step 4: The critical value for a Mann–Whitney U test with two groups of eight. d. What is the independent variable. 15-51. Step 3: There are eight red and eight blue states. so we can assume that they are representative of their populations.Students with better knowledge of and more practice with the material would be able to get through the test more quickly and get a better grade. c. Do red states (U. states whose residents tend to vote Democratic)? The accompanying table shows voter turnouts (in percentages) for the 2004 presidential election for eight randomly selected red states and eight randomly selected blue states.05. Conduct all six steps of hypothesis testing for a Mann–Whitney U test. Step 5: . a. Is this a between-groups or within-groups design? Explain. and a two-tailed test is 15. c. Research hypothesis: There is a difference between the voter turnout in red and blue states.

Voter turnout tends to be higher in blue states than in red states. p < 0. U = 13. 13. is less than the critical value of 15.05 .ΣRred = 5 + 7 + 9 + 10 + 11 + 14 + 15 + 16 = 87 ΣRblue = 1 + 2 + 3 + 4 + 6 + 8 + 12 + 13 = 49 Step 6: The smaller calculated U. There is a statistically significant difference between voter turnout in red and blue states. so we reject the null hypothesis. d.

scribd
/*********** DO NOT ALTER ANYTHING BELOW THIS LINE ! ************/ var s_code=s.t();if(s_code)document.write(s_code)//-->