This action might not be possible to undo. Are you sure you want to continue?

# Clarifying the Concepts 15-1. Distinguish nominal, ordinal, and scale data.

Answer: Nominal data are those that are categorical in nature; they cannot be ordered in any meaningful way, and they are often thought of as simply named. Ordinal data can be ordered, but we cannot assume even distances between points of equal separation. For example, the difference between the second and third scores may not be the same as the difference between the seventh and the eighth. Scale data are measured on either the interval or ratio level; we can assume equal intervals between points along these measures. 15-3. What is the difference between the chi-square test for goodness-of-fit and the chi-square test for independence? Answer: The chi-square test for goodness-of-fit is a nonparametric hypothesis test used with one nominal variable. The chisquare test for independence is a nonparametric test used with two nominal variables. 15-5. List two ways in which statisticians use the word independence or independent with respect to concepts introduced earlier in this book. Then describe how independence is used by statisticians with respect to chi square. Answer: Throughout the book, we have referred to independent variables, those variables that we hypothesize to have an effect on the dependent variable. We also described how statisticians refer to observations that are independent of one another, such as a between-groups research design requiring that observations be taken from independent samples. Here, with regard to chi square, independence takes on a similar meaning. We are testing whether the effect of one variable is independent of the other—that the proportion of cases across the levels of one variable does not depend on the levels of the other variable. 15-7. How are the degrees of freedom for the chi-square hypothesis tests different from those of most other hypothesis tests? Answer: In most previous hypothesis tests, the degrees of freedom have been based on sample size. For the chi-square hypothesis tests, however, the degrees of freedom are based on the numbers of categories, or cells, in which participants can be counted. For example, the degrees of freedom for the chi-square test for goodness-of-fit is the number of categories minus 1:dfX2 = k – 1. Here, k is the symbol for the number of categories. 15-9. What information is presented in a contingency table in the chi-square test for independence? Answer: The contingency table presents the observed frequencies for each cell in the study.

15-11. Define the symbols in the following formula: Answer: This is the formula to calculate the chi-square statistic, which is the sum, for each cell, of the squared difference between each observed frequency and its matching expected frequency, divided by the expected value for its cell. 15-13. Why do we sometimes convert scale data to ordinal data?

For each of the following. (ii) state what the correct symbol should be.Answer: When we are concerned about meeting the assumptions of a parametric test. (i) N is incorrect. (i) M is incorrect. For example. d. low ranks on one variable tend to be associated with high ranks on the other. Answer: In all correlations. (ii) k is the correct symbol. we examine how ranks on one variable relate to ranks on the other variable. When do we use the Mann–Whitney U test? Answer: We use the Mann–Whitney U test when there are two groups. and (iii) explain why the initial symbol was incorrect. c. (iii) Calculation of chi square involves calculating the difference between observed (O) and expected frequencies. For the chi-square test for goodness-of-fit: dfχ2 = N − 1 b. Answer: a. (i) + is incorrect. a between-groups design. 15-15. we can convert scale data to ordinal data and use a nonparametric test. symbolized by k. (ii) O is the correct symbol. 15-17. Explain how the relation between ranks is the core of the Spearman rank-order correlation. . we assess the relative position of a score on one variable with its position on the other variable. In the case of the Spearman rank-order correlation. a. Calculating the Statistics 15-19. f. we need to have working formulas. For a negative correlation. e. and an ordinal dependent variable. For the chi-square test for independence: dfχ2 (krow − 1) + (kcolumn − 1) c. (iii) When obtaining the degrees of freedom for the chi-square test for independence. with a positive correlation. we multiply the degrees of freedom associated with each variable. b. (iii) Degrees of freedom for the chi-square test of goodness-of-fit is based on the number of groups. (ii) The multiplication symbol is the correct symbol. In order to compute statistics. g. scores that rank low on one variable tend to rank low on the other. (i) identify the incorrect symbol. and scores that rank high on one variable tend to rank high on the other.

. Below are some data to use in a chi-square test for independence. Compute the chi-square statistic. e. (i) Both ks are incorrect. b. Answer: a. (ii) rS is the correct symbol. Use this calculation table for the chi-square test for goodness-of-fit to complete this exercise. c. not the number of groups. dfX2 = k – 1 = 3 – 1 = 2 b. (iii) Calculation of Cramer’s V involves dividing by the degrees of freedom. (ii) ΣR1 is the correct symbol. Calculate degrees of freedom for this chi-square test for goodness-of-fit. (iii) Calculation of the expected values is based on the total counts for the rows and the columns. (i) The r is incorrect. Calculate the degrees of freedom for this test. (ii) df is the correct symbol. Perform all of the calculations to complete this table. which requires the subscript S. (i) ΣR12 is incorrect. (iii) This is the formula for the Spearman rank-order correlation. (i) k is incorrect. (iii) In the Mann–Whitney U test. we do not square the ranks before we sum them. (ii) Total is the correct symbol. not the numbers of categories. we just sum the ranks. g. 15-21. f.d. 15-23. a.

Answer: . Convert the following scale data to ordinal or ranked data. Answer: 15-27. starting with a rank of 1 for the smallest data point. calculate the test statistic. Using the data presented in Exercise 15-23 and the work you did in Exercise 15-24.Answer: dfχ2 = (krow – 1)(kcolumn – 1) 5 (2 – 1)(2 – 1) = 1 15-25.

Compute the Mann –Whitney U test on the following data: . Answer: 15-31. Compute the Spearman correlation for the data listed in Exercise 15-27.15-29.

5. At your high school. Were those wearing seat belts driving at slower speeds. a. on average. Compare car accidents in which the occupants were wearing seat belts with accidents in which the occupants were not wearing seat belts.5 ΣRgroup2 = 11 + 9 + 2. Do seat belts seem to make a difference in the numbers of accidents that lead to no injuries. At a small company with 15 staff and 1 top boss. did athletes or nonathletes tend to have higher class ranks? e. nonfatal injuries. and fatal injuries? f. did athletes or nonathletes tend to have higher grade point averages? d. because it is the smaller of the two.5 The formula for the first group is: The formula for the second group is: Our test statistic would be 10.5 + 8 + 4 + 6 + 10 = 31. Compare car accidents in which the occupants were wearing seat belts with accidents in which the occupants were not wearing seat belts. than those not wearing seat belts? . do those with a college education tend to make a different amount of money from those without one? c.Answer: ΣRgroup1 = 1 + 2.5 + 5 + 7 + 12 = 46. For each of the following research questions. Explain your answers. Are women more or less likely than men to be economics majors? b. Applying the Concepts 15-33. state whether a parametric or nonparametric hypothesis test is more appropriate. At your high school.

CNN. and infrastructure. and (iii) what category of research design is being used: I—scale independent variable(s) and scale dependent variable II—nominal independent variable(s) and scale dependent variable III—only nominal variables Explain your answer to part (iii). (iii) This is a category II research design because the independent variable. A nonparametric test would be appropriate because both of the variables are nominal: gender and major. b. is nominal and the dependent variable. b.Answer: a.com reported on a 2005 study that ranked the world’s cities in terms of how livable they are using a range of criteria related to stability. This outlier would lead to a nonnormal distribution. At Dartmouth College. the ―top boss‖ is likely to have a much higher income than the other employees. health care. (iii) This is a category III research design because the independent variable. For each of the findings outlined below. A parametric test would be appropriate because the independent variable (type of student: athlete versus nonathlete) is nominal and the dependent variable (gradepoint average) is scale. year. 1998). Vancouver came out on top. 15-37. c. SAT scores of incoming students have increased along with their subsequent college GPAs (perhaps an explanation for grade inflation). In 1969. is nominal and the dependent variable. the average GPA for graduating students at selective schools (the level below elite schools) is 3. Answer: a. and the average GPA for graduating students at state colleges is 2. 7% of all grades were A’s. For each of the following research questions. is scale. type of school. A nonparametric test is more appropriate for this question because the sample size is small and the data are unlikely to be normal. (i) SAT scores of incoming students. state which nonparametric hypothesis test is most . A New York Times article on grade inflation reported several findings related to a tendency for average grades to rise over the years and a tendency for the top-ranked institutions to give the highest average grades (Archibold. c. (i) Type of school. 25% of all grades were A’s. grade (A or not). (i) Year. (ii) Grades received. state (i) the independent variable or variables. a. e. A nonparametric test would be appropriate because the independent variable (athlete versus nonathlete) is nominal and the dependent variable (class rank) is ordinal. GPA. The average GPA for the graduating students of elite schools is 3. education.04. A parametric test would be appropriate because the independent variable (seat-belt use: no seat belt versus seat belt) is nominal and the dependent variable (speed) is scale.2. could also be considered nominal. b. and their levels where appropriate.95. culture and environment. A nonparametric test would be appropriate because the research question is about the relation between two nominal variables: seat-belt wearing and degree of injuries. (ii) the dependent variable(s). (iii) This is a category I research design because both the independent variable and the dependent variable are scale. d. (ii) Average GPA of graduating students. (ii) College GPA. in 1994. an elite university. 15-35. c. f.

841. c. it is the ranking of the city). Step 5: . and an ordinal dependent variable (here. A chi-square test for goodness-of-fit would be used because we have one sample. This study meets three of the four assumptions. a.) In Punjab. How many variables are there in this study? What are the levels of any variable you identified? b. evidence of a bias that leads many parents to illegally select for boys or to kill their infant girls. 2006). not the actual numbers for the population.) d. (2) Each observation is independent of the others.appropriate: Spearman rank-order correlation coefficient or Mann–Whitney U test. whether this is a randomly selected sample of the more educated people. Step 2: Null hypothesis: The proportions of boys and girls in Punjab are the same as those in India as a whole. Answer: a.) a. The Spearman rank-order correlation would be most appropriate because we are asking a question about the relation between two ordinal variables. (Note: Be sure to use the correct proportions for the expected values. so we must generalize with caution. (1) The variable under study is nominal. Explain your answers. There is one variable. they are north and south of the equator). Step 1: Population 1 is children with gender proportions like those that we observed in Punjab. b. The hypothesis test will be a chi-square test for goodness-of-fit because we have only one nominal variable. c. is 3. Which cities tend to receive higher rankings—those north or south of the equator? b. The comparison distribution is a chi-square distribution. What hypothesis test would be used to analyze these data? Justify your answer. there are only 933 girls for every 1000 boys (Lloyd.05 and 1 degree of freedom. the children from Punjab. and we are comparing proportions of children that fall within each level of gender (a nominal variable) to expectations based on national proportions. based on a p level of 0. Step 3: The comparison distribution is a chi-square distribution that has 1 degree of freedom: dfχ2 = 2 – 1 = 1. Its levels are girls and boys. b. there are only 798 girls for every 1000 boys. Are the livability rankings related to a city’s economic status (assessed by rank)? Answer: a. (4)We do not know.483. Across all of India. (3) There are more than five times as many participants as there are cells (there are 1798 children in the sample and only 2 cells). Report the statistics as you would in a journal article. Population 2 is children with gender proportions similar to those in India as a whole. The Mann–Whitney U test would be most appropriate because it is a nonparametric equivalent to the independent-samples t test. Step 4: Our critical χ2. gender of the children. ( Hint: You will use the proportions from the national database for comparison. (Note that this translates into a proportion of girls of 0. however. a region of India in which residents tend to be more educated than in other regions. Assume that you are a researcher interested in whether sex selection is more or less prevalent in educated regions of India and that 1798 children from Punjab constitute the entire sample. 15-39. Research hypothesis: The proportions of boys and girls in Punjab are different from those in India as a whole. a between-groups research design. Conduct the six steps of hypothesis testing for this example. It is used when we have a nominal independent variable with two levels (here.

however. you receive an even bigger payout and your opponent receives nothing. If you both defect. χ2(1. Calculate the appropriate measure of effect size. If you cooperate but your opposing player defects. Answer: . Report the statistics as you would in a journal article. they won’t win anything. knowing that if they coopera te but their partners don’t. What hypothesis test would be used to analyze these data? Justify your answer. players who cooperate with each other both earn good prizes. How many variables are there in this study? What are the levels of any variables you identified? b. According to Cohen’s conventions.S.05 15-41. he or she receives that bigger payout and you receive nothing. The strategies of U. c. and Chinese students were compared. p < 0. d. what size effect is this? e. most players of such games choose to defect. would defect more often) than would those from the nonmarket economy (China). N = 1798) = 11. d. In a classic prisoner’s dilemma game with money for prizes. If. Conduct the six steps of hypothesis testing for this example. using the above data. It appears that the proportion of girls in Punjab is less than that in the general population of India..e.05.Step 6: Reject the null hypothesis. your opposing player cooperates but you do not (the term used is defect). The researchers hypothesized that those from the market economy (United States) would cooperate less (i. a. you each get a small prize. Our calculated chi square value exceeds our critical value. Because of this.

however. students. (2) every participant is in only one cell. The comparison distribution is a chisquare distribution. so we should use caution when generalizing beyond this sample.841. There are two variables in this study. and (3) there are more than five times as many participants as there are cells (there are 122 participants and 4 cells). China).S. This study meets three of the four assumptions. The hypothesis test will be a chi-square test for independence because we have two nominal variables. A chi-square test for independence would be used because we have data on two nominal variables. Population 2 contains students from a population in which country of origin and choice to defect or cooperate are independent.S. b. (1) The two variables are nominal. based on a p level of 0.a. is 3. Step 2: Null hypothesis: The proportion of Chinese students who choose to defect as opposed to cooperate is similar to the proportion for U. Step 1: Population 1 contains students like those in this sample. (4) The students were not randomly selected. students. Step 5: . The independent variable is the country the student is from (United States.05 and 1degree of freedom. cooperate). Step 4: Our cutoff χ2. The dependent variable is the choice the student made (defect. Step 3: The comparison distribution is a chi-square distribution that has 1 degree of freedom: dfχ2 = (krow – 1)(kcolumn – 1) = (2 – 1)(2 – 1) = 1. c. Research hypothesis: The proportion of Chinese students who choose to defect as opposed to cooperate is different from the proportion for U.

p < 0. Our calculated chi-square value exceeds our critical value. .Step 6: Reject the null hypothesis.05. χ2(1. Draw a table that includes the conditional proportions for participants from China and from the United States.29 15-43.) b. c. Create a graph with two bars showing just the proportions for the defections for each country. Answer: a. The accompanying graph shows the conditional proportions for all four conditions. Refer to the prisoner’s dilemma example in Exercise 15-41. e. this is a medium effect.99. Create a graph with bars showing the proportions for all four conditions. b. Cramer’s V = 0.S. The accompanying table shows the conditional proportions. a. N = 122) = 9. It appears that the proportion of participants who choose to defect is higher among U. students than among Chinese students. d. (The conditional proportions are the proportions of Chinese who defect or cooperate and the proportions of Americans who defect or cooperate. Cramer’s According to Cohen’s conventions.

for college students: a. (Don’t forget to put them in order first. . Here are some monthly cell phone bills. what shape would the distribution of these data take? Would they likely be normally distributed? Explain why the distribution of ordinal data is never normal.) What happens to an outlier when you convert these data to ordinal? b. in dollars. Convert these data from scale to ordinal. 15-45. Roughly. The accompanying graph shows only the bar for defects.c.

Nonparametric statistics do not require the assumption that the underlying distribution is normal. The distribution is likely to be somewhat rectangular and not normal. It does not matter that the ordinal transformation is not normally distributed because we would be using nonparametric statistics to analyze the data.) Answer: a. However. 15-47. They are arranged in order from the student who turned in the test first to the student who turned in the test last. Prior to converting to ordinal data. which means that each individual raw score usually has a different rank from the others. the distribution of ordinal data is never normal because each score is assigned a rank. 200.c. What are the two variables of interest? For each variable. 500. Why does it not matter if the ordinal variable is normally distributed? ( Hint: Think about what kind of hypothesis test you would conduct. the outlier is still at the top of the distribution but is no longer very different from the rest of the scores in the distribution. all frequencies would be 1. 98 74 87 92 88 93 62 67 a. b. was well above the next-highest observation. Does speed in completing a test correlate with one’s grade? Here are test scores for eight students in one of our statistics classes. the outlier. respectively. The accompanying table shows the ordered data and corresponding ranks. When converted to ordinal data. state whether it’s scale or ordinal. . Now the scores of 500 and 200 are ranked 29 and 28. c. In most cases (unless there are ties).

15-49. does it indicate that taking the test quickly causes a good grade? Explain your answer. What third variables might be responsible for this correlation? That is. The second variable of interest is the order in which students completed the test. b. A third variable that might cause both speedy test-taking and a good test grade is knowledge of the material. difference scores. Calculate the Spearman correlation coefficient for these two variables. Does this correlation coefficient suggest that students should take their tests as quickly as possible? That is. which is a scale variable. a. Students who completed the test more quickly also tended to score higher. order in which students turned in the test. The first variable of interest is test grade. Correlation does not provide evidence for a particular causal relation. This correlation does not indicate that students should attempt to take their tests as quickly as possible. c. What does the coefficient tell us about the relation between these two variables? d. Why couldn’t we calculate a Pearson correlation coefficient for these data? Answer: a. is ordinal. The coefficient tells us that there is a rather large positive relation between the two variables. Remember to convert any scale variables to ranks. which is an ordinal variable. We could not have calculated a Pearson correlation coefficient because one of our variables. d. We calculate the Spearman correlation coefficient as: c. and squared differences. Exercise 15-47 presented data to enable you to calculate the Spearman correlation coefficient that quantifies the relation between the speed of taking the test and the test grade. b. The accompanying table shows test grade converted to ranks. . A number of underlying causal relations could produce this observed correlation. b. what third variables might cause both speedy test-taking and a good test grade? Answer: a.b.

The dependent variable is the percentage of registered voters who voted. Is this a between-groups or within-groups design? Explain.S. Step 5: . Step 4: The critical value for a Mann–Whitney U test with two groups of eight. Conduct all six steps of hypothesis testing for a Mann–Whitney U test. How would you present these statistics in a journal article? Answer: a. so we can assume that they are representative of their populations. What is the independent variable. Do red states (U. and what are its levels? What is the dependent variable? b. there are no tied ranks. states whose residents tend to vote Democratic)? The accompanying table shows voter turnouts (in percentages) for the 2004 presidential election for eight randomly selected red states and eight randomly selected blue states. d.05.Students with better knowledge of and more practice with the material would be able to get through the test more quickly and get a better grade. Finally. Step 2: Null hypothesis: There is no difference between the voter turnout in red and blue states. and its levels are red and blue. The states were randomly selected. c. Step 3: There are eight red and eight blue states. Research hypothesis: There is a difference between the voter turnout in red and blue states. The smaller calculated statistic needs to be less than or equal to this critical value to be considered statistically significant. This is a between-groups design because each state is either a red state or a blue state but cannot be both.S. The independent variable is type of state. and a two-tailed test is 15. a. c. b. Step 1: We need to convert our data to an ordinal measure. states whose residents tend to vote Republican) have different voter turnouts from blue states (U. 15-51. a p level of 0.

Voter turnout tends to be higher in blue states than in red states. There is a statistically significant difference between voter turnout in red and blue states.ΣRred = 5 + 7 + 9 + 10 + 11 + 14 + 15 + 16 = 87 ΣRblue = 1 + 2 + 3 + 4 + 6 + 8 + 12 + 13 = 49 Step 6: The smaller calculated U.05 . so we reject the null hypothesis. d. p < 0. U = 13. is less than the critical value of 15. 13.