# Clarifying the Concepts 15-1. Distinguish nominal, ordinal, and scale data.

Answer: Nominal data are those that are categorical in nature; they cannot be ordered in any meaningful way, and they are often thought of as simply named. Ordinal data can be ordered, but we cannot assume even distances between points of equal separation. For example, the difference between the second and third scores may not be the same as the difference between the seventh and the eighth. Scale data are measured on either the interval or ratio level; we can assume equal intervals between points along these measures. 15-3. What is the difference between the chi-square test for goodness-of-fit and the chi-square test for independence? Answer: The chi-square test for goodness-of-fit is a nonparametric hypothesis test used with one nominal variable. The chisquare test for independence is a nonparametric test used with two nominal variables. 15-5. List two ways in which statisticians use the word independence or independent with respect to concepts introduced earlier in this book. Then describe how independence is used by statisticians with respect to chi square. Answer: Throughout the book, we have referred to independent variables, those variables that we hypothesize to have an effect on the dependent variable. We also described how statisticians refer to observations that are independent of one another, such as a between-groups research design requiring that observations be taken from independent samples. Here, with regard to chi square, independence takes on a similar meaning. We are testing whether the effect of one variable is independent of the other—that the proportion of cases across the levels of one variable does not depend on the levels of the other variable. 15-7. How are the degrees of freedom for the chi-square hypothesis tests different from those of most other hypothesis tests? Answer: In most previous hypothesis tests, the degrees of freedom have been based on sample size. For the chi-square hypothesis tests, however, the degrees of freedom are based on the numbers of categories, or cells, in which participants can be counted. For example, the degrees of freedom for the chi-square test for goodness-of-fit is the number of categories minus 1:dfX2 = k – 1. Here, k is the symbol for the number of categories. 15-9. What information is presented in a contingency table in the chi-square test for independence? Answer: The contingency table presents the observed frequencies for each cell in the study.

15-11. Define the symbols in the following formula: Answer: This is the formula to calculate the chi-square statistic, which is the sum, for each cell, of the squared difference between each observed frequency and its matching expected frequency, divided by the expected value for its cell. 15-13. Why do we sometimes convert scale data to ordinal data?

For each of the following. (ii) state what the correct symbol should be.Answer: When we are concerned about meeting the assumptions of a parametric test. (i) N is incorrect. (i) M is incorrect. For example. d. low ranks on one variable tend to be associated with high ranks on the other. Answer: In all correlations. (ii) k is the correct symbol. we examine how ranks on one variable relate to ranks on the other variable. When do we use the Mann–Whitney U test? Answer: We use the Mann–Whitney U test when there are two groups. and (iii) explain why the initial symbol was incorrect. c. (iii) Calculation of chi square involves calculating the difference between observed (O) and expected frequencies. For the chi-square test for goodness-of-fit: dfχ2 = N − 1 b. Answer: a. (i) + is incorrect. a between-groups design. 15-15. we can convert scale data to ordinal data and use a nonparametric test. symbolized by k. (ii) O is the correct symbol. 15-17. Explain how the relation between ranks is the core of the Spearman rank-order correlation. . we assess the relative position of a score on one variable with its position on the other variable. In the case of the Spearman rank-order correlation. a. Calculating the Statistics 15-19. f. we need to have working formulas. For a negative correlation. e. and an ordinal dependent variable. For the chi-square test for independence: dfχ2 (krow − 1) + (kcolumn − 1) c. (iii) When obtaining the degrees of freedom for the chi-square test for independence. with a positive correlation. we multiply the degrees of freedom associated with each variable. b. (iii) Degrees of freedom for the chi-square test of goodness-of-fit is based on the number of groups. (ii) The multiplication symbol is the correct symbol. In order to compute statistics. g. scores that rank low on one variable tend to rank low on the other. (i) identify the incorrect symbol. and scores that rank high on one variable tend to rank high on the other.

. Below are some data to use in a chi-square test for independence. Compute the chi-square statistic. e. (i) Both ks are incorrect. b. Answer: a. (ii) rS is the correct symbol. Use this calculation table for the chi-square test for goodness-of-fit to complete this exercise. c. not the number of groups. dfX2 = k – 1 = 3 – 1 = 2 b. (iii) Calculation of Cramer’s V involves dividing by the degrees of freedom. (ii) ΣR1 is the correct symbol. Calculate degrees of freedom for this chi-square test for goodness-of-fit. (iii) Calculation of the expected values is based on the total counts for the rows and the columns. (i) The r is incorrect. Calculate the degrees of freedom for this test. (ii) df is the correct symbol. Perform all of the calculations to complete this table. which requires the subscript S. (i) ΣR12 is incorrect. (iii) This is the formula for the Spearman rank-order correlation. (i) k is incorrect. (iii) In the Mann–Whitney U test. we do not square the ranks before we sum them. (ii) Total is the correct symbol. not the numbers of categories. we just sum the ranks. g. 15-21. f.d. 15-23. a.

Answer: . Convert the following scale data to ordinal or ranked data. Answer: 15-27. starting with a rank of 1 for the smallest data point. calculate the test statistic. Using the data presented in Exercise 15-23 and the work you did in Exercise 15-24.Answer: dfχ2 = (krow – 1)(kcolumn – 1) 5 (2 – 1)(2 – 1) = 1 15-25.

Compute the Mann –Whitney U test on the following data: . Answer: 15-31. Compute the Spearman correlation for the data listed in Exercise 15-27.15-29.

5. At your high school. Were those wearing seat belts driving at slower speeds. a. on average. Compare car accidents in which the occupants were wearing seat belts with accidents in which the occupants were not wearing seat belts.5 ΣRgroup2 = 11 + 9 + 2. Do seat belts seem to make a difference in the numbers of accidents that lead to no injuries. At a small company with 15 staff and 1 top boss. did athletes or nonathletes tend to have higher class ranks? e. nonfatal injuries. and fatal injuries? f. did athletes or nonathletes tend to have higher grade point averages? d. because it is the smaller of the two.5 The formula for the first group is: The formula for the second group is: Our test statistic would be 10.5 + 8 + 4 + 6 + 10 = 31. Compare car accidents in which the occupants were wearing seat belts with accidents in which the occupants were not wearing seat belts. than those not wearing seat belts? . do those with a college education tend to make a different amount of money from those without one? c.Answer: ΣRgroup1 = 1 + 2.5 + 5 + 7 + 12 = 46. For each of the following research questions. Explain your answers. Are women more or less likely than men to be economics majors? b. Applying the Concepts 15-33. state whether a parametric or nonparametric hypothesis test is more appropriate. At your high school.

841. c. it is the ranking of the city). Step 5: . and an ordinal dependent variable (here. A chi-square test for goodness-of-fit would be used because we have one sample. This study meets three of the four assumptions. a.) In Punjab. How many variables are there in this study? What are the levels of any variable you identified? b. evidence of a bias that leads many parents to illegally select for boys or to kill their infant girls. 2006). not the actual numbers for the population.) d. (2) Each observation is independent of the others.appropriate: Spearman rank-order correlation coefficient or Mann–Whitney U test. whether this is a randomly selected sample of the more educated people. Step 2: Null hypothesis: The proportions of boys and girls in Punjab are the same as those in India as a whole. Answer: a.) a. The Spearman rank-order correlation would be most appropriate because we are asking a question about the relation between two ordinal variables. (Note: Be sure to use the correct proportions for the expected values. so we must generalize with caution. (1) The variable under study is nominal. Explain your answers. There is one variable. they are north and south of the equator). Step 1: Population 1 is children with gender proportions like those that we observed in Punjab. b. The hypothesis test will be a chi-square test for goodness-of-fit because we have only one nominal variable. c. is 3. Which cities tend to receive higher rankings—those north or south of the equator? b. The comparison distribution is a chi-square distribution. What hypothesis test would be used to analyze these data? Justify your answer. there are only 933 girls for every 1000 boys (Lloyd.05 and 1 degree of freedom. the children from Punjab. and we are comparing proportions of children that fall within each level of gender (a nominal variable) to expectations based on national proportions. based on a p level of 0. Step 3: The comparison distribution is a chi-square distribution that has 1 degree of freedom: dfχ2 = 2 – 1 = 1. Its levels are girls and boys. b. there are only 798 girls for every 1000 boys. Are the livability rankings related to a city’s economic status (assessed by rank)? Answer: a. (4)We do not know.483. Across all of India. (3) There are more than five times as many participants as there are cells (there are 1798 children in the sample and only 2 cells). Report the statistics as you would in a journal article. Population 2 is children with gender proportions similar to those in India as a whole. The Mann–Whitney U test would be most appropriate because it is a nonparametric equivalent to the independent-samples t test. Step 4: Our critical χ2. gender of the children. ( Hint: You will use the proportions from the national database for comparison. (Note that this translates into a proportion of girls of 0. however. a region of India in which residents tend to be more educated than in other regions. Assume that you are a researcher interested in whether sex selection is more or less prevalent in educated regions of India and that 1798 children from Punjab constitute the entire sample. 15-39. Research hypothesis: The proportions of boys and girls in Punjab are different from those in India as a whole. a between-groups research design. Conduct the six steps of hypothesis testing for this example. It is used when we have a nominal independent variable with two levels (here.

however. you receive an even bigger payout and your opponent receives nothing. If you both defect. χ2(1. Calculate the appropriate measure of effect size. If you cooperate but your opposing player defects. Answer: . Report the statistics as you would in a journal article. they won’t win anything. knowing that if they coopera te but their partners don’t. What hypothesis test would be used to analyze these data? Justify your answer. players who cooperate with each other both earn good prizes. How many variables are there in this study? What are the levels of any variables you identified? b. According to Cohen’s conventions.S.05 15-41. he or she receives that bigger payout and you receive nothing. The strategies of U. c. and Chinese students were compared. p < 0. d. what size effect is this? e. most players of such games choose to defect. would defect more often) than would those from the nonmarket economy (China). N = 1798) = 11. d. In a classic prisoner’s dilemma game with money for prizes. If. Conduct the six steps of hypothesis testing for this example. using the above data. It appears that the proportion of girls in Punjab is less than that in the general population of India..e.05.Step 6: Reject the null hypothesis. your opposing player cooperates but you do not (the term used is defect). The researchers hypothesized that those from the market economy (United States) would cooperate less (i. a. you each get a small prize. Our calculated chi square value exceeds our critical value. Because of this.

however. students. (2) every participant is in only one cell. The comparison distribution is a chisquare distribution. so we should use caution when generalizing beyond this sample.841. There are two variables in this study. and (3) there are more than five times as many participants as there are cells (there are 122 participants and 4 cells). China).S. This study meets three of the four assumptions. The hypothesis test will be a chi-square test for independence because we have two nominal variables. A chi-square test for independence would be used because we have data on two nominal variables. Population 2 contains students from a population in which country of origin and choice to defect or cooperate are independent.S. b. (1) The two variables are nominal. based on a p level of 0.a. is 3. Step 2: Null hypothesis: The proportion of Chinese students who choose to defect as opposed to cooperate is similar to the proportion for U. Step 1: Population 1 contains students like those in this sample. (4) The students were not randomly selected. students. Step 5: . The independent variable is the country the student is from (United States.05 and 1degree of freedom. cooperate). Step 4: Our cutoff χ2. The dependent variable is the choice the student made (defect. Step 3: The comparison distribution is a chi-square distribution that has 1 degree of freedom: dfχ2 = (krow – 1)(kcolumn – 1) = (2 – 1)(2 – 1) = 1. c. Research hypothesis: The proportion of Chinese students who choose to defect as opposed to cooperate is different from the proportion for U.

p < 0. Our calculated chi-square value exceeds our critical value. .Step 6: Reject the null hypothesis.05. χ2(1. Draw a table that includes the conditional proportions for participants from China and from the United States.29 15-43.) b. c. Create a graph with two bars showing just the proportions for the defections for each country. Answer: a. The accompanying graph shows the conditional proportions for all four conditions. Refer to the prisoner’s dilemma example in Exercise 15-41. e. this is a medium effect.99. Create a graph with bars showing the proportions for all four conditions. b. Cramer’s V = 0.S. The accompanying table shows the conditional proportions. a. N = 122) = 9. It appears that the proportion of participants who choose to defect is higher among U. students than among Chinese students. d. (The conditional proportions are the proportions of Chinese who defect or cooperate and the proportions of Americans who defect or cooperate. Cramer’s According to Cohen’s conventions.

for college students: a. (Don’t forget to put them in order first. . Here are some monthly cell phone bills. what shape would the distribution of these data take? Would they likely be normally distributed? Explain why the distribution of ordinal data is never normal.) What happens to an outlier when you convert these data to ordinal? b. in dollars. Convert these data from scale to ordinal. 15-45. Roughly. The accompanying graph shows only the bar for defects.c.

Nonparametric statistics do not require the assumption that the underlying distribution is normal. The distribution is likely to be somewhat rectangular and not normal. It does not matter that the ordinal transformation is not normally distributed because we would be using nonparametric statistics to analyze the data.) Answer: a. However. 15-47. They are arranged in order from the student who turned in the test first to the student who turned in the test last. Prior to converting to ordinal data. which means that each individual raw score usually has a different rank from the others. the distribution of ordinal data is never normal because each score is assigned a rank. 200.c. What are the two variables of interest? For each variable. 500. Why does it not matter if the ordinal variable is normally distributed? ( Hint: Think about what kind of hypothesis test you would conduct. the outlier is still at the top of the distribution but is no longer very different from the rest of the scores in the distribution. all frequencies would be 1. 98 74 87 92 88 93 62 67 a. b. was well above the next-highest observation. Does speed in completing a test correlate with one’s grade? Here are test scores for eight students in one of our statistics classes. the outlier. respectively. The accompanying table shows the ordered data and corresponding ranks. When converted to ordinal data. state whether it’s scale or ordinal. . Now the scores of 500 and 200 are ranked 29 and 28. c. In most cases (unless there are ties).