You are on page 1of 4

JTMS-03 Applied Statistics with R

Spring Semester 2023

Lab 04 – Nonparametric alternatives to the t-test – Solution


February 28, 2023

In your role of Integration Officer for the City State of Bremen, you research the attitudes of Bremen's
citizens toward diversity in the German society. For this purpose, you use the subsample for Bremen
(n = 79) from the 2019 Vielfaltsbarometer study of Robert Bosch Stiftung. The study examined the
extent to which citizens accept social diversity along age, disability, gender roles, ethnicity, religion and
socio-economic status. Each of these diversity dimensions was assessed with three to four items.
Respondents’ answers were summarized into an index for each dimension that can take scores from 0
to 100. The data are representative of the population aged 16 and older.

Date VielfaltHB.sav (SPSS format)


Source Robert Bosch Stiftung1
Variables (only the relevant)
female Respondent’s biological sex (0= male, 1= female)
div_eth Acceptance of ethnic diversity (0= complete rejection to 100= complete acceptance)
div_rel Acceptance of religious diversity (0= complete rejection to 100= complete acceptance)

Reading in the dataset

setwd("Type/your/directory/here")
library(foreign)
diversity.hb <- read.spss("VielfaltHB.sav",
header= T, to.data.frame= T, use.value.labels= F,
use.missings= T)
View(diversity.hb)
attach(diversity.hb)

Tasks

1. Examine to what extent men and women differ in their attitudes toward ethnic diversity. Test the
hypothesis that men are on average less tolerant of ethnic diversity than women are.

a) The hypothesis was tested in Lab 03 using a parametric test. Check whether the data meet the
assumptions of the parametric test (normal distribution and homogeneity of variance) on the
basis of descriptive statistics and visual inspection of the distribution of the dependent variable.

Regarding the assumption of normal distribution, the descriptive statistics point to pronounced negative
skewness (g1 = -1.29) and leptokurtosis (g2 = 1.95) in the distribution of the attitudes towards ethnic
diversity. Both findings indicate that the data in the total sample do not conform to a normal distribution.

describe(div_eth, type= 2)
## n mean sd median min max range skew kurtosis se
## 79 75.93 21.48 77.8 0 100 100 -1.29 1.95 2.42

The breakdown by biological sex reveals that these issues predominantly arise in the subsample of
male respondents: the distribution of acceptance of ethnic diversity in the group of men is characterized
by negative skewness (g1 = -1.30) and leptokurtosis (g2 = 1.78). In contrast, the subsample of female

1
https://www.bosch-stiftung.de/en/publication/cohesion-diversity-diversity-barometer-robert-bosch-stiftung

1
respondents is characterized by borderline skewness (g1 = -1.00) and only a slight tendency towards
leptokursis (g2 = 0.66).

female <- factor(female, levels=c(0,1), labels=c("male","female"))

library(psych)
describeBy(div_eth, group= female, type= 2)
##
## Descriptive statistics by group
## group: male
## n mean sd median min max range skew kurtosis se
## 35 72.39 24.61 77.8 0 100 100 -1.3 1.78 4.16
## ------------------------------------------------------------
## group: female
## n mean sd median min max range skew kurtosis se
## 44 78.75 18.44 88.68 22.23 100 77.77 -1 0.66 2.78

The observed deviations from a normal distribution can be seen in the histogram below. The bars
represent the empirical distribution of the attitudes. Based on the superimposed curve of a theoretical
normal distribution with the same mean and standard deviation as those of the empirical distribution,
the histogram shows in which ranges of the measurement scale the distribution of acceptance of ethnic
diversity deviates from normality. Most scores are located in the upper range of the measurement scale;
few participants who are very critical of ethnic diversity are the reason for the pronounced negative
skew. In addition, the bars in the positive range of the measurement scale and above the curve illustrate
the excessive peak of the distribution.

hist(div_eth, probability= TRUE, ylim= c(0,0.03), xlim= c(0,100),


xlab="Acceptance of ethnic diversity", col= "cornflowerblue")
curve(dnorm(x, mean= mean(div_eth), sd= sqrt(var(div_eth))),
add= TRUE, col= "red", lwd= 3)

The conformity of the data to the assumption of homogeneity of variance can be inspected by comparing
the difference between the standard deviations in the subsamples of men (sM = 24.61) and women (sF
= 18.44) on the one hand, and the standard deviation in the total sample (sT = 21.48) on the other.
Since the difference between the standard deviations of the subsamples (Δs = 6.17) is smaller than half
of the standard deviation in the total sample (1/2 sT = 10.74), homogeneity of variance between the
subsamples of men and women can be assumed.

Taken together, the descriptive evidence only partially supports the assumptions of the parametric test.
Despite the presence of homogeneity of variance between the subsamples of men and women, the

2
assumption of normality has been violated. Therefore, it seems appropriate to perform the
nonparametric alternative of the independent-samples t-test.

b) Perform the appropriate nonparametric test, if the assumptions of the parametric test have been
violated. Report and interpret the findings.

The appropriate nonparametric procedure for testing the hypothesis is the Wilcoxon rank-sum test (also
Mann-Whitney U-test).

wilcox.test(div_eth ~ female)
##
## Wilcoxon rank sum test with continuity correction
##
## data: div_eth by female
## W = 659.5, p-value = 0.27
## alternative hypothesis: true location shift is not equal to 0

rdiv_eth <- rank(div_eth, ties.method= "average")


tapply(rdiv_eth, female, mean)
## male female
## 36.84286 42.51136

On descriptive grounds, the comparison between the mean ranks of both subsamples indicates that
acceptance of ethnic diversity is more pronounced among women (empirical mean rank = 42.5) than
among men (empirical mean rank = 36.8). This is consistent with the descriptive evidence based on
the raw scores.

Despite the tendency in the expected direction, the result from the nonparametric test is not statistically
significant: W = 659.5, p = 0.27. The observed difference in the average acceptance of ethnic diversity
between men and women is not strong enough to be considered statistically significant.

c) Discuss the conclusions reached using the parametric and nonparametric test.

The hypothesis that men are less tolerant towards ethnic diversity than women cannot be empirically
supported using either the parametric or the nonparametric test. Although male respondents tend to
view ethnic diversity somewhat more critically, the observed difference is not statistically significant.
The null hypothesis holds that acceptance of ethnic diversity is, on average, equally pronounced among
women and men. The lack of difference between the results of the nonparametric and parametric test
speaks for the robustness of the parametric procedure to (minor) violations of their assumptions (in this
case: normal distribution).

2. Check whether respondents accept ethnic and religious diversity to the same extent.

a) The hypothesis was tested in Lab 03 using a parametric test. Check whether the data fulfill the
central assumption of the parametric test (normal distribution) on the basis of descriptive
statistics and visual inspection of the distribution of the pertinent variables.

The dependent-samples t-test requires, amongst others, that both variables forming the dependent
samples be normally distributed. In the context of Task 1, it was shown that the distribution of the
attitudes towards ethnic diversity does not fulfill this assumption. This already is a serious call for
performing a nonparametric test. Nevertheless, the distribution of the second variable – acceptance of
religious diversity – will be examined for normality.

3
describe(div_rel)
## n mean sd median min max range skew kurtosis se
## 79 50.48 22.82 55.57 0 100 100 -0.03 -0.44 2.57

The descriptive statistics do not indicate considerable deviations from normality. The distribution of
attitudes toward religious diversity exhibits neither skewness (g1 = -0.03) nor excessive kurtosis (g2 = -
0.44). However, the histogram of the distribution shows a certain tendency toward bimodality.

hist(div_rel, probability= TRUE, ylim= c(0,0.02), xlim= c(0,100),


xlab="Acceptance of religious diversity", col= "green")
curve(dnorm(x, mean= mean(div_rel), sd= sqrt(var(div_rel))),
add= TRUE, col= "red", lwd= 3)

Considering that both variables do not meet the assumption of normality, it is appropriate to perform a
nonparametric test.

b) Perform the appropriate nonparametric test, if the assumptions of the parametric test have been
violated. Report and interpret the findings.

The means of the two variables differ such that acceptance of ethnic diversity (x̄eth. = 75.93) is on
average stronger than acceptance of religious diversity (x̄rel. = 50.48). Using the appropriate
nonparametric test for two dependent samples (Wilcoxon signed-rank test), we will examine whether
the observed difference can be attributed to random variability or to systematic differences.

wilcox.test(div_eth, div_rel, paired= TRUE)


##
## Wilcoxon signed rank test with continuity correction
##
## data: div_eth and div_rel
## V = 2582, p-value = 1.267e-11
## alternative hypothesis: true location shift is not equal to 0

The test result indicates a highly significant difference between the average strength of accepting
ethnic and religious diversity (T = 2582, p < 0.01).

c) Discuss the conclusions reached using the parametric and nonparametric test.

According to the nonparametric procedure, the hypothesis tested must be rejected. Respondents from
Bremen are significantly more tolerant of ethnic diversity than of religious diversity. The parametric t-
test led to the same conclusion (see Lab 03). The correspondence between the two procedures
underscores the robustness of the parametric test to violations of its assumptions.

You might also like