You are on page 1of 8

Running head: ANALYZING A HEALTH CARE DATA SET 1

Analyzing a Health Care Data Set

Student’s Name

Institutional Affiliation
ANALYZING A HEALTH CARE DATA SET 2

Analyzing a Health Care Data Set

Part 1: Yoga and Stress Study Statistical Tests

1. Level of Measurement for the Outcome Variable

The outcome variable in this study is stress level (PRE-PSS and POST_PSS). The level of

measurement for this variable is the ratio scale. The Psychological Stress Score measures stress

levels from zero to 40. Zero (0) represents no perceived stress, while 40 represents the highest

perceived stress level. Therefore, the scale has a true zero, which makes it an interval scale.

2. Pre-Evaluation Tests: Normality and Existence of Outliers

Figure 1: Box Plot

The box plot above shows that PRE_PSS has no outliers, while POST_PSS has an outlier.

Outliers are problematic for most parametric tests due to the violation of normality (Heavey,

2019).

Shapiro-Wilk and Kolmogorov_Smirnov tests of normality were conducted to assess if the two

outcome variables are normally distributed (Heavey, 2019). The hypotheses for the tests are as

follows:
ANALYZING A HEALTH CARE DATA SET 3

Null hypothesis, H0: The data is normally distributed

Null hypothesis, H0: The data is not normally distributed

Table 1 shows the outcome of the two tests.


Table 1: Tests of Normality

The p-values of Kolmogorov-Smirnov and Shapiro-Wilk statistics for PRE_PSS are both greater

than 0.05. This indicates that there insufficient evidence to refute the null hypothesis. Thus, it

can be concluded that PRE_PSS is normally distributed (Heavey, 2019). On the other, the p-

values of Kolmogorov-Smirnov and Shapiro-Wilk statistics for POST_PSS are both less than

0.05. Thus, the null hypothesis is rejected. It implies at a 5% significance level, POST_PSS is

not normally distributed (Fusfield, 2013).

3. Statistical Tests

Descriptive Statistics for Age (demographic variable)

The average age of the 20 participants in the study is 39.45 years, with a standard deviation of 12

years. The oldest participant was 60 years, while the youngest was 18 years, as shown by the

minimum and maximum values.

Table 2: Selected Descriptive Statistics for Age

Mean Standard dev. Minimum Maximum Range

39.45 12.643 18 60 42
ANALYZING A HEALTH CARE DATA SET 4

Association between gender and race

The association between two categorical variables is determined using the Chi-Square test. In

this case, race and gender are nominal categorical variables. The hypotheses are as follows:

Null hypothesis, H0: Gender and race are independent.

Alternative hypothesis, H0: Gender and race are dependent.

The Chi-Square test of independence was conducted, as shown in Table 3 below.

Table 3: Chi-Square Test

The Pearson Ch-Square statistic for the test is 24.85, and its p-value is 0.016. The p-value is less

than 5%, indicating that the null hypothesis is false. Thus, there is a significant association

between gender and race in the military study.


ANALYZING A HEALTH CARE DATA SET 5

The Cramer’s V statistic of the test is 0.769, indicating a strong association between race and

gender. Thus, the effect size is strong. This suggests that the association between race and gender

is practically significant (Fusfield, 2013).

4. Comparison of Pre and Post-Test Scores

The Pre-test (PRE_PSS) and post-test scores (POST_PSS) are paired samples because it is the

same data on the 20 participants collected at different points. The PRE_PSS is collected before

yoga, and the POST_PSS was collected after yoga. Paired samples can be compared using either

parametric or non-parametric tests. The parametric test is the paired t samples test, while the

non-parametric alternative is the Wilcoxon Signed Rank test. The conditions for paired t samples

test include: the dependent variable must be continuous, approximately normally distributed, and

must not contain outliers (Nahm, 2016). The normality tests conducted in question 1 above

indicated that POST_PSS is not normally distributed and has outliers (Nahm, 2016). Therefore,

the data does not meet the conditions for t-test for paired samples. Outliers could be corrected by

removing the outliers, but the small sample size does not allow this. Therefore, the non-

parametric alternative Wilcoxon Signed Rank test is used to compare the test scores.

The Wilcoxon Signed-Rank compares the medians of two paired tests. The hypotheses are as

follows:

Null hypothesis, H0: The median of PRE_PSS is equal to that of POST_PSS

Alternative hypothesis, H0: The medians of PRE_PSS and POST_PSS are not equal

Table 4: Wilcoxon Signed Rank Test - Ranks


ANALYZING A HEALTH CARE DATA SET 6

Table 5: Wilcoxon Signed Rank Test – Test Statistic

Table 5 shows that the p-value of the Z-statistic is less than 0.05. Therefore, the null hypothesis

of equality of medians is rejected (Heavey, 2019). The study concludes that there is a significant

difference between the medians of PRE_PSS and POST_PSS. The sign of the Z statistic is

negative, implying that the median of POST_PSS is significantly lower than that of PRE_PSS.

Table 4 shows that there are 17 negative ranks out of the 20 possible ranks. Only one rank is

positive. This implies that the POST_PSS was lower than the PRE_PSS in 17 out 20 participants

(85% of the time). The test statistic is significant for both ranks and signs. This suggests that the

results are practically significant.

Part 2: Interpretive Report

Clinical Implications
ANALYZING A HEALTH CARE DATA SET 7

This study conducted several statistical tests on the Yoga and Stress Study dataset. Normality

tests of the outcome variable indicated that POST_PSS is not normally distributed and has

outliers. This necessitated the use of non-parametric tests. The Wilcoxon Signed Rank Test was

conducted to compare pre-test and post-test stress scores. The test shows that there is a

significant difference between the medians of PRE_PSS and POST_PSS. The median of

POST_PSS is significantly lower than that of PRE_PSS. The clinical implication is that yoga is

effective in reducing stress levels among the participants. This explains why the median score

post yoga is lower than the median score before yoga. Therefore, yoga should be recommended

or adopted as a measure for reducing stress levels in the military, among other settings.

This study also finds a significant association between race and gender. It implies that race

influences the participation of certain gender in the military. For instance, in some races, one

gender is either more or less likely to join the military. This shows that there is a need to

demystify gender roles in society.

Limitations of the study

The study uses non-parametric tests to compare pre and post-test scores. Non-parametric tests are

less efficient and less powerful than parametric tests since they ignore the distribution of the data

(Nahm, 2016). Therefore, it is less likely to reject the null hypothesis using non-parametric than

when using parametric tests.

The study analysed a sample size of 20. The sample size is too small, thereby limiting the ability

to generalize the outcome (Fusfield, 2013). It is not reasonable to conclude that yoga effectively

reduces stress in the military based on a study of only 20 participants.

References

Fusfield, B. (2013). Introductory statistics. Hoboken, NJ.: John Wiley.


ANALYZING A HEALTH CARE DATA SET 8

Heavey, E. (2019). Statistics for Nursing: A Practical Approach (3rd ed.). Burlington: Jones &

Bartlet Learning.

Nahm, F. (2016). Nonparametric statistical tests for the continuous data: the basic concept and

the practical use. Korean Journal Of Anesthesiology, 69(1), 8. doi: 10.4097/kjae.2016.69.1.8

You might also like