Stat 2112 Project

1
Anna Weber
STAT 2112.10 Project
Prof. Amini
26 November 2019
The goal of my project is to investigate the effect of several different variables on stress
level. To complete the aforementioned task, I surveyed 288 people through a Google Form. I
sent the Google Form out on a variety of social platforms, including GroupMe, Instagram, and
Facebook.
My dependent variable in the following analysis is stress, which I measured on a scale of
1 to 100, with 1 being the lowest and 100 being the highest. Each data point represents how
stressed each person in the data set is on a typical weekday.
For a nominal variable, I chose to record gender, with male and female as the options.
The descriptive statistics for gender can be seen in Appendix A. My categorical variable was
preferred area of study, and the options were humanities, social sciences, and STEM. The
descriptive statistics for preferred area of study can be found in Appendix B. I had three
continuous, independent variables. I collected data on how many hours a given respondent
spends studying on a typical weekday. I also collected data that measured a survey respondent’s
religiousness and how close they are with their biological mother. Both of those were measured
on a scale of 1 to 100 (1 is low and 100 is high). The descriptive statistics for each of the
continuous variables, including stress, closeness with mother, time studying, and strength of
religious beliefs can be found in Appendix C.
For the analysis of this data, I used several tests, all performed by the Statistix computer
package. To measure the effects of gender and preferred area of study on stress level, I used a
2
two-factor ANOVA with interaction. The factors were gender and preferred area of study, and
the dependent variable was stress level. My null hypothesis was that there was no interaction
between gender and preferred area of study variables, and my alternative hypothesis was that
there was interaction between the aforementioned factors.
I used a simple linear regression to measure the relationship between hours spent
studying and stress level. My null hypothesis was that there would not be a meaningful linear
relationship between hours spent studying and stress level, and my alternative hypothesis was
that there was a meaningful linear relationship between the two variables.
To measure the relationship between how religious someone is, how close they are with
their mother, and stress level, I used a multiple regression. The null hypothesis in this case was
that there would be no significant relationship between the variables. The alternative hypothesis
was that there would be a significant relationship between the variables.
Because I was curious, I also performed a chi-squared test for independence between
gender and preferred area of study. This test’s null hypothesis was that there was no association
between gender and preferred area of study, and the alternative hypothesis was that there was an
association between the variables.
For all the above tests, the threshold for significance was =0.05. The results and
conclusion for each statistical test are below.
The first test I ran was the two-way ANOVA, measuring the effects of preferred area of
study and gender on stress. The results for this test are displayed in Appendix D. As shown, the
interaction of the area of study (A_STUDY) and gender (GENDER) variables had a p-value of
0.9861, which is clearly not statistically significant. Therefore, we can accept the null hypothesis
that stated that there was no interaction between the variables. Because there was no interaction,
3
I then ran one-way ANOVA tests on each of the main effect variables. I concluded that preferred
area of study on its own was not significant in affecting stress level (p=0.2103, Appendix E).
Gender, on its own, does affect stress level (p=0.000, Appendix F). I performed post-hoc
analysis using the LSD multiple comparisons method on this test (Appendix G), which indicated
that females had significantly higher levels of stress than males. On average, females were about
16 points more stressed than males (Appendix G).
The next test I ran was a simple linear regression to measure the association between
hours spent per day studying (STUDY) and stress level (STRESS). The results for this test are
displayed in Appendix H. The test concluded that these variables do not have a statistically
significant linear relationship, and therefore we can accept the aforementioned null hypothesis
(p=0.5549). According to the R2 value for this test (Appendix H), 0.12% of the variation in stress
can be explained by its relationship with the time spent studying variable. This R2 level is quite
small, further explaining that the relationship is not meaningful.
To measure the correlation between religiousness, closeness with mother, and stress
level, I performed a multiple regression. The results of this test are displayed in Appendix I.
According to the p-values listed in Appendix I, we can accept that null hypothesis as no p-value
was below 0.05. The R2 value for this regression was 0.0066, meaning that 0.66% of the variance
in stress level could be explained by the relationship with the level of religiousness (RELIGION)
variable and closeness with mother (MOTHER) variable. Overall, this relationship is not
statistically significant.
The final test I performed was a chi-squared test of independence analyzing the
relationship between gender and preferred area of study, as seen in Appendix J. The results
indicate that because of the p-value of 0.5210, gender and preferred area of study in my data set
4
are independent of each other. This means that gender does not influence a person’s preferred
area of study and vice versa.
Before I started this project, I was hoping that more of my results would end up being
statistically significant. However, I am not surprised that they turned out the way they did, as my
data set faced selective response bias and my original Google Form only went out to a select
group of people. That being said, though, I did get 289 responses, and I did not have any
problems with expected values being too low or not having enough data to perform the relevant
tests. I find it very interesting that according to my analysis with the ANOVA, males and
females have statistically significant differences in stress levels, but I do not have the data at this
time to further determine the cause of that stress. I am also curious as to what data sets other than
my own say about how someone’s relationship with religion and their mother affect stress level,
but I do not have that data at this time. Overall, while the majority of my results were not
statistically significant, I am glad I was able to collect my own data and analyze it using multiple
methods.
5
Appendix A: Descriptive Statistics for Gender Variable (GENDER)
Appendix B: Descriptive Statistics for Preferred Area of Study Variable (A_STUDY)
Appendix C: Descriptive Statistics for Continuous Variables where:

STRESS = stress level
RELIGION = strength of religious beliefs
STUDY = hours spent studying per day
MOTHER = closeness of relationship with mother
Appendix D: Factorial Design ANOVA

6
Appendix E: One-Way ANOVA for Area of Study and Stress
Appendix F: One-Way ANOVA for Gender and Stress
Appendix G: LSD Post-Hoc Analysis for ANOVA with Gender and Stress
Appendix H: Simple Linear Regression for Stress Level vs. Hours Spent Studying
7
Appendix I: Multiple Linear Regression for Stress Level vs. Level of Religious Affiliation and
Closeness with Mother
Appendix J: Chi-Square Test of Independence for Gender and Area of Study

Stat 2112 Project

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stat 2112 Project

Uploaded by

Copyright:

Available Formats

1

STAT 2112.10 Project

My dependent variable in the following analysis is stress, which I measured on a scale of

stressed each person in the data set is on a typical weekday.

religious beliefs can be found in Appendix C.

there was interaction between the aforementioned factors.

was that there would be a significant relationship between the variables.

association between the variables.

conclusion for each statistical test are below.

16 points more stressed than males (Appendix G).

small, further explaining that the relationship is not meaningful.

area of study and vice versa.

Appendix A: Descriptive Statistics for Gender Variable (GENDER)

Appendix B: Descriptive Statistics for Preferred Area of Study Variable (A_STUDY)

Appendix C: Descriptive Statistics for Continuous Variables where:

Appendix D: Factorial Design ANOVA

Appendix E: One-Way ANOVA for Area of Study and Stress

Appendix F: One-Way ANOVA for Gender and Stress

Appendix J: Chi-Square Test of Independence for Gender and Area of Study

You might also like