You are on page 1of 7

1

Anna Weber

STAT 2112.10 Project

Prof. Amini

26 November 2019

The goal of my project is to investigate the effect of several different variables on stress

level. To complete the aforementioned task, I surveyed 288 people through a Google Form. I

sent the Google Form out on a variety of social platforms, including GroupMe, Instagram, and

Facebook.

My dependent variable in the following analysis is stress, which I measured on a scale of

1 to 100, with 1 being the lowest and 100 being the highest. Each data point represents how

stressed each person in the data set is on a typical weekday.

For a nominal variable, I chose to record gender, with male and female as the options.

The descriptive statistics for gender can be seen in Appendix A. My categorical variable was

preferred area of study, and the options were humanities, social sciences, and STEM. The

descriptive statistics for preferred area of study can be found in Appendix B. I had three

continuous, independent variables. I collected data on how many hours a given respondent

spends studying on a typical weekday. I also collected data that measured a survey respondent’s

religiousness and how close they are with their biological mother. Both of those were measured

on a scale of 1 to 100 (1 is low and 100 is high). The descriptive statistics for each of the

continuous variables, including stress, closeness with mother, time studying, and strength of

religious beliefs can be found in Appendix C.

For the analysis of this data, I used several tests, all performed by the Statistix computer

package. To measure the effects of gender and preferred area of study on stress level, I used a
2

two-factor ANOVA with interaction. The factors were gender and preferred area of study, and

the dependent variable was stress level. My null hypothesis was that there was no interaction

between gender and preferred area of study variables, and my alternative hypothesis was that

there was interaction between the aforementioned factors.

I used a simple linear regression to measure the relationship between hours spent

studying and stress level. My null hypothesis was that there would not be a meaningful linear

relationship between hours spent studying and stress level, and my alternative hypothesis was

that there was a meaningful linear relationship between the two variables.

To measure the relationship between how religious someone is, how close they are with

their mother, and stress level, I used a multiple regression. The null hypothesis in this case was

that there would be no significant relationship between the variables. The alternative hypothesis

was that there would be a significant relationship between the variables.

Because I was curious, I also performed a chi-squared test for independence between

gender and preferred area of study. This test’s null hypothesis was that there was no association

between gender and preferred area of study, and the alternative hypothesis was that there was an

association between the variables.

For all the above tests, the threshold for significance was =0.05. The results and

conclusion for each statistical test are below.

The first test I ran was the two-way ANOVA, measuring the effects of preferred area of

study and gender on stress. The results for this test are displayed in Appendix D. As shown, the

interaction of the area of study (A_STUDY) and gender (GENDER) variables had a p-value of

0.9861, which is clearly not statistically significant. Therefore, we can accept the null hypothesis

that stated that there was no interaction between the variables. Because there was no interaction,
3

I then ran one-way ANOVA tests on each of the main effect variables. I concluded that preferred

area of study on its own was not significant in affecting stress level (p=0.2103, Appendix E).

Gender, on its own, does affect stress level (p=0.000, Appendix F). I performed post-hoc

analysis using the LSD multiple comparisons method on this test (Appendix G), which indicated

that females had significantly higher levels of stress than males. On average, females were about

16 points more stressed than males (Appendix G).

The next test I ran was a simple linear regression to measure the association between

hours spent per day studying (STUDY) and stress level (STRESS). The results for this test are

displayed in Appendix H. The test concluded that these variables do not have a statistically

significant linear relationship, and therefore we can accept the aforementioned null hypothesis

(p=0.5549). According to the R2 value for this test (Appendix H), 0.12% of the variation in stress

can be explained by its relationship with the time spent studying variable. This R2 level is quite

small, further explaining that the relationship is not meaningful.

To measure the correlation between religiousness, closeness with mother, and stress

level, I performed a multiple regression. The results of this test are displayed in Appendix I.

According to the p-values listed in Appendix I, we can accept that null hypothesis as no p-value

was below 0.05. The R2 value for this regression was 0.0066, meaning that 0.66% of the variance

in stress level could be explained by the relationship with the level of religiousness (RELIGION)

variable and closeness with mother (MOTHER) variable. Overall, this relationship is not

statistically significant.

The final test I performed was a chi-squared test of independence analyzing the

relationship between gender and preferred area of study, as seen in Appendix J. The results

indicate that because of the p-value of 0.5210, gender and preferred area of study in my data set
4

are independent of each other. This means that gender does not influence a person’s preferred

area of study and vice versa.

Before I started this project, I was hoping that more of my results would end up being

statistically significant. However, I am not surprised that they turned out the way they did, as my

data set faced selective response bias and my original Google Form only went out to a select

group of people. That being said, though, I did get 289 responses, and I did not have any

problems with expected values being too low or not having enough data to perform the relevant

tests. I find it very interesting that according to my analysis with the ANOVA, males and

females have statistically significant differences in stress levels, but I do not have the data at this

time to further determine the cause of that stress. I am also curious as to what data sets other than

my own say about how someone’s relationship with religion and their mother affect stress level,

but I do not have that data at this time. Overall, while the majority of my results were not

statistically significant, I am glad I was able to collect my own data and analyze it using multiple

methods.
5

Appendix A: Descriptive Statistics for Gender Variable (GENDER)

Appendix B: Descriptive Statistics for Preferred Area of Study Variable (A_STUDY)

Appendix C: Descriptive Statistics for Continuous Variables where:


STRESS = stress level
RELIGION = strength of religious beliefs
STUDY = hours spent studying per day
MOTHER = closeness of relationship with mother

Appendix D: Factorial Design ANOVA


6

Appendix E: One-Way ANOVA for Area of Study and Stress

Appendix F: One-Way ANOVA for Gender and Stress

Appendix G: LSD Post-Hoc Analysis for ANOVA with Gender and Stress

Appendix H: Simple Linear Regression for Stress Level vs. Hours Spent Studying
7

Appendix I: Multiple Linear Regression for Stress Level vs. Level of Religious Affiliation and
Closeness with Mother

Appendix J: Chi-Square Test of Independence for Gender and Area of Study

You might also like