You are on page 1of 13

INFO 4652/5652

Statistical Programming in R
Fall 2023
Project Report

Report Prepared By:


Chloe Smith
Student ID: 110078256
Understanding the Link between Physical Activity and Mental Health:
Statistical Analysis
Date Submitted: December 14th, 2023
Table of Contents
1. Introduction....................................................................................................................3
1.1 Overall Context for the Report................................................................................. 3
1.2 Problem Definition................................................................................................... 3
1.3 Project Motivation....................................................................................................4
2. Methodology................................................................................................................... 5
3. Data Source.....................................................................................................................5
4. Data Analyses................................................................................................................. 6
5. Exploratory Data Analysis............................................................................................ 7
5.1 Univariate Analysis.................................................................................................. 7
5.2 Bivariate Analysis.................................................................................................... 7
6. Statistical Model Design................................................................................................ 8
7. Key Insights/findings and Statistical Model................................................................ 9
8. Potential real-world applications of the project........................................................ 11
9. Limitations of Project Work....................................................................................... 12
10. Conclusion.................................................................................................................. 12
11. Work Cited..................................................................................................................13

2
1. Introduction

1.1 Overall Context for the Report

This comprehensive report explores the relationship between physical activity and mental
health. The purpose of this report is to document both statistical data modeling and the
techniques used during the exploratory data analysis to show this relationship.

There has been increasing awareness in the relationship between exercise and mental
health. The wider societal recognition regarding the topic of mental health itself notably
emphasizes this heightened awareness. Mental health is a highly intersectional subject that
includes someone's emotional, psychological, and social well-being. It can affect someone's daily
living, relationships and how they think, feel, and act (SAMHSA).

The staggering statistics presented by the National Alliance on Mental Illness indicate
that approximately 43.8 million adults experience mental illness each year (The National
Institute of Mental Health), therefore, the importance of effectively managing mental health
takes center stage. Effectively managing mental health is pivotal in determining how individuals
handle stress, relate to others, and navigate life's choices. As this issue becomes more prevalent,
exploring how exercise connects with mental well-being is increasingly crucial. This possible
correlation can make exercise a powerful way to promote healthier lifestyles.

1.2 Problem Definition

The aim of this paper is to address the need for a comprehensive understanding of
exercise and mental health outcomes. While there is a growing acknowledgment that this
relationship is significant, there remains a gap in our knowledge of it. It is paramount that
individuals have access to the correct information to make informed decisions based on their
health. The challenge lies in addressing these gaps to inform evidence-based strategies and
recommendations that can effectively help individuals and the therapeutic potential of exercise
for mental health. When individuals lack this crucial information, they face challenges in making
informed decisions to enhance their mental health or understanding the underlying causes
contributing to their mental well-being.

3
Mental health treatment is an approach aimed at addressing and alleviating an
individual's mental health issues, enabling them to lead self-fulfilling lives. Counseling is a
popular way that people seek to heal and manage their mental health challenges. Statistics show
that in “2021, around 41.7 million adults in the United States received treatment or counseling
for their mental health within the past year” (Vankar). It is clear that people are looking for
effective ways to help heal and foster a healthier life. If we can substantiate the significance that
physical activity holds for mental well-being, we would be able to integrate physical activity into
mental health care, which adds another dimension to mental health treatment.

1.3 Project Motivation

Many people may not fully understand the widespread prevalence of mental health issues
within the United States. It is a reality that at some point in, in almost every individual’s lifetime,
they will encounter challenges posed by a mental health disorder, either personally or within
their families. These are the hard facts relating to mental health:

● One in five American adults experienced a mental health condition in a given year.
(SAMHSA)
● One in six young people have experienced a major depressive episode. (SAMHSA)
● Suicide stands as the leading cause of mortality in the United States, ranking as the
second leading cause of death among individuals aged 10-24. In 2020, suicide resulted in
the loss of over 45,979 American lives, nearly twice the number of lives claimed by
homicide. (SAMHSA)

These facts emphasize the significance of gaining a comprehensive understanding of


mental well-being in our society due to the prevalence of mental health issues. Recognizing the
relationship between physical activity and mental health can help individuals make informed
decisions to benefit their mental health or understand the underlying causes contributing to their
mental well-being.

4
2. Methodology
For this in-depth analysis, I have chosen to use the R programming language tailored for
statistical computing and data analysis. Our first step in our analysis is data preparation. This
initial step involves loading our data into R. Because the data we are using, analytic_2023.csv, is
a CSV file, the function we will use is read.csv(). This step ensures a smooth transition to the
next analysis stages. It also considers details like missing values and outliers to keep the data
reliable.
In the next stages, we will use three key statistical tools within R: regression analysis,
Pearson’s Product-Moment Correlation, and ggplot. These tools will help us create a thoughtful
understanding of how things relate. By creating a ggplot, we will have a visual showing how the
two variables tend to “move” together and show if they have a possible correlation. We must
select our outcome and predictor variables based on the theoretical framework. This intentional
selection ensures that our hypotheses, formed based on theoretical considerations, are tested
through regression analysis and Pearson's correlation. We next will establish a significance level
(alpha) for our hypothesis testing. Running our linear regression and Pearson correlation
functions, we interpret findings derived from p-values and correlation coefficients, providing a
comprehensive understanding of relationships between the two variables.

3. Data Source
To conduct this research and understand the relationship between physical activity and
mental health, I used data from the University of Wisconsin Population Health Institute. The
University of Wisconsin Population Health Institute aims to “advance health and well-being by
developing and evaluating interventions and promoting evidence-based approaches to policy and
practice at the local, state, and national levels” (CHR&R). The University of Wisconsin
Population Health Institute's County Health Rankings & Roadmaps program is a vital data
source. It plays a role in awareness of factors that impact health and works towards improving
health equity. They have national rankings data and documentation available to the public to
view and use. Through this, I was able to obtain analysis data for 2023.
In the analysis, two key variables are focused on within this data. The first variable,
"Frequent Mental Distress raw value," measures the “percentage of adults reporting 14 or more

5
days of poor mental health per month (age-adjusted)” (CHR&R). This variable provides insights
into chronic and likely severe mental health issues. The second variable, "Physical Inactivity raw
value," measures the “percentage of adults aged 18 and over reporting no leisure-time physical
activity (age-adjusted)” (CHR&R). The percentage of adults mentioned in the analysis represents
the percentage of adults in each county in the USA as one data point. Each county is uniquely
identified by a FIPS (Federal Information Processing Standards) code. By utilizing the
percentage of adults for both variables and linking it to the FIPS code, you can comprehensively
understand the relationship between physical activity and mental health across various counties.

4. Data Analyses
The summary statistics presented in Table 4.1 provide valuable insights into the
distribution and characteristics of these two variables. It is important to note that each value
represents the percentage of adults within the respective county in the USA. For the variable that
represents frequent mental distress, we can observe a range from 0.0830 to 0.2330, with an
average (mean) of 0.1571. The interquartile range (IQR) between the first and third quartiles
suggests that most observations fall within a narrow range. Similarly, values span from 0.1020 to
0.4720 for the variable representing physical inactivity, with a mean of 0.2566. The IQR shows
variability in physical inactivity levels. Within both variables, two missing values require careful
consideration within the analyses. Because there are only two missing values in a total of 3,194
observations, we will remove the two rows without data in them. These statistics lay the
groundwork for a more comprehensive understanding of the relationships between frequent
mental distress and physical inactivity.

Table 4.1 Summary statistics for “Frequent Mental.Distress (raw value)” and “Physical Inactivity (raw
value)”

6
5. Exploratory Data Analysis

5.1 Univariate Analysis


In examining Figure 5.1.1 and 5.1.2, you can observe that the range of values have a
reasonably equal spread. In Figure 5.1.1, the frequencies are relatively uniform, indicating a
balanced representation of different levels of frequent mental distress within the dataset.
Likewise, in Figure 5.1.2, the distribution appears relatively even, although there is a slight right
skewness. Knowing how the variables are distributed is important because it ensures the dataset's
representativeness and that statistical tests/models are not biased towards specific ranges or
levels of the variables.

Figure 5.1.1 & 5.1.2 Histogram of the distribution of Frequent Mental Distress and Physical Inactivity

5.2 Bivariate Analysis


A ggplot visualization shows how the variables move with each other and their
relationship. The plot in Table 5.2.1 distinctly illustrates a coherent movement of the variables
“Frequent Mental Distress” and “Physical Inactivity.” It also suggests a potential positive
correlation between the two. As “Frequent Mental Distress” increases in value, there appears to
be a corresponding trend of values increasing for “Physical Inactivity.” While this visual plot

7
provides an initial indication, we can quantify and understand the strength and significance of
this correlation by diving deeper into statistical modeling.

Table 5.2.1: Relationship Between "Frequent Mental Distress" and "Physical Inactivity"

6. Statistical Model Design


The first step in the statistical analysis is to formulate our hypotheses to provide a clear
direction for our investigation. The two key variables are used in creating our hypotheses: the
Dependent Variable, Frequent Mental Distress raw value, and the Independent Variable, Physical
Inactivity raw value. In our null and alternative hypotheses, we must establish a foundation for
understanding the potential relationship between these variables.

Null Hypothesis (H0)= There is no statistically significant relationship between Frequent


Mental Distress and Physical Inactivity. ρ = 0

Alternative Hypothesis(H1): There is a statistically significant relationship between Frequent


Mental Distress and Physical Inactivity. ρ ≠ 0

The alternative hypothesis would show if there is noteworthy correlation—positive or


negative—between Frequent Mental Distress and Physical Inactivity.

8
Our next step is to choose significance level (alpha) for hypothesis testing. A chosen
significance level (α) of 5% (0.05) signifies our willingness to accept a 5% chance of rejecting a
true null hypothesis, known as a Type I error.
A simple linear regression is used to move on to our model selection. This choice aligns
with the nature of our variables and addresses our research question. A simple linear regression
works when exploring the relationship between two continuous variables (Dependent Variable,
Frequent Mental Distress raw value, and our Independent Variable, Physical Inactivity raw
value). This model demonstrates how the independent variable changes regarding our dependent
variable.

7. Key Insights/findings and Statistical Model


Starting off, the analysis of the regression model, shown in Table 7.1, reveals information
about the coefficients and as a result it shows the intercept (β₀) holds high significance. The
predicted value of the dependent variable (Frequent Mental Distress) when all independent
variables (Physical Inactivity) are set to zero is measured at 0.086518.

Moving on to the coefficient for Physical Inactivity (β₁), we see that its high significance
suggests a strong positive connection with Frequent Mental Distress. Therefore, a Frequent
Mental Distress is expected to increase by about 0.275 units for every one-unit increase in
Physical Inactivity. This means that as Physical Inactivity increases, the Frequent Mental
Distress is expected to increase proportionally.

Our model equation using the coefficients we estimated states:

Frequent.Mental.Distress.raw.value = 0.086518 + 0.275017 × Physical.Inactivity.raw.value

This model suggests that there is a statistically significant positive relationship between
physical inactivity and frequent mental distress. As the level of physical inactivity increases, the
predicted level of frequent mental distress also increases.

2
The 𝑅 value of 0.4849 indicates that the model explains approximately 48.49% of the
variance in Frequent Mental Distress.

9
The residuals appear normally distributed and have constant variance based on examining
the residual plot and summary statistics.

The results obtained from the statistical analysis, in Table 7.1, showed a calculated
p-value of < 2. 2𝑒 − 16, significantly less than the chosen significance level of alpha = 0.05.
This outcome displays evidence in favor of the alternative hypothesis, rejecting the null.
Therefore, the analysis strongly supports a statistically significant relationship between Frequent
Mental Distress and physical inactivity.

Pearson's product-moment correlation test in Table 7.2 provides further evidence of a


strong positive correlation between Frequent Mental Distress and Physical Inactivity. The
correlation coefficient of 0.6963467 indicates a strong positive linear relationship. The 95%
confidence interval, with a lower limit of 0.681037, supports the conclusion that the true
correlation is likely within this range.

Both the linear regression analysis (Table 7.1) and Pearson's correlation test (Table 7.2)
consistently support that there is a statistically significant positive relationship between Physical
Inactivity and Frequent Mental Distress. The evidence from both analyses strengthens the
confidence in the observed correlation.

Table 7.1 Linear regression model of Frequent Mental Distress raw value and Physical Inactivity raw
value

10
Table 7.2 Pearson’s Product-Moment Correlation Test between Frequent Mental Distress raw value and
Physical Inactivity raw value

8. Potential real-world applications of the project


Given the strong correlation and significant positive relationship between mental distress
and physical inactivity, you can start looking at the potential real-world applications of these
findings.
Initially, it is important to start by looking into improving mental health treatment. As
stated earlier, “In 2021, around 41.7 million adults in the United States received treatment or
counseling for their mental health within the past year” (Vankar). In improving treatment that
people are already seeking, healthcare providers can explore adding physical activity into mental
health treatment plans to enhance therapy outcomes.
Another way to apply this is by developing targeted wellness programs and educational
campaigns for public awareness. Wellness programs and employers can have initiatives that
target the mental well-being of individuals through education to raise awareness of the benefits
of regular physical activity while encouraging and supporting physical activity.
Lastly, given the positive correlation, there is now the ability to start making informed
decisions about incorporating physical activity into our daily lives to promote mental well-being.

11
9. Limitations of Project Work

It is essential to acknowledge and address limitations in the study, such as the sample
size. This study is limited to 3194 observations, a relatively small sample size. This may impact
the generalizability of our findings to broader populations. It is also important to acknowledge
that while the correlation between these two variables is significant, it must be noted that
correlation does not imply causation. Further research and consideration of other potential
variables to prove the cause-and-effect relationship must be done. Lastly, there can be response
bias within the Frequent Mental Distress variable. Since this variable relies on self-reported data,
individuals may provide responses influenced by social desirability, subjective interpretation, or
personal perceptions.

10. Conclusion
In conclusion, this report was aimed to analyze and understand the relationship between
physical activity and mental health, specifically focusing on the variables presented in our data,
Frequent Mental Distress and Physical Inactivity. The statistical analyses conducted using R
programming language and statistical modeling, linear regression analysis and Pearson's
Product-Moment Correlation, revealed significant findings.
These statistical models highlighted the strong positive correlation/relationship between
Physical Inactivity and Frequent Mental Distress. Because of this, we can think about real-world
applications, including improving mental health treatment plans, developing targeted wellness
programs, and creating overall awareness. However, it is important to acknowledge the
limitations, such as the small sample size.
In summary, this study provided insights into the connection between physical activity
and mental health, presenting applications for healthcare and public awareness and why we are
motivated to understand this relationship.

12
11. Work Cited

CHR&R. “About Us.” County Health Rankings, 2023, https://www.countyhealthrankings.org/about-us.

Accessed 10 December 2023.

CHR&R. “Frequent Mental Distress*.” County Health Rankings, 2023,

https://www.countyhealthrankings.org/explore-health-rankings/county-health-rankings-model/hea

lth-outcomes/quality-of-life/frequent-mental-distress?year=2023. Accessed 10 December 2023.

CHR&R. “Physical Inactivity.” County Health Rankings, 2023,

https://www.countyhealthrankings.org/explore-health-rankings/county-health-rankings-model/hea

lth-factors/health-behaviors/diet-and-exercise/physical-inactivity?year=2023. Accessed 10

December 2023.

The National Institute of Mental Health. “Mental Health Facts in America.” NAMI, 2023,

https://www.nami.org/nami/media/nami-media/infographics/generalmhfacts.pdf. Accessed 10

December 2023.

SAMHSA. “Mental Health Myths and Facts.” SAMHSA, 24 April 2023,

https://www.samhsa.gov/mental-health/myths-and-facts. Accessed 10 December 2023.

SAMHSA. “What is Mental Health?” SAMHSA, 24 April 2023, https://www.samhsa.gov/mental-health.

Accessed 10 December 2023.

Vankar, Preeti. “Mental health treatment or therapy among american adults 2002-2021.” Statista, 29

November 2023,

https://www.statista.com/statistics/794027/mental-health-treatment-counseling-past-year-us-adult

s/. Accessed 10 December 2023.

13

You might also like