PSY417 Week12

CRICOS Provider No. 00300K (NT/VIC) I 03286A (NSW) | RTO Provider No.
0373
PSY417: Research Methods and Practice
Week 12 – Introduction to Machine Learning for Psychology

Faculty Of Health
Dr. Rebecca Williams
SEM1 2023
Week # Topic
Readings (textbook) Assessments Due
(Date Beginning)
Week 1
Introduction to the course and review of univariate statistics Chapters 1 & 2
(06/03/2023)
Week 2
(13/03/2023) Non-parametric models Chapters 6 & 7
Online quiz (Opens

Week 3 22/03/2023 at 5:00PM ACST
Logistic regression
(20/03/2023) Chapter 20 and closes 24/03/2023 at
5:00PM)
Week 4
Moderation and mediation Chapter 11
(27/03/2023)
Week 5
Advanced ANOVA: Repeated measures and mixed designs Chapters 15 & 16
(03/04/2023)
(10/04/23) Semester Break
Week 6
ANCOVA and MANOVA Chapters 13 & 17
(17/04/2023)
Week 7 Oral presentation (In class, 15

Oral Presentations
(24/04/2023) minutes)
Week 9
Factor analysis and principal component analysis Chapter 18
(08/05/2023)
Week 10
Structural Equation Modelling
(15/05/2023)
Week 11
Synthesizing statistics// Reading and writing scientific reports
(22/05/2023)
Week 12 Report due (1500 words,

Machine learning for psychology
(29/05/2023) 02/06/2023 by 5:00PM ACST)
(05/06/2023) Revision Week
Final online exam (Opens

14/06/2023 at 5:00PM ACST
(12/06/2023) Centrally organised examination period and closes 16/06/2023 at
5:00PM) 2
Today’s Lecture Outline
• What is machine learning (ML)?
• How to interpret ML jargon
• Application of ML in psychology
3
What is machine learning (ML)?
“A set of methods that can automatically detect patterns in data, and then use
the patterns to predict future data.”
(Rosenbusch et al., 2021).
• Regression is used to determine the statistical significance of predictors on

outcome variables.
• The key distinction is that ML can use regression to predict future data.
• ML takes statistical approaches from ‘description’ of phenomena to

individualized application.
4
“What’s going to happen over the next decade, just as a consequence of having
more data, is that machine-learning systems are going to be able to pull out more
insights than the humans who were thinking about those data may be able to generate.”
Tom Griffiths, a professor of psychology and computer science at Princeton University.

5
“We argue that psychology’s near-total focus on explaining the causes of
behavior has led much of the field to be populated by research
programs that provide intricate theories of psychological mechanism,
but that have little (or unknown) ability to predict future behaviors with
any appreciable accuracy.
We propose that principles and techniques from the field of machine

learning can help psychology become a more predictive science.”
(Yarkoni & Westfall, 2017).

6
Psychology to ML
Common research questions in psychology can easily
become questions that utilize ML methods if:
1. There is a focus on prediction, and
2. There is a large enough sample size to enable accurate

prediction
7
ML can be broadly dichotomized into two
types
1. Supervised machine
learning
2. Unsupervised
machine learning
Source: Wiki Commons 8

Supervised ML is used for regression and
classification problems
High 1
extraversion
Extraversion
Low
extraversion 0
Minutes spent playing Minutes spent playing

online games each week online games each week 9
Factors common to both regression and
classification in ML
10
Supervised machine learning also has
predictor and outcome variables
The intercept is the value
of extraversion when Recall from Lecture 3:
minutes = 0
Extraversion
Simple linear regression is used for

predicting the value of an outcome
The slope is how much variable (continuous) from a predictor
extraversion would change variable (continuous)
for every 1-min change in
time spent playing
𝑌 = 𝑏! + 𝑏" 𝑋" + 𝑒𝑟𝑟𝑜𝑟
Minutes spent playing

online games each week 11
However, the dataset is now separated into
training and testing sets
In supervised ML, the dataset is separated into
a training set to estimate the parameter
values (such as 𝑏! and 𝑏" ), then tested for
accuracy on the testing set.
Extraversion
Typically, the training set is extracted and the

parameters estimated numerous times, and
the averages are used. This is called “k-fold
cross-validation”.
The ‘k’ is how many times the training set is

extracted (it’s very similar to bootstrapping...)
Think back to linear regression, where the
null hypothesis is either rejected or retained
Recall from lecture 3:
Extraversion
The model is evaluated to
retain or reject the null
hypothesis if is it
significantly different from
the mean line (mean of Y)

online games each week
13
But in supervised machine learning, there is
no null hypothesis to retain or reject
• Rather, the model is
evaluated in terms of how
well it predicts each
Extraversion
individual value of Y in the
testing set
• Each predicted value of Y is

!
referred to as Y-hat (𝑌)
𝑌) = 𝑏! + 𝑏" 𝑋" + 𝑒𝑟𝑟𝑜𝑟 online games each week
14
Some common approaches used to evaluate
the performance of the trained model
• These might be referred to as cost functions or

performance metrics, and include
• The coefficient of determination (R2)

• Root mean square error (RMSE)
• Mean square error (MSE)
• Mean absolute error (MAE)
15
Now just looking at ML regression...
16
In supervised machine learning for
regression problems, it’s not all about fitting
a line
• In supervised ML, the regression model
may not be linear (although it can be).
Extraversion
• To capture more complex relationships
between X and Y, flexible non-linear
models can be implemented
• Some common nonlinear regression

models used in ML are
• Decision trees*
• Neural networks Minutes spent playing
• Random forests* online games each week
17
*May also find in classification
One problem commonly encountered in ML
is overfitting
This is when the model fits the training
data very well, but not the testing data.
Extraversion
It can be a problem for both linear and
non-linear regression models.
There needs to be a compromise

between model complexity (fitting the
training set) and flexibility to generalize
to new, unseen data (the testing set).
online games each week
18
Different regression models are used to
address the issue of overfitting
Some regression techniques are often
implemented to address the problem of
overfitting.
Extraversion
These include:
• Ridge regression
• Lasso regression
• Elastic net regression
The regression models are referred to as

‘shrinkage’ or ‘regularized’ regression. online games each week
19
Now just looking at ML classification...
20
In supervised machine learning for
classification problems, it’s not all about
fitting a logit function
High 1
• In supervised ML, the classification
model may not be logistic (although it extraversion
can be).
• To capture more complex relationships

between X and Y, flexible non-linear
models are implemented
Low
• Some common nonlinear classification extraversion 0
models used in ML are support vector
machines (SVMs)*. Minutes spent playing
*May also find in regression
Support vector machines are used for both
simple and complex classification
22
Source: analyticsvidhya.com
Summary of machine learning
• The focus of research aims in studies using ML is prediction.
• Supervised ML is used for regression and classification problems. The

broad steps are:
• Separate the dataset into training and testing sets
• Apply a learning algorithm to the training set to fit a model (i.e. find the
parameters, such as b0 and b1 in linear regression)
• Evaluate how well the model fits the testing set
• The ‘significant result’ is how well the model fits the data compared
to other models (such as regular, linear regression).
23
Today’s Lecture Outline
• What is machine learning (ML)?
• How to interpret ML jargon
• Application of ML in psychology
24
25
Introduction
• Problematic smartphone use (PSU) is overuse of a smartphone with functional
impairment.
• PSU severity is most widely studied in relation to depression, anxiety, stress and
low self-esteem.
• Recent research has shown that fear of missing out (FOMO) is associated with PSU.
• A fear of missing out on rewarding and pleasurable experiences, and a cognitive bias
related to one’s social resources.
• Another maladaptive cognitive mechanism associated with PSU is rumination

• Frequent, negative self-referencing thoughts, typically about past events.
26
Study aims and hypotheses
• Understand PSU severity using established explanatory variables, but with novel statistical
methods.
• Supervised machine learning implemented as it has shown to outperform conventional

statistics in prediction.
• Regression-based ML was used to model PSU symptom severity as a continuous variable.
Hypotheses:
H1: Depression and anxiety severity should be positively associated with PSU severity.
H2: FOMO and rumination should be positively associated with PSU severity.
H3: Machine learning procedures will produce an algorithm which can predict PSU severity.
27
Methods
Participants
1238 students recruited to complete online survey. 141 were excluded for careless
responding. Among 1097 remaining, mean age = 19.4 (+/- 1.2) years. 18.1% were
male.
Instruments
1. Demographics (age, sex)
2. Smartphone addition short-scale version (SAS-SV): PSU severity
3. Depression Anxiety Stress Scale-21 (DASS-21)
4. Fear of missing out (FOMO) scale
5. Ruminative Response Scale (RRS)
28
Methods
Analysis
1. Data screened for missing/careless responses
2. Correlations between variables computed
3. Data split into training (70%) and testing sets
• k-folds cross validation (k = 10) used
4. Data entered into 6 different ML models
• Predictors = age, sex, depression + anxiety, FOMO, rumination
• Outcome = PSU severity
5. The 6 ML models were
• Ridge regression
• Lasso regression
• Elastic net regression
• Random forest
• Support vector machine
• Extreme gradient boosting
29
6. ML models compared using RMSE, MAE, R2, and statistical tests
Results
ANOVAs showed sex
differences in terms of
• PSU severity (women higher
scores)
• Depression, anxiety (women
lower scores)
• Rumination (women lower
scores)
30
Shrinkage regression techniques
FOMO was performed the best
Results largest predictor
of PSU severity
31
Discussion
• Shrinkage regression models performed the best in explaining PSU severity.
• The degree of variance explained by these models is similar to recent papers

implementing linear regression (without ML).
• H1 was unsupported (depression + anxiety did not contribute to PSU severity).
• H2 was partly supported: FOMO but not rumination conferred a relatively large
contribution in modelling PSU severity.
• Future research could use supervised ML to classify people with a gaming

disorder – which is now an ICD-11 diagnosis- based on similar predictor variables. 32
Congratulations on finishing PSY417!
You are now familiar with
1. Basic statistics 8. Structural equation
2. Bias and non-parametric models modelling
3. Linear and logistic regression 9. Systematic reviews and
4. Moderation and mediation meta-analysis
5. Repeated-measures and mixed 10. Supervised machine learning
factorial ANOVA
6. ANCOVA and MANOVA
7. Factor analysis and PCA
33
References
• Rosenbusch, H., Soldner, F., Evans, A.M., Zeelenberg, M.
(2021). Supervised machine learning methods in
psychology: A practical introduction with annotated R code.
Social and Personality Psychology Compass, 15(2), e12579.
• Yarkoni, T. & Westfall, J. (2017). Choosing prediction over

explanation in psychology: Lessons from machine learning.
Perspectives on Psychological Science, 12(6), 1100-1122.
34

PSY417 Week12

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PSY417 Week12

Uploaded by

Copyright:

Available Formats

CRICOS Provider No. 00300K (NT/VIC) I 03286A (NSW) | RTO Provider No.

PSY417: Research Methods and Practice

Week 12 – Introduction to Machine Learning for Psychology

Online quiz (Opens

(10/04/23) Semester Break

Week 7 Oral presentation (In class, 15

Week 12 Report due (1500 words,

(05/06/2023) Revision Week

Final online exam (Opens

• What is machine learning (ML)?

• How to interpret ML jargon

• Regression is used to determine the statistical significance of predictors on

• ML takes statistical approaches from ‘description’ of phenomena to

Tom Griffiths, a professor of psychology and computer science at Princeton University.

We propose that principles and techniques from the field of machine

(Yarkoni & Westfall, 2017).

1. There is a focus on prediction, and

2. There is a large enough sample size to enable accurate

Source: Wiki Commons 8

Minutes spent playing Minutes spent playing

Simple linear regression is used for

Minutes spent playing

Typically, the training set is extracted and the

The ‘k’ is how many times the training set is

Recall from lecture 3:

Minutes spent playing

• Each predicted value of Y is

• These might be referred to as cost functions or

• The coefficient of determination (R2)

• Some common nonlinear regression

There needs to be a compromise

The regression models are referred to as

• To capture more complex relationships

• Supervised ML is used for regression and classification problems. The

• What is machine learning (ML)?

• How to interpret ML jargon

• Another maladaptive cognitive mechanism associated with PSU is rumination

• Supervised machine learning implemented as it has shown to outperform conventional

• Regression-based ML was used to model PSU symptom severity as a continuous variable.

• The degree of variance explained by these models is similar to recent papers

• H1 was unsupported (depression + anxiety did not contribute to PSU severity).

• Future research could use supervised ML to classify people with a gaming

• Yarkoni, T. & Westfall, J. (2017). Choosing prediction over

You might also like