You are on page 1of 55

Add Side Image

27/10/2021

Nicolaas Puts, PhD


Department of Forensic and Neurodevelopmental Sciences
Institute of Psychiatry, Psychology, and Neuroscience

MSc CNS: Module 3

Week 5; Topic 5
In terms of prep.
• No prior exams will be provided
• Use the practice quizzes and practice practicals
• Revision week is the end of December

• Ask questions! 
Resources
• King’s Academic Skills for Learning (KASL)
 https://keats.kcl.ac.uk/course/view.php?id=62576. 

Students can see workshops and they can book 1-2-1 study skills sessions with
PhD students that will support academic writing, scientific writing, academic reading
and statistics.
Learning outcomes
• To be familiar with the underlying assumptions of statistical tests and recognize when these assumptions
have been violated.
• To calculate relevant effect indices for comparing two samples when assumptions are violated.
• To understand how to use non-parametric testing methods to evaluate such indices in the underlying
population. 

• Hypothesis testing
• T-tests for continuous variables
• Chi-squared tests for categorical values

• Understanding their assumptions and what test to run (Non-parametric tests) when these are violated
• How to do this and identify this in SPSS
Look at your data!!
Topic 4 cheat slide
What can we conclude
This is Jack, jack wants to know if the amount of candy children get differs between children who dress
up in a happy way, and those who dress up in a scary way. What can we conclude?
Levene’s Test
F Sig. t df Sig. Diff STD err
Equal variance
assumed
4.16 0.06 1.45 60 0.01 1.36 0.10
Equal variance not
assumed
1.48 59.3 0.03 1.19 0.12

A. There is a significant difference in candy between a happy and scary costume (t = 1.45, df = 60, p = 0.01)
B. There is no significant difference in candy between a happy and scary costume (t = 1.45, df = 160, p = 0.06)
C. There is a significant difference in in candy between a happy and scary costume (t = 1.48, df = 60, p = 0.03)
Levene’s test!
This is Jack, jack wants to know if the amount of candy children get differs between children who dress
up in a happy way, and those who dress up in a scary way. What can we conclude?
Levene’s Test
F Sig. t df Sig. Diff STD err
Equal variance
assumed
4.16 0.06 1.45 60 0.01 1.36 0.10
Equal variance not
assumed
1.48 59.3 0.03 1.19 0.12

A. There is a significant difference in candy between a happy and scary costume (t = 1.45, df = 60, p = 0.01)
B. There is no significant difference in candy between a happy and scary costume (t = 1.45, df = 160, p = 0.06)
C. There is a significant difference in in candy between a happy and scary costume (t = 1.48, df = 60, p = 0.03)
Different tests for different answers

Is the proportion of male jedi higher than that of female jedi?

Jedi
Jedi Yes No
Female 30% Female 25% 75%
Male 70% Male 50% 50%

One sample 𝝌2 Test Pearson 𝝌2 Test


When assumptions do not hold, we expand the table
Normality Normality not
assumed assumed
Numerical data
Parametric Non-parametric
Checking assumptions for the t-test and parameters

Assumptions
• The observations are randomly and independently drawn
• There are no outliers (within each group if appropriate)

• One sample: Symmetrical continuous data (approximately normally distributed)


• Two independent sample: Symmetrical continuous data (approximately normally distributed) within each
group
• Two paired samples: The difference is symmetrical and continuous

If data are normally distributed we can use the mean and standard deviation, which are parameters of the
normal distribution

When data are not normally distributed, we cannot use these parameters of the normal distribution (mean
and SD) to describe the data, thus we use non-parametric approaches
When violated (one sample) we use Wilcoxon signed rank test

For
• Skewed continuous data
• Ordinal or discrete data
Wilcoxon Signed Rank Test
Step 1. Test whether our data are normally distributed

Step 2. If it’s not, perform a Wilcoxon Signed rank test


What would be appropriate hypothesis?
Wilcoxon Signed Rank Test
Step 1. Test whether our data are normally distributed

Step 2. If it’s not, perform a Wilcoxon Signed rank test


What would be appropriate hypothesis?

I have a sample of monsters with data on weight, preference for brains


(yes/no), plasma, and scary scores (both continuous). Assume data are
not-normal, what is a correct statement?

A. H1. The median preference for brain is significantly different than 50-50
B. H1. The median scary score is significantly different than 24
C. Ha. The median scary score is not significantly different between liking
brains or not
D. Ha. The mean scary score is significantly different than 24
Wilcoxon Signed Rank Test
Step 1. Test whether our data are normally distributed

Step 2. If it’s not, perform a Wilcoxon Signed rank test


What would be appropriate hypothesis?

I have a sample of monsters with data on weight, preference for brains


(yes/no), plasma, and scary scores (both continuous). Assume data are
not-normal, what is a correct statement?

A. H1. The median preference for brain is significantly different than 50-50
B. H1. The median scary score is significantly different than 24
C. Ha. The median scary score is not significantly different between liking
brains or not
D. Ha. The mean scary score is significantly different than 24
Example

I have a sample of monsters with data on weight, preference for brains, plasma, and
scary scores. Based on what I am showing, what can we write?

A. The mean weight of the sample was not different from 26 kg (t = 1.683, p = 0.092)
B. The mean weight of the sample was significantly different from 26 kg (z = 1.683, p = 0.092)
C. The median weight of the sample was not significantly different from 26 kg (t = 1.683, p =
0.092)
D. The median weight of the sample was not significantly different from 26 kg (z = 1.683, p =
0.092)
Example

I have a sample of monsters with data on weight, preference for brains, plasma, and
scary scores. Based on what I am showing, what can we write?

A. The mean weight of the sample was not different from 26 kg (t = 1.683, p = 0.092)
B. The mean weight of the sample was significantly different from 26 kg (z = 1.683, p = 0.092)
C. The median weight of the sample was not significantly different from 26 kg (t = 1.683, p =
0.092)
D. The median weight of the sample was not significantly different from 26 kg (z = 1.683, p =
0.092)
In a way, SPSS is really informative

A one-sample Wilcoxon signed-rank test indicated that the


median was not significantly different than
26 (Z = 1.683, p = 0.092).
Two samples – Mann Whitney U Test
I have a sample of zombies and ghosts, with
data on weight, preference for brains, plasma,
and scary scores. Here I am testing scary score
between the two groups, what would our
hypothesis be?
A. H0. The distribution of scary score in is the
same between ghosts and zombies
B. H1. The distribution of scary score is
different between ghosts and zombies
C. H0. The distribution of scary score is
different between ghosts and zombies
Two samples – Mann Whitney U Test
I have a sample of zombies and ghosts, with
data on weight, preference for brains, plasma,
and scary scores. Here I am testing scary score
between the two groups, what would our
hypothesis be?
A. H0. The distribution of scary score in is the
same between ghosts and zombies
B. H1. The distribution of scary score is
different between ghosts and zombies
C. H0. The distribution of scary score is
different between ghosts and zombies
I have a sample of zombies and ghosts, with
data on weight, preference for brains, plasma,
and scary scores. Here I am testing scary score
between ASD and TDC.

What can we conclude?


A. The difference in distribution of scary score was
statistically significant between ghosts and
zombies (Mann Whitney U = 137.50, p = 0.005)
with lower scary scores in zombies
B. The difference in distribution of scary score was
not statistically significant between ghosts and
zombies (Mann Whitney U = 137.50, p = 0.005)
with lower scary score in ghosts
C. The difference in distribution of scary score was
statistically significant between ghosts and
zombies (W = 413.50, p = 0.005) with higher
scary score in ghosts
D. The difference in distribution of scary score was
not statistically significant between ghosts and
zombies (Test statistic = -2.81, p = 0.005) with
higher scary score in ghosts
Two samples – Mann Whitney U Test
I have a sample of zombies and ghosts, with
data on weight, preference for brains, plasma,
and scary scores. Here I am testing scary score
between ASD and TDC.

What can we conclude?


A. The difference in distribution of scary score was
statistically significant between ghosts and
zombies (Mann Whitney U = 137.50, p = 0.005)
with lower scary scores in zombies
B. The difference in distribution of scary score was
not statistically significant between ghosts and
zombies (Mann Whitney U = 137.50, p = 0.005)
with lower scary score in ghosts
C. The difference in distribution of scary score was
statistically significant between ghosts and
zombies (W = 413.50, p = 0.005) with higher
scary score in ghosts
D. The difference in distribution of scary score was
not statistically significant between ghosts and
zombies (Test statistic = -2.81, p = 0.005) with
higher scary score in ghosts
In a way, Again, SPSS does a lot for us
If we determine in step 1 that the
data are not normally distributed,
we do a non-parametric test in
SPSS

The distribution of Scary Score was statistically different across groups (Mann-Whitney U = 137.5, p = 0.005), with ghosts
having a higher scary score than zombies.
Example from real-life!

Mcpartland et al. 2019


If data are paired, we can use the Wilcoxon Matched - Pair Signed Rank test

The median difference between the ‘weight after’ and the ‘weight before’ was significantly different than zero
(Wilcoxon rank sum Z = -14.88, p <0.001). The weight decreases significantly after the programme.
In a way, SPSS is easier
Non-parametric hypotheses…

They don’t change much (but they do a little)


One sample
H0: The median equals a pre-specified value
Ha: The median is different than a pre-specified value

Are two independent samples different


H0: The distribution of two groups is identical
Ha: The distribution of two groups is different

Are two paired samples different


H0: The distribution of two paired groups is identical OR median of the paired differences equals zero
Ha: The distribution of two paired groups is different OR median of the paired differences is different from
zero
When assumptions do not hold, we expand the table
Assumptions Assumptions do not
hold hold
Categorical data Parametric Non-parametric
Checking assumptions for the 𝝌2 Tests

Assumptions
• Up to 20% of the cells can have expected counts less than 5%
• The minimum expected count is larger than 1

Yes No Preferred means of attack


Ghosts 100 3 Teeth Scream Claws Other
Zombies 200 75 Ghosts 12 40 34 23
Zombies 3 25 22 26

So when these assumptions are violated,


Preferred means of attack
then we tick ‘exact’ and write down (exact p-
value =…) Teeth Scream Claws Other
Ghosts 12 40 34 0
Remember, SPSS will usually tell you when
Zombies 0 25 22 26
these are violated!
Is this right?

The sample proportion is not significantly


different from the expected values (Chi-
Square = 0.114, p = 0.735)
Say we have different groups here, click exact
Checking assumptions for the Pearson 𝝌2 Test

Assumptions
• Unpaired

And again, SPSS tells you everything you need to know!

Among women, the proportion of those who exercised before the


programme was lower than those who did not exercise before the
programme (25% versus 100%, respectively). This difference was
statistically significant according to Fisher’s exact test (exact p<0.001).
Understanding cross-tabs

Cells – Column (which means columns add to 100%) Cells – Row (which means row add to 100%)

18 ghosts who like brains


5 ghosts who do not like brains
15 zombies who like brains
8 zombies who do not like brains
Understanding cross-tabs

Cells – Column (which means columns add to 100%) Cells – Row (which means row add to 100%)

We never add to a 100% so here we read the ROW


We say: among ghosts, the proportion of those who like brain was higher (54.5%) compared to those who do not
like brains (38.5%).
Understanding cross-tabs

Cells – Row (which means row add to 100%)

We never add to a 100% so here we read the COLUMN


We say: among those who like brains was proportion of ghosts was higher (78.3%) than zombies (65.2%).
Checking assumptions for the Pearson 𝝌2 Test
Say we find a new dataset and test a brain (yes,
or no) by group association and get the following
result. We want to know there’s a brain by group
association. What can we infer from these data?

A. Brain preference and group are significantly


associated (Pearson Chi-Square = 0.965, df =
1, p = 0.514)
B. There are significantly more ghosts who like
brains than zombies who do (Pearson Chi-
Square = 0.965, df = 1, p = 0.326)
C. Brain preference and group are not
significantly associated (Pearson Chi-Square =
0.965, df = 1, p = 0.326)
D. Brain preference and group are not
significantly associated (Fisher’s Exact, p =
0.514)
Checking assumptions for the Pearson 𝝌2 Test
Say we find a new dataset and test a brain (yes,
or no) by group association and get the following
result. We want to know there’s a brain by group
association. What can we infer from these data?

A. Brain preference and group are significantly


associated (Pearson Chi-Square = 0.965, df =
1, p = 0.514)
B. There are significantly more ghosts who like
brains than zombies who do (Pearson Chi-
Square = 0.965, df = 1, p = 0.326)
C. Brain preference and group are not
significantly associated (Pearson Chi-Square =
0.965, df = 1, p = 0.326)
D. Brain preference and group are not
significantly associated (Fisher’s Exact, p =
0.514)
More than two groups, we’re good with Pearson/Fisher!

Pearson, Fisher (and McNemar) can be used for more than two groups too!

Ethnicity
White Black Asian Other
Female
Male
And McNemar

Assumptions
• We need at least 25 discordant observations and paired (categorical) data.
Discordant (Yes -> No, and No -> Yes) (must be more
than 25)

Concordant (stays the same)

SPSS ones again does this for you!

But make sure you write it down correctly (Value A and B are not related (McNemar, binomial, exact p = 0.1)
And McNemar for more than 2 groups…

Assumptions
• We need at least 25 discordant observations and paired (categorical) data.
Discordant (Yes -> No, and No -> Yes) (must be more
than 25)

Concordant (stays the same)

SPSS ones again does this for you!

But make sure you write it down correctly (Value A and B are not related (McNemar, binomial, exact p = 0.1)
What test should I run?

I work in a zoo. I want to test whether the mean height of African animals is different
from that of Asian animals. In African animals it looks like this.
What test should I run?

I work in a zoo. I want to test whether the mean height of African animals is different
from that of Asian animals. Height is skewed.

Mann-Whitney Wilcoxon Sum Rank

A. Independent Sample T-test


B. McNemar-Bowker
C. One sample T-test
D. One sample Chi-Square
E. Paired-sample T-test
F. Mann-Whitney Wilcoxon Sum Rank
G. Fisher’s Exact
H. Wilcoxon Matched Paired Sign Rank
What test should I run?

I want to test if a drug can treat how sensitive zombies are to


I work in a zoo. I want to test whether the mean height of African animals is different brains. I measure sensitivity, before and after the drug. The
from that of Asian animals. Height is skewed. difference is not normally distributed. How do I test the
effect?
What test should I run?

I want to test if a drug can treat how sensitive zombies are to


I work in a zoo. I want to test whether the mean height of African animals is different brains. I measure sensitivity, before and after the drug. The
from that of Asian animals. Height is skewed. difference is not normally distributed. How do I test the
effect?

Wilcoxon Matched Pair Signed Rank

A. Independent Sample T-test


B. McNemar-Bowker
C. One sample T-test
D. One sample Chi-Square
E. Paired-sample T-test
F. Mann-Whitney Wilcoxon Sum Rank
G. Fisher’s Exact
H. Wilcoxon Matched Paired Sign Rank
What test should I run?

I want to test if a drug can treat how sensitive zombies are to


I work in a zoo. I want to test whether the mean height of African animals is different brains. I measure sensitivity, before and after the drug. The
from that of Asian animals. Height is skewed. difference is not normally distributed. How do I test the
effect?

I want to test if the proportion of people doing the KEATS quiz (YES/NO) is linked
to age bracket (<23, 23-28, > 28) but there was no one over 23 last year.
What test should I run?

I want to test if a drug can treat how sensitive zombies are to


I work in a zoo. I want to test whether the mean height of African animals is different brains. I measure sensitivity, before and after the drug. The
from that of Asian animals. Height is skewed. difference is not normally distributed. How do I test the
effect?

A. Independent Sample T-test


B. McNemar-Bowker Fisher Exact (2 x 3)
C. One sample T-test
D. One sample Chi-Square
E. Paired-sample T-test
F. Mann-Whitney Wilcoxon Sum Rank
G. Fisher’s Exact I want to test if the proportion of people doing the KEATS quiz (YES/NO) is linked
H. Wilcoxon Matched Paired Sign Rank to age bracket (<23, 23-28, > 28) but there was no one over 23 last year.
Cheat sheet!
And another one!

T!
R I N
P
And another one!
Learning outcomes ”What was discussed”
• When to use T-Tests and when to use 𝝌2
• When to use parametric and when to use non-parametric tests

• Understand which test to use when

• Understand under what assumptions to use a test


• Understand how to identify violations and use the right test

• Understand how to test the null hypothesis and interpret SPSS (and read the fine print)
Thank you/Questions?
Contact details/for more information:
Nicolaas Puts
E1.05/IOPPN
nicolaas.puts@kcl.ac.uk

© 2022 King’s College London. All rights reserved

You might also like