You are on page 1of 31

Logistic regression + comparing

groups

Pre-Master course Business Research Methods


Dr. Kristin Kronenberg
Agenda
• Comparing two means
• Chi-square
• Logistic regression
Agenda
• Comparing two means
• Chi-square
• Logistic regression
Why compare two means?
• Often used for experimental data
• Looking at differences

• Different entities
- Participants who received actual medication vs. those who
received a placebo
• Same or related entities
- Students' knowledge before and after this lecture
How to compare two means?
• Different entities
- Independent t-test (independent-measures t-test;
independent-means t-test)
• Same or related entities
- Paired-samples t-test (dependent t-test)
How to compare two means?
• Comparing differences between the means of two
groups means predicting an outcome based on
membership of two groups
• We can use the linear model with a dichotomous
predictor (also known as dummy variable)
- Yes or no
- Treatment or no treatment
- Lecture or not
- Cloak or no cloak
- 0 or 1
How to compare two means?
• The t-test tells us whether the difference between
means is different from zero (= something is going on!)
• Best predicted value of the outcome is the group
mean (summary statistic with the least squared error)
How to compare two means?
• You're asking yourself: do rabbits eat more carrots than
other animals?
Categorical predictors in the linear model

𝑌𝑖 = 𝑏0 + 𝑏1 𝑋1𝑖 + 𝜀𝑖

Carrots𝑖 = 𝑏0 + 𝑏1 Rabbit 𝑖 + 𝜀𝑖
Categorical predictors in the linear model
• Group variable = 0 (no rabbit)
• b0 = mean of baseline (no rabbit) group = intercept

Carrots𝑖 = 𝑏0 + 𝑏1 Rabbit 𝑖
𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑏0 + 𝑏1 ∗ 0
𝑏0 = 𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡
Categorical predictors in the linear model
• Group variable = 1 (Rabbit)
• 𝑏1 = difference between group means

Carrots𝑖 = 𝑏0 + 𝑏1 Rabbit 𝑖
𝑋𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑏0 + 𝑏1 ∗ 1
𝑋𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑏0 + 𝑏1
𝑋𝑅𝑎𝑏𝑏𝑖𝑡 = 𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡 + 𝑏1
𝑏1 = 𝑋𝑅𝑎𝑏𝑏𝑖𝑡 − 𝑋𝑁𝑜𝑅𝑎𝑏𝑏𝑖𝑡
The logic behind the t-test
• If samples come from the same population, we expect large differences
between sample means to occur very infrequently
• Under H0, we expect means from two random samples to be very similar
• We compare the difference between the sample means that we collected to
the difference between the sample means that we would expect to obtain
(in the long run) if there was no effect
• If the difference between the samples we have collected is larger than we
would expect (based on the standard error), then one of two things has
happened
- There is no effect, but sample means from our population fluctuate a lot and we
happen to have collected two samples that produce very different means
- The two samples come from different populations, which is why they have
different means and this difference indicates an actual difference between the
samples, and H0 is unlikely
The logic behind the t-test
• Two samples with two means, which differ by a little or a lot
• Compare difference between sample means we obtained to
expected sample means if there was no effect ( = other animals
eat as many carrots as rabbits)
• Signal-to-noise ratio: (systematic) variance explained by the
model divided by (unsystematic) variance the model cannot
explain
• How large is the observed difference between the sample
means (relative to the standard error)?
• The larger it is (relative to the standard error), the more likely it
is that the two means differ due to different conditions
The logic behind the t-test

Model (Signal)

Error (Noise)
Independent t-test: Example
• Are invisible people mischievous?
• Experiment
- Participants placed in enclosed community full of hidden
cameras
- 12 participants with invisibility cloak
- 12 participants without invisibility cloak
• How many mischievous acts did participants perform
in a week?
What does a suitable dataset look like?
What does a suitable dataset look like?
The independent t-test in SPSS
The independent t-test in SPSS
The independent t-test in SPSS
The independent t-test in SPSS

Mean difference = 3.75 - 5.00 = -1.25


Standard error of the sampling distribution of differences = 0.73
−𝟏.𝟐𝟓
t-statistic = 𝟎.𝟕𝟑𝟎 = −𝟏. 𝟕𝟏𝟑
The probability to obtain this value or larger if 𝑯𝟎 was true is 0.101 (10.1%)
We do not reject 𝑯𝟎 and assume that the cloak does not affect the amount of
mischief
Paired-samples t-test: Example
• Are invisible people mischievous?
• Experiment with 12 participants
- Participants placed in enclosed community full of hidden
cameras
- No cloak in week 1
- Invisibility cloak in week 2
• How many mischievous acts did participants perform
in weeks 1 and 2?
What does a suitable dataset look like?
What does a suitable dataset look like?
The paired-samples t-test in SPSS
The paired-samples t-test in SPSS
The paired-samples t-test in SPSS
The paired-samples t-test in SPSS

Mean difference = 3.75 - 5.00 = -1.25


Standard error of difference scores = 0.329
−𝟏.𝟐𝟓
t-statistic = 𝟎.𝟑𝟐𝟗 = −𝟑. 𝟖𝟎𝟒
The probability to obtain this value or larger if 𝑯𝟎 was true is 0.003 (0.3%)
We reject 𝑯𝟎 and assume that the cloak does affect the amount of mischief
Assumptions
• The t-test is a special case of the linear model, so the previously
discussed assumptions apply
• Both t-tests are parametric tests based on the normal distribution.
Therefore, they assume that
- Data are measured at least at the interval level
- The sampling distribution is normally distributed. In the dependent t-test
this means that the sampling distribution of the differences between
scores should be normal, not the scores themselves
• The independent t-test, as it is used to test different groups of
entities, also assumes that
- Variances in these populations are equal (homogeneity of variance)
- Scores in different treatment conditions are independent (since they
come from different entities)
Reporting
• Independent samples
- On average, participants given a cloak of invisibility engaged
in more acts of mischief (M = 5, SE = 0.48) than those without
a cloak (M = 3.75, SE = 0.55). This difference (-1.25) was not
significant, t(21.54) = -1.71, p = 0.101
• Dependent samples
- On average, participants given a cloak of invisibility engaged
in more acts of mischief (M = 5, SE = 0.48) than those without
a cloak (M = 3.75, SE = 0.55). This difference (-1.25) was not
significant, t(11) = -3.80, p = 0.003

You might also like