You are on page 1of 65

Bachelor in Psychology (2022-2023)

Statistics III

Jan Schepers

IPN/PSY3008

Period 4
Contents

A. General Information ............................................................................................... 1


1. The Education Office and www.askpsy.nl ................................................................ 1
2. Regulations, Including Code of Conduct & Education and Examination Regulations ....... 1
3. Attendance ......................................................................................................... 2
4. Covid-19 ............................................................................................................ 3
5. Calculators at the Exam ........................................................................................ 3
B. Course Information................................................................................................. 4
1. Course Planning Group ......................................................................................... 4
2. Course Description ............................................................................................... 4
3. Intended Learning Outcomes ................................................................................ 6
4. Alignment with the Program .................................................................................. 6
5. Course Schedule .................................................................................................. 6
6. Essential And Recommended Literature .................................................................. 7
7. Overview of Significant Changes of the Course since last Year ................................... 8
8. Examination/Assessment plan ............................................................................... 8
9. SPSS Practicals ................................................................................................... 8
C. Tasks .................................................................................................................... 9
Tutorial 1: Statistics III ............................................................................................ 9
Subject: Contingency tables ................................................................................... 9
SPSS practical 1: Statistics III ................................................................................ 13
Subject: SPSS Contingency table analysis .............................................................. 13
Tutorial 2: Statistics III .......................................................................................... 17
Subject: discussion of SPSS assignments practical 1 ............................................... 17
Tutorial 3: Statistics III .......................................................................................... 18
Subject: logistic regression .................................................................................. 18
SPSS practical 2: Statistics III ................................................................................ 21
Subject: SPSS logistic regression (bring the output from practical 2!) ....................... 21
Tutorial 4: Statistics III .......................................................................................... 25
Subject: discussion of SPSS assignments practical 2 ............................................... 25
Tutorial 5: Statistics III .......................................................................................... 29
Subject: classical test theory, reliability, item analysis ............................................ 29
SPSS practical 3: Statistics III ................................................................................ 33
Subject: SPSS Reliability and item analysis ............................................................ 33
Tutorial 6: Statistics III .......................................................................................... 38
Subject: discussion of SPSS assignments practical 3 ............................................... 38
Tutorial 7: Statistics III .......................................................................................... 39
Subject: modern test theory, Rasch model ............................................................ 39
Tutorial 8: Statistics III .......................................................................................... 45
Subject: factor analysis ....................................................................................... 45
SPSS practical 4: Statistics III ................................................................................ 47
Subject: SPSS factor analysis ............................................................................... 47
Tutorial 9: Statistics III .......................................................................................... 51
Subject: discussion of SPSS assignments practical 4 ............................................... 51
Tutorial 10: Statistics III ........................................................................................ 52
Subject: validity and agreement ........................................................................... 52
Appendix A General tips for using SPSS for Windows .............................................. I
Appendix B Importing SPSS output in Word ......................................................... III
Appendix C Training files ................................................................................... IV
A. General Information
1. The Education Office and www.askpsy.nl
The Education Office is responsible for the practical organisation and coordination of all the
education related activities within the Faculty of Psychology and Neuroscience (FPN), for
example the schedules.

Askpsy.nl is the website for FPN faculty information, FAQs, and contact options.

Here you can find among other things:

- course and exam booking/cancelling


- provisional schedules
- repeat education
- academic calendar
- exam schedules, procedures and inspection
- requirements for passing a course
- information on resits
- information on exam inspection
- attendance requirements
- appointment with academic advisors
- and so on.

2. Regulations, Including Code of Conduct & Education and


Examination Regulations
Each study programme offered at FPN has its own Examination and Education Regulation
(EER), which is updated every academic year. The EER of the year in which you started your
studies applies to your entire study programme and contains information on like for
example attendance at tutorial group meetings, determination and publishing results, and
exam inspection. The 'rules and regulations' apply to all students of a study programme
equally, and are valid for one academic year only. The rules and regulations are part of the
Examination and Education Regulation (EER).

Note that the 2022/23 course may have COVID-19 related restrictions and based on the
situation we might need to adjust our way of education. Please contact your tutor, course
coordinator or mentor whenever you have questions or remarks related to the teaching
format. We are grateful for your input and flexibility in these times.

FPN regards behaviour in compliance with its core values as being of great importance. A
Code of Conduct has been developed to ensure a good and productive study environment
and to avoid undesirable and unwanted situations.

A link to all these regulations can be found at Askpsy.nl/regulations.

1
3. Attendance
The tutor registers your presence. Be aware that, if you arrive more than 10 minutes after
the official starting time of the meeting or if you leave more than 10 minutes before the
actual ending time of the meeting, you are considered to be absent.

There is an attendance obligation with respect to the tutorial group meetings (IPN/PSY3008)
and the practicals SPSS (IPN/PSY3201). The document entitled rules and regulations of the
education program define the attendance requirement for the tutorial meetings and the
practical meetings. Please see art. 5.8 EER and Art 7 Rules and Regulations Bachelor in
Psychology for the complete attendance rules (www.askpsy.nl/regulations). If you miss
more than the allowed number of tutorials to obtain your attendance obligation for
IPN/PSY3008, you will have to take the course again next year. Note: In some courses, it is
possible to join a meeting in another tutorial group. Joining a meeting in another tutorial
group will however not be registered as part of your attendance obligation. You need to
pass all attendance requirements in your own tutorial group. If you miss one practical
SPSS (but not more than one), you may apply for a catch-up assignment to obtain your
attendance obligation for IPN/PSY3201. Contact the course coordinator for the catch-up
assignment.

In this course, the following is applicable:

If you miss a meeting, you may join this meeting in another tutorial group.
However, this is only allowed if the tutor of the group you would like to join,
agrees that you will join the tutorial group you are officially not registered for.

2
4. Covid-19
Due to COVID-19, this course may be offered online or partially online. FPN will comply with
the measures set by the Dutch government and Maastricht University. See
maastrichtuniversity.nl and askpsy.nl for latest information.

In COVID-19 times, you will be assigned randomly to groups, as in normal times. But,
reshuffling of groups by the coordinator is allowed to configure groups in a most optimal
way if necessary (for example, to find a balance between students who are actually
physically here in Maastricht and students who have to be online because they are still
abroad due to travel restrictions). Any reshuffling will be communicated by the coordinator
to the Education Office. The office will arrange a regrouping in the system too. This is
important, so that you and your peers always have correct time table information. In case
your time table is not correct, please contact the education office.

5. Calculators at the Exam


The only calculators that are allowed at FPN exams are the non-programmable Casio FX-82
or the Casio FX-85. All subtypes of these models are allowed, e.g. Casio FX-82MS or Casio
FX-85ES. For the exams where calculators are permitted, all students are expected to bring
one of these two models. If you do not yet own one, please make sure you buy the right
type of calculator well in advance. No other brands nor other models of calculator will be
accepted at FPN exams. If you are using a different model of calculator during your exam,
this will be reported to the Board of Examiners FPN, and this may have severe
consequences.

3
B. Course Information
1. Course Planning Group

Jan Schepers (course coordinator)


Department of Methodology & Statistics,
Faculty of Psychology and Neuroscience,
Email: jan.schepers@maastrichtuniversity.nl

Nick Broers
Department of Methodology & Statistics,
Faculty of Psychology and Neuroscience,
Email: alberto.cassese@maastrichtuniversity.nl

Philippe Verduyn
Department of Work and Social Psychology,
Faculty of Psychology and Neuroscience,
Email: philippe.verduyn@maastrichtuniversity.nl

2. Course Description

The course will cover three methods: logistic regression, reliability analysis and factor
analysis.

Logistic regression is the counterpart of ANOVA and regression analysis for dichotomous
rather than continuous dependent variables. Examples include: recovering from an illness
and passing an exam. Using logistic regression it is possible to correct for confounding and
to investigate interaction if there are multiple independent variables and a dichotomous
dependent variable. It is therefore an extension of the contingency table analysis (covered
in Statistics 1), allowing multiple independent variables to be handled, of both the
categorical and continuous type. We will limit ourselves to between-subjects designs for the
reason that logistic regression for repeated measures is a too complicated topic for the
bachelor.

Reliability analysis is a classic psychometric method for analysing tests and questionnaires.
Tests and questionnaires are used as measuring tools in many studies. This involves
assigning binary scores for the answers provided by respondents to a number of multiple-
choice questions (items) and adding up the scores to obtain an overall score for the
characteristic being measured (such as intelligence or attitude). This is done on the
assumption that each item measures the same characteristic. Reliability analysis is a tool for
verifying how well each item fits the scale and how reliable the overall score is. The course
offers training in classical test theory (psychometrics) and an introduction to modern
psychometrics. Other topics that will be addressed are the relationship between reliability
and validity, and the difference between reliability and agreement.

Factor analysis provides a method for reducing a large number of variables to a small
number of underlying factors. Historically, the application of factor analysis mainly focused
on reducing scores from a range of tests to a small number of dimensions, such as verbal
and spatial intelligence or extraversion and neuroticism. Nowadays, factor analysis is often

4
used to group items within one and the same questionnaire into subscales. Factor analysis
is therefore related to classical reliability analysis. This course offers training in exploratory
factor analysis using SPSS, focusing on the selection of the number of factors and on
rotation and interpretation. In the final part, participants are introduced to confirmative
factor analysis.

The course consists of three parts.


The first part of the course covers contingency tables and logistic regression. In week 1 we
will discuss contingency tables, stratification, odds ratio, confounding and interaction. In
week 2 we will look at logistic regression, with a strong focus on the similarities with
stratified analyses of contingency tables and the advantages of logistic regression if multiple
predictors are present.

The second part will deal with test theory. In week 3, the focus is on classical reliability
theory, measures of test reliability (Cronbach’s , split-half, test-retest) and item analysis.
Week 4 uses the Rasch model to provide an introduction to modern test theory, with an
emphasis on the similarities and differences compared with classical theories and
techniques, and the advantages of modern theories and techniques.

The third and last part begins in week 5 and first looks at factor analysis. The emphasis will
be on what is referred to as exploratory factor analysis and three choices that need to be
made in this regard: the extraction method, the number of factors and the method of
rotation. We will see that commonly used options are often not the best. In addition to this,
an outline of confirmatory factor analysis will be provided. In Week 6, the topics of validity
and agreement will form the conclusion of this course.

The teaching method for each component of the course will be a mix of lectures, tutorials,
SPSS practicals, response lectures (i.e., Q&A lectures) and formative tests. Each lecture will
discuss a statistical method using a real-world or simplified example. During the tutorial, the
relevant method will be repeated using general theory questions and applied using
calculation work on paper. During the practical, the method is then applied to another
example with the help of SPSS and the results will be discussed in the next tutorial. A
response lecture (Q&A lecture) will be held for each course component in order to provide a
safety net for any subject matter that was not discussed or understood. For each tutorial,
the course manual enables independent study through knowledge questions. The response
lectures will also be open to questions that arise after the student has practised
independently with the knowledge questions.
In addition to this, the following teaching materials will be made available via Canvas:
- two formative tests
- for each lecture, a PDF handout of the slides
- for each lecture, a summary
- Questions posted by students on Canvas > Discussions (the lecturer will reply on a
weekly basis)

5
3. Intended Learning Outcomes
Students:

- are able to explain relevant concepts central to this module, including confounding and
interaction, classical psychometrics, reliability, modern psychometrics, item response
theory, Rasch model, validity, agreement,
- are able to explain and apply specific statistical techniques, such as three-way
contingency table analysis, logistic regression, reliability analysis (including item
analysis) and exploratory factor analysis, and they can interpret relevant output of
these techniques
- are able to specify the assumptions of statistical techniques that were discussed in this
module and are able to apply this knowledge when analyzing data.

4. Alignment with the Program


We will complement the methods for analysing effects in experimental and correlational
research taught in Year 2, to methods for analyzing dichotomous rather than quantitative
dependent variables. Moreover, we will focus on methods for analysing tests and
questionnaires. As such, this course provides the prior knowledge required for the
Psychodiagnostics course.

5. Course Schedule

Lectures:
6 Feb 08:30-10:30 Contingency tables, odds ratio, stratification,
confounding, interaction
14 Feb 08:30-10:30 Logistic regression
28 Feb 08:30-10:30 Classical psychometrics, reliability, item analysis
7 Mar 11:00-13:00 Modern psychometrics, the Rasch model, item
and test information
14 Mar 11:00-13:00 Factor analysis
23 Mar 08:30-10:30 Validity and agreement between assessors

6
Practical SPSS*:
9 Feb refer to your timetable Contingency table analysis
16 Feb refer to your timetable Logistic regression
7 Mar refer to your timetable Classical psychometrics
21 Mar refer to your timetable Factor analysis

* see section C of this course manual for the practical assignments

Tutorials**:
7 Feb refer to your timetable tutorial 1
10 Feb refer to your timetable tutorial 2
15 Feb refer to your timetable tutorial 3
27 Feb refer to your timetable tutorial 4
3 Mar refer to your timetable tutorial 5
8 Mar refer to your timetable tutorial 6
13 Mar refer to your timetable tutorial 7
17 Mar refer to your timetable tutorial 8
22 Mar refer to your timetable tutorial 9
27 Mar refer to your timetable tutorial 10

**see section C of this course manual for the tutorial assignments

Response lectures or Q & A lectures:


2 Mar 11:00-13:00
9 Mar 11:00-13:00
16 Mar 11:00-13:00
30 Mar 11:00-13:00

6. Essential And Recommended Literature


The course material that is available on Canvas (course manual, lecture handouts,
summaries of lectures and formative exams) is complete in terms of covering the learning
goals of this course. In addition to these sources, students may find the following literature
useful:

 For contingency tables:


1) the chapter on "three-way contingency tables" in Agresti, A. (2007). An
introduction to categorical data analysis (2nd ed.). Wiley. A little technical but
very good.
2) the chapter "Analysis of two-way tables" in Introduction to the practice of
Statistics (see Statistics I)
 For logistic regression:
1) Andy Field (2005). Discovering Statistics using SPSS. London: SAGE (2nd ed.). A
good book on how to do things with SPSS, but less suitable as a manual
2) the chapters on "logistic regression" in Agresti.
 For psychometrics and factor analysis:
1) Robert F DeVellis (2003). Scale development: theory and applications. Thousand
Oaks (CA): SAGE (2nd ed.). A non-technical and fairly good introduction into
both psychometrics and factor analysis.
2) Crocker, L. & Algina, J.(1986). Introduction to classical and modern test theory.

7
Orlando: Harcourt Brace Javonovich College Publishers. A more technical
introduction into psychometrics and factor analysis. See Chapter 15 on modern
psychometrics in particular.
 The ‘SPSS in steps’ reader issued in the first bachelor year.

See the library link:


https://maastrichtuniversity.keylinks.org/#/list/699

7. Overview of Significant Changes of the Course since last Year


The content and organization of the course have not been changed compared to the
previous year. All teaching activities will be onsite.

8. Examination/Assessment plan

The exam will consist of 24 questions with three answer options (2 or 3 questions for each
tutorial), with a number of questions based on an extract of SPSS output. To gain an
impression of the nature of the exam, a formative exam will be made available. You will
have three hours to complete the exam for this course. Sixteen correct questions (or
more) means a pass.

The date and time of the exam inspection will be announced on Canvas on the date the
final results are submitted to the Exam Administration.

9. SPSS Practicals
This course is linked to the SPSS practical IPN/PSY3201. All assignments of that practical
are included in this syllabus.

8
C. Tasks
Tutorial 1: Statistics III

Subject: Contingency tables

This tutorial will include two types of exercises: calculations and general theory questions. The
emphasis will be on the calculations. The aim is twofold: (1) learning how to calculate odds
ratios for 2*2 and 2*2*2 contingency tables, and (2) learning how to investigate confounding
and interaction for dichotomous dependent variables.

Calculations

Objective:
Practising the calculation and interpretation of odds ratios and log odds and checking for the
presence of interaction or confounding.

Assignment 1. Using statistics to untangle the confusion caused by statistics

It is a well-known fact that many Psychology students find statistics a challenging topic. The pre-
university education and study effort are often pointed to as the causes. Whereas statistics is an
exact science, psychology studies attract a large number of students who have had a non-
scientific previous education. Statistics does not enjoy any popularity as a topic and therefore the
study efforts are lower.

In a (fictitious) study we will verify the extent to which the probability of passing a statistics
exam is determined by previous education and study efforts. This results in the contingency table
below.

Previous education Alfa (= Previous education Beta (=exact


language and culture studies) sciences)
Low effort High effort Low effort High effort

Failed 30 20 40 10

Passed 10 40 20 30

9
a) First, set up a contingency table for the association between both predictors (previous
education and effort) without any distinction between fail/pass and then calculate the OR. Is this
a balanced or unbalanced design? Is there a confounding?

b) Now draw a graph expressing the effect of effort on the probability of a pass, whilst keeping
previous education constant (or put differently, for each type of previous education individually).
Do this in a similar way to a two-way ANOVA. Plot the logodds for a pass, LN [P(pass)/P(fail)],
against the Y axis. Put effort on the X axis and draw a separate line for each type of previous
education.

c) Calculate the odds ratio for the association between effort and the probability of a pass for
students with previous education Alfa on the one hand and for students with previous education
Beta on the other.
Code the variables as follows:
pass: 0=fail, 1=pass; effort: 0=low, 1=high; previous education:0=Alfa, 1=Beta.
Is there an interaction between previous education and effort? How is this reflected in the graph
for (b)? What is the exact relationship between the odds ratio and the slope of the lines in the
graph?

d) Now set up a contingency table for the association between effort and the probability of a pass
without distinguishing between the types of previous education, and calculate the odds ratio.
Compare this with question (c) and explain the difference.

e) Draw a graph expressing the effect of type of previous education on the probability of a pass,
whilst keeping effort constant (or put differently, for each level of effort individually). Plot the
log odds LN [ P(pass)/P(fail) ] against the Y axis. Put type of previous education on the X axis
and draw a separate line for each level of effort.

f) Calculate the odds ratio for the association between type of previous education and the
probability of a pass for students with a low effort level on the one hand and for students with a
high effort level on the other hand. Apply the same coding to the variables as in question (c).
Is there an interaction between previous education and effort? How is this reflected in the graph
for (e)? What is the exact relationship between the odds ratio and the slope of the lines in the
graph?

g) Now set up a contingency table for the association between type of previous education and
probability of a pass, without distinguishing between effort levels, and calculate the odds ratio.
Compare this with (f) and explain the difference.

Assignment 2. Is the whole greater than the sum of its parts?

Let’s assume the study resulted in the following contingency table:

10
Previous education Alfa (= Previous education Beta (= exact
language and culture studies) sciences)
Low effort High effort Low effort High effort

Failed 30 20 40 10

Passed 20 30 10 40

a) First, set up a contingency table for the association between both predictors (previous
education and effort) without any distinction between fail/pass and then calculate the OR. Is this
a balanced or unbalanced design? Is there a confounding?

b) Now draw a graph expressing the effect of effort on the probability of a pass, whilst keeping
previous education constant (or put differently, for each type of previous education individually).
Plot the log odds of a pass against the Y axis. Put effort on the X axis and draw a separate line
for each type of previous education.

c) Calculate the odds ratio for the association between effort and the probability of a pass for
students with previous education Alfa on the one hand and for students with previous education
Beta on the other.
Apply the same coding to the variables as in assignment 1.
Is there an interaction between previous education and effort? How is this reflected in the graph
for (b)? What is the exact relationship between the odds ratio and the slope of the lines in the
graph?

d) Now set up a contingency table for the association between effort and probability of a pass
without distinguishing between the types of previous education, and calculate the odds ratio.
Compare this with question (c) and explain the difference.

e) Draw a graph expressing the effect of the type of previous education on the probability of a
pass, whilst keeping effort level constant (i.e., for each level of effort individually). Plot the log
odds against the Y axis, put type of previous education on the X axis and draw a separate line for
each level of effort.

f) Calculate the odds ratio for the association between type of previous education and the
probability of a pass for students with a low effort level on the one hand and for students with a
high effort level on the other hand. Apply the same coding to the variables as in question (c).
Is there an interaction between previous education and effort? How is this reflected in the graph
for (e)? What is the exact relationship between the odds ratio and the slope of the lines in the
graph?

g) Now set up a contingency table for the association between the type of previous education and
a pass without distinguishing between effort levels, and calculate the odds ratio. Compare this
with (f) and explain the difference.

11
h) For each contingency table from assignment 1 and 2, indicate whether the table is plausible in
an actual study of previous education, effort and the probability of a pass and if so, explain why.

i) Let’s assume that the study from assignment 1 and 2 is carried out in practice. Which
independent variables would you add to the current set of two and why?

General theory question

Assignment 3. Level of measurement and statistics

Let’s assume we are investigating the association between variable X and variable Y, such as
effort and performance or lifestyle and health. For each instance below, indicate what figure or
table and what technique and measure you would use to analyse the association:
- X and Y are both quantitative variables (continuous and interval or ratio measurement)
- X and Y are both dichotomous (0/1)
- X is dichotomous and Y is continuous
- X is continuous and Y is dichotomous

12
SPSS practical 1: Statistics III

Subject: SPSS Contingency table analysis

Objective:
During this practical, we will apply the methods covered in tutorial 1 to other examples and with
the help of the computer. In addition to odds ratios, we will now also obtain standard errors and
tests of significance. This means we will no longer limit ourselves to descriptive statistics and
draw conclusions about population parameters as well.

Please note that a set of specific SPSS instructions for the following assignments can be found on
page 16.

Assignment 1. Necessity is the mother of invention

In his book "The paradoxicon" (Wiley, 1990), Falletta discusses a number of instructive
examples of contingency table analysis. One of them relates to the probability of recovery among
the male and female patients of two doctors. The contingency table below provides a summary of
the data. Based on statistical analyses that we will replicate next, researcher I concludes that men
have a higher probability of recovery compared to women whereas researcher II concludes there
is no difference in the probability of recovery between the sexes.

Doctor A Doctor B

Recovery men women men women

No 300 200 100 200

Yes 200 100 200 300

This contingency table can be found in the STAT3PR12a.SAV file and has the following
structure:
There are 8 records, with each record containing one of the eight cells from the contingency
table. The variables are: recovery, sex, doctor, frequency. The last variable indicates the number
of people in the relevant cell. This is a compact method for storing data that can only be used for
a small number of non-continuous variables.

Before starting the analysis, the user must indicate in SPSS that the file does not simply contain 8
persons, but that each record in the working memory must be multiplied by the frequency before
the analyses are carried out.

13
In SPSS, the option Data-Weight cases is used to specify this (select ‘Weight cases by’ and
choose the variable ‘frequency’).
Now it is possible to do a contingency table analysis. For checking purposes, the total N for the
following analyses must be 1600, not 8.

a) Calculate the contingency table of sex (columns) against recovery (rows) in SPSS (Analyze >
Descriptive Statistics > Crosstabs). Also request the following output: observed and expected
counts (see the tab ‘Cells…’), the recovery percentage for each sex (see the tab ‘Cells…’), as
well as the χ2 test, correlation and odds ratio (see the tab ‘Statistics …’; For the odds ratio, select
‘Risk’). What is your conclusion regarding the difference in the probability of recovery between
the sexes?

b) Repeat the analysis, now using doctor as the stratifying factor (place the variable ‘doctor’ in
the window ‘Layer 1 of 1’). As additional output, request the Mantel-Haenszel test for the
common odds ratio (see the tab ‘Statistics …’) and answer the following questions:
- Which null hypotheses do the ‘Chi-square Tests’ tables evaluate now?
- What does the ‘Risk Estimate’ table mean?
- Which null hypothesis is tested in the ‘Tests of Homogeneity of the Odds Ratio’ table and what
is your conclusion from this test?
- Which null hypothesis is tested in the ‘Tests of Conditional Independence’ table and what is
your conclusion from this test?
- Which null hypothesis is tested in the 'Mantel-Haenszel common odds ratio estimate' table and
what effect is being estimated?
- In what order should these five tables be read and what are the circumstances in which each
table is relevant?
- What is your conclusion regarding the difference between the sexes?
- How large is the observed effect of the patient’s sex in the random sample? What is the range
of plausible values of this effect in the population, given the data?

c) Now run the two following contingency tables with the same additional output as for question
1a:
- doctor (columns) * recovery (rows)
- doctor * recovery with the patient’s sex as stratifying factor (+ Mantel-Haenszel)
Now answer the following questions:
- Which of these two contingency tables is the most important and why is this?
- What conclusions can you derive from these tables regarding the impact of doctor on recovery?

d) Finally request the contingency table for doctor * sex.


Questions:
- Why do we not break down this table according to recovery?
- What conclusion can you derive from this table regarding the relationship between the doctor
and sex?

e) Using the three contingency tables in (c) and (d), explain the different conclusions in the
analyses for (a) and (b). Which researcher is right? And which has the bigger impact, the
patient’s sex or the doctor?

14
Assignment 2. It all depends on your approach.

In occupational psychology, there is an impressive amount of research into the relationship


between work stress and health. The general idea is that whereas high stress levels at work can
lead to serious health issues, work factors such as autonomy and social support and personal traits
such as age and personality also play a role (the rushed type A personality).
This assignment focuses on the effects of work stress and personality type (A vs. B) on chronic
fatigue symptoms. A summary of the (fictitious) results is shown below.

Type B personality Type A personality

Chronic fatigue? Low stress High stress Low stress High stress
levels at work levels at work levels at work levels at work

No 60 40 30 10

Yes 40 60 20 40

This contingency table can be found in the STAT3PR12c.SAV file and has the following
structure:
There are 8 records, with each record containing one of the eight cells from the table. The
variables are: fatigue, stress, personality type and number (of respondents).
The file structure is therefore the same as for assignment 1. Before starting the analysis, specify
in SPSS that each record must be copied X times (using Data-Weight cases).
For checking purposes, the total N for each analysis must be 300, not 8.

a) Request the contingency table for work stress (columns) and fatigue (rows). Also request the
observed and expected cell frequencies and relevant percentages, as well as the χ2 test,
correlation and odds ratio. What is your conclusion regarding the effect of work stress on chronic
fatigue?

b) Repeat the analysis, now using personality type as the stratifying factor. As additional output,
request the Mantel-Haenszel test for the common odds ratio. For each combination of ‘work
stress’ and ‘personality type’, compute the log odds of fatigue by hand. Then draw on a piece of
paper a plot of these log odds against work stress, with a separate line for each personality type:
- What is the relationship between the odds ratio and these lines?
- Can you see an interaction between work stress and personality type? If so, is this interaction
significant?

c) Now request the contingency table for personality type and fatigue, along with any relevant
additional output (refer to question (a)). What is your conclusion regarding the effect of
personality type on chronic fatigue?

15
d) Repeat the analysis of question (c), now using work stress as the stratifying factor. Request the
Mantel-Haenszel test as additional output once more. Plot the log odds for fatigue against
personality type, with a separate line for each level of work stress. What is your interpretation of
this pattern? What is the significance, if any?

e) Finally, request the contingency table for both independent variables and exclude the
dependent variable. Is the design unbalanced? What are the implications of any unbalance for the
effect analysis? Draw a distinction between the presence and absence of interaction!

f) Which odds ratios and p-values would you report based on all the analyses in this assignment?
Give reasons for your answer (Tip: interaction? Confounding?)

SPSS instructions for practical 1, assignment 1

a) Specify in SPSS that the ‘frequency’ variable must be used as the weighting (via Data -
Weight cases).
Now run the contingency table analysis. Contingency table analysis:
Select: Analyze - Descriptive Statistics - Crosstabs.
Fill in the rows (recovery) and columns (sex) and leave layers blank
Use the Statistics and Cells buttons to select the desired output
(Note: use the Risk box to obtain the odds ratio).
Skip the Exact and Format buttons.
Check that the total N in the output is equal to 1600 rather than just 8. Otherwise the weighting
based on the variable ‘frequency’ has not worked.

b) As under (a), but specify doctor as the variable for the layers. In the Statistics section, request
the Mantel-Haenszel test for a common odds ratio of 1.

c) As under (a) and (b), but switch sex and doctor around.

16
Tutorial 2: Statistics III

Subject: discussion of SPSS assignments practical 1

This tutorial is dedicated in its entirety to the discussion of the results of the SPSS contingency
table practical. Any remaining time can be spent on any assignments from the previous tutorials
that have not been discussed yet.

Instructions with regard to the working method:

1. For the relevant assignment, first review the global structure of the SPSS output.

2. Then discuss the output using the questions for this practical as set out in this course
textbook.

3. Any questions or issues that remain unclear can be raised during the Q&A session.

17
Tutorial 3: Statistics III

Subject: logistic regression

This tutorial is similar to tutorial 1. We will first perform two calculations and then continue with
a general theory question. The calculations are intended to provide insight into the relationships
between logistic regression weights on the one hand and odds ratios on the other hand, and this
with and without correcting for confounding, and as main effect or a simple effect.
If these relationships are not understood, the SPSS output for logistic regression (next practical
and tutorial) will be impossible to grasp. During this tutorial, we will for now limit ourselves to
descriptive statistics once again. We will return to inferential statistics in tutorial 4.

Assignment 1. Using statistics to untangle the confusion caused by statistics

The starting point is the contingency table below, taken from assignment 1 in tutorial 1. This
related to a fictitious study of the relationship between previous education, effort and the
probability of a pass for Psychology students taking a statistics exam.

Previous education Alfa (= Previous education Beta (= exact


language and culture studies) sciences)
Low effort High effort Low effort High effort

Failed 30 20 40 10

Passed 10 40 20 30

a) Begin by calculating the odds ratio for the association between effort and the probability of a
pass for each level of previous education. Next, also calculate the odds ratio for the association
between previous education and the probability of a pass for each effort level.

b) Let’s assume we apply logistic regression to these data, using the probability of a pass as the
dependent variable and previous education, effort and previous education*effort as the
predictors. Define the regression model for this study, using the equation log odds =  0   .

c) For each level of previous education and effort, define the log odds for the probability of a
pass in the form of betas, as demonstrated during the lecture. Use the following coding:
pass: 0 = fail, 1 = pass; effort: 0 = low, 1 = high; previous education: 0 = Alfa, 1 = Beta.

d) Using the contingency table, calculate the log odds for each combination of previous
education and effort. Now plot the log odds (Y axis) against effort (X axis) for each level of

18
previous education and do the same for the log odds against previous education for each effort
level.

e) Use the results under (c) and (d) to calculate the best estimate for each regression weight (see
lecture slides 35 and 43 for an example). In the plots for (d), indicate which regression weight
corresponds to which slope. What does the regression weight represent, in terms of probabilities
or log odds?

f) Review question (c). Express the odds ratio for the association between effort and the
probability of a pass for students with a previous education in Alfa studies as regression weights.
Do the same for students with a previous education in Beta studies (for an example, refer once
more to lecture slides 35 and 43).

g) Calculate the odds ratios referred to in (f) by completing the beta estimates in question (e).
Compare your results with the odds ratios calculated in question (a).

h) Repeat (f) and (g) for the association between previous education and the probability of a pass
for each effort level.

i) Explain why in this assignment we have calculated the odds ratios using logistic regression, if
the same can also be obtained simply on the basis of the contingency table as in question (a). Put
differently, how are contingency table analysis and logistic regression similar and how do they
differ?

j) How can this regression model be simplified without compromising on ‘goodness of fit’ (that
is, without reducing the match with the data)?

Assignment 2. Is the whole greater than the sum of its parts?

The starting point is the contingency table below, taken from assignment 2 in tutorial 1.
Now answer the same questions as in assignment 1d to 1h in the current tutorial.

Previous education Alfa (= Previous education Beta (= exact


language and culture studies) sciences)
Low effort High effort Low effort High effort

Failed 30 20 40 10

Passed 20 30 10 40

19
Assignment 3. Interaction and main effects too?

a) Let’s assume that when studying the effects of effort and previous education on the probability
of passing statistics, you found a significant interaction between effort and previous education.
How should you go about analysing the effect of effort on the probability of passing?

b) And how will you do this if there is no interaction?

20
SPSS practical 2: Statistics III

Subject: SPSS logistic regression (bring the output from practical 2!)

Objective:
During this practical, we will use the computer to apply the methods covered in tutorial 3 to
other examples, so that we will now also obtain standard errors and tests of significance. This
means we are switching back from descriptive statistics to inferential statistics.
We will compare the SPSS output for logistic regression with the SPSS contingency table
analyses from the first practical (as discussed in tutorial 2). The aim is to gain a good
understanding of the exact relationship between the two analysis methods and the outcomes they
produce.

Please note that a set of specific SPSS instructions for the following assignments can be found on
page 24.

Assignment 1. Necessity is the mother of invention

This assignment relates to the same study as assignment 1 of practical 1.


Using a frequency table for one of the variables, begin by checking whether the weighting based
on the ‘frequency’ variable is being applied. The total N should be 1600, rather than just 8.
Apply the weighting if necessary (Data – Weight cases).

Now carry out a logistic regression analysis using recovery as dependent variable. Build the
model using three blocks: First, select only sex as the predictor, then add doctor and finally the
sex*doctor interaction. Also request 95% confidence intervals for the odds ratios and the
Hosmer-Lemeshow goodness of fit test.

Questions:

a) Begin by reviewing the output in its entirety. Which model is being fitted to the data in each
block? For block 3, now review the tables listed below, check which null hypothesis is being
tested in each table and what conclusion can be derived from this for the relevant block/model:
Omnibus Tests of Model Coefficients – Hosmer and Lemeshow Test – Variables in the
Equation. Then repeat these steps for block 2 and subsequently for block 1.

b) Which of the three regression models is preferable and why?

c) Compare the output for each model (= block) with the contingency tables in assignment 1 of
practical 1. Begin by checking which model corresponds with which analysis from practical 1.
Next, attempt to match the output of each regression model to one or more tables from practical
1. Use both the odds ratios and the p-values when you do this.

21
d) What conclusions can be drawn regarding the effects of sex (of the patient) and doctor on the
probability of recovery? Take account of the p-values, as well as the odd ratios and the
corresponding confidence intervals.

e) Let’s assume the selected model will be used to predict for each patient in the random sample
whether he or she will recover, based on their sex and doctor. How can you use this model to
obtain the best possible prediction? (Tip: How can you calculate the probability of recovery for
each individual? And how can you then convert this probability into a prediction?)

f) The result of these predictions is shown in the Classification Table. Which percentage of
patients had the correct prediction? Is this a high or a low percentage? (Tip: what percentage can
be achieved by guessing if the only fact that is known is that 50% in total will recover?)

Assignment 2. It all depends on your approach.

This assignment relates to the same study as assignment 2 of practical 1.


Using a frequency table for one of the variables, begin by checking whether the weighting based
on the ‘frequency’ variable is being applied. The total N should be 300, rather than just 8. Apply
the weighting if necessary (Data – Weight cases).
Now carry out a logistic regression analysis using fatigue as a dependent variable. Build the
model using three blocks: First, select only work stress as the predictor and then add personality
type and finally the interaction. Request 95% confidence intervals for the odds ratios and the
Hosmer-Lemeshow test. Use the Save button to save the probability of fatigue and predicted
outcome for each individual calculated in the last model (= block 3).

Questions:

a) Let’s assume we select the model that includes interaction. Express this model in the form of a
regression equation. Complete the equation for each group (work stress * personality type), using
the coding of each variable. Use the table below to do this. For each group, you will now obtain
the log odds for fatigue, expressed as regression weights or betas. Refer to tutorial 3, assignment
1b and 1c, for an example.

Work Personali Work stress Log odds expressed as log odds as reported in
stress ty type * regression weights (betas) the output for
personality assignment 2 of
type practical 1
Low (0) B (0) 0*0 = 0 … …
Low (0) A (1) 0*1 = 0 … …
High (1) B (0) 1*0 = 0 … …
High (1) A (1) 1*1 = 1 … …

22
b) Now fill in the SPSS estimates of the regression weights in the equations for 2a). This
expresses the log odds for each group in numerical terms (example: tutorial 3, assignment 1d-
1e). Compare the results with those of assignment 2 of practical 1.

c) Use SPSS to calculate the log odds (of fatigue) for each individual based on the probability of
fatigue (which you previously calculated and saved during the logistic regression as variable
PRE_1 using Save). Do this by using COMPUTE to create a new variable called ‘logodds’,
where logodds = LN (PRE_1/(1 – PRE_1)). Next, request a graph that plots this new variable
against work stress, with a separate line for each personality type. Then request another graph to
plot the new variable against personality type, with a separate line for each level of work stress.

d) Compare the lines in both graphs with your calculations for question 2a. What is the
relationship between each line and the betas in the regression model for question 2a? What
conclusions can you derive from the graphs with regard to the effects of work stress and
personality type?

e) How does SPSS convert the log odds of fatigue to a prediction for each individual (where 1=
fatigued, 0 = not fatigued)? Compare your thoughts with the fatigue prediction that you
calculated and saved during the logistic regression in SPSS through the use of Save. Next,
request a contingency table of the observed fatigue (row) and the predicted fatigue (column). In
other words, this is a contingency table of the dependent variable against the saved predicted
value which was retained using Save. Compare this contingency table with the Classification
Table for the regression model with interaction.

f) Use the outcomes of question a) to manually calculate the following odds ratios: the odds ratio
for the effect of work stress on fatigue in people with a Type B personality and the same odds
ratio for people with a Type A personality. Compare the results with assignment 2b of practical
1.

g) Repeat question f) for the effect of the personality type on fatigue for each level of work
stress. Compare the results with assignment 2d of practical 1.

h) Let’s assume we select the model with interaction. Which follow-up analysis could we use to
estimate and test the effects of work stress and personality type? Carry out this analysis with
logistic regression. Compare the results with those for question f) and g), as well as those for
assignment 2 of practical 1.

i.) Let’s assume we select the model without interaction. Compare the resulting main effects
(odds ratios and p-values) with assignment 2 of practical 1. Explain how the simple effects in
question h) could be used to roughly calculate these main effects.

j) Which of the models in the output do you prefer and why? Do not only consider the p-values
of the predictors, but also look at the Hosmer-Lemeshow Test and the Classification Table.

23
SPSS instructions for practical 2

Assignment 1

First check the weighting based on ‘frequency’. The total N must be 1600. If N = 8 in your
output, the weighting has not been applied. In this case, you must first enable the weighting
using Data -Weight cases.

Select: Analyze - Regression - Binary Logistic.


Specify ‘recovery’ as the dependent variable. Use the Next button in the logistic regression menu
(under Block) to specify three blocks, using method=enter each time: 1. sex as the covariate, 2.
doctor as the covariate, 3. sex*doctor (for block 3, highlight both variables and click >a*b>).
Under Options, request a 95% confidence interval for exp(B) and the Hosmer-Lemeshow test.
Skip the Select, Categorical and Save buttons.

Assignment 2

As for assignment 1, using a different file. Under Save, tick both predicted values.

2b) Calculation: Compute Variable , target = logodds, numeric expression = LN (p/(1-p)),


where p is the variable that represents the probability of fatigue.
Graph of the log odds: Graphs – Legacy Dialogs – Line.
Select the following in the first screen (Line charts): multiple, summaries for groups of cases.
Select the following in the second screen (Define multiple line): lines represent other statistic
(the average of the log odds), category axis: work stress; define lines by: personality type.
(for the second graph this will be: category axis: personality type; define lines by: work stress).

2h) Select Data – Split file – Organize output by groups, groups based on personality type and
click OK. Then carry out the logistic regression using work stress as a predictor for fatigue. Now
undo the file split using Data – Split file – Analyze all cases. Repeat these steps for the simple
effects of personality type by using work stress to split the file.

24
Tutorial 4: Statistics III

Subject: discussion of SPSS assignments practical 2

This tutorial is dedicated in its entirety to the discussion of the results of the SPSS logistic
regression practical. Any remaining time can be spent on any assignments from the previous
tutorials that have not been discussed yet.

Instructions with regard to the working method:

1. For the relevant assignment, first review the global structure of the SPSS output.

2. Then discuss the output using the questions for this practical as set out in this course
textbook.

3. Any questions or issues that remain unclear can be raised during the Q&A session.

25
Knowledge questions part 1: contingency tables and logistic regression

Contingency tables

1. What type of graph or table is used to represent the correlation between two continuous
variables, such as intelligence and income? Which statistical measure and test correspond with
this?

2. Answer the questions under 1 for the correlation between two dichotomous or binary
variables, such as gender and being in paid employment.

3. Let’s assume we have a contingency table for treatment (yes/no) and recovery (yes/no).
Formulate a null hypothesis to establish a difference and a null hypothesis to establish a
correlation. How do these hypotheses relate to each other?

4. How does the contingency table calculate the frequencies expected under the null hypothesis?

5. Which two assumptions are made in a Chi-square test for a contingency table?

6. If both variables are dichotomous, Pearson’s correlation coefficient can be rewritten to which
measure of association?

7. What is the formula for calculating the odds ratio as a measure of association for a 2*2 table?

8. What is the relation between the odds ratio, the log of the odds ratio and the correlation? What
value must each of these three measures have in order for the association to be negative, positive
or absent?

9. Let’s assume we split a 2*2 table for variables X and Y into the levels for C, a third variable.
We then perform the Mantel-Haenszel test to establish the so-called ‘common odds ratio’. Which
null hypothesis is being tested in this case? Which assumption must be applied in order for this
test to be meaningful?

10. How does one determine that an interaction between X and C exists in the sample? And what
part of the SPSS output for contingency table analysis tests whether there is interaction in the
population?

11. What is the correct follow-up analysis for the effect of X on Y if interaction is present? And
if there is no interaction?

12. When can confounding be said to exist between X and C?

13. Can confounding and interaction occur at the same time?

14. What is moderation? (Tip: think back to lecture 1).

26
Logistic regression

15. What is the principal difference between linear and logistic regression?

16. In the case of ANOVA and regression (Statistics 2), the expected value of continuous
variable Y is modelled as the sum total of a constant + main effects + interactions. In logistic
regression, Y is dichotomous. What is being modelled now as the sum total of the effects?

17. How large is ln(X) if X is 1, has a value between 0 and 1, or is greater than 1? How large are
the log odds if the probability is 50%, less than 50% or more than 50%?

18. How large is exp(X) if X is 0, smaller than 0 or greater than 0?

19. Assume we perform logistic regression using a single predictor, X, and X is dichotomous.
How do we interpret regression weight B for X in this case? And how do we interpret exp(B)?

20. How large is the odds ratio for the effect of dichotomous variable X on dichotomous variable
Y if the regression weight B for X is 0, negative or positive?

21. Answer the question in 20 if X is continuous, such as age in years.

22. Answer the question in 20 if the model contains additional predictors (no interactions).

23. Let’s assume that the logistic model for dichotomous value Y (dementia: 0 =no, 1=yes)
includes predictors X (sex: 0=male, 1=female) and C (age in years), and there is no interaction.
The regression weight (B) for X is significantly negative. Will the odds ratio be greater or
smaller than 1? And which of the sexes will have a higher probability of dementia?

24. Suppose we repeat the analysis in the previous question with the coding for the sexes
reversed: 0=female, 1=male. Which B value will you find now, and what odds ratio?

25. Suppose we repeat the analysis in the previous question with the coding for both the sexes
and dementia reversed: 0 = yes, 1 = no. Which B value will you find now, and what OR?

26. Assume the model in question 23. What is the formula for the 95% confidence interval of the
odds ratio (as a measure of the effect of gender on dementia)?

27. Let’s assume the logistic regression of Y against X, C and X*C reveals a significant
interaction effect. How should we estimate and test the effect of X on Y in this case?

28. Answer the previous question if the interaction effect is not significant at all. How does the
method that needs to be applied in this case relate to the Mantel-Haenszel test of the common
odds ratio?

27
29. In the previous tutorials and practicals, logistic regression and contingency table analysis
yielded similar results. Despite the fact that logistic regression appears much more complicated
than contingency table analysis, this method is preferable in most cases. Why is this?

28
Tutorial 5: Statistics III

Subject: classical test theory, reliability, item analysis

As before, this tutorial will include two types of exercises: calculations and general theory
questions. The emphasis will be on the calculations. The aim is to practise the correct application
and interpretation of classical methods for reliability estimation and item analysis when applied
to psychological tests and questionnaires.

Calculations

Assignment 1. The whole is equal to the sum of its parts

The test known as the ISI test (where ISI stands for Intelligence, Progress and Interest) is
intended for pupils completing the last two years of their primary education. Among other things,
the ISI comprises six intelligence subtests, which each consist of 20 questions with four possible
answers. Subtest 5, titled ‘'Understanding linguistic categories' measures the ability to place
words into categories. The following item provides a basic example:

Information:
Monday – Wednesday – Saturday
Question:
Which two of the following words also belong in this list?
January – Tuesday – April – Sunday – evening

A reliability analysis of ISI-5 (20 items) has yielded the following information: the sum-score
has a mean of 12.57 and an SD of 4.74; the item-p-values gradually decrease from 0.90 (item 1)
to 0.15 (item 20); the average of all 20 item variances is 0.20; the average of all item-item
covariances is 0.05; the average of all item-item correlations is 0.25.

Questions:

a) What is the reliability of a random item from this subtest?


b) What is the reliability of the sum-score for this subtest?
c) What are the estimated true variance and measurement error variance for the sum-score?
d) Assume that the measurement errors have a normal distribution. What will be the margins of
the measurement error for approx. 68% of the pupils? And for approx. 95%? And for
approximately 99%?
e) Can the items in this subtest be regarded as parallel measurements? Explain.
f) How does your answer for question (e) impact on the reliability estimate in question (b)?

29
Assignment 2. Does high reliability translate to good agreement?

The case used in the previous assignment serves as the starting point. Here we will calculate two
scores for each pupil: the sum-score for odd items and the sum-score for even items. The sum-
score for odd items is 6.57 on average (SD = 2.54). The sum-score for even items is 6.00 on
average (SD = 2.49). The Pearson correlation coefficient between the even and odd sum-scores is
0.78. For each pupil, we also calculate the difference between the even and odd sum-scores. It
turns out this difference ranges from -5 to +6.

Questions:

a) Use this information to calculate the reliability of the sum-score for the entire subtest and
compare the outcome with that in the previous assignment.

b) Can the even and odd test halves be considered parallel tests? Explain.

c) Will there be a major difference for individual students as to which half of the test they take?
Explain.

d) How does your answer for question (c) impact on the assessment of end-of year coursework or
essays? (Tip: assume the even and odd test halves represent two assessors.)

Assignment 3. The great leap forward

The case used in the two previous assignments serves as the starting point. We will assume that
the test-retest reliability is 0.80 (for a retest interval of half a year). At both times, the mean sum-
score and SD are the same as stated in question 1. Marieke (11), a pupil at ‘Great Leap’
elementary school, only gets 12 out of 20 test items correct when she first takes the test. When
the test is taken for the second time, she scores 16 out of 20 after intensive individual tutoring.

Questions:

a) Has Marieke genuinely improved the skill that is measured in this test? Provide a calculation
to substantiate your answer. Tip: standard error of measurement (SEM).

b) Let’s assume the test-retest reliability is not known. Is it permitted in this case to answer
question a) using the previously calculated internal consistency or split-half reliability? Provide
arguments for your answer.

30
Assignment 4. About autonomy, nurses and items

In a study into the association between working conditions and health among nursing and care
staff, an autonomy scale was included. The scale consisted of 10 items in relation to autonomy in
the workplace, with each answer having the form of a Likert scale (from 1 - very little autonomy
to 5 - a great deal of autonomy). The next page shows a reliability analysis for this scale.

Questions:

a) From a statistical point of view, which item fits least well in the scale and which the best?

b) Cronbach’s alpha is 0.84 for the overall scale. Would you remove one or more items from the
scale? Why (or why not)?

c) What is the average item-item correlation for this scale? (Tip: the Spearman-Brown formula
also applies if K < 1, in other words for a shortening rather than expanding a test.)

d) The average sum-score for the sample was 27.4 (SD = 6). Calculate the standard error of
measurement for the sum-score and indicate the maximum deviation of the sum-score from the
true score for 95% of subjects.

e) Can these items be considered parallel measurements? Explain.

item mean SD item-rest correlation Alpha excluding that item


1 3.16 0.82 0.49 0.82
2 2.41 1.01 0.51 0.82
3 3.00 0.89 0.54 0.82
4 3.36 0.94 0.56 0.82
5 3.58 0.77 0.43 0.83
6 2.33 0.97 0.65 0.81
7 2.39 1.00 0.62 0.81
8 2.98 0.99 0.53 0.82
9 1.79 1.01 0.37 0.84
10 2.43 0.98 0.58 0.82

31
General theory questions

Assignment 5. Metaphysics

In contrast to actual measurements, it is not possible to directly observe true scores and
measurement errors. Despite this, we are still able to estimate the true score variance and the
measurement error variance. How?

Assignment 6. Let’s hear it for diversity

Think about an intelligence test. Will the reliability of this test among the student population be
greater than, smaller than or equivalent to the reliability for the total population group of 18-25
year olds? Explain.

Assignment 7. The more the merrier?

According to the Spearman-Brown formula for calculating the reliability of the mean or the sum
of a number of replications reliability increases as the number of replications increases. With this
in mind, how is it possible that the removal of items from a scale may result in a higher rather
than a lower value for Cronbach's alpha?

Assignment 8. There are measurement errors and … measurement errors

Four methods for estimating reliability have been discussed. Explain which methods amount to
the same thing, and which do not. Suppose you can or wish to report on only two of these
measures in a paper, which two would you choose and why?

32
SPSS practical 3: Statistics III

Subject: SPSS Reliability and item analysis

Please note that a set of specific SPSS instructions for the following assignments can be found on
page 37.

Assignment 1. There is only one health

In 1981, UM staff published a large-scale survey into health perception among adult Dutch
nationals. The questionnaire that was used contained a large number of scales and individual
items. One of these, the VOEG, asked about the presence of 21 physical complaints. The
answers from a random sample of N=200 persons from the original, much larger sample have
been saved in the stat3pr34a.sav file and provide the case for this assignment (for more
information, see Appendix C). We will use classical psychometric methods to analyse these
data.

However, data from questionnaires require pre-processing before psychometric analyses can be
carried out, as a result of missing values (MV) and in some cases the answer category ‘Not
Applicable’ (N/A). This is because both the MV and N/A category have the effect that a
dichotomous or ordinal item becomes nominal and therefore unsuitable for correlational analyses
such as reliability analysis. In this instance, we will limit ourselves to a simple initial check for
the presence and implications of missing values.

Questions:

a) Use SPSS to determine the number of missing values per item. If listwise deletion is applied,
how many of the 200 persons are excluded from the reliability analysis for this scale?

b) Will listwise deletion have a serious impact in this instance? (Provide arguments for your
answer.) How can exclusion of respondents from the analysis due to MVs be prevented?

c) Perform a reliability analysis for the entire scale of 21 items (without imputing any missing
data). Do not limit the output to Cronbach’s alpha, but request additional statistics for each item
and for the scale in its entirety.

d) Which health issue occurs the least and which the most?

e) Which health issue demonstrates the smallest spread and which the largest? What is the
relation between spread and average for dichotomous items? Why and when might a large spread
be desirable?

f) How large is the reliability for an arbitrary item from this scale according to the output?

33
g) Which assumptions have been made for the reliability statement in f)? Check these
assumptions to the extent that the output permits this. What is the impact on the analysis if these
assumptions are violated?

h) Use the Spearman-Brown formula to calculate Cronbach’s alpha on the basis of the item
reliability. Compare the outcome with the alpha provided in SPSS.

i) What are the estimated true variance and measurement error variance for the sum-score? If the
measurement error has a normal distribution, what are the margins of the measurement error
assuming a probability of 95%?

j) Decide which item fits the scale least well. What is the internal consistency of the scale
without this item? Using the SPSS Reliability procedure, calculate the split-half reliability for
the 20 remaining items. Compare the result with the internal consistency of the scale excluding
the item with the worst fit.

k) Now perform a split-half reliability analysis according to the odd-even method. Once again
exclude the item with the worst fit. The SPSS Reliability procedure is not able to do this. You
must therefore calculate the sum-score for the 10 even items and the sum-score for the 10 odd
items (excluding the item with the worst fit) yourself. Compare the outcomes of questions (j) and
(k). Which split-half method is used in the SPSS Reliability procedure? Which method is
generally preferred and why? That of SPSS or the odd-even method?

l) Which assumptions are made when applying the split-half method? (Tip: parallelism.) Check
whether these assumptions were met to a reasonable extent with regard to the odd-even method.
What is the impact on the split-half reliability if these assumptions are violated?

m) Starting point: we will assume that the following subsets of items from the VOEG are two
subscales: CDGHQ and JPRTU. A reliability analysis results in a Cronbach’s alpha of 0.68 for
subset 1 and 0.82 for subset 2. The correlation between both sum-scores is 0.57. What is the true
correlation between both subscales? Take a closer look at the content of each subscale. Is the
VOEG a homogeneous scale?

n) We will now assume that the following items from the VOEG are also a subscale: ABELNS.
What is measured in this subscale? What items have not been included in a scale yet? Where do
these items fit in in terms of their content? How well do they fit into the overall VOEG from a
statistical point of view?

34
Assignment 2. Intelligent measurement

Among other things, the ISI (Intelligence, Study progress and Interest) test for pupils completing
the last two years of their primary education includes six subtests, each of which has 20 items
scored dichotomously:
1: synonyms (verbal)
2: cut figures (spatial)
3: oppositions (verbal)
4: rotating figures (spatial)
5: understanding linguistic categories (categorising)
6: understanding categories of figures (categorising)

Each of these tests is taken in a classroom setting, subject to a time-limit, in other words the
pupils must stop when the time specified in the test guidelines has elapsed. The test score is the
number of correct answers and is a function of power (the ratio of correct answers) as well as
speed (the number of items completed). During the study that resulted in this file, only around
half of all pupils were able to complete each subtest within the time-limit. This variation in the
number of items completed renders psychometric analysis of tests subject to a time-limit
complicated (as an example, see Crocker & Algina, 1986, p. 145). For this reason, the current
stat3pr34b.sav file only contains those students who were able to complete all subtests within
the time-limit. Please note that a real analysis of tests subject to a time-limit must also include
the respondents that were not able to complete everything. Psychometric methods to evaluate this
are still being developed.

Questions:

a) Carry out a reliability analysis for ISI subtest 4 (see Appendix C for an example of an item.)
Request the output you will need for an item analysis and for manually calculating the true
variance and measurement error variance of the sum-score.

b) Which items seem the easiest and which the hardest? What role could the time-limit for the
test play in this?

c) Which item fits the best in the scale and which the least well?

d) Using classical test theory, estimate the reliability of a random item. Which assumptions apply
when you do this? Verify these assumptions using the SPSS output.

e) Use the Spearman-Brown formula to calculate Cronbach’s alpha using the item reliability and
compare the outcome with the alpha provided in SPSS.

f) What are the estimated true variance and measurement error variance for the sum-score? If the
measurement error has a normal distribution, what are the margins of the measurement error
assuming a probability of 95%?

35
g) What does this margin tell you about the replicability of the test score for these individuals?
Does a small margin imply that when the same children retake the test a few months later each
child will achieve roughly the same test score as they did the first time the test was taken?
Provide arguments for your answer. (Tip: parallel-test reliability vs. test-retest reliability.)

h) Starting point: Cronbach’s alpha for ISI subtest 5 is 0.87. The sum-scores for ISI-4 and ISI-5
have a correlation of 0.49.
Question: Express an opinion on each of the statements below, providing substantive and
statistical arguments:
(i) ISI-4 and ISI-5 measure two independent personal traits
(ii) ISI-4 and ISI-5 measure the same personal trait

36
SPSS instructions for practical 3

Assignment 1

a) Select Analyze - Descriptive Statistics - Descriptives.

c) Select Analyze - Scale - Reliability Analysis. Select all 21 items as the input variables. Under
the Statistics button, request all the information under Descriptives for and Summaries.

j) Click the Model button within the Reliability procedure and select ‘split-half’. Uncheck all the
additional output.

k) In the data screen, use Transform – Compute Variable to calculate two variables with 10
items each (make sure you exclude the item with the poorest fit, see question j)):
- SUMODD, the sum-score for the odd items (voega + voegc etc.)
- SUMEVEN, the sum-score for the even items (voegb + voegd etc.)
Next, calculate the correlation between the two sum-scores and compute the split-half reliability
of the test by hand using the Spearman-Brown formula.

Note:
If you add up items using + signs, you will only include respondents for whom there are no
missing values. Respondents with one or more missing values will be given a missing value as
their sum-score. As far as this practical is concerned, it is OK to use this addition method as there
are hardly any missing values. This also means the number of respondents is exactly the same as
for question (j), which enhances the comparability.
For practical applications, there are better yet more complicated methods to calculate
sum-scores if there are missing values. A brief discussion of these is provided in the SPSS
manual on Reliability.

l) Carry out a paired t test for both sum-scores calculated in question (k).
In addition, carry out a reliability analysis (Model: alpha) for each of the scale halves, skipping
the optional output. Make sure you use the individual items from the relevant scale half as the
input for these reliability analyses, rather than the sum-scores calculated in question (k)!

Assignment 2
a) As for assignment (1c). Analyse only the items for ISI subtest 4!

37
Tutorial 6: Statistics III

Subject: discussion of SPSS assignments practical 3

This tutorial is dedicated in its entirety to the discussion of the results of the SPSS reliability
analysis practical. Any remaining time can be spent reviewing any assignments from the
previous tutorial (classical psychometrics) that have not been discussed yet or a look ahead to the
next tutorial (modern psychometrics).

Instructions with regard to the working method:

1. For the relevant assignment, first review the global structure of the SPSS output.

2. Then discuss the output using the questions for this practical as set out in this course
textbook.

3. Any questions or issues that remain unclear can be raised during the Q&A session.

38
Tutorial 7: Statistics III

Subject: modern test theory, Rasch model

These assignments mostly concern the theory. Real world calculations with regard to modern test
theory will be too advanced for this course and require specialised software. The aim of the
assignments is to gain a good understanding of the similarities and differences in classical and
modern psychometrics and an understanding of item parameters, latent traits, item information
and test information.

Assignment 1. Modernisation

Starting point:

The 1 and 2-parameter logistic model for dichotomous data (0/1) assume unidimensionality and
monotonicity of ICCs, item characteristics curves: For all items in the test, the probability of
item score 1 (= ‘correct’ in tests, ‘agree’ or ‘yes’ in questionnaires) is a monotonically increasing
function of the same latent trait θ (unidimensionality). The slope and location of the curve in
relation to the θ axis depend on the item parameters. In addition to this, the models assume what
is referred to as local independence, which means that for a constant trait value θ all item scores
are independent of each other and therefore uncorrelated.

Questions:

a) Draw a number of ICCs according to the 2-parameter model. Explain how the item parameters
determine the slope and location of the curve. What is the difference between the 1 and 2-
parameter model?

b) Which elements of classical test theory represent unidimensionality and monotonicity? (Tip:
what is the equivalent of a latent trait in classical test theory?)

c) Which element of classical test theory represents the assumption of local independence?
Does this assumption imply that all item-item correlations are equal to zero? Explain.

d) Which statistical measures in classical item analysis (as seen in the table for question 4 in
tutorial 5) are the equivalents of the item parameters from the 2-parameter logistic model?

e) What will roughly be the outcomes of classical item analysis if the data to be analysed
correspond with the 1-parameter (Rasch) model? And what if they correspond with the 2-
parameter model? In your answer, focus on the item-p-values and the item-rest correlations.

39
f) What shape will the ICCs of the 1-parameter model take in the case of a very high
discrimination parameter (= a)? How large will Cronbach’s alpha be in that case? Come up with
a test where this might occur.

g) Answer question (f) if the discrimination parameter is very small (close to 0).

h) In assignment 1 of the SPSS Reliability practical (tutorial 6) we analysed the VOEG, a scale
of 21 items relating to health issues. Review the analysis once more. Question: Assume that we
are to apply the 1-parameter logistic model to the VOEG data. Which item would have the
highest difficulty parameter b and which the smallest?

i) Assume that we are to apply the 2-parameter logistic model to the VOEG data. Which item
would have the highest discrimination parameter a and which the smallest?

j) In assignment 1m) of the SPSS Reliability practical (tutorial 6), we analysed two item subsets
from the VOEG, being the CDGHQ and JPRTU items. We observed the following: Cronbach’s
alpha was 0.68 for subset 1 and 0.82 for subset 2. The correlation between both sum-scores is
0.57. Question: which assumption in the 1 and 2-parameter logistic models is violated in the
VOEG data according to this analysis?

Assignment 2. Measuring is knowing - or is it?

Starting point:

In the 1 and 2-parameter logistic models, the general formula for item information can be defined
as I = (Da)2 P(1-P). Here, P is the probability of a correct answer and (Da) the discrimination
parameter, with D being an arbitrary constant larger than zero.
Note:
Crocker and Algina (1986, p. 353) set D at 1.7 for a specific reason they explain in their text.
Others typically opt for D = 1, as is customary in most of the literature. The choice between the
two values is similar to choosing a length in cm or mm and the only effect is that a is expressed
on a different scale. Here we will assume D = 1.

Questions:

a) What will P be if the value of latent trait θ goes to negative infinity, positive infinity or is
equal to item difficulty b?

b) What P will yield maximum item information and what P will yield minimum item
information?

c) What item difficulty will therefore provide maximum information on the individual’s ability if
the true ability θ of that individual is 1? What item difficulty will provide maximum information
on a respondent for whom θ = 0, and for a respondent for whom θ = -1?

40
d) What effect will discrimination parameter a have on the item information? If a increases, will
this information increase or decrease? Please note: P partly depends on a!

Assignment 3. Best test ever

Refer to the lecture notes on modern psychometrics and look at the slides titled ‘Test information
and Test construction’. We will now calculate item information and test information for a few
cases. Use the auxiliary tables for this.

Auxiliary table 1: calculation of the probability of success P using item parameters a and b and
latent trait value θ:

if a(θ-b) = -3.0 -2.5 -2.0 -1.5 -1.0 -0.5 0.0


then P = 0.05 0.08 0.12 0.18 0.27 0.38 0.50

if a(θ-b) = +0.5 +1.0 +1.5 +2.0 +2.5 +3.0


then P = 0.62 0.73 0.82 0.88 0.92 0.95

Auxiliary table 2: calculation of probability of success P for the 1-parameter model (with a = 1)

θ = -2 θ = -1 θ=0 θ = +1 θ = +2
b = -1 0.27 0.50 0.73 0.88 0.95
b=0 0.12 0.27 0.50 0.73 0.88
b = +1 0.05 0.12 0.27 0.50 0.73

Auxiliary table 3: calculation of item information (a**2) P(1-P) for the 1-parameter model (with
a = 1):

θ = -2 θ = -1 θ=0 θ = +1 θ = +2
b = -1 0.20 0.25 0.20 0.11 0.05
b=0 0.11 0.20 0.25 0.20 0.11
b = +1 0.05 0.11 0.20 0.25 0.20

Questions:

a) Check the calculation of item informations in auxiliary table 3 yourself (only for b = 0).

41
b) Now calculate the test information for the heterogeneous test (b1 = -1, b2 = 0, b3 = 1) as well
as the homogeneous test (b1 = b2 = b3 = 0) in the lecture slides. Which of the two tests is the
most informative?

c) Why does optimal test construction depend on the distribution of θ in the population in which
the test will be used? When thinking about test construction, consider two features: a high or low
value for discrimination parameter a and a homogeneous or heterogeneous difficulty parameter
b.

d) Why does optimal test construction also depend on the purpose of the test (accurate
measurements across the entire skills spectrum or a selection of pupils or staff members, where
for example θ > 1 means a pass)?

42
Knowledge questions part 2: psychometrics

Classical test theory

1. What types of items and scoring in a questionnaire or test form the focus of classical test
theory?

2. What is the formal definition of reliability in classical psychometrics?

3. Why is the reliability of a measurement tool often indicated by the letter ρ or r, which is the
symbol for correlation (in the population and the random sample respectively)?

4. What does the restriction of range effect entail?

5. And the attenuation effect? How can we correct for this effect?

6. Why would a progress test consist of so many items?

7. What is the Spearman-Brown formula used for? What input is needed and what output will
this produce?

8. Name the four methods for testing the reliability of a test or questionnaire.

9. Which of these four methods is very different from the other three? (Tip: error of
measurement.)

10. Which two types of reliability testing are in effect applications of the Spearman-Brown
formula? What input does each type require?

11. What does the ‘standard error of measurement’ (SEM) entail?

12. How can you apply the SEM in order to establish whether there is a difference in the true
scores of two individuals? Which assumptions do we make to do this?

13. Answer the questions in 12 for the difference between two measurements for the same
individual.

14. How does item analysis estimate the reliability of each item in a test or questionnaire?

15. Which results from an item analysis could provide grounds for possibly removing items from
a measuring tool?

43
Modern test theory

16. Give reasons why modern psychometrics came to be developed in addition to classic
psychometrics (Tips: level of measurement, parallel items, tailored testing)

17. What does the assumption of unidimensionality in modern test models mean? Which
assumption is the equivalent in classical test theory?

18. Answer the questions in 17 for the assumption of ‘monotonicity’. How can you roughly
verify this assumption using statistical methods from classical psychometrics?

19. Answer the questions in 17 for the assumption of ‘local independence’. Does this assumption
imply that all item-item correlations are equal to zero? How can you verify this assumption?

20. Which parameters does the 1-parameter (Rasch) model from modern test theory include?
And the 2-parameter model?

21. For each parameter in question 20, indicate which statistic in classical item analysis more or
less corresponds with this.

22. The true score model from classical test theory does not include item parameters. What type
of result should an item analysis produce according to that model? (Think back to question 21.)

23. In education, there is often a need to compare (groups of) individuals based on their test
results, although not all of these individuals have answered the same items. Examples are the
Dutch CITO tests in primary education (the tests change year on year) and the progress tests
administered by UM (each student selects the items he or she will answer). How is it possible to
draw fair comparisons between individuals and what assumptions are required to guarantee this?

24. What does the concept of item information mean? Will this information only depend on the
item or does the individual tested also have an impact?

25. How can item information be derived from a graph containing item characteristic curves?
What probability of a correct answer produces maximum item information and for what
probability is this information minimal?

26. What does the concept of test information mean and how does it depend on item
information?

27. Let’s assume a test is used to decide on a pass or a fail (an example could be a course exam).
What test construction should be used to obtain maximum information? Should the test include
items that vary from very easy to very difficult or should the test simply consist of moderately
difficult items?

28. Answer question 27 if the aim is to obtain the best possible measurements across the entire
skills spectrum. What role does the item discrimination parameter play in this?

44
Tutorial 8: Statistics III

Subject: factor analysis

During this tutorial, we will run through each step of a factor analysis as discussed in the lecture,
using a number of small ‘pen-and-paper’ calculations for the same data discussed in the lecture.
The aim is to gain a good understanding of the criteria and methods of each step.

General theory questions (brief discussion, no more than 20 minutes)

Assignment 1. Prewash, main cycle, spin cycle

a) What are the three steps in an exploratory factor analysis?

b) Which analyses does the first step entail and what is the purpose of each analysis?

c) Why is it useful to review a scatter plot for each pair of variables? How many plots would you
need to review if there are 10 variables?

d) Which methods can you list for the second step and how do these methods differ from each
other?

e) Which criteria can you list for the number of factors and what does each criterion entail?

f) Which methods can you list for the third step? Which of these methods is preferable and why?

g) Describe the differences between exploratory and confirmatory factor analysis.

Calculations

Assignment 2. Reducing data whilst retaining information

The starting point is the principal component analysis (PCA) of four intelligence subtests
(discussed in the lecture). Questions:

a) Use the table on slide 4 to perform your own calculation of the correlation between both
verbal tests and both spatial tests. Compare your outcomes with the table on slide 5.

b) Which tests have the strongest correlation, the verbal or the spatial tests?

c) How many factors should we expect based on these tables? Explain.

45
d) Use the table on slide 15 to calculate the communality and uniqueness of test scores V1 and
S2. Do this according to a 1-factor model first, followed by a 2-factor model. Compare your
outcomes with the relevant cells in the tables on slides 17 and 19).

e.) Use the table on slide 15 to calculate the reproduced and residual correlation between test
scores V1 and S2. Do this according to a 1-factor model first, followed by a 2-factor model.
Compare your outcomes with the tables on slides 18 and 19).

f) Now review the table on slide 19). How many factors would you select and why?

Assignment 3. Factors vs. components

The starting point is the principal factor analysis (PFA) of the case used in assignment 2.
Questions:

a) Use the factor loadings in the table on slide 22 to calculate the following for test scores V1
and S2 based on the two-factor model: communality, uniqueness and the reproduced and residual
correlations between them. Compare your outcomes with the information in the table on slide 23.

b) Compare the communalities of the PFA (table on slide 23), and the outcomes for question (a))
with those for the PCA (table on slide19). Which method is better able to explain the variance in
the test scores?

c) Compare the residual correlations in the table on slide 23 (PFA) with those in the table on
slide 19 (PCA). Which method is able to provide the best explanation for the correlations
between the four test scores?

d) Explain the outcomes of the comparisons in questions (b) and (c).

e) Which method (PCA or PFA) should be the preferred choice at what time?

Assignment 4. Simple and still effective

a) Review the criteria for selecting the number of factors (slide 25). State for each criterion how
many factors you would extract in the example of the four IQ tests using PFA.

b) Now review the factor plots for this case before rotation, after orthogonal rotation and after
oblique rotation (figures on slides 28, 30, 31). Which rotation method should be the preferred
choice and why?

c) Given the plot and the loadings, how would you interpret the factors?

d) Repeat questions (a), (b) and (c) for the factor analysis of 15 items relating to the working
conditions among nursing staff (slide 35).

46
SPSS practical 4: Statistics III

Subject: SPSS factor analysis

During this practical, we will apply the individual steps in a factor analysis that we practised in
the last tutorial to a different data set and we will also use SPSS, given the large number of
calculations.
The aim here is to gain a good understanding of the differences between the various methods for
factor extraction, of the criteria for selecting the number of factors and of methods for rotating
and interpreting factors. There will be an assignment for each step in factor analysis.

View the following YouTube video in order to prepare for, or complement, the practical:
https://www.youtube.com/watch?v=Nj9tj4AGAA0

Please note that a set of specific SPSS instructions for the following assignments can be found on
page 49.

Assignment 1. Uncritical acceptance: the prewash

In 1975, Statistics Netherlands (CBS) conducted a large-scale survey to elicit the opinions of
Dutch nationals on a wide range of topics. This assignment looks at the responses of 110
individuals to 8 Likert items vr1-vr8 (responses varying between ‘agree’ and ‘disagree’) in
relation to their view of the government, among other topics (file stat3pr34c.sav , see Appendix
C for the content of the items).

Questions:

a) For each item, check for the presence of missing values, outliers or strong non-normality.

b) Carry out a reliability analysis for this scale and also request the item-item correlation matrix.
What is your assessment of the internal consistency of this scale and of the extent to which each
item fits the scale? Provide arguments for your answer.

c) Can this scale be said to be unidimensional? How can you tell and what does this imply for the
sum-scores for this scale?

Assignment 2. Uncritical acceptance: the main cycle

a) Carry out a principal component analysis (PCA) for these eight items. Determine the number
of factors in accordance with the K1 criterion (eigenvalues). Request all output with the
exception of factor scores. Do not apply rotation. How many factors should be extracted in
accordance with each criterion? Which items demonstrate a similar factor pattern? Compare this
with assignment 1b.

47
b) Verify whether the chi-square test in the ML factor analysis confirms the number of factors
selected in 2a.

c) Perform a principal factor analysis (PFA) and extract the number of selected factors. Request
the reproduced correlation matrix, communalities and factor loadings. Compare the output with
that of the principal component analysis. What similarities and differences can you see?

d) The differences in output requested under c) are very small in this instance. Why is this?
(Tip: look at the initial communalities = R2.)

Assignment 3. Uncritical acceptance: the spin cycle

a) Repeat the principal factor analysis of assignment 2c, but now carry out a Varimax rotation.
Request a table and a plot for the rotated factor loadings.

b) Repeat question 3a with oblique rotation. Also request the correlations between the factors.
How high is the correlation between the factors?

c) Compare the unrotated factor pattern (assignment 2c) with both rotated patterns. Which
pattern should be the preferred choice and why?

Assignment 4. Uncritical acceptance: sorting the clean laundry

Based on the factor analyses, decide which items belong in the same subscale.
For each subscale, use COMPUTE to calculate the sum-score for the items in question.
Request the correlation matrix for the sum-scores. Compare this matrix with the correlation
matrix for the factors after oblique rotation. What do you notice? What does this tell you about
the meaning of the factors?

48
SPSS instructions for practical 4

Assignment 1

a) Use Analyze – Descriptive Statistics -Frequencies to request a histogram for each variable
(vr1 - vr8). Use Frequencies or Descriptives to request the mean, SD, minimum, maximum,
skewness and kurtosis. For ordinal variables, this will be adequate. In the case of continuous
variables, such as response times, it is better to use Explore (allows testing for normality and
provides an overview of outliers, among other things).

b) Select Analyze - Scale - Reliability Analysis. Select all eight items (vr1 – vr8) as the input
variables. Under Statistics, request all information under Descriptives for and Summaries, as
well as inter-item correlations.

Assignment 2

a) Select Analyze - Dimension Reduction - Factor. Select all eight items as variables and
select:
Descriptives: all output under this button (useful for an initial analysis of a file.
For follow-up analysis, use only the initial solution and the reproduced correlation matrix).
Extraction:
- under Method: principal components
- under Analyze: the correlation matrix
- under Extract: eigenvalues > 1
- under Display: unrotated factor solution, scree plot
Rotation: no rotation (not until assignment 3), only a loading plot
Scores: skip (only needed for factor scores)
Options: skip (only relevant if there are missing values).

b) As for assignment (2a), with the following adaptations:


Descriptives: initial solution
Extraction:
- under Method: Maximum Likelihood
- under Extract: fixed no. of factors ... (fill in the number of selected factors)
- under Display: unrotated factor solution, no scree plot
- under Rotation: nothing
If the selected factor model is rejected, test a model with one additional factor.
If the selected factor model is not rejected, test a model with one factor fewer.

c) As for (2b), but select Principal Axis Factoring under Method and specify the number of
factors selected in question 2b under Extract. Restrict the output under the Descriptives button to
the initial solution and the reproduced correlation matrix.

49
Assignment 3

a) As for assignment (2c), with the following adaptations: Under Extraction: select fixed no. of
factors and choose the number based on assignment 2. Under Rotation: Select Varimax and
request the rotated factor solution and the factor plot (‘loading plot’).

b) As for assignment (3a), but instead select Oblimin or Promax rotation and keep the default
value for the delta and kappa respectively.

50
Tutorial 9: Statistics III

Subject: discussion of SPSS assignments practical 4

This tutorial is dedicated in its entirety to the discussion of the results of the SPSS factor analysis
practical. Any remaining time can be spent on any assignments from the previous tutorials that
have not been discussed yet.

Instructions with regard to the working method:

1. For the relevant assignment, first review the global structure of the SPSS output.

2. Then discuss the output using the questions for this practical as set out in this course
textbook.

3. Any questions or issues that remain unclear can be raised during the Q&A session.

51
Tutorial 10: Statistics III

Subject: validity and agreement

With only a short while to go until the course exams, we will not introduce many more new or
complicated topics. That is why this tutorial only has a handful of assignments for the first hour
of the session. The second hour can be spent on a recap of the course or on questions.

Assignment 1. As many different aspects as there are people

Let’s assume there are four questionnaires, one for extraversion (E), one for impulsiveness (I),
one for remoteness (R) and one for neuroticism (N). All four questionnaires have a reliability of
0.70. We have established the following correlations between the sum-score for the extraversion
questionnaire and the three other sum-scores: r(E,I) = 0.50, r(E,R) = -0.70, r(E,N) = 0. Question:
For each questionnaire, determine the extent to which the measurement is comparable to the
extraversion questionnaire, from a statistical point of view.

Assignment 2. Second opinion

Over the course of a few years, 200 patients suffering from back problems in a region have been
referred to a physiotherapy practice run by two part-time staff. The two therapists want to check
whether there is agreement between them on their decision whether to apply traction or not
(traction is a specific treatment method). In order to establish this, they assess each patient
without consulting one another. This results in the contingency table below.

therapist B
no possibly yes
therapist no 30 20 10
A possibly 20 40 20
yes 10 20 30

Questions:
a) For what percentage of patients are the therapists in full agreement? And for what
percentage do they completely disagree? What is your opinion of this level of agreement?

b) Now calculate the kappa measure of agreement (unweighted). What is your opinion of
this level of agreement? Is this level good, average or poor?

c) Now calculate the kappa with linear weighting and answer question b once more.

d) Finally calculate the kappa with quadratic weighting and answer question b again.

e) Which kappa would you choose in this instance, and why?

52
Assignment 3. Academic progress

Let’s assume that we wish to use a progress test to closely monitor the academic progress of
bachelor students. Key aspects are the reliability and validity of the progress test. Question: for
each of the criteria below, indicate how the criterion could be assessed for the progress test: (a)
internal consistency, (b) retest reliability, (c) content validity and (d) construct validity (tip:
progress test)

53
Knowledge questions part 3: factor analysis and validity & agreement

Statistical pre-processing and factor extraction

1. What is the difference between covariance and correlation?

2. How is the covariance matrix for variable K constructed? And the correlation matrix? What do
we find on the diagonal and what off the diagonal?

3. What requirements must the measurement level and the distribution of the variables meet to do
a factor analysis?

4. And the correlations between the variables themselves?

5. What is the objective of factor analysis?

6. How is the first principal component constructed? And the second?

7. What is a factor pattern?

8. What is the eigenvalue of a principal component or factor?

9. How is this calculated from the factor pattern (for rotation)?

10. What are the communality and uniqueness of a variable?

11. How are they calculated from the factor pattern (for rotation)?

12. What is the reproduced and residual correlation between two variables?

13. How are they calculated from the factor pattern (for rotation)?

14. What is the difference between principal component analysis and principal factor analysis?

15. What advantage does each method offer compared to the other method?

16. What does the K1 criterion for factor selection mean? And the scree test?

17. What are the other criteria for determining the number of factors?

18. What is the best way to apply all of those criteria if they do not all result in the selection of
the same number of factors?

54
Factor rotation and confirmatory factor analysis

19. What does factor rotation mean?

20. Once the factors have been extracted, they can be rotated arbitrarily. Why is this?

21. Factor rotation is used to produce a simple structure. What does that mean?

22. What is orthogonal rotation and which is the most well-known method?

23. What does oblique rotation mean?

24. Which rotation method, orthogonal or oblique, is usually best, and why?

25. What is the difference between exploratory and confirmatory factor analysis?

26. What criteria are used to select the number of factors in a confirmatory analysis?

27. Why is there no application of rotation in confirmatory analysis?

28. What are factor scores? What is the simple alternative for factor scores?

29. What is cross-validation and when do you need to apply it?

55
Validity and agreement

30. What is content validity and how can this be verified for a test or questionnaire?

31. Answer the questions in 30 for the concept of ‘predictive validity’.

32. Answer the questions in 30 for the concept of ‘construct validity’.

33. Why is reliability a prerequisite for validity?

34. Why is it that reliability and validity are often also at odds with each other?

35. What statistical measure can be used to verify whether two tests or questionnaires measure
the same thing?

36. What do the sensitivity and specificity of a diagnostic test refer to?

37. Why is a high correlation between two assessors not sufficient to ensure good agreement?

38. Which is the common measure of agreement between two assessors for dichotomous or
nominal ratings?

39. Why is this measure less suitable for ordinal ratings? Which measure is suitable for ordinal
ratings?

40. Which is the best measure for verifying agreement between two assessors in the event of a
quantitative rating?

41. Does good agreement between the assessors mean that there will be little or no difference for
a random individual as to whom he or she is assessed by? Explain.

56
Appendix A General tips for using SPSS for Windows

SPSS has three important windows that should be distinguished:

1. data (.sav): the raw data matrix (individuals as rows, variables as columns)

2. output viewer (.spv): results of analyses

3. syntax (.sps): save (via the Paste button for all statistical analyses), process and execute SPSS
commands in the old style (as previously under DOS)
(This window is not opened by default, but can be opened via File – New or Open)

Data matrix:
individual X1 X2 .... XK
1
2
...
N

The SPSS toolbar has various pull-down menus, from File through to Help. Listed below are the
four menus that are specific to SPSS, along with the most frequently used options for each menu:

Data: defining new variables, sorting and selected cases, merging files

Transform: calculating and recoding values, converting raw values to rankings, imputing
missing values. Options under Transform are applied to each individual (row) in the data file.
The result will be a new or amended variable (column).

Analyze: all statistical methods from our course and more

Graphs: creating all manner of graphs The Legacy Dialogs option allows graphs to be defined in
the same manner as for analyses, that is, by placing variables in boxes using mouse clicks. The
Chart Builder option requires you to drag and drop variables.

Note:
Selection with filtering for unselected cases (in the Data menu) only has an impact on procedures
carried out using Analyze and Graphs. Procedures under Transform are not affected. This means

I
unselected cases will not be included in a statistical analysis, but they will be included when
calculating or recoding variables.
Selection with deletion of unselected cases (in the Data menu) results in the removal of cases
from the data window and also from the data file if this data window is saved.

Always check that the selection of persons or the creation/change of variables has been carried
out correctly. The result can be incorrect, especially if the selection or creation/change uses a
variable which has missing values. Always check selections or operations using a frequency
distribution (for new variables), a contingency table (new versus old variable) or listing (Analyze
– Reports – Case Summaries).

A few other tips:

Saving paper: (1) Remove superfluous or incorrect output from the window, (2) reduce figures or
tables in size if necessary and (3) print two pages on each A4 sheet.

Adding SPSS commands in the output.


Select: Edit – Options – Viewer – Display commands in the log.

Names instead of labels for variables in the input screen and/or output file:
Select: Edit – Options – General (for input screens this only becomes effective after the data
have been loaded again), or Edit – Options – Output labels (effective immediately for output).

Values instead of labels to see the values assumed by the variables in the output:
Select Edit – Options – Output labels (effective immediately for output)

Including comments/explanation in the output (for later reference). Include lines that start with a
* and end in a full stop in the syntax window. If you ‘run’ these lines, they will be printed in the
output but not interpreted by SPSS as command lines.

II
Appendix B Importing SPSS output in Word

SPSS includes easy ways to copy output (both tables and graphs) to Word. Select a graph and
select <Copy> via <Edit>. Now open a Word document, hold down the right mouse button and
select <Edit> and <Paste>. If you hold down the Shift button, you can select and copy multiple
graphs or tables in one go if needed.
If a table doesn’t fit on the page, click the right mouse button and select <Autofit> and <Autofit
to content>.

The option below can be used as an alternative, particularly if there is a large volume of output:
 Exporting all output at once: go to the output window. At the top, select <File> and
<Export>:

What?

File Type?

File name?

Location?

How?

- Objects to Export: select All to indicate that you wish to export all output. In this case, the
output will contain a number of useless tables that are not visible in the spv file. To exclude
these tables, select All Visible instead of All.
-Type: select Word/RTF(*doc). As you can see, you also have the option of exporting the
output in PDF, Excel or PowerPoint format.
- File Name: define the location and name of the Word file to which you wish to export the
SPSS output.
- Change options: the default option is Wrap table (tables that are too wide are split into
multiple tables shown one underneath the other). Select Shrink table if you don’t want to
do this.

III
Appendix C Training files

This appendix only describes the files for the psychometrics and factor analysis practicals. The
files for contingency table analysis and logistic regression each contain no more than a few
variables and a description is included with the relevant practical.

stat3pr34a.sav

This file contains the responses of 200 adult Dutch nationals to 21 items relating to physical
complaints. These formed part of a survey research around 1980, which looked at health in the
Netherlands (items scored as follows: 1 = yes, 0 = no, blank = missing).
Please see below for the content of the items.

a. Is your appetite less than it would usually be?


b. Do you occasionally experience bloating or pressure in the abdominal area?
c. Do you quickly run out of breath?
d. Do you occasionally experience pain around your chest or heart?
e. Do you occasionally experience pain in your abdominal area?
f. Do you often have an unpleasant or sweet taste in your mouth?
g. Do you occasionally experience heart palpitations or pounding around your heart?
h. Do you occasionally experience tightness in your chest?
i. Do you experience pain in your bones or muscles?
j. Do you often feel fatigued?
k. Do you occasionally suffer from headaches?
l. Do you occasionally suffer from an upset belly?
m. Do you occasionally suffer from back pain?
n. Do you occasionally suffer from an upset stomach?
o. Do you experience numbness or tingling in your extremities?
p. Do you tire more quickly than you feel is normal?
q. Do you occasionally experience dizziness?
r. Do you sometimes feel listless?
s. Do you occasionally suffer from non-specific stomach problems?
t. Do you occasionally feel sleepy or drowsy?
u. Do you generally feel tired and not rested when you get up in the morning?

IV
stat3pr34b.sav

This file contains the scores of 917 pupils in the two last years of primary education. The scores
concern two of the six intelligence subtests in the ISI, being ISI-4 (rotation of figures) and ISI-5
(understanding word categories).
Names of the variables:
item4_1 to item4_20 incl. = scores for items 1-20 of subtest 4 (1 = correct, 0 = incorrect),
item5_1 to item5_20 incl. = scores for items 1-20 of subtest 5
sum4 and sum5 = sum-scores for both subtests.
For reasons of protecting the copyright and confidentiality of the tests, the exact wording of the
items has not been reproduced here. Instead, fictional items are included below to give an
impression of ISI-4 and an example of the instructions provided with ISI-5.

ISI-4 (rotation of figures): a fictional but similar item.


Instructions to the test subject:
On the answer sheet, underline the letters of the two figures that can be created by rotating the
figure to the left of the line. Underline two letters, no more and no fewer.

A B C D E

ISI-5 (understanding word categories): an example item from the instructions.


Instructions to the test subject:
The three words in bold print belong together. Two of the five words below also belong with
them. On the answer sheet, underline the letters corresponding with those two words.

Monday Wednesday Saturday

A January
B Tuesday
C April
D Sunday
E Evening

V
stat3pr34c.sav

This file contains the responses of 100 individuals to the following eight items from a survey
conducted by Statistics Netherlands in 1975. The items relate to their view of the government
and young people with a critical attitude.
For each item, the answer is chosen using an ordinal scale, where 1 = disagree and 5 = agree.
In items 4, 5 and 7, the order of the statement is reversed. The responses in the file have already
been mirrored (recoded) so these items signify the following: 1 = agree, 5 = disagree

1. There should be no cause to be angry if young people who protest against perceived injustices
occasionally break the law.
2. The significant increase in the number of government bodies has put personal freedom under
threat.
3. Young people should have a critical attitude regarding the status quo.
4. It is desirable for the government to introduce a law which allows all protests to be broken up.
5. The government is authorised to deploy soldiers in order to break a strike.
6. More state-owned companies should be transferred into private ownership.
7. Severe sentences should be introduced for people who disregard the instructions of the police
during protests.
8. By introducing tougher measures against terrorism and civil disobedience, the government
would be turning our country into a police state.

VI

You might also like