0% found this document useful (0 votes)

54 views29 pages

Lecture 6 - Test Construction

how to make psycholigcal test

Uploaded by

abdul aleem ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

54 views29 pages

Lecture 6 - Test Construction

how to make psycholigcal test

Uploaded by

abdul aleem ali

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

Test Construction

Writing Items

 In the grand scheme of test construction, considerations related to the

actual writing of the test’s items go hand in hand with scaling
considerations.
 The prospective test developer or item writer immediately faces three
questions related to the test blueprint:
 What range of content should the items cover?
 Which of the many different types of item formats should be employed?
 How many items should be written in total and for each content area
covered?
Cont..

 When devising a standardized test using a multiple-choice format, it is

usually advisable that the first draft contain approximately twice the
number of items that the final version of the test will contain.
 If, for example, a test called “American History: 1940 to 1990” is to have
30 questions in its final version, it would be useful to have as many as
60 items in the item pool.
 Ideally, these items will adequately sample the domain of the test.
 An item pool is the reservoir or well from which items will or will not be
drawn for the final version of the test
Cont..

 A comprehensive sampling provides a basis for content validity of the

final version of the test. Because approximately half of these items will
be eliminated from the test’s final version, the test developer needs to
ensure that the final version also contains items that adequately sample
the domain.
 Thus, if all the questions were determined to be poorly written, then the
test at particular developer should either rewrite items sampling that
particular domain or create new items.
 The new or rewritten items would then also be subjected to tryout so as
not to jeopardize the test’s content validity. As in earlier versions of the
test, an effort is made to ensure adequate sampling of the domain in the
final version of the test.
How does one develop items for the item
pool?

 The test developer may write a large number of items from personal
experience or academic acquaintance with the subject matter.
 Help may also be sought from others, including experts.
 For psychological tests designed to be used in clinical settings, clinicians,
patients, patients’ family members, clinical staff, and others may be
interviewed for insights that could assist in item writing.
 For psychological tests designed to be used by personnel psychologists,
interviews with members of a targeted industry or organization will likely be
of great value.
 For psychological tests designed to be used by school psychologists,
interviews with teachers, administrative staff, educational psychologists, and
others may be valuable. Searches through the academic research literature
may also prove fruitful
Item format

 Considerations related to variables such as the purpose of the test and

the number of examinees to be tested at one time enter into decisions
regarding the format of the test under construction.
 Variables such as the form, plan, structure, arrangement, and layout of
individual test items are collectively referred to as item format.
Item format

 Two types of item format are the selected-response format and the
constructed-response format.
 Selected-response Format: Items presented in a selected-
response format require testtakers to select a response from a set of
alternative responses.
 Constructed-response Format: Items presented in a constructed-
response format require testtakers to supply or to create the correct
answer, not merely to select it.
Selected Response Format

 If a test is designed to measure achievement and if the items are written

in a selected response format, then examinees must select the response
that is keyed as correct.
 If the test is designed to measure the strength of a particular trait and if
the items are written in a selected-response format, then examinees
must select the alternative that best answers the question with respect
to themselves
Cont..

 Three types of selected-response item formats are

 Multiple-choice
 Matching
 True–false
Multiple Choice Format

 An item written in a multiple-choice format has three elements:

 (1) a stem
 (2) a correct alternative or option
 (3) several incorrect alternatives or options variously referred to as
distractors or foils.
Cont..

For example
 Stem ⟶ A psychological test, an interview, and a case study are:
 Correct alt. ⟶ a. psychological assessment tools
 Distractors ⟶ b. standardized behavioral samples
c. reliable assessment instruments
d. theory-linked measures
Matching item format

 In a matching item, the testtaker is presented with two columns:

premises on the left and responses on the right.
 The testtaker’s task is to determine which response is best associated
with which premise.
 For very young testtakers, the instructions will direct them to draw a line
from one premise to one response.
 Testtakers other than young children are typically asked to write a letter
or number as a response.
True/False item format

 A multiple-choice item that contains only two possible responses is

called a binary-choice item.
 Perhaps the most familiar binary-choice item is the true–false item.
 As you know, this type of selected-response item usually takes the form
of a sentence that requires the testtaker to indicate whether the
statement is or is not a fact.
 Other varieties of binary-choice items include sentences to which the
testtaker responds with one of two responses, such as agree or disagree,
yes or no, right or wrong, or fact or opinion.
Cont..

 A good binary choice contains a single idea, is not excessively long, and is
not subject to debate; the correct response must undoubtedly be one of
the two choices.
 Like multiple-choice items, binary-choice items are readily applicable to a
wide range of subjects. Unlike multiple-choice items, binary-choice items
cannot contain distractor alternatives.
 For this reason, binary-choice items are typically easier to write than
multiple-choice items and can be written relatively quickly.
 A disadvantage of the binary-choice item is that the probability of
obtaining a correct response purely on the basis of chance (guessing) on
any one item is .5, or 50%. In contrast, the probability of obtaining a
correct response by guessing on a four-alternative multiple-choice question
is .25, or 25%.
Constructed-response Items

 Items presented in a constructed-response format require testtakers to

supply or to create the correct answer, not merely to select it.
 Three types of constructed-response items are
 The completion item
 The short answer
 The essay
Completion Item

 A completion item requires the examinee to provide a word or phrase

that completes a sentence
 For example: The standard deviation is generally considered the most useful
measure of __________. (The correct completion here is variability.)
 A good completion item should be worded so that the correct answer is
specific.
 Completion items that can be correctly answered in many ways lead to
scoring problems.
Short answer

 A completion item may also be referred to as a short-answer item.

 It is desirable for completion or short-answer items to be written clearly
enough that the testtaker can respond succinctly—that is, with a short
answer.
 For example: What descriptive statistic is generally considered the most
useful measure of variability?
 There are no hard-and-fast rules for how short an answer must be to be
considered a short answer; a word, a term, a sentence, or a paragraph
may qualify.
Essay Item

 Beyond a paragraph or two, the item is more properly referred to as an

essay item.
 We may define an essay item as a test item that requires the testtaker
to respond to a question by writing a composition, typically one that
demonstrates recall of facts, understanding, analysis, and/or
interpretation.
 For example: Compare and contrast definitions and techniques of classical
and operant conditioning. Include examples of how principles of each have
been applied in clinical as well as educational settings
 An essay item is useful when the test developer wants the examinee to
demonstrate a depth of knowledge about a single topic.
Cont..

 The skills tapped by essay items are different from those tapped by
true–false and matching items. Whereas these latter types of items
require only recognition, an essay requires recall, organization, planning,
and writing ability.
 A drawback of the essay item is that it tends to focus on a more limited
area than can be covered in the same amount of time when using a
series of selected-response items or completion items.
 Another potential problem with essays can be subjectivity in scoring and
inter-scorer differences.
Cont..

 Floor effect: Test item too difficult that all test takers answer
incorrectly
 Ceiling effect: test items too easy that all test-takers answer
correctly

 Due to floor and ceiling effect, a test would not discriminate

between high achievers and low achievers
Scoring Items

Three Models of scoring are:

 Cumulative scoring model
 Class scoring or category scoring model
 Ipsative scoring model
Cumulative Model

 Perhaps the model used most commonly—owing, in part, to its simplicity

and logic—is the cumulative model.
 Typically, the rule in a cumulatively scored test is that the higher the
score on the test, the higher the testtaker is on the ability, trait, or other
characteristic that the test purports to measure.
 For each testtaker response to targeted items made in a particular way,
the testtaker earns cumulative credit with regard to a particular
construct.
Class Scoring

 In tests that employ class scoring or (also referred to as category

scoring), testtaker responses earn credit toward placement in a
particular class or category with other testtakers whose pattern of
responses is presumably similar in some way.
 This approach is used by some diagnostic systems wherein individuals
must exhibit a certain number of symptoms to qualify for a specific
diagnosis.
Ipsative Scoring

 A third scoring model, ipsative scoring, departs radically in rationale

from either cumulative or class models.
 A typical objective in ipsative scoring is comparing a testtaker’s score on
one scale within a test to another scale within that same test.
 Consider, for example, a personality test called the Edwards Personal
Preference Schedule (EPPS), which is designed to measure the relative
strength of different psychological needs.
 The EPPS ipsative scoring system yields information on the strength of
various needs in relation to the strength of other needs of the testtaker.
The test does not yield information on the strength of a testtaker’s need
relative to the presumed strength of that need in the general population.
Cont..

 Edwards constructed his test of 210 pairs of statements in a way such

that respondents were “forced” to answer true or false or yes or no to
only one of two statements.
 Prior research by Edwards had indicated that the two statements were
equivalent in terms of how socially desirable the responses were.
 Here is a sample of an EPPS-like forced-choice item, to which the
respondents would indicate which is “more true” of themselves:
 I feel depressed when I fail at something.
 I feel nervous when giving a talk before a group
Cont..

 On the basis of such an ipsatively scored personality test, it would be

possible to draw only intra-individual conclusions about the testtaker.
 Here’s an example:
 “John’s need for achievement is higher than his need for affiliation.”
 It would not be appropriate to draw inter-individual comparisons on the
basis of an ipsatively scored test. It would be inappropriate, for example,
to compare two testtakers with a statement like “John’s need for
achievement is higher than Jane’s need for achievement.”
Guidelines for writing items

 Do not use trick items

 Measure a single construct
 Avoid opinion based items
 Avoid negative statements
 Avoid double barreled questions
 Once the test developer has decided on a scoring model and has done
everything else necessary to prepare the first draft of the test for
administration, the next step is test tryout.
THANKYOU!

Vena's Report in Chapter 8 - Test Development
No ratings yet
Vena's Report in Chapter 8 - Test Development
7 pages
Lesson 11
No ratings yet
Lesson 11
4 pages
Test Development
100% (1)
Test Development
59 pages
Lecture 5, Test Construction Standardization
No ratings yet
Lecture 5, Test Construction Standardization
26 pages
7 8 9 10 11 12 Writing Item Selected and Constructed Response Format
No ratings yet
7 8 9 10 11 12 Writing Item Selected and Constructed Response Format
41 pages
Test Construction and Validation Guide
No ratings yet
Test Construction and Validation Guide
20 pages
Constructing Test Items BARU
100% (1)
Constructing Test Items BARU
324 pages
Week13 - Ã Ä Renci
No ratings yet
Week13 - Ã Ä Renci
41 pages
Test Development Process Explained
No ratings yet
Test Development Process Explained
30 pages
Development of Selection Type Test Items UNIT:4.C.C:8602
No ratings yet
Development of Selection Type Test Items UNIT:4.C.C:8602
30 pages
Test-Development-and-Administration (Edited)
No ratings yet
Test-Development-and-Administration (Edited)
5 pages
Slide 6 - Test Construction and Adaptation
No ratings yet
Slide 6 - Test Construction and Adaptation
34 pages
Test Construction and Evaluation
100% (1)
Test Construction and Evaluation
34 pages
Classroom Test Planning Guide
No ratings yet
Classroom Test Planning Guide
42 pages
QUIZZES and EXAM Making-Tips
No ratings yet
QUIZZES and EXAM Making-Tips
34 pages
Types of Tests
No ratings yet
Types of Tests
26 pages
Writing Objective Test Items Guide
No ratings yet
Writing Objective Test Items Guide
29 pages
Comprehensive Guide to Test Development
No ratings yet
Comprehensive Guide to Test Development
57 pages
RM - Questionnaire Design
No ratings yet
RM - Questionnaire Design
40 pages
Assignment Presentation Report
No ratings yet
Assignment Presentation Report
48 pages
Guidelines for Effective Item Writing
No ratings yet
Guidelines for Effective Item Writing
31 pages
Two General Categories of Test Items
No ratings yet
Two General Categories of Test Items
4 pages
English Test Design Essentials
No ratings yet
English Test Design Essentials
25 pages
Understanding Essay Test Items
No ratings yet
Understanding Essay Test Items
58 pages
Type of Test Items
80% (10)
Type of Test Items
16 pages
Essay vs. Objective Test Formats
No ratings yet
Essay vs. Objective Test Formats
8 pages
Testing 3.2
No ratings yet
Testing 3.2
4 pages
Test Construction and Item Analysis Guide
No ratings yet
Test Construction and Item Analysis Guide
21 pages
Achievement Test Insights
100% (1)
Achievement Test Insights
57 pages
Types and Rules for Test Assessments
No ratings yet
Types and Rules for Test Assessments
23 pages
Test Construction Guidelines 2
No ratings yet
Test Construction Guidelines 2
47 pages
TDP301 Construction of CBC Assessments 2
No ratings yet
TDP301 Construction of CBC Assessments 2
48 pages
Objective Test Items
No ratings yet
Objective Test Items
5 pages
Tools and Techniques of Assessment-Test Construction....
No ratings yet
Tools and Techniques of Assessment-Test Construction....
37 pages
Epy Topic 8 Notes
No ratings yet
Epy Topic 8 Notes
18 pages
Objective vs. Subjective Test Items Guide
No ratings yet
Objective vs. Subjective Test Items Guide
25 pages
Constructing Test Items Guide
No ratings yet
Constructing Test Items Guide
70 pages
Test Development Process Explained
No ratings yet
Test Development Process Explained
11 pages
Topic 55
No ratings yet
Topic 55
2 pages
Test Construction
100% (2)
Test Construction
8 pages
Test Construction Guidelines and Steps
No ratings yet
Test Construction Guidelines and Steps
5 pages
Final
No ratings yet
Final
26 pages
Comprehensive Guide to Test Development
No ratings yet
Comprehensive Guide to Test Development
17 pages
NOTES: The Process of Test Development
No ratings yet
NOTES: The Process of Test Development
5 pages
Item Writing and Item Formats-1
100% (1)
Item Writing and Item Formats-1
64 pages
05 MCQs MPhil
No ratings yet
05 MCQs MPhil
4 pages
Classroom Test Planning Guide
0% (1)
Classroom Test Planning Guide
44 pages
PsychAssess 5 TestDevelopment
No ratings yet
PsychAssess 5 TestDevelopment
4 pages
Writing Effective Test Items Guide
No ratings yet
Writing Effective Test Items Guide
22 pages
Item Analysis
No ratings yet
Item Analysis
22 pages
Writing Effective Test Items Guide
No ratings yet
Writing Effective Test Items Guide
39 pages
Types of Tests: Comparison and Guidelines
No ratings yet
Types of Tests: Comparison and Guidelines
23 pages
Science Evaluation Methods and Test Types
No ratings yet
Science Evaluation Methods and Test Types
19 pages
Profed 06 Lesson 5
No ratings yet
Profed 06 Lesson 5
5 pages
Psychological Measurement Techniques
No ratings yet
Psychological Measurement Techniques
40 pages
Test Item Writing Guidelines
No ratings yet
Test Item Writing Guidelines
10 pages
MODULE 8: Test Development: PSY 112: Psychological Assessment
No ratings yet
MODULE 8: Test Development: PSY 112: Psychological Assessment
59 pages
Lecture 5 - Test Development
No ratings yet
Lecture 5 - Test Development
31 pages
House Tree Person Projective Test Overview
No ratings yet
House Tree Person Projective Test Overview
31 pages
Understanding Test Reliability Methods
No ratings yet
Understanding Test Reliability Methods
47 pages
Lecture 14 - Psychological Assessment Report Writing
No ratings yet
Lecture 14 - Psychological Assessment Report Writing
19 pages
Rotter Incomplete Sentences Blank Guide
No ratings yet
Rotter Incomplete Sentences Blank Guide
31 pages
Chernev 2003
No ratings yet
Chernev 2003
12 pages
Building Capabilities of The Neophyte School Heads: Initiatives and Needs
No ratings yet
Building Capabilities of The Neophyte School Heads: Initiatives and Needs
9 pages
PYC3704 Psychological Research Exam Questions and Answers: Adeles
67% (3)
PYC3704 Psychological Research Exam Questions and Answers: Adeles
168 pages
Quantum Tourism in Bali: Integrating Cultural Heritage With Advanced Technologies For A Transformative Travel Experience
No ratings yet
Quantum Tourism in Bali: Integrating Cultural Heritage With Advanced Technologies For A Transformative Travel Experience
26 pages
The Influence of Hospital Governance On The Organisational Success: A Study Done at King Fahad Hospital, Al-Baha, KSA
No ratings yet
The Influence of Hospital Governance On The Organisational Success: A Study Done at King Fahad Hospital, Al-Baha, KSA
12 pages
Productivity Rates of A Thirty-Storey Residential Building: Technological Institute of The Philippines
No ratings yet
Productivity Rates of A Thirty-Storey Residential Building: Technological Institute of The Philippines
15 pages
Ump Ips R-01 Conversion of Status Master To PHD Student
No ratings yet
Ump Ips R-01 Conversion of Status Master To PHD Student
2 pages
Cybersecurity Research Paper Draft 2025
No ratings yet
Cybersecurity Research Paper Draft 2025
13 pages
College of Education: Mabini Colleges, Inc
No ratings yet
College of Education: Mabini Colleges, Inc
2 pages
Comparison of Anchorage Efficiency of Orthodontic Mini-Implant and Conventional Anchorage Reinforcement in Patients Requiring Maximum Orthodontic Anchorage - A Systematic Review and Meta-Analysis
No ratings yet
Comparison of Anchorage Efficiency of Orthodontic Mini-Implant and Conventional Anchorage Reinforcement in Patients Requiring Maximum Orthodontic Anchorage - A Systematic Review and Meta-Analysis
13 pages
Service Sector Pricing Strategies
No ratings yet
Service Sector Pricing Strategies
11 pages
Au Bon Pain Management Strategy
No ratings yet
Au Bon Pain Management Strategy
6 pages
Gilang Rickat Trengginas - 2423031005 - Artikel IJPSAT
No ratings yet
Gilang Rickat Trengginas - 2423031005 - Artikel IJPSAT
13 pages
Managing Quality
50% (2)
Managing Quality
37 pages
Understanding Sampling Methods Explained
No ratings yet
Understanding Sampling Methods Explained
34 pages
CH 1
No ratings yet
CH 1
16 pages
Research Variables Answer Key Guide
No ratings yet
Research Variables Answer Key Guide
2 pages
Understanding Research Designs
No ratings yet
Understanding Research Designs
16 pages
Soara Interview Technique Overview
No ratings yet
Soara Interview Technique Overview
34 pages
Behavsci 14 00725 v2
No ratings yet
Behavsci 14 00725 v2
23 pages
Statistics & Hypothesis Testing Quiz
No ratings yet
Statistics & Hypothesis Testing Quiz
4 pages
Social Media's Impact on Appearance Anxiety
No ratings yet
Social Media's Impact on Appearance Anxiety
3 pages
Miranda PSA
No ratings yet
Miranda PSA
4 pages
Principles in Practice: Ethnography
No ratings yet
Principles in Practice: Ethnography
15 pages
Robutness Check Ref
No ratings yet
Robutness Check Ref
26 pages
Biostatistics 2M Answers
No ratings yet
Biostatistics 2M Answers
6 pages
Mathematics and Surveying May 2013
100% (1)
Mathematics and Surveying May 2013
7 pages
Factors Influencing Career Choices in SHS
No ratings yet
Factors Influencing Career Choices in SHS
19 pages
Determinant Factors Affecting Customer Satisfaction Among Filipinos in Lazada Online Shopping During COVID-19 Pandemic: A Structural Equation Modeling Approach
No ratings yet
Determinant Factors Affecting Customer Satisfaction Among Filipinos in Lazada Online Shopping During COVID-19 Pandemic: A Structural Equation Modeling Approach
5 pages
Sustainability Strategies of Barber Salons
No ratings yet
Sustainability Strategies of Barber Salons
192 pages

Lecture 6 - Test Construction

Uploaded by

Lecture 6 - Test Construction

Uploaded by

Test Construction

 In the grand scheme of test construction, considerations related to the

 When devising a standardized test using a multiple-choice format, it is

 A comprehensive sampling provides a basis for content validity of the

 Considerations related to variables such as the purpose of the test and

 If a test is designed to measure achievement and if the items are written

 Three types of selected-response item formats are

 An item written in a multiple-choice format has three elements:

 In a matching item, the testtaker is presented with two columns:

 A multiple-choice item that contains only two possible responses is

 Items presented in a constructed-response format require testtakers to

 A completion item requires the examinee to provide a word or phrase

 A completion item may also be referred to as a short-answer item.

 Beyond a paragraph or two, the item is more properly referred to as an

 Due to floor and ceiling effect, a test would not discriminate

Three Models of scoring are:

 Perhaps the model used most commonly—owing, in part, to its simplicity

 In tests that employ class scoring or (also referred to as category

 A third scoring model, ipsative scoring, departs radically in rationale

 Edwards constructed his test of 210 pairs of statements in a way such

 On the basis of such an ipsatively scored personality test, it would be

 Do not use trick items

You might also like