PA Ratio2

Gmail Images
Outcome 2
process, research methods and statistics
used in test development and standardization
PCAS-06-701A PCAS-06-701E PCAS-06-701P

CAS-06-703E
Your Team
QUESTION 1
A. Nominal, continuous, point-

biserial
You'd like to see the strength of the
relationship between sex (Male, B. Interval, interval, Pearson-r
Female) and level of introversion.
Sex is ______ while introversion is
______, thus you will use _______
for correlation. C. Interval, nominal, point-
biserial
D. Nominal, nominal, phi-

coefficient
MALE FEMALE
Scale of Measurement
Scale of Measurement
CATEGORICAL CONTINUOUS
Ex. Religion; sex, nationality, personality type, gender,

Measures of length; periods of time, score in an exam.
marital status, college major, and blood type.
Ex. SES (“low income”,”middle income”,”high income”) Kelvin scale : Income, height, weight, annual sales, market share, product
education level (“high school”,”BS”,”MS”,”PhD”) defect rates, time to repurchase, unemployment rate, and crime rate
income level (“less than 50K”, “50K-100K”, “over 100K”)
QUESTION 1
You'd like to see the strength of the relationship

between sex (Male, Female) and introversion. Sex is
______ while introversion is ______, thus you will use
_______ for correlation.
A. Nominal, Interval, Point Biserial

B. Interval, Interval, Pearson
C. Interval, Nominal, Point Biserial
D. Nominal, Nominal, Phi- Coefficient
DIFFERENT TYPES OF CORRELATION COEFFICIENTS
Biserial Correlation True dichotomy – dichotomy in

• Correlates one continuous and one which there are only two
artificial dichotomous data
possible categories.
• Score in a test (continuous/interval) Ex. Sex (male – female)
and being highly aggressive
(artificial dichotomy)
Artificial Dichotomy – dichotomy
Point Biserial Correlation
• Correlates one continuous and one
in which there are other
true dichotomous data. possibilities in a certain category
Ex. Non quota course vs. quota
• Score in the test courses
(continuous/interval) and
correctness in an item within the
test (true dichotomous)
Phi Coefficient
Correlates two dichotomous data; at least one true dichotomy
Ex. Gender and passing or failing a test
Tetrachoric Correlation
Correlates two dichotomous data; both are artificial dichotomy
Ex. Passing or failing a test and being highly aggressive or not.
QUESTION 1
You'd like to see the strength of the relationship

between sex (Male, Female) and introversion. Sex
is ______ while introversion is ______, thus you
will use _______ for correlation.
A) Nominal, continuous, point-biserial

B) Interval, interval, Pearson-r
C)Interval, nominal, point-biserial
D)Nominal, nominal, phi-coefficient
QUESTION 2
Distribution of income in the

A Philippines
Which of the
A greater percentage of cases following
B distributed about the mean
describes a
normal curve?
The values that trail off sharply
C on one side than the other
The distribution where the

D mean is less than the median
Normal
• The normal distribution is a

symmetrical, bell-shaped
distribution in which the mean,
median and mode are all equal.
• majority of the test takers are

bulked at the middle of the
distribution, very few test takers
are at the extremes
Mean = median = mode

Q1 and Q3 have equal distances to the
Q2/median
Positive (Also known as right skewed)
• Positively skewed distribution –
more test takers got a low score

Mean>median>mode
(Q3-Q2)>(Q2-Q1)
Negative (Also known as left skewed)
• Negatively skewed distribution
– more test takers got a high score

Mode>median>mean
(Q2-Q1)>(Q3-Q2)
QUESTION 3
A Slightly passing the average
Ali's score in the last

examination is equivalent
B Average to a T-score of 45. Ali's
performance is _____.
C Below average
Poor
D
A T-score of 45
STANDARD SCORES
• A raw score that has been converted

from one scale to another scale
• Provide a context of comparing scores

on different tests by converting scores
from the two tests into z-score
• “z scores are golden”

STANDARD SCORES
T – Score
Mean = 50; SD = 10
Created by McCall in honor of
his professor Thorndike
Stanine
Mean = 5; SD = 2
Used by US Airforce
Z-SCORES - Mean of 0 ; SD of 1
- zero plus or minus one scale Assessment
scale
- When determined, can be used to translate one
to another.
Takes whole numbers 1 – 9; no
Example:
decimals
Score- 65
Mean- 50
Sd= 15
STANDARD SCORES
Sten
Standard ten
Mean = 5.5; SD = 2
GRE or SAT (Graduate Record Exam/

Scholastic Aptitude Test)
Mean = 500; SD = 100

Deviation IQ Used for admission for graduate school and
Mean = 100; SD = 15 college
Used for interpreting IQ
STANDARD SCORES
QUESTION 4
A 61
51
Find range
B distribution of 72, 25,
81, 63, 30, 20, 53.
C 52
20
D
MEASURES OF DISPERSION
QUESTION 5
Babies who are held more tend to cry

A less.
Which of the
following is an
B The student's level of anxiety tells us illustration of a
nothing about one's job performance.
negative correlation?
Students who score low on Psych

C Assessment Drills tend to score low on the
summative and final examination
Students under modular learning scheme tend

D to maintain the same relative performance in
an online learning mode.
TYPES/SHAPES OF RELATIONSHIP
Positive Linear Negative Relationship No relationship – Curvilinear Relationship

Relationship
higher values of one When there is no Increases in the values of one
Increases in the values variable tend to be relationship between the variable are accompanied by
systematic increases and
of one variable are associated with lower two variables, the graph decreases in the values of the other
accompanied by values of the other. is simply a flat line. variable. In other words, the
increases in the values direction of the relationship
of the second variable. changes at least once. This type of
relationship is sometimes referred
to as a non-monotonic function.
QUESTION 6
A Pearson – r
You are calculating the
reliability of your newly
Spearman-Brown develop Interest
B Questionnaire, answerable
by YES/NO.
C Cronbach's Alpha What statistical treatment

should you use?
D KR 20
TEST STATISTICS FOR CORRELATION
Pearson Product Moment Correlation

Correlates 2 variables in interval/ratio scale
format
Devised by Karl Pearson
Spearman Rho
Also called as rank-ordered correlation or
Spearman Correlation
Correlates 2 variables in ordinal scale
CRONBACH
➔COEFFICIENT ALPHA
➔Non-dichotomous items
➔Preferred statistic for obtaining an estimate
of internal consistency reliability
Provide an indication of the likelihood that a test taker will score within some interval of
scores on a criterion measure – an interval that can be categorized as “passing”,
➔Widely used as a measure of reliability

“acceptable”, or “failing”.
➔Answer how similar sets of data are.

➔A value of alpha above .90 may be too high
or redundant
CRONBACH
➔COEFFICIENT ALPHA
➔Non-dichotomous items
➔Preferred statistic for obtaining an estimate
of internal consistency reliability
Provide an indication of the likelihood that a test taker will score within some interval of
scores on a criterion measure – an interval that can be categorized as “passing”,
➔Widely used as a measure of reliability

“acceptable”, or “failing”.
➔Answer how similar sets of data are.

➔A value of alpha above .90 may be too high
or redundant
KR 20
❖ Items are highly homogenous
❖ Determining the inter-item consistency of
dichotomous items
❖ Right (1) or wrong (0)
❖ When used in heterogeneous items = lower reliability
estimates than half split methods
❖ Various Difficulty
QUESTION 7
Does seniority at work determine

A job commitment?
Are people's demographics correlate Which of the

B with their level of prejudice? following questions
can be put to a
C Is there a relationship between regression
academic success and procrastination?
analysis?
Which of the following social-cognitive
D variables (attitude, beliefs, social norms, and
intention) have an impact on diet adherence?
QUESTION 8
A Factor analysis
Sarah would like to predict
college achievement from a
variable of his High School
B Meta-analysis
Grade Point Average,
Scholastic Achievement Test
(SAT), SAT reading score,
SAT Math score, and SAT
C Multiple Regression
writing score. What statistical
treatment is most applicable?
D Multi-variate analysis
REGRESSION ANALYSIS
MULTIPLE REGRESSION ANALYSIS
Independent Dependent
Variable Variable
attitude
Diet
beliefs adherence
Social
norms
Factor Analysis Meta-analysis
Factor analysis works by detecting sets Meta-analysis is a quantitative, formal,

of variables which correlate highly with epidemiological study design used to
each other. These variables may then systematically assess the results of
be condensed into a single variable. previous research to derive conclusions
about that body of research.
Multiple Regression Multivariate Analysis
Multiple linear regression is a Multivariate analysis is used to describe

dependence method which looks at the analyses of data where there are multiple
relationship between one dependent variables or observations for each unit or
variable and two or more independent individual.
variables
Statistical Technique: Factor Analysis
Multiple
Regression
QUESTION 9
People are more likely to form committed

A relationship with someone equally attractive
Psychotherapy cannot treat the

B symptoms and problems of psychotic Which of the following
patients. is an Alternative
hypothesis?
There is no significant difference
C between males and females concerning
parenting style.
The COVID-19 restrictions do not

D affect people's mental health
Hypothesis
A research hypothesis is a statement of expectation or

prediction that will be tested by research.
The null hypothesis is generally The alternative hypothesis is

denoted as H0. It states the exact generally denoted as H1. It makes a
opposite of what an investigator or statement that suggests or advises
an experimenter predicts or a potential result or an outcome that
expects. It basically defines the an investigator or the researcher
statement which states that there is may expect. It has been categorized
no exact or actual relationship into two categories: directional
between the variables. alternative hypothesis and non
directional alternative hypothesis.
QUESTION 10
A A high amount of caffeinated beverages.
One-half the amount given to the Group 1 is conducting a

B experimental group. study on the effects of
caffeine on athletes' running
performance. In this study,
the control group should be
C Running performance before and after
given_____.
taking caffeinated beverages
D No caffeine at all.
CONTROL GROUP IN AN EXPERIMENT
A control group is a statistically significant portion of

participants in an experiment that are shielded from
exposure to variables. In a pharmaceutical drug
study, for example, the control group receives a
placebo, which has no effect on the body.
The treatment group (also called

the experimental group) receives the treatment
whose effect the researcher is interested in.
QUESTION 11
A Criterion-referenced
Results of the College

B Norm-referenced Admission Test become
the basis of qualifying
incoming freshmen in
most universities. The
C Developmental norm norm used is ____.
D Cultural norm
NORMS
• Performance by defined groups on a particular test.

• Transformation of raw scores in making meaningful
interpretations of scores on a test
*Norming – process of creating norms

*Normative samples – group of people whose performance on a
particular test is analyzed and referred
*Race norming – norming based on race/ culture
*User norms – norms provided by the test manuals.
NORMS
Criterion Reference Testing – interpretation of test is

based on a certain standards.
Method of evaluation and a way of deriving meaning form test
scores evaluating an individual’s score with reference to a set of
standard.
Also called as Content-referenced or Domain-referenced
Criterion –a standard on which a judgement or decision is
based.
Norms Reference Testing – Score is interpreted based

on the performance of a standardized group.
TYPES OF NORMS REFERENCE TESTING
DEVELOPMENTAL NORMS
NATIONAL NORMS
NATIONAL NORMS
QUESTION 12
A Construct underrepresentation
A Depression Scale which only

contains items on persistent
B Construct – irrelevant variance feelings of sadness, but no items
related to loss of interest,
hopelessness, and thought of
death, has a problem on _____.
C Criterion contamination
D Poor divergence
Construct underrepresentation
Failure to capture important components of a construct (e.g. An English
test which only contains vocabulary items but no grammar items will
have a poor content validity.)
Construct-irrelevant variance
Happens when scores are influenced by factors irrelevant to the
construct (e.g. test anxiety, reading speed, reading comprehension,
illness)
Criterion Contamination – a situation in which a response

measure (the criterion) is influenced by factors that are not related to
the concept being measured. Contamination can occur for several
reasons, such as low reliability, rater bias, cheating, or other construct-
irrelevant influences
QUESTION 13
A Predictive validity
Pre-board examinations
B Content validity should be concerned
primarily with _____.
C Construct validity
D Convergent validity
QUESTION 14
The test is homogenous,

A measuring a single construct.
Test score increases/decrease as a Which of the

B function of age, the passage of time, or
following is NOT
experimental manipulation.
evidence of
construct validity?
Test scores correlate with scores on
C another test.
Test score and the criterion

D measure are obtained at present
EVIDENCE OF CONSTRUCT VALIDITY
Cohen and Swerdlick, 2018

QUESTION 15
A Local validation
When the test is

B Item analysis altered in some ways
like its format, and
language used. It
C Distractor analysis should follow a ____.
Cultural Check
D
EVIDENCE OF CONSTRUCT VALIDITY
Local validation studies
• are absolutely necessary when the test user plans to alter in

some way the format, instructions, language, or content of the
test.
For example, a local validation study would be necessary
if the test user sought to transform a nationally standardized
test into Braille for administration to blind and visually
impaired test-takers.
• Local validation studies would also be necessary if a test user sought to

use a test with a population of test-takers that differed in some significant
way from the population on which the test was standardized.
Cohen and Swerdlick, 2018

QUESTION 16
A Good parallel reliability
Two psychologists
B Split-half reliability evaluated the "Autism"
symptoms of their
patients. If both of their
judgment yields
C Inter-rater reliability identical rating, then it
has _______.
Test-retest reliability
D
QUESTION 17
A Inter-rater reliability
Two of your expert

B Test-retest reliability validators agree on the
interpretation or scoring
of the test you are
developing. This is an
C Internal Consistency evidence of _______.
D Standardization
RELIABILITY
RELIABILITY ESTIMATES
RELIABILITY
• Dependability or • Inter-scorer reliability – the degree
consistency in of agreement or consistency
measurement. between two or more scorers (or
judges or raters) with regard to a
particular measure.
• Coefficient of inter-score reliability

– the scores from different raters are
correlated with one another.
RELIABILITY
RELIABILITY
• Dependability or • Test-retest reliability – an estimate of
reliability obtained by correlating pairs of
consistency in
scores from the same people on two different
measurement. administrations of the same test.
• Parallel – forms – for each form of the test,

the means and the variances of observed test
scores are equal.
• or alternate forms – the degree – different

versions of a test that have been constructed
so as to be parallel
RELIABILITY
RELIABILITY
• Dependability or • Split-half reliability – obtained by
consistency in correlating two pairs of scores
measurement. obtained from equivalent halves of a
single test administered once.
• Spearman – Brown formula

– allows a test developer or user to
estimate internal consistency
reliability from a correlation of two
halves of a test.
RELIABILITY
RELIABILITY
• Dependability or • Inter-item consistency – the degree
consistency in of relatedness of items on a test.
measurement. Able to gauge the homogeneity of a
test.
• Kuder- Richardson formula 20 –

statistic of choice for determining the
inter-item consistency of
dichotomous items.
QUESTION 18
A Good item
An item with a low

B Poor item discrimination value
means that the item
is____.
C Not discriminating
D Strongly discriminating
ITEM – DISCRIMINATORY INDEX
• Indicates how adequately an item separates or

discriminates between high scorers and low scorers
on an entire test
• A measure of the difference between the proportion

of high scorers answering an item correctly and the
proportion of low scorers answering the item
correctly.
QUESTION 19
84% of the test-takers did not

A answer the item correctly.
Which of the following

B The item is difficult explains a difficulty
index of .84?
The item is difficult for the lower
C group and easy for the upper group.
84% of the test-taker answered the

D item correctly.
QUESTION 20
A Time differences
What is the primary

B Content differences source of error
variance for a test-
retest approach to
C Rater differences reliability?
Test forms
D
Thank You

PA Ratio2

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PA Ratio2

Uploaded by

Copyright:

Available Formats

Gmail Images

PCAS-06-701A PCAS-06-701E PCAS-06-701P

A. Nominal, continuous, point-

D. Nominal, nominal, phi-

Ex. Religion; sex, nationality, personality type, gender,

You'd like to see the strength of the relationship

A. Nominal, Interval, Point Biserial

Biserial Correlation True dichotomy – dichotomy in

You'd like to see the strength of the relationship

A) Nominal, continuous, point-biserial

Distribution of income in the

The distribution where the

• The normal distribution is a

• majority of the test takers are

Mean = median = mode

• Positively skewed distribution –

more test takers got a low score

• Negatively skewed distribution

– more test takers got a high score

A Slightly passing the average

Ali's score in the last

• A raw score that has been converted

• Provide a context of comparing scores

• “z scores are golden”

GRE or SAT (Graduate Record Exam/

Mean = 500; SD = 100

Babies who are held more tend to cry

Students who score low on Psych

Students under modular learning scheme tend

Positive Linear Negative Relationship No relationship – Curvilinear Relationship

C Cronbach's Alpha What statistical treatment

Pearson Product Moment Correlation

➔Widely used as a measure of reliability

➔Answer how similar sets of data are.

➔Widely used as a measure of reliability

➔Answer how similar sets of data are.

Does seniority at work determine

Are people's demographics correlate Which of the

Factor analysis works by detecting sets Meta-analysis is a quantitative, formal,

Multiple Regression Multivariate Analysis

Multiple linear regression is a Multivariate analysis is used to describe

People are more likely to form committed

Psychotherapy cannot treat the

The COVID-19 restrictions do not

A research hypothesis is a statement of expectation or

The null hypothesis is generally The alternative hypothesis is

A A high amount of caffeinated beverages.

One-half the amount given to the Group 1 is conducting a

A control group is a statistically significant portion of

The treatment group (also called

Results of the College

• Performance by defined groups on a particular test.

*Norming – process of creating norms

Criterion Reference Testing – interpretation of test is

Norms Reference Testing – Score is interpreted based

A Depression Scale which only

Criterion Contamination – a situation in which a response

The test is homogenous,

Test score increases/decrease as a Which of the

Test score and the criterion

Cohen and Swerdlick, 2018

When the test is