You are on page 1of 76

Gmail Images

Outcome 2
process, research methods and statistics
used in test development and standardization

PCAS-06-701A PCAS-06-701E PCAS-06-701P


CAS-06-703E

Your Team
QUESTION 1

A. Nominal, continuous, point-


biserial
You'd like to see the strength of the
relationship between sex (Male, B. Interval, interval, Pearson-r
Female) and level of introversion.
Sex is ______ while introversion is
______, thus you will use _______
for correlation. C. Interval, nominal, point-
biserial

D. Nominal, nominal, phi-


coefficient
MALE FEMALE

Scale of Measurement
Scale of Measurement

CATEGORICAL CONTINUOUS

Ex. Religion; sex, nationality, personality type, gender,


Measures of length; periods of time, score in an exam.
marital status, college major, and blood type.

Ex. SES (“low income”,”middle income”,”high income”) Kelvin scale : Income, height, weight, annual sales, market share, product
education level (“high school”,”BS”,”MS”,”PhD”) defect rates, time to repurchase, unemployment rate, and crime rate
income level (“less than 50K”, “50K-100K”, “over 100K”)
QUESTION 1

You'd like to see the strength of the relationship


between sex (Male, Female) and introversion. Sex is
______ while introversion is ______, thus you will use
_______ for correlation.

A. Nominal, Interval, Point Biserial


B. Interval, Interval, Pearson
C. Interval, Nominal, Point Biserial
D. Nominal, Nominal, Phi- Coefficient
DIFFERENT TYPES OF CORRELATION COEFFICIENTS

Biserial Correlation True dichotomy – dichotomy in


• Correlates one continuous and one which there are only two
artificial dichotomous data
possible categories.
• Score in a test (continuous/interval) Ex. Sex (male – female)
and being highly aggressive
(artificial dichotomy)
Artificial Dichotomy – dichotomy
Point Biserial Correlation
• Correlates one continuous and one
in which there are other
true dichotomous data. possibilities in a certain category
Ex. Non quota course vs. quota
• Score in the test courses
(continuous/interval) and
correctness in an item within the
test (true dichotomous)
DIFFERENT TYPES OF CORRELATION COEFFICIENTS
DIFFERENT TYPES OF CORRELATION COEFFICIENTS

Phi Coefficient
Correlates two dichotomous data; at least one true dichotomy
Ex. Gender and passing or failing a test

Tetrachoric Correlation
Correlates two dichotomous data; both are artificial dichotomy
Ex. Passing or failing a test and being highly aggressive or not.
QUESTION 1

You'd like to see the strength of the relationship


between sex (Male, Female) and introversion. Sex
is ______ while introversion is ______, thus you
will use _______ for correlation.

A) Nominal, continuous, point-biserial


B) Interval, interval, Pearson-r
C)Interval, nominal, point-biserial
D)Nominal, nominal, phi-coefficient
QUESTION 2

Distribution of income in the


A Philippines

Which of the
A greater percentage of cases following
B distributed about the mean
describes a
normal curve?
The values that trail off sharply
C on one side than the other

The distribution where the


D mean is less than the median
Normal

• The normal distribution is a


symmetrical, bell-shaped
distribution in which the mean,
median and mode are all equal.

• majority of the test takers are


bulked at the middle of the
distribution, very few test takers
are at the extremes

Mean = median = mode


Q1 and Q3 have equal distances to the
Q2/median
Positive (Also known as right skewed)

• Positively skewed distribution –

more test takers got a low score


Mean>median>mode
(Q3-Q2)>(Q2-Q1)
Negative (Also known as left skewed)

• Negatively skewed distribution

– more test takers got a high score


Mode>median>mean
(Q2-Q1)>(Q3-Q2)
QUESTION 3

A Slightly passing the average

Ali's score in the last


examination is equivalent
B Average to a T-score of 45. Ali's
performance is _____.

C Below average

Poor
D
A T-score of 45
STANDARD SCORES

• A raw score that has been converted


from one scale to another scale

• Provide a context of comparing scores


on different tests by converting scores
from the two tests into z-score

• “z scores are golden”


STANDARD SCORES

T – Score
Mean = 50; SD = 10
Created by McCall in honor of
his professor Thorndike

Stanine
Mean = 5; SD = 2
Used by US Airforce
Z-SCORES - Mean of 0 ; SD of 1
- zero plus or minus one scale Assessment
scale
- When determined, can be used to translate one
to another.
Takes whole numbers 1 – 9; no
Example:
decimals
Score- 65
Mean- 50
Sd= 15
STANDARD SCORES

Sten
Standard ten
Mean = 5.5; SD = 2

GRE or SAT (Graduate Record Exam/


Scholastic Aptitude Test)

Mean = 500; SD = 100


Deviation IQ Used for admission for graduate school and
Mean = 100; SD = 15 college
Used for interpreting IQ
STANDARD SCORES
QUESTION 4

A 61

51
Find range
B distribution of 72, 25,
81, 63, 30, 20, 53.

C 52

20
D
MEASURES OF DISPERSION
MEASURES OF DISPERSION
MEASURES OF DISPERSION
QUESTION 5

Babies who are held more tend to cry


A less.
Which of the
following is an
B The student's level of anxiety tells us illustration of a
nothing about one's job performance.
negative correlation?

Students who score low on Psych


C Assessment Drills tend to score low on the
summative and final examination

Students under modular learning scheme tend


D to maintain the same relative performance in
an online learning mode.
TYPES/SHAPES OF RELATIONSHIP

Positive Linear Negative Relationship No relationship – Curvilinear Relationship


Relationship
higher values of one When there is no Increases in the values of one
Increases in the values variable tend to be relationship between the variable are accompanied by
systematic increases and
of one variable are associated with lower two variables, the graph decreases in the values of the other
accompanied by values of the other. is simply a flat line. variable. In other words, the
increases in the values direction of the relationship
of the second variable. changes at least once. This type of
relationship is sometimes referred
to as a non-monotonic function.
QUESTION 6

A Pearson – r
You are calculating the
reliability of your newly
Spearman-Brown develop Interest
B Questionnaire, answerable
by YES/NO.

C Cronbach's Alpha What statistical treatment


should you use?

D KR 20
TEST STATISTICS FOR CORRELATION

Pearson Product Moment Correlation


Correlates 2 variables in interval/ratio scale
format
Devised by Karl Pearson

Spearman Rho
Also called as rank-ordered correlation or
Spearman Correlation
Correlates 2 variables in ordinal scale
CRONBACH

➔COEFFICIENT ALPHA
➔Non-dichotomous items
➔Preferred statistic for obtaining an estimate
of internal consistency reliability
Provide an indication of the likelihood that a test taker will score within some interval of
scores on a criterion measure – an interval that can be categorized as “passing”,

➔Widely used as a measure of reliability


“acceptable”, or “failing”.

➔Answer how similar sets of data are.


➔A value of alpha above .90 may be too high
or redundant
CRONBACH

➔COEFFICIENT ALPHA
➔Non-dichotomous items
➔Preferred statistic for obtaining an estimate
of internal consistency reliability
Provide an indication of the likelihood that a test taker will score within some interval of
scores on a criterion measure – an interval that can be categorized as “passing”,

➔Widely used as a measure of reliability


“acceptable”, or “failing”.

➔Answer how similar sets of data are.


➔A value of alpha above .90 may be too high
or redundant
KR 20
❖ Items are highly homogenous
❖ Determining the inter-item consistency of
dichotomous items
❖ Right (1) or wrong (0)
❖ When used in heterogeneous items = lower reliability
estimates than half split methods
❖ Various Difficulty
QUESTION 7

Does seniority at work determine


A job commitment?

Are people's demographics correlate Which of the


B with their level of prejudice? following questions
can be put to a
C Is there a relationship between regression
academic success and procrastination?
analysis?
Which of the following social-cognitive
D variables (attitude, beliefs, social norms, and
intention) have an impact on diet adherence?
QUESTION 8

A Factor analysis
Sarah would like to predict
college achievement from a
variable of his High School
B Meta-analysis
Grade Point Average,
Scholastic Achievement Test
(SAT), SAT reading score,
SAT Math score, and SAT
C Multiple Regression
writing score. What statistical
treatment is most applicable?

D Multi-variate analysis
REGRESSION ANALYSIS
MULTIPLE REGRESSION ANALYSIS

Independent Dependent
Variable Variable

attitude

Diet
beliefs adherence

Social
norms
Factor Analysis Meta-analysis

Factor analysis works by detecting sets Meta-analysis is a quantitative, formal,


of variables which correlate highly with epidemiological study design used to
each other. These variables may then systematically assess the results of
be condensed into a single variable. previous research to derive conclusions
about that body of research.

Multiple Regression Multivariate Analysis

Multiple linear regression is a Multivariate analysis is used to describe


dependence method which looks at the analyses of data where there are multiple
relationship between one dependent variables or observations for each unit or
variable and two or more independent individual.
variables
Statistical Technique: Factor Analysis
Multiple
Regression
QUESTION 9

People are more likely to form committed


A relationship with someone equally attractive

Psychotherapy cannot treat the


B symptoms and problems of psychotic Which of the following
patients. is an Alternative
hypothesis?
There is no significant difference
C between males and females concerning
parenting style.

The COVID-19 restrictions do not


D affect people's mental health
Hypothesis

A research hypothesis is a statement of expectation or


prediction that will be tested by research.

The null hypothesis is generally The alternative hypothesis is


denoted as H0. It states the exact generally denoted as H1. It makes a
opposite of what an investigator or statement that suggests or advises
an experimenter predicts or a potential result or an outcome that
expects. It basically defines the an investigator or the researcher
statement which states that there is may expect. It has been categorized
no exact or actual relationship into two categories: directional
between the variables. alternative hypothesis and non
directional alternative hypothesis.
QUESTION 10

A A high amount of caffeinated beverages.

One-half the amount given to the Group 1 is conducting a


B experimental group. study on the effects of
caffeine on athletes' running
performance. In this study,
the control group should be
C Running performance before and after
given_____.
taking caffeinated beverages

D No caffeine at all.
CONTROL GROUP IN AN EXPERIMENT

A control group is a statistically significant portion of


participants in an experiment that are shielded from
exposure to variables. In a pharmaceutical drug
study, for example, the control group receives a
placebo, which has no effect on the body.

The treatment group (also called


the experimental group) receives the treatment
whose effect the researcher is interested in.
QUESTION 11

A Criterion-referenced

Results of the College


B Norm-referenced Admission Test become
the basis of qualifying
incoming freshmen in
most universities. The
C Developmental norm norm used is ____.

D Cultural norm
NORMS

• Performance by defined groups on a particular test.


• Transformation of raw scores in making meaningful
interpretations of scores on a test

*Norming – process of creating norms


*Normative samples – group of people whose performance on a
particular test is analyzed and referred
*Race norming – norming based on race/ culture
*User norms – norms provided by the test manuals.
NORMS

Criterion Reference Testing – interpretation of test is


based on a certain standards.
Method of evaluation and a way of deriving meaning form test
scores evaluating an individual’s score with reference to a set of
standard.
Also called as Content-referenced or Domain-referenced
Criterion –a standard on which a judgement or decision is
based.

Norms Reference Testing – Score is interpreted based


on the performance of a standardized group.
TYPES OF NORMS REFERENCE TESTING
DEVELOPMENTAL NORMS
NATIONAL NORMS
NATIONAL NORMS
QUESTION 12

A Construct underrepresentation

A Depression Scale which only


contains items on persistent
B Construct – irrelevant variance feelings of sadness, but no items
related to loss of interest,
hopelessness, and thought of
death, has a problem on _____.

C Criterion contamination

D Poor divergence
Construct underrepresentation
Failure to capture important components of a construct (e.g. An English
test which only contains vocabulary items but no grammar items will
have a poor content validity.)

Construct-irrelevant variance
Happens when scores are influenced by factors irrelevant to the
construct (e.g. test anxiety, reading speed, reading comprehension,
illness)

Criterion Contamination – a situation in which a response


measure (the criterion) is influenced by factors that are not related to
the concept being measured. Contamination can occur for several
reasons, such as low reliability, rater bias, cheating, or other construct-
irrelevant influences
QUESTION 13

A Predictive validity

Pre-board examinations
B Content validity should be concerned
primarily with _____.

C Construct validity

D Convergent validity
QUESTION 14

The test is homogenous,


A measuring a single construct.

Test score increases/decrease as a Which of the


B function of age, the passage of time, or
following is NOT
experimental manipulation.
evidence of
construct validity?
Test scores correlate with scores on
C another test.

Test score and the criterion


D measure are obtained at present
EVIDENCE OF CONSTRUCT VALIDITY

Cohen and Swerdlick, 2018


QUESTION 15

A Local validation

When the test is


B Item analysis altered in some ways
like its format, and
language used. It
C Distractor analysis should follow a ____.

Cultural Check
D
EVIDENCE OF CONSTRUCT VALIDITY

Local validation studies

• are absolutely necessary when the test user plans to alter in


some way the format, instructions, language, or content of the
test.
For example, a local validation study would be necessary
if the test user sought to transform a nationally standardized
test into Braille for administration to blind and visually
impaired test-takers.

• Local validation studies would also be necessary if a test user sought to


use a test with a population of test-takers that differed in some significant
way from the population on which the test was standardized.

Cohen and Swerdlick, 2018


QUESTION 16

A Good parallel reliability

Two psychologists
B Split-half reliability evaluated the "Autism"
symptoms of their
patients. If both of their
judgment yields
C Inter-rater reliability identical rating, then it
has _______.

Test-retest reliability
D
QUESTION 17

A Inter-rater reliability

Two of your expert


B Test-retest reliability validators agree on the
interpretation or scoring
of the test you are
developing. This is an
C Internal Consistency evidence of _______.

D Standardization
RELIABILITY

RELIABILITY ESTIMATES
RELIABILITY
• Dependability or • Inter-scorer reliability – the degree
consistency in of agreement or consistency
measurement. between two or more scorers (or
judges or raters) with regard to a
particular measure.

• Coefficient of inter-score reliability


– the scores from different raters are
correlated with one another.
RELIABILITY

RELIABILITY ESTIMATES
RELIABILITY
• Dependability or • Test-retest reliability – an estimate of
reliability obtained by correlating pairs of
consistency in
scores from the same people on two different
measurement. administrations of the same test.

• Parallel – forms – for each form of the test,


the means and the variances of observed test
scores are equal.

• or alternate forms – the degree – different


versions of a test that have been constructed
so as to be parallel
RELIABILITY

RELIABILITY ESTIMATES
RELIABILITY
• Dependability or • Split-half reliability – obtained by
consistency in correlating two pairs of scores
measurement. obtained from equivalent halves of a
single test administered once.

• Spearman – Brown formula


– allows a test developer or user to
estimate internal consistency
reliability from a correlation of two
halves of a test.
RELIABILITY

RELIABILITY ESTIMATES
RELIABILITY
• Dependability or • Inter-item consistency – the degree
consistency in of relatedness of items on a test.
measurement. Able to gauge the homogeneity of a
test.

• Kuder- Richardson formula 20 –


statistic of choice for determining the
inter-item consistency of
dichotomous items.
QUESTION 18

A Good item

An item with a low


B Poor item discrimination value
means that the item
is____.

C Not discriminating

D Strongly discriminating
ITEM – DISCRIMINATORY INDEX

• Indicates how adequately an item separates or


discriminates between high scorers and low scorers
on an entire test

• A measure of the difference between the proportion


of high scorers answering an item correctly and the
proportion of low scorers answering the item
correctly.
QUESTION 19

84% of the test-takers did not


A answer the item correctly.

Which of the following


B The item is difficult explains a difficulty
index of .84?
The item is difficult for the lower
C group and easy for the upper group.

84% of the test-taker answered the


D item correctly.
QUESTION 20

A Time differences

What is the primary


B Content differences source of error
variance for a test-
retest approach to
C Rater differences reliability?

Test forms
D
Thank You

You might also like