Psychological Testing

Guidance and Counseling Board Examination Review

(Part I)

July 28, 2018

Prayer Before Study by St. Thomas

Aquinas, OP

Creator of all things, true source of light and wisdom, origin of all

being, graciously let a ray of your light / penetrate the darkness

of my understanding.

Take from me the double darkness / in which I have been born,

an obscurity of sin and ignorance.

Give me a keen understanding, a retentive memory, and the

ability to grasp things correctly and fundamentally.

Grant me the talent of being exact in my explanations / and the

ability to express myself with thoroughness and charm.

Point out the beginning, direct the progress, and help in the

completion. I ask this through Jesus Christ our Lord. Amen.

References

• Psychological Testing (Kaplan, Cohen, Gregory)

• Alexa Abrenica’s Workbook

• Bajo (Reviewer)

• Mastering the National Counselor Examination and the

Counselor Comprehensive Examination (Erford, Hays,

Crockett)

• De Jesus

• Munarriz, Cervera

• DSM 5

• Behavioral Research Method

• Assessment of Children and Adolescents

• List of Psychological Tests (Internet)

Psychological Tes.ng

• Psychological tes.ng means

sta$s$cs

Purposes:

✏ Descrip(on

✏ Make Inferences

Psychological Testing

✏ a standardized measurement of a

sample of behavior

✏establishing norms

✏important test items that correspond

to what the test is to discover about the

test-taker

✏based on the uniformity of procedures

in administering and scoring the test

Standardized

✏ There is an established

reference point that a test scorer

can use to evaluate, judge,

measure against, and compare.

point?

• Norms are established

Norms

✏ Relies on the number of test takers

who take a given test, to establish

what is normal in the group

✏ Then scorers can determine where

an individual falls within that group

✏ The larger the sample, the be;er!

back*

Test Items

✏ The ques+ons that a test-taker is

asked on any given test

✏ Must be relevant to what the test is

trying to measure

✏ Must have large sets in order to

establish a proper measurement

back*

Uniformity of Procedures in

Administering and Scoring

same way

✏ Test takers take the test the same

way

✏ Scorers score the test the same way

This helps with

✏ Validity

✏ Reliability

Scales of Measurement

Scales of Measurement

Scales of Measurement

Scales of Measurement

Frequency Distribution

• Frequency is how often something occurs.

• By counting frequencies we can make a

Frequency Distribution table.

• Frequency Distribution: values and their

frequency (how often each value occurs).

• Normal Distribution: Bell Curve

• Measures of Central Tendency (Mean, Median, Mode)

• Measures of Spread (Range, Percentiles, Standard Deviation)

Frequency Distribu/on

Describing Distributions

• Mean

• Standard Devia3on

• Z score

• T-score

• Quar3les and Deciles

Sta$s$cal Symbols

Mean

• The average score in a distribu3on

Exercise

Find the mean:

4, 2, 1, 1, 2, 1, 4, 1

Median and Mode

• Median

• Mode

Exercise

Find the median and mode:

4, 4, 2, 2, 1, 1, 1, 1

Mean = 2

Median = 1.5

Mode = 1

Range

• The Range is the diﬀerence between the

lowest and highest values.

Range

Standard Deviation

around the mean

– a number used to tell how measurements for a

group are spread out from the average (mean), or

expected value.

– A low standard deviation means that most of the

numbers are very close to the average.

– A high standard deviation means that the

numbers are spread out.

Standard Devia,on

devia,on around the mean

Standard Deviation

Consider a group having the following eight

numbers/scores:

2, 4, 4, 4, 5, 5, 7, 9

5:

Standard Deviation

To calculate the population standard deviation,

first find the difference of each number in the

list from the mean. Then square the result of

each difference:

Standard Deviation

Next, find the average of these values (sum

divided by the number of numbers). Last, take

the square root:

The answer is the population standard deviation. The formula is only true if

the eight numbers we started with are the whole group. If they are only a

part of the group picked at random, then we should use 7 (which is n − 1)

instead of 8 (which is n) in the bottom (denominator) of the second-to-last

step. Then the answer is the sample standard deviation.

Z score

• Diﬀerence between a score and the mean, divided by

the standard devia6on

• The devia6on of score from the mean in standard

devia+on unit

Z score

Example 1:

X=6

Mean = 3

S=3

Frequency Distribution

Percentile Ranks

• Answers the question, “What percent of the scores fall below

a particular score?”

T score

• Exactly the same as standard scores (Z scores)

except that the mean is 50 rather than 0 and

the standard devia:on is 10 rather than 1.

T = 10Z + 50

Quar%les and Deciles

• Quartiles are points that divide the distribution

into equal fourths. The first quartile is the 25th

percentile, the second is the median or the 50th

percentile; and the third quartile is the 75th

percentile.

• Deciles are similar to quartiles except that they

use points that mark 10% rather than 25%

intervals. The top decile, or D9, is the point below

which 90% of the cases fall, D8 marks the 80%

percentile and so forth

Correla'on

• Scatter diagram

– Picture of the relationship between two variables

Correlation

Correlation

• Correlational analysis is designed primarily to

examine linear relationships between

variables

• A correlation coefficient is a mathematical

index that describes the direction and

magnitude of a relationship.

Correla'on

Correlation Coefficient

and direction of straight-line correlation

Correlation Coefficient

Correla'on Coeﬃcient Description

-1.00 perfect negative correlation

-.60 strong negative correlation

-.30 moderate negative correlation

-.10 weak negative correlation

.00 no correlation

+.10 weak positive correlation

+.30 moderate positive correlation

+.60 strong positive correlation

+1.00 perfect positive correlation

Curvilinear Relationship

• a relationship between X and Y that begins

as positive becomes negative and vice

versa (e.g. relationship between anxiety

levels and English test scores)

Pearson s Correlation Coefficient (r)

Coefficient

the correlational method to do agricultural

research

Pearson s Correlation Coefficient (r)

– For N greater than or equal to 30

– A straight-line relationship

– Interval data – two sets of scores or data, so

that scores may be assigned to the

respondents

– Random sampling (for applying a test of

significance)

Pearson s Correlation Coefficient (r)

• computational formula:

Coefficient of Determination (r2)

compute the r2, which estimates the amount

of variability in scores on one variable that

can be explained by the other variable

• e.g. r= .40

r2 =.16

*meaning 16% of y can be explained by x

and the remaining 84% can be explained by

other factors

Formula of IQ

IQ

Exercises:

MA = 7 MA = 20

CA = 7 CA = 16

IQ = ? IQ = ?

MA = 34 MA= 30

CA = 40 CA = 15

IQ = ? IQ = ?

IQ

Exercises:

MA = 7 MA = 20

CA = 7 CA = 16

IQ = 100 IQ = 125

MA = 34 MA= 30

CA = 40 CA = 15

IQ = 85 IQ = 200

Reliability vs Validity

Reliability

X=T+E

Reliability

• Standard Error of Measurement (SEM)

✏ Standard devia8on of errors

✏ Es8mates how repeated measures of a

person on the same instrument tend to be

distributed around his or her “true” score.

✏ The true score is always an unknown because no

measure can be constructed that provides a perfect

reﬂec9on of the true score.

• Ex. Rubber-yards8ck as measure

Models of Reliability

• Time sampling: the Test-retest Method

– Administration of the same test on two well-

specified occasions and then find the correlation

between the scores from the two administration

• Item sampling: Parallel Forms Method

– Compares two equivalent forms of a test that

measure the same attribute

• Split-half Method

Split-half Method

• A test is given and divided into halves that are

scored separately

• Commonly use the odd-even system

• Correlation of the two halves is computed

• Uses the Spearman-Brown Formula

Other Methods for Estimating the

Internal Consistency of a Test

• KR20 Formula

– Does not split the test into two halves

– It considers all the individual item variances

• Coefficient Alpha

– Similar to KR20 formula

– For types of tests with no right or wrong answer

What to do about low reliability?

• Increase the number of items

• Factor and item analysis

• Correc9on for a:enua9on

What to do about low reliability?

• Increase the number of items

– The reliability of the test increases as the number

of items increases

– Spearman-brown prophesy formula can estimate

how many items will have to be added in order to

bring a test to an acceptable level of reliability.

What to do about low reliability?

• Factor and item analysis

– Tests are most reliable when they are

unidimensional. One factor should account for

more of the variance than any other factor.

– Items that do not load on this factor might best be

omitted.

– Uses discriminability analysis

• When the correlation between the performance on a

single item and the total test score is low, the item is

probably measuring something different from the other

items on the test

What to do about low reliability?

• Correction for attenuation

– To use the method, one needs to know only the

reliabilities of two tests and the correlation

between them

– E.g. happiness and scholastic achievement

Validity

• The agreement between a test score or

measure and the quality it is believed to

measure.

• “Does the test measure what it is

supposed to measure?”

• Types

– Content-related

– Criterion-related

– Construct-related

Content-Related Validity

• Considers the adequacy of representation of the

conceptual domain the test is designed to cover

– E.g., the score on your history test should represent your

comprehension of the history you are expected to know

• Construct underrepresentation

– The failure to capture important component of a construct

– Ex., measure of general mathematical knowledge (but only

algebra problems are included)

• Construct-irrelevant variance

– Occurs when scores are influenced by other factors

irrelevant to the construct

– Ex., test anxiety, reading comprehension

Criterion-related Evidence for Validity

• How well a test corresponds with a particular

criterion

• Predictive validity evidence

– Forecasting function of test

– Ex., entrance test and grades

• Concurrent validity evidence

– Measures and criterion measures are taken at the

same time

– Ex., learning disability test and school performance

Validity Coefficient

• The relationship between a test and

a criterion

• Validity coefficients in the range of

.30 - .40 are commonly considered

high

Construct-related Evidence for Validity

• Established through a series of ac8vi8es

in which a researcher simultaneously

deﬁnes some construct and develops

the instrumenta8on to measure it

• Types

– Convergent Evidence

– Discriminant evidence

Convergent Validity

• When a measure correlates well with

other tests believed to measure the

same construct

• Ex., those who score low on health

index are expected to visit the doctors

more often

Discriminant Validity

• A test should have a low correlations with

measures of unrelated constructs, or

evidence for what the test does not

measure

• Providing evidence that a test measures

something different from other tests,

providing evidence that it is a unique

construct

• Ex., health index should not correlate with

IQ

Item Difficulty

An item’s diﬃculty level is usually measured in

terms of the percentage of examinees who

answer the item correctly. This percentage is

referred to as the item diﬃculty index, or "p"

2 alternaAves true and false = .75

3 alternaAves mulAple-choice = .67

4 alternaAves mulAple-choice = .63

5 alternaAves mulAple-choice = .60

Item Discrimina-on

Refers to the degree to which the items

differentiate among examinees in terms of the

characteristic being measured (e.g., between

high and low scorers).

Remember

If a test is unreliable, it cannot be valid.

For a test to be valid, it must reliable.

However, just because a test is reliable

does not mean it will be valid.

For questions, you may reach me thru:

mbbonifacio@ust.edu.ph

Lord God

be with you in

this journey!

