Different Kinds of Psychological Tests

Different kinds of
Psychological
Tests
Anselma R. Delos Santos

2BAF-1
Prof. Hanzen Reyes

Intelligence tests
Definition
Intelligence tests are psychological tests that are designed to measure a variety of mental
functions, such as reasoning, comprehension, and judgment.
Purpose
The goal of intelligence tests is to obtain an idea of the person's intellectual potential. The tests
center around a set of stimuli designed to yield a score based on the test maker's model of what
makes up intelligence. Intelligence tests are often given as a part of a battery of tests.
Precautions
There are many different types of intelligence tests and they all do not measure the same
abilities. Although the tests often have aspects that are related with each other, one should not
expect that scores from one intelligence test, that measures a single factor, will be similar to
scores on another intelligence test, that measures a variety of factors. Also, when determining
whether or not to use an intelligence test, a person should make sure that the test has been
adequately developed and has solid research to show its reliability and validity.
Additionally, psychometric testing requires a clinically trained examiner. Therefore, the test
should only be administered and interpreted by a trained professional.
A central criticism of intelligence tests is that psychologists and educators use these tests
to distribute the limited resources of our society. These test results are used to provide rewards
such as special classes for gifted students, admission to college, and employment. Those who
do not qualify for these resources based on intelligence test scores may feel angry and as if the
tests are denying them opportunities for success. Unfortunately, intelligence test scores have
not only become associated with a person's ability to perform certain tasks, but with self-worth.
Many people are under the false assumption that intelligence tests measure a person's inborn
or biological intelligence. Intelligence tests are based on an individual's interaction with the
environment and never exclusively measure inborn intelligence. Intelligence tests have been
associated with categorizing and stereotyping people. Additionally, knowledge of one's
performance on an intelligence test may affect a person's aspirations and motivation to obtain
goals. Intelligence tests can be culturally biased against certain groups.
Description
When taking an intelligence test, a person can expect to do a variety of tasks. These tasks may
include having to answer questions that are asked verbally, doing mathematical problems, and
doing a variety of tasks that require eye-hand coordination. Some tasks may be timed and
require the person to work as quickly as possible. Typically, most questions and tasks start out
easy and progressively get more difficult. It is unusual for anyone to know the answer to all of
the questions or be able to complete all of the tasks. If a person is unsure of an answer,
guessing is usually allowed.
The four most commonly used intelligence tests are:
Stanford-Binet Intelligence Scales
Wechsler-Adult Intelligence Scale
Wechsler Intelligence Scale for Children
Wechsler Primary & Preschool Scale of Intelligence
Advantages
In general, intelligence tests measure a wide variety of human behaviors better than any other
measure that has been developed. They allow professionals to have a uniform way of
comparing a person's performance with that of other people who are similar in age. These tests
also provide information on cultural and biological differences among people.
Intelligence tests are excellent predictors of academic achievement and provide an outline of a
person's mental strengths and weaknesses. Many times the scores have revealed talents in
many people, which have led to an improvement in their educational opportunities. Teachers,
parents, and psychologists are able to devise individual curricula that matches a person's level
of development and expectations.
Disadvantages
Some researchers argue that intelligence tests have serious shortcomings. For example, many
intelligence tests produce a single intelligence score. This single score is often inadequate in
explaining the multidimensional
Intelligence tests are psychological tests that are designed to measure a variety of
mental functions, such as reasoning, comprehension, and judgment. Intelligence tests
are often given as part of a battery of tests.
(Lew Merrim/Science Source. Photo Researchers, Inc. Reproduced by permission.)
aspects of intelligence. Another problem with a single score is the fact that individuals with
similar intelligence test scores can vary greatly in their expression of these talents. It is
important to know the person's performance on the various subtests that make up the overall
intelligence test score. Knowing the performance on these various scales can influence the
understanding of a person's abilities and how these abilities are expressed. For example, two
people have identical scores on intelligence tests. Although both people have the same test
score, one person may have obtained the score because of strong verbal skills while the other
may have obtained the score because of strong skills in perceiving and organizing various
tasks.
Furthermore, intelligence tests only measure a sample of behaviors or situations in which

intelligent behavior is revealed. For instance, some intelligence tests do not measure a person's
everyday functioning, social knowledge, mechanical skills, and/or creativity. Along with this, the
formats of many intelligence tests do not capture the complexity and immediacy of real-life
situations. Therefore, intelligence tests have been criticized for their limited ability to predict non-
test or nonacademic intellectual abilities. Since intelligence test scores can be influenced by a
variety of different experiences and behaviors, they should not be considered a perfect indicator
of a person's intellectual potential.
Results
The person's raw scores on an intelligence test are typically converted to standard scores. The
standard scores allow the examiner to compare the individual's score to other people who have
taken the test. Additionally, by converting raw scores to standard scores the examiner has
uniform scores and can more easily compare an individual's performance on one test with the
individual's performance on another test. Depending on the intelligence test that is used, a
variety of scores can be obtained. Most intelligence tests generate an overall
intelligence quotient or IQ. As previously noted, it is valuable to know how a person performs on
the various tasks that make up the test. This can influence the interpretation of the test and what
the IQ means. The average of score for most intelligence tests is 100.
Resources
BOOKS
Kaufman, Alan, S., and Elizabeth O. Lichtenberger. Assessing Adolescent and Adult
Intelligence. Boston: Allyn and Bacon, 2001.
Matarazzo, J. D. Wechsler's Measurement and Appraisal of Adult Intelligence. 5th ed. New York:
Oxford University Press, 1972.
Sattler, Jerome M. "Issues Related to the Measurement and Change of Intelligence."
In Assessment of Children: Cognitive Applications. 4th ed. San Diego: Jerome M. Sattler,
Publisher, Inc., 2001.
Sattler, Jerome M. and Lisa Weyandt. "Specific Learning Disabilities." In Assessment of
Children: Behavioral and Clinical Applications. 4th ed. Written by Jerome M. Sattler. San Diego:
Jerome M. Sattler, Publisher, Inc., 2002.
Keith Beard, Psy.D.
I Individual Intelligence Tests
a Wechsler Adult Intelligence Scale-R (WAIS-R)

Definition
The Wechsler adult intelligence scale (WAIS) is an individually administered measure of

intelligence, intended for adults aged 1689.
Purpose
The WAIS is intended to measure human intelligence reflected in both verbal and performance
abilities. Dr. David Wechsler, a clinical psychologist , believed that intelligence is a global
construct, reflecting a variety of measurable skills and should be considered in the context of the
overall personality. The WAIS is also administered as part of a test battery to make inferences
about personality and pathology, both through the content of specific answers and patterns of
subtest scores.
Besides being utilized as an intelligence assessment, the WAIS is used

in neuropsychological evaluation, specifically with regard to brain dysfunction. Large differences
in verbal and nonverbal intelligence may indicate specific types of brain damage.
The WAIS is also administered for diagnostic purposes. Intelligence quotient (IQ) scores
reported by the WAIS can be used as part of the diagnostic criteria for mental retardation ,
specific learning disabilities, and attention-deficit/hyperactivity disorder (ADHD).
Precautions
The Wechsler intelligence scales are not considered adequate measures of extremely high and
low intelligence (IQ scores below 40 and above 160). The nature of the scoring process does
not allow for scores outside of this range for test takers at particular ages. Wechsler himself was
even more conservative, stressing that his scales were not appropriate for people with an IQ
below 70 or above 130. Also, when administering the WAIS to people at extreme ends of the
age range (below 20 years of age or above 70), caution should be used when interpreting
scores. The age range for the WAIS overlaps with that of the Wechsler Intelligence Scale for
Children (WISC) for people between 16 and 17 years of age, and it is suggested that the WISC
provides a better measure for this age range.
Administration and scoring of the WAIS require an active test administrator who must interact
with the test taker and must know test protocol and specifications. WAIS administrators must
receive proper training and be aware of all test guidelines.
Description
The Wechsler intelligence tests , which include the WAIS, the WISC, and the WPPSI
(Wechsler preschool and primary scale of intelligence), are the most widely used intelligence
assessments and among the most widely used neuropsychological assessments. Wechsler
published the first version of the WAIS in 1939, initially called the Wechsler-Bellevue. The
newest version is the WAIS-III (the third edition, most recently updated in 1997). Since
Wechsler's death in 1981, the Wechsler tests have been revised by the publisher, the
Psychological Corporation.
The theoretical basis for the WAIS and the other Wechlser scales came from Wechsler's belief
that intelligence is a complex ability involving a variety of skills. Because intelligence is
multifaceted, Wechsler believed, a test measuring intelligence must reflect this multitude of
skills. After dividing intelligence into two major types of skillsverbal and performance
Wechsler utilized the statistical technique of factor analysis to determine specific skills within
these two major domains. These more specific factors formed the basis of the Wechsler
subtests.
The WAIS-III consists of 14 subtests and takes about 6075 minutes to complete. The test is
taken individually, with a test administrator present to give instructions. Each subtest is given
separately, and proceeds from very easy items to very difficult ones. There is some flexibility in
the administration of the WAISthe administrator may end some subtests early if test takers
seem to reach the limit of their capacity. Tasks on the WAIS include questions of general
knowledge, traditional arithmetic problems, a test of vocabulary, completion of pictures with
missing elements, arrangements of blocks and pictures, and assembly of objects.
The WAIS is considered to be a valid and reliable measure of general intelligence. When
undergoing reliability and validity studies, other intelligence tests are often compared to the
Wechsler scales. It is regularly used by researchers in many areas of psychology as a measure
of intelligence. Research has demonstrated correlations between WAIS IQ scores and a variety
of socioeconomic, physiological, and environmental characteristics.
The WAIS has also been found to be a good measure of both fluid and crystallized intelligence.
Fluid intelligence refers to inductive and deductive reasoning, skills considered to be largely
influenced by neurological and biological factors. In the WAIS, fluid intelligence is reflected in
the performance subtests. Crystallized intelligence refers to knowledge and skills that are
primarily influenced by environmental and sociocultural factors. In the WAIS, crystallized
intelligence is reflected in the verbal subtests. Wechsler himself did not divide overall
intelligence into these two types. However, the consideration of fluid and crystallized intelligence
as two major categories of cognitive ability has been a focus for many intelligence theorists.
The Wechshler scales were originally developed and later revised

using standardization samples. The samples were meant to be demographically representative
of the United States population at the time of the standardization.
Results
The WAIS elicits three intelligence quotient scores, based on an average of 100, as well as
subtest and index scores. WAIS subtests measure specific verbal abilities and specific
performance abilities.
The WAIS elicits an overall intelligence quotient, called the full-scale IQ, as well as a verbal IQ
and a performance IQ. The three IQ scores are standardized in such a way that the scores have
a mean of 100 and a standard deviation of 15. Wechsler pioneered the use of deviation IQ
scores, allowing test takers to be compared to others of different as well as the same age. WAIS
scores are sometimes converted into percentile ranks. The verbal and performance IQ scores
are based on scores on the 14 subtests. The 14 subtest scores have a mean of 10 and a
standard deviation of three. The WAIS also elicits four indices, each based on a different set of
subtests: verbal comprehension, perceptual organization, working memory, and processing
speed.
The full-scale IQ is based on scores on all of the subtests and is a reflection of both verbal IQ
and performance IQ. It is considered the single most reliable and valid score elicited by the
WAIS. However, when an examinee's verbal and performance IQ scores differ significantly, the
full-scale IQ should be interpreted cautiously.
The verbal IQ
The verbal IQ is derived from scores on seven of the subtests: information, digit span,
vocabulary, arithmetic, comprehension, similarities, and letter-number sequencing. Letter-
number sequencing is a new subtest added to the most recent edition of the WAIS (WAIS-III).
The information subtest is a test of general knowledge, including questions about geography
and literature. The digit span subtest requires test takers to repeat strings of digits. The
vocabulary and arithmetic subtests are general measures of a person's vocabulary and
arithmetic skills. The comprehension subtest requires test takers to solve practical problems and
explain the meaning of proverbs. The similarities subtest requires test takers to indicate the
similarities between pairs of things. The letter-number sequencing subtest involves ordering
numbers and letters presented in an unordered sequence. Scores on the verbal subtests are
based primarily on correct answers.
The performance IQ
The performance IQ is derived from scores on the remaining seven subtests: picture
completion, picture arrangement, block design, object assembly, digit symbol, matrix reasoning,
and symbol search. Matrix reasoning and symbol search are new subtests and were added to
the most recent edition of the WAIS (WAIS-III).
In the picture completion subtest, the test taker is required to complete pictures with missing
elements. The picture arrangement subtest entails arranging pictures in order to tell a story. The
block design subtest requires test takers to use blocks to make specific designs. The object
assembly subtest requires people to assemble pieces in such a way that a whole object is built.
In the digit symbol subtest, digits and symbols are presented as pairs and test takers then must
pair additional digits and symbols. The matrix reasoning subtest requires test takers to
identify geometric shapes. The symbol search subtest requires examinees to match symbols
appearing in different groups. Scores on the performance subtests are based on both response
speed and correct answers.
Resources
BOOKS
Groth-Marnat, Gary. Handbook of Psychological Assessment, 3rd edition. New York: John Wiley
and Sons, 1997.
Kline, Paul. The Handbook of Psychological Testing. New York: Routledge, 1999.
McGrew, Kevin S., and Dawn P. Flanagan. The Intelligence Test Desk Reference. Needham
Heights, MA: Allyn and Bacon, 1998.
Ali Fahmy, Ph.D.
b WECHSLER INTELLIGENCE SCALE FOR CHILDREN (WISC)

Definition
The Wechsler Intelligence Scale for Children, often abbreviated as WISC, is an individually
administered measure of intelligence intended for children aged six years to 16 years and 11
months.
Purpose
The WISC is designed to measure human intelligence as reflected in both verbal and nonverbal
(performance) abilities. David Wechsler, the author of the test, believed that intelligence has a
global quality that reflects a variety of measurable skills. He also thought that it should be
considered in the context of the person's overall personality.
The WISC is used in schools as part of placement evaluations for programs for gifted children
and for children who are developmentally disabled.
In addition to its uses in intelligence assessment, the WISC is used

in neuropsychological evaluation, specifically with regard to brain dysfunction. Large differences
in verbal and nonverbal intelligence may indicate specific types of brain damage.
The WISC is also used for other diagnostic purposes. IQ scores reported by the WISC can be
used as part of the diagnostic criteria for mental retardation and specific learning disabilities.
The test may also serve to better evaluate children with attention-deficit/hyperactivity
disorder (ADHD) and other behavior disorders.
Precautions
The Wechsler intelligence scales are not considered adequate measures of extreme intelligence
(IQ scores below 40 and above 160). The scoring process does not allow for scores outside this
range for test takers at particular ages. Wechsler himself was even more conservative, stressing
that his scales were not appropriate for people with IQs below 70 or above 130. Despite this
restriction, many people use the WISC as a measure of the intelligence of gifted children, who
typically score above 130. The age range for the WISC overlaps with that of the Wechsler
Adult Intelligence Scale (WAIS) for people between 16 and 17 years of age, but experts
suggest that the WISC provides a better measure for people in this age range.
Administration and scoring of the WISC require a competent administrator who must be able to
interact and communicate with children of different ages and must know test protocol and
specifications. WISC administrators must receive training in the proper use of the instrument
and demonstrate awareness of all test guidelines.
Description
The Wechsler intelligence tests , which include the WISC, the WAIS, and the WPPSI
(Wechsler Preschool and Primary Scale of Intelligence), are the most widely used intelligence
and neuropsychological assessments. The first version of the WISC was written in 1949 by
David Wechsler. The newest version of the WISC is the WISC-III (Wechsler Intelligence Scale
for Children-Third Edition, most recently updated in 1991). Since Wechsler's death in 1981, the
tests have been revised by their publisher, the Psychological Corporation.
The theoretical basis for the WISC and the other Wechsler scales is Wechsler's belief that
human intelligence is a complex ability involving a variety of skills. Because intelligence is
multifaceted, Wechsler believed, a test measuring intelligence must reflect this diversity. After
dividing intelligence into two major types of skillsverbal and performanceWechsler used
a statistical technique called factor analysis to determine which specific skills fit within these two
major domains.
The current version of the WISC (the WISC-III) consists of 13 subtests and takes between 50
and 75 minutes to complete. The test is taken individually, with an administrator present to give
instructions. Each subtest is given separately. There is some flexibility in the administration of
the WISCthe administrator may end some subtests early if the test taker appears to have
reached the limit of his or her capacity. Tasks on the WISC include questions of general
knowledge, traditional arithmetic problems, English vocabulary, completion of mazes, and
arrangements of blocks and pictures.
Children who take the WISC are scored by comparing their performance to other test takers of
the same age. The WISC yields three IQ (intelligence quotient) scores, based on an average of
100, as well as subtest and index scores. WISC subtests measure specific verbal and
performance abilities. The Wecshler scales were originally developed and later revised
using standardization samples. The samples were meant to be representative of the United
States population at the time of standardization.
The WISC is considered to be a valid and reliable measure of general intelligence in children. It
is regularly used by researchers in many areas of psychology and child development as a
general measure of intelligence. It has also been found to be a good measure of both fluid
and crystallized intelligence. Fluid intelligence refers to inductive and deductive reasoning, skills
that are thought to be largely influenced by neurological and biological factors. Fluid intelligence
is measured by the performance subtests of the WISC. Crystallized intelligence refers to
knowledge and skills that are primarily influenced by environmental and sociocultural factors. It
is measured by the verbal subtests of the WISC. Wechsler himself did not divide overall
intelligence into these two types. The definition of fluid and crystallized intelligence as two major
categories of cognitive ability, however, has been a focus of research for many intelligence
theorists.
Verbal IQ
The child's verbal IQ score is derived from scores on six of the subtests: information, digit span,
vocabulary, arithmetic, comprehension, and similarities.
The information subtest is a test of general knowledge, including questions about geography
and literature. The digit span subtest requires the child to repeat strings of digits recited by the
examiner. The vocabulary and arithmetic subtests are general measures of the child's
vocabulary and arithmetic skills. The comprehension subtest asks the child to solve practical
problems and explain the meaning of simple proverbs. The similarities subtest asks the child to
describe the similarities between pairs of items, for example that apples and oranges are both
fruits.
Performance IQ
The child's performance IQ is derived from scores on the remaining seven subtests: picture
completion, picture arrangement, block design, object assembly, coding, mazes, and symbol
search.
In the picture completion subtest, the child is asked to complete pictures with missing elements.
The picture arrangement subtest entails arranging pictures in order to tell a story. The block
design subtest requires the child to use blocks to make specific designs. The object assembly
subtest asks the child to put together pieces in such a way as to construct an entire object. In
the coding subtest, the child makes pairs from a series of shapes or numbers. The mazes
subtest asks the child to solve maze puzzles of increasing difficulty. The symbol search subtest
requires the child to match symbols that appear in different groups. Scores on the performance
subtests are based on both the speed of response and the number of correct answers.
Results
WISC scores yield an overall intelligence quotient, called the full scale IQ, as well as a verbal IQ
and a performance IQ. The three IQ scores are standardized in such a way that a score of 100
is considered average and serves as a benchmark for higher and lower scores. Verbal and
performance IQ scores are based on scores on the 13 subtests.
The full scale IQ is derived from the child's scores on all of the subtests. It reflects both verbal
IQ and performance
A child taking the picture arrangement portion of the WISC.
(Lew Merrim/Science Source/Photo Researchers, Inc. Photo reproduced by permission.)
IQ and is considered the single most reliable and valid score obtained by the WISC. When a
child's verbal and performance IQ scores are far apart, however, the full scale IQ should be
interpreted cautiously.
Resources
BOOKS
Groth-Marnat, Gary. Handbook of Psychological Assessment. 3rd edition. New York: John Wiley
and Sons, 1997.
Kline, Paul. The Handbook of Psychological Testing. New York: Routledge, 1999.
McGrew, Kevin S., and Dawn P. Flanagan. The Intelligence Test Desk Reference. Needham
Heights, MA: Allyn and Bacon, 1998.
Ali Fahmy, Ph.D.
c STANFORD BINET LM
Definition
The Stanford-Binet Intelligence Scale: Fourth Edition (SB: FE) is a standardized test that
measures intelligence and cognitive abilities in children and adults, from age two through
mature adulthood.
Purpose
The Stanford-Binet Intelligence Scale was originally developed to help place children in
appropriate educational settings. It can help determine the level of intellectual and cognitive
functioning in preschoolers, children, adolescents and adults, and assist in the diagnosis of a
learning disability, developmental delay, mental retardation , or giftedness. It is used to provide
educational planning and placement, neuropsychological assessment, and research. The
Stanford-Binet Intelligence Scale is generally administered in a school or clinical setting.
Precautions
The Stanford-Binet Intelligence Scale is considered to be one of the best and most widely used
intelligence tests available. It is especially useful in providing intellectual assessment in young
children, adolescents, and young adults. The test has been criticized for not being comparable
for all age ranges. This is because different age ranges are administered different subtests.
Additionally, for very young preschoolers, it is not uncommon to receive a score of zero due to
test difficulty or the child's unwillingness to cooperate. Consequently, it is difficult to discriminate
abilities in this age group among the lower scorers.
Administration and interpretation of results of the Stanford-Binet Intelligence Scale requires a

competent examiner who is trained in psychology and individual intellectual assessment,
preferably a psychologist.
Description
The Stanford-Binet Intelligence Scale has a rich history. It is a descendant of the Binet-Simon
scale which was developed in 1905 and became the first intelligence test. The Stanford-Binet
Intelligence Scale was developed in 1916 and was revised in 1937, 1960, and 1986. The
present edition was published in 1986. The Stanford Binet Intelligence Scale is currently being
revised and the Fifth Edition is expected to be available in the spring of 2003.
Administration of the Stanford-Binet Intelligence Scale typically takes between 45 to 90 minutes,

but can take as long as two hours, 30 minutes. The older the child and the more subtests
administered, the longer the test generally takes to complete. The Stanford-Binet Intelligence
Scale is comprised of four cognitive area scores which together determine the composite score
and factor scores. These area scores include: Verbal Reasoning, Abstract/Visual Reasoning,
Quantitative Reasoning, and Short-Term Memory. The composite score is considered to be what
the authors call the best estimate of "g" or "general reasoning ability" and is the sum of all of
subtest scores. General reasoning ability or "g" is considered to represent a person's ability to
solve novel problems. The composite score is a global estimate of a person's intellectual
functioning.
The test consists of 15 subtests, which are grouped into the four area scores. Not all subtests
are administered to each age group; but six subtests are administered to all age levels. These
subtests are: Vocabulary, Comprehension, Pattern Analysis, Quantitative, Bead Memory, and
Memory for Sentences. The number of tests administered and general test difficulty is adjusted
based on the test taker's age and performance on the sub-test that measures word knowledge.
The subtest measuring word knowledge is given to all test takers and is the first subtest
administered.
The following is a review of the specific cognitive abilities that the four area scores measure.
The Verbal Reasoning area score measures verbal knowledge and understanding obtained
from the school and home learning environment and reflects the ability to apply verbal skills to
new situations. Examples of subtests comprising this factor measure skills which include: word
knowledge, social judgment and awareness, ability to isolate the inappropriate feature in visual
material and social intelligence, and the ability to differentiate essential from non-essential
detail.
The Abstract/Visual Reasoning area score examines the ability to interpret and perform
mathematic operations, the ability to visualize patterns, visual/motor skills, and problem-solving
skills through the use of reasoning. An example of a subtest which determines the
Abstract/Visual Reasoning score is a timed test that involves tasks such as completing a basic
puzzle and replicating black and white cube designs.
The Quantitative Reasoning area score measures: numerical reasoning, concentration, and
knowledge and application of numerical concepts. The Quantitative Reasoning area is
combined with the Abstract/Visual Reasoning area score to create an Abstract/Visual Reasoning
Factor Score.
The Short-Term Memory score measures concentration skills, short-term memory, and
sequencing skills. Subtests comprising this area score measure visual short-term memory and
auditory short term memory involving both sentences and number sequences. In one subtest
that measures visual short-term memory, the participant is presented with pictures of a bead
design, and asked to replicate it from memory.
Results
The Stanford-Binet Intelligence Scale is a standardized test, which means that a large sample of
children and adults were administered the exam as a means of developing test norms. The
population in the sample was representative of the population of the United States based on
age, gender, race or ethnic group, geographic region, community size, parental education,
educational placement (normal versus special classes), etc. From this sample, norms were
established. Norms are the performance of a comparison group of subjectsthat nature of the
group should be specified, and this usually constitutes a normal group so that the performance
of the tested individual can be compared to this group and thus evaluated.
The numbers of correct responses on the given subtests are converted to a SAS score or
Standard Age Score which is based on the chronological age of the test subject. This score is
similar to an I.Q. score. Based on these norms, the Area Scores and Test Composite on the
Stanford-Binet Intelligence Scale each have a mean or average score of 100 and a standard
deviation of 16. For this test, as with most measures of intelligence, a score of 100 is in the
normal or average range. The standard deviation indicates how above or below the norm a
child's score is. For example, a score of 84 is one standard deviation below the norm score of
100. Based on the number of correct responses on a given subtest, an age-equivalent is
available to help interpret the person's level of functioning.
Test scores provide an estimate of the level at which a child is functioning based on a
combination of many different subtests or measures of skills. A trained psychologist is needed to
evaluate and interpret the results, determine strengths and weaknesses, and make overall
recommendations based on the findings and observed behavioral observations.
Resources
BOOKS
Sattler, Jerome. Assessment of Children. 3rd Edition. San Diego, CA, Jerome Sattler, Publisher
Inc. 1992.
PERIODICALS
Caruso, J. "Reliable Component Analysis of the Stanford-Binet: Fourth Edition for 26-Year
Olds." Psychological Assessment 13, no. 2. (2001): 827840.
Grunau, R., M. Whitfield, and J. Petrie. "Predicting IQ of Biologically 'At Risk' Children from Age
3 to School Entry." Developmental and Behavioral Pediatrics 21, no. 6 (2000): 401407.
II Group Intelligence Tests
a The College Aptitude Examination
General Aptitude Test (GAT)
Objectives
This test measures a student's analytical and deductive skills. It focuses on testing the student's capacity
for learning in general regardless of any specific skill in a certain subject or topic. The test measures
abilities relevant to:
1. reading comprehension
2. recognizing logical relations
3. solving problems based on basic mathematical notions.
4. inference skills
5. measuring capacity
GAT Sections
The test is divided into two sections , the verbal and the quantitative.
The verbal section. This section includes the following.
o Reading comprehension: Testees are required to comprehend and analyze reading

passages by answering the questions given.
o Sentence completion: Testees are asked to fill in the missing parts in the text to make
complete meaningful sentences.
o Verbal analogy: Testees are required to match the relation between a pair of words given
at the beginning of the question and a pair given in the choices.
o Synonymy: Testees are asked to give a synonym that matches the meaning of the word
give.
o This section includes 68 questions for science students and 91 questions for humanities
majors.
the quantitative Section
This section includes suitable mathematical problems that match General Secondary Schools Science
and humanities majors. It focuses on measurements, inference and problem solving skills and requires
only basic knowledge. The section includes 52 science-major objective questions. The questions are
divided as follows :
Arithmetic questions 40 %
Geometry questions 24 %
Algebra questions 23 %
Statistical and analytical questions 13 %
As for the humanities-major students, GAT includes 30 questions on arithmetic , geometry and
mathematical analysis.
Common information about the test
1. Some trial questions are included in both the verbal and the quantitative sections but they do not
calculate towards the final score.
2. Questions alternate between the verbal type and the quantitative type in all six parts. A 25-minute
duration is allocated for each part.
3. The number of items is fixed in all tests and so are the sections, parts and time duration. Items,
however, vary although the same difficulty standard is maintained. To ensure the validity and
reliability of the test , test scores are compared to those of the previous tests.
4. Questions are arranged in order of difficulty from the easiest to the more difficult in each section.
Students should answer questions fast enough to cover them all within the limits of the required
time. A student should not leave any question unanswered ; guessing by elimination is one
strategy that he/she can follow while answering questions.
Test duration
The test normally takes two and half hours, divided on 25-minute intervals for each of the six test parts.
Preparation for the test
The test is not based on a particular type of knowledge obtained from certain courses the student has
covered. The test, therefore, does not require special preparation except for some training relevant to
general test-taking strategies. Some basic mathematical and geometry skills will be included in the
test. A student is advised to consult the GAT preparation manual produced by the Center which includes
some questions and model answers. The test does not require doing complex mathematical operations,
so the use of a calculator is not allowed.
GAT timetable
The Center holds the GAT twice, one time during the first semester and the second during the second
semester. The student is entitled to picking the time he/she finds appropriate and at the center nearest to
where he/she lives. Testing centers are spread nation-wide to save students cost and effort. A circular is
sent by the Center to all secondary schools all over the kingdom including the necessary registration
information. Students can also visit the Center's website and check the timetable.
Procedures on the test day
Be sure to be in the exam session in time and on the date and the time slot specified in the
registration form.
GAT is administered in day and evening sessions. As for the day session, the test starts in all
exam stations at 7:30 sharp. You are advised to be at your exam venue by 7:00. AM. Late comers
are not allowed in the test session.
Evening session. All test-takers are required to be at the exam station 30 to 60 minutes after
Maghreb Athan according to Makkah's prayer calendar. Late comers are not allowed in the exam
hall.
As soon as you arrive at the testing station, go directly to the test supervisor to review your
documents and check eligibility and identity.
You will be shown your specified seat on which you will find HB2 pencil that you will use to fill in
the blanks. You may bring in an HB2 pencil and a sharpener with you as a precaution. Using
fountain or ball-point pens is strictly forbidden.
As soon as students are admitted into the test hall, question booklets will be distributed and you
will be instructed to type your personal information on one side and the answer sheet will be on
the back side. All students shall start answering concurrently.
If you have any questions raise your hand and a supervisor will soon be at your place.
GAT duration is 2 and half hours.
GAT is divided into six parts , each is allotted only 25 minutes. Test supervisor will tell you the start
and end time of each section.
During the 25-minute period allowed for a particular part, you may answer and review the
question on that section only. You are not allowed to move to the previous or a following part. Your
test session might get cancelled in case of non-compliance.
How many times can you sit for GAT?

A student not satisfied with his GAT score is allowed to sit for this test more than once. It is not
expected, however, that a student shall have a big difference in score if he/she takes it once more, under
normal conditions. A student may produce any of the GAT results he obtained to the university he is
applying for. The test is strictly standardized to ensure equity and fairness.
Results
Answer sheets are machine-graded and results are listed and printed and then announced. A student can
receive GAT score via the Center's website or through an SMS for those registered for this service.
Results are electronically sent to universities and colleges. For student who did the test more than once,
only the highest score is sent. GAT is no Pass-Fail test. The recorded score that the student receives
represents the relevant position that the student occupies among the total number of students taking the
test. Every higher education institution in Saudi Arabia has its own method of interpreting the relevant
weight of GAT vs. General Secondary School final score. Competition in university admission is then
based on the combined total scores of the GAT and General Secondary School in addition to the score
obtained in any achievement test administered by the university (if required).
Is GAT a Pass/Fail test?

As we have noted, GAT is no Pass/Fail test. This 100-point test carries a certain relative weight
interpreted by the institution the student applies for. GAT score should not be compared to that of General
Secondary School score. What really matters is the relevant position of the student compared to those of
other students, according to the following table:
Score Student's position

81 and above Top 5 %
78 and above Top 10%
65 The average
60 and below Lowest 30 %
References: National Center for assessment in Higher Education 2016
b GENERAL MENTAL ABILITY TEST

GENERAL ABILITY TESTS (GATS)
The three Morrisby General Ability Tests can be used separately or as a unit. They have been
designed to assess Verbal, Numerical and Perceptual potential, rather than attainment, so they
are not intended to assess acquired knowledge like mathematical formulae or the possession of
a wide and complex vocabulary. The General Ability Tests are intended to establish how well
individuals can manipulate words, numbers and pictured visual symbols and where their
greatest strengths lie.
Verbal: Measures a candidates level of skill in communicating and ability to manipulate and
understand words. It is composed of two sub-tests which examine the candidates insight into
the meaning of words.
Numerical: Examines the ability to manipulate figures and perceive relationships between
them, and is a useful predictor of potential skill in mathematics, data handling and commercial,
financial matters. It requires no mathematical skills beyond a very basic knowledge of simple
arithmetic.
Perceptual: Assesses the candidates ability to deal with real objects rather than verbal or
numerical concepts. It tests the ability to manipulate and perceive the relationships between
two-dimensional figures. This is closely related to the effective use of diagrams and graphics;
important factors in much scientific and design work, as well as engineering.
Benefits: Inexpensive and quick method of gaining reliable and objective information on
candidates abilities. Well established, BPS registered, measures of core general abilities.
Useful as preliminary screening tests. The three GATs relate to a type of learning ability; that of
remembering verbal, numerical and perceptual symbols associated with conceptual knowledge.
The three tests measure the base parameters of this part of the intellectual structure, which
confers the ability to memorise, manipulate and utilise such conceptual knowledge. As such, the
tests are therefore useful in all forms of general selection. The tests are suitable for those of all
levels of ability and have been successfully used with groups ranging from graduate level to
those with no formal qualifications.
Items:
Verbal: Identifies the potential ability to use ideas and concepts expressed in words. It does not
directly measure knowledge of English. Vocabulary is weighted less than the insight into verbal
relationships.
Numerical: Measures the mental capacity for interpreting experience, ideas and concepts
quantitatively. The tests are weighted towards an insight into numerical relationships and away
from a facility for arithmetic.
Perceptual: Measures the mental function which experiences the world in terms of direct
observations and representations, such as charts, pictures, diagrams and so on.
Materials & Administration: The General Ability Tests are paper and pencil test, using a
reusable Question Book and separate Answer Sheets. Performance times: Verbal (16 mins);
Numerical (24 mins); Perceptual (20 mins). The time to take the entire GAT is 1 hour, with total
administration time is approximately 80 mins, depending on group size. All the tests are hand
scored using plastic overlay scoring keys. Instructions for scoring, norms, interpretation
guidelines, as well as administration instructions and a photocopyable administration log are
included in the Manual. Age: 15 years to adult.
References: Morrisby.com
c SRA Verbal by Louis. L. Thurstone (1887-1955)
Purpose: SRA (science research associate) verbal is a test of general ability. It is used as a
measure of an individuals overall adaptability and flexibility in comprehending and following
instructions, and in adjusting to alternating types of problems.
Purpose: This test has been designed for used in both school and in industry. Forms A and
B can be used at all educational levels from junior high school to college and at all employee
levels from unskilled laborers to middle management. However, it is intended only for persons
who are familiar with the English language. To determine the general ability of a persons who
speaks foreign languages or of illiterates, a nonverbal form or pictorial test should be used.
Description The SRA verbal is a short test of general ability designed to measure adaptability.
The items are of two types: Vocabulary (Linguistic) and Arithmetic Reasoning (Quantitative)
Because these two item types measures separate mental skills and are presented in an
interspersed format with a restricted time limit, the test presents a situation in which the
individual must adjust rapidly from one item to another.
Description The score level depends on power (ability to handle test items) and speed of
responses (adaptability). A low score can be attributed either to inability to answer items or
inability to shift mental set. It would seem logical that this test generally prove to be most valid
in a situation where both mental ability and speed of responses are required.
Construction of Test Test items are selected by (L or Q) and by level of difficulty. Items are
arrange by increasing difficulty in the following sequence: 2 linguistic, 1 quantitative, 2 linguistic,
2 quantitative (2:1:2:2) for both Forms A and B.
Reliability A study of equivalence of Form A and B of the SRA verbal was conducted on a
sample of 300 high school students in the Midwest. The age range of the students was from
14-19 years with a mean of 16.3. One-half of the sample took Form A first and then Form B.
this sequence was reversed for the other half of the students. Reliabilities shown were in the
high .70s for all scores : L, Q and Total.
Age/Range: Senior High School, College, and Adult Time: 15 minutes Requirements for
Purchase: Level B
Requirements for Purchase (Philippine Psychological Corporation, 1995) LEVEL A- available
if the person administering the tests had undergraduate courses in testing or psychometrics, or
sufficient training and experience in test administration. LEVEL B- available only if the test
administrator has completed an advanced level in course in testing in a university, or its
equivalent in training under the direction of a qualified superior or consultant LEVEL C-
available only for use, by, or under the supervision of qualified psychologists, i.e. members of
APA or the PAP or other persons with at least a Masters Degree in psychology and at least one
year experience under professional supervision
Administration of the Test: Testing room should be free from distractions The examiner
distributes a test booklet and a hard lead pencil to each examinee, in addition, each examinee
should be given a sheet of scratch paper for use in solving arithmetic problems. Each
examinee print his name, group, age and date on the cover of the test booklet Practice the
exercise at the cover page.
Administration of the Test: The examinee should allow as much as time as needed on the
practice exercises. When everyone has completed the practice page and all questions have
been answered, the examiner gives the starting signal, for the test to begin. As the word
BEGIN the timing of the test starts. At the end of 15 minutes, the examiner should say
STOP. PUT YOUR PENCIL AND CLOSE YOUR BOOKLET
Scoring Hand-Scoring Raw scores on either form may be converted to a percentile or
stanine rank for interpretation To determine the rank, locate his raw score in the appropriate
norm group on the table (found in the manual).
Percentile Classification 97-99 Very Superior 90-96 Superior 75-89 Above Average 60-74 High
Average 40-59 Average 10-23 Low Average 24-39 Below Average 4-9 Low 1-3 Very Low
Interpretation Linguistic (L) score represents proficiency in the use of language and is
probably most related to the students performance in work requiring language comprehension,
such as English language, foreign languages, history and the various social sciences. HIGH
SCORE- may be expected to have a good of the English language, to follow instructions easily,
and to show facility in reading and expression LOW SCORE- will probably have difficulty
handling most verbal material.
Interpretation Quantitative (Q) represents efficiency in perceiving and solving mathematical
problems. This score is probably most related to achievement in mathematics, bookkeeping and
science courses. HIGH SCORE- may be expected to adjust quickly and accurately to
situations involving numbers. LOW SCORE- will probably have difficulty handling most
quantitative problems.
Sample Interpretation # 1
Sample Interpretation # 2
Sample Interpretation # 3 The client garnered below average scores both on the Linguistic
and Quantitative parts manifesting his difficulty on the English language and arithmetic
reasoning. Sample Interpretation # 4 On the SRA Verbal, the client got an average score on
the Linguistic part signifying her adeptness with the English language while she garnered an
above average score on the Quantitative part denoting her skill in arithmetic reasoning.
Sample Interpretation # 5 Intellectually, the client garnered a superior scores both on the
SRA Verbal (Linguistic and Quantitative), indicative of his proficiency in the English language
and arithmetic reasoning. Sample Interpretation # 6 On the SRA Verbal, the client got an
average score on the Linguistic part signifying her adeptness with the English language while
she garnered an above average score on the Quantitative part denoting her skill in arithmetic
reasoning.
References: Thurstone, T.G., Thurstone, L.L., Scientific Research Associate

PERSONALITY TESTS
A personality test is a questionnaire or other standardized instrument designed to reveal

aspects of an individual's character or psychological makeup.
The first personality tests were developed in the 1920s and were intended to ease the process
of personnel selection, particularly in the armed forces. Since these early efforts, a wide variety
of personality tests have been developed, notably the MyersBriggs Type Indicator (MBTI),
the Minnesota Multiphasic Personality Inventory (MMPI), and a number of tests based on
the five factor model of personality, such as the Revised NEO Personality Inventory.
Estimates of how much the industry is worth are between $2 and $4 billion a year. Personality
tests are used in a range of contexts, including individual and relationship counseling, career
counseling, employment testing, occupational health and safetyand customer interaction
management.
Test development
A substantial amount of research and thinking has gone into the topic of personality
test development. Development of personality tests tends to be an iterative process whereby a
test is progressively refined. Test development can proceed on theoretical or statistical grounds.
There are three commonly used general strategies: Inductive, Deductive, and Empirical. Scales
created today will often incorporate elements of all three methods.
Deductive assessment construction begins by selecting a domain or construct to measure. The

construct is thoroughly defined by experts and items are created which fully represent all the
attributes of the construct definition. Test items are then selected or eliminated based upon
which will result in the strongest internal validity for the scale. Measures created through
deductive methodology are equally valid and take significantly less time to construct compared
to inductive and empirical measures. The clearly defined and face valid questions that result
from this process make them easy for the person taking the assessment to understand.
Although subtle items can be created through the deductive process, these measure often are
not as capable of detecting lying as other methods of personality assessment construction.
Inductive assessment construction begins with the creation of a multitude of diverse items.The
items created for an inductive measure to not intended to represent any theory or construct in
particular. Once the items have been created they are administered to a large group of
participants. This allows researchers to analyze natural relationships among the questions and
label components of the scale based upon how the questions group together. Several statistical
techniques can be used to determine the constructs assessed by the measure. Exploratory
Factor Analysis and Confirmatory Factor Analysis are two of the most common data reduction
techniques that allow researchers to create scales from responses on the initial items.
The Five Factor Model of personality was developed using this method. Advanced statistical
methods include the opportunity to discover previously unidentified or unexpected relationships
between items or constructs. It also may allow for the development of subtle items that prevent
test takers from knowing what is being measured and may represent the actual structure of a
construct better than a pre-developed theory. Criticisms include a vulnerability to finding item
relationships that do not apply to a broader population, difficulty identifying what may be
measured in each component because of confusing item relationships, or constructs that were
not fully addressed by the originally created questions.
Empirically derived personality assessments also require statistical techniques. One of the
central goals of empirical personality assessment is to create a test that validity discriminates
between two personality features. For example, this may include depressed and non-depressed
individuals, or individuals high or low in levels of aggression. In order to accomplish this goal
items are selected that differentiate between the personality trait being assessed.
The Minnesota Multiphasic Personality Inventory was initially developed using this method.
Test evaluation
There are several criteria for evaluating a personality test. Fundamentally, a personality test is
expected to demonstrate reliability and validity.
Analysis
A respondent's response is used to compute the analysis. Analysis of data is a long process.
Two major theories are used here; Classical test theory (CTT)- used for the observed score, and
item response theory (IRT)- "a family of models for persons' responses to items". The two
theories focus upon different 'levels' of responses and researchers are implored to use both in
order to fully appreciate their results.
Non-response
Firstly, item non-response needs to be addressed. Non-response can either be 'unit'- where a
person gave no response for any of the n items, or 'item'- i.e., individual question. Unit non-
response is generally dealt with exclusion. Item non-response should be handled by imputation-
the method used can vary between test and questionnaire items. Literature about the most
appropriate method to use and when can be found here.
Scoring
The conventional method of scoring items is to assign '0' for an incorrect answer '1' for a correct
answer. When tests have more response options (e.g. ordinal-polytomous items)- '0' when
incorrect, '1' for being partly correct and '2' for being correct. Personality tests can also be
scored using a dimensional (normative) or a typological (ipsative) approach. Dimensional
approaches such as the Big 5 describe personality as a set of continuous dimensions on which
individuals differ. From the item scores, an 'observed' score is computed. This is generally found
by summing the un-weighted item scores.
References: Saccuzzo, , Dennis P.; Kaplan, Robert M. (2009). Psychological Testing: Principles,
Applications, and Issues (7th ed.). Belmont, CA: Wadsworth Cengage Learning.
a) EDWARDS PERSONAL PREFERENCE SCHEDULE
-was derived from the System of Human Needs Theory by Henry Alexander Murray
Test Format
15 Personality Variablity Scales
Test Taking
Administration is simple, because all the administrator must do is explain the use of the score
sheet, but the actual test and instructions are given on the test sheet to each individual test-
taker.
Scoring Procedures
The test can either be scored by hand, or through a machine.
Edwards Personal Preference Schedule
Developed by Allen L. Edwards (Professor of Psychology at the University of Washington) in

1957.
meant for people aged between 16-85
takes approximately 45 minutes to complete

used primarily in Personal Counseling, but can also be used in groups.
System of Human Needs by Murray
a theory of personality that was organized in terms of motives, presses and needs.
Needs
potentiality or readiness to respond in a certain way under a certain given circumstance.
Theories of personality based upon needs and motives suggest that our personalities are a
reflection of behaviors controlled by needs.
While some needs are temporary and changing, other needs are more deeply seated in our
nature.
According to Murray, these psychogenic needs function mostly on the unconscious level, but
play a major role in our personality
Achievement
Deference
Order
Exhibition
Autonomy
Affiliation
Intraception
Succorance
Dominance
Abasement
Nurturance
Change
Endurance
Aggression
* Heterosexuality
Each category is paired with 14 others, making a total of 210 items, with another item added for
each category just to check consistency.
225 Items.
A. I like to share things with my friends.
B. I like to analyze my own motives or feelings.
The test taker reads all instructions and fills out the answers themselves.
Rapport is important, but considering the personal
nature of the test, there will not be much interaction
during the test.
The score is based upon a category that the test taker most often agreed with on the
statements.
i.e. The examinee agreed with every item on the test related to Nurturance (15 items), but only
agreed with 2 items on Aggression. One can then assume that the examinee is not very
aggressive.
Interpretation of Scores is usually done in the counseling setting with one-on-one discussion
regarding the results.
The official scoring involves adding up the number of times the examinee answered "A" or "B" in
each row. The scorer puts each number in the proper row that correlates to the given category
to determine that individual's score. (0-28)
Reliability
The reliability of the EPPS was determined through the Test-Retest procedure, with a group of
89 students at the University of Washington. They took it twice with a 1 week interval.
The interrelated coefficients were very low, which means they are measuring different things.
Reliability
It was also determined through Split-half reliability, with a group of 1509 College Students.
Overall reliability was close to 0.77
Validity
The manual reports studies comparing the EPPS with the Guilford Martin Personality Inventory
and the Taylor Manifest Anxiety Scale.
Other researchers have correlated the California Psychological Inventory, the Adjective Check
List, the Thematic Apperception Test, the Strong Vocational Interest Blank, and the MMPI with
the EPPS.
In these studies there are often statistically significant correlations among the scales of these
tests and the EPPS, but the relationships are usually low-to-moderate and often are difficult for
the researcher to explain.
Norms
The men's highest category was that of Dominance.
Men's lowest score was Order.

Women's highest category was that of Affiliation.
Women's lowest score was also Order.
When averaged together, the highest score in men and women was Intraception.
Achievement : A need to accomplish tasks well
Deference: A need to conform to customs and defer to others
Order: A need to plan well and be organized
Exhibition: A need to be the center of attention in a group
Autonomy: A need to be free of responsibilities and obligations
Affiliation: A need to form strong friendships and attachments
Intraception: A need to analyze behaviors and feelings of others
Succorance: A need to receive support and attention from others
Dominance: A need to be a leader and influence others
Abasement: A need to accept blame for problems and confess errors to others
Nurturance: A need to be of assistance to others
Change: A need to seek new experiences and avoid routine
Endurance: A need to follow through on tasks and complete assignments
Heterosexuality: A need to be associated with and attractive to members of the opposite sex
Aggression: A need to express one's opinion and be critical of others
References: Kaplan, R. M., & Saccuzzo, D. P. (2009). "Psychological testing: Principles,

appoications, and issues" (7th ed.). Belmont, CA: Wadsworth
b) PANUKAT NG PAGKATAONG PILIPINO
The 200-item inventory which asessess nineteen (19) personality listed below
psychometric properties
Reliability
Norms
It was initiated in 1978 and
was motivated by
several factors such as:
Steps on the Development of PPP
16 dimensions were identified (top ranked was
Pagkaresponsable[responsibility], 3other traits (Pagkamalikhain
[creativity], Pagkamasikap [achievement orientation], and
Pagkamapagsapalaran [risk taking]) were added because of the interest of
the researchers.
Internal consistency reliability coefficients for the PPP range from .94 for
Pagkamatalino (Intelligence) to .44 for Pagkamadaldal (Social Curiosity) with
an average of 72. Standard error of measurement, on the other hand, range
from 1.58 for Pagkamatalino (Intelligence) to 4.67 for Pagkamadaldal (Social
Curiosity).
Validity
The construct validity of the PPP was determined by establishing subscale

intercorrelations. Average intercorrelations for the different subscales range
from .10 to .33.
In general, there were more positive than negative correlations values, thus,
implying that the subscales measure a common factor, which is the construct
of personality.
The norm of this inventory was obtained from a heterogeneous Metro Manila
sample. The norm of the PPP is in the form of percentiles and normalized
standardization scores with a mean of 50 and a standard deviation of 10.
Profile sheets on which standardized scores may be plotted are also
available.
Lack of agreement among Filipino researchers about the most salient

dimensions of Filipino personality
Choice of traits were from foreign-made tests
Scarcity of indigenous measures
The primary basis for trait identification and item development was an
inductive and empirical approach whereas the final selection depended on
the internal consistency of the items in each subscale.
1. Pagkamaalalahanin/Thoughtfulness
2. Pagkamaayos/Social Curiosity
3. Pagkamadaldal/Social Curiosity
4. Pagkamagalang/Respectfulness
5. Pagkamahinahon/Emotional Stability
6. Pagkamalikhain/Creativity
7. Pagkamapagkumbaba/Humility
8. Pagkamapagsapalaran/Risk-taking
9. Pagkamadamdamin/Sensitiveness
10. Pagkamasayahin/Cheerfulness
11. Pagkamasikap/Achievement Orientation
12. Pagkamasunurin/Obedience
13. Pagkamatalino/Intelligence
14. Pagkamatapat/Honesty
15. Pagkamatiyaga/Patience
16. Pagkamatulungin/Helpfulness
17. Pagkamaunawain/Capacity for Understanding
18. Pagkapalakaibigan/Sociability
19. Pagkaresponsable/Responsibility
indicate on a five-point scale his/her degree of agreement/disagreement with

each of the items applicable to him/her
45 minutes to 1 hour to complete, but there is no time limit for the

administration of the test.
Reference:
De Jesus, E. M. 1998. Handbook of Psychological Tests (Theories,
Administratio, Scoring and Application. Rex Printing, Company, Inc. Quezon
City, Philippines. pp. 117-118
c) MOONEY PROBLEM CHECKLIST
General Purpose:
help individuals express their personal problems
Developmental History
by Ross L. Mooney and Leonard V. Gordon
each blank consists of a list of items, varying in number from 210 to 330 on the several forms ,
that represent areas.
original lists were compiled from large numbers of free responses, from case records and from
reviews of the literature on student problems
1950 revisions eliminated items infrequently checked, retained items having retest stability and
improved the grade placement of some statements
Test Administration
self-administered
group or individual
no time limit
underline all items of concern
circle those of the most concern
answer summary questions in their own words
Interpretation
Areas Covered
health and physical development, home and family, morals and religion, courtship, sex and
marriage
Mooney Problem Checklist
Target Group:
Adult, College, High School and Junior High School
Testing considerations/Accommodations:
language is simple and readily understood
this is NOT a test and does not yield scores

items which have been circled are counted
item which have been underlined are counted
areas with a high number of items marked should be examined.
References: Mooney Ross L. Problem Check List, College Form. Columbus, Ohio: Bureau of
Educational Research, Ohio State University, 1941
INTEREST TEST
Psychological tests to assess a persons interests and preferences. These tests are
used primarily for career counseling. Interest tests include items about daily activities
from among which applicants select their preferences. The rationale is that if a person
exhibits the same pattern of interests and preferences as people who are successful in a
given occupation, then the chances are high that the person taking the test will find
satisfaction in that occupation. A widely used interest test is the Strong Interest
Inventory, which is used in career assessment, career counseling, and educational
guidance.
a) BRAINARD OCCUPATIONAL PREFERENCER INVENTORY FORM R
Description of the TEST It is a standardized The Brainard Occupational questionnaire designed

to Preference Inventory bring to the fore the facts permits a systematic study about a person
with respect of a persons interests. to his occupational interests. The Inventory can be The
purpose is to help to administered in about 30intelligently and objectively minutes. It is intended
fordiscuss his occupational and students in grades 8-12 and educational plans. adults.
Caution REQUIRES a relatively low level ofreading skill. HENCE, it may appropriately be used
at lower educational levels than similar instruments which contain more difficult reading
material. Adults with limited educationalbackgrounds may also be able to reactwith greater
understanding to the item ofthe Inventory.
Components Both boys and girls obtain scores The Brainard Occupational in the fields
identified as Preference Inventory yield scores Commercial, Mechanical,in six broad
occupational fields for Professional, Esthetic, and each sex. Scientific.Only boys answer the
item which Only girls answer then items for a yield an Agricultural score; personal service score.
continuation The subject responds to each Each field contains item by indicating whether he
twenty questions strongly dislikes the activity,divided equally among dislikes it, is neutral about
it, four occupational likes it, or strongly likes it, or sections. strongly likes it. Answers are marked
on a separate answer sheet by drawing a line which indicates the choice of response.
History The four inventories In 1932 the had been Specific Interest carried on All four
formsInventory by Paul since 192. of the P. Brainard was (Refer to Specific published in four
Fryers The Interestsforms. The forms Measurement Inventorywere M and W, for of were
widelymen and women; Interests, He used.B and G, for boys nry Holt and and girls. Company,
19 31).
REVISIONS Classification of the 140 question Wording of items were made into 28 Sections
and these, in turn, applicable to both sexes and for high into 7 Occupational Fields. school
through adult ages.Grouping of items which conformed in general to the code plan of
theDictionary of Occupational Titles, and In 1955, Form A was reviewed and also to the large
fields of interest changes in wording were made in a (proposed by Dr. Alfred Lewerenz of few
items. the Educational Research andGuidance Section of the Los Angeles City Schools).
ContinuationAgricultural score for girls and the Personal Service score for boys Inventory
items werewere removed since each of these put into a new booklet fields was apparently more
and the answer sheet meaningful for one sex than the was redesigned. other. Scoring system
was The answer sheet can changed to eliminate now be scored either negative scores on by
hand or by items. machine.
FIELDS Field Item numbers I Commercial 1-20 II Mechanical 21-30; 41-50 III
Professional 31-40; 51-60 IV Esthetic 61-70; 81-90 V Scientific 71-80; 91-100 VI
Agricultural (for Boys) 101-120 Personal Service (or girls)
ReliabilityThe first reliability test was a TEST-RETEST study in 1955 to the entire tenth grade in
an academic high school in Yonkers, NY. The time interval between tests was one week. The
test-retest reliability coefficients of the six field scores are for each sex. 2nd reliability study in
which scores on odd and even items were correlated was based on 683 boys and 200 girls in
grade 12.The odd-even correlation coefficients are corrected by the Spearman- Brown formula.
Both studies indicate that the reliabilities of the field scores areadequately high for most
purposes to which these scores will be put. No data have been gathered to establish the
effectiveness of the Inventory scores as predictors of success in specific occupations.
Validity The scores from the BOPIshow very little relationship BOPI gains different data to the
scores derived from from the Kuder PPR. the Kuder Personal Preference Record. While BOPI
permits the The Kuder items measure student to indicate his interests by forcing the strength of
interest in other subjects to choose among activities without forcingthree activities indicative of
him to subordinate interestdifferent types of interests. in other activities.
Scoring Score each field byThe weights for the various summing the responses are: 1 for SD; 2
appropriate weightsfor D; 3 for N; 4 for L; and 5 for the given for SL. responses . The lowest
possible score Record the scores in for any fields is 20, the proper boxes at meaning that the
subject the top edge of the has marked the SD for answer sheet. each item; the maximum score
is 100, in which case every item has been marked through the SL.
Omitted Items If only one item in a field has been omitted, treat it as a response of N (Neutral)
and give it a weight of 3. But when two or more items in a field have not been mark, no score
should be obtained for that particular field as such as score could only be an approximation of
the subjects interest.
References: Paul P Brainard; Ralph T Brainard; Psychological Corporation.New York :

Psychological Corp., 1945-1956.
b) OCCUPATIONAL INTEREST INVENTORY
Occupational Interest Inventory is designed to be used in a wide spectrum of career guidance

activities. It helps candidates choose an occupation, plan their career, and grow as
professionals in the workplace.
The assessment, which is based on the RIASEC model, measures levels of interest in 12
domains and matches the candidates profile with a list of 138 occupations across various
sectors and fields. In this way, it helps to pinpoint the most suitable profession for them.
The report provides a constructive analysis, helping the evaluator to initiate a meaningful
dialogue with the candidate, understand their vocational interests, and match their profile with
suitable career options.
Orientation and mobility

Occupational Interest Inventory can be a point of reference for career guidance processes,
helping candidates to make informed decisions about their career. For those considering a
career transition, it provides good insight into their job preferences.
The combined approach of RIASEC profiles opens the field of possibilities to a deeper
exploration of an individual's aspirations by selecting occupations that align with their
personality.
Skills assessment/training courses
Occupational Interest Inventory facilitates a dialogue between the individual and the assessor
and, as of such, is an indispensable tool for skills assessments and trainings. For training
courses, it can help trainers to optimise resources by identifying individuals' learning styles and
the environment most conducive to their development.
Recruitment
When combined with a personality questionnaire, Occupational Interest Inventory can be an
integral part of your recruitment process. Through an assessment of interests and professional
aspirations, recruiters can assign the positions and responsibilities that would be most
stimulating and rewarding for employees.
References: Centraltest.com
c) KUDER PREFERENCE RECORD
The Kuder Preference Record was one of the first interest inventories. It had 168 three choice
items focusing on vocational interests which returned scores on ten scales which were claimed
to measure such areaas as artistic, clerical, mechanical and scientific interests. On this basis
suitable careers could be discussed with respondents.
Description:
The Kuder Preference Record helps make a systematic approach to the investigation of
occupations by measuring preferences in 10 broad areas: Outdoor, Mechanical, Computational,
Scientific, Persuasive, Artistic, Literary, Musical, and Clerical. An individual's preferences
indicate that s/he likes certain types of activities. When his/her preferences are identified, s/he
can investigate the occupations that involve these activities. In this way s/he narrows the field of
investigation to those occupations most deserving of his/her attention.
Age Range:
Adults
Inventory:
1 Examiner Manual - Vocational Form C (1956) (original)

1 Examiner Manual - Vocational Form C (1960 - Revised)
1 Administrator's Manual - Vocational Form C (1956) (original)
2 Administrator's Manuals - Vocational Form C (Revised - 1960) (originals)
2 copies of Memorandum of Instructions, Form CP (original, copy)
4 Form CH Question Booklets
25 Form CH Answer Record Booklets
4 copies of Form CH, CM Profile Sheets (2 originals, 2 copies)
Packet of Pins (158A)
25 Padding Boards
1 Answer Pad
Reference: Kuder Preference Record n.in A Dictionary of Psychology
PROJECTIVE TESTS
Projective test is a personality test designed to let a person respond to ambiguous
stimuli, presumably revealing hidden emotions and internal conflicts projected by the
person into the test. This is sometimes contrasted with a so-called "objective test" or
"self-report test" in which responses are analyzed according to a presumed universal
standard (for example, a multiple choice exam), and are limited to the content of the test.
The responses to projective tests are content analyzed for meaning rather than being
based on presuppositions about meaning, as is the case with objective tests. Projective
tests have their origins in psychoanalytic psychology, which argues that humans have
conscious and unconscious attitudes and motivations that are beyond or hidden from
conscious awareness.
How Do Projective Tests Work?
In many projective tests, the participant is shown an ambiguous image and then asked to give
the first response that comes to mind.
The key to projective tests is the ambiguity of the stimuli. According to the theory behind such
tests, clearly defined questions result in answers that are carefully crafted by the conscious
mind. By providing the participant with a question or stimulus that is not clear, the underlying
and unconscious motivations or attitudes are revealed.
Strengths and Weaknesses of Projective Tests
Projective tests are most frequently used in therapeutic settings. In many cases, therapists use
these tests to learn qualitative information about a client. Some therapists may use projective
tests as a sort of icebreaker to encourage the client to discuss issues or examine thoughts and
emotions.
While projective tests have some benefits, they also have a number of weaknesses and
limitations. For example, the respondent's answers can be heavily influenced by the examiner's
attitudes or the test setting. Scoring projective tests is also highly subjective, so interpretations
of answers can vary dramatically from one examiner to the next.
Additionally, projective tests that do not have standard grading scales tend to lack
both validity and reliability.
Validity refers to whether or not a test is measuring what it purports to measure while reliability
refers to the consistency of the test results.
However, these tests are still widely used by clinical psychologists and psychiatrists. Some
experts suggest that the latest versions of many projective tests have both practical value and
some validity.
References: Wood, J. M., Nexworski, M. T., & Stejskal, W. J. (1996). The comprehensive
system for the Rorschach: A critical examination. Psychological Science, 7(1), 3-10, 14-17.
a) THE BENDER VISUAL MOTOR GESTALT TEST
Definition
The Bender Gestalt Test, or the Bender Visual Motor Gestalt Test, is a psychological
assessment instrument used to evaluate visual-motor functioning and visual perception skills in
both children and adults. Scores on the test are used to identify possible organic brain damage
and the degree maturation of the nervous system. The Bender Gestalt was developed by
psychiatrist Lauretta Bender in the late nineteenth century.
Purpose
The Bender Gestalt Test is used to evaluate visual maturity, visual motor integration skills, style
of responding, reaction to frustration, ability to correct mistakes, planning and organizational
skills, and motivation. Copying figures requires fine motor skills, the ability to discriminate
between visual stimuli, the capacity to integrate visual skills with motor skills, and the ability to
shift attention from the original design to what is being drawn.
Precautions
The Bender Gestalt Test should not be administered to an individual with severe visual
impairment unless his or her vision has been adequately corrected with eyeglasses.
Additionally, the test should not be given to an examinee with a severe motor impairment, as the
impairment would affect his or her ability to draw the geometric figures correctly. The test scores
might thereby be distorted.
The Bender Gestalt Test has been criticized for being used to assess problems with organic
factors in the brain. This criticism stems from the lack of specific signs on the Bender Gestalt
Test that are definitively associated with brain injury, mental retardation , and other physiological
disorders. Therefore, when making a diagnosis of brain injury, the Bender Gestalt Test should
never be used in isolation. When making a diagnosis, results from the Bender Gestalt Test
should be used in conjunction with other medical, developmental, educational, psychological,
and neuropsychological information.
Finally, psychometric testing requires administration and evaluation by a clinically trained

examiner. If a scoring system is used, the examiner should carefully evaluate its reliability and
validity, as well as the normative sample being used. A normative sample is a group within a
population who takes a test and represents the larger population. This group's scores on a test
are then be used to create "norms" with which the scores of test takers are compared.
Description
The Bender Gestalt Test is an individually administered pencil and paper test used to make a
diagnosis of brain injury. There are nine geometric figures drawn in black. These figures are
presented to the examinee one at a time; then, the examinee is asked to copy the figure on a
blank sheet of paper. Examinees are allowed to erase, but cannot use any mechanical aids
(such as rulers). The popularity of this test among clinicians is most likely the short amount of
time it takes to administer and score. The average amount of time to complete the test is five to
ten minutes.
The Bender Gestalt Test lends itself to several variations in administration. One method requires
that the examinee view each card for five seconds, after which the card is removed. The
examinee draws the figure from memory. Another variation involves having the examinee draw
the figures by following the standard procedure. The examinee is then given a clean sheet of
paper and asked to draw as many figures as he or she can recall. Last, the test is given to a
group, rather than to an individual (i.e., standard administration). It should be noted that these
variations were not part of the original test.
Results
A scoring system does not have to be used to interpret performance on the Bender Gestalt Test;
however, there are several reliable and valid scoring systems available. Many of the available
scoring systems focus on specific difficulties experienced by the test taker. These difficulties
may indicate poor visual-motor abilities that include:
Angular difficulty: This includes increasing, decreasing, distorting, or omitting an angle in a

figure.
Bizarre doodling: This involves adding peculiar components to the drawing that have no
relationship to the original Bender Gestalt figure.
Closure difficulty: This occurs when the examinee has difficulty closing open spaces on a figure,
or connecting various parts of the figure. This results in a gap in the copied figure.
Cohesion: This involves drawing a part of a figure larger or smaller than shown on the original
figure and out of proportion with the rest of the figure. This error may also include drawing a
figure or part of a figure significantly out of proportion with other figures that have been drawn.
Collision: This involves crowding the designs or allowing the end of one design to overlap or
touch a part of another design.
Contamination: This occurs when a previous figure, or part of a figure, influences the examinee
in adequate completion of the current figure. For example, an examinee may combine two
different Bender Gestalt figures.
Fragmentation: This involves destroying part of the figure by not completing or breaking up the
figures in ways that entirely lose the original design.
Impotence: This occurs when the examinee draws a figure inaccurately and seems to recognize
the error, then, he or she makes several unsuccessful attempts to improve the drawing.
Irregular line quality or lack of motor coordination: This involves drawing rough lines, particularly
when the examinee shows a tremor motion, during the drawing of the figure.
Line extension: This involves adding or extending a part of the copied figure that was not on the
original figure.
Omission: This involves failing to adequately connect the parts of a figure or reproducing only
parts of a figure.
Overlapping difficulty: This includes problems in drawing portions of the figures that overlap,
simplifying the drawing at the point that it overlaps, sketching or redrawing the overlapping
portions, or otherwise distorting the figure at the point at which it overlaps.
Perseveration: This includes increasing, prolonging, or continuing the number of units in a

figure. For example, an examinee may draw significantly more dots or circles than shown on the
original figure.
Retrogression: This involves substituting more primitive figures for the original designfor
example, substituting solid lines or loops for circles, dashes for dots, dots for circles, circles for
dots, or filling in circles. There must be evidence that the examinee is capable of drawing more
mature figures.
Rotation: This involves rotating a figure or part of a figure by 45 or more. This error is also
scored when the examinee rotates the stimulus card that is being copied.
Scribbling: This involves drawing primitive lines that have no relationship to the original Bender
Gestalt figure.
Simplification: This involves replacing a part of the figure with a more simplified figure. This error
is not due to maturation. Drawings that are primitive in terms of maturation would be categorized
under "Retrogression."
Superimposition of design: This involves drawing one or more of the figures on top of each
other.
Work over: This involves reinforcing, increased pressure, or overworking a line or lines in a
whole or part of a figure.
Additionally, observing the examinee's behavior while drawing the figures can provide the
examiner with an informal evaluation and data that can supplement the formal evaluation of the
examinee's visual and perceptual functioning. For example, if an examinee takes a large
amount of time to complete the geometric figures, it may suggest a slow, methodical approach
to tasks, compulsive tendencies, or depressive symptoms. If an examinee rapidly completes the
test, this could indicate an impulsive style.
Resources
BOOKS
Hutt, M. L. The Hutt Adaptation of the Bender Gestalt Test. New York: Grune and Stratton, 1985.
Kaufman, Alan, S., and Elizabeth O. Lichtenberger. Assessing Adolescent and Adult
Intelligence. Boston: Allyn and Bacon, 2001.
Koppitz, E. M. The Bender Gestalt Test for Young Children. Vol. 2. New York: Grune and
Stratton, 1975.
Pascal. G. R., and B. J. Suttell. The Bender Gestalt Test: Quantification and Validation for
Adults. New York: Grune and Stratton, 1951.
Sattler, Jerome M. "Assessment of visual-motor perception and motor proficiency."
In Assessment of Children: Behavioral and Clinical Applications. 4th ed. San Diego: Jerome M.
Sattler, Publisher, Inc., 2002.
Watkins, E. O. The Watkins Bender Gestalt Scoring System. Novato, CA: Academic Therapy,
1976.
b) THE THEMATIC APPERCEPTION TEST
Thematic apperception test (TAT) is a projective psychological test. Proponents of the

technique assert that subjects' responses, in the narratives they make up about ambiguous
pictures of people, reveal their underlying motives, concerns, and the way they see the social
world. Historically, the test has been among the most widely researched, taught, and used of
such techniques.
Procedure
The TAT is popularly known as the picture interpretation technique because it uses a series of
provocative yet ambiguous pictures about which the subject is asked to tell a story. The TAT
manual provides the administration instructions used by Murray,although these procedures are
commonly altered. The subject is asked to tell as dramatic a story as they can for each picture
presented, including the following:
what has led up to the event shown
what is happening at the moment
what the characters are feeling and thinking

what the outcome of the story was
If these elements are omitted, particularly for children or individuals of low cognitive abilities, the
evaluator may ask the subject about them directly. Otherwise, the examiner is to avoid
interjecting and should not answer questions about the content of the pictures. The examiner
records stories verbatim for later interpretation.
The complete version of the test contains 32 picture cards. Some of the cards show male
figures, some female, some both male and female figures, some of ambiguous gender, some
adults, some children, and some show no human figures at all. One card is completely blank
and is used to elicit both a scene and a story about the given scene from the storyteller.
Although the cards were originally designed to be matched to the subject in terms of age and
gender, any card may be used with any subject. Murray hypothesized that stories would yield
better information about a client if the majority of cards administered featured a character similar
in age and gender to the client.
Although Murray recommended using 20 cards, most practitioners choose a set of between 8
and 12 selected cards, either using cards that they feel are generally useful, or that they believe
will encourage the subject's expression of emotional conflicts relevant to their specific history
and situation. However, the examiner should aim to select a variety of cards in order to get a
more global perspective of the storyteller and to avoid confirmation bias (i.e., finding only what
you are looking for).
Many of the TAT drawings consist of sets of themes such as: success and failure, competition
and jealousy, feeling about relationships, aggression, and sexuality. These are usually depicted
through picture cards.
Psychometric characteristics
Thematic Apperception Tests are meant to evoke an involuntary display of ones subconscious.
There is no standardization for evaluating ones TAT responses; each evaluation is completely
subjective because each response is unique. Validity and reliability are, consequently, the
largest question marks of the TAT. There are trends and patterns, which help identify
psychological traits, but there are no distinct responses to indicate different conditions a patient
may or may not have. Medical professionals most commonly use it in the early stages of patient
treatment. The TAT helps professionals identify a broad range of issues that their patients may
suffer from. Even when individual scoring procedures are examined, the absence of
standardization or norms make it difficult to compare the results of validity and reliability
research across studies. Specifically, even studies using the same scoring system often use
different cards, or a different number of cards. Standardization is also absent amongst
clinicians, who often alter the instructions and procedures. Murstein explained that different
cards may be more or less useful for specific clinical questions and purposes, making the use of
one set of cards for all clients impractical.
Reliability
Internal consistency, a reliability estimate focusing on how highly test items correlate to each
other, is often quite low for TAT scoring systems. Some authors have argued that internal
consistency measures do not apply to the TAT. In contrast to traditional test items, which should
all measure the same construct and be correlated to each other, each TAT card represents a
different situation and should yield highly different response themes. Lilienfeld and
colleagues countered this point by questioning the practice of compiling TAT responses to form
scores. Both inter-rater reliability (the degree to which different raters score TAT responses the
same) and testretest reliability (to degree to which individuals receive the same scores over
time) are highly variable across scoring techniques. However, Murray asserted that TAT
answers are highly related to internal states such that high test-retest reliability should not be
expected. Gruber and Kreuzpointner (2013) developed a new method for calculating internal
consistency using categories instead of pictures. As they demonstrated in a mathematical proof,
their method provides a better fit for the underlying construction principles of TAT, and also
achieved adequate Cronbach's alpha scores up to .84
Validity
The validity of the TAT, or the degree to which it measures what it is supposed to measure, is
low. Jenkins has stated that the phrase validity of the TAT is meaningless, because validity is
specific not to the pictures, but to the set of scores derived from the population, purpose, and
circumstances involved in any given data collection." That is, the validity of the test would be
ascertained by seeing how clinician's decisions were assisted based on the TAT. Evidence on
this front suggests it is a weak guide at best. For example, one study indicated that clinicians
classified individuals as clinical or non-clinical at close to chance levels (57% where 50% would
be guessing) based on TAT data alone. The same study found that classifications were 88%
correct based on MMPI data. Interestingly, using TAT in addition to the MMPI reduced accuracy
to 80%.
Alternate considerations
Despite the conflicting information about the psychometric characteristics of the TAT, proponents
have argued that the TAT should not be judged using traditional standards of reliability and
validity. According to Holt, the TAT is a complex method of assessing people, which does not
lend itself to the standard rules of thumb about test standards [. . .] (p. 101). For example, it has
been argued that the purpose of the TAT is to reveal a wide range of personality characteristics
and complex, nuanced patterns, as opposed to traditional psychological tests that are designed
to measure unitary and narrow constructs. Hibbard and colleagues examined several
considerations about traditional views of reliability and validity as they apply to the TAT. First,
they noted that traditional views of reliability may limit the validity of a measure (such as occurs
with multi-faceted concepts in which characteristics are not necessarily related to each other,
but are meaningful in combination). Further, Cronbach's alpha, a commonly used measure of
internal consistency, is dependent on the number of items in scale. For the TAT, most scales use
only a small number of cards (with each card treated like an item) so alphas would not be
expected to be very high. Many clinicians also discount the importance of psychometrics,
believing that generalizability of the findings to a given clients situation is more important than
generalizing findings to the population.
Scoring systems
When he created the TAT, Murray also developed a scoring system based on his need-press
theory of personality. Murray's system involved coding every sentence given for the presence of
28 needs and 20 presses (environmental influences), which were then scored from 1 to 5,
based on intensity, frequency, duration, and importance to the plot. However, implementing this
scoring system is time-consuming and was not widely used. Rather, examiners have
traditionally relied on their clinical intuition to come to conclusions about storytellers.
Although not widely used in the clinical setting, several formal scoring systems have been
developed for analyzing TAT stories systematically and consistently. Two common methods that
are currently used in research are the:
Defense Mechanisms Manual DMM. This assesses three defense mechanisms: denial (least
mature), projection (intermediate), and identification (most mature). A person's thoughts/feelings
are projected in stories involved.
Social Cognition and Object Relations SCOR scale. This assesses four different dimensions
of object relations: Complexity of Representations of People, Affect-Tone of Relationship
Paradigms, Capacity for Emotional Investment in Relationships and Moral Standards, and
Understanding of Social Causality.
Personal Problem-Solving SystemRevised (PPSS-R). This assesses how people identify,

think about and resolve problems through the scoring of thirteen different criteria. This scoring
system is useful because theoretically, good problem-solving ability is an indicator of an
individuals mental health. Although the TAT is a projective personality technique that is based
primarily on the psychoanalytic perspective, the PPSS-R scoring system is designed for
clinicians and researchers working from a cognitive behavioral framework. The PPSS-R scoring
system has been studied in a wide range of populations, including college students, community
residents, jail inmates, university clinic clients, community mental health center clients, and
psychiatric day treatment clients. Thus, the PPSS-R scoring system allows clinicians and
researchers to assess for problem solving ability and social functioning in many types of people,
without being hindered by social desirability effects.
Similar to other scoring systems, with the PPSS-R TAT cards are typically administered
individually and examinees responses are recorded verbatim. Unlike other scoring systems, the
PPSS-R only uses six of the 31 TAT cards: 1, 2, 4, 7BM, 10, and 13MF. The PPSS-R provides
information about four different areas related to problem solving ability: Story Design, Story
Orientation, Story Solutions, and Story Resolution. These four areas are assessed by the 13
scoring criteria, 12 of which are rated on a 5-point scale that ranges from -1 to 3.
Each of these scoring categories attempts to measure the following information:
Story Design measures an individuals ability to identify and formulate a problem situation.
Story Orientation assesses an examinees level of personal control, emotional distress,
confidence and motivation.
Story Solutions assesses how impulsive an examinee is. In addition to evaluating the types of
problem solutions that are provided, the number of problem solutions that examinees provide for
each of the TAT cards is summed.
Story Resolution provides information on the examinees ability to formulate problem solutions
that maximize both short and long-term goals.
Examiners are encouraged to explore information obtained from the TAT stories as hypotheses
for testing rather than concrete facts.
References: Schacter, Daniel, Daniel Gilbert, and Daniel Wegner. Psychology. 2nd. New York:
Worth Publishers, 2009. 18.
c) RORSCHACH
The Rorschach test is a psychological test in which subjects' perceptions of inkblots are
recorded and then analyzed using psychological interpretation, complex algorithms, or both.
Some psychologists use this test to examine a person's personality characteristics and
emotional functioning. It has been employed to detect underlying thought disorder, especially in
cases where patients are reluctant to describe their thinking processes openly. The test is
named after its creator, Swiss psychologist Hermann Rorschach. In the 1960s, the Rorschach
was the most widely used projective test.
Although the Exner Scoring System (developed since the 1960s) claims to have addressed and
often refuted many criticisms of the original testing system with an extensive body of
research, some researchers continue to raise questions. The areas of dispute include the
objectivity of testers, inter-rater reliability, the verifiability and general validity of the test, bias of
the test's pathology scales towards greater numbers of responses, the limited number of
psychological conditions which it accurately diagnoses, the inability to replicate the test's norms,
its use in court-ordered evaluations, and the proliferation of the ten inkblot images, potentially
invalidating the test for those who have been exposed to them.
Method
The Rorschach test is appropriate for subjects from the age of five to adulthood. The
administrator and subject typically sit next to each other at a table, with the administrator slightly
behind the subject. Side-by-side seating of the examiner and the subject is used to reduce any
effects of inadvertent cues from the examiner to the subject. In other words, side-by-side
seating mitigates the possibility that the examiner will accidentally influence the subject's
responses. This is to facilitate a "relaxed but controlled atmosphere". There are ten official
inkblots, each printed on a separate white card, approximately 18 by 24 cm in size. Each of the
blots has near perfect bilateral symmetry. Five inkblots are of black ink, two are of black and red
ink and three are multicolored, on a white background. After the test subject has seen and
responded to all of the inkblots (free association phase), the tester then presents them again
one at a time in a set sequence for the subject to study: the subject is asked to note where he
sees what he originally saw and what makes it look like that (inquiry phase). The subject is
usually asked to hold the cards and may rotate them. Whether the cards are rotated, and other
related factors such as whether permission to rotate them is asked, may expose personality
traits and normally contributes to the assessment. As the subject is examining the inkblots, the
psychologist writes down everything the subject says or does, no matter how trivial. Analysis of
responses is recorded by the test administrator using a tabulation and scoring sheet and, if
required, a separate location chart.
The general goal of the test is to provide data about cognition and personality variables such
as motivations, response tendencies, cognitive operations, affectivity, and
personal/interpersonal perceptions. The underlying assumption is that an individual will class
external stimuli based on person-specific perceptual sets, and including needs, base
motives, conflicts, and that this clustering process is representative of the process used in real-
life situations. Methods of interpretation differ. Rorschach scoring systems have been described
as a system of pegs on which to hang one's knowledge of personality. The most widely used
method in the United States is based on the work of Exner.
Administration of the test to a group of subjects, by means of projected images, has also
occasionally been performed, but mainly for research rather than diagnostic purposes.
Test administration is not to be confused with test interpretation:
The interpretation of a Rorschach record is a complex process. It requires a wealth of

knowledge concerning personality dynamics generally as well as considerable experience with
the Rorschach method specifically. Proficiency as a Rorschach administrator can be gained
within a few months. However, even those who are able and qualified to become
Rorschach interpreters usually remain in a "learning stage" for a number of years.
Features or categories
The interpretation of the Rorschach test is not based primarily on the contents of the response,
i.e., what the individual sees in the inkblot (the content). In fact, the contents of the response are
only a comparatively small portion of a broader cluster of variables that are used to interpret the
Rorschach data: for instance, information is provided by the time taken before providing a
response for a card can be significant (taking a long time can indicate "shock" on the card). as
well as by any comments the subject may make in addition to providing a direct response.
In particular, information about determinants (the aspects of the inkblots that triggered the
response, such as form and color) and location (which details of the inkblots triggered the
response) is often considered more important than content, although there is contrasting
evidence. "Popularity" and "originality" of responses can also be considered as basic
dimensions in the analysis.
Content
The goal in coding content of the Rorschach is to categorize the objects that the subject
describes in response to the inkblot. There are 27 established codes for identifying the name of
the descriptive object. The codes are classified and include terms such as "human", "nature",
"animal", "abstract", "clothing", "fire", and "x-ray", to name a few. Content described that does
not have a code already established should be coded using the code "idiographic contents" with
the shorthand code being "Idio." Items are also coded for statistical popularity (or, conversely,
originality).
More than any other feature in the test, content response can be controlled consciously by the
subject, and may be elicited by very disparate factors, which makes it difficult to use content
alone to draw any conclusions about the subject's personality; with certain individuals, content
responses may potentially be interpreted directly, and some information can at times be
obtained by analyzing thematic trends in the whole set of content responses (which is only
feasible when several responses are available), but in general content cannot be analyzed
outside of the context of the entire test record.
Location
Identifying the location of the subject's response is another element scored in the Rorschach
system. Location refers to how much of the inkblot was used to answer the question.
Administrators score the response "W" if the whole inkblot was used to answer the question, "D"
if a commonly described part of the blot was used, "Dd" if an uncommonly described or unusual
detail was used, or "S" if the white space in the background was used. A score of W is typically
associated with the subject's motivation to interact with his or her surrounding environment. D is
interpreted as one having efficient or adequate functioning. A high frequency of responses
coded Dd indicate some maladjustment within the individual. Responses coded S indicate an
oppositional or uncooperative test subject.
Determinants
Systems for Rorschach scoring generally include a concept of "determinants": These are the
factors that contribute to establishing the similarity between the inkblot and the subject's content
response about it. They can also represent certain basic experiential-perceptual attitudes,
showing aspects of the way a subject perceives the world. Rorschach's original work used
only form, color and movement as determinants. However currently, another major determinant
considered is shading, which was inadvertently introduced by poor printing quality of the
inkblots. Rorschach initially disregarded shading, since the inkblots originally featured uniform
saturation, but later recognized it as a significant factor.
Form is the most common determinant, and is related to intellectual processes. Color responses
often provide direct insight into one's emotional life. Movement and shading have been
considered more ambiguously, both in definition and interpretation. Rorschach
considered movement only as the experiencing of actual motion, while others have widened the
scope of this determinant, taking it to mean that the subject sees something "going on".
More than one determinant can contribute to the formation of the subject's perception. Fusion of
two determinants is taken into account, while also assessing which of the two constituted the
primary contributor. For example, "form-color" implies a more refined control of impulse than
"color-form". It is, indeed, from the relation and balance among determinants that personality
can be most readily inferred.
Symmetry of the test items
A striking characteristic of the Rorschach inkblots is their symmetry. Many unquestionably

accept this aspect of the nature of the images but Rorschach, as well as other researchers,
certainly did not. Rorschach experimented with both asymmetric and symmetric images before
finally opting for the latter.
He gives this explanation for the decision:
Asymmetric figures are rejected by many subjects; symmetry supplied part of the necessary
artistic composition. It has a disadvantage in that it tends to make answers somewhat
stereotyped. On the other hand, symmetry makes conditions the same for right and left handed
subjects; furthermore, it facilitates interpretation for certain blocked subjects. Finally, symmetry
makes possible the interpretation of whole scenes.
The impact of symmetry in the Rorschach inkblot's has also been investigated further by other
researchers.
Exner scoring system
The Exner scoring system, also known as the Rorschach Comprehensive System (RCS), is the
standard method for interpreting the Rorschach test. It was developed in the 1960s by Dr. John
E. Exner, as a more rigorous system of analysis. It has been extensively validated and shows
high inter-rater reliability. In 1969, Exner published The Rorschach Systems, a concise
description of what would be later called "the Exner system". He later published a study in
multiple volumes called The Rorschach: A Comprehensive system, the most accepted full
description of his system.
Creation of the new system was prompted by the realization that at least five related, but
ultimately different methods were in common use at the time, with a sizeable minority of
examiners not employing any recognized method at all, basing instead their judgment on
subjective assessment, or arbitrarily mixing characteristics of the various standardized systems.
The key components of the Exner system are the clusterization of Rorschach variables and a
sequential search strategy to determine the order in which to analyze them, framed in the
context of standardized administration, objective, reliable coding and a representative normative
database.The system places a lot of emphasis on a cognitive triad of information processing,
related to how the subject processes input data, cognitive mediation, referring to the way
information is transformed and identified, and ideation.
In the system, responses are scored with reference to their level of vagueness or synthesis of
multiple images in the blot, the location of the response, which of a variety of determinants is
used to produce the response (i.e., what makes the inkblot look like what it is said to resemble),
the form quality of the response (to what extent a response is faithful to how the actual inkblot
looks), the contents of the response (what the respondent actually sees in the blot), the degree
of mental organizing activity that is involved in producing the response, and any illogical,
incongruous, or incoherent aspects of responses. It has been reported that popular responses
on the first card include bat, badge and coat of arms.
Using the scores for these categories, the examiner then performs a series of calculations
producing a structural summary of the test data. The results of the structural summary are
interpreted using existing research data on personality characteristics that have been
demonstrated to be associated with different kinds of responses.
With the Rorschach plates (the ten inkblots), the area of each blot which is distinguished by the
client is noted and codedtypically as "commonly selected" or "uncommonly selected". There
were many different methods for coding the areas of the blots. Exner settled upon the area
coding system promoted by S. J. Beck (1944 and 1961). This system was in turn based upon
Klopfer's (1942) work.
As pertains to response form, a concept of "form quality" was present from the earliest of
Rorschach's works, as a subjective judgment of how well the form of the subject's response
matched the inkblots (Rorschach would give a higher form score to more "original" yet good
form responses), and this concept was followed by other methods, especially in Europe; in
contrast, the Exner system solely defines "good form" as a matter of word occurrence
frequency, reducing it to a measure of the subject's distance to the population average.
Rorschach performance assessment system
Rorschach performance assessment system (R-PAS) is a scoring method created by several

members of the Rorschach Research Council. They believed that the Exner scoring system was
in need of an update, but after Exner's death, the Exner family forbade any changes to be made
to the Comprehensive System.Therefore, they established a new system: the R-PAS. It is an
attempt at creating a current, empirically based, and internationally focused scoring system that
is easier to use than Exner's Comprehensive System. The R-PAS manual is intended to be a
comprehensive tool for administering, scoring, and interpreting the Rorschach. The manual
consists of two chapters that are basics of scoring and interpretation, aimed for use for novice
Rorschach users, followed by numerous chapters containing more detailed and technical
information.
In terms of updated scoring, the authors only selected variables that have been empirically
supported in the literature. To note, the authors did not create new variables or indices to be
coded, but systematically reviewed variables that had been used in past systems. While all of
these codes have been used in the past, many have been renamed to be more face valid and
readily understood. Scoring of the indices has been updated (e.g.
utilizing percentiles and standard scores) to make the Rorschach more in line with other
popular personality measures.
In addition to providing coding guidelines to score examinee responses, the R-PAS provides a
system to code an examinee's behavior during Rorschach administration. These behavioral
codes are included as it is believed that the behaviors exhibited during testing are a reflection of
someone's task performance and supplements the actual responses given. This allows
generalizations to be made between someone's responses to the cards and their actual
behavior.
The R-PAS also recognized that scoring on many of the Rorschach variables differed across
countries. Therefore, starting in 1997, Rorschach protocols from researchers around the world
were compiled. After compiling protocols for over a decade, a total of 15 adult samples were
used to provide a normative basis for the R-PAS. The protocols represent data gathered in the
United States, Europe, Israel, Argentina and Brazil.
References: Santo Di Nuovo; Maurizio Cuffaro (2004). Il Rorschach in pratica : strumenti per la
psicologia clinica e l'ambito giuridico. Milano: F. Angeli. p. 147.

Different Kinds of Psychological Tests

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Different Kinds of Psychological Tests

Uploaded by

Copyright:

Available Formats

Different kinds of

Anselma R. Delos Santos

Prof. Hanzen Reyes

The four most commonly used intelligence tests are:

Stanford-Binet Intelligence Scales

Wechsler-Adult Intelligence Scale

Wechsler Intelligence Scale for Children

Wechsler Primary & Preschool Scale of Intelligence

(Lew Merrim/Science Source. Photo Researchers, Inc. Reproduced by permission.)

Furthermore, intelligence tests only measure a sample of behaviors or situations in which

I Individual Intelligence Tests

a Wechsler Adult Intelligence Scale-R (WAIS-R)

The Wechsler adult intelligence scale (WAIS) is an individually administered measure of

Besides being utilized as an intelligence assessment, the WAIS is used

The Wechshler scales were originally developed and later revised

b WECHSLER INTELLIGENCE SCALE FOR CHILDREN (WISC)

In addition to its uses in intelligence assessment, the WISC is used

(Lew Merrim/Science Source/Photo Researchers, Inc. Photo reproduced by permission.)

Administration and interpretation of results of the Stanford-Binet Intelligence Scale requires a

Administration of the Stanford-Binet Intelligence Scale typically takes between 45 to 90 minutes,

II Group Intelligence Tests

a The College Aptitude Examination

General Aptitude Test (GAT)

2. recognizing logical relations

3. solving problems based on basic mathematical notions.

The verbal section. This section includes the following.

o Reading comprehension: Testees are required to comprehend and analyze reading

the quantitative Section

Common information about the test

Procedures on the test day

GAT duration is 2 and half hours.

How many times can you sit for GAT?

Is GAT a Pass/Fail test?

Score Student's position

References: National Center for assessment in Higher Education 2016

b GENERAL MENTAL ABILITY TEST

c SRA Verbal by Louis. L. Thurstone (1887-1955)

References: Thurstone, T.G., Thurstone, L.L., Scientific Research Associate

A personality test is a questionnaire or other standardized instrument designed to reveal

Deductive assessment construction begins by selecting a domain or construct to measure. The

a) EDWARDS PERSONAL PREFERENCE SCHEDULE

15 Personality Variablity Scales

The test can either be scored by hand, or through a machine.

Edwards Personal Preference Schedule

Developed by Allen L. Edwards (Professor of Psychology at the University of Washington) in

meant for people aged between 16-85

takes approximately 45 minutes to complete

System of Human Needs by Murray

potentiality or readiness to respond in a certain way under a certain given circumstance.

A. I like to share things with my friends.

B. I like to analyze my own motives or feelings.

Rapport is important, but considering the personal

nature of the test, there will not be much interaction

during the test.

Overall reliability was close to 0.77

Men's lowest score was Order.

References: Kaplan, R. M., & Saccuzzo, D. P. (2009). "Psychological testing: Principles,

The construct validity of the PPP was determined by establishing subscale

Lack of agreement among Filipino researchers about the most salient

indicate on a five-point scale his/her degree of agreement/disagreement with

45 minutes to 1 hour to complete, but there is no time limit for the

c) MOONEY PROBLEM CHECKLIST

this is NOT a test and does not yield scores

a) BRAINARD OCCUPATIONAL PREFERENCER INVENTORY FORM R