You are on page 1of 5

1. Test scores are frequently expressed as numbers and statistical 10.

The mode is
tools are frequently used to A) the most frequently occurring score in a
A) describe test scores. distribution.
B) make inferences from test scores. B) the arithmetic average of all of the scores in a
C) draw conclusions about test scores. distribution.
D) all of the above. C) a reliable indicator of the variability within a
distribution.
2. Measurement may be defined as the act of assigning numbers or D) all of the above.
symbols to characteristics of objects (as well as people, events or
other things) according to 11. Which is true about the range of the distribution?
A) tests. A) It provides a useful index of central tendency.
B) scales. B) It is a quick but gross description of the spread
C) rules. of scores.
D) intuition. C) It can be used with other measures to calculate
kurtosis.
3. The French word for black (noir) may be thought of as an acronym D) It tends to be broader in nature than that of the
wherein each of the four letters corresponds to the first letter in "3 Tenors".
each of four
A) methods of testing. 12. Resulting data from a test of academic knowledge are positively
B) scales of measurement. skewed. This suggests that
C) tools of assessment. A) the test was too difficult.
D) major theories. B) the test was too easy.
C) an unbiased estimate must be calculated.
4. Through his writings, we know that Alfred Binet viewed D) some students benefited from a "halo effect".
intelligence test data as
A) continuous in nature. 13. From which type of test administered to members of the general
B) comprehensive in nature. population would you LEAST expect the resulting data to be
C) ordinal in nature. distributed normally?
D) all of the above A) A test to measure the strength of one's hand
grip.
5. A type of scale with a true zero point is B) A test to measure general intelligence and fund
A) an IQ scale. of knowledge.
B) an ordinal scale. C) A test to measure knowledge of psychometric
C) an interest scale. principles.
D) a ratio scale. D) A test to measure self-esteem.

6. The intersection of the vertical and horizontal axes of a graph 14. Which is NOT a type of standard score?
A) is customarily at zero. A) A stanine.
B) must never be at zero. B) A deviation IQ.
C) may or may not be at zero. C) A T score.
D) might be at 3. D) A J score.

7. Look at the following data for scores of 10 students on a test: Class 15. A test developer intent on obtaining a distribution of scores that
Interval of Score frequency 80 - 100 4 60 - 79 4 40 - 60 2. These data approximates the normal curve may statistically
are presented in the form of A) normalize the distribution.
A) a frequency distribution. B) regress the distribution.
B) a grouped frequency distribution. C) digest the distribution.
C) a class interval distribution. D) all of the above.
D) a nominal scale distribution.

8. Which does not belong?


A) Mean.
B) Measure.
C) Mode.
D) Median.

9. Distributions may vary in terms of their


A) variability.
B) skewness.
C) kurtosis.
D) all of the above.
1. On the PBS series "This Old House," Norm is the name of the
master carpenter. In psychology, "norm" is used to refer to 8. In the context of norming a test, a sample of the population refers
A) behavior that is usual, average, normal or to
expected. A) people deemed to be representative of the
B) a level of work product anticipated under whole population.
specific conditions. B) people deemed to be atypical of the whole
C) what is typical at a particular age in a particular population.
context. C) a mixture of people who are both representative
D) all of the above. and atypical.
D) a randomly selected group of people who share
2. In its plural form, "norms" is a term used in psychometrics to refer a characteristic.
to the test performance data of
A) people tested at a different time than another 9. An incidental sample
group of test takers. A) is also known as a convenience sample.
B) test takers who constitute a control group in an B) is composed of people exposed to the same
experiment. incident.
C) a particular group of test takers to be used for C) is purposive in nature.
comparison. D) all of the above.
D) a sample of people with no prior training in the
tested area. 10. In a bygone era, age norms were
A) used as a first step in establishing national
3. The term "norming" refers to the process of anchor norms.
A) interpreting and re-interpreting norms. B) designed to help estimate average grade-level
B) deriving or generating norms. performance.
C) distributing norms to members of target C) associated with the now out of favor concept of
populations. mental age.
D) putting a carpenter's personal signature on a D) all of the above.
work product.
11. Tests such as the SAT and the GRE employ these to help bring
4. A raw score of 0 on a test alerts the test user to the possibility that meaning to test takers' test scores. They are
the test taker A) criterion-referenced interpretation systems.
A) probably was unprepared for the examination or B) incidental samples.
did not care about it. C) race norming procedures.
B) may not have understood or been able to carry D) fixed reference group scoring systems.
out the directions.
C) had previously obtained a copy of the test and 12. Criterion-referenced tests
studied it carefully. A) have also been referred to as domain-referenced
D) all of the above. tests.
B) have also been referred to as content-referenced
5. Race norming is the controversial practice of norming tests.
A) by computer as rapidly as possible. C) have criteria derived from the standards of the
B) intelligence test scores as a function of ethnic test developer.
background. D) all of the above.
C) GM Test Track scores on the basis of NASCAR
rankings. 13. A coefficient of correlation is an index of the
D) on the basis of race or ethnic background. A) degree to which one variable influences another.
B) strength of the relationship between two things.
6. Cut scores C) way in which one event may cause another
A) can be derived in only one of three different event.
ways. D) all of the above.
B) are numerical reference points used to classify.
C) are objective in nature, minimizing subjectivity. 14. The most widely used measure of correlation is
D) all of the above. A) the Pearson r.
B) Spearman's rho.
7. According to research by Medvec and her colleagues, bronze C) the rank-difference correlation coefficient.
medallists in the Olympic games are likely to be D) the Spearman-Brown prophecy formula.
A) happier with their performance than silver
medallists. 15. The primary use of a regression equation in testing and
B) happier with their performance than gold assessment is to
medallists. A) convert one score or variable to another.
C) emotionally devastated about not winning the B) transform scores derived in a different cultural
gold. context.
D) thinking about how they can improve their C) fix certain scores as an anchor for future scores.
performance. D) predict one score or variable from another.
1. In the language of psychometrics, reliability refers primarily to 9. It has been said that a so-called true score is "not the ultimate fact
A) expertise in measurement. in the book of the recording angel." This means that
B) dependability in measurement. A) subjectivity in scoring may lead to chance
C) speed of measurement. variations in examination scores.
D) consistency in measurement. B) factors due to luck, for lack of another word,
must always be considered.
2. With regard to test reliability, C) test variance due to true variance as opposed to
A) there are different types of reliability. error may never be known.
B) it is seldom an all-or-none matter. D) religion as a cultural variable has not been given
C) tests are reliable to different degrees. sufficient consideration.
D) all of the above.
10. Which is NOT a form of reliability?
3. A reliability coefficient is an index of reliability that reflects the A) Test-retest reliability.
ratio between B) Past-present reliability.
A) the error variance and the error variance C) Split-half reliability.
squared. D) Alternate-forms reliability.
B) the true score variance on a test and the total
variance. 11. It is most appropriate to use the Spearman-Brown formula to
C) the true score variance on a test and the error estimate what form of reliability?
variance squared. A) Test-retest reliability.
D) the true score variance and the error variance. B) Past-present reliability.
C) Split-half reliability.
4. In the context of psychometrics, error refers to the component of D) Alternate-forms reliability.
the observed score on an ability test that
A) does not have to with the ability being 12. In general, as test length increases, test reliability
measured. A) increases.
B) was distorted as a result of examiner error. B) decreases.
C) may have been measured inaccurately for C) is not affected either way.
whatever reason. D) is affected but insignificantly.
D) was administered solely for experimental
reasons. 13. The degree of correlation among all of the items on a scale
A) is referred to as inter-item consistency.
5. According to the true score theory, an individual's score on a test B) may be estimated by means of KR-20.
of extraversion reflects a level of extraversion as defined by the test C) may be estimated by means of the Rulon
and that level is presumed to be formula.
A) the testtaker's "true" level of extraversion. D) all of the above.
B) only an estimate of the testtaker's true level of
extraversion. 14. Coefficient alpha is conceptually
C) greater than the degree of error inherent in the A) the variance of all possible sources of error
score. variance.
D) less than or equal to the degree of error inherent B) the mean of all possible split-half correlations.
in the score C) the standard deviation of all possible sources of
variation.
6. Which is a source of error variance? D) the estimate of inter-scorer reliability that is
A) Test construction. most robust.
B) Test administration.
C) Test scoring. 15. The difference between a speed test and a power test has to do
D) All of the above. with
A) whether or not the range has been restricted.
7. Item sampling is a source of error variance within the context of B) the time limit allotted for completion of the
A) test construction. items.
B) test administration. C) whether or not the variance has been restricted
C) test scoring. D) all of the above.
D) all of the above.

8. A behavioral observation checklist requires the observer to note


whether the person being observed smiles. A key source of error
variance resulting from this requirement is
A) content variance.
B) scoring variance.
C) item sampling variance.
D) all of the above.
1. Stated succinctly, test validity refers to a judgment concerning 9. The proportion of people accurately identified by a test as having
A) How consistent a test measures what it purports a particular trait, behavior, characteristic, or attribute is referred to
to measure. as a
B) Why the test should or should not be used for a A) hit rate.
specific purpose. B) accuracy rate.
C) How well a test measures what it purports to C) trait rate.
measure. D) all of the above.
D) How sound the evidence is that supports
conclusions from it. 10. In order to establish the construct validity of a test, evidence
could be gathered indicating that
2. A local validation study is a study A) the test measures a single concept.
A) conducted by test-takers from a particular B) expected changes in test performance occur over
locality or community. time.
B) conducted by test users who plans to alter a C) expected changes in test performance occur as a
test in some way. result of experience.
C) undertaken by certified test developers from a D) all of the above
specific locality.
D) undertaken by the original test publisher for any 11. Discriminant evidence of construct validity is otherwise known as
purpose. A) discriminant validity.
B) convergent validity.
3. Which is NOT a traditional component of the so-called Trinitarian C) predictive validity.
view of validity? D) factor analytic validity.
A) Validity.
B) Content validity. 12. With regard to the issues of test validity, fairness, and bias
C) Criterion-related validity. A) it is possible for a test to be valid yet used
D) Construct validity. unfairly.
B) it is possible for a test to be biased and used
4. It has been characterized as the "Rodney Dangerfield of unfairly.
psychometric variables." It is C) test bias systematically prevents accurate,
A) odd-even reliability. impartial measurement.
B) marital satisfaction. D) all of the above.
C) variance.
D) face validity. 13. An instructor evidences a tendency not to fail any student in the
class, but not to give any grades of A either. This instructor may be
5. A job applicant takes a company-administered test for evidencing a
employment and then questions the relevance to the job of certain A) leniency tendency error.
test items. Stated another way, the applicant is expressing concern B) severity error.
about the test's C) central tendency error.
A) incremental validity. D) all of the above
B) content validity.
C) factor loadings. 14. Your brother did very well in Mrs. Jones's class. Now you are in
D) all of the above. her class and can't seem to do any wrong. You are probably the
benefit of
6. History tests in different regions of the world may A) a generosity error.
A) be used to illustrate that there are certain B) a halo effect.
cultural absolutes. C) a clerical error.
B) have the same questions but different answers D) the Mel Gibson effect.
keyed correct.
C) be used to illustrate the irrelevancy of culture to 15. With regard to adjusting test scores by group membership,
historical fact. Gottfredson has argued that
D) be written in different languages but have similar A) such adjustments must be made if society is to
content. progress.
B) tools of measurement must not be blamed for
7. Which does NOT belong? group differences.
A) Criterion-related validity. C) adverse impact must be prevented by means of
B) Content validity. differential scoring.
C) Concurrent validity. D) all of the above.
D) Predictive validity.

8. When a criterion measure is itself based, at least in part, on


predictor measures, what is said to exist is
A) criterion-related validity.
B) predictive validity.
C) criterion contamination.
D) concurrent validity.
The process of developing a test occurs in five stages beginning with 9. In the language of psychometrics, a "foil" is
test conceptualization. What is the fifth stage of this process? A) part of the stem of an item on a multiple choice
A) Item analysis. test.
B) Test revision. B) an incorrect alternative to a question on a
C) Test tryout. multiple choice test.
D) Test construction. C) a general reference to distraction created by
fellow testtakers.
2. A preliminary question in test development is "How will meaning D) an aluminum sheet used primarily for computer
be attributed to scores on this test?" This question logically leads in test security.
to a discussion of
A) test revision in succeeding editions. 10. An essay test is an example of a test that employs
B) inter-item consistency. A) a selected-response format.
C) the sequential strategy in scale construction. B) an unselected-response format.
D) norm- versus criterion-referenced tests. C) a constructed-response format.
D) an under-construction-response format.
3. Pilot work is typically necessary in test development to
A) Evaluate the utility of including specific items. 11. The Edwards Personal Preference Schedule is a personality test
B) Gather baseline physiological data on all test- that features ipsative scoring. This means that the strength of
takers. various needs of the testtaker may be compared
C) Make certain all nonverbal symbols are A) to the strength of those needs in other test-
understood. takers.
D) Transform ratio level data into interval level data. B) to the strength of other needs of the same
testtaker.
4. “I spent all last night studying Chapter 7, and there wasn't one C) to the strength of needs expressed on the
item on the test from that chapter!" Translated into the language of Mooney Problem Checklist.
psychometrics, this statement would read: D) all of the above.
A) "I noted excessive error variance related to test
administration." 12. "What is a good item?" The answer to this question
B) "I question the examination's content validity." A) can never be made with certainty.
C) "I have grave concerns about rater error affecting B) can be made with reference to item analysis
reliability." data.
D) "And staying up to watch Sledgehammer on VH-1 C) much like beauty, is in the eyes of the beholder.
didn't help either!" D) all of the above.

5. In his paper entitled "A Method of Scaling Psychological and 13. Which is most useful in determining whether different items on a
Educational Tests," the test development pioneer L. L. Thurstone test are measuring the same thing?
introduced the notion of A) Co-validation.
A) difficulty scaling. B) Cross-validation.
B) positive scaling. C) Factor analysis.
C) absolute scaling. D) Test tryout.
D) all of the above.
14. This index, symbolized by a lowercase italicized letter, compares
6. Rating scales are used to record judgments about performance on a particular item with performance in the upper and
A) oneself. lower regions of a distribution of continuous scores. It is:
B) objects. A) a
C) others. B) b
D) all of the above. C) c
D) d.
7. A Likert scale is a
A) Unidimensional scale. 15. Test revision
B) Summative scale. A) applies to both brand new and existing tests.
C) Categorical scale. B) occurs after each stage of the development
D) Liking scale. process.
C) is usually followed by test standardization.
8. Items on a Guttman scale are designed so that D) all of the above.
A) they range sequentially from weaker to
stronger expressions.
B) agreement with weaker statements implies
agreement with stronger ones.
C) endorsement of one statement says nothing at
all about another.
D) they mirror a Likert format but allow for more
detailed methods of analysis.

You might also like