You are on page 1of 6

MIDLANDS STATE UNIVERSITY

FACULTY OF SOCIAL SCIENCES

DEPARTMENT OF PSYCHOLOGY

NAME REG NUMBER

BOIKETLO P. NARE R1810735V

LEVEL: 4:1

MODULE NAME: PSYCHOMETRICS

LECTURER: MR P MUTANDWA

QUESTION 6: DEFINE AND DISCUSS THE UTILITY VALUE OF ANY FOUR


METHODS OF DETERMINING A MEASURE OR A TEST.

According to Cronbach (1960), a psychological test ‘is a systematic procedure for


comparing the behavior of two or more people’. There are numerous methods that
help one come up with a test and these include the correlation-coefficient method,
cross-validation method, expectancy table method and item analysis method. Furr &
Bacharach (2008), state that ‘a correlation coefficient reflects the direction of the
association between two variables’. A great benefit of a correlation is that it reflects
the magnitude of the association between variables. Cross-validation is a resampling
method that uses different portions of the data to test and train a model on different
iteration, Stone (1974). It is accomplished by trying out previously developed and
refined test on a completely new group. Helmstadter (1964) notes that an expectancy
table provides the probabilities indicating that students achieving a given test score
will behave in a certain way in some second situation. The last method is item
analysis which is a process that examines a response to individual test items in order
to assess the quality of those items and of the test as a whole. The methods listed
above have different levels of relevance and the following essay will discuss this.

Cross validation sometimes known as rotation estimation is the statistical practice of


partitioning a sample of data into subsets such that the analysis is initially performed
on a single subset, while the other subset(s) are retained for subsequent use in
confirming and validating the initial analysis. The initial subset of data is called the
training set, the other subset(s) are called validation or testing sets. (Browne ,2000)
states that cross validation entails a set of techniques that partition the dataset and
repeatedly generate models and test their future predictive power. Cross validation
has the advantage that it avoids fitting a model too closely to the peculiarities of a
dataset (over fitting). Over fitting can occur if a too complex model is fitted to the
training set for example if the number of parameters of a model is the same or
greater than the number of observations , then a model can perfectly predict the
training simply by memorizing the data in its entirety. However such a model will
typically fail severely when making predictions. Cross validation avoids this risk by
evaluating the performance of the model on an independent dataset (testing set).
This protects against over fitting the training data and at the same time it increases
the confidence that the effects obtained in a specific study will be replicated,
instantiating a simulated replication of the original study. In this way, cross
validation mimics the advantages of an independent replication with the same of
collected data (Yarkoni & Westfall, 2017).
Item analysis as mentioned above is a way to determine the quality of a test by
looking at each individual item or question and determining if they are fair or not in
other words item analysis is a process which examines student responses to
individual test items (questions) in order to assess the quality of those items and of
the test as a whole according to (J. C. Nunnally, 1967). It helps one decide is an
individual item of the test should be kept, discarded or revised as it is a measure that
is used only after the test has be administered and results summed up (it is said to be
a post hoc test). To find the utility value an item analysis is evaluated and this is done
through different ways and one such way is difficulty index or item difficulty (Lord,
F.M, 1952). Which is the proportion or probability that candidates, or students, will
answer a test item correctly. Generally, more difficult items have a lower percentage,
or P-value. The next stage is Item discrimination which refers to the ability of an item
to differentiate among students on the basis of how well they know the material
being tested. Various hand calculation procedures have traditionally been used to
compare item responses to total test scores using high and low scoring groups of
students. Computerized analyses provide more accurate assessment of the
discrimination power of items because they take into account responses of all
students rather than just high and low scoring groups. Lastly there is frequencies and
distribution which represents the number and percentage of students who choose
each alternative are reported. Frequently chosen wrong alternatives may indicate
common misconceptions among the students. However item analysis data are not
synonymous with item validity an external criterion is required to accurately judge
the validity of test items. By using the internal criterion of total test score, item
analyses reflect internal consistency of items rather than validity. The discrimination
index is not always a measure of item quality. There is a variety of reasons an item
may have low discriminating power which may include extremely difficult or easy
items will have low ability to discriminate but such items are often needed to
adequately sample course content and objectives and an item may show low
discrimination if the test measures many different content areas and cognitive skills.
For example, if the majority of the test measures knowledge of facts, then an item
assessing ability to apply principles may have a low correlation with total test score,
yet both types of items are needed to measure attainment of course objectives (Holt,
1973).
The other method is correlation coefficient. In this method the scores of newly
constructed test are correlated with that of criterion scores. The coefficient of
correlation gives the extent of validity index of the test. For this purpose Pearson’s
method of correlation is most widely and popularly used. The technique of
correlation depends on the nature of data obtained on the test as well as on criterion.
Correlation coefficient is utilized highly when taking a measure of the strength of a
linear association between two variables. This method of correlation attempts to
draw a line of best fit through the data of two variables. In other words, with
correlation coefficient, we want to know the relationship between variables. For
example, if one wants to test the performance of two different classes of physics with
different teachers, the students are given a

The advantage of using correlational coefficient method is that it uses the non-
experimental method where the measurement of two variables occurs. It is up to the
individuals conducting the study to assess and understand the statistical relationship
between them without having extraneous influences occur. It’s like when a child
hears the music playing from an ice cream truck. There is a direct relationship
between the sound heard and how far away the vehicle is from their current location.
By understanding the commonality of the data in that situation, the child knows
whether to grab their money, ask their parents for some, or not to bother making an
effort.

An important limitation of the correlation coefficient is that it assumes a linear


association. This also means that any linear transformation and any scale
transformation of either variable X or Y, or both, will not affect the correlation
coefficient. However, variables X and Y may also have a non-linear association,
which could still yield a low correlation coefficient, even though variables X and Y are
clearly related. Nonetheless, the correlation coefficient will not always return 0 in
case of a non-linear association.

An expectancy table is a two-way table showing the relationship between two tests.
Helmstadter (1964) notes that an expectancy table provides the probabilities
indicating that students achieving a given test score will behave in a certain way in
some second situation (p. 52). Expectancy tables are discussed as a device for
interpreting the meaning of test results for those untrained in statistics. Three ways
of organizing an expectancy table are discussed: firstly to determine the probability
that a student with a given test score will succeed in a specified course; secondly to
find out how to pick the best applicants; and thirldy to determine the probability that
an office worker will attain an average rating or higher. . Teachers can use the
information from one test to help predict the performance level on another test; that
is, an expectancy table can be used to display predictive validity data. In this method
the scores of newly constructed test are evaluated or correlated with the rating of the
supervisors. It provides empirical probabilities of the validity index. Taylor-Russell
tables list numbers, which illustrate the expected proportion of success on the job,
given the validity of the test, the base rate, and the selection ratio (Murphy &
Davidshofer, 2005). Expectancy tables provide a table of numbers from which,
depending on the test score, one may determine the probability of an individual’s or
a group of individuals’ attaining a specified level of success or “superiority” (Lawshe
& Bolda, 1958).

In conclusion, in this essay we have tried to show the utility value of correlation-
coefficient, cross-validation and expectancy table and item analysis methods of
determining a test. We have highlighted on each method’s strengths and weakness,
which then prove their usefulness.
References.

Cronbach, L. J. (1960). Essentials of psychological testing (2nd ed.). Harper.

Furr, R. M., & Bacharach, V. R. (2008). Psychometrics: An Introduction. Thousand


Oaks, CA: Sage

Helmstadter, G. C. (1964). Principles of psychological measurement. New York:


Appleton-Century-Crofts.

Lawshe, C. H & Bolda, R. A. (1958). Expectancy Charts: I. Their Use and Empirical
Development First published: September 1958 retrieved from
https://doi.org/10.1111/j.1744-6570.1958.tb00023

Davidshofer, C. I. & Murphy,K R. (2005)Psychological Testing: Principles and


Applications

Stone, M (1974). “Cross-Validation Choice and Assessment of Statistical Predictions”.


Journal of the Royal Statistical Society, Series B (Methodological) 36 (2): 111-147.

Nunnally, J. C. Psychometric Theory. New York: McGraw-Hill, 1967.

Holt, Rinehart and Winston, Measurement and Evaluation in Education and


Psychology. New York:, 1973.

Lord, F.M. “The Relationship of the Reliability of Multiple-Choice Test to the


Distribution of Item Difficulties,” Psychometrika, 1952, 18, 181-194

You might also like