You are on page 1of 5

Reliability and Types of Reliability

Subject: Psychological Assessment and Diagnosis

Submitted to: Ms. Maryam Razzaq

Dated: 6th May, 2020

Submitted by: Akhwand Abdur Raffi Saulat

Roll no: ADCP-021R20-6

Session: 2020-2021

Department of Psychology
Reliability:
Reliability means that scores from an instrument are stable and consistent. Scores should
be nearly the same when researchers administer the instrument multiple times at different times.
Also, scores need to be consistent. When an individual answers certain questions one way, the
individual should consistently answer closely related questions in the same way (Creswell, 2013)
or it can also be defined as how consistently a measuring instrument or model measures the specific
concept it is measuring (Razak et al., 2011)
Types of Reliability:
The reliability of a measure represents stability and consistency of the tool or
measurement. Sekaran (1992) defined the stability of a measure as the ability to maintain stability
over time, despite uncontrollable testing conditions and the states of the respondents themselves.
Test-retest reliability and parallel-form reliability are the two types of stability tests.
Stability
a) Test-Retest Reliability
Test–retest reliability involves administering the same questionnaire to a large sample of
people at two different times (hence, test and retest). For a questionnaire to yield reliable
measurements, people need not obtain identical scores on the two administrations of the
questionnaire, but a person’s relative position in the distribution of scores should be similar at the
two test times.
The consistency of this relative positioning is determined by computing a correlation
coefficient using the two scores on the questionnaire for each person in the sample. A desirable
value for test–retest reliability coefficients is .80 or above, but the size of the coefficient will
depend on factors such as the number and types of items (Shaughnessy et al., 2012).
b) Parallel-Form Reliability
This involves using two instruments, both measuring the same variables and relating (or
correlating) the scores for the same group of individuals to the two instruments. In practice, both
instruments need to be similar, such as the same content, same level of difficulty, and same types
of scales. Thus, the items for both instruments represent the same population of items.
The advantage of this approach is that it allows you to see if the scores from one instrument
are equivalent to scores from another instrument, for two instruments intended to measure the same
variables. The difficulty, of course, is whether the two instruments are equivalent in the fi rst place.
Assuming that they are, the researchers relate or correlate the items from the one instrument with
its equivalent instrument (Creswell, 2013).
Consistency
Consistency of measures, on the other hand, is indicative of the homogeneity of the items
in the measurement that tap the construction of a model. Types of the consistency tests are inter-
item consistency reliability, inter-rater reliability, and split-half reliability (Razak et al., 2011).
a) Inter-Item Consistency Reliability
Inter-item consistency reliability is an essential element in conducting an item analysis of
a set of test questions. Inter-item correlations examine the extent to which scores on one item are
related to scores on all other items in a scale. It provides an assessment of item redundancy: the
extent to which items on a scale are assessing the same content (Cohen & Swerdlik, 2005).
Ideally, the average inter-item consistency for a set of items should be between .20 and .40,
suggesting that while the items are reasonably homogenous, they do contain sufficiently unique
variance so as to not be isomorphic with each other.
b) Inter-Rater Reliability
Interrater reliability is a procedure used when making observations of behavior. It involves
observations made by two or more individuals of an individual’s or several individuals’ behavior.
The observers record their scores of the behavior and then compare scores to see if their scores are
similar or different.
Because this method obtains observational scores from two or more individuals, it has the
advantage of negating any bias that any one individual might bring to scoring. It has the
disadvantages of requiring the researcher to train the observers and requiring the observers to
negotiate outcomes and reconcile differences in their observations, something that may not be easy
to do (Creswell, 2013).
c) Split Half Reliably
Consistency is can also be measured by splitting the items of a multi-item scale into two
parts and determining the correlations between the parts; hence the use of the term split-half
reliability to describe this technique (Weiner et al., 2003).
Figure 1 Types of Reliability and its descriptions (Sekaran, 1992)
References
Cohen, R. J., & Swerdlik, M. E. (2005). Psychological testing and assessment: An
introduction to tests and measurement (6th ed.). New York: McGraw-Hill.
Creswell, J. W. (2013). Educational Research: Pearson New International Edition:
Planning, Conducting, and Evaluating Quantitative and Qualitative Research (4th ed.). Harlow,
United Kingdom: Pearson.
Razak, I. H. A., Kamaruddin, S., & Azid, I. A. (2011). Towards Human Performance
Measurement from the Maintenance Perspective: A Review. International Journal of
Engineering Management and Economics, 2(1), 60. DOI: 10.1504/ijeme.2011.039613
Sekaran, U. (1992). Instructor’s Resource Guide with Test Questions and Transparency
Masters to Accompany Research Methods For Business: A Kill Building Approach, (2nd ed.).
New York, NY: John Wiley & Sons.
Shaughnessy, J. J., Zechmeister, E. B., & Zechmeister, J. S. (2012). Research Methods in
Psychology (9th ed.). New York, N.Y: McGraw-Hill.
Weiner, I. B., Freedheim, D. K., Schinka, J. A., & Velicer, W. F. (2003). Handbook of
Psychology (Vol. 2). New York: Wiley.

You might also like