You are on page 1of 5

RELIABILITY & VALIDITY

After assigning numerals to objects or events according to rules, an investigator


faces two major problems viz. reliability and validity. He has devised his
measurement game and has administered the measuring instrument to a group
of subjects. He has a set of numbers, the end product of the measurement
game. He must now ask and answer the questions: What is the reliability of the
measuring instrument? What is its validity? These two aspects viz., reliability
and validity are of prime importance to a researcher.
Reliability:
Synonyms for reliability are: dependability, stability, consistency, predictability &
accuracy. A reliable man, for instance, is a man whose behavior is consistent
dependable, and predictable-what he will do tomorrow and next week will be
consistent with what he does today and what he has done last week. Thus, he is
a reliable man. On the other hand, an unreliable man is one whose behavior is
much more variable. He is unpredictably variable. Sometimes he does this,
sometimes that. He lacks stability. We say he is inconsistent.
Reliability Defined:
It is possible to approach the definition of reliability in three ways. One approach
is epitomized by the question: If we measure the same set of objects again and
again with the same or comparable measuring instrument, will we get the same
or similar results? This question implies a definition of reliability in stability,
dependability & predictability terms.
A second approach is epitomized by the question: Are the measures obtained
from a measuring instrument the true measures of the property measured?
This is an accuracy definition. Compared to the first definition, it is further
removed from common sense and intution, but it is also more fundamental.
These two approaches or definitions can be summarized in the words stability
and accuracy.
There is a third approach to the definition of reliability, an approach that not only
helps us better define and solve both theoretical and practical problems but also
implies other approaches and definitions. We can inquire: how much error of
measurement is there in a measuring instruments? Recall that there are two
general types of variance: systematic and random. Systematic variance leans in
one direction: scores tend to be all positive or all negative or all high or all low.
Error in this case is constant or biased. But random or error variance is selfcompensating: scores tend now to lean this way, now that way. Errors of
measurement are random errors. They are the sum or product of a number of
causes: the ordinary random or chance elements present in all measures due to
unknown causes, temporary or momentary fatigue, fortuitous conditions at a
particular time that temporarily affect the object measured or the measuring
instrument, fluctuations of memory or mood, and other factors that are temporary
and shifting. To the extent that errors of measurement are present in a
measuring instrument, to this extent the instrument is unreliable. In other words,
reliability can be defined as the relative absence of errors of measurement in a
measuring instrument.

The more widely accepted approach/explanation of reliability is through error: the


more error, the greater the unreliability; the less error, the greater the reliability.
Practically speaking, this means that if we can estimate the error variance of a
measure we can also estimate the measure s reliability. This brings us to two
equivalent definitions of reliability:
1. Reliability is the proportion of the true variance to the total obtained
variance of the data yielded by a measuring instrument.
2. Reliability is the proportion of error variance to the total obtained variance
yielded by a measuring instrument subtracted from one.
It is easier t write these definitions in equation form:
V
Rtt = --------------- .. (1)
Vt
Ve
Rtt = 1 - ----------- .. (2)
Vt
And Vt = V + Ve .... (3)
Where, rtt is the reliability coefficient
V is true variance
Vt
is obtained variance
Ve is error variance
[Eq. 1 is theoretical and cannot be used for calculation unless there is an
assumption that the true variance is equal to the population variance. Eq 2
can be used both to conceptualize the idea of reliability and to estimate the
reliability of an instrument. An alternate equation to Eq. 2
Vt V
Rtt = ----------------------- . (4)
Vt
This alternate definition of reliability will be useful in helping us to understand
what reliability is.
The Interpretation of the Reliability Coefficient:
If r, the coefficient of correlation, is squared, it becomes a coefficient of
determination, that is, it gives us the proportion or percentage of the variance
shared by two variables. If r = .90, then the two variables share (.90)2=0.81
percent of the total variance of the two variables in common. Coefficient of
determination has a wider use in regression anylysis (R 2). Similarly, the
reliability coefficient is also a coefficient of determination. Theoretically, it tells
how much variance of the total variance of a measured variable is true
variance. If we had the true scores and could correlate them with the scores
of the measured varies and square the resulting coefficient of correlation, we
would obtain the reliability coefficient.

The Value of Reliability:


To be interpretable, a test must be reliable. Unless one can depend upon the
results of the measurement of ones variable, one cannot, with any
confidence, determine the relations between the variables. The goal of
science is to observe the behavior of the variable(s) and
discover the
relations among variables. Since unreliable measurement is measurement
overloaded with error, the discovery of these relations becomes a difficult and
tenuous business. We may consider the following questions. Is an obtained
coefficient of correlations between two variables low because one or both
measures are unreliable? Is an analysis of variance (F ratio) not significant
because the hypothesized relation does not exist or because the measure of
the dependent variables is too crude, too unreliable?
Reliability, while not the most important facet of measurement, is still
extremely important. In a way, this is like the money problems: the lack of it
is the real problem. High reliability is no guarantee of good scientific results,
but there can be no good scientific results without reliability. In brief, reliability
is a necessary but not sufficient condition of the value of research and their
interpretation.
Validity:
The subject of validity is complex, controversial, and peculiarly important in
behavioral research. Here perhaps more than anywhere else, the nature of
reality is questioned. But it is possible to study reliability without inquiring
into the meaning of variables. It is not possible to study validity, without
sooner or later inquiring into the nature of ones variables.
When measuring certain physical properties and relatively simple attributes of
persons, validity is not a great problem. There is often rather direct and close
congruence between the nature of the object measured and the measuring
instrument. The length of an object, for example, can be measured by laying
off sticks, containing a standard number system in feet or meters, on the
object. Similarly, weight, height, gender, domicile, income, sale volume etc.
could be measured with specific measures.
On the other hand, suppose an educational scientist wishes to study the
relation between intelligence and school achievement or the relation between
authoritarianism and teaching style or brand preference with economic status.
Now there are no rulers to use, no scales with which to weigh the degree of
authoritarianism, no clear-cut physical or behavioral attributes that point
unmistakably to teaching style and brand preference. Similarly, furthermore,
one tries to find the relationship between brand loyalty and educational status
or between risk aversiveness and economic status. The former in each of the
above cases is purely a s subjective phenomenon of consumers/investors.
There is no direct way in which these could be measured. There is no clearcut physical or behavioral attributes which an precisely/unmistakably measure
these attributes. To take a few more example, we do not have a direct and
precise
measurement
techniques
for
intelligence,
personality,
industrialization, brand preference, awareness etc. Thus, it is necessary to

find out indirect means to measure these psychological properties or


economic phenomenon. These measures are so indirect that the validity of
the measurement and its product is often doubtful. Thus, we may summarize
that validity is a descriptive term used of a measure that accurately reflects
the concept that it is intended to measure. For example, your intelligence
quotient would seem a more valid measure of your intelligence than would the
number of hours you spend in the library. It is important to realize that the
ultimate validity of a measure can never be proven. Yet we may agree to its
relative validity. Researchers distinguish between three types of validity viz.,
content, criterion & construct validity.
Content Validity:
(i)
Content validity is the representativeness or sampling adequacy or
the content. Validation is guided by the question: Is the substance
or content of this measure representative of the content or the
universe of content of the property being measured? In empirical
research content validity refers to the extent to which a measuring
instrument provides adequate coverage of the topic under study.
For example, if the instrument contains a representative sample of
the universe the content validity is good.
(ii)
Criterion Related Validity:
Criterion-related validity is studied by comparing test or scale scores with one
or more external variables, or criteria, known or believed to measure the
attribute under study. When one predicts success or failure of students from
academic aptitude measures, one is concerned with criterion-related validity.
How well does the test (or tests) predict to graduation or to grade-point
average? One does not care so much what the test measures as one cares
for its predictive ability. In fact, in criterion-related validation, which is often
practical and applied research, the basic interest is usually more in the
criterion, some practical outcome, than in the predictors. The higher the
correlation between a measure or measures of academic aptitude and the
criterion, say grade-average, the better the validity. In empirical research
where relationship between variable are studied criterion related validity may
relate to the predictive efficiency of the model and the variables used. It is
studied by comparing actual/real observations with the predicted one using
external variables. Taking an example from the economy one may study
investment potentiality with the help of either composition of income group or
marginal prosperity to save (MPS) of a group of people. A question arises
which of the two explanatory variables could predict investment potentiality
accurately. In such cases we address the problem of criterion-relatedvalidity.

(iii)
Construct Validity:
Scientifically speaking, construct is one of the most significant advances of
modern measurement theory and practice. It is a significant advance
because it unites psychometric notions with theoretical notions.A researcher
generally starts with the constructs or variables entering into the relations. He
has discovered, say a positive correlation between sale volume and
advertising expenditure. The researcher wants to know why this relationship
exists, what is behind it. To learn why, he must know the meaning of the
constructs entering the relation. Taking another example we may consider
that price may not play a significant role in the demand for a high priced
consumer durable rather attitude of the people belonging to the high income
group towards the product could be the determining factor of the said
demand. Therefore, a researcher has to ask a question himself here, that
which variable price or attribute could explain the variation in demand of the
above mentioned good. He also tries to articulate why such relationship
could exist. Thus, he goes into the meaning of the relationship which is a
construct validity problem. In otherwords proper articulation with valid logic
are associated with construct validity.One can see that construct validation
and empirical scientific inquiry are closely allied. It is not simply a question of
validating a test. One must try to validate the theory behind the test and the
logic behind the relationship.The significant point about construct validity,
which sets it apart from other types of validity, is its preoccupation with theory,
theoretical constructs, logical articulation and scientific empirical inquiry
involving the testing of hypothesized relations.
Note: Participants are required to understand the above three types of
validity by taking/articulating several real life examples from the
economy/market.
They are advised to go through the relevant chapters on the above concepts
in the following books for a detail study.
Foundation of Behavioral Research by F.N. Kerlinger.

You might also like