Professional Documents
Culture Documents
MEASUREMENT IN RESEARCH
In our daily life we are said to measure when we use some yardstick to determine
weight, height, or some other feature of a physical object.
We also measure when we judge how well we like a song, a painting or the
personalities of our friends. We, thus, measure physical objects as well as abstract
concepts.
Need of Measurement
Standardization: Measurement scales provide a standardized framework for collecting
and analyzing data in research. They ensure that data is collected consistently and can
be compared across different samples or studies.
Data Organization: Measurement scales aid in organizing and categorizing data. They
provide a structure for classifying variables into different levels or categories, making
it easier to analyze and interpret the data.
Reliability and Validity: Measurement scales play a crucial role in ensuring the
reliability and validity of research findings. Reliable measurement scales produce
consistent results over time, while valid scales accurately measure the intended
constructs or variables.
Levels of Measurement
1.Nominal Scale:
This type of scale allows a researcher to classify characteristics of the persons, places
or objects into categories.
It is simply a system of assigning of number symbols to events in order to label them.
2.Ordinal Scale
In this case, the characteristics can be put into categories and the categories also can
be ordered in some meaningful way. The distance between the categories, however, is
unknown.
A student’s rank in his class involves use of this scale.
Permits the ranking of items from highest to lowest but the real difference between
adjacent ranks may not be equal.
Implies a statement of ‘greater than’ or ‘less than’ without our being able to state how
much greater or less.
Median can be used as the measure of central tendency.
Examples:
Socioeconomic Status
1 = Low
2 = Middle
3 = High
Health Status
1 = Poor
2 = Fair
3 = Good
4 = Excellent
3. Interval Scale:
Numbers are assigned to objects or events which can be categorized, ordered and
assumed to have an equal distance between scale values.
It has an arbitrary zero, but it lacks true zero or absolute zero.
It does not have the capacity to measure the complete absence of a trait or
characteristic.
Example: Fahrenheit or centigrade scale of temperature
Addition and subtraction are permissible, but not multiplication and division
Characteristics:
More powerful measurement than ordinal scale as it involves the concept of equality
of interval.
Mean-appropriate measure of central tendency, std. deviation most widely used
measure of dispersion
Product moment correlation technique
Ratio Scale
The most precise level of measurement consists of meaningfully ordered
characteristics with equal intervals between them and the presence of a zero point that
is not arbitrary but determined by nature.
For example, the zero point on a centimeter scale indicates complete absence of
length or height, but absolute zero of temperature is theoretically unobtainable.
Ratio is possible, e.g. it can be said that 40 kg. is four times more than 10 kg.
Reliability
Measurement is said to be reliable when it give consistent results. i.e. when repeated
measurements of same things give constant results.
Reliability is the extent to which the same finding will be obtained if the research is
repeated at another time by another researcher. If the same finding can be obtained
again, the instrument is consistent or reliable.
Reliability refers to the consistency of scores obtained by the same individuals when
re-examined with test on different occasions, or with different sets of equivalent
items, or under variable examining conditions.
Test-retest method:
Single form of test is administered twice on the same sample with a reasonable time
gap.
It yields two independent sets of scores and the correlation between them gives the
value of reliability coefficient which is also known as temporal stability coefficient.
Split-half method:
It indicates homogeneity of the test. Test is divided into two halves, say, one set
contains odd numbered items and another contains even numbered items.
A single administration of the two sets of items to a sample of respondents yields two
sets of scores. A positive and significant correlation indicates that the test is reliable.
The advantage is that data necessary for computation of the reliability coefficient are
obtained in a single administration of the test, and hence variability produced by two
administrations is automatically eliminated.
Validity of measurement
Validity of the measuring instrument is the degree or the extent to which it measures
what it is supposed to measure.
The term validity means truth or fidelity. It can be defined as the accuracy with which
it measures that which is intended to measure.
Validity is epitomized by the question: ‘Are we measuring what we think we are
measuring?’ This is very difficult to assess. The following questions are typical of
those asked to assess validity issues:
Has the researcher gained the full access to the knowledge and meanings of
informants?
A measure cannot be valid unless it is reliable, but a reliable measure may not be
valid
Content validity
When the content of items individually and as a whole are relevant to the test, it
represents content validity.
It requires both:
Item validity: concerned with whether the test items represent measurement in the
contended area, and
Sampling validity: concerned with the extent to which the test samples the total
content area.
Concurrent validity
In this method, a test is correlated with a criterion which is available at present time.
e.g. test of dictionary skills can estimate students’ current skills in the actual use of
dictionary – observation.
e.g. the Scholastic Aptitude Test (SAT) is valid to the extent that it distinguishes
between students that do well in college versus those that do not.
Predictive validity
e.g. reading readiness test might be used to predict students’ achievement in reading.
Predictive validity is needed for tests which include long range forecast of academic
achievement, industrial management etc.
Construct validity
It is the extent to which the test may be said to measure a theoretical construct or
trait.
Construct validation is a more complex and difficult process than content validation
and criterion validation..
Construct validity is computed only when the scope for investigating criterion related
validity or content validity is bleak.
Practicability
From the operational point of view, the measuring instrument ought have:
Economy,
Convenience and
Interpretability
The rules for assigning numbers should be standardized and applied uniformly.
Scale Characteristics
Description
By description, we mean the unique labels or descriptors that are used to designate
each value of the scale. All scales possess description.
Order
By order, we mean the relative sizes or positions of the descriptors. Order is denoted
by descriptors such as greater than, less than, and equal to.
Scale Characteristics
Distance
The characteristic of distance means that absolute differences between the scale
descriptors are known and may be expressed in units
Types of comparative scales are:
1. Paired comparison:
This technique is a widely used comparative scaling technique.
In this technique, the respondent is asked to pick one object among the two objects with
the help of some criterion.
The respondent makes a series of judgements between objects.
The data obtained is ordinal in nature.
With n brands, [n(n-1)/2] paired comparisons are required.
For example: A survey was conducted to find out consumer’s preference for dark
chocolate or white chocolate. The outcome was as follows:
Dark chocolate= 30%
Thus, it is visible that consumers prefer white chocolate over dark chocolate.
2. Rank order:
In this technique, the respondent judges one item against others.
Respondent are present with several objects and are asked to rank or order them
according to some criterion.
Rank order scaling is also ordinal in nature.
Only (n-1) scaling decisions need to be made in this technique.
For example: A respondent is asked to rate the following soft drinks:
Drinks Rank
Pepsi 2
Thumbs Up 1
Mountain dew 3
Mirinda 4
Products Money
Product A 250
Product B 150
Product C 100
Total 500
4. Q sort:
It is a sophisticated form of rank order.
In this technique, a set of objects is given to an individual to sort into piles to specified
rating categories.
For example: A respondent is given 10 brands of shampoos and asked to place them in 2
piles, ranging from “most preferred” to “least preferred“.
pile 1
Most preferred
Clinic
Head n shoulder dove L’Oreal paris pantene
plus
pile 2
Least preferred
Sunsil
TRESemme Biotique Himalya Patanjali
k
Note: Generally the most preferred shampoo is placed on the top while the least preferred
at the bottom.
Non-comparative scales:
In non-comparative scales, each object of the stimulus set is scaled independently of the
others. The resulting data are generally assumed to be ratio scaled.
Types of Non-comparative scales are:
a. Likert scale:
This scale requires the respondent to indicate a degree of agreement or disagreement
with the statements mentions on the left side of the object.
The analysis is often conducted on an item-by-item basis, or a total score can be
calculated.
When arriving at a total score, the categories assigned to the negative statements by the
respondent is scored by reversing the scale.
For example: A well-known shampoo brand carried out Likert scaling technique to find
the agreement or disagreement for ayurvedic shampoo.
Ayurvedic shampoo 1 2 3 4 5
helps in maintaining
hair
Ayurvedic shampoo
5 4 3 2 1
damage hair
Ayurvedic shampoo
5 4 3 2 1
cleans your hair
c. Staple scale:
It is a unipolar rating scale with 10 categories scaled from -5 to +5.
It does not have a neutral point, that is, zero.
It is represented vertically.
For example: A well-known shoe brand carried out a staple scaling technique to find out
costumer’s opinion towards their product.