You are on page 1of 5

Measurement and Scaling Techniques

As quantification became easy, business research methods are now mainly concerned with
quantitative methods. This is made possible by expressing the hypotheses associated with a
research problem in the quantitative form and then analyzing or testing them using quantitative
tools on the quantitative data generated or collected. For this we need high quality data. If all the
attributes under consideration are directly measurable using scientific instruments, it is very easy
to generate accurate quantitative data. For example, price, volume of sales, profit, turnover, growth
rate etc., are directly measurable and naturally will generate valid, reliable and sensitive data. But
in business research, a researcher may have to handle several psychological variables which cannot
be directly measured using standard measuring instruments. Consumer motivation, consumer
intension, consumer perception etc., are some examples from marketing research. In these cases,
since there are no standard measuring instruments, the researcher has to develop his/her own
constructs as measuring instruments. While doing this a researcher should be very careful in
assessing what is to be measured and how to be measured. For example, suppose a researcher is
addressing a research question “What motivates a consumer to buy a luxury car.” For this the
researcher has to unfold the underlying motive and quantify those factors to find a measure for the
level of consumer motivation. This can be done in several ways. One can ask a question which can
be answered on a five-point scale, another researcher may be asking a set of questions developed
carefully and each of which can be answered on a five-point scale. Another person can do it by
asking the customers to rank one by one by giving a list of qualities, yet another can do it by giving
a set of yes or no questions.
Please note that the quality of research always depends on the precision of measurement, which in
turn depends on the measurement techniques adopted and how well it quantifies the various
dimensions of the attribute. Construction of precise measurement tools requires a careful
conceptual definition, operational definition and a system of consistent rules for assigning scores
or symbols. Thus, measurement becomes a three-part process:
1. With the help of the conceptual definition develop well defined observable empirical
events corresponding to various state of nature.
2. Using the operational definition develop a scheme for assigning numbers or symbols to
represent aspects of the events being measured.
3. Apply mapping rules to assign scores or numbers to each observations.
Measurement Levels
Before coming to the construction of measuring instruments, one should have a clear vision about
the nature and qualities of the outcomes that the measurement instrument to going to produce.
Note that the statistical tools that we are going to use for the analysis depends on the nature of the
data generated. For example, consider the three numbers 2, 3 and 4 generated as the outcome of a
measuring instrument measured on an individual. If these numbers represent weights of three
commodities, one can find the average weight. On the other hand, if these numbers represent ranks
of three individuals, there is no meaning in finding the average, but we can use median. Further if

1
these numbers represent the identification numbers of commodities in a shop there is no meaning
in finding both mean and median. Therefore, one should have a clear understanding about the level
of measurement, to use appropriate statistical tools and techniques. The following are the four
commonly used measurement scales:
• Nominal scale
• Ordinal scale
• Interval scale
• Ratio scale
Nominal Scale: If the data numbers represent labels or names used to identify an individual, then
nominal scale is used. For example in a survey, suppose respondents are classified into three
categories, students, employees and others and 1 is used to denote students, 2 is used to denote
employees and 3 to denote others. In this case the numbers 1, 2 and 3 are measured on a nominal
scale. Here we cannot compare or add these numbers. The only arithmetic operation possible is
count. The only statistical measure that can be used here is mode. But the researcher can form
frequency tables and cross tabulation tables to study the patterns. Although nominal data are
statistically weak, they are very useful in surveys and business research.
Ordinal Scale: Ordinal scale includes all the characteristics of nominal scale plus an indication
of order. For example, in a survey question about consumer perception, suppose each consumer is
asked to judge between ‘excellent’, ‘good’ or ‘poor’. While codding suppose 3, 2 and 1 is used to
represent excellent, good and poor respectively. Here the numbers 1, 2 and 3 are measured on
ordinal scale. Comparison is possible. Clearly excellent is better than good which is better than
poor. Hence, we have 3 > 2 >1. Median is the appropriate measure of central tendency. Correlation
analysis is to be done using ‘Rank Correlation’ and testing of statistical significance is to be done
using non-parametric methods. In ordinal scale there is no meaning in finding the interval between
the values. The difference between 3 and 2 is not equal to the difference between 2 and 1. The only
thing that can be specified is 3 is superior to 2 which is superior to 1.
Interval Scale: Interval scale has the power of nominal and ordinal data plus the interval between
two consecutive numbers is meaningful. Time is an example for interval scale. Elapsed time
between 3 am and 6 am is same as elapsed time between 5 am and 8 am. Centigrade and Fahrenheit
temperature scales are other examples. There is no absolute zero for this scale (Note that zero here
is relative, 00C is same as 320F) (F = Cx9/5 + 32). When a data is measured in interval scale and
if it is symmetric about mode, AM can be used as a measure of central tendency, SD can be used
as a measure of dispersion and other statistical tools such as correlation coefficient, and other
parametric methods can be used to analyze the data. If the data is skewed to one side the median
is to be used as a measure of central tendency and quartile deviation is to be used as measure of
dispersion.

2
Ratio Scale: Ratio scale incorporates all the powers of previous scales plus the provision for
absolute zero or origin. In this case meaningful ratio of two values exists. Measures of physical
dimensions such as height, weight, area etc. are examples. Note that a weight of 80kg is twice as
big as a weight of 40 kg. The interval and ratio level data are collected using some precise
instruments. These data are known as quantitative data. All manipulations that can be carried out
with the real numbers can be done on ratio scale values and all statistical tools can be used to
analyze ratio scale data.
Four Levels of Data Measurement: In terms of measurement capacity nominal, ordinal, interval
and ratio level data are in the ascending order. This means that nominal data are the weakest and
ratio data are the strongest. Please note that higher levels of measurements generally yield more
information. Because of the measurement precision at higher levels, more powerful and sensitive
statistical procedures can be used. Also, when one moves from higher measurement level to lower
one, there is always loss of information. In terms measurement level, statistical tools and
techniques can be divided into two categories: Parametric Tools and Non-Parametric Tools.
Spearman’s rank correlation, Run test, Mann-Whitney U test, Wilcoxon test etc. are examples of
non-parametric tools and Z, T and F tests are the examples of parametric tools. Almost all statistical
software packages provide these tools.
Measurement Errors or Differences: When a measuring instrument is constructed to measure a
psychological attribute, there is every possibility of occurrence of errors in the resulting outcomes.
Generally, two types of errors can happen, systematic and random errors. Random errors cannot
be controlled but have to manage using suitable statistical tools. Systematic errors arise due to the
bias and hence one has to identify the source and try to minimize the occurrence. The following
are the identified sources of systematic errors.
1. Respondent 2. Situation 3. The Measurer 4. The instrument
Characteristic of Good Measurement: To evaluate the quality of a measuring instrument
generally three criteria are used. Validity, Reliability and Practicality are the three criteria for a
good measurement:
The three qualities can be easily explained with the help of the example of a weighing machine.
It is said to be valid if it is giving the true weight of the object weighed.
It is said to be reliable if it is giving the same weight, when one object is weighed at different time
points.
It is said to be practical if there is no operational difficulty to find the weights.
When it comes to the constructs designed to measure psychological attributes we can elaborate as
follows:

3
Validity: Validity is the ability of an instrument to measure what is designed to measure. Even
though it sounds simple, it is very difficult to implement in real life situations. Hence researchers
are always concerned about the validity of their measuring instruments. Evaluation of validity is
dealt with three basic approaches: content validity, criterion validity and construct validity.
Content Validity: It is the subjective evaluation of the measurement instrument for its ability to
measure what is supposed to measure. It is the extent to which measuring instrument provides
adequate coverage of the topic under study. The researcher based on the objective of the research
identifies the relevant dimensions to be measured and based on his logical discretion construct the
instrument. A group of experts or researchers examine the content of the measuring instrument,
with reference to the objective of the research, whether the instrument is able to provide adequate
coverage of the concept. As content validity is subjective in nature and hence it alone is not a
sufficient criterion evaluation.
Criterion Validity: Criterion validity is the extent to which a measure is related to an outcome.
Criterion validity is often divided into concurrent and predictive validity. Concurrent
validity refers to a comparison between the measure in question and an outcome assessed at the
same time. It can also be demonstrated when a test correlates well with a measure that has
previously been validated. Predictive validity, on the other hand, by examining usefulness of the
measure in predicting some future performance. For example, the predictive validity of a cognitive
test for job performance is its positive correlation between test scores and supervisor’s
performance ratings in future.
Construct Validity: To evaluate construct validity we consider the measuring instrument and the
theory by which it is developed. It actually measures how well it measures what it claims to
measure. This is usually assessed through convergent and discriminant validity. Convergent
validity is established when the scores obtained using the new instrument correlates or converges
with the scores on other instruments designed to assess the same/similar constructs. On the same
grounds discriminant validity is established when the new measuring instrument has low
correlation or non- convergence with measures of dissimilar concept, i.e., the one which is
supposed to be uncorrelated with the concept.
Reliability: A measuring instrument is said to be reliable if it supplies consistent results across
time on various items to be measured. It is a necessary contributor to validity, but not a sufficient
condition. For example, suppose a weighing machine measures the weight correctly. Then it is
reliable and valid. On the other hand, suppose it always overweighs by six pounds, then the
instrument is reliable, but not valid. Note that if an instrument is not reliable it is not at all valid.
Reliable instruments provide confidence to a researcher that the transient and situational factors
are not intervening the process and hence the instrument is robust. An unreliable measuring
instrument is extremely dangerous to a business researcher, as it will generate different responses
from respondents who have the same feeling about the phenomenon. A researcher can adopt the
following three ways to handle the issue of reliability.

4
a) Test- Retest reliability: This can be executed as follows. Administer the same questionnaire
to the same set of respondents in two different time slots and then evaluate the degree of
similarity. The similarity can be assessed by computing correlation coefficient. Higher
correlation implies higher degree of reliability and lower correlation lower reliability.
b) Equivalent Forms Reliability: Two equivalent forms are constructed to measure the same
characteristic with different sample of items. Both forms contain the same type of questions
and the same structure but with some specific difference in the wordings and sequence etc.
The reliability is established by computing the correlation coefficient. On applying the forms
of the measurement device, they may be give one after the other or after a specified time
interval. The reliability is established by computing the correlation coefficient of the results
obtained from the two equivalent forms.
c) Internal Consistency Reliability: It is used to assess the reliability of a summated scale by
which several items are summed to form a total score. The split half technique is used to
measure internal consistency reliability. In this technique the components are divided into two
equivalent groups. This division can be done either on the basis of some pre-defined aspects
as odd verses even or split of components randomly. Internal consistency reliability is
measured using Cronbach’s alpha or Coefficient alpha. Cronbach’s alpha is the mean
reliability coefficient for all different ways of splitting the components. It varies from 0 to 1
and a coefficient value of 0.6 or less is considered to be unsatisfactory. Several authors
interpret in different ways. There is an argument that its value should be more than 0.7 for a
narrow construct and between 0.55 and 0.7 for a moderately broad construct.
The general guideline is α ≥ 0.9 Excellent;0.8 ≤ α < 0.9 Good; 0.7 ≤ α < 0.8 acceptable 0.6 ≤
α < 0.7 questionable;0.5 ≤ α < 0. 6 poor; α < 0.5 unacceptable.
Practicality: Practicality is to be viewed with reference to economy, convenience and
interpretability. The simple thing is the designed construct or instrument should be economical
with in the budget, convenient to use and the resulting outcome should be interpretable.

You might also like