You are on page 1of 24

Scaling, Reliability and Validity

Scaling Techniques

▪ How do we assign numbers or symbols when measuring


variables?
▪ There are two categories of how we design survey
instruments to measure variables
▪ Rating scales
▪ Scales with two or more response categories to elicit responses with
regard to the variable being measured
▪ Value is based on a specific criteria
▪ Ranking scales
▪ Scales with two or more response categories to measure preferred
choice and ranking among the choices given
▪ Value is based on how one is compared to another

@UNITAR International University


Rating Scales

▪ The following rating scales are often used in


organisational research
▪ Dichotomous scale
▪ Category scale
▪ Likert scale
▪ Numerical scale
▪ Semantic Differential scale
▪ Itemised rating scale
▪ Fixed or constant sum rating scale

@UNITAR International University


Dichotomous Scale

▪ The dichotomous scale is used to elicit a “Yes” or “No” answer


▪ Examples
▪ Do you smoke? Yes No
▪ Do you own a credit card? Yes No
▪ Have you use glasses to read? Yes No
▪ Do you have an e-mail account? Yes No

@UNITAR International University


Category Scale

▪ The category scale uses multiple items to elicit a single response


▪ Example:
▪ Educational qualification
SPM
Certificate
Diploma
Bachelor degree
Post graduate degree

@UNITAR International University


Likert Scale

▪ Likert scale is designed to examine how strongly subjects agree or


disagree with statements on a 5-point scale
▪ Example:

Strongly Neither Agree Strongly


Disagree Nor Disagree Agree

My work is challenging 1 2 3 4 5
I think about my work at home 1 2 3 4 5

@UNITAR International University


Numerical Scale

▪ The numerical scale is a 5-point or 7-point scale


with bi-polar adjectives at both ends
▪ Example:

How satisfied are you with the car you are currently driving?

Extremely Extremely
1 2 3 4 5 6 7
Satisfied Dissatisfied

@UNITAR International University


Semantic Differential Scale

▪ This scale can be constructed by placing bipolar


attributes at extreme ends of the scale and asking
respondents to tick where they think their opinion /
belief lies on the continuum
▪ Example:

Describe the personality of your boss


Cold ___ ___ ___ ___ ___ ___ ___ ___ Warm

Gloomy ___ ___ ___ ___ ___ ___ ___ ___ Cheerful

Self assured ___ ___ ___ ___ ___ ___ ___ ___ Hesitant

@UNITAR International University


Itemised Rating Scale

▪ A 5-point or 7-point scale with anchors, as needed, is


provided for each item and the respondent states the
appropriate number on the side of each item, or circles
the relevant number against each item
▪ Example

1 2 3 4 5
Very Unlikely Neither likely Likely Very
Unlikely Nor likely Likely

1) I will apply for a new job in the next 6 months ____

2) I will learn a new skill that may improve my career ____

@UNITAR International University


Fixed or Constant Sum Rating Scale

▪ The respondents are asked to distribute a given number


of points across various items
▪ Example:

In choosing a car, indicate the importance you attach


to each of the following five aspects by allotting points
for each to total 100 in all
Fuel consumption ____ %
Seating capacity ____ %
Horse power / C.C. ____ %
Price ____ %
Potential cost of maintenance ____ %

Total: 100 %

@UNITAR International University


Ranking Scales

▪ Ranking scales are used to tap preferences between


two or among more objects or items (ordinal in nature)

@UNITAR International University


Forced Choice

▪ Ranking scales are used to tap preferences


between two or among more objects or items
(ordinal in nature)
▪ Example

Rank the following cars that you would like to own in the order
of preference, assigning 1 for the most preferred choice and 5
for the least preferred
Proton Waja ____
Proton Savvy ____
Proton Gen-2 ____
Proton Satria ____
Proton Wira ____

@UNITAR International University


Paired Comparison

▪ Paired comparison is a scale that asks the respondents to choose


between two objects at a time
▪ For example:
▪ Do you like coffee or tea
▪ Do you like Coke or Pepsi

@UNITAR International University


Comparative Scale

▪ Comparative scale provides a benchmark or a point of


reference to assess attitudes toward the current object,
event, or situation under study
▪ Example:

Do you think that the prices of vegetables are cheaper if you


were to buy from “pasar malam” compared to a hypermarket?

Much Cheaper About the Same More expensive


1 2 3 4 5

@UNITAR International University


Goodness of Measure

▪ Once an instrument (questionnaire) has been designed


with the appropriate scaling techniques, it is critical to
make sure the instrument is indeed accurately
measuring the variable, and that in fact, it is actually
measuring the concept it was designed to measure
▪ Before using a measurement instrument, we need to
test the “goodness of fit” and ensure the instrument is
reliable and valid

@UNITAR International University


Item Analysis

▪ Item analysis is done to see if the items in the instrument belong


there or not
▪ Each item is examined for its ability to discriminate between those
subjects whose total scores are high, and those with low scores

@UNITAR International University


Reliability

▪ The reliability of a measure indicates the extent to


which it is without bias (error free) and hence ensures
consistent measurement across time and across various
items in the instrument
▪ Reliability of a measure is an indication of stability and
consistency with which the instrument measures the
concept and helps to assess the “goodness” of a
measure

@UNITAR International University


Reliability: Stability of Measure

▪ Test-retest reliability
▪ This test generates a reliability coefficient by repeating of the
same measure on a second occasion
▪ Respondents are asked to answer the same questionnaire at
two separate times
▪ if their responses are consistent on these two occasion then the
questionnaire is said to achieve reliability
▪ Parallel-Form Reliability
▪ When responses on two comparable sets of measures tapping
the same construct are highly correlated, we have parallel-form
reliability

@UNITAR International University


Reliability: Internal Consistency of Measure

▪ Internal consistence of measures is indicative of


the homogeneity of the items in the measure
that tap the construct
▪ The items should “hang out together”
▪ Consistency can be examined through the
following
▪ Inter item Consistency reliability
▪ Split Half Reliability

@UNITAR International University


Reliability: Internal Consistency of Measure

▪ Inter item Consistency reliability


▪ This is a test of the consistency of respondents’
answers to all the items in a measure
▪ To the degree that items are independent measures
of the same concept, they will be correlated with
one another
▪ The most popular test of inter-item consistency
reliability is the Cronbach’s Alpha
▪ Split Half Reliability
▪ Split Half Reliability reflects the correlations between
two halves of an instrument

@UNITAR International University


Validity

▪ Validity ensures the ability of a scale to measure the intended


concept
▪ Internal validity addresses the authenticity of the cause and effect
relationships
▪ External validity is concerned with the generalisability to the external
environment

@UNITAR International University


Validity

▪ Validity tests can be grouped into two broad headings:


▪ Content validity (internal)
▪ Criterion-related validity (external)

@UNITAR International University


Validity: Content Validity

▪ Content validity ensures the measure includes an


adequate and representative set of items that tap the
concept
▪ The more the scale items represent the domains or
universe of the concept being measured, the greater
the content validity
▪ Content validity is a function of how well the
dimensions and elements of a concept have been
delineated
▪ Face validity indicates the items that are intended to
measure a concept, do on the face of it look like they
measure the concept

@UNITAR International University


Validity: Criterion-Related Validity

▪ Criterion-related validity is established when the


measure differentiates individuals on a criterion it is
expected to predict
▪ This can be done by establishing concurrent validity or
predictive validity
▪ Concurrent validity is established when the scale
discriminates individuals who are known to be different
▪ Predictive validity indicates the ability of the measuring
instrument to differentiate among individuals with
reference to the future criterion

@UNITAR International University

You might also like