T6.1-Scales, Reliability and Validity

Scaling, Reliability and Validity
Scaling Techniques
▪ How do we assign numbers or symbols when measuring

variables?
▪ There are two categories of how we design survey
instruments to measure variables
▪ Rating scales
▪ Scales with two or more response categories to elicit responses with
regard to the variable being measured
▪ Value is based on a specific criteria
▪ Ranking scales
▪ Scales with two or more response categories to measure preferred
choice and ranking among the choices given
▪ Value is based on how one is compared to another
@UNITAR International University

Rating Scales
▪ The following rating scales are often used in

organisational research
▪ Dichotomous scale
▪ Category scale
▪ Likert scale
▪ Numerical scale
▪ Semantic Differential scale
▪ Itemised rating scale
▪ Fixed or constant sum rating scale

Dichotomous Scale
▪ The dichotomous scale is used to elicit a “Yes” or “No” answer

▪ Examples
▪ Do you smoke? Yes No
▪ Do you own a credit card? Yes No
▪ Have you use glasses to read? Yes No
▪ Do you have an e-mail account? Yes No

Category Scale
▪ The category scale uses multiple items to elicit a single response

▪ Example:
▪ Educational qualification
SPM
Certificate
Diploma
Bachelor degree
Post graduate degree

Likert Scale
▪ Likert scale is designed to examine how strongly subjects agree or

disagree with statements on a 5-point scale
▪ Example:
Strongly Neither Agree Strongly

Disagree Nor Disagree Agree
My work is challenging 1 2 3 4 5
I think about my work at home 1 2 3 4 5

Numerical Scale
▪ The numerical scale is a 5-point or 7-point scale

with bi-polar adjectives at both ends
▪ Example:
How satisfied are you with the car you are currently driving?
Extremely Extremely
1 2 3 4 5 6 7
Satisfied Dissatisfied

Semantic Differential Scale
▪ This scale can be constructed by placing bipolar

attributes at extreme ends of the scale and asking
respondents to tick where they think their opinion /
belief lies on the continuum
▪ Example:
Describe the personality of your boss

Cold ___ ___ ___ ___ ___ ___ ___ ___ Warm
Gloomy ___ ___ ___ ___ ___ ___ ___ ___ Cheerful
Self assured ___ ___ ___ ___ ___ ___ ___ ___ Hesitant

Itemised Rating Scale
▪ A 5-point or 7-point scale with anchors, as needed, is

provided for each item and the respondent states the
appropriate number on the side of each item, or circles
the relevant number against each item
▪ Example
1 2 3 4 5
Very Unlikely Neither likely Likely Very
Unlikely Nor likely Likely
1) I will apply for a new job in the next 6 months ____
2) I will learn a new skill that may improve my career ____

Fixed or Constant Sum Rating Scale
▪ The respondents are asked to distribute a given number

of points across various items
▪ Example:
In choosing a car, indicate the importance you attach

to each of the following five aspects by allotting points
for each to total 100 in all
Fuel consumption ____ %
Seating capacity ____ %
Horse power / C.C. ____ %
Price ____ %
Potential cost of maintenance ____ %
Total: 100 %

Ranking Scales
▪ Ranking scales are used to tap preferences between

two or among more objects or items (ordinal in nature)

Forced Choice
▪ Ranking scales are used to tap preferences

between two or among more objects or items
(ordinal in nature)
▪ Example
Rank the following cars that you would like to own in the order
of preference, assigning 1 for the most preferred choice and 5
for the least preferred
Proton Waja ____
Proton Savvy ____
Proton Gen-2 ____
Proton Satria ____
Proton Wira ____

Paired Comparison
▪ Paired comparison is a scale that asks the respondents to choose

between two objects at a time
▪ For example:
▪ Do you like coffee or tea
▪ Do you like Coke or Pepsi

Comparative Scale
▪ Comparative scale provides a benchmark or a point of

reference to assess attitudes toward the current object,
event, or situation under study
▪ Example:
Do you think that the prices of vegetables are cheaper if you

were to buy from “pasar malam” compared to a hypermarket?
Much Cheaper About the Same More expensive

1 2 3 4 5

Goodness of Measure
▪ Once an instrument (questionnaire) has been designed

with the appropriate scaling techniques, it is critical to
make sure the instrument is indeed accurately
measuring the variable, and that in fact, it is actually
measuring the concept it was designed to measure
▪ Before using a measurement instrument, we need to
test the “goodness of fit” and ensure the instrument is
reliable and valid

Item Analysis
▪ Item analysis is done to see if the items in the instrument belong

there or not
▪ Each item is examined for its ability to discriminate between those
subjects whose total scores are high, and those with low scores

Reliability
▪ The reliability of a measure indicates the extent to

which it is without bias (error free) and hence ensures
consistent measurement across time and across various
items in the instrument
▪ Reliability of a measure is an indication of stability and
consistency with which the instrument measures the
concept and helps to assess the “goodness” of a
measure

Reliability: Stability of Measure
▪ Test-retest reliability
▪ This test generates a reliability coefficient by repeating of the
same measure on a second occasion
▪ Respondents are asked to answer the same questionnaire at
two separate times
▪ if their responses are consistent on these two occasion then the
questionnaire is said to achieve reliability
▪ Parallel-Form Reliability
▪ When responses on two comparable sets of measures tapping
the same construct are highly correlated, we have parallel-form
reliability

Reliability: Internal Consistency of Measure
▪ Internal consistence of measures is indicative of

the homogeneity of the items in the measure
that tap the construct
▪ The items should “hang out together”
▪ Consistency can be examined through the
following
▪ Inter item Consistency reliability
▪ Split Half Reliability

Reliability: Internal Consistency of Measure
▪ Inter item Consistency reliability

▪ This is a test of the consistency of respondents’
answers to all the items in a measure
▪ To the degree that items are independent measures
of the same concept, they will be correlated with
one another
▪ The most popular test of inter-item consistency
reliability is the Cronbach’s Alpha
▪ Split Half Reliability
▪ Split Half Reliability reflects the correlations between
two halves of an instrument

Validity
▪ Validity ensures the ability of a scale to measure the intended

concept
▪ Internal validity addresses the authenticity of the cause and effect
relationships
▪ External validity is concerned with the generalisability to the external
environment

Validity
▪ Validity tests can be grouped into two broad headings:

▪ Content validity (internal)
▪ Criterion-related validity (external)

Validity: Content Validity
▪ Content validity ensures the measure includes an

adequate and representative set of items that tap the
concept
▪ The more the scale items represent the domains or
universe of the concept being measured, the greater
the content validity
▪ Content validity is a function of how well the
dimensions and elements of a concept have been
delineated
▪ Face validity indicates the items that are intended to
measure a concept, do on the face of it look like they
measure the concept

Validity: Criterion-Related Validity
▪ Criterion-related validity is established when the

measure differentiates individuals on a criterion it is
expected to predict
▪ This can be done by establishing concurrent validity or
predictive validity
▪ Concurrent validity is established when the scale
discriminates individuals who are known to be different
▪ Predictive validity indicates the ability of the measuring
instrument to differentiate among individuals with
reference to the future criterion

T6.1-Scales, Reliability and Validity

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

T6.1-Scales, Reliability and Validity

Uploaded by

Copyright:

Available Formats

Scaling, Reliability and Validity

▪ How do we assign numbers or symbols when measuring

@UNITAR International University

▪ The following rating scales are often used in

@UNITAR International University

▪ The dichotomous scale is used to elicit a “Yes” or “No” answer

@UNITAR International University

▪ The category scale uses multiple items to elicit a single response

@UNITAR International University

▪ Likert scale is designed to examine how strongly subjects agree or

Strongly Neither Agree Strongly

@UNITAR International University

▪ The numerical scale is a 5-point or 7-point scale

@UNITAR International University

▪ This scale can be constructed by placing bipolar

Describe the personality of your boss

@UNITAR International University

▪ A 5-point or 7-point scale with anchors, as needed, is

1) I will apply for a new job in the next 6 months ____

2) I will learn a new skill that may improve my career ____

@UNITAR International University

▪ The respondents are asked to distribute a given number

In choosing a car, indicate the importance you attach

@UNITAR International University

▪ Ranking scales are used to tap preferences between

@UNITAR International University

▪ Ranking scales are used to tap preferences

@UNITAR International University

▪ Paired comparison is a scale that asks the respondents to choose

@UNITAR International University

▪ Comparative scale provides a benchmark or a point of

Do you think that the prices of vegetables are cheaper if you

Much Cheaper About the Same More expensive

@UNITAR International University

▪ Once an instrument (questionnaire) has been designed

@UNITAR International University

▪ Item analysis is done to see if the items in the instrument belong

@UNITAR International University

▪ The reliability of a measure indicates the extent to

@UNITAR International University

@UNITAR International University

▪ Internal consistence of measures is indicative of

@UNITAR International University

▪ Inter item Consistency reliability

@UNITAR International University

▪ Validity ensures the ability of a scale to measure the intended

@UNITAR International University

▪ Validity tests can be grouped into two broad headings:

@UNITAR International University

▪ Content validity ensures the measure includes an

@UNITAR International University

▪ Criterion-related validity is established when the

@UNITAR International University

You might also like