You are on page 1of 58

Criteria for Good Questionnaire Measurement

Validity

• Validity in research refers to the extent to which a tool, such as a


questionnaire, measures what it is intended to measure. It's about the
accuracy and appropriateness of the conclusions that are drawn from the
research.

• There are different types of validity that can be considered when assessing a
questionnaire, such as content validity, criterion validity, and construct
validity.
1. Content Validity
2. Criterion Validity
3. Construct Validity
Content Validity: This refers to the extent to which a measure represents all
facets of a given construct.

Example: Suppose you are designing a questionnaire to assess happiness. If the


questions only cover a person's financial satisfaction but ignore other aspects like
emotional well-being, family relationships, etc., then the content validity of the
questionnaire would be low.

Customer Satisfaction Survey: If a company wants to gauge customer


satisfaction with their online shopping experience, the survey must include
questions that cover all relevant aspects of the experience, such as website
navigation, product selection, payment process, delivery, customer service
interactions, etc. Leaving out any key aspect would diminish content validity.
Criterion Validity
• Criterion validity assesses how well the results of a new test or
questionnaire agree with the results of a well-established test or known
outcomes. Think of it like comparing a new measuring tool to a trusted
standard.

• Checking if a new questionnaire gives similar results to a trusted


questionnaire when both are taken at the same time.
• Example: You have a new quiz to test math skills. You give students this quiz
and a well-known math quiz at the same time. If the scores are similar on
both quizzes, the new quiz has good concurrent validity.

• It's like using two different thermometers to check a fever – if they show the
same temperature, they are probably both accurate.
Construct Validity
• Construct validity is about ensuring that a questionnaire or measurement
tool is accurately measuring the theoretical idea or "construct" it's meant to
measure.
• In simpler terms, it checks whether the questions you're asking truly
capture the concept you're trying to study. Here's a closer look at construct
validity with examples:
If I want to measure Consumer Buying Behavior

• Buying behavior, or consumer behavior, refers to the decision-making


processes and actions of consumers when they purchase goods and services.

• It's a complex interplay of various sub-variables that can be categorized


under psychological, personal, social, and situational factors

1. Psychological Factors
These are the internal factors that influence buying behavior:
•Motivation: The drive that compels a consumer to fulfill a need.
•Perception: How a consumer views a product or brand.
•Attitudes and Beliefs: A consumer's feelings and convictions related to a product or service.
•Learning: Past experiences and the information that influence current purchasing decisions.
If I want to measure Service quality of a Hotel
• Service quality is a critical aspect in the evaluation of customer satisfaction
and loyalty. It's often examined through various dimensions or sub-variables
that capture different facets of the service experience.

• The SERVQUAL model is a widely recognized approach that identifies five


key dimensions of service quality. The SERVQUAL model was developed by
A. Parasuraman, Valarie Zeithaml, and Leonard Berry.
1.Tangibles:
1. Appearance: Physical appearance of facilities, equipment, personnel, and
communication materials.
2. Facilities: The condition and functionality of physical facilities.
3. Tools and Equipment: The quality and availability of tools and equipment used in
service delivery.
2.Reliability:
1. Consistency: Providing service as promised, dependably and accurately.
2. Fulfillment: Meeting stated commitments, such as delivery times or pricing.
3.Responsiveness:
1. Timeliness: Willingness to help customers and provide prompt service.
2. Customer Support: Availability and accessibility of support, such as through helplines
or online chat.
3. Handling Requests and Complaints: Ability to deal with special requests or issues
efficiently.
4. Assurance:
1.Competence: Knowledge and courtesy of employees and their ability to
convey trust and confidence.
2.Credibility: The trustworthiness, believability, and honesty of the service
provider.
3.Security: The feeling of safety in interactions, including financial and
personal data protection.
5. Empathy:
1.Understanding the Customer: Making an effort to understand specific
customer needs and requirements.
2.Personalized Attention: Providing individualized attention to customers.
3.Communication: Keeping customers informed in a language they can
understand.
Reliability
• Reliability in research refers to the consistency or stability of a measurement
tool, such as a questionnaire, across different instances of measurement.

• If a tool is reliable, it should produce similar results under consistent


conditions. Reliability is crucial because it underpins the trust that
researchers, practitioners, and policymakers can have in the findings.
1. Test-Retest Reliability
This refers to the consistency of a measure over time.
Example: If you administer a job satisfaction questionnaire to a group of
employees and then administer the same questionnaire to the same group a
week later (assuming no significant changes in their job satisfaction), the
scores should be very similar.

2. Internal Consistency Reliability


This evaluates the extent to which items within a test consistently measure
the same construct.
Example: If you're using a questionnaire to measure anxiety, all the items
should be closely related to the concept of anxiety. Using a statistical measure
like Cronbach's alpha, you can gauge how closely the items correlate with one
another, reflecting the questionnaire's internal consistency.
Equivalent forms reliability

• Equivalent forms reliability, also known as parallel forms reliability, refers to


the consistency of measurement between two different forms of a test.

• These two forms are designed to be as similar as possible in content, difficulty,


and length, but with different sets of questions or items.

• If the two forms yield similar results when administered to the same group of
individuals, then they are considered to have equivalent forms reliability.
Example of Equivalent Forms Reliability

Let's illustrate this concept with an example related to employee


training in a corporation.

Situation: A company wants to assess the effectiveness of a new


training program. They create two different tests (Test A and Test B) to
evaluate the employees' understanding of the material. These tests
cover the same topics, have the same number of questions, and are of
the same difficulty level, but the questions are worded differently.
Procedure:
1.Create Two Tests: The company develops Test A and Test B, each with 50
questions covering the same subject matter but phrased differently.

2.Divide Participants: Employees who have completed the training are divided
into two groups. Group 1 takes Test A, and Group 2 takes Test B.

3.Administer the Tests: Both tests are administered to the respective groups at
the same time, under the same conditions.

4.Compare Results: The scores from Test A and Test B are then compared to
see if they are consistent with each other.
Outcome:
•If the scores from both tests are very similar, it indicates that the
tests are interchangeable, demonstrating equivalent forms reliability.

•If the scores vary widely between the two tests, it might suggest that
the tests are not truly equivalent, and therefore, they lack equivalent
forms reliability.
Sensitivity

• sensitivity refers to the ability of a questionnaire to accurately and effectively


capture the nuances or specifics of what it is intended to measure.

• It relates to the questionnaire's capacity to detect variations or changes in


the construct being studied, even when those changes might be subtle. Here
are a couple of examples to illustrate the concept of sensitivity in research
questionnaires:
Example 1: Measuring Customer Satisfaction

Situation: A company wants to gauge customer satisfaction with a new product.


They design a questionnaire to capture various aspects of the customer
experience, such as the quality of the product, ease of use, perceived value, and
overall satisfaction.

Sensitivity Concern: The company wants the questionnaire to be sensitive


enough to detect even slight changes in customer satisfaction over time. A
change in satisfaction might be influenced by subtle factors like a minor update
to the product, a slight change in price, or a small alteration in packaging.
Solution: To ensure sensitivity, the questionnaire includes detailed questions
that explore various dimensions of satisfaction, with Likert scale responses
ranging from "Strongly Disagree" to "Strongly Agree." By including specific and
nuanced questions and a finely graded response scale, the questionnaire
becomes sensitive enough to detect subtle changes in customer satisfaction.
Example 2: Assessing Mental Health Symptoms
Situation: A mental health professional wants to assess the anxiety levels of
patients in therapy. The goal is to have a sensitive measure that can detect even
small changes in anxiety, which could be indicative of progress or problems in
treatment.

Sensitivity Concern: Mental health symptoms like anxiety can be complex and
multifaceted, with small changes in symptoms possibly having significant
implications. A questionnaire that is not sensitive enough might overlook these
subtle changes, leading to a lack of recognition of important shifts in the
patient's condition.
• Solution: The researcher opts for a validated anxiety assessment tool known
for its sensitivity. It includes a variety of questions that tap into different
aspects of anxiety, such as physical symptoms, cognitive patterns, and
behavioral tendencies.
• By assessing a comprehensive array of symptoms and utilizing a graded
response scale, the questionnaire can sensitively detect changes in anxiety
levels over time.
Measurement scales
Single-item scales
• Single-item scales are measurement tools that use just one question or
statement to assess a particular construct or attribute. Unlike multi-item
scales, which gauge a construct through several interrelated questions,
single-item scales aim to capture the essence of what is being measured in a
concise manner. They are often used for simplicity and efficiency, especially
when the construct is clear and unambiguous.
Example 1: Measuring Overall Life Satisfaction
Single-Item Scale: "Overall, how satisfied are you with your life as a whole these
days?"
•Response Options: A 7-point scale ranging from "Completely Dissatisfied" (1) to
"Completely Satisfied" (7).
This question is simple and straightforward, aiming to capture a respondent's
general satisfaction with life in a single query.
Example 2: Assessing Customer Satisfaction with a Product
Single-Item Scale: "How satisfied are you with your recent purchase of [Product
Name]?"
•Response Options: A 5-point scale ranging from "Very Dissatisfied" (1) to "Very
Satisfied" (5).
This question provides a quick snapshot of a customer's satisfaction with a
specific product, without delving into various dimensions like quality, value, or
usability.
Example 3: Evaluating Employee Job Satisfaction
Single-Item Scale: "Overall, how satisfied are you with your current job?"
•Response Options: A 10-point scale ranging from "Not Satisfied At All" (1) to
"Extremely Satisfied" (10).
This question aims to gauge an employee's overall job satisfaction in a single
measure, without dissecting specific aspects like work environment,
relationships with colleagues, or job tasks.
Advantages and Disadvantages of Single-Item Scales

Advantages:
•Simplicity: They are easy to administer and understand.
•Efficiency: They save time, especially when survey length is a concern.
•Usefulness for Clear Constructs: They can be effective for measuring
straightforward and unidimensional concepts.

Disadvantages:
•Lack of Depth: They may miss nuances and subtleties of more complex
constructs.
Multiple choice scales

• Multiple choice scales, often used in questionnaires and surveys, provide


respondents with several predefined options, allowing them to select the
answer that best represents their opinion, preference, or experience.

• They differ from open-ended questions, where respondents are free to


articulate their answers in their own words. Multiple choice scales can take
various forms, including single answer selection or multiple answer selection.
Here are some examples:
Example 1: Single Answer Multiple Choice - Satisfaction with a Product
Question: How satisfied are you with our new smartphone model?
•a) Very satisfied
•b) Satisfied
•c) Neutral
•d) Dissatisfied
•e) Very dissatisfied
Respondents choose one option that best describes their level of satisfaction.

Example 2: Multiple Answer Multiple Choice - Preferred Features of a Product


Question: What features do you value most in a laptop? (Select all that apply)
•a) Battery life
•b) Screen size
•c) Processor speed
•d) Weight
•e) Brand reputation
Respondents can select multiple answers that correspond to their preferences.
Example 3: Multiple Choice with a "None of the Above" Option - Travel
Preferences
Question: What type of vacation do you prefer?
•a) Beach vacation
•b) City exploration
•c) Mountain hiking
•d) Historical site visits
•e) None of the above
Including a "None of the above" option can help account for preferences not
covered in the given choices.
forced choice ranking

• Forced choice ranking is a type of question format where respondents are


given a set of items and are required to rank them in a specific order based
on certain criteria.

• Unlike other types of scales where respondents can rate each item
independently, forced choice ranking requires them to make a comparative
judgment, effectively "forcing" them to choose a hierarchy among the
options.
Example 1: Ranking Preferences for Vacation Activities
Question: Please rank the following vacation activities from your most preferred (1) to your
least preferred (5):
•a) Beach lounging
•b) Museum visiting
•c) Mountain hiking
•d) City sightseeing
•e) Dining at local restaurants
The respondent might rank them as follows:
•a) 3
•b) 5
•c) 2
•d) 4
•e) 1
This ranking reflects the respondent's preference for dining at local restaurants most and
visiting museums least.
Example 2: Ranking Importance of Job Benefits
Question: Rank the following job benefits by importance to you, with 1 being
the most important and 4 being the least important:
•a) Health insurance
•b) Retirement plan
•c) Paid vacation time
•d) Flexible work schedule
A possible ranking might be:
•a) 1
•b) 3
•c) 4
•d) 2
This order signifies that the respondent values health insurance the most and
paid vacation time the least.
Paired comparison technique
• The paired comparison technique is a method used to compare and
evaluate different items or alternatives relative to one another.

• It is often used in various fields, such as marketing, psychology, and


human resource management, to assess preferences, priorities, or
the importance of different attributes.

• The technique involves comparing pairs of items and judging which


one is preferred or ranks higher on some criterion.

• By doing this for all possible pairs, a ranking or preference order can
be derived for the whole set of items
Example of Paired Comparison Technique
Let's say a company wants to understand what features are most
important to consumers when purchasing a new smartphone. They
decide to use the paired comparison technique and identify the
following five key features:
1.Battery life
2.Camera quality
3.Screen size
4.Processing speed
5.Price
To use the paired comparison technique, each feature will be compared with
every other feature, one pair at a time. For example:
•Battery life vs. Camera quality
•Battery life vs. Screen size
•Battery life vs. Processing speed
•Battery life vs. Price
•Camera quality vs. Screen size
•Camera quality vs. Processing speed
•Camera quality vs. Price
•Screen size vs. Processing speed
•Screen size vs. Price
•Processing speed vs. Price
Participants in the study will be asked to choose their preference in each of these
pairings. After gathering enough responses, the company can create a ranking of
the features based on how often they were chosen over other features.
If the results are as follows:
1.Battery life
2.Camera quality
3.Price
4.Screen size
5.Processing speed
This ranking can be used to inform product development and marketing strategies,
focusing on the aspects most important to the target audience.
Advantages and Disadvantages
Advantages:
•Simplicity: Easy to understand and apply.
•Sensitivity: Can detect subtle preferences or differences between items.
•Flexibility: Can be applied to various contexts and types of data.
Disadvantages:
•Complexity: The number of comparisons grows exponentially with the number
of items.
Example: Selecting Teaching Methods
Imagine an educational institution wanting to revamp its teaching methodologies. They've
identified four methods they want to evaluate:
1.Traditional Lecture
2.Interactive Group Work
3.Flipped Classroom
4.Online Learning
They decide to evaluate these methods based on the following criteria:
•Student Engagement
•Information Retention
•Flexibility
•Cost-Effectiveness
Using the paired comparison technique, the decision-makers will compare each method
against every other method using each criterion, one pair at a time. This will result in six
comparisons for each criterion:
•Traditional Lecture vs. Interactive Group Work
•Traditional Lecture vs. Flipped Classroom
•Traditional Lecture vs. Online Learning
•Interactive Group Work vs. Flipped Classroom
•Interactive Group Work vs. Online Learning
•Flipped Classroom vs. Online Learning

Educators, students, and other stakeholders will then be asked to choose their preference in each
of these pairings. After gathering enough responses, the institution can create a ranking of the
methods based on how often they were chosen over other methods.
The results may look something like this:

1.Interactive Group Work


2.Flipped Classroom
3.Online Learning
4.Traditional Lecture
These rankings can help the institution decide on the teaching method that best aligns with their
goals and values, leading to more engaged students and better learning outcomes.
Constant sum scales

• Constant sum scales are a type of measurement scale used in survey


research to understand the relative importance or preference among a set of
attributes, items, or options. Respondents are given a constant sum (usually
100 points) and asked to allocate the sum across various attributes according
to their importance or preference.

• The idea behind this scale is to force respondents to make trade-offs


between the different attributes, ensuring that an increase in importance or
preference for one attribute must correspond to a decrease for another.
Example of Constant Sum Scales
Let's consider an example where a car manufacturer wants to understand the
relative importance of various features for potential customers while purchasing
a new car. The features they want to evaluate are:
1.Fuel Efficiency
2.Safety Features
3.Comfort
4.Design
5.Technology Integration

Using the constant sum scale method, the manufacturer can provide a survey to
potential customers asking them to distribute 100 points among these five
features according to their importance.
A respondent's answer might look like this:
•Fuel Efficiency: 30 points
•Safety Features: 25 points
•Comfort: 20 points
•Design: 15 points
•Technology Integration: 10 points

The total must sum up to 100 points. In this example, the respondent values fuel
efficiency most, followed by safety features, and technology integration least.
By collecting data from several respondents, the car manufacturer can analyze
the average point allocation for each feature and determine what aspects of the
car are most valuable to their target market.
Direct quantification scales
• Direct quantification scales, also known as direct rating scales, are a type of
measurement technique used to evaluate objects, items, or attributes by
assigning a numerical value to them.

• Respondents are asked to directly rate the objects on a numeric scale,


typically ranging from 0 to 100 or from 1 to 10. The numerical values
represent the level of preference, importance, satisfaction, or any other
dimension being measured.
Example of Direct Quantification Scales
Customer Satisfaction Survey
Imagine a restaurant that wants to measure customer satisfaction for various
aspects of its service. They decide to use a direct quantification scale ranging
from 1 to 10, where 1 represents "Completely Dissatisfied" and 10 represents
"Completely Satisfied."
Customers are asked to rate their satisfaction on the following aspects:
•Food Quality
•Service Speed
•Waitstaff Friendliness
•Atmosphere
•Overall Experience
A respondent might provide the following ratings:
•Food Quality: 8
•Service Speed: 7
•Waitstaff Friendliness: 9
•Atmosphere: 6
•Overall Experience: 7
The restaurant can then analyze the ratings across all respondents to identify
strengths and areas for improvement. For example, the lower rating for
"Atmosphere" might indicate that enhancements to the dining ambiance could
improve overall customer satisfaction.
Employee Performance Evaluation
Another example could be an HR department evaluating employees on various
performance metrics using a scale from 0 to 100

Employees might be rated on:


•Job Knowledge: 80/100
•Teamwork: 75/100
•Communication Skills: 70/100
•Adaptability: 85/100
The numerical ratings directly quantify the employees' performance in each
area, providing a clear and easily interpretable measurement.
Q sort scales

• The Q sort method, also known as Q methodology or Q sorting, is a research


technique used to study subjective viewpoints. It combines the strengths of
both qualitative and quantitative research and is often employed in psychology,
sociology, political science, and other fields to explore human subjectivity.

• The Q sort method involves a set of statements related to a specific topic.


Participants are asked to rank or sort these statements according to their
personal opinions or beliefs, usually using a predefined distribution that
resembles a normal distribution.

• Here's how the Q sort method generally works, along with an example to
illustrate its application.
Steps in Q Sort Method
1.Selection of Statements: A comprehensive set of statements that relate to the
subject under study is collected. This set should cover various aspects and
opinions related to the topic.

2.Sorting Process: Participants are given the statements and asked to sort them
into a specific distribution (often a quasi-normal distribution) based on how
much they agree or disagree with each one.

3.Data Analysis: The sorted data are then analyzed quantitatively, often using
factor analysis, to identify common patterns or factors among participants.
Example: Understanding Attitudes Toward Environmental Conservation
Imagine a research study aimed at understanding people's attitudes and beliefs
about environmental conservation.

Step 1: Selection of Statements


A set of 40 statements related to environmental conservation is developed,
covering various aspects such as recycling, energy conservation, climate change,
wildlife protection, etc.
Examples of statements:
•"Recycling should be mandatory for all citizens."
•"Climate change is the most critical issue of our time."
•"Investing in renewable energy is unnecessary."
Step 2: Sorting Process

• Participants are asked to sort these 40 statements into a specific


distribution, such as a quasi-normal distribution with 9 categories ranging
from "Most Disagree" to "Most Agree." They might be provided with a
grid or template to place the statements.

• For instance, a participant may place the statement "Recycling should be


mandatory for all citizens" in the "Most Agree" category, and "Investing in
renewable energy is unnecessary" in the "Most Disagree" category.
Step 3: Data Analysis
The sorted data from all participants are analyzed to identify common patterns
or factors. This might reveal groups of participants with similar beliefs, such as a
group that strongly supports renewable energy and recycling and another group
more skeptical about climate change.

Advantages:
•Comprehensive: Captures a wide range of subjective opinions and beliefs.
•Combines Qualitative and Quantitative: Enables both rich qualitative insights
and rigorous quantitative analysis.
•Flexibility: Can be applied to various subjects and populations.
Disadvantages:
•Complexity: Requires careful design and analysis
Summated scaling technique

• The summated scaling technique, also known as the Likert scale, is a widely
used method in survey research to measure attitudes, opinions, or beliefs. It
consists of a series of statements or questions, and respondents are asked to
indicate their level of agreement or disagreement with each statement on a
predetermined scale.

• The responses are then summed to create a composite score, reflecting the
individual's overall attitude or perception of the subject being studied.
Example of Summated Scaling Technique
Employee Job Satisfaction Survey
Imagine a company that wants to measure its employees' overall job
satisfaction. They decide to use the summated scaling technique and create a
survey with the following five statements, each measured on a 5-point scale
ranging from 1 (Strongly Disagree) to 5 (Strongly Agree):
1."I feel satisfied with my current position."
2."My supervisor provides me with the support I need."
3."I have opportunities for professional growth within the company."
4."My workload is reasonable and manageable."
5."I feel fairly compensated for my work."
Respondents are asked to indicate their level of agreement or disagreement with
each statement, and their responses might look like this:
•Statement 1: Agree (4)
•Statement 2: Neutral (3)
•Statement 3: Strongly Agree (5)
•Statement 4: Disagree (2)
•Statement 5: Agree (4)

The summated score for this respondent would be


4+3+5+2+4=184+3+5+2+4=18.
By analyzing the summated scores across all respondents, the company can
identify overall trends in job satisfaction and specific areas that may need
improvement.
factors selecting an appropriate measurement scale

1. Research Objectives
•Example: A company wants to measure customer satisfaction with its
products.
•Selection Factor: A Likert scale may be chosen to gauge customers'
satisfaction levels, as it is designed to measure attitudes and can provide
detailed insights into customer opinions.

2. Level of Measurement
•Example: A health study is collecting data on types of physical activity.
•Selection Factor: A nominal scale might be used, as the activities (e.g.,
running, swimming, cycling) can be categorized without a natural order or
ranking.
4. Complexity of the Concept
•Example: Researching the multifaceted concept of job satisfaction.
•Selection Factor: A composite scale combining different types of questions
(Likert, open-ended) might be chosen to capture the various dimensions of job
satisfaction, such as workload, relationships, compensation, etc.

5. Respondent Characteristics
•Example: Conducting a survey with children to understand their favorite school
subjects.
•Selection Factor: A simple visual scale with pictures or emojis might be chosen,
recognizing that children might struggle with more complex or text-heavy scales.
6. Reliability and Validity
•Example: A longitudinal study tracking mental health over time.
•Selection Factor: A standardized and validated psychological well-being scale
might be used to ensure consistent and accurate measurements across different
time points.

7. Sensitivity
•Example: Measuring subtle differences in taste preferences among different
types of coffee.
•Selection Factor: A finely graded scale, perhaps a 10-point scale, could be
chosen to capture subtle variations in taste perception that might be missed
with a coarser scale.
8. Practical Considerations
•Example: A quick poll to gauge public opinion on a topical issue.
•Selection Factor: A simple dichotomous scale (e.g., Agree/Disagree) might be
chosen for quick and easy administration and analysis.

9. Ethical Considerations
•Example: Surveying diverse populations about sensitive topics, such as
religious beliefs.
•Selection Factor: Careful choice of non-offensive language and consideration
of cultural norms might guide the selection of a scale that respects respondents'
sensitivities.
10. Previous Research and Standards
•Example: A study that compares results with previous research on smoking
habits.
•Selection Factor: Using the same scale as the earlier studies ensures
comparability and consistency in the results.

You might also like