0% found this document useful (0 votes)
43 views41 pages

Research Methodology Notes-1

The document provides an overview of research methodology, detailing the meaning, significance, types, and processes of research. It emphasizes the importance of scientific research in decision-making, highlighting aspects such as objectivity, reliability, and risk reduction. Additionally, it outlines various research designs and the measurement scales used in data collection and analysis.

Uploaded by

Subhasis Chakra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views41 pages

Research Methodology Notes-1

The document provides an overview of research methodology, detailing the meaning, significance, types, and processes of research. It emphasizes the importance of scientific research in decision-making, highlighting aspects such as objectivity, reliability, and risk reduction. Additionally, it outlines various research designs and the measurement scales used in data collection and analysis.

Uploaded by

Subhasis Chakra
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

MODULE I

Introduction to Research Methodology (RM)


Meaning of Research

1. Research is a systematic, organized effort to investigate a specific problem, collect data,


analyze information, and reach valid conclusions.
2. The word research literally means “to search again” — implying a careful, persistent
investigation.
3. It aims to discover new facts, verify existing knowledge, and develop new theories or
applications.

Significance of Research

 Expands knowledge

1. Research systematically adds to the body of human knowledge.


2. It uncovers new facts, relationships, and insights that were previously unknown.
3. By investigating unanswered questions, it helps correct misconceptions and refines
existing ideas.
4. Example: Discovering a new species in biology expands scientific understanding of
biodiversity.

 Solves practical problems

1. Research offers evidence-based solutions to real-world issues.


2. It helps design interventions, tools, or technologies to overcome challenges in health,
education, engineering, agriculture, and many other fields.
3. By applying rigorous methods, it ensures that solutions are reliable and replicable.
4. Example: Medical research leading to vaccines for infectious diseases.

 Aids in theory development

1. Research supports the creation and refinement of theories that explain how or why
things happen.
2. It tests hypotheses, validates existing theories, or builds new frameworks to explain
observed phenomena.
3. Theories developed through research guide further studies and provide foundations
for practice.
4. Example: Social research that builds theories on human motivation or learning.

 Improves practices and processes

1. Research identifies more efficient, effective, or ethical ways of doing things.


2. It evaluates current practices, identifies weaknesses, and recommends improvements.
3. This is especially important in sectors like education, healthcare, business, or
manufacturing.
4. Example: Educational research leading to better teaching strategies to improve
student outcomes.

 Supports policy-making and planning

1. Research generates data and evidence that inform public policies, laws, and
regulations.
2. Policymakers rely on research findings to make informed decisions and plan
interventions that benefit society.
3. Sound evidence helps avoid guesswork and improves transparency and
accountability.
4. Example: Economic research guiding minimum wage policies or climate research
informing environmental laws.

Importance of Scientific Research in Decision Making


 Objectivity:

1. Scientific research uses systematic and unbiased methods to gather evidence.


2. This helps decision-makers base their choices on facts, rather than opinions, emotions, or
political pressures.
3. Objectivity ensures fairness and credibility in decisions, improving public trust.
4. Example: Using randomized controlled trials to evaluate a new drug avoids subjective
bias.

 Reliability:

1. Scientific methods produce results that are repeatable and verifiable, increasing
confidence in the evidence used for decisions.
2. Reliable research findings help reduce errors or failures in policies and strategies.
3. Decisions based on reliable data are more likely to succeed and gain stakeholder support.
4. Example: Consistent economic indicators guiding central bank monetary policy.

 Risk Reduction:

1. Scientific research helps identify potential risks before actions are taken, enabling better
planning and preparedness.
2. It provides evidence to assess possible negative consequences and ways to mitigate
them.
3. This prevents costly mistakes and improves safety and sustainability.
4. Example: Environmental impact studies reducing risks of large construction projects.

 Predictive Power:

1. Scientific research can develop models and forecasts to predict future outcomes and
trends.
2. Predictive capability supports proactive rather than reactive decision-making.
3. It enables organizations and governments to plan long-term strategies with greater
confidence.
4. Example: Climate models predicting rainfall patterns to support agricultural planning.

 Resource Optimization:

1. Research provides data to allocate resources efficiently, avoiding waste.


2. By identifying priority areas and the most effective interventions, decision-makers
maximize the impact of limited budgets and time.
3. This supports cost-effective and targeted strategies.
4. Example: Health research showing which vaccines deliver the best public health results
for the investment.

 Accountability:

1. Decisions backed by scientific research can be defended and justified transparently,


increasing accountability.
2. Stakeholders (public, investors, policy makers) can trace decisions to evidence rather
than arbitrary choices.
3. This builds trust and supports ethical governance.
4. Example: Public health authorities citing peer-reviewed data to justify pandemic
restrictions.

Types of Research

1. Basic (Pure) Research:

I. It is also called fundamental research.


II. Conducted to generate new knowledge and advance theoretical understanding
without immediate practical application.
III. Driven by curiosity and intellectual interest.
IV. Example: Studying the behavior of subatomic particles in physics.

2. Applied Research:

I. Focuses on practical problem-solving.


II. Aims to find solutions that can be directly applied to real-world challenges.
III. Often uses results from basic research to address specific needs.
IV. Example: Developing a new drug to treat a disease.

3. Descriptive Research:

I. Seeks to describe characteristics of a phenomenon or population systematically and


accurately.
II. Answers “what” questions, rather than “why” or “how.”
III. Often involves surveys, observations, or case studies.
IV. Example: Surveying the literacy rate in a rural community.
4. Analytical Research:

I. Goes beyond describing — it analyzes information to understand relationships


and causes.
II. Uses data that is already available to draw conclusions or test hypotheses.
III. Often includes critical evaluation and comparison.
IV. Example: Analyzing crime records to determine factors influencing crime rates.

5. Exploratory Research:

I. Conducted to explore a relatively unknown area or gain new insights where


little previous research exists.
II. Helps identify variables, questions, or hypotheses for future studies.
III. Flexible and open-ended in design.
IV. Example: Interviewing startup founders to explore challenges in a new industry.

6. Experimental Research:

I. Involves manipulating variables to test cause-and-effect relationships under


controlled conditions.
II. Often includes experimental and control groups.
III. Considered the most scientifically rigorous research type.
IV. Example: Testing the effectiveness of a new fertilizer on crop yields in a lab-
controlled field.

7. Quantitative Research:

I. Involves numerical data collection and statistical analysis.


II. Seeks to measure variables and test hypotheses objectively.
III. Results are often presented in graphs, tables, or charts.
IV. Example: Using a questionnaire to measure levels of stress among students and
analyzing it statistically.
8. Qualitative Research:

I. Focuses on understanding meanings, experiences, and perspectives in a


deeper, non-numerical way.
II. Uses interviews, observations, and textual analysis.
III. Generates rich, detailed data about human behavior or culture.
IV. Example: Conducting in-depth interviews to explore how cancer patients cope
emotionally.

Research Process

 Formulating the research problem

1. The first and most crucial step.


2. Involves identifying and clearly defining the issue, question, or gap the research aims
to address.
3. A well-formulated problem provides direction and focus for the entire study.
4. Should be specific, feasible, and researchable.
5. Example: Investigating the reasons for high dropout rates among rural high school
students.

 Reviewing literature

1. A systematic examination of existing studies, theories, and data related to the research
topic.
2. Helps understand what is already known, where gaps exist, and how the current study can
contribute.
3. Prevents duplication of effort and builds a theoretical framework for the research.
4. Example: Reading studies on student motivation, family background, and school
resources in relation to dropout.

 Developing objectives and hypotheses

1. Objectives state what the research aims to achieve, in clear and measurable terms.
a. Example: “To identify socio-economic factors influencing school dropout rates.”
2. Hypotheses are testable statements predicting relationships between variables.
a. Example: “Students from low-income families are more likely to drop out.”
3. Together, they guide the direction and scope of the research.
 Designing the research (methodology)

1. Involves deciding how the research will be carried out, including:


2. Research type (qualitative, quantitative, mixed)
3. Sampling techniques
4. Tools for data collection (questionnaires, interviews, experiments, etc.)
5. Ethical considerations
6. Data analysis plan
7. A well-planned design ensures validity, reliability, and feasibility.

 Collecting data

1. Executing the planned data-gathering process.


2. May involve surveys, interviews, experiments, observations, or secondary data sources.
3. Data collection must follow ethical guidelines, maintain accuracy, and protect
participants’ privacy.
4. Example: Administering a questionnaire to rural students on reasons for school leaving.

 Analyzing data

1. Transforming raw data into meaningful results using statistical or qualitative analysis
methods.
2. Includes coding, organizing, testing hypotheses, identifying patterns, and interpreting
results.
3. Analysis links the data back to the research questions and objectives.
4. Findings are usually presented with tables, charts, narratives, or models.
5. Example: Calculating the percentage of dropouts linked to family income levels.

1. Interpreting results
2. Reporting and presenting findings

Identification of Research Problem

 Involves recognizing an area where knowledge is lacking


 Should be specific, researchable, and feasible
 Criteria:
o Relevance
o Clarity
o Novelty
o Practical feasibility

Formulation of Hypothesis

 A tentative statement predicting the relationship between variables


 Must be testable and measurable
 Example: “Higher advertising expenditure increases sales.”

Types of hypotheses:

 Null hypothesis (H0): No relationship exists

Definition:

The null hypothesis is a statement that there is no effect, no difference, or no relationship


between variables. It represents the default assumption.

Purpose:

It is the hypothesis we attempt to disprove or reject through statistical testing.

 Examples:

A new drug has no effect on patients:


H₀: The drug has no effect (mean difference = 0)

A coin is fair:
H₀: The probability of heads = 0.5

 Alternative hypothesis (H1): Relationship exists

Definition: The alternative hypothesis is the statement we want to prove or support. It


suggests that there is an effect, a difference, or a relationship.

Purpose: If we reject the null hypothesis, we accept the alternative hypothesis.

Examples:

 The drug does have an effect:


H1: The drug has an effect (mean difference ≠ 0)
 The coin is biased:
H1: Probability of heads ≠ 0.5

Research Designs

 A blueprint for conducting the study


 Specifies methods of data collection, measurement, and analysis
 Ensures validity and reliability of results

Types of research designs:

1. Exploratory: Flexible, preliminary studies

✅ Purpose: To explore a problem or phenomenon when there is little prior knowledge


available.
✅ Features: Flexible, open-ended, aims to generate insights, hypotheses, or new research
questions rather than testing a theory.
✅ Methods: Literature reviews, focus groups, in-depth interviews, pilot studies.
✅ Example: Studying why a new consumer trend is emerging without predefined
hypotheses.

2. Descriptive: Structured, describes characteristics

✅ Purpose: To describe characteristics of a population, situation, or phenomenon.


✅ Features: Structured, uses clear definitions and measurement tools; answers “what”
rather than “why”.
✅ Methods: Surveys, observational studies, case studies.
✅ Example: Determining the demographic profile of smartphone users in a city.

3. Experimental: Tests cause-effect relationships

✅ Purpose: To test cause-and-effect relationships under controlled conditions.


✅ Features: Manipulates independent variables and observes effects on dependent variables,
usually with random assignment to groups.
✅ Methods: Laboratory experiments, field experiments, randomized controlled trials (RCTs).
✅ Example: Testing the effectiveness of a new teaching method on student performance.
4. Diagnostic: Identifies reasons for a problem

✅ Purpose: To determine the causes of a problem or to identify solutions.


✅ Features: Seeks to diagnose an issue, its sources, and possible remedies.
✅ Methods: Root cause analysis, case analysis, stakeholder interviews.
✅ Example: Investigating why employee turnover is high in an organization.

5. Cross-sectional: Observes at a single point in time

✅ Purpose: To study a population or phenomenon at one point in time.


✅ Features: Snapshot approach, often descriptive or correlational; cannot establish causality over
time.
✅ Methods: One-time surveys, observational studies.
✅ Example: A one-time survey measuring health behaviors among teenagers in 2025.

6. Longitudinal: Observes over a period of time

✅ Purpose: To study changes and developments over time.


✅ Features: Data is collected from the same subjects repeatedly over a period, enabling analysis
of trends, patterns, and causal relationships.
✅ Methods: Panel studies, cohort studies, repeated surveys.
✅ Example: Tracking the academic progress of students from grade 1 to grade 12.

MODULE-II
1. Measurement and Data Collection

 Measurement is the process of assigning numbers, symbols, or labels to characteristics


or variables of phenomena according to specific rules. It is essential for converting
abstract concepts (like intelligence, satisfaction, or performance) into observable and
quantifiable data for analysis.

Objectives of Measurement

 To quantify variables for comparison and statistical analysis


 To ensure consistency and accuracy in research
 To test hypotheses and validate theoretical models
 To enable replication of research studies
Key Concepts in Measurement

Concept Description

A characteristic or attribute that can vary among subjects (e.g., age, income,
Variable
satisfaction)

Construct An abstract concept developed for scientific study (e.g., motivation, intelligence)

Operational
How a variable or construct is measured or manipulated in a specific study
Definition

 Levels (Scales) of Measurement

Understanding the level of measurement helps determine the appropriate statistical tools to
use.

1. Nominal Scale

The nominal scale classifies data into distinct categories that are mutually exclusive and
without any inherent order.

Characteristics:

 Only labels or names are assigned


 No ranking or ordering
 Used for categorical variables

Examples:

 Gender: Male, Female, Other


 Blood Type: A, B, AB, O
 Religion: Hindu, Muslim, Christian, Sikh

Statistical Use:

 Frequency counts
 Mode
 Chi-square test

2. Ordinal Scale

The ordinal scale involves rank-ordering items, but does not assume equal spacing between
the ranks.
Characteristics:

 Categories are ordered


 Intervals are not equal or known
 Indicates relative position

Examples:

 Customer Satisfaction: Very Satisfied > Satisfied > Neutral > Dissatisfied
 Education Level: High School < Bachelor's < Master's < PhD
 Military Rank: Private < Sergeant < Captain

Statistical Use:

 Median
 Percentiles
 Non-parametric tests (e.g., Mann-Whitney U test)

3. Interval Scale

The interval scale has ordered categories with equal intervals between values, but no true
zero point.

Characteristics:

 You can measure differences, but not true ratios


 Zero point is arbitrary

Examples:

 Temperature in Celsius or Fahrenheit (0°C ≠ no temperature)


 IQ scores
 Dates on a calendar

Statistical Use:

 Mean and standard deviation


 Correlation and regression
 t-tests and ANOVA

4. Ratio Scale

The ratio scale includes all the properties of interval data plus a true zero, which allows for
meaningful ratios.
Characteristics:

 Has order, equal intervals, and a true zero


 Enables comparison using multiplication/division

Examples:

 Weight, Height, Age


 Income, Distance, Speed
 Number of children, Exam scores (out of 100)

Statistical Use:

 All descriptive and inferential statistics


 Can compute ratios (e.g., “twice as fast”)

 Data Collection:

Data collection is a vital part of the research process. It involves gathering accurate
and relevant information to answer research questions, test hypotheses, and analyze
relationships between variables.
 There are several techniques used, depending on the type of research (qualitative or
quantitative), the nature of the data, and the research objectives.

Techniques used to gather information for research. Methods include:

o Surveys
o Interviews
o Observations
o Experiments

1. Surveys
Definition:

Surveys are a widely used method of data collection, especially in quantitative research. They
involve asking a set of standardized questions to a group of respondents to gather data about their
opinions, behaviors, characteristics, or experiences.

Tools Used:

 Printed or digital questionnaires


 Online survey platforms (e.g., Google Forms, SurveyMonkey)

Used For:

 Large sample sizes


 Gathering opinions, attitudes, behaviors, demographics

Types of Surveys

Type Description Use Case

Snapshot of current opinions or


Cross-Sectional Data collected at a single point in time.
behavior.

Data collected from the same respondents


Longitudinal Studying trends or changes.
over time.

Web-based (e.g., Google Forms,


Online Surveys Easy distribution, low cost.
SurveyMonkey).

Telephone Surveys Conducted via phone calls. Useful for quick, direct responses.

Face-to-Face Higher response rates, more detailed


In-person interviews using a survey format.
Surveys answers.

Useful for specific populations (e.g.,


Mail Surveys Paper surveys sent and returned by mail.
elderly).

Advantages:

 Cost-effective for large populations


 Standardized data
 Easy to analyze statistically
 Easy to administer and analyze
 Can reach large and diverse populations

Limitations:

 Responses may lack depth


 Risk of non-response or biased answers
 Misinterpretation of questions
 Social desirability bias

2. Interviews

Definition:

An interview is a method of collecting data through direct, personal conversation between the
researcher and the respondent in order to gather information, insights, or opinions on a particular
topic.

Key Features of Interviews

1. Direct Interaction – Face-to-face, phone, or video communication.


2. Flexible Structure – Can range from structured (fixed questions) to unstructured (open-
ended discussion).
3. Depth of Data – Allows for in-depth understanding of experiences, motivations, and
perspectives.

Types of Interviews

Type Description Use Case

Structured Pre-determined questions, fixed order. Surveys, large sample studies.

Guided questions with flexibility to Most common in qualitative


Semi-Structured
explore topics. research.

Open conversation with no fixed Exploratory studies, ethnographic


Unstructured
questions. work.

Group Interviews / Focus Interviewing multiple participants Gathering varied perspectives in


Groups simultaneously. social contexts.

Telephone/Online Remote, often recorded for


When in-person is not feasible.
Interviews transcription.

Used For:

 In-depth insights into beliefs, feelings, and motivations


 Qualitative research and case studies

Advantages:

 Rich, detailed information


 Opportunity to clarify and probe further
 Flexibility in exploring new topics
Limitations:

 Time-consuming and costly


 Requires skilled interviewer
 May introduce interviewer bias
 Difficult to analyze large volumes of qualitative data

3. Observations

Observations as a data collection method involve systematically watching, listening to, and
recording behaviors, events, or conditions in their natural setting. It is commonly used in
qualitative research, though it can also be structured for quantitative purposes.

Key Features of Observations

1. Direct Data Collection – No reliance on participants’ self-reports.


2. Contextual Insight – Captures behaviors as they occur in real environments.
3. Flexible or Structured – Can be open-ended or follow a checklist.

Types of Observations

Type Description Use Case

Structured Uses a checklist or coding system. Behavior studies, classroom monitoring.

Ethnographic research, exploratory


Unstructured Open-ended, descriptive notes.
studies.

Participant Observation Researcher becomes part of the group. Social sciences, anthropology.

Non-Participant
Observer remains separate. Objective recording of public behavior.
Observation

Naturalistic Conducted in real-life settings. Studying subjects in their environment.

Conducted in a controlled environment (e.g.,


Controlled Experiments, usability testing.
lab).

Used For:

 Studying behavior, social interactions, physical settings


 Ethnographic and psychological studies

Advantages:

 Real-time data in natural context


 Non-verbal behavior can be recorded
 Captures actual behavior
 Useful when participants may not be fully aware of their actions

Limitations:

 Observer bias
 Limited access to internal thoughts or motivations
 Ethical concerns in covert observation
 Time-consuming and labor-intensive
 Behavior may change if people know they are being watched (Hawthorne effect)

4. Experiments
Definition:
Experiments are a structured method of data collection used primarily in quantitative research
to test hypotheses by manipulating one or more variables and observing the effect on other
variables. They are especially common in scientific, psychological, and social research.

Key Features of Experiments

1. Controlled Environment – Researchers manipulate variables in a systematic way.


2. Causal Inference – Allows determination of cause-and-effect relationships.
3. Randomization – Participants are often randomly assigned to groups to reduce bias.

Types of Experiments

Type Description Use Case


Laboratory Psychology, biology, product
Conducted in a controlled, artificial setting.
Experiment testing.
Marketing, education, workplace
Field Experiment Conducted in a real-world setting.
studies.
Researcher does not manipulate variables; they Policy impact, environmental
Natural Experiment
occur naturally. studies.
When randomization is not feasible
Quasi-Experiment Lacks random assignment to groups.
or ethical.
Involves manipulation, control group, and
True Experiment Gold standard for testing causality.
random assignment.
Used For:

 Testing causal relationships between variables


 Hypothesis-driven research

Advantages:

 High internal validity (control over variables)


 Can determine cause-and-effect relationships
 Replicable and often statistically rigorous

Limitations:

 Artificial settings may reduce external validity


 Ethical constraints
 Often expensive and complex
 Can be time-consuming
 Participant behavior may be influenced by awareness of the experiment (demand
characteristics)

Summary Table
Technique Data Type Strengths Limitations

Survey Quantitative Cost-effective, large samples Superficial answers, bias possible

Interview Qualitative Deep insights, flexible Time-consuming, risk of bias

Observation Both Real-time, behavioral data Limited to observable phenomena

Experiment Quantitative Causality, control May lack real-world relevance

2. Primary Data

Primary data is data collected directly from first-hand sources for a specific research purpose.
It is original, raw, and has not been previously published or interpreted by others.

Characteristics

 Original – Collected by the researcher directly from the source.


 Specific – Tailored to a particular research question or objective.
 Current – Reflects recent or real-time information.
 Controlled – Researcher has control over how data is collected (tools, timing, sample,
etc.).

Common Methods of Collecting Primary Data

Method Description

Surveys/Questionnaires Written or digital forms with structured questions.

Interviews Verbal interaction (structured, semi-structured, or unstructured).

Observations Watching and recording behavior or events in real-time.

Experiments Controlled setups to test hypotheses by manipulating variables.


Method Description

Focus Groups Guided group discussions to gather opinions or insights.

Field Notes / Diaries Self-recorded experiences or observations by participants or researchers.

Advantages

 Relevant to the specific study


 Greater accuracy and reliability
 Researcher controls data quality
 Timely and up-to-date information

Disadvantages

 Time-consuming to collect
 Can be expensive (especially for large samples)
 May require trained personnel and special tools
 Limited in scope (compared to large datasets available as secondary data)

 Examples of Primary Data

 A student conducting a survey of 100 classmates about study habits


 A company observing customer behavior in a store
 An experiment testing the effectiveness of a new drug
 A researcher interviewing local farmers about climate change impacts

3. Secondary Data

Secondary data refers to data that has already been collected, processed, and
published by someone else, often for a purpose different from your own research. Data
collected previously for another purpose but reused for a new study.

Characteristics

 Pre-existing – Collected by another person, organization, or researcher.


 Easily Accessible – Often found in reports, databases, journals, or websites.
 Cost-effective – Saves time and resources compared to collecting new (primary) data.
 Broad Scope – Often covers large samples or long time spans.

Common Sources of Secondary Data


Source Type Examples

Government Reports Census data, labor statistics, health records

Academic Research Journal articles, theses, dissertations

Institutional Databases World Bank, WHO, IMF, OECD

Commercial Sources Market research firms, company reports

Online Content News archives, social media, blogs (used with caution)

Libraries and Archives Historical documents, newspapers, records

Advantages

 Saves time and cost


 Data may come from large, reliable sources
 Allows for longitudinal or trend analysis
 Can complement or validate primary data

Disadvantages

 May not be specific to your research question


 Could be outdated or incomplete
 Risk of bias or data quality issues
 Limited control over how data was collected

Examples of Secondary Data

 Using UNICEF reports for child health statistics


 Analyzing past academic studies on climate change
 Referencing market trends from a commercial research firm
 Examining crime statistics from government databases

Primary vs. Secondary Data

Aspect Primary Data Secondary Data

Source Collected directly Already collected

Cost & Time High Low

Specificity Custom to your study May be general

Control Full control No control over collection


Aspect Primary Data Secondary Data

Accuracy Can be high if collected well Varies depending on source

4. Design of Questionnaire

A questionnaire is a structured set of questions used to collect data from respondents. It’s
commonly used in surveys for both qualitative and quantitative research.

Elements of good questionnaire design:

o Clear objectives

✅The questionnaire should directly serve the research objectives.

✅ Questions must be relevant to what you want to measure, avoiding unnecessary


items.

✅Helps keep the questionnaire focused and concise.

Example: If studying customer satisfaction, do not include irrelevant questions


about unrelated products.

o Simple and unambiguous language

✅ Use clear, everyday language that is easily understood by the target audience.
✅ Avoid jargon, technical terms like SQL, NoSQL, ETL, APIs, and cloud tools like
AWS, or complex sentences.
✅ Help reduce misinterpretation of questions.
Example: Instead of “What is your perception of the efficacy of our service
delivery?” ask “How satisfied are you with our service?”

o Logical sequence

✅ Arrange questions in a logical, natural flow.


✅ Start with easy and non-sensitive questions to build rapport (mutual sense of trust).
✅ Group similar topics together to maintain respondent focus.
✅ Place sensitive or personal questions later, after trust has been established.
Example sequence: demographics → general behavior → specific opinions → sensitive
topics.
o Pre-testing (pilot testing)

✅ Test the questionnaire with a small sample of respondents before the actual survey.
✅ Help identify confusing questions, unclear wording, or missing response options.
✅ Allows improvements in design and flow.
✅ Minimizes errors during full data collection.

o Balanced and unbiased questions

✅ Questions should be neutral, avoiding leading language that could push respondents to
a particular answer.
✅ Provide balanced options for responses.
✅ Avoid emotionally loaded words or one-sided framing.
Example: Instead of “Don’t you think our service is excellent?” ask “How would you
rate the quality of our service?”

5. Sampling Fundamentals and Sample Designs

Sampling: Selecting a portion of the population to represent the whole. In other words,
Sampling is the process of selecting a subset (sample) from a larger group (population) to
estimate characteristics of the whole.

 Instead of surveying every individual in a population (which is often costly or


impossible), sampling allows you to draw conclusions from a manageable number of
observations.

Sampling Fundamentals:

1. Population

 The entire set of items or individuals of interest in a study.


 Can be finite (e.g., students in a school) or infinite (e.g., all possible outcomes of rolling a
die).
 Example: All voters in a country.

2. Sample

 A portion of the population selected for analysis.


 Should ideally be representative of the whole population?
 Example: 1,000 voters selected randomly to predict election outcomes.

3. Sampling Frame

 A list or source containing all the elements of the population from which the sample is
drawn.
 Example: A city’s registered voter list.
 Must be as complete and current as possible to avoid sampling bias.

4. Sampling Unit

 A single element or group of elements considered for selection.


 Could be an individual, household, company, etc.
 Example: One household in a housing survey.

5. Sample Size

 The number of sampling units selected from the population.


 Larger samples give more accurate results but cost more.
 Sample size is influenced by:
o Desired confidence level
o Margin of error
o Population variability

6. Sampling Error

 The difference between the sample result and the true population value.
 Occurs naturally because only part of the population is observed.
 Reduced by increasing sample size and using probability sampling.

7. Non-Sampling Error

 Errors not related to the act of sampling itself.


 Examples:
o Data entry mistakes
o Misinterpretation of survey questions
o Non-response bias

8. Representativeness

 A sample is representative if it reflects the key characteristics of the population.


 Representativeness is critical for making valid inferences.

Sample Designs
Sample design refers to the plan or strategy used to select a sample from a population.
It determines how, from where, and how many units will be selected to ensure the
sample is representative and suitable for the research objectives.

Objectives of a Good Sample Design


A well-designed sample should:

 Be representative of the population


 Minimize bias
 Be efficient in terms of time and cost
 Allow valid statistical inferences
 Be simple and practical to implement

Types of Sample Designs

Sample designs are broadly categorized into two main types:

o Probability sampling (e.g., simple random, stratified, cluster, systematic)


o Non-probability sampling (e.g., convenience, purposive, quota, snowball)

1. Probability Sampling Designs

Every unit in the population has a known and non-zero chance of being selected.

Sampling Method Description Use Case


Simple Random
Every unit has an equal chance of selection. Small, well-defined populations
Sampling
Systematic
Select every kth unit after a random start. Manufacturing, quality control
Sampling
Divide the population into subgroups (strata), Surveys needing demographic
Stratified Sampling
then sample from each. representation
Divide population into clusters, then randomly
Cluster Sampling Wide geographic studies
choose clusters.
Multistage Combines two or more sampling methods (e.g., Large-scale national or
Sampling clusters → individuals). educational studies

1. Simple Random Sampling


Definition:

Every member of the population has an equal and independent chance of being selected.

Working Principle:

 Use a random number generator, lottery method, or software to select individuals.


 Requires a complete list (sampling frame) of the population.
Example:

From a list of 1,000 students, randomly select 100 to take a survey.

Advantages:

 Minimizes selection bias


 Easy to analyze statistically

Disadvantages:

 Not feasible for large populations without a complete list


 May not represent subgroups proportionally

2. Systematic Sampling
Definition:

Select every kᵗʰ element from a list after randomly selecting a starting point.

Working Principle:

 Determine the sampling interval k = N/n (where N is the population size and n is the
sample size)
 Randomly choose a starting point between 1 and k
 Select every kᵗʰ individual from the list

Example:

From a list of 1,000 households, if a sample of 100 is needed, choose every 10ᵗʰ household after
a random start between 1 and 10.

Advantages:

 Simple and quick


 Ensures evenly spread sample

Disadvantages:

 Can lead to bias if there's a hidden pattern in the population list

3. Stratified Sampling
Definition:
Divide the population into homogeneous subgroups (strata) and randomly sample from each
subgroup.

Working Principle:

 Identify key subgroups (e.g., age, gender, income)


 Perform simple random sampling within each subgroup

Example:

In a school with 60% females and 40% males, to sample 100 students, randomly select 60
females and 40 males.

Advantages:

 Ensures representation of all key subgroups


 Greater precision than simple random sampling

Disadvantages:

 Requires knowledge of population structure


 More complex to organize

4. Cluster Sampling
Definition:

Divide the population into clusters, then randomly select entire clusters for the sample.

Working Principle:

 Clusters are often naturally occurring groups (e.g., schools, neighborhoods)


 Randomly select some clusters
 Include all members of the selected clusters, or sample within them

Example:

Randomly choose 5 schools out of 50, and survey all students in those 5 schools.

Advantages:

 Cost-effective and practical for large, spread-out populations


 Easier to implement when full population list is unavailable

Disadvantages:
 Higher sampling error if clusters are not homogeneous
 Less precision than stratified or simple random sampling

5. Multistage Sampling
Definition:

A complex form of cluster sampling, where sampling is done in multiple stages, often
combining several techniques.

Working Principle:

 Stage 1: Randomly select clusters


 Stage 2: Randomly sample individuals within those clusters

Example:

Stage 1: Randomly select districts →


Stage 2: Within each district, randomly select schools →
Stage 3: Within each school, randomly select students

Advantages:

 Efficient for large, diverse populations


 Flexible and cost-effective

Disadvantages:

 Increased complexity
 Higher potential for sampling error

2. Non-Probability Sampling Designs

Not every unit has a known or equal chance of selection.

Sampling
Description Use Case
Method
Convenience Selects the most accessible subjects (e.g.,
Pilot studies, quick surveys
Sampling people nearby).
Judgmental Selection based on researcher’s judgment or
Expert panels, market research
Sampling expertise.
Quota population is divided into specific subgroups, and Opinion polls, market research
a predetermined number (quota) of subjects is and social sciences when time or
Sampling
Description Use Case
Method
Sampling selected from each subgroup resources are limited.
Snowball Participants recruit other participants (useful Social networks, drug use
Sampling for hidden populations). research

1. Convenience Sampling
Definition:

A sampling method where subjects are selected based on their easy availability and willingness
to participate.

Working Principle:

 The researcher selects whoever is easiest to reach.


 No effort is made to ensure the sample is representative.

Example:

A college student surveys classmates in the cafeteria because they are easily accessible.

Advantages:

 Fast and inexpensive


 Easy to implement

Disadvantages:

 High risk of bias


 Results are not generalizable to the broader population

2. Judgmental Sampling (Purposive Sampling)


Definition:

The researcher intentionally selects individuals who are most appropriate or relevant to the
study.

Working Principle:
 Selection is based on the researcher’s expert judgment about who will provide the best
information.

Example:

A medical researcher selects only doctors with 10+ years of experience to study opinions on a
new treatment.

Advantages:

 Focuses on knowledge-rich individuals


 Useful in qualitative or exploratory research

Disadvantages:

 Subjective and prone to bias


 Cannot generalize findings to a broader population

3. Quota Sampling
Definition:

A method where the population is divided into subgroups, and a specific number (quota) is
filled for each subgroup—not using random selection.

Working Principle:

 Identify key characteristics (e.g., age, gender)


 Set quotas for each subgroup
 Select individuals non-randomly until quotas are met

Example:

Survey 100 people: 50 men and 50 women. The researcher interviews people until those
numbers are reached, based on convenience.

Advantages:

 Ensures representation of specific subgroups


 Faster and cheaper than stratified random sampling

Disadvantages:

 Non-random selection can lead to sampling bias


 Less reliable than probability sampling
4. Snowball Sampling
Definition:

A method where existing study subjects recruit future subjects from among their social
networks.

Working Principle:

 Start with a few known individuals (seeds)


 Ask them to refer others who meet the criteria
 The sample “snowballs” as more people are referred

Example:

To study drug use in a hidden population, a researcher starts with a known user and asks them to
refer other users.

Advantages:

 Useful for hard-to-reach or hidden populations


 Cost-effective for rare subjects

Disadvantages:

 Sample may be homogeneous (limited diversity)


 Introduces referral bias
 No control over the sampling frame

Key Differences

Feature Probability Sampling Non-Probability Sampling


Selection Method Randomized Non-random
Representative? Yes (more likely) Not guaranteed
Bias Risk Low High
Cost and Time Often higher Usually lower
Statistical Validity High Limited

Example Scenario: Student Satisfaction Survey


Step Example
Population All students in a university
Step Example
Sampling Frame List of enrolled students
Sample Size 300 students
Sampling Design Stratified sampling by department
Sampling Unit Each student

6. Measurement and Scaling Techniques

Measurement

Measurement is the process of assigning numbers or symbols to objects, events, or


characteristics according to specific rules, to represent quantities or qualities of attributes.

For example: measuring customer satisfaction, intelligence, or brand preference.

Scaling

Scaling refers to the process of adjusting the range, size, or level of something to make it
comparable, manageable, or more effective. It involves developing rules to quantify abstract
concepts (like attitudes or perceptions).

Objectives of Measurement & Scaling

 Quantify subjective data (e.g., opinions, preferences)


 Enable comparison and analysis
 Support decision-making
 Ensure consistency and reliability in data collection

Levels of Measurement (Scales of Measurement)

There are four primary types of measurement scales:

Level Description Example

Nominal Categorizes data without any order Gender (male/female), colors

Satisfaction (satisfied > neutral >


Ordinal Ranks data but doesn’t show exact difference
dissatisfied)

Numeric scales with equal intervals, no true


Interval Temperature in Celsius, IQ scores
zero

Ratio Like interval, but has a meaningful zero Age, weight, income, height

Types of Scaling Techniques


Scaling techniques are methods used to construct scales for measuring variables. They fall into
two broad categories: Comparative and Non-Comparative.

A. Comparative Scaling Techniques

Involve comparing one item directly with another. Respondents are asked to evaluate one object
relative to another, rather than in isolation. These methods are often used in marketing research,
psychology, and surveys where preference or priority needs to be identified.

Type Description Example

Paired Respondents choose between two items at a


Coke vs. Pepsi
Comparison time.

Rank Order Rank items in order of preference. Ranking 5 brands of smartphones

Allocate a constant total (e.g., 100 points) Allocate 100 points among brands by
Constant Sum
among items. preference

Sorting personality traits into “most/least


Q-Sort Scaling Sort items into predefined categories.
like me”

1 Paired Comparison Scaling

Definition: Respondents are presented with two items at a time and asked to select one, based
on a certain criterion (e.g., preference, importance, quality).

Example:

You are shown these pairs:

 Tea vs Coffee → You choose Coffee


 Coffee vs Juice → You choose Coffee
 Tea vs Juice → You choose Juice

From this, a preference order can be inferred.

🔹 Characteristics:

 Easy for respondents (only two items at a time)


 Requires multiple comparisons (n(n-1)/2 for n items)
 Good for small sets (3–10 items)

🔹 Advantages:

 Simple and intuitive


 Reduces decision complexity

🔹 Disadvantages:

 Not practical with large item sets (time-consuming)


 Can lead to inconsistent choices

2️ Rank Order Scaling

🔹 Definition: Respondents are shown all items simultaneously and asked to rank them from
most to least preferred (or vice versa).

🔹 Example:

Rank the following from most preferred (1) to least preferred (4):

1. Tea
2. Coffee
3. Juice
4. Soda

You might respond:

 1 = Coffee
 2 = Juice
 3 = Tea
 4 = Soda

🔹 Characteristics:

 Provides ordinal data (not interval)


 Allows for comparative insights

🔹 Advantages:

 Efficient even for moderate number of items


 Reveals full preference order

🔹 Disadvantages:

 No indication of how much more one item is preferred than another


 Can be cognitively demanding if the list is long
3️ Constant Sum Scaling
🔹 Definition: Respondents allocate a fixed number of points (e.g., 100) across a set of items
to indicate the relative importance or preference.

🔹 Example:

Distribute 100 points among the following based on how important they are in choosing a drink:

 Taste
 Price
 Availability
 Brand

You might respond:

 Taste: 40
 Price: 30
 Availability: 20
 Brand: 10

🔹 Characteristics:

 Provides ratio-level data


 Reveals relative weight of each attribute

🔹 Advantages:

 More precise than simple ranking


 Useful in resource allocation and priority setting

🔹 Disadvantages:

 Requires more effort


 Some respondents may struggle with allocating exactly 100 points

4️Q-Sort Scaling

🔹 Definition: Respondents sort a set of statements or items into predefined categories (often a
quasi-normal distribution), based on how much they agree, prefer, or relate.

🔹 Example:

Given 10 product features, you might be asked to sort them into 5 categories:
 Most preferred (2 items)
 Preferred (2 items)
 Neutral (2 items)
 Less preferred (2 items)
 Least preferred (2 items)

🔹 Characteristics:

 Forces a distribution of preferences


 Often used in psychology and user experience studies

🔹 Advantages:

 Reduces central tendency bias


 Good for comparative studies involving subjective judgments

🔹 Disadvantages:

 May feel artificial or restrictive


 Not suitable for all types of stimuli or populations

Summary Table:

Technique Type of Data Comparison Style Best Use Case


Paired
Ordinal One pair at a time Simple preference tests
Comparison
When full preference order is
Rank Order Ordinal Rank all items
needed
Allocate fixed
Constant Sum Ratio Prioritizing or weighting factors
points
Ordinal (quasi-
Q-Sort Scaling Categorized sorting Subjective traits or attitudes
normal)

B. Non-Comparative Scaling Techniques

Non-comparative scaling techniques involve evaluating a single object (product, service, idea,
etc.) independently, rather than in comparison to others. Respondents evaluate items
independently of others. It is also called monadic scaling.

Common Types of Non-Comparative Scaling Techniques

1️Continuous Rating Scale (Graphic Rating Scale)


Description:

Respondents mark their opinion on a continuous line between two extreme points.

Example:

Indicate your satisfaction with our service:


Very Unsatisfied ──────────────|────────────── Very Satisfied

Advantages:

 Fine-grained feedback
 Visual and intuitive

Disadvantages:

 Manual scoring can be difficult without software


 Subjective interpretation of where to mark

2️Itemized Rating Scales

These are discrete, pre-defined categories used to rate an object.

a) Likert Scale

 Measures level of agreement/disagreement


 Usually has 5 or 7 points

Example:

"The product is easy to use."

 Strongly Disagree
 Disagree
 Neutral
 Agree
 Strongly Agree

 Data Level: Ordinal (often treated as interval in practice)

🔸 b) Semantic Differential Scale

 Uses bipolar adjectives (e.g., good–bad, fast–slow)


 Respondents mark on a 5 or 7-point scale

Example:
Fast ◀─────●─────▶ Slow
Rate our delivery speed:

 Data Level: Interval

🔸 c) Numerical Rating Scale

 Respondents rate items with numbers (e.g., 1 to 10)

Example:

Rate your satisfaction on a scale from 1 to 10.

 Data Level: Interval

3️Staple Scale

Description:

 A single adjective is placed in the center of a scale ranging from +5 to –5, without a
neutral zero.

Example:

Rate the staff's friendliness


+5 (Extremely Friendly) to –5 (Extremely Unfriendly)

diff
Copy Edit
+5
+4
+3
+2
+1
Friendliness
-1
-2
-3
-4
-5

 Data Level: Interval

Advantages:

 Compact and easy to administer


 Can be used where bipolar adjectives don't make sense
Disadvantages:

 Less intuitive for some respondents

Summary Table

Scale Type Format Data Level Typical Use

Graphic (Continuous) Scale Mark on a line Interval Satisfaction, pain levels

Likert Scale Agreement levels Ordinal Attitudes, opinions

Semantic Differential Bipolar adjectives Interval Brand image, perception

Numerical Rating 1–10 or 1–5 scale Interval Satisfaction, quality

Staple Scale +5 to –5 scale Interval Attitude measurement

Applications of Non-Comparative Scaling

 Marketing research (customer satisfaction, product evaluation)


 Psychology (attitudes, behavior measurement)
 Service quality studies
 Brand perception analysis
 UX/UI testing and feedback

Advantages

 Simple to administer and analyze


 Doesn’t overwhelm respondents with comparisons
 Generates absolute values, useful for benchmarking

Limitations

 Does not show relative preferences among items


 Potential bias due to respondents' interpretation of scales
 Assumes equal intervals in some types, which may not be accurate

7. Data Processing

 Data processing is the process of collecting and manipulating data to produce


meaningful information. It converts raw, unorganized data into a usable format through a
sequence of operations.
Stages of Data Processing

Data processing typically follows a systematic, multi-stage flow. Each stage plays a crucial role
in ensuring data is accurate, useful, and ready for analysis or decision-making.

1. Data Collection

Purpose: Gather raw data from various sources.


Sources can include:

 Surveys or questionnaires
 IoT devices and sensors
 Databases or data lakes
 Web scraping
 Government datasets (e.g., census data)
 Business transactions

Importance: The quality of your output depends on the quality of your input ("Garbage in,
garbage out").

2. Data Preparation (Data Cleaning)

Purpose: Make raw data usable by detecting and correcting errors or inconsistencies.

Common tasks:

 Handling missing or null values


 Removing duplicates
 Filtering out outliers
 Correcting typos and mislabels
 Normalizing or standardizing data formats
 Converting data types (e.g., string to date)

Goal: Ensure the dataset is accurate, consistent, and structured.

3. Data Input

Purpose: Feed cleaned data into a system for processing.

Methods:

 Manual input (e.g., forms)


 Automated loading (e.g., using scripts, APIs(Application Programming Interface), or
ETL (Extract → Transform → Load) tools)
 File imports (e.g., CSV(Comma-Separated Values), Excel, JSON ((JavaScript Object
Notation), SQL(Querying Census Data)
Goal: Ensure data is correctly stored in databases or data pipelines for further use.

4. Data Processing

Purpose: Perform operations that convert raw input into useful outputs.

Techniques vary by use case and may include:

 Sorting and filtering


 Aggregation (e.g., sum, average)
 Transformation (e.g., normalization, encoding)
 Joining datasets
 Applying algorithms or statistical models

This is the "engine room" where value is created.

5. Data Storage

Purpose: Save processed data for retrieval, analysis, and future use.

Storage systems include:

 Relational databases (e.g., MySQL, PostgreSQL)


 Data warehouses (e.g., Amazon Redshift, Snowflake)
 Cloud storage (e.g., AWS S3, Google Cloud Storage)
 NoSQL databases (e.g., MongoDB, Cassandra)

Importance: Well-organized storage ensures fast access and scalability.

6. Data Output

Purpose: Present the processed data in a user-friendly, actionable format.

Output formats can include:

 Reports (PDF, Excel, dashboards)


 Visualizations (charts, maps, graphs)
 APIs or data feeds
 Alerts or notifications

Goal: Translate data into information that stakeholders can understand and act on.

7. Data Interpretation/Analysis

Purpose: Draw insights, detect patterns, and support decision-making.


Approaches:

 Descriptive statistics (mean, median, mode)


 Predictive analytics (regression, classification)

 Inferential statistics (regression, hypothesis testing)


 Machine learning models
 Trend and pattern detection

Outcome: Data-driven decisions, policy-making, or business strategies.

Summary Table

Stage Key Focus Main Tools


Data Collection Acquiring raw data Surveys, APIs, Web scraping
Data Preparation Cleaning and formatting Python (Pandas), Excel, OpenRefine
Data Input Feeding data into systems Scripts, ETL tools, SQL
Data Processing Manipulating and transforming data Python, R, SQL, Spark
Data Storage Saving for future use Databases, Data lakes, Cloud storage
Data Output Reporting and visualizing Power BI, Tableau, Dashboards
Data Interpretation Analyzing and making decisions Analytics, ML, Statistical methods

You might also like