You are on page 1of 168


Chapter one: Introduction

Definitions of Research
The word research is composed of two syllabuses, re and search. The dictionary defines the
former as a prefix meaning again, anew or over again and the latter as a verb meaning to
examine closely and carefully, to test and try or to probe. Together they form a noun describing a
careful, systematic, patient study and investigation in some field of knowledge, undertaken to
establish facts or principles.
The Advanced Learners Dictionary of current English lays down the meaning of research as a
careful investigation or inquiry especially through search for new facts in any branch of
Research is a structured inquiry that utilizes acceptable scientific methodology to solve problems
and creates new knowledge that is generally applicable.
Research is a systematic investigation to find answers to a problem.
Research is a systematic, controlled empirical and critical investigation of propositions about the
presumed relationships about various phenomena.
Research is defined as a search for knowledge.
Research is a scientific and systematic search for pertinent information on a specific topic.
Research is an art of investigation.
Research is a movement from the known to the unknown.
Research comprises defining and redefining problems, formulating hypothesis or suggested
solutions; collecting, organizing and evaluating data; making deductions and reaching
conclusions; and last carefully testing the conclusions to determine whether they fit the
formulating hypothesis.
Research is the manipulation of things, concepts or symbols for the purpose of generalizing to
extend, correct or verify knowledge, whether that knowledge aids in construction of theory or in
the practice of an art.
Research is an original contribution to the existing stock of knowledge making for its
Research is the pursuit of truth with the help of study, observation, comparison and experiment.
Research is the search for knowledge through objective and systematic method of finding
solutions to a problem.

Research refers to the systematic method consisting of enumerating the problem, formulating a
hypothesis, collecting the facts or data, analyzing the facts and reaching certain conclusions
either in the form of solutions toward the concerned problem or in certain generalizations for
some theoretical formulation.
Research is a systematic, controlled empirical and critical method consisting of
enumerating the problem, formulating a hypothesis, collecting the facts or data, analyzing
the facts and reaching certain conclusions either in the form of solutions toward the
concerned problem or in certain generalizations for some theoretical formulation.
Characteristics of Research
From the above definitions it is clear that research is a process for collecting, analyzing and
interpreting information to answer questions. But to qualify as a research, a process must have
certain characteristics as listed below:
In exploring the causality relation to two variables, the study must be set in way that minimizes
the effects of other factors affecting relationship. In social science research, however, since
controlling is almost impossible, the effect of the other variable must be quantified rather.
One must be very careful (scrupulous) in ensuring that the procedures followed to find answers to
questions are relevant, appropriate and justified.
Valid and verifiable
This concept implies that whatever you conclude on the basis of your findings is correct and can
be verified by you and others.
This means that any conclusions drawn are based upon hard evidence gathered from information
collected from real life experiences or observations.
The methods employed and procedures used should be critically scrutinized. The process of
investigation must be foolproof and free from any drawbacks. The process adopted and the
procedures used must be able to withstand critical scrutiny.
Objectives of Research
To gain familiarity with a phenomenon or to achieve new insights into ( studies with this
object in view are termed as exploratory or formulative research studies);

To portray accurately the characteristics of a particular individual, situation or a group (
studies with this objective are called descriptive research studies);
To determine the frequency with which something occurs or with which it is associated with
something else (studies with this object in view are known as diagnostic research studies.
To test a hypothesis of a causal relationship between variables (such studies are known as
hypothesis testing research studies).
Motivations in Research
What makes people to undertake research? The possible motives for doing research may be
either one or more of the following among others:
Desire to get a research degree along with its consequential benefits;
Desire to face the challenges in solving the unresolved problems, i.e., concern over
practical problems initiates research;
Desire to get intellectual joy of doing some creative work;
Desire to be of service to society;
Desire to get respectability.
Types of Research
There are different bases in classifying researches; however we will see only the most
common bases and these classifications which are relevant to our syllabus.
First, there are two broad classification of research that follows:
1. Research in physical sciences
2. Research in social sciences
Physical sciences deal with things, which can be put to laboratory tests under guided
conditions. These researches deal with physical phenomena upon which man has complete
Researches in social sciences are based on human behavior, which is influenced by so many
factors, such as physical, social, temperamental, psychological and economic. We dwell for
some time on this category of research in the forthcoming subtopics as the whole handout is
about social sciences research methods.
Social Research
Social research is part of research, which studies human behavior on a part of society. Social
research is to find explanation to unexplained social phenomena, to clarify doubts and correct
the misconceived facts of social life.

Social research can be defined as:
Systematic investigation to gain new knowledge about social phenomena and surveys.
A systematic method of exploring, analyzing and conceptualizing social life in order to
extend, correct or verify knowledge, whether that knowledge aid in the construction of a
theory or in the practice of an art.
A scientific undertaking which by means of logical and systematized techniques aims to
discover new facts or verify and test old facts, analyze their sequences, interrelationships and
causal explanation which were derived within an appropriate theoretical frame of reference,
develop new scientific tools, concepts and theories which would facilitate reliable and valid
study of human behavior and social life and thereby gain greater control overtime.
A study of mankind in his social environment and is concerned with improving his
understanding of social orders, groups, institutions and ethics.
A collection of methods and methodologies that researchers apply systematically to produce
scientifically based knowledge about the social world.
Characteristics of Social Research
From the above given definitions the following characteristics of social research may be drawn:
1. It deals with social phenomena. It studies the behavior of human beings as a member of a
society and their feelings, responses and attitudes under different circumstances. It
encompasses the study of social phenomena covering economic, political, social, educational,
administrative and related aspects of social life. The social research was born out to solve
social problems.
2. It aims at discovering of new facts. The scientific research techniques are applied to find out
truth reasoning or relationship of various kind of human behavior.
3. It is a scientific undertaking in which logical and systematized techniques are used. It also
develops new scientific tools and concepts which facilitate reliable and valid study of human
4. It assists in the undertaking of evolution of new theories. Every research highlights some
broad principles, establishes some scientific truth and analysis their sequences,
interrelationships and causal explanations. This results in expansion of knowledge,
improvement in the understanding of the social phenomena and in the evolution of new
5. It requires deep knowledge and minute investigation of the topic concerned.

6. It must be objective. Research should not take his own interest because any personal bias
vitiates the universality criterion of scientific proposition.
7. Experimentation is not possible in social researches. However, in some cases social research
takes the shelter of controlled experiments.
8. Inter-relationship between variables under study is must. Besides it, the variables of social
research study can not be measured correctly, only rough estimation of variables is possible.
9. It is dynamic in nature; therefore, what was true of past might not be true of present.
10. It is inter-related. Therefore, we can not draw water-tight compartments for each sector or we
cannot say whether it is purely political, economic, or sociological research.
11. It tells that the social events are also governed by the rules and regulations as physical events.
12. It is complementary to research in physical sciences and both branches of knowledge help
each other and are the way to progress.
Motivating factors of social Research
The following are four motivating factors:
1. Curiosity about unknown: Curiosity is an intrinsic trait of human mind and a compelling
drive in the exploration of mans surroundings. It is natural instinct in the makings. A man is
always curious about the unknown and mysterious objects that he notices around him and
tries to understand them in his own. The same curiosity drives social scientists to explore,
reveal and understand unknown factors behind the social phenomena.
2. Desire to understand the cause and effect Relationship of Social problems: The research
of cause and effect relationship has been more relentless than almost any other scientist effort
upon which human energies have been spent.
3. Appearance of New and unexpected Situations: In modern complex and dynamic world a
man is often faced with many acute and difficult problems. It is the duty of the social scientist
to find out their real cause and suggest solutions to such problems.
4. Desire to Discover New and Old Scientific Procedures: It concerns with the technique or
methods used in social research. Social scientists have been busy in devising and developing
new methods and techniques in place old ones for dealing with social problems.
Importance of Social Research
In general way, some of the directly practical benefits and theoretic implications of social
research may be listed as follows:

1. Guides in social planning: Adequate social planning depends for its success on a
systematic knowledge of the social resources and liabilities, of the people and their
culture, of their similarities and differences, of organizations and operative controls of
their needs, hopes and problems, etc.
2. Provides knowledge to Control any social Phenomena: By affording first hand
knowledge about the organization and working of society and its institution, Social
research acts as a source of a power to control social phenomena. Furthermore, social
research has practical implications for formal and informal types of leadership, pattern of
influence and reform in different spheres of society.
3. Contributes to the betterment of social welfare.
4. Ascertain orders among facts.
5. Contributes for the advancement and improvement of social research techniques.
6. Provides solutions to social problems.
7. Contributes to the development of developing countries.
Problems/challenges in Social Researches
Following are the main difficulties faced by the researchers in the application of scientific
methods in social research:
1. Complexity of Social Data: The behavior of human beings or economic problems, a subject
of social research, are influenced by so many factors. Because of these factors a researcher is
generally confused.
2. Problems in Interpreting Relationship Between Cause and Effect: In case of social
phenomena the cause and effect are interdependent and one stimulates the other. It is very
difficult to establish cause and effect relationship in social sciences and to find as to what is
the cause and what is its effect?
3. Problems of Concepts
4. Dynamic Nature of Social Phenomena: Human society is constantly changing and
improving itself by past knowledge.
5. Problem of Maintaining Objectivity: Achieving an effective degree of objectivity in social
inquiry is very difficult task. Different findings on the same issue arise most often.
6. Unpredictability: Predictability is one of the most important characteristics of science.
Because of the complexity of social data and irregularity of social behavior, predicting is
challenging in social researches.

7. Difficulty in the verification of the Inferences: Verification of the results obtained is
possible in physical sciences but in the case of social sciences it is much more difficult. The
events in social sciences are non-repetitive and social scientists are ill-equipped with their
tools to verify prediction.
8. Difficulty in The Use of Experiment Methods: It is not possible to put human beings to
laboratory tests. Even if it is done their responses would not be natural but subject to the
awareness of the artificial conditions.
9. Incapability of being Dealt Through Empirical Method: Exact sciences tend to become
increasingly quantitative in its units, measures and terminology while most of matter of social
sciences in qualitative and does not admit to quantitative statement. Direct quantification of
socio-economic variables is not possible; we can have only rough estimates. Data is more
reliable in physical sciences. Because data obtained in social research is always changing, as
some variables are changing in social sciences. Empirical method gives very accurate results
when experiment on social phenomena is repeatedly carried on. But in case of social sciences
repeated experimentation is is not possible. Empirical methods are methods of statistics
therefore, all problems, limitations, distrusts in statistical methods are also problems of social
research. For example problem of unbiased sampling, selection of data etc.
10. Problem of Inter-disciplinary Research: Social research in any field are interrelated,
therefore, we can not draw watertight compartments for each sector social sciences. We can
not say whether it is purely political research, economic research or purely sociological
research. But in case of physical sciences, it is possible to a very great extent to state whether
it is physical or chemical or biological problem. The main problem in interdisciplinary
research is that every branch of knowledge has its own line of approach and a methodology
suited for the purpose when these are tried to be fitted in a single frame, distortions are bound
to take place.
11. Less Finance: Social researchers get less finance than a researcher in physical sciences as a
result the rate of progress in social sciences research is less than that of the physical sciences
To sum up, social sciences are less precise in its findings than the natural sciences because it
deals with human society whose group as well as individual behaviour has always been more
diverse, full of more surprises and less predicate. Like natural sciences we do not have nice neat
equations that will yield answers to all situations.

Research can also be classified from other three perspectives. However, these perspectives are
not mutually exclusive.
1. The application of the research study;
2. the objectives in undertaking the research ; and
3. The type of information sought.
Classification of research based on Application of the research study
If you examine a research endeavors from the perspective of its application, there are two broad
categories: pure research and applied research.
Pure research involves developing and testing theories and hypothesis that are intellectually
challenging to the researcher but may or may not have practical application at the present time or
in the future. Thus such work often involves the testing of hypothesis containing very abstract
and specialized concepts.
Pure research is also concerned with the development, examination, verification and refinement
of research methods, procedures, techniques and tools that form the body of research
The knowledge produced through pure research is sought in order to add to the existing body of
knowledge of research methods.
In applied research the research techniques, procedures and methods that form the body of
research methodology are applied to the collection of information about various aspects of a
situation, issue, problem or phenomenon so that information gathered can be used in other ways-
such as for policy formulation, administration, and the enhancement of understanding of a
Most of the researches in the social sciences are applied researches.
Classification of researches based on Objectives of the study
a) Descriptive Research
Includes surveys and fact finding enquires of different kinds.
Its major purpose is description of the state of affairs as it exists at present

In social science and business research it is often called expost fact research. Researcher has
no control over the variables, he can only report what has happened or what is happening.

Also includes attempts by researchers to discover causes even when they can not control the
variables. The methods used in descriptive researches are survey methods of all kinds,
including comparative and correlational methods.
b) Correlational Research
c) Explanatory Research
d) Exploratory Research

Classification of researches based on the type of information Sought
a) Quantitative Research
Is based on the measurement of quantity or amount.
Is applicable to phenomena that can be expressed in terms of quantity.

b) Qualitative Research
is concerned with qualitative phenomenon, i.e., phenomena relating to or involving quality or
Pure Research
Types of Research
From the View point of
Application Objectives Type of information

is especially important in behavioral sciences where the aim is to discover the underlying
motives of human behavior.
E,g, Motivation research, attitude or opinion research, word association tests, sentence
completion etc.
Significance of Research
Research inculcates scientific and inductive thinking and it promotes the development of
logical habits of thinking and organization.
Research becomes an important aid in solving operational problems. This due to the
increasingly complex nature of business and government . Furthermore, research is helpful
aid in economic policy.
Research provides the basis for nearly all government policies in our economic system.
Research has its special significance in solving various operational and planning problems of
business and industry.
Research is equally important for social scientists in studying social relationships and in
seeking answers to various social problems.
Research may mean a careerism or a way to attain a high position in the social structure
particularly for those students of Masters or PhD.
Research may mean a source of livelihhod for those professionals in research methodology.
Research may mean the outlet for new ideas and insights for philosophers and thinkers.
Research may mean the development of new styles and creative work to literary person.
Research may mean the generalization of new theories to analysts and intellectuals.
Research Methods and Methodology
Research methods may be understood as all those methods/ techniques that are used for
conduction of research.
Research methods/ techniques, thus, refer to the methods the researchers use in performing
research operation.
In other words, all those methods which are used by the researcher during the course of studying
his research problem are termed as research methods.
Research methods can be put into the following three groups:
1. In the first group we include those methods which are concerned with the data collection
of data. These methods will be used where the data already available are not sufficient to
arrive at the required solution;

2. The second group consists of those statistical techniques which are used for establishing
relationships between the data and the unknowns;
3. The third group consists of those methods which are used to evaluate the accuracy of the
results obtained.
Research Methodology
-is a way to systematically solve the research problem.
-may be understood as a science of studying how research is done scientifically.
-consists of the various steps generally adopted by a researcher in studying a research problem along
with the logics behind them
--involves determing which research methods/techniques are relevant and which are not, and what
would they mean or indicate.
-may differ from problem to problem.
-constitutes many dimensions and research methods is a part of it.
-has wider scope than research methods.
Decision of research methodology addresses the research methods to be used and the logic behind the
methods we use in the context of the research study and explain why we are using a particular
method and why we are not using others so that research results are capable of being evaluated either
by the research himself or by others.

\Research and scientific Methods
The two terms research and scientific methods are closely related.
Research, as already stated, can be termed as an inquiry in to the nature of , the reasons for, and the
consequences of any particular set of circumstances, whether these circumstances are experimentally
controlled or recorded just as they occur.
On the other hand, the philosophy common to all research methods and techniques, although they
may vary considerably from one science to another, is usually given the name of scientific method.
In nutshell research methodology addresses:
why a research study has been undertaken
how the research problem has been defined
in what way and why the hypothesis has been formulated
what data have been collected and what particular method has been adopted
why particular technique of analyzing data has been used

Scientific method is a pursuit of truth as determined by logical considerations. The ideal of science is
to achieve a systematic interrelation of facts. Scientific method attempts to achieve this ideal by
experimentation, observation, logical arguments from accepted postulates and combination of these
three in varying proportions.
The research process
Research process consists of series of actions or steps necessary to effectively carry out research
and the desired sequencing of these steps. These activities indeed overlap continuously rather
than following a strictly prescribed sequence.
A brief description of these activities is as follows:
1. Formulating the Research Problem
Formulating a research problem is the first and most important step in the research process. It is
like determination of the destination before undertaking a journey.
There are two types of research problems, viz., those which relate to states of nature and those
which relate to relationships between variables.
Formulation of the problem means defining the problem precisely. In other words, a problem
defined is half solved. Formulation of problem is often more essential than its solution because
when the problem is formulated, an appropriate technique can be applied to generate
alternative solutions.
Formulation of a problem involves the following steps:
a) Statement of the problem in a general way
b) Understanding the nature of the problem
c) Surveying the available literature
d) Developing the idea through discussion
e) Rephrasing the research problem into a working proposition.
Importance of formulating a research problem
a) It determines the research destine. It indicates a way for the researcher. Without it a
clear and economical plan is impossible.
b) Research problem is like the foundation of a building. The type and design of the
building is dependent upon the foundation. If the foundation is well-designed and
strong, one can expect the building to be also. The research problem serves as the
foundation of a research study: if it is well formulated, one can expect a good study to

c) The way you formulate your research problem determines almost every step that
follows: the type of study design that can be used; the type of sampling strategy that
can be employed; the research instrument that can be used; and the type of analysis
that can be undertaken.
d) The quality of the research report(output of the research undertakings) is dependent on
the quality of the problem formulation.
Considerations in selecting a research problem
When selecting a research problem/topic there is a number of considerations to keep in
mind. These considerations are:
a) Interest
b) Magnitude
c) Measurement of concepts
d) Level of expertise
e) Relevance
f) Availability of data
g) Ethical issues
2. Extensive Literature Review
Once the problem is formulated, a brief summery of it should be written down.
Reasons for Reviewing Literature
Literature review has three functions:
a) Bringing clarity and focus to the research problem
b) Improving the methodology
c) Broadening the researcher knowledge in the research area.
Procedures in reviewing the literature
Reviewing a literature is a continuous process. Often it begins before a specific research
problem has been formulated and continues until the report is finished.
There are four steps involved in conducting a literature review:
a) Search for existing literature in your area of study
b) Review the literature selected
c) Develop a theoretical framework
d) Develop a conceptual framework.


3. Development of Working Hypothesis
After extensive literature survey, researcher should state in clear terms the working hypothesis.
Working hypotheses is tentative assumption made in order to draw out and test its logical or
empirical consequences. Hypotheses affect the manner in which tests must be conducted in the
analysis of data and indirectly the quality of data which is required for the analysis. Hypotheses
should be very specific and limited to the piece of research in hand because it has to be tested.
The role of hypotheses is to guide the researcher by delimiting the area of research and keep him on
the right track. It sharpens his thinking and focuses attention on the more important facets of the
problem. It also indicates the type of data and the type of methods of data analysis to be used.
Working hypotheses are more useful when stated in precise and clearly defined terms. Sometimes,
particularly in case of exploratory researches, we do not need hypotheses.
4. Preparing the Research Design
The research problem having been formulated in clear cut terms, the researcher will be required to
prepare a research design, i.e., he will have to state the conceptual structure within which research
would be conducted. The preparation of such a design facilitates research to be as efficient as
possible yielding maximal information.
A research design is the arrangement of conditions for collecting and analysis of data in a manner
that aims to combine relevance to the research purpose with economy in procedure.
Research design is the conceptual structure within which research is conducted; it constitutes the
blueprint for collection, measurement and analysis of data.
Research design is a plan, structure and strategy of investigation so conceived as to obtain answers to
research questions or problems. The plan is the complete scheme or programme of the research. It
includes outline of what the investigator will do from writing the hypothesis and their operational
implications to the final analysis of data.
Research design is defined as a blueprint or detailed plan for how a research study is to be completed,
operationalizing variables so they can be measured, selecting a sample of interest to study, collecting
data to be used as a basis for testing hypothesis and analyzing the results.
The function of research design is to provide for the collection of relevant evidence with minimal
expenditure of effort, time and money. Furthermore, research design explains how the researcher will
find answers to the research questions. It sets out the logic of the inquiry. But how all these can be
achieved depends mainly on the research purpose. Research purposes may be grouped into four
categories, viz.,

a. Exploration
b. Description
c. Diagnosis
d. Experimentation.
When the purpose happens to be an accurate description of a situation or of an association between
variables, the suitable design will be on that minimizes bias and maximizes the reliability of the data
collected and analyzed. When selecting a research design it is important to ensure that it is valid,
workable, and manageable.
The functions of a research design
The research design has two main functions. The first relates to the identification and/or development
of procedures and logistical arrangements required to undertake a study, and the second emphasizes
the importance of quality in these procedures to ensure their validity, objectivity, and accuracy.
A research design should include the following:
a) The study design per se and the logistical arrangements that you purpose to
b) The measurement procedures
c) The sampling strategy
d) The frame of analysis
e) Time frame
Selecting a study design
The study design is a part of the research design. It is the design of the study perse, whereas the
research design also includes other details related to the carrying out of the study.
The various designs have been classified by examining them from three different perspectives:
a) The number of contacts with the study population
b) The reference period of the study
c) The nature of the investigation.
The number of contacts
a) Cross-sectional study
b) Before-and- after study
c) Longitudinal studies


Features of Good Research Design
A research design appropriate for a particular research problem, usually involves the consideration of
the following factors:
1. the means of obtaining information
2. the availability and skills of the researcher
3. the objective of the problem to be studied
4. The availability of time and money for the research work.

5. Determining Sampling Design:
All the items under consideration in any field of inquiry constitute a universe or population.
A complete enumeration of all the items in the population is known as a census inquiry.
Because of the difficulty, relative inaccuracy and biasness related to census study determine sample,
few elements from the population, becomes mandatory.
In such cases, the researcher must decide the way of selecting a sample or what is popularly known
as the sample design.
In other words, a sample design is a definite plan determined before any data are actually collected
for obtaining a sample from a given population.
Samples can be either probability samples or non-probability samples.
With probability samples each element has unknown probability of being included in the sample but
in the non-probability samples do not allow the researcher to determine this probability.
Probability samplings are those based on simple random sampling, systematic sampling, stratified
sampling, cluster/area sampling whereas non-probability samples are those based on convenience
sampling, judgment sampling and quota sampling techniques.


Types of study design
Number of contacts Reference period Nature of
Non experimental
Quasi experimental
Retro- prospective
One Two Three or more
Crossectional studies Before and after Longitudinal

concepts and
sample design)
Collect data (
Analyze data
hypothesis if
t and

The sample design to be used must be decided by the researcher taking into consideration the nature of
the inquiry and other related factors.

6. Collecting the data
In dealing with any real life problem it is often found that data at hand are inadequate, and hence, it
becomes necessary to collect data that are appropriate.
There are several ways of collecting the appropriate data which differ considerably in context of
money costs, time and other resources at the disposal of the researcher.
Primary data can be collected either through experiments or through survey. If the researcher conducts
experiment, he observes some quantitative measurements, or the data, with the help of which he
examines the truth contend in his hypothesis. But in the case of a survey, data can be collected by any
one of the following ways:
a. Observation
b. Interview
c. Questionnaire
The researcher should select one of these methods of collecting data taking in to consideration the
nature of investigation, objective and scope of the inquiry, financial resources, available time and the
desired degree of accuracy.
7. Execution of the project
The researcher should see, during this phase, that the project is executed in a systematic manner and in
8. Analysis of Data
After the data have been collected, the researcher turns to the task of analyzing them.
The analysis of data requires a number of closely related operations such as establishment of
categories, the application of these categories to raw data through coding, tabulation and then drawing
statistical inferences.
The unwieldy data should be condensed in to a few manageable groups and tables for further analysis.
9. Hypothesis Testing
After analyzing the data, the researcher is in a position to test the hypothesis, if any, he had formulated
Do the facts support the hypotheses or they happen to be contrary? This is the usual question which
should be answered while testing hypotheses.
Various tests, such as Chi square test, t-test, F-test may be applied.

10. Generalization and Interpretation
If a hypotheses is tested and upheld several times, it may be possible for the researcher to arrive at
generalization, i.e., to build a theory.
As a matter of fact, the real value of research lies in its ability to arrive at certain generalizations.
If the researcher had no hypotheses to start with, he might seek to explain his findings on the basis of
some theory. It is known as interpretation. The process of interpretation may quite often trigger off
new questions which inturn may lead to further researches.
11. Preparation of the Research Report or the thesis
Finally, the researcher has to prepare the report of what has been done by him following the
appropriate formats and appropriate language.


Chapter 2: Survey Research
2.1. Introduction
Webster defines a survey as the action of ascertaining facts regarding conditions or the condition of
something to provide exact information especially to persons responsible or interested and as a
systematic collection and analysis of data on some aspect of an area or group.
A survey, then, is much more than the mere compiling of data. The data must be analyzed, interpreted,
and evaluated. Only after this processing can data become information. The "exactness" of the
information is determined by the surveyor's methods. Unless he makes a systematic collection of data,
followed by a careful analysis and evaluation with predefined objectives, his collection of data cannot
become exact information.
A survey can be anything from a short paper-and-pencil feedback form to an intensive one-on-one in-
depth interview.
In experiments, researchers place people in small group, test one or two hypotheses with a few
variables, control the timing of the treatment and the dependent variable, and control for alternative
explanations. By contrast , survey researchers sample many respondents who answer the same
questions, measure many variables, test multiple hypotheses, under infer temporal order from
questions about past behavior, experience or characteristics.
There are four different types of surveys that are: Questionnaires, I nterviews, observations and
projective techniques.
2.2. A history of Survey Research
The modern survey can be traced back to ancient form of census. A census includes information on
characteristics of the entire population in a territory. It is based on what people tell officials or what
officials observe
2.3. Types of Survey5
2.3.1. The Questionnaire
One of the steps in preparing the survey research is developing the data collection instrument.
The most common means of collecting data are the interview and the questionnaire.
In the past, the interview has been the most popular data-collecting instrument.
Recently, the questionnaire has surpassed the interview in popularity, especially in the military.


The Questionnaire: Pros and Cons
It is important to understand the advantages and disadvantages of the questionnaire as opposed to the
personal interview. This knowledge will allow you to maximize the strengths of the questionnaire
while minimizing its weaknesses.
The advantages of administering a questionnaire instead of conducting an interview are:
lower costs
better samples
respondent privacy (anonymity)
It is free from the bias of the interviewers, answers are in respondents own words
Respondents have adequate time to give well thought out answers.
Respondents, who are not easily approachable, can also be reached conveniently.
The primary advantage is lower cost, in time as well as money. Not having to train interviewers
eliminates a lengthy and expensive requirement of interviewing.
The questionnaire can be administered simultaneously to large groups whereas an interview requires
each individual to be questioned separately. This allows the questions to reach a given number of
respondents more efficiently than is possible with the interview. Finally, the cost of postage should be
less than that of travel or telephone expenses.
Since a typical questionnaire usually has a lower cost per respondent, it can reach more people within
a given budget (or time) limit. This can enhance the conduct of a larger and more representative
The questionnaire provides a standardized data-gathering procedure. Using a well- constructed
questionnaire can minimize the effects of potential human errors (for example, altering the pattern of
question asking, calling at inconvenient times, and biasing by explaining. The use of a questionnaire
also eliminates any bias introduced by the feelings of the respondents towards the interviewer (or vice
Although the point is debatable, most surveyors believe the respondent will answer a questionnaire
more frankly than he would answer an interviewer, because of a greater feeling of anonymity. The
respondent has no one to impress with his/her answers and need have no fear of anyone hearing them.
To maximize this feeling of privacy, it is important to guard, and emphasize, the respondent's
The primary disadvantages of the questionnaire are:

Misinterpretation, and
Validity problems.
It can be used only when respondents are educated and cooperating.
The control over the questionnaire may be lost once it is sent.
There is inbuilt inflexibility because of the difficulty of amending the approach once
questionnaires have been dispatched.
This method is the slowest of all.
Non-returns are questionnaires or individual questions that are not answered by the people to whom
they were sent. The important point about these low response rates is not the reduced size of the
sample, which could easily be overcome by sending out more questionnaires, but the possibility of
bias. Non-response is not a random process; it has its own determinants, which vary from survey to
For example, you may be surveying to determine the attitude of a group about a new policy. Some of
those opposed to it might be afraid to speak out, and they might comprise the majority of the non-
returns. This would introduce non-random (or systematic) bias into your survey results, especially if
you found only a small number of the returns were in favor of the policy. Non-returns cannot be
overcome entirely. What we can do is try to minimize them. Techniques to accomplish this are
covered later in this chapter.
Misinterpretation occurs when the respondent does not understand either the survey instructions or
the survey questions. If respondents become confused, they will either give up on the survey
(becoming a nonreturn) or answer questions in terms of the way they understand it, but not necessarily
the way you meant it. Some view the latter problem as a more dangerous occurrence than merely
nonresponding. The questionnaire instructions and questions must be able to stand on their own and
must use terms that have commonly understood meanings throughout the population under study. If
novel terms must be used, be sure to define them so all respondents understand your meaning.
The third disadvantage of using a questionnaire is inability to check on the validity of the answer.
Did the person you wanted to survey give the questionnaire to a friend or complete it personally? Did
the individual respond indiscriminately? Did the respondent deliberately choose answers to mislead
the surveyor? Without observing the respondent's reactions (as would be the case with an interview)
while completing the questionnaire, you have no way of knowing the true answers to these questions.
The secret in preparing a survey questionnaire is to take advantage of the strengths of questionnaires
(lower costs, more representative samples, standardization, and privacy) while minimizing the number

of nonreturns, misinterpretations, and validity problems. This is not always as easy as it sounds. But an
inventive surveyor can very often find legitimate ways of overcoming the disadvantages.
The key to minimizing the disadvantages of the survey questionnaire lies in the construction of the
questionnaire itself.
A poorly developed questionnaire contains the seeds of its own destruction. Each of the three portions
of the questionnaire - the cover letter, the instructions, and the questions - must work together to
have a positive impact on the success of the survey.
The cover letter should explain to the respondent the purpose of the survey and motivate him to
reply truthfully and quickly. If possible, it should explain why the survey is important to him, how
he was chosen to participate, and who is sponsoring the survey (the higher the level of sponsorship
the better). Also the confidentiality of the results should be strongly stressed. A well written cover
letter can help minimize both nonreturn and validity problems. In support of the statement above
regarding level of sponsorship, the signature block on the letter should be as high level as you can
get commensurate with the topic being investigated. Another tip that seems to help improve response
rate is to identify the survey as official. In general, the more official the survey appears, the less
likely it is to be disregarded.
The cover letter should be followed by a clear set of instructions explaining how to complete the
survey and where to return it. If the respondents do not understand the mechanical procedures
necessary to respond to the questions, their answers will be meaningless. In case of mail
questionnaire, the instructions substitute for your presence, so you must anticipate any questions or
problems that may arise and attempt to prevent them from occurring. Remember anonymity! If you
do not want respondents to provide their names say so explicitly in the instructions. If you need
respondents' name included on the survey for tracking or analysis purposes, you will need to put a
Privacy Act Statement somewhere on the survey. The"Instructions" page is usually a good place for
this statement. It places it in a prominent place where all respondents will see it, but does not clutter
the instrument itself or the cover letter.
The third and final part of the questionnaire is the set of questions. Since the questions are the
means by which you are going to collect your data, they should be consistent with your survey
plan. They should not be ambiguous or encourage feelings of frustration or anger that will lead to
nonreturns or validity problems.


1. Based on the situation of survey
When most people think of questionnaires, they think of the mail survey.
There are many advantages to mail surveys.
a) They are relatively inexpensive to administer. You can send the exact same instrument
to a wide number of people.
b) They allow the respondent to fill it out at their own convenience.
But there are some disadvantages as well.
a) Response rates from mail surveys are often very low.
b) Mail questionnaires are not the best vehicles for asking for detailed written responses.
Self administered Questionnaires
A second type is the group administered questionnaire. A sample of respondents is brought
together and asked to respond to a structured sequence of questions. Traditionally, questionnaires were
administered in group settings for convenience. The researcher could give the questionnaire to those
who were present and be fairly sure that there would be a high response rate. If the respondents were
unclear about the meaning of a question they could ask for clarification. And, there were often
organizational settings where it was relatively easy to assemble the group (in a company or business,
for instance).
What's the difference between a group administered questionnaire and a group interview or focus
group? In the group administered questionnaire, each respondent is handed an instrument and asked to
complete it while in the room. Each respondent completes an instrument. In the group interview or
focus group, the interviewer facilitates the session. People work as a group, listening to each other's
comments and answering the questions. Someone takes notes for the entire group -- people don't
complete an interview individually.
A less familiar type of questionnaire is the household drop-off survey. In this approach, a
researcher goes to the respondent's home or business and hands the respondent the instrument. In some
cases, the respondent is asked to mail it back or the interview returns to pick it up. This approach
attempts to blend the advantages of the mail survey and the group administered questionnaire. Like the
mail survey, the respondent can work on the instrument in private, when it's convenient. Like the
group administered questionnaire, the interviewer makes personal contact with the respondent -- they
don't just send an impersonal survey instrument. And, the respondent can ask questions about the

study and get clarification on what is to be done. Generally, this would be expected to increase the
percent of people who are willing to respond.
2. Types of Questionnaires Based on Level of Measurement
We can also classify questions in terms of their level of measurement. For instance, we might measure
occupation using a nominal question. Here, the number next to each response has no meaning except
as a placeholder for that response.
We might ask respondents to rank order their preferences for presidential candidates using an ordinal
question. We want the respondent to put a 1, 2, 3 or 4 next to the candidate, where 1 is the
respondent's first choice.
We can also construct survey questions that attempt to measure on an interval level. One of the most
common of these types is the traditional 1-to-5 rating (or 1-to-7, or 1-to-9, etc.). This is sometimes
referred to as a Likert response scale. Here, we see how we might ask an opinion question on a 1-to-5
bipolar scale (it's called bipolar because there is a neutral point and the two ends of the scale are at
opposite positions of the opinion):
Another interval question uses an approach called the semantic differential. Here, an object is
assessed by the respondent on a set of bipolar adjective pairs (using 5-point rating scale):

Finally, we can also get at interval measures by using what is called a cumulative or Guttman scale.
Here, the respondent checks each item with which they agree. The items themselves are constructed so
that they are cumulative -- if you agree to one, you probably agree to all of the ones above it in the list:

3. Variable of structure or response format
The types of questionnaire vary widely. Questionnaires may be classified on a number of different
bases. The classification of questionnaires used here is based on the variablity of structure. It is how
you collect the answer from the respondent. Accordingly, we have:
structured/ standardized questionnaire
Unstructured/ non-structured questionnaire.
Structured questionnaires/response formats are those in which there are definite, concrete and
preordained questions with additional questions limited to those necessary to clarify inadequate
answers or to elicit more detailed responses. The questions are presented with exactly the same
wording, and in the same wording, and in the same order to all respondents.
Structured questions/formats help the respondent to respond more easily and help the researcher to
accumulate and summarize responses more efficiently. But, they can also constrain the respondent and
limit the researcher's ability to understand what the respondent really means. There are many different
structured response formats, each with its own strengths and weaknesses. We'll review the major ones
a) Fill- In-The-Blank. One of the simplest response formats is a blank line. A blank line can be used
for a number of different response types.
b) Check The Answer. The respondent places a check next to the response(s). Sometimes, we supply
a box that the person can fill in with an 'X' (which is sort of a variation on the check mark. By
convention, we usually use the checkmark format when we want to allow the respondent to select
multiple items.
Whenever you use a checklist, you want to be sure that you ask the following questions:
- Are all of the alternatives covered?
- Is the list of reasonable length?
- Is the wording impartial?
- Is the form of the response easy, uniform?

Sometimes you may not be sure that you have covered all of the possible responses in a checklist. If
that is the case, you should probably allow the respondent to write in any other options that may apply.
c) Circle The Answer. Sometimes the respondent is asked to circle an item to indicate their
response. Usually we are asking them to circle a number.
Advantages of structured Questionnaire
a. It is easier and quicker for respondents to answer.
b. The answers of different respondents are easier to compare.
c. Answers are easier to code and statistically analyze.
d. The response choices can clarify question meaning for respondents.
e. Respondents are more likely to answer sensitive questions.
f. There are fewer irrelevant or confused answers to questions.
g. Less articulate or less literate respondents are not at a disadvantage.
h. Replication is easier.
Disadvantages of structured Questionnaire
a. They can suggest ideas that the respondent would not otherwise have.
b. Respondents with no opinion or no knowledge can answer anyway.
c. Respondents can be frustrated because their desired answer is not a choice.
d. It is confusing if many response choices are offered.
e. Misinterpretation of a question can go unnoticed.
f. Distinction between respondent answers may be blurred.
g. Clerical mistakes or making the wrong response is possible.
h. They force respondents to give simplistic responses to complex issues.
i. They force people to make choices they would not make in the real world.
Unstructured questionnaires/ Response Formats while there are a wide variety of structured response
formats, there are relatively few unstructured ones. What is an unstructured response format?
Generally, it's written text. If the respondent writes down text as the response, you've got an
unstructured response format. These can vary from short comment boxes to the transcript of an
interview. In almost every short questionnaire, there are one or more short text field questions.
Advantages of unstructured questionnaires/response format
a. They permit an unlimited number of possible answers.
b. Respondents can answer in detail and can qualify and clarify responses.
c. Unanticipated findings can be discovered.
d. They permit adequate answers to complex issues.

e. They permit creativity; self expression, and richness of detail.
f. They reveal a respondents logic, thinking process, and frame of reference.
Disadvantages of unstructured questionnaire/response format
a. Different respondents give different degrees of detail in answers.
b. Responses may be irrelevant or buried in useless detail.
c. Comparison and statistical analysis become very difficult.
d. Coding response is difficult.
e. Articulate and highly literate respondents have an advantage.
f. Questions may be to general for respondents who loss direction.
g. Responses are written verbatim, which is difficult for interviewers.
h. A great amount of respondent time, thought and effort is necessary.
i. Respondents can be intimidated by questions.
j. Answers take up a lot of space in the questionnaire.
Questionnaire Construction/ Wording Decision
Many researchers have investigated the complex art of question writing. From their experiences, they
offer valuable advice. Below are some helpful hints typical of those that appear most often in texts on
question construction.
1. Keep the language simple.
Analyze your audience and write on their level. It usually suggested that writing at the sixth grade
level may be appropriate.
Avoid the use of technical terms or jargon.
2. Keep the questions short.
Long questions tend to become ambiguous and confusing. A respondent, in trying to comprehend a
long question, may leave out a clause and thus change the meaning of the question.
3. Keep the number of questions to a minimum.
There is no commonly agreed on maximum number of questions that should be asked, but research
suggests higher return rates correlate highly with shorter surveys.
Ask only questions that will contribute to your survey.
Apply the So what? and Who cares? tests to each question.
Nice-to-know questions only add to the size of the questionnaire.
Do not leave out, however, questions that would yield necessary data simply because it will shorten
your survey. If the information is necessary, ask the question.


4. Limit each question to one idea or concept.
A question consisting of more than one idea may confuse the respondent and lead to a meaningless
Consider this question: Are you in favor of raising pay and lowering benefits? What would a yes (or
no) answer mean?
5. Do not ask leading questions.
These questions are worded in a manner that suggests an answer. Some respondents may give the
answer you are looking for whether or not they think it is right. Such questions can alienate the
respondent and may open your questionnaire to criticism.
A properly worded question gives no clue as to which answer you may believe to be the correct one.
6. Use subjective terms such as good, fair, and bad sparingly/economically, if at all.
These terms mean different things to different people. One person's fair may be another person's
bad. How much is often and how little is seldom?
7. Allow for all possible answers.
Respondents who cannot find their answer among your list will be forced to give an invalid reply or,
possibly, become frustrated and refuse to complete the survey.
Wording the question to reduce the number of possible answers is the first step.
Avoid dichotomous (two-answer) questions (except for obvious demographic questions such as
If you cannot avoid them, add a third option, such as no opinion, don't know, or other. These may
not get the answers you need but they will minimize the number of invalid responses. A great number
of don't know answers to a question in a fact-finding survey can be a useful piece of information.
But a majority of other answers may mean you have a poor question, and perhaps should be cautious
when analyzing the results.
8. Avoid emotional or morally charged questions and too direct questions
There are times when asking a question too directly may be too threatening or disturbing for
respondents. The respondent may feel your survey is getting a bit too personal!
For instance, consider a study where you want to discuss battlefield experiences with former soldiers
who experienced trauma. Examine the following three question options:
- How did you feel about being in the war?
- How well did the equipment hold up in the field?
- How well were new recruits trained?

The first question may be too direct. For this population it may elicit powerful negative emotions
based on their recollections. The second question is a less direct one. It asks about equipment in the
field, but, for this population, may also lead the discussion toward more difficult issues to discuss
directly. The last question is probably the least direct and least threatening. If you are doing a study
where the respondents may experience high levels of stress because of the questions you ask, you
should reconsider the ethics of doing the study.
9. Understand the should-would question.
Usually respondents answer should questions from a social or moral point of view while answering
would questions in terms of personal preference.
10. Formulate your questions and answers to obtain exact information and to minimize
The survey author has to always be on the lookout for questions that could be misunderstood or
confusing. Some terms are just to vague to be useful. For instance, if you ask a question about the
"mass media," what do you mean? The newspapers? Radio? Television? How old are you? mean on
your last or your nearest birthday? Does What is your (military) grade? mean permanent or
temporary grade? As of what date?
By including instructions like Answer all questions as of (a certain date), you can alleviate many
such conflicts.
11. Include a few questions that can serve as checks on the accuracy and consistency of the
answers as a whole.
Have some questions that are worded differently, but are soliciting the same information, in different
parts of the questionnaire.
These questions should be designed to identify the respondents who are just marking answers
randomly or who are trying to game the survey (giving answers they think you want to hear).
If you find a respondent who answers these questions differently, you have reason to doubt the
validity of their entire set of responses. For this reason, you may decide to exclude their response
sheet(s) from the analysis.
12. Organize the pattern of the questions appropriately:
Place demographic questions at the end of the questionnaire.
Have your opening questions arouse interest.
Ask easier questions first.
To minimize conditioning, have general questions precede specific ones.

Group similar questions together.
If you must use personal or emotional questions, place them at the end of the questionnaire.
Thank the respondent at the beginning for allowing you to conduct your study
Keep your survey as short as possible -- only include what is absolutely necessary
Be sensitive to the needs of the respondent
Be alert for any sign that the respondent is uncomfortable
Thank the respondent at the end for participating
Assure the respondent that you will send a copy of the final results.
Note: The next two hints apply to the entire questionnaire including the cover letter, instructions, and
13. Pretest (pilot test) the questionnaire.
This is the most important step in preparing your questionnaire.
The purpose of the pretest is to see just how well your cover letter motivates your respondents and
how clear your instructions, questions, and answers are.
You should choose a small group of people (from three to ten should be sufficient) you feel are
representative of the group you plan to survey.
After explaining the purpose of the pretest, let them read and answer the questions without
interruption. When they are through, ask them to critique the cover letter, instructions, and each of the
questions and answers. Don't be satisfied with learning only what confused or alienated them.
Question them to make sure that what they thought something meant was really what you intended it
to mean.
Use the above 12 hints as a checklist, and go through them with your pilot test group to get their
reactions on how well the questionnaire satisfies these points. Finally, redo any parts of the
questionnaire that are weak.
14. Have your questionnaire neatly produced on quality paper.
A professional looking product will increase your return rate. But always remember the adage You
can't make a silk purse out of a sow's ear.
A poorly designed survey that contains poorly written questions will yield useless data regardless of
how pretty it looks.
15. Be realistic in assuming about the respondents

Sometimes we don't stop to consider how a question will appear from the respondent's point-of-view.
We don't think about the assumptions behind our questions. For instance, if you ask what social class
someone's in, you assume that they know what social class is and that they think of themselves as
being in one. In this kind of case, you may need to use a filter question first to determine whether
either of these assumptions is true.
16. Finally, make your survey interesting!
Question Placement
One of the most difficult tasks facing the survey designer involves the ordering of questions. Which
topics should be introduced early in the survey and which later? If you leave your most important
questions until the end, you may find that your respondents are too tired to give them the kind of
attention you would like. If you introduce them too early, they may not yet be ready to address the
topic, especially if it is a difficult or disturbing one. Whenever you think about question placement,
consider the following questions:
- Is the answer influenced by prior questions?
- Does question come too early or too late to arouse interest?
- Does the question receive sufficient attention?
The Opening Questions
The opening few questions should, in general, be easy to answer. You might start with some simple
descriptive questions that will get the respondent rolling. You should never begin your survey with
sensitive or threatening questions.
Sensitive Questions
Before asking difficult and uncomfortable subjects, you should attempt to develop some trust or
rapport with the respondent. Often, preceding the sensitive questions with some easier warm-up ones
will help. But, you have to make sure that the sensitive material does not come up abruptly or appear
unconnected with the rest of the survey. It is often helpful to have a transition sentence between
sections of your instrument to give the respondent some idea of the kinds of questions that are coming.
For instance, you might lead into a section on personal material with the transition: I n this next

section of the survey, we'd like to ask you about your personal relationships. Remember, we do not
want you to answer any questions if you are uncomfortable doing so.
Bias and How to Combat It
Surveyors must be aware of ways the surveys might become biased and of the available means for
combating bias.
The main sources of bias in a questionnaire are:
a non-representative sample
leading questions
question misinterpretation
untruthful answers
Surveyors can expose themselves to possible non-representative sample bias in two ways.
The first is to actually choose a non-representative sample. This bias can be eliminated
by careful choice of the sample.
The second way is to have a large number of non-returns.
The nonreturn bias (also called non-respondent bias) can affect both the sample survey and the
complete survey. The bias stems from the fact that the returned questionnaires are not necessarily
evenly distributed throughout the sample. The opinions or attitudes expressed by those who returned
the survey may or may not represent the attitudes or opinions of those who did not return the survey. It
is impossible to determine which is true since the non-respondents remain an unknown quantity.
The following are techniques used to get people to reply to surveys.
1. Use follow-up letters.
These letters are sent to the nonrespondents after a period of a couple of weeks asking them again to
fill out and return the questionnaire. The content of this letter is similar to that of the cover letter.
If you are conducting a volunteer survey, you should anticipate the need for following up with non-
respondents and code the survey in some unobtrusive way to tell who has and who has not yet
responded. If you don't do that, but still need to get in touch with nonrespondents, consider placing ads
in local papers or base bulletins, announcements at commander's call, or notices posted in public
places. If at all possible, provide a fresh copy of the survey with the follow- up letter. This often
increases return rate over simply sending out a letter alone.
2. Use high-level sponsorship.

This hint was mentioned in an earlier section. People tend to reply to surveys sponsored by
organizations they know or respect. Effort spent in doing this will result in a higher percentage of
returns. If possible, use the letterhead of the sponsor on your cover letter.
3. Make your questionnaire attractive, simple to fill out, and easy to read.
A professional product usually gets professional results.
4. Keep the questionnaire as short as possible.
You are asking for a person's time, so make your request as small as possible.
5. Use your cover letter to motivate the person to return the questionnaire.
One form of motivation is to have the letter signed by an individual known to be respected by the
target audience for your questionnaire. In addition, make sure the individual will be perceived by the
audience as having a vested interest in the information needed.
6. Use inducements to encourage a reply.
These can range from a small amount of money attached to the survey to an enclosed stamped
envelope. A promise to report the results to each respondent can be helpful. If you do promise a report,
be sure to send it.
Proper use of these techniques can lower the nonreturn rate to acceptable levels. Keep in mind, though,
that no matter what you do, there will always be non-respondents to your surveys. Make sure the effort
and resources you spend are in proportion with the return you expect to get.
The second source of bias is misinterpretations of questions. We have seen that these can be limited
by clear instructions, well-constructed questions, and through judicious pilot testing of the survey.
Biased questions can also be eliminated by constructing the questions properly and by using a pilot
Finally, internal checks and a good motivational cover letter can control bias introduced by untruthful
Although bias cannot be eliminated totally, proper construction of the questionnaire, a well-chosen
sample, follow- up letters, and inducements can help control it.

The questionnaire is the means for collecting your survey data. It should be designed with your data
collection plan in mind. Each of its three parts should take advantage of the strengths of questionnaires
while minimizing their weaknesses. Each of the different kinds of questions is useful for eliciting
different types of data, but each should be constructed carefully with well- developed construction
guidelines in mind. Properly constructed questions and well-followed survey procedures will allow

you to obtain the data needed to check your hypothesis and, at the same time, minimize the chance that
one of the many types of bias will invalidate your survey results.

2.3.2. Schedules
This method of data collection is very much like the collection of data through questionnaire, with
little difference which lies in the fact that schedules (proforma containing a set of questions) are being
filled in by the enumerators who are specially appointed for the purpose.
These enumerators along with schedules go to respondents, put to them the questions from the
proforma in the order the questions are listed and record the replies in the space meant for the same in
the proforma.
In certain situations schedules may be handed over to respondents and enumerators may help them in
recording their answers to various questions in the said schedules.
Enumerators explain the aims and objects of the investigation and also remove the difficulties which
any respondent may feel in understanding the implications of a particular question or the definition or
concept of difficult terms.
This method requires the selection of enumerators for filling up schedules or assisting respondents to
fill up schedules and as such enumerators should be very carefully selected. The enumerators should
be trained to perform their job well and the nature and scope of the investigation should be explained
to them thoroughly so that they may well understand the implications of different questions put in the
Enumerators should be intelligent and must possess the capacity of cross-examination in order to find
out the truth. Above all, they should be honest, sincere, hard working, and should have patience and
This method of data collection is very useful in extensive enquires and can lead to fairly reliable
results. It is, however, very expensive and is usually adopted in investigations conducted by
governmental agencies or by some big organizations. Population census all over the world is
conducted through this method.
This method is suitable where finance and trained enumerators are available to cover a wide field and
where some significance is attached to the accuracy of the results obtained.
It can be adopted even in those cases where informants are illiterates.
It eliminates to a great extent the problem of non-response

The enumerator can explain the significance of the inquiry and the questions in the questionnaire
personally to the informants and thus ensuring collection of accurate and reliable information.
The enumerator might be biased one and may not enter the answers given by the respondents
truthfully. He may twist or suppress the information provided by the informant.
Where there are many enumerators, they may interpret various terms in the questionnaire
according to their own understanding of the terms. The interpretation may be quite
The bias might be arising due to the state of mind of the informant or the environment in which
he is placed.
This method is some what costly and time consuming since it requires a large number of
enumerators who are paid persons.

Difference between questionnaires and Schedules
Both questionnaire and schedule are used method of collecting data in research surveys. There is
much resemblance in the nature of these two and this fact has made many people to remark that from
a particular point of view, the two methods can be taken to be the same. But from technical point of
view there is a difference between the two. The important points of difference are as under:
1. The questionnaire is generally sent through mail to informants to be answered as specified in a
covering letter, but otherwise without further assistance from the sender. The schedule is
generally filled out by the research worker or the numerator, who can interpret questions when
2. Questionnaire is relatively economical
3. Non-response is usually high in case of questionnaire.
4. In case of questionnaire, it is not always clear as to who replies, but in case of schedule the
identity of the respondent is known.
5. The questionnaire method is likely to be slow than schedule.
6. Personal contact is generally not possible in case of the questionnaire method, but in case of
schedules direct personal contact is established with respondents.
7. Questionnaire method is only used when respondents are literate and cooperative, but in case of
schedules the information can be gathered even when the respondents happen to be illiterate.

8. Wider and more representative sample coverage is possible in case of questionnaire method,
whereas in schedule there is usual remains the difficulty in sending enumerators over a relatively
wider area.
9. Risk of collecting incomplete and wrong information is relatively high in case of questionnaire
method than in case of schedule.
10. The success of questionnaire method lies more on the quality of the questionnaire itself, but in
case of schedules much depends upon the honesty and competence of enumerators,
11. Along with schedules observation method can be used but such thing is not possible in case of
questionnaire method.
2.3.3. Interviews
Interviews are among the most challenging and rewarding forms of data collection technique. They
require a personal sensitivity and adaptability as well as the ability to stay within the bounds of the
designed protocol. Interviews are a far more personal form of research than questionnaires and
Types of Interviews
1. Face-to Face Interviews/ personal Interviews
In the personal interview, the interviewer works directly with the respondent. Unlike with mail
surveys, the interviewer has the opportunity to probe or ask follow-up questions. And, interviews are
generally easier for the respondent, especially if what is sought is opinions or impressions.
Advantages of Personal Interviews
a. It has the highest response rates.
b. Quick response can be attained.
c. Personal contacts are involved
d. Follow up questions can be asked.
e. It permits the longest questionnaire.
f. Higher flexibility
g. Interviewers can observe the surroundings and can use nonverbal communication and
visual aids.

h. The interviewer can control who answers the questions.
i. All types of questions can be asked including complex questions using illustrations and
extensive probes.
Disadvantages of Personal Interviews
a. Interviews can be very time consuming
b. Interviews are resource intensive or very expensive as training, travel, supervision cost are
c. Interviewer bias is greatest.
d. The interviewers wording, tone of voice, appearance may matter.
2. Telephone interview
Another type of interview is called telephone interview. It is a popular survey method. Most of the
major public opinion polls that are reported were based on telephone interviews.
a. Telephone interviews enable a researcher to gather information rapidly.
b. They allow for some personal contact between the interviewer and the respondent.
c. They allow the interviewer to ask follow-up questions.
d. They are cheaper than the personal interview.
e. No field staff is required.
f. Representative and wider distribution of sample is possible.
a. Many people don't have publicly-listed telephone numbers. Some don't have telephones.
b. People often don't like the intrusion of a call to their homes.
c. Telephone interviews have to be relatively short or people will feel imposed upon.
d. Noise may interrupt the process.
e. Possibility of the bias of the interviewer is relatively more.
f. It is not suitable for intensive surveys where comprehensive answers are required to various

The process of conducting the interview
1. Preparation
1.1. Knowing the Role of the I nterviewer and Preparing on it
The interviewer is really the "jack-of-all-trades" in survey research. The interviewer's role is complex
and multifaceted. It includes the following tasks:
a) Locate and enlist cooperation of respondents
b) Motivate respondents to do good job
c) Clarify any confusion/concerns
d) Observe quality of responses
e) Conduct a good interview
1.2. Training the I nterviewers
One of the most important aspects of any interview study is the training of the interviewers
themselves. In many ways the interviewers are your measures, and the quality of the results is totally
in their hands. Even in small studies involving only a single researcher-interviewer, it is important to
organize in detail and rehearse the interviewing process before beginning the formal study.
Here are some of the major topics that should be included in interviewer training:
a) Describe the entire study
b) State who is sponsor of research
c) Teach enough about survey research
d) Explain the sampling logic and process
e) Explain interviewer bias
f) "Walk through" the interview
g) Explain respondent selection procedures
h) reading maps
i) identifying households
j) identify respondents
k) Rehearse interview
l) Explain supervision
m) Explain scheduling

1.3. Make ready The I nterviewer's Kit
2. Conducting the I nterview
So all the preparation is complete, the training done, the interviewers ready to proceed, their "kits" in
hand. It's finally time to do an actual interview. Each interview is unique, like a small work of art (and
sometimes the art may not be very good). Every interview includes some common components.
There's the opening, where the interviewer gains entry and establishes the rapport and tone for what
follows. There's the middle game, the heart of the process, that consists of the protocol of questions
and the improvisations of the probe. And finally, there's the endgame, the wrap-up, where the
interviewer and respondent establish a sense of closure. Whether it's a two-minute phone interview or
a personal interview that spans hours, the interview is a bit of theater, a mini-drama that involves real
lives in real time.
2.1. Opening Remarks
In many ways, the interviewer has the same initial problem that a salesperson has. You have to get the
respondent's attention initially for a long enough period that you can sell them on the idea of
participating in the study. Many of the remarks here assume an interview that is being conducted at a
respondent's residence. But the analogies to other interview contexts should be straightforward.
- Gaining entry
The first thing the interviewer must do is gain entry. Several factors can enhance the prospects.
Probably the most important factor is your initial appearance. The interviewer needs to dress
professionally and in a manner that will be comfortable to the respondent. In some contexts a business
suit and briefcase may be appropriate. In others, it may intimidate. The way the interviewer appears
initially to the respondent has to communicate some simple messages -- that you're trustworthy,
honest, and non-threatening. Cultivating a manner of professional confidence, the sense that the
respondent has nothing to worry about because you know what you're doing -- is a difficult skill to
teach and an indispensable skill for achieving initial entry.
- Doorstep technique
You're standing on the doorstep and someone has opened the door, even if only halfway. You need to
smile. You need to be brief. State why you are there and suggest what you would like the respondent
to do. Don't ask -- suggest what you want. Instead of saying "May I come in to do an interview?", you

might try a more imperative approach like " I'd like to take a few minutes of your time to interview
you for a very important study."
- Introduction
If you've gotten this far without having the door slammed in your face, chances are you will be able to
get an interview. Without waiting for the respondent to ask questions, you should move to introducing
yourself. You should have this part of the process memorized so you can deliver the essential
information in 20-30 seconds at most. State your name and the name of the organization you represent.
Show your identification badge and the letter that introduces you. You want to have as legitimate an
appearance as possible. If you have a three-ring binder or clipboard with the logo of your organization,
you should have it out and visible. You should assume that the respondent will be interested in
participating in your important study -- assume that you will be doing an interview here.
- Explaining the study
At this point, you've been invited to come in (After all, you're standing there in the cold, holding an
assortment of materials, clearly displaying your credentials, and offering the respondent the chance to
participate in an interview -- to many respondents, it's a rare and exciting event. They hardly ever get
asked their views about anything, and yet they know that important decisions are made all the time
based on input from others.). Or, the respondent has continued to listen long enough that you need to
move onto explaining the study. There are three rules to this critical explanation: 1) Keep it short; 2)
Keep it short; and 3) Keep it short! The respondent doesn't have to or want to know all of the neat
nuances of this study, how it came about, how you convinced your thesis committee to buy into it, and
so on. You should have a one or two sentence description of the study memorized. No big words. No
jargon. No detail. There will be more than enough time for that later (and you should bring some
written materials you can leave at the end for that purpose). This is the "25 words or less" description.
What you should spend some time on is assuring the respondent that you are interviewing them
confidentially, and that their participation is voluntary.
2.2. Asking the Questions
You've gotten in. The respondent has asked you to sit down and make yourself comfortable. It may be
that the respondent was in the middle of doing something when you arrived and you may need to

allow them a few minutes to finish the phone call or send the kids off to do homework. Now, you're
ready to begin the interview itself.
- Use questionnaire carefully, but informally
The questionnaire is your friend. It was developed with a lot of care and thoughtfulness. While you
have to be ready to adapt to the needs of the setting, your first instinct should always be to trust the
instrument that was designed. But you also need to establish a rapport with the respondent. If you have
your face in the instrument and you read the questions, you'll appear unprofessional and disinterested.
Even though you may be nervous, you need to recognize that your respondent is most likely even
more nervous. If you memorize the first few questions, you can refer to the instrument only
occasionally, using eye contact and a confident manner to set the tone for the interview and help the
respondent get comfortable.
- Ask questions exactly as written
Sometimes an interviewer will think that they could improve on the tone of a question by altering a
few words to make it simpler or more "friendly." DON'T. You should ask the questions as they are on
the instrument. If you had a problem with a question, the time to raise it was during the training and
rehearsals, not during the actual interview. It is important that the interview be as standardized as
possible across respondents (this is true except in certain types of exploratory or interpretivist research
where the explicit goal is to avoid any standardizing). You may think the change you made was
inconsequential when, in fact, it may change the entire meaning of the question or response.
- Follow the order given
Once you know an interview well, you may see a respondent bring up a topic that you know will come
up later in the interview. You may be tempted to jump to that section of the interview while you're on
the topic. DON'T. You are more likely to lose your place. You may omit questions that build a
foundation for later questions.
- Ask every question
Sometimes you'll be tempted to omit a question because you thought you already heard what the
respondent will say. Don't assume that. For example, let's say you were conducting an interview with

college age women about the topic of date rape. In an earlier question, the respondent mentioned that
she knew of a woman on her dormitory floor who had been raped on a date within the past year. A few
questions later, you are supposed to ask "Do you know of anyone personally who was raped on a
date?" You figure you already know that the answer is yes, so you decide to skip the question. Instead,
you might say something like "I know you may have already mentioned this, but do you know of
anyone personally who was raped on a date?" At this point, the respondent may say something like
"Well, in addition to the woman who lived down the hall in my dorm, I know of a friend from high
school who experienced date rape." If you hadn't asked the question, you would never have discovered
this detail.
- Don't finish sentences
I don't know about you, but I'm one of those people who just hates to be left hanging. I like to keep a
conversation moving. Once I know where a sentence seems to be heading, I'm aching to get to the next
sentence. I finish people's sentences all the time. If you're like me, you should practice the art of
patience (and silence) before doing any interviewing. As you'll see below, silence is one of the most
effective devices for encouraging a respondent to talk. If you finish their sentence for them, you imply
that what they had to say is transparent or obvious, or that you don't want to give them the time to
express themselves in their own language.
2.3. Obtaining Adequate Responses - The Probe
OK, you've asked a question. The respondent gives a brief, cursory answer. How do you elicit a more
thoughtful, thorough response? You probe.
- Silent probe
The most effective way to encourage someone to elaborate is to do nothing at all - just pause and wait.
This is referred to as the "silent" probe. It works (at least in certain cultures) because the respondent is
uncomfortable with pauses or silence. It suggests to the respondent that you are waiting, listening for
what they will say next.
- Overt encouragement

At times, you can encourage the respondent directly. Try to do so in a way that does not imply
approval or disapproval of what they said (that could bias their subsequent results). Overt
encouragement could be as simple as saying "Uh-huh" or "OK" after the respondent completes a
- Elaboration
You can encourage more information by asking for elaboration. For instance, it is appropriate to ask
questions like "Would you like to elaborate on that?" or "Is there anything else you would like to
- Ask for clarification
Sometimes, you can elicit greater detail by asking the respondent to clarify something that was said
earlier. You might say, "A minute ago you were talking about the experience you had in high school.
Could you tell me more about that?"
- Repetition
This is the old psychotherapist trick. You say something without really saying anything new. For
instance, the respondent just described a traumatic experience they had in childhood. You might say
"What I'm hearing you say is that you found that experience very traumatic." Then, you should pause.
The respondent is likely to say something like "Well, yes, and it affected the rest of my family as well.
In fact, my younger sister..."
2.4. Recording the Response
Although we have the capability to record a respondent in audio and/or video, most interview
methodologists don't think it's a good idea. Respondents are often uncomfortable when they know
their remarks will be recorded word-for-word. They may strain to only say things in a socially
acceptable way. Although you would get a more detailed and accurate record, it is likely to be
distorted by the very process of obtaining it. This may be more of a problem in some situations than in
others. It is increasingly common to be told that your conversation may be recorded during a phone
interview. And most focus group methodologies use unobtrusive recording equipment to capture

what's being said. But, in general, personal interviews are still best when recorded by the interviewer
using pen and paper. Here, I assume the paper-and-pencil approach.
- Record responses immediately
The interviewer should record responses as they are being stated. This conveys the idea that you are
interested enough in what the respondent is saying to write it down. You don't have to write down
every single word -- you're not taking stenography. But you may want to record certain key phrases or
quotes verbatim. You need to develop a system for distinguishing what the respondent says verbatim
from what you are characterizing (how about quotations, for instance!).
- Include all probes
You need to indicate every single probe that you use. Develop a shorthand for different standard
probes. Use a clear form for writing them in (e.g., place probes in the left margin).
- Use abbreviations where possible
Abbreviations will help you to capture more of the discussion. Develop a standardized system (e.g.,
R=respondent; DK=don't know). If you create an abbreviation on the fly, have a way of indicating its
origin. For instance, if you decide to abbreviate Spouse with an 'S', you might make a notation in the
right margin saying "S=Spouse."
2.5. Concluding the I nterview
When you've gone through the entire interview, you need to bring the interview to closure. Some
important things to remember:
- Thank the respondent
Don't forget to do this. Even if the respondent was troublesome or uninformative, it is important for
you to be polite and thank them for their time.
- Tell them when you expect to send results

I hate it when people conduct interviews and then don't send results and summaries to the people who
they get the information from. You owe it to your respondent to show them what you learned. Now,
they may not want your entire 300-page dissertation. It's common practice to prepare a short, readable,
jargon-free summary of interviews that you can send to the respondents.
- Don't be brusque or hasty
Allow for a few minutes of winding down conversation. The respondent may want to know a little bit
about you or how much you like doing this kind of work. They may be interested in how the results
will be used. Use these kinds of interests as a way to wrap up the conversation. As you're putting away
your materials and packing up to go, engage the respondent. You don't want the respondent to feel as
though you completed the interview and then rushed out on them -- they may wonder what they said
that was wrong. On the other hand, you have to be careful here. Some respondents may want to keep
on talking long after the interview is over. You have to find a way to politely cut off the conversation
and make your exit.
- Immediately after leaving -- write down any notes about how the interview went
Sometimes you will have observations about the interview that you didn't want to write down while
you were with the respondent. You may have noticed them get upset at a question, or you may have
detected hostility in a response. Immediately after the interview you should go over your notes and
make any other comments and observations -- but be sure to distinguish these from the notes made
during the interview (you might use a different color pen, for instance).
Plus & Minus of Survey Methods
It's hard to compare the advantages and disadvantages of the major different survey types. Even
though each type has some general advantages and disadvantages, there are exceptions to almost every
rule. Here's my general assessment. Perhaps you would differ in your ratings here or there, but I think
you'll generally agree.


Issue Questionnaire Interview
Group Mail
Personal Phone
Are Visual Presentations Possible? Yes Yes Yes Yes No
Are Long Response Categories Possible? Yes Yes Yes ??? No
Is Privacy A Feature? No Yes No Yes ???
Is the Method Flexible? No No No Yes Yes
Are Open-ended Questions Feasible? No No No Yes Yes
Is Reading & Writing Needed? ??? Yes Yes No No
Can You Judge Quality of Response? Yes No ??? Yes ???
Are High Response Rates Likely? Yes No Yes Yes No
Can You Explain Study in Person? Yes No Yes Yes ???
Is It Low Cost? Yes Yes No No No
Are Staff & Facilities Needs Low? Yes Yes No No No
Does It Give Access to Dispersed Samples? No Yes No No No
Does Respondent Have Time to Formulate
No Yes Yes No No
Is There Personal Contact? Yes No Yes Yes No
Is A Long Survey Feasible? No No No Yes No
Is There Quick Turnaround? No Yes No No Yes


2.3.4. Observation Method
Observation is one of the methods of collecting data. It is the most commonly used method especially
in studies related to behavioral sciences. It is used to obtain both past and current data. Although it is
not possible to observe past behavior, we may observe the results of such behavior.
In a way we all observe things around us, but this is not a scientific observation. Observation becomes
a scientific tool and the method of data collection method for the researcher, when it serves a
formulated research purpose, is systematically planned and recorded and is subjected to checks and
controls on validity and reliability.
There are some advantages of observation method of data collection:
1. The direct observational technique enables the investigator to record the behavior as it occurs.
2. It can be used regardless of whether the respondent is willing to report or not.
3. It can be used even when it pertains to those who are unable to respond, such as an infants and
There are some limitations as well to this method of data collection
1. Only the current behavior of a person or group of persons can be observed,. One is unable to
observe neither the past behavior nor one observe a persons future behavior because the act of
observation takes place in the present.
2. Observation doesnt help us in gauging a persons attitude or opinion or knowledge on a
certain subject.
3. The observational method is very slow and, therefore, when a large number of persons are to
be contacted, it becomes unsuitable because of the longtime required for this purpose.
4. it is an expensive method
5. The information provided by this method is very limited.
6. Sometimes unforeseen factors may interfere with the observational task.
2.4. Selecting the Survey Method
Selecting the type of survey you are going to use is one of the most critical decisions in many social
research contexts. You'll see that there are very few simple rules that will make the decision for you --
you have to use your judgment to balance the advantages and disadvantages of different survey types.
Here, are some points that help you to make sound choice.

Population I ssues
The first set of considerations has to do with the population and its accessibility.
- Can the population be enumerated?
For some populations, you have a complete listing of the units that will be sampled. For others, such a
list is difficult or impossible to compile. For instance, there are complete listings of registered voters
or person with active drivers licenses. But no one keeps a complete list of homeless people. If you are
doing a study that requires input from homeless persons, you are very likely going to need to go and
find the respondents personally. In such contexts, you can pretty much rule out the idea of mail
surveys or telephone interviews.
- Is the population literate?
Questionnaires require that your respondents can read. While this might seem initially like a
reasonable assumption for many adult populations, we know from recent research that the instance of
adult illiteracy is alarmingly high. And, even if your respondents can read to some degree, your
questionnaire may contain difficult or technical vocabulary. Clearly, there are some populations that
you would expect to be illiterate. Young children would not be good targets for questionnaires.
- Are there language issues?
We live in a multilingual world. Virtually every society has members who speak other than the
predominant language. Some countries (like Canada) are officially multilingual. And, our increasingly
global economy requires us to do research that spans countries and language groups. Can you produce
multiple versions of your questionnaire? For mail instruments, can you know in advance the language
your respondent speaks, or do you send multiple translations of your instrument? Can you be confident
that important connotations in your instrument are not culturally specific? Could some of the
important nuances get lost in the process of translating your questions?
- Will the population cooperate?
People who do research on immigration issues have a difficult methodological problem. They often
need to speak with undocumented immigrants or people who may be able to identify others who are.

Why would we expect those respondents to cooperate? Although the researcher may mean no harm,
the respondents are at considerable risk legally if information they divulge should get into the hand of
the authorities. The same can be said for any target group that is engaging in illegal or unpopular
- What are the geographic restrictions?
Is your population of interest dispersed over too broad a geographic range for you to study feasibly
with a personal interview? It may be possible for you to send a mail instrument to a nationwide
sample. You may be able to conduct phone interviews with them. But it will almost certainly be less
feasible to do research that requires interviewers to visit directly with respondents if they are widely
Sampling Issues
The sample is the actual group you will have to contact in some way. There are several important
sampling issues you need to consider when doing survey research.
- What data is available?
What information do you have about your sample? Do you know their current addresses? Their current
phone numbers? Are your contact lists up to date?
- Can respondents be found?
Can your respondents be located? Some people are very busy. Some travel a lot. Some work the night
shift. Even if you have an accurate phone or address, you may not be able to locate or make contact
with your sample.
- Who is the respondent?
Who is the respondent in your study? Let's say you draw a sample of households in a small city. A
household is not a respondent. Do you want to interview a specific individual? Do you want to talk
only to the "head of household" (and how is that person defined)? Are you willing to talk to any
member of the household? Do you state that you will speak to the first adult member of the household
who opens the door? What if that person is unwilling to be interviewed but someone else in the house

is willing? How do you deal with multi-family households? Similar problems arise when you sample
groups, agencies, or companies. Can you survey any member of the organization? Or, do you only
want to speak to the Director of Human Resources? What if the person you would like to interview is
unwilling or unable to participate? Do you use another member of the organization?
- Can all members of population be sampled?
If you have an incomplete list of the population (i.e., sampling frame) you may not be able to sample
every member of the population. Lists of various groups are extremely hard to keep up to date. People
move or change their names. Even though they are on your sampling frame listing, you may not be
able to get to them. And, it's possible they are not even on the list.
- Are response rates likely to be a problem?
Even if you are able to solve all of the other population and sampling problems, you still have to deal
with the issue of response rates. Some members of your sample will simply refuse to respond. Others
have the best of intentions, but can't seem to find the time to send in your questionnaire by the due
date. Still others misplace the instrument or forget about the appointment for an interview. Low
response rates are among the most difficult of problems in survey research. They can ruin an otherwise
well-designed survey effort.
Question Issues
Sometimes the nature of what you want to ask respondents will determine the type of survey you
- What types of questions can be asked?
Are you going to be asking personal questions? Are you going to need to get lots of detail in the
responses? Can you anticipate the most frequent or important types of responses and develop
reasonable closed-ended questions?
- How complex will the questions be?
Sometimes you are dealing with a complex subject or topic. The questions you want to ask are going
to have multiple parts. You may need to branch to sub-questions.

- Will screening questions be needed?
A screening question may be needed to determine whether the respondent is qualified to answer your
question of interest. For instance, you wouldn't want to ask someone their opinions about a specific
computer program without first "screening" them to find out whether they have any experience using
the program. Sometimes you have to screen on several variables (e.g., age, gender, experience). The
more complicated the screening, the less likely it is that you can rely on paper-and-pencil instruments
without confusing the respondent.
- Can question sequence be controlled?
Is your survey one where you can construct in advance a reasonable sequence of questions? Or, are
you doing an initial exploratory study where you may need to ask lots of follow-up questions that you
can't easily anticipate?
- Will lengthy questions be asked?
If your subject matter is complicated, you may need to give the respondent some detailed background
for a question. Can you reasonably expect your respondent to sit still long enough in a phone interview
to ask your question?
- Will long response scales be used?
If you are asking people about the different computer equipment they use, you may have to have a
lengthy response list (CD-ROM drive, floppy drive, mouse, touch pad, modem, network connection,
external speakers, etc.). Clearly, it may be difficult to ask about each of these in a short phone
Content I ssues
The content of your study can also pose challenges for the different survey types you might utilize.
- Can the respondents be expected to know about the issue?
If the respondent does not keep up with the news (e.g., by reading the newspaper, watching television
news, or talking with others), they may not even know about the news issue you want to ask them

about. Or, if you want to do a study of family finances and you are talking to the spouse who doesn't
pay the bills on a regular basis, they may not have the information to answer your questions.
- Will respondent need to consult records?
Even if the respondent understands what you're asking about, you may need to allow them to consult
their records in order to get an accurate answer. For instance, if you ask them how much money they
spent on food in the past month, they may need to look up their personal check and credit card records.
In this case, you don't want to be involved in an interview where they would have to go look things up
while they keep you waiting (they wouldn't be comfortable with that).
Bias I ssues
People come to the research endeavor with their own sets of biases and prejudices. Sometimes, these
biases will be less of a problem with certain types of survey approaches.
- Can social desirability be avoided?
Respondents generally want to "look good" in the eyes of others. None of us likes to look like we don't
know an answer. We don't want to say anything that would be embarrassing. If you ask people about
information that may put them in this kind of position, they may not tell you the truth, or they may
"spin" the response so that it makes them look better. This may be more of a problem in an interview
situation where they are face-to face or on the phone with a live interviewer.
- Can interviewer distortion and subversion be controlled?
Interviewers may distort an interview as well. They may not ask questions that make them
uncomfortable. They may not listen carefully to respondents on topics for which they have strong
opinions. They may make the judgment that they already know what the respondent would say to a
question based on their prior responses, even though that may not be true.
- Can false respondents be avoided?
With mail surveys it may be difficult to know who actually responded. Did the head of household
complete the survey or someone else? Did the CEO actually give the responses or instead pass the task
off to a subordinate? Is the person you're speaking with on the phone actually who they say they are?

At least with personal interviews, you have a reasonable chance of knowing who you are speaking
with. In mail surveys or phone interviews, this may not be the case.
Administrative I ssues
Last, but certainly not least, you have to consider the feasibility of the survey method for your study.
- costs
Cost is often the major determining factor in selecting survey type. You might prefer to do personal
interviews, but can't justify the high cost of training and paying for the interviewers. You may prefer
to send out an extensive mailing but can't afford the postage to do so.
- facilities
Do you have the facilities (or access to them) to process and manage your study? In phone interviews,
do you have well-equipped phone surveying facilities? For focus groups, do you have a comfortable
and accessible room to host the group? Do you have the equipment needed to record and transcribe
- time
Some types of surveys take longer than others. Do you need responses immediately (as in an overnight
public opinion poll)? Have you budgeted enough time for your study to send out mail surveys and
follow-up reminders, and to get the responses back by mail? Have you allowed for enough time to get
enough personal interviews to justify that approach?
- personnel
Different types of surveys make different demands of personnel. Interviews require interviewers who
are motivated and well-trained. Group administered surveys require people who are trained in group
facilitation. Some studies may be in a technical area that requires some degree of expertise in the
Clearly, there are lots of issues to consider when you are selecting which type of survey you wish to
use in your study. And there is no clear and easy way to make this decision in many contexts. There

may not be one approach which is clearly the best. You may have to make tradeoffs of advantages and
disadvantages. There is judgment involved. Two expert researchers may, for the very same problem or
issue, select entirely different survey methods. But, if you select a method that isn't appropriate or
doesn't fit the context, you can doom a study before you even begin designing the instruments or
questions themselves.
2.5. Differences between Survey and Experiment
The following points are noteworthy so far as difference between survey and experiment is concerned:
I. Surveys are conducted in case of descriptive research studies whereas experiments are a
part of experimental research studies.
II. Survey type research studies have usually larger samples because the percentage of
responses generally happens to be low, as low as 20 to 30%, especially in mailed
questionnaire studies. Thus, the survey method gathers data from a relatively large
number of cases at a particular time; it essentially cross-sectional. As against this,
experimental studies generally need small samples.
III. Surveys are concerned with describing, recording, analyzing and interpreting conditions
that either exist or existed. The researcher does not manipulate the variable or arrange for
events to happen. Surveys are only concerned with conditions or relationships that exist,
opinions that are held, processes that are going on, effects that are evident or trends that
are developing. They are primarily concerned with the present but at times do consider
past events and influences as they relate to current conditions. Thus in surveys, variables
that exist or have already occurred are selected and observed. Experimental researches
provides a systematic and logical method of answering the question what will happen if
this is done when certain variables are carefully controlled or manipulated? in fact,
deliberate manipulation is a part of the experimental method. In an experiment, the
researcher measures the effects of an experiment which he conducts intentionally.
IV. Surveys are usually appropriate in case of social and behavioral sciences( because many
types of behavior that interest the researcher cannot be arranged in a realistic setting)
whereas as experiments are mostly essential feature of physical and natural sciences.
V. Surveys are an example of field research whereas experiments generally constitute an
example of laboratory research.
VI. Surveys are concerned with hypothesis formulation and testing the analysis of the
relationship between non-manipulated variables. Experimentation provides a method of

hypothesis testing. After experimenters define a problem, they propose a hypothesis.
They then test the hypotheses and confirm or disconfirm it in the light of the controlled
variable relationship that they observed. The confirmation or rejection is always stated
interms of probability rather than certainty. Experimentation, thus, is the most
sophisticated, exacting and powerful method of discovering and developing an organized
body of knowledge. The ultimate purpose of experimentation is to generalize the variable
relationships so that they may be applied outside the laboratory to a wider population of
VII. Surveys may either be census or sample surveys. They may also be classified as social
surveys, economic surveys or public opinion surveys. Whatever be their type, the method
of data collection happens to be either observation or interview or
questionnaire/opinionnaire or some projective technique(s). Case study method can as
well be used. But in case of experiments, data are collected from several readings of
VIII. In case of surveys, research design must be rigid, must make enough provision for
protection against bias and must maximize reliability as the aim happens to be to obtain
complete and accurate information. Research design in case of experimental studies,
apart reducing bias and ensuring reliability, must permit drawing inferences about
IX. Possible relationships between the data and the unknowns in the universe can be studied
through surveys where as experiments are meant to determine such relationships.
X. Causal analysis is considered relatively more important in experiments whereas in most
social and business surveys our interest lies in understanding and controlling
relationships between variables and as such correlation analysis is relatively more
important in surveys.


Chapter Three
The case study method
A case study is deep and intensive study of a particular social unit, confined to a very small number of
cases. Thus the field of study in the case study method is limited but aims at studying all aspects of
social unit. It also seeks to determine social process; it reveals the complexity of factors and indicates
their sequences and their relationships. It is also a diagnostic study oriented towards finding out what
is happening and why it is happening and what can be done about it.
Case study is a method of exploring and analyzing the life of a social unit, be that a person, a family,
institution, cultural groups or even an entire community.
Case study is a way of organizing social data so as to preserve the unitary character of the social object
being studied. It is an approach which views any social unit as a whole.
Case study is a complete analysis and report of the status of an individual subject with respect as a
rule to specific phases of his total personality.
Case studies are usually characterized as thorough examinations of specific social settings or particular
aspects of social settings including varying details, psychological descriptions of persons in those
Case study is one of the important types of non-experimental or descriptive researches.
If we trace the history of the case study method, it becomes obvious that Fredrick Le Play(1806-1882)
had, for the first time, introduced this method into social science research in his studies of family
budgets. Herbert Spencer, an English sociologist (1820-1903) was the first to use case materials in his
ethnographic studies.

Sources of Data
Case studies are not limited to any single source of data collection. A number methods or techniques
of data gathering may be employed by the researcher such as:
Observations of behavior, characters, and social qualities of the unit by the researcher.
Use of questionnaire, opinonnaires, inventories, checklists and other psychological tests.
Analysis of recorded data from news papers, schools, clinics, court or other similar sources.
Interviewing the subjects, friends, relatives and others.
However, the main sources of data include:
- Personal Document

- Life History
Personal Documents
Most of the people keep personal records, documents, letters and write their autobiographies or
These documents play an important role in the case study as they contain description of the important
events of the life of the writer as well as his relations towards them.
These documents may also contain the description of events in which the narrator has played his part
only as a witness.
Personal documents represents continuity of experience which helps all to illuminate the writers
personality, social relations and philosophy of life often expressed is objective, reality or subjective
Personal documents are very helpful in studying the personality of the writer and his relations to
different circumstances of life as the writer is an identical part of the group; they may represent not
only the reaction of the person but any typical number of groups.
Life history
Life history is the study of various events of respondents life together with an attempt to find their
social significance.
Life history data is generally gathered through prolonged interview with the respondent use of any
written material about his life, conference at specified intervals, experimental studies, observations,
post experimental interviews, various tests on analysis of fact6s so collected inorder to draw vivid
generalizations from them.

Characteristics of Case study Methods
1. Case study is an approach which views a social unit as a whole.
2. The social unit need not be an individual only but it may be a family, a social group, a social
institution, or a community.
3. In case study the unitary character of the social unit is maintained. It means that the social unit,
whatever it is, is studied as a whole.
4. In case study the researcher studies the aspects of what and why of the social unit. In other
words, here the researcher not only tries to explain the complex behavioral pattern of the social
unit but also tries to locate those factors which have given rise to such complex behavioral
5. Since the case study is a descriptive research, no variables are manipulated.

6. In case study, the researcher gathers data usually through methods of observation, interview,
questionnaire, opinionnaires, and other psychological tests. Analysis of recorded data from
news papers, court, government agencies and other similar sources is not uncommon.
1. Based on the number of individuals the case study may be of two types:
Individual case study.
Community case study.
In individual case study the social unit consists of one individual or person. Since there is only
one person, it emphasizes analysis in depth. Such an individual case study may be fruitful in
developing some hypothesis to be studied but it is not useful in making broad generalizations.
Such individual case study is a time-honored procedure in the field of medicine and medical
The community case study is one in which the social unit is not a person rather a family or a
social group. Such a case study is a thorough observation and analysis of a group of people who
are living together in a particular geographical territory. The community case study tries to deal
with different elements of the community life such as location, prevailing economic activity,
climate and natural resources, historical development, social structure, life values, health
education, religious expression, recreation, impact of outside world etc.
2. Based on their purpose, a case study may be divided in to two categories
Deviant case analysis.
Isolated clinical case analysis.
In deviant case study, the researcher starts with difference already found between two persons
or groups of persons and his task is to read backward to deduce the condition that might have
produced the difference.
In isolated clinical case analysis the emphasis is upon the individual units with respect to some
analytical problem. Such study has been popular in psychoanalysis.
Advantages of case study
The main advantages of case study method are
It produces new ideas and fresh suggestions
It helps in formulating a sound hypothesis
It helps in exploring new areas of research.

1. Since the case study method makes an in-depth study of a particular unit of investigation and is
always approached with an open mind, it bestows upon the researcher a huge wealth of new ideas and
new suggestions for further exploration of the research field. Investigators of an institutions may reach
to the fresh knowledge about the problems that might not have occurred to the researcher before he
undertook the investigation. The researcher may also get new suggestions from the field of operation
by intensively carrying out the examination of the case study.
2. Case study method is very useful in helping the researcher to develop and formulate scientifically
sound hypothesis for more research on broader level. Researcher may not start with a given hypothesis
but may desirably undertake a case study for formulating such hypothesis for further researcher. It has
also an advantage in making a multi-dimensional exploration of the same unit and thus enriches the
knowledge pertaining to a particular case for further use in policy formulation.
3. When a case study is undertaken, some of the areas of research may not have occurred to the
researchers mind and the very case study, may open out new avenues of research where fruitful
investigation can be undertaken either by the same researcher or other researchers.
Limitations of Case Study Method
1. Case study develops false sense of confidence which is highly detrimental to any scientific
outlook. In case study method each unit is studied in its complete dimensions and the
researcher feels as if it knows every thing about that unit but in reality in major parts of the life
of the unit is hidden from us.
2. There is a tendency for a researcher to draw generalizations after studying a few cases which
may not be relevant to all the situations. Thus what the researcher thinks to be the common
trait of human nature may be personal peculiarity of the subject and therefore applicable to a
particular person under particular circumstances.
3. Case study does not provide universal, impersonal and common aspects of a phenomenon.
4. Case study method is quite unsystematic in the absence of any control upon the informant or
the researcher. The data collected in the way are generally incapable of verification. Thus the
interfaces drawn may not be very accurate.
5. Case study situations are seldom comparable. Therefore, there is a tendency under this method
to overemphasize the unique or universal events which are not necessarily repeated.
6. It is difficult to apply the usual scientific method without destroying the rationale of the life
documents method and the unique value of the personal document will be lost if it is
formalized and abstracted.

7. There is enough scope of errors due to inaccurate observation, faulty inferences, wrong
selection of a case and misreporting. Thus the inferences are at time far away from truth.
8. Case study method is costly, time consuming and wasteful in certain cases where the objective
are limited.
9. There is a temptation to ignore the basic principles of research design because quote often a
personal relationship is developed between the researcher and the unit studied and, therefore,
objective is lost which is a very harmful thing.
10. it is also not useful if an intensive investigation has already been made on the subject.
11. The case study method is not in itself a scientific method at all, but merely a first step in
scientific procedure because it is qualitative in character and is not very useful for quantitative
Basic Assumptions of the case study Method
The case study method is based upon the following assumptions:
1. The case study method is not in itself a scientific basic at all it is a first step in scientific
2. It is assumed that in the fact of apparent diversity among different units, there is an underlying
unit. A particular unit has its uniqueness but it is not different from other units in all aspects.
3. A unit selected is a representative of a group . In many respects it is similar to the measure of
central tendency of average. It tries tp locate the variations in the reactions and activities of the
4. The study a particular unit is helpful in the prediction and discussion of other unit of the same
5. a unit is indivisible whole and cannot be studied piece-meal and in fragments. We must study
its life history and its background and to explain the behavior at a particular time.
6. A social phenomenon is of a very complex, nature and the deep study of number of units is a
difficult task. Therefore, the researcher has to take the shelter of the case study method as a
single unit can be studied in wholeness and depth.
7. Social phenomena are of dynamic nature and are influenced by time. in search of root cause
behind and event a researcher has to study the problem in its historical perspective. It is
assumed in a case study that a study of single unit would be able to explain the influence of
time overtime over the variables.

Steps involved in Case study

1. Selection of Cases and I dentification of Situations
Before taking up the case study a researcher has to take some decisions such as which unit has to
be taken up for studies? What aspects or what period of life of unit can be studied? What are the
situations in which the unit exists? All these questions should be answered. Keeping in view these
questions, a researcher has to choose representative and typical data. The selection of such a
representative unit highly depends on the ability and skill of the researcher.
2. Collection and Recording Data
The researcher should use different data collection tools and techniques and should collect vthe
different aspects of data, personal documents, life histories, observations, interviews,
questionnaires, schedules etc. The data should be recorded carefully , uniformly topic-wise,
accurately and objectively with clarity. The data should be complete and easy reference. The
researcher should be suspicious of striking (abnormal) cases and are far less significant than more
common places ones.
3. I nterpretation of Data
Analysis and interpretation of data are considered to be highly skilled and technical job. Facts and
figures never speak for themselves. The facts collected must be classified, explained and
interpreted. The interpretation must be in logical and convenient form. Only by means of
interpretation the underlying features of the data revealed and valid generalizations are arrived at.
4. Report Writing
Report writing is the end product of a research. Reporting of the research findings is an
important part of any type of research. Reporting means the written presentation of the
evidences and findings of a research. The report must be in such a manner that the reader can
easily understand assess and enable him to verify the validity of the conclusions.


Chapter 4
Experimental Method
- The essence of an experiment may be described as observing the effect on a dependent variable of
the manipulation by independent variable.
- The proof of hypothesis which seeks to look upto two factors in to causal relationship through the
study of controlling situations which have been controlled on all factors except the one of interest
the latter being the hypothetical case or the hypothetical effect.
- An experiment is said to be successful if certain effects are shown to be the consequence of certain
identified causes and vice versa.
- Objective-to generalize the cause-effect relationship and to apply it to similar situations elsewhere.
- In experimental method the key word is control. Things and phenomena are deliberately
manipulated to identify and isolate certain features which provide the basis of interest.
Applicability in social Sciences
- It can not be used in social sciences because man can not control space that is too vast or the past
that is already over or human behavior that changes under pressures of circumstances.
Characteristics of Experimental Method
- In social sciences it is a method of testing hypothesis.
- Using this method researchers try to isolate complex social phenomena.
- In this method we see the effect of a variable keeping other things equal.
- It helps in scientific study of relationship between cause and effect.
- This approach to social research resembles scientific method so far as model building and
formulation of relationship among variables is concerned.
- This method has increased accuracy in prediction of social behavior.
Basic Principles of Experimental Design
1. Replication- we mean the repetition of an experiment which helps in estimating the
experimental error.
- To get a more precise effect(std error of the mean) of any factor from the following


o = , where std
= o deviation of means

n replicatio of number n
error erimental true
= exp o

- The greater the number of replications, the smaller is
2. Randomization-
- Taking samples in a random or probabilistic manner without any prejudices helps to
ensure that the errors of observation are independently distributed which is the most
common principle in experimental design.
- It may be noted that randomization does not guarantee independence but allow us to
assume the independent is a fact.
- There may be situation where complete randomization may not be feasible or
economically possible. However a judicious mixture of randomization and systematic
non-random designs may be more realistic.
3. Local Controls
- The balancing, blocking and grouping of experimental units employed in an
experimental design.
- Replication and randomization make void test of significance possible while local
control makes the test more sensitive.
- Grouping-placing of a set of homogeneous experimental units in different groups.
- Blocking- assigning the experimental units to blocks in such a manner that a balanced
configuration results.
- Balancing-way of adjusting the grouping, blocking and assigning the treatments in
such a manner that a balanced configuration results.
Problems of Social Experiments
1. Difficulty of cooperation
- Are not easily manipulated.
2. Difficulty of setting
3. Difficulty of Control
4. Errors of Measurement
- It is not possible in practical life to measure values of the variables used for the
estimation of regression models without error.

Merits of Experimental Method
1. To solve scientific problems, ascertain appropriateness of techniques or determine
accuracy of data.
2. Helps in the establishment of cause and effect more clearly than any other methods. In
case of other methods we can atleast locate the degree of association or co-variance but
can not say with certainty as to which one is the cause and which one the effect.
3. It is more precise and accurate since the variable under study is manipulated leaving
others untouched.
4. It is the best method for testing a hypothesis, causal observation may help in
formulation of a hypothesis, but it is only through experiment method that hypothesis
can be tested and verified.
5. Conclusions drawn from an experimental method are subject to verification at any time.
6. Laws formed by this method are universal in their application.
7. Prediction can be done with sufficient accuracy.
Types of Experimental Design
1. After only experimental design
- Under this method the dependent variables are measured after exposure of groups to
experimental variables.
- The steps involved in such experiments are:
i. Select two equivalent groups. Among these groups one is considered as experiment
group and the other as control group. Both these groups are selected at random with or without
supplementary matching.
ii. These two groups are not measured in respect of the characteristic which is likely to
change due to the exposure of experimental variable. The two groups are assumed to be equal
in respect of this characteristic.
iii. The experimental group is exposed to the experimental variable for a specified period
of time.
iv. Uncontrolled events are certain events or factors whose effects on the dependent
variable are beyond the control of the experimenter.
v. The experimental and control groups are observed or measured with respect to the
dependent variable(Y) after the exposure of the experimental group to the assumed causal
variable (X).

vi. The inference whether the hypothesis X produces Y is tenable is arrived at simply by
comparing the occurrence of Y( or its extent or nature) in the experimental group after
exposure to variable X with the occurrence of Y in the control group which has not been
exposed to X, Y2, and Y2 are obtained in the experimental and control group respectively
which are compared by means of substraction to ascertain whether X and Y vary
concomitantly. Thus the difference (Y2-Y2) is obtained to ascertain the effect of the
dependent variable.
i. Two groups are selected with a reason to assure that they do not differ from each
other except by chance.
ii. The final problem of eliminating the effect of other factors is dealt with the
assumption that both groups are exposed to the same external events and undergo similar
natural developmental process between the time of selection and the time of measurement.
iii. The differences between Y2 and Y2 can be taken as an indication of the effect of
the experimental variable.
- The major weakness of the after only experimental design is that the before experiments
are not taken.
- In many cases both the experimental and control groups are not similar such that the
research attributes to the experimental variable may really be due to the initial differences
between the two groups. However, in certain experimental situations before measurements
are not feasible due to practical difficulties.

2. Before-after Experimental design
- In this method the measurement of dependent variable before and after exposure of
group(s) to dependent variable is done.
- This design is mainly of four types. Namely
i. Before-after single group
ii. Before- after with one control group
iii. Before- after with two control groups
iv. Before- after with three or more control groups.


Chapter 5: Hypotheses
The meaning of Hypotheses
Hypotheses is a hunch, assumption, suspicion, assertions or an idea about a phenomena, relationship
or situation, the reality or truth of which you do not know.
A hypothesis is a conjectural statement of the relationship between two or more variables.
A hypothesis is a proposition, condition, or principle which is assumed perhaps without belief, inorder
to draw out its logical consequences and this method to test its accord with facts which are known or
may be determined.
Hypothesis is a proposition that is stated in a testable form and that predicts a particular relationship
between two or more variables.
Hypothesis is a tentative statement about something, the validity of which is usually unknown.
Importance of Hypotheses
1. It provides direction to research
It defines what relevant and what is irrelevant. Thus it prevents the review of irrelevant literature and
the collection of useless or excess data. It not only prevents wastage in the collection of data, but also
ensures the collection of the data necessary to answer the question posed in the statement of the
2. It sensitizes the investigator to certain aspects of the situations which are relevant from
the standpoint of the problem in hand
It spells the difference between precision and haphazardness, between fruitful and fruitless research.
3. It is a guide to the thinking process and the process of discovery.
It is the investigators eye-a sort of guiding light in the world of darkness.
4. It focuses research.
Without it research would be like a random and aimless wondering.
5. It prevents blind research.
It prevents indiscriminate gathering of data which may later turnout to be irrelevant.
6. It sensitizes the individual facts and conditions that might otherwise be overlooked.
7. It places clear and specific goals before the researcher.
These clear and specific goals provide the investigator with a basis for selecting samples and research
procedures to meet these goals.
8. It serves the function of linking together related facts and information and organizing
them into one comprehensive whole.

9. It enables the investigator to understand with greater clarity his problem and its
ramifications, as well as the data which beat it.
It further enables a researcher to clarify the procedures and methods which are incapable of providing
the necessary data.
10. It serves as a framework for drawing conclusions.
It makes possible the interpretation of data in the light of the tentative proposition or provisional guess.
It provides the outline for setting conclusions in a meaningful way.
11. A hypothesis may enable you to add to the formulation of theory and help you to bridge
the gaps in the body of knowledge.

Characteristics of Good Hypotheses
There are a number of considerations to keep in mind, as they are important, for valid verification,
when constructing hypotheses.
1. A hypothesis should be simple, specific, and conceptual clear.
There is no place for ambiguity in construction of hypotheses, as ambiguity will make the verification
of your hypotheses almost impossible. A good hypothesis is the one which is based on the
operationally defined concepts. It should be uni-dimensional , that it should test only one relationship
at a time.
2. A hypothesis should be capable of verification.
Methods and techniques must be available for data collection and analysis. It should be formulated in a
way that can be tested directly and found to be probably true or probably false.
3. A hypothesis should be related to the body of knowledge.
It is important that your hypothesis emerges from the existing body of knowledge, and that it adds to
it, as this is an important function of research. This can only be achieved if the hypothesis has its roots
in the existing body of knowledge.
4. A hypothesis should be operationalisable.
That is, it can be expressed in terms that can be measured. If it can not be measured, it can not be
tested and hence no conclusions can be drawn.

Types of Hypotheses

As explained, any assumption that you seek to validate through an inquiry is called hypothesis. Hence,
theoretically there should be only one type hypothesis, that is the research hypotheses-the basis for
your investigation.
However, because of the convention in scientific inquiries and because of the wording used in the
construction of a hypothesis, hypothesis can be classified in to several types.
Broadly, there are two categories of hypothesis:
1. Research Hypothesis
2. Alternate Hypothesis
The formulation of alternate hypothesis is a convention in scientific inquiries. Its main function is to
explicitly specify the relationship that will be considered as true in case the research hypothesis proves
to be wrong.
In a way an alternate hypothesis is an opposite of the research hypothesis. Again, as convention a null
hypotheses or hypotheses of no difference is formulated as an alternate hypotheses.
Based on the following example we can differentiate the types of hypotheses
Suppose you want to study the smoking pattern in a community in relation to gender differentials. The
following hypotheses could be constructed.
1. There is no significance difference in the proportion of male and female smokers in the study
2. A greater proportion of females than males are smokers in the study population.
3. Sixty percent of females and thirty percent of males in the study population are smokers.
4. There are twice as many female smokers as male smokers in the study population.
The first hypothesis formulated indicates that there is no difference in the proportion female and male
smokers. When you construct such hypothesis, it is called null hypothesis and usually written as H
The second hypothesis implies that there is a difference in the proportion of male and female smokers
among the population, though the extent of the difference is not specified. A hypothesis in which a
researcher stipulates that there will be a difference but does not specify its magnitude is called a
hypothesis of difference.
The researcher might have enough knowledge about the topic to speculate almost the exact prevalence
of the situation or the outcome of a treatment program in quantitative units. Such type of hypothesis is
called a hypothesis of point prevalence.
A hypothesis that speculates the extent of a relationship in terms of the effect of different treatment
groups on the dependent variable is known as hypothesis of association. E,g., example number 4

There may be some confusion between null and research hypothesis as indicated in the fig below as
the null hypothesis is classified under research hypothesis as well. Any hypothesis including null
hypothesis can become the basis of an inquiry. When a null hypothesis becomes the basis of an
investigation, it becomes a research hypothesis.

Fig Types of hypothesis
Procedures for Hypotheses Testing
To test hypothesis means to tell (on the basis of the data the researcher has collected) whether or not
the hypothesis seems to be valid.
Procedures in hypothesis testing refers to all those steps that we undertake for making a choice
between the two actions i.e., rejection and acceptance of a null hypothesis.
The various steps involved in hypothesis testing are stated below:
1. Making a formal statement
2. Selecting a significance level
3. Deciding the distribution to use
4. Selecting a random sample and computing an appropriate value
5. Calculation of the probability
6. Comparing the probability

Types of
of point

Limitations of the tests of Hypotheses
1. The tests do not explain the reasons as to why do the difference exist, say between the means
of the two samples. They simply indicate whether the difference is due to fluctuations the
sampling or because of other reasons but the test do not tell us as to which is /are the other
reason(s) causing the difference.
Errors in Hypothesis Testing
When a hypothesis is tested, there are four possible outcomes:
- The hypothesis is true but our test leads to its rejection.
- The hypothesis is false but our test leads to its acceptance.
- The hypothesis is true and our test leads to its acceptance.
- The hypothesis is false and our test leads to its rejection.
Of these four possibilities, the first two lead to an erroneous decision. The first possibility leads to a
Type I error and the second possibility leads to a Type II error.
This can be shown as follows:
Decision State of nature
Ho is true Ho is false
Accept Ho Correct Decision Type II error ( | )
Reject Ho Type I error(o ) Correct Decision


State H
and H
Specify the level of significance (or the o value)
Decide the correct sampling distribution
Sample a random sample(s) and workout an appropriate value
from sampling data
Calculate the probability that sample result would diverge as
widely as it has from expectations, if H
were true
Is this probability equal to or smaller thano value in case of one
tailed test, and
in case of two tailed test
Reject H
Accept H
Thereby run the risk of
committing Type I error
Thereby run some risk of
committing type II error
No Yes

Deduction & Induction
In logic, we often refer to the two broad methods of reasoning as the deductive and inductive

Deductive reasoning works from the more general to the more specific. Sometimes this is informally
called a "top-down" approach. We might begin with thinking up a theory about our topic of interest.
We then narrow that down into more specific hypotheses that we can test. We narrow down even
further when we collect observations to address the hypotheses. This ultimately leads us to be able to
test the hypotheses with specific data -- a confirmation (or not) of our original theories.
Inductive reasoning works the other way, moving from specific observations to broader
generalizations and theories. Informally, we sometimes call this a "bottom up" approach. In inductive
reasoning, we begin with specific observations and measures, begin to detect patterns and regularities,
formulate some tentative hypotheses that we can explore, and finally end up developing some general
conclusions or theories.

These two methods of reasoning have a very different feel to them when you're conducting research.
Inductive reasoning, by its very nature, is more open-ended and exploratory, especially at the
beginning. Deductive reasoning is narrower in nature and is concerned with testing or confirming
hypotheses. Even though a particular study may look like it's purely deductive (e.g., an experiment
designed to test the hypothesized effects of some treatment on some outcome), most social research
involves both inductive and deductive reasoning processes at some time in the project. In fact, it
doesn't take a rocket scientist to see that we could assemble the two graphs above into a single circular
one that continually cycles from theories down to observations and back up again to theories. Even in
the most constrained experiment, the researchers may observe patterns in the data that lead them to
develop new theories.
Chapter 6: Sampling
o Introduction
o Important terminologies and concepts in sampling.
o Sampling theory
o Types of sampling techniques (Probability Sampling, Non- probability Sampling)
Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so
that by studying the sample we may fairly generalize our results back to the population from which
they were chosen.
Important terminologies and concepts in sampling:
External Validity
External validity is related to generalizing. Validity, in general, refers to the approximate truth of
propositions, inferences, or conclusions. So, external validity refers to the approximate truth of
conclusions that involve generalizations. Put in more pedestrian terms, external validity is the degree
to which the conclusions in your study would hold for other persons in other places and at other times.

External validity can be improved by, one way, based on the sampling model, whenever you do a good
job of drawing a sample from a population. For instance, you should use random selection, if possible,
rather than a nonrandom procedure. And, once selected, you should try to assure that the respondents
participate in your study and that you keep your dropout rates low.
The group you wish to generalize to is often called the population in your study. This is the group you
would like to sample from because this is the group you are interested in generalizing to. It is an
aggregate of items possessing a common trait or traits. It is a complete group of items about which
knowledge is sought. There is a distinction between the population you would like to generalize to,
and the population that will be accessible to you. We'll call the former the theoretical population and
the latter the accessible population. Furthermore, population could be finite or infinite. Similarly,
population could be hypothetical or existent.
Sampling Frame
Once you've identified the theoretical and accessible populations, you have to do one more thing
before you can actually draw a sample -- you have to get a list of the members of the accessible
population. (Or, you have to spell out in detail how you will contact them to assure
representativeness). The listing of the accessible population from which you'll draw your sample is
called the sampling frame.
The sample is the group of people who you select to be in your study. Notice that it is not said that the
sample was the group of people who are actually in your study. You may not be able to contact or
recruit all of the people you actually sample, or some could drop out over the course of the study. The
group that actually completes your study is a sub-sample of the sample -- it doesn't include non-
respondents or dropouts.
Statistic and Parameter
When we sample, the units that we sample supply us with one or more responses. When we look
across the responses that we get for our entire sample, we use a statistic. If you measure the entire

population and calculate a value like a mean or average, we don't refer to this as a statistic, we call it a
parameter of the population.
Sampling Theory
Sampling theory is a study of relationships existing between a population and samples drawn from the
Sampling theory is applicable only to random samples.
The main problem of sampling theory is the problem of relationship between a parameter and a
statistic. The sampling theory is concerned with estimating the properties of the population from those
of the sample and also with gauging the precision of the estimates. This sort of movement from
particulars (samples) towards general (population) is what is known as statistical induction or
statistical inference.
In clear terms, from the sample we may attempt to draw inference concerning the population. In order
to be able to follow this inductive method, we first follow a deductive argument which is that we
imagine a population or universe and investigate the behavior of the sample drawn from this
population applying the laws of probability. The methodology dealing with all this is known as
sampling theory.
Sampling theory is designed to attain one of the following objectives:
Statistical Estimation
Sampling theory helps in estimating unknown population parameters from knowledge of statistical
measures based on sample studies .In other words, to obtain an estimate of parameter from statistic is
the main objective of sampling theory.
The estimate can be either a point estimate or an interval estimate.
Point estimate is a single estimate expressed in the form of a single figure, but interval estimate has
two limits viz., the upper limit and the lower limit within which the parameter value may lie. Interval
estimates are often used in statistical induction.

Test of hypothesis
The second objective of sampling theory is to enable us to decide whether to accept or reject
hypothesis; the sampling theory helps in determining whether observed difference are actually due to
chance or whether they are really significant.
Statistical inference
Sampling theory helps in making generalization about the population from the studies based on
samples drawn from it. It helps in determining the accuracy of such generalization.
Sampling theory can be studied under two heads viz., the sampling of attributes and the sampling of
variables and that too in the context of large and small samples.
When we study some qualitative characteristic of the items in a population, we obtain statistics of
attributes in the form of two classes; one class consisting of items wherein the attribute is present and
the other class consisting of items wherein attribute is absent. The presence of item may be termed as
success whereas the absence may be termed as failure.
We generally consider the following three types of problems in case of sampling attributes:
- The parameter value may be given and it is only to be tested if an observed statistic is its
- The parameter value is not known and we have to estimate it from the sample.
- Examination of the reliability of the estimate i.e., the problem of finding out how far the
estimate is expected to deviate from the true value for the population.
The theory of sampling can be applied in the context of statistics of variables ( i.e., data relating to
some characteristics concerning population which can be measured enumerated with the help of some
well defined statistical unit) in which case the objective happens to be:
- To compare the observed and expected values and find if the difference can be ascribed to the
fluctuations of sampling
- To estimate population parameters from the sample
- To find out the degree of the reliability of the estimate.

Sampling Design
A sample design is a definite plan for obtaining a sample from a given population. It refers to the
technique or the procedure the researcher would adopt in selecting items for the sample.
Sample design may as well lay down the number of items to be included in the sample, i.e. the sample
Sample design is determined before data are collected.
Researchers must select/prepare a sample design which should be reliable and appropriate for his
A. Steps in sample Designing
While developing a samling design, the researcher must pay attention to the following points:
1. Defining clearly the population/ universe to be studied.
2. Determination of the sampling unit.( Sampling unit may be a geographical one such as state,
district, village, etc., or a construction unit such as house, flat, etc., or it may be a social unit such as
family, club, school, etc., or it may be an individual.
3. I dentifying the sampling frame or source list.(Sampling frame contains the names of all items
of universe).If a source list is not available, a researcher has to prepare it.
4. Determining the sample size( this refers to determining the number of items to be selected
from the population to constitute a sample.) An optimum sample size is one which fulfills the
requirements of efficiency, representativeness, reliability and flexibility.
5. I dentifying the parameters of interest (what type of population characteristic the researcher
wants to study i.e., whether proportion, mean, variation affects the sample design to be selected. )
6. Determining the Budgetary Constraints (cost consideration has effect not only the size of the
sample but also on overall sample design to be pursued).
7. Determining the Sampling Procedure (what type of sample is to be used that can minimize the
sampling error).
B. Errors in Sampling

There are two types of errors in sample studies or in inference making: sampling error and non-
sample errors.
Non sampling errors, called systematic bias, results from errors in the sampling procedures, it can not
be reduced or eliminated by increasing the sample size. At best the causes responsible for these errors
can be detected and corrected. Usually a systematic bias is the result of one or more of the following
- Inappropriate sampling frame
- Defective measurement device(questionnaire or interview guide)
- Non respondents
- Indeterminancy principle( individuals may act differently when put under observation)
- Natural bias in the reporting of data basically by the respondents.
Sampling errors are the random variation in the sample estimates around the true population
parameters. Since they occur randomly and are equally likely to be in either direction, their nature
happens to be of compensatory type and the expected value of such errors happens to be equal to zero.
Sampling error decreases with the increase in the size of the sample, and it happens to be of a smaller
magnitude in case of homogeneous population. Sampling error is measured for each sample size and
design. Doing so is called the precision of the sampling plan.
In nut shell, while selecting a sampling procedure, researcher must ensure that the procedure causes a
relatively small sampling error and helps to control the systematic bias in a better way.
C. Characteristics of Good Sample Design
The characteristics of good sample design are
a. It must result in a truly representative sample
b. It must be such which results in a small sampling error
c. It must be viable in the context of funds available for the research study
d. It must be such so that systematic bias can be controlled in a better way
e. It must be such that the results of the sample study can be applied, in general, for the
population with a reasonable confidence.

Sampling techniques
There are different ways of classifying sampling methods: sometimes they are classified as Random
sampling (where the members of the sample are chosen by some random mechanism) Quasi-random
sampling (where the mechanism for choosing the sample is only partly random) Non-random
sampling where the sample is specifically selected rather than randomly selected. However, the
dominant classification is the probability-non probability continuum.

1. Probability Sampling
A probability sampling method is any method of sampling that utilizes some form of random
selection. In order to have a random selection method, you must set up some process or procedure that
assures that the different units in your population have equal probabilities of being chosen.
In case of probability sample method:
The probability or chance of every unit in the population being included in the sample is known.
Selection of the specific units in the population depends entirely on chance.

a. Simple Random Sampling
The simplest form of random sampling is called simple random sampling. Here's the quick
description of simple random sampling:
- Objective: To select n units out of N such that each
has an equal chance of being selected.
- Procedure: Use a table of random numbers, a computer random number generator, or a
mechanical device (lottery method) to select the sample.
- It is a more scientific method of taking out samples from the universe since it eliminates
personal bias.

- No advance knowledge of the characteristics of the population is necessary.
- Assessment of the accuracy of the results is possible by sample error estimation.
- The sample is true representative of the population.
- It is very easy and easily practicable procedure of selecting samples.
- Provides reliable information for low cost, time and energy.
Simple random sampling is simple to accomplish and is easy to explain to others. Because simple
random sampling is a fair way to select a sample, it is reasonable to generalize the results from the
sample back to the population.
A simple random sample is one in which each member (person) in the total population has an equal
chance of being picked for the sample.
In addition, the selection of one member should in no way influence the selection of another.
Simple random sampling should be used with a homogeneous population.
The simple random sample requires less knowledge about the population than other techniques, but it
does have the following major drawbacks.
- One is if the population is large, a great deal of time must be spent listing and numbering the
- The sampling method requires complete list of the universe. But such up-to data list is not
available in many enquires which restricts the use of this method.
- In a field of survey if the area of coverage is fairly large than the units selected under this method
are expected to be scattered widely geographically and thus is may be quite time consuming and
costly affair to collect the requisite information.
- The selected sample may not be a true representative of the universe if its size is too small.
- A simple random sample will not adequately represent many population attributes (characteristics)
unless the sample is relatively large. That is, if you are interested in choosing a sample to be
representative of a population on the basis of the distribution in the population of gender, age, and
economic status, a simple random sample will need to be very large to ensure all these
distributions are equivalent to (or representative of) the population.
b. Stratified Random Sampling

Stratified Random Sampling, also sometimes called proportional or quota random sampling, involves
dividing your population into homogeneous subgroups and then taking a simple random sample in
each subgroup.
Objective: Divide the population into non-overlapping groups (i.e., strata) N
, N
, N
, ... N
, such that
+ N
+ N
+ ... + N
= N. Then do a simple random sample of f = n/N in each strata.
When we use the same sampling fraction within strata we are conducting proportionate stratified
random sampling. When we use different sampling fractions in the strata, we call this disproportionate
stratified random sampling.
There are several major reasons why you might prefer stratified sampling over simple random
First, it assures that you will be able to represent not only the overall population, but also key
subgroups of the population, especially small minority groups.
Second, stratified random sampling will generally have more statistical precision than simple random
sampling. This will only be true if the strata or groups are homogeneous. If they are, we expect that the
variability within-groups is lower than the variability for the population as a whole, stratified sampling
capitalizes on that fact.
- If a correct stratification has been made even a small number of units will form a representative
- Under stratified sampling no significant group is left unrepresented.
- It is more precise and to a great extent avoids bias.
- It saves cost and time of data collection since the sample size can be less.
- Achieves different degree of accuracy for different segments of the population.
- Replacement of case is easy if the original case is not accessible to study.
- It is of a great advantage if the distribution of the universe is skewed.

- It is a very difficult task to divide the universe into homogeneous strata.
- If the strata is overlapping, unsuitable or disproportionate the selection of the sample may not
be representative.
- If stratification is faulty, it can not be correct by taking large size of sample.
- Disproportionate stratification requires weighting which adds complexity and bias.
c. Systematic Random Sampling
Here are the steps you need to follow in order to achieve a systematic random sample:
- number the units in the population from 1 to N
- decide on the n (sample size) that you want
- k = N/n = the interval size
- randomly select an integer between 1 to k
- then take every k
For this to work, it is essential that the units in the population are randomly ordered, at least with
respect to the characteristics you are measuring.
Why would you ever want to use systematic random sampling?
- It is fairly easy to do. You only have to select a single random number to start things off.
- It is very easy to operate and checking can also be done quickly.
- It may also be more precise than simple random sampling.
- In some situations there is simply no easier way to do random sampling.
- Randomness and probability features are present in this model which makes sample
- It works well only if the complete and up-to-date frame is available and if the units are
randomly arranged.
- Any hidden periodicity in the list will adversely affect the representativeness of the sample.

d) Cluster (Area) Random Sampling
The problem with random sampling methods when we have to sample a population that is disbursed
across a wide geographic region is that you will have to cover a lot of ground geographically in order
to get to each of the units you sampled. This problem can be minimized by the cluster random
In cluster sampling, we follow these steps:
- divide population into clusters (usually along geographic boundaries)
- randomly sample clusters
- measure all units within sampled clusters
- Significant cost gain.
- Easier and more practical method which facilitates the field work.
- Probability and the representativeness of the sample are sometimes affected, if the number of
the cluster is very large.
- The results obtained are likely to be less accurate if the number of sampling units in each
cluster is not approximately the same.
e) Multi-Stage Sampling
The four methods we've covered so far -- simple, (random) stratified, systematic and cluster -- are the
simplest random sampling strategies. Sometimes, however, it is necessary that we have to combine
the simple methods described earlier in a variety of useful ways that help us address our sampling
needs in the most efficient and effective manner possible. When we combine sampling methods, we
call this multi-stage sampling.
For example, consider the problem of sampling students in schools. We might begin with a national
sample of school districts stratified by economics and educational level. Within selected districts, we
might do a simple random sample of schools. Within schools, we might do a simple random sample of

classes or grades. And, within classes, we might even do a simple random sample of students. In this
case, we have three or four stages in the sampling process and we use both stratified and simple
random sampling. By combining different sampling methods we are able to achieve a rich variety of
probabilistic sampling methods that can be used in a wide range of social research contexts.
- It is more flexible in comparison to the other methods of sampling.
- It is simple to carry out and results in administrative convenience by allowing the field work to
be concentrated and yet covering large area.
- It is of great significance in surveys of underdeveloped areas where no up-to-date and accurate
frame is generally available for subdivision of the material into reasonably small sampling
- It is a reliable and satisfactory technique and by using this sample survey can be conducted
with considerable speed.
- Errors are likely to be large in comparison to others.
- It is less efficient than a suitable single stage sampling of the same.
- It involves considerable amount of listing of first stage units, second stage units etc., though
complete listing of units is not necessary.
2. Non-probability Sampling
The difference between non-probability and probability sampling is that non-probability sampling
does not involve random selection and probability sampling does.
Does that mean that non-probability samples aren't representative of the population? Not necessarily.
But it does mean that non-probability samples cannot depend upon the rationale of probability theory.
At least with a probabilistic sample, we know the odds or probability that we have represented the
population well. We are able to estimate confidence intervals for the statistic. With non-probability
samples, we may or may not represent the population well, and it will often be hard for us to know
how well we've done so. In general, researchers prefer probabilistic or random sampling methods over
non-probabilistic ones, and consider them to be more accurate and rigorous. However, in applied

social research there may be circumstances where it is not feasible, practical or theoretically sensible
to do random sampling. Here, we consider a wide range of non-probabilistic alternatives.
In case of non-probability sampling method:
The probability of inclusion of any units (of population) in a sample is not known.
The selection of units within a sample involves human judgment rather than pure chance.
The maximum information available per Birr which can be determined from probability sample
is not possible in this case and moreover, the degree of accuracy is not known.
Although probability sampling is scientific and accurate, however, because of convenience and
economy, the non-probability samples are preferred.
Many times, samples are selected by interviewers at random meaning that the actual sample
selection is left to the choice of the researcher, such samples are non-probability samples and not
probability samples.
We can divide nonprobability sampling methods into two broad types: accidental or purposive. Most
sampling methods are purposive in nature because we usually approach the sampling problem with a
specific plan in mind. The most important distinctions among these types of sampling methods are the
ones between the different types of purposive sampling approaches.
a. Accidental, Haphazard or Convenience Sampling
One of the most common methods of sampling goes under the various titles listed here. Like for
example "man on the street" (of course, now it's probably the "person on the street") interviews
conducted frequently by television news programs to get a quick (although non-representative) reading
of public opinion. Clearly, the problem with type of samples is that we have no evidence that they are
representative of the populations we're interested in generalizing to -- and in many cases we would
clearly suspect that they are not.
This method may be used in the following cases:
- The universe is not clearly defined
- Sampling unit is not clear
- A complete source list is not available.
b. Purposive Sampling

In purposive sampling, we sample with a purpose in mind. We usually would have one or more
specific predefined groups we are seeking.
Purposive sampling can be very useful for situations where you need to reach a targeted sample
quickly and where sampling for proportionality is not the primary concern. With a purposive sample,
you are likely to get the opinions of your target population, but you are also likely to overweight
subgroups in your population that are more readily accessible.
- More economical and less time consuming
- Ensures proper representation of a cross-section of various strata of the universe if the
researcher has full knowledge of the composition of the universe.
- It is very useful when some of the units are very important and their inclusion in the study is
- It is a pratical method where randomization is not possible.
- Considerable prior knowledge of the universe is necessary which in most cases is not possible.
- Controls and safeguards adopted under this method are sometimes are not effective and there
is very possibility of the selection of biased samples.
- The calculation of sample error is not possible. Therefore, the hypothesis can not be tested.
All of the methods that follow can be considered subcategories of purposive sampling methods. We
might sample for specific groups or types of people as in modal instance, expert, or quota sampling.
We might sample for diversity as in heterogeneity sampling. Or, we might capitalize on informal
social networks to identify specific respondents who are hard to locate otherwise, as in snowball
sampling. In all of these methods we know what we want -- we are sampling with a purpose.
Modal I nstance Sampling
In statistics, the mode is the most frequently occurring value in a distribution. In sampling, when we
do a modal instance sample, we are sampling the most frequent case, or the "typical" case.

Expert Sampling
Expert sampling involves the assembling of a sample of persons with known or demonstrable
experience and expertise in some area. Often, we convene such a sample under the auspices of a
"panel of experts."
There are two reasons you might do expert sampling.
First, because it would be the best way to elicit the views of persons who have specific
expertise. In this case, expert sampling is essentially just a specific subcase of purposive sampling.
The other reason you might use expert sampling is to provide evidence for the validity of
another sampling approach you've chosen. The disadvantage is that even the experts can be, and often
are, wrong.
Quota Sampling
In quota sampling, you select people non-randomly according to some fixed quota.
There are two types of quota sampling: proportional and non proportional.
In proportional quota sampling you want to represent the major characteristics of the population by
sampling a proportional amount of each
Non-proportional quota sampling is a bit less restrictive. In this method, you specify the minimum
number of sampled units you want in each category. Here, you're not concerned with having numbers
that match the proportions in the population. Instead, you simply want to have enough to assure that
you will be able to talk about even small groups in the population. This method is the non-probabilistic
analogue of stratified random sampling in that it is typically used to assure that smaller groups are
adequately represented in your sample.
- Quota sampling is a stratified-cum-purposive sampling and thus enjoys the benefits of both
sampling techniques.
- It makes the best use of stratification economically.
- Practical and convenient method.
- Likely to give accurate results.
- Is the only useful method when no sample frame is available.

- Suffers from the limitations of both stratified and purposive sampling.
- As control over field work is difficult, the results may be biased.
- Since it is not based on random sampling, the sampling error as well as standard error cannot
be estimated.
- Since the samples are not selected randomly, it may be less representative.
- Bias may occur due to substitution of unlike sample units.
Heterogeneity Sampling
We sample for heterogeneity when we want to include all opinions or views, and we aren't concerned
about representing these views proportionately. Another term for this is sampling for diversity.
In many brainstorming or nominal group processes (including concept mapping), we would use some
form of heterogeneity sampling because our primary interest is in getting broad spectrum of ideas, not
identifying the "average" or "modal instance" ones. In effect, what we would like to be sampling is not
people, but ideas. We imagine that there is a universe of all possible ideas relevant to some topic and
that we want to sample this population, not the population of people who have the ideas. Clearly, in
order to get all of the ideas, and especially the "outlier" or unusual ones, we have to include a broad
and diverse range of participants. Heterogeneity sampling is, in this sense, almost the opposite of
modal instance sampling.
Snowball Sampling
In snowball sampling, you begin by identifying someone who meets the criteria for inclusion in your
study. You then ask them to recommend others who they may know who also meet the criteria.
Although this method would hardly lead to representative samples, there are times when it may be the
best method available. Snowball sampling is especially useful when you are trying to reach
populations that are inaccessible or hard to find. For instance, if you are studying the homeless, you
are not likely to be able to find good lists of homeless people within a specific geographical area.
However, if you go to that area and identify one or two, you may find that they know very well who
the other homeless people in their vicinity are and how you can find them.
Sample Size
When considering collecting data, it is important to ensure that the sample contains a sufficient
number of members of the population for adequate analysis to take place. Larger samples will
generally give more precise information about the population. Unfortunately, in reality, questions of
expense and time tend to limit the size of the sample it is possible to take. For example, national
opinion polls often rely on samples in the region of 1000.

Chapter 7
Method of Data Collection
When we talk of collection of data, we should be clear as to what does the word data connote. The
word datum is a Latin word, which literally means something given. It means pieces of information,
which can be either quantitative or qualitative. The term data is a plural of datum and means facts
and statistics collected together for reference or analysis. Thus any information collected is data.
Collection of data refers to a purposive gathering of information relevant to the subject matter of the
study from the units under investigation. The method of data collection depends mainly upon the
nature, purpose and the scope of inquiry on one hand the availability of resources on the other hand.
The task of data collection begins after a research problem has been defined and research design/plan
chalked out.
Sources of data
There are two types of data that are collected and analyzed in research endeavors. These are:
Secondary data
Primary data
Secondary sources of Data
Secondary data means data that are already available i.e., they refer to the data which have been
already been collected and analyzed by some one else. Secondary data are collected by others and used
by others.
Any data that has been collected earlier for some other purpose are secondary data in the hands of an
individual who is using them.
Collection of secondary data
Secondary data may either be published or unpublished data.
Usually published data are available in:
Various publications of the central, state, or local government
Various publications of international bodies or their subsidiaries or foreign governments
Technical or trade journals
Books, magazines and news papers
Reports and publications of various organizations
Reports of research scholars in different fields

Public records and statistics, historical documents and other sources of published information.
Advantages and disadvantages of Secondary Data
There are certain distinct advantages, as also the limitations, of using secondary data. One should,
therefore, be fully aware of both the advantages and the limitations.
1. A major advantage of the use of secondary data is that it is far more economical as the cost of
collecting original data is saved. In the collection of primary data, a good deal of effort is
required- data collection form should be designed, and printed, field staff is to be assigned and
maintained until all the data have been collected, the traveling expenses are to be incurred, the
sample design has to be selected, data are to be collected and verified for their accuracy, and
finally all such data has to be tabulated. All these activities would need large funds, which can
be utilized elsewhere if secondary data alone can serve the purpose.
2. The use of secondary data saves much of our time. This leads to prompt completion of the
report for which, otherwise, primary data would have been required to be collected.
3. As one explores the availability of secondary data required for ones project, one finds, in the
process that ones understanding of the problem has improved. One may even have to change
some of ones earlier ideas in the light of the secondary data.
4. Secondary data can be used as a basis for comparison with the primary data that have been just
5. Search for secondary data is helpful not only because secondary data may be helpful but
familiarity with such data indicates the deficiencies and gaps. As a result, one can make the
primary data collection more specific and more relevant to ones study.
In practice secondary data seldom fit perfectly in to the framework of the proposed study. This is on
the account of a number of factors
1. The unit in which secondary data are expressed may not be the same as is required in the
proposed study.
2. Even if the units are the same as those required by the research project, it may be that class
boundaries are different from those desired.
3. One does not always know how accurate the secondary data are. In case the degree of accuracy
is high, the use of such dubious data would determine the utility of our study. In most cases it
is difficult to know with what care secondary data have been collected and tabulated. All the
same, in the case of well-established and reputed organizations, both official and non-official,

secondary data would be far more accurate and reliable and they can be used without much
4. A severe limitation in the use of secondary data is that they may be somewhat out of date. A
good deal of time is spent in the collection, processing, tabulation and publishing of such data
and by the time the data is available , they are already two or three years old. As a result, the
data are no longer up-to -date. The utility of secondary data declines progressively as the time
goes by, and they are finally useful only historical purpose.
Evaluating Secondary Data
Since the use of secondary data is relatively cheaper than that of primary data, it is advisable to
explore the possibility of using secondary data. In this connection there are four requirements that
must be met. These are:
1. Availability of Secondary Data
The first and foremost requirement is that secondary data must be available for use. At times, one may
find that secondary data are just not available on a problem at hand. In such cases, there is no
alternative but to take resources to the collection of primary data.
2. Relevance/suitability of the data
Another precondition for the use of secondary data is their relevance to the marketing problem.
Relevance means that the data available must fit the requirements of the problem. This would cover
several aspects
Unit of measurement should be the same as that in the marketing problem
The concepts used should be the same as are envisaged in the problem.
The data should not be obsolete.
3. Reliability of thedata
The reliability can be tested by finding out such things about the said data:
Who collected the data
What were the sources of the data
Were they collected by using proper method
At what time were they collected
Was there any bias of the complier
What level of accuracy was desired? Was it achieved?


4. Accuracy
The other requirement is that the data should be accurate. In this connection the researcher should
consult the original source. This would not only enable one to get more comprehensive information
but would also indicate the context in which data have been collected, the procedure followed and the
extent of care exercised in their collection.
5. Sufficiency
The data should be sufficient. If the data are inadequate, then compliance with the preceding
requirements will be vain.
Primary Data
Primary data are original observations collected by the researcher or his agents for the first time for
any investigation and used by them in the statistical analysis.
Primary data are those data which are collected afresh and for the first time, and thus happen to be
original in character.
Advantages of Primary Data
1. The primary source gives data in greater details compared to secondary sources. The secondary
sources often omits part of the information.
2. In the secondary source, there is a possibility of mistakes due to errors in transcription made
then the figures were copied from the primary sources.
3. The primary source includes definitions of terms and units used. It is essential that the
investigators understand the meaning of units in which data are recorded.
4. The primary source also includes a copy of the schedule used in data collection together with
the prescription of the procedure used in selectinf the sample and size of the sample.
Methods of Secondary Data Collection
We collect primary data during the course of doing experiments in an experimental research but in
case we do research of the descriptive type and perform surveys then we can obtain primary data either
through observation or through direct communication with respondents in one form or another or
through personal interviews.
In other words, this means that there are several methods of collecting primary data, particularly in
surveys and descriptive researches.
Important ones are:
Observation method
Interview method

Through questionnaires
Through schedules
Other methods
- Warranty cards
- Distributor audits
- Pantry audits
- Consumer panels
- Using mechanical devices
- Through projective techniques
- Depth interviews
- Content analysis


Chapter 8
Data Processing, Statistical Analysis and Interpretation
- Data analysis
- Data preparation/processing
Graphs & diagrams
- Statistical Methods in Research
Descriptive Statistics
- Interpretation
I. Data Analysis
By the time you get to the analysis of your data, most of the really difficult work has been done.
It's much more difficult to: define the research problem; develop and implement a sampling plan;
conceptualize, operationalize and test your measures; and develop a design structure. If you have
done this work well, the analysis of the data is usually a fairly straightforward affair.
Analysis of data involves a number of closely related operations that are performed with the
purpose of summarizing the collected data and organizing these in such a manner that they will
yield answers to the research questions and research hypothesis and imitated the study.
Analysis of data includes comparison of the outcomes of the various treatments upon the several
groups and the making of the decision as to the achievement of the goals of research.
Analysis of data means to make the raw data meaningful or to draw some results from the data
after the proper treatment.
Process involved in Data Analysis
Some authors differentiate data analysis and data preparation stating that data preparation is one
of the steps in research activities whereas data analysis is the other step.
Others put the steps involved in data analysis, in general, as being classified as;

1. classification or establishment of categories for data
2. application of categories to raw data through coding
3. the tabulation of data
4. statistical analysis of data
5. Inferences about causal relationship among variables.
The other approach used to classify the data analysis in social research involves three major
steps, done in roughly this order:
1. Cleaning and organizing the data for analysis (Data Preparation/processing)
2. Describing the data (Descriptive Statistics)
3. Testing Hypotheses and Models (I nferential Statistics)
In most research studies, the analysis section follows these three phases of analysis. Descriptions
of how the data were prepared tend to be brief and to focus on only the more unique aspects to
your study, such as specific data transformations that are performed. The descriptive statistics
that you actually look at can be voluminous. In most write-ups, these are carefully selected and
organized into summary tables and graphs that only show the most relevant or important
information. Usually, the researcher links each of the inferential analyses to specific research
questions or hypotheses that were raised in the introduction, or notes any models that were tested
that emerged as part of the analysis. In most analysis write-ups it's especially critical to not "miss
the forest for the trees." If you present too much detail, the reader may not be able to follow the
central line of the results. Often extensive analysis details are appropriately relegated to
appendices, reserving only the most critical analysis summaries for the body of the report itself.
Data Preparation/processing
Data Preparation or processing includes:
checking or logging the data in;
checking the data for accuracy/editing
coding the data
a) Logging the Data

In any research project you may have data coming from a number of different sources at
different times: For example from; mail surveys returns, coded interview data, pretest or
posttest data and observational data.
In all but the simplest of studies, you need to set up a procedure for logging the information
and keeping track of it until you are ready to do a comprehensive data analysis.
Different researchers differ in how they prefer to keep track of incoming data. In most
cases, you will want to set up a database that enables you to assess at any time what data is
already in and what is still outstanding.
You could do this with any standard computerized database program (e.g., Microsoft
Access, Claris Filemaker), although this requires familiarity with such programs. Or you can
accomplish these using standard statistical programs (e.g., SPSS, SAS, Minitab, Data desk)
and running simple descriptive analyses to get reports on data status.
It is also critical that the data analyst retain the original data records for a reasonable period
of time -- returned surveys, field notes, test protocols, and so on.
A database for logging incoming data is a critical component in good research record-
b) Checking the Data for Accuracy/ Editing
As soon as data is received you should screen it for accuracy. In some circumstances doing this
right away will allow you to go back to the sample to clarify any problems or errors.
Editing of data is a process of examining the raw collected data to detect errors and omissions
and to correct these when possible.
Editing is done to assure that the data are accurate, consistent with other facts gathered,
uniformly entered, as completed as possible and have been well arranged to facilitate coding and
There are several questions you should ask as part of this initial data screening:
- Are the responses legible/ readable?
- Are all important questions answered?
- Are the responses complete?

- Is all relevant contextual information included (e.g., data, time, place, researcher)?
Editing could be:
Field editing and Central editing
Field editing:
Consists in the review of the reporting forms by the investigators for completing
(translating or rewriting) what the latter has written in abbreviated and/or in illegible
form at the time of recording the respondents responses.
Done for checking whether the handwriting is readable or not.
Must not include correcting errors of omission by guessing
Central Editing:
Take place when all forms or schedules have been completed and returned to the
Implies that all forms should get a thorough editing by a single editor in a small study
and by a team in case of large inquiry.
The obvious errors may be corrected such as wrong place entry.
In case of omission of responses, sometimes the editor can enter the answer by
considering other information.
Editors must keep in view several points while performing their work:
They should be familiar with instructions given to the interviewers and coders as well
as with the editing instructions supplied to them for the purpose.
While crossing out an original entry for one reason or another, they should just draw a
single line on it so that the same may remain legible.
They must make entries in distinctive colors and in standard forms
They should initial all answers which they change or supply.
Editors initials and the date of editing should be placed on each completed form or

c) Coding
Refers to the process of assigning numbers or other symbols to answers so that
responses can be put into a limited number of categories or classes.
Such classes are appropriate to the research problem under consideration.
The classes must be exhaustive, mutually exclusive and unidimensional.
Coding is necessary for efficient analysis and through it the several replies may be
reduced to small number of classes which contain critical information requires for
d) Classification
Is the process of arranging data into sequences and groups according to their common
characteristics, or separating them into different but related parts.
Is the scheme of breaking a category into a set of parts, called classes, according to
some precisely defined differing characteristics possessed by all the elements of the
Is the arrangement of data into different classes which are to be determined
depending upon the nature, objective, and scope of the enquiry.
Reduces a large volume of data into homogeneous groups of manageable size.
Is the process of arranging data in groups or classes on the basis of common
Characteristics of Classification
When we make a classification, we break up the subject matter into a number of
classes. It is important that the classification should possess the following
Exhaustive: the classification system must be exhaustive. There must be no item which
cannot find a class. There must be a class for each item of data in one of the class. If
classification is made exhaustive, there will be no place for ambiguity.

Mutually exclusive: there must not be overlap. That is, each item of data must find its
place in one class and one class only. There must be no item which can find its way into
more than one class.
Stability: classification must proceed at every stage in accordance with one principle and
that principle should be maintained throughout. If a classification is not stable and is
changed for every inquiry, then data would not be fit for comparison.
Flexibility: A good classification should be flexible and should have the capacity of
adjustment to new situations and circumstances.
Homogeneity: the items included in one class should be homogeneous.
Suitability: the classification should conform to the objects of enquiry. If an
investigation is carried on to enquire into the economic conditions of laborers, then it will
be useless to classify them on the basis of their religion.
Arithmetic Accuracy: the total of the items included in different classes, should tally
with the total of the universe.
Types of classification
Classification can be done in one of the following ways, depending on the nature of
the phenomenon involved:
1. Classification According to Attributes
Is sometimes called classification based on difference in kind or qualitative
Data can be classified on the basis of common characteristics which can either be
descriptive or numerical.
Descriptive characteristics refer to qualitative phenomenon which can not be
measured quantitatively. Such data are called statistics of attributes and their
classification is called classification according to attributes.
Such classification can be
o Simple classification or
o Manifold classification.

In simple classification, we consider only one attribute and divide the universe into
two classes- one class consisting the attribute and the other not consisting the
In manifold classification, we consider two or more attributes simultaneously, and
divide the data into a number of classes.
2. Classification according to class-intervals
Sometimes known as classification based on difference of degree of a given
characteristics or quantitative classification.
The numerical characteristics refer to quantitative phenomenon can be measured
through some statistical units.
Such data are known as statistics of variables and are classified on the basis of class
Each class of interval has upper limit and lower limit which are known as class
limits. The difference between the two class limits is called class magnitudes.
Class magnitudes may be equal or unequal.
The number of items which fall in a given class is known as the frequency of the
given class.
All the classes or groups, taken together with their respective frequencies and put in
the form of a table, are described as a group frequency distribution.
3. Geographical classification
The data are classified according to the geographical location such as continents,
countries, states, districts, or other subdivisions.
4. Chronological Classification
When the given data is classified on the basis of time , it is named chronological
In this type of classification, the data may be classified on the basis of time, i.e.,
years, months, weeks, days or hours.

5. Alphabetical Classification
When the data are arranged according to alphabetical order, it is called alphabetical
This type of classification is mostly adopted for data of general use because it aids
in locating the items easily.
Objectives of Classification
The chief objectives of classification are:
- To present the facts in a simple form: classification process eliminates
unnecessary details and makes the mass complex data, simple, brief, logical and
- To bring out clearly points of similarity and dissimilarity: Classification brings
out clearly the points of similarity and dissimilarity of the data to that they can be
easily grasped. Facts having similar characteristics are placed in a class, such as
educated, uneducated, employed, unemployed etc,.
- To facilitate comparison: Classification of data enables one to make comparison,
draw inferences and locate facts. This is not possible in an unclassified data. If
marks obtained by students in two colleges are given, no comparison can be
made of their intelligence level. Classification of students in to first, second, third
and failure classes on the basis of marks obtained by them will make such
comparison easy.
- To bring out relationship: classification helps in finding out cause effect
relationship, if there is any in the data.
- To present a mental picture: The process of classification enables one to form a
mental picture of objects of perception and conception. Summarized data can be
easily understood and remembered.
- To prepare the basis for tabulation: classification prepares the basis for tabulation
and statistical analysis of the data. Unclassified data can not be presented.

The number of groups and magnitude size determination is challenging.
Choosing class limit and type( inclusive or exclusive) as well is a difficult decision.
Determination of the frequency of each class is another challenge.
e) Tabulation
Is the process of summarizing raw data and displaying the same in compact form
(i.e., in the form of statistical tables) for further analysis.
Is an orderly arrangement of data in columns and rows.
Is the orderly and systematic presentation of numerical data in a form designed to
elucidate the problem under consideration.
Statistical table is the logical listing of related quantitative data in vertical columns
and horizontal rows of numbers with sufficient explanatory and qualifying words,
phrases and statements in the form of titles, headings and notes to make clear the full
meaning of data and their origin.

Objectives of tabulation
Tabulation is a process which helps in understanding complex numerical facts.
The purpose of table is to summarize a mass of numerical information and to present
it in the simplest possible form consistent with the purpose for which it is to be used.
In general tabulation has the following objectives
To clarify the object of investigation
The function of tabulation in the general scheme statistical investigation is to arrange
in easily accessible form the answer with which the investigation is concerned.
To clarify the characteristics of data
A table presents facts clearly and concisely, eliminating the need for wordy
explanation. It brings out the chief characteristics of data.
To present facts in the minimum space

A table presents facts in minimum of space and communicates information in a far
better way than textual material.
To facilitate statistical process
It simplifies references to data and facilitates comparative analysis and interpretation
of the facts.
Advantages of Tabulation
Tabulation is essential for the following reasons:
It conserves space and reduces explanatory and descriptive statement to a
It facilitates the process of comparison
It facilitates the summation of items and the detection of errors and omissions.
It provides a basis for various statistical computations.
Limitations of Tabulation
A table contains only figures and not their description. It is not easy to understand
it by persons who are not adept in assimilating facts from table.
It requires specialized knowledge to understand tables. A layman cannot derive
any conclusion from a table.
A table does not lay emphasis on any section of particular importance.
Main Parts of Statistical tables
1. Table number: every table should be numbered so that it can be identified. The
number is normally indicated at the top of the table.
2. Title: Each table must bear a title indicating the type of data contained. The title
should not be very lengthy so as to run in several lines. It should be clear and
3. Captions and Stubs: A table consists of rows and columns. The headings or
subheadings given in columns are known as captions while those given in rows
are stubs. It is necessary that a table should have captions and stubs to indicate

what columns and rows stand for. It is also desirable to provide for an extra
column and row in the table for the column and row totals.
4. Main body of the table: As this part of the table contains data, it is most
important part. Its size and shape should be suitable to accommodate the data. The
data are enterd from the top to the bottom in columns and from left to the right
5. Ruling and spacing:
6. Head note:
7. Footnote:
8. Sourcenote:
Tabulation can be classified as simple or complex.
- Simple tabulation gives information about one or more groups of independent
questions. It results in one way tables which supplies answers to questions
about one characteristic of data only.
- Complex tabulation shows the division of data in to two or more categories
and as such is designed to give information concerning one or more sets of
inter-related questions. It usually results in two-way tables (which give
information about two inter-related characteristics of data), three-ways, or still
higher order tables or manifold tables.
f) Graphic presentation of data
In the previous topic we have seen that tabulation is one method of presenting data. Another way
of presenting data is in form of diagrams and graphs. However, this method of data presentation
is also not without limitation.
Importance of graphic and diagrammatic presentation of Data
1. On account of their visual impact, the data presented through graphic and diagrammatic
presentation are better grasped and remembered than the tabulated ones.
2. These forms of presentation transform data in simple, clear and effective manner.
3. They are able to attract the attention of the reader particularly when several colors and
pictures are used in preparation.

4. A major advantage of these presentations is that they have better appeal even to a layman.
For the layman, simple charts, maps and pictures facilitate a much better understanding
of the data on which these are based.
5. Since they lead to a better understanding, they save considerable time.
6. Even when data show highly complex relations among variables, these devices make
them much clear. They thus greatly facilitate in the interpretation and analysis of data.
7. These devices are extremely helpful in depicting mode, median, skewness, correlation
and regression, normal distribution, time series analysis and so on.
Limitations of Graphs and Diagrams
1. In presenting data by these devices, it is not possible to maintain 100% precision. As such
these devices are not suitable where precision is needed.
2. There cannot be a complete substitute for tabulation. They can serve the purpose better
when they are accompanied by suitable tables.
3. When too many details are to be presented, these devices fail to present them without loss
of clarity.
4. In those cases, where mathematical treatment is required, these devices turn out to be
extremely unsuitable.
5. Small differences in large measurements cannot be properly brought out by means of
graphs and diagrams.
6. While graphs and diagrams are generally simple to understand, one should know that all
graphic devices are not simple. Particularly when ratio graphs and multidimensional
figures are used, these may be beyond the comprehension of the common man. A proper
understanding of these figures needs some expertise on the part of the reader.
g) Graphic devices
There are two major categories of graphs:
The natural scale graph and Ratio scale graph
The natural scale graph is more frequently used.
Within the natural scale graph, again there are two types of graphs:
Time series graph
Frequency graph

Time series graph shows the data against time, which could be any measure such as hours, day,
weeks, months and years.
In frequency graphs, time is not a measure instead some other variables such as income of
employees the number of employees earning that income, if plotted on a graph. Within frequency
graph category there are some graphs such as histogram, frequency polygon, and the ogive curve
are the popular ones.
Time series Graphs:
1. Line Graph
Time period is measured along X-axis and the corresponding values are on the Y-axis.
2. Silhouette or Net Balance Graph
In such a graph the two related series are plotted in such a manner as to highlight the
difference or gap between them.
3. Component or Band Graph
Under this device, phenomena, which form part of the whole, are shown by successive bands
or components to enable an overall picture alongwith the successive contributions of the
4. Range Graph
This graph shows the range, that is the highest and the lowest of a certain product or items
under reference.
Frequency Graphs:
The following are the types of frequency graphs
1. Line graph
2. Polygon
3. Ogive
4. Histogram
5. Frequency curve
6. Lorenz Curve
7. Z-chart.


Line Graph
Line graph is also used to present a discrete frequency distribution.
On the axis of X is measured the size of the items while on the axis of Y is measured the
corresponding frequency.
In histogram, we measure the size of the item in question, given in terms of class intervals,
on the axis of X while the corresponding frequencies are shown on the axis of Y. Unlike the
line graph, here the frequencies are shown in the form of rectangles the basis of which is the
class interval. Furthermore, the rectangles are adjacent to each other without having any gap
amongst them. A histogram generally represents a continuous frequency distribution in
contrast to line graph, which represents either a discrete frequency distribution or a time
Each rectangle shows distinctly separate class in the distribution.
The area of each rectangle in relation to all other rectangles shows the proportion of the
total number of observations pertaining to that class.
Frequency Polygon
A frequency polygon like any polygon consists of many angles. A histogram can be easily
transformed into frequency polygon by joining the mid-points of the rectangles by straight lines.
Frequency polygon can also be drawn by taking the mid points of each class interval and by
joining the mid points by the straight lines. This can be done only when we have a continuous
The frequency polygon is simpler as compared to histogram.

It shows more vividly an outline of the data pattern.
As the number of classes and the number of observations increase, so also the frequency
polygon becomes increasingly smooth.
Frequency Curve
When a frequency polygon is smoothened and rounded at the top, then it is known as
frequency curve.
Cumulative Frequency Curve or Ogive
Cumulative frequency curve enables us to know how many observations are above or below a
certain value. It is also known as ogive.
It is commonly used in business. The name of this device is derived from its shape. It is the
combination of three curves, namely
I. The curve based on the original data
II. The curve based on the cumulative frequency
III. The curve based on the moving totals (which can be obtained by
adding the past X number of data).
II. Statistical Analysis in Research
Analysis means the computation of certain indices or measures along with searching for patterns
of relationship that exist among the data groups.
Analysis involves estimating the values of unknown parameters of the population and testing of
hypothesis for drawing inferences.
Analysis, therefore, may be classified as descriptive analysis and inferential analysis.
In descriptive statistics we are simply describing what is or what the data shows.

With inferential statistics, we are trying to reach conclusions that extend beyond the
immediate data alone. For instance, we use inferential statistics to try to infer from the
sample data what the population might think. Or, we use inferential statistics to make
judgments of the probability that an observed difference between groups is a dependable
one or one that might have happened by chance in this study. Thus, we use inferential
statistics to make inferences from our data to more general conditions; we use descriptive
statistics simply to describe what's going on in our data.
Descriptive Statistics are used to describe the basic features of the data in a study. They
provide simple summaries about the sample and the measures. Together with simple
graphics analysis, they form the basis of virtually every quantitative analysis of data. With
descriptive statistics you are simply describing what is, what the data shows.
Inferential Statistics investigate questions, models and hypotheses. In many cases, the
conclusions from inferential statistics extend beyond the immediate data alone. For
instance, we use inferential statistics to try to infer from the sample data what the
population thinks. Or, we use inferential statistics to make judgments of the probability that
an observed difference between groups is a dependable one or one that might have
happened by chance in this study. Thus, we use inferential statistics to make inferences
from our data to more general conditions; we use descriptive statistics simply to describe
what's going on in our data.
Descriptive Statistics
Descriptive statistics are used to describe the basic features of the data in a study. They provide
simple summaries about the sample and the measures. Together with simple graphics analysis,
they form the basis of virtually every quantitative analysis of data.
Descriptive Statistics are used to present quantitative descriptions in a manageable form.
Descriptive statistics help us to simplify large amounts of data in a sensible way. Each
descriptive statistic reduces lots of data into a simpler summary. However, every time you try to
describe a large set of observations with a single indicator you run the risk of distorting the
original data or losing important detail. For instance, the GPA doesn't tell you whether the
student was in difficult courses or easy ones, or whether they were courses in their major field or

in other disciplines. Even given these limitations, descriptive statistics provide a powerful
summary that may enable comparisons across people or other units.
a. Univariate Analysis
Univariate analysis involves the examination across cases of one variable at a time. There are
three major characteristics of a single variable that we tend to look at:
- the distribution - the central
- the dispersion
In most situations, we would describe all three of these characteristics for each of the variables in
our study.
The Distribution
The distribution is a summary of the frequency of individual values or ranges of values for a
variable. The simplest distribution would list every value of a variable and the number of persons
who had each value. For instance, a typical way to describe the distribution of college students is
by year in college, listing the number or percent of students at each of the four/three years. Or,
we describe gender by listing the number or percent of males and females.

Table 1. Frequency distribution table.
One of the most common ways to describe a single variable is with a frequency distribution.

Depending on the particular variable, all of the data values may be represented, or you may
group the values into categories first (e.g., with age, price, or temperature variables, it would
usually not be sensible to determine the frequencies for each value. Rather, the values are
grouped into ranges and the frequencies determined.).
Frequency distributions can be depicted in two ways, as a table or as a graph. Table 1 shows
an age frequency distribution with five categories of age ranges defined. The same frequency
distribution can be depicted in a graph as shown in Figure 2. This type of graph is often
referred to as a histogram or bar chart.

Table 2. Frequency distribution bar chart.
Distributions may also be displayed using percentages. For example, you could use
percentages to describe the:
- percentage of people in different income levels
- percentage of people in different age ranges
- percentage of people in different ranges of standardized test scores
Central Tendency/statistical average
The central tendency of a distribution is an estimate of the "center" of a distribution of values.
Tells us the point about which items have a tendency to cluster.
There are three major types of estimates of central tendency:
- Mean
- Median
- Mode

The Mean or average or arithmetic mean is probably the most commonly used method of
describing central tendency. To compute the mean all you do is add up all the values and
divide by the number of values. For example, the mean or average quiz score is
determined by summing all the scores and dividing by the number of students taking the
exam. For example, consider the test score values:
15, 20, 21, 20, 36, 15, 25, 15
The sum of these 8 values is 167, so the mean is 167/8 = 20.875.
The Median is the score found at the exact middle of the set of values. One way to
compute the median is to list all scores in numerical order, and then locate the score in
the center of the sample. For example, if there are 500 scores in the list, score #250 would
be the median. If we order the 8 scores shown above, we would get:
There are 8 scores and score #4 and #5 represent the halfway point. Since both of these
scores are 20, the median is 20. If the two middle scores had different values, you would
have to interpolate to determine the median.
The mode is the most frequently occurring value in the set of scores. To determine the
mode, you might again order the scores as shown above, and then count each one. The
most frequently occurring value is the mode. In our example, the value 15 occurs three
times and is the model. In some distributions there is more than one modal value. For
instance, in a bimodal distribution there are two values that occur most frequently.
Notice that for the same set of 8 scores we got three different values -- 20.875, 20, and 15
-- for the mean, median and mode respectively. If the distribution is truly normal (i.e.,
bell-shaped), the mean, median and mode are all equal to each other.
Dispersion refers to the spread of the values around the central tendency. There are two
common measures of dispersion, the range and the standard deviation.

The range is simply the highest value minus the lowest value. In our example
distribution, the high value is 36 and the low is 15, so the range is 36 - 15 = 21.
There are two problems with the range as a measure of spread. When calculating the
range you are looking at the two most extreme points in the data, and hence the value
of the range can be unduly influenced by one particularly large or small value,
known as an outlier. The second problem is that the range is only really suitable for
comparing (roughly) equally sized samples as it is more likely that large samples
contain the extreme values of a population.

The Inter-Quartile Range
The inter-quartile range describes the range of the middle half of the data and so is less
prone to the influence of the extreme values.
To calculate the inter-quartile range (IQR) we simply divide the ordered data into
four quarters.
The three values that split the data into these quarters are called the quartiles. The
first quartile (lower quartile, Q1) has 25% of the data below it; the second quartile
(median, Q2) has 50% of the data below it; and the third quartile (upper quartile,
Q3) has 75% of the data below it.
Quartiles are calculated as follows:
n observatio

Just as with the median, these quartiles might not correspond to actual observations.
The inter-quartile range is simply the difference between the upper and lower quartiles,
that is
IQR = Q3 Q1
The inter-quartile range is useful as it allows us to start to make comparisons between the
ranges of two data sets, without the problems caused by outliers or uneven sample sizes.


The sample variance is the standard measure of spread used in statistics. It is usually
denoted by s2 and is simply the average of the squared distances of the observations
from the sample mean.
Strictly speaking, the sample variance measures deviation about a value calculated from
the data (the sample mean) and so we use an n 1 divisor rather than n.
Mathematical notation is

( ) ) (
2 2
x x



The Standard Deviation is a more accurate and detailed estimate of dispersion because
an outlier can greatly exaggerate the range (as was true in this example where the single
outlier value of 36 stands apart from the rest of the values. The Standard Deviation shows
the relation that set of scores has to the mean of the sample.

In the top part of the ratio, the numerator, we see that each score has the mean
subtracted from it, the difference is squared, and the squares are summed. In the bottom
part, we take the number of scores minus 1. The ratio is the variance and the square root
is the standard deviation.
The standard deviation is the square root of the sum of the squared deviations from the
mean divided by the number of scores minus one.
The standard deviation allows us to reach some conclusions about specific scores in our
distribution. Assuming that the distribution of scores is normal or bell-shaped, the
following conclusions can be reached:

- approximately 68% of the scores in the sample fall within one standard deviation
of the mean
- approximately 95% of the scores in the sample fall within two standard deviations
of the mean
- approximately 99% of the scores in the sample fall within three standard
deviations of the mean
For instance, if mean for a given data is 20.875 and the standard deviation is 7.0799, we
can estimate that approximately 95% of the scores will fall in the range of 20.875-
(2*7.0799) to 20.875+(2*7.0799) or between 6.7152 and 35.0348. This kind of
information is a critical stepping stone to enabling us to compare the performance of an
individual on one variable with their performance on another, even when the variables are
measured on entirely different scales.
Measures of skewness (asymmetry)
When the distribution of item in a series happens to be perfectly symmetrical, we then
have to the following type of curve for the distribution.

Curve showing no skewness in which cases we have Z M X = =
Such a curve is techinically described as a normal curve and relating distribution as
normal distribution.
Z M X = =

Such a curve is perfectly bell shaped curve in which case the value of Z or M or X is just
the same and skewness is altogether absent.
But if the curve is distorted (whether on the right side or on the left side), we have
asymmetrical distribution which indicates that there is skewness.
If the curve is distorted on the right side, we have positive skewness but when the curve
is distorted to the left, we have negative skewness.
Skewness is, thus, measure of asymmetry and shows the manner in which the items are
clustered around the average.
In asymmetrical distribution, the items show a perfect balance on the either side of the
mode, but in a skew distribution the balance is thrown to one side.
The amount by which the balance exceeds on one side measures the skewness of the
The difference between the mean, median, or the mode provides an easy way of
expressing skewness in a series.
In case of positive skewness, we have X M Z < <
In case of negative skewness, we have Z M X < < .
Graphic representation is as follows:
Usually we measure skewness as:
Positively Skewed Negatively skewed

Skewness= Z X and its coefficient (j) is worked out as


In case Z is not well defined, then we work out skewness as under:
Skewness=3( M X ) and its coefficient (j) is worked out as

) ( 3 M X

The significance of skewness lies in the fact that through it one can study the formation
of series and can have the idea about the shape of the curve, whether normal or otherwise,
when the items of a given series are plotted on graph.
Kurtosis is the measure of flat-toppedness of a curve.
Kurtosis is the hompedness of the curve and points to the nature of distribution of items
in the middle of a series.
A bell shaped curve or the normal curve is Mesokurtic because it is kurtic in the
If the curve is relatively more peaked than the normal curve, it is called Leptokurtic
If a curve is more flat than the normal curve, it is called Platykurtic.
Knowing the shape of the distribution curve is crucial to the use of statistical method in
research analysis since most methods make specific assumptions about the nature of the
distribution curve.
b) Bivariate and Multivariate Analysis
Whenever we deal with data on two or more variables, we said to have a bivariate or
multivariate population.

Such situations usually happen when we wish to know the relation of the two and/or
more variables in the data with one another.
There are different methods of determining the relationship between variables, but no
method can tell us for certain that a correlation is indicative of causal relationship.
Thus we have to answer to types of questions in bivariate or multivariate population viz.,
Does there exist association or correlation between the two (or more) variables? If
yes, of what degree?
Is there any cause and effect relationship between two variables in case of bivariate
population or between one variable on one side and two or more variables on the
other side in case of multivariate population? If yes , of what degree and in which
The first question can be answered by the use of correlation technique and the second
question by the technique of regression.
There are several methods of applying the two techniques, but the important ones are as
I n case of bivariate population:
Correlation can be studied through:
- Cross tabulation
- Scattergram
- Charles Spearmans Coefficient of correlation
- Karl Pearsons Coefficient of correlation
Cause and effect relationship can be studied through;
- Simple regression analysis
In case of multivariate population
Correlation can be studied through;

- Coefficient of multiple correlation
- Coefficient of partial correlation
Cause and effect relationship can be studied through:
- Multiple regression
Cross tabulation
Is useful when the data is in nominal form.
We classify each variable in to two or more categories and then cross classify the
variables in these subcategories.
Begins with the two wat table which indicates whether there is or there is not an
interrelationship between the variables.
Then we look for intersections between them which may be symmetrical,
reciprocal or asymmetrical.
- A symmetrical relationship is one in which the two variable vary together, but
we assume that neither variable is due to the other.
- A reciprocal relationship exists when the two variable mutually influence or
reinforce each other.
- Asymmetrical relationship is said to exist if one variable ( the independent
variable) is responsible for another variable( the dependent variable).
The strength of a relationship is determined by the pattern of difference between
the values of variables.
- If there are marked percentage difference between the different categories
of the variables, the relationship between them is strong
- If the percentage differences are slight, the relationship is weak.
The statistical significance of a relationship is determined by using appropriate test
of significance.
Is a graph on which a researcher plots each case or observation, where each axis
represents the value of one variable.

Is used for variables measured at the interval or ratio level, rarely for ordinal
variables, and never if either of the variables is nominal.
Usually put independent variable on X-axis and dependent on Y-axis.
Can show three aspects of the bivariate relationship for the researcher.
- Form
- Direction
- Precision
Relationship can take three forms
- Independence
- Linear
- Curvilinear
No relationship
Is the easiest to see
Looks like a random scatter with no pattern or straight line that is exactly parallel
to the horizontal or vertical axis.
Linear Relationship
Means that a straight line can be visualized in the middle of a maze of cases
running from corner to another.
Curvilinear Relationship
Means that the center of a maze of cases would form a U curve, right side up or
upside down or an S curve.
Linear relationships can have a positive or negative direction.
The plot of a positive relationship looks like a diagonal line from the
lower left to the upper right. Higher values on X-axis tend to go with
higher values on Y, and vice versa.

A negative relationship looks like a line from the upper left to the lower
right. It means that higher values on one variable go with lower values
on the other.
Bivariate relationships differ in their degree of precision.
Precision is the amount of spread in the points on the graph.
A higher level of precision occurs when the points hug the line that
summarizes the relationship.
Spearmans Rank Order Correlation coefficient (Rank Correlation) Or Rho
Is the oldest of the frequently used measures of ordinal association?
Rho(e) is the measure of the extent of agreement or disagreement between two sets
of ranks.
Is a non-parametric measure and so it does not require the assumption of a bivariate
normal distribution.
Its values ranges between -1.0(perfect negative association) and +1.0(perfect
positive association)
Its underlining logic centers on the difference between ranks.

n observatio of number the n
object an to asigned ranks Y X between difference the D
n n


) 1 (

Requirements for using Rho
The following conditions should be satisfied for using Rho
A straight line correlation should exist between the variables.
Both X and Y variables must be ranked or ordered.
Sample members must have been taken at random from a larger population.

Research Example: A researcher in a study of two factor theory of job
satisfaction, used Rho.
Ranks were given a perceived needs for supervisors and clerks on each
job factor according to the magnitude of mean scores, and Rho was
calculated. The calculated value was significant(Rho=0.86, p less than
0.01) indicating similarity between the two groups in their perceived
need importance.
Karl Pearsons Coefficient of Correlation ( or Simple Correlation)
Is the most widely used method of measuring the degree of relationship between
two variables.
Expresses both the strength and direction of linear correlation.
Also known as the product moment correlation coefficient.
Denoted by r which can be in between 1 .
- Positive value of r indicate positive correlation between the two
variables(i.e., changes in both variables take place in the same
- Negative value of r indicates negative correlation(i.e., changes in
two variables take place in opposite direction.
- A zero value of r indicates that there is no association between the
two variables.
Assumes the following:
- That there is linear relationship between the two variables.
- That the two variables are causally related which means that one of the
variables is independent and the other one is dependent.
- A large number of independent causes are operating in both variables
so as to produce a normal distribution.
Mathematically it can be expressed as:

Testing the Significance of a Correlation
Once you've computed a correlation, you can determine the probability that the observed
correlation occurred by chance. That is, you can conduct a significance test. Most often
you are interested in determining the probability that the correlation is a real one and not
a chance occurrence. In this case, you are testing the mutually exclusive hypotheses:
Null Hypothesis: r = 0
Alternative Hypothesis: r <> 0
As in all hypotheses testing,
We need to first determine the significance level. For example, we use the
common significance level of alpha = .05. This means that we are conducting a
test where the odds that the correlation is a chance occurrence is no more than 5
out of 100.
The degrees of freedom or df is equal to N-2.
Finally, deciding the type of test to be applied(two-tailed test or one tailed test) is
to be done.
Accept or reject the null hypothesis.
Other Correlations
There are a wide variety of other types of correlations for other circumstances. For

if you have two ordinal variables, you could use the Kendall rank order
Correlation (tau).
When one measure is a continuous interval level one and the other is dichotomous
(i.e., two-category) you can use the Point-Biserial Correlation.
Partial Coefficient of correlation
Partial correlation measures separately the relationship between two variables in such a
way that the effects of other related variables are eliminated.
In partial correlation analysis, we aim at measuring the relations between a dependent
variable and a particular independent variable by holding all other variables constant.
Each partial coefficient of correlation measures the effect of its independent variable on
the dependent variable.
The partial correlation shows the relationship between two variables, excluding the effect
of other variables. In a way, the partial correlation is a special case of multiple
The difference between simple correlation and partial correlation is that the simple
correlation does not include the effect of other variables as they are completely ignored.
There is almost an implicit assumption that the variables not included don not have any
impact on the dependent variable. But such is not the case in the partial correlation,
where the impact of other independent variables is held constant.
N.B. In multiple correlation, three or more variables are studied simultaneously. But in
partial correlation we consider only two variables influencing each other while the effect
of other variables is held constant.
For example, suppose we have a problem comprising three variables X1, X2, and Y. X1
is the number of hours studied, X2 is I.Q. and Y is the number of marks obtained in the
examination. In a multiple correlation, we will study the relationship between the marks
obtained(Y) and the two variables, number of hours studied(X1) and I.Q.(X2). In

contrast, when we study the relationship between X1 and Y keeping an average I.Q. as
constant, it is said to be a study involving partial correlation.
If we denote r
as the coefficient of partial correlation between X1 and X2, holding X3
constant, then

23 13 12
3 . 12
1 1 r r
r r r

Multiple Correlation Analysis
Unlike the partial correlation, multiple correlation is based on three or more variables
without excluding the effect of anyone. Its denoted by R .
In case of three variables X1, X2, and x3, the multiple correlation coefficient will be:
R1.23=Multiple correlation coefficient with X1 as a dependent variable while X2 and X3
as independent variables.
R2.13=Multiple correlation coefficient with X2 as a dependent variable while X1 and X3
as independent variables.
R3.12= Multiple correlation coefficient with X3 as a dependent variable while X1 and
X2 as independent variables.
It may be recalled that the concepts of dependent and independent variables were non-
existent in case of simple bivariate correlation. In contrast, the concepts of dependent and
independent variables are introduced here in multiple correlation.
Symbolically, the multiple correlation coefficient can be shown as follows:

23 13 12
23 . 1
r r r r r

+ +
Simple Regression Analysis

Regression is the determination of a statistical relationship between two or more
Regression analysis is a mathematical measure of the average relationship between two
or more variables in-terms of the original units of the data.
Regression analysis is a statistical method to deal with the formulation of mathematical
model depicting relationship amongst variables which can be used for the purpose of
prediction of the values of dependent variable, given in the value of the independent
In simple regression, we have only two variables, one variable ( defined as independent)
in cause of the behavior of another one (defined as dependent variable).
Regression can only interpret what exists physically i.e., there must be a physical way in
which independent variable X can affect dependent variable Y.
The basic relationship between X and Y is given by
bX a Y + =
This equation is known as the regression equation of Y on X(also represents the
regression line of Y on X when drawn on a graph) which means that each unit change in
X produces a change of b in Y, which is positive for direct and negative for inverse
Regression coefficients
As we saw the regression of Y on X, it is possible that we may think of X as dependent
variable and Y as an independent one.
In that case, we may have to use bY a X + = as an estimating equation.
Regression coefficient and Correlation coefficient

If all the points on the scatter diagram fall on the regression line , the correlation between
the two variables involved is perfect.
This is as much true about the regression line of Y on X as about the line of X on Y.
This means is that if the correlation is perfect the regression line can pass through more
than one point. This is because of the fact that one and only one line can pass through
more than one point.
If, however, the two lines diverge and intersect each other the correlation is not perfect.
Properties of Regression Coefficient
1. The coefficient of correlation is the geometric mean of the two regression
2. As the coefficient of correlation cannot exceed 1, in case one of the regression
coefficients is greater than 1, then the other must be less than 1.
3. Both the regression coefficient will have the same sign, either positive or
negative. If one regression coefficient is positive then the other will also be
4. The coefficient of correlation and the regression coefficient will have the same
sign. If the regression coefficients are positive, then the correlation will also be
positive and vice versa.
5. The average of two regression coefficients will always be greater than the
correlation coefficient.
6. Regression coefficients are not affected by change of origin
Multiple regression coefficients
When there are two or more than two independent variables, the analysis concerning
cause and effect relationship is known as multiple correlations and the equation
describing such relationship as the multiple regression equation.
Multiple regression equation assumes the form:

2 2 1 1
X b X b a + +
Where X
and X
are two independent variables and Y being the dependent variable,
and the a,b
and b
In multiple regression analysis, the regression coefficients (viz., b
) becomes less
reliable as the degree of correlation between the independent variables (viz X
, X
If there is a high degree of correlation between independent variables, we have a problem
of what is commonly described as the problem of multicollinearity.
In such a situation we should use only one set of the independent variable to make our
estimate. Infact, adding a second variable, say X
that is correlated with the first variable,
say X
distorts the values of the regression coefficients.
Measures of Association in Case of Attributes
When data is collected on the basis of some attribute or attributes, we have a statistics
commonly termed as statistics of attributes.
In such a situation our interest may remain in knowing whether the attributes are
associated with each other or not.
The (two) attributes are associated if they appear together in a greater number of cases
than is to be expected if they are independent and not simply on the basis that they are
appearing together in a number of cases as is done in ordinary life.
The association may be positive or negative (negative association is also known as
If class frequency of AB, symbolically written as (AB), is greater than the expectation of
AB being together if they are independent, then we say the two attributes are positively
associated; but if the class frequency of AB is less than this expectation, the two
attributes are said to be negatively associated.

In case the class frequency of AB is equal to expectation, the two attributes are
considered as independent i.e., are said to have no association.
If (AB)> associated related postively are AB then N
/ ,
) ( ) (

If (AB)< associated related negatively are AB then N
/ ,
) ( ) (

If (AB)= n associatio no have e ti independen are AB then N
., . ,
) ( ) (

Where (AB) =frequency of class AB and
. , ,
) ( ) (
items of number being N and t independen are B and A if AB of n Expectatio N

In order to find out the degree or intensity of association between two or more sets of
attributes, we should work out the coefficient of association. Yules coefficient of
association is most popular and is often used for this purpose.
It can be mentioned as under:
) )( ( ) )( (
) )( ( ) )( (
aB Ab ab AB
aB Ab ab AB

=Yules coefficient of association between attributes A and B
(AB)=Frequency of class AB in which A and B are present.
(Ab)=Frequency of class Ab in which A is present and B is absent
(aB)=Frequency of class aB in which A is absent and B is present.
(ab) = Frequency of class ab in which both A and B are absent.

The value of this coefficient will be in between +1 and -1.
If the attributes are completely associated (perfect positive association) with each
other, the coefficient will be +1.
If the attributes are completely disassociated (perfect negative association) with
each other, the coefficient will be -1.
If the attributes are completely independent of each other, the coefficient will be
In order to judge the significance of association between two attributes, we make use of
Chi square test by finding the value of Chi square(
_ ) and using Chi square distribution
the value of
_ can be worked out as under:
_ =

ij ij
) (

Where O
=Observed frequencies
=expected frequencies.
Chi Square Test
Is an important test amongst the several tests of significance tests?
Used in the context of sampling analysis for comparing a variance to a theoretical
Can be used as a non parametric test to determine if categorical data shows
dependency or the two classifications are independent.
Can be used to make a comparision between theoretical populations and actual
data when categories are used.
Is a technique through the use of which it is possible for researchers to
- Test the goodness of fit
- Test the significance of association between two attributes
- Test the homogeneity of population variance.

Chi-square as a Test for Comparing Variance
The chi square is often used to judge the significance of population variance i.e., we can
use the test to judge if a random sample has been drawn from a normal population with
mean ( ) and with a specified variance (
o ).
The test is based on
_ distribution.
_ distribution is not symmetrical and all the values are positive.
For making use of this distribution, one is required to know the dgrees of freedom since
for different degrees of freedom we have different curves.
The smaller the number of degrees of freedom, the more skewed is the distribution is.
In brief when we have to use chi-square as a test of population variance, we have to
workout the value of
_ to test the null hypothesis( viz., H
2 2
p s
o o = ) as under:
) 1 (
= n
Then by comparing the calculated value with the table value of
_ for (n-1) degrees of
freedom at a given level of significance, we may either accept or reject the null
- If the calculated value of
_ is less than the table value, the null hypothesis is
- If the calculated value of
_ is equal or greater than the table value, the
hypothesis is rejected.
Chi Square As a Non Parametric Test
Chi square is an important non-parametric test and as such no rigid assumptions are
necessary in respect of the type of population.

We require degree of freedom (implicitly of course the size of the sample) for using this
As a non-parametric test, chi-square can be used:
- As a test of goodness of fit
- As a test of independence.
As a test of goodness of fit,
_ test enables us to see how well does the assumed
theoretical distribution (such as Binomial, Poisson or Normal distribution) fit to the
observed data?
When some theoretical distribution is fitted to the given data, we are always interested in
knowing as to how well this distribution fits with the observed data.
- If the calculated value of
_ is less than the table value at certain level of
significance , the fit is considered to be a good one which means that the
divergence between the observed and expected frequencies is attributable to
fluctuations of sampling.
- If the calculated value of
_ is greater than its table value, that fit is not considered
to be a good one.
As a test of independence,
_ test enables us to explain whether or not two attributes are
associated. In asserting so we first calculate the expected frequencies and then workout
the value of
_ .
- If the calculated value of
_ is less than the table value at certain level of
significance for a given degree of freedom, we conclude that null hypothesis
stands which means that two attributes are independent or not associated.
- If the calculated value of
_ is greater than its table value , the inference would be
null hypothesis does not hold good which means that the two attributes are
associated and association is not because of some chance factor but it exists in

N.B. 1)
_ is not a measure of the degree of relationship or the form of relationship
between two attributes, but it simply a technique of judging the significance of such
association or relationship between two attributes.
2) in order that we may apply the chi-square test either as a test of goodness of fit
or as a test to judge the significance of association between attributes, it is necessary
that the observed as well theoretical or expected frequencies must be grouped in the
same way and the theoretical distribution must be adjusted to give the same total
frequency as we find the case observed distribution.
Conditions for the Application of
_ Test
1. Observation recorded and used are collected on a random basis.
2. All the items in the sample must be independent.
3. No group should contain very few items, say less than 10. In case where the
frequencies are less than 10, regrouping is done by combining the frequencies of
adjoining groups so that the new frequencies become greater than 10.
4. The overall number of items must also be reasonably large. It should normally be
at least 50, howsoever small the number of groups may be.
5. The constraints may be linear. Constraints which involve linear equations in the
cell frequencies of a contingency table are known as linear constraints.
Important Characteristics of
_ Test
1. The test (as a non-parametric test) is based on frequencies and not on the
parameters like mean and standard deviation.
2. The test is used for testing the hypothesis and is not useful for estimation.
3. This test possesses the additive property .
4. This test can also be applied to a complex contingency table with several
classes and as such is very useful in research work.
5. This test is an important non-parametric test as no rigid assumptions are
necessary in regard to the type of population, no need of parameter values and
relatively less mathematical details are involved.

Steps involved in Applying Chi-Square Test
The various steps involved are as follows:
1. Calculate the expected frequencies on the basis of given hypothesis or on the
basis of null hypothesis. Usually in case of 2x2 or any contingency table, the
expected frequency for any given cell is worked out as under:
total Grand
cell that of column the for total column
cell that of row the for total rows
cell any of frequency Expected

2. Obtain the difference between observed and expected frequencies and find out
the squares of such differences i.e., calculate (O
3. Divide the quantity (O
obtained as stated above by the corresponding
expected frequency to get(O
and this should be done for all the cell
frequencies or the group frequencies.
4. Find the summation of (O
value or what we call

ij ij
) (
. This
is the required
_ value.
5. Compare the calculated with its table value at n-1 degree of freedom and draw
the inference.
Interpretation of The research findings
Meaning of I nterpretation
Interpretation refers to the task of drawing inferences from the collected facts after an
analytical and/or experimental study.
Interpretation is concerned with relationships within the collected data, partially
overlapping analysis. It also extends beyond the data of the study to include the results of
other research, theory and hypothesis.
Interpretation is the device through which the factors that seem to explain what has been
observed by researcher in the course of the study can be better understood and it also
provides a theoretical conception which can serve as a guide for further researches.
In general interpretation as two major aspects
1. The efforts to establish continuity in research through linking the results of a
given study with those of another

2. The establishment of some explanatory concepts.
Why interpretation?
Interpretation is essential for the simple reason that the usefulness and utility of research
findings lie in proper interpretation. It is being considered a basic component of research
process because of the following reasons.
1. It is through interpretation that the researcher can well understand the abstract
principle that works beneath his findings. Through this he can link up his findings
with those of other studies, having the same abstract principle, and thereby can
predict about the concrete world of events. Fresh inquiries can test these
predictions later on. This way the continuity in research can be maintained.
2. Interpretation leads to the establishment of explanatory concepts that can serve as
a guide for future research studies; it opens new avenues for intellectual adventure
and stimulate the quest for more knowledge.
3. Researcher can better appreciate only through interpretation why his findings are
what they are and can make others to understand the real significance of his
research findings.
4. The interpretation of the findings of exploratory research study often results into
hypotheses for experimental research and as such interpretation involved in the
transition from exploratory to experimental research.
Techniques of Interpretation
The task of interpretation is not an easy job, rather it requires a great skill and dexterity
on the part of researcher.
Interpretation is an art that one learns through practice and experience.
The techniques of interpretation often involve the following steps:
1. Researcher must give reasonable explanations of the relations which he has found
and he must interpret the lines of relationship interms of the underlying processes
and must try to find out the thread of uniformity that lies under the surface layer
of his diversified research findings. In fact, this is the techniques of how
generalization should be done and concept be formulated.
2. Extraneous information, if collected during the study, must be considered while
interpreting the final results of research study, for it may be a key factor in
understanding the problem under consideration.

Chapter 9: Scaling
General Issues in Scaling
Scaling is the branch of measurement that involves the construction of an instrument that
associates qualitative constructs with quantitative metric units.
Scaling describes the procedure of assigning numbers to various degree of opinion,
attitudes and other categories.
Scale is a continuum, consisting of the highest point and the lowest point along with
several intermediate points between these two extreme points.
Scaling evolved out of efforts to measure "unmeasurable" constructs like authoritarianism
and self esteem.
Scaling attempts to do one of the most difficult of research tasks -- measure abstract
Scaling is the assignment of objects to numbers according to a rule. In most scaling, the
objects are text statements, usually statements of attitude or belief. To scale these
statements, we have to assign numbers to them. Usually, we would like the result to be on
at least an interval scale.
Scaling is how we get numbers that can be meaningfully assigned to objects -- it's a set of
Purposes of Scaling
Why do we do scaling? Why not just create text statements or questions and use response
formats to collect the answers?
First, sometimes we do scaling to test a hypothesis.
We might want to know whether the construct or concept is a single dimensional or
multidimensional one.
Sometimes, we do scaling as part of exploratory research. We want to know what
dimensions underlie a set of ratings.
But probably the most common reason for doing scaling is for scoring purposes. When a
participant gives their responses to a set of items, we often would like to assign a single
number that represents that's person's overall attitude or belief.

Scales are common in situations in which a researcher wants to measure how an
individual feels or thinks about something. Some call this the hardness or potency of
Scales help in conceptualization and operationalization process.
Scale Classification Bases
The number assigning procedure or the scaling procedures may be broadly classified on
one or more of the following bases:
a) Subject Orientation
Scale may be designed to measure characteristics of the respondent who completes it or
to judge the stimulus object which is presented to the respondents.
In the respect of the characteristics of the respondents, we presume that the stimuli
presented are sufficiently homogeneous so that the between stimuli variation is small as
compared to the variation among respondents.
In the respect to judge the stimulus object approach, we ask the respondents to judge
some specific object in terms of one or more dimensions and we presume that the
between respondents variation will be small as compared to the variation among the
different stimuli presented to the respondents for judging.
b) Response form
Scale may be classified as categorical and comparative.
I. Categorical Scales
- Are also known as rating scales.
- Used when a respondent scores some object without direct reference to other
- Involves qualitative description of a limited number of aspects of a thing or
traits of a person or an object.
- Judges properties of objects without reference to other similar objects.
- Involves such forms as like-dislike, above average, average , below average,
or other classification with more categories such as like very much, like some
what, neutral, dislike, dislike somewhat, dislike very much.
- Usually 3-7 point scales.

- Can be either
Graphical rating scale
Itemized scale
- The Graphic Rating Scale
Is commonly used in practice.
Various points are usually put along line to form a continuum
A rater indicates his rating by marking at the appropriate point on
linethat runs from one extreme to the other.
Scale points with brief desription may be indicated along the line to
assist the rater in performing his job.
Respondents boredom as they should check at all points.
The meaning of the terms used sometimes is difficult to
distinguish e.g. very much and some what.
- Itemized Rating Scale
Is also known as numerical scale.
Presents a series of statements from which a respondent selects one as
best reflecting his evaluation.
Statements are ordered progressively interms of more or less of some
Provides more information and meaning to the rater, thereby
increases reliability.
It is relatively difficult to develop
The statements may not say exactly what the respondent would
like to express.

Types of Scales
Response format Degree of
Rank Order
Scale properties
Uni-dimensional Multi-

Item analysis
Factor Scales

Advantages of Rating Scales
The results obtained from their use compare favorably with alternative methods.
Requires less time.
Are interesting to use
Have a wide range of applications.
May be used with a large number of properties or variables.
Limitations Of the rating Scale
The results are highly Dependent on the respondents good judgement. As it
is liable to errors otherwise.
It is liable to three types of respondents error: the error of leniency, the
error of central tendency and the error of hallo effect.
II. Comparative Scales
- Are also known as ranking scales.
- The respondent is asked to compare two or more objects.
- In essence the comparison is relative to one another.
- Relative judgments against other similar objects are made.
- The respondents directly compare two or more objects and make choices among
- Uses two approaches of ranking scales:
- Method of paired comparison
- Method of rank order
Method of Paired comparison
The respondent can express his attitudes by making a choice between two or more
When the number of stimuli or objects to be judged is a big figure, there is a risk of
respondents giving ill considered answers or they may even refuse to answer.
Provides ordinal data. But can be converted to interval scale by using a technique
developed by L.L. Thurstone which involves the conversion of frequencies of
preferences in to a table of proportions which are then transformed into Z matrix by
referring to the table under the normal curve.

Method of Rank Order
The respondents are asked to rank their choices.
Is easier and faster than the method of paired comparison.
There is no problem of transitivity unlike the paired comparison.
A complete ranking at a time is not required as the respondents can only rank the
first, say for example four, choices only.
To secure a simple ranking of all items involved we simply total rank values
received by each item.
c) Degree of subjectivity
The scale data may be based on whether we measure subjective personal preferences (in
case when the respondent is asked to choose person he favors or which solution he would
like to see employed) or simply make non-preference judgments (where the respondent is
asked to judge which person is more effective in some aspects or which solution will take
fewer resources without reflecting any personal preferences).
d) Scale properties
Scales, based on their properties, can be classified as:
- Nominal
- Ordinal
- I nterval
- Ratio
Nominal scales merely classify without indicating order, distance or unique origin.
Ordinal scales indicate magnitude relationship of more than or less than, but indicates
no distance or unique origin.
Interval scales have both order and distance values but no unique origin.
Ratio scales possess all these features.
e) Dimensions
A scale can have any number of dimensions in it. Most scales that we develop have only a
few dimensions. What's a dimension? Think of a dimension as a number line. If we want to
measure a construct, we have to decide whether the construct can be measured well with

one number line or whether it may need more. For instance, height is a concept that is
unidimensional or one-dimensional. We can measure the concept of height very well with
only a single number line (e.g., a ruler). Weight is also unidimensional -- we can measure it
with a scale. Thirst might also bee considered a unidimensional concept -- you are either
more or less thirsty at any given time. It's easy to see that height and weight are
unidimensional. But what about a concept like self esteem? If you think you can measure a
person's self esteem well with a single ruler that goes from low to high, then you probably
have a unidimensional construct.
What would a two-dimensional concept be? Many models of intelligence or achievement
postulate two major dimensions -- mathematical and verbal ability. In this type of two-
dimensional model, a person can be said to possess two types of achievement. Some people
will be high in verbal skills and lower in math. For others, it will be the reverse. Such
concepts are called multidimensional concepts.
In general, therefore, Scales can be classified as:
- Unidimensional
- Multidimensional
Unidimensional scales measure only one attribute of the respondent or object
Multidimensional scales measure more than one attributes of the respondent or objects.
Unidimensional scales
The unidimensional scaling methods were developed in the first half of the twentieth
century and are generally named after their inventor. There are three major types of
unidimensional scaling methods.
They are similar in that they each measure the concept of interest on a number line. But
they differ considerably in how they arrive at scale values for different items.
There are three types of unidimensional scaling methods here:
- Thurstone or Equal-Appearing I nterval Scaling
- Likert or "Summative" Scaling
- Guttman or "Cumulative" Scaling
In the late 1950s and early 1960s, measurement theorists developed more advanced
techniques for creating multidimensional scales. Although these techniques are not
considered here, they are powerful techniques in researches.

f) Scale construction techniques
There are five main techniques by which scales can be developed:
I. Arbitrary approach
Is developed on ad hoc basis and are designed largely through the researchers own
subjective selection of items.
Is the most widely used
Presumed to measure the concept for which they have been developed.
Has the following merits
Can be developed very easily, quickly
Less expensive to develop
Can be developed for highly specific and adequate
Has the following limitations
There is no objective evidence that the scale measures the
concepts for which they have been developed.
II. Consensus approach
A panel of judges evaluates the items chosen for inclusion in the instruments in terms
of the whether they are relevant to the topic.
III. Item analysis approach
A number of individual items are developed in to a test which is given to a group of
After administering the test, the total scores are calculated for every one .
Individual items are then analyzed to determine which item discriminate between
person or objects with high total scores and those with low.
IV. Commutative scales
Chosen on the basis of their conforming to some ranking of items with ascending and
descending discriminating power.
V. Factor scales
Constructed on the basis of intercorrelations of items which indicate that a common
factor accounts for the relationship between items.


Measurement of Attitude
Attitude is the mental state of an individual which makes him to act or respond for or
against objects, situations, etc., with which his/her vested feelings or interest , liking,
desire, and so on, are directly or indirectly linked or associated.
During the course of development, the person acquires tendencies to respond to objects.
These learned cognitive mechanisms are called attitudes.
Attitudes cannot be observed because psychological variables are dormant or latent. Being
covert, attitude measurement is difficult.
An information form that attempts to measure the attitude or belief of an individual is
known as opinionnaire.
In social sciences, while measuring attitudes of the people we generally follow the
technique of preparing the opinionaire or attitude scale in such away that the score of the
individual responses assigns him a place on a scale.
Psychologists and sociologists have developed several scale construction techniques for the
purpose of attitude measurement.
Some of the important approaches, along with the corresponding scales developed under
each approach are as follows:
Scale Construction approach Name of the Scale
Arbitrary approach Arbitrary scales
Consensus approach Differential scales,e.g., Thurstone
Differential Scale
Item analysis Summate Scale, e.g., Likert Scale
Cumulative approach Cumulative Scales, e.g., Guttmsns
Factor analysis approach Factor Scales, e.g., Osgood Semantic
Differential, Multidimentional Scaling etc.
Commonly used Attitude Measurement Scales
Commonly used attitude scales are of three types:
1. Differential Scales (Thurstone Scaling)
Thurstone was one of the first and most productive scaling theorists.

He developed an approach to development of a scale using consensus in which a panel of
judges who evaluate the items in terms of whether they are relevant to the topic area and
unambiguous in implication.
He actually invented three different methods for developing a unidimensional scale:
- Method of equal-appearing intervals
- Method of successive intervals
- Method of paired comparisons
The three methods differed in how the scale values for items were constructed, but in all
three cases, the resulting scale was rated the same way by respondents.
The Method of Equal-Appearing I ntervals
The following steps are necessary in constructing a scale by a method of equal- appearing
interval as developed by L.L. Thurstone
- Developing the Focus
- Generating Potential Scale Items
- Rating the Scale Items ,
- Computing Scale Score Values for Each Item
- Selecting the Final Scale Items
- Administering the Scale.
Developing the Focus: The Method of Equal-Appearing Intervals starts like almost every
other scaling method -- with the development of the focus for the scaling project. Because
this is a unidimensional scaling method, we assume that the concept you are trying to scale
is reasonably thought of as one-dimensional. The description of this concept should be as
clear as possible so that the person(s) who are going to create the statements have a clear
idea of what you are trying to measure. For instance, you might start with the focus

Generate statements that describe specific attitudes that people might have towards
persons with AIDS.
You want to be sure that everyone who is generating statements has some idea of what you
are after in this focus command. You especially want to be sure that technical language and
acronyms are spelled out and understood (e.g., what is AIDS?).
Generating Potential Scale Items. Now, you're ready to create statements. You want a
large set of candidate statements (e.g., 80 -- 100) because you are going to select your final
scale items from this pool. You also want to be sure that all of the statements are worded
similarly -- that they don't differ in grammar or structure. For instance, you might want
them each to be worded as a statement which you cold agree or disagree with. You don't
want some of them to be statements while others are questions.
Rating the Scale Items. Now we have a set of statements. The next step is to have your
participants (i.e., judges) rate each statement on a 1-to-11 scale in terms of how much each
statement indicates a favorable attitude towards the subject matter. Pay close attention
here! You DON'T want the participants to tell you what their attitudes towards AIDS are,
or whether they would agree with the statements. You want them to rate the
"favorableness" of each statement in terms of an attitude towards the subject matter, where
1 = "extremely unfavorable attitude " and 11 = "extremely favorable attitude ."
Computing Scale Score Values for Each Item. The next step is to analyze the rating data.
For each statement, you need to compute the Median and the Interquartile Range. The
median is the value above and below which 50% of the ratings fall. The first quartile (Q1)
is the value below which 25% of the cases fall and above which 75% of the cases fall -- in
other words, the 25th percentile. The median is the 50th percentile. The third quartile, Q3,
is the 75th percentile. The Interquartile Range is the difference between third and first
quartile, or Q3 - Q1. To facilitate the final selection of items for your scale, you might want
to sort the table of medians and Interquartile Range in ascending order by Median and,
within that, in descending order by Interquartile Range.

Median Q1 Q3 Inerquartile

Selecting the Final Scale Items. Now, you have to select the final statements for your
scale. You should select statements that are at equal intervals across the range of medians.
Within each value, you should try to select the statement that has the smallest Interquartile
Range. This is the statement with the least amount of variability across judges. You don't
want the statistical analysis to be the only deciding factor here. Look over the candidate
statements at each level and select the statement that makes the most sense. If you find that
the best statistical choice is a confusing statement, select the next best choice.
Items with higher scale values should, in general, indicate a more favorable attitude
Randomly scrambled the order of the statements with respect to scale values.
Administering the Scale. You now have a scale -- a yardstick you can use for measuring
attitudes.. You can give it to a participant and ask them to agree or disagree with each
statement. To get that person's total scale score, you average the scale scores of all the
items that person agreed with.
The Other Thurstone Methods
The other Thurstone scaling methods are similar to the Method of Equal-Appearing
Intervals. All of them begin by focusing on a concept that is assumed to be unidimensional
and involve generating a large set of potential scale items. All of them result in a scale
consisting of relatively few items which the respondent rates on Agree/Disagree basis.
The major differences are in how the data from the judges is collected. For instance, the
method of paired comparisons requires each judge to make a judgement about each pair of
statements. With lots of statements, this can become very time consuming indeed. Clearly,
the paired comparison method would be too time consuming when there are lots of
statements initially.

B. Likert Scaling
Like Thurstone or Guttman Scaling, Likert Scaling is a unidimensional scaling method.
Here, are the basic steps in developing a Likert or "Summative" scale.
Defining the Focus: As in all scaling methods, the first step is to define what it is you are
trying to measure. Because this is a unidimensional scaling method, it is assumed that the
concept you want to measure is one-dimensional in nature. You might operationalize the
definition as an instruction to the people who are going to create or generate the initial set
of candidate items for your scale.
Generating the Items. Next, you have to create the set of potential scale items. These
should be items that can be rated on a 1-to-5 or 1-to-7 Disagree-Agree response scale.
Sometimes you can create the items by yourself based on your intimate understanding of
the subject matter. But, more often than not, it's helpful to engage a number of people in
the item creation step. For instance, you might use some form of brainstorming to create
the items. It's desirable to have as large a set of potential items as possible at this stage,
about 80-100 would be best.
Rating the Items. The next step is to have a group of judges rate the items. Usually you
would use a 1-to-5 rating scale where:
1. = strongly unfavorable to the concept
2. = somewhat unfavorable to the concept
3. = undecided
4. = somewhat favorable to the concept
5. = strongly favorable to the concept
Notice that, as in other scaling methods, the judges are not telling you what they believe
rather they are judging how favorable each item is with respect to the construct of interest.

Selecting the Items. The next step is to compute the intercorrelations between all pairs of
items, based on the ratings of the judges. In making judgments about which items to retain
for the final scale there are several analyses you can do:
- Throw out any items that have a low correlation with the total (summed) score across
all items
In most statistics packages it is relatively easy to compute this type of Item-Total
correlation. First, you create a new variable which is the sum of all of the individual items
for each respondent. Then, you include this variable in the correlation matrix computation
(if you include it as the last variable in the list, the resulting Item-Total correlations will all
be the last line of the correlation matrix and will be easy to spot). How low should the
correlation be for you to throw out the item? There is no fixed rule here - you might
eliminate all items with a correlation with the total score less that .6, for example.
- For each item, get the average rating for the top quarter of judges and the bottom
quarter. Then, do a t-test of the differences between the mean value for the item for the
top and bottom quarter judges.
Higher t-values mean that there is a greater difference between the highest and lowest
judges. In more practical terms, items with higher t-values are better discriminators, so
you want to keep these items. In the end, you will have to use your judgement about
which items are most sensibly retained. You want a relatively small number of items on
your final scale (e.g., 10-15) and you want them to have high Item-Total correlations and
high discrimination (e.g., high t-values).
Administering the Scale. You're now ready to use your Likert scale. Each respondent is
asked to rate each item on some response scale. For instance, they could rate each item on a
1-to-5 response scale where:
1. = strongly disagree
2. = disagree
3. = undecided

4. = agree
5. = strongly agree
There are a variety possible response scales (1-to-7, 1-to-9, 0-to-4). All of these odd-
numbered scales have a middle value is often labeled Neutral or Undecided. It is also
possible to use a forced-choice response scale with an even number of responses and no
middle neutral or undecided choice. In this situation, the respondent is forced to decide
whether they lean more towards the agree or disagree end of the scale for each item.
The final score for the respondent on the scale is the sum of their ratings for all of the items
(this is why this is sometimes called a "summated" scale). On some scales, you will have
items that are reversed in meaning from the overall direction of the scale. These are called
reversal items. You will need to reverse the response value for each of these items before
summing for the total. That is, if the respondent gave a 1, you make it a 5; if they gave a 2
you make it a 4; 3 = 3; 4 = 2; and, 5 = 1.
Example: The Employment Self Esteem Scale
Here's an example of a three-item Likert Scale that attempts to estimate the level of self
esteem a person has on the job. Notice that this instrument has no center or neutral point -
the respondent has to declare whether he/she is in agreement or disagreement with the item.
INSTRUCTIONS: Please rate how strongly you agree or disagree with each of the
following statements by placing a check mark in the appropriate box.




1. I feel good about my work on the job.




2. On the whole, I get along well with
others at work.
3. I am proud of my ability to cope with

difficulties at work.
It is relatively easy to construct
It is more reliable because respondents answer each statement included in the
It provides more information and data than others
It can be used in respondent centered and stimuli-centered studies
It takes much less time to construct.
It can not accurately estimate the distance between responses e.g., the
distance between undecided and agree, and agree and strongly agree may
not be equal letting the scale no more than ordinal scale.
Often the total score can be secured by a variety of answer patterns because
the questions may not reflect the real feelings of the respondents or the
respondents may answer as per what they think they should feel than as per
their actual feelings.
C. Guttman Scaling
Guttman scaling is also sometimes known as cumulative scaling or scalogram analysis.
The purpose of Guttman scaling is to establish a one-dimensional continuum for a concept
you wish to measure.
What does that mean? Essentially, we would like a set of items or statements so that a
respondent who agrees with any specific question in the list will also agree with all
previous questions. Put more formally, we would like to be able to predict item responses
perfectly knowing only the total score for the respondent.

For example, imagine a ten-item cumulative scale. If the respondent scores a four, it
should mean that he/she agreed with the first four statements. If the respondent scores an
eight, it should mean they agreed with the first eight. The object is to find a set of items
that perfectly matches this pattern. In practice, we would seldom expect to find this
cumulative pattern perfectly. So, we use scalogram analysis to examine how closely a set
of items corresponds with this idea of cumulativeness.
Define the Focus: As in all of the scaling methods, we begin by defining the focus for our
scale. Let's imagine that you wish to develop a cumulative scale that measures U.S. citizen
attitudes towards immigration. You would want to be sure to specify in your definition
whether you are talking about any type of immigration (legal and illegal) from anywhere
(Europe, Asia, Latin and South America, Africa).
Develop the Items: Next, as in all scaling methods, you would develop a large set of items
that reflect the concept. You might do this yourself or you might engage a knowledgeable
group to help. Let's say you came up with the following statements:
- I would permit a child of mine to marry an immigrant.
- I believe that this country should allow more immigrants in.
- I would be comfortable if a new immigrant moved next door to me.
- I would be comfortable with new immigrants moving into my community.
- It would be fine with me if new immigrants moved onto my block.
- I would be comfortable if my child dated a new immigrant.
Of course, we would want to come up with many more statements (about 80-100 would be
Rate the Items: Next, we would want to have a group of judges rate the statements or
items in terms of how favorable they are to the concept of immigration. They would give a
Yes if the item was favorable toward immigration and a No if it is not. Notice that we are
not asking the judges whether they personally agree with the statement. Instead, we're
asking them to make a judgment about how the statement is related to the construct of

Develop the Cumulative Scale: The key to Guttman scaling is in the analysis. We
construct a matrix or table that shows the responses of all the respondents on all of the
items. We then sort this matrix so that respondents who agree with more statements are
listed at the top and those agreeing with fewer are at the bottom. For respondents with the
same number of agreements, we sort the statements from left to right from those that most
agreed to to those that fewest agreed to. We might get a table something like the figure.
Notice that the scale is very nearly cumulative when you read from left to right across the
columns (items). Specifically if someone agreed with Item 7, they always agreed with Item
2. And, if someone agreed with Item 5, they always agreed with Items 7 and 2. The matrix
shows that the cumulativeness of the scale is not perfect, however. While in general, a
person agreeing with Item 3 tended to also agree with 5, 7 and 2, there are several
exceptions to that rule.
While we can examine the matrix if there are only a few items in it, if there are lots of
items, we need to use a data analysis called scalogram analysis to determine the subsets of
items from our pool that best approximate the cumulative property. Then, we review these
items and select our final scale elements. There are several statistical techniques for
examining the table to find a cumulative scale. Because there is seldom a perfectly
cumulative scale we usually have to test how good it is. These statistics also estimate a
scale score value for each item. This scale score is used in the final calculation of a
respondent's score.
Administering the Scale: Once you've selected the final scale items, it's relatively simple
to administer the scale. You simply present the items and ask the respondent to check items
with which they agree. For our hypothetical immigration scale, the items might be listed in
cumulative order as:
- I believe that this country should allow more immigrants in.
- I would be comfortable with new immigrants moving into my community.
- It would be fine with me if new immigrants moved onto my block.
- I would be comfortable if a new immigrant moved next door to me.
- I would be comfortable if my child dated a new immigrant.

- I would permit a child of mine to marry an immigrant.
Of course, when we give the items to the respondent, we would probably want to mix up
the order. Our final scale might look like:
INSTRUCTIONS: Place a check next to each statement you
agree with.
_____ I would permit a child of mine to marry an immigrant.
_____ I believe that this country should allow more
immigrants in.
_____ I would be comfortable if a new immigrant moved next
door to me.
_____ I would be comfortable with new immigrants moving
into my community.
_____ It would be fine with me if new immigrants moved onto
my block.
_____ I would be comfortable if my child dated a new
Each scale item has a scale value associated with it (obtained from the scalogram analysis).
To compute a respondent's scale score we simply sum the scale values of every item they
agree with. In our example, their final value should be an indication of their attitude
towards immigration.


Chapter 10: Preparation of Thesis
What is a thesis and why write one?
Dictionary meaning of thesis: 1) a proposition to be maintained or proved. 2) a dissertation
especially by a candidate for a degree. One might infer from the etymology above that a
thesis is an (obligatory) offering placed at the desk of the examiner by a candidate who
wishes to get a degree. This is the most common, and often only, reason why a thesis is
written. But there are other reasons for writing a thesis.
A thesis is a written record of the work that has been undertaken by a candidate. It
constitutes objective evidence of the authors knowledge and capabilities in the field of
interest and is therefore a fair means to gauge them. Although thesis writing may be viewed
as an unpleasant obligation on the road to a degree, the discipline it induces may have
lifelong benefits.
Most of all, a thesis is an attempt to communicate. Science begins with curiosity, follows
on with experiment and analysis, and leads to findings which are then shared with the
larger community of scientists and perhaps even the public. The thesis is therefore not
merely a record of technical work, but is also an attempt to communicate it to a larger
Differences between the undergraduate and postgraduate theses
The difference between the undergraduate and postgraduate theses is one of degree rather
than kind. They share a common structure and need for logical rigour. It is only in the
substance and the emphasis placed on it that the differences arise. Specifically, a PhD
thesis shall be a substantial and original contribution to scholarship, for example, through
the discovery of knowledge, the formulation of theories or the innovative re-interpretation
of known data and established ideas.
Indeed, the three most commonly cited qualities that earn an undergraduate thesis the first
class grade are originality, independence, and mastery. Candidates writing a higher degree
thesisand the PhD thesis in particularare required to present their research in the
context of existing knowledge. This means a thorough and critical review of the literature,
not necessarily limited to the narrow topic of research, but covering the general area. The
PhD candidate should also show clearly what original contributions she or he has made.

Although neither of these requirements applies strictly to undergraduate work, the
candidate should demonstrate familiarity with previous relevant work in his or her thesis.
In short, a thesiswhether undergraduate or postgraduateis evidence of the candidates
capacity to carry out independent research under the guidance of a supervisor, and to
analyze and communicate the significant results of that work. The candidate for higher
degrees must demonstrate, in addition, mastery of the literature and indicate clearly which
his or her original work is, and why it is significant.
There are several general considerations to keep in mind when generating a thesis report:
The Audience: Who is going to read the report? Reports will differ considerably
depending on whether the audience will want or require technical detail, whether they are
looking for a summary of results, or whether they are about to examine your research in a
Ph.D. exam.
The Story: Every research project has at least one major "story" in it. Sometimes the story
centers on a specific research finding. Sometimes it is based on a methodological problem
or challenge. When you write your report, you should attempt to tell the "story" to your
reader. Even in very formal journal articles where you will be required to be concise and
detailed at the same time, a good "storyline" can help make an otherwise very dull report
interesting to the reader.

The hardest part of telling the story in your research is finding the story in the first place.
Usually when you come to writing up your research you have been steeped in the details
for weeks or months (and sometimes even for years). You've been worrying about
sampling response, struggling with operationalizing your measures, dealing with the details
of design, and wrestling with the data analysis. You're a bit like the ostrich that has its head
in the sand. To find the story in your research, you have to pull your head out of the sand
and look at the big picture. You have to try to view your research from your audience's
perspective. You may have to let go of some of the details that you obsessed so much about
and leave them out of the write up or bury them in technical appendices or tables.

Formatting Considerations: Are you writing a research report that you will submit for
publication in a journal? If so, you should be aware that every journal requires articles that
you follow specific formatting guidelines. Thinking of writing a book. Again, every
publisher will require specific formatting. Writing a term paper? Most faculties will require
that you follow specific guidelines. Doing your thesis or dissertation? Every university has
very strict policies about formatting and style except Andhra University. There are
legendary stories that circulate among graduate students about the dissertation that was
rejected because the page margins were a quarter inch off or the figures weren't labeled
As per APA, all sections of the paper should be typed, double-spaced on white 8 1/2 x 11
inch paper with 12 pitch typeface with all margins set to 1 inch. Every page must have a
header in the upper right corner with the running header right-justified on the top line and
the page number right-justified and double-spaced on the line below it. The paper must
have all the sections in the order given below; following the specifications outlined for
each section.
1.Preliminary Pages: In its preliminary pages the research report should carry:
a) Title page
b) Abstract (on a separate single page)
c) Table of Contents
d) Acknowledgements
2. Main Text: The main text provides the complete outline of the research report along
with all details. Title of the research study is repeated at the top of the first page of the
main text and then follows at the other details on pages numbered consecutively,
beginning with the second page. Each main section of the report should begin on a new
page. The main text of the report should have the following section:
a. Chapter 1: Introduction
b. Chapter 2: Review of the Literature
c. Chapter 3: Methods
I. Sample (1 page)

II. Measures (2-3 pages)
III. Design (2-3 pages)
IV. Procedures (2-3 pages)
d. Chapter 4: Results
e. Chapter (n + 1): General Discussion or Conclusions
3. End Matter: At the end of the report, appendices should be enlisted in respect of
all technical data such as questionnaires, mathematical derivations and the like
ones. Bibliography of sources consulted should also be given. Index(an alphabetic
listing of names, places and topics alongwith the number of the pages in a book or
report on which they are mentioned or discussed) should invariably be given at the
end of the report. Particularly the following elements should be included:
a. References
b. Tables (one to a page)
c. Figures (one to a page)
f. Appendices
Title Page
On separate lines and centered, the title page has the title of the study, the author's name,
and the institutional affiliation. At the bottom of the title page you should have the words
(in caps) RUNNING HEADER: followed by a short identifying title (2-4 words) for the
study. This running header should also appear on the top right of every page of the paper.
The abstract is limited to one page, double-spaced. At the top of the page, centered, you
should have the word 'Abstract'. The abstract itself should be written in paragraph form
and should be a concise summary of the entire paper including: the problem; major
hypotheses; sample and population; a brief description of the measures; the name of the
design or a short description (no design notation here); the major results; and, the major
conclusions. Obviously, to fit this all on one page you will have to be very concise.

Body/Main text
The first page of the body of the paper should have, centered, the complete title of the
I ntroduction
The first section in the body is the introduction. There is no heading that says 'Introduction,'
you simply begin the paper in paragraph form following the title. Every introduction will
have the following (roughly in this order): a statement of the problem being addressed; a
statement of the cause-effect relationship being studied; a description of the major
constructs involved; a brief review of relevant literature (including citations); and a
statement of hypotheses. The entire section should be in paragraph form with the possible
exception of the hypotheses, which may be indented.
The next section of the paper has four subsections: Sample; Measures; Design; and,
Procedure. The Methods section should begin immediately after the introduction (no page
break) and should have the centered title 'Methods'. Each of the four subsections should
have an underlined left justified section heading.
This section should describe the population of interest, the sampling frame, the method for
selecting the sample, and the sample itself. A brief discussion of external validity is
appropriate here, that is, you should state the degree to which you believe results will be
generalizable from your sample to the population.
This section should include a brief description of your constructs and all measures that will
be used to operationalize them. You may present short instruments in their entirety in this
section. If you have more lengthy instruments you may present some "typical" questions to

give the reader a sense of what you will be doing (and include the full measure in an
Appendix). You may include any instruments in full in appendices rather than in the body.
Appendices should be labeled by letter. (e.g., 'Appendix A') and cited appropriately in the
body of the text. For pre-existing instruments you should cite any relevant information
about reliability and validity if it is available. For all instruments, you should briefly state
how you will determine reliability and validity, report the results and discuss. For
reliability, you must describe the methods you used and report results. A brief discussion of
how you have addressed construct validity is essential. In general, you should try to
demonstrate both convergent and discriminant validity. You must discuss the evidence in
support of the validity of your measures.
You should state the name of the design that is used and tell whether it is a true or quasi-
experiment, nonequivalent group design, and so on. You should also present the design
structure in X and O notation (this should be indented and centered, not put into a
sentence). You should also include a discussion of internal validity that describes the major
likely threats in your study and how the design accounts for them, if at all. (Be your own
study critic here and provide enough information to show that you understand the threats to
validity, whether you've been able to account for them all in the design or not.)
Generally, this section ties together the sampling, measurement, and research design. In
this section you should briefly describe the overall plan of the research, the sequence of
events from beginning to end (including sampling, measurement, and use of groups in
designs), how participants will be notified, and how their confidentiality will be protected
(where relevant). An essential part of this subsection is a description of the program or
independent variable that you are studying.

The heading for this section is centered with upper and lower case letters. You should
indicate concisely what results you found in this research. Your results don't have to
confirm your hypotheses. In fact, the common experience in social research is the finding
of no effect.
Here you should describe the conclusions you reach (assuming you got the results
described in the Results section above). You should relate these conclusions back to the
level of the construct and the general problem area which you described in the Introduction
section. You should also discuss the overall strength of the research proposed (e.g. general
discussion of the strong and weak validity areas) and should present some suggestions for
possible future research which would be sensible based on the results of this work.
There are really two parts to a reference citation. First, there is the way you cite the item in
the text when you are discussing it. Second, there is the way you list the complete reference
in the reference section in the back of the report.
Reference Citations in the Text of Your Paper
Cited references appear in the text of your paper and are a way of giving credit to the
source of the information or quote you have used in your paper. They generally consist of
the following bits of information:
The author's last name, unless first initials are needed to distinguish between two authors
with the same last name. If there are six or more authors, the first author is listed followed
by the term, et al., and then the year of the publication is given in parenthesis. Page
numbers are given with a quotation or when only a specific part of a source was used. E.g.
"To be or not to be" (Shakespeare, 1660, p. 241)

One Work by One Author:
Rogers (1994) compared reaction times...
One Work by Multiple Authors:
Wasserstein, Zappulla, Rosen, Gerstman, and Rock (1994) [first time you cite in text]
Wasserstein et al. (1994) found [subsequent times you cite in text]
Reference List in Reference Section
There are a wide variety of reference citation formats. Before submitting any research
report you should check to see which type of format is considered acceptable for that
context. If there is no official format requirement then the most sensible thing is for you to
select one approach and implement it consistently (there's nothing worse than a reference
list with a variety of formats). Here, We'll illustrate by example some of the major
reference items and how they might be cited in the reference section.
The References lists all the articles, books, and other sources used in the research and
preparation of the paper and cited with a parenthetical (textual) citation in the text. These
items are entered in alphabetical order according to the authors' last names; if a source does
not have an author, alphabetize according to the first word of the title, disregarding the
articles "a", "an", and "the" if they are the first word in the title.
Jones, T. (1940). My life on the road. New York: Doubleday.
Williams, A., & Wilson, J. (1962). New ways with chicken. New York: Harcourt.

Smith, J., Jones, J., & Williams, S. (1976). Common names. Chicago: University of
Chicago Press.
Handbook of Korea (4th ed.). (1982). Seoul: Korean Overseas Information, Ministry of
Culture & Information.
Oates, J.C. (1990). Because it is bitter, and because it is my heart. New York: Dutton.
Oates, J.C. (1993). Foxfire: Confessions of a girl gang. New York: Dutton.
Note: Entries by the same author are arranged chronologically by the year of publication,
the earliest first. References with the same first author and different second and subsequent
authors are listed alphabetically by the surname of the second author, then by the surname
of the third author. References with the same authors in the same order are entered
chronologically by year of publication, the earliest first. References by the same author (or
by the same two or more authors in identical order) with the same publication date are
listed alphabetically by the first word of the title following the date; lower case letters (a, b,
c, etc.) are included after the year, within the parentheses.
President's Commission on Higher Education. (1977). Higher education for American
democracy . Washington, D.C.: U.S. Government Printing Office.
Bloom, H. (Ed.). (1988). James Joyce's Dubliners. New York: Chelsea House.

Dostoevsky, F. (1964). Crime and punishment (J. Coulson Trans.). New York: Norton.
(Original work published 1866)
O'Connor, M.F. (1975). Everything that rises must converge. In J.R. Knott, Jr. & C.R.
Raeske (Eds.), Mirrors: An introduction to literature (2nd ed., pp. 58-67). San Francisco:
Tortora, G.J., Funke, B.R., & Case, C.L. (1989). Microbiology: An introduction (3rd ed.).
Redwood City, CA: Benjamin/Cummings.
American Psychiatric Association. (1994). Diagnostic and statistical manual of mental
disorders (4th ed.). Washington, D.C.: Author.
Churchill, W.S. (1957). A history of the English speaking peoples: Vol. 3. The Age of
Revolution. New York: Dodd, Mead.
Cockrell, D. (1980). Beatles. In The new Grove dictionary of music and musicians (6th ed.,
Vol. 2, pp. 321-322). London: Macmillan.
Jones, W. (1970, August 14). Todays's kids. Newseek, 76, 10-15.

Howe, I. (1968, September). James Baldwin: At ease in apocalypse. Harper's, 237, 92-100.
Brody, J.E. (1976, October 10). Multiple cancers termed on increase. New York Times
(national ed.). p. A37.
Barber, B.K. (1994). Cultural, family, and personal contexts of parent-adolescent conflict.
Journal of Marriage and the Family, 56, 375-386.
U.S. Department of Labor. Bureau of Labor Statistics. (1980). Productivity. Washington,
D.C.: U.S. Government Printing Office.
Research and Training Center on Independent Living. (1993). Guidelines for reporting and
writing about people with disabilities. (4th ed.) [Brochure]. Lawrence, KS: Author.
Any Tables should have a heading with 'Table #' (where # is the table number), followed
by the title for the heading that describes concisely what is contained in the table. Tables
and Figures are typed on separate sheets at the end of the paper after the References and
before the Appendices. In the text you should put a reference where each Table or Figure
should be inserted using this form:



Insert Table 1 about here
Figures are drawn on separate sheets at the end of the paper after the References and
Tables, and before the Appendices. In the text you should put a reference where each
Figure will be inserted using this form:

Insert Figure 1 about here
Appendices should be used only when absolutely necessary. Generally, you will only use
them for presentation of extensive measurement instruments, for detailed descriptions of
the program or independent variable and for any relevant supporting documents which you
don't include in the body. Even if you include such appendices, you should briefly describe
the relevant material in the body and give an accurate citation to the appropriate appendix
(e.g., 'see Appendix A').