You are on page 1of 278

General Syllabus for PhD Preparation



1 Introduction - Meaning, Objectives and Types of research, Research Approach,
Research Process, Relevance & scope of research in management.

2 Research Design - Features of good Design, Types of Research Design, Basic
principles of experimental Design.

3 Sampling Design - Steps in sample Design, Characteristics of a good sample
Design, Probability & Non Probability sampling.

4 Measurement & scaling techniques - Errors in measurement. Test of sound
measurement, Scaling and scale construction technique.

5 Methods of data collection - Primary data questionnaire and interviews;
Collection of secondary data,

6 Collection and Processing data - Survey Errors, Data coding; Editing and

7 Analysis of data - Analysis of Variance; Advanced Data Analysis Techniques-
Factor Analysis, Cluster Analysis, Discriminant Analysis, Conjoint Analysis, Multi
Dimensional Scaling.

8 Testing of hypothesis - Procedure for hypothesis testing; Use of statistical
techniques for testing of hypothesis.

9 Interpretation of data - Techniques of Interpretation, Report writing, Layout of a
project report, preparing research reports.

10 Research in various Functional Areas 203
Bibliography 207
11 All FAQ on Ph.D , Research Aptitude Test: Examination Pattern 208
12 Most likely asked questions in Ph.D entrance aptitude test 217
13 Most Likely asked Questions for Ph.D Interview 270
Enclosed CD contents: Sample Ph.D Thesis, Synopsis , Summary etc

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 2

Chapter1: Research
Research comprises of creative work undertaken on a systematic basis in order to increase the stock of
knowledge, including knowledge of man, culture and society, and the use of this stock of knowledge to
devise new applications.
Research can be defined to be search for knowledge or any systematic investigation to establish facts. The
primary purpose for applied research (as opposed to basic research) is discovering, interpreting, and the
development of methods and systems for the advancement of human knowledge on a wide variety of
scientific matters of our world and the universe. Research can use the scientific method, but need not do
Scientific research relies on the application of the scientific method, a harnessing of curiosity. This
research provides scientific information and theories for the explanation of the nature and the properties
of the world around us. It makes practical applications possible. Scientific research is funded by public
authorities, by charitable organisations and by private groups, including many companies. Scientific
research can be subdivided into different classifications according to their academic and application
Research can be defined as a scientific and systematic search for gaining information and knowledge on a
specific topic or phenomena. In management, research is extensively used in various areas. For example,
We all know that, Marketing is the process of Planning & Executing the concepts, pricing, promotion &
distribution of ideas, goods, and services to create exchange that satisfy individual & organizational
objectives. Thus, we can say that, the Marketing Concept requires Customer Satisfaction rather than
Profit Maximization to be the goal of an organization. The organization should be Consumer oriented
and should try to understand consumers requirements & satisfy them quickly and efficiently, in ways
that are beneficial to both the consumer & the organization.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 3

This means that any organization should try to obtain information on consumer needs and gather market
intelligence to help satisfy these needs efficiently. This can only be done only by research.
Research in common parlance refers to a search for knowledge. It is an endeavour to discover answers to
problems (of intellectual and practical nature) through the application of scientific methods. Research,
thus, is essentially a systematic inquiry seeking facts (truths) through objective, verifiable methods in
order to discover the relationship among them and to deduce from them broad conclusions. It is thus a
method of critical thinking. It is imperative that any type of organisation in the globalised environment
needs systematic supply of information coupled with tools of analysis for making sound decisions, which
involve minimum risk. In this chapter, we will discuss at length the need and significance of research,
types and methods of research, and the research process.
When research is used for decision-making, it means we are using the methods of science to the art of
management. Every organization operates under some degree of uncertainty. This uncertainty cannot be
eliminated completely, although it can be minimized with the help of research methodology. Research is
particularly important in the decision making process of various business organizations to choose the
best line of action (in the light of growing competition and increasing uncertainty).
The research process usually starts with a broad area of interest, the initial problem that the researcher
wishes to study. For instance, the researcher could be interested in how to use computers to improve the
performance of students in mathematics. However, this initial interest is far too broad to study in any
single research project (it might not even be addressable in a lifetime of research).
The researcher has to narrow the question down to one that can reasonably be studied in a research
project. This might involve formulating a hypothesis or a focus question. For instance, the researcher
might hypothesize that a particular method of computer instruction in math will improve the ability of
elementary school students in a specific district. At the narrowest point of the research hourglass, the
researcher is engaged in direct measurement or observation of the question of interest.

Research in common context refers to a search for knowledge. It can also be defined as a scientific and
systematic search for gaining information and knowledge on a specific topic or phenomena. In
management, research is extensively used in various areas. For example, we all know that, Marketing is
the process of Planning & Executing the concepts; pricing, promotion & distribution of ideas, goods, and
services to create exchange that satisfy individual & organizational objectives. Thus, we can say that, the
Marketing Concept requires Customer Satisfaction rather than Profit Maximization to be the goal of an
organization. The organization should be Consumer oriented and should try to understand consumers
requirements & satisfy them quickly and efficiently, in ways that are beneficial to both the consumer &
the organization.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 4

The Random House Dictionary of the English language defines the term Research as a meticulous and
systematic inquiry or investigation into a subject in order to discover or revise facts, theories,
applications, etc. This definition explains that research involves acquisition of knowledge. Research
means search for truth. Truth means the quality of being in agreement with reality or facts. It also means
an established or verified fact. To do research is to get nearer to truth, to understand the reality. Research
is the pursuit of truth with the help of study, observation, comparison and experimentation. In other
words, the search for knowledge through objective and systematic method of finding solution to a
problem/answer to a question is research. There is no guarantee that the researcher will always come out
with a solution or answer. Even then, to put it in Karl Pearsons words there is no short cut to truth no
way to gain knowledge of the universe except through the gate way of scientific method. Let us see
some definitions of Research:
L.V. Redman and A.V.H. Mory in their book on The Romance of Research defined research as a
systematized effort to gain new knowledge
Research is a scientific and systematic search for pertinent information on a specific topic (C.R. Kothari,
Research Methodology - Methods and Techniques)
A careful investigation or inquiry specially through search for new facts in any branch of knowledge
(Advanced learners Dictionary of current English) Research refers to a process of enunciating the
problem, formulating a hypothesis, collecting the facts or data, analyzing the same, and reaching certain
conclusions either in the form of solution to the problem enunciated or in certain generalizations for some
theoretical formulation.
D. Slesinger and M. Stephenson in the Encyclopedia of Social Sciences defined research as:
Manipulation of things, concepts or symbols for the purpose of generalizing and to extend, correct or
verify knowledge, whether that knowledge aids in the construction of a theory or in the practice of an

To understand the term research clearly and comprehensively let us analyze the above definition.
i) Research is manipulation of things, concepts or symbols
- manipulation means purposeful handling,
- things means objects like balls, rats, vaccine,
- concepts mean the terms designating the things and their perceptions about
- which science tries to make sense. Examples: velocity, acceleration, wealth, income.
- Symbols may be signs indicating +, , , , x , s, S, etc.
- Manipulation of a ball or vaccine means when the ball is kept on different degrees of incline how and at
what speed does it move? When the vaccine is used, not used, used with different gaps, used in different
quantities (doses) what are the effects?
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 5

ii) Manipulation is for the purpose of generalizing
The purpose of research is to arrive at generalization i.e., to arrive at statements of generality, so that
prediction becomes easy. Generalization or conclusion of an enquiry tells us to expect something in a
class of things under a class of conditions. Examples: Debt repayment capacity of farmers will be
decreased during drought years. When price increases demand falls. Advertisement has a favourable
impact on sales.
iii) The purpose of research (or generalization) is to extend, correct or verify knowledge
Generalization has in turn certain effects on the established corpus or body of knowledge. It may extend
or enlarge the boundaries of existing knowledge by removing inconsistencies if any. It may correct the
existing knowledge by pointing out errors if any. It may invalidate or discard the existing knowledge
which is also no small achievement. It may verify and confirm the existing knowledge which also gives
added strength to the existing knowledge. It may also point out the gaps in the existing corpus of
knowledge requiring attempts to bridge these gaps.
iv) This knowledge may be used for construction of a theory or practice of an art
The extended, corrected or verified knowledge has two possible uses to which persons may put it.
a) may be used for theory building so as to form a more abstract conceptual system. E.g. Theory of
relativity, theory of full employment, theory of wage.
b) may be used for some practical or utilitarian goal. E.g. Salesmanship and advertisement increase sales
is the generalization. From this, if sales have to be increased, use salesmanship and advertisement for
increasing sales. Theory and practice are not two independent things. They are interdependent. Theory
gives quality and effectiveness to practice. Practice in turn may enlarge or correct or confirm or even
reject theory.

Some other definitions of Research are:
1. Redman and Mory define research as a systematized effort to gain new knowledge.
2. Some people consider research as a movement, a movement from known to unknown. It is actually a
voyage to discovery.
3. According to Clifford Woody
Research comprises of defining and redefining problems, formulating hypothesis or suggested
solutions; making deductions and reaching conclusions; and at last carefully testing the conclusions to
determine whether they fit the formulating hypothesis.
On evaluating these definitions we can conclude that Research refers to the systematic method consisting
- Enunciating the problem,
- Formulating a hypothesis,
- Collecting the fact or data,
- Analyzing the facts and
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 6

- Reaching certain conclusions either in the form of solutions towards the concerned problem or in
certain generals for some theoretical formulation.

Research covers the search for and retrieval of information for a specific purpose. Research has many
categories, from medical research to literary research.
Research is essentially a fact-finding process, which influences decision-making. It is a careful search or
inquiry into any subject or subject matter, which is an endeavour to discover or find out valuable facts,
which would be useful for further application or utilization. Research can be a basic research or applied
research. Basic research is studies conducted toward long-range questions or advancing scientific

Characteristics of Research
a. Systematic Approach
Each step must of your investigation be so planned that it leads to the next step. Planning and
organization are part of this approach. A planned and organized research saves your time and money.
b. Objectivity
It implies that True Research should attempt to find an unbiased answer to the decision-making problem.
c. Reproducible
A reproducible research procedure is one, which an equally competent researcher could duplicate, and
from it deduces approximately the same results. Precise information regarding samples-methods,
collection etc., should be specified.
d. Relevancy
It furnishes three important tasks:
- It avoids collection of irrelevant information and saves time and money
- It compares the information to be collected with researchers criteria for action
- It enables to see whether the research is proceeding in the right direction
e. Control:
Research is not only affected by the factors, which one is investigating but some other extraneous factors
also. It is impossible to control all the factors. All the factors that we think may affect the study have to be
controlled and accounted for.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 7

For Example
Suppose we are studying the relationship between incomes and shopping behaviour, without controlling
for education and age, it will be a height of folly, since our findings may reflect the effect of education
and age rather than income.

Objectives of research
Following are the key objectives of research:
1. Exploration- an understanding of an area of concern in very general terms. Example: I want to know
how to go about doing more effective research on school violence.
2. Description - an understanding of what is going on. Example: I want to know the attitudes of potential
clients toward Air-Conditioner use.
3. Explanation - an understanding of how things happen. Involves an understanding of cause and effect
relationships between events. Example: I want to know if a group of people who have gone through a
certain program have higher self-esteem than a control group.
4. Prediction - an understanding of what is likely to happen in the future. If I can explain, I may be able to
predict. Example: If one group had higher self-esteem, is it likely to happen with another group?
5. Intelligent intervention - an understanding of what or how in order to help more effectively.
6. Awareness - an understanding of the world, often gained by a failure to describe or explain.

Types of research
Research may be classified into different types for the sake of better understanding of the concept.
Several bases can be adopted for the classification such as nature of data, branch of knowledge, extent of
coverage, place of investigation, method employed, time frame and so on. Depending upon the BASIS
adopted for the classification, research may be classified into a class or type. It is possible that a piece of
research work can be classified under more than one type, hence there will be overlapping. It must be
remembered that good research uses a number of types, methods, & techniques. Hence, rigid
classification is impossible. The following is only an attempt to classify research into different types.

i) According to the Branch of Knowledge
Different Branches of knowledge may broadly be divided into two:
a) Life and physical sciences such as Botany, Zoology, Physics and Chemistry.
b) Social Sciences such as Political Science, Public Administration, Economics, Sociology, Commerce and
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 8

Research in these fields is also broadly referred to as life and physical science research and social science
research. Business education covers both Commerce and Management, which are part of Social sciences.
Research is a broad term which covers many areas.

The research carried out, in these areas, is called management research, production research, personnel
research, financial management research, accounting research, Marketing research etc.
a. Management research includes various functions of management such as planning, organizing,
staffing, communicating, coordinating, motivating, controlling. Various motivational theories are the
result of research.
b. Production (also called manufacturing) research focuses more on materials and equipment rather
than on human aspects. It covers various aspects such as new and better ways of producing goods,
inventing new technologies, reducing costs, improving product quality.
c. Research in personnel management may range from very simple problems to highly complex
problems of all types. It is primarily concerned with the human aspects of the business such as personnel
policies, job requirements, job evaluation, recruitment, selection, placement, training and development,
promotion and transfer, morale and attitudes, wage and salary administration, industrial relations. Basic
research in this field would be valuable as human behaviour affects organizational behaviour and
d. Research in Financial Management includes financial institutions, financing instruments (egs. shares,
debentures), financial markets (capital market, money market, primary market, secondary market),
financial services (egs. merchant banking, discounting, factoring), financial analysis (e.g. investment
analysis, ratio analysis, funds flow / cash flow analysis) etc.,
e. Accounting research though narrow in its scope, but is a highly significant area of business
management. Accounting information is used as a basis for reports to the management, shareholders,
investors, tax authorities, regulatory bodies and other interested parties. Areas for accounting research
include inventory valuation, depreciation accounting, generally accepted accounting principles,
accounting standards, corporate reporting etc.
f. Marketing research deals with product development and distribution problems, marketing
institutions, marketing policies and practices, consumer behaviour, advertising and sales promotion,
sales management and after sales service etc. Marketing research is one of the very popular areas and
also a well established one. Marketing research includes market potentials, sales forecasting, product
testing, sales analysis, market surveys, test marketing, consumer behaviour studies, marketing
information system etc.
g. Business policy research is basically the research with policy implications. The results of such studies
are used as indices for policy formulation and implementation.
h. Business history research is concerned with the past. For example, how was trade and commerce
during the Moghul regime.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 9

ii) According to the Nature of Data
A simple dichotomous classification of research is Quantitative research and Qualitative research / non-
a. Quantitative research is variables based where as qualitative research is attributes based. Quantitative
research is based on measurement / quantification of the phenomenon under study. In other words, it is
data based and hence more objective and more popular.
b. Qualitative research is based on the subjective assessment of attributes, motives, opinions, desires,
preferences, behaviour etc. Research in such a situation is a function of researchers insights and

iii) According to the Coverage
According to the number of units covered it can be Macro study or Micro study. Macro study is a study
of the whole where as Micro study is a study of the part. For example, working capital management in
State Road Transport Corporations in India is a macro study where as Working Capital Management in
Andhra Pradesh State Road Transport Corporation is a micro study.

iv) According to Utility or Application
Depending upon the use of research results i.e., whether it is contributing to the theory building or
problem solving, research can be Basic or Applied.
a. Basic research is called pure / theoretical / fundamental research. Basic research includes original
investigations for the advancement of knowledge that does not have specific objectives to answer
problems of sponsoring agencies.
b. Applied research also called Action research, constitutes research activities on problems posed by
sponsoring agencies for the purpose of contributing to the solution of these problems.

v) According to the place where it is carried out
Depending upon the place where the research is carried out (according to the data generating source),
research can be classified into:
a) Field Studies or field experiments
b) Laboratory studies or Laboratory experiments
c) Library studies or documentary research

vi) According to the Research Methods used
Depending upon the research method used for the investigation, it can be classified as:
a) Survey research, b) Observation research, c) Case research, d) Experimental research, e) Historical
research, f) Comparative research.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 10

vii) According to the Time Frame
Depending upon the time period adopted for the study, it can be
a) One time or single time period research - e.g. One year or a point of time. Most of the sample studies,
diagnostic studies are of this type.
b) Longitudinal research - e.g. several years or several time periods ( a time series analysis) e.g. industrial
development during the five year plans in India.

viii) According to the purpose of the Study
What is the purpose/aim/objective of the study? Is it to describe or analyze or evaluate or explore?
Accordingly the studies are known as.
a) Descriptive Study: The major purpose of descriptive research is the description of a person, situation,
institution or an event as it exists. Generally fact finding studies are of this type.
b) Analytical Study: The researcher uses facts or information already available and analyses them to
make a critical examination of the material. These are generally Ex-post facto studies or post-mortem
c) Evaluation Study: This type of study is generally conducted to examine /evaluate the impact of a
particular event, e.g. Impact of a particular decision or a project or an investment.
d) Exploratory Study: The information known on a particular subject matter is little. Hence, a study is
conducted to know more about it so as to formulate the problem and procedures of the study. Such a
study is called exploratory/ formulative study.

Research Approaches
The researcher has to provide answers at the end, to the research questions raised in the beginning of the
study. For this purpose he has investigated and gathered the relevant data and information as a basis or
evidence. The procedures adopted for obtaining the same are described in the literature as methods of
research or approaches to research. In fact, they are the broad methods used to collect the data. These
methods are as follows:
1) Survey Method
2) Observation Method
3) Case Method
4) Experimental Method
5) Historical Method
6) Comparative Method
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 11

It is now proposed to explain briefly, each of the above mentioned approaches.

1. Survey Method
The dictionary meaning of Survey is to oversee, to look over, to study, to systematically investigate.
Survey research is used to study large and small populations (or universes). It is a fact finding survey.
Mostly empirical problems are investigated by this approach. It is a critical inspection to gather
information, often a study of an area with respect to a certain condition or its prevalence. For example: a
marketing survey, a household survey, All India Rural Credit Survey.
Survey is a very popular branch of social science research. Survey research has developed as a separate
research activity along with the development and improvement of sampling procedures. Sample surveys
are very popular now a days. As a matter of fact sample survey has become synonymous with survey.
For example, see the following definitions:
Survey research can be defined as Specification of procedures for gathering information about a large
number of people by collecting information from a few of them. (Black and Champion). Survey
research is Studying samples chosen from populations to discover the relative incidence,
distribution, and inter relations of sociological and psychological variables. (Fred N. Kerlinger) By
surveying data, information may be collected by observation, or personal interview, or mailed
questionnaires, or administering schedules or telephone enquiries.
Features of Survey method
The important features of survey method are as follows:
i) It is a field study, as it is always conducted in a natural setting.
ii) It solicits responses directly from the respondents or people known to have knowledge about the
problem under study.
iii) Generally, it gathers information from a large population.
iv) A survey covers a definite geographical area e.g. A village / city or a district.
v) It has a time frame.
vi) It can be an extensive survey involving a wider sample or it can be an intensive study covering few
samples but is an in-depth and detailed study.
vii) Survey research is best adapted for obtaining personal, socio-economic facts, beliefs, attitudes,
Survey research is not a clerical routine of gathering facts and figures. It requires a good deal of research
knowledge and sophistication. The competent survey investigator must know sampling procedures,
questionnaire / schedule / opionionaire construction, techniques of interviewing and other technical
aspects of the survey. Ultimately the quality of the Survey results depends on the imaginative planning,
representative sampling, reliability of data, appropriate analysis and interpretation of the data.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 12

2. Observation Method
Observation means seeing or viewing. It is not a casual but systematic viewing. Observation may
therefore be defined as a systematic viewing of a specific phenomenon in its proper setting for the
purpose of gathering information for the specific study.
Observation is a method of scientific enquiry. We observe a person or an event or a situation or an
incident. The body of knowledge of various sciences such as biology, physiology, astronomy, sociology,
psychology, anthropology etc., has been built upon centuries of systematic observation.
Observation is also useful in social and business sciences for gathering information and conceptualizing
the same. For example, What is the life style of tribals? How are the marketing activities taking place in
Regulated markets? How will the investment activities be done in Stock Exchange Markets? How are
proceedings taking place in the Indian Parliament or Assemblies? How is a corporate office maintained in
a public sector or a private sector undertaking? What is the behaviour of political leaders? Traffic jams in
Delhi during peak hours?
Observation as a method of data collection has some features:
i) It is not only seeing & viewing but also hearing and perceiving as well. ii) It is both a physical and a
mental activity. The observing eye catches many things which are sighted, but attention is also focused on
data that are relevant to the problem under study.
iii) It captures the natural social context in which the persons behaviour occurs.
iv) Observation is selective: The investigator does not observe everything but selects the range of things
to be observed depending upon the nature, scope and objectives of the study.
v) Observation is not casual but with a purpose. It is made for the purpose of noting things relevant to the
vi) The investigator first of all observes the phenomenon and then gathers and accumulates data.
Observation may be classified in different ways. According to the setting it can be (a) observation in a
natural setting, e.g. Observing the live telecast of parliament proceedings or watching from the visitors
gallery, Electioneering in India through election meetings or (b) observation in an artificially stimulated
setting, e.g. business games, Tread Mill Test. According to the mode of observation it may be classified as
(a) direct or personal observation, and (b) indirect or mechanical observation. In case of direct
observation, the investigator personally observes the event when it takes place, where as in case of
indirect observation it is done through mechanical devices such as audio recordings, audio visual aids,
still photography, picturization etc. According to the participating role of the observer, it can be classified
as (a) participant observation and (b) non-participant observation. In case of participant observation, the
investigator takes part in the activity, i.e. he acts both as an observer as well as a participant. For example,
studying the customs and life style of tribals by living / staying with them. In case of non-participant
observation, the investigator observes from outside, merely as an on looker. Observation method is
suitable for a variety of research purposes such as a study of human behaviours, behaviour of social
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 13

groups, life styles, customs and traditions, inter personal relations, group dynamics, crowd behaviour,
leadership and management styles, dressing habits of different social groups in different seasons,
behaviour of living creatures like birds, animals, lay out of a departmental stores, a factory or a
residential locality, or conduct of an event like a meeting or a conference or Afro- Asian Games.

3. Case Method
Case method of study is borrowed from Medical Science. Just like a patient, the case is intensively studied
so as to diagnose and then prescribe a remedy. A firm, or a unit is to be studied intensively with a view to
finding out problems, differences, specialties so as to suggest remedial measures. It is an in-
depth/intensive study of a unit or problem under study. It is a comprehensive study of a firm or an
industry, or a social group, or an episode, or an incident, or a process, or a programme, or an institution
or any other social unit. According to P.V. Young a comprehensive study of a social unit, be that unit a
person, a group, a social institution, a district, or a community, is called a Case Study.
Case Study is one of the popular research methods. A case study aims at studying everything about
something rather than something about everything. It examines complex factors involved in a given
situation so as to identify causal factors operating in it. The case study describes a case in terms of its
peculiarities, typical or extreme features. It also helps to secure a fund of information about the unit
under study. It is a most valuable method of study for diagnostic therapeutic purposes.

4. Experimental Method
Experimentation is the basic tool of the physical sciences like Physics, Chemistry for establishing cause
and effect relationship and for verifying inferences. However, it is now also used in social sciences like
Psychology, Sociology. Experimentation is a research process used to observe cause and effect
relationship under controlled conditions. In other words it aims at studying the effect of an independent
variable on a dependent variable, by keeping the other interdependent variables constant through some
type of control. In experimentation, the researcher can manipulate the independent variables and
measure its effect on the dependent variable. The main features of the experimental method are :
i) Isolation of factors or controlled observation.
ii) Replication of the experiment i.e. it can be repeated under similar conditions.
iii) Quantitative measurement of results.
iv) Determination of cause and effect relationship more precisely.
Three broad types of experiments are:
a) The natural or uncontrolled experiment as in case of astronomy made up mostly of observations.
b) The field experiment, the best suited one for social sciences. A field experiment is a research study in
a realistic situation in which one or more independent variables are manipulated by the experimenter
under as carefully controlled conditions as the situation will permit. ( Fred N. Kerlinger)
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 14

c) The laboratory experiment is the exclusive domain of the physical scientist.
A laboratory experiment is a research study in which the variance of all or nearly all of the possible
influential independent variables, not pertinent to the immediate problem of the investigation, is kept at a
minimum. This is done by isolating the research in a physical situation apart from the routine of ordinary
living and by manipulating one or more independent variables under rigorously specified,
operationalized, and controlled conditions. (Fred N. Kerlinger). The contrast between the field
experiment and laboratory experiment is not sharp, the difference is a matter of degree. The laboratory
experiment has a maximum of control, where as the field experiment must operate with less control.

5. Historical Method
When research is conducted on the basis of historical data, the researcher is said to have followed the
historical approach. To some extent, all research is historical in nature, because to a very large extent
research depends on the observations / data recorded in the past. Problems that are based on historical
records, relics, documents, or chronological data can conveniently be investigated by following this
method. Historical research depends on past observations or data and hence is non-repetitive, therefore it
is only a post facto analysis. However, historians, philosophers, social psychiatrists, literary men, as well
as social scientists use the historical approach. Historical research is the critical investigation of events,
developments, experiences of the past, the careful weighing of evidence of the validity of the sources of
information of the past, and the interpretation of the weighed evidence. The historical method, also called
historiography, differs from other methods in its rather elusive subject matter i.e. the past. In historical
research primary and also secondary sources of data can be used. A primary source is the original
repository of a historical datum, like an original record kept of an important occasion, an eye witness
description of an event, the inscriptions on copper plates or stones, the monuments and relics,
photographs, minutes of organization meetings, documents. A secondary source is an account or record
of a historical event or circumstance, one or more steps removed from an original repository. Instead of
the minutes of the meeting of an organization, for example, if one uses a newspaper account of the
meeting, it is a secondary source.
The aim of historical research is to draw explanations and generalizations from the past trends in order to
understand the present and to anticipate the future. It enables us to grasp our relationship with the past
and to plan more intelligently for the future.
For historical data only authentic sources should be depended upon and their authenticity should be
tested by checking and cross checking the data from as many sources as possible. Many a times it is of
considerable interest to use Time Series Data for assessing the progress or for evaluating the impact of
policies and initiatives. This can be meaningfully done with the help of historical data.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 15

6. Comparative Method
The comparative method is also frequently called the evolutionary or Genetic Method. The term
comparative method has come about in this way: Some sciences have long been known as Comparative
Sciences - such as comparative philology, comparative anatomy, comparative physiology, comparative
psychology, comparative religion etc. Now the method of these sciences came to be described as the
Comparative Method, an abridged expression for the method of the comparative sciences. When the
method of most comparative sciences came to be directed more and more to the determination of
evolutionary sequences, it came to be described as the Evolutionary Method.
The origin and the development of human beings, their customs, their institutions, their innovations and
the stages of their evolution have to be traced and established. The scientific method by which such
developments are traced is known as the Genetic method and also as the Evolutionary method. The
science which appears to have been the first to employ the Evolutionary method is comparative
philology. It is employed to compare the different languages in existence, to trace the history of their
evolution in the light of such similarities and differences as the comparisons disclosed. Darwins famous
work Origin of Species is the classic application of the Evolutionary method in comparative anatomy.
The whole theory of biological evolution rests on applications of evolutionary method. This method can
be applied not only to plants, to animals, to social customs and social institutions, to the human mind
(comparative psychology), to human ideas and ideals, but also to the evolution of geological strata, to the
differentiation of the chemical elements and to the history of the solar system. The term comparative
method as a method of research is used here in its restricted meaning as synonymous with Evolutionary
method. To say that the comparative method is a method of comparison is not convincing, for
comparison is not a specific method, but something which enters as a factor into every scientific method.
Classification requires careful comparison and every other method of science depends upon a precise
comparison of phenomena and the circumstances of their occurrence. All methods are, therefore,
comparative in a wider sense.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 16

The Research Process
Having received the research brief, the researcher responds with a research proposal. This is a document
which develops after having given careful consideration to the contents of the research brief. The
research proposal sets out the research design and the
procedures to be followed. The eight steps are set out in figure

Step -I: Problem definition
The point has already been made that the decision-maker should
clearly communicate the purpose of the research to the
researcher but it is often the case that the objectives are not fully
explained to the individual carrying out the study. Decision-
makers seldom work out their objectives fully or, if they have,
they are not willing to fully disclose them. In theory,
responsibility for ensuring that the research proceeds along
clearly defined lines rests with the decision-maker. In many
instances, the researcher has to take the initiative.
In situations, in which the researcher senses that the decision-
maker is either unwilling or unable to fully articulate the
objectives then he/she will have to pursue an indirect line of
questioning. One approach is to take the problem statement supplied by the decision-maker and to break
this down into key components and/or terms and to explore these with the decision-maker. For example,
the decision-maker could be asked what he has in mind when he uses the term market potential. This is a
valid question since the researcher is charged with the responsibility to develop a research design which
will provide the right kind of information. Another approach is to focus the discussions with the person
commissioning the research on the decisions which would be made given alternative findings which the
study might come up with. This process frequently proves of great value to the decision-maker in that it
helps him think through the objectives and perhaps select the most important of the objectives.
Whilst seeking to clarify the objectives of the research it is usually worthwhile having discussions with
other levels of management who have some understanding of the marketing problem and/or the
surrounding issues. Other helpful procedures include brainstorming, reviews of research on related
problems and researching secondary sources of information as well as studying competitive products.

The nature of problems
A decision makers degree of uncertainty influences decisions about the type of research that will be
conducted. A business manager may be completely certain about the situation s/he is facing. Or, at the
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 17

other extreme, a manager or researcher may describe a decision-making situation as absolute ambiguity.
The nature of the problem to be solved is unclear. The objectives are vague and the alternatives are
difficult to define. This is by far the most difficult decision situation. Most business decision face
situations falling in-between these two extremes.

The importance of proper problem definition
Business research is conducted to help solve managerial problems. It is extremely important to define the
business problem carefully because such definition will determine the purpose of the research and,
ultimately, the research design.
Formal qualitative research should not begin until the problem has been clearly defined. However, when
a problem or opportunity is discovered, managers may have only vague insights about a complex
situation. If quantitative research is conducted before the researchers understand exactly what is
important, then false conclusions may be drawn from the investigation.
Problem definition indicates a specific business decision area that will be clarified by answering some
research questions.

The process of defining the problem
The process of defining the problem involves several interrelated steps. They are:
1. Ascertain the decision makers objectives.
2. Understand the background of the problem
3. Isolate and identify the problem not the symptoms
4. Determine the unit of analysis
5. Determine the relevant variables
6. State the research questions (Hypotheses) and
7. Research objectives

1) Ascertain the decision makers objectives
The research investigation must attempt to satisfy the decision makers objectives. Sometimes, decision
makers are not able to articulate precise research objectives. Both the research investigator and the
manager requesting the research should attempt to have a clear understanding of the purpose of
undertaking the research. Often, exploratory researchby illuminating the nature of the business
opportunity or problemhelps managers clarify their objectives and decisions.
The iceberg principle
The dangerous part of any business problem, like the submerged part of an iceberg, is neither visible to
nor understood by the business managers. If the submerged portions of the problem are omitted from
the problem definition, and subsequently from the research design, then the decision based on such
research may be less than optimal.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 18

2) Understand the background of the problem.
The background of the problem is vital. A situation analysis is the logical first step in defining the
problem. This analysis involves the informal gathering of background information to familiarize
researchers or managers with the decision area. Exploratory research techniques have been developed to
help formulate clear definitions of the problem (see Chapter 7).
3) Isolate and identify the problem, not the symptoms.
Anticipating the many influences and dimensions of a problem is impossible for any researcher or
executive. Certain occurrences that appear to be the problem may only be symptoms of a deeper
problem. Executive judgment and creativity must be exercised in identifying a problem.
4) What is the unit of analysis?
The researcher must specify the unit of analysis. Will the individual consumer be the source of
information or will it be the parent-child dyad? Industries, organizations, departments, or individuals,
may be the focus for data collection and analysis. Many problems can be investigated at more than one
level of analysis.
5) What are the relevant variables?
One aspect of problem definition is identification of the key variables. A variable is a quality that can
exhibit differences in value, usually magnitude or strength.
In statistical analysis, a variable is identified by a symbol such as X. A category or classificatory variable
has a limited number of distinct variables (e.g., sexmale or female). A continuous variable may
encompass an infinite range of numbers (e.g., sales volume).
Managers and researchers must be careful to include all relevant variables that must be studied in order
to be able to answer the managerial problem. Irrelevant variables should not be included.
In causal research, a dependent variable is a criterion or variable that is expected to be predicted or
explained. An independent variable is a variable that is expected to influence the dependent variable.
6 &7) State the research questions and research objectives
The research question is the researchers translation of the business problem into a specific need for
A. Clarity in Research Questions and Hypotheses
Research questions should be specific, clear, and accompanied by a well-formulated hypothesis.
A hypothesis is an unproven proposition or possible solution to a problem. In its simplest form, a
hypothesis is a guess. Problems and hypotheses are similar; both state relationships, but, whereas
problems are interrogative, hypotheses are declarative and more specifically related to the research
operations and testing. Hypotheses are statements that can be empirically tested.
A formal statement of hypothesis can force researchers to be clear about what they expect to find through
their study. The hypothesis can raise critical questions about the data that will be required in the analysis
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 19

When evaluating a hypothesis, researchers should make sure that the information collected will be useful
in decision making.
B. Decision-oriented research objectives
The research objective is the researchers version of the problem. The research objective is derived from
the problem definition and it explains the purpose of the research in measurable terms, as well as
defining what standards the research should accomplish. Such objectives help ensure that the research
projects will be manageable in size.
In some instances the problems and the projects research objectives are identical. The objectives must,
however, specify the information needed to make a decision. Statements about the required precision
may be necessary to clearly communicate exactly what information is required.
It is useful if the research objective is a managerial action standard. That is, if the criterion being
measured turns out to be X, then management will do A; if it is Y, then management will do B. This
leaves no uncertainty concerning the decision to be made once the research is finished.
The number of research objectives should be limited to a manageable number so that each one can be
addressed fully.

How much time should be spent defining the problem?
It is impractical to search for every conceivable cause and minor influence of a problem. The importance
of the recognized problem will usually
dictate what is a reasonable amount of
time and money for determining which
possible explanations are most likely.

The research proposal
The research proposal is a written
statement of the research designit
explains the purpose of the study,
defines the problem, outlines the
research methodology, details the
procedures to be followed, and states all costs and deadlines.
The proposal should be precise, specific, and concrete. All ambiguities about why and how the research
will be conducted must be "ironed out" before the proposal is complete.
The research proposal can act as a communication tool. It allows managers to evaluate the proposed
research design and determine if alterations are necessary. The proposal should be detailed enough that
managers are clear about exactly how the information will be obtained.
Misstatements and faulty communication may occur if the two parties rely on each others memory of
what occurred at a planning meeting; therefore, it is wise to write down all proposals. Such a written
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 20

proposal eliminates many problems that may arise and acts as a record of the researchers obligation. In
the case of an outside consultant, the written proposal serves as a bid to offer a specific service; a
company can then judge the relative quality of alternative research suppliers.
Anticipating outcomes
By anticipating the outcomes of a research study, possibly through the use of a dummy table (a table
filled by the researcher with fictitious data), managers may gain a better understanding of what the
actual outcome is liable to be. These tables help clarify what the findings of the research will be, and if
these findings will meet the needs of the researcher.

Step II: Hypothesis generation
Whilst it is true that the purpose of research is to address some question, nonetheless one does not test
research questions directly. For example, there may be interest in answering the question: "Does a
person's level of education have any bearing upon whether or not he/she adopts new products?" Or,
"Does a person's age bear any relation to brand loyalty behaviour?". Research questions are too broad to
be directly testable. Instead, the question is reduced to one or more hypotheses implied by these
A hypothesis is a conjectural statement regarding the relation between two or more variables. There are
two key characteristics which all hypotheses must have: they must be statements of the relationship
between variables and they must carry clear implications for testing the stated relations. These
characteristics imply that it is relationships, rather than variables, which are tested; the hypotheses
specify how the variables are related and that these are measurable or potentially measurable. Statements
lacking any or all of these characteristics are not research hypotheses.
For example, consider the following hypothesis:
1. "Red meat consumption increases as real disposable incomes increase."
This is a relation stated between one variable, "red meat consumption", and another variable,
"disposable incomes". Moreover, both variables are potentially measurable. The criteria have been
met. However for the purposes of statistical testing it is more usual to find hypotheses stated in the
so-called null form, e.g.
"There is no relationship between red meat consumption and the level of disposable incomes."
2. Consider a second hypothesis:
"There is no relationship between a farmer's educational level and his degree of innovativeness
with respect to new farming technologies."
Again there is a clear statement of the relationship being investigated but there are question marks
over the measurability with respect to at least one of the variables i.e. "...a farmer's degree of
innovativeness." We may also encounter difficulties in agreeing an appropriate measure of the other
variable, i.e. "level of education". If these problems can be resolved then we may indeed have a
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 21

Hypotheses are central to progress in research. They will direct the researcher's efforts by forcing
him/her to concentrate on gathering the facts, which will enable the hypotheses to be tested. The point
has been made that it is all too easy when conducting research to collect "interesting data" as opposed to
"important data". Data and questions, which enable researchers to test explicit hypotheses, are important.
The rest are merely interesting.
There is a second advantage of stating hypotheses, namely that implicit notions or explanations for events
become explicit and this often leads to modifications of these explanations, even before data is collected.
On occasion a given hypotheses may be too broad to be tested. However, other testable hypotheses may
be deduced from it. A problem really cannot be solved unless it is reduced to hypothesis form, because a
problem is a question, usually of a broad nature, and is not directly testable.
Problem refinement: in most cases a problem statement is refined to a hypothesis: a proposed
hypothetical relationship between two or more variables in terms of cause (independent variable) and
effect (dependent variable). The possible solutions are that the proposed relation is valid or it is invalid.
Examples of possible hypotheses are:
- Hypothesis: Gender is related to income
- Hypothesis: Crime is related to population size
- Hypothesis: Crime is related to social class structure

These are good starting points but much more refinement can be done. For example, as a start:
- Hypothesis: females will make lower annual wages then males
- Hypothesis: the crime index is related to population density
- Hypothesis: the crime index is related to the percentage of the population below the poverty level
In these we have begun to specify our variables, but even more refinement remains. The variable "female"
is indicative of some of the issues involved. We generally think we know what gender -male and female -
is, but after some consideration we realize the Olympic committees have some doubt about masculine
and feminine, and further we realize that there is biological gender, sex roles, and personal sex identity,
this just reinforces the need for clarity and specificity of definitions.
Alternative and Null hypothesis: The alternative hypothesis (H1) is simply the hypothesis statement
of problem as stated above. The null hypothesis (H0) is that there is no relationship between the
variables- crime and population density. It is a statement that there is no relationship between our
independent and dependent variables in the population.
H1, Alternative hypothesis
The CBI Crime Index is related to population density
H0, Null hypothesis
There is no relationship between the CBI Crime Index and the population density.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 22

The goal of research is to reject the null hypothesis and thus support the alternative, theoretical
hypothesis. The theoretical hypothesis is the hypothesis we proposed based on our review of the
literature and theoretical considerations. Incidentally, we can see that this logic will never allow us to
conclusively prove our hypothesis but it does provide support for its acceptance by rejection of the null

Step III: Decision on type of study
Research can be carried out on one of three levels: a) Exploratory, b) Descriptive research and c)
Experimental research.

Step IV: Decision on data collection method
The next set of decisions concerns the method(s) of data gathering to be employed. The main methods of
data collection are secondary data searches, observation, and the survey, experimentation and consumer
Under ideal conditions the researcher would select the most appropriate method-field research, survey,
experiment, or secondary
data analysis-for the
research problem.
Realities of available
money, time, access to
information, and own
personal skills often are
decisive factors in design
choice and data
collection. Once the design is firm, follow through the steps in the design and collect the data.
All of us have collected data, not necessarily precisely and carefully in a scientific manner. Frequently we
observe people in a new situation to determine what is expected of us, such as when we first started
college, visited a new city, or started a new job; this is called participant observation, a particular type of
field research. We may ask friends how and why they are going to vote a certain way in an upcoming
election. This is known as interviewing. We may try different types or amounts of spices in a recipe to
find which combination tastes the best. This is called experimenting. Most of us have investigated sources
and data in the library to help us in making a decision about a trip, car, house or major appliance
purchase. This is known as secondary analysis, the analysis of data collected by others.
All of these are research "data collection" techniques, though they lack the rigor, care, and explicitness of
scientific research. Some may approach scientific quality for testing statements, while others would be
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 23

considered as primarily acceptable for the generation of hypothesis, but not acceptable for drawing
Research techniques vary in terms of the formal aspects of their structure. Some are more open-ended
and there is less consensus on structure (field studies, content analysis, focus groups, etc.). Most of these
techniques of study are not really lacking in numbers and counting of observations; where they differ
from other techniques is in their more open approach. Additionally, they frequently lack precise agreed
upon data collection techniques and sufficient numbers in their samples to allow using statistics and
generalizing conclusions.

Step V: Development of an analysis plan
Those new to research often intuitively believe that decisions about the techniques of analysis to be used
can be left until after the data has been collected. Such an approach is ill-advised. Before interviews are
conducted the following checklist should be applied:
- Is it known how each and every question is to be analysed? (e.g. which univariate or bivariate
descriptive statistics, tests of association, parametric or nonparametric hypotheses tests, or
multivariate methods are to be used?)
- Does the researcher have a sufficiently sound grasp of these techniques to apply them with
confidence and to explain them to the decision-maker who commissioned the study?
- Does the researcher have the means to perform these calculations? (e.g. access to a computer
which has an analysis program which he/she is familiar with? Or, if the calculations have to be
performed manually, is there sufficient time to complete them and then to check them?)
- If a computer program is to be used at the data analysis stage, have the questions been properly
- Have the questions been scaled correctly for the chosen statistical technique? (e.g. a t-test cannot
be used on data which is only ranked)
There is little point in spending time and money on collecting data, which subsequently is not or cannot
be analysed. Therefore consideration has to be given to issues such as these before the fieldwork is

Step VI: Data collection
Research involves the collection of data to obtain insight and knowledge into the needs and wants of
customers and the structure and dynamics of a market. In nearly all cases, it would be very costly and
time-consuming to collect data from the entire population of a market. Accordingly, in market research,
extensive use is made of sampling from which, through careful design and analysis, researchers can
draw information about the market.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 24

i) Sample Design
Sample design covers the method of selection, the sample structure and plans for analysing and
interpreting the results. Sample designs can vary from simple to complex and depend on the type of
information required and the way the sample is selected.
Sample design affects the size of the sample and the way in which analysis is carried out. In simple terms
the more precision the market researcher requires, the more complex will be the design and the larger the
sample size.
The sample design may make use of the characteristics of the overall market population, but it does not
have to be proportionally representative. It may be necessary to draw a larger sample than would be
expected from some parts of the population; for example, to select more from a minority grouping to
ensure that sufficient data is obtained for analysis on such groups.
Many sample designs are built around the concept of random selection. This permits justifiable inference
from the sample to the population, at quantified levels of precision. Random selection also helps guard
against sample bias in a way that selecting by judgement or convenience cannot.
ii) Defining the Population
The first step in good sample design is to ensure that the specification of the target population is as clear
and complete as possible to ensure that all elements within the population are represented. The target
population is sampled using a sampling frame. Often the units in the population can be identified by
existing information; for example, pay-rolls, company lists, government registers etc. A sampling frame
could also be geographical; for example postcodes have become a well-used means of selecting a sample.
iii) Sample Size
For any sample design deciding upon the appropriate sample size will depend on several key factors.
(1) No estimate taken from a sample is expected to be exact: Any assumptions about the overall
population based on the results of a sample will have an attached margin of error.
(2) To lower the margin of error usually requires a larger sample size. The amount of variability in the
population (i.e. the range of values or opinions) will also affect accuracy and therefore the size of sample.
(3) The confidence level is the likelihood that the results obtained from the sample lie within a required
precision. The higher the confidence level that is the more certain you wish to be that the results are not
atypical. Statisticians often use a 95 per cent confidence level to provide strong conclusions.
(4) Population size does not normally affect sample size. In fact the larger the population size the lower
the proportion of that population that needs to be sampled to be representative. It is only when the
proposed sample size is more than 5 per cent of the population that the population size becomes part of
the formulae to calculate the sample size.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 25

iv) Types of Sampling (Discussed in details in the Chapter 3: Sampling)

Step VII: Analysis of data
The word 'analysis' has two component parts, the prefix 'ana' meaning 'above' and the Greek root 'lysis'
meaning 'to break up or dissolve'. Thus data analysis can be described as:
"...a process of resolving data into its constituent components, to reveal its characteristic elements and
Where the data is quantitative there are three determinants of the appropriate statistical tools for the
purposes of analysis. These are the number of samples to be compared, whether the samples being
compared are independent of one another and the level of data measurement.
Suppose a fruit juice processor wishes to test the acceptability of a new drink based on a novel
combination of tropical fruit juices. There are several alternative research designs which might be
employed, each involving different numbers of samples.

Test A Comparing sales in a test market and the market share of the
product it is targeted to replace.
Number of
samples = 1
Test B Comparing the responses of a sample of regular drinkers of fruit
juices to those of a sample of non-fruit juice drinkers to a trial
Number of
samples = 2
Test C Comparing the responses of samples of heavy, moderate and
infrequent fruit juice drinkers to a trial formulation.
Number of
samples = 3

The next consideration is whether the samples being compared are dependent (i.e. related) or
independent of one another (i.e. unrelated). Samples are said to be dependent, or related, when the
measurement taken from one sample in no way affects the measurement taken from another sample.
Take for example the outline of test B above. The measurement of the responses of fruit juice drinkers to
the trial formulation in no way affects or influences the responses of the sample of non-fruit juice
drinkers. Therefore, the samples are independent of one another. Suppose however a sample were given
two formulations of fruit juice to taste. That is, the same individuals are asked first to taste formulation X
and then to taste formulation Y. The researcher would have two sets of sample results, i.e. responses to
product X and responses to product Y. In this case, the samples would be considered dependent or
related to one another. This is because the individual will make a comparison of the two products and
his/her response to one formulation is likely to affect his/her reaction or evaluation of the other product.
The third factor to be considered is the levels of measurement of the data being used. Data can be
nominal, ordinal, interval or ratio scaled. Table summarises the mathematical properties of each of these
levels of measurement.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 26

Once the researcher knows how many samples are to be compared, whether these samples are related or
unrelated to one another and the level of measurement then the selection of the appropriate statistical test
is easily made. To illustrate the importance of understanding these connections consider the following
simple, but common, question in research. In many instances the age of respondents will be of interest.
This question might be asked in either of the two following ways:
Please indicate to which of the following age categories you belong :
(a) 15-21 years ___
22 - 30 years ___
Over 30 years ___
(b) How old are you? ___ Years

Measurement Level Examples Mathematical properties
Nominal Frequency counts Producing grading categories Confined to a small number of
tests using the mode and
Ordinal Ranking of items Placing brands of cooking oil
in order of preference
Wide range of nonparametric
tests which test for order
Interval Relative differences of
magnitude between
Scoring products on a 10 point
scale of like/dislike
Wide range of parametric tests
Ratio Absolute differences of
Stating how much better one
product is than another in
absolute terms.
All arithmetic operations
Levels of measurement
Choosing format (a) would give rise to nominal (or categorical) data and format (b) would yield ratio
scaled data. These are at opposite ends of the hierarchy of levels of measurement. If by accident or design
format (a) were chosen then the analyst would have only a very small set of statistical tests that could be
applied and these are not very powerful in the sense that they are limited to showing association between
variables and could not be used to establish cause-and-effect. Format (b), on the other hand, since it gives
the analyst ratio data, allows all statistical tests to be used including the more powerful parametric tests
whereby cause-and-effect can be established, where it exists. Thus a simple change in the wording of a
question can have a fundamental effect upon the nature of the data generated.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 27

Selecting statistical tests
The individual responsible for commissioning the research may be unfamiliar with the technicalities of
statistical tests but he/she should at least be aware that the number of samples, their dependence or
independence and the levels of measurement does affect how the data can be analysed. Those who
submit research proposals involving quantitative data should demonstrate an awareness of the factors
that determine the mode of analysis and a capability to undertake such analysis.
Researchers have to plan ahead for the analysis stage. It often happens that data processing begins whilst
the data gathering is still underway. Whether the data is to be analysed manually or through the use of a
computer program, data can be coded, cleaned (i.e. errors removed) and the proposed analytical tests
tried out to ensure that they are effective before all of the data has been collected.
Another important aspect relates to logistics planning. This includes ensuring that once the task of
preparing the data for analysis has begun there is a steady and uninterrupted flow of completed data
forms or questionnaires back from the field interviewers to the data processors. Otherwise, the whole
exercise becomes increasingly inefficient. A second logistical issue concerns any plan to build up a picture
of the pattern of responses as the data comes flowing in. This may require careful planning of the
sequencing of fieldwork. For instance, suppose that research was being undertaken within a particular
agricultural region with a view to establishing the size, number and type of milling enterprises which had
established themselves in rural areas following market liberalisation. It may be that the West of the
district under study mainly wheat is grown whilst in the East it is maize which is the major crop. It would
make sense to coordinate the fieldwork with data analysis so that the interim picture was of either wheat
or maize milling since the two are likely to differ in terms of the type of mill used (e.g. hammer versus
plate mills) as well as screen sizes and end use (e.g. the proportions prepared for animal versus human
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 28

Step VIII: Drawing conclusions and making recommendations
The concluding chapters of this textbook are devoted to the topic of research report writing. However,
it is perhaps worth noting that the end products of research are conclusions and recommendations. With
respect to the marketing planning function, research helps to identify potential threats and opportunities,
generates alternative courses of action, provides information to enable marketing managers to evaluate
those alternatives and advises on the implementation of the alternatives.
Too often research reports chiefly comprise a lengthy series of tables of statistics accompanied by a few
brief comments which verbally describe what is already self-evident from the tables. Without
interpretation, data remains of potential, as opposed to actual use. When conclusions are drawn from raw
data and when recommendations are made then data is converted into information. It is information
which management needs to reduce the inherent risks and uncertainties in management decision making.
Customer oriented researchers will have noted from the outset of the research which topics and issues
are of particular importance to the person(s) who initiated the research and will weight the content of
their reports accordingly. That is, the researcher should determine what the marketing manager's
priorities are with respect to the research study. In particular he/she should distinguish between what
the managers:
1. must know
2. should know
3. could know
This means that there will be information that is essential in order for the manager to make the particular
decision with which he/she is faced (must know), information that would be useful to have if time and
resources within the budget allocation permit (should know) and there will be information that it would
be nice to have but is not at all directly related to the decision at hand (could know). In writing a research
proposal, experienced researchers would be careful to limit the information which they firmly promise to
obtain, in the course of the study, to that which is considered 'must know' information. Moreover, within
their final report, experienced researchers will ensure that the greater part of the report focuses upon
'must know' type information.

Review Question:
1. Define the concept of research and analyze
its characteristics.
2. Write an essay on various types of research.
3. Explain the significance of research in
various functional areas of business.
4. What are the difficulties faced by
researchers of business in India?
5. What is meant by research process? What
are the various stages / aspects involved in
the research process?
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 29

6. What do you mean by a method of
research? Briefly explain different methods
of research.
7. Explain the significance of research in
various functional areas of business.
8. What is Survey Research? How is it
different from Observation Research?
9. Write short note on:
a) Case Research
b) Experimental Research
c) Historical Research
d) Comparative Method of research

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 30

Chapter 2: Research Design
The decisions regarding what, where, when, how much, by what means concerning a research project
constitute a research design. A research design is the arrangement of conditions for collection and
analysis of data in a manner that aims to combine relevance to the research purpose with economy in
procedure. In fact, the research design is the conceptual structure within which research is conducted; it
constitutes the blueprint for the collection, measurement and analysis of data. As such the design
includes an outline of what the researcher will do from writing the hypothesis and its operational
implications to the final analysis of data. More explicitly, the design decisions happen to be in respect of:

- What is the study about?
- Why is the study being made?
- Where will the study be carried
- What type of data is required?
- Where can the required data be

- What periods of time will the study
- What will be the sample design?
- What techniques of data collection
will be used?
- How will the data be analysed?
- In what style will the report be
Meaning of Research Design
Research design is also known by different names such as research outline, plan, blue print. In the words
of Fred N. Kerlinger, it is the plan, structure and strategy of investigation conceived so as to obtain
answers to research questions and control variance. The plan includes everything the investigator will do
from writing the hypothesis and their operational implications to the final analysis of data. The structure
is the outline, the scheme, the paradigms of the operation of the variables. The strategy includes the
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 31

methods to be used to collect and analyze the data. At the beginning this plan (design) is generally vague
and tentative. It undergoes many modifications and changes as the study progresses and insights into it
deepen. The working out of the plan consists of making a series of decisions with respect to what, why,
where, when, who and how of the research.
According to Pauline V.Young a research design is the logical and systematic planning and directing of
a piece of research. According to Reger E.Kirk research designs are plans that specify how data should
be collected and analyzed.
Research design is the plan, structure and strategy of investigation conceived so as to obtain answers
to research questions and to control variance. (Kerlinger)
A research is the specification of methods and procedures for acquiring the information needed. It is
the overall operational pattern or framework of the project that stipulates what information is to be
collected from which sources by what procedures. (Green and Tull).
The research has to be geared to the available time, energy, money and to the availability of data. There is
no such thing as a single or correct design. Research design represents a compromise dictated by many
practical considerations that go into research.

Why Research design is required?
Research design is needed because it facilitates the smooth sailing of the various research operations,
thereby making research as efficient as possible yielding maximal information with minimal expenditure
of effort, time and money.
For example, economical and attractive construction of house we need a blueprint (or what is commonly
called the map of the house) well thought out and prepared by an expert architect, similarly we need a
research design or a plan in advance of data collection and analysis for our research project. Research
design stands for advance planning of the methods to be adopted for collecting the relevant data and the
techniques to be used in their analysis.

Functions of Research Design

Regardless of the type of research design selected by the investigator, all plans perform one or more
functions outlined below.
i) It provides the researcher with a blue print for studying research questions.
ii) It dictates boundaries of research activity and enables the investigator to channel his energies in a
specific direction.
iii) It enables the investigator to anticipate potential problems in the implementation of the study.
iv) The common function of designs is to assist the investigator in providing answers to various kinds of
research questions.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 32

A study design includes a number of component parts which are interdependent and which demand a
series of decisions regarding the definitions, methods, techniques, procedures, time, cost and
administration aspects.

Features of good Design
A research design basically is a plan of action. Once the research problem is selected, then it must be
executed to get the results. Then how to go about it? What is its scope? What are the sources of data?
What is the method of enquiry? What is the time frame? How to record the data? How to analyze the
data? What are the tools and techniques of analysis?
What is the manpower and organization required?
What are the resources required? These and many
such are the subject matter of attacking the research problem demanding decisions in the beginning itself
to have greater clarity about the research study. It is similar to having a building plan before the building
is constructed. Thus, according to P.V. Young the various considerations which enter into making
decisions regarding what, where, when, how much, by what means constitute a plan of study or a study
design. Usually the features or components of a Research design are as follows:
1) Need for the Study: Explain the need for and importance of this study and its relevance.
2) Review of Previous Studies: Review the previous works done on this topic, understand what they did,
identify gaps and make a case for this study and justify it.
3) Statement of Problem: State the research problem in clear terms and give a title to the study.
4) Objectives of Study: What is the purpose of this study? What are the objectives you want to achieve by
this study? The statement of objectives should not be vague. They must be specific and focused.
5) Formulation of Hypothesis: Conceive possible outcome or answers to the research questions and
formulate into hypothesis tests so that they can be tested.
6) Operational Definitions: If the study is using uncommon concepts or unfamiliar tools or using even
the familiar tools and concepts in a specific sense, they must be specified and defined.
7) Scope of the Study: It is important to define the scope of the study, because the scope decides what is
within its purview and what is outside.
Scope includes Geographical scope, content scope, chronological scope of the study. The territorial area to
be covered by the study should be decided. E.g. only Delhi or northern states or All India. As far as
content scope is concerned according to the problem say for example, industrial relations in so and so
organization, what are aspects to be studied, what are the aspects not coming under this and hence not
studied. Chronological scope i.e., time period selection and its justification is important. Whether the
study is at a point of time or longitudinal say 1991-2003.
8) Sources of Data: This is an important stage in the research design. At this stage, keeping in view the
nature of research, the researcher has to decide the sources of data from which the data are to be
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 33

collected. Basically the sources are divided into primary source (field sources) and secondary source
(documentary sources). The data from primary source are called as primary data, and data from
secondary source are called secondary data. Hence, the researcher has to decided whether to collect from
primary source or secondary source or both sources.
9) Method of Collection: After deciding the sources for data collection, the researcher has to determine
the methods to be employed for data collection, primarily, either census method or sampling method.
This decision may depend on the nature, purpose, scope of the research and also time factor and financial
10) Tools & Techniques: The tools and techniques to be used for collecting data such as observation,
interview, survey, schedule, questionnaire, etc., have to be decided and prepared.
11) Sampling Design: If it is a sample study, the sampling techniques, the size of sample, the way
samples are to be drawn etc., are to be decided.
12) Data Analysis: How are you going to process and analyze the data and information collected? What
simple or advanced statistical techniques are going to be used for analysis and testing of hypothesis, so
that necessary care can be taken at the collection stage.
13) Presentation of the Results of Study: How are you going to present the results of the study? How
many chapters? What is the chapter scheme? The chapters, their purpose, their titles have to be outlined.
It is known as chapterisation.
14) Time Estimates: What is the time available for this study? Is it limited or unlimited time? Generally, it
is a time bound study. The available or permitted time must be apportioned between different activities
and the activities to be carried out within the specified time. For example, preparation of research design
one month, preparation of questionnaire one month, data collection two months, analysis of data two
months, drafting of the report two months etc.,
15) Financial Budget: The design should also take into consideration the various costs involved and the
sources available to meet them. The expenditures like salaries (if any), printing and stationery, postage
and telephone, computer and secretarial assistance etc.
16) Administration of the Enquiry: How is the whole thing to be executed? Who does what and when?
All these activities have to be organized systematically, research personnel have to be identified and
trained. They must be entrusted with the tasks, the various activities are to be coordinated and the whole
project must be completed as per schedule. Research designs provide guidelines for investigative activity
and not necessarily hard and fast rules that must remain unbroken. As the study progresses, new aspects,
new conditions and new connecting links come to light and it is necessary to change the plan / design as
circumstances demand. A universal characteristic of any research plan is its flexibility. Depending upon
the method of research, the designs are also known as survey design, case study design, observation
design and experimental design.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 34

Types of research designs
Various types of research design can be classified under three titles viz :

I) Experimental Research Design
II) Exploratory Research Design
III) Descriptive research Design

I) Experimental Research Design can also be called hypothesis testing research design. It refers to
that research process in which one or more variable are manipulated under conditions that permit the
collection of data that shows the effect, if any, of such variable in unconfused fashion.

Basic Principles of Experimental Designs
There are three principles of experimental designs:
1. Principle of Replication;
2. Principle of Randomization
3. Principle of Local Control
Now let us discuss each one of these experimental design
1. Principle of Replication
In this design, the experiment should be repeated more than once. Thus, each treatment is applied in
many experimental units instead of one. By doing so the statistical accuracy of the experiments is
increased. For example, suppose we are to examine the effect of two varieties of rice.
For this purpose, we may divide the field into two parts and grow one variety in one part and the other
variety in the other part. We can then compare the yield of the two parts and draw conclusion on that
basis. But if we are to apply the principle of replication to this experiment, then we first divide the field
into several parts, grow one variety in half of these parts and the other variety in the remaining parts. We
can then collect the data of yield of the two varieties and draw conclusion by comparing the same. The
result so obtained will be more reliable in comparison to the conclusion we draw without applying the
principle of replication. The entire experiment can even be repeated several times for better results.
Conceptually replication does not present any difficulty, but computationally it does. For example, if, an
experiment requiring a two-way analysis of variance is replicated, it will then require a three-way
analysis of variance since replication itself may be a source of variation in the data. However, it should be
remembered that replication is introduced in order to increase the precision of a study; that is to say, to
increase the accuracy with which the main effects and interactions can be estimated.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 35

2. Principle of Randomization
This principle indicates that we should design or plan the experiment in such a way that the variations
caused by extraneous factor can all be combined under the general heading of chance. For example - if
grow one variety of rice, say, in the first half of the parts of a field and the other variety is grown in the
other half, then it is just possible that the soil fertility may be different in the first half in comparison to
the other half. If this is so our results would not be realistic. In such a situation, we may assign the variety
of rice to be grown in different parts of the field on the basis of some variety sampling technique, i.e., we
may apply randomization principle and random ourselves against the effects of the extraneous factors
(soil fertility processes in the given case.)
3. The Principle of Local Control
It is another important principle of experimental designs. Under it the extraneous factor, the known
source of variability, is made to vary deliberately over as wide a range as necessary and these needs to be
done in such a way that the variability it causes can be measured and hence eliminated from the
experimental error.
This means that we should plan the experiment in a manner that we can perform a two-way analysis of
variance, in which the total variability of the data is divided into three components attributed to
treatments (varieties of rice in our case), the extraneous factor (soil fertility in our case) and experimental
In other words, according to the principle of local control, we first divide the field into several
homogeneous parts, known as blocks, and then each such block is divided into parts equal to the number
of treatments. Then the treatments are randomly assigned to these parts of a block.

Some of the commonly used experimental designs are:
1) After only Design:
2) Before After Design
3) Before After with control Group Design
4) After only with control

1) "After-only" designs
As the name suggests, with after-only experimental designs measures of the independent variable are
only taken after the experimental subjects have
been exposed to the independent variable. This is
a common approach in advertising research where
a sample of target customers are interviewed
following exposure to an advertisement and their
recall of the product, brand, or sales features is
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 36

measured. The advertisement could be one appearing on national television and/or radio or may appear
in magazines, newspapers or some other publication. The amount of information recalled by the sample
is taken as an indication of the effectiveness of the advertisement.
The chief problem with after-only designs is that they do not afford any control over extraneous factors
that could have influenced the post-exposure measurements. For example, marketing extension
personnel might have completed a trial campaign to persuade small-scale poultry producers, in a
localised area, to make use of better quality feeds in order to improve the marketability and price of the
end product. The decision to extend the campaign to other districts will depend on the results of this trial.
After-only measures are taken; following the campaign, by checking poultry feed sales with merchants
operating within the area. Suppose a rise in sales of good quality poultry feed mixes occurs four weeks
after the campaign ends. It would be dangerous to assume that this sales increase is wholly due to the
work of the marketing extension officers. A large part of the increase may be due to other factors such as
promotional activity on the part of feed manufacturers and merchants who took advantage of the
campaign, of which they were forewarned, and timed their marketing programme to coincide with the
extension campaign. If the extension service erroneously drew the conclusion that the sales increase was
entirely due to their own promotional activity, then they might be misled into repeating the same
campaign in other areas where there would not necessarily be the same response from feed
manufacturers and merchants.
After-only designs are not true experiments since little or no control is exercised over any of the variables
by the researcher. However its inclusion here serves to underline the need for more complex designs.

2) "Before-after" designs
A before-after design involves the researcher in measuring the dependent variable both before and after
the participants has been exposed to the independent variables.
The before-after design is an improvement
upon the after-only design, in that the effect of
the independent variable, if any, is established
by observing differences between the value of
the dependent variable before and after the
experiment. Nonetheless, before-after designs
still have a number of weaknesses.
Consider the case of the vegetable packer who is thinking about sending his/her produce to the
wholesale market in more expensive, but more protective, plastic crates, instead of cardboard boxes. The
packer is considering doing so in response to complaints from commissioning agents that the present
packaging affords little protection to produce from handling damage. The packer wants to be sure that
the economics of switching to plastic crates makes sense. Therefore, the packer introduces the plastic
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 37

crates for a trial period. Before introducing these crates, the packer records the prices received for his/her
top grade produce. Unless prices increase by more than the additional cost of plastic crates then there is
no economic advantage to using the more expensive packaging.
Suppose, for instance, that the packer was receiving ` 15 per crate, when these were of the cardboard
type, but that the price after the introduction of plastic crates had risen to ` 17 per crate. The ` 2 difference
would be attributed to better quality produce reaching the market as a result of the protection afforded
by the plastic crates. However, there are several equally reasonable explanations for the upward drift in
produce prices including a shortfall in supply, a fall in the quality of produce supplied by competitors
who operate in areas suffering adverse weather conditions, random fluctuation in prices, etc.

3) "Before-after with control group" design
This design involves establishing two samples or groups of respondents: an experimental group that
would be exposed to the marketing variable and a control group which would not be subjected to the
marketing variable under study. The two groups would be matched. That is, the two samples would be
identical in all important respects. The idea is that any confounding factors would impact equally on both
groups and therefore any differences in the data drawn from the two groups can be attributed to the
experimental variable.
Study table, which depicts how an experiment involving the measurement of the impact of a sugar beet
seed promotional campaign on brand awareness might be configured with a control group.
An example of a before after with control group design
Experimental Group Control Group
measure: % recalling Brand X sugarbeet seed 25.5% 25.5%
Exposed to promotional campaign Yes No
'After' measure: % recalling Brand X sugarbeet seed 34.5% 24,5%
First, the two groups would be matched: attributes such as age distribution of group members, spread of
sizes of farms operated, types of farms operated, ratio of dependence on hand tools, animal drawn tools
and tractor mounted equipment, etc. would be matched within each group so that the groups are
interchangeable for the purposes of the test. The initial level of awareness of the sugar beet brand would
be recorded within each group. Only the experimental group would see the test promotional campaign.
After the campaign, a second measure of brand awareness would be taken from each group. Any
difference between the 'after' and 'before' measurements of the control group (C2 - C1) would be due to
uncontrolled variables. Differences between the 'after' and 'before' measurements in the experimental
group (E2 - E1) would be the result of the experimental variable plus the same uncontrolled variables
affecting the control group. Isolating the effect of the experimental variable is simply a matter of
subtracting the difference in the two measurements of the control group from the difference in the two
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 38

measures taken from the experimental group. To illustrate the computation consider the following
hypothetical figures.
Awareness of the brand within the experimental group has increased by 9 percent. At the same time, the
awareness level, within the control group, appears to have fallen by 1 percent. This could be due to
random fluctuations or a real lowering of awareness due to some respondents forgetting the brand in the
absence of any supporting advertisements/promotions. Thus the effects of the test campaign would seem
to have been:
Effect of experimental variable = (34.5 - 25.5) - (24.5 - 25.5)
= (9%) - (-1%) = 10%

If a "before and after with control group" experiment is properly designed and executed then the effects
of maturation, pre-testing and measurement variability should be the same for the experimental group as
for the control group. In this case, these factors appear to have had a negative effect on awareness of one
percent. Had it not been for the experimental variable, the experimental group would have shown a
similar fall in awareness over the period of the test. Instead of recording a fall in the level of awareness of
the sugar beet brand, the experimental group actually showed a nine percent increase in brand
awareness. However, the design is not guaranteed to be unflawed. The accurate matching of the two
groups is a difficult, some would say impossible, task. Moreover, over time the rate and extent of
mortality, or drop out, is likely to vary between the groups and create additional problems in maintaining
a close match between groups.

4)"After-only with control group" experimental design
Again, this design involves establishing two matched samples or groups of respondents. There is no
measurement taken from either group before the experimental variable is introduced and the control
group is not subsequently subjected to the experimental variable. Afterwards measures are taken from
both groups and the effect of the experimental variable is established by deducting the control group
measure from the experimental group measure. An illustrative example will help clarify the procedures
A Sri Lankan food technology research institute was trying to convince small-scale food processors to
adopt solar dryers to produce dried plantain and other dehydrated vegetables. Much of the initial
resistance to the adoption of this technology was due to the belief that the taste characteristics of this
snack food would be altered from those of traditional sun-dried plantain. The research institute was able
to convince the food manufacturers that there would be no perceptible changes in the taste characteristics
by carrying out an "after-only with control group" experiment. Sensory analysis experiments conclusively
showed that almost none of the participants were able to discriminate between plantain dehydrated by
means of the solar powered dryer and that which was sun-dried.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 39

Many product tests are of the "after-only with control group" type. This design escapes the problems of
pre-testing, history and maturation. However, this form of "after-only design" does not facilitate an
analysis of the process of change, whereas a comparable "before-after design" would. The attitudes,
opinions and/or behaviour of individual participants can be recorded both before and afterwards and
changes noted. For instance, the effect of the experimental variable on those participants who held
unfavourable attitudes can be compared with those they held in the "before" measurement. Changes in
those that held favourable attitudes in the "before" measurement can also be assessed after exposure to
the experimental variable.

II) Exploratory research design
Exploratory research: what it is and what it is not
Exploratory research helps ensure that a rigorous and conclusive study will not begin with an inadequate
understanding of the nature of the problem. Most exploratory research designs provide qualitative data
which provides greater understanding of a concept. In contrast, quantitative data provides precise
Exploratory research may be a single research investigation or it may be a series of informal studies; both
methods provide background information. Researchers must be creative in the choice of information
sources. They should explore all appropriate inexpensive sources before embarking on expensive
research of their own. However, they should still be systematic and careful at all times.

Why conduct exploratory research?
There are three purposes for conducting exploratory research; all three are interrelated:
A. Diagnosing a situation: Exploratory research helps diagnose the dimensions of problems so that
successive research projects will be on target.
B. Screening alternatives: When several opportunities arise and budgets restrict the use of all possible
options, exploratory research may be utilized to determine the best alternatives. Certain evaluative
information can be obtained through exploratory research. Concept testing is a frequent reason for
conducting exploratory research. Concept testing refers to those research procedures that test some sort
of stimulus as a proxy for a new, revised, or remarketed product or service. Generally, consumers are
presented with an idea and asked if they like it, they would use it, etc. Concept testing is a means of
evaluating ideas by providing a feel for the merits of the idea prior to the commitment of any research
and development, marketing, etc. Concept testing portrays the functions, uses, and possible situations for
the proposed product.
C. Discovering new ideas: Uncovering consumer needs is a great potential source of ideas. Exploratory
research is often used to generate new product ideas, ideas for advertising copy, etc.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 40

Categories of exploratory research
The purpose, rather than the technique, of the research determines whether a study is exploratory,
descriptive, or causal. A manager may choose from three general categories of exploratory research:

A. Experience surveys: Concepts may be discussed with top executives and knowledgeable managers
who have had personal experience in the field being researched. This constitutes an informal experience
survey. Such a study may be conducted by the business manager rather than the research department.
On the other hand, an experience survey may be a small number of interviews with experienced people
who have been carefully selected from outside the organization. The purpose of such a study is to help
formulate the problem and clarify concepts rather than to develop conclusive evidence.

B. Secondary data analysis: A quick and economical source of background information is trade literature
in the public library. Searching through such material is exploratory research with secondary data;
research rarely begins without such an analysis. An informal situation analysis using secondary data and
experience surveys can be conducted by business managers. Should the project need further clarification,
a research specialist can conduct a pilot study.

C. Case study method: The purpose of a case study is to obtain information from one, or a few, situations
similar to the researcher's situation. A case study has no set procedures, but often requires the
cooperation of the party whose history is being studied. However, this freedom to research makes the
success of the case study highly dependent on the ability of the researcher. As with all exploratory
research, the results of a case study should be seen as tentative.
Case study research excels at bringing us to an understanding of a complex issue or object and can extend
experience or add strength to what is already known through previous research. Case studies emphasize
detailed contextual analysis of a limited number of events or conditions and their relationships.
Researchers have used the case study research method for many years across a variety of disciplines.
Social scientists, in particular, have made wide use of this qualitative research method to examine
contemporary real-life situations and provide the basis for the application of ideas and extension of
methods. Researcher Robert K. Yin defines the case study research method as an empirical inquiry that
investigates a contemporary phenomenon within its real-life context; when the boundaries between
phenomenon and context are not clearly evident; and in which multiple sources of evidence are used.
Many well-known case study researchers such as Robert E. Stake, Helen Simons, and Robert K. Yin have
written about case study research and suggested techniques for organizing and conducting the research
successfully. This introduction to case study research draws upon their work and proposes six steps
that should be used:
1. Determine and define the research questions
2. Select the cases and determine data gathering and analysis techniques
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 41

3. Prepare to collect the data
4. Collect data in the field
5. Evaluate and analyze the data
6. Prepare the report

D. Pilot studies: The term "pilot studies" is used as a collective to group together a number of diverse
research techniques all of which are conducted on a small scale. Thus, a pilot study is a research project
which generates primary data from consumers, or other subjects of ultimate concern. There are four
major categories of pilot studies:
1. Focus group interviews: These interviews are free-flowing interviews with a small group of people.
They have a flexible format and can discuss anything from brand to a product itself. The group typically
consists of six to ten participants and a moderator. The moderator's role is to introduce a topic and to
encourage the group to discuss it among themselves. There are four primary advantages of the focus
group: (1) it allows people to discuss their true feelings and convictions, (2) it is relatively fast, (3) it is
easy to execute and very flexible, (4) it is inexpensive.
One disadvantage is that a small group of people, no matter how carefully they are selected, will not be
Specific advantages of focus group interviews have to be categorized as follows:
a) Synergism: the combined effort of the group will produce a wider range of information, insights and
ideas than will the cumulation of separately secured responses.
b) Serendipity: an idea may drop out of the blue, and affords the group the opportunity to develop such
an idea to its full significance.
c) Snowballing: a bandwagon effect occurs. One individual often triggers a chain of responses from the
other participants.
d) Stimulation: respondents want to express their ideas and expose their opinions as the general level of
excitement over the topic increases.
e) Security: the participants are more likely to be candid because they soon realize that the things said are
not being identified with any one individual.
f) Spontaneity: people speak only when they have definite feelings about a subject; not because a
question requires an answer.
g) Specialization: the group interview allows the use of a more highly trained moderator because there
are certain economies of scale when a large number of people are "interviewed" simultaneously.
h) Scientific scrutiny: the group interview can be taped or even videoed for observation. This affords
closer scrutiny and allows the researchers to check for consistency in the interpretations.
i) Structure: the moderator, being one of the groups, can control the topics the group discusses.
j) Speed: a number of interviews are, in effect, being conducted at one time.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 42

The ideal size for a focus group is six to ten relatively homogeneous people. This avoids one or two
members intimidating the others, and yet, is a small enough group that adequate participation is allowed.
Homogeneous groups avoid confusion which might occur if there were too many differing viewpoints.
Researchers who wish to collect information from different groups should conduct several different focus
The sessions should be as relaxed and natural as possible. The moderator's job is to develop a rapport
with the group and to promote interaction among its members. The discussion may start out general, but
the moderator should be able to focus it on specific topics.
An effective focus group moderator prepares a discussion guide to help ensure that the focus group will
cover all topics of interest. The discussion guide consists of written prefatory remarks to inform the
group about the nature of the focus group and an outline of topics/questions that will be addressed in
the group session.
The focus group technique has two shortcomings:
- Without an experienced moderator, a self-appointed leader will dominate the session resulting in
an abnormal "halo effect" on the interview.
- There may be sampling problems.

2. Interactive Media and online Focus Group: When a person uses the Internet, he or she interacts with a
computer. It is an interactive media because the user clicks a command and the computer responds. The
use of the Internet for qualitative exploratory research is growing rapidly. The term online focus group
refers to qualitative research where a group of individuals provide unstructured comments by
keyboarding their remarks into a computer connected to the Internet. The group participants either
keyboard their remarks during a chat room format or when they are alone at their computers. Because
respondents enter their comments into the computer, transcripts of verbatim responses are available
immediately afterward the group session. Online groups can be quick and cost efficiency. However,
because there is less interaction between participants, group synergy and snowballing of ideas can suffer.
Research companies often set up a private chat room on their company Web sites for focus group
interviews. Participants in these chat rooms feel their anonymity is very secure. Often they will make
statements or ask questions they would never address under other circumstances. This can be a major
advantage for a company investigating sensitive or embarrassing issues.
Many online focus groups using the chat room format arrange for a sample of participants to be online at
the same time for about typically 60 to 90 minutes. Because participants do not have to be together in the
same room at a research facility, the number of participants in online focus groups can be much larger
than traditional focus groups. A problem with online focus groups is that the moderator cannot see body
language and facial expressions (bewilderment, excitement, interest, etc.) to interpret how people are
reacting. Also, the moderators ability to probe and ask additional questions on the spot is reduced in
online focus groups, especially those in which participants are not simultaneously involved. Research
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 43

that requires tactile touch, such as a new easy-opening packaging design, or taste experiences cannot be
performed online.

3. Projective techniques: Individuals may be more likely to give a true answer if the question is
disguised. If respondents are presented with unstructured and ambiguous stimuli and are allowed
considerable freedom to respond, they are more likely to express their true feelings.
A projective technique is an indirect means of questioning that enables respondents to "project their
beliefs onto a third party." Thus, the respondents are allowed to express emotions and opinions that
would normally be hidden from others and even hidden from themselves. Common techniques are as
a) Word association: The subject is presented with a list of words, one at a time, and asked to respond
with the first word that comes to mind. Both verbal and non-verbal responses are recorded. Word
association should reveal each individual's true feelings about the subject. Interpreting the results is
difficult; the researcher should avoid subjective interpretations and should consider both what the subject
said and did not say (e.g., hesitations).
b) Sentence completion method: This technique is also based on the assumption of free association.
Respondents are required to complete a number of partial sentences with the first word or phrase that
comes to mind. Answers tend to be more complete than in word association, however, the intention of
the study is more apparent.
c) Third-person technique and role playing: Providing a "mask" is the basic idea behind the third-person
technique. Respondents are asked why a third person does what he or she does, or what a third person
thinks of a product. The respondent can transfer his attitudes onto the third person. Role playing is a
dynamic reenactment of the third-person technique in a given situation. This technique requires the
subject to act out someone else's behavior in a particular setting.
d) Thematic apperception test (TAT): This test consists of a series of pictures in which consumers and
products are the center of attention. The investigator asks the subject what is happening in the picture
and what the people might do next. Theses ("thematic") are elicited on the basis of the perceptual-
interpretive ("apperception") use of the pictures. The researcher then analyses the content of the stories
that the subjects relate. The picture should present a familiar, interesting, and well-defined problem, but
the solution should be ambiguous. A cartoon test, or picture frustration version of TAT, uses a cartoon
drawing in which the respondent suggests dialogue that the cartoon characters might say. Construction
techniques request that the consumer draw a picture, construct a collage, or write a short story to express
their perceptions or feelings.

4. Depth interviews: Depth interviews are similar to the client interviews of a clinical psychiatrist. The
researcher asks many questions and probes for additional elaboration after the subject answers; the
subject matter is usually disguised.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 44

Depth interviews have lost their popularity recently because they are time-consuming and expensive as
they require the services of a skilled interviewer.

- Exploratory research techniques have their limitations. Most of them are qualitative, and the
interpretation of their results is judgmentalthus, they cannot take the place of quantitative,
conclusive research.
- Because of certain problems, such as interpreter bias or sample size, exploratory findings should be
treated as preliminary. The major benefit of exploratory research is that it generates insights and
clarifies the problems for testing in future research.
- If the findings of exploratory research are very negative, then no further research should probably be
conducted. However, the researcher should proceed with caution because there is a possibility that a
potentially good idea could be rejected because of unfavorable results at the exploratory stage.
- In other situations, when everything looks positive in the exploratory stage, there is a temptation to
market the product without further research. In this situation, business managers should determine
the benefits of further information versus the cost of additional research. When a major commitment
of resources is involved, it is often well worth conducting a quantitative study.

III) Descriptive Research design
This is intended to describe certain factors that management is likely to be interested in such as market
conditions, customers feelings or opinions toward a particular company, purchasing behaviour as so
forth. Such research is not intended to allow the researcher to establish causal relationships between
marketing variables and sales or consumer behaviour, or to enable the researcher to predict likely future
conditions. Descriptive research merely examines what is. Such research, just like exploratory research,
usually forms part of an ongoing research programme. Once the researcher has established the present
situation in terms of market size, main segments, main competitors, etc., they may then proceed to types
of research of a more predictive and/or conclusive nature. Descriptive research usually makes use of
descriptive statistics to help the user understand the structure of the data and any significant patterns
that may be found in the data. All measures of central tendency such as the mean, median and mode are
often used along with measures of dispersion such as the variance and standard deviation. Descriptive
research results are often presented using pictorial methods such as graphs, pie charts, histograms, etc..
When the nature of the initial decision problem is either to describe specific characteristics of existing
market phenomena or to evaluate current marketing mix strategies of a defined target population or
market structure, then a descriptive research design is appropriate.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 45

If the research question(s) is linked to answering specified questions concerning who, what, where,
when, and how about known members or elements of the target population or market structures under
investigation, then the researcher should consider using a descriptive research design to gather the
needed primary data.
Remember, there are two basic ways to gather the primary data needed: observation and asking
questions. When the researcher needs to ask questions, the different approaches used are referred to as
survey methods.
Over time, descriptive research designs have come to be viewed and acknowledged as the different
survey methods available to researchers for collecting quantitative primary data from large groups of
people through the question and answer protocol process.
Examples of questions for descriptive research:
1. Do teachers hold favorable attitudes toward using computers, in schools?
2. What kinds of activities that involve technology occur in 6th-grade classrooms and how frequently do
they occur?
3. Is there a relationship between experience with multimedia computers and problem solving skills?

Descriptive research can be either quantitative or qualitative.
Descriptive research involves gathering data that describe events and then organizes, tabulates, depicts,
and describes the data collection. Descriptive statistics are very important in reducing the data to
manageable form.

The Nature of Descriptive Research
1. The descriptive function of research is heavily dependent on instrumentation for measurement and
2. Once the instruments are developed, they can be used to describe phenomena of interest to the
3. The intent of' some descriptive research is to produce statistical information about aspects of education
that interests policymakers and educators.
4. There has been in ongoing debate among researchers about the value of quantitative vs. qualitative
research, with some saying descriptive research is less pure than traditional experimental, quantitative

Some of the Descriptive Techniques
The descriptive techniques that are commonly used include:
- Graphical description
o use graphs to summarize data
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 46

o examples: histograms, scatter diagrams, bar charts, pie charts
- Tabular description
o use tables to summarize data
o examples: frequency distribution schedule, cross tabs
- Parametric description
o estimate the values of certain parameters which summarize the data
measures of location or central tendency
arithmetic mean
interquartile mean
measures of statistical dispersion
standard deviation
statistical range

Review Questions:
1. What is a research design? Explain the
functions of a research design.
2. Define a research design and explain its
3. What are the various features/components of
a research design?

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 47

Chapter 3: Sampling Design
What is a Sample?
A sample is a finite part of a statistical population whose properties are studied to gain information about
the whole (Webster, 1985). When dealing with people, it can be defined as a set of respondents (people)
selected from a larger population for the purpose of a survey.
A population is a group of individuals persons, objects, or items from which samples are taken for
measurement for example a population of presidents or professors, books or students.
Market research involves the collection of data to obtain insight and knowledge into the needs and
wants of customers and the structure and dynamics of a market. In nearly all cases, it would be very
costly and time-consuming to collect data from the entire population of a market. Accordingly, in market
research, extensive use is made of sampling from which, through careful design and analysis, Marketers
can draw information about the market.
Sampling is the key to survey research. No matter how well a study is done in other ways, if the sample
has not been properly found, the results cannot be regarded as correct. It applies mainly to surveys, but is
also important for planning other types of research.
Sampling is the process of selecting units (e.g., people, organizations) from a population of interest so
that by studying the sample we may fairly generalize our results back to the population from which they
were chosen. Let's begin by covering some of the key terms in sampling like "population" and "sampling
frame." Then, because some types of sampling rely upon quantitative models, we'll talk about some of
the statistical terms used in sampling. Finally, we'll discuss the major distinction between probability and
Non-probability sampling methods and work through the major types in each. Sampling is often used
when conducting a census is impossible or unreasonable.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 48

What is sampling?
Sampling is the act, process, or technique of selecting a suitable sample, or a representative part of a
population for the purpose of determining parameters or characteristics of the whole population.
What is the purpose of sampling? To draw conclusions about populations from samples, we must use
inferential statistics which enables us to determine a populations characteristics by directly observing
only a portion (or sample) of the population. We obtain a sample rather than a complete enumeration (a
census) of the population for many reasons. Obviously, it is cheaper to observe a part rather than the
whole, but we should prepare ourselves to cope with the dangers of using samples. Some are better than
others but all may yield samples that are inaccurate and unreliable. We will learn how to minimize these
dangers, but some potential error is the price we must pay for the convenience and savings the samples

Why sampling?
One of the decisions to be made by a researcher in conducting a survey is whether to go for a census or a
sample survey. We obtain a sample rather than a complete enumeration (a census ) of the population for
many reasons. The most important considerations for this are: cost, size of the population, accuracy of
data, accessibility of population, timeliness, and destructive observations.
1) Cost: The cost of conducting surveys through census method would be prohibitive and sampling helps
in substantial cost reduction of surveys. Since most often the financial resources available to conduct a
survey are scarce, it is imperative to go for a sample survey than census.
2) Size of the Population: If the size of the population is very large it is difficult to conduct a census if not
impossible. In such situations sample survey is the only way to analyse the characteristics of a
3) Accuracy of Data: Although reliable information can be obtained through census, sometime the
accuracy of information may be lost because of a large population. Sampling involves a small part of the
population and a few trained people can be involved to collect accurate data. On the other hand, a lot of
people are required to enumerate all the observations. Often it becomes difficult to involve trained
manpower in large numbers to collect the data thereby compromising accuracy of data collected. In such
a situation a sample may be more accurate than a census. A sloppily conducted census can provide less
reliable information than a carefully obtained sample.
4) Accessibility of Population: There are some populations that are so difficult to get access to that only a
sample can be used, e.g., people in prison, birds migrating from one place to another place etc. The
inaccessibility may be economic or time related. In a particular study, population may be so costly to
reach, like the population of planets, that only a sample can be used.
5) Timeliness: Since we are covering a small portion of a large population through sampling, it is
possible to collect the data in far less time than covering the entire population. Not only does it take less
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 49

time to collect the data through sampling but the data processing and analysis also takes less time
because fewer observations need to be covered. Suppose a company wants to get a quick feedback from
its consumers on assessing their perceptions about a new improved detergent in comparison to an
existing version of the detergent. Here the time factor is very significant. In such situations it is better to
go for a sample survey rather than census because it reduces a lot of time and product launch decision
can be taken quickly.
6) Destructive Observations: Sometimes the very act of observing the desired characteristics of a unit of
the population destroys it for the intended use. Good examples of this occur in quality control. For
example, to test the quality of a bulb, to determine whether it is defective, it must be destroyed. To obtain
a census of the quality of a lorry load of bulbs, you have to destroy all of them. This is contrary to the
purpose served by quality-control testing. In this case, only a sample should be used to assess the quality
of the bulbs. Another example is blood test of a patient.

The disadvantages of sampling are few but the researcher must be cautious. These are risk, lack of
representativeness and insufficient sample size each of which can cause errors. If researcher dont pay
attention to these flaws it may invalidate the results.
1) Risk: Using a sample from a population and drawing inferences about the entire population involves
risk. In other words the risk results from dealing with a part of a population. If the risk is not acceptable
in seeking a solution to a problem then a census must be conducted.
2) Lack of representativeness: Determining the representativeness of the sample is the researchers
greatest problem. By definition, sample means a representative part of an entire population. It is
necessary to obtain a sample that meets the requirement of representativeness otherwise the sample will
be biased. The inferences drawn from non-reprentative samples will be misleading and potentially
3) Insufficient sample size: The other significant problem in sampling is to determine the size of the
sample. The size of the sample for a valid sample depends on several factors such as extent of risk that
the researcher is willing to accept and the characteristics of the population itself.

What is Sampling Design?
Sample design covers the method of selection, the sample structure and plans for analysing and
interpreting the results. Sample designs can vary from simple to complex and depend on the type of
information required and the way the sample is selected.
Sample design affects the size of the sample and the way in which analysis is carried out. In simple terms
the more precision the market researcher requires, the more complex will be the design and the larger the
sample size.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 50

The sample design may make use of the characteristics of the overall market population, but it does not
have to be proportionally representative. It may be necessary to draw a larger sample than would be
expected from some parts of the population; for example, to select more from a minority grouping to
ensure that sufficient data is obtained for analysis on such groups.
Many sample designs are built around the concept of random selection. This permits justifiable inference
from the sample to the population, at quantified levels of precision. Random selection also helps guard
against sample bias in a way that selecting by judgement or convenience cannot.

Characteristics of a good sample Design
It is important that the sampling results must reflect the characteristics of the population. Therefore,
while selecting the sample from the population under investigation it should be ensured that the sample
has the following characteristics:
1) A sample must represent a true picture of the population from which it is drawn.
2) A sample must be unbiased by the sampling procedure.
3) A sample must be taken at random so that every member of the population of data has an equal chance
of selection.
4) A sample must be sufficiently large but as economical as possible.
5) A sample must be accurate and complete. It should not leave any information incomplete and should
include all the respondents, units or items included in the sample.
6) Adequate sample size must be taken considering the degree of precision required in the results of

Sampling and non-sampling errors
The quality of a research project depends on the accuracy of the data collected and its representation to
the population. There are two broad sources of errors. These are sampling errors and non-sampling
1 Sampling Errors
The principal sources of sampling errors are the sampling
method applied, and the sample size. This is due to the
fact that only a part of the population is covered in the
sample. The magnitude of the sampling error varies from
one sampling method to the other, even for the same
sample size. For example, the sampling error associated
with simple random sampling will be greater than
stratified random sampling if the population is
heterogeneous in nature. Intuitively, we know that the larger the sample the more accurate the research.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 51

In fact, the sampling error varies with samples of different sizes. Increasing the sample size decreases the
sampling error. The following Figure gives an approximate relationship between sample size and
sampling error. Study the following figure carefully.
2 Non-Sampling Errors
The non-sampling errors arise from faulty research design and mistakes in executing research. There are
many sources of non-sampling errors which may be broadly classified as: (a) respondent errors, and (b)
administrative errors.
a) Respondent Errors: If the respondents co-operate and give the correct information the objectives of the
researcher can be easily accomplished. However, in practice, this may not happen. The respondents may
either refuse to provide information or even if he/she provides information it may be biased. If the
respondent fails to provide information, we call it as non-response error. Although this problem is
present in all types of surveys, the problem is more acute in mailed surveys. Non-response also leads to
some extreme situations like those respondents who are willing to provide information are over
represented while those who are indifferent are under-represented in the sample. In order to minimise
the non-response error the researcher often seeks to re-contact with the non-respondents if they were not
available earlier. If the researcher finds that the non-response rate is more in a particular group of
respondents (for example, higher income groups) additional efforts should be made to obtain data from
these under-represented groups of the population. For example, for these people who are not responding
to the mailed questionnaires, personal interviews may be conducted to obtain data. In a mailed
questionnaire the researcher never knows whether the respondent really refused to provide data or was
simply indifferent. There are several techniques which help to encourage respondents to reply. Response
bias occurs when the respondent may not give the correct information and try to mislead the investigator
in a certain direction. The respondents may consciously or unconsciously misrepresent the truth. For
example, if the investigator asks a question on the income of the respondent he/ she may not give the
correct information for obvious reasons. Or the investigator may not be able to put a question that is
sensitive (thus avoiding embarrassment). This may arise from the problems in designing the questionaire
and the content of questions. Respondents who must understand the questions may unconsciously
provide biased information. The response bias may also occur because the interviewers presence
influences respondents to give untrue or modified answers. The respondents/ interviewers tendency is to
please the other person rather than provide/elicit the correct information.
b) Administrative Errors: The errors that have arisen due to improper administration of the research
process are called administrative errors. There are four types of administrative errors. These are as
i) sample selection error,
ii) investigator error,
iii) investigator cheating, and
iv. data processing error.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 52

i) Sample Selection Error: It is difficult to execute a sampling plan. For example, we may plan to use
systematic sampling plan in a market research study of a new product and decide to interview every 5

customer coming out of a consumer store. If the day of interview happened to be a working day then we
are excluding all those consumers who are working. This may lead to an error because of the
unrepresentative sample selection.
ii) Investigator Error: When the investigator interviews the respondent, he/ she may fail to record the
information correctly or may fail to cross check the information provided by the respondent. Therefore,
the error may arise due to the way the investigator records the information.
iii) Investigator Cheating: Some times the investigator may try to fake the data even without meeting the
concerned respondents. There should be some mechanism to crosscheck this type of faking by the
iv) Data Processing Error: Once the data is collected the next job the researcher does is edit, code and
enter the data into a computer for further processing and analysis. The errors can be minimised by careful
editing, coding and entering the data into a computer.

Control of Errors
In the above two sections we have identified the most significant sources of errors. It is not possible to
eliminate completely the sources of errors. However, the researchers objective and effort should be to
minimise these sources of errors as much as possible. There are ways of reducing the errors. Some of
these are:
(a) designing and executing a good questionnaire; (b) selection of appropriate sampling method; (c)
adequate sample size; (d) employing trained investigators to collect the data; and (e) care in editing,
coding and entering the data into the computer.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 53

What are the Steps involved in sample Design?
The sampling design process consists of five stages:
1. Definition of population of concern
2. Specification of a sampling frame, a set of items or events that it is possible to
3. Specification of sampling method for
selecting items or events from the frame
4. Sampling and data collecting
5. Review of sampling process

1) Populations, (Universe) definition:-
The first concept you need to understand is the difference
between a population and a sample. To make a sample, you
first need a population. In non-technical language,
population means "the number of people living in an area."
This meaning of population is also used in survey research,
but this is only one of many possible definitions of
population. The word universe is sometimes used in survey
research, and means exactly the same in this context as
The unit of population is whatever you are counting: there can be a population of people, a population of
households, a population of events, institutions, transactions, and so forth. Anything you can count can
be a population unit. But if you can't get information from it, and you can't measure it in some way, it's
not a unit of population that is suitable for survey research.
For a survey, various limits (geographical and otherwise) can be placed on a population. Some
populations that could be covered by surveys are...
- All people living in India.
- All people aged 18 and over.
- All households in Nagpur.
- All schools in Maharashtra.
- All instances of tuning in to FM radio station in the last seven days
...and so on. If you can express it in a phrase beginning "All," and you can count it, it's a population of
some kind. The commonest kind of population used in survey research uses the formula:
- All people aged X years and over, who live in area Y.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 54

The "X years and over" criterion usually rules out children below a certain age, both because of the
difficulties involved in interviewing them and because many research questions don't apply to them.
Even though some populations can't be questioned directly, they're still populations. For example,
schools can't fill in questionnaires, but somebody can do so on behalf of each school. The distinction is
important when finding the answers to questions like "What proportion of Primary schools have
libraries?" You need only one questionnaire from each school - not one from each teacher, or one from
each student.
Often, the population you end up surveying is not the population you really wanted, because some part
of the population cannot be surveyed. For example, if you want to survey opinions among the whole
population of an area, and choose to do the survey by telephoning people at home, the population you
actually survey will be people with a telephone in their home. If the people with no telephone have
different opinions, you will not discover this.
As long as the surveyed population is a high proportion of the wanted population, the results obtained
should also be true for the larger population. For example, if 90% of homes have a telephone, the 10%
without a phone would have to be very different, for the survey's results not to be true for the whole

2. Sampling frames
A sampling frame can be one of two things: either a list of all members of a population, or a method of
selecting any member of the population. The term general population refers to everybody in a
particular geographical area. Common sampling frames for the general population are electoral rolls,
street directories, telephone directories, and customer lists from utilities which are used by almost all
households: water, electricity, sewerage, and so on.
It is best to use the list that is most accurate, most complete, and most up to date. This differs from
country to country. In some countries, the best lists are of households, in other countries, they are of
people. For most surveys, a list of households is more useful than a list of people. Another commonly
used sampling frame (which is not recommended for sampling people) is a map.

A sample is a part of the population from which it was drawn. Survey research is based on sampling,
which involves getting information from only some members of the population.
If information is obtained from the whole population, it is not a sample, but a census. Some surveys,
based on very small populations (such as all members of an organization) in fact are censuses and not
sample surveys. When you do a census, the techniques given in this book still apply, but there is no
sampling error - as long as the whole group participates in the census.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 55

Samples can be drawn in several different ways, e.g. probability samples, quota samples, purposive
samples etc.

Sample size
Contrary to popular opinion, sample sizes do not have to be particularly large. Their size is not, as
commonly thought, determined by the size of the population they are to represent. The U.S., for
example, contains more than two and a half million people, yet the General Social Survey, a highly
valued yearly interview survey of the U.S. population, is based on a sample of around 1500 cases.
Political and attitudinal polls, such as the California Poll, typically draw a sample of around 1000, and
some local polls obtain samples of 500 or less. The determiners of sample size are the variability within
the population and the degree of accuracy of population estimates the researcher is willing to accept (pay
for). If you are, for example, interested in the gender distribution of crime victims, the sample could be
relatively small with limited variability of only two possibilities (male and female) compared to the size
of the sample needed to make the same level of accuracy statement about the ethnicity of crime victims
(Germans, Italians, Irish, Poles, Canadians, etc.). To make a statement about the gender makeup of crime
victims that would be within 3% of the population parameter that we would be 95% confident in making
would require a sample of 1200, while a similar statement about the ethnic makeup of victims, would
require a much larger sample due to the variability.
For any sample design deciding upon the appropriate sample size will depend on several key factors:-
(1) No estimate taken from a sample is expected to be exact: Any assumptions about the overall
population based on the results of a sample will have an attached margin of error.
(2) To lower the margin of error usually requires a larger sample size. The amount of variability in the
population (i.e. the range of values or opinions) will also affect accuracy and therefore the size of sample.
(3) The confidence level is the likelihood that the results obtained from the sample lie within a
required precision. The higher the confidence level that is the more certain you wish to be that the results
are not atypical. Statisticians often use a 95 per cent confidence level to provide strong conclusions.
(4) Population size does not normally affect sample size. In fact the larger the population sizes the lower
the proportion of that population that needs to be sampled to be representative. It is only when the
proposed sample size is more than 5 per cent of the population that the population size becomes part of
the formulae to calculate the sample size.
Sampling error is the error in sample estimates of a population. Of course you would like to precisely
know the population characteristics from your sample, but that is not likely. Suppose that you wanted to
know about the students at a school of 1000 students and you choose a random sample of 100. With
much variability at all it is unlikely that your sample of 100 would have exactly the same characteristics
as another sample of 100 from the same 1000 students. This variation in samples is called sampling
error. It is at this point that statistics enters the picture. We know from the logic of statistics that if we
took all possible samples of 100 from our population the distribution of characteristics such as means
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 56

and standard deviations of the samples would be "normal," with the mean and standard deviation of the
samples collectively equal to the population mean and standard deviation.

3) Sampling method
The difference between non-probability and probability sampling is that non-probability sampling
does not involve random selection and probability sampling does. Does that mean that non-probability
samples aren't representative of the population? Not necessarily. But it does mean that non-probability
samples cannot depend upon the rationale of probability theory. At least with a probabilistic sample, we
know the odds or probability that we have represented the population well. We are able to estimate
confidence intervals for the statistic. With non-probability samples, we may or may not represent the
population well, and it will often be hard for us to know how well we've done so. In general, researchers
prefer probabilistic or random sampling methods over non-probabilistic ones, and consider them to be
more accurate and rigorous. However, in applied social research there may be circumstances where it is
not feasible, practical or theoretically sensible to do random sampling. Here, we consider a wide range of
non-probabilistic alternatives.
Probability sampling, or random sampling, is a sampling technique in which the probability of getting
any particular sample may be calculated. Nonprobability sampling does not meet this criterion and
should be used with caution. Nonprobability sampling techniques cannot be used to infer from the
sample to the general population. Any generalizations obtained from a nonprobability study must be
filtered through ones knowledge of the topic being studied. Performing nonprobability sampling is
considerably less expense than doing probability sampling.

A) Probability sampling methods
Each subject or unit in the population has a known non-zero probability of being included in the
sample. This allows the application of probability theory to estimate how likely it is that the sample
reflects the target population. In statistical terms, a calculation of sampling error can be made.
Probability sampling method is any method of sampling that utilizes some form of random selection. In
order to have a random selection method, you must set up some process or procedure that assures that
the different units in your population have equal probabilities of being chosen. Humans have long
practiced various forms of random selection, such as picking a name out of a hat, or choosing the short
straw. These days, we tend to use computers as the mechanism for generating random numbers as the
basis for random selection.
- General advantages
- A high degree of representativeness is likely
- The sampling error can be calculated
- General disadvantages
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 57

- Expensive
- Time consuming
- Relatively complicated

Definition of basic terms:-
These are:
N = the number of cases in the sampling frame
n = the number of cases in the sample
N C n = the number of combinations (subsets) of n from N
f = n/N = the sampling fraction

In Probability sampling, all items have some chance of selection that can be calculated. Probability
sampling technique ensures that bias is not introduced regarding who is included in the survey.

Five common Probability sampling or random sampling techniques are:
1) Simple random sampling,
2) Systematic sampling,
3) Stratified sampling,
4) Cluster sampling, and
5) Multi-stage sampling

1) Simple random sampling
With simple random sampling, each item in a population has an equal chance of inclusion in the sample.
For example, each name in a telephone book could be numbered sequentially. If the sample size was to
include 2,000 people, then 2,000 numbers could be randomly generated by computer or numbers could be
picked out of a hat. These numbers could then be matched to names in the telephone book, thereby
providing a list of 2,000 people.
Example: - A lotto draw is a good example of simple random sampling. A sample of 6 numbers is
randomly generated from a population of 45, with each number having an equal chance of being selected.
The advantage of simple random sampling is that it is simple and easy to apply when small
populations are involved. However, because every person or item in a population has to be listed
before the corresponding random numbers can be read, this method is very cumbersome to use for
large populations.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 58

2) Systematic sampling
Systematic sampling, sometimes called interval-sampling, means that there is a gap, or interval,
between each selection. This method is often used in industry, where an item is selected for testing
from a production line (say, every fifteen minutes) to ensure that machines and equipment are working
to specification.
Alternatively, the manufacturer might decide to select every 20th item on a production line to test for
defects and quality. This technique requires the first item to be selected at random as a starting point for
testing and, thereafter, every 20th item is chosen.
This technique could also be used when questioning people in a sample survey. A market researcher
might select every 10th person who enters a particular store, after selecting a person at random as a
starting point; or interview occupants of every 5th house in a street, after selecting a house at random as a
starting point.
It may be that a researcher wants to select a fixed size sample. In this case, it is first necessary
to know the whole population size from which the sample is being selected. The appropriate sampling
interval, I, is then calculated by dividing population size, N, by required sample size, n, as follows: I =
Example:-If a systematic sample of 500 students were to be carried out in a University with an enrolled
population of 10,000, the sampling interval would be: I = N/n = 10,000/500 =20
Note: if I is not a whole number, then it is rounded to the nearest whole number.
All students would be assigned sequential numbers. The starting point would be chosen by selecting a
random number between 1 and 20. If this number was 9, then the 9th student on the list of students
would be selected along with every following 20th student. The sample of students would be those
corresponding to student numbers 9, 29, 49, 69, ........ 9929, 9949, 9969 and 9989.
The advantage of systematic sampling is that it is simpler to select one random number and then every
' (e.g. 20
) member on the list, than to select as many random numbers as sample size. It also gives a
good spread right across the population. A disadvantage is that you may need a list to start with, if you
wish to know your sample size and calculate your sampling interval.

3) Stratified sampling
A general problem with random sampling is that you could, by chance, miss out a particular group in the
sample. However, if you form the population into groups, and sample from each group, you can make
sure the sample is representative.
In stratified sampling, the population is divided into groups called strata. A sample is then drawn from
within these strata. Some examples of strata commonly used by the research Organisation are States, Age
and Sex. Other strata may be religion, academic ability or marital status.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 59

Example: - The committee of a school of 1,000 students wishes to assess any reaction to the reintroduction
of rural Care into the school timetable. To ensure a representative sample of students from all year levels,
the committee uses the stratified sampling technique.
In this case the strata are the year levels. Within each stratum the committee selects a sample. So, in a
sample of 100 students, all year levels would be included. The students in the sample would be selected
using simple random sampling or systematic sampling within each stratum.
Stratification is most useful when the stratifying variables are simple to work with, easy to observe and
closely related to the topic of the survey.
An important aspect of stratification is that it can be used to select more of one group than another. You
may do this if you feel that responses are more likely to vary in one group than another. So, if you know
everyone in one group has much the same value, you only need a small sample to get information for
that group; whereas in another group, the values may differ widely and a bigger sample is needed.
If you want to combine group level information to get an answer for the whole population, you have to
take account of what proportion you selected from each group
When stratified sampling designs are to be employed, there are 3 key questions which have to be
immediately addressed:
1 The bases of stratification, i.e. what characteristics should be used to subdivide the
universe/population into strata?
2 The number of strata, i.e. how many strata should be constructed and what stratum boundaries
should be used?
3 Sample sizes within strata, i.e. how many observations should be taken in each stratum?

1) Bases of stratification
Intuitively, it seems clear that the best basis would be the frequency distribution of the principal
variable being studied. For example, in a study of coffee consumption we may believe that behavioural
patterns will vary according to whether a particular respondent drinks a lot of coffee, only a moderate
amount of coffee or drinks coffee very occasionally. Thus we may consider that to stratify according to
"heavy users", "moderate users" and "light users" would provide an optimum stratification. However,
two difficulties may arise in attempting to proceed in this way. First, there is usually interest in many
variables, not just one, and stratification on the basis of one may not provide the best stratification for
the others. Secondly, even if one survey variable is of primary importance, current data on its frequency
is unlikely to be available. However, the latter complaint can be attended to since it is possible to
stratify after the data has been completed and before the analysis is undertaken. The only approach is
to create strata on the basis of variables, for which information is, or can be made available, that are
believed to be highly correlated with the principal survey characteristics of interest, e.g. age, socio-
economic group, sex, farm size, firm size, etc.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 60

In general, it is desirable to make up strata in such a way that the sampling units within strata are as
similar as possible. In this way a relatively limited sample within each stratum will provide a generally
precise estimate of the mean of that stratum. Similarly it is important to maximise differences in
stratum means for the key survey variables of interest. This is desirable since stratification has the effect
of removing differences between stratum means from the sampling error.
Total variance within a population has two types of natural variation: between-strata variance and
within-strata variance. Stratification removes the second type of variance from the calculation of the
standard error. Suppose, for example, we stratified students in a particular University by subject
specialty - marketing, engineering, chemistry, computer science, mathematics, history, geography etc.
and questioned them about the distinctions between training and education. The theory goes that
without stratification we would expect variation in the views expressed by students from say within
the marketing specialty and between the views of marketing students, as a whole, and engineering
students as a whole. Stratification ensures that variation between strata does not enter into the standard
error by taking account of this source in drawing the sample.
2) Number of strata
The next question is that of the number of strata and the construction of stratum boundaries. As regards
number of strata, as many as possible should be used. If each stratum could be made as homogeneous
as possible, its mean could be estimated with high reliability and, in turn, the population mean could be
estimated with high precision. However, some practical problems limit the desirability of a large
number of strata:
a) No stratification scheme will completely "explain" the variability among a set of observations. Past a
certain point, the "residual" or "unexplained" variation will dominate, and little improvement will be
effected by creating more strata.
b) Depending on the costs of stratification, a point may be reached quickly where creation of additional
strata is economically unproductive.
If a single overall estimate is to be made (e.g. the average per capita consumption of coffee) we would
normally use no more than about 6 strata. If estimates are required for population subgroups (e.g. by
region and/or age group), then more strata may be justified.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 61

3) Sample sizes within strata
Proportional allocation: Once strata have been established, the question becomes, "How big a sample
must be drawn from each?" Consider a situation where a survey of a two-stratum population is to be
carried out:
Stratum Number of Items in Stratum
A 10,000
B 90,000
If the budget is fixed at ` 3000 and we know the cost per observation is ` 6 in each stratum, so the
available total sample size is 500. The most common approach would be to sample the same proportion
of items in each stratum. This is termed proportional allocation. In this example, the overall sampling
fraction is:

Thus, this method of allocation would result in:
Stratum A (10,000 0.5%) = 50
Stratum B (90,000 0.5%) = 450
The major practical advantage of proportional allocation is that it leads to estimates which are
computationally simple. Where proportional sampling has been employed we do not need to weight
the means of the individual stratum when calculating the overall mean. So:
sr = W1 1 + W2 2 + W3 3+ - - - Wk k
Optimum allocation: Proportional allocation is advisable when all we know of the strata is their sizes.
In situations where the standard deviations of the strata are known it may be advantageous to make a
disproportionate allocation.
Suppose that, once again, we had stratum A and stratum B, but we know that the individuals assigned
to stratum A were more varied with respect to their opinions than those assigned to stratum B.
Optimum allocation minimises the standard error of the estimated mean by ensuring that more
respondents are assigned to the stratum within which there is greatest variation.

4) Cluster sampling
It is sometimes expensive to spread your sample across the population as a whole. For example, travel
can become expensive if you are using interviewers to travel between people spread all over the country.
To reduce costs you may choose a cluster sampling technique.
Cluster sampling divides the population into groups, or clusters. A number of clusters are selected
randomly to represent the population, and then all units within selected clusters are included in the
sample. No units from non-selected clusters are included in the sample. They are represented by those
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 62

from selected clusters. This differs from stratified sampling, where some units are selected from each
Examples of clusters may be factories, schools and geographic areas such as electoral sub-divisions. The
selected clusters are then used to represent the population.
Example:- Suppose an organisation wishes to find out which sports 11 Std students are participating in
across Maharashtra. It would be too costly and take too long to survey every student, or even some
students from every school. Instead, 100 schools are randomly selected from all over Maharashtra.
These schools are considered to be clusters. Then, every 11 Std student in these 100 schools is surveyed.
In effect, students in the sample of 100 schools represent all 11 Std students in Maharashtra.
Cluster sampling has several advantages: reduced costs, simplified fieldwork and administration are
more convenient. Instead of having a sample scattered over the entire coverage area, the sample is more
localised in relatively few centres (clusters).
Cluster sampling's disadvantage is that less accurate results are often obtained due to higher sampling
error than for simple random sampling with the same sample size. In the above example, you might
expect to get more accurate estimates from randomly selecting students across all schools than from
randomly selecting 100 schools and taking every student in those chosen.

5) Multi-stage sampling
Multi-stage sampling is like cluster sampling, but involves selecting a sample within each chosen cluster,
rather than including all units in the cluster. Thus, multi-stage sampling involves selecting a sample in at
least two stages. In the first stage, large groups or clusters are selected. These clusters are designed to
contain more population units than are required for the final sample.
In the second stage, population units are chosen from selected clusters to derive a final sample. If more
than two stages are used, the process of choosing population units within clusters continues until the
final sample is achieved.
Example:- An example of multi-stage sampling is where, firstly, electoral sub-divisions (clusters) are
sampled from a city or state. Secondly, blocks of houses are selected from within the electoral sub-
divisions and, thirdly, individual houses are selected from within the selected blocks of houses.
The advantages of multi-stage sampling are convenience, economy and efficiency. Multi-stage sampling
does not require a complete list of members in the target population, which greatly reduces sample
preparation cost. The list of members is required only for those clusters used in the final stage.
The main disadvantage of multi-stage sampling is the same as for cluster sampling: lower accuracy due
to higher sampling error.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 63

B) Non-probability sampling techniques
The selection of subjects or units is left to the discretion of the researcher and methods are less structured
and less strict. Probability theory cannot be used to estimate sampling error.
Non-probability sampling methods are usually used for qualitative research when the purpose is
exploratory or interpretative.
We can divide non-probability sampling methods into two broad types: accidental or purposive. Most
sampling methods are purposive in nature because we usually approach the sampling problem with a
specific plan in mind. The most important distinctions among these types of sampling methods are the
ones between the different types of purposive sampling approaches.
- General advantages
- Typicality of subjects is aimed for
- Permits exploration
- General disadvantage
- Unrepresentative

Examples of non-probability sampling includes:
1) Accidental, Haphazard or Convenience Sampling
Members of the population are chosen based on their relative ease of access. To sample friends, co-
workers, or shoppers at a single mall, are all examples of Convenience sampling.
Accidental, convenience, available samples are all names for non-purposive non-probability samples. In
these, people in the samples are those who simply agreed to take part, were around and available at the
time. They are quick and cheap but their use is really limited to pilot or exploratory work; or, if one is
used because there re is no alternative form of sampling available, caution must be exercised in the
analysis of the results. Tempting though it may be, you cannot assume the sample is representative.

2) Purposive Sampling
In purposive sampling the people/units/ elements/ in the sample are selected because they are
regarded as having similar characteristics to the people in the designated research population. So, for
example, in research investigating the management skills of owner/managers of small enterprises, the
researcher might select some typical owner managers to take part in the study. They will not be selected
randomly. One advantage of this kind of sample is that it is usually possible to get a targeted sample
together very quickly - and hence cheaply.
All of the methods that follow can be considered subcategories of purposive sampling methods. We
might sample for specific groups or types of people as in modal instance, expert, or quota sampling.
We might sample for diversity as in heterogeneity sampling. Or, we might capitalize on informal
social networks to identify specific respondents who are hard to locate otherwise, as in snowball
sampling. In all of these methods we know what we want -- we are sampling with a purpose.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 64

a) Modal Instance Sampling
In statistics, the mode is the most frequently occurring value in a distribution. In sampling, when we do a
modal instance sample, we are sampling the most frequent case, or the "typical" case. In a lot of informal
public opinion polls, for instance, they interview a "typical" voter. There are a number of problems with
this sampling approach. First, how do we know what the "typical" or "modal" case is? We could say that
the modal voter is a person who is of average age, educational level, and income in the population. But,
it's not clear that using the averages of these is the fairest (consider the skewed distribution of income, for
instance). And, how do you know that those three variables -- age, education, income -- are the only or
event the most relevant for classifying the typical voter? What if religion or ethnicity is an important
discriminator? Clearly, modal instance sampling is only sensible for informal sampling contexts.

b) Expert Sampling
Expert sampling involves the assembling of a sample of persons with known or demonstrable experience
and expertise in some area. Often, we convene such a sample under the auspices of a "panel of experts."
There are actually two reasons you might do expert sampling. First, because it would be the best way to
elicit the views of persons who have specific expertise. In this case, expert sampling is essentially just a
specific sub case of purposive sampling. But the other reason you might use expert sampling is to
provide evidence for the validity of another sampling approach you've chosen. For instance, let's say you
do modal instance sampling and are concerned that the criteria you used for defining the modal instance
are subject to criticism. You might convene an expert panel consisting of persons with acknowledged
experience and insight into that field or topic and ask them to examine your modal definitions and
comment on their appropriateness and validity. The advantage of doing this is that you aren't out on
your own trying to defend your decisions -- you have some acknowledged experts to back you. The
disadvantage is that even the experts can be, and often are, wrong.

c) Quota Sampling
In quota sampling, you select people non-randomly according to some fixed quota. There are two types
of quota sampling: proportional and non proportional.
i) In proportional quota sampling you want to represent the major characteristics of the population by
sampling a proportional amount of each. For instance, if you know the population has 40% women and
60% men, and that you want a total sample size of 100, you will continue sampling until you get those
percentages and then you will stop. So, if you've already got the 40 women for your sample, but not the
sixty men, you will continue to sample men but even if legitimate women respondents come along, you
will not sample them because you have already "met your quota." The problem here (as in much
purposive sampling) is that you have to decide the specific characteristics on which you will base the
quota. Will it be by gender, age, education race, religion, etc.?
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 65

ii) Non-proportional quota sampling is a bit less restrictive. In this method, you specify the minimum
number of sampled units you want in each category. Here, you're not concerned with having numbers
that match the proportions in the population. Instead, you simply want to have enough to assure that
you will be able to talk about even small groups in the population. This method is the non-probabilistic
analogue of stratified random sampling in that it is typically used to assure that smaller groups are
adequately represented in your sample.

d) Heterogeneity Sampling
We sample for heterogeneity when we want to include all opinions or views, and we aren't concerned
about representing these views proportionately. Another term for this is sampling for diversity. In many
brainstorming or nominal group processes (including concept mapping), we would use some form of
heterogeneity sampling because our primary interest is in getting broad spectrum of ideas, not
identifying the "average" or "modal instance" ones. In effect, what we would like to be sampling is not
people, but ideas. We imagine that there is a universe of all possible ideas relevant to some topic and that
we want to sample this population, not the population of people who have the ideas. Clearly, in order to
get all of the ideas, and especially the "outlier" or unusual ones, we have to include a broad and diverse
range of participants. Heterogeneity sampling is, in this sense, almost the opposite of modal instance

e) Snowball sampling - In snowball sampling, you begin by identifying someone who meets the criteria
for inclusion in your study. You then ask them to recommend others who they may know who also meet
the criteria. Although this method would hardly lead to representative samples, there are times when it
may be the best method available. Snowball sampling is especially useful when you are trying to reach
populations that are inaccessible or hard to find. For instance, if you are studying the homeless, you are
not likely to be able to find good lists of homeless people within a specific geographical area. For
example, we might wish to know whether a new educational program causes subsequent achievement
score gains, whether a special work release program for prisoners causes lower recidivism rates, whether
a novel drug causes a reduction in symptoms, and so on.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 66

The following is the characteristic of most popular sampling techniques:-

Sampling Method Definition Uses Limitations
Cluster Sampling Units in the population can often be
found in certain geographic groups
or "clusters" (e.g. primary school
children in Chandrapur). A random
sample of clusters is taken, then all
units within the cluster are

Quick & easy; does
not require complete
information; good for
face-to-face surveys

Expensive if the
clusters are large;
greater risk of
sampling error
Convenience Sampling Uses those who are willing to
Readily available;
large amount of
information can be
gathered quickly
Cannot extrapolate
from sample to infer
about the population;
prone to volunteer

Judgement Sampling A deliberate choice of a sample - the
opposite of random
Good for providing
illustrative examples
or case studies
Very prone to bias;
samples often small;
cannot extrapolate
from sample

Quota Sampling Aim is to obtain a sample that is
"representative" of the overall
population; the population is
divided ("stratified") by the most
important variables (e.g. income,.
age, location) and a required quota
sample is drawn from each stratum

Quick & easy way of
obtaining a sample
Not random, so still
some risk of bias; need
to understand the
population to be able
to identify the basis of
Simply Random
Ensures that every member of the
population has an equal chance of
Simply to design and
interpret; can
calculate estimate of
the population and
the sampling error
Need a complete and
accurate population
listing; may not be
practical if the sample
requires lots of small
visits all over the
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 67


Systematic Sampling After randomly selecting a starting
point from the population, between
1 and "n", every nth unit is selected,
where n equals the population size
divided by the sample size
Easier to extract the
sample than via
simple random;
ensures sample is
spread across the
Can be costly and
time-consuming if the
sample is not
conveniently located

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 68

Chapter4: Measurement & Scaling Techniques
The data consists of quantitative variables like price, income, sales etc., and qualitative variables like
knowledge, performance, character etc. The qualitative information must be converted into numerical
form for further analysis. This is possible through measurement and scaling techniques. A common
feature of survey based research is to have respondents feelings, attitudes, opinions, etc. in some
measurable form. For example, a bank manager may be interested in knowing the opinion of the
customers about the services provided by the bank. Similarly, a fast food company having a network in a
city may be interested in assessing the quality and service provided by them. As a researcher you may be
interested in knowing the attitude of the people towards the government announcement of a metro rail in
Delhi. In this unit we will discuss the issues related to measurement, different levels of measurement
scales, various types of scaling techniques and also selection of an appropriate scaling technique.

Measurement and scaling
Before we proceed further it will be worthwhile to understand the following two terms: (a)
Measurement, and (b) Scaling.
a) Measurement: Measurement is the process of observing and recording the observations that are
collected as part of research. The recording of the observations may be in terms of numbers or other
symbols to characteristics of objects according to certain prescribed rules. The respondents,
characteristics are feelings, attitudes, opinions etc. For example, you may assign 1 for Male and 2 for
Female respondents. In response to a question on whether he/she is using the ATM provided by a
particular bank branch, the respondent may say yes or no. You may wish to assign the number 1 for
the response yes and 2 for the response no. We assign numbers to these characteristics for two reasons.
First, the numbers facilitate further statistical analysis of data obtained. Second, numbers facilitate the
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 69

communication of measurement rules and results. The most important aspect of measurement is the
specification of rules for assigning numbers to characteristics. The rules for assigning numbers should be
standardised and applied uniformly. This must not change over time or objects.
b) Scaling: Scaling is the assignment of objects to numbers or semantics according to a rule. In scaling, the
objects are text statements, usually statements of attitude, opinion, or feeling. For example, consider a
scale locating customers of a bank according to the characteristic agreement to the satisfactory quality of
service provided by the branch. Each customer interviewed may respond with a semantic like strongly
agree, or somewhat agree, or somewhat disagree, or strongly disagree. We may even assign each of
the responses a number. For example, we may assign strongly agree as 1, agree as 2 disagree as 3, and
strongly disagree as 4. Therefore, each of the respondents may assign 1, 2, 3 or 4.

Issues in measurement
When a researcher is interested in measuring the attitudes, feelings or opinions of respondents he/she
should be clear about the following:
a) What is to be measured?
b) Who is to be measured?
c) The choices available in data collection techniques
The first issue that the researcher must consider is what is to be measured?
The definition of the problem, based on our judgments or prior research indicates the concept to be
investigated. For example, we may be interested in measuring the performance of a fast food company.
We may require a precise definition of the concept on how it will be measured. Also, there may be more
than one way that we can measure a particular concept. For example, in measuring the performance of a
fast food company we may use a number of measures to indicate the performance of the company. We
may use sales volume in terms of value of sales or number of customers or spread of network of the
company as measures of performance. Further, the measurement of concepts requires assigning numbers
to the attitudes, feelings or opinions. The key question here is that on what basis do we assign the
numbers to the concept. For example, the task is to measure the agreement of customers of a fast food
company on the opinion of whether the food served by the company is tasty, we create five categories: (1)
strongly agree, (2) agree, (3) undecided, (4) disagree, (5) strongly disagree. Then we may measure the
response of respondents. Suppose if a respondent states disagree with the statement that the food is
tasty, the measurement is 4.
The second important issue in measurement is that, who is to be measured? That means who are the
people we are interested in. The characteristics of the people such as age, sex, education, income, location,
profession, etc. may have a bearing on the choice of measurement. The measurement procedure must be
designed keeping in mind the characteristics of the respondents under consideration.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 70

Levels of measurement
We know that the level of measurement is a scale by which a variable is measured. For 50 years, with few
detractors, science has used the Stevens (1951) typology of measurement levels (scales). There are three
things, which you need to remember about this typology: Anything that can be measured falls into one of
the four types
The higher the level of measurement, the more precision in measurement and every level up contains all
the properties of the previous level. The four levels of measurement, from lowest to highest, are as
1. Nominal
2. Ordinal
3. Interval
4. Ratio

1) Nominal scales
This, the crudest of measurement scales, classifies individuals, companies, products, brands or other
entities into categories where no order is implied. Indeed it is often referred to as a categorical scale. It is a
system of classification and does not place the entity along a continuum. It involves a simply count of the
frequency of the cases assigned to the various categories, and if desired numbers can be nominally
assigned to label each category as in the example below:
An example of a nominal scale
Which of the following food items do you tend to buy at least once per month? (Please tick)
Okra Palm Oil Milled Rice
Peppers Prawns Pasteurised milk
The numbers have no arithmetic properties and act only as labels. The only measure of average which
can be used is the mode because this is simply a set of frequency counts. Hypothesis tests can be carried
out on data collected in the nominal form. The most likely would be the Chi-square test. However, it
should be noted that the Chi-square is a test to determine whether two or more variables are associated
and the strength of that relationship. It can tell nothing about the form of that relationship, where it
exists, i.e. it is not capable of establishing cause and effect.

2) Ordinal scales
Ordinal scales involve the ranking of individuals, attitudes or items along the continuum of the
characteristic being scaled. For example, if a researcher asked farmers to rank 5 brands of pesticide in
order of preference he/she might obtain responses like those in table below.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 71

An example of an ordinal scale used to determine farmers' preferences among 5 brands of pesticide.
Order of preference Brand
1 Rambo
2 Harpic
4 Bagyone
5 Rat kill

From such a table the researcher knows the order of preference but nothing about how much more one
brand is preferred to another, which is there is no information about the interval between any two
brands. All of the information a nominal scale would have given is available from an ordinal scale. In
addition, positional statistics such as the median, quartile and percentile can be determined.
It is possible to test for order correlation with ranked data. The two main methods are Spearman's
Ranked Correlation Coefficient and Kendall's Coefficient of Concordance. Using either procedure one
can, for example, ascertain the degree to which two or more survey respondents agree in their ranking of
a set of items. Consider again the ranking of pesticides example in given figure. The researcher might
wish to measure similarities and differences in the rankings of pesticide brands according to whether the
respondents' farm enterprises were classified as "arable" or "mixed" (a combination of crops and
livestock). The resultant coefficient takes a value in the range 0 to 1. A zero would mean that there was no
agreement between the two groups, and 1 would indicate total agreement. It is more likely that an
answer somewhere between these two extremes would be found.
The only other permissible hypothesis testing procedures are the runs test and sign test. The runs test
(also known as the Wald-Wolfowitz). Test is used to determine whether a sequence of binomial data -
meaning it can take only one of two possible values e.g. African/non-African, yes/no, male/female - is
random or contains systematic 'runs' of one or other value. Sign tests are employed when the objective is
to determine whether there is a significant difference between matched pairs of data. The sign test tells
the analyst if the number of positive differences in ranking is approximately equal to the number of
negative rankings, in which case the distribution of rankings is random, i.e. apparent differences are not
significant. The test takes into account only the direction of differences and ignores their magnitude and
hence it is compatible with ordinal data.

3) Interval scales
It is only with an interval scaled data that researchers can justify the use of the arithmetic mean as the
measure of average. The interval or cardinal scale has equal units of measurement, thus making it
possible to interpret not only the order of scale scores but also the distance between them. However, it
must be recognised that the zero point on an interval scale is arbitrary and is not a true zero. This of
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 72

course has implications for the type of data manipulation and analysis we can carry out on data collected
in this form. It is possible to add or subtract a constant to all of the scale values without affecting the form
of the scale but one cannot multiply or divide the values. It can be said that two respondents with scale
positions 1 and 2 are as far apart as two respondents with scale positions 4 and 5, but not that a person
with score 10 feels twice as strongly as one with score 5. Temperature is interval scaled, being measured
either in Centigrade or Fahrenheit. We cannot speak of 50F being twice as hot as 25F since the
corresponding temperatures on the centigrade scale, 10C and -3.9C, are not in the ratio 2:1. Interval
scales may be either numeric or semantic. Study the examples below in figure.
Examples of interval scales in numeric and semantic formats
Please indicate your views on Balkan Olives by scoring them on a scale of 5 down to 1 (i.e. 5 = Excellent; =
Poor) on each of the criteria listed
Balkan Olives are: Circle the appropriate score on each line
Succulence 5 4 3 2 1
Fresh tasting 5 4 3 2 1
Free of skin blemish 5 4 3 2 1
Good value 5 4 3 2 1
Attractively packaged 5 4 3 2 1
Please indicate your views on Balkan Olives by ticking the appropriate responses below:
Excellent Very Good Good Fair Poor
Freedom from skin blemish
Value for money
Attractiveness of packaging
Most of the common statistical methods of analysis require only interval scales in order that they might
be used. These are not recounted here because they are so common and can be found in virtually all basic
texts on statistics.

4) Ratio scales
The highest level of measurement is a ratio scale. This has the properties of an interval scale together with
a fixed origin or zero point. Examples of variables which are ratio scaled include weights, lengths and
times. Ratio scales permit the researcher to compare both differences in scores and the relative magnitude
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 73

of scores. For instance the difference between 5 and 10 minutes is the same as that between 10 and 15
minutes, and 10 minutes is twice as long as 5 minutes.
Given that sociological and management research seldom aspires beyond the interval level of
measurement, it is not proposed that particular attention be given to this level of analysis. Suffice it to say
that virtually all statistical operations can be performed on ratio scales.

Errors in measurement
In principle, every operation of a survey is a potential source of measurement error. Some examples of
causes of measurement error are non-response, badly designed questionnaires, respondent bias and
processing errors.
Measurement errors can be grouped into two main causes, systematic errors and random errors.
Systematic error (called bias) makes survey results unrepresentative of the target population by
distorting the survey estimates in one direction. For example, if the target population is the entire
population in a country but the sampling frame is just the urban population, then the survey results will
not be representative of the target population due to systematic bias in the sampling frame. On the other
hand, random error can distort the results on any given occasion but tends to balance out on average.
Some of the types of measurement error are outlined below:
1. Failure to identify the target population
Failure to identify the target population can arise from the use of an inadequate sampling frame,
imprecise definition of concepts, and poor coverage rules. Problems can also arise if the target
population and survey population do not match very well. Failure to identify and adequately capture
the target population can be a significant problem for informal sector surveys. While establishment and
population censuses allow for the identification of the target population, it is important to ensure that the
sample is selected as soon as possible after the census is taken so as to improve the coverage of the
survey population.
2. Non-response bias
Non-respondents may differ from respondents in relation to the attributes/variables being measured.
Non-response can be total (where none of the questions were answered) or partial (where some
questions may be unanswered owing to memory problems, inability to answer, etc.). To improve
response rates, care should be taken in training interviewers, assuring the respondent of confidentiality,
motivating him or her to cooperate, and revisiting or calling back if the respondent has been previously
unavailable. 'Call backs' are successful in reducing non-response but can be expensive. It is also
important to ensure that the person who has the information required can be contacted by the
interviewer; that the data required are available and that an adequate follow up strategy is in place
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 74

3. Questionnaire design
The content and wording of the questionnaire may be misleading and the layout of the questionnaire
may make it difficult to accurately record responses. Questions should not be misleading or ambiguous,
and should be directly relevant to the objectives of the survey. In order to reduce measurement error
relating to questionnaire design, it is important to ensure that the questionnaire:
- can be completed in a reasonable amount of time;
- can be properly administered by the interviewer;
- uses language that is readily understood by both the interviewer and the respondent; and
- can be easily processed.
In designing questionnaires and training interviewers in the case of informal sector survey where there
is a strong potential for inaccurate information being provided by respondents, consideration should be
given to the use of random question sequencing, derived or imputed results, and the use of partial
questionnaires. The random question sequencing approach involves the interviewer asking the survey
respondent a number of questions about the relevant data items (e.g. input costs and quantities, output
prices and output units sold) in a random order. The interviewer would use a deck of questionnaire
cards. The cards would be shuffled and then the interviewer would ask a series of questions out of
sequence, record each answer and then reassemble the questions in the right sequence to get the final
response (e.g. profit or value added information) as a derived result. Another approach-to consider;-
where particular-responding businesses form a reasonably homogeneous group operating with similar
cost structures and market conditions, is aggregating results from sample measures of inputs and
outputs. This approach involves using separate but representative random samples of businesses to
collect information about different data items. The data are then brought together to produce imputed
aggregate level estimates.
4. Interviewer bias
The respondent answers questions can be influenced by the interviewer's behaviour, choice of clothes,
sex, accent and prompting when a respondent does not understand a question. A bias may also be
introduced if interviewers receive poor training as this may have an affect on the way they prompt for,
or record, the answers. The best way to minimise interviewer bias is through effective training and by
ensuring manageable workloads.
Training can be provided in the form of manuals, formal training courses on questionnaire content and
interviewing techniques, and on-the-job training in the field. Topics that should be covered in
interviewer training include - the purpose of the survey; the scope and coverage of the survey; a general
outline of the survey design and sampling approach being used; the questionnaire; interviewing
techniques and recording answers; ways to avoid or reduce non-response; how best to maintain
respondent co-operation; field practice; quality assurance and editing of data; planning workloads; and
administrative arrangements.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 75

5. Respondent bias
Refusals and inability to answer questions, memory biases and inaccurate information will lead to a bias
in the estimates. An increasing level of respondent burden (due to the number of times a person is
included in surveys) can also make it difficult to get the potential respondent to participate in a survey.
When designing a survey it should be remembered that uppermost in the respondent's mind will be
protecting their own personal privacy, integrity and interests. Also, the way the respondent interprets
the questionnaire and the wording of the answer the respondent gives can cause inaccuracies to enter the
survey data. Careful questionnaire design, effective training of interviewers and adequate survey testing
can overcome these problems to some extent.
6. Processing errors
There are four stages in the processing of the data where errors may occur: data grooming, data capture,
editing and estimation. Data grooming involves preliminary checking before entering the data onto the
processing system in the capture stage. Inadequate checking and quality management at this stage can
introduce data loss (where data are not entered into the system) and data duplication (where the same
data are entered into the system more than once). Inappropriate edit checks and inaccurate weights in
the estimation procedure can also introduce errors to the data at the editing and estimation stage. To
minimise these errors, processing staff should be given adequate training and realistic workloads.
Training material for processing staff should cover similar topics to those for interview staff, however,
with greater emphasis on editing techniques and quality assurance practices.
There are five main editing checks that should be considered including structure checks, range edits,
sequencing checks, checks for duplication and omissions, and logic edits. Structure checks are
undertaken to ensure that all the information sought has been provided. This involves checking that all
documents for a record are together and correctly labelled. Range edits are used to ensure that only the
possible codes for each question are used and that no codes outside the valid range has been entered.
Sequencing checks involve the process of ensuring that all those who should have answered the question
(because they gave a particular answer to earlier question) have done so and that respondents who
should not have answered the question did not do so. Duplication and omission checks ensure that the
specific data reported by a respondent has not been recorded more than once or that data reported has
not been omitted. Logic edits involve specifying checks in advance to data collection. An example of a
logic edit would be that males cannot report that they are pregnant.
The key areas that an effective editing strategy should address to reduce processing errors are:
- target the editing effort to large contributors and large units within the survey population;
- do not over edit the data;
- automate the editing process as far as possible; and
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 76

- feedback information from the data processing stage to refine the conduct of the survey through
changes such as improvements in question wording, questionnaire design, training and
7. Misinterpretation of results
This can occur if the researcher is not aware of certain factors that influence the characteristics under
investigation. A researcher or any other user not involved in the data collection process may be unaware
of trends built into the data due to the nature of the collection (e.g. where interviews are always
conducted at a particular time of the weekday could result in only particular types of householders being
interviewed). Researchers should carefully investigate the methodology used in any given survey.
8. Non-response
Non-response results when data are not collected from respondents. The proportion of these non-
respondents in the sample is called the non-response rate. It is important to make all reasonable efforts to
maximise the response rate as non-respondents may have differing characteristics to respondents.
Significant non-response can bias the survey results. When a respondent replies to the survey answering
some but not all questions then it is called partial non-response. Partial non-response can arise due to
memory problems, inadequate information or an inability to answer a particular question. The
respondent may also refuse to answer questions if they find questions particularly sensitive; or have
been asked too many questions (the questionnaire is too long). Total non-response can arise if a
respondent cannot be contacted (the frame contains inaccurate or out-of-date contact information or the
respondent is not at home), is unable to respond (may be due to language difficulties or illness) or
refuses to answer any questions.
Response rates can be improved through good survey design via short, simple questions, good forms
design techniques and by effectively explaining survey purposes and uses. Assurances of confidentiality
are very important as many respondents are unwilling to respond due to privacy concerns. For informal
sector surveys, it is essential to ensure that the survey is directed to the person within the establishment
or household who can provide the data sought. Call backs for those not available and follow-ups can
increase response rates for those who, initially, were unable to reply. Refusals can be minimised through
the use of positive language; contacting the right person who can provide the information required;
explaining how and what the interviewer plans to do to help with completing the questionnaire;
stressing the importance of the survey and the authority under which the survey is being conducted;
explaining the importance of their response as being representative of other units; emphasising the
benefits from the survey results for the individual and/or broader community; giving adequate
assurances of the confidentiality of the responses; and finding out the reasons for their reluctance to
participate and trying to talk through the areas of concern.
Other measures that can improve respondent cooperation and maximise response include public
awareness activities including discussions with key organisations and interest groups, news releases,
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 77

media interviews and newspaper articles this is aimed at informing the community about the survey,
identifying issues of concern and addressing them; and where possible, using a primary approach letter,
which gives respondents advance notice and explains the purposes of the survey and how the survey
will be conducted.
In case of a mail survey most of the points above can be stated in an introductory letter or through a
publicity campaign. Other non-response minimisation techniques which could be used in a mail survey
include providing a postage-paid mail-back envelope with the survey form; and reminder letters.
Where non-response is at an unsatisfactory level after all reasonable attempts to follow-up are
undertaken, bias can be reduced by imputation for item non-response (non-response to a particular
question) or imputation for unit non-response (complete non-response for a unit). The main aim of
imputation is to produce consistent data without going back to the respondent for the correct values thus
reducing both respondent burden and costs associated with the survey. Broadly speaking the imputation
methods fall into three groups - the imputed value is derived from other information supplied by the
unit; values by other units can be used to derive a value for the non-respondent (e.g. average); and an
exact value of another unit (called donor) is used as a value for the non-respondent (called recipient).
When deciding on the method for non-response imputation it is desirable to know what effect
imputation will have on the final estimates. If a large amount of imputation is performed the results can
be misleading particularly if the imputation used distorts the distribution of data. If at the planning stage
it is believed that there is likely to be a high non-response rate, then the sample size could be increased to
allow for this. However, the problem may not be overcome by just increasing the sample size,
particularly if the non-responding units have different characteristics to the responding units.
Imputation also fails to totally eliminate non-response bias from the results.
If a low response rate is obtained, estimates are likely to be biased and therefore misleading.
Determining the exact bias in estimates is difficult. However, an indication can be obtained by -
comparing the characteristics of respondents to non-respondents; comparing results with alternative
sources and/or previous estimates; and performing a post-enumeration survey on a sub-sample of the
original sample with intensive follow-up of non-respondents.

In conclusion, while measurement error may be difficult to measure accurately it can be minimised by:
Careful selection of the time the survey is conducted;
using an up-to-date, accurate sample framework;
revisiting or conducting 'call backs' to unavailable respondents;
Careful questionnaire design;
Providing thorough training for interviewers and processing staff; and
being aware of all the factors affecting the topic under investigation.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 78

Test of sound measurement
Scales should be tested for reliability, generalizability, and validity. Generalizability is the ability to
make inferences from a sample to the population, given the scale you have selected. Reliability is the
extent to which a scale will produce consistent results. Test-retest reliability checks how similar the
results are if the research is repeated under similar circumstances. Alternative forms reliability checks
how similar the results are if the research is repeated using different forms of the scale. Internal
consistency reliability checks how well the individual measures included in the scale are converted into a
composite measure.
Scales and indexes have to be validated. Internal validation checks the relation between the individual
measures included in the scale, and the composite scale itself. External validation checks the relation
between the composite scale and other indicators of the variable, indicators not included in the scale.
Content validation (also called face validity) checks how well the scale measures what it is supposed to
measure. Criterion validation checks how meaningful the scale criteria are relative to other possible
criteria. Construct validation checks what underlying construct is being measured. There are three
variants of construct validity. They are convergent validity, discriminant validity, and nomological
validity. The coefficient of reproducibility indicates how well the data from the individual measures
included in the scale can be reconstructed from the composite scale.

Reliability and Validity
For a research study to be accurate, its findings must be both reliable and valid.
Research means that the findings would be consistently the same if the study were done over again
A valid measure is one that provides the information that it was intended to provide. The purpose of a
thermometer, for example, is to provide information on the temperature, and if it works correctly, it is a
valid thermometer.
A study can be reliable but not valid, and it cannot be valid without first being reliable. There are many
different threats to validity as well as reliability but an important early consideration is to ensure you
have internal validity.

Methods of Measuring Reliability
Now, the question arises that how will you measure the reliability of a particular measure? There are
four good methods of measuring reliability:
1. Test-retest
2. Multiple forms
3. Inter-rater
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 79

4. Split-half
1. Test-retest
The Test Retest in the same group technique is to administer your test, instrument, survey, or measure to
the same group of people at different points in time. Most researchers administer what is called a pretest
for this, and to troubleshoot bugs at the same time.
All reliability estimates are usually in the form of a correlation coefficient, so here, all you do is calculate
the correlation coefficient between the two scores of each group and report it as your reliability

2. Multiple Forms
The multiple forms technique has other names, such as parallel forms and disguised test-retest, but its
simply the scrambling or mixing up of questions on your survey, for example, giving it to the same
group twice. Its a more rigorous test of reliability.
3. Inter-rater
Inter-rater reliability is most appropriate when you use assistants to do interviewing or content analysis
for you. To calculate this kind of reliability, all you do is report the percentage of agreement on the same
subject between your raters, or assistants.
Taking half of your test, instrument, or survey, and analyzing that Then, you compare the results of this
analysis with your overall analysis.

Methods of Measuring Validity
Once you find that your measurement of variable under study is reliable, you will want to measure its
validity. There are four good methods of estimating validity:
1. Face
2. Content
3. Criterion
4. Construct

1.Face Validity
Face validity is the least statistical estimate (validity overall is not as easily quantified as reliability) as its
simply an assertion on the researchers part claiming that theyve reasonably measured what they
intended to measure. Its essentially a take my word for it kind of validity. Usually, a researcher asks
a colleague or expert in the field to vouch for the items measuring what they were intended to measure.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 80

2.Content Validity
Content validity goes back to the ideas of conceptualization and operationalization. If the researcher has
focused in too closely on only one type or narrow dimension of a construct or concept, then its
conceivable that other indicators were overlooked. In such a case, the study lacks content validity
Content validity is making sure youve covered all the conceptual space.
There are different ways to estimate it, but one of the most common is a reliability approach where you
correlate scores on one domain or dimension of a concept on your pretest with scores on that domain or
dimension with the actual test.
Another way is to simply look over your inter -item correlations.

3.Criterion Validity
Criterion validity is using some standard or benchmark that is known to be a good indicator. There are
different forms of criterion validity:
- Concurrent validity is how well something estimates actual day-by-day behavior;
- Predictive validity is how well something estimates some future event or manifestation that hasnt
happened yet. It is commonly found in criminology.

4.Construct Validity
Construct validity is the extent to which your items are tapping into the underlying theory or model of
behavior. Its how well the items hang together (convergent validity) or distinguish different people on
certain traits or behaviors (discriminant validity). Its the most difficult validity to achieve. You have to
either do years and years of research or find a group of people to test that have the exact opposite traits
or behaviors youre interested in measuringhalf as if it were the whole thing estimate split-half

Scaling techniques
Scaling is the measurement of a variable in such a way that it can be expressed on a continuum.
Rating your preference for a product from 1 to 10 is an example of a scale.
With comparative scaling, the items are directly compared with each other (example: Do you prefer
Pepsi or Coke?). In non-comparative scaling each item is scaled independently of the others (example:
How do you feel about Coke?).

Scale construction decisions
- What level of data is involved (nominal, ordinal, interval, or ratio)?
The type of information collected can influence scale construction. Different types of information are
measured in different ways.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 81

1. Some data is measured at the nominal level. That is, any numbers used are mere labels : they
express no mathematical properties. Examples are SKU inventory codes and UPC bar codes.
2. Some data is measured at the ordinal level. Numbers indicate the relative position of items, but
not the magnitude of difference. An example is a preference ranking.
3. Some data is measured at the interval level. Numbers indicate the magnitude of difference
between items, but there is no absolute zero point. Examples are attitude scales and opinion
4. Some data is measured at the ratio level. Numbers indicate magnitude of difference and there is
a fixed zero point. Ratios can be calculated. Examples include: age, income, price, costs, sales
revenue, sales volume, and market share.

- What will the results be used for?
- Should you use a scale, index, or typology?
- What types of statistical analysis would be useful?
- Should you use a comparative scale or a non-comparative scale?
- How many scale divisions or categories to use (1 to 10; 1 to 7; -3 to +3)?
- Odd or even number of divisions - odd gives neutral center value; even forces respondents to take
a non-neutral position
- The nature and descriptiveness of the scale labels?
- The physical form or layout of the scale? (Graphic, simple linear, vertical, horizontal)
- Forced versus optional response?

Attitude measurement
Many of the questions in a research survey are designed to measure attitudes. Attitudes are a person's
general evaluation of something. Customer attitude is an important factor for the following reasons:
- Attitude helps to explain how ready one is to do something.
- Attitudes do not change much over time.
- Attitudes produce consistency in behavior.
- Attitudes can be related to preferences.

Attitudes can be measured using the following procedures:
- Self-reporting - subjects are asked directly about their attitudes. Self-reporting is the most
common technique used to measure attitude.
- Observation of behaviour - assuming that one's behaviour is a result of one's attitudes, attitudes
can be inferred by observing behaviour. For example, one's attitude about an issue can be
inferred by whether he/she signs a petition related to it.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 82

- Indirect techniques - use unstructured stimuli such as word association tests.
- Performance of objective tasks - assumes that one's performance depends on attitude. For
example, the subject can be asked to memorize the arguments of both sides of an issue. He/she is
more likely to do a better job on the arguments that favour his/her stance.
- Physiological reactions - subject's response to a stimuli is measured using electronic or
mechanical means. While the intensity can be measured, it is difficult to know if the attitude is
positive or negative.
- Multiple measures - a mixture of techniques can be used to validate the findings, especially
worthwhile when self-reporting is used.

The attitude-measuring process
There are a remarkable variety of techniques that have been devised to measure attitudes. These
techniques range from direct to indirect, physiological to verbal, etc.
Obtaining verbal statements from respondents generally requires that the respondent perform a task such
as ranking, rating, sorting, or making a choice or a comparison.
Ranking tasks require that the respondent rank order a small number of objects in overall preference
on the basis of some characteristic or stimulus.
Rating asks the respondent to estimate the magnitude of a characteristic, or quality, that an object
possesses. The respondent indicates the position on a scale(s) where he or she would rate an object.
Sorting might present the respondent with several product concepts typed on cards and require that
the respondent arrange the cards into a number of piles or otherwise classify the product concepts.
Choice between two or more alternatives is another type of attitude measurementit is assumed that
the chosen object is preferred over the other(s).
Physiological measures of attitudes provide a means of measuring attitudes without verbally questioning
the respondent. For example, galvanic skin responses, measure blood pressure, etc., are physiological

Scale construction technique
The various types of scales used in research fall into two broad categories: comparative and non
comparative. In comparative scaling, the respondent is asked to compare one brand or product against
another. With non-comparative scaling respondents need only evaluate a single product or brand. Their
evaluation is independent of the other product and/or brands which the researcher is studying.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 83

Non-comparative scaling is frequently referred to as monadic scaling and this is the more widely used
type of scale in commercial research studies.

I) Comparative scales
a) Paired comparison: It is sometimes the case that researchers wish to find out which are the most
important factors in determining the demand for a product. Conversely they may wish to know which
are the most important factors acting to prevent the widespread adoption of a product. Take, for example,
the very poor farmer response to the first design of an animal-drawn mould board plough. A
combination of exploratory research and shrewd observation suggested that the following factors played
a role in the shaping of the attitudes of those farmers who feel negatively towards the design:
- Does not ridge
- Does not work for inter-cropping
- Far too expensive
- New technology too risky
- Too difficult to carry.
Suppose the organisation responsible wants to know which factors is foremost in the farmer's mind. It
may well be the case that if those factors that are most important to the farmer than the others, being of a
relatively minor nature, will cease to prevent widespread adoption. The alternatives are to abandon the
product's re-development or to completely re-design it which is not only expensive and time-consuming,
but may well be subject to a new set of objections.
The process of rank ordering the objections from most to least important is best approached through the
questioning technique known as 'paired comparison'. Each of the objections is paired by the researcher so
that with 5 factors, as in this example, there are 10 pairs-

In 'paired comparisons' every factor has to be paired with every other factor in turn. However, only one
pair is ever put to the farmer at any one time.
The question might be put as follows:
Which of the following was the more important in making you decide not to buy the plough?

In most cases the question, and the alternatives, would be put to the farmer verbally. He/she then
indicates which of the two was the more important and the researcher ticks the box on his questionnaire.
The question is repeated with a second set of factors and the appropriate box ticked again. This process
continues until all possible combinations are exhausted, in this case 10 pairs. It is good practice to mix the
pairs of factors so that there is no systematic bias. The researcher should try to ensure that any particular
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 84

factor is sometimes the first of the pair to be mentioned and sometimes the second. The researcher would
never, for example, take the first factor (on this occasion 'Does not ridge') and systematically compare it to
each of the others in succession. That is likely to cause systematic bias.
Below labels have been given to the factors so that the worked example will be easier to understand. The
letters A - E have been allocated as follows:
A = Does not ridge
B = Far too expensive
C = New technology too risky
D = Does not work for inter-cropping
E = Too difficult to carry.
The data is then arranged into a matrix. Assume that 200 farmers have been interviewed and their
responses are arranged in the grid below. Further assume that the matrix is so arranged that we read
from top to side. This means, for example, that 164 out of 200 farmers said the fact that the plough was
too expensive was a greater deterrent than the fact that it was not capable of ridging. Similarly, 174
farmers said that the plough's inability to inter-crop was more important than the inability to ridge when
deciding not to buy the plough.
A preference matrix
A 100 164 120 174 180
B 36 100 160 176 166
C 80 40 100 168 124
D 26 24 32 100 102
E 20 34 76 98 100

If the grid is carefully read, it can be seen that the rank order of the factors is -
Most important E Too difficult to carry
D Does not inter crop
C New technology/high risk
B Too expensive
Least important A Does not ridge.

It can be seen that it is more important for designers to concentrate on improving transportability and, if
possible, to give it an inter-cropping capability rather than focusing on its ridging capabilities (remember
that the example is entirely hypothetical).
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 85

One major advantage to this type of questioning is that whilst it is possible to obtain a measure of the
order of importance of five or more factors from the respondent, he is never asked to think about more
than two factors at any one time. This is especially useful when dealing with illiterate farmers. Having
said that, the researcher has to be careful not to present too many pairs of factors to the farmer during the
interview. If he does, he will find that the farmer will quickly get tired and/or bored. It is as well to
remember the formula of n(n - 1)/2. For ten factors, brands or product attributes this would give 45 pairs.
Clearly the farmer should not be asked to subject himself to having the same question put to him 45
times. For practical purposes, six factors is possibly the limit, giving 15 pairs.
It should be clear from the procedures described in these notes that the paired comparison scale gives
ordinal data.

b) Rupee Metric Comparisons: This type of scale is an extension of the paired comparison method in that
it requires respondents to indicate both their preference and how much they are willing to pay for their
preference. This scaling technique gives the researcher an interval - scaled measurement. An example is
given below:-
An example of a Rupee metric scale
Which of the following types of fish do
you prefer?
How much more, in cents, would you be prepared to pay for
your preferred fish?
Fresh Fresh (gutted) ` 0.70
Fresh (gutted) Smoked 0.50
Frozen Smoked 0.60
Frozen Fresh 0.70
Smoked Fresh 0.20
Frozen(gutted) Frozen
From the data above the preferences shown below can be computed as follows:
Fresh fish: 0.70 + 0.70 + 0.20 =1.60
Smoked fish: 0.60 + (-0.20) + (-0.50) =(-1.10)
Fresh fish(gutted): (-0.70) + 0.30 + 0.50 =0.10
Frozen fish: (-0.60) + (-0.70) + (-0.30) =(-1.60)

c) The Unity-sum-gain technique: A common problem with launching new products is one of reaching a
decision as to what options, and how many options one offers. Whilst a company may be anxious to meet
the needs of as many market segments as possible, it has to ensure that the segment is large enough to
enable him to make a profit. It is always easier to add products to the product line but much more
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 86

difficult to decide which models should be deleted. One technique for evaluating the options which are
likely to prove successful is the unity-sum-gain approach.
The procedure is to begin with a list of features which might possibly be offered as 'options' on the
product, and alongside each you list its retail cost. A third column is constructed and this forms an index
of the relative prices of each of the items. The table below will help clarify the procedure. For the
purposes of this example the basic reaper is priced at ` 20,000 and some possible 'extras' are listed along
with their prices.
The total value of these hypothetical 'extras' is RS 7,460 but the researcher tells the farmer he has an
equally hypothetical ` 3,950 or similar sum. The important thing is that he should have considerably less
hypothetical money to spend than the total value of the alternative product features. In this way the
farmer is encouraged to reveal his preferences by allowing researchers to observe how he trades one
additional benefit off against another. For example, would he prefer a side rake attachment on a 3 metre
head rather than have a transporters trolley on either a standard or 2.5m wide head? The farmer has to be
told that any unspent money cannot be retained by him so he should seek the best value-for-money he
can get.
In cases where the researcher believes that mentioning specific prices might introduce some form of bias
into the results, then the index can be used instead. This is constructed by taking the price of each item
over the total of ` 7,460 and multiplying by 100. Survey respondents might then be given a maximum of
60 points and then, as before, are asked how they would spend these 60 points. In this crude example the
index numbers are not too easy to work with for most respondents, so one would round them as has been
done in the adjusted column. It is the relative and not the absolute value of the items which is important
so the precision of the rounding need not overly concern us.
The unity-sum-gain technique
Item Additional Cost (` ) Index Adjusted Index
2.5 wide rather than standard 2m 2,000 27 30
Self lubricating chain rather than belt 200 47 50
Side rake attachment 350 5 10
Polymer heads rather than steel 250 3 5
Double rather than single edged cutters 210 2.5 5
Transporter trolley for reaper attachment 650 9 10
Automatic levelling of table 300 4 5
The unity-sum-gain technique is useful for determining which product features are more important to
farmers. The design of the final market version of the product can then reflect the farmers' needs and
preferences. Practitioners treat data gathered by this method as ordinal.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 87

II) Non-comparative scales
a) Continuous rating scales: The respondents are asked to give a rating by placing a mark at the
appropriate position on a continuous
line. The scale can be written on card and
shown to the respondent during the
interview. Two versions of a continuous
rating scale are depicted in figure
When version B is used, the respondent's
score is determined either by dividing
the line into as many categories as
desired and assigning the respondent a
score based on the category into which his/her mark falls, or by measuring the distance, in millimetres or
inches, from either end of the scale.
Whichever of these forms of the continuous scale is used, the results are normally analysed as interval

b) Line marking scale: The line marked scale is typically used to measure perceived similarity differences
between products, brands or other objects.
Technically, such a scale is a form of what is termed a
semantic differential scale since each end of the scale
is labelled with a word/phrase (or semantic) that is
opposite in meaning to the other. Following figure
provides an illustrative example of such a scale.
Consider the products below which can be used when
frying food. In the case of each pair, indicate how
similar or different they are in the flavour which they impart to the food.
For some types of respondent, the line scale is an easier format because they do not find discrete numbers
(e.g. 5, 4, 3, 2, 1) best reflect their attitudes/feelings. The line marking scale is a continuous scale.

c) Itemised rating scales: With an itemised
scale, respondents are provided with a
scale having numbers and/or brief
descriptions associated with each category
and are asked to select one of the limited
numbers of categories, ordered in terms of
scale position that best describes the
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 88

product, brand, company or product attribute being studied. Examples of the itemised rating scale are
illustrated in figure.
Itemised rating scales can take a variety of innovative forms as demonstrated by the two illustrated in
figure, which are graphic.
Whichever form of itemised scale is
applied, researchers usually treat
the data as interval level.

d) Semantic scales: This type of
scale makes extensive use of words
rather than numbers. Respondents
describe their feelings about the
products or brands on scales with
semantic labels. When bipolar
adjectives are used at the end points of the scales, these are termed semantic differential scales. The
semantic scale and the semantic differential scale are illustrated in figure.

e) Likert scales: A Likert scale is what is termed a summated instrument scale. This means that the items
making up a Liken scale are summed to produce a total score. In fact, a Likert scale is a composite of
itemised scales. Typically, each scale item will have 5 categories, with scale values ranging from -2 to +2
with 0 as neutral response.
This explanation may be clearer from the example in the table below.
Agree Neither Disagree Strongly
If the price of raw materials fell firms would reduce
the price of their food products.
-2 -1 0 1 2
Without government regulation the firms would
exploit the consumer.
-2 -1 0 1 2
Most food companies are so concerned about making
profits they do not care about quality.
-2 -1 0 1 2
The food industry spends a great deal of money
making sure that its manufacturing is hygienic.
-2 -1 0 1 2
Food companies should charge the same price for
their products throughout the country
-2 -1 0 1 2
The Likert scale
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 89

Likert scales are treated as yielding Interval data by the majority of researchers. The scales which have
been described in this chapter are among the most commonly used in research. Whilst there are a great
many more forms which scales can take, if students are familiar with those described in this chapter they
will be well equipped to deal with most types of survey problem.

Electing a measurement scale: some practical questions
There is no best scale that applies to all research projects. The choice of scale will be a function of the
nature of the attitudinal object to be measured, the managers problem definition, and the backward and
forward linkages to other choices that have already been made (e.g., telephone survey versus mail
survey). There are several issues that will be helpful to consider:
Is a ranking, sorting, rating, or choice technique best? The answer to this question is largely determined
by the problem definition and especially by the type of statistical analysis that is desired.
Should a monadic or comparative scale be used? If a scale is other than a ratio scale, the researcher
must make a decision whether to use a standard of comparison. A monadic rating scale uses no such
comparison; it asks a respondent to rate a single concept in isolation. A comparative rating scale asks a
respondent to rate a concept in comparison with a benchmarkin many cases, "the ideal situation"
presents a reference for comparison with the actual situation.
What type of category labels, if any, will be used for the rating scale? We have discussed verbal labels,
numerical labels, and unlisted choices. The maturity and educational levels of the respondents and the
required statistical analysis will influence this decision.
How many scale categories or response positions are required to accurately measure an attitude? The
researcher must determine the number of meaningful positions that is best for each specific project.
Should a balanced or unbalanced rating scale be chosen? The fixed-alternative format may be
balancedwith a neutral or indifferent point at the center of the scaleor unbalanced. Unbalanced scales
may be used when the responses are expected to be distributed at one end of the scale; an unbalanced
scale may eliminate this type of "end piling."
Should respondents be given a forced-choice scale or a non-forced-choice scale? In many situations, a
respondent has not formed an attitude towards a concept, and simply cannot provide an answer. If many
respondents in the sample are expected to be unaware of the attitudinal object under investigation, this
problem may be eliminated by using a non-forced-choice scale that provides a "no opinion" category. The
argument for forced choice is that people really do have attitudes, even if they are unfamiliar with the
attitudinal object.
Should a single measure or an index measure be used? The researchers conceptual definition will be
helpful in making this choice. The researcher has many scaling options. The choice is generally
influenced by what is planned for the later stages of the research project.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 90

Chapter 5: Methods of Data Collection
A research design is a blue print which directs the plan of action to complete the research work. The
collection of data is an important part in the process of research work.
The quality and credibility of the results derived from the application of
research methodology depends upon the relevant, accurate and
adequate data.
In this unit, we shall study about the various sources of data and
methods of collecting primary and secondary data with their merits and limitations and also the choice of
suitable method for data collection.

Meaning and need for data
Data is required to make a decision in any business situation. The researcher is faced with one of the
most difficult problems of obtaining suitable, accurate and adequate data. Utmost care must be exercised
while collecting data because the quality of the research results depends upon the reliability of the data.
Suppose, you are the Director of your company. Your Board of Directors has asked you to find out why
the profit of the company has decreased since the last two years. Your Board wants you to present facts
and figures. What are you going to do?
The first and foremost task is to collect the relevant information to make an analysis for the above
mentioned problem. It is, therefore, the information collected from various sources, which can be
expressed in quantitative form, for a specific purpose, which is called data. The rational decision maker
seeks to evaluate information in order to select the course of action that maximizes objectives. For
decision making, the input data must be appropriate. This depends on the appropriateness of the method
chosen for data collection. The application of a statistical technique is possible when the questions are
answerable in quantitative nature, for instance; the cost of production, and profit of the company
measured in rupees, age of the workers in the company measured in years. Therefore, the first step in
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 91

statistical activities is to gather data. The data may be classified as primary and secondary data. Let us
now discuss these two kinds of data in detail.

Primary and secondary data
The Primary data are original data which are collected for the first time for a specific purpose. Such data
are published by authorities who themselves are responsible for their collection. The Secondary data on
the other hand, are those which have already been collected by some other agency and which have
already been processed. Secondary data may be available in the form of published or unpublished
sources. For instance, population census data collected by the Government in a country is primary data
for that Government.
But the same data becomes secondary for those researchers who use it later. In case you have decided to
collect primary data for your investigation, you have to identify the sources from where you can collect
that data. For example, if you wish to study the problems of the workers of X Company Ltd., then the
workers who are working in that company are the source. On the other hand, if you have decided to use
secondary data, you have to identify the secondary source who have already collected the related data for
their study purpose. With the above discussion, we can understand that the difference between primary
and secondary data is only in terms of degree. That is that the data which is primary in the hands of one
becomes secondary in the hands of another.

Primary data
Primary data can be obtained by communication or by observation. Communication involves questioning
respondents either verbally or in writing. This method is versatile, since one needs only to ask for the
information; however, the response may not be accurate. Communication usually is quicker and cheaper
than observation. Observation involves the recording of actions and is performed by either a person or
some mechanical or electronic device. Observation is less versatile than communication since some
attributes of a person may not be readily observable, such as attitudes, awareness, knowledge, intentions,
and motivation. Observation also might take longer since observers may have to wait for appropriate
events to occur, though observation using scanner data might be quicker and more cost effective.
Observation typically is more accurate than communication.
Some common types of primary data are:
- demographic and socioeconomic characteristics
- psychological and lifestyle characteristics
- attitudes and opinions
- awareness and knowledge - for example, brand awareness
- Intentions - for example, purchase intentions. While useful, intentions are not a reliable indication
of actual future behaviour
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 92

- motivation - a person's motives are more stable than his/her behaviour, so motive is a better
predictor of future behaviour than is past behaviour
- behaviour

Methods of collecting primary data
If the available secondary data does not meet the requirements of the present study, the researcher has to
collect primary data. As mentioned earlier, the data, which is collected for the first time by the researcher
for his own purpose, is called primary data. There are several methods of collecting primary data, such as
observation, interview through reporters, questionnaires and schedules. Let us study about them in

1. Observation Method
The Concise Oxford Dictionary defines observation as, accurate watching and noting of phenomena as
they occur in nature with regard to cause and effect or mutual relations. Thus observation is not only a
systematic watching but it also involves listening and reading, coupled with consideration of the seen
phenomena. It involves three processes. They are: sensation, attention or concentration and perception.
Under this method, the researcher collects information directly through observation rather than through
the reports of others. It is a process of recording relevant information without asking anyone specific
questions and in some cases, even without the knowledge of the respondents. This method of collection is
highly effective in behavioural surveys. For instance, a study on behaviour of visitors in trade fairs,
observing the attitude of workers on the job, bargaining strategies of customers etc. Observation can be
participant observation or non-participant observation. In Participant Observation Method, the
researcher joins in the daily life of informants or organisations, and observes how they behave. In the
Non-participant Observation Method, the researcher will not join the informants or organisations but will
watch from outside.

1) This is the most suitable method when the informants are unable or reluctant to provide information.
2) This method provides deeper insights into the problem and generally the data is accurate and quicker
to process. Therefore, this is useful for intensive study rather than extensive study.

Despite of the above merits, this method suffers from the following limitations:
1) In many situations, the researcher cannot predict when the events will occur. So when an event occurs
there may not be a ready observer to observe the event.
2) Participants may be aware of the observer and as a result may alter their behaviour.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 93

3) Observer, because of personal biases and lack of training, may not record specifically what he/she
4) This method cannot be used extensively if the inquiry is large and spread over a wide area.

2. Interview Method
Interview is one of the most powerful tools and most widely used method for primary data collection in
research. In our daily routine, we see interviews on T.V. channels on various topics related to social,
business, sports, budget etc. In the words of C. William Emory, personal interviewing is a two way
purposeful conversation initiated by an interviewer to obtain information that is relevant to some
research purpose. Thus an interview is basically, a meeting between two persons to obtain the
information related to the proposed study. The person who is interviewing is named as interviewer and
the person who is being interviewed is named as informant. It is to be noted that, the research
data/information collect through this method is not a simple conversation between the investigator and
the informant, but also the glances, gestures, facial expressions, level of speech etc., are all part of the
Through this method, the researcher can collect varied types of data intensively and extensively.
Interviewes can be classified as direct personal interviews and indirect personal interviews. Under the
techniques of direct personal interview, the investigator meets the informants (who come under the
study) personally, asks them questions pertaining to enquiry and collects the desired information. Thus if
a researcher intends to collect the data on spending habits of Delhi University (DU) students, he/ she
would go to the DU, contact the students, interview them and collect the required information.
Indirect personal interview is another technique of interview method where it is not possible to collect
data directly from the informants who come under the study. Under this method, the investigator
contacts third parties or witnesses, who are closely associated with the persons/situations under study
and are capable of providing necessary information. For example, an investigation regarding a bribery
pattern in an office. In such a case it is inevitable to get the desired information indirectly from other
people who may be knowing them. Similarly, clues about the crimes are gathered by the CBI. Utmost care
must be exercised that these persons who are being questioned are fully aware of the facts of the problem
under study, and are not motivated to give a twist to the facts.
Another technique for data collection through this method can be structured and unstructured
interviewing. In the Structured interview set questions are asked and the responses are recorded in a
standardised form. This is useful in large scale interviews where a number of investigators are assigned
the job of interviewing. The researcher can minimise the bias of the interviewer. This technique is also
named as formal interview. In Un-structured interview, the investigator may not have a set of questions
but have only a number of key points around which to build the interview. Normally, such types of
interviews are conducted in the case of an explorative survey where the researcher is not completely sure
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 94

about the type of data he/ she collects. It is also named as informal interview. Generally, this method is
used as a supplementary method of data collection in conducting research in business areas.
Now-a-days, telephone or cellphone interviews are widely used to obtain the desired information for
small surveys. For instance, interviewing credit card holders by banks about the level of services they are
receiving. This technique is used in industrial surveys specially in developed regions.

The major merits of this method are as follows:
1) People are more willing to supply information if approached directly. Therefore, personal interviews
tend to yield high response rates.
2) This method enables the interviewer to clarify any doubt that the interviewee might have while asking
him/her questions. Therefore, interviews are helpful in getting reliable and valid responses.
3) The informants reactions to questions can be properly studied.
4) The researcher can use the language of communication according to the standard of the information, so
as to obtain personal information of informants which are helpful in interpreting the results.

The limitations of this method are as follows:
1) The chance of the subjective factors or the views of the investigator may come in either consciously or
2) The interviewers must be properly trained, otherwise the entire work may be spoiled.
3) It is a relatively expensive and time-consuming method of data collection especially when the number
of persons to be interviewed is large and they are spread over a wide area.
4) It cannot be used when the field of enquiry is large (large sample).

Precautions : While using this method, the following precautions should be taken:
1. Obtain thorough details of the theoretical aspects of the research problem.
2. Identify who is to be interviewed.
3. The questions should be simple, clear and limited in number.
4. The investigator should be sincere, efficient and polite while collecting data.
5. The investigator should be of the same area (field of study, district, state etc.).

3. Through Local Reporters and Correspondents
Under this method, local investigators/agents or correspondents are appointed in different parts of the
area under investigation. This method is generally adopted by government departments in those cases
where regular information is to be collected. This method is also useful for newspapers, magazines, radio
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 95

and TV news channels. This method has been used when regular information is required and a high
degree of accuracy is not of much importance.

1) This method is cheap and economical for extensive investigations.
2) It gives results easily and promptly.
3) It can cover a wide area under investigation.

1) The data obtained may not be reliable.
2) It gives approximate and rough results.
3) It is unsuited where a high degree of accuracy is desired.
4) As the agent/reporter or correspondent uses his own judgement, his personal bias may affect the
accuracy of the information sent.

4. Questionnaire and Schedule Methods
Questionnaire and schedule methods are the popular and common methods for collecting primary data
in research. Both the methods comprise a list of questions arranged in a sequence pertaining to the
investigation. Let us study these methods in detail one after another.
i) Questionnaire Method
Under this method, questionnaires are sent personally or by post to various informants with a request to
answer the questions and return the questionnaire. If the questionnaire is posted to informants, it is called
a Mail Questionnaire. Sometimes questionnaires may also sent through E-mail depending upon the
nature of study and availability of time and resources. After receiving the questionnaires the informants
read the questions and record their responses in the space meant for the purpose on the questionnaire. It
is desirable to send the quetionnaire with self-addressed envelopes for quick and high rate of response.

1) You can use this method in cases where informants are spread over a vast geographical area.
2) Respondents can take their own time to answer the questions. So the researcher can obtain original
data by this method.
3) This is a cheap method because its mailing cost is less than the cost of personal visits.
4) This method is free from bias of the investigator as the information is given by the respondents
5) Large samples can be covered and thus the results can be more reliable and dependable.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 96

1) Respondents may not return filled in questionnaires, or they can delay in replying to the
2) This method is useful only when the respondents are educated and co-operative.
3) Once the questionnaire has been despatched, the investigator cannot modify the questionnaire.
4) It cannot be ensured whether the respondents are truly representative.

ii) Schedule Method
As discussed above, a Schedule is also a list of questions, which is used to collect the data from the field.
This is generally filled in by the researcher or the enumerators. If the scope of the study is wide, then the
researcher appoints people who are called enumerators for the purpose of collecting the data. The
enumerators go to the informants, ask them the questions from the schedule in the order they are listed
and record the responses in the space meant for the answers in the schedule itself. For example, the
population census all over the world is conducted through this method. The difference between
questionnaire and schedule is that the former is filled in by the informants, the latter is filled in by the
researcher or enumerator.

1) It is a useful method in case the informants are illiterates.
2) The researcher can overcome the problem of non-response as the enumerators go personally to obtain
the information.
3) It is very useful in extensive studies and can obtain more reliable data.

1) It is a very expensive and time-consuming method as enumerators are paid persons and also have to
be trained.
2) Since the enumerator is present, the respondents may not respond to some personal questions.
3) Reliability depends upon the sincerity and commitment in data collection.
The success of data collection through the questionnaire method or schedule method depends on how the
questionnaire has been designed.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 97

Primary data collection methods: Some Advantages & Disadvantages

Survey type Advantages Disadvantages
Spoken surveys Effective in all situations, e.g. when
literacy level is low.
Need a lot of organization.
Face to face surveys Usually provides very accurate
results. Any question can be asked.
Can include observation and visual
Expensive, specially when large areas
are covered.
Face to face surveys at
Can cover the entire population. Expensive; much organization needed.
Face to face surveys in
public places
Can do lots of interviews in a short
Samples are usually not representative
of the whole population.
Telephone surveys High accuracy obtainable if most
members of population have
No visual aids possible. Only feasible
with high telephone saturation.
Written surveys Cheaper than face-to-face surveys. Hard to tell if questions not correctly
understood. More chance of question
wording causing problems.
Mail surveys Cheap.
Allows anonymity.
Requires high level of literacy and
good postal system. Slow to get
questionnaires collected
and delivered
Cheap. Gives respondents time to
check documents.
Respondents must be highly literate..
Fax surveys Fast
Questionnaires with more than one
page are often only partially returned.
Email surveys Very cheap
Quick results.
Samples not representative of whole
population. Some respondents lie.
High computer skills needed.
Web surveys More easily processed than email
Many people dont have good web
Informal methods Fast
Cant produce accurate figures.
Experience needed for comparisons.
Subjective. Most suitable for
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 98

preliminary studies.
Monitoring Little work required
Often not completely relevant.
Samples often not representative.
Most suitable when assessing
Observation (can be
combined with surveys)
More accurate than asking people
their behaviour.
Only works in limited situations.
Meters More accurate than asking people
their behaviour.
Very expensive to set up; measures
equipment rather than people. Cant
find out reasons for behaviour.
Panels Ability to discover changes in
individuals preferences and
Need to maintain records of previous
contact, etc.
Depth interviews Provide insights not available with
most other methods.
Expensive; need highly skilled
Focus groups Provide insights not available with
most other methods.
Need highly skilled moderator,
trained in psychology etc.
Consensus groups Instant results.
Clear wording.
Secretary and/or moderator need
strong verbal skills. Dont work well
in some cultures, e.g. Buddhist.
Internet qualitative
Easy for a geographically dispersed
group to meet.
Low cost.
Doesnt provide the subtlety of
personal interaction. Very new, so few
experts available to help with

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 99

Questionnaire Design
Questionnaires are frequently used in quantitative research. They are a valuable method of collecting a
wide range of information from a large number
of respondents. Good questionnaire
construction is critical to the success of a survey.
Inappropriate questions, incorrect ordering of
questions, incorrect scaling, or bad
questionnaire format can make the survey
valueless. A useful method for checking a
questionnaire for problems is to pretest it. This
usually involves giving it to a small sample of
respondents, then interviewing the respondents
to get their impressions and to confirm that the
questions accurately captured their opinions.
The design of a questionnaire will depend on whether the researcher wishes to collect exploratory
information (i.e. qualitative information for the purposes of better understanding or the generation of
hypotheses on a subject) or quantitative information (to test specific hypotheses that have previously
been generated).
If the data to be collected is qualitative or is not to be statistically evaluated, it may be that no formal
questionnaire is needed. For example, in interviewing the female head of the household to find out how
decisions are made within the family when purchasing breakfast foodstuffs, a formal questionnaire may
restrict the discussion and prevent a full exploration of the woman's views and processes. Instead one
might prepare a brief guide, listing perhaps ten major open-ended questions, with appropriate
probes/prompts listed under each.
If the researcher is looking to test and quantify hypotheses and the data is to be analysed statistically, a
formal standardised questionnaire is designed. Such questionnaires are generally characterised by:
- prescribed wording and order of questions, to ensure that each respondent receives the same
- prescribed definitions or explanations for each question, to ensure interviewers handle
questions consistently and can answer respondents' requests for clarification if they occur
- prescribed response format, to enable rapid completion of the questionnaire during the
interviewing process.
Given the same task and the same hypotheses, six different people will probably come up with six
different questionnaires that differ widely in their choice of questions, line of questioning, use of open-
ended questions and length. There are no hard-and-fast rules about how to design a questionnaire, but
there are a number of points that can be borne in mind:
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 100

1. A well-designed questionnaire should meet the research objectives. This may seem obvious, but
many research surveys omit important aspects due to inadequate preparatory work, and do not
adequately probe particular issues due to poor understanding. To a certain degree some of this is
inevitable. Every survey is bound to leave some questions unanswered and provide a need for further
research but the objective of good questionnaire design is to 'minimise' these problems.
2. It should obtain the most complete and accurate information possible. The questionnaire designer
needs to ensure that respondents fully understand the questions and are not likely to refuse to answer, lie
to the interviewer, or try to conceal their attitudes. A good questionnaire is organised and worded to
encourage respondents to provide accurate, unbiased, and complete information.
3. A well-designed questionnaire should make it easy for respondents to give the necessary information
and for the interviewer to record the answer and it should be arranged so that sound analysis and
interpretation are possible.
4. It would keep the interview brief and to the point and be so arranged that the respondent(s) remain
interested throughout the interview.
Even after the exploratory phase, two key steps remain to be completed before the task of designing the
questionnaire should commence. The first of these is to articulate the questions that research is intended
to address. The second step is to determine the hypotheses around which the questionnaire is to be
It is possible for the piloting exercise to be used to make necessary adjustments to administrative aspects
of the study. This would include, for example, an assessment of the length of time an interview actually
takes, in comparison to the planned length of the interview; or, in the same way, the time needed to
complete questionnaires. Moreover, checks can be made on the appropriateness of the timing of the study
in relation to contemporary events such as avoiding farm visits during busy harvesting periods.

Preliminary decisions in questionnaire design
There are nine steps involved in the development of a questionnaire:
1. Decide the information required.
2. Define the target respondents.
3. Choose the method(s) of reaching your target respondents.
4. Decide on question content.
5. Develop the question wording.
6. Put questions into a meaningful order and format.
7. Check the length of the questionnaire.
8. Pre-test the questionnaire.
9. Administer the questionnaires

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 101

1. Deciding on the information required
It should be noted that one does not start by writing questions. The first step is to decide 'what are the
things one needs to know from the respondent in order to meet the survey's objectives?' These, as has
been indicated in the opening chapter of this book, should appear in the research brief and the research
One may already have an idea about the kind of information to be collected, but additional help can be
obtained from secondary data, previous rapid rural appraisals and exploratory research. In respect of
secondary data, the researcher should be aware of what work has been done on the same or similar
problems in the past, what factors have not yet been examined, and how the present survey questionnaire
can build on what has already been discovered. Further, a small number of preliminary informal
interviews with target respondents will give a glimpse of reality that may help clarify ideas about what
information is required.

2. Define the target respondents
At the outset, the researcher must define the population about which he/she wishes to generalise from
the sample data to be collected. For example, researchers often have to decide whether they should cover
only existing users of the generic product type or whether to also include non-users. Secondly,
researchers have to draw up a sampling frame. Thirdly, in designing the questionnaire we must take into
account factors such as the age, education, etc. of the target respondents.

3. Choose the method(s) of reaching target respondents
It may seem strange to be suggesting that the method of reaching the intended respondents should
constitute part of the questionnaire design process. However, a moment's reflection is sufficient to
conclude that the method of contact will influence not only the questions the researcher is able to ask but
the phrasing of those questions. The main methods available in survey research are:
- Personal interviews
- Group or focus interviews
- Mailed questionnaires
- Telephone interviews.
Within this region the first two mentioned are used much more extensively than the second pair.
However, each has its advantages and disadvantages. A general rule is that the more sensitive or
personal the information, the more personal the form of data collection should be.

4. Decide on question content
Researchers must always be prepared to ask, "Is this question really needed?" The temptation to include
questions without critically evaluating their contribution towards the achievement of the research
objectives, as they are specified in the research proposal, is surprisingly strong. No question should be
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 102

included unless the data it gives rise to is directly of use in testing one or more of the hypotheses
established during the research design.
There are only two occasions when seemingly "redundant" questions might be included:
Opening questions that are easy to answer and which are not perceived as being "threatening", and/or
are perceived as being interesting, can greatly assist in gaining the respondent's involvement in the
survey and help to establish a rapport.
This, however, should not be an approach that should be overly used. It is almost always the case that
questions which are of use in testing hypotheses can also serve the same functions.
"Dummy" questions can disguise the purpose of the survey and/or the sponsorship of a study. For
example, if a manufacturer wanted to find out whether its distributors were giving the consumers or end-
users of its products a reasonable level of service, the researcher would want to disguise the fact that the
distributors' service level was being investigated. If he/she did not, then rumours would abound that
there was something wrong with the distributor.

5. Develop the question wordings
There are a series of questions that should be posed as the researchers develop the survey questions
a) "Is this question sufficient to generate the required information?"
For example, asking the question "Which product do you prefer?" in a taste panel exercise will reveal
nothing about the attribute(s) the product was judged upon. Nor will this question reveal the degree of
preference. In such cases a series of questions would be more appropriate.
b) "Can the respondent answer the question correctly?"
An inability to answer a question arises from three sources:
Having never been exposed to the answer, e.g. "How much does your husband earn?"
Forgetting, e.g. what price did you pay when you last bought maize meal?"
An inability to articulate the answer: e.g. "What improvements would you want to see in food
preparation equipment?"
c) "Are there any external events that might bias response to the question?"
For example, judging the popularity of beef products shortly after a foot and mouth epidemic is likely to
have an effect on the responses.
d) "Do the words have the same meaning to all respondents?"
For example, "How many members are there in your family?"
There is room for ambiguity in such a question since it is open to interpretation as to whether one is
speaking of the immediate or extended family.
e)"Are any of the words or phrases loaded or leading in any way?"
For example," What did you dislike about the product you have just tried?"
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 103

The respondent is not given the opportunity to indicate that there was nothing he/she disliked about the
product. A less biased approach would have been to ask a preliminary question along the lines of, "Did
you dislike any aspect of the product you have just tried?", and allow him/her to answer yes or no.
f) "Are there any implied alternatives within the question?"
The presence or absence of an explicitly stated alternative can have dramatic effects on responses. For
example, consider the following two forms of a question asked of a 'Pasta-in-a-Jar' concept test:
a. " Would you buy pasta-in-a-jar if it were locally available?"
b. "If pasta-in-a-jar and the cellophane pack you currently use were both available locally, would
- Buy only the cellophane packed pasta?
- Buy only the pasta-in-a-jar product?
- Buy both products?"
The explicit alternatives provide a context for interpreting the true reactions to the new product idea. If
the first version of the question is used, the researcher is almost certain to obtain a larger number of
positive responses than if the second form is applied.
g)"Will the question be understood by the type of individual to be interviewed?"
It is good practice to keep questions as simple as possible. Researchers must be sensitive to the fact that
some of the people he/she will be interviewing do not have a high level of education. Sometimes he/she
will have no idea how well or badly educated the respondents are until he/she gets into the field. In the
same way, researchers should strive to avoid long questions. The fewer words in a question the better.
Respondents' memories are limited and absorbing the meaning of long sentences can be difficult: in
listening to something they may not have much interest in, the respondents' minds are likely to wander,
they may hear certain words but not others, or they may remember some parts of what is said but not all.
8) "Is there any ambiguity in my questions?"
The careless design of questions can result in the inclusion of two items in one question. For example:
"Do you like the speed and reliability of your tractor?"
The respondent is given the opportunity to answer only 'yes' or 'no', whereas he might like the speed, but
not the reliability, or vice versa. Thus it is difficult for the respondent to answer and equally difficult for
the researcher to interpret the response.
The use of ambiguous words should also be avoided. For example: "Do you regularly service your
The respondents' understanding and interpretation of the term 'regularly' will differ. Some may consider
that regularly means once a week, others may think once a year is regular. The inclusion of such words
again presents interpretation difficulties for the researcher.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 104

h) "Are any words or phrases vague?"
Questions such as 'What is your income?' are vague and one is likely to get many different responses with
different dimensions. Respondents may interpret the question in different terms, for example:
- hourly pay?
- weekly pay?
- yearly pay?
- income before tax?
- income after tax?
- income in kind as well as cash?
- income for self or family?
- all income or just farm income?
The researcher needs to specify the 'term' within which the respondent is to answer.
i) "Are any questions too personal or of a potentially embarrassing nature?"
The researcher must be clearly aware of the various customs, morals and traditions in the community
being studied. In many communities there can be a great reluctance to discuss certain questions with
interviewers/strangers. Although the degree to which certain topics are taboo varies from area to area,
such subjects as level of education, income and religious issues may be embarrassing and respondents
may refuse to answer.
j) "Do questions rely on feats of memory?"
The respondent should be asked only for such data as he is likely to be able to clearly remember. One has
to bear in mind that not everyone has a good memory, so questions such as 'Four years ago was there a
shortage of labour?' should be avoided.

Putting questions into a meaningful order and format
i. Opening questions: Opening questions should be easy to answer and not in any way threatening
to THE respondents. The first question is crucial because it is the respondent's first exposure to
the interview and sets the tone for the nature of the task to be performed. If they find the first
question difficult to understand, or beyond their knowledge and experience, or embarrassing in
some way, they are likely to break off immediately. If, on the other hand, they find the opening
question easy and pleasant to answer, they are encouraged to continue.
ii. Question flow: Questions should flow in some kind of psychological order, so that one leads
easily and naturally to the next. Questions on one subject, or one particular aspect of a subject,
should be grouped together. Respondents may feel it disconcerting to keep shifting from one
topic to another, or to be asked to return to some subject they thought they gave their opinions
about earlier.
iii. Question variety: Respondents become bored quickly and restless when asked similar questions
for half an hour or so. It usually improves response, therefore, to vary the respondent's task from
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 105

time to time. An open-ended question here and there (even if it is not analysed) may provide
much-needed relief from a long series of questions in which respondents have been forced to
limit their replies to pre-coded categories. Questions involving showing cards/pictures to
respondents can help vary the pace and increase interest.
iv. Closing questions: It is natural for a respondent to become increasingly indifferent to the
questionnaire as it nears the end. Because of impatience or fatigue, he may give careless answers
to the later questions. Those questions, therefore, that are of special importance should, if
possible, be included in the earlier part of the questionnaire. Potentially sensitive questions
should be left to the end, to avoid respondents cutting off the interview before important
information is collected.
In developing the questionnaire the researcher should pay particular attention to the
presentation and layout of the interview form itself. The interviewer's task needs to be made as
straight-forward as possible.

Types of Questions
Survey questions can be classified as follows:-
1. Contingency questions - A question that is answered only if the respondent gives a particular
response to a previous question. This avoids asking questions of people that do not apply to them
(for example, asking men if they have ever been pregnant).
2. Matrix questions - Identical response categories are assigned to multiple questions. The questions
are placed one under the other, forming a matrix with response categories along the top and a list of
questions down the side. This is an efficient use of page space and respondents time.
3. Scaled questions - Responses are graded on a continuum (example : rate the appearance of the
product on a scale from 1 to 10, with 10 being the most preferred appearance). Examples of types of
scales include the Likert scale, semantic differential scale, and rank-order scale .
4. Closed ended questions - Respondents answers are limited to a fixed set of responses. Most scales
are closed ended. Other types of closed ended questions include:
* Dichotomous questions - The respondent answers with a yes or a no.
* Multiple choice - The respondent has several option from which to choose.
Advantage of Closed Format
Closed-that is, forced choice-format
- Easy and quick to fill in
- Minimise discrimination against the less literate (in self administered questionnaire) or the less
articulate (in interview questionnaire)
- Easy to code, record, and analyse results quantitatively
- Easy to report results
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 106

5. Open ended questions - No options or predefined categories are suggested. The respondent
supplies their own answer without being constrained by a fixed set of possible responses. Examples
of types of open ended questions include:
o Completely unstructured - For example, What is your opinion of questionnaires?
o Word association - Words are presented and the respondent mentions the first word that comes
to mind.
o Sentence completion - Respondents complete an incomplete sentence. For example, The most
important consideration in my decision to buy a new house is . . .
o Story completion - Respondents complete an incomplete story.
o Picture completion - Respondents fill in an empty conversation balloon.
o Thematic apperception test - Respondents explain a picture or make up a story about what they
think is happening in the picture
Advantages of open format
- Allows exploration of the range of possible themes arising from an issue
- Can be used even if a comprehensive range of alternative choices cannot be compiled
There are three commonly used rating scales: graphic, itemized, and comparative.
- Graphic - simply a line on which one marks an X anywhere between the extremes with an infinite
number of places where the X can be placed.
- Itemized - similar to graphic except there are a limited number of categories that can be marked.
- Comparative - the respondent compares one attribute to others. Examples include the Q-sort
technique and the constant sum method, which requires one to divide a fixed number of points
among the alternatives.
Questionnaires typically are administered via a personal or telephone interview or via a mail
questionnaire. Newer methods include e-mail and the Web.

7. Checking and editing
Though completed questionnaires should already have been checked by interviewers and supervisors,
they need to be checked again before (or during) data entry.
What to check
Every questionnaire needs to be thoroughly checked:
All standard items at the beginning or end of a questionnaire should be filled in. They usually include:
- the questionnaire serial number
- the place where the interview was done (often in coded form)
- the interviewers name (or initials, or number)
- the date and time of interview.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 107

These are not questions asked of the respondent, but information supplied by the interviewer. If the
interviewer forgot to include something here, the supervisor should have noticed, and made sure it was
added. But sometimes newly trained supervisors dont notice these omissions. They sooner these
problems are found, the more easily they can be corrected.
- Check that every question which is supposed to have only one answer does not have more.
- Check that no question which should have been skipped has an answer entered.
- If an answer has been written in because no code applied, perhaps a new code will have to be
created. This will have to be done after looking at all answers to this question, after going
through all the questionnaires.

Physical appearance of the questionnaire
The physical appearance of a questionnaire can have a significant effect upon both the quantity and
quality of marketing data obtained. The quantity of data is a function of the response rate. Ill-designed
questionnaires can give an impression of complexity, medium and too big a time commitment. Data
quality can also be affected by the physical appearance of the questionnaire with unnecessarily confusing
layouts making it more difficult for interviewers, or respondents in the case of self-completion
questionnaires, to complete this task accurately.
Attention to just a few basic details can have a disproportionately advantageous impact on the data
obtained through a questionnaire.

Use of booklets The use of booklets, in the place of loose or stapled sheets of paper, make it easier for
interviewer or respondent to progress through the document. Moreover, fewer pages
tend to get lost.
Simple, clear
The clarity of questionnaire presentation can also help to improve the ease with which
interviewers or respondents are able to complete a questionnaire.
Creative use of
space and
In their anxiety to reduce the number of pages of a questionnaire these is a tendency to
put too much information on a page. This is counter-productive since it gives the
questionnaire the appearance of being complicated. Questionnaires that make use of
blank space appear easier to use, enjoy higher response rates and contain fewer errors
when completed.
Use of colour
Colour coding can help in the administration of questionnaires. It is often the case that
several types of respondents are included within a single survey (e.g. wholesalers and
retailers). Printing the questionnaires on two different colours of paper can make the
handling easier.
Interviewer instructions should be placed alongside the questions to which they
pertain. Instructions on where the interviewers should probe for more information or
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 108

how replies should be recorded are placed after the question.

In general it is best for a questionnaire to be as short as possible. A long questionnaire leads to a long
interview and this is open to the dangers of boredom on the part of the respondent (and poorly
considered, hurried answers), interruptions by third parties and greater costs in terms of interviewing
time and resources. In a rural situation, an interview should not last longer than 30-45 minutes.

8. Piloting the questionnaires
Even after the researcher has proceeded along the lines suggested, the draft questionnaire is a product
evolved by one or two minds only. Until it has actually been used in interviews and with respondents, it
is impossible to say whether it is going to achieve the desired results. For this reason it is necessary to
pre-test the questionnaire before it is used in a full-scale survey, to identify any mistakes that need
The purpose of pre-testing the questionnaire is to determine:
- whether the questions as they are worded will achieve the desired results
- whether the questions have been placed in the best order
- whether the questions are understood by all classes of respondent
- whether additional or specifying questions are needed or whether some questions should be
- whether the instructions to interviewers are adequate.
Usually a small number of respondents are selected for the pre-test. The respondents selected for the pilot
survey should be broadly representative of the type of respondent to be interviewed in the main survey.
If the questionnaire has been subjected to a thorough pilot test, the final form of the questions and
questionnaire will have evolved into its final form. All that remains to be done is the mechanical process
of laying out and setting up the questionnaire in its final form. This will involve grouping and sequencing
questions into an appropriate order, numbering questions, and inserting interviewer instructions.
Recoding frequent "other" answers
Its annoying to read a survey report and find that a large proportion of the answers to a question were
"other". The goal should be to make sure the "other" category is the one with the fewest answers -
certainly no more than 5%. Take for example this question:
"Which languages do you understand?"
(Circle all codes that apply)
1 Marathi
2 Hindi
3 Bengali
4 English
5 Other - write in: ......................
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 109

If 10% of people gave an "other" answer, the written-in responses will need to be counted. If 4% of people
understood Gujrati, and 3% understood Tamil, two new codes could be created:
6 = Gujrati
7 = Punjabi
For each questionnaire mentioning these languages, the circled 5 should be crossed out (unless a different
"other" language was also mentioned), and 6 and/or 7 written in and circled. This should reduce the
remaining "other" figure to about 3%. Unless at least 2% of respondents give a particular "other" answer,
its usually not worthwhile to create a separate code. Sometimes a number of "other" answers can be
grouped, e.g.
8 = South Indian languages
But when such a code has been made, there is no way to recode the question except by going back to all
the questionnaires with that code. The principle should be not to combine any answers which you might
later want to look at separately.

Coding open-ended questions
With some open-ended questions, you expect to find many answers recurring.
For example: "What is your occupation?" There will be some occupations which are very common, some
less common, and there will probably be a lot of occupations which only one respondent in the sample
mentions. With other open-ended questions (such as "What do you like most about listening to FM
RADIOMIRCHI?") you may find that no two respondents give the same answer.
For both types of question, the standard coding method is the same: you take a sub-sample of answers to
that question - often the first 100 answers to come in. (That may be a lot more than 100 questionnaires, if
not everybody is asked the question.)
Each different answer is written on a slip of paper, and these answers are then sorted into groups with
similar meanings. Usually, there are 10 to 20 groups. If fewer than 2 people in 100 give a particular
answer, its not worthwhile having a separate code for that answer - unless it has a very specific and
different meaning from all the others.

Having defined these 10 to 20 groups, a code number is then assigned to each. Following the example of
what people like about FM RADIOMIRCHI, these codes might be assigned.
01 = like everything about FM RADIOMIRCHI
02 = like nothing about FM RADIOMIRCHI
03 = the announcers in general
04 = the programs in general
05 = the music
06 = news bulletins
07 = talkback
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 110

08 = breakfast program
09 = Lunch time programee
10 = other

A practical problem with such a coding scheme is that, the more codes are defined, the more likely some
are to be very similar, and the coders may not be consistent in assigning codes to answers.
When consistency is very important, any codes which are not absolutely clear should be allocated by
several coders working together, or by a single supervisor. As new comments are found, which are not
covered by the original codes, new codes will need to be added.
There are many ways in which an answer can be given a code - what is most useful depends on any
action you might take as a result of the survey. If there are particular types of answer you are looking for,
you could create codes for these. For example, if a station broadcasts programs in a particular language,
that language should be listed as a code. Even if no respondent understands that language, this in itself is
useful information.
For open-ended questions with predefined answers (such as occupations) there may be no need to build a
coding frame by looking at the answers. For example, occupation coding is often done using the 10 major
groups from the International Standard Classification of Occupations.
Thats one way to code open-ended questions. It works well for questions with a limited number of
answers, but for questions on attitudes, opinions, and so on, counting the coded categories lose much of
the detail in the answers. Another approach is to use the whole wording of the answers - e.g. by entering
the verbatim answers on a computer file. The coding can then be very simple, and summarize the exact
wording. We can use coding schemes such as...
0 = made no comment
1 = made a comment
1 = positive or favourable comment
2 = negative or unfavourable comment
3 = neutral comment, or mixed positive and negative.
These very broad coding schemes are much quicker to apply, and less dependent on a coders opinion.
But the broad codes are not very useful, unless you also report the exact wording of comments.

9. How to administer the questionnaires
There are several ways of administering questionnaires. They may be self administered or read out by
interviewers. Self administered questionnaires may be sent by post, email, or electronically online.
Interview administered questionnaires may be by telephone or face to face.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 111

The exact method of administration also depends on who the respondents are. For example, University
lecturers may be more appropriately surveyed by email; older people by telephone interviews; train
passengers by face-to-face interviews.
Advantages of self-administered questionnaires
Advantages of self-administered questionnaires include:
They are less expensive than interviews.
They do not require a large staff of skilled interviewers.
They can be administered in large numbers all at one place and time.
Anonymity and privacy encourage more candid and honest responses.
Lack of interviewer bias.
Speed of administration and analysis.
Suitable for computer based research methods.
Less pressure on respondents
Advantages of researcher administered interviews
Advantages of researcher administered interviews include:
Fewer misunderstood questions and inappropriate responses.
Fewer incomplete responses.
Higher response rates.
Greater control over the environment that the survey is administered in.

Introduction, personalised letter, and ending
It seems a good idea to have either a personalised covering letter or at least an introduction explaining
briefly the purpose of the survey, the importance of the respondents' participation, who is responsible for
the survey, and a statement guaranteeing confidentiality. A personalised letter can be easily generated
using mail-merge on a word processor. It is also important to thank the respondent at the end of the

Questionnaire design Issues
The way questions are phrased is important and there are some general rules for constructing good
questions in a questionnaire.

1) Use short and simple sentences
Short, simple sentences are generally less confusing and ambiguous than long, complex ones. As a rule of
thumb, most sentences should contain one or two clauses. Sentences with more than three clauses should
be rephrased.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 112

2) Ask for only one piece of information at a time
For example, "Please rate the lecture in terms of its content and presentation" asks for two pieces of
information at the same time. It should be divided into two parts: "Please rate the lecture in terms of (a)
its content, (b) its presentation."
3) Avoid negatives if possible
Negatives should be used only sparingly. For example, instead of asking students whether they agree
with the statement, "Small group teaching should not be abolished," the statement should be rephrased
as, "Small group teaching should continue." Double negatives should always be avoided.
4) Ask precise questions
Questions may be ambiguous because a word or term may have a different meaning. For example, if we
ask students to rate their interest in "medicine," this term might mean "general medicine" (as opposed to
general surgery) to some, but inclusive of all clinical specialties (as opposed to professions outside
medicine) to others.
Another source of ambiguity is a failure to specify a frame of reference. For example, in the question,
"How often did you borrow books from your library?" the time reference is missing. It might be
rephrased as, "How many books have you borrowed from the library within the past six months
5) Ensure those you ask have the necessary knowledge
For example, in a survey of University lecturers on recent changes in higher education, the question, "Do
you agree with the recommendations in the Dearing report on higher education?" is unsatisfactory for
several reasons. Not only does it ask for several pieces of information at the same time as there are several
recommendations in the report, the question also assumes that all lecturers know about the relevant
6) Level of details
It is important to ask for the exact level of details required. On the one hand, you might not be able to
fulfill the purposes of the survey if you omit to ask essential details. On the other hand, it is important to
avoid unnecessary details. People are less inclined to complete long questionnaires. This is particularly
important for confidential sensitive information, such as personal financial matters or marital relationship
7) Sensitive issues
It is often difficult to obtain truthful answers to sensitive questions. Clearly, the question, "Have you ever
copied other students' answers in a degree exam?" is likely to produce either no response or negative
responses. Less direct approaches have been suggested. Firstly, the casual approach: "By the way, do you
happen to have copied other students' answers in a degree exam?" may be used as a last part of another
decoy question. Secondly, the numbered card approach: "Please tick one or more of the following items
which correspond to how you have answered degree examination questions in the past." In the list of
items, include "copy from other students" as one of many items. Thirdly, the everybody approach: "As we
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 113

all know, most medical students have copied other students' answers in degree exams. Do you happen to
be one of them?" Fourthly, other people approach. This approach was used in the recent medical student
survey. In this survey, students were given the scenario, "Anil copies answers in a degree exam from
Sunil." They were then asked, "Do you feel Anil is wrong, what penalty should be imposed for Anil, and
have you done or would you consider doing the above?"
8) Minimise bias
People tend to answer questions in a way they perceive to be socially desired or expected by the
questioner and they often look for clues in the questions. Many apparently neutral questions can
potentially lead to bias. For example, in the question, "Within the past month, how many lectures have
you missed due to your evening job?" students may perceive the desired responses to be "never" to the
first question. This question could be rephrased as, "Within the past month, how many times did your
evening job commitment clash with lectures? How many times did you give priority to your evening
job?" Take another example. The question, "Please rate how useful the following text-books are. Please
also state whether they are included in your lecturer's recommended reading list?" There is a risk that the
students may perceive that they should rate books recommended by lecturers more favourably than
those not recommended by their lecturers. This risk may be minimised by putting the second question
later on in the questionnaire.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 114

Typical example of: - Questionnaire for consumers

Dear Respondent,
We are conducting a survey of the after-shave lotion market. We would be grateful if you could fill-up
the following questionnaire in this regard.

Madhav B
Student of PDIMTR

1. Do you use an after shave lotion?
( ) Yes ( ) No
If you do not use an after shave lotion then go to the Question-12

2. Please name a few after shave lotions you have heard of.
a ..
b .

3. Which of the following brands have you heard of? TICK
a. Park Avenue b Old Spice
c Savage d English Leather
e Patricks f Williams
g Aramis h Givenchy
i Brut j Yardley

4. a. Which after shave lotion are you using at present?.....
b. If you are to select an after shave brand now which brand
will you choose? ......

5. Can you recall the name of the previous brand of after shave lotion you used? Please mention......

6. Can you give reasons for consistency/change in your after shave lotions Consistency
a. Habitual
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 115

b. Value for money
c. Dont like others
d. Any other, please specify Change
a. Like to try other brands.
b. For a change;
c. All brands are same.
d. Any other, please specify.

7. Why do you use an after shave lotion? TICK
a. For its antiseptic properties
b. As a perfume
c. To feel fresh
d. Girlfriend loves it
e. To get the sting.
f. Any other reason, please mention.

8. When do you use an after shave lotion?
a. Immediate after shaving
b. After a bath
c. Anytime of the day
d. Before going to a party.
e. ............

9. Given an easy availability of Indian and foreign brands of after shave lotion which brand do you
( ) Indian ( ) Imported
a. Perfume is better
b. Quality is better
c. Brand image
d. Price is lower
e. Status
f. Easy availability
g. Any other, please specify.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 116

10. Who buys the after shave lotion for you?
a. Self
b. Family members
c. Normally get it as a gift
d. ............

11. Here we have mentioned a set of factors that you may consider while buying an after shave lotion?
Give your response on a seven point scale ranging from (1) most important to (7) least
important for each of them.
a. Price
b. Brand name
c. Perfume
d. Antiseptic property
e. Type of bottle (with/without atomizer)

12. Personal Information:
Age: ( ) less than 18 years ( )18-25 years
( ) 25-35 years ( ) above 35 years
Family Income:
( ) less than ` 36000 p.a.
( ) ` 36000 to ` 72000 p.a
( ) above ` 72000 p.a.
Govt. Service/Private Service/Student/Business/ Any Other ......

Thank you for your participation in this survey!

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 117

Collection of secondary data
Before going through the time and expense of collecting primary data, one should check for secondary
data that previously may have been collected for other purposes but that can be used in the immediate
study. Secondary data may be internal to the firm, such as sales invoices and warranty cards, or may be
external to the firm such as published data or commercially available data. The government census is a
valuable source of secondary data.
Secondary data has the advantage of saving time and reducing data gathering costs. The disadvantages
are that the data may not fit the problem perfectly and that the accuracy may be more difficult to verify
for secondary data than for primary data.
Some secondary data is republished by organizations other than the original source. Because errors can
occur and important explanations may be missing in republished data, one should obtain secondary data
directly from its source. One also should consider who the source is and whether the results may be

The nature of secondary sources of information
Secondary data is data, which has been collected by individuals or agencies for purposes other than those
of our particular research study. For example, if a government department has conducted a survey of,
say, family food expenditures, and then a food manufacturer might use this data in the organisation's
evaluations of the total potential market for a new product. Similarly, statistics prepared by a ministry on
agricultural production will prove useful to a whole host of people and organisations, including those
marketing agricultural supplies.
No research study should be undertaken without a prior search of secondary sources (also termed desk

There are several grounds for making such a bold statement.
Secondary data may be available which is entirely appropriate and wholly adequate to draw
conclusions and answer the question or solve the problem. Sometimes primary data collection simply
is not necessary.
It is far cheaper to collect secondary data than to obtain primary data. For the same level of research
budget a thorough examination of secondary sources can yield a great deal more information than can
be had through a primary data collection exercise.
The time involved in searching secondary sources is much less than that needed to complete primary
data collection.
Secondary sources of information can yield more accurate data than that obtained through primary
research. This is not always true but where a government or international agency has undertaken a
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 118

large scale survey, or even a census, this is likely to yield far more accurate results than custom
designed and executed surveys when these are based on relatively small sample sizes.
It should not be forgotten that secondary data can play a substantial role in the exploratory phase of the
research when the task at hand is to define the research problem and to generate hypotheses. The
assembly and analysis of secondary data almost invariably improves the researcher's understanding of
the marketing problem, the various lines of inquiry that could or should be followed and the
alternative courses of action which might be pursued.
Secondary sources help define the population. Secondary data can be extremely useful both in
defining the population and in structuring the sample to be taken. For instance, government statistics
on a country's agriculture will help decide how to stratify a sample and, once sample estimates have
been calculated, these can be used to project those estimates to the population.

Precaution in Using Secondary Data

With the above discussion, we can understand that there is a lot of published and unpublished sources
where researcher can gets secondary data. However, the researcher must be cautious in using this type of
data. The reason is that such type of data may be full of errors because of bias, inadequate size of the
sample, errors of definitions etc. Bowley expressed that it is never safe to take published or unpublished
statistics at their face value without knowing their meaning and limitations. Hence, before using
secondary data, you must examine the following points.
1. Suitability of Secondary Data
Before using secondary data, you must ensure that the data are suitable for the purpose of your enquiry.
For this, you should compare the objectives, nature and scope of the given enquiry with the original
investigation. For example, if the objective of our enquiry is to study the salary pattern of a firm including
perks and allowances of employees. But, secondary data is available only on basic pay. Such type of data
is not suitable for the purpose of the study.
2. Reliability of Secondary Data
For the reliability of secondary data, these can be tested: i) un-biasedness of the collecting person, ii)
proper check on the accuracy of field work, iii) the editing, tabulating and analysis done carefully, iv) the
reliability of the source of information, v) the methods used for the collection and analysis of the data. If
the data collecting organisations are government, semi-government and international, the secondary data
are more reliable corresponding to data collected by individual and private organisations.
3. Adequacy of Secondary Data
Adequacy of secondary data is to be judged in the light of the objectives of the research. For example, our
objective is to study the growth of industrial production in India. But the published report provide
information on only few states, then the data would not serve the purpose. Adequacy of the data may
also be considered in the light of duration of time for which the data is available. For example, for
studying the trends of per capita income of a country, we need data for the last 10 years, but the
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 119

information available for the last 5 years only, which would not serve our objective. Hence, we should
use secondary data if it is reliable, suitable and adequate.

Sources of information
Secondary sources of information may be divided into two categories: internal sources and external

1) Internal sources of secondary information
a. Sales data: All organisations collect information in the course of their everyday operations. Orders are
received and delivered, costs are
recorded, sales personnel submit
visit reports, invoices are sent
out, and returned goods are
recorded and so on. Much of this
information is of potential use in
marketing research but a
surprising amount of it is
actually used. Organisations
frequently overlook this
valuable resource by not
beginning their search of secondary sources with an internal audit of sales invoices, orders, inquiries
about products not stocked, returns from customers and sales force customer calling sheets. For example,
consider how much information can be obtained from sales orders and invoices:
- Sales by territory
- Sales by customer type
- Prices and discounts
- Average size of order by customer, customer type, geographical area
- Average sales by sales person and
- Sales by pack size and pack type, etc.
This type of data is useful for identifying an organisation's most profitable product and customers. It can
also serve to track trends within the enterprise's existing customer group.
b. Financial data: An organisation has a great deal of data within its files on the cost of producing,
storing, transporting and marketing each of its products and product lines. Such data has many uses in
research including allowing measurement of the efficiency of marketing operations. It can also be used to
estimate the costs attached to new products under consideration, of particular utilisation (in production,
storage and transportation) at which an organisation's unit costs begin to fall.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 120

c. Transport data: Companies that keep good records relating to their transport operations are well
placed to establish which are the most profitable routes, and loads, as well as the most cost effective
routing patterns. Good data on transport operations enables the enterprise to perform trade-off analysis
and thereby establish whether it makes economic sense to own or hire vehicles, or the point at which a
balance of the two gives the best financial outcome.
d. Storage data: The rate of stock turn, stock handling costs, assessing the efficiency of certain marketing
operations and the efficiency of the marketing system as a whole. More sophisticated accounting systems
assign costs to the cubic space occupied by individual products and the time period over which the
product occupies the space. These systems can be further refined so that the profitability per unit, and
rate of sale, are added. In this way, the direct product profitability can be calculated.

2) External sources of secondary information
The researcher who seriously seeks after useful secondary data is more often surprised by its abundance
than by its scarcity. Too often, the researcher has secretly (sometimes subconsciously) concluded from the
outset that his/her topic of study is so unique or specialised that a research of secondary sources is futile.
Consequently, only a specified search is made with no real expectation of sources. Cursory researches
become a self-fulfilling prophecy. Dillon et. al give the following advice:
"You should never begin a half-hearted search with the assumption that what is being sought is so unique
that no one else has ever bothered to collect it and publish it. On the contrary, assume there are scrolling
secondary data that should help provide definition and scope for the primary research effort."
The same authors support their advice by citing the large numbers of organisations that provide
marketing information including national and local government agencies, quasi-government agencies,
trade associations, universities, research institutes, financial institutions, specialist suppliers of secondary
marketing data and professional research enterprises. Dillon et al further advise that searches of printed
sources of secondary data begin with referral texts such as directories, indexes, handbooks and guides.
These sorts of publications rarely provide the data in which the researcher is interested but serve in
helping him/her locate potentially useful data sources.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 121

The main sources of external secondary sources are (1) government (Central, state and local) (2) trade
associations (3) commercial services (4) national and international institutions.

These may include all or some of the following:
Population censuses
Social surveys, family expenditure surveys
Import/export statistics
Production statistics
Agricultural statistics.
Trade associations Trade associations differ widely in the extent of their data collection and information
dissemination activities. However, it is worth checking with them to determine what
they do publish. At the very least one would normally expect that they would
produce a trade directory and, perhaps, a yearbook.
Published market research reports and other publications are available from a wide
range of organisations which charge for their information. Typically, marketing
people are interested in media statistics and consumer information which has been
obtained from large scale consumer or farmer panels. The commercial organisation
funds the collection of the data, which is wide ranging in its content, and hopes to
make its money from selling this data to interested parties.
National and
Bank economic reviews, University research reports, journals and articles are all
useful sources to contact. International agencies such as World Bank, IMF, UNDP,
ITC, FAO and ILO produce a overabundance of secondary data which can prove
extremely useful to the researcher.

Merits and Limitations of Secondary Data
1) Secondary data is much more economical and quicker to collect than primary data, as we need not
spend time and money on designing and printing data collection forms (questionnaire/schedule),
appointing enumerators, editing and tabulating data etc.
2) It is impossible to an individual or small institutions to collect primary data with regard to some
subjects such as population census, imports and exports of different countries, national income data etc.
but can obtain from secondary data.
1) Secondary data is very risky because it may not be suitable, reliable, adequate and also difficult to find
which exactly fit the need of the present investigation.
2) It is difficult to judge whether the secondary data is sufficiently accurate or not for our investigation.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 122

3) Secondary data may not be available for some investigations. For example, bargaining strategies in live
products marketing, impact of T.V. advertisements on viewers, opinion polls on a specific subject, etc. In
such situations we have to collect primary data.

Collection of secondary Data
As already mentioned, secondary data involves use of published or unpublished data. Published data are
available in:
a) Publications of the central state and local government, b) publication of foreign or of international
bodies c) technical and trade journals d) reports prepared by research scholars, universities in
different fields etc. The sources of unpublished data are many; they may be found in diaries, letter,
biographies, and autobiographies, trade associations etc.
Types of Secondary Published data
Type of
What it Is Why It might be
Where to Access Examples
Newspapers - published daily,
weekly, monthly
- written by
freelancers, staff who
are usually paid
- written for general
public (although some
target specific groups)
- provide
immediate news
- local news
- editorials
- can provide
- excellent for
- electronic
- print
- some
have free
- Economic Times
- Times of India
- Employment News

Note: Because newspapers are meant to provide immediate information, some facts might not
be accurate or will change over time.
- published weekly,
monthly, etc.
- written for a wide,
general (non-
academic) audience
- written by
journalists, staff and
freelancers who are
usually paid
- usually
provide general
information in
short articles (can
provide analysis)
- lots of
photographs and
- electronic
- print
- some
have free
websites and
some exist
- India Today
- Business World
- Sports week
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 123

- slick appearance,
variety of formats
- lots of advertising
which may be tied to
editorial content
- can also be a
source for public
- rarely provide
overviews of
topics, statistics,
cited references
solely online
Note: Popular magazines, in general, exist to entertain, sell products, express a particular
point of view, or provide news summaries of current events.
- published
monthly, quarterly,
- written for
scholars, researchers,
students and assumes
scholarly background
- written by
- use language of
specific discipline
- generally peer-
reviewed (articles are
evaluated by experts
who make publication
- serious appearance
with few images or
- cite sources
and provide
- provide in-
depth articles
- provide results
of original
research and
- often a
preliminary step
before publishing
research in book
- electronic
- print
- some
journals have
websites and
some exist
solely online
- Journal of Marketing
- The Strategist
- Western Criminology
Note: Scholarly journals are often published by scholarly societies and organizations or by
publishers of other scholarly information.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 124

Books - written by and for
a variety of audiences
- generally takes
longer time to be
- often provides
citations and
- can provide
very in-depth
- can be primary
- can present
viewpoints in
compilations and

- Use a
catalogue to
find out what
a library owns
- some
published in
format (e-
Books) and are
- Marketing
- Organization
Behaviour- Robbins

- encyclopaedias
- dictionaries
- chronologies
- thesauri
- usually written by
scholars/experts in a
- provide
general or in-
depth information
- provide
information and
overview of topics
- statistics
- bibliographies
- facts and
- names,
addresses, and
- define terms
- Use a
catalogue to
find out what
a library owns
- some
online via
- some only
available in
the Library
- Encyclopaedia
- Oxford Dictionary
- Monorama

- population
- demographics
- crime
- health care
- education
- provide a
statistical look at a
population or
- Use a
catalogue to
find out what
a library owns
- Statistical Abstract of
the India
- Indian Bureau of the

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 125

- income
- public opinion etc.
- some
available via
- some only
available in
the Library
Websites All kinds of
- full-text books
- government
- online shopping
- greeting cards

The web is:
- an infinite
array of
- In a variety of
- Internet
Tip: Keep a research notebook or log of databases searched

Review questions
1. What do you mean by data? Why it is
needed for research?
2. Distinguish between primary and
secondary data. Illustrate your answer with
3. Write names of five web sources of
secondary data which have not been
included in the above table.
4. Explain the merits and limitations of using
secondary data.
5. What precautions must a researcher take
before using the secondary data?
6. In the following situations indicate whether
data from a census should be taken?
i)A TV manufacturer wants to obtain data
on customer preferences with respect to
size of TV.
ii) RTMNU wants to determine the
acceptability of its employees for
subscribing to a new employee insurance
7. How can data be collected through the
Observation Method?
8. Distinguish between the observation and
the interview method of data collection.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 126

Chapter 6: Collection and Processing Data
In Chapter 5, we have discussed various methods of collection of data. Once the collection of data is over,
the next step is to organize data so that meaningful conclusions may be drawn. The information content
of the observations has to be reduced to a relatively few concepts and aggregates. The data collected from
the field has to be processed as laid down in the research plan. This is possible only through systematic
processing of data. Data processing involves editing, coding, classification and tabulation of the data
collected so that they are amenable to analysis. This is an intermediary stage between the collection of
data and their analysis and interpretation. In this unit, therefore, we will learn about different stages of
processing of data in detail.

Classification of data
Once the data is collected and edited, the next step towards further processing the data is classification. In
most research studies, voluminous data collected through various methods needs to be reduced into
homogeneous groups for meaningful analysis. This necessitates classification of data, which in simple
terms is the process of dividing data into different groups or classes according to their similarities and
dissimilarities. The groups should be homogeneous within and heterogeneous between themselves.
Classification condenses huge amount of data and helps in understanding the important underlying
features. It enables us to make comparison, draw inferences, locate facts and also helps in bringing out
relationships, so as to draw meaningful conclusions. In fact classification of data provides a basis for
tabulation and analysis of data.
Classification is the process of arranging data under homogeneous groups. The process of arranging
data in groups or classes according to resemblances and similarities is technically called
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 127

It is the process of arranging data either actually or notionally in groups or classes according to their
common characteristics classification precedes tabulation and makes tabulation easier.

Objectives of classification:
1) To identify similarity in the data collected.
2) To maintain homogeneity
3) To facilitate comparison
4) To maintain clarity
5) To simplify complex data
6) To achieve effective quantification and
7) To facilitate easy presentation and interpretation of data.

Types of Classification
Data may be classified according to one or more external characteristics or one or more internal
characteristics or both. Let us study these kinds with the help of illustrations.
1. Classification According to External Characteristics
In this classification, data may be classified according to area or region (Geographical) and according to
occurrences (Chronological).
a. Geographical: In this type of classification, data are organized in terms of geographical area or region.
State-wise production of manufactured goods is an example of this type. Data collected from an all India
market survey may be classified geographically. Usually the regions are arranged alphabetically or
according to the size to indicate the importance.
b. Chronological: When data is arranged according to time of occurrence, it is called chronological
classification. Profit of engineering industries over the last few years is an example. We may note that it is
possible to have chronological classification within geographical classification and vice versa. For example,
a large scale all India market survey spread over a number of years.
2. Classification According to Internal Characteristics
Data may be classified according to attributes (Qualitative characteristics which are not capable of being
described numerically) and according to the magnitude of variables (Quantitative characteristics which
are numerically described).

3. Classification according to attributes: In this classification, data are classified by descriptive
characteristic like sex, caste, occupation, place of
residence etc. This is done in two ways simple
classification and manifold classification. In
simple classification (also called classification
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 128

according to dichotomy), data is simply grouped according to presence or absence of a single
characteristics male or female, employee or unemployee, rural or urban etc.
In manifold classification (also known as multiple classification), data is classified according to more
than one characteristic. First, the data may
be divided into two groups according to
one attribute (employee and unemployee,
say). Then using the remaining attributes,
data is sub-grouped again (male and
female based on sex). This may go on
based on other attributes, like married and
unmarried, rural and urban so on The
following table is an example of manifold
4. Classification according to magnitude of the variable: This classification refers to the classification of
data according to some characteristics that can be measured. In this classification, there are two aspects:
one is variables (age, weight, income etc;) another is frequency (number of observations which can be put
into a class). Quantitative variables may be, generally, divided into two groups - discrete and continuous.
A discrete variable is one which can take only isolated (exact) values, it does not carry any fractional
value. The examples are number of children in a household, number of departments in an organization,
number of workers in a factory etc. The variables that take any numerical value within a specified range
are called continuous variables. The examples of continuous variables are the height of a person,
profit/loss of campanies etc. One point may be noted. In practice, even the continuous variables are
measured up to some degree of precision and they also essentially become discrete variables. The
following are two examples of discrete and continuous frequency distribution placed side by side.

Surveys defined
Surveys are quantitative information collection techniques used in marketing, political polling, and
social science research.
All surveys involve questions of some sort. When the questions are administered by a researcher, the
survey is called an interview or a researcher administered survey. When the questions are administered
by the respondent, the survey is referred to as a questionnaire or a self-administered survey.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 129

Advantages of surveys
The advantages of survey techniques include:
- It is an efficient way of collecting information from a large number of respondents. Very large
samples are possible. Statistical techniques can be used to determine validity, reliability, and
statistical significance.
- Surveys are flexible in the sense that a wide range of information can be collected. They can be used
to study attitudes, values, beliefs, and past behaviours.
- Because they are standardized, they are relatively free from several types of errors.
- They are relatively easy to administer.
- There is an economy in data collection due to the focus provided by standardized questions. Only
questions of interest to the researcher are asked, recorded, codified, and analyzed. Time and money
is not spent on tangential questions.

Disadvantages of surveys
Disadvantages of survey techniques include:
- They depend on subjects motivation, honesty, memory, and ability to respond. Subjects may not be
aware of their reasons for any given action. They may have forgotten their reasons. They may not be
motivated to give accurate answers, in fact, they may be motivated to give answers that present
themselves in a favorable light.
- Surveys are not appropriate for studying complex social phenomena. The individual is not the best
unit of analysis in these cases. Surveys do not give a full sense of social processes and the analysis
seems superficial.
- Structured surveys, particularly those with closed ended questions, may have low validity when
researching affective variables.

Survey Methods
Once the researcher has decided on the size of sample, the next step is to decide on the method of data
collection. Each method has its advantages and disadvantages.

a) Personal Interviews
Interview is one of the most powerful tools and most widely used method for primary data collection in
research. In our daily routine we see interviews on T.V. channels on various topics related to social,
business, sports, budget etc. In the words of C. William Emory, personal interviewing is a two way
purposeful conversation initiated by an interviewer to obtain information that is relevant to some
research purpose. Thus an interview is basically, a meeting between two persons to obtain the
information related to the proposed study. The person who is interviewing is named as interviewer and
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 130

the person who is being interviewed is named as informant. It is to be noted that, the research
data/information collect through this method is not a simple conversation between the investigator and
the informant, but also the glances, gestures, facial expressions, level of speech etc., are all part of the
process. Through this method, the researcher can collect varied types of data intensively and extensively.
Interviewes can be classified as direct personal interviews and indirect personal interviews. Under the
techniques of direct personal interview, the investigator meets the informants (who come under the
study) personally, asks them questions pertaining to enquiry and collects the desired information. Thus if
a researcher intends to collect the data on spending habits of Delhi University (DU) students, he/ she
would go to the DU, contact the students, interview them and collect the required information.
Indirect personal interview is another technique of interview method where it is not possible to collect
data directly from the informants who come under the study. Under this method, the investigator
contacts third parties or witnesses, who are closely associated with the persons/situations under study
and are capable of providing necessary information. For example, an investigation regarding a bribery
pattern in an office. In such a case it is inevitable to get the desired information indirectly from other
people who may be knowing them. Similarly, clues about the crimes are gathered by the CBI. Utmost care
must be exercised that these persons who are being questioned are fully aware of the facts of the problem
under study, and are not motivated to give a twist to the facts.
Another technique for data collection through this method can be structured and unstructured
interviewing. In the Structured interview set questions are asked and the responses are recorded in a
standardised form. This is useful in large scale interviews where a number of investigators are assigned
the job of interviewing. The researcher can minimise the bias of the interviewer. This technique is also
named as formal interview. In Un-structured interview, the investigator may not have a set of questions
but have only a number of key points around which to build the interview. Normally, such type of
interviews are conducted in the case of an explorative survey where the researcher is not completely sure
about the type of data he/ she collects. It is also named as informal interview. Generally, this method is
used as a supplementary method of data collection in conducting research in business areas.
Now-a-days, telephone or cellphone interviews are widely used to obtain the desired information for
small surveys. For instance, interviewing credit card holders by banks about the level of services they are
receiving. This technique is used in industrial surveys specially in developed regions.
The major merits of this method are as follows:
1) People are more willing to supply information if approached directly. Therefore, personal interviews
tend to yield high response rates.
2) This method enables the interviewer to clarify any doubt that the interviewee might have while asking
him/her questions. Therefore, interviews are helpful in getting reliable and valid responses.
3) The informants reactions to questions can be properly studied.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 131

4) The researcher can use the language of communication according to the standard of the information, so
as to obtain personal information of informants which are helpful in interpreting the results.
The limitations of this method are as follows:
1) The chance of the subjective factors or the views of the investigator may come in either consciously or
2) The interviewers must be properly trained; otherwise the entire work may be spoiled.
3) It is a relatively expensive and time-consuming method of data collection especially when the number
of persons to be interviewed is large and they are spread over a wide area.
4) It cannot be used when the field of enquiry is large (large sample).
Precautions: While using this method, the following precautions should be taken:
- Obtain thorough details of the theoretical aspects of the research problem.
- Identify who is to be interviewed.
- The questions should be simple, clear and limited in number.
- The investigator should be sincere, efficient and polite while collecting data.
- The investigator should be of the same area (field of study, district, state etc.).

b) Telephone Surveys
Surveying by telephone is the most popular interviewing method in the most of the country. This is made
possible by nearly universal coverage (Approx. 70 % of homes have a telephone in urban area).
1. People can usually be contacted faster over the telephone than with other methods. If the
Interviewers are using CATI (computer-assisted telephone interviewing), the results can be
available minutes after completing the last interview.
2. You can dial random telephone numbers when you do not have the actual telephone numbers of
potential respondents.
3. CATI software, such as The Survey System, makes complex questionnaires practical by offering
many logic options. It can automatically skip questions, perform calculations and modify
questions based on the answers to earlier questions. It can check the logical consistency of
answers and can present questions or answers choices in a random order (the last two are
sometimes important for reasons described later).
4. Skilled interviewers can often elicit longer or more complete answers than people will give on
their own to mail, email surveys (though some people will give longer answers to Web page
surveys). Interviewers can also ask for clarification of unclear responses.
5. Some software, such as The Survey System, can combine survey answers with pre-existing
information you have about the people being interviewed.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 132

1. Many telemarketers have given legitimate research a bad name by claiming to be doing research
when they start a sales call. Consequently, many people are reluctant to answer phone interviews
and use their answering machines to screen calls.
2. The growing number of working women often means that no one is home during the day. This
limits calling time to a "window" of about 6-9 p.m. (when you can be sure to interrupt dinner or a
favourite TV program).
3. You cannot show or sample products by phone.

c) Mail Surveys
One way of improving response rates to mail surveys is to mail a postcard telling your sample to watch
for a questionnaire in the next week or two. Another is to follow up a questionnaire mailing after a couple
of weeks with a card asking people to return the questionnaire. The downside is that this doubles or
triples your mailing cost. If you have purchased a mailing list from a supplier, you may also have to pay
a second (and third) use fee - you often cannot buy the list once and re-use it.
Another way to increase responses to mail surveys is to use an incentive. One possibility is to send a
dollar bill (or more) along with the survey (or offer to donate the dollar to a charity specified by the
respondent). If you do so, be sure to say that the dollar is a way of saying "thanks," rather than payment
for their time. Many people will consider their time worth more than a dollar. Another possibility is to
include the people who return completed surveys in a drawing for a prize. A third is to offer a copy of the
(non-confidential) result highlights to those who complete the questionnaire. Any of these techniques will
increase the response rates.
Remember that if you want a sample of 1,000 people, and you estimate a 10% response level, you need to
mail 10,000 questionnaires. You may want to check with your local post office about bulk mail rates - you
can save on postage using this mailing method. However, most researchers do not use bulk mail, because
many people associate "bulk" with "junk" and will throw it out without opening the envelope, lowering
your response rate. Also bulk mail moves slowly, increasing the time needed to complete your project.
1. Mail surveys are among the least expensive.
2. This is the only kind of survey you can do if you have the names and addresses of the target
population, but not their telephone numbers.
3. The questionnaire can include pictures - something that is not possible over the phone.
4. Mail surveys allow the respondent to answer at their leisure, rather than at the often inconvenient
moment they are contacted for a phone or personal interview. For this reason, they are not
considered as intrusive as other kinds of interviews.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 133

1. Time! Mail surveys take longer than other kinds. You will need to wait several weeks after
mailing out questionnaires before you can be sure that you have gotten most of the responses.
2. In populations of lower educational and literacy levels, response rates to mail surveys are often
too small to be useful. This, in effect, eliminates many immigrant populations that form
substantial markets in many areas. Even in well-educated populations, response rates vary from
as low as 3% up to 90%. As a rule of thumb, the best response levels are achieved from highly-
educated people and people with a particular interest in the subject (which, depending on your
target population, could lead to a biased sample).

d) Computer Direct Interviews
These are interviews in which the Interviewees enter their own answers directly into a computer. They
can be used at malls, trade shows, offices, and so on. The Survey System's optional Interviewing Module
and Interview Stations can easily create computer-direct interviews. Some researchers set up a Web page
survey for this purpose.
1. The virtual elimination of data entry and editing costs.
2. You will get more accurate answers to sensitive questions. Recent studies of potential blood
donors have shown respondents were more likely to reveal HIV-related risk factors to a
computer screen than to either human interviewers or paper questionnaires. The National
Institute of Justice has also found that computer-aided surveys among drug users get better
results than personal interviews. Employees are also more often willing to give more honest
answers to a computer than to a person or paper questionnaire.
3. The elimination of interviewer bias. Different interviewers can ask questions in different ways,
leading to different results. The computer asks the questions the same way every time.
4. Ensuring skip patterns are accurately followed. The Survey System can ensure people are not
asked questions they should skip based on their earlier answers. These automatic skips are more
accurate than relying on an Interviewer reading a paper questionnaire.
5. Response rates are usually higher. Computer-aided interviewing is still novel enough that some
people will answer a computer interview when they would not have completed another kind of
1. The Interviewees must have access to a computer or one must be provided for them.
2. As with mail surveys, computer direct interviews may have serious response rate problems in
populations of lower educational and literacy levels. This method may grow in importance as
computer use increases.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 134

e) Email Surveys
Email surveys are both very economical and very fast. More people have email than have full Internet
access. This makes email a better choice than a Web page survey for some populations. On the other
hand, email surveys are limited to simple questionnaires, whereas Web page surveys can include
complex logic.
1. Speed. An email questionnaire can gather several thousand responses within a day or two.
2. There is practically no cost involved once the set up has been completed.
3. You can attach pictures and sound files.
4. The novelty element of an email survey often stimulates higher response levels than ordinary
snail mail surveys.
1. You must possess (or purchase) a list of email addresses.
2. Some people will respond several times or pass questionnaires along to friends to answer. Many
programs have no check to eliminate people responding multiple times to bias the results. The
Survey Systems Email Module will only accept one reply from each address sent the
questionnaire. It eliminates duplicate and pass along questionnaires and checks to ensure that
respondents have not ignored instructions (e.g., giving 2 answers to a question requesting only
3. Many people dislike unsolicited email even more than unsolicited regular mail. You may want to
send email questionnaires only to people who expect to get email from you.
4. You cannot use email surveys to generalize findings to the whole populations. People who have
email are different from those who do not, even when matched on demographic characteristics,
such as age and gender.
5. Email surveys cannot automatically skip questions or randomize question or answer choice order
or use other automatic techniques that can enhance surveys the way Web page surveys can.
Although use of email is growing very rapidly, it is not universal - and is even less so outside the urban
areas. Many average citizens still do not possess email facilities, especially older people and those in
lower income and education groups. So email surveys do not reflect the population as a whole. At this
stage they are probably best used in a corporate environment where email is common or when most
members of the target population are known to have email.

f) Internet/Intranet (Web Page) Surveys
Web surveys are rapidly gaining popularity. They have major speed, cost, and flexibility advantages, but
also significant sampling limitations. These limitations make software selection especially important and
restrict the groups you can study using this technique.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 135

1. Web page surveys are extremely fast. A questionnaire posted on a popular Web site can gather
several thousand responses within a few hours. Many people who will respond to an email
invitation to take a Web survey will do so the first day, and most will do so within a few days.
2. There is practically no cost involved once the set up has been completed. Large samples do not
cost more than smaller ones (except for any cost to acquire the sample).
3. You can show pictures. Some Web survey software can also show video and play sound.
4. Web page questionnaires can use complex question skipping logic, randomisations and other
features not possible with paper questionnaires or most email surveys. These features can assure
better data.
5. Web page questionnaires can use colours, fonts and other formatting options not possible in most
email surveys.
6. A significant number of people will give more honest answers to questions about sensitive topics,
such as drug use or sex, when giving their answers to a computer, instead of to a person or on
7. On average, people give longer answers to open-ended questions on Web page questionnaires
than they do on other kinds of self-administered surveys.
8. Some Web survey software, such as The Survey System, can combine the survey answers with
pre-existing information you have about individuals taking a survey.
1. Current use of the Internet is far from universal. Internet surveys do not reflect the population as
a whole. This is true even if a sample of Internet users is selected to match the general population
in terms of age, gender and other demographics.
2. People can easily quit in the middle of a questionnaire. They are not as likely to complete a long
questionnaire on the Web as they would be if talking with a good interviewer.
3. If your survey pops up on a web page, you often have no control over who replies - anyone from
Antarctica to Zanzibar, cruising that web page may answer.
4. Depending on your software, there is often no control over people responding multiple times to
bias the results.
At this stage we recommend using the Internet for surveys mainly when your target population consists
entirely or almost entirely of Internet users. Business-to-business research and employee attitude
surveys can often meet this requirement. Surveys of the general population usually will not. Another
reason to use a Web page survey is when you want to show video or both sound and graphics. A Web
page survey may be the only practical way to have many people view and react to a video.
In any case, be sure your survey software prevents people from completing more than one questionnaire.
You may also want to restrict access by requiring a password (good software allows this option) or by
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 136

putting the survey on a page that can only be accessed directly (i.e., there are no links to it from other

g) Scanning Questionnaires
Scanning questionnaires is a method of data collection that can be used with paper questionnaires that
have been administered in face-to-face interviews; mail surveys or surveys completed by an Interviewer
over the telephone. The Survey System can produce paper questionnaires that can be scanned using
Remark Office OMR (Optical Mark Reader). Other software can scan questionnaires and produce ASCII
Files that can be read into The Survey System.
1. Scanning can be the fastest method of data entry for paper questionnaires.
2. Scanning is more accurate than a person in reading a properly completed questionnaire.
1. Scanning is best-suited to "check the box" type surveys and bar codes. Scanning programs have
various methods to deal with text responses, but all require additional data entry time.
2. Scanning is less forgiving (accurate) than a person in reading a poorly marked questionnaire.
Requires investment in additional hardware to do the actual scanning.
The choice of survey method will depend on several factors. These include:
Speed Email and Web page surveys are the fastest methods, followed by telephone
interviewing. Mail surveys are the slowest.
Cost Personal interviews are the most expensive followed by telephone and then mail.
Email and Web page surveys are the least expensive for large samples.
Internet Usage Web page and Email surveys offer significant advantages, but you may not be able to
generalize their results to the population as a whole.
Illiterate and less-educated people rarely respond to mail surveys.
People are more likely to answer sensitive questions when interviewed directly by a
computer in one form or another.
Video, Sound,
A need to get reactions to video, music or a picture limits your options. You can play a
video on a Web page, in a computer-direct interview, or in person. You can play music
when using these methods or over a telephone. You can show pictures in those first
methods and in a mail survey.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 137

Survey Errors
A. Random sampling error: Most surveys try to portray a representative cross section of a particular
target population, but even with technically proper random probability samples, statistical errors will
occur because of chance variation. Without increasing sample size, these statistical problems are

B. Systematic error: Systematic errors result from some imperfect research design or from a mistake in
the execution of the research. These errors are also called non-sampling errors. A sample bias exists
when the results of a sample show a persistent tendency to deviate in one direction from the true value of
the population parameter. The two general categories of systematic error are respondent error and
administrative error.
1. Respondent error: If the respondents do not cooperate or do not give truthful answers then two types
of error may occur.
a) Non-response error: To utilize the results of a survey, the researcher must be sure that those who did
not respond to the questionnaire were representative of those who did not. If only those who responded
are included in the survey then non-response error will occur. Non-respondents are most common in
mail surveys, but may also occur in telephone and personal surveys in the form of no contacts (not-at-
homes) or refusals. The number of no contacts has been increasing because of the proliferation of
answering machines and growing usage of Caller ID to screen telephone calls. Self-selection may also
occur in self-administered questionnaires; in this situation, only those who feel strongly about the subject
matter will respond, causing an over-representation of extreme positions. Comparing demographics of
the sample with the demographics of the target population is one means of inspecting for possible biases.
Additional efforts should be made to obtain data from any underrepresented segments of the population.
For example, call-backs can be made on the not-at-homes.
b) Response bias: Response bias occurs when respondents tend to answer in a certain direction. This bias
may be caused by an intentional or inadvertent falsification or by a misrepresentation of the respondents
(1) Deliberate falsification: People may misrepresent answers in order to appear intelligent, to avoid
embarrassment, to conceal personal information, to "please" the interviewer, etc. It may be that the
interviewees preferred to be viewed as average and they will alter their responses accordingly.
(2) Unconscious misrepresentation: Response bias can arise from question format, question ambiguity or
content. Time-lapse may lead to best-guess answers.
Types of response bias: There are five specific categories of response bias. These categories overlap and
are by no means mutually exclusive.
(i) Agreement bias: This is a response bias caused by a respondents tendency to concur with a
particular position. For example, "yes Sayers" who accept all statements they are asked about.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 138

(ii) Extremity bias: Some individuals tend to use extremes when responding to questions which
may cause extremity bias.
(iii) Interviewer bias: If an interviewers presence influences respondents to give untrue or
modified answers, the survey will contain interviewer bias. Respondents may wish to appear
wealthy or intelligent, or they may try to give the "right" answer or the socially acceptable answer.
(iv) Patronage bias: The answers to a survey may be deliberately or unintentionally misrepresented
because the respondent is influenced by the organization conducting the survey.
(v) Social desirability bias: This may occur consciously or subconsciously. Answers to questions
that seek factual information or matters of public knowledge are usually quite accurate, but the
interviewers presence may increase a respondents tendency toward an inaccurate response to a
sensitive question in an attempt by the respondent to gain prestige in the interviewers mind.

2. Administrative error: The results of improper administration or execution of the research task are
examples of administrative error. Such errors are inadvertently caused by confusion, neglect, omission, or
some other blunder. There are four types of administrative error:
a) Data processing error: The accuracy of the data processed by computer depends on correct data entry
and programming. Mistakes can be avoided if verification procedures are employed at each processing
b) Sample selection error: This type of error is a systematic error that results in an unrepresentative
sample because of an error in either the sample design or execution of the sampling procedure.
c) Interviewer error: Interviewers may record an answer incorrectly or selective perception may influence
them to record data supportive of their own attitudes.
d) Interviewer cheating: To avoid possible cheating, it is wise to inform the interviewers that a small
sample of respondents will be back to confirm that the interview actually took place.

Rule-of-thumb estimates for systematic error
Sampling error may be estimated using certain statistical tools, but ways to estimate systematic error are
less precise. Many researchers have found it useful to use some standard of comparison in order to
understand how much error can be expected. For example, one cable TV company knocks down the
number of people saying that they intend to purchase the service by a "ballpark 10 percent" because
previous experience has indicated a 10 percent upward bias on the intention questions.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 139

Data processing
Data are raw facts. When organised and presented properly, they become information. Turning data into
information involves several steps.
These steps are known as data
processing. This section looks at
data processing and the use of
computers to do it easily and
quickly. The diagram below shows a
simplified view of the procedure for
turning data into information. Data, in a range of forms and from various sources, may be entered into a
computer where it can be manipulated to produce useful information (output).

Data processing includes the following steps:
1. Data coding,
2. Data input,
3. Data editing, and
4. Data manipulation.

1) Data coding
Coding is placing data in a usable form. Researcher must make decisions about the level of measurement
needed and assign numbers to variables, including codes for variables where the data is missing or
unusable. This is likely already done if the researcher is using a pre-coded questionnaire, but for other
data collection techniques, such as using public records, this is a step that has to be taken.
Before raw data is entered into a computer it may need to be coded. Coding involves labelling the
responses in a unique and abbreviated way (often by simple numerical codes). The reason raw data are
coded is that it makes data entry and data manipulation easier. Coding can be done by interviewers in
the field or by people in an office.
A closed question implies that only a fixed number of predetermined responses are allowed, and these
responses can have codes affixed on the form. An open question implies that any response is allowed,
making subsequent coding more difficult. One may select a sample of responses, and design a code
structure which captures and categorizes most of these.
Each variable should be carefully examined in terms of research problem. In general the level of
measurement for a variable should be the highest level possible to retain the most information and allow
the most powerful statistics to be used. For example, education could be classified into categories such as
(1) less than 12 years, (2) high school degree, (3) some college, and (4) college degree. This may be
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 140

perfectly acceptable for research problem as long as we are examining differences based on degrees.
Frequently, a research hypothesis is modified in the process, but while the original categories worked for
the original hypothesis, the new hypothesis might need more specific data. For example, we may find we
need specific number of years of education and not just degrees, because degrees alone do not seem to be
the relevant categories of education. Thus, it is preferable to code at the highest level of measurement
possible. You can always recode data into simpler categories for testing hypothesis if the original data is
there but you can't create higher-level data from lower level measurement.
Level of measurement: the issue of measurement levels is very complex. Luckily we don't have to
become experts but we do have to know enough to define our variables and later to choose appropriate
statistics. A simple outline of levels of measurement: -
We can demonstrate these levels by defining sex/gender two different ways.
(1) A self-selected choice on a questionnaire
What is your gender, please check the appropriate selection!
(1) Female: - ______
(2) Male: - ______
The first definition of gender is a nominal level measure, a simple classification system with limited
statistics appropriate for analysis: only the mode would be acceptable for measuring central tendency.
Incidentally, while gender is our variable, the choices 1 and 2 are referred to as attributes or values of the
variable gender.
Coding refers to the process by which data are categorized into groups and numerals or other symbols or
both are assigned to each item depending on the class it falls in. Hence, coding involves: (i) deciding the
categories to be used, and (ii) assigning individual codes to them. In general, coding reduces the huge
amount of information collected into a form that is amenable to analysis. A careful study of the answers is
the starting point of coding. Next, a coding frame is to be developed by listing the answers and by
assigning the codes to them. A coding manual is to be prepared with the details of variable names, codes
and instructions. Normally, the coding manual should be prepared before collection of data, but for open-
ended and partially coded questions. These two categories are to be taken care of after the data collection.
The following are the broad general rules for coding:
1) Each respondent should be given a code number (an identification number).
2) Each qualitative question should have codes. Quantitative variables may or may not be coded
depending on the purpose. Monthly income should not be coded if one of the objectives is to compute
average monthly income. But if it is used as a classificatory variable it may be coded to indicate poor,
middle or upper income group.
3) All responses including dont know, no opinion no response etc., are to be coded.
Sometimes it is not possible to anticipate all the responses and some questions are not coded before
collection of data. Responses of all the questions are to be studied carefully and codes are to be decided
by examining the essence of the answers. In partially coded questions, usually there is an option Any
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 141

Other (specify). Depending on the purpose, responses to this question may be examined and additional
codes may be assigned.

2) Data input
The keyboard of a computer is one of the more commonly known input, or data entry, devices in current
use. In the past, punched cards or paper tapes have been used.
Other input devices in current use include light pens, trackballs, scanners, mice, optical mark readers and
bar code readers. Some common everyday examples of data input devices are:
- Bar code readers used in shops, supermarkets or libraries, and
- Scanners used in desktop publishing.

3) Data editing
Before being presented as information, data should be put through a process called editing. This process
checks for accuracy and eliminates problems that can produce disorganised or incorrect information.
Data editing may be performed by clerical staff, computer software, or a combination of both; depending
on the medium in which the data is submitted.
Editing may be broadly defined to be a procedure, which uses available information and assumptions to
substitute inconsistent values in a data set. In other words, editing is the process of examining the data
collected through various methods to detect errors and omissions and correct them for further analysis.
While editing, care has to be taken to see that the data are as accurate and complete as possible, units of
observations and number of decimal places are the same for the same variable.
The following practical guidelines may be handy while editing the data:
1) The editor should have a copy of the instructions given to the interviewers.
2) The editor should not destroy or erase the original entry. Original entry should be crossed out in such a
manner that they are still legible.
3) All answers, which are modified or filled in afresh by the editor, have to be indicated.
4) All completed schedules should have the signature of the editor and the date.
For checking the quality of data collected, it is advisable to take a small sample of the questionnaire and
examine them thoroughly. This helps in understanding the following types of problems: (1) whether all
the questions are answered, (2) whether the answers are properly recorded, (3) whether there is any bias,
(4) whether there is any interviewer dishonesty, (5) whether there are inconsistencies. At times, it may be
worthwhile to group the same set of questionnaires according to the investigators (whether any
particular investigator has specific problems) or according to geographical regions (whether any
particular region has specific problems) or according to the sex or background of the investigators, and
corrective actions may be taken if any problem is observed.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 142

Before tabulation of data it may be good to prepare an operation manual to decide the process for
identifying inconsistencies and errors and also the methods to edit and correct them. The following broad
rules may be helpful.
i. Incorrect answers: It is quite common to get incorrect answers to many of the questions. A person with
a thorough knowledge will be able to notice them. For example, against the question Which brand of
biscuits do you purchase? the answer may be We purchase biscuits from ABC Stores. Now, this
questionnaire can be corrected if ABC Stores stocks only one type of biscuits, otherwise not. Answer to
the question How many days did you go for shopping in the last week? would be a number between 0
and 7. A number beyond this range indicates a mistake, and such a mistake cannot be corrected. The
general rule is that changes may be made if one is absolutely sure, otherwise this question should not be
used. Usually a schedule has a number of questions and although answers to a few questions are
incorrect, it is advisable to use the other correct information from the schedule rather than discarding the
schedule entirely.
ii. Inconsistent answers: When there are inconsistencies in the answers or when there are incomplete or
missing answers, the questionnaire should not be used. Suppose that in a survey, per capita expenditure
on various items are reported as follows: Food ` 700, Clothing `300, Fuel and Light ` 200, other
items ` 550 and Total ` 1600. The answers are obviously inconsistent as the total of individual items of
expenditure is exceeding the total expenditure.
iii. Modified answers: Sometimes it may be necessary to modify or qualify the answers. They have to be
indicated for reference and checking. Numerical answers to be converted to same units: Against the
question What is the plinth area of your house? answers could be either in square feet or in square
metres. It will be convenient to convert all the answers to these questions in the same unit, square metre
for example.

4) Data manipulation
After editing, data may be manipulated by computer to produce the desired output. The software used to
manipulate data will depend on the form of output required.
Software applications such as word processing, desktop publishing, graphics (including graphing and
drawing), databases and spreadsheets are commonly used. Following are some ways that software can
manipulate data:
- Spreadsheets are used to create formulas that automatically add columns or rows of figures calculate
means and perform statistical analyses. They can be used to create financial worksheets such as
budgets or expenditure forecasts, balance accounts and analyse costs.
- Databases are electronic filing cabinets: systematically storing data for easy access to produce
summaries, stocktakes or reports. A database program should be able to store, retrieve, sort, and
analyse data.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 143

- Charts can be created from a table of numbers and displayed in a number of ways, to show the
significance of a selection of data. Bar, line, pie and other types of charts can be generated and
manipulated to advantage.
Processing data provides useful information called output. Computer output may be used in a variety of
ways. It may be saved in storage for later retrieval and use. It may be laser printed on paper as tables or
charts, put on a transparent slide for overhead projector use, saved on floppy disk for portable use in
other computers, or sent as an electronic file via the internet to others. Types of output are limited only by
the available output devices, but their form is usually governed by the need to communicate information
to someone. For whom is output being produced? How will they best understand it? The answers to
these questions help determine one's output type.

Before analysis can be performed, raw data must be transformed into the right format. First, it must be
edited so that errors can be corrected or omitted. The data must then be coded; this procedure converts
the edited raw data into numbers or symbols. A codebook is created to document how the data was
coded. Finally, the data is tabulated to count the number of samples falling into various categories.
Simple tabulations count the occurrences of each variable independently of the other variables. Cross
tabulations, also known as contingency tables or cross tabs, treats two or more variables simultaneously.
However, since the variables are in a two-dimensional table, cross tabbing more than two variables is
difficult to visualize since more than two dimensions would be required. Cross tabulation can be
performed for nominal and ordinal variables.
Cross tabulation is the most commonly utilized data analysis method in research. Many studies take the
analysis no further than cross tabulation. This technique divides the sample into sub-groups to show how
the dependent variable varies from one subgroup to another. A third variable can be introduced to
uncover a relationship that initially was not evident.
Tabulation is an orderly arrangement of data in columns and rows. It is a systematic presentation of
classified data on the basis of the nature of analysis & investigation.
Tabulation refers to the orderly arrangement of data in a table or other summary format. Counting the
number of responses to a question and putting them into a frequency distribution is a simple tabulation,
or marginal tabulation, which provides the most basic form of information for the researcher. Often such
simple tabulation is presented in the form of a frequency table. A frequency table is the arrangement of
statistical data in a row and column format that exhibits the count of responses or observations for each
of the categories or codes assigned to a variable. Large samples generally require computer tabulation of
the data.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 144

Presentation of collected data in the tabular form is one of the techniques of data presentation. The two
other techniques are diagrammatic and graphic presentation. Arranging the data in an orderly manner in
rows and columns is called tabulation of data.
Sometimes data collected by survey or even from publications of official bodies are so numerous that it is
difficult to understand the important features of the data. Therefore it becomes necessary to summarize
data through tabulation to an easily intelligible form. It may be noted that there may be loss of some
minor information in certain cases, but the essential underlying features come out more clearly. Quite
frequently, data presented in tabular form is much easier to read and understand than the data presented
in the text.
In classification, as discussed in the previous section, the data is divided on the basis of similarity and
resemblance, whereas tabulation is the process of recording the classified facts in rows and columns.
Therefore, after classifying the data into various classes, they should be shown in the tabular form.

Tabulation is important because:-
1) It conserves space and reduces explanatory and descriptive statement to the minimum
2) It facilitates the process of comparison
3) It saves time and interpretation, induction, deduction ad conclusion become easier.
Tabulation may be simple or complex. Simple calculation gives information about one or more groups of
independent questions. A complex tabulation gives information or shows the division of data in two or
more categories. A complex table generally results in two way (which give information about two
interrelated characteristics of data), three way tables or still higher order tables, which supply
information about several interrelated characteristic of data.

Requisites of a Good Statistical Table
After having an understanding of the parts of a statistical table, now let us discuss the features of an ideal
statistical table. Besides the rules relating to part of the table, certain guidelines are very helpful in its
preparation. They are as follows:
1) A good table must present the data in as clear and simple a manner as possible.
2) The title should be brief and self-explanatory. It should represent the description of the contents of the
3) Rows and Columns may be numbered to facilitate easy reference.
4) Table should not be too narrow or too wide. The space of columns and rows should be carefully
planned, so as to avoid unnecessary gaps.
5) Columns and rows which are directly comparable with one another should be placed side by side.
6) Units of measurement should be clearly shown.
7) All the column figures should be properly aligned. Decimal points and plus or minus signs also should
be in perfect alignment.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 145

8) Abbreviations should be avoided in a table. If it is inevitable to use, their meanings must be clearly
explained in footnote.
9) If necessary, the derived data (percentages, indices, ratios, etc.) may also be incorporated in the tables.
10) The sources of the data should be clearly stated so that the reliability of the data could be verified, if

Review questions
1. What do you mean by Editing of data?
Explain the guidelines to be kept in mind
while editing the statistical data.
2. Explain the meaning of coding? How
would you code your research data?
3. Classification of data provides a basis for
tabulation of data. Comment.
4. Discuss the various methods of
5. What is tabulation? Draw the format of a
statistical table and indicate its various
6. Describe the requisites of a good statistical
7. Prepare a blank table showing the age, sex
and literacy of the population in a city,
according to five age groups from 0 to 100
8. The following figures relate to the number
of crimes (nearest-hundred) in four
metropolitan cities in India. In 1961,
Bombay recorded the highest number of
crimes i.e. 19,400 followed by Calcutta with
14,200, Delhi 10,000 and Madras 5,700. In
the year 1971, there was an increase of
5,700 in Bombay over its 1961 figure. The
corresponding increase was 6,400 in Delhi
and 1,500 in Madras. However, the number
of these crimes fell to 10,900 in the case of
Calcutta for the corresponding period. In
1981, Bombay recorded a total of 36,300
crimes. In that year, the number of crimes
was 7,000 less in Delhi as compared to
Bombay. In Calcutta the number of crimes
increased by 3,100 in 1981 as compared to
1971. In the case of Madras the increase in
crimes was by 8,500 in 1981 as compared to
1971. Present this data in tabular form.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 146

Chapter7: Analysis of Data
Data Analysis is the process of systematically applying statistical and/or logical techniques to describe
and illustrate, condense and recap, and evaluate data. According to Shamoo and Resnik (2003) various
analytic procedures provide a way of drawing inductive inferences from data and distinguishing the
signal (the phenomenon of interest) from the noise (statistical fluctuations) present in the data..
While data analysis in qualitative research can include statistical procedures, many times analysis
becomes an ongoing iterative process where data is continuously collected and analyzed almost
simultaneously. Indeed, researchers generally analyze for patterns in observations through the entire
data collection phase (Savenye, Robinson, 2004).
An essential component of ensuring data integrity is the accurate and appropriate analysis of research
findings. Improper statistical analyses distort scientific findings, mislead casual readers, and may
negatively influence the public perception of research. Integrity issues are just as relevant to analysis of
non-statistical data as well.

Considerations/issues in data analysis
There are a number of issues that researchers should be aware of with respect to data analysis. These
- Having the necessary skills to analyze
- Concurrently selecting data collection methods and appropriate analysis
- Drawing unbiased inference
- Inappropriate subgroup analysis
- Following acceptable norms for disciplines
- Determining statistical significance
- Lack of clearly defined and objective outcome measurements
- Providing honest and accurate analysis
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 147

- Manner of presenting data
- Environmental/contextual issues
- Data recording method
- Partitioning text when analyzing qualitative data
- Training of staff conducting analyses
- Reliability and Validity
- Extent of analysis

Whether statistical or non-statistical methods of analyses are used, researchers should be aware of the
potential for compromising data integrity. While statistical analysis is typically performed on quantitative
data, there are numerous analytic procedures specifically designed for qualitative material including
content, thematic, and ethnographic analysis. Regardless of whether one studies quantitative or
qualitative phenomena, researchers use a variety of tools to analyze data in order to test hypotheses,
discern patterns of behavior, and ultimately answer research questions. Failure to understand or
acknowledge data analysis issues presented can compromise data integrity.

Advanced Data Analysis Techniques
The next step is that of choosing the appropriate statistical test. There are basically two types of statistical
test, parametric and non-parametric. Parametric tests are those, which make assumptions about the
nature of the population from which the scores were drawn (i.e. population values are "parameters", e.g.
means and standard deviations). If we assume, for example, that the distribution of the sample means is
normal, then we require to use a parametric test. Non-parametric tests do not require this type of
assumption and relate mainly to that branch of statistics known as "order statistics". We discard actual
numerical values and focus on the way in which things are ranked or classed. Thereafter, the choice
between alternative types of test is determined by 3 factors:
(1) Whether we are working with dependent or independent samples, (2) whether we have more or less
than two levels of the independent variable, and (3) the mathematical properties of the scale which we
have used, i.e. ratio, interval, ordinal or nominal.
We will reject Ho, our null hypothesis, if a statistical test yields a value whose associated probability of
occurrence is equal to or less than some small probability, known as the critical region (or level).
Common values of this critical level are 0.05 and 0.01.

Bivariate statistical analysis: tests of differences
What is the appropriate test of difference?
One of the most frequently tested hypotheses states that two (or more) groups are different with respect
to some behavior, characteristic, or attitude. Such tests are called tests of differences. For example, a
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 148

researcher may be interested to see if male and female consumers purchase a product with equal
frequency. Bivariate statistical analysis is data analysis and hypothesis testing when the investigation
concerns simultaneous investigation of two variables.

1) Tests of Statistical Significance
The chi-square (
) goodness-of-fit test is used to determine whether a set of proportions have specified
numerical values. It often is used to analyze bivariate cross-tabulated data. Some examples of situations
that are well-suited for this test are:
- A manufacturer of packaged products test markets a new product and wants to know if sales of
the new product will be in the same relative proportion of package sizes as sales of existing
- A company's sales revenue comes from Product A (50%), Product B (30%), and Product C (20%).
The firm wants to know whether recent fluctuations in these proportions are random or whether
they represent a real shift in sales.
The chi-square test is performed by defining k categories and observing the number of cases falling into
each category. Knowing the expected number of cases falling in each category, one can define chi-squared

i - Ei )
/ Ei
Oi = the number of observed cases in category i,
Ei = the number of expected cases in category i,
k = the number of categories,
the summation runs from i = 1 to i = k.

Before calculating the chi-square value, one needs to determine the expected frequency for each cell. This
is done by dividing the number of samples by the number of cells in the table.
To use the output of the chi-square function, one uses a chi-square table. To do so, one needs to know the
number of degrees of freedom (df). For chi-square applied to cross-tabulated data, the number of degrees
of freedom is equal to (Number of columns - 1) (Number of rows - 1)
This is equal to the number of categories minus one. The conventional critical level of 0.05 normally is
used. If the calculated output value from the function is greater than the chi-square look-up table value,
the null hypothesis is rejected.

2) The t-test for comparing two means
The t-test may be used to test a hypothesis that the mean scores on some interval- or ratio-scaled variable
will be significantly different for two independent samples or groups. It is used when the number of
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 149

observations (sample size) in either group is small (less than 30) and the population standard deviation is
unknown. To use the t-test for difference of means, we assume the two samples are drawn from normal
distributions and the variances of the two populations or groups are equal (homoscedasticity). Further,
we assume interval data.
A pooled estimates of the standard error is a better estimate of the standard error than one based on the
variance from either sample.
In a test of two means, the degrees of freedom are calculated as follows:
d.f. = n - k
An illustration of the t-test would be to test the difference between sociology majors and business majors
on scores on a scale measuring attitudes toward business. The null hypothesis would be that there is no
difference in attitudes toward business (mean score) between the two groups.
Computer programs, such as SPSS, are commonly used to do the calculations in testing the mean
differences of two groups.

3) The z-test for comparing two proportions
When the observed statistic is a proportion, the Z-test for differences of proportions is used to test the
hypothesis that the two proportions will be significantly different for two independent samples or
groups. Again, sample size is the appropriate criterion when selecting either a t-test or a Z-test.

4) Analysis of Variance (ANOVA) test ANOVA (F test)
The analysis of variance is a powerful statistical tool for tests of significance. The term Analysis of
Variance was introduced by Prof. R.A. Fisher to deal with problems in agricultural research. The test of
significance based on t-distribution is an adequate procedure only for testing the significance of the
difference between two sample means. In a situation where we have three or more samples to consider at
a time, an alternative procedure is needed for testing the hypothesis that all the samples are drawn from
the same population, i.e., they have the same mean. For example, five fertilizers are applied to four plots
each of wheat and yield of wheat on each of the plot is given. We may be interested in finding out
whether the effect of these fertilizers on the yields is significantly different or in other words whether the
samples have come from the same normal population. The answer to this problem is provided by the
technique of analysis of variance. Thus basic purpose of the analysis of variance is to test the
homogeneity of several means.
Another test of significance is the Analysis of Variance (ANOVA) test. The primary purpose of ANOVA
is to test for differences between multiple means. Whereas the t-test can be used to compare two means,
ANOVA is needed to compare three or more means. If multiple t-tests were applied, the probability of a
TYPE I error (rejecting a true null hypothesis) increases as the number of comparisons increases.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 150

The term analysis of variance (ANOVA) is used in the field of study called designed experiments. In
this field the goal is to try to maximize the amount of information that is collected when an
experiment (production trial) is performed.
The technique was developed by Sir Ronald Fisher in the 1930's as a way to interpret the results from
agricultural experiments.
The normal way in which things are usually done in experiments is to hold everything constant while
only varying one item at a time. This is a most inefficient way to do things and not very representative of
what happens in the real world.
In designed experimental approaches items are allowed to vary simultaneously and the respective data is
gathered and analyzed. This analysis can not only detect differences in means, but effects of interactions.
As mentioned the area of ANOVA is a whole field of study in itself, and we will only look at one of the
simpler types. One word of caution should be given before ever starting any data collection; the data
gathering should be randomized allowing equal chance of occurrence. This is necessary to prevent any
bias that might result in misinterpretation.
ANOVA is efficient for analyzing data using relatively few observations and can be used with categorical
variables. Note that regression can perform a similar analysis to that of ANOVA.
An example of an ANOVA problem might be to compare women who are working full time outside the
home, working part time outside the home, or working full time inside the home on their willingness to
purchase a microwave oven. Here there is one independent variable working statusbut there are
three groups (levels) and therefore a t-test cannot be used for the testing of statistical significance.
The null hypothesis in such a test is that all the means are equal. The logic of this technique goes as
follows. The variance of the means of the three groups will be large if these women differ from one
another in terms of purchasing intentions. If we calculate this variance within groups and compare it with
the variance of the group means about a grand mean, we can determine if the means are significantly
Variation is inherent in nature. The total variation in any set of numerical data is due to a number of
causes which may be classified as:
(i) Assignable causes and (ii) Chance causes
The variation due to assignable causes can be detected and measured whereas the variation due to chance
causes is beyond the control of human hand and cannot be traced separately.

According to R.A. Fisher , Analysis of Variance (ANOVA) is the Separation of Variance ascribable to
one group of causes from the variance ascribable to other group. By this technique the total variation in
the sample data is expressed as the sum of its nonnegative components where each of these components
is a measure of the variation due to some specific independent source or factor or cause.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 151

For the validity of the F-test in ANOVA the following assumptions are made.
(i) The observations are independent
(ii) Parent population from which observations are taken is normal and
(iii) Various treatment and environmental effects are additive in nature.

The F-Test
One-way ANOVA examines whether multiple means differ. The test is called an F-test. ANOVA
calculates the ratio of the variation between groups to the variation within groups (the F ratio). While
ANOVA was designed for comparing several means, it also can be used to compare two means. Two-
way ANOVA allows for a second independent variable and addresses interaction.

To run a one-way ANOVA, use the following steps:
1. Identify the independent and dependent variables.
2. Describe the variation by breaking it into three parts - the total variation, the portion that is
within groups, and the portion that is between groups (or among groups for more than two
groups). The total variation (SStotal) is the sum of the squares of the differences between each
value and the grand mean of all the values in all the groups. The in-group variation (SSwithin) is
the sum of the squares of the differences in each element's value and the group mean. The
variation between group means (SSbetween) is the total variation minus the in-group variation
(SStotal - SSwithin).
3. Measure the difference between each group's mean and the grand mean.
4. Perform a significance test on the differences.
5. Interpret the results.
This F-test assumes that the group variances are approximately equal and that the observations are
independent. It also assumes normally distributed data; however, since this is a test on means the Central
Limit Theorem holds as long as the sample size is not too small.
The F-Test is a procedure for comparing one sample variance with another sample variance. The key
question is whether the two sample variances are different from each other or if they are from the same
The F-test utilizes measures of sample variance rather than the sample standard deviation because
summation is allowable with the sample variance.
To test the null hypothesis of no difference between the sample variances, a table of the F-distribution is
Identifying and Partitioning the Total Amount of Variation
In the F-test there will be two forms of variation: (1) variation of scores due to random error or within-
group variation due to individual differences (within-group variance) and (2) systematic variation of
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 152

scores between the groups as the result of the manipulation of an independent variable or due to
characteristics of the independent variable (between-group variance).
The larger the ratio the greater the value of F. If the F value is large, the results are likely to be statistically
Calculation of the F Ratio
The calculation of the F ratio requires that we partition the total variance into two parts:

Total sum of squares = Within-group sum of squares + Between-group sum of squares or
SS total = SS within + SS between

The total sum of squares, or SS total, is computed by squaring the deviation of each score from the grand
mean and summing these squares . SS within, the variability that we observe within each group, is
calculated by squaring the deviation of each score from its group mean and summing these scores .
SS between, which is the variability of the group means about a grand mean, is calculated by squaring the
deviation of each mean from the grand mean, multiplying by the number of items in the group, and
summing these scores .
The next calculation requires dividing the various sums of squares by their appropriate degrees of
freedom. The results of these divisions produce the variances, or mean squares.
To obtain the mean square between the groups, SS between is divided by c - 1 degrees of freedom, and to
obtain the mean square within the groups, SS within is divided by cn - c degrees of freedom.
Finally, the F ratio is calculated by taking the ratio of the mean square between groups to the mean
square within groups:
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 153

There will be c - 1 degrees of freedom in the numerator and cn - c degrees of freedom in the denominator.

Various types Goal: Design:
Scale of
Inferential Test:
Establish that a group was drawn
from a population.
Single-Group (sample to
Interval or Ratio
Z-test (Requires that
population mean and
variance are known)
Interval or Ratio T-test: Single Sample
Chi-square: Goodness of
Establish a causal relationship
between one level of an
independent variable and a
dependent variable.
Between-Subject: Two
Interval or Ratio
T-test: Independent

Chi-square: Test of
Establish a causal relationship
between multiple levels of an
independent variable and a
dependent variable.
Between-Subject: One
Independent Variable that
Contains Three or More
Interval or Ratio ANOVA: Fisher's F-test
Various types of inferential test

Advanced Data Analysis Techniques
Some of the Advanced Data Analysis Techniques are as follows:
1) Conjoint analysis
2) Factor analysis
3) Multi dimensional scaling
4) Discriminant analysis
5) Cluster analysis

1) Conjoint analysis
Conjoint analysis, also called multiattribute compositional models, is a statistical technique that
originated in mathematical psychology. Today it is used in many of the social sciences and applied
sciences including marketing, product management, and operations research. The objective of conjoint
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 154

analysis is to determine what combination of a limited number of attributes is most preferred by
respondents. It is used frequently in testing customer acceptance of new product designs and
assessing the appeal of advertisements. It has been used in product positioning, but there are some
problems with this application of the technique.
When asked to do so outright, many consumers are unable to accurately determine the relative
importance that they place on product attributes. For example, when asked which attributes are the more
important ones, the response may be that they all are important. Furthermore, individual attributes in
isolation are perceived differently than in the combinations found in a product. It is difficult for a survey
respondent to take a list of attributes and mentally construct the preferred combinations of them. The task
is easier if the respondent is presented with combinations of attributes that can be visualized as different
product offerings. However, such a survey becomes impractical when there are several attributes that
result in a very large number of possible combinations.
Fortunately, conjoint analysis can facilitate the process. Conjoint analysis is a tool that allows a subset of
the possible combinations of product features to be used to determine the relative importance of each
feature in the purchasing decision. Conjoint analysis is based on the fact that the relative values of
attributes considered jointly can better be measured than when considered in isolation.
In a conjoint analysis, the respondent may be asked to arrange a list of combinations of product attributes
in decreasing order of preference. Once this ranking is obtained, a computer is used to find the utilities of
different values of each attribute that would result in the respondent's order of preference. This method is
efficient in the sense that the survey does not need to be conducted using every possible combination of
attributes. The utilities can be determined using a subset of possible attribute combinations. From these
results one can predict the desirability of the combinations that were not tested.

We can best understand Conjoint analysis with the help of an example:
Example 1
Suppose we have to design a public transport system. We wish to test the relative desirability of three
The company aims to provide a service. They wish to test three levels of frequency, and three levels of
prices. Further they want to test the weightage given by consumer to add on features such as AC and
music. The conjoint problem can be presented as follows:
Fare (three levels ` 10, ` 15, ` 20)
Frequency of service (10 minutes, 15 minutes, 20 minutes)
AC vs non AC vs. music (Ac & music, AC, music, nothing)
A sample of 500 respondents are selected and asked to rank their preferences for all possible
combinations and for each level. These are shown below along with one respondents sample rankings.
We can present our trade off information in the form of a table:
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 155

Basically the respondents preference ranking help reveal how desirable a particular feature is to a
respondent. Features respondents are unwilling to give up from one preference ranking to the next are
given a higher utility. Thus in the above example the
respondent gives a high weightage to service followed by AC.
the offer of music is clearly not very important as he ranks it
below AC. However he is not willing to trade off frequency of
service with either AC or music.
Conjoint analysis uses preference rankings to calculate a set of utilities for each respondent where one
utility is calculated for each respondent for each attribute or feature. The calculation of utilities is such
that the sum of utilities for a particular combination shows a good correspondence with that
combinations position in the individuals original preference rankings. The utilities basically show the
importance of each level of each importance to respondents. We can also identify the more important
attributes by looking at the range of utilities for each of the different levels.
For Example
- Frequency of service has a range from 1.6 to .04. The range is therefore equal to =1.2.A high range
implies that the respondent is more sensitive to changes in the level of this attribute.
- These utilities are calculated across all respondents for all attributes and for different levels of each
At the end of the analysis we would identify 3-4 of the most popular combinations would be identified
for which the relative costs and benefits can be worked out.

Steps in Developing a Conjoint Analysis
1. Choose product attributes, for example, appearance, size, or price.
2. Choose the values or options for each attribute. For example, for the attribute of size, one may
choose the levels of 5", 10", or 20". The higher the number of options used for each attribute, the
more burden that is placed on the respondents.
3. Define products as a combination of attribute options. The set of combinations of attributes that
will be used will be a subset of the possible universe of products.
4. Choose the form in which the combinations of attributes are to be presented to the respondents.
Options include verbal presentation, paragraph description, and pictorial presentation.
5. Decide how responses will be aggregated. There are three choices - use individual responses,
pool all responses into a single utility function, or define segments of respondents who have
similar preferences.
6. Select the technique to be used to analyze the collected data. The part-worth model is one of the
simpler models used to express the utilities of the various attributes. There also are vector (linear)
models and ideal-point (quadratic) models.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 156

The data is processed by statistical software written specifically for conjoint analysis.
Conjoint analysis was first used in the early 1970's and has become an important research tool. It is well-
suited for defining a new product or improving an existing one.
Information collection
Respondents are shown a set of products, prototypes, mock-ups or pictures. Each example is similar
enough that consumers will see them as close substitutes, but dissimilar enough that respondents can
clearly determine a preference. Each example is composed of a unique combination of product features.
Rank-order preferences are obtained. The responses are codified and input into a statistical program like
The computer uses monotonic analysis of variance or linear programming techniques to create utility
functions for each feature. These utility functions indicate the perceived value of the feature and how
sensitive consumer perceptions and preferences are to changes in product features.

Uses of conjoint analysis
- It is used in industrial marketing where a product can have many combinations and features and
not all features would be important to all consumers. In industrial marketing the analysis can be
done at the individual level, as each individual is important.
- In case of consumer goods the analysis should be done segment wise. To avoid unnecessarily long
questionnaires a preliminary factor analysis should be run to select only testable attributes. Also the
number of attributes should be restricted.

- estimates psychological tradeoffs that consumers make when evaluating several attributes
- measures preferences at the individual level
- uncovers real or hidden drivers which may not be apparent to the respondent themselves
- realistic choice or shopping task
- able to use physical objects
- if appropriately designed, the ability to model interactions between attributes can be used to
develop needs based segmentation
- designing conjoint studies can be complex
- with too many options, respondents resort to simplification strategies
- difficult to use for product positioning research because there is no procedure for converting
perceptions about actual features to perceptions about a reduced set of underlying features
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 157

- respondents are unable to articulate attitudes toward new categories
- poorly designed studies may over-value emotional/preference variables and undervalue
concrete variables
- does not take into account the number items per purchase so it can give a poor reading of market

2) Factor Analysis
Factor analysis is a statistical technique that originated in mathematical psychology. It is used in the
social sciences and in marketing, product management, operations research, and other applied sciences
that deal with large quantities of data. The objective is to discover patterns among variations in the
values of multiple variables. This is done by generating artificial dimensions (called factors) that
correlate highly with the real variables.
Factor analysis is a very popular technique to analyze interdependence. Factor analysis studies the entire
set of interrelationships without defining variables to be dependent or independent. Factor analysis
combines variables to create a smaller set of factors. Mathematically, a factor is a linear combination of
variables. A factor is not directly observable; it is inferred from the variables. The technique identifies
underlying structure among the variables, reducing the number of variables to a more manageable set.
Factor analysis groups variables according to their correlation.
The factor loading can be defined as the correlations between the factors and their underlying variables.
A factor loading matrix is a key output of the factor analysis. An example of matrix is shown below.
Factor 1 Factor 2 Factor 3
Variable 1
Variable 2
Variable 3
Column's Sum of Squares:
Each cell in the matrix represents correlation between the variable and the factor associated with that
cell. The square of this correlation represents the proportion of the variation in the variable explained by
the factor. The sum of the squares of the factor loadings in each column is called an eigenvalue. An
eigenvalue represents the amount of variance in the original variables that is associated with that factor.
The communality is the amount of the variable variance explained by common factors.
A rule of thumb for deciding on the number of factors is that each included factor must explain at least
as much variance as does an average variable. In other words, only factors for which the eigenvalue is
greater than one are used. Other criteria for determining the number of factors include the Scree plot
criteria and the percentage of variance criteria.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 158

To facilitate interpretation, the axis can be rotated. Rotation of the axis is equivalent to forming linear
combinations of the factors. A commonly used rotation strategy is the varimax rotation. Varimax
attempts to force the column entries to be either close to zero or one.

The basic steps are:
- Identify the salient attributes consumers use to evaluate products in this category.
- Use quantitative research techniques (such as surveys) to collect data from a sample of potential
customers concerning their ratings of all the product attributes.
- Input the data into a statistical program and run the factor analysis procedure. The computer
will yield a set of underlying attributes (or factors).
- Use these factors to construct perceptual maps and other product positioning devices.

Typical Problem Studied Using Factor Analysis
Factor analysis is used to study a complex product or service to identify the major characteristics
considered important by consumers.
The two major uses of factor analysis
1. To simplify a set of data by reducing a large number of measures (which in some way may be
interrelated and causing multicollinearity) for a set of respondents to a smaller more manageable set
which are not interrelated and still retain most of the original information .
2. To identify the underlying structure of the data in which a very large number of variables may really
be measuring a small number of basic characteristics or constructs of our sample. For e.g. a survey may
throw up bet 15-20 attributes which a consumer considers when buying a product. However there is a
need to find out what are the key drivers. Factor analysis identifies latent or underlying factors from an
array of seemingly imp variables.

Uses of Factor Analysis
To reduce a large number of variables to a smaller number of factors for modeling purposes, where the
large number of variables precludes modeling all the measures individually. As such, factor analysis is
integrated in structural equation modeling (Sem), helping create the latent variables modeled by Sem.
However, factor analysis can be and is often used on a stand-alone basis for similar purposes.
- To select a subset of variables from a larger set, based on which original variables have the highest
correlations with the principal component factors.
- To create a set of factors to be treated as uncorrelated variables as one approach to handling
multicollinearity in such procedures as multiple regression
- To validate a scale or index by demonstrating that its constituent items load on the same factor, and
to drop proposed scale items which cross-load on more than one factor.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 159

- To establish that multiple tests measure the same factor, thereby giving justification for
administering fewer tests.
- To identify clusters of cases and/or outliers.
- To determine network groups by determining which sets of people cluster together (using Q-mode
factor analysis, discussed below)

Information collection
The data collection stage is usually done by research professionals. Survey questions ask the respondent
to rate a product from one to five (or 1 to 7, or 1 to 10) on a range of attributes. Anywhere from five to
twenty attributes are chosen. They could include things like: ease of use, weight, accuracy, durability,
colourfulness, price, or size. The attributes chosen will vary depending on the product being studied. The
same question is asked about all the products in the study. The data for multiple products is codified and
input into a statistical program such as SPSS or SAS.

The analysis will isolate the underlying factors that explain the data. Factor analysis is an
interdependence technique. The complete set of interdependent relationships are examined. There is no
specification of either dependent variables, independent variables, or causality. Factor analysis assumes
that all the rating data on different attributes can be reduced down to a few important dimensions. This
reduction is possible because the attributes are related. The rating given to any one attribute is partially
the result of the influence of other attributes. The statistical algorithm deconstructs the rating (called a
raw score) into its various components, and reconstructs the partial scores into underlying factor scores.
The degree of correlation between the initial raw score and the final factor score is called a factor loading.
There are two approaches to factor analysis: "principal component analysis" (the total variance in the data
is considered); and "common factor analysis" (the common variance is considered).
The use of principle components in a semantic space can vary somewhat because the components may
only "predict" but not "map" to the vector space. This produces a statistical principle component use
where the most salient words or themes represent the preferred Basis .

1. both objective and subjective attributes can be used
2. it is fairly easy to do, inexpensive, and accurate
3. it is based on direct inputs from customers
4. there is flexibility in naming and using dimensions
1. usefulness depends on the researchers ability to develop a complete and accurate set of product
attributes - If important attributes are missed the procedure is valueless.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 160

2. naming of the factors can be difficult - multiple attributes can be highly correlated with no
appearent reason.
3. factor analysis will always produce a pattern between variables, no matter how random.

3) Multidimensional scaling
Multidimensional scaling (MDS) is a statistical technique often used in marketing and the social
sciences. It is a procedure for taking the preferences and perceptions of respondents and representing
them on a visual grid. These grids, called perceptual maps are usually two-dimensional, but they can
represent more than two. Potential customers are asked to compare pairs of products and make
judgements about their similarity. Whereas other techniques (such as factor analysis, discriminant
analysis, and conjoint analysis) obtain underlying dimensions from responses to product attributes
identified by the researcher, MDS obtains the underlying dimensions from respondents judgements
about the similarity of products. This is an important advantage. It does not depend on researchers
judgments. It does not require a list of attributes to be shown to the respondents. The underlying
dimensions come from respondents judgements about pairs of products. Because of these advantages,
MDS is the most common technique used in perceptual mapping.

Multidimensional Scaling Procedure
There are several steps in conducting MDS research:
1. Formulating the problem - What brands do you want to compare? How many brands do you want
to compare? More than 20 is cumbersome. Less than 8 (4 pairs) will not give valid results. What
purpose is the study to be used for?
2. Obtaining Input Data - Respondents are asked a series of questions. For each product pair they are
asked to rate similarity (usually on a 7 point Likert scale from very similar to very dissimilar). The
first question could be for Coke/Pepsi for example, the next for Coke/Hires rootbeer, the next for
Pepsi/Dr Pepper, the next for Dr Pepper/Hires rootbeer, etc. The number of questions is a function
of the number of brands and can be calculated as Q = N (N - 1) / 2 where Q is the number of
questions and N is the number of brands. This approach is referred to as the Perception data :
direct approach. There are two other approaches. There is the Perception data : derived approach
in which products are decomposed into attributes which are rated on a semantic differential scale.
The other is the Preference data approach in which respondents are asked their preference rather
than similarity.
3. Running the MDS statistical program - Software for running the procedure is available in most of
the better statistical applications programs. Often there is a choice between Metric MDS (which
deals with interval or ratio level data), and Nonmetric MDS (which deals with ordinal data). The
researchers must decide on the number of dimensions they want the computer to create. The more
dimensions, the better the statistical fit, but the more difficult it is to interpret the results.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 161

4. Mapping the results and defining the dimensions - The statistical program (or a related module)
will map the results. The map will plot each product (usually in two dimensional space). The
proximity of products to each other indicate either how similar they are or how preferred they are,
depending on which approach was used. The dimensions must be labelled by the researcher. This
requires subjective judgement and is often very challenging. The results must be interpreted ( see
perceptual mapping).
5. Test the results for reliability and Validity - Compute R-squared to determine what proportion of
variance of the scaled data can be accounted for by the MDS procedure. An R-square of .6 is
considered the minimum acceptable level. Other possible tests are Kruskals Stress, split data tests,
data stability tests (i.e.: eliminating one brand), and test-retest reliability.

Perceptual mapping
Perceptual mapping is a graphics technique used
by marketers that attempts to visually display the
perceptions of customers or potential customers.
Typically the position of a product, product line,
brand, or company is displayed relative to their
Perceptual maps can have any number of
dimensions but the most common is two
dimensions. Any more is a challenge
to draw and confusing to interpret.
The first perceptual map below shows
consumer perceptions of various
automobiles on the two dimensions of
sportiness/conservative and
classy/affordable. This sample of
consumers felt Porsche was the
sportiest and classiest of the cars in the study (top right corner). They felt Plymouth was most practical
and conservative (bottom left corner).
Cars that are positioned close to each other are seen as similar on the relevant dimensions by the
consumer. For example consumers see Buick, Chrysler, and Oldsmobile as similar. They are close
competitors and form a competitive grouping. A company considering the introduction of a new model
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 162

will look for an area on the map free from competitors. Some perceptual maps use different size circles to
indicate the sales volume or market share of the various competing products.
Displaying consumers perceptions of related products is only half the story. Many perceptual maps also
display consumers ideal points. These points reflect ideal combinations of the two dimensions as seen
by a consumer. The next diagram shows a study of consumers ideal points in the alcohol/spirits
product space. Each dot represents one respondents ideal combination of the two dimensions. Areas
where there is a cluster of ideal points (such as A) indicates a market segment. Areas without ideal
points are sometimes referred to as demand voids.
A company considering introducing a new product will look for areas with a high density of ideal
points. They will also look for areas without competitive rivals. This is best done by placing both the
ideal points and the
competing products on the
same map.
Some maps plot ideal vectors
instead of ideal points. The
map below, displays various
aspirin products as seen on the
dimensions of effectiveness
and gentleness. It also shows
two ideal vectors. The slope of
the ideal vector indicates the
preferred ratio of the two
dimensions by those consumers within that segment. This study indicates there is one segment that is
more concerned with effectiveness than harshness, and another segment that is more interested in
gentleness than strength.
Perceptual maps need not come from a detailed study. There are also intuitive maps (also called
judgmental maps or consensus maps) that are created by marketers based on their understanding of
their industry. Management uses its best judgement. It is questionable how valuable this type of map is.
Often they just give the appearance of credibility to managements preconceptions.
When detailed research studies are done methodological problems can arise, but at least the information
is coming directly from the consumer. There is an assortment of statistical procedures that can be used
to convert the raw data collected in a survey into a perceptual map.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 163

4) Discriminant analysis
Discriminant analysis is a statistical technique used in marketing and the social sciences. It is applicable
when there is only one dependent variable but multiple independent variables (similar to ANOVA and
regression). But unlike ANOVA and regression analysis, the dependent variable must be categorical. It is
similar to factor analysis in that both look for underlying dimensions in responses given to questions
about product attributes. But it differs from factor analysis in that it builds these underlying dimensions
based on differences rather than similarities. Discriminant analysis is also different from factor analysis
in that it is not an interdependence technique: a distinction between independent variables and
dependent variables (also called criterion variables) must be made.
Discriminant analysis works by creating a new variable called the Discriminant function score which is
used to predict to which group a case belongs.
Discriminant function scores are computed similarly to factor scores, i.e. using eigenvalues. The
computations find the coefficients for the independent variables that maximize the measure of distance
between the groups defined by the dependent variable.
The discriminant function is similar to a regression equation in which the independent variables are
multiplied by coefficients and summed to produce a score.
The data structure for DFA is a single grouping variable that is predicted by a series of other variables.
The grouping variable must be nominal, which might also be a reclassification of a continuous variable
into groups. The function is presented thus:
Y = X1W1 + X2W2 + X3W3 + ...XnWn + Constant
This is essentially identical to a multiple regression, but in reality the two techniques are quite different.
Regression is built on a linear combination of variables that maximizes the regression relationship, i.e.,
the least squares regression, between a continuous dependent variable and the regression variate. In
DFA, the dependent variable consists of discrete groups, and what you want to do with the function is to
maximize the distance between those groups, i.e., to come up with a function that has strong
discriminatory power among the groups. Although logit regression does somewhat the same thing when
you have a binary (two group) variable, and the book makes a big thing of the similarities, the reality is
that the way in which they compute the functions is quite different.

Discriminant Analysis Involves:
1. Formulate the problem and gather data - Identify the salient attributes consumers use to evaluate
products in this category - Use quantitative research techniques (such as surveys) to collect data
from a sample of potential customers concerning their ratings of all the product attributes. The data
collection stage is usually done by research professionals. Survey questions ask the respondent to
rate a product from one to five (or 1 to 7, or 1 to 10) on a range of attributes chosen by the
researcher. Anywhere from five to twenty attributes are chosen. They could include things like:
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 164

ease of use, weight, accuracy, durability, colourfulness, price, or size. The attributes chosen will
vary depending on the product being studied. The same question is asked about all the products in
the study. The data for multiple products is codified and input into a statistical program such as
SPSS or SAS. (This step is the same as in Factor analysis).
2. Estimate the Discriminant Function Coefficients and determine the statistical significance and
validity - Choose the appropriate discimininant analysis method. The direct method involves
estimating the discriminant function so that all the predictors are assessed simultaneously. The
stepwise method enters the predictors sequentially. The two-group method should be used when
the dependant variable has two categories or states. The multiple discriminant method is used
when the dependent variable has three or more categorical states. Use Wilkss Lambda to test for
significance in SPSS or F stat in SAS. The most common method used to test validity is to split the
sample into an estimation or analysis sample, and a validation or holdout sample. The estimation
sample is used in constructing the discriminant function. The validation sample is used to
construct a classification matrix which contains the number of correctly classified and incorrectly
classified cases. The percentage of correctly classified cases is called the hit ratio.
3. Plot the results on a two dimensional map, define the dimensions, and interpret the results. The
statistical program (or a related module) will map the results. The map will plot each product
(usually in two dimensional space). The distance of products to each other indicate either how
different they are. The dimensions must be labelled by the researcher. This requires subjective
judgement and is often very challenging.

Applications of Discriminant Function Analysis
General Purpose
Discriminant function analysis is used to determine which variables discriminate between two or more
naturally occurring groups. For example, an educational researcher may want to investigate which
variables discriminate between high school graduates who decide
(1) to go to college, (2) to attend a trade or professional school, or (3) to seek no further training or
education. For that purpose the researcher could collect data on numerous variables prior to students
graduation. After graduation, most students will naturally fall into one of the three categories.
Discriminant Analysis could then be used to determine which variable(s) are the best predictors of
students subsequent educational choice. A medical researcher may record different variables relating to
patients backgrounds in order to learn which variables best predict whether a patient is likely to recover
completely (group 1), partially (group 2), or not at all (group 3). A biologist could record different
characteristics of similar types (groups) of flowers, and then perform a discriminant function analysis to
determine the set of characteristics that allows for the best discrimination between the types.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 165

5) Cluster analysis
Cluster analysis is a class of statistical techniques that can be applied to data that exhibits natural
groupings. Cluster analysis sorts through the raw data and groups them into clusters. A cluster is a
group of relatively homogeneous cases or observations. Objects in a cluster are similar to each other.
They are also dissimilar to objects outside the cluster, particularly objects in other clusters.
The diagram below illustrates the results of a survey that studied drinkers perceptions of spirits
(alcohol). Each point represents the results from one respondent. The research indicates there are four
clusters in this market.
Another example is the vacation travel market. Recent research has identified three clusters or market
segments. They are the: 1) The
demanders - they want exceptional
service and expect to be pampered; 2)
The escapists - they want to get away
and just relax; 3) The educationalist -
they want to see new things, go to
museums, go on a safari, or experience
new cultures.
Cluster analysis, like factor analysis and
multi dimensional scaling, is an
interdependence technique : it makes no
distinction between dependent and independent variables. The entire set of interdependent relationships
is examined. It is similar to multi dimensional scaling in that both examine inter-object similarity by
examining the complete set of interdependent relationships. The difference is that multi dimensional
scaling identifies underlying dimensions, while cluster analysis identifies clusters. Cluster analysis is the
obverse of factor analysis. Whereas factor analysis reduces the number of variables by grouping them
into a smaller set of factors, cluster analysis reduces the number of observations or cases by grouping
them into a smaller set of clusters.
In marketing, cluster analysis is used for:
1. Segmenting the market and determining target markets
2. Product positioning and New Product Development
3. Selecting test markets

The basic procedure is:
1. Formulate the problem - select the variables that you wish to apply the clustering technique to
2. Select a distance measure - various ways of computing distance:
o Squared Euclidean distance - the square root of the sum of the squared differences in value
for each variable
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 166

o Manhattan distance - the sum of the absolute differences in value for any variable
o Chebychev distance - the maximum absolute difference in values for any variable
3. Select a clustering procedure (see below)
4. Decide on the number of clusters
5. Map and interpret clusters - draw conclusions - illustrative techniques like perceptual maps,
icicle plots, and dendrograms are useful
6. Assess reliability and validity - various methods:
o repeat analysis but use different distance measure
o repeat analysis but use different clustering technique
o split the data randomly into two halves and analyze each part separately
o repeat analysis several times, deleting one variable each time
o repeat analysis several times, using a different order each time

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 167

Chapter 8: Testing of Hypothesis
Many a time, we strongly believe some results to be true. But after taking a sample, we notice that one
sample data does not wholly support the result. The difference is due to (i) the original belief being
wrong, or (ii) the sample being slightly one sided.
Tests are, therefore, needed to distinguish between the two possibilities. These tests tell about the likely
possibilities and reveal whether or not the difference can be due to only chance elements. If the difference
is not due to chance elements, it is significant and, therefore, these tests are called tests of significance. The
whole procedure is known as Testing of Hypothesis.
Setting up and testing hypotheses is an essential part of statistical inference. In order to formulate such a
test, usually some theory has been put forward, either because it is believed to be true or because it is to
be used as a basis for argument, but has not been proved. For example, the hypothesis may be the claim
that a new drug is better than the current drug for treatment of a disease, diagnosed through a set of
In each problem considered, the question of interest is simplified into two competing claims/hypotheses
between which we have a choice; the null hypothesis, denoted by H0, against the alternative hypothesis, denoted
by H1. These two competing claims / hypotheses are not however treated on an equal basis; special
consideration is given to the null hypothesis. We have two common situations :
(i) The experiment has been carried out in an attempt to disprove or reject a particular
hypothesis, the null hypothesis; thus we give that one priority so it cannot be rejected unless
the evidence against it is sufficiently strong. For example, null hypothesis H0: there is no
difference in taste between coke and diet coke, against the alternate hypothesis H1: there is a
difference in the tastes.
(ii) If one of the two hypotheses is simpler, we give it priority so that a more complicated
theory is not adopted unless there is sufficient evidence against the simpler one. For
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 168

example, it is simpler to claim that there is no difference in flavour between coke and diet
coke than it is to say that there is a difference.
The hypotheses are often statements about population parameters like expected value and variance. For
example, H0, might be the statement that the expected value of the height of ten year old boys in the
Indian population, is not different from that of ten year old girls. A hypothesis might also be a statement
about the distributional form of a characteristic of interest; for example, that the height of ten years old
boys is normally distributed within the Indian population.

What is a Hypothesis?
A hypothesis is the assumption that we make about the population parameter. This can be any
assumption about a population parameter not necessarily based on statistical data. For example it can
also be based on the gut feel of a manager. Managerial hypotheses are based on intuition; the market
place decides whether the managers intuitions were in fact correct.
In fact managers propose and test hypotheses all the time. For example:
- If a manager says if we drop the price of this car model by ` 15000, well increase sales by 25000
units is a hypothesis. To test it in reality we have to wait to the end of the year to and count sales.
- A manager estimates that sales per territory will grow on average by 30% in the next quarter is also
an assumption or hypotheses.
To understand the meaning of a hypothesis, let us see some definitions:
A hypothesis is a tentative generalization, the validity of which remains to be tested. In its most
elementary stage the hypothesis may be any guess, hunch, imaginative idea, which becomes the basis for
action or investigation. (G.A.Lundberg)
It is a proposition which can be put to test to determine validity. (Goode and Hatt).
A hypothesis is a question put in such a way that an answer of some kind can be forth coming -
(Rummel and Ballaine).
These definitions lead us to conclude that a hypothesis is a tentative solution or explanation or a guess or
assumption or a proposition or a statement to the problem facing the researcher, adopted on a cursory
observation of known and available data, as a basis of investigation, whose validity is to be tested or

How would the manager go about testing this assumption?
Suppose he has 70 territories under him.
- One option for him is to audit the results of all 70 territories and determine whether the average is
growth is greater than or less than 30%. This is a time consuming and expensive procedure.
- Another way is to take a sample of territories and audit sales results for them. Once we have our sales
growth figure, it is likely that it will differ somewhat from our assumed rate. For example we may get
a sample rate of 27%. The manager is then faced with the problem of determining whether his
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 169

assumption or hypothesized rate of growth of sales is correct or the sample rate of growth is more
representative. To test the validity of our assumption about the population we collect sample data
and determine the sample value of the statistic.
We then determine whether the sample data supports our hypotheses assumption regarding the average
sales growth.

How is this Done?
If the difference between our hypothesized value and the sample value is small, then it is more likely that
our hypothesized value of the mean is correct. The larger the difference the smaller the probability that
the hypothesized value is correct. In practice however very rarely is the difference between the sample
mean and the hypothesized population value larger enough or small enough for us to be able to accept or
reject the hypothesis prima-facie. We cannot accept or reject a hypothesis about a parameter simply on
intuition; instead we need to use objective criteria based on sampling theory to accept or reject the
hypothesis. Hypotheses testing is the process of making inferences about a population based on a sample.
The key question therefore in hypotheses testing is: how likely is it that a population such as one we have
hypothesized to produce a sample such as the one we are looking at.

Types of Hypothesis
Hypotheses can be classified in a variety of ways into different types or kinds. The following are some of
the types of hypotheses:
i) Explanatory Hypothesis: The purpose of this hypothesis is to explain a certain fact. All hypotheses are
in a way explanatory for a hypothesis is advanced only when we try to explain the observed fact. A large
number of hypotheses are advanced to explain the individual facts in life. A theft, a murder, an accident
are examples.
ii) Descriptive Hypothesis: Sometimes a researcher comes across a complex phenomenon. He/ she does
not understand the relations among the observed facts. But how to account for these facts? The answer is
a descriptive hypothesis. A hypothesis is descriptive when it is based upon the points of resemblance of
something. It describes the cause and effect relationship of a phenomenon e.g., the current
unemployment rate of a state exceeds 25% of the work force. Similarly, the consumers of local made
products constitute a significant market segment.
iii) Analogical Hypothesis: When we formulate a hypothesis on the basis of similarities (analogy), it is
called an analogical hypothesis e.g., families with higher earnings invest more surplus income on long
term investments.
iv) Working hypothesis: Sometimes certain facts cannot be explained adequately by existing hypotheses,
and no new hypothesis comes up. Thus, the investigation is held up. In this situation, a researcher
formulates a hypothesis which enables to continue investigation. Such a hypothesis, though inadequate
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 170

and formulated for the purpose of further investigation only, is called a working hypothesis. It is simply
accepted as a starting point in the process of investigation.
v) Null Hypothesis: It is an important concept that is used widely in the sampling theory. It forms the
basis of many tests of significance. Under this type, the hypothesis is stated negatively. It is null because it
may be nullified, if the evidence of a random sample is unfavourable to the hypothesis. It is a hypothesis
being tested (H0). If the calculated value of the test is less than the permissible value, Null hypothesis is
accepted, otherwise it is rejected. The rejection of a null hypothesis implies that the difference could not
have arisen due to chance or sampling fluctuations.
vi) Statistical Hypothesis: Statistical hypotheses are the statements derived from a sample. These are
quantitative in nature and are numerically measurable. For example, the market share of product X is
70%, the average life of a tube light is 2000 hours etc.

Criteria for Workable Hypothesis
A hypothesis controls and directs the research study. When a problem is felt, we require the hypothesis
to explain it. Generally, there is more than one hypothesis which aims at explaining the same fact. But all
of them cannot be equally good. Therefore, how can we judge a hypothesis to be true or false, good or
bad? Agreement with facts is the sole and sufficient test of a true hypothesis. Therefore, certain conditions
can be laid down for distinguishing a good hypothesis from bad ones. The formal conditions laid down
by thinkers provide the criteria for judging a hypothesis as good or valid. These conditions are as follows:
i) A hypothesis should be empirically verifiable: The most important condition for a valid hypothesis is
that it should be empirically verifiable. A hypothesis is said to be verifiable, if it can be shown to be either
true or false by comparing with the facts of experience directly or indirectly. A hypothesis is true if it
conforms to facts and it is false if it does not. Empirical verification is the characteristic of the scientific
ii) A hypothesis should be relevant: The purpose of formulating a hypothesis is always to explain some
facts. It must provide an answer to the problem which initiated the enquiry. A hypothesis is called
relevant if it can explain the facts of enquiry.
iii) A hypothesis must have predictive and explanatory power: Explanatory power means that a good
hypothesis, over and above the facts it proposes to explain, must also explain some other facts which are
beyond its original scope. We must be able to deduce a wide range of observable facts which can be
deduced from a hypothesis. The wider the range, the greater is its explanatory power.
iv) A hypothesis must furnish a base for deductive inference on consequences: In the process of
investigation, we always pass from the known to the unknown. It is impossible to infer anything from the
absolutely unknown. We can only infer what would happen under supposed conditions by applying the
knowledge of nature we possess. Hence, our hypothesis must be in accordance with our previous
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 171

v) A hypothesis does not go against the traditionally established knowledge: As far as possible, a new
hypothesis should not go against any previously established law or knowledge. The new hypothesis is
expected to be consistent with the established knowledge.
vi) A hypothesis should be simple: A simple hypothesis is preferable to a complex one. It sometimes
happens that there are two or more hypotheses which explain a given fact equally well. Both of them are
verified by observable facts. Both of them have a predictive power and both are consistent with
established knowledge. All the important conditions of hypothesis are thus satisfied by them. In such
cases the simpler one is to be accepted in preference to the complex one.
vii) A hypothesis must be clear, definite and certain: It is desirable that the hypothesis must be simple
and specific to the point. It must be clearly defined in a manner commonly accepted. It should not be
vague or ambiguous.
(viii) A Hypothesis should be related to available techniques: If tools and techniques are not available
we cannot test the hypothesis. Therefore, the hypothesis should be formulated only after due thought is
given to the methods and techniques that can be used to measure the concepts and variables related to
the hypothesis.

Stages in Hypothesis
There are four stages. The first stage is feeling of a problem. The observation and analysis of the
researcher reveals certain facts. These facts pose a problem. The second stage is formulation of a
hypothesis or hypotheses. A tentative supposition/ guess is made to explain the facts which call for an
explanation. At this stage some past experience is necessary to pick up the significant aspects of the
observed facts. Without previous knowledge, the investigation becomes difficult, if not impossible. The
third stage is deductive development of hypothesis using deductive reasoning. The researcher uses the
hypothesis as a premise and draws a conclusion from it. And the last stage is the verification or testing of
hypothesis. This consists in finding whether the conclusion drawn at the third stage is really true.
Verification consists in finding whether the hypothesis agrees with the facts. If the hypothesis stands the
test of verification, it is accepted as an explanation of the problem. But if the hypothesis does not stand
the test of verification, the researcher has to search for further solutions.
To explain the above stages let us consider a simple example. Suppose, you have started from your home
for college on your scooter. A little while later the engine of your scooter suddenly stops. What can be the
reason? Why has it stopped? From your past experience, you start guessing that such problems generally
arise due to either petrol or spark plug. Then start deducing that the cause could be: (i) that the petrol
knob is not on. (ii) that there is no petrol in the tank. (iii) that the spark plug has to be cleaned. Then start
verifying them one after another to solve the problem. First see whether the petrol knob is on. If it is not,
switch it on and start the scooter. If it is already on, then see whether there is petrol or not by opening the
lid of the petrol tank. If the tank is empty, go to the near by petrol bunk to fill the tank with petrol. If there
is petrol in the tank, this is not the reason, then you verify the spark plug. You clean the plug and fit it.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 172

The scooter starts. That means the problem is with the spark plug. You have identified it. So you got the
answer. That means your problem is solved.

Reliability and validity
Research should be tested for reliability, generalizability, and validity.
- Generalizability is the ability to make inferences from a sample to the population.
- Reliability is the extent to which a measure will produce consistent results. Test-retest reliability
checks how similar the results are if the research is repeated under similar circumstances.
Stability over repeated measures is assessed with the Pearson coefficient. Alternative forms
reliability checks how similar the results are if the research is repeated using different forms.
Internal consistency reliability checks how well the individual measures included in the research
are converted into a composite measure. Internal consistency may be assessed by correlating
performance on two halves of a test (split-half reliability).
- Validity asks whether the research measured what it intended to. Content validation (also called
face validity) checks how well the content of the research are related to the variables to be
studied. Are the research questions representative of the variables being researched. It is a
demonstration that the items of a test are drawn from the domain being measured. Criterion
validation checks how meaningful the research criteria are relative to other possible criteria.
When the criterion is collected later the goal is to establish predictive validity. Construct
validation checks what underlying construct is being measured. There are three variants of
construct validity. They are convergent validity (how well the research relates to other measures
of the same construct), discriminant validity (how poorly the research relates to measures of
opposing constructs), and nomological validity (how well the research relates to other variables
as required by theory) .
Internal validation, used primarily in experimental research designs, checks the relation
between the dependent and independent variables. Did the experimental manipulation of the
independent variable actually cause the observed results? External validation checks whether
the experimental results can be generalized.
Validity implies reliability : a valid measure must be reliable. But reliability does not necessarily imply
validity :a reliable measure need not be valid.

Testing of Hypothesis
When the hypothesis has been framed in the research study, it must be verified as true or false.
Verifiability is one of the important conditions of a good hypothesis. Verification of hypothesis means
testing of the truth of the hypothesis in the light of facts. If the hypothesis agrees with the facts, it is said
to be true and may be accepted as the explanation of the facts. But if it does not agree it is said to be false.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 173

Such a false hypothesis is either totally rejected or modified. Verification is of two types viz., Direct
verification and Indirect verification.
1. Direct verification may be either by observation or by experiments. When direct observation shows
that the supposed cause exists where it was thought to exist, we have a direct verification. When a
hypothesis is verified by an experiment in a laboratory it is called direct verification by experiment. When
the hypothesis is not amenable for direct verification, we have to depend on indirect verification.
2. Indirect verification is a process in which certain possible consequences are deduced from the
hypothesis and they are then verified directly. Two steps are involved in indirect verification. (i)
Deductive development of hypothesis: By deductive development certain consequences are predicted
and (ii) finding whether the predicted consequences follow. If the predicted consequences come true, the
hypothesis is said to be indirectly verified. Verification may be done directly or indirectly or through
logical methods.
Testing of a hypothesis is done by using statistical methods. Testing is used to accept or reject an
assumption or hypothesis about a random variable using a sample from the distribution. The assumption
is the null hypothesis (H0), and it is tested against some alternative hypothesis (H1). Statistical tests of
hypothesis are applied to sample data. The procedure involved in testing a hypothesis is A) select a
sample and collect the data. B) Convert the variables or attributes into statistical form such as mean,
proportion. C) formulate hypotheses. D) select an appropriate test for the data such as t-test, Z-test. E)
perform computations. F) finally draw the inference of accepting or rejecting the null hypothesis.

Procedure for hypothesis testing
Hypothesis testing involves the following steps:
1. Formulate the null and alternative hypotheses.
2. Choose the appropriate test.
3. Choose a level of significance (alpha) - determine the rejection region.
4. Gather the data and calculate the test statistic.
5. Determine the probability of the observed value of the test statistic under the null hypothesis
given the sampling distribution that applies to the chosen test.
6. Compare the value of the test statistic to the rejection threshold.
7. Based on the comparison, reject or do not reject the null hypothesis.
8. Make the research conclusion.
In order to analyze whether research results are statistically significant or simply by chance, a test of
statistical significance can be run.

How do we use Sampling to accept or Reject Hypothesis?
Again we go back to the normal sampling distribution. We use the result that there is a certain fixed
probability associated with intervals from the mean defined in terms of number of standard deviations
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 174

from the mean. Therefore our problem of testing a hypothesis reduces to determining the probability that
a sample statistic such as the one we have obtained could have arisen from a population with a
hypothesized mean m. In the hypothesis tests we need two numbers to make our decision whether to
accept or reject the null hypothesis:
- an observed value or computed from the sample
- a critical value defining the boundary between the acceptance and rejection region
Instead of measuring the variables in original units we calculate a standardized z variable for a standard
normal distribution with mean x =0.The z statistic tells us how many how many standard deviations
above or below the mean standardized mean (z,<0, z>0) our observation falls. We can convert our
observed data into the standardized scale using the transformation:

The z statistic measures the number of standard deviations away from the hypothesized mean the sample
mean lies. From the standard normal tables we can calculate the probability of the sample mean differing
from the true population mean by a specified number of standard deviations.
For example:
o we can find the probability that the sample mean differs from the population mean by two or more
standard deviations.
It is this probability value that will tell us how likely it is that a given sample mean can be obtained
from a population with a hypothesized mean m. .
o If the probability is low for example less than 5% , perhaps it can be reasonably concluded that the
difference between the sample mean and hypothesized population mean is too large and the chance
that the population would produce such a random sample is too low.
What probability constitutes too low or acceptable level is a judgment for decision makers to make.
Certain situations demand that decision makers be very sure about the characteristics of the items being
tested and even a 2% probability that the population produces such a sample is too high. In other
situations there is greater latitude and a decision maker may be willing to accept a hypothesis with a 5%
probability of chance variation.
In each situation what needs to be determined are the costs resulting from an incorrect decision and the
exact level of risk we are willing to assume. Our minimum standard for an acceptable probability, say,
5%, is also the risk we run of rejecting a hypothesis that is true.

Hypothesis errors:
- type I error (also called alpha error)
o the study results lead to the rejection of the null hypothesis even though it is actually true
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 175

- type II error (also called beta error)
o the study results lead to the acceptance (non-rejection) of the null hypothesis even though it is
actually false
The choice of significance level affects the ratio of correct and incorrect conclusions which will be drawn.
Given a significance level there are four alternatives to consider:
Type I and type II errors
Correct Conclusion Incorrect Conclusion
Accept a correct hypothesis
Reject an incorrect hypothesis
Reject a correct hypothesis
Accept an incorrect hypothesis

Consider the following example. In a straightforward test of two products, we may decide to market
product A if, and only if, 60% of the population prefer the product. Clearly we can set a sample size, so as
to reject the null hypothesis of A = B = 50% at, say, a 5% significance level. If we get a sample which
yields 62% (and there will be 5 chances in a 100 that we get a figure greater than 60%) and the null
hypothesis is in fact true, then we make what is known as a Type I error.
If however, the real population is A = 62%, then we shall accept the null hypothesis A = 50% on nearly
half the occasions as shown in the diagram overleaf. In this situation we shall be saying "do not market
A" when in fact there is a market for A. This is the type II error. We can of course increase the chance of
making a type I error which will automatically decrease the chance of making a type II error.
Obviously some sort of compromise is required. This depends on the relative importance of the two types
of error. If it is more important to avoid rejecting a true hypothesis (type I error) a high confidence
coefficient (low value of x) will be used. If it is more important to avoid accepting a false hypothesis, a
low confidence coefficient may be used. An analogy with the legal profession may help to clarify the
matter. Under our system of law, a man is presumed innocent of murder until proved otherwise. Now, if
a jury convicts a man when he is, in fact, innocent, a type I error will have been made: the jury has
rejected the null hypothesis of innocence although it is actually true. If the jury absolves the man, when
he is, in fact, guilty, a type II error will have been made: the jury has accepted the null hypothesis of
innocence when the man is really guilty. Most people will agree that in this case, a type I error, convicting
an innocent man, is the more serious.
In practice, of course, researchers rarely base their decisions on a single significance test. Significance tests
may be applied to the answers to every question in a survey but the results will be only convincing, if
consistent patterns emerge. For example, we may conduct a product test to find out consumers
preferences. We do not usually base our conclusions on the results of one particular question, but we ask
several, make statistical tests on the key questions and look for consistent significances. We must
remember that when one makes a series of tests, some of the correct hypotheses will be rejected by
chance. For example, if 20 questions were asked in our "before" and "after" survey and we test each
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 176

question at the 5% level, then one of the differences is likely to give significant results, even if there is no
real difference in the population.
No mention is made in these notes of considerations of costs of incorrect decisions. Statistical significance
is not always the only criterion for basing action. Economic considerations of alternative actions are often
just as important.
These, therefore, are the basic steps in the statistical testing procedure. The majority of tests are likely to
be parametric tests where researchers assume some underlying distribution like the normal or binomial
distribution. Researchers will obtain a result, say a difference between two means, calculate the standard
error of the difference and then ask "How far away from the zero difference hypothesis is the difference
we have found from our samples?"
To enable researchers to answer this question, they convert their actual difference into "standard errors"
by dividing it by its standard deviation, then refer to a chart to ascertain the probability of such a
difference occurring.

Uses of Hypothesis
If a clear scientific hypothesis has been formulated, half of the research work is already done. The
advantages/utility of having a hypothesis are summarized here underneath:
i) It is a starting point for many a research work.
ii) It helps in deciding the direction in which to proceed.
iii) It helps in selecting and collecting pertinent facts.
iv) It is an aid to explanation.
v) It helps in drawing specific conclusions.
vi) It helps in testing theories.
vii) It works as a basis for future knowledge.

Use of statistical techniques for testing of hypothesis
A hypothesis test is a statistical method that uses sample data to evaluate a hypothesis about a
population parameter.
The hypothesis testing is standard and it follows a specific order;
(i) first state a hypothesis about a population (a population parameter, e.g. mean ),
(ii) obtain a random sample from the population and also find its mean

x , and
(iii) compare the sample data with the hypothesis on the scale (standard z or normal distribution).
A hypothesis test is typically used in the context of a research study, i.e. a researcher completes one
round of a field investigation and then uses a hypothesis test to evaluate the results. Depending on the
type of research and the type of data, the details will differ from one research situation to another.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 177

The following are some of the statistical techniques for testing of hypothesis
1. Z-Score Statistics
Z-Score is called a test statistics. The purpose of a test statistics is to determine whether the result of a
research study (the obtained difference) is more than what would be expected by the chance alone.
chance to due Difference
difference Obtained
= z
Now suppose a manufacturer, produces some type of articles of good quality. A purchaser by chance
selects a sample randomly. It so happens that the sample contains many defective articles and it leads the
purchaser to reject the whole product. Now, the manufacturer suffers a loss even though he has produced
a good article of quality. Therefore, this Type-I error is called producers risk.
On the other hand, if we accept the entire lot on the basis of a sample and the lot is not really good, the
consumers are put in loss. Therefore, this Type-II error is called the consumers risk.
In practical situations, still other aspects are considered while accepting or rejecting a lot. The risks
involved for both producer and consumer are compared. Then Type-I and Type-II errors are fixed; and a
decision is reached.

2. Students t-distribution
This concept was introduced by W. S. Gosset (1876 - 1937). He adopted the pen name student.
Therefore, the distribution is known as students t-distribution.
It is used to establish confidence limits and test the hypothesis when the population variance is not
known and sample size is small (< 30).
If a random sample x1, x2, . . . , xn of n values be drawn from a normal population with mean and
standard deviation o then the mean of sample

3. Chi-square test
Tests like z-score and t are based on the assumption that the samples were drawn from normally
distributed populations or more accurately that the sample means were normally distributed. As these
tests require assumptions about the type of population or parameters, these tests are known as parametric
There are many situations in which it is impossible to make any rigid assumption about the distribution
of the population from which samples are drawn. This limitation led to search for non-parametric tests.
Chi-square (Read as Ki-square) test of independence and goodness of fit is a prominent example of a non-
parametric test. The chi-square (_
) test can be used to evaluate a relationship between two nominal or
ordinal variables.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 178

(chi-square) is measure of actual divergence of the observed and expected frequencies. In sampling
studies, we never expect that there will be a perfect coincidence between actual and observed frequencies
and the question that we have to tackle is about the degree to which the difference between actual and
observed frequencies can be ignored as arising due to fluctuations of sampling. If there is no difference
between actual and observed frequencies then _
= 0. If there is a difference, then _
would be more than
0. But the difference may also be due to sample fluctuation and thus the value of _
should be ignored in
drawing the inference. Such values of _
under different conditions are given in the form of tables and if
the actual value is greater than the table value, it indicates that the difference is not solely due to sample
fluctuation and that there is some other reason.
On the other hand, if the calculated _
is less than the table value, it indicates that the difference may have
arisen due to chance fluctuations and can be ignored. Thus _
-test enables us to find out the divergence
between theory and fact or between expected and actual frequencies.
If the calculated value of _
is very small, compared to table value, then expected frequencies are very
little and the fit is good.
If the calculated value of _
is very large as compared to table value then divergence between the
expected and the observed frequencies is very big and the fit is poor.
We know that the degree of freedom r (df) is the number of independent constraints in a set of data.
Suppose there is a two _
association table and actual frequencies of the various classes are as follows :

A a
22 38 60

Ab ab
8 32 40
30 70 100
Now the formula for calculating expected frequency of any class (cell)
colum cell the containing row for total Row =
ns observatio of number total The
cell the containing column for Total

In notations : Expected frequency
For example, if we have two attributes A and B that are independent then the expected frequency of the
class (cell) AB would be 18
60 30

= .
Once the expected frequency of cell (AB) is decided the expected frequencies of remaining three classes
are automatically fixed.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 179

Thus, for class (aB) it would be 60 18 = 42
for class (Ab) it would be 30 18 = 12
for class (ab) it would be 70 42 = 28
This means that so far as two _
association (contingency) table is concerned, there is
1 degree of freedom.
In such tables, the degrees of freedom are given by a formula n = (c 1) (r 1),
where c = Number of columns and r = Number of rows.
Thus in 2 2 table df = (2 1) (2 1) = 1
3 3 table df = (3 1) (3 1) = 4
4 4 table df = (4 1) (4 1) = 9 etc.
If the data is not in the form of contingency tables but as a series of individual observations or discrete or
continuous series then it is calculated by n = n 1 where n is the number of frequencies or values of
number of independent individuals.

= _
frequency Expected
) frequency Expected frequency Observed (

= _
) (
where O = Observed frequency and E = Expected frequency.

Review questions
1. Distinguish between Estimation and testing
of hypothesis.
2. Explain the procedure for testing a statistical
3. Discuss the role of normal distribution in
interval estimation and also in testing
4. Discuss how far the sample proportion
satisfies the desirable properties of a good
5. How do you proceed to set confidence limits
to population mean ?
6. Describe how you could set confidence
limits to population proportion on the basis
of a large sample.
7. Explain how you would test for population
8. Describe the different steps for testing the
significance of population proportion.
9. Describe a situation where you can apply t-
10. How would you distinguish between a t-test
for independent sample and a paired t-test?
11. Distinguish between large samples and
small samples.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 180

Chapter9: Interpretation of Data
Statistics are not an end in themselves but they are a means to an end, the end being to draw certain
conclusions from them. This has to be done very carefully, otherwise misleading conclusions may be
drawn and the whole purpose of doing research may get vitiated. A researcher/statistician besides the
collection and analysis of data, has to draw inferences and explain their significance. Through
interpretation the meanings and implications of the study become clear. Analysis is not complete without
interpretation, and interpretation cannot proceed without analysis. Both are, thus, inter-dependent. In
this unit, therefore, we will discuss the interpretation of analysed data, summarizing the interpretation
and statistical fallacies.

Meaning of interpretation
The following definitions can explain the meaning of interpretation.
- The task of drawing conclusions or inferences and of explaining their significance after a careful
analysis of selected data is known as interpretation.
- It is an inductive process, in which you make generalizations based
on the connections and common aspects among the categories and
- Scientific interpretation seeks relationship between the data of a
study and between the study findings and other scientific
- Interpretation in a simple way means the translation of a statistical
result into an intelligible description.
Thus, analysis and interpretation are central steps in the research
process. The purpose of analysis is to summarize the collected data,
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 181

where as interpretation is the search for the broader meaning of research findings. In interpretation, the
researcher goes beyond the descriptive data to extract meaning and insights from the data.

Why interpretation?
A researcher/ statistician is expected not only to collect and analyse the data but also to interpret the
results of his/ her findings. Interpretation is essential for the simple reason that the usefulness and utility
of research findings lie in proper interpretation. It is only through interpretation that the researcher can
expose relations and patterns that underlie his findings. In case of hypothesis testing studies the
researcher may arrive at generalizations. In case the researcher had no hypothesis to start with, he would
try to explain his findings on the basis of some theory. It is only through interpretation that the researcher
can appreciate why his findings are what they are, and can make others understand the real significance
of his research findings.
Interpretation is not a mechanical process. It calls for a critical examination of the results of ones analysis
in the light of all the limitations of data gathering. For drawing conclusions you need a basis. Some of the
common and important bases of interpretation are: relationships, ratios, rates and percentages, averages
and other measures of comparison.

Essentials for interpretation
Certain points should be kept in mind before proceeding to draw conclusions from statistics. It is
essential that:
a) The data are homogeneous: It is necessary to ascertain that the data are strictly comparable. We must
be careful to compare the like with the like and not with the unlike.
b) The data are adequate: Sometimes it happens that the data are incomplete or insufficient and it is
neither possible to analyze them scientifically nor is it possible to draw any inference from them. Such
data must be completed first.
c) The data are suitable: Before considering the data for interpretation, the researcher must confirm the
required degree of suitability of the data. Inappropriate data are like no data. Hence, no conclusion is
possible with unsuitable data.
d) The data are properly classified and tabulated: Every care is to be taken as a pre-requisite, to base all
types of interpretations on systematically classified and properly tabulated data and information.
e) The data are scientifically analyzed: Before drawing conclusions, it is necessary to analyze the data by
applying scientific methods. Wrong analysis can play havoc with even the most carefully collected data.
If interpretation is based on uniform, accurate, adequate, suitable and scientifically analyzed data, there is
every possibility of attaining a better and representative result. Thus, from the above considerations we
may conclude that it is essential to have all the pre-requisites/pre-conditions of interpretation satisfied to
arrive at better conclusions.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 182

Precautions in interpretation
It is important to recognize that errors can be made in interpretation if proper precautions are not taken.
The interpretation of data is a very difficult task and requires a high degree of skill, care, judgement and
objectivity. In the absence of these, there is every likelihood of data being misused to prove things that
are not true. The following precautions are required before interpreting the data.
1) The interpreter must be objective.
2) The interpreter must understand the problem in its proper perspective.
3) He / she must appreciate the relevance of various elements of the problem.
4) See that all relevant, adequate and accurate data are collected.
5) See that the data are properly classified and analyzed.
6) Find out whether the data are subject to limitations? If so what are they?
7) Guard against the sources of errors.
8) Do not make interpretations that go beyond the information / data.
9) Factual interpretation and personal interpretation should not be confused. They should be kept apart.
If these precautions are taken at the time of interpretation, reasonably good conclusions can be arrived at.

Techniques of Interpretation
There are many different of tnterpretation techniques like graph or chart, but most are not used in
business research.
Those used most often include:
1. pie charts
2. vertical bar charts (histograms)
3. horizontal bar charts (also histograms)
4. pictograms
5. line charts
6. area charts
Some other types of charts, well suited to audience research, but less often used, include
- perceptual maps ( Discussed in data analysis techniques)
Though many different kinds of graph are possible, if a report includes too many types, its often
confusing for readers, who must work out how to interpret each new type of graph, and why it is
different from an earlier one. It is recommended using as few types of graph as are necessary.
If you have a spreadsheet or graphics program, such as Excel or Deltagraph, its very easy to produce
graphs. You simply enter the numbers and labels in a table, click a symbol to show which type of graph
you want, and it appears before your eyes. These graphs are usually not very clear when first produced,
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 183

but the software has many options for changing headings, scales, and graph layout. You can waste a lot
of time perfecting these graphs. Excel (actually, Microsoft Graph, which Excel uses) has dozens of
options, and it takes a lot of clicking of the right-hand mouse button to discover them all. If you dont
have a recent and powerful computer, Excel can be a very slow and frustrating program to use.
The main types of graph include pie charts, bar charts (histograms), line charts, area charts, and several

1) Pie chart
A round graph, cut (like a pie) into slices of varying size, all adding to 100%. Because a pie chart is round,
its useful for communicating data which takes a
"round" form: for example, the answers to "How
many minutes in each hour would you like FM
RADIOMIRCHI to spend on each of the following
types of program...?" In this case, the pie
corresponds to a clock face, and the slices can be
interpreted as fractions of an hour.
Pie charts are easily understood when the slices are
similar in size, but if several slices are less than 5%, or lots of different colours are used, it can be quite
difficult to read a pie chart. In that case the chart has to be very big, taking perhaps half a page to convey
one set of numbers. Not a very efficient way to display information.

2) Vertical bar chart
Also known as a histogram. A very common type of graph, easily understood. But when one of these
charts has more than about 6 vertical bars, theres very little space below each bar to explain what its

3) Horizontal bar chart
Exactly like a vertical bar chart, but turned
sideways. The big advantage of the
horizontal bar chart is that you can easily
read a description with more than one word.
Unfortunately, most graphics software displays the bars upside down youre expected to read from
the bottom, upwards to the top. A standard bar chart looks like this. (Like the two above charts, this was
created with Excel.)
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 184

You dont need graphics software to produce a
horizontal bar chart: you can do it easily with a
word processing program. One of the easiest
ways to do this is to use the | symbol to
produce the bars. This symbol is usually found
on the \ key; it is not a lower-case L or upper-
case I or number 1. It stands out best in bold
type. This is what we call a blobbogram.
For example:

Male 47.4% |||||||||||||||||||||||||
Female 52.6% |||||||||||||||||||||||||||
Total 100.0% = 325 cases

If each symbol represents 2% of the sample, you can usually fit the graph on a single line. Round each
figure to the nearest 2% to work out how many times to press the symbol key. In the above example,
47.4% is closer to 48% than to 46%, so I pressed the | key 24 times to graph the percentage of men. This is
a very clear layout, and quick to produce, so it is well suited to a preliminary report.
A more elaborate looking graph can be made by using special symbols. For example, if you have the font
Zapf Dingbats or Wingdings, you can use the shaded-
This is wider than the | symbol, and no more than about 20 will fit on a normal-width line, if half the line

Male 47.4%
Female 52.6%
Total 100.0% = 325 cases

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 185

4) Pictograms
Like a bar chart, a pictogram can be either vertical or horizontal, but instead of showing a solid bar, a
pictogram shows a number of symbols - e.g. small diagrams of people. In fact, the above bar chart with
pictograms show partial symbols. If one little man
means 10%, and the number to be graphed is 45%,
you see four and a half little men...

5) Line chart
This is used when the variable you are graphing is a numeric one. In audience research, most variables
are nominal, not numeric, so line charts arent needed much. But to plot the answers to a question such as
"How many people live in your household?" you could produce a graph like this:
Its normal to show the measurement (e.g. percentage) upwards, and the scale (e.g. hours per week) on
the horizontal scale. Unlike a bar chart, it will confuse people if the scales are exchanged. Youll find that
almost every line chart has a peak in the middle, and falls off to each side, reflecting whats known as the
"normal curve."
A line chart is really another form of a vertical bar chart. You could turn a vertical bar chart into a line
chart by drawing a line connecting the top of each bar, then deleting the bars.
A line chart can have more than one line. For example, you could have a line chart comparing the number
of hours per week that men and women watch TV. Thered be two lines, one for each sex. Each line needs
to be shown with a different style, or a different colour. With more than 3 or 4 lines, a line chart becomes
very confusing, specially when the lines cross each other.

6) Area chart
In a line chart with several lines such as the above example, with two sexes each line starts from the
bottom of the table. That way, you can compare the height of the lines at any point. An area chart is a
little different, in that each line starts
from the line below it. So you dont
compare the height of the lines, but the
areas between them. These areas always
add to 100% high. You can think of an
area chart as a lot of pie charts, flattened
out and laid end-to-end.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 186

A common use of area charts in audience research is to show how peoples behaviour changes across the
24 hours of the day. The horizontal scale runs from midnight to midnight, and the vertical scale from 0 to
100%. This area chart, taken from a survey in Vietnam, shows how people divide their day into sleep,
work, watching TV, listening to radio, and work and everything else.
An area chart needs to be studied closely: the results arent obvious at a glance. However, area charts
provide a lot of information in a small space.

Which type of graph is best?
There are dozens of other chart types not mentioned above, and also dozens of variations on the above
types - specially bar charts. However the above graph types cover most situations. It becomes confusing
to readers of reports if many different types of graph are presented, so it is recommended that any report
should include no more different graph types than necessary.
The most appropriate type of graph to present depends on the number of variables being displayed, and
whether these are nominal variables (with a limited number of separate values) or metric variables
(whose value can be any number). It is suggested to use a horizontal bar chart whenever possible. In a
normal audience survey, less than a third of the graphs are unsuited to being shown as horizontal bar
Variables Recommended chart type
number type
1 nominal bar chart, pictogram, or pie chart
1 metric line graph, or box and whisker plot
2 both nominal multiple bar chart, or domino chart
2 both metric bubble chart, or scattergram
2 1 metric, 1 nominal box and whisker plot, or area chart
3-D charts can look very impressive, but It is strongly suggested to avoid using them its just too easy
to misread them. The simpler a graph is, the more effective it is at communicating

Statistical fallacies
Interpretation of data, as we stated earlier, is a very difficult task and requires a high degree of care,
objectivity, skill and judgement. In the absence of these things, it is likely that the data may be misused.
In fact, experience shows that the largest number of mistakes are committed knowingly or unknowingly
while interpreting statistical data which may lead to misinterpretation of data by most of the readers.
Statistical fallacies may arise at any stage in the collection, presentation, analysis and interpretation of
data. The following are some of the (i) specific examples illustrating how statistics can be misinterpreted,
(ii) Sources of errors leading to false generalizations, (iii) examples how fallacies arise in using statistical
data and statistical methods.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 187

1. Bias: Bias, whether it is conscious or unconscious, is very common in statistical work and it leads to
false generalizations. It is found that wrong interpretations are made want only to prove their point.
Sometimes deliberately statistical information is twisted as to grind ones own axe. For example, a
business man may use statistics to prove the superiority of their firm over others by saying that our firm
earned a profit of ` 1,00,000 where as firm X earned only ` 80,000 this year. On the face of it, it appears
that firm X has not performed well. But a little thinking reveals that many other variables have to be
considered before drawing such a conclusion, such as what is the capital employed? If the capital
employed is same, then the quality of product and so on. Unconscious bias is even more insidious.
Perhaps, all statistical reports contain some unconscious bias, since the statistical results are interpreted
by human beings after all. Each may look at things in terms of his own experience and his attitude
towards the problem under study. People suffer from several inhibitions, prejudices, ideologies and
hardened attitudes. They cannot help reflecting these in their interpretation of results. For example: A
pessimist will see the future as being dark, where as an optimist may see it as being bright.
2. Inconsistency in Definitions: Sometimes false conclusions are drawn because of failure to define
properly the object being studied and hold that definition in mid for making comparisons. When the
working capital of two firms is compared, net working capital of one must be compared with only net
working capital of the other and not with gross working capital. Even within the organization, for
facilitating comparison over a period of time it is necessary to keep the definition constant.
3. Inappropriate Comparisons: Comparisons between two things cannot be made unless they are really
alike. Unfortunately, this point is generally forgotten and comparisons are made between two dissimilar
things, thereby, leading to fallacious conclusions. For example, the cost of living index of Bangalore is 150
(with base year 1999) and that of Hyderabad is 155 (with base 1995). Therefore, Hyderabad is a costlier
city than Bangalore city. This conclusion is misleading as the base years of the Indices are different.
4. Faulty Generalizations: Many a time people jump to conclusions or generalizations on the basis of
either too small a sample or a sample that is not representative of the population. For example, if a
foreigner came to Delhi and his purse was stolen by a pick pocket and he comments that there is no safety
and security for foreigners in India. This is not true as thousands of foreigners come to India. They are
safe and secure. Sometimes the sample size may be adequate but not representative.
5. Drawing Wrong Inferences: Sometimes wrong inferences may be drawn from the data. For example,
the population of a town has doubled in 10 years. From this it is interpreted that the birth rate in the town
has doubled. Obviously, this is a wrong inference, as the population of the town can double in many
ways (example: exodus from villages, migration from other places etc.) than doubling of birth rate only.
6. Misuse of Statistical Tools: The various tools of analysis such as measures of central tendency,
measures of variation, measures of correlation, ratios, percentages etc., are very often misused to present
information in such a manner as to convince the public or to camoaflage things. In a company there are
1,00,000 shares and 1,000 share holders. The company claims that their shares are well distributed as the
average share holding is 100. But a close scrutiny reveals that 10 persons hold 90,000 shares where as 990
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 188

persons hold 10,000 shares, average being about 10. Similarly, range can be misused to exaggerate
disparities. For example, in a factory the wages may range between ` 1,000 to ` 1,500 a month and the
Manager gets ` 20,000 a month. It is reported that the earnings of their employees range from ` 1,000 to `
7. Failure to Comprehend the Data: Very often figures are interpreted without comprehending the total
background of the data and it may lead to wrong conclusions. For example, see the following
o The death rate in the army is 9 per thousand, where as in the city of Delhi it is 15 per thousand.
Therefore, it is safer to be in the army than in the city.
o Most of the patients who were admitted in the intensive care (IC) ward of a hospital died.
Therefore, it is unsafe to be admitted to intensive care ward in that hospital.

Concluding remarks on interpretation
The task of interpretation is not an easy job. It requires skill and dexterity on the part of the researcher.
Interpretation is an art that one learns through practice and experience. The researcher may seek the
guidance of experts for accomplishing the task of interpretation.
The element of comparison is fundamental to all research interpretations. Comparison of ones findings
with a criterion, or with results of other comparable investigations or with normal (ideal) conditions, or
with existing theories or with the opinions of a panel of judges / experts forms an important aspect of
The researcher must accomplish the task of interpretation only after considering all relevant factors
affecting the problem to avoid false generalizations. He/she should not conclude without evidence.
He/she should not draw hasty conclusions. He/she should take all possible precautions for proper
interpretation of the data.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 189

Report writing
The last and final phase of the journey in research is writing of the report. After the collected data has
been analyzed and interpreted and generalizations have been drawn the report has to be prepared. The
task of research is incomplete till the report is presented.
Writing of a report is the last step in a research study and requires a set of skills somewhat different from
those called for in respect of the earlier stages of research. This task should be accomplished by the
researcher with utmost care.

Purpose of a report
The report may be meant for the people in general, when the investigation has not been carried out at the
instance of any third party. Research is essentially a cooperative venture and it is essential that every
investigator should know what others have found about the phenomena under study. The purpose of a
report is thus the dissipation of knowledge, broadcasting of generalizations so as to ensure their widest
A report of research has only one function, it must inform. It has to propagate knowledge. Thus, the
purpose of a report is to convey to
the interested persons the results
and findings of the study in
sufficient detail, and so arranged
as to enable each reader to
comprehend the data, and to
determine for himself the validity
of conclusions. Research results
must invariably enter the general
store of knowledge. A research report is always an addition to knowledge. All this explains the
significance of writing a report. In a broader sense, report writing is common to both academics and
organizations. However, the purpose may be different. In academics, reports are used for comprehensive
and application-oriented learning. Whereas in organizations, reports form the basis for decision making.

Reporting simply means communicating or informing through reports. The researcher has collected some
facts and figures, analyzed the same and arrived at certain conclusions. He has to inform or report the
same to the parties interested. Therefore reporting is communicating the facts, data and information
through reports to the persons for whom such facts and data are collected and compiled.
A report is not a complete description of what has been done during the period of survey/research. It is
only a statement of the most significant facts that are necessary for understanding the conclusions drawn
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 190

by the investigator. Thus, a report by definition, is simply an account. The report thus is an account
describing the procedure adopted, the findings arrived at and the conclusions drawn by the investigator
of a problem.

Types of reports
Broadly speaking reporting can be done in two ways:
a) Oral or Verbal Report: reporting verbally in person, for example; Presenting the findings in a
conference or seminar or reporting orally to the superiors.
b) Written Report: Written reports are more formal, authentic and popular.

Written reports can be presented in different ways as follows.
i) Sentence form reports: Communicating in sentence form
ii) Tabular reports: Communicating through figures in tables
iii) Graphic reports: Communicating through graphs and diagrams
iv) Combined reports: Communicating using all the three of the above. Generally, this is the most
Research reports vary greatly in length and type. In each individual case, both the length and the form
are largely dictated by the purpose of the study and problems at hand. For example, business
organizations generally prefer reports in letter form, that too short in length. Banks, insurance and other
financial institutions generally prefer figure form in tables. The reports prepared by government bureaus,
enquiry commissions etc., are generally very comprehensive on the issues involved. Similarly, research
theses/dissertations usually prepared by students for Ph.D. degree are also elaborate and methodical.
It is, thus, clear that the results of a research enquiry can be presented in a number of ways. They may be
termed as a technical report, a popular report, an article, or a monograph.
1) Technical Report: A technical report is used whenever a full written report (ex: Ph.D. thesis) of the
study is required either for evaluation or for record keeping or for public dissemination. The main
emphasis in a technical report is on :
a) the methodology employed.
b) the objectives of the study.
c) the assumptions made / hypotheses formulated in the course of the study.
d) how and from what sources the data are collected and how have the data been analyzed.
e) the detailed presentation of the findings with evidence, and their limitations.
2) Popular Report: A popular report is one which gives emphasis on simplicity and attractiveness. Its aim
is to make the general public understand the findings and implications. Generally, it is simple. Simplicity
is sought to be achieved through clear language and minimization of technical details. Attention of the
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 191

readers is sought to be achieved through attractive layout, liberal use of graphs, charts, diagrams and
pictures. In a popular report emphasis is given on practical aspects and policy implications.
3) Research Article: Sometimes the findings of a research study can be published in the form of a short
paper called an article. This is one form of dissemination. The research papers are generally prepared
either to present in seminars and conferences or to publish in research journals. Since one of the
objectives of doing research is to make a positive contribution to knowledge, in the field, publication
(publicity) of the work serves the purpose.
4) Monograph: A monograph is a treatise or a long essay on a single subject. For the sake of convenience,
reports may also be classified either on the basis of approach or on the basis of the nature of presentation
such as:
i) Journalistic Report
ii) Business Report
iii) Project Report
iv) Dissertation
v) Enquiry Report (Commission Report), and
vi) Thesis
Reports prepared by journalists for publication in the media may be journalistic reports. These reports
have news and information value. A business report may be defined as report for business
communication from one departmental head to another, one functional area to another, or even from top
to bottom in the organizational structure on any specific aspect of business activity. These are
observational reports which facilitate business decisions. A project report is the report on a project
undertaken by an individual or a group of individuals relating to any functional area or any segment of a
functional area or any aspect of business, industry or society. A dissertation, on the other hand, is a
detailed discourse or report on the subject of study. Dissertations are generally used as documents to be
submitted for the acquisition of higher research degrees from a University or an academic institution.
The thesis is an example in point.
An enquiry report or a commission of enquiry report is a detailed report prepared by a commission
appointed for the specific purpose of conducting a detailed study of any matter of dispute or of a subject
requiring greater insight. These reports facilitate action, since they contain expert opinions.

Preparing research report
Research reports are the product of slow and painstaking and accurate work. Therefore, the preparation
of the report may be viewed in the following major stages.
1) The logical understanding and analysis of the subject matter.
2) Planning/designing the final outline of the report.
3) Write up/preparation of rough draft.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 192

4) Polishing/finalization of the Report.
Logical Understanding of the Subject Matter: It is the first stage which is primarily concerned with the
development of a subject. There are two ways to develop a subject viz. a. logically and b. chronologically.
The logical development is done on the basis of mental connections and associations between one aspect
and another by means of logical analysis. Logical treatment often consists of developing material from the
simple to the most complex. Chronological development is based on a connection or sequence in time or
happening of the events. The directions for doing something usually follow the chronological order.
Designing the Final Outline of the Report: It is the second stage in writing the report. Having
understood the subject matter, the next stage is structuring the report and ordering the parts and
sketching them. This stage can also be called as planning and organization stage. Ideas may pass through
the authors mind. Unless he first makes his plan/sketch/design he will be unable to achieve a
harmonious succession and will not even know where to begin and how to end. Better communication of
research results is partly a matter of language but mostly a matter of planning and organizing the report.
Preparation of the Rough Draft: The third stage is the write up/drafting of the report. This is the most
crucial stage to the researcher, as he/she now sits to write down what he/she has done in his/her
research study and what and how he/she wants to communicate the same. Here the clarity in
communicating/reporting is influenced by some factors such as who the readers are, how technical the
problem is, the researchers hold over the facts and techniques, the researchers command over language
(his communication skills), the data and completeness of his notes and documentation and the
availability of analyzed results. Depending on the above factors some authors may be able to write the
report with one or two drafts. Some people who have less command over language, no clarity about the
problem and subject matter may take more time for drafting the report and have to prepare more drafts
(first draft, second draft, third draft, fourth draft etc.,)
Finalization of the Report: This is the last stage, perhaps the most difficult stage of all formal writing. It
is easy to build the structure, but it takes more time for polishing and giving finishing touches. Take for
example the construction of a house. Up to roofing (structure) stage the work is very quick but by the
time the building is ready, it takes up a lot of time. The rough draft (whether it is second draft or n th
draft ) has to be rewritten, polished in terms of requirements. The careful revision of the rough draft
makes the difference between a mediocre and a good piece of writing. While polishing and finalizing one
should check the report for its weaknesses in logical development of the subject and presentation
cohesion. He/she should also check the mechanics of writing language, usage, grammar, spelling and

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 193

Characteristics of a good report
Research report is a channel of communicating the research findings to the readers of the report. A good
report is one which does this task efficiently and effectively. As such it should have the following
i) It must be clear in informing the what, why, who, whom, when, where and how of the research study.
ii) It should be neither too short nor too long. One should keep in mind the fact that it should be long
enough to cover the subject matter but short enough to sustain the readers interest.
iii) It should be written in an objective style and simple language, correctness, precision and clarity
should be the watchwords of the scholar. Wordiness, indirection and pompous language are barriers to
iv) A good report must combine clear thinking, logical organization and sound interpretation.
v) It should not be dull. It should be such as to sustain the readers interest.
vi) It must be accurate. Accuracy is one of the requirements of a report. It should be factual with objective
presentation. Exaggerations and superlatives should be avoided.
vii) Clarity is another requirement of presentation. It is achieved by using familiar words and
unambiguous statements, explicitly defining new concepts and unusual terms.
viii) Coherence is an essential part of clarity. There should be logical flow of ideas (i.e. continuity of
thought), sequence of sentences. Each sentence must be so linked with other sentences so as to move the
thoughts smoothly.
ix) Readability is an important requirement of good communication. Even a technical report should be
easily understandable. Technicalities should be translated into language understandable by the readers.
x) A research report should be prepared according to the best composition practices. Ensure readability
through proper paragraphing, short sentences, illustrations, examples, section headings, use of charts,
graphs and diagrams.
xi) Draw sound inferences/conclusions from the statistical tables. But dont repeat the tables in text
(verbal) form.
xii) Footnote references should be in proper form. The bibliography should be reasonably complete and
in proper form.
xiii) The report must be attractive in appearance, neat and clean whether typed or printed.
xiv) The report should be free from mistakes of all types viz. language mistakes, factual mistakes, spelling
mistakes, calculation mistakes etc.,
The researcher should try to achieve these qualities in his report as far as possible.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 194

Layout of a Report
Under this head, the format/outline/sketch of a comprehensive technical report or research report is
discussed below. A research report has a number of clearly defined sections. The headings of the sections
and their order may differ from one situation to another. The contents of a report can broadly be divided
into following parts as :

A) Front Matters
1. Title Page
2. Certificate
3. Declaration
4. Acknowledgments
5. Executive Summary
6. Table of Contents
7. List of Illustrations and List of Tables
8. List of abbreviations used

B) Main Text
1. Introduction
2. Research methodology
3. Background to the research problem
4. Data collection
5. Sample and sampling method
Statistical or qualitative methods used for data analysis
Sample description
6. Tabulation and Analysis of Data
7. Finding of study
8. Conclusions
9. Recommendations of study

C) Reference Matters
1. Bibliography
2. Appendices (optional)
3. Glossary (optional)
4. References (optional)
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 195

A) Front Pages
1) Title Page
The cover page should display full name of researcher, guide along with qualification, and the title of
2) Certificate
Format for same given in sample page at the last of book
3) Declaration
Format for same given in sample page at the last of book
4) Acknowledgments
The researcher may wish to acknowledge people who helped in preparation of report. For example, you
may wish to thank someone you interviewed, or someone who provided you with some special
5) Table of Contents and List of Figures
Report should have a Table of Contents that lists the report's sections and page numbers. If figures
include in report (charts, tables, diagrams), one must also include a list of figures, indicating titles and
page numbers. Figures should be numbered, titled, and mentioned in the text preceding them.
6) List of tables and illustrations used
7) Executive Summary
One of the most important components of the report is the Executive Summary. It answers the
question, "What does the report contain?" and should be written after the rest of the report is
complete. The Executive Summary should be complete in itself and may be consulted by readers who
wish to determine whether they need to read the whole report.
Limit the Executive Summary to two-three pages and discuss:
- Purpose and extent of the report
- Major points contained in the body of the report
- Highlights of key conclusions
- Highlights of key recommendations

B) Main Text
1) Introduction:-The Introduction should establish the purpose of the report and should convey what is
in the body of the report. One should provide the reader with the following information:
- Necessary background information
- major points that will be covered in the report
- the situation or problem that will be analyzed
- what your aims are in compiling the report Analysis
- Why does a problem exist?
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 196

- How does the problem affect the environment?
- What efforts may solve this problem?
- What aspects of the problem have been measured and improved? How?
- What problems does the potential solution not solve? Why not?
- What could be improved?
2) Research Methodology: -
- Goals of the study, specific objectives, and purpose of the study.
- Statistical design:- Universe of study, sampling method, sample size and unit , secondary data
sources ,and Limitations of study.
- Tools of Data collection, and the response rate
3) Tabulation and Analysis: -
Analysis is the most important part of report because it contains "workings out" - how one reaches the
conclusions. Analysis should contain the thoughts, reasons, judgments based on the facts and figures and
data t collected. In analysis one makes INFERENCES, conclusions that are drawn from the research.
4) Finding, Conclusions and Recommendations of study: -
The conclusions are the final results of analysis. They should be brief and should contain no new
information. They should not make direct reference to sources, figures, or tables. The conclusions should
be listed and numbered, with brief explanation for each. Each conclusion should follow logically from the
facts and arguments presented in the main text (body). RECOMMENDATIONS are suggestions, based on
the conclusions reached from the research. These should brief and should follow logically from the

C) Reference Matter

A bibliography is an alphabetical list of all materials consulted in the preparation of research.

Why do a bibliography?
Some reasons:
1. To acknowledge and give credit to sources of words, ideas, diagrams, illustrations, and quotations
borrowed, or any materials summarized or paraphrased.
2. To show that you are respectfully borrowing other peoples ideas, not stealing them, i.e. to prove that
you are not plagiarizing (Copying).
3. To offer additional information to readers who may wish to further pursue the topic.
4. To give readers an opportunity to check out the sources for accuracy. An honest bibliography inspires
reader confidence in writing.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 197

What must be included in a bibliography?
1. Author
2. Title
3. Place of publication
4. Publisher
5. Date of publication
6. Page number(s) (for articles from magazines, journals, periodicals, newspapers,
encyclopaedias etc.)

1. Author
Ignore any titles, designations or degrees, etc. which appear before or after the name, e.g., The
Honourable, Dr., Mr., Mrs., Ms., Rev., S.J., Esq., Ph.D., M.D., Q.C., etc. Exceptions are Jr. and Sr. Do
include Jr. and Sr. as John Smith, Jr. and John Smith, Sr. are two different individuals. Include also I, II,
III, etc. for the same reason.
a) Last name, first name:
Kotlar, Philip.
Christensen, Asger.
Wilson-Smith, Anthony.
b) Last name, first and middle names:
Wyse, Cassandra Ann Lee.
c) Last name, first name and middle initial:
Schwab, Charles R.
d) Last name, initial and middle name:
Holmes, A. William.
e) Last name, initials:
Meister, F.A.
f) Last name, first and middle names, Jr. or Sr. designation:
Davis, Benjamin Oliver, Jr.
g) Last name, first name, I, II, III, etc.:
Stilwell, William E., IV.

2. Title and subtitle
a) If the title on the front cover or spine of the book differs from the title on the title page, use the title on
the title page for your citation.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 198

b) UNDERLINE the title and subtitle of a book, magazine, journal, periodical, newspaper, or
encyclopaedia, e.g., What to Do When Things Go Wrong, Sports Illustrated, New York Times,
Encyclopaedia Britannica.
c) If the title of a newspaper does not indicate the place of publication, add the name of the city or town
after the title in square brackets, e.g. National Post [Toronto].
Freeze, Colin. "Illinois Puts the Death Penalty Itself on Trial." Globe and Mail [Toronto]
29 Oct. 2002: A3.
Furuta, Aya. "Japan Races to Stay Ahead in Rice-Genome Research." Nikkei Weekly [Tokyo]
5 June 2000: 1+.
d) DO NOT UNDERLINE the title and subtitle of an article in a magazine, journal, periodical, newspaper,
or encyclopedia; put the title and subtitle between quotation marks:
Baker, Peter, and Susan B. Glasser. "No Deals with Terrorists: Putin." Toronto Star
29 Oct. 2002: A1+.
Fisher, Dennis. "Safe Data: At What Price?" eWeek 21 Oct. 2002: 26.
Penny, Nicholas B. "Sculpture, The History of Western." New Encyclopaedia Britannica.
1998 ed.
e) CAPITALIZE the first word of the title, the first word of the subtitle, as well as all important words
except for articles, prepositions, and conjunctions, e.g., Flash and XML: A Developer's Guide, or The Red
Count: The Life and Times of Harry Kessler.
f) Use LOWER CASE letters for conjunctions such as and, because, but, and however; for prepositions
such as in, on, of, for, and to; as well as for articles: a, an, and the, unless they occur at the beginning of a
title or subtitle, or are being used emphatically, e.g., "And Now for Something Completely Different: A
Hedgehog Hospital," "Court OKs Drug Tests for People on Welfare," or "Why Winston Churchill Was The
Man of The Hour."
g) Separate the title from its subtitle with a COLON (:), e.g. "Belfast: A Warm Welcome Awaits."

3. Place of publication - for books only
a) DO NOT use the name of a country, state, province, or country as a Place of Publication, e.g. do not list
India, Australia, Canada, United Kingdom, Great Britain, United States of America, California, or
Maharashtra as a place of publication.
b) Use only the name of a city or a town.
c) Choose the first city or town listed if more than one Place of Publication is indicated in the book.
d) It is not necessary to indicate the Place of Publication when citing articles from major encyclopaedias,
magazines, journals, or newspapers.
e) If the city is well known, it is not necessary to add the State or Province after it, e.g.:
New Delhi:
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 199

New York:
f) If the city or town is not well known, or if there is a chance that the name of the city or town may create
confusion, add the abbreviated letters for State, Province, or Territory after it for clarification. Example:
Amravati, MS
Hyderabad, AP
Austin, TX:
g) Use "n.p." to indicate that no place of publication is given.

4. Publisher - for books only
a) Be sure to write down the Publisher, NOT the Printer.
b) If a book has more than one publisher, not one publisher with multiple places of publication, list the
publishers in the order given each with its corresponding year of publication, e.g.: Conrad, Joseph. Lord
Jim. 1920. New York: Doubleday; New York: Signet, 1981.
c) Shorten the Publisher's name, e.g. use Macmillan, not Macmillan Publishing Co., Inc.
d) No need to indicate Publisher for encyclopaedias, magazines, journals, and newspapers.
e) If you cannot find the name of the publisher anywhere in the book, use "n.p." to indicate there is no
publisher listed.

5. Date of publication
a) For a book, use the copyright year as the date of publication, e.g.: 2003, not 2003 or Copyright 2003,
i.e. do not draw the symbol for copyright or add the word Copyright in front of the year.
b) For a monthly or quarterly publication use month and year, or season and year. For the months May,
June, and July, spell out the months, for all other months with five or more letters, use abbreviations: Jan.,
Feb., Mar., Apr., Aug., Sept., Oct., Nov., and Dec. Note that there is no period after the month. For
instance, the period after Jan. is for the abbreviation of January only. See Abbreviations of Months of the
Year, Days of the Week, and Other Time Abbreviations. If no months are stated, use Spring, Summer,
Fall, Winter, etc. as given, e.g.:
Alternatives Journal Spring 2004.
Classroom Connect Dec. 2003/Jan. 2004.
Discover July 2003.
Scientific American Apr. 2004.
c) For a weekly or daily publication use date, month, and year, e.g.:
Newsweek 11 Aug. 2003.
d) Use the most recent Copyright year if two or more years are listed, e.g., 1988, 1990, 2004. Use 2004.
e) Do not confuse Date of Publication with Date of Printing, e.g., 7th Printing 2004, or Reprinted in 2004.
These are not publication dates.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 200

f) If you cannot find a publication date anywhere in the book, use "n.d." to indicate there is "No Date"
listed for this publication.
g) If there is no publication date, but you are able to find out from reliable sources the approximate date
of publication, use [c. 2004] for circa 2004, or use [2003?]. Always use square brackets [ ] to indicate
information that is not given but is supplied by you.

6. Page number(s)
a) Page numbers are not needed for a book, unless the citation comes from an article or essay in an
anthology, i.e. a collection of works by different authors.
Example of a work in an anthology (page numbers are for the entire essay or piece of work):
Fish, Barry, and Les Kotzer. "Legals for Life." Death and Taxes: Beating One of the Two Certainties in
Life. Ed. Jerry White. Toronto: Warwick, 1998. 32-56.
b) If there is no page number given, use "n. pag."
(Works Cited example)
Schulz, Charles M. The Meditations of Linus. N.p.: Hallmark, 1967.
(Footnote or Endnote example)

Charles M. Schulz, The Meditations of Linus (N.p.: Hallmark, 1967) n. pag.
c) To cite a source with no author, no editor, no place of publication or publisher stated, no year of
publication, but you know where the book was published, follow this example:
Full View of Temples of Taiwan - Tracks of Pilgrims. [Taipei]: n.p., n.d.
d) Frequently, page numbers are not printed on some pages in magazines and journals. Where page
numbers may be counted or guessed accurately, count the pages and indicate the page number or


[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 201


Presentation of research report
The research report should be typed following the requirements detailed below:
1. Use Executive bond A-4 size paper, type on one side of the paper only.
2. Use 1.5 spacing
3. Include margins: left-hand 3.8 cm (1&1/2 inches)
Right-hand 2.5 cm (1 inch)
4. Paragraphs should not be indented.
5. Pages should be numbered.
6. Tables should be numbered
7. Figures (e.g. diagrams and graphs) should be treated in a similar way to tables but should be
numbered "Figure 2" etc
8. Headings: Section Heading : upper case (e.g. INTRODUCTION), Subsection Heading: lower case
underlined, numbered 1.1, 1.2 etc indented to start of lettering on main heading
1. INTRODUCTION: Technological advances have opened many doors in education.....
1.1 The model presented: In the final year the occupational therapy course is being
1.2 The task: A tutorial workbook.....
1.2.1 Using the programs: The programs designed are very varied....

9. Length of project: The project should be approximately 15,000 - 22,000 words (For project at Post
graduate level)
10. Submitted copies of the project should be hard-bound volume only.
11. If you wish to acknowledge any individual's contribution to the project, this should be stated on a
separate acknowledgement page.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 202

12. Your project should contain a list of contents which states the page number of each section of the
13. Appendices should not be considered part of the project report (for example, raw data could be
included in this way). Appendices should be placed at the very end of the project and referred to in the
contents section.

Review questions
1. What are the preconditions for drawing
better conclusions?
2. State any five precautionary steps to be
taken before interpretation.
3. What is meant by interpretation of statistical
data? What precautions should be taken
while interpreting the data?
4. What do you understand by interpretation
of data? Illustrate the types of mistakes
which frequently occur in interpretation.
5. Explain the need, meaning and essentials of
6. What is reporting? What are the different
stages in the preparation of a report?
7. What is a report? What are the
characteristics/qualities of a good report?
8. Briefly describe the structure of a report.
9. What are the various aspects that have to be
checked before going to final typing?
10. What are the points to be kept in mind in
revising the draft report?
11. Give a brief note on the prefatory items.
12. What are the various items that will find a
place in the text / body of the report?
13. Describe briefly how a research report
should be presented.
14. Describe the considerations and steps
involved in planning a report writing work.
15. Write short notes on:
a) Characteristics of a good report.
b) Research article
c) Sources of data
d) Chapter plan

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 203

Chapter 10: Research in various Functional Areas
Through research, an executive can quickly get a synopsis of the current scenario, which improves his
information base for making sound decisions affecting future operations of the enterprise. The following
are the major areas in which research plays a key role in making effective decisions.
There are many topics that benefit from research. Some major topics are: general business, economic,
and corporate research; financial and accounting research; management and organizational research;
sales and marketing research; information systems research; and corporate responsibility research.
Few of the above important areas are covered in detail below:

1. Marketing
Marketing research is undertaken to assist the marketing function. Marketing research stimulates the
flow of marketing data from the consumer and his environment to marketing information system of the
enterprise. Market research involves the process of
- Systematic collection
- Compilation
- Analysis
- Interpretation of relevant data for marketing decisions
This information goes to the executive in the form of data. On the basis of this data the executive develop
plans and programmers. Advertising research, packaging research, performance evaluation research,
sales analysis, distribution channel, etc., may also be considered in management research. Research tools
are applied effectively for studies involving:
1. Demand forecasting
2. Consumer buying behaviour
3. Measuring advertising effectiveness
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 204

4. Media selection for advertising
5. Test marketing
6. Product positioning
7. Product potential

Marketing Research
i. Product Research: Assessment of suitability of goods with respect to design and price.
ii. Market Characteristics Research (Qualitative): Who uses the product? Relationship between buyer
and user, buying motive, how a product is used, analysis of consumption rates, units in which product is
purchased, customs and habits affecting the use of a product, consumer attitudes, shopping habits of
consumers, brand loyalty, research of special consumer groups, survey of local markets, basic economic
analysis of the consumer market, etc.
iii. Size of Market (Quantitative): Market potential, total sales quota, territorial sales quota, quota for
individuals, concentration of sales and advertising efforts; appraisal of efficiency, etc.
iv. Competitive position and Trends Research
v. Sales Research: Analysis of sales records.
vi. Distribution Research: Channels of distribution, distribution costs.
vii. Advertising and Promotion Research: Testing and evaluating, advertising and promotion
viii. New product launching and Product Positioning.

2. Production
Research helps you in an enterprise to decide in the field of production on:
- What to produce
- How much to produce
- When to produce
- For whom to produce
Some of the areas you can apply research are:
- Product development
- Cost reduction
- Work simplification
- Profitability improvement
- Inventory control
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 205

The materials department uses research to frame suitable policies regarding:
- Where to buy
- How much to buy
- When to buy
- At what prices to buy?

3. Human Resource Development
You must be Aware that The Human Resource Development department uses research to study wage
rates, incentive schemes, cost of living, employee turnover rates, employment trends, and performance
appraisal. It also uses research effectively for its most important activity namely manpower planning.

4. Solving Various Operational and Planning Problems of Business and
Various types of researches, e.g., market research, operations research and motivational research, when
combined together, help in solving various complex problems of business and industry in a number of
ways. These techniques help in replacing intuitive business decisions by more logical and scientific
i. Government and Economic System
Research helps a decision maker in a number of ways, e.g., it can help in examining the consequences of
each alternative and help in bringing out the effect on economic conditions. Various examples can be
quoted such as problems of big and small industries due to various factorsup gradation of technology
and its impact on lab our and supervisory deployment, effect of governments liberal policy, WTO and its
new guidance, ISO 9000/14000 standards and their impact on our exports allocation of national resources
on national priority basis, etc. Research lays the foundation for all Government Policies in our economic
We all are aware of the fact that research is applied for bringing out union finance budget and railway
budget every year. Government also uses research for economic planning and optimum utilization of
resources for the development of the country. For systematic collection of information on the economic
and social structure of the country, you need Research. Such types of information indicate what is
happening to the national economy and what changes are taking place.
ii. Social Relationships
Research in social sciences is concerned with both-knowledge for self and knowledge for helping in
solving immediate problems of human relations. It is a sort of formal training, which helps an individual
in a better way, e.g.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 206

- It helps professionals to earn their livelihood
- It helps students to know how to write and report various findings.
- It helps philosophers and thinkers in their new thin kings and ideas.
- It helps in developing new styles for creative work.
- It may help researchers, in general, to generalize new theories.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 207

Bibliography and Suggested readings

Aaker D A, Kumar V & Day G S - Marketing
Research (John Wiley &Sons Inc, 6th ed.)
- Agresti A., Categorical Data Analysis. New
York: John Wiley & Sons 1990.
- Backstrom H. and Hursh-Cesar G.: Survey
Research, 2nd edition. Wiley, 1981.
- Boehm W., Brown J. R., Kaspar H., Liplow
M., Macleod G. J., and Merrit M. J. ,
- C.R.Kothari, Research Methodology
(Methods and Techniques), New
AgeInternational Pvt. Ltd. New Delhi
- Cauvery Research Methodology (S.
Chand & Co.)
- Characteristics of Software Quality.
Amsterdam: North-Holland, 1978.
- Dillard, J., Hunter, J., & Burgoon, M. (1984).
Sequential request persuasive strategies:
Meta-analysis of foot-in-the-door and door-
in-the-face. Human Communication
Research, 10, 461-488.
- Dwivedi Research Methods in Behavioral
Science, ( Macmillan)
- Flower , Floyed J. Jr. : Survey methods, Sage
Publication 1993
- Fred N. Kerlinger. Foundations of
Behavioural Research, Surjeet
- Golde, Biddle, Koren : Composing
Qualitative Research, Sage Publication
- Gupta S.P. : Statistical Methods, Sultan
Chand, New Delhi 2001
- Gy, P (1992) Sampling of Heterogeneous and
Dynamic Material Systems: Theories of
Heterogeneity, Sampling and Homogenizing
- J.F.Rummel & W.C.Ballaine. Research
Methodology in Business, Harper &Row,
Publishers, Newyork
Kothari C R Quantitative Techniques (Vikas
Publishing House 3rd Ed.)
- Nowak, R. (1994). Problems in clinical trials
go far beyond misconduct. Science. 264(5165):
- P.Saravanavel. Research Methodology, Kitab
Mahal, Allahabad.
- P.V.Young. Scientific Social Surveys and
Research, Prentice-Hall of India,New Delhi
- Resnik, D. (2000). Statistics, ethics, and
research: an agenda for educations and
reform. Accountability in Research. 8: 163-88
- Spiegel, M R, 1992. Statistics, Schaums
Outline Series, Mc Graw Hill, Singapore.
- Sue ZayacAcademic Information Systems
March, 2003
- T.S. Wilkinson & P.L.Bhanarkar.
Methodology and Techniques of
SocialResearch, Himalaya Publishing House,

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 208

Frequently asked questions(FAQs) on Ph.D
[ University directions and rules on PdD made easy ]

1. Do I need to clear the Research
aptitude Test/Entrance Test for enrolling
myself as Ph.D student?
Yes, UGC has made it mandatory to clear the
entrance test for Ph.D enrolment. The format
and the pattern of test may vary from one
University to other University.
- Check the website of the respective
University for more details.

2. Who shall be exempted from Ph.D.
entrance test ?
The candidates fulfilling one of the following
conditions shall be exempted from PET.
(i) Qualified in GATE/SET/NET/JRF
examination of the apex bodies such as IIT/
(ii) Candidates holding M.Phil. degree in the
concerned subject from any Statutory
(iii) Full time teacher of any statutory University
or full time approved teacher in an affiliated
college of any statutory University with
minimum 7 years of teaching experience.
(iv) Scientists/ Officers working in Government
organizations, National laboratories and
research institutions having 7 years research/
professional experience,
The Ph.D. registration form shall be submitted
by the candidates exempted from PET with
relevant supporting documents, to the Head,
Place of Research.

3. How do I register for the online
entrance test?
You can register for online entrance test by
logging on to the website of the University and
after filling the form, submitting the hardcopy of
the same to the university along with relevant
documents [ e.g. Draft for entrance fees specified
by the University, Mark sheet & degree
certificate of you PG , Caste certificate ( if
applicable ) etc.]

4. What will be the pattern of the
Entrance test?
Two Solved Model question Papers along with
explanation are given from page no 216 in this
text book for your ready reference and two
additional Solved Model question Papers are
also written in the CD enclosed.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 209

5. Who are eligible for Ph.D. programme ?
a) Persons should have valid score in Ph.D.
Entrance Test (PET) as prescribed in the rules.
(Candidates who score 50% and more shall be
declared as successful) and
b) Persons having passed Post Graduate Degree
(Masters Degree) Examination with at least 50%
marks or equivalent Grade Point Average (GPA)
of this university or any other examination
recognized as equivalent thereto.
Provided that, relaxation of 5% marks shall be
provided in case of (a) & (b) above for the
candidates belonging to reserved category in the
State of Maharashtra.
c) Persons working in National Laboratories/
Institutes /Government / Private organizations
nominated/sponsored by the respective
employers. Such persons should have a Post
Graduate Degree and should be holding rank of
Assistant Director or above. The candidates who
have obtained Masters degree of any statutory
Indian University but working outside India
shall be included in this category,
d) Persons with exceptional research abilities/
contribution to be judged by Research and
Recognition Committee who have passed
Graduate Degree Examination with 50% of
marks and with 15 years experience after
graduation in related fields.
e) The fellow members of the Institute of
Chartered Accountants and/ or Institute of Cost
and Works Accountants and/ or having
qualification of C.S. shall be held eligible for
registration for Ph.D. in the subject in the
concerned Board of Studies in the faculty of
Commerce provided that they possess a
Bachelors Degree of any statutory University.
Such candidate should have at least 5 years of
professional experience.
f) A Graduate in any faculty who has developed
important new techniques (new for the country)
or designed and fabricated special instruments
or apparatus which are deemed by competent
judge to be a valuable contribution to
Engineering/Pharmacy field may be permitted
by the Research and Recognition Committee of
concerned faculty. Such a candidate must have
at least five years of experience after obtaining
Bachelors degree in the concerned faculty.

6. Is there any age bar for taking Ph.D
entrance examination?
There is no maximum or minimum age bar for
doing Ph.D. The basic eligibility criteria is a
TWO Master Degrees with at least 50% marks or
equivalent Grade Point Average (GPA) of this
University or any other examination recognized
as equivalent thereto in respective faculty.
There shall be relaxation of 5% for reserved
category candidates in the state of Maharashtra.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 210

7. How many marks I need to score to
clear the Ph.D entrance exam?
You need to score 50% marks to clear the exam.
There shall be relaxation of 5% for reserved
category candidates in the state of Maharashtra.

8. What if I have Post graduated from
other University?
If you have post graduated from other
University then you need to get an Eligibility
Certificate from RTMNU Nagpur University
and Migration Certificate from Home
University before you are finally enrolled for

9. Once I clear my entrance , what shall I
The eligible candidate who is declared to be
successful in the PET or the candidate who is
exempted from PET shall approach the Place of
Research where he/she intends to do the
research work. On the basis of number of seats
available with the approved Ph.D. Guides, the
available specialization among the Ph.D. Guides
and the research interest of the candidate, the
guide shall be allotted by Head of the Place of
Research to the candidate in consultation with
the guide and student in formal way.
While granting admission to candidates for
Ph.D. programme, due attention shall be paid to
the State Reservation Policy.

10. How long will my PhD Entrance exam
result will be valid?
The result of PET shall be valid for a period of
12 months from the date of holding of entrance
examination. The candidate who has been
decided to be successful shall be eligible to
submit application(s) for registration within the
period of 12 months. However, after expiry of
period of 12 months , the candidate shall be
required to appear for PET afresh if he fails to
submit application or if the application for
registration is not approved by Research and
Recognition Committee.

11. Will I get a suitable supervisor of my
Normally a candidate shall be required to
complete his/her doctoral research under the
supervision of allotted (original) approved
guide. However, the Research & Recognition
Committee concerned may allow change of
guide on the production of a No Objection
Certificate from the original guide and an
acceptance letter from the new guide. In case of
such a change, the candidate shall work for a
minimum period of one calendar year under the
new guide before he/she submits the thesis. The
requirement of No Objection Certificate shall
not be necessary if the candidate justifies the
non-availability of his original guide. The
justification will have to be endorsed by the
Head, place of research.
Provided further that in specific cases Co-
guide/ second Supervisor shall also be
permitted for justified reasons. However, Guide
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 211

and Co-guide shall not be from the same

12. What after I Get my supervisor ?
(1) Every registered candidate shall submit to
the Controller of Examinations of the University
through the Head, place of research and the
guide the progress report of his/her research
after every six months. If a candidate fails to
submit three reports consecutively, his/her
registration may be cancelled by Research and
Recognition Committee on recommendation of
guide and of Head of place of Research.
(2) The Head, Place of Research after the
completion of the given period (one and a half
years) shall send to the University office within
15 days a report on the noncompliance of the

13. What is synopsis and how do I
prepare that? Are there any Specific guide
lines for that ?
No, there are no specific guidelines for the same
. Refer enclosed CD for general format.

14. What are the numbers of synopsis
copies to be submitted to the University?
The applicant shall submit along with the
application synopsis of the proposed research
work in eight copies(8) to the University.

15. Do I need to register myself with a
Research Institute before submission of
my synopsis?
Yes, you are required to register yourself with a
Research Institute before submission of your

16. Is it compulsory for me to secure
admission at a Research Institute for
carrying out my research?
A big Yes, without the endorsement of the
head of the Research Institute you cannot carry
out your research.

17. What is the last date for securing
admission in research cell?
The entrance examination is usually conducted
twice a year, tentatively on 15
July & 15

January. So the admissions can be secured after
you clear the entrance test.

18. What facilities are available in the
Research Institute to conduct research?
The Institute provides well stocked library with
latest books, journals & other secondary data for
the use the scholar. The Research Institute
usually has well equipped computer laboratory
with broadband internet connection for the use
of research scholars.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 212

19. What about COURSE WORK for
(a) The course work is compulsory and it is
treated as pre Ph.D. preparation. The course
work must include topics on research
methodology , quantitative methods of
computer application , seminars, review of
published research work in the relevant field.
(b) The evaluation of course work is done by the
concerned Guide. Completion report of the
Course work shall be submitted by Guide to
Head of the Place of Research in duplicate. Copy
of completion report shall be thereafter
forwarded by Head of the Place of Research to
the University Ph.D. Cell.
(c) If found necessary by guide with consent of
Head of the Place of Research, course work may
be carried out by the candidates in sister
departments/institutes either within or outside
the University In such case, completion report of
the course work shall be submitted by the Head
of the concerned sister department/institute to
the guide who shall forward it to the Head of
the Place of Research. Copy of the completion
report shall be thereafter forwarded by the Head
of Place of Research to University Ph.D. Cell.

20. How do I choose my research topic?
PhD students can choose research topic of their
area of interest under the supervision and
guidance of a suitable supervisor of their
faculty. The research cell helps the students to
narrow down on these areas of interest and
formulate a well designed research topic.

21. What is the procedure for taking
admission in the research cell of the
The student should either be a registered scholar
with RTM Nagpur University or should have
passed the Ph.D entrance examination
conducted by the University or has cleared the
test. The student has to then buy the prospectus
from the Institute & pay the required fees for
enrolment & secure admission.(Kindly refer to
question no 5)

22. What is the Progress Report and when
it is to be submitted?
It is a report wherein the researcher has to show
the progress of his research work and it has to
be submitted every 6 months along with the
retention fees and in prescribed format. (Format
enclosed in CD)

23. What are some of the common
mistakes/ errors committed by the
The following are few of the most common
mistakes/errors committed by the researcher.
- Faulty/wrong problem definition.
- Objectives of the study without starting
with To e.g. To find, To know, To analyze etc.
- Improper Hypothesis formulation.

24. What is the right time for submission
of Ph.D thesis?
The researcher has to mandatorily carry out
his/her research work for minimum period of
TWO years. After that at any point of time after
the thesis and summary is ready you can submit
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 213

the same to Ph.D cell. There is no fixed date for
submission of Ph.D thesis.
The summary and thesis shall be processed for
final viva-voce/open defense examination by
the University through a RRC (Research &
Recognition Committee) meetings.

25. What shall be my tenure of
registration and When do I submit my
The registration of the candidate shall be valid
and shall remain in force for a period of 5 years
from the date of registration and shall stand
cancelled automatically on expiry of 5 years.
Provided that extension upto maximum period
of 12 months shall be permissible in those cases
which are recommended by the Guide and
Head of the Place of research and the decision
for extension shall be taken by Research and
Recognition Committee . The application for
extension is required to be submitted at least 3
months prior to the date of expiry of
registration. After expiry of extended period of
registration the candidate shall be required to
apply for registration a fresh following the
denovo procedure.
(1) The submission of summary of the thesis
may be permitted only after completion of
twenty two months from the date of
registration. The summary should contain
introduction, chapter wise brief account of the
work done and overall conclusions. Ph.D.
candidate has to publish one research paper in a
standard refereed journal/ monograph before
the submission of the thesis for adjudication,
and produce evidence for the same in the form
of acceptance letter or the reprint. The list of the
reputed journals in the subject shall be prepared
and maintained by the respective Research and
Recognition Committee.
(2) The thesis can be submitted after two months
from the date of submission of summary. At
least three months before the date of submission
of the thesis each candidate shall give a pre-
submission seminar to be arranged by the Head
of the place of research on the request of the
candidate duly endorsed by the guide. The
relevant suggestions if any given by other
research scholars, other research guides and the
Head Place of Research or his/her nominee
present for such a seminar may be considered
while preparing the final draft of the thesis.
(3) On the basis of discussions and suggestions
made in the pre submission seminar the
candidate shall submit to the Controller of
Examinations ten copies of the summary of
his/her thesis through his/her guide within one
month from the date of seminar. (The guide may
suggest list of referees to the Research and
Recognition Committee.)
(4) The candidate shall be allowed to submit
his/her thesis after the completion of a period of
two months and before six months from the date
of submission of the summary, failing which the
candidate will have to pay the prescribed fine to
be decided by the University from to time for
late submission. Late submission of thesis shall
be allowed up to the completion of one year
from the date of submission of the summary or
till the expiry of the registration period,
whichever is earlier.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 214

26. What should be the colour of the
cover page?
The colour of cover page of thesis should be
black in Hardbound volume.

27. What are the specifications for
submission of final thesis?
The final thesis shall be presented in accordance
with the following specifications:
(a) The paper used for printing shall be A4
size paper.
(b) Printing shall be in a standardized form on
one side of the paper and with minimum
of one and half spacing.
(c) A margin of one and a half inches shall be
on the left hand side.
(d) The title of the thesis, name of the
candidate, degree, name of the guide,
place of research, the month and year of
submission shall be printed on the title
page and front cover.
(e) Side cover (Spin) should mention Ph.D
thesis on the top , name of the candidate
and month and year.
(f) There is on binding on the use of executive
bond paper.
(g) All the pages should be properly

28. What is the ideal font size and font
type for the contents of Ph.D?
As such there is no specified guidelines given,
but generally Times New Roman with font size
of 12 is followed with spacing between the line
as 1.5 or Book Antique with font size 11 is
followed with spacing between the line as 1.5.

29. Is there any thumb rule on the
number of pages of Ph.D thesis?
No, but the coverage of the topic should be
adequate, self-explanatory and must justify your
research for the award of the said degree.

30. Is the Certificate and Declaration
format specified by the University?
Yes, RTMNU has prescribed format for the
same. Soft copy of the same is available in the
CD enclosed.

31. How many copies of summary the
researcher has to submit for final
TEN(10) copies of summary shall be submitted
to the Controller of Examination.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 215

32. What are the documents that are
needed at the time of final submission of
(a) Ten Copies(10) of Summary & Five Copies
(5)of thesis along with one soft copy in CD.
(b) Receipt of submission fees
(c) Photocopy of receipts of retention fees.
(d) Photocopy of progress reports.
(e) Photocopy of Ph.D. Registration Certificate.
(f) NOC of University library.
(g) Photocopy of PG mark sheet and degree

33. What is the fee for thesis submission
and how many copies has to be submitted
during final submission?
The fee for thesis submission may change one
University to other and five copies(5) of the
thesis have to be submitted during final
submission. Along with a soft copy (CD)
through the research guide and Head of Place of

34. After submission of the summary and
thesis, when will be the PhD awarded to
On successfully passing the open-defense
examination, the University usually in a
fortnight issues notification regarding the same.
You will get your PhD degree at the time of
Convocation function organized by the

35. As a approved teacher for UG and PG
by profession , what benefits shall I get
once I am awarded PhD?
As per 6th Pay Commission you are entitled to
get three increments in your salary.

N.B:- All the answers provided to questions in
FAQ on Ph.D are generalized in nature and are
not for any specific University and based on the
directions issued by the various University from
time to time. This directions may change based
on the norms and act set by UGC and other
Govt. Agencies.
Candidates are advised to refer to the respective
University website for latest directions and
norms. The authors do not take any legal or
other responsibility for change in the same.
For more detail about RTMNU PET ( Ph.D
Entrance Test) and Direction regarding the
same refer to the Soft copy enclosed in the CD.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 216

Research Aptitude Test: Examination Pattern
In most of the University the Research Aptitude Examination usual consists of :
Paper-I :- Research Aptitude Test and Paper-II : - Subject specific Test
Some University (e.g. RTMNU & SGBAU) do follows only Research Aptitude Test for qualifying
Ph.D entrance test. This is because of the fact that they offer large number of Ph.D options in
different faculties compared to various other University.

Paper-I :- Research Aptitude Test
Paper-I consists of 4 Parts : Time : 90 minutes Total Marks : 100
Part-1 :- Analytical Reasoning Part-2:- Numerical Ability and
Part-3:- Language Competency/ Computer/ Environment/ Logical Reasoning
Part-4:- Data Interpretation
Maximum marks required to clear Paper-I : a) OPEN Category: 50% b) Reserved Category: 45%

Paper-II : - Subject specific Test Total Marks : 100
Paper-II consists of 2 Parts :
a) Multiple Choice Question : 20 Marks b) Theoretical/Descriptive Questions : 80 Marks
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 217

Online Ph.D Entrance Test (PET) : Sample Test Paper I(Solved with explanation)

Time: 90 minutes] [Max Marks: 100

(a) N.B:- a) There are in all 100 multiple Choice Questions
(b) Each correct answer carries 1 Mark.
(c) There is No negative marking system.
(d) Click online the correct option for each question.
(e) Use of Electronic / Scientific calculator is not allowed
(f) Multiple Choice Questions are divided into four parts
1. Analytical Reasoning,
2. Numerical Ability
3. Language Competency/ Computer/ Environment/ Logical Reasoning
4. and Data Interpretation

Part-1 :- Analytical Reasoning

Logical Sequence of Words

1.In each of the following questions, arrange the given words in a meaningful sequence and thus find
the correct answer from alternatives.
1. Arrange the words given below in a meaningful sequence.
1. Key 2. Door 3. Lock 4. Room 5. Switch on

A.5, 1, 2, 4, 3 B. 4, 2, 1, 5, 3
C. 1, 3, 2, 4, 5 D. 1, 2, 3, 5, 4

Answer & Explanation
Answer: Option C
The correct order is :
Key Lock Door Room Switch on
1 3 2 4 5

2. Arrange the words given below in a meaningful sequence.
1. Word 2. Paragraph 3. Sentence 4. Letters 5. Phrase

A.4, 1, 5, 2, 3 B. 4, 1, 3, 5, 2
C. 4, 2, 5, 1, 3 D. 4, 1, 5, 3, 2

Answer & Explanation
Answer: Option D
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 218

The correct order is :
Letters Word Phrase Sentence Paragraph
4 1 5 3 2

3. Arrange the words given below in a meaningful sequence.
1. Police 2. Punishment 3. Crime 4. Judge 5. Judgement

A.3, 1, 2, 4, 5 B. 1, 2, 4, 3, 5
C. 5, 4, 3, 2, 1 D. 3, 1, 4, 5, 2

Answer & Explanation
Answer: Option D
The correct order is :
Crime Police Judge Judgement Punishment
3 1 4 5 2

4. Arrange the words given below in a meaningful sequence.
1. Family 2. Community 3. Member 4. Locality 5. Country

A.3, 1, 2, 4, 5 B. 3, 1, 2, 5, 4
C. 3, 1, 4, 2, 5 D. 3, 1, 4, 5, 2

Answer & Explanation
Answer: Option A
The correct order is :
Member Family Community Locality Country
3 1 2 4 5

5. Arrange the words given below in a meaningful sequence.
1. Poverty 2. Population 3. Death 4. Unemployment 5. Disease

A.2, 3, 4, 5, 1 B. 3, 4, 2, 5, 1
C. 2, 4, 1, 5, 3 D. 1, 2, 3, 4, 5
Answer & Explanation
Answer: Option C
The correct order is :
Population Unemployment Poverty Disease Death
2 4 1 5 3

Seating Arrangement

6. A, P, R, X, S and Z are sitting in a row. S and Z are in the centre. A and P are at the ends. R is sitting
to the left of A. Who is to the right of P ?
A.A B. X
C. S D. Z
Answer & Explanation
Answer: Option B
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 219

The seating arrangement is as follows:

Therefore, right of P is X.

7. There are 8 houses in a line and in each house only one boy lives with the conditions as given
1. Jack is not the neighbour Siman.
2. Harry is just next to the left of Larry.
3. There is at least one to the left of Larry.
4. Paul lives in one of the two houses in the middle.
5. Mike lives in between Paul and Larry.
If at least one lives to the right of Robert and Harry is not between Taud and Larry, then which one of
the following statement is not correct ?
A.Robert is not at the left end.
B. Robert is in between Simon and Taud.
C. Taud is in between Paul and Jack.
D. There are three persons to the right of Paul.

Answer: Option C

8. A, B, C, D and E are sitting on a bench. A is sitting next to B, C is sitting next to D, D is not sitting
with E who is on the left end of the bench. C is on the second position from the right. A is to the right
of B and E. A and C are sitting together. In which position A is sitting ?
A.Between B and D B. Between B and C
C. Between E and D D. Between C and E

Answer: Option B

Therefore, A is sitting in between B and C.

9. Six friends P, Q, R, S, T and U are sitting around the hexagonal table each at one corner and are
facing the centre of the hexagonal. P is second to the left of U. Q is neighbour of R and S. T is second
to the left of S.
1. Which one is sitting opposite to P ?
A.R B. Q
C. T D. S
Answer & Explanation
Answer: Option D

S is sitting opposite to P.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 220

10.Who is the fourth person to the left of Q ?
A.P B. U
C. R D. Data inadequate
Answer & Explanation
Answer: Option A

P is the fourth person to the left of Q.

Verification of Truth

11. A train always has
A.Rails B. Driver
C. Guard D. Engine

Answer: Option D

12. Which one of the following is always found in 'Bravery'?
A.Experience B. Power
C. Courage D. Knowledge

Answer: Option C

13. A song always has
A.Word B. Chorus
C. Musician D. Tymbal

Answer: Option A

14. Yesterday I saw a ice cube which had already melted due to heat of a nearby furnace.
A.Always B. Never
C. Often D.Sometimes
Answer & Explanation
Answer: Option B
Since the ice cube had already melted due to the heat of a nearby furnace so after this ice cannot remain
as ice cube.

15. What is found necessarily in milk?
A.Cream B. Curd
C. Water D. Whiteness

Answer: Option D

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 221

Cause and Effect

Below in each of the questions are given two statements I and II. These statements may be either
independent causes or may be effects of independent causes or a common cause. One of these
statements may be the effect of the other statements. Read both the statements and decide which of
the following answer choice correctly depicts the relationship between these two statements.
Mark answer
(A) If statement I is the cause and statement II is its effect.
(B) If statement II is the cause and statement I is its effect.
(C) If both the statements I and II are independent causes.
(D) If both the statements I and II are effects of independent causes.
(E) If both the statements I and II are effects of some common cause.

16. Statements:
1. Standard of living among the middle class society is constantly going up since part of few years.
2. Indian Economy is observing remarkable growth.
A.Statement I is the cause and statement II is its effect.
B. Statement II is the cause and statement I is its effect.
C. Both the statements I and II are independent causes.
D.Both the statements I and II are effects of independent causes.
E. Both the statements I and II are effects of some common cause.
Answer & Explanation
Answer: Option A
Since the standard of living among the middle class society is constantly going up so Indian Economy is
observing remarkable growth.

17. Statements:
1. The meteorological Department has issued a statement mentioning deficient rainfall during
monsoon in many parts of the country.
2. The Government has lowered the revised estimated GDP growth from the level of earlier estimates.
A.Statement I is the cause and statement II is its effect.
B. Statement II is the cause and statement I is its effect.
C. Both the statements I and II are independent causes.
D. Both the statements I and II are effects of independent causes.
E. Both the statements I and II are effects of some common cause.
Answer & Explanation
Answer: Option D
Both the statements I and II are effects of independent causes.

18. Statements:
1. The staff of Airport Authorities called off the strike they were observing in protest against
2. The staff of Airport Authorities went on strike anticipating a threat to their jobs.
A.Statement I is the cause and statement II is its effect.
B. Statement II is the cause and statement I is its effect.
C. Both the statements I and II are independent causes.
D. Both the statements I and II are effects of independent causes.
E. Both the statements I and II are effects of some common cause.
Answer & Explanation
Answer: Option D
Both the statements I and II are effects of independent causes.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 222

19. Statements:
1. A huge truck overturned on the middle of the road last night.
2. The police had cordoned of entire area in the locality this morning for half of the day.
A.Statement I is the cause and statement II is its effect.
B. Statement II is the cause and statement I is its effect.
C. Both the statements I and II are independent causes.
D.Both the statements I and II are effects of independent causes.
E. Both the statements I and II are effects of some common cause.
Answer & Explanation
Answer: Option A
Since a huge truck overturned on the middle of the road last night, so, the police had cordoned off the
entire area in the locality last morning for half of the day.

20. Statements:
1. Importance of Yoga and exercise is being realized by all sections of the society.
2. There is an increasing awareness about health in the society particularly among middle ages group
of people.
A.Statement I is the cause and statement II is its effect.
B. Statement II is the cause and statement I is its effect.
C. Both the statements I and II are independent causes.
D. Both the statements I and II are effects of independent causes.
E. Both the statements I and II are effects of some common cause.
Answer & Explanation
Answer: Option B
As the awareness about health in the society is increasing particularly among middle-aged group of
people, the importance of Yoga and exercise is being realized by all sections of the society.

Data Sufficiency

In each of the questions below consists of a question and two statements numbered I and II given
below it. You have to decide whether the data provided in the statements are sufficient to answer the
question. Read both the statements and
Give answer
(A) If the data in statement I alone are sufficient to answer the question, while the data in statement II
alone are not sufficient to answer the question
(B) If the data in statement II alone are sufficient to answer the question, while the data in statement I
alone are not sufficient to answer the question
(C) If the data either in statement I alone or in statement II alone are sufficient to answer the question
(D) If the data given in both statements I and II together are not sufficient to answer the question and
(E) If the data in both statements I and II together are necessary to answer the question.

21. Question: In which year was Rahul born ?
1. Rahul at present is 25 years younger to his mother.
2. Rahul's brother, who was born in 1964, is 35 years younger to his mother.
A.I alone is sufficient while II alone is not sufficient
B. II alone is sufficient while I alone is not sufficient
C. Either I or II is sufficient
D. Neither I nor II is sufficient
E. Both I and II are sufficient
Answer & Explanation
Answer: Option E
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 223

From both I and II, we find that Rahul is (35 - 25) = 10 years older than his brother, who was born in 1964.
So, Rahul was born in 1954.

22. Question: What will be the total weight of 10 poles, each of the same weight ?
1. One-fourth of the weight of each pole is 5 kg.
2. The total weight of three poles is 20 kilograms more than the total weight of two poles.
A.I alone is sufficient while II alone is not sufficient
B. II alone is sufficient while I alone is not sufficient
C. Either I or II is sufficient
D.Neither I nor II is sufficient
E. Both I and II are sufficient
Answer & Explanation
Answer: Option C
From I, we conclude that weight of each pole = (4x5) kg = 20 kg.
So, total weight of 10 poles = (20 x 10) kg = 200 kg.
From II, we conclude that:
Weight of each pole = (weight of 3 poles) - (weight of 2 poles) = 20 kg.
So, total weight of 10 pojes = (20 x 10) kg = 200 kg.

23. Question: How much was the total sale of the company ?
1. The company sold 8000 units of product A each costing ` 25.
2. This company has no other product line.
A.I alone is sufficient while II alone is not sufficient
B. II alone is sufficient while I alone is not sufficient
C. Either I or II is sufficient
D.Neither I nor II is sufficient
E. Both I and II are sufficient
Answer & Explanation
Answer: Option E
From I, total sale of product A = ` (8000 x 25) = ` 200000.
From II, we know that the company deals only in product A.
This implies that sale of product A is the total sale of the company, which is ` 200000.

24. Question: The last Sunday of March, 2006 fell on which date ?
1. The first Sunday of that month fell on 5th.
2. The last day of that month was Friday.
A.I alone is sufficient while II alone is not sufficient
B. II alone is sufficient while I alone is not sufficient
C. Either I or II is sufficient
D.Neither I nor II is sufficient
E. Both I and II are sufficient
Answer & Explanation
Answer: Option C
From I, we conclude that 5th, 12th, 19th and 26th of March, 2006 were Sundays.
So, the last Sunday fell on 26th.
From II, we conclude that 31st March, 2006 was Friday. Thus, 26th March, 2006 was the last Sunday of the

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 224

25. One morning Udai and Vishal were talking to each other face to face at a crossing. If Vishal's
shadow was exactly to the left of Udai, which direction was Udai facing?
A. East B. West
C. North D. South
Answer & Explanation
Answer: Option C

Part-2:- Numerical Ability

Problems on Ages

1. Father is aged three times more than his son Ronit. After 8 years, he would be two and a half times
of Ronit's age. After further 8 years, how many times would he be of Ronit's age?
A. 2 times B. 2
C. 2
D. 3 times
Answer & Explanation
Answer: Option A
Let Ronit's present age be x years. Then, father's present age =(x + 3x) years = 4x years.
(4x + 8) =
(x + 8)
8x + 16 = 5x + 40
3x = 24
x = 8.
Hence, required ratio =
(4x + 16)
= 2.
(x + 16) 24

2. The sum of ages of 5 children born at the intervals of 3 years each is 50 years. What is the age of the
youngest child?
A. 4 years B. 8 years
C. 10 years D. None of these
Answer & Explanation
Answer: Option A
Let the ages of children be x, (x + 3), (x + 6), (x + 9) and (x + 12) years.
Then, x + (x + 3) + (x + 6) + (x + 9) + (x + 12) = 50
5x = 20
x = 4.
Age of the youngest child = x = 4 years.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 225

3. A father said to his son, "I was as old as you are at the present at the time of your birth". If the
father's age is 38 years now, the son's age five years back was:
A. 14 years B. 19 years
C. 33 years D. 38 years
Answer & Explanation
Answer: Option A
Let the son's present age be x years. Then, (38 - x) = x
2x = 38
x = 19
Son's age 5 years back (19 - 5) = 14 years.

4. A is two years older than B who is twice as old as C. If the total of the ages of A, B and C be 27, the
how old is B?
A. 7 B. 8
C. 9 D. 10
E. 11

Answer & Explanation
Answer: Option D
Let C's age be x years. Then, B's age = 2x years. A's age = (2x + 2) years.
(2x + 2) + 2x + x = 27
5x = 25
x = 5.
Hence, B's age = 2x = 10 years.

5. Present ages of Sameer and Anand are in the ratio of 5 : 4 respectively. Three years hence, the ratio
of their ages will become 11 : 9 respectively. What is Anand's present age in years?
A. 24 B. 27
C. 40 D. Cannot be determined
E. None of these

Answer & Explanation
Answer: Option A
Let the present ages of Sameer and Anand be 5x years and 4x years respectively.
5x + 3
4x + 3 9
9(5x + 3) = 11(4x + 3)
45x + 27 = 44x + 33
45x - 44x = 33 - 27
x = 6.
Anand's present age = 4x = 24 years.


6. A batsman scored 110 runs which included 3 boundaries and 8 sixes. What percent of his total score
did he make by running between the wickets?
A. 45% B. 45
C. 54
D. 55%
Answer & Explanation
Answer: Option B
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 226

Number of runs made by running = 110 - (3 x 4 + 8 x 6)
= 110 - (60)
= 50.
Required percentage =

x 100
% = 45

5 %
110 11

7. Two students appeared at an examination. One of them secured 9 marks more than the other and his
marks was 56% of the sum of their marks. The marks obtained by them are:
A. 39, 30 B. 41, 32
C. 42, 33 D. 43, 34
Answer & Explanation
Answer: Option C
Let their marks be (x + 9) and x.
Then, x + 9 =
(x + 9 + x)
25(x + 9) = 14(2x + 9)
3x = 99
x = 33
So, their marks are 42 and 33.

8. A fruit seller had some apples. He sells 40% apples and still has 420 apples. Originally, he had:
A. 588 apples B. 600 apples
C. 672 apples D. 700 apples
Answer & Explanation
Answer: Option D
Suppose originally he had x apples.
Then, (100 - 40)% of x = 420.

x x = 420
x =

420 x 100
= 700.

9. What percentage of numbers from 1 to 70 have 1 or 9 in the unit's digit?
A. 1 B. 14
C. 20 D. 21
Answer & Explanation
Answer: Option C
Clearly, the numbers which have 1 or 9 in the unit's digit, have squares that end in the digit 1. Such
numbers from 1 to 70 are 1, 9, 11, 19, 21, 29, 31, 39, 41, 49, 51, 59, 61, 69.
Number of such number =14
Required percentage =

x 100
% = 20%.

10. If A = x% of y and B = y% of x, then which of the following is true?
A. A is smaller than B. B. A is greater than B
Relationship between A and B cannot be
D. If x is smaller than y, then A is greater than B.
E. None of these

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 227

Answer & Explanation
Answer: Option E
x% of y =

x y


x x

= y% of x
100 100
A = B.


11. An accurate clock shows 8 o'clock in the morning. Through how may degrees will the hour hand
rotate when the clock shows 2 o'clock in the afternoon?
A. 144 B. 150
C. 168 D. 180
Answer & Explanation
Answer: Option D
Angle traced by the hour hand in 6 hours =

x 6

= 180.

12. The reflex angle between the hands of a clock at 10.25 is:
A. 180 B. 192

C. 195 D. 197

Answer & Explanation
Answer: Option D
Angle traced by hour hand in
hrs =


= 312
12 12 12 2
Angle traced by minute hand in 25 min =

x 25

= 150.
Reflex angle = 360 -

- 150

= 360 - 162
= 197
2 2 2

13. A clock is started at noon. By 10 minutes past 5, the hour hand has turned through:
A. 145 B. 150
C. 155 D. 160
Answer & Explanation
Answer: Option C
Angle traced by hour hand in 12 hrs = 360.
Angle traced by hour hand in 5 hrs 10 min. i.e.,
hrs =


= 155.
6 12 6

14. A watch which gains 5 seconds in 3 minutes was set right at 7 a.m. In the afternoon of the same
day, when the watch indicated quarter past 4 o'clock, the true time is:
A. 59
min. past 3
B. 4 p.m.
C. 58
min. past 3
D. 2
min. past 4
Answer & Explanation
Answer: Option B
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 228

Time from 7 a.m. to 4.15 p.m. = 9 hrs 15 min. =
3 min. 5 sec. of this clock = 3 min. of the correct clock.

hrs of this clock =
hrs of the correct clock.
720 20

hrs of this clock =


hrs of the correct clock.
4 20 37 4
= 9 hrs of the correct clock.
The correct time is 9 hrs after 7 a.m. i.e., 4 p.m.

15. How much does a watch lose per day, if its hands coincide ever 64 minutes?
A. 32
B. 36
C. 90 min. D. 96 min.
Answer & Explanation
Answer: Option A
55 min. spaces are covered in 60 min.
60 min. spaces are covered in

x 60
= 65
55 11
Loss in 64 min. =

- 64

11 11
Loss in 24 hrs =

x 24 x 60
= 32
11 64 11


16. It was Sunday on Jan 1, 2006. What was the day of the week Jan 1, 2010?
A. Sunday B. Saturday
C. Friday D. Wednesday
Answer & Explanation
Answer: Option C
On 31
December, 2005 it was Saturday.
Number of odd days from the year 2006 to the year 2009 = (1 + 1 + 2 + 1) = 5 days.
On 31
December 2009, it was Thursday.
Thus, on 1
Jan, 2010 it is Friday.

17. What was the day of the week on 28
May, 2006?
A. Thursday B. Friday
C. Saturday D. Sunday
Answer & Explanation
Answer: Option D
28 May, 2006 = (2005 years + Period from 1.1.2006 to 28.5.2006)
Odd days in 1600 years = 0
Odd days in 400 years = 0
5 years = (4 ordinary years + 1 leap year) = (4 x 1 + 1 x 2) 6 odd days
Jan. Feb. March April May
(31 + 28 + 31 + 30 + 28 ) = 148 days
148 days = (21 weeks + 1 day) 1 odd day.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 229

Total number of odd days = (0 + 0 + 6 + 1) = 7 0 odd day.
Given day is Sunday.

18. What was the day of the week on 17
June, 1998?

A. Monday B. Tuesday
C. Wednesday D. Thursday
Answer & Explanation
Answer: Option C
June, 1998 = (1997 years + Period from 1.1.1998 to 17.6.1998)
Odd days in 1600 years = 0
Odd days in 300 years = (5 x 3) 1
97 years has 24 leap years + 73 ordinary years.
Number of odd days in 97 years ( 24 x 2 + 73) = 121 = 2 odd days.
Jan. Feb. March April May June
(31 + 28 + 31 + 30 + 31 + 17) = 168 days
168 days = 24 weeks = 0 odd day.
Total number of odd days = (0 + 1 + 2 + 0) = 3.
Given day is Wednesday.

19. What will be the day of the week 15
August, 2010?
A. Sunday B. Monday
C. Tuesday D. Friday
Answer & Explanation
Answer: Option A
August, 2010 = (2009 years + Period 1.1.2010 to 15.8.2010)
Odd days in 1600 years = 0
Odd days in 400 years = 0
9 years = (2 leap years + 7 ordinary years) = (2 x 2 + 7 x 1) = 11 odd days 4 odd days.
Jan. Feb. March April May June July Aug.
(31 + 28 + 31 + 30 + 31 + 30 + 31 + 15) = 227 days
227 days = (32 weeks + 3 days) 3 odd days.
Total number of odd days = (0 + 0 + 4 + 3) = 7 0 odd days.
Given day is Sunday.

20. Today is Monday. After 61 days, it will be:
A. Wednesday B. Saturday
C. Tuesday D. Thursday
Answer & Explanation
Answer: Option B
Each day of the week is repeated after 7 days.
So, after 63 days, it will be Monday.
After 61 days, it will be Saturday.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 230

Odd Man Out and Series

21.Find the odd man out.
1. 3, 5, 11, 14, 17, 21
A. 21 B. 17
C. 14 D. 3
Answer & Explanation
Answer: Option C
Each of the numbers except 14 is an odd number.
The number '14' is the only EVEN number.

22. 8, 27, 64, 100, 125, 216, 343
A. 27 B. 100
C. 125 D. 343
Answer & Explanation
Answer: Option B
The pattern is 2
, 3
, 4
, 5
, 6
, 7
. But, 100 is not a perfect cube.

23. 10, 25, 45, 54, 60, 75, 80
A. 10 B. 45
C. 54 D. 75
Answer & Explanation
Answer: Option C
Each of the numbers except 54 is multiple of 5.

24. 396, 462, 572, 396, 427, 671, 264
A. 396 B. 427
C. 671 D. 264
Answer & Explanation
Answer: Option B
In each number except 427, the middle digit is the sum of other two.

25. 6, 9, 15, 21, 24, 28, 30
A. 28 B. 21
C. 24 D. 30
Answer & Explanation
Answer: Option A
Each of the numbers except 28, is a multiple of 3.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 231

Part-3:- (a) Language Competency

Spotting Errors
Read the each sentence to find out whether there is any grammatical error in it. The error, if any will
be in one part of the sentence. The letter of that part is the answer. If there is no error, the answer is
'D'. (Ignore the errors of punctuation, if any).

1. (solve as per the direction given above)
A.We discussed about the problem so thoroughly
B. on the eve of the examination
C. that I found it very easy to work it out.
D. No error.
Answer & Explanation
Answer: Option A
We discussed the problem so thoroughly

2. (solve as per the direction given above)
A.An Indian ship
B. laden with merchandise
C. got drowned in the Pacific Ocean.
D. No error.
Answer & Explanation
Answer: Option C
sank in the Pacific Ocean

3. (solve as per the direction given above)
A.I could not put up in a hotel
B. because the boarding and lodging charges
C. were exorbitant
D. No error.
Answer & Explanation
Answer: Option A
'I could not put up at a hotel'

4. (solve as per the direction given above)
A.The Prime Minister has said that India would not have spent so much on defence
B. if some of the neighbouring countries
C. adopted the policy of restricting defence expenditure
D. No error.
Answer & Explanation
Answer: Option A
The Prime Minister has said that India would not have had to spend so much on defence

5. (solve as per the direction given above)
A.The Indian radio
B. which was previously controlled by the British rulers
C. is free now from the narrow vested interests.
D. No error.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 232

Answer & Explanation
Answer: Option C
is now free from the narrow vested interests.

Selecting Words
Pick out the most effective word(s) from the given words to fill in the blank to make the sentence
meaningfully complete.
6. Fate smiles ...... those who untiringly grapple with stark realities of life.
A.with B. over
C. on D. round

Answer: Option C

7. The miser gazed ...... at the pile of gold coins in front of him.
A.avidly B. admiringly
C. thoughtfully D. earnestly

Answer: Option A

8. Catching the earlier train will give us the ...... to do some shopping.
A.chance B. luck
C. possibility D. occasion

Answer: Option A

9. I saw a ...... of cows in the field. B. herd
C. swarm D. flock

Answer: Option B

10. The grapes are now ...... enough to be picked.
A.ready B. mature
C. ripe D. advanced

Answer: Option C

(b) Computer Competency

11) The various cards in a PC requires _______ voltage to function. A) AC B) DC

Answer: Option B

12) Which has more storage capacity CD or DVD A) DVD B) CD

Answer: Option A

13) LCD monitor is also known as___________ a) TFT b) CRT

Answer: Option A

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 233

14) Acronym of HDD? A) Hard Disk Drive B) Hard Drive Disk

Answer: Option A

15) What type of memory is a PEN drive...? A) FLASH Memory B) Catch Memory

Answer: Option A

(C)Environment Competency

16. Branch of Biology which is concerned with the inter-relationship between plants and animals is
called :
(A) Physiology
(B) Ecology
(C) Anatomy
(D) Morphology

Answer: Option B

17. The largest unit of living organisms on earth is :
(A) Ecosystem
(B) biome
(C) Biosphere
(D) Population

Answer: Option C

18. The two components of an ecosystem are :
(A) Plants and animals
(B) Biotic and abiotic
(C) Plants and light
(D) Weeds and micro-organisms

Answer: Option B

19. The green plants are called :
(A) Producers
(B) Consumers
(C) Decomposers
(D) None of these

Answer: Option A

20. Total organic matter present in an ecosystem is called :
(A) Biome
(B) Biomass
(C) Biotic community
(D) Litter

Answer: Option B
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 234

(d) Logical Reasoning Competency
Each problem consists of three statements. Based on the first two statements, the third statement may
be true, false, or uncertain.

1. Tanya is older than Eric.
Cliff is older than Tanya.
Eric is older than Cliff.
If the first two statements are true, the third statement is
B. false
C. uncertain
Answer & Explanation
Answer: Option B
Because the first two statements are true, Eric is the youngest of the three, so the third statement must be

2. Blueberries cost more than strawberries.
Blueberries cost less than raspberries.
Raspberries cost more than both strawberries and blueberries.
If the first two statements are true, the third statement is
B. false
C. uncertain
Answer & Explanation
Answer: Option A
Because the first two statements are true, raspberries are the most expensive of the three.

3. All the trees in the park are flowering trees.
Some of the trees in the park are dogwoods.
All dogwoods in the park are flowering trees.
If the first two statements are true, the third statement is
B. false
C. uncertain
Answer & Explanation
Answer: Option A
All of the trees in the park are flowering trees, So all dogwoods in the park are flowering trees.

4. Mara runs faster than Gail.
Lily runs faster than Mara.
Gail runs faster than Lily.
If the first two statements are true, the third statement is
B. false
C. uncertain
Answer & Explanation
Answer: Option B
We know from the first two statements that Lily runs fastest. Therefore, the third statement must be false.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 235

5. Apartments in the Riverdale Manor cost less than apartments in The Gaslight Commons.
Apartments in the Livingston Gate cost more than apartments in the The Gaslight Commons.
Of the three apartment buildings, the Livingston Gate costs the most.
If the first two statements are true, the third statement is
B. false
C. uncertain
Answer & Explanation
Answer: Option A
Since the Gaslight Commons costs more than the Riverdale Manor and the Livingston Gate costs more
than the Gaslight Commons, it is true that the Livingston Gate costs the most.

Part-4:- Data Interpretation

Pie Charts

The following pie-chart shows the percentage distribution of the expenditure incurred in publishing a
book. Study the pie-chart and the answer the questions based on it.
Various Expenditures (in percentage) Incurred in Publishing a Book

1. If for a certain quantity of books, the publisher has to pay ` 30,600 as printing cost, then what will
be amount of royalty to be paid for these books?
A. ` 19,450 B. ` 21,200
C. ` 22,950 D. ` 26,150
Answer & Explanation
Answer: Option C
Let the amount of Royalty to be paid for these books be ` r.
Then, 20 : 15 = 30600 : r r = `

30600 x 15

= ` 22,950.

2. What is the central angle of the sector corresponding to the expenditure incurred on Royalty?
A. 15 B. 24
C. 54 D. 48
Answer & Explanation
Answer: Option C
Central angle corresponding to Royalty = (15% of 360)


x 360


= 54.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 236

3. The price of the book is marked 20% above the C.P. If the marked price of the book is ` 180, then
what is the cost of the paper used in a single copy of the book?
A. ` 36 B. ` 37.50
C. ` 42 D. ` 44.25

Answer & Explanation
Answer: Option B
Clearly, marked price of the book = 120% of C.P.
Also, cost of paper = 25% of C.P
Let the cost of paper for a single book be ` n.
Then, 120 : 25 = 180 : n n = `

25 x 180

= ` 37.50 .

4. If 5500 copies are published and the transportation cost on them amounts to ` 82500, then what
should be the selling price of the book so that the publisher can earn a profit of 25%?
A. ` 187.50 B. ` 191.50
C. ` 175 D. ` 180
Answer & Explanation
Answer: Option A
For the publisher to earn a profit of 25%, S.P. = 125% of C.P.
Also Transportation Cost = 10% of C.P.
Let the S.P. of 5500 books be ` x.
Then, 10 : 125 = 82500 : x x = `

125 x 82500

= ` 1031250.
S.P. of one book = `


= ` 187.50 .

5. Royalty on the book is less than the printing cost by:
A. 5% B. 33
C. 20% D. 25%
Answer & Explanation
Answer: Option D
Printing Cost of book = 20% of C.P.
Royalty on book = 15% of C.P.
Difference = (20% of C.P.) - (15% of C.P) = 5% of C.P.
Percentage difference =

x 100

Printing Cost


5% of C.P.
x 100

% = 25%.
Printing Cost

6. If the difference between the two expenditures are represented by 18 in the pie-chart, then these
expenditures possibly are
A. Binding Cost and Promotion Cost
B. Paper Cost and Royalty
C. Binding Cost and Printing Cost
D. Paper Cost and Printing Cost
Answer & Explanation
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 237

Answer: Option D
Central angle of 18 =

x 100

% of the total expenditure
= 5% of the total expenditure.
From the given chart it is clear that:
Out of the given combinations, only in combination (d) the difference is 5% i.e.
Paper Cost - Printing Cost = (25% - 20%) of the total expenditure

= 5% of the total expenditure.

7. For an edition of 12,500 copies, the amount of Royalty paid by the publisher is ` 2,81,250. What
should be the selling price of the book if the publisher desires a profit of 5%?
A. ` 152.50 B. ` 157.50
C. ` 162.50 D. ` 167.50
Answer & Explanation
Answer: Option B
Clearly, S.P. of the book = 105% of C.P.
Let the selling price of this edition (of 12500 books) be ` x.
Then, 15 : 105 = 281250 : x x = `

105 x 281250

= ` 1968750.
S.P. of one book = `


= ` 157.50 .

8. If for an edition of the book, the cost of paper is ` 56250, then find the promotion cost for this
A. ` 20,000 B. ` 22,500
C. ` 25,500 D. ` 28,125
Answer & Explanation
Answer: Option B
Let the Promotion Cost for this edition be ` p.
Then, 25 : 10 = 56250 : p p = `

56250 x 10

= ` 22,500.

9. Which two expenditures together have central angle of 108?
A. Biding Cost and Transportation Cost
B. Printing Cost and Paper Cost
C. Royalty and Promotion Cost
D. Binding Cost and Paper Cost
Answer & Explanation
Answer: Option A
Central angle of 108 =

x 100

% of the total expenditure
= 30% of the total expenditure.
From the pie chart it is clear that:
Binding Cost + Transportation Cost = (20% + 10%) of the total expenditure

= 30% of the total expenditure.
Binding Cost and Transportation Cost together have a central angle of 108.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 238

The pie chart shows the distribution of New York market share by value of different computer
companies in 2005.

The pie chart shows the distribution of New York market share by volume of different computer
companies in 2005.
Number of units sold in 2005 in New York = 1,500
Value of units sold in 2005 in New York = US $1,650,000.

1. For the year 2005, which company has realised the lowest average unit sales price for a PC ?
A. Commodore B. IBM
C. Tandy D. Cannot be determined
Answer & Explanation
Answer: Option D
Although it seems to be Commodore, the answer cannot be determined due to the fact that we are
unaware of the break-up of the sales value and volume of companies compromising the other

2. Over the period 2005-2006, if sales (value-wise) of IBM PC's increased by 50% and of Apple by 15%
assuming that PC sales of all other computer companies remained the same, by what percentage
(approximately) would the PC sales in New York (value-wise) increase over the same period ?
A. 16.1 % B. 18 %
C. 14 % D. None of these
Answer & Explanation
Answer: Option A
If we assume the total sales to be 100 in the first year, IBM's sales would go up by 50% (from 28 to 42)
contributing an increase of 14 to the total sales value.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 239

Similarly, Apple's increase of 15% would contribute an increase of 2.1 to the total sales value. The net
change would be 14 + 2.1 on 100. (i.e., 16.1%)

3. In 2005, the average unit sale price of an IBM PC was approximately (in US$)
A. 3180 B. 2800
C. 393 D. 3080
Answer & Explanation
Answer: Option D
IBM accounts for 28% of the share by value and 10% of the share by volume.
28% of 1650000 = 28 x 1650000/100 = 462000
10% of 1500 = 10 x 1500/100 = 150
Therefore, average unit sale price = 462000/150 = 3080.

Bar Charts
Study the following bar charts and answer the questions.
Foreign Trade (Imports and Exports) by countries for the year (1993 - 1994)

1. The ratio of the maximum exports to the minimum imports was closest to ?
A. 64 B. 69
C. 74 D. 79
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 240

Answer & Explanation
Answer: Option B
The value of maximum exports = 6045.
The value of minimum imports = 87.
Therefore, the required ratio (6045/87) = 69.48 = 69 (approximately).

2. How many countries exhibited a trade surplus ?
A. 5 B. 4
C. 3 D. 6
Answer & Explanation
Answer: Option B
Out of a total of 12 countries, 8 showed a deficit while 4 showed a surplus.

3. The total trade deficit/surplus for all the countries put together was ?
A. 11286 surplus B. 11286 deficit
C. 10286 deficit D. None of these
Answer & Explanation
Answer: Option B
Sum of exports - Sum of imports = deficit(11286).

4. The highest trade deficit was shown by which country ?
A. C B. G
C. H D. L
Answer & Explanation
Answer: Option D
Visually its clear that L has the highest trade deficit.

5. The ratio of Exports to Imports was highest for which country ?
A. A B. I
C. J D. K
Answer & Explanation
Answer: Option B
I has a ratio of 4002/2744 = 1.45, which is the highest.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 241

The following bar chart shows the composition of the GDP two countries (India and Pakistan).
Composition of GDP of Two Countries

1. If the total GDP of Pakistan is ` 10,000 crore, then a GDP accounted for by Manufacturing is ?
A. `200 crore B. `600 crore
C. `2,000 crore D. `6,000 crore
Answer & Explanation

Answer: Option C
20% of 10000 = 2000

2. What fraction of India's GDP is accounted for by Services ?
A. (6/33)th B. (1/5)th
C. (2/3)rd D. None of these
Answer & Explanation
Answer: Option B
Service accounts for 20%, i.e., (1/5)th of the GDP of India.

3. If the total GDP of India is `30,000 crores, then the GDP accounted for by Agriculture, Services and
Miscellaneous is ?
A. `18,500 crore B. ` 18,000 crore
C. ` 21,000 crore D. ` 15,000 crore
Answer & Explanation
Answer: Option C
(40 + 20 + 10)% of 30,000 = ` 21,000 crore.

4. Which country accounts for higher earning out of Services and Miscellaneous together ?
A. India B. Pakistan
C. Both spend equal amounts D. Cannot be determined
Answer & Explanation
Answer: Option D
Although the percentage on Services and Miscellaneous put together is equal for both the countries, we
cannot comment on this since we have no data about the respective GDP's.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 242

5. If the total GDP is the same for both the countries, then what percentage is Pakistan's income
through agriculture over India's income through Services ?
A. 100 % B. 200 %
C. 133.33 % D. None of these
Answer & Explanation
Answer: Option A
Since the GDP is same, the answer will be got by (40 - 20)/20 = 100%.

Table Charts

The following table shows the number of new employees added to different categories of employees
in a company and also the number of employees from these categories who left the company every
year since the foundation of the Company in 1995.
Year Managers Technicians Operators Accountants Peons
New Left New Left New Left New Left New Left
1995 760 - 1200 - 880 - 1160 - 820 -
1996 280 120 272 120 256 104 200 100 184 96
1997 179 92 240 128 240 120 224 104 152 88
1998 148 88 236 96 208 100 248 96 196 80
1999 160 72 256 100 192 112 272 88 224 120
2000 193 96 288 112 248 144 260 92 200 104
1. What is the difference between the total number of Technicians added to the Company
and the total number of Accountants added to the Company during the years 1996 to
Answer & Explanation
Answer: Option D
Required difference
= (272 + 240 + 236 + 256 + 288) - (200 + 224 + 248 + 272 + 260)
= 88.

2. What was the total number of Peons working in the Company in the year 1999?
A. 1312 B. 1192
C. 1088 D. 968
Answer & Explanation
Answer: Option B
Total number of Peons working in the Company in 1999
= (820 + 184 + 152 + 196 + 224) - (96 + 88 + 80 + 120)
= 1192.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 243

Online Ph.D Entrance Test (PET) : Sample Test Paper I(Solved with explanation)

Time: 90 minutes] [Max Marks: 100

(a) N.B:- a) There are in all 100 multiple Choice Questions
(b) Each correct answer carries 1 Mark.
(c) There is No negative marking system.
(d) Click online the correct option for each question.
(e) Use of Electronic / Scientific calculator is not allowed
(f) Multiple Choice Questions are divided into four parts
1. Analytical Reasoning,
2. Numerical Ability
3. Language Competency/ Computer/ Environment/ Logical Reasoning
4. and Data Interpretation

Part-1 :- Analytical Reasoning

Series Completion

Choose the correct alternative that will continue the same pattern and replace the question mark in the
given series.
1. 120, 99, 80, 63, 48, ?
A. 35 B. 38
C. 39 D. 40
Answer & Explanation
Answer: Option A
The pattern is - 21, - 19, - 17, - 15,.....
So, missing term = 48 - 13 = 35.

2. 589654237, 89654237, 8965423, 965423, ?
A. 58965 B. 65423
C. 89654 D. 96542
Answer & Explanation
Answer: Option D
The digits are removed one by one from the beginning and the end in order alternately
so as to obtain the subsequent terms of the series.

3. 3, 10, 101,?
A.10101 B. 10201
C. 10202 D.11012
Answer & Explanation
Answer: Option B
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 244

Clearly, 2 x 3 = 6, 6 x 3 = 18, 18 x 3 = 54,.....
So, the series is a G.P. in which a = 2, r = 3.
Therefore 8th term = ar
= ar
= 2 x 3
= (2 x 2187) = 4374.

4. In the series 2, 6, 18, 54, ...... what will be the 8th term ?
A.4370 B. 4374
C. 7443 D.7434
Answer & Explanation
Answer: Option B
Clearly, 2 x 3 = 6, 6 x 3 = 18, 18 x 3 = 54,.....
So, the series is a G.P. in which a = 2, r = 3.
Therefore 8th term = ar
= ar
= 2 x 3
= (2 x 2187) = 4374.

5. 125,80,45,20,?
A.5 B. 8
C. 10 D.12
Answer & Explanation
Answer: Option A
The pattern is - 45, - 35, - 25, .....
So, missing term = 20 - 15 = 5.


In each of the following questions, five words have been given out of which four are alike in some
manner, while the fifth one is different. Choose the word which is different from the rest.

6. Choose the word which is different from the rest.
A.Chicken B. Snake
C. Swan D. Crocodile
E. Frog
Answer & Explanation
Answer: Option A
All except Chicken can live in water.

7. Choose the word which is different from the rest.
A.Cap B. Turban
C. Helmet D.Veil
E. Hat
Answer & Explanation
Answer: Option D
All except Veil cover the head, while veil covers the face.

8. Choose the word which is different from the rest.
A.Kiwi B. Eagle
C. Emu D.Ostrich
Answer & Explanation
Answer: Option B
AH except Eagle are flightless birds.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 245

9. Choose the word which is different from the rest.
A.Rigveda B. Yajurveda
C. Atharvaveda D.Ayurveda
E. Samveda
Answer & Explanation
Answer: Option D
All except Ayurveda are names of holy scriptures, the four Vedas. Ayurveda is a branch of medicine.

10. Choose the word which is different from the rest.
A.Curd B. Butter
C. Oil D. Cheese
E. Cream
Answer & Explanation
Answer: Option C
All except Oil are products obtained from milk.

Blood Relation Test

11. Pointing to a photograph of a boy Suresh said, "He is the son of the only son of my mother." How
is Suresh related to that boy?
A.Brother B. Uncle
C. Cousin D. Father
Answer & Explanation
Answer: Option D
The boy in the photograph is the only son of the son of Suresh's mother i.e., the son of Suresh. Hence,
Suresh is the father of boy.

12. If A + B means A is the mother of B; A - B means A is the brother B; A % B means A is the father of
B and A x B means A is the sister of B, which of the following shows that P is the maternal uncle of Q?
A.Q - N + M x P B. P + S x N - Q
C. P - M + N x Q D. Q - S % P
Answer & Explanation
Answer: Option C
P - M P is the brother of M
M + N M is the mother of N
N x Q N is the sister of Q
Therefore, P is the maternal uncle of Q.

13. If A is the brother of B; B is the sister of C; and C is the father of D, how D is related to A?
A.Brother B. Sister
C. Nephew D.Cannot be determined
Answer & Explanation
Answer: Option D
If D is Male, the answer is Nephew.
If D is Female, the answer is Niece.
As the sex of D is not known, hence, the relation between D and A cannot be determined.
Note: Niece - A daughter of one's brother or sister, or of one's brother-in-law or sister-in-law. Nephew - A
son of one's brother or sister, or of one's brother-in-law or sister-in-law.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 246

Character Puzzles

14. Which one will replace the question mark ?

A.L10 B. K15
C. I15 D.K8

Answer & Explanation
Answer: Option D

How the number is obtained?
2 + 4 = 6
5 + 9 = 14
3 + 5 = 8
Therefore, the answer is K8.

15. Which one will replace the question mark ?

A.1 B. 4
C. 3 D. 6
Answer & Explanation
Answer: Option D
(5 + 4 + 7)/2 = 8
(6 + 9 + 5)/2 = 10
(3 + 7 + 2)/2 = 6.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 247

16. Which one will replace the question mark ?

A.18 B. 12
C. 9 D. 6
Answer & Explanation
Answer: Option C
(12 + 18 + 30)/10 = 6
(16 + 24 + 40)/10 = 8
Similarly, (45 + 18 + 27)/10 = 9.

17. Which one will replace the question mark ?

A.25 B. 37
C. 41 D. 47
Answer & Explanation
Answer: Option C
(5 x 3) + 4 = 19
and (6 x 4) + 5 = 29
Therefore, (7 x 5) + 6 = 41

18. Which one will replace the question mark ?

A.45 B. 41
C. 32 D. 40
Answer & Explanation
Answer: Option A
(15 x 2 - 3) = 27,
(31 x 2 - 6) = 56
and (45 x 2 - 9) = 81

19. Y is in the East of X which is in the North of Z. If P is in the South of Z, then in which direction of
Y, is P?
A.North B. South
C. South-East D. None of these
Answer & Explanation
Answer: Option D
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 248


P is in South-West of Y.

20. If South-East becomes North, North-East becomes West and so on. What will West become?
A.North-East B. North-West
C. South-East D.South-West
Answer & Explanation
Answer: Option C


It is clear from the diagrams that new name of West will become South-East.

21. A man walks 5 km toward south and then turns to the right. After walking 3 km he turns to the left
and walks 5 km. Now in which direction is he from the starting place?
A.West B. South
C. North-East D. South-West
Answer & Explanation
Answer: Option D

Hence required direction is South-West.

22. Rahul put his timepiece on the table in such a way that at 6 P.M. hour hand points to North. In
which direction the minute hand will point at 9.15 P.M. ?
A.South-East B. South
C. North D.West
Answer & Explanation
Answer: Option D
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 249


At 9.15 P.M., the minute hand will point towards west.

23. Rasik walked 20 m towards north. Then he turned right and walks 30 m. Then he turns right and
walks 35 m. Then he turns left and walks 15 m. Finally he turns left and walks 15 m. In which
direction and how many metres is he from the starting position?
A.15 m West B. 30 m East
C. 30 m West D. 45 m East
Answer & Explanation
Answer: Option D

24. Two cars start from the opposite places of a main road, 150 km apart. First car runs for 25 km and
takes a right turn and then runs 15 km. It then turns left and then runs for another 25 km and then
takes the direction back to reach the main road. In the mean time, due to minor break down the other
car has run only 35 km along the main road. What would be the distance between two cars at this
A.65 km B. 75 km
C. 80 km D.85 km
Answer & Explanation
Answer: Option A
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 250


25. Starting from the point X, Jayant walked 15 m towards west. He turned left and walked 20 m. He
then turned left and walked 15 m. After this he turned to his right and walked 12 m. How far and in
which directions is now Jayant from X?
A.32 m, South B. 47 m, East
C. 42 m, North D. 27 m, South

Answer & Explanation
Answer: Option A

Part-2:- Numerical Ability

1. Which one of the following is not a prime number?
A.31 B. 61
C. 71 D. 91
Answer & Explanation
Answer: Option D
91 is divisible by 7. So, it is not a prime number.

2. (112 x 5
) = ?
A.67000 B. 70000
C. 76500 D.77200
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 251

Answer & Explanation
Answer: Option B
(112 x 5
) = 112 x


112 x 10

= 70000
2 2

3. It is being given that (2
+ 1) is completely divisible by a whole number. Which of the following
numbers is completely divisible by this number?
+ 1) B. (2
- 1)
C. (7 x 2
) D. (2
+ 1)
Answer & Explanation
Answer: Option D
Let 2
= x. Then, (2
+ 1) = (x + 1).
Let (x + 1) be completely divisible by the natural number N. Then,
+ 1) = [(2
] = (x
+ 1) = (x + 1)(x
- x + 1), which is completely divisible by N, since (x + 1) is divisible
by N.

4. What least number must be added to 1056, so that the sum is completely divisible by 23 ?
A.2 B. 3
C. 18 D. 21
E. None of these
Answer & Explanation
Answer: Option A
23) 1056 (45

Required number = (23 - 21)
= 2

5. 1397 x 1397 = ?
A.1951609 B. 1981709
C. 18362619 D. 2031719
E. None of these
Answer & Explanation
Answer: Option A
1397 x 1397= (1397)

= (1400 - 3)

= (1400)
+ (3)
- (2 x 1400 x 3)
= 1960000 + 9 - 8400
= 1960009 - 8400
= 1951609.

6. How many of the following numbers are divisible by 132 ?
264, 396, 462, 792, 968, 2178, 5184, 6336
A.4 B. 5
C. 6 D. 7
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 252

Answer & Explanation
Answer: Option A
132 = 4 x 3 x 11
So, if the number divisible by all the three number 4, 3 and 11, then the number is divisible by 132 also.
264 11,3,4 (/)
396 11,3,4 (/)
462 11,3 (X)
792 11,3,4 (/)
968 11,4 (X)
2178 11,3 (X)
5184 3,4 (X)
6336 11,3,4 (/)
Therefore the following numbers are divisible by 132 : 264, 396, 792 and 6336.
Required number of number = 4.

7. (935421 x 625) = ?
A. 575648125 B. 584638125
C. 584649125 D. 585628125
Answer & Explanation
Answer: Option B
935421 x 625 = 935421 x 5
= 935421 x


935421 x 10

= 584638125

8. The largest 4 digit number exactly divisible by 88 is:
A.9944 B. 9768
C. 9988 D.8888
E. None of these
Answer & Explanation
Answer: Option A
Largest 4-digit number = 9999

88) 9999 (113

Required number = (9999 - 55) = 9944.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 253

Problems on Trains

1. A train running at the speed of 60 km/hr crosses a pole in 9 seconds. What is the length of the train?
A. 120 metres B. 180 metres
C. 324 metres D. 150 metres
Answer & Explanation
Answer: Option D

60 x

18 3
Length of the train = (Speed x Time) =

x 9
m = 150 m.

2. A train 125 m long passes a man, running at 5 km/hr in the same direction in which the train is
going, in 10 seconds. The speed of the train is:
A. 45 km/hr B. 50 km/hr
C. 54 km/hr D. 55 km/hr
Answer & Explanation
Answer: Option B
Speed of the train relative to man =



2 5
= 45 km/hr.
Let the speed of the train be x km/hr. Then, relative speed = (x - 5) km/hr.
x - 5 = 45 x = 50 km/hr.

3. The length of the bridge, which a train 130 metres long and travelling at 45 km/hr can cross in 30
seconds, is:
A. 200 m B. 225 m
C. 245 m D. 250 m
Answer & Explanation
Answer: Option C
Speed =

45 x

18 2
Time = 30 sec.
Let the length of bridge be x metres.
130 + x
30 2
2(130 + x) = 750
x = 245 m.

4. Two trains running in opposite directions cross a man standing on the platform in 27 seconds and 17
seconds respectively and they cross each other in 23 seconds. The ratio of their speeds is:
A.1 : 3 B. 3 : 2
C. 3 : 4 D. None of these
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 254

Answer & Explanation
Answer: Option B
Let the speeds of the two trains be x m/sec and y m/sec respectively.
Then, length of the first train = 27x metres,
and length of the second train = 17y metres.
27x + 17y = 23
x+ y
27x + 17y = 23x + 23y
4x = 6y
x = 3 .
y 2

5. A train passes a station platform in 36 seconds and a man standing on the platform in 20 seconds. If
the speed of the train is 54 km/hr, what is the length of the platform?
A. 120 m B. 240 m
C. 300 m D. None of these
Answer & Explanation
Answer: Option B
Speed =

54 x
m/sec = 15 m/sec.
Length of the train = (15 x 20)m = 300 m.
Let the length of the platform be x metres.
x + 300
= 15
x + 300 = 540
x = 240 m.


1. Tickets numbered 1 to 20 are mixed up and then a ticket is drawn at random. What is the probability
that the ticket drawn has a number which is a multiple of 3 or 5?
Answer & Explanation
Answer: Option D
Here, S = {1, 2, 3, 4, ...., 19, 20}.
Let E = event of getting a multiple of 3 or 5 = {3, 6 , 9, 12, 15, 18, 5, 10, 20}.
P(E) =
n(S) 20

2. A bag contains 2 red, 3 green and 2 blue balls. Two balls are drawn at random. What is the
probability that none of the balls drawn is blue?
Answer & Explanation
Answer: Option A
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 255

Total number of balls = (2 + 3 + 2) = 7.
Let S be the sample space.
Then, n(S) = Number of ways of drawing 2 balls out of 7

C2 `

(7 x 6)
(2 x 1)

= 21.
Let E = Event of drawing 2 balls, none of which is blue.
n(E) = Number of ways of drawing 2 balls out of (2 + 3) balls.


(5 x 4)
(2 x 1)

= 10.
P(E) =
n(S) 21

3. In a box, there are 8 red, 7 blue and 6 green balls. One ball is picked up randomly. What is the
probability that it is neither red nor green?
Answer & Explanation
Answer: Option A
Total number of balls = (8 + 7 + 6) = 21.
Let E = event that the ball drawn is neither red nor green

= event that the ball drawn is blue.
n(E) = 7.
P(E) =
n(S) 21 3

4. What is the probability of getting a sum 9 from two throws of a dice?
Answer & Explanation
Answer: Option C
In two throws of a die, n(S) = (6 x 6) = 36.
Let E = event of getting a sum ={(3, 6), (4, 5), (5, 4), (6, 3)}.
P(E) =
n(S) 36 9

5. Three unbiased coins are tossed. What is the probability of getting at most two heads?
Answer & Explanation
Answer: Option D
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 256

Let E = event of getting at most two heads.
P(E) =
n(S) 8


1. A man has ` 480 in the denominations of one-rupee notes, five-rupee notes and ten-rupee notes. The
number of notes of each denomination is equal. What is the total number of notes that he has ?
A.45 B. 60
C. 75 D. 90
Answer & Explanation
Answer: Option D
Let number of notes of each denomination be x.
Then x + 5x + 10x = 480
16x = 480
x = 30.
Hence, total number of notes = 3x = 90.

2. There are two examinations rooms A and B. If 10 students are sent from A to B, then the number of
students in each room is the same. If 20 candidates are sent from B to A, then the number of students
in A is double the number of students in B. The number of students in room A is:
A.20 B. 80
C. 100 D. 200
Answer & Explanation
Answer: Option C
Let the number of students in rooms A and B be x and y respectively.
Then, x - 10 = y + 10 x - y = 20 .... (i)
and x + 20 = 2(y - 20) x - 2y = -60 .... (ii)
Solving (i) and (ii) we get: x = 100 , y = 80.
The required answer A = 100.

3. The price of 10 chairs is equal to that of 4 tables. The price of 15 chairs and 2 tables together is `
4000. The total price of 12 chairs and 3 tables is:
A. ` 3500 B. ` 3750
C. ` 3840 D. ` 3900
Answer & Explanation
Answer: Option D
Let the cost of a chair and that of a table be ` x and ` y respectively.
Then, 10x = 4y or y =
15x + 2y = 4000
15x + 2 x
x = 4000
20x = 4000
x = 200.
So, y =

x 200

= 500.
Hence, the cost of 12 chairs and 3 tables = 12x + 3y
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 257

= ` (2400 + 1500)
= ` 3900.

4. If a - b = 3 and a
+ b
= 29, find the value of ab.
A. 10 B. 12
C. 15 D. 18
Answer & Explanation
Answer: Option A
2ab = (a
+ b
) - (a - b)

= 29 - 9 = 20
ab = 10.

5. The price of 2 sarees and 4 shirts is ` 1600. With the same money one can buy 1 saree and 6 shirts. If
one wants to buy 12 shirts, how much shall he have to pay ?
A. ` 1200 B. ` 2400
C. ` 4800 D. Cannot be determined
E. None of these

Answer & Explanation
Answer: Option B
Let the price of a saree and a shirt be ` x and ` y respectively.
Then, 2x + 4y = 1600 .... (i)
and x + 6y = 1600 .... (ii)

Divide equation (i) by 2, we get the below equation.

=> x + 2y = 800. --- (iii)

Now subtract (iii) from (ii)

x + 6y = 1600 (-)
x + 2y = 800
4y = 800

Therefore, y = 200.

Now apply value of y in (iii)

=> x + 2 x 200 = 800

=> x + 400 = 800

Therefore x = 400

Solving (i) and (ii) we get x = 400, y = 200.
Cost of 12 shirts = ` (12 x 200) = ` 2400.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 258

Part-3:- Language Competency
Ordering of Words
In each question below, there is a sentence of which some parts have been jumbled up. Rearranage these
parts which are labelled P, Q, R and S to produce the correct sentence. Choose the proper sequence.

1. When he
P : did not know
Q : he was nervous and
R : heard the hue and cry at midnight
S : what to do
The Proper sequence should be:

Answer: Option A

2. It has been established that
P : Einstein was
Q : although a great scientist
R : weak in arithmetic
S : right from his school days
The Proper sequence should be:

Answer: Option B

3. Then
P : it struck me
Q : of course
R : suitable it was
S : how eminently
The Proper sequence should be:

Answer: Option C

4. I read an advertisement that said
P : posh, air-conditioned
Q : gentleman of taste
R : are available for
S : fully furnished rooms
The Proper sequence should be:

Answer: Option B

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 259

5. Since the beginning of history
P : have managed to catch
Q : the Eskimos and Red Indians
R : by a very difficulty method
S : a few specimens of this aquatic animal
The Proper sequence should be:
Answer: Option D
Completing Statements

In each question, an incomplete statement (Stem) followed by fillers is given. Pick out the best one
which can complete incomplete stem correctly and meaningfully.

6. Despite his best efforts to conceal his anger ......
A.we could detect that he was very happy
B. he failed to give us an impression of his agony
C. he succeeded in camouflaging his emotions
D.he could succeed in doing it easily
E. people came to know that he was annoyed

Answer: Option E

7. Even if it rains I shall come means ......
A.if I come it will not rain
B. if it rains I shall not come
C. I will certainly come whether it rains or not
D.whenever there is rain I shall come
E. I am less likely to come if it rains

Answer: Option C

8. His appearance is unsmiling but ......
A.his heart is full of compassion for others
B. he looks very serious on most occasions
C. people are afraid of him
D. he is uncompromising on matters of task performance
E. he is full of jealousy towards his colleagues

Answer: Option A

9. She never visits any zoo because she is strong opponent of the idea of ......
A.setting the animals free into forest
B. feeding the animals while others are watching
C. watching the animals in their natural abode
D. going out of the house on a holiday
E. holding the animals in captivity for our joy

Answer: Option E

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 260

10. I felt somewhat more relaxed ......
A.but tense as compared to earlier
B. and tense as compared to earlier
C. as there was already no tension at all
D. and tension-free as compared to earlier
E. because the worry had already captured by mind

Answer: Option D

Computer Competency

11) What is the name of the printed circuit board? A) Ram B) Mother Board

Answer: Option A

12) TO write, erase, rewrite data on a CD RAM what type of CD ROM you should use?

Answer: Option A

13) A byte is equivalent to...? A) 8 bits B) 10 bits

Answer: Option A

14) Which of the following retains the information it's storing when the power to the system is turned
off? a) CPU b) ROM c) DRAM d) DIMM

Answer: Option B

15) Hard Disk, DVD, CD-ROM are the examples of what type of Memory? a) Primary b) Secondary

Answer: Option B

Environment Competency

16. Plants are killed at low temperature because :
(A) Desiccation takes place owing to the withdrawal of water from vacuolated protoplasm
(B) Precipitation of cell proteins
(C) Cells rupture due to the mechanical pressure of ice
(D) All the above three are correct

Answer: Option D

17. Which one of the chemicals is responsible for the reduction of ozone content of the atmosphere?
(A) SO2 (B) Chlorofluoro carbon (C) HCl (D) Photochemical smog

Answer: Option B

18. Acid rains occur when atmosphere is heavily polluted with :
(A) CO, CO2 (B) Smoke particles (C) Ozone (D) SO2 and NO2

Answer: Option D

19. Spraying of DDT on crops causes pollution of:
(A) Soil and Water (B) Air and Soil (C) Crops and Air (D) Air and Water
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 261

Answer: Option A

20. Soil erosion can be prevented by :
(A) Increasing bird population(B) Afforestation(C) Removal of vegetation (D) Over grazing

Answer: Option B

Logical Reasoning Competency

In these series, you will be looking at both the letter pattern and the number pattern. Fill the blank in the
middle of the series or end of the series.
1. SCD, TEF, UGH, ____, WKL
Answer & Explanation
Answer: Option C
There are two alphabetical series here. The first series is with the first letters only: STUVW. The second
series involves the remaining letters: CD, EF, GH, IJ, KL.

2. B2CD, _____, BCD4, B5CD, BC6D
A. B2C2D B. BC3D
C. B2C3D D. BCD7
Answer & Explanation
Answer: Option B
Because the letters are the same, concentrate on the number series, which is a simple 2, 3, 4, 5, 6 series,
and follows each letter in order.

3. FAG, GAF, HAI, IAH, ____
Answer & Explanation
Answer: Option A
The middle letters are static, so concentrate on the first and third letters. The series involves an
alphabetical order with a reversal of the letters. The first letters are in alphabetical order: F, G, H, I , J.
The second and fourth segments are reversals of the first and third segments. The missing segment
begins with a new letter.

4. ELFA, GLHA, ILJA, _____, MLNA
Answer & Explanation
Answer: Option D
The second and forth letters in the series, L and A, are static. The first and third letters consist of an
alphabetical order beginning with the letter E.

5. CMM, EOO, GQQ, _____, KUU
Answer & Explanation
Answer: Option C
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 262

The first letters are in alphabetical order with a letter skipped in between each segment: C, E, G, I, K.
The second and third letters are repeated; they are also in order with a skipped letter: M, O, Q, S, U.

Part-4:- Data Interpretation

Bar Charts
The bar graph given below shows the sales of books (in thousand number) from six branches of a
publishing company during two consecutive years 2000 and 2001. Sales of Books (in thousand
numbers) from Six Branches - B1, B2, B3, B4, B5 and B6 of a publishing Company in 2000 and 2001.

1. What is the ratio of the total sales of branch B2 for both years to the total sales of branch B4 for both
A.2:3 B. 3:5
C. 4:5 D.7:9
Answer & Explanation
Answer: Option D
Required ratio =(75 + 65)=140 = 7 .
(85 + 95) 180 9

2. Total sales of branch B6 for both the years is what percent of the total sales of branches B3 for both
the years?

A. 68.54% B. 71.11%
C. 73.17% D. 75.55%
Answer & Explanation
Answer: Option C
Required percentage =

(70 + 80)
x 100

(95 + 110)


x 100

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 263

= 73.17%.

3. What percent of the average sales of branches B1, B2 and B3 in 2001 is the average sales of branches
B1, B3 and B6 in 2000?
A. 75% B. 77.5%
C. 82.5% D. 87.5%
Answer & Explanation
Answer: Option D
Average sales (in thousand number) of branches B1, B3 and B6 in 2000
x (80 + 70 + 95) =


3 3
Average sales (in thousand number) of branches B1, B2 and B3 in 2001
x (105 + 65 + 110) =


3 3
Required percentage =

x 100

% =

x 100

% = 87.5%.
280/3 280

4. What is the average sales of all the branches (in thousand numbers) for the year 2000?

A.73 B. 80
C. 83 D. 88
Answer & Explanation
Answer: Option B
Average sales of all the six branches (in thousand numbers) for the year 2000
=1 x [80 + 75 + 95 + 85 + 75 + 70]
= 80.

5. Total sales of branches B1, B3 and B5 together for both the years (in thousand numbers) is?
A. 250 B. 310
C. 435 D. 560
Answer & Explanation
Answer: Option D
Total sales of branches B1, b2 and B5 for both the years (in thousand numbers)
= (80 + 105) + (95 + 110) + (75 + 95)
= 560.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 264

Pie Charts
The following pie-chart shows the percentage distribution of the expenditure incurred in publishing a
book. Study the pie-chart and the answer the questions based on it.
Various Expenditures (in percentage) Incurred in Publishing a Book

6. If for a certain quantity of books, the publisher has to pay ` 30,600 as printing cost, then what will
be amount of royalty to be paid for these books?

A.` 19,450 B. ` 21,200
C. ` 22,950 D. ` 26,150
Answer & Explanation
Answer: Option C
Let the amount of Royalty to be paid for these books be ` r.

Then, 20 : 15 = 30600 : r r = `

30600 x 15

= ` 22,950.

7. What is the central angle of the sector corresponding to the expenditure incurred on Royalty?
A. 15 B. 24
C. 54 D. 48
Answer & Explanation
Answer: Option C
Central angle corresponding to Royalty = (15% of 360)


x 360


= 54.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 265

8. The price of the book is marked 20% above the C.P. If the marked price of the book is ` 180, then what
is the cost of the paper used in a single copy of the book?
A. ` 36 B. ` 37.50
C. ` 42 D. ` 44.25
Answer & Explanation
Answer: Option B
Clearly, marked price of the book = 120% of C.P.
Also, cost of paper = 25% of C.P
Let the cost of paper for a single book be ` n.
Then, 120 : 25 = 180 : n n = `

25 x 180

= ` 37.50 .

9. If 5500 copies are published and the transportation cost on them amounts to ` 82500, then what
should be the selling price of the book so that the publisher can earn a profit of 25%?

A. `. 187.50 B. ` 191.50
C. ` 175 D. ` 180
Answer & Explanation
Answer: Option A
For the publisher to earn a profit of 25%, S.P. = 125% of C.P.
Also Transportation Cost = 10% of C.P.
Let the S.P. of 5500 books be ` x.
Then, 10 : 125 = 82500 : x x = `

125 x 82500

= ` 1031250.
S.P. of one book = `


= ` 187.50 .

10. Royalty on the book is less than the printing cost by:
A. 5% B. 33

C. 20% D. 25%
Answer & Explanation
Answer: Option D
Printing Cost of book = 20% of C.P.
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 266

Royalty on book = 15% of C.P.
Difference = (20% of C.P.) - (15% of C.P) = 5% of C.P.
Percentage difference =

x 100

Printing Cost


5% of C.P.
x 100

% = 25%.
Printing Cost

Line Charts
The following line graph gives the percentage of the number of candidates who qualified an
examination out of the total number of candidates who appeared for the examination over a period of
seven years from 1994 to 2000.
Percentage of Candidates Qualified to Appeared in an Examination Over the Years

11. The difference between the percentage of candidates qualified to appeared was maximum in which
of the following pairs of years?
A. 1994 and 1995 B. 1997 and 1998
C. 1998 and 1999 D. 1999 and 2000
Answer & Explanation
Answer: Option B
The differences between the percentages of candidates qualified to appeared for the give pairs of years
For 1994 and 1995 = 50 - 30 = 20.
For 1998 and 1999 = 80 - 80 = 0.
For 1994 and 1997 = 50 - 30 = 20.
For 1997 and 1998 = 80 - 50 = 30.
For 1999 and 2000 = 80 - 60 = 20.
Thus, the maximum difference is between the years 1997 and 1998.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 267

12. In which pair of years was the number of candidates qualified, the same?
A. 1995 and 1997 B. 1995 and 2000
C. 1998 and 1999 D. Data inadequate
Answer & Explanation
Answer: Option D
The graph gives the data for the percentage of candidates qualified to appeared and unless the
absolute values of number of candidates qualified or candidates appeared is know we cannot
compare the absolute values for any two years.
Hence, the data is inadequate to solve this question.

13. If the number of candidates qualified in 1998 was 21200, what was the number of candidates
appeared in 1998?
A. 32000 B. 28500
C. 26500 D. 25000
Answer & Explanation
Answer: Option C
The number of candidates appeared in 1998 be x.
Then, 80% of x = 21200 x =
21200 x 100
= 26500 (required number).

14. If the total number of candidates appeared in 1996 and 1997 together was 47400, then the total
number of candidates qualified in these two years together was?
A. 34700 B. 32100
C. 31500 D. Data inadequate
Answer & Explanation
Answer: Option D
The total number of candidates qualified in 1996 and 1997 together, cannot be determined until we
know at least, the number of candidates appeared in any one of the two years 1996 or 1997 or the
percentage of candidates qualified to appeared in 1996 and 1997 together.
Hence, the data is inadequate.

15. The total number of candidates qualified in 1999 and 2000 together was 33500 and the number of
candidates appeared in 1999 was 26500. What was the number of candidates in 2000?
A. 24500 B. 22000
C. 20500 D. 19000
Answer & Explanation
Answer: Option C
The number of candidates qualified in 1999 = (80% of 26500) = 21200.
Number of candidates qualified in 2000 = (33500 - 21200) = 12300.
Let the number of candidates appeared in 2000 be x.
Then, 60% of x = 12300 x =

12300 x 100

= 20500.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 268

Table Charts

Study the following table and answer the questions based on it.
Expenditures of a Company (in Lakh Rupees) per Annum Over the given Years.
Item of Expenditure
Salary Fuel and Transport Bonus Interest on Loans Taxes
1998 288 98 3.00 23.4 83
1999 342 112 2.52 32.5 108
2000 324 101 3.84 41.6 74
2001 336 133 3.68 36.4 88
2002 420 142 3.96 49.4 98
16. What is the average amount of interest per year which the company had to pay during
this period?
A. ` 32.43 lakhs B. ` 33.72 lakhs
C. ` 34.18 lakhs D. ` 36.66 lakhs
Answer & Explanation
Answer: Option D
Average amount of interest paid by the Company during the given period
= `

23.4 + 32.5 + 41.6 + 36.4 + 49.4

= `


= ` 36.66 lakhs.

17. The total amount of bonus paid by the company during the given period is approximately what
percent of the total amount of salary paid during this period?
A. 0.1% B. 0.5%
C. 1% D. 1.25%
Answer & Explanation
Answer: Option C
Required percentage =

(3.00 + 2.52 + 3.84 + 3.68 + 3.96)
x 100

(288 + 342 + 324 + 336 + 420)


x 100


= 1%.

[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 269

18. Total expenditure on all these items in 1998 was approximately what percent of the total
expenditure in 2002?

A. 62% B. 66%
C. 69% D. 71%
Answer & Explanation
Answer: Option C
Required percentage =

(288 + 98 + 3.00 + 23.4 + 83)
x 100

(420 + 142 + 3.96 + 49.4 + 98)


x 100


= 69.45%.

19. The total expenditure of the company over these items during the year 2000 is?
A. ` 544.44 lakhs B. ` 501.11 lakhs
C. ` 446.46 lakhs D. ` 478.87 lakhs
Answer & Explanation
Answer: Option A
Total expenditure of the Company during 2000
= ` (324 + 101 + 3.84 + 41.6 + 74) lakhs
= ` 544.44 lakhs.

20. The ratio between the total expenditure on Taxes for all the years and the total expenditure on Fuel
and Transport for all the years respectively is approximately?
A. 4:7 B. 10:13
C. 15:18 D. 5:8
Answer & Explanation
Answer: Option B
Required ratio =

(83 + 108 + 74 + 88 + 98)

(98 + 112 + 101 + 133 + 142)






[More Solved MS-Word PET/PAT Question
Papers can be found in the enclosed CD ]
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 270

Most Likely asked Questions for PhD Entrance Interview(at the time of
synopsis submission) and final Open Defense viva voce examination.

Few Breaking the Ice Questions
1. Tell me about yourself.
2. What are your strengths and
3. What is the difference between
confidence and over confidence?
4. What is the difference between hard
work and smart work?
5. What are your goals? What motivates
you to do a good job?
6. Give me an example of your creativity.
7. Who has inspired you in your life and
8. What was the toughest decision you ever
had to make?

Few Research appetite Questions
9. Why do you want to do Ph.D? What will
be your Ph.D topic? Why you have
chosen this topic?
10. What are going to be the steps of your
research work?
11. What is your aim behind doing this
12. What are the objectives behind your
study? What is the importance of this
13. What is the scope and limitations of your
14. What is going to be your research area?
15. What benefits the masses are going to
derive from your study?
16. Have you chosen any specific area or
sector to conduct your research work?
17. Have you chosen any particular
organization or institution for conducting
the research work?
18. Whether this organization will give you
the permission to conduct this study?
19. Why you want to do your research work
in this particular sector?
20. If you are not in the education sector then
why you want to do Ph.D?
21. Do you have a work experience? If you
have a work experiences then how this
will help in your research work?
22. What do you understand by the term
Research ? Which are the various stages
in the development of a research?
23. Explain the stages in the research process
with the help of a flow chart of research
24. Define the term 'hypothesis'. What will
be your hypothesis?
25. Define a Hypothesis Discuss the
importance of hypothesis in research and
the process of a formation of hypothesis.
26. What are your primary and secondary
sources for data collection?
[ 2011-12 : A Complete textbook on PhD (along with CD)] Page 271

27. What do you think how much time will
be required to complete the research
28. What is going to be the sample size?
Have you decided upon the
demographics of the sample size?
29. What is going to be the research
methodology of your study?
30. How you will do the analysis of the data
collected during the research work?
31. Explain few sources of primary and
secondary data collection methods.
32. How the analysis of your study is going
to help or guide future researchers.
33. The textbook says that one does not start
by writing questions. How should the
researcher begin?
34. Define sampling state briefly various
methods of sampling.
35. A researcher is interested in knowing the
answer to a why question, but does not
know what sort of answer will be
satisfying. Is this exploratory,
descriptive, or casual research? Explain.
36. What are the major characteristics in
sampling? State the type of sampling
with suitable illustrations.
37. What is the task of problem definition?
The city police wishes to understand its
image from the publics point of view.
Define the business problem.
38. How do you recognize a research
problem? Describe the criteria of a good
research problem
39. With the help of examples, classify
survey research methods.
40. Discuss the use of self administered
questionnaires along with their
41. Design a complete questionnaire to
evaluate job satisfaction of entry level
marketing executives.
42. Define the interviewing and the
questionnaire techniques of data
43. Define and classify secondary data.
Discuss the process of evaluating
secondary data.
44. Discuss various contents required in the
layout of Internet questionnaire.
45. Discuss various factors that influence the
validity of experimental studies in
46. What type of research should be
conducted? Give reasons to support your
47. Design the research process in detail.
Support your answer with flow diagram.
48. Explain the significance of statistical tools
in the interpretation of data. What its
49. Discuss briefly the various methods of
data collection. What steps will you
follow while writing a Research Report?
50. Define Research Report. Explain the
characteristics of a good research report?