Professional Documents
Culture Documents
DBA1657-Research Methods in Business
DBA1657-Research Methods in Business
UNIT -1
NOTES
Introduction to Research
Unit structure:
1.1 Introduction
1.2 Learning objectives
1.3 Definition of research
1.4 Importance of research
1.5 Hallmarks of scientific research
1.6 The Building Blocks of Science in research
1.7 Research process: An Overview
1.7.1. Defining the research problem
1.7.2. Establishing research objectives
1.7.3 Developing the research design
1.7.4 Preparing a research proposal
1.7.5 Data Collection
1.7.6 Data Analysis and Interpretation
1.7.7 Research report
1.8 Theoretical Framework
1.8.1 Types of Variables
1.8.2 Theoretical framework : The need and features
1.9 Hypotheses: Types and testing
1.10 Research Design
1.10.1 The need for research design
1.10.2 Classification of research designs
-------------------------------------------------------------------------------------
1.1 INTRODUCTION
Managers are mostly involved in studying and analyzing issues that lead to
decision making. They are involved in some form of research for making an appropriate
decision. Decision making today is complicated and complex. There is a myriad flow of
information enabled by data mining and warehousing which provides a vital input for
decision making. The success or failure of a business decision depends on the data
associated with the decision. The decisions can be made in an objective or subjective
manner. Objective decision making is rationale and scientific. To arrive at objective
1
DBA 1657
NOTES
decision making the business managers often involve themselves in some form of
research.
Research is simply the process of finding solutions to a problem after a thorough
study and analysis of the situational and other related factors. Business research is a
systematic and organized effort to investigate a specific problem encountered in the
work setting that needs a solution. It comprises of a series of steps designed and executed
with the goal of finding answers to the issues that are of concern to the manager.
This unit provides a basic understanding of research, the process involved and
the steps involved in the development and testing of hypothesis. Further, the need and
the major types of research design is dealt in detail.
DBA 1657
investigation into a specific problem undertaken with the purpose of finding solutions to
it. Research provides the needed information that guides managers to make informed
decisions to successfully deal with problems.
NOTES
The business world today is more complicated and complex. In this context
the research enables the manager to face the competitive global market with
greater confidence. Research enables to consider the available information in a
sophisticated and creative way.
Managers need to understand, predict and control events that are dysfunctional
to the organization. Research enables to understand, predict and control the
environment.
Research enables to sense, spot and deal with problems before they go out of
hand.
The organizations may not be able to solve all the problems encountered inhouse. Consultants may be engaged for expert advice. The manager needs to
have knowledge of research to interact with the research consultants effectively
and to get the maximum benefit out of them
All the research findings published cannot be accepted as such. The soundness
of findings should be evaluated before making decisions on the basis of research
findings. The managers needs to know about the research so as to evaluate and
discriminate the research findings based on the soundness of methodology etc.,
The knowledge of research and research methods sensitize the mangers to the
various variables operating in a situation and remind them of the multicausality
and multifinality of the situations and thereby avoiding inappropriate, simplistic
notions of one variable causing another.
Knowledge about the scientific investigation will enable the managers to eliminate
or avoid making decisions on subjective or biased manner.
DBA 1657
NOTES
Knowledge about research helps the manager to understand the need for and
share the pertinent information with the research consultants.
1.5.1 Purposiveness
The research is conducted with a purpose. It has a focus. The purpose of the
research should be clearly mentioned in an understandable and unambiguous manner.
The statement of the decision problem should include its scope, its limitations and the
precise meaning of all words and terms significant to the research. Failure to mention
the purpose clearly will raise doubts in the minds of stakeholders of the research as to
whether the researcher has sufficient understanding of the problem.
1.5.2. Rigor
Rigor means carefulness, scrupulousness and the degree of exactness in research
investigation. In order to make a meaningful and worthwhile contribution to the field of
knowledge, research must be carried out rigorously. Conducting a rigorous research
requires a good theoretical knowledge and a clearly laid out methodology. This will
eliminate the bias, facilitate proper data collection and analysis, which in turn would
lead to sound and reliable research findings.
1.5.3. Testability
Research should be based on testable assumptions/hypotheses developed after
a careful study of the problems involved. The scientific research should enable the
testing of logically developed hypotheses to see whether or not the data collected support
the hypotheses developed.
Anna University Chennai
DBA 1657
NOTES
1.5.4. Replicability
Research findings would command more faith and credence if the same results
are evolved on different set of data. The results of the test hypothesis should be supported
again and again when the same type of research is repeated in other similar circumstances.
This will ensure the scientific nature of the research conducted and more confidence
could be placed in the research findings. It also eliminates the doubt that the hypotheses
are supported by chance and ensures that the findings reflect the true state of affairs.
1.5.6. Objectivity
Research findings should be factual, data-based and free from bias. The
conclusion drawn should be based on the facts of the findings derived form the actual
data and not on the basis of subjective or emotional values. Business organizations will
suffer a greater extent of damage if non-data-based or misleading conclusions drawn
from the research is implemented. Scientific approach ensures objectivity of research.
1.5.7. Generalizability
It refers to the scope of applying the research findings of one organizational
setting to other settings of almost similar nature. The research will be more useful if the
solutions are applicable to a wider range. The more generlizable the research, the greater
5
DBA 1657
NOTES
will be its usefulness and value. However it is not always possible to generalize the
research findings to all other settings, situations or organizations. For achieving
genaralizability the sampling design has to be logically developed and data collection
method needs to be very sound. This may increase the cost of conducting the research.
In most of the cases though the research findings would be based on scientific methods
it is applicable only to a particular organization, settings or situations.
1.5.8. Parsimony
Research needs to be conducted in a parsimonious ie simple and economical
manner. Simplicity in explaining the problems and generalizing solutions for the problems
is preferred to a complex research framework. Economy in research models can be
achieved by way of considering less number of variables leading to greater variance
rather than considering more number of variables leading to less variance. Clear
understanding regarding the problem and the factors influencing the same will lead to
parsimony in research activities. Sound understanding can be achieved through structured
and unstructured interview with the concerned people and by undertaking a study of
related literature in the problem area.
The scientific research in management area cannot fulfill all the above discussed
hallmarks to the fullest extent. In management research it is not always possible to
conduct investigations that are 100% scientific like in physical science as it is difficult to
collect and measure the data regarding the feelings, emotions, attitudes and perception.
It is also difficult to obtain representative sample. These aspects restrict the generlizability
of the findings. Though it is not possible to meet all the above said characteristics of the
scientific research, to the possible extent the research activities should be pursued in
the scientific manner.
DBA 1657
be necessarily based on reasons. The reasons are said to imply the conclusions and
represent a proof. The bond between the reasons and conclusions is much stronger
than in the case of induction. To be correct, a deduction should be both valid and true.
True in the sense that the reasons given for the conclusions must agree with the real
world. Valid means the conclusion must necessarily be arrived from the reasons.
NOTES
Researchers often use deduction to reason out the implication of various acts
and conditions. For example, in a survey a researcher may reason as follows:
Surveying households in urban area is difficult and expensive (Reason 1)
The study involves interview with households in urban area
(Reason 2)
(Conclusion)
Induction
Induction is a process where certain phenomenon is observed on the basis of
which conclusions are arrived at. The conclusions are drawn from one or more facts or
pieces of evidence. The conclusions in induction results in hypotheses. Induction leads
to establish a general proposition based on observed facts. For example, the researcher
understands that production processes is the prime feature of factories. It is therefore
concluded that factories exist for production purposes.
Research is based on both deduction and induction. It helps us to understand,
explain and predict business phenomena.
The building blocks of scientific inquiry include the following sequences:
1. Observing a phenomena
2. Identifying a problem
3. Constructing a theory
4. Developing hypotheses
5. Developing research design
6. Collecting data
7. Analyzing data and
8. Interpreting results
Observation a phenomena may be casual or purposeful. A casual scanning of
the environment may lead us to the knowledge of interesting facts. This observation
may lead to identifying the problem in the concerned area. The problem identification
needs gathering of primary data from the customers or from the employees or
management concerned with the particular problem. Further insights may be obtained
7
DBA 1657
NOTES
to refine the problem in a more specific manner. The next step is to build a conceptual
model or theoretical framework taking into consideration all the factors contributing to
the problem. The framework enables to integrate all the information collected in a
meaningful manner. From this theoretical framework several hypotheses can be generated
and tested to support the concept. A research design provides the blue print of the
mechanism or insight regarding the methods of collecting data, analyzing the same and
interpreting them in order to solve the problem.
The building blocks of science discussed above provide the genesis for the
hypothetico-deductive method of scientific methods. The steps are discussed below:
1. Observation
Observation is the first stage in scientific investigation. In this process, the
researcher takes into account the changes that are occurring in the environment. To
proceed further the changes observed in the environment should have important
consequences. The changes may be in the form of sudden drop in the sales, increase in
the employee turnover, decrease in the number of customer and the like.
2. Preliminary information gathering
This involves seeking in depth information regarding the facts being observed.
The information may be gathered through formal questionnaires, interview schedules or
through informal or causal talk with the concerned people. Desk research may also be
conducted to enrich the information gathered. The next step is to make sense out of the
factors identified in the information gathering stage by assembling them together in a
meaningful manner.
3. Formulation of theory
Theory formulation enables to integrate all the information in a logical manner
so as to conceptualize and test the factors responsible for the problem. The critical
variables contributing to the problem are examined. The association or relationship
among the variables contributing to the problem is studied in order to formulate the
theory.
4. Developing Hypotheses
The next logical step leads to framing of testable hypotheses. Hypotheses testing
are called deductive research. Sometimes it may so happen that the hypotheses which
are not originally formulated get generated through the process of induction. After the
collection of data an insight may occur based on which new hypotheses can be formulated.
Thus hypotheses testing through deductive research and hypotheses generation through
induction are both common.
DBA 1657
NOTES
After the hypothesis is developed, the data with respect to each variable in the
hypotheses needs to be obtained in a scientific manner so as to test the hypotheses. The
primary and secondary sources can both be explored in order to collect the data. Data
on every variable in the theoretical framework from which the hypothesis is generated
should be collected.
6. Data Analysis
The data gathered are to be statistically analyzed to validate the hypothesis
postulated. Both qualitative and quantitative data needs to be analyzed. Qualitative
data refer to information gathered through interviews and observations. Through scaling
techniques the qualitative data can be converted into quantifiable form and subjected to
analysis. Appropriate statistical tools should be used to analyze the data.
7. Deduction
Deduction is the process of arriving at conclusions by interpreting the meaning
of results of the data analysis. Based on the deduction, recommendations can be made
to solve the problem encountered.
DBA 1657
NOTES
There must be an individual or a group which has some difficulty with problem
There must be alternative means or course of action for obtaining the objectives
There must be some doubt in the minds of a researcher with regard to the
selection of alternatives.
The researcher should be familiar with the subject chosen for research. The
researcher should have enough knowledge, qualification and training in the
selected problem area. The resources needed to solve the problem in terms of
time, money, efforts, manpower requirement should be taken into account before
embarking on a problem.
The subject of research should be familiar and feasible so that related research
material or sources of research can be obtained easily.
Research problems trigger the research process. Defining the research problem
is a critical activity. A thorough understanding of research problem is a must for achieving
success in the research endeavor. Defining the research problem begins with identifying
the basic dilemma that prompts the research. It can be further developed by progressively
breaking down the original dilemma into more specific and focus oriented objectives.
Five steps could be envisaged (1) Identifying the broad problem area(2) Literature
review (3) Identifying the research question (4) Refining the research question (5)
Developing investigative questions. They are discussed below:
1.7.1.1 Identifying the broad problem area
The process begins with specifying the problem at the most general level eg.,
declining sales, increased cost, increased employee turnover etc. From this general
specification of problem the next step is to move towards the question. The question
reinstates the general problem. For eg., What is the reason for declining sales?. The
questions that can be raised can be grouped into three categories;(1) Choice of purposes
Anna University Chennai
10
DBA 1657
or objectives where the question focuses on what objectives the researcher wishes to
achieve by conducting the research (2) Generation and evaluation of solutions where
the question focuses on the alternatives available to solve a problem in hand (3) Trouble
shooting or control situation where the query focuses on monitoring and diagnosing,
why an organization is not achieving the established goals.
NOTES
The researcher can identify the problem through the following sources:
The above techniques would enable the researcher to understand the problem
in a better manner and also to outline the possible variables that might exert an influence.
The nature of information needed by the researcher could be broadly classified under
three headings:
1. Background information of the organization for which research is
conducted viz, the origin and history of the company, its assets, number
of employees, location etc., The information can be obtained from
company records, published data, Census of Business and Industry
and the web.
2. Information regarding managerial philosophy, company policies and
other structural aspects can be collected by asking direct question from
the management
3. Information regarding the perception, attitudes and behavioural aspect
of employees could be obtained by way of observations, interviews
and questionnaires.
1.7.1.2 Literature survey
Literature survey is the review of published and unpublished work from
secondary source in the area of interest to the researcher. The purpose of conducting
literature survey at this stage is:
DBA 1657
NOTES
To ensure that no variable that has been taken up in the past related studies is
ignored.
To avoid conducting similar type of study and thereby stop the researcher from
investing his resources in terms of time and effort in an research venture which
is already solved.
To provide a good frame work and a solid foundation to proceed further in the
investigation.
To enhance the testability and replicability of the findings of the current research.
To clear conceptualization
12
DBA 1657
NOTES
DBA 1657
NOTES
exploration and question revision to refine the original question and generate the material
for constructing the investigative questions.
1.7.1.5 Developing investigative questions
Investigative questions are questions that the researcher must answer
satisfactorily to arrive at a conclusion about the research question. To formulate them,
the researcher should break down the research question into more specific questions
for which the data is to be gathered. This fractioning process can be continued down to
several levels with increasing specificity. The investigative questions guides to develop a
suitable research design. They are the foundations for creating the research data collection
instrument. In developing the investigative questions performance considerations,
attitudinal issues and behavioral issues can be included, depending on the research
problem.
The problems in defining research questions
There might be some problems in defining the research questions which are
discussed below:
Some problems are complex, value-laden and bound by constraints. These illdefined questions have characteristics that are virtually the opposite of those of
well-defined problems. These problems require a thorough exploratory study
before proceeding.
14
DBA 1657
NOTES
DBA 1657
NOTES
The secondary data is the historical data previously collected and assembled
for some other research problem. Secondary data can be usually gathered at faster and
economical manner than the primary data. However, the data may not fit in the
researchers information need. The secondary data can be obtained from the libraries,
website, published as well as unpublished documents etc.,
1.7.3.3 Sampling methodology and procedure
Sampling refers to randomly selected subgroup of people or objects from the
overall membership pool of defined target population. Sampling plans can be broadly
classified into probability and non probability sampling. In a probability sampling plan,
each member of the defined target population is a known and has an equal chance of
being drawn into the sample group. Probability sampling gives the researcher the
opportunity to assess the sampling error. In the case of non probability sampling the
research finding cannot be generalized and the sampling error cannot be assessed. The
findings are limited to the sample which provided the original raw data. However non
probability sampling may be the only choice in case where the population cannot be
ascertained. (A more detailed discussion on sampling is dealt in Unit 3 )
1.7.3.4The time schedule and the budget
The time schedule for completing the research along with the break up of time
required for each task has to be ascertained. Scheduling will enable the completion of
the project in time. A budget displays the sources and application of funds for the
research. The budget may require less attention in case of a in-house project or one which
is sourced by the researcher . However, a budget which is prepared for financial grants
needs to be prepared very systematically supported with proper documentation. The
budget may be prepared on various basis for eg., the rule-of- thumb budgeting where
a fixed percentage is arrived on some criterion like a percentage of sales or previous
years research budget.
Task budgeting selects specific research projects to support on an adhoc basis.
16
DBA 1657
NOTES
The data gathering phase begins with the pilot testing. It is done to detect the
weakness in the research design, questionnaire/interview schedule and provides proxy
data for selection of probability sample. The pilot testing should stimulate the procedure
and protocols designed for data collection. If the study is to be conducted by email then
the pilot questionnaire should be emailed. The size of the pilot group may range normally
from 25 to 100 respondents who need not be statistically selected. There are a number
of variations of pilot testing. Some of them may be restricted to data collection only.
One form is pretesting where the responses are collected from colleagues, respondents
surrogates or actual respondents for the main purpose of refining the questionnaire.
Based on the pilot testing the questionnaire may be redesigned, rephrased and improved.
Pretesting may be repeated many times to refine questions or procedures.
Data is the facts presented to the researcher from the study environment. Data
can be gathered from a singe location or from all over the world based on the research
objectives and the resource allocation. The data collection method ranges from
observation, questionnaires, laboratory notes and other modern instruments and devices.
Data can be characterized by their abstractness, verifiability, elusiveness and closeness
to the phenomenon. As abstractions, data are more metaphorical than real. When sensory
experiences consistently produce the same result then the data is said to be trustworthy
as they are verified. Data capturing is elusive, complicated by the speed at which events
occur and the time-bound nature of observation. Data reflect their truthfulness measured
by the degree of closeness to the phenomena. Secondary data has at least one level of
interpretation inserted between the event and its recording. Primary data are close to
the truth.
Data collected need to be edited for ensuring consistency and to locate
omissions. In case of survey method editing reduces errors in the recording, improves
legibility and clarifies unclear and inappropriate responses. Edited data are then converted
into analyzable form. Computers can be used to find missing data, validate data, edit
and code so that further analysis can be carried out in a valid manner.
17
DBA 1657
NOTES
Various statistical softwares are available to make the job of data analysis easier
and scientifical. However, the interpretation needs to be made with expertise as the
recommendations are made on the basis on them.
18
DBA 1657
name suggest takes varied values. The values may be different at various time for the
same object or person or at the same time for different objects or persons. For eg.,
Age is a variable, as it can be different for different consumers and also for a single
consumer it varies as time evolves.
NOTES
Age
Independent variable
Dependent variable
DBA 1657
NOTES
strong contingent effect on the relationship between the independent and the dependent
variable. The presence of a third variable modifies the original relationship between the
dependent and independent variables. In the example discussed above the price of the
product is a moderating variable. Though the age influences the price may moderate the
choice of the product.
Choice of product
Age
Price
Attitude
Intervening
Choice of product
Dependent variable
20
DBA 1657
underlying the relations and describes the nature and directions of the relationship. The
theoretical foundations provide the basis for developing testable hypotheses.
NOTES
The variables influencing the research problem should be clearly identified, defined
and discussed.
The discussion should also highlight the relationship between the variables so
identified.
The reason for assuming the type of relationship should be mentioned drawing
on the previous research studies identified through the literature review.
A model showing the relationship among the variables can be given so that the
concepts can be visualized and understood clearly by the reader.
DBA 1657
NOTES
Eg., Education of the respondent does not have an influence on the importance
given to the information source.
Non directional hypotheses are formulated in the case where previous studies
have not explored the direction of relationship or there is no evidence to assume the
direction of the relationship among the variables. The previous research studies may
give rise to conflicting findings which will also be the reason for nondirectional hypothesis.
3 Null and alternative hypotheses
Null hypotheses states that there is no significant relationship between the
variables. Null hypotheses also state that there is no difference between what we might
find in the population characteristics and the sample that is being studied. It is implied
through null hypotheses that the difference, if any between the two samples groups or
any relationship between two variables based on our sample is simply due to random
sampling fluctuations and not due to any true differences between the two population
groups. The null hypotheses so formulated are tested for possible rejection. It may
state that the population correlation between two variables is equal to zero or that the
difference in the means of two groups in the population is equal to zero.
The hypotheses generation and testing can be done through both induction and
deduction. In deduction, the theoretical model is first developed, testable hypotheses
are formulated on the basis of the theoretical framework, data collected and then the
hypotheses are tested. In the inductive process, new hypotheses are formulated based
on the known facts collected already which are subjected to test. The findings would
add to the knowledge and help to build a theoretical framework.
22
DBA 1657
Bayesian statistics also use sampling data for decision but go beyond them and
considers all other available information. The additional information consists of subjective
probability estimates stated in terms of degrees of belief. The subjective estimates are
based on general experience rather than on specific data collected . They are expressed
as a prior distribution that can be revised after sample information is gathered. The
revised estimate known as posterior distribution can be further revised by additional
information and so on. Various decision rules are established, cost and other estimates
can be introduced and the expected outcome of the combination of these elements are
used to judge the decision alternatives
NOTES
DBA 1657
NOTES
The observation must be independent i.e., the selection of one case should not
affect the chances of any other cases to be included in the sample
Non parametric tests have few assumptions. They are easy to understand and
simple to use. They do not specify normally distributed population or homogeneity of
variance. Some tests require independence of cases and others are designed for related
cases. Non parametric tests are the only ones usable with nominal data; they are the
only technically correct test to be used with ordinal data. Non parametric test can also
be used in the case of interval and ratio data. However, it will result in waste of some
information available. The non parametric tests are highly efficient as compared to
parametric tests. Non parametric test with the sample of 100 will provide the same
statistical testing power as a parametric test with a sample of 95.
24
DBA 1657
NOTES
I. One-sample Tests
One-sample tests are used when a single sample is taken and test is undertaken
to know whether the sample comes from a specified population.
Parametric tests
The parametric tests Z or t-test can be used to determine the statistical
significance between a sample distribution mean and a parameter. When sample sizes
are beyond 120 then the t and z distributions are virtually identical.
Non-parametric tests
Different types of non parametric tests may be used in the case of one-sample
test depending on the measurement scale used and other conditions. If measurement
scale is nominal, binomial or chi-square test can be used. The binomial test is appropriate
when the population is viewed as only two classes such as male and female, buyers and
non-buyers and all observations fall into one of these categories. The binomial test is
useful when the size of sample is very small and the chi-square test cannot be used
x2
Chi-square test is the most widely used non parametric test of significance. It is
particularly useful in those tests involving nominal data but can also be used for higher
scales. Using this technique, the significant differences between the observed distribution
of data among categories and the expected distribution are tested on the null hypothesis.
This test can be used in one sample, two independent samples or k independent samples.
It must be calculated with actual counts rather than percentages. The formula for the
chi-square( ) test is
k
(Oi E1 )2
2
x =
Ei
t =1
Oi = observed number of cases categorized in the ith category
Ei = Expected number of cases in the ith category under Ho
K = the number of categories
II. Two-independent samples tests
The need for two independent samples tests is often encountered in research.
One can compare the purchasing predisposition of a sample of subscribers from two
magazines to discover if they are from the same population.
Parametric tests
The z and t-tests are frequently used parametric tests for independent samples,
however F test can also be used
25
DBA 1657
NOTES
The z test is used with sample sizes exceeding 30 for both independent samples
or with smaller samples when the data are normally distributed and population variances
are known. The formula for the z test is
z=
(x
x 2 ( 1 2 ) X 0
S 12 S 22
+
n1
n2
In the case of small sample sizes, normally distributed populations and assuming equal
population variances, the t-test is appropriate:
t=
(x
x 2 (1 2 ) X 0
1 1
S p2 +
n1 n 2
(1 2 )
is the difference between the two population means is associated with the
pooled variable estimate:
S =
2
p
Non-parametric tests
The chi-square test is appropriate for situations in which a test for differences
between samples is required. It is especially valuable for nominal data, however it can
be used with ordinal measurements also. The formula slightly differs from earlier one
and it is as below:
x = i j
2
(O
ij
Eij )
Eij
26
DBA 1657
Parametric Tests
NOTES
t=
where
SD n
D=
D
SD =
n
D2
( D)2
n
(n 1)
Nonparametric Tests
The McNemar test may be used with either nominal or ordinal data and is
especially useful with before-after measurement of the same subjects.
IV. K Independent Samples Tests
K independent samples tests are normally used in management and economic
research when three or more samples are involved. The test is concerned with whether
the samples might come from the same or identical population. When the data are
measured on an interval-ratio scale and the necessary assumptions are met then the
Analysis of Variance and F test are used. If the assumptions cannot be met or if the
data are measured on ordinal and nominal scale then the non parametric test can be
selected. The samples are assumed to be independent.
Parametric Tests
Analysis of Variance(ANOVA) is a statistical method of testing the null
hypothesis that the means of several populations are equal. To use ANOVA certain
conditions must be met.
1
The distance from one value to its groups mean should be independent of the
distance of other values to that mean
DBA 1657
NOTES
model each group has its own mean and values that deviate from that mean. Similarly,
all the data points form all of the groups produce an overall grand mean. The total
deviation is the sum of the squared differences between each data point and the overall
grand mean.
The total deviation of any particular data point may be portioned into between
groups variance and within-group variance. The between-groups variance represents
the effect of the treatment or factor. The differences between-groups means imply that
each group was treated differently and the treatment will appear as deviations of the
sample means from the grand mean. The within-groups variance describes the deviations
of the data points within each group from the sample mean. This results from variability
among subjects and from random variation. This is often called error. When the variability
attributable to the treatment exceeds the variability arising from error and random
fluctuations, the viability of the null hypothesis begins to diminish.
The test statistic for ANOVA is the F ratio. It compares the variance between
two sources:
F = Between-groups variance
Within-groups variance
28
DBA 1657
In this method it is often necessary to measure subjects several times. These repeated
measurements are called trials. The repeated-measures ANOVA is a special type of nway analysis of variance.
NOTES
The overall research design can be split into the following parts:
The sampling design which deals with the method of selecting the samples for
the purpose of conducting the study
The observational design which deals with the conditions under which the
observation is made
DBA 1657
NOTES
The operational design which relates to the techniques by which the procedures
specified in the sampling, statistical and observational designs can be carried
out.
Since the research design is the plan regarding the sampling procedure, data
collection method and various other activities to be performed in the proposed
research , the same can be discussed with others and based on the critical
comments, the flaws and inadequacies can be tackled leading to an effective
research design.
The research design affects the reliability of the research findings and as such it
constitutes the foundation of the entire research work
30
DBA 1657
Criteria
NOTES
Types
Monitoring
Interrogation/communication
Experimental
Ex post facto
Purpose of study
Casual
Descriptive
Time dimension
Cross-sectional
Longitudinal
Scope
Case
Statistical study
Research environment
Field setting
Laboratory research
Simulation
Participants perception
Actual routine
Modified routine
Types of investigation
Casual
Correlational
Unit of analysis
Single
Dyad
Group
Organization/Nation
Extent of crystallization
Formal study
Exploratory study
31
DBA 1657
NOTES
32
DBA 1657
or is responsible for changes in the other variables. There are three possibilities of
relationship that can occur between variables:
NOTES
1. Symmetrical
2. Reciprocal
3. Asymmetrical
In the case of Symmetrical relationship two variables fluctuate together but it is
assumed that changes in neither variable are due to changes in the other. Symmetrical
conditions are usually found when two variables are alternate indicators of another
cause or independent variable.
Reciprocal relationship exists in the case where two variables mutually influence
or reinforce each other. Asymmetrical relationship exists where the changes in one
variable viz., the independent variable is responsible for changes in another variable
viz., the dependent variable. The dependent and independent variables are identified
on the basis of :
The degree to which each variable may be altered. The variable which are
relatively unalterable is called independent variable
The time order between the variables. The independent variables precedes the
dependent variable.
33
DBA 1657
NOTES
All factors except the independent variable must be held constant and not
confounded with another variable that is not part of the study. This is called a
control group.
Each person in the study must have an equal chance for exposure to each level
of the independent variable. This is called random assignment of subjects to
groups
34
DBA 1657
NOTES
8. Type of investigation
Research study can take the form of casual or correlational investigation. A
casual study is conducted to establish a definitive cause and effect relationship. In this
case the objective of the research is to delineate one or more factors that are causing
the problem. The intention of the researcher conducting a casual study is to be able to
state that variable X causes variable Y. Thus the study in which the researcher wants to
delineate the cause of one or more problems is called a casual study.
35
DBA 1657
NOTES
36
DBA 1657
NOTES
In-depth interviewing
Participant observation
Case studies
Street ethnography
Document analysis
Experience survey
Focus groups
Two-stage design
DBA 1657
NOTES
The researcher can explore the organizations archives for the data. Report of
prior research studies would reveal the successful and unsuccessful methods adopted
in the previous research studies. Browsing through the earlier research studies will also
reveal the less attempted problem areas which can be addressed in the present research.
The researcher can look into the published documents, in the form of books/
journals by outside organizations. They can be a rich source of hypotheses. The esources and the library will provide the needed information.
The search of secondary sources will provide the background information about
the research to be conducted and also will also provide a fair idea about the areas to be
pondered.
ii. Experience survey
Experience survey involves collecting information from the people experienced
or knowledgeable in the particular area of study. The data would be collected from
their memories and experiences .The ideas on important issues and the subject matter
can be explored. The investigative format is more flexible. The outcome of the interview
would be a new hypothesis, discarding an old one or information, for doing the study in
a better manner.
iii. Focus groups
A focus group is a panel of people who meet for about 90 minutes to 2 hours
and discuss about the subject matter led by a trained moderator. The facilitator uses
group dynamics to focus or guide the group in the exchange of ideas, feelings and
experiences on a specific topic. The focus group is made up of 6 to 10 respondents.
Too small or too large a group may not be effective in meeting the objective. The
outcome of the focus group will be a list of ideas and behaviourial observations with the
observations and recommendations made by the moderator. The qualitative data
produced from the focus group can be used for enriching the knowledge.
Depending on the topic, separate focus groups could be run for different subset
of the population. Homogeneity in the focus group will be more effective and produce
maximum results. The focus groups can be conducted in a face to face manner, through
telephones, internet (e-groups) and through video conferencing.
iv.Two stage design
In the exploratory stage, the researcher does not know much about the problem
in hand but needs to know more before proceeding further in terms of time and resources.
38
DBA 1657
A two stage design would be useful in this situation. With this approach exploration
NOTES
becomes a separate first stage with limited objectives: (1) clearly defining the research
question and (2) developing research design. A limited exploration at a lesser cost
carries little risk for the researcher and enables to uncover information that reduces the
total research cost.
SUMMARY:
This unit has examined some of the basic aspects of research. The importance
of knowledge of research in business setting was emphasized. The hallmarks of scientific
research viz., purposiveness, rigor, testability, replicability, precision and confidence,
objectivity, generalization and parsimony were described. The steps involved in
hypothetico-deductive research were discussed. The various steps involved in
undertaking the research were dealt in detail. The issues involved in development of
hypothesis and the parametric and non parametric tests were examined.
With the impetus to the background of research, the next unit deals with the
issues concerning the research design and its types.
Discuss the need for theoretical framework and highlight the features of the
same
DBA 1657
NOTES
What are the basic research design issues? Discuss them in detail.
Is single research design suitable for all research studies? If not why?
Discuss the different types of research design. Site a situation to which each
design is applicable to.
40
DBA 1657
NOTES
Unit -2
Experimental Design
Unit structure:
2.1 Introduction
2.2 Learning Objectives
2.3 The benefits and drawbacks
2.4 Activities involved in conducting an experiment
2.5 Validity in Experimentation
2.5.1 Factors affecting internal validity
2.5.2 Factors affecting external Validity
2.6 Experimental research designs
2.6.1. Pre-experimental designs
2.6.2 True experiments/ Lab experiments
2.6.3. Field experiments: Quasi or semi - experiments
2.7 Measurement
2.8 Measurement Scales
2.8.1 Selection of measurement scale
2.8.2 Methods of scaling
2.8.3 Construction of measurement scales
-------------------------------------------------------------------------------------
2.1 INTRODUCTION
Experimental design enables a researcher to alter systematically the variables
involved in the study. The experimental design involves intervention by the researcher.
The researcher intervenes by way of manipulating the variables in a setting and observes
the effect on the subjects studied. Under experimental design the independent variables
are manipulated and the effects of the same on the dependent variables are observed.
This units deals with the discussion on activities involved in conducting an
experiment, the factors affecting the validity in experimentation and the various types of
experimental designs. Measurement of variables is necessary for testing the hypotheses.
41
DBA 1657
NOTES
The nominal, ordinal, interval and ratio scales are dealt in detail. The process involved
in selection and construction of measurement scales are discussed in detail.
Understand the meaning of scaling and the six critical decisions involved in
selecting an appropriate measurement scale
The researcher can manipulate the independent variable and thereby understand
the effect on the dependent variable. This will lead to understand the existence
and potency of the manipulation.
The experiment can be replicated with different subject groups and conditions
and thereby enables to understand the effect of independent variables across
people, situations, conditions and time.
The researcher in some situations can use field experiments to reduce the
subjects perception of the researcher as intervention or deviations in their
everyday lives.
The research is undertaken in artificial settings and hence the subjects may not
behave as they do under normal circumstances.
42
DBA 1657
NOTES
DBA 1657
NOTES
44
DBA 1657
experimental treatment has been the source of change in the dependent variable. The
factors listed below affects the internal validity:
History
Maturation
Testing
Instrumentation
Selection
Statistical regression
Experimental mortality
Compensatory equalization
Compensatory rivalry
Local history
NOTES
1. History
In the experimental designs a control measurement (O1) of dependent variable
is taken before introducing the manipulation (X). After the manipulation an after
measurement (O2) of the dependent variable is taken. Then the difference between O1
and O2 is attributed to the manipulation. However, some events may occur during the
course of the experimental study which will affect the relationship between the variables
under the study.
2. Maturation
The subjects considered for experimentation might change with the passage of
time and may not be due to the occurrence of any specific event. This happens particularly
when the study covers a long period of time.
3. Testing
The process of taking a test can affect the scores of further tests. The first test
would have created some awareness and learning experience which influences the results
of the subsequent tests.
45
DBA 1657
NOTES
4. Instrumentation
The threat to validity may arise due to the observer or the instrumentation.
Using different observers or interviewers affects the validity of the study. If the same
observer is used for a longer period of time, it may affect the validity due to observers
experience, boredom, fatigue and anticipation of results. Difference in the questions for
each measurement affects the validity
5. Selection
Differential selection of subjects for experimental and control groups affects
the validity. Validity considerations require the groups to be equivalent in every aspect.
The problem can be overcome by randomly assigning the subjects to experimental and
control groups. In addition, matching can be done. Matching is a control procedure to
ensure that experimental and control groups are equated on one or more variables
before the experiment. Matching the members of the groups on key factors also enhances
the equivalence of the groups.
6. Statistical Regression
This factor operates especially when members chosen for the experimental
group have extreme scores on the dependent variable. For eg., If a manager wants to
test if he can increase the salesmanship qualities of the sales personnel through training
program, he should not choose those with extremely low or extremely high abilities for
the experiment. This is because, those with very low score i.e., those with low current
sale abilities have a greater probability of showing improvement and scoring closer to
the mean test after being exposed to the treatment. This phenomenon of low scorers
tending to score closer to the mean is known as regressing towards the mean. Likewise,
those with very high abilities would also have a greater tendency to regress towards the
mean they will score lower on the post-test than on the pretest. Thus, those who are
at either end of the continuum with respect to the variable would not truly reflect the
cause-and-effect relationship. This phenomenon of statistical regression is a threat to
internal validity.
7. Experimental mortality
This factor arises due to the changes in the composition of study groups during
the test. There may be drop outs in the study group leading to the changes in the
membership of the group. This problem does not arise for the control group as they are
not affected by the testing situation and they are less likely to withdraw.
All the above threat factors can be controlled to a certain extent by random
assignment. However, the following factors affecting internal validity cannot be controlled
46
DBA 1657
by randomization. Both the control group and the experimental group are affected by
the first three factors.
NOTES
DBA 1657
NOTES
48
DBA 1657
NOTES
O
Observation or measurement of
dependent variable
X
Manipulation
O
Post-test
However, other aspects of internal validity like the history, maturation, testing
effect etc are not taken into account. Hence, it is still a weak design only.
3. Static Group Comparison
This design uses two groups; one receives the experimental stimulus and the
other serves as a control group and is not given the treatment. The dependent variable
is measured in both groups after the treatment. For e.g., in a field setting an experiment
is designed to study the effect of a natural disaster (experimental treatment) on the
psychological trauma ( measured outcome). A pretest before the natural disaster say
49
DBA 1657
NOTES
tsunami is possible but not on a large scale. Moreover, the timing of the pretest would
be problematic. The control group, receiving the post-test would consist of subject
whose property is safe. The effect can be represented in the following manner:
X
O1
-
O2
The addition of a comparison group increases the validity over the previous two designs.
However, there is no way to be certain that the two groups are equivalent.
Factors: The independent variables of an experiment are often called the factors
of the experiment. Active factors are those the experimenter can manipulate by
causing a subject to receive one level or another. Blocking factor is one where
the experimenter can only identify and classify the subject on an existing level.
Treatment: this is another word used for condition. It also refers to the statistical
test of the effect of various conditions of the experiment.
50
DBA 1657
Test unit: the experimental subjects are referred as test unit. The test unit may
be people, organizations, machine type, materials and other entities.
NOTES
04
O2
51
DBA 1657
NOTES
The experimental effect is measured by the difference between O1 and O2. The
design is more simple and attractive. Internal validity threats from the history, maturation,
selection and statistical regression are adequately controlled by random assignment.
Since the subjects are measured only once, the threats of testing and implementation
are handled. The different mortality rates between experimental and control groups
continue to be a problem. The design reduces the external validity problem of testing
interaction effect.
3. Extensions of True Experimental Design
The researcher normally uses an operational extension to the basic design.
These extensions differ from classical design in terms of
52
DBA 1657
NOTES
High
Medium
X1
X2
X2
X3
X2
X1
53
Low
X2
X1
X3
Anna University Chennai
DBA 1657
NOTES
Treatments are assigned based on random number tables. From the above, the effects
of price reduction can be ascertained. The major limitation of Latin square is that it is
assumed that there are no interaction between treatments and the blocking factors.
d. Factorial design
In case of factorial design a researcher can deal with more than one factor
simultaneously. This design is especially important in several economic and social
phenomena where usually, large number of factors affect a particular problem. Factorial
designs can be of two types:
(1) Simple factorial designs
(2) Complex factorial designs
(1). Simple Factorial designs: When the effect of varying two factors on the dependent
variables is dealt, the design is called simple factorial design. This design is also known
as two-factor-factorial design. A simple factorial design may be either a 2 X 2 , 3 X 4
or 5 X 3 or the like type of design. A 2 X 2 simple factorial design can be depicted as
below:
Experimental variable
Control variables
Treatment A
Treatment B
Level I
Cell 1
Cell 3
Level II
Cell 2
Cell 4
54
DBA 1657
The results are more generalizable and so the external validity is more in case of
field experiments. However, the internal validity is lesser as the extent, to which variable
X alone causes variable Y, cannot be ascertained
NOTES
O2
O4
Two designs are possible viz., intact equivalent design and the self-selected
experimental group design
In intact equivalent design members of the experimental and control groups are
naturally assembled. This design is useful when any type of individual selection process
would be reactive.
In the self-selected experimental group design the volunteers are recruited to
form the experimental group, while the non volunteer subjects are used for control
Comparison of the pretest result (O1 O2) is one indicator of the degree of
equivalence between test and control groups. If the pretest results are significantly
different, there is a real question about comparability. On the other hand, if pretest
observations are similar between groups, there is more reason to believe internal validity
of the experiment is good.
2. Sepearate Pretest and post - test design
This design is most applicable in the situations when the researcher cannot
know when and to whom to introduce the treatment but can decide when and whom to
measure. The basic design is
R O1
(X)
X O2
The bracketed treatment ( X ) suggests that the researcher cannot control the treatment.
This is not a strong design because of several threats to internal validity are not handled
adequately. History can confound the results but it can be overcome by repeated
experiments. This design is considered to be superior to true experiments in external
55
DBA 1657
NOTES
validity. The strength is that samples are drawn from general population which extends
the generalization of the study
3. Group Time Series design
This design introduces repeated observations before and after the treatment
and allows subjects to act on their own control. The single treatment group design has
before and after measurements as the only controls. There is also a multiple design with
two or more comparison groups as well as the repeated measurements in each treatment
group. This format is useful where regularly kept records are a natural part of the
environment and are unlikely to be reactive. It is also a good way to study unplanned
events in an ex post facto manner.
2.7 MEASUREMENT
In normal parlance, measurement refers to an attempt to fix quantitatively the
form or other features of a physical object. In research, measurement refers to assigning
numbers to empirical events in compliance with a set of rules. This definition brings out
the three steps involved in the process of measurement:
1. Selecting the observable empirical events.
2. Developing a set of mapping rules i.e. a scheme for assigning numbers or symbols
to represent aspects of the event being measured.
3. Applying the mapping rules to each observation of that event.
The goal of measurement is to provide the highest quality, lowest error data for
the purpose of testing the hypotheses identified and other related analysis and
interpretations. Variables dealt in research studies can be classified as objects and
properties. Objects include the things of ordinary experience, such as the laptop, chair
and car. It also includes things which are not concrete such as attitude, peer group
pressures, perception etc. Properties are the characteristics of the objects. It includes
level of motivation, leadership skills etc. Strictly speaking researchers are not involved
in measuring objects or properties but rather they measure the indicants of the properties
or indicants of the properties of the objects.
56
DBA 1657
2. Order: Numbers are arranged in some order in such a way that one number is
greater than /smaller than / equal to another number.
NOTES
DBA 1657
NOTES
greater than or less than or equal to without stating how much greater or less.
Other descriptors may also be used viz., superior to, happier than, poorer than,
above. It is also possible to rank more than one property at a time. For e.g., researcher
can ask the respondent to rank various air lines on the basis of certain properties.
In ordinal scaling the differences in the ranking of objects, persons or events
investigated are clearly known. However, the ordinal data does not give any indication
of the magnitude of the differences among the ranks
3. Interval scale
Interval data has the power of the nominal and ordinal data and in addition it
incorporates the concept of equality of interval. The interval scale allows to measure
the distance between any two points on the scale. It not only enables to group the
individuals according to certain categories and taps the order of the groups; it also
measures the magnitude of differences in the preferences among the individuals. The
interval scale is more powerful than the nominal and the ordinal scales. The measure of
central tendency the arithmetic mean, is applicable. Its measures of dispersion are the
range, the standard deviation and the variance.
4. Ratio scale
Ratio data has the power of the nominal, ordinal and interval scale in addition it
also has the provision for absolute zero or origin. It covers the disadvantage of the
arbitrary origin point of the interval scale, i.e., it has an absoultue zero point. The ratio
scale not only measures the magnitude of the differences between points on scale but
also the proportion in the differences. Multiplication or division would preserve the
ratios. It is the most powerful of the four scales because it has a unique zero origin and
subsumes all the properties of the other three scales.
The measure of central tendency of the ratio scale could be either the arithmetic
or the geometric mean and the measure of dispersion could be either the standard
deviation or variance or the coefficient of variation. Some examples of ratio scales are
those pertaining to actual age, income and work experience in organizations.
58
DBA 1657
NOTES
DBA 1657
NOTES
1.Content Validity
The content validity refers to the extent to which a measuring instrument provides
adequate coverage of the investigative questions guiding the study. Content validity is
good if the instrument contains a representative sample of the universe of subject matter
of interest. Determination of content validity is judgmental and can be approached in
several ways. Generally the content validity is treated to be higher , if the scale items
used represents to a greater extent the domain or universe of the concept being measured.
The researcher may determine the content validity through a careful definition of the
topic of concern, the item to be scaled and the scales to be used. Another way is to use
a panel of persons to judge whether the instrument meets the standards.
Face validity is considered as a basic and very minimum index of content validity.
It indicates that on the face of it the, items look as if they measure the intended concept.
2.Criterion-Related Validity
Criterion related validity reflects the success of measures used for prediction or
estimation. Predictive validity refers to the extent to which an outcome could be predicted
and concurrent validity refers to the extent to which estimate of current behaviour or
condition could be made. The researcher must ensure that the validity criterion used is
itself valid. This can be judged in terms of four qualities viz., relevance, freedom from
bias, reliability and availability.
3.Construct validity
This is the most complex and abstract feature. Construct validity testifies that
the results obtained from the use of measure fits the theories around which the test is
designed. In other words a measure has construct validity to the degree that it conforms
to predicted correlations of other theoretical propositions. The researcher may wish to
measure or infer the presence of abstract characteristics for which no empirical validation
seems possible. Attitude, aptitude and personality scales generally fall in this category.
Although it is difficult, assurance is still needed that the measurement has an acceptable
degree of validity. This is assessed through convergent and discriminant validity.
Convergent validity is established when the score obtained with two different instruments
measuring the same concept are highly correlated. Discriminant validity is established
when, based on theory two variables are predicted to be uncorrelated and it is also
empirically proved. The validity can be proved through the use of correlational analysis,
factor analysis etc.
II. Reliability
Reliability refers to consistency i.e. a measure is reliable to the degree that it
supplies consistent results. Reliability is concerned with estimates of the degree to which
60
DBA 1657
measurement is free from random or unstable error. Reliable instruments can be used
with confidence that transient and situational factors are not interfering. Reliable
instruments are robust and they work well at different times under different conditions.
The reliability of an instrument is measured on the basis of the stability, equivalence and
internal consistency.
NOTES
1.Stability
Stability is securing consistent results with repeated measurements of the same
person with the same instrument. An observation is said to be stable if it gives the same
reading on a particular person when repeated one or more times. Stability measurement
in survey situations is more difficult then in observational studies. Observation can be
done repeatedly but the resurvey can be conducted only once. Two tests of stability are
test-retest reliability and parallel-form reliability.
(a)Test-Retest Reliability
The conduct of resurvey is called test-retest arrangement which involves comparisons
between the two tests to learn about the reliability. The reliability coefficient obtained
with a repetition of the same measure on a second occasion is called test-retest reliability.
When a questionnaire containing some items that are supposed to measure a concept is
administered to a set of respondents and if the same questionnaire is administered after
some time again, then the correlation between the scores obtained at two different
times from the same set of respondents is called test-retest co-efficient. Higher the
coefficient, better the reliability and stability.
The following difficulties can occur in test-retest methodology;
Topic sensitivity occurs when the respondents seek to learn more about the
topic or form new and different opinions before the retest.
(b)Parallel-Form reliability
Parallel form reliability occurs when two comparable sets of measures tapping
the same construct are highly correlated. The forms have similar items and the same
61
DBA 1657
NOTES
response format. The wording and the order or sequence of the question is only changed.
It is done in order to establish the error variability arising due to change in the wordings
or ordering of questions. High correlation between two forms ensures that the measures
are reasonably reliable, with minimal error variance caused by wording, ordering or
other factors.
2.Equivalence
Equivalence is concerned with how much error may be introduced by different
investigators or different sample of items being studied. Equivalence is concerned with
variations at one point of time among observers and samples of items. One way to test
for the equivalence of measurements by different observers is to compare their scoring
of the same event. One test for item sample equivalence is by using alternative or parallel
forms of the same test administered to the same person simultaneously. The results of
the two tests are then correlated.
The major interest with equivalence is typically not how respondents differ
from item to item but how well a given set of items will categorize individuals. There
may be differences in responses between two samples of items, but if a person is
classified the same way by each test, then the test has good equivalence.
3. Internal consistency
The internal consistency indicates the homogeneity of the items in the course of
measuring a construct. The items should hang together as a set and should also be
capable of independently measuring the same concept so that the respondents attach
the same overall meaning to each of the items. This can be ensured by examining if the
items and the subsets of items in the measuring instrument are correlated highly. The
internal consistency among the items can be measured by using the split-half reliability
and the inter-item consistency reliability test.
(a)Split-Half Reliability
Split-half reliability reflects the correlations between two halves of an instrument.
This technique can be used when the measuring tool has many similar questions or
statements. The instrument is administered and the results are separated by items into
even and odd numbers or randomly selected halves. When correlation is performed, if
the results of correlation are high the instrument is said to have high reliability as regardsthe internal consistency.
(b)Inter-item consistency reliability
It is a test of the consistency of respondents answers to all the items in a
measure. If the items are independent measures of the same concept, they will be
correlated with one another. The most popular test of inter-item consistency reliability
62
DBA 1657
is the Cronbachs coefficient alpha. This test is used for multipoint-scaled items. For
dichotomous items the Kuder-Richardson formula is are used. Higher coefficients will
result in a better measuring instrument.
NOTES
III Practicality
The operational requirements of the project require it to be practical. Practicality
is defined as economy, convenience and interpretability. Economy is concerned with
minimizing the cost concerned with conducting the research project. The method of
data collection, length of the instrument etc will have an implication on the research
budget. Convenience refers to ease in administering the questionnaire. This can be
achieved by giving clear and complete instructions and by paying proper attention to
design and layout. The interpretability issue arises in case when the persons other than
the test designers must interpret the results. To enable interpretation, the designer of the
data collection instrument should provide enough information regarding the scoring keys,
norms, guidelines for test use etc.
DBA 1657
NOTES
attitude. Ranking scales enable to make comparison among two or more indicants
or objects. Categorization enables to put the subjects involved in groups or
categories
3. Degree of preference: Measurement scales may involve preference measurement
or non preference evaluation. In case of preference measurement respondents
are asked to choose the object preferred. In case of non preference evaluation
the respondents are asked to make judgments without any personal preference
towards objects or solutions.
4. Data properties: The data properties should also be viewed in case of decision
regarding measurement scales. The data can be classified as nominal, ordinal,
interval and ratio. The statistical application depends on the assumptions
underlying each data type.
5. Number of Dimensions: Measurement scales can be unidemensional or
multidimensional. In case of unidimensional scale only one attribute of the
respondent is measured. Multidimensional scaling recognizes objects as
consisting of n dimensions.
6. Scale construction: Five construction approaches are available viz., arbitrary,
consensus, item analysis, cumulative and factoring. The researcher should take
into consideration of both the type of measurement and the scales construction
when selecting an appropriate scale.
64
DBA 1657
Some of the rating scales used often by researchers are explained below;
The category scale uses multiple items to elicit a single response. The multiple
choice, single-response scale is appropriate when there are multiple options
but only one answer is sought.
o Eg., Age
NOTES
- 21 to 40 years
- 41 to 50 years
- Above 50 years
The check list or a multiple choice, multiple- response scale allows the
respondent to select one or several alternatives. E.g., in eliciting the response
regarding the source through which the information about a new product is
obtained, a respondent may select all or more than one of the choices given
below:
Source of information - Advertisement
- Sales person
- Sales materials
- Showrooms
- Friends/ relatives/ Neighbours
- Other sources
The Likert scale is designed to examine how strongly the respondents agree
or disagree with statements relating to the attitude or object on a 5-point scale.
The scores on the individual items are summed to produce a total score for the
respondent and hence it is also called summated scales. A Likert scale usually
contains two parts, the item part and the evaluative part. The item part usually
contains statement about a product, event or attitude. The evaluative part is a
list of response categories ranging from strongly agree to strongly disagree.
The item and evaluative part are shown below:
65
DBA 1657
NOTES
Disagree
2
Neutral
Agree
Strongly
Agree
5
I am satisfied with
the working environment
I am happy with the
work assigned
The responses over a number of items or statements tapping a particular concept
or variable are summated for every respondent. It is assumed that all the statements
measure some aspect of a single common factor. This is an interval scale and the
differences in the responses between any two points on the scale remain the same.
The Semantic Differential Scales are used widely to describe the set of
beliefs a person holds. Several bipolar attributes are identified at the extremes
of the scale and respondents are asked to indicate their attitudes on semantic
space toward a particular individual, object or event on each of the attributes.
The semantic space may consist of five or seven-point rating scales bounded at
each end by polar adjectives or phrases. There may be as many as 15 to 25
semantic differential scales for each attitude or object. The procedure is also
insightful for comparing the images of competing brands, stores or services.
The semantic differential also may be analyzed as a summated rating scale.
Each of the scale is assigned a value from -3 to 3 or 1 to 7 and the scores
across all adjective pairs are summed for each respondent. Individuals can be
compared on the basis of the total scores. An example of semantic differential
scale is given below;
Responsive
_ Unresponsive
Beautiful
_ Ugly
Courageous
_ Timid
The Numerical scale is similar to the semantic differential scale with the
difference that numbers on a 5 point or 7 point scale are provided with bipolar
adjectives at both ends. This is also an interval scale. The scale provides both
an absolute measure of importance and a relative measure of the various items
66
DBA 1657
rated. The scales linearity, simplicity and production of ordinal or interval data
makes it very popular. An example :
NOTES
The itemized rating scale is a 5 point or 7 point scale with anchors provided
for each item and the respondent states the appropriate number on the side of
each item or circles the relevant number against each item. The responses to
the items are then summated. This uses an interval scale . Example is shown
below;
Indicate your response number on the line for each item.
1
Very Unlikely
Unlikely
Neither Unlikely
nor likely
Likely
Very Likely
The itemized rating scale provides the flexibility to use as many points in the scale as
considered necessary ranging from 4,5,7,9, etc., It is also possible to use different
anchors. When a neutral point is provided, it is a balanced rating scale. When a neutral
point is missing it is an unbalanced rating scale. The itemized rating scale is frequently
used in business research as it adapts to the number of points desired to be used , as
well as the nomenclature of the anchors can be accommodated to suit the needs of the
researcher.
In Fixed or constant sum scale the respondents are asked to distribute a given
number of points across various items. It enables the researcher to discover the
proportions and is more in the nature of ordinal scale. A minimum of two categories
and a maximum of ten can be presented to the respondents. Presenting too many
stimuli will be a hindrance to the precision and the patience of the respondents. A
respondents ability to add is also taxed if too much of stimulus is provided. For
example in selecting a particular brand of computer a respondent may be asked to
rate the following aspects;
67
DBA 1657
NOTES
Hardware configuration
-----
Freebies given
-----
Brand image
---------------100
------------
Total points
+3
+3
+2
+2
+2
+1
+1
+1
Challenging
suits my skill
satisfactory
-3
-3
-3
-2
-2
-2
-1
-1
-1
The graphic rating scale is simple and commonly used in practice. In this
scale various points are marked along the line to form a continuum. The respondent
indicates his rating by simply making a mark at the appropriate point on a line that runs
from one extreme to the other. A brief description on the scale points are given to act as
a guide in locating the rating. The faces scales depicting faces ranging from smiling to
sad can be used on a rating scale to obtain responses regarding peoples feelings with
respect to some aspect. A major limitation of this scale is that the respondent may select
almost any position on the line which will pose difficulty in analysis.
68
DBA 1657
also tested for validity and reliability. For eg. Thurstone Equal Appearing Interval Scale
is a consensus scale. A panel of judges selects the statements which describe the concept
under study. The scale is developed based on the consensus. Developing this scale
involves time and as such is rarely used in the organizational concept.
NOTES
DBA 1657
NOTES
Business Line Indian Express If the number of stimuli to be ranked is 5 or less, then it is comparatively an
easy task. The respondents may be normally careless if the items exceed 10.
Comparative scale
It involves a standard against which comparison is done. The comparison scale
provides a point of reference against which the current object under study is compared.
It enables benchmarking. However this can be used only when the respondents have
the knowledge regarding the standard against which comparison is made. The researchers
can treat the data produced by comparative scales as interval data since the scores
reveal the interval between the standard and the actual. It can also be treated as ordinal
data as the rank or position of the items are dealt with.
70
DBA 1657
to the theme of study may be selected. Each item is scored from 1 to 5 depending on
the responses obtained. The results are then totaled. Arbitrary scales are easy to develop,
inexpensive and highly specific to the theme of the study. However the major limitation
is that the design approach is subjective. There is no assurance other than researchers
insight that the items chosen are representatives of the universe of content.
NOTES
2. Consensus Scaling
In consensus scale the items are selected by a panel of judges after evaluation
on the basis of some criteria like - relevance to the topic area, the risk of ambiguity and
the level of attitude represented by the items. This approach is widely known as
Thurstone equal appearing Interval Scale. The procedure followed in construction of
the scale is described below
(i)
(ii)
A panel of judges evaluates the statements. The statements are written in the
card. One statement is written in each card. The judges sort each card into one of
the 11 piles representing the degree of favourableness the statement expresses.
(iii)
The sorting yields a composite position for each of the items. In case of disagreement
between the judges, the item is discarded.
(iv)
For the items that are retained, median scale value between one and eleven is assigned.
(v)
A final selection of statements are made on the basis of the median score. Of
the 11 piles 3 are identified by the judges as favourable, unfavourableand neutral.
The eight intermediate piles are unlabelled.
The Thurstone method is widely used for developing differential scales to measure
attitudes. The scale is more reliable for measuring a single attitude. This method of
construction involves cost, time and people and hence it is impractical. The values are
assigned to the items by judges which is subjective.
3. Item Analysis scaling
In Item analysis scaling, an item is evaluated on the basis of how well it
discriminates between those persons whose score is high and those whose total score
is low. It involves calculating the mean score for each scale item among the low scorers
and high scorers. The item means between the high-score group and the low-score
group are then tested for significance by calculating t values. Finally the items that have
the greatest t values are selected for inclusion in the final scales.
71
DBA 1657
NOTES
(ii)
A trial test can be conducted with a small group of respondents who form
part of the final study. The agreement or diasagreement towards each
statement is obtained on a five point scale.
(iii)
The response is scored in such a way that the response indicating the most
favrourable attitude is given the highest score of 5 and the most unfavourable
attitude is given the lowest score 1.
(iv)
The total score of each respondent is obtained by adding the score for
each individual statements
(v)
The next step is to array the total scores and find out those statements
which have a high discriminatory power. For this purpose the researcher
may select some part of the highest and the lowest total scores, for eg, top
25 percent and bottom 25 percent. These two extreme groups are
interpreted to represent the most favourable and the least favourable attitudes
and are used as criterion groups by which to evaluate individual statements.
Thus the statements which consistently correlate with low favourability and
with high favourability are identified.
(vi)
The statements which correlate with the total test are retained in the final
instrument and all others are discarded.
The advantages of Likert scale is that it is relatively easy to construct, considered
to be more reliable and less time consuming. One of the major limitations is that
the scale simply examines whether respondents are more or less favourable
towards the subject under study, but it cannot reveal how much more or less
they are. There is no basis for belief that the five positions indicated on the scale
are equally spaced.
72
DBA 1657
NOTES
4. Cumulative scales
Cumulative scales consist of series of statements to which a respondent expresses
his agreement or disagreement. The special feature in this scale is that it forms a cumulative
series. The statements are related to one another in such a way that an individual who
replies favourably to Item No.3 also replies favourably to Item no. 2 and 1. An individual
whose attitude is at a certain point in a cumulative scale will answer favourably all the
items on one side of this point and answer unfavourably all the items on the other side of
this point. The individuals score is arrived at by counting the number of points concerning
the number of statements answered favourably. If the total score is known, it is easy to
estimate the respondents answer to individual statements constituting the cumulative
scales. A major scale of this type is the Guttmans scalogram.
Scalogram analysis refers to the procedure for determining whether a set of
items forms a unidimensional scale. A scale is unidimensional if the responses fall into a
pattern in which endorsement of the item reflecting the extreme position results also in
endorsing all items which are less extreme. Under this technique, the respondents are
asked to indicate in respect of each item whether they agree or disagree with it. If the
items form a unidimensional scale , the response patter will be in the following manner
Item Number
Respondent score
A score of 3 means that the respondent agrees with all the statements which is
positive or expresses favourable attitude. A score of 2 reveal that the respondent
does not agree with the third statements but agrees with all other statements. In this
way, the scores can be interpreted.
The procedure for developing a scalogram is described below;
(i)
(ii)
(iii)
The next step is to pretest the items to determine the scalability. The pretest
should include a minimum of 12 items and should be administered on atleast
73
DBA 1657
NOTES
Item
6 2 5
4(perfect)
X X X X
3(perfect)
X X X
nonscale
nonscale
- X X -
3(perfect)
X X X
2(perfect)
- X
1(perfect)
- -
nonscale
X -
0(perfect)
- -
(iv)
- X
Errors
Per case
Number of
errors
The total scores of the various opinions are obtained. The order is then
shifted such that it results in a reduced number of items. The above example
shows that five items ( 9,6,2,5 and 7) are selected for final scale. Perfect
scales are those in which the respondents answers fit the pattern that would
be reproduced by using the persons total score. Non-scale types are those
in which the category pattern differs from that expected from the
respondents total score ie non scale items have deviations from
unidimensionality or errors. The selection of an item in the final unidimensional
scale is made on the basis of the coefficient of reproducibility. Guttman has
set 0.9 as the level of minimum reproducibility in order to select a scale.
The following formula is used for measuring the reproducibility:
74
DBA 1657
NOTES
e = number of errors
n = number of items
N = Number of cases
5. Factor scales
Factor scales include a variety of techniques that has been developed to address
two issues viz, the problem of dealing with the universe of content that is multi dimensional
and the problem of uncovering the underlying dimensions that has not been identified by
the exploratory research. Factor scales are developed through factor analysis or on the
basis of intercorrelations of items which indicate the common factor responsible for the
relationships between items. The techniques are designed to intercorrelate items so that
the degree of interdependence may be detected. An important factor scale based on
factor analysis is semantic differential scale and multi dimensional scales. They are
discussed below:
(a) Semantic Differential Scale
The semantic differential scale (S.D) was developed by Osgood and his
associates to measure the psychological meaning of an object to an individual. The
scale is made on the presumption that the object under study can have different dimensions
of connotative meaning which can be located in multidimensional property space or in
the semantic space in the context of S.D scale. The scaling consists of a set of bipolar
rating scales, usually of 7 points on the basis of which the respondents rate each concept
on the scale item. An example of the scale being used by a panel of corporate leaders
to rate the candidates for a leadership position are shown below. Three factors contribute
viz., evaluation (E), potency (P) and activity (A) are considered.
(E) Sociable
____ ____
(P) Strong
(A) Active
____ ____
(P) Tenacious
____ ____
(A) Fast
(7) ____
The nature of the problem determines the selection of dimensions and bipolar
pairs. The SD scale is adapted to each research problem. The construction of SD scale
involves the following steps:
75
DBA 1657
NOTES
76
DBA 1657
NOTES
SUMMARY:
This unit examined the meaning and the various types of experimental design.
The various activities involved in conducting the experiment were discussed. The various
factors affecting the internal and the external validity were dealt. The pre-experiments,
true/lab experiment and the field experiments were covered. The various rating and
ranking scales were discussed. Construction of arbitrary, cumulative, consensus, item
analysis and factor scales were examined.
Equipped with the knowledge on the research design and measurement scales,
the next unit presents the various data collection methods, sampling techniques and the
parametric and non-parametric tests available to test the hypothesis.
A retail grocery chain wants to study the effects of the various levels of advertising
effort and price reduction on the sale of specific branded grocery products.
What type of experimental design would you recommend? Suggest in detail the
design for the study.
What are the essential differences among nominal, ordinal, interval and ratio
scales? How do these differences affect the application of statistical techniques?
What are the four sources of measurement error? Illustrate by example how
each of these might affect the measurement results in a face-to-face interview.
77
DBA 1657
NOTES
How is the interval scale more sophisticated than the nominal and ordinal scales?
Why is ratio scale considered to be the most powerful of the four scales?
Describe the difference between the rating scales and ranking scales and indicate
the application areas where they can be used
78
DBA 1657
Unit - 3
NOTES
DBA 1657
NOTES
3.1 INTRODUCTION
Once the problem is defined and research design is finalized, the various sources
of data and the ways in which it can be collected for the purpose of analysis, testing of
hypotheses and answering research questions should be explored. Data collection could
be made through the primary, secondary and tertiary sources which are dealt in detail.
This unit also highlights the sources of data and the methods of collecting the same. The
data cannot be collected always form the entire population due to various reasons like
difficulty in estimating the population, cost constraints, time etc., Sampling technique
has to be adopted by a researcher for collection of data. This unit provides a detailed
account of the probability and the non probability sampling techniques. The issues
regarding determination of sample sizes are also presented.
Know the difference between primary and secondary data and their sources
Understand the various data collection methods and the advantages and
disadvantages of each method
80
DBA 1657
NOTES
Focus groups
Focus group involves a formalized process of bringing small group of people
together for an interactive and spontaneous discussion on any one particular topic.
Focus group generally consists of 6 to 12 participants with a moderator leading the
unstructured discussions which can last between 90 minutes to two hours in general. By
facilitating the discussions the moderator elicits as many ideas, attitudes, feeling and
experiences as possible regarding the concerned issue. Participants are generally chosen
on the basis of their expertise in the topic on which the information is sought.
The goal of conducting focus group is to give researchers the access to as
much information as possible regarding the product, service concept or organization.
The focus group does not restrict to only asking and answering questions. The success
of focus group relies to a greater extent on the group dynamics, the willingness of the
participants to engage in interactive dialogue and the ability of the moderator to keep
the discussion on the track. The fundamental idea behind the focus group is that one
participants remark or response may initiate comments and discussions from the other
participant thus generating spontaneous and free interplay among all the participants.
Focus groups are relatively inexpensive and can provide access to dependable data
within a short period of time.
Focus groups objectives
The objectives of forming focus groups are listed below:
The focus groups provide data for defining and refining the problems. In situations
where it is difficult for the researcher to pinpoint the specific problems, the
focus group aids to differentiate between symptoms and root cause problems.
In certain situations, researchers may not be sure about the specific types of
data or information that should be investigated. In these situations, focus groups
reveal unexpected components of the problem and thus can help researchers
to determine the specific data that are to be collected.
There are situations when the quantitative research investigations leads to results
which are not understandable or explainable. In such situations the focus group
enables to provide data for a better understanding of results derived from
quantitative studies.
DBA 1657
NOTES
The general interactions and discussions among the focus group members
enables to generate new ideas, products or services or innovative ways of
solving problems which are unexplored hitherto.
The focus group plays a critical role in the process of developing new constructs
and creating reliable and valid measurement scales. In the exploratory stage,
the focus group reveals additional insights into the underlying dimensions that
may or may not make up the construct. This insight can help researcher to
develop scales that can be later tested and refined through larger survey research
designs.
82
DBA 1657
The issue of sampling, needs to be addressed carefully while planning for the
focus group. Random sampling may eliminate bias and produce dependable conclusions .
However, it may not be possible or necessary in certain situations. For example, in
qualitative research, a more flexible research design can be followed.
NOTES
DBA 1657
NOTES
successful conduct of the focus group discussions. The moderator should be able to
stimulate and control the focus group discussions over the predetermined topics in a
skillful manner. He should be able to draw the best and most innovative ideas from the
participants regarding the topic or problem under discussion. The moderator is
responsible for creating a positive group dynamics and a comfort zone between himself
and each group member as well as among the group members.
The moderator should have enough background knowledge regarding the topic
of discussion. Apart from skill set discussed above, moderating the session requires
objectivity, self-discipline, concentration and careful listening. The moderator should
be completely prepared with the questioning route yet should allow flexibility depending
on the situation.
The actual conduct of the focus group discussion can be arranged under three
phases viz, opening the session, main session and closing the session. They are dealt
below:
Opening the session
The moderator should warmly receive the participants and make them feel
comfortable. The participants should be instructed to write their names in the name
cards. A few minutes should be allowed for socializing before seating the participants
so that a warm, friendly and congenial environment can be set. The socializing session
can be used by the moderator to observe the participants and place them in groups.
The moderator should discuss the ground rules for the session; one person should only
talk at a time, each one should be given a chance, brief about the purpose of the session
and so on. The moderator can begin the discussion with an open question designed to
engage all participants in the discussion. This breaks the ice and enables to build a
positive group dynamics and comfort zone.
The main session
The topic area is introduced in the main session and as the discussion starts the
moderator gears the direction using probing techniques to get as many details as possible.
As there is no hard and fast rules regarding how long a discussion can be carried out the
moderator should use his judgment in deciding when to close one topic and move to the
next. The critical question should be provided with more time so that the ideas, feeling
and thoughts can be elicited to the maximum.
Closing the session
After covering all the topics for which the focus group is formed, the session
can be wound up. In this process the moderator can summarize the conclusion around
84
DBA 1657
at the discussion and also invite the closing comment from the participants regarding
further contributions or disagreements over certain ideas. If nothing else arises, the
moderator can close the discussion after thanking the participants and distributing the
promised incentives.
NOTES
The spontaneous and unrestricted interaction among the participants give raise
to new ideas, thoughts and feelings which cannot be elicited in an one to one
interview. The respondents will provide creative and honest opinion. The
conducive environment enhances creativity.
The underlying reason to the attitude, feeling, emotions, behaviour etc can be
dealt as in case of the focus groups discussion.
The researcher will have a first hand information and the opportunity to be
involved in the overall process right from starting the focus group till closing it.
85
DBA 1657
NOTES
This gives an in depth insight into the various dimensions of the problem which
are hitherto unexplored.
Focus group interview can cover a number of topics. The discussion can be
directed at successfully over a number of issues.
The data structures developed from focus group interview cannot be as such
applied to the target population. The generalization of the research findings are
questionable.
The researcher has only limited ways to substantiate the data reliability. Added
to this the data collected from the participants may not be structured and
amenable to further statistical inferences.
The data collected from the focus group can be subjectively interpreted by the
researcher according to the researchers preconceived views. The bias will
reduce the credibility and trustworthiness of the data and the information derived.
The cost per participant in terms of identifying, recruiting and compensating are
relatively quite high.
86
DBA 1657
The members of the panel are randomly chosen. They may be exposed to an
advertisement or their attitude towards a particular brand may be recorded. After a few
days or month the panel may be exposed to a different set of advertisement or their
attitude may be measured again to identify the changes in the behaviourial pattern.
Thus, the continuing set of members form the sample base or the platform for assessing
the effects of change. Such members are called panels and the research that uses them
is called as panel study.
NOTES
The panels can be static or dynamic. In case of a static panel the same members
form part of the panel over an extended period of time. In case of the dynamic panel
members change from time to time as the study progresses to successive phases. The
static panel offers a good and sensitive measure of changes. However, due to continuous
interviews the panel members are over exposed to the issues on hand and they may not
reflect the view of population. The members may also not continue to be a part of panel
for a longer period of time. There may be dropouts. The major drawback in dynamic
panel is that it deals with different people which may give raise to different opinions and
the changes cannot be tracked in a objective manner.
3.3.2 Secondary data sources
Secondary data refers to the information gathered from already existing sources.
Secondary data may be either published or unpublished data. The published data are
available in the following forms:
Publications of central, state and local governments
Public records and statistics, historical documents and other sources of published
information.
DBA 1657
NOTES
of secondary data involves less time and cost. However, a researcher should not solely
depend on the secondary data due to the following reasons; the data can become
obsolete and may not provide current and updated information, data would have been
collected for some other purpose and hence it may not meet the specific requirements
of the researcher.
The researcher before using secondary data should ensure the following:
The reliability of data should be ensured by way of finding out the type of
people involved in data collection, the sources from which the data is collected,
the methods used to collect the data, the time of data collection and the level of
accuracy associated.
The secondary data used by the researcher would have been collected for a
different problem other than what the researcher is presently attempting to solve.
Hence the researcher should ensure that the data collected is suitable for the
purpose of the study for which it is attempted.
The secondary data should be adequate for the conduct of the study. It should
be related to the area and should neither be narrower nor wider than the problem
attempted by the researcher.
Availability of resources
88
DBA 1657
NOTES
3.4.1 Interviews
In this method the respondents are interviewed for the purpose of obtaining
information on the issues pertaining to the reserch. The interview may be either
unstructured or structured and it can be a personal interview or conducted through
telephone, mail , internet or a combination of all these.
Unstructured interviews
In the unstructured interviews, the interviewer does not conduct the interview
with a planned sequence of questions. The aim of this interview is to highlight the
preliminary issues so that the researcher can determine the variables which needs further
in-depth investigation. The researcher resorts to the unstructured interviews when the
problem is not clearly formulated or when a clear understanding of the variables involved
is not present. The researcher in the attempt to obtain information may adopt different
styles and sequencing of questions to various respondents. Some may provide information
with open ended questions, whereas some may require more directions. Some
respondents may be more defensive and may not be willing to share information. Some
may even be reluctant to undergo the interview and may refuse to respond. The
researchers have to employ various questioning techniques so as bring the respondents
defenses down and make them more amenable to reveal information. The researcher
should also know when to retreat or terminate the interview when the respondents
cannot be convinced to participate or impart the information.
The unstructured interview will direct the researcher to understand the variables
which need greater focus based on which a structured interview can be planned.
Structured interviews
The structured interviews are conducted when the interviewer knows the type
of questions to be asked to the respondents or when the information needs are clearly
known. The questions may focus on the issues that have been highlighted during the
unstructured interviews and are considered relevant to the problem identified. The
interview may be conducted by the researcher himself or by a team of interviewers.
The researcher/ interviewer should be very clear about the purpose of each question
particularly when a team of interviewers conduct the survey. The same questions are
posed in the same sequence or manner to all the respondents and the responses are
noted down. Depending on the situations and the respondents willingness and knowledge
the researcher can also ask other relevant questions which may not be in the list so as
to gain more insight into the identified problem. The researcher may also include visual
89
DBA 1657
NOTES
aids, drawings , pictures and other materials in conducting the interviews. In situations
where the ideas cannot be clearly articulated only with words visual aids are more
useful.
1. Personal interviews
Personal interviews or face to face communication is a two-way conversation
initiated by the interviewer to obtain information from the participants. The interviewer
and the participants may be strangers. The interviewer controls the topic and pattern of
discussion. The participant or the respondents may not gain anything out of their
participation in the interview.
The success of the personal interview lies among other things on the respondents
ability to provide the information needed and the ability to understand the importance
of information provided by him. The researcher should take necessary steps to motivate
the respondents to cooperate so as to ensure successful conduct of the interview.
Increasing participation
The researcher can enhance the respondents participation by way of explaining
the kind of answer sought, the terms that should be expressed, the depth and clarity of
information needed etc. Coaching can be provided to the participants but care should
be taken to avoid the biasing factor. The interviewer can make the session an interesting
and enjoyable experience by means of administering adequate motivation techniques.
Some of the techniques for successful interviewing of the participants are listed below:
The interviewer should introduce himself by name and the organizations to which
they are affiliated to. The interviewer can identify himself with the introductory
letters or other information that confirms the legitimacy of the work. Enough
details regarding the work to be done should be given, wherever demanded
more information may be provided. The interviewer should be able to kindle
the interest of the respondent.
In the process of gathering data the interviewer should ensure that the objective
of each question is achieved and the needed response is obtained. The
interviewer can resort to probing, but steps should be taken to avoid the bias.
90
DBA 1657
NOTES
DBA 1657
NOTES
arises due to selection of samples through the probability sampling method. The problem
can be tackled by way of attempting to contact the respondent again. Another approach
is to treat all the remaining non-participants as a new sub-population after a few callbacks.
A random sample is drawn from the non participant group and attempt is made to
contact and complete this sample at hundred percent success rate. Finding from this
non-participant sample can be then weighed into the total population estimate. The
researcher can also try to substitute the missing participant but care should be taken to
see that the substitute participant possess the significant character of the replaced
participant. For e.g., the respondent should belong to the same occupation, educational
status, income level etc.,
iii.Response error
Response error occurs when the data reported differ from the actual data. The
error can be caused by the respondent or the interviewer or during the preparation of
data for analysis. Participant initiated error occurs when the participant fails to answer
accurately either by choice or due to lack of knowledge. Interviewer error arises due to
the inability to conduct the interview in a controlled manner. This may take many forms
like the failure to secure cooperation, lack of consistent interview procedures, inability
to establish appropriate interview environment, bias due to physical presence, failure to
record answers correctly. These errors affect the quality of the data collected.
iv. Cost
To conduct the personal interview, the respondents should be met individually.
They might be scattered geographically and the time and cost involved in administrative
and travel task is higher. Sometimes, the respondents may not be available and repeated
contacts have to be made which adds to the cost. In addition to this the researcher may
employ interviewers who have to be paid. To reduce the cost telephonic interviews and
self-administrated surveys can be attempted.
Advantages and drawbacks
The major advantage of personal interviewing is the ability to secure in-depth
information and detail. The ability to harness information is more in personal interviewing
as compared to telephone, mail survey and through internet. The researcher can adopt
the questioning technique in tune with the respondents ability to understand. Further
clarification can be immediately made by repeating or rephrasing the questions concerned.
The researcher can also get information from the nonverbal clues exhibited through the
body language of the respondent.
However, the personal interviewing involves cost in terms of both money and
time. Costs may escalate in case, where the study covers a wide geographic area or
92
DBA 1657
has a large sample to be covered. The chance of the outcome being affected by the
interviewers bias is more in the case of personal interviews. The respondents may feel
uneasy about the secrecy of their responses in case of the face to face interaction.
NOTES
2 Telephonic interviews
Interviewing through telephones enables to gain the following advantages:
Conducting interview through telephone enables to reduce the cost. The cost
reduction arises due to reduction in traveling and administrative expenses involved
in training and supervision. It is enough to train less number of interviewers
since the interview is conducted through telephone. Coverage per person through
telephone will be more than the face to face interviews.
Telephonic interview enables to screen and cover large population spread over
a wide geographical location. It enables to have a much more representative
sample.
Unlike face to face interview where the respondent may avoid contact with the
researcher, the contact rate is higher in telephonic interviews as the respondent
has to pick up the ringing phone. However, the use of caller identification facility
may reduce the contact rate.
The following drawbacks arise out of telephonic interviews:
DBA 1657
NOTES
The length or duration for which the telephonic interview can be conducted is
limited. Ten minutes interview is considered ideal. However sometimes the
interview may extent to more than an hour also.
3.4.2 Observation
Observation is most commonly used data collection method in many of the
studies relating to behavioral sciences. Observation enables to collect data without
asking question from the respondents. The respondents can be observed in the natural
work environment or in lab setting and their activities and behaviors of interest can be
recorded.
In conducting research, casual examination without purpose cannot be called
as observation. Observation becomes a scientific tool for data collection, if it is conducted
specifically to answer a research question. It should be systematically planned and
executed using proper controls and should provide a reliable and valid account of what
has happened.
Types of observation
Observation can be grouped under the following categories:
1. Type of activity under observation
Observation includes monitoring both behavioral and non-behavioral activities
and conditions. Behavioral observation includes nonverbal analysis, linguistic analysis,
extra linguistic analysis and spatial analysis.
94
DBA 1657
There are four dimensions to extralinguistic behaviour viz., (i) vocal which
includes pitch, loudness and timbre (ii) temporal which includes the rate of
speaking, duration of utterance and rhythm (iii) interaction which includes the
tendencies to interrupt, dominate or inhibit and (iv) verbal stylistic including
vocabulary and pronunciation, peculiarities, dialect and characteristic
expressions.
NOTES
The records include the historical or current record and public or private records.
It may be written or printed.
DBA 1657
NOTES
3. Concealment
This categorization is based on whether the participant is aware of the observers
presence. The presence of observer may cause the participant to behave in a different
manner which might arrest the very purpose of observation. If the activity in which the
participants are involved is highly absorbing then there is a high chance that the participant
may remain unaffected by the presence of the observer. However, the potential bias
due to the presence of the observer cannot be totally ruled out.
In order to rule out the bias in behaviour the observers may conceal themselves
from the object being observed using some mechanical means. For e.g., one way mirror,
camera, microphone etc. However, this has to be carefully evaluated on the basis of
ethical grounds.
Partial concealment is where the presence of the observer is not concealed but
his objectives or interest is not revealed. In order to evaluate the performance of a sales
person, a sales manager may be present when the salesman is dealing with the customer.
However the purpose of the sales managers presence may be concealed and he may
pretend to be involved in some other task.
4.Participation
The presence of the observer and his involvement in the research setting is
called participant observation. He plays the role of observer as well as the participant.
The participants may or may not know about the same. The observer should be more
efficient as he has to play a dual role.
Non-participant observation occurs when the observer collects the data without
becoming an integral part of the research setting. The observer merely observes the
activities, records them and tabulates them in a systematic manner. This type of
observation requires the observer to be physically present in the research setting for a
extended period of time which makes it a time consuming task.
5. Definiteness of structure
The observation can be grouped as structured and unstructured observation.
Clear definition of various aspects of observation viz., the units to be observed, method
of recording, extent of accuracy needed, conditions of observation and selection of
pertinent data of observation etc are the characteristic of structured observation.
Structured observation is appropriate in case of descriptive studies.
If the observation is conducted without the above characteristics defined in
advance, it is termed as unstructured observation. This method of observation is usually
followed in exploratory studies.
96
DBA 1657
NOTES
6. Extent of control
The observation can be carried out in controlled or uncontrolled settings.
Uncontrolled observation is carried out in a natural setting. No attempt is made to use
precision instruments. The main aim of using this method is to get a spontaneous picture
of reality. It provides naturalness and completeness to observation. However, it may
lead to subjective interpretation and over confidence that the observer knows more
about the observed phenomena than the actual. It is usually used in exploratory research.
Controlled observation takes place according to a definite predetermined plan.
It involves experimental procedure and involves the use of precision instruments to
record the observation. The observation is usually carried out in a standardized and
accurate manner leading to certain assured degree of generalization. It is usually carried
out in the form of experiments in laboratory or under controlled conditions.
Decision involved in conducting the observational study
Observational studies involve the decision regarding the type of the study, content
to be observed, training requirement of the observer/researcher and the data collection.
1.Type of the study
Observation in various forms is practiced in different type of studies. In
exploratory studies data collection is done through simple observation which may not
be carried out in a structured manner. In case of studies other than the exploratory
nature, systematic observation employing standardized scientific procedure will be
followed.
2.Content specification
In observational studies the variables to be observed and other variables that
may affect them should be specified. From the specified variables, the variables that are
to be observed should be selected. The variables should be operationally defined so as
to avoid confusion in the minds of observers.
3. Training the observers
The validity and reliability of the findings from observation depends on the
observer. If the observer is not trained properly, the data collected may not lead to
valid results. The observer is prone to fatigue, halo effects and observers drift which
will affect the dependability of the data collected. Hence, in selection of observers
certain guidelines should be followed. The observer should have the ability to function
amidst lot of distractions, remember details of the activity observed, blend with the
settings being observed and should have the ability to extract the most from the
observational study. The observer should be given clear instructions regarding the outcome
sought and the precise content to be observed.
97
DBA 1657
NOTES
4.Data collection
Data collection plans deals with answers to question like who, what, when,
how and where. The qualification of a participant to be observed, the characteristics of
the observation, the time of observation, the method of recording data by the observers
and the place where the observation is to be conducted.
3.4.3. Questionnaires
Most of the research studies carried out for solving business problems require
the researcher to depend on primary data. The researcher should collect data through
questionnaires/ interview schedules and process the same so as to provide solution to
the identified problem. A questionnaire is a formalized framework consisting of a set of
questions and scales designed to generate primary raw data. It is a preformulated written
set of questions to which the respondents record their answers. The answers are mostly
chosen by a respondent from within the closely defined alternatives. The questionnaires
can be administered personally, mailed to the respondents or electronically distributed.
A.Personally administered questionnaire
If the study is confined to a local area, the questionnaires can be collected by
personally administering the same. The main advantage is that the researcher can collect
all the completed responses within a short period of time. The researcher has an
opportunity to introduce the research topic and motivate the respondents to offer frank
answers. Any doubts that the respondents have on any questions is clarified on the
spot.
Administering the questionnaire to a large number of respondents at a time
would save time and expenses and also ensure quick collection of data as against
personal interviewing. Hence, wherever possible group administration of questionnaire
should be opted for depending on the sample frame work. The major drawback will
be the reluctance of organizations to give time to conduct survey among group of
employees.
B. Mail questionnaire
Where the respondents are scattered over a wide geographical area, the
researcher has to resort to mail questionnaires. The questionnaires are mailed to the
respondents, who can complete them at their convenience, in their home at their own
pace. The main advantage is that the anonymity of respondents is maintained and this
will lead to a free and frank disclosure of information. The respondents spread over a
wide , geographical area can be reached and the respondents can take more time at
their convenience and fill the questionnaire. It can also be administered electronically.
98
DBA 1657
However, the return rates of mail questionnaires are typically low. The doubts
in the questionnaire cannot be cleared as easily as in the case of personally administered
questionnaire. The representativeness of the sample is questionable due to the low
return rates. The respondents can be motivated by sending follow-up letters, enclosing
small monetary amounts as incentives, providing respondents with self-addressed,
stamped return envelopes and keeping the questionnaire as brief as possible
NOTES
DBA 1657
NOTES
The question should have a proper scope and should cover the issue. The
questions asked should reveal all that is needed to know. Questions are
considered to be ineffective if they do not provide the right information that is
needed.
The question should ask precisely what is needed. For e.g., if the researcher
needs to know the family income of the respondent but the question is asked
regarding income then it may mean to the respondent as the respondents
income and not family income. Unambiguous words can be used so that clarity
can be ensured.
The question asked by the researcher may be contributing towards the theme
and may be precise but it may not be possible for the respondent to answer the
same adequately. The respondent may require time to think and answer certain
questions. Sometimes the respondent may not be able to give an accurate
answer due to his inability to recall things from memory.
100
DBA 1657
probing questions. If correctly administered the open ended question can provide the
researcher with a rich array of information.
NOTES
DBA 1657
NOTES
to the choice of laptop will only provide the factors considered but not the order of
importance. The ranking question will lead the respondent to rank the most important
factor as 1 the next important as 2 and so on.
6. Positively and negatively worded questions
The questionnaire should include both positively and negatively worded
questions. If all the questions are positively worded then the respondent will tend to
mechanically circle all the points toward one end of the scale. A respondent who is
interested in completing the questionnaire soon will tend to circle all the questions to
one end. The researcher can keep a respondent more alert by including both positive
and negative worded questions. The use of double negatives and excessive use of
words such as not , only etc., should be avoided in the negatively worded question
as they will tend to confuse the respondents.
7. Double-barreled questions
A question that leads to different possible responses to its sub-parts is called a
double-barreled question. Such questions should be avoided by way of breaking the
questions into two or more parts. For example the question do you like the flavour
and the taste of the soft drink?. The question may lead to ambiguous reply. It should be
broken into two questions addressing flavour and taste separately so as to obtain
unambiguous response.
The type of question dealt below should be carefully avoided or used with
caution by the researcher.
8. Ambiguous question
The question may not be double-barreled but still it may lead to ambiguity. For
e.g., if the researcher involved in the study of the job satisfaction asks the respondent
to rate the level of satisfaction, the respondent may be confused as to whether the
question is addressing satisfaction related to work environment, salary, team spirit or
overall satisfaction. The question should not give raise to ambiguous response and bias.
9. Memory related questions
If the questions require respondents to recall experiences from a distance past
that are very hazy in their memory, then the answers to such question might be biased.
10. Leading / Loaded questions
Questions should not be asked in such a way that the respondents are forced
or directed to respond in a manner that he would not have, under normal situations
where all possible alternatives are given. Questions should not prompt the respondents
to answer in the way the researcher wants it answered. For example, Dont you think
102
DBA 1657
that salary is the main reason for software employees to quit the job?. Questions
which are emotionally charging the respondents are called as loaded questions. Such
questions would lead to bias in response and should be avoided.
NOTES
The vocabulary should be simple, direct and familiar to all respondents. If the
wordings / jargons used or the language is not understood by the respondent,
then it may lead to wrong or biased answers. The wording and language should
be selected keeping in mind the educational level of the respondents, the terms
used in the culture and the frames of reference of the respondents.
The words used should not give raise to ambiguity or vagueness. This problem
arises because of not giving the respondent an adequate frame of reference , in
time and space for interpreting the question. Words such as often, usually
lack an appropriate time referent leading the respondents to choose their own
which will lead to answers not comparable. Similarly, appropriate space or
location is not often specified. For eg.,the question Mention your place of
origin Does it elicit response as the district, state or country?.
DBA 1657
NOTES
The instructions provided to answer the question should not be confusing the
respondent. The questions should be directed more towards measuring the
respondents knowledge or interest in the subject
Simple short questions should be asked instead of long ones. Researcher should
see that a question or a statement in the questionnaire should be worded as
minimum as possible.
Questions should not be asked in such a manner that it will elicit socially desirable
response. For example, Do you think that physically challenged people should
be given more weightage in employment opportunities?. Irrespective of the
true feeling of the respondent a socially desirable answer would be provided.
104
DBA 1657
well sequenced questions and response alternatives will make things easier for the
respondents to answer. These aspects are explained below:
In the introduction section, the researcher can disclose his identity and
communicate the purpose of the research. It is also used to motivate the
respondents to answer the questions by conveying the importance of the research
work and by specifying the importance of contribution from the respondent.
The researcher should also ensure the confidentiality of the information provided.
The introduction section should end with a courteous note, thanking the
respondent for the time devoted to respond to the survey.
Questions relating to the personal profile of the respondents viz., name, gender,
age, education, income, marital status etc., can appear in the beginning or at the
end of the questionnaire. The questions should provide a range of response
options rather than seeking an exact figure. The personal profile related questions
asked at the end may have a greater chance of response because the respondent
would have gone through other questions which would have convinced him
about the legitimacy and genuineness of the questions framed. This would make
them more amenable to reveal the personal information. Some researchers feel
that asking personal data in the beginning would enable the respondent to
psychologically identify themselves with the questionnaire and enhance the
commitment to respond.
NOTES
The open ended questions should be put at the end so the respondent may find
it easy to comment on the various aspects.
DBA 1657
NOTES
The most important purpose for pretesting is to know whether the meaning of
the questions is interrupted in the manner in which it is intended to. This problem
may arise because, the respondent may not be familiar with the word which
will result in distortion of the meaning of the question. The respondent is likely
to modify a difficult question in a way that makes it easier for him to respond.
106
DBA 1657
Flow of the questionnaire should be tested to know whether the transition from
one topic to another is natural, logical and ensures a coherent flow.
Task difficulty should also be identified through pretest. The respondent may
be confused if the question requires that a respondent make connections or put
together information in an unfamiliar way. For e.g., questions related to annual
income. It involves calculation by the respondent. Instead the researcher can
get monthly income and calculate the annual income on his own.
Ability to capture and maintain the interest of the respondent throughout the
entire questionnaire is a major challenge. The extent to which this is successful
should be pretested
Testing the items for an acceptable level of variation in the target population is
one of the common goals of pretesting. The researcher should lookout for
items showing greater variability. Very skewed distributions from a pretest can
serve as a warning signal that the question is not tapping the intended construct.
NOTES
The flaws identified in the questionnaire should be corrected. Finally the pretest
analysis should return to the first step in the design process. Each question should be
reviewed again and again regarding its contribution to objectives of the study, leading to
other steps. The last step in the process may be another pretest, if major changes are
needed again.
DBA 1657
NOTES
be handed over to the respondents and researcher may help them in recording their
answers to various questions in the schedules. The researcher can explain the aims and
objectives of the investigation and also can clear doubts and difficulties which the
respondents feel in understanding the implications of a particular question.
The success of this method depends on the selection of enumerators for filling
up schedules or assisting respondents to fill up schedules. The enumerators should be
trained to perform their job well and nature and scope of the investigation should be
clearly explained to them. The purpose of each question and the type of response
expected should be informed to them. Enumerators should posses patience to tackle
the respondents and should also be intelligent to cross examine and find the truth. They
should be sincere, hardworking and should have perseverance.
Collection of data using interview schedules and enumerators lead to fairly
reliable results and extensive inquiry. However it is expensive and takes time.
Difference between questionnaire and interview schedule
Questionnaire and interview schedule are both used for data collection and
they resemble each other. However, the important points of difference are highlighted
below:
i.
The questionnaire can be sent thought mail with covering letter and the same
does not require further assistance. The schedule is filed out by the researcher
who interprets the question whenever needed.
ii. Collecting the questionnaire requires less expense as it is filled by the respondent
himself. In the case of schedules, enumerators should be appointed. This involves
additional expenses in terms of payments made to them and training provided.
iii. The rate of non-response is usually higher in case of mailed questionnaire. In
case of schedules the non-response rate is lesser as the enumerator himself fills
the schedules and is personally present. However, the danger of bias and cheating
prevails.
iv. The identity of the respondent is not clear in the case of the questionnaire, but
in case of the schedules the identity is known.
v. The questionnaire method of data collection involves time as it requires several
reminders inspite of which it may not be returned. In case of schedules direct
personal contact is established and responses are elicited soon.
vi. Questionnaire method can be used only in case of educated or literate
respondents but the interview schedules can be administered even in case of
illiterate persons
Anna University Chennai
108
DBA 1657
vii. Wider and more representative population is possible in the questionnaire method
of data collection, but it remains as a difficulty in case of schedules particularly
when the respondents are distributed over a wide geographical area.
NOTES
DBA 1657
NOTES
conclude with the acknowledgement to the respondent for the time and effort spent in
completing the questionnaire. These aspects are detailed below:
Welcome
Introduction
Registration/
Login
Additional
information
Questionnaire
questions
Screening test
Thank you
i. Welcome:
The site or domain name that brings the respondents to the survey page should
be easy to remember and should reflect the purpose of the questionnaire. Several domain
name could be used to attract the respondents. The welcome page should be designed
in such a way that it is loaded quickly. The page should provide information regarding
the organization on whose behalf the questionnaire is administered. It should motivate
the respondent to take part in the survey and emphasize on the ease of responding. The
procedure to start should also be made evident. For questionnaire with password
restriction, the fact should be mentioned clearly in the welcome screen so that the
respondent does not waste time over the same. Too much of animations and gimmicks
should be avoided as it may take more time to download and may also distract the
respondents attention.
ii. Registration/login
The registration or login screen is needed if the access to the questionnaire is
restricted to specific people. The passwords access should be provided to the
appropriate respondents so as to enable them to participate in the survey. While
processing the pin number and password it is better to accept dashes and hyphens as
part of string of numbers. In order to alleviate the respondents frustration, soon after
Anna University Chennai
110
DBA 1657
the data is entered in the required fields, all correct data should be accepted and only
the fields that have been erroneously omitted or completely incorrect should be
highlighted for reentry with a proper explanation regarding the required data. Sufficient
time should be provided to read and complete the registration forms before automatic
time out.
NOTES
iii. Introduction
This section should provide a brief description of the survey, the purpose and
the importance of the response received. It should also outline all the security and
privacy practices associated with the survey so as to reassure the respondents.
Alternatively, these informations can also be included in the registration/login page.
iv. Screening test
If the screening test is very simple it can be located within the introduction
page. If it is more extensive, it should be dealt in a separate page but should be linked
to the preceding and the succeeding pages. If a respondent fails a screening test, still the
chance to participate in the survey should not be denied as it will offend the respondent.
However the contribution can be discarded in the study.
v. Questions
The questions should follow all the basic guidelines of the offline questionnaire.
In addition the following should also be considered in the designing of the electronic
questions:
1. The total number of questions should be as minimum as possible.
2. Initial questions should be routine, easy to answer questions so as to ease the
respondents mindset. The first question should be engaging and should attract
the attention and interest of the respondent.
3. Difficult, sensitive and most important questions should appear after the
respondent has completed atleast 1/3 of the questionnaire at a point when the
respondent would have settled down.
4. In order to ensure consistency of responses, repeated questions which are
worded in a different manner often forms part of the questionnaire. Such
questions should be placed apart.
5. Open-ended questions should appear before close-ended on the same topic
so as to prevent influencing respondents with the fixed option choices of the
close-ended questions.
111
DBA 1657
NOTES
112
DBA 1657
6. Frames can make pages difficult to read, print, increase load time and cause
problems. So the use of frame should be minimized or avoided.
NOTES
7. Forms and fields are commonly used for data entry. The field labels should be
placed close to the associate fields. The submit button should be located
adjacent to the last field. The tab order for key navigation around the fields in
questionnaire should be logical and reflect the visual appearance as far as
possible. Fields should be stacked in vertical column and any instruction
pertaining to a given field should appear before and not after the field.
3. Navigation
To enable easy navigation within the website presenting questionnaire, online
questionnaire usually includes buttons, links, site maps and scrolling. All mechanisms
for navigation should be clearly defined and should be placed in such a manner that it
can be easily identified and accessed by the respondents. The navigational aids are
typically located at the top right hand corner of web pages and it should appear
consistently in the same place on each page of a quesionnaires website. The guidelines
regarding the navigational elements are listed below:
1. Buttons enables a respondent to exit a questionnaire or return to the previous/
next section of a questionnaire. This should be placed consistently in the same
place in all the pages and should be designed in a easily identifiable manner.
Graphical presentation can be used to name the button.
2. Links are commonly used in web pages. It should be designed in a simple
manner and used sparingly. It should be placed in a clearly identifiable manner.
Bold, coloured, underlined text can be used. A link that has been visited by the
respondent should be indicated by a change in the colour. Text based link
should be used rather than image based links. Clear distinctions should be
made between links to locations within the same page and different page.
3. Site maps provide an overview of the entire webpage at a single glance. They
help users navigate through website and enables saving time and frustration.
The path way is usually in a linear manner and therefore the orientation should
not be overly complex. The site maps should be scaleable and should be
consistently placed. It should be downloadable in minimum time and used only
when there are more number of pages in a questionnaire.
4. Scroll bar should be avoided as some respondent find scrolling hard to use
and it can also be overlooked by them. The welcome page should fit into a
single screen and not require scrolling. If scrolling cannot be avoided then the
respondents should be informed of the need to scroll. Scrolling can be avoided
by using jump buttons which takes the respondents to the next screen full of
information or questions.
113
DBA 1657
NOTES
4. Formatting
Formatting of a questionnaire includes several aspects i.e. text, colour, graphics,
flash, frames and tables, feedback and other miscellaneous factors. Guidelines pertaining
to each of these aspects are discussed below:
A. Text
i.
The font used should be readable and the text should be presented
in standard sentence format. Capital letters should be used only for
emphasizing title, captions etc.
ii. Sentences should not have minimum words and should be presented
with minimum characters per line. Paragraph should be of minimum
size.
iii. Technical instructions should be written in such a way that nontechnical people can understand them.
iv. Questions should be easily distinguishable in terms of formatting
from instructions and answers.
v. The relative position of questions and answers should be consistent
throughout the questionnaire. Where different types of questions
are to be included in the same questionnaire, each question type
should have an unique visual appearance
vi. A minimum font size of 12 pt should be used. The font colour should
contrast significantly with the background colour.
vii. The text should be left justified and use of italics should be avoided.
B. Colour
Colour has a great impact on the respondents and their responses and
so it is important to use colours in a wise manner. Consistent colour coding
should be used throughout the questionnaire to reinforce meaning or information
in an unambiguous fashion. Neutral background colour excluding patterns should
be used to make text easy to read. When using two colours, the colours of high
contrast can be used to ensure maximum discernability. The use of following
combination of colours can be avoided since visual vibrations and after images
can occur; red and green, yellow and blue, blue and red, blue and green. While
using colours the standard cultural colour association should be kept in mind
Anna University Chennai
114
DBA 1657
NOTES
C.Graphics
In order to minimize the download time, the graphics should be kept to
minimum. Some guidelines to be followed are listed below:
i.
DBA 1657
NOTES
F. Feedback
It is important to get feedback on the online questionnaire so as to
understand whether a respondent will abandon completion or will persevere
with it. With each new section/page respondent should be given real time
feedbacks to their degree of process through questionnaire. This may take the
form of 30 % completed in a progressive bar. Respondents answers to the
questions should be made immediately visible to them in a clear and concise
manner to reinforce the effect of their action.
G. Miscellaneous
The following guidelines relating to formatting does not fall in any of the
previously discussed categories. The total website content should remain below
60 KB of text and graphics. A version of questionnaire as well as all referenced
articles or documentation should be provided in an alternative format that can
be printed fully. All introductory pages in the survey website should include a
date-last-modified notification as well as a copyright notice if applicable.
5. Response Formats
Electronic equivalents to the various paper-based response styles have to be
selected to best meet the needs of the questionnaire and target audience. Some guidelines
in this respect are discussed below;
A. Matrix questions:
If a question involves many response options, matrix formats can be used to
condense and simplify questions. it should be used sparingly as they require a lot of
work to be done in a single screen. It is also hard to predict how such questions will
appear on the respondents web browsers and size and format of such questions
demands a significant amount of screen which cannot be guaranteed on smaller-scale
technology.
B. Drop-down boxes:
A drop-down text box appears in one line text format. When collapsed it
contains a list or response options from which a respondent can select one or more.
Drop-down boxes are fast to download and can be used when very long lists of response
options are required. It should be used sparingly as it requires very accurate mouse
click and should be avoided when it would be faster to simply type the response. It is
important that the first option in the drop-down list box is not visible by default as it can
lead respondents to select the same.
116
DBA 1657
NOTES
C. Radio Buttons:
Radio buttons are small circles that are placed next to response options of a
close ended question. By default, only one radio button within any given group of radio
buttons can be selected at a time. It can be used in case of mutually exclusive options.
They closely resemble the paper based questionnaire answer formats. It demands a
relatively high degree of mouse precision and users with limited computer exposure
may find it frustrating to click the options or to change the options.
D.Check boxes
Check boxes are typically small squares that contain a tick mark when checked
and allow multiple options rather than exclusive options. They also require high degree
of mouse precision. The advantage of using check boxes and radio buttons within the
same questionnaire is that their appearance is visibly different and so, respondents are
given visual clues as to how to answer any question using either of the two response
formats.
6. General technical guidelines
In addition to the above technicalities of a online-questionnaire design, the
following additional details should also be taken into consideration in the design of an
online questionnaire:
i. Privacy and Protection
It is important to ensure that the respondents privacy and perception of privacy
are protected. The survey data should be encrypted and the anonymity of the respondent
should be assured.
ii. Computer literacy
The questionnaire should be designed keeping in mind the less knowledgeable,
and low-end computer user. Specific instructions should be provided without offending
the inexperienced respondents. Prior knowledge or preconceptions in terms of
technological know-how should not be assumed. The need for double-click should be
eliminated since it can be difficult for an inexperienced user.
iii. Automation
Many aspects of the online questionnaire can be automated unlike the paper
based questionnaire. For e.g., the skip questions. Skip questions are primarily used to
determine the basis of the individual respondents answer, which of the following questions
a respondent should jump(or skip) to when question path is response directed. When
automation is used it should be carefully designed in order to avoid disorientation or
confusion to the respondents.
117
DBA 1657
NOTES
There is practically no cost involved once the set up has been completed.
Large samples do not cost more than smaller ones
The researcher can use audio visuals in collection of the data. Some Web survey
software can also show video and play sound.
Web page questionnaires can use colors, fonts and other formatting options
not possible in most email surveys.
Some Web survey software can combine the survey answers with preexisting information on the individuals taking a survey.
118
DBA 1657
It is possible to link the online questionnaire to data base and as such the
information received can be immediately updated without further need for manual
data entry as in the case of paper based questionnaire
NOTES
Disadvantages
The coverage error is prevalent in online survey i.e. all the members of the
population is not having an equal chance of being included in the survey. This is
particularly so in case of countries where the internet access is very low and
computer illiteracy is higher.
Non-responsive error is higher because respondent may not opt to fill the online
questionnaire. Also respondents can easily quit in the middle of a questionnaire.
They are not as likely to complete a long questionnaire on the Web as they
would be if talking with a good interviewer.
The survey on a web page cannot exercise a control over who replies - anyone
from anywhere who is surfing may answer. We cannot restrict the demographic
pattern of the respondent.
There is often no control over people responding multiple times to bias the
results.
In the present context online surveys should be mainly used only when the target
population consists entirely or almost entirely of Internet users. Business-to-business
research and employee attitude surveys can often meet this requirement. Surveys of the
general population usually will not. Web page surveys can be used when the researcher
uses audio or video or both sound and graphics. A Web page survey may be the only
practical way to have many people view and react to a video.
The researcher should make sure that the software used for conducting online surveys
prevents people from completing more than one questionnaire. The access can also be
restricted by requiring a password or by putting the survey on a page that can only be
accessed directly i.e., there are no links to it from other pages.
DBA 1657
NOTES
product with a request to the consumer to fill in the card and post the same to the
dealer.
2. Store audits
Store audits are performed by distributors as well as manufacturers through
their salesmen at regular intervals. The information is used to estimate market size,
market share, seasonal purchasing pattern etc. The data is obtained mostly by
observational method. Store audits are invariably panel operation, for the derivation of
sales estimates and compilation of sales trends are the base for the calculation. It provides
an efficient way to evaluate the effect of various in-store promotions on the sales.
3. Pantry audits
This is used to estimate consumption of the basket of goods at the consumer
level. The investigator collects an inventory of types, quantities and prices of the
commodities consumed. In pantry audits the data is recorded from the examination of
consumers pantry. The objective in a pantry audit is to identify the type of consumers
who buy certain products and certain brands. The basic assumption is that the contents
of the pantry accurately portray consumers preference. Pantry audits are usually
supplemented by direct questioning relating to reasons for preference of a product.
4. Consumer panels
The consumer panels consists a group of consumers who are interviewed on a
regular basis over a period of time. The consumer panels may be transitory or a continuing
panel. A transitory panel is set up to measure the effect of a particular phenomenon.
The panel is conducted on a before and after basis. Interview is conducted before the
phenomenon takes place and another interview after the phenomenon has occurred so
as to measure the changes in the attitude and behaviour of the consumers. A continuing
consumer panel is set up for an indefinite period with a view to collect data on a particular
aspect of consumer behaviour over a time period.
5. Mechanical devices
Use of mechanical devices enables to record data accurately. Eye camera,
Pupilometric camera, Psychogalvanometer, Motion picture camera and Audiometer
are some of the devices used for data collection. Eye cameras are designed to record
the focus of eyes of a respondent on a specific portion of a sketch or diagram or a
product package etc. Pupilometric cameras record dilation of the pupil as a result of
visual stimuli. The extent of dilation shows the degree of interest aroused by the stimuli.
Pshchogalvanometer is used to measure the extent of body excitement as a result of the
visual stimulus. Motion pictures are used to record the movement of the buyer while
120
DBA 1657
deciding to buy a consumer good. Audiometers are used with television to find out the
type of programmes as well as channels preferred by viewers .A device is fitted in the
television itself to record the changes which can be used to ascertain the market share.
NOTES
6. Projective techniques
Certain ideas and thoughts cannot be easily verbalized as it remains at the
unconscious levels in the minds of respondents. This can be brought to the surface by
trained professionals who apply different probing techniques so as to bring to the surface
the deep-rooted ideas and thoughts. Some techniques are explained below:
i. Word association test
The test is used to extract information regarding words which have maximum
association. Respondents are asked to quickly associate a word say - happy with the
first thing that comes to mind. This is often used to get true attitudes and feeling of the
respondent. The same idea is used in marketing research to find out the quality that is
mostly associated with a brand of product. This technique is quick and easy to use and
yields reliable results, when applied to words that are widely known and posses
essentially one meaning.
ii. Sentence completion tests
It is an extension of the word association tests. The respondent is provided
with a several half completed statement regarding a subject. Analysis of replies from the
respondent reveals his attitude towards the subject. This technique not only permits the
words testing but ideas too. It is quick and easy to use, however, it leads to analytical
problems as the responses are multidimensional.
iii. Story completion test
This test is a step further where the researcher may contrive stories instead of
sentences and the respondent to complete the same. The respondent is given just enough
of a story to focus attention on a given subject and is asked to provide a conclusion to
the story.
iv. Verbal projection tests
The respondent is asked to comment on or explain on what other people do.
For example - Why people own a particular product? Answers may reveal the
respondents own motivations.
v. Pictorial techniques
Several pictorial techniques are available. They are discussed below:
121
DBA 1657
NOTES
122
DBA 1657
NOTES
DBA 1657
NOTES
Sampling frame
After defining the target population, the researcher must assemble a list of all
eligible sampling units, referred to as a sampling frame. Some common sources of
sampling frames for a study about the customers are the customer list from credit card
companies.
Sample
A sample is a subset or subgroup of the population. It comprises some members
selected from it. Only some and not all elements of the population would form the
sample. If 200 members are drawn from a population of 500 workers, these 200
members form the sample for the study. From the study of 200 members, the researcher
would draw conclusions about the entire population.
Subject
A subject is a single member of the sample, just as an element is a single member
of the population. If 200 members from the total population of 500 workers form the
sample for the study, then each worker in the sample is a subject.
Lower cost: The cost of conducting a study based on sample is much lesser
than the cost of conducting the census study.
124
DBA 1657
NOTES
DBA 1657
NOTES
required data from the target population elements. The method of data collection guides
the researcher in identifying and securing the necessary sampling frame for conducting
the research.
Identify the sampling frames needed
The researcher should identify and assemble a list of eligible sampling units.
The list should contain enough information about each prospective sampling unit so as
to enable the researcher to contact them. Drawing an incomplete frame decreases the
likelihood of drawing a representative sample.
Select the appropriate sampling method
The researcher can choose between probability and non-probability sampling
methods. Using a probability sampling method will always yield better and more accurate
information about the target populations parameters than the non-probability sampling
methods. Seven factors should be considered in deciding the appropriateness of the
sampling method viz., research objectives, degree of desired accuracy, availability of
resources, time frame, advanced knowledge of the target population, scope of the
research and perceived statistical analysis needs.
Determine necessary sample sizes and overall contact rates
The sample size is decided based on the precision required from the sample
estimates, time and money available to collect the required data. While determining the
sample size due consideration should be given to the variability of the population
characteristic under investigation, the level of confidence desired in the estimates and
the degree of the precision desired in estimating the population characteristic. The number
of prospective units to be contacted to ensure that the estimated sample size is obtained
and the additional cost involved should be considered. The researcher should calculate
the reachable rates, overall incidence rate and expected completion rates associated
with the sampling situation.
Creating an operating plan for selecting sampling units
The actual procedure to be used in contacting each of the prospective
respondents selected to form the sample should be clearly laid out. The instruction
should be clearly written so that interviewers know what exactly should be done and
the procedure to be followed in case of problems encountered, in contacting the
prospective respondents.
Executing the operational plan
The sample respondents are met and actual data collection activities are executed
in this stage. Consistency and control should be maintained at this stage.
126
DBA 1657
NOTES
The ultimate test of a good sample is based on how well it represents the
characteristics of the population it represents. In terms of measurement the sample
should be valid. Validity of the sample depends on two considerations viz., accuracy
and precision.
Accuracy
The accuracy is determined by the extent to which bias is eliminated from the
sample. When the sample elements are drawn properly, some sample elements
underestimates the population values being studied and others overestimate them.
Variations in these values offset each other. This counteraction results in sample value
that is generally close to the population value. An accurate i.e., unbiased sample is one
in which the underestimators and the overestimators are balance among the members
of the sample. There is no systematic variance with an accurate sample. Systematic
variance has been defined as the variation in measures due to some unknown influences
that cause the scores to lean in one direction more than another. Even a large size of
samples cannot counteract systematic bias.
Precision
A second criterion of a good sample design is precision of estimate. No sample
will fully represent its population in all aspects. The numerical descriptors that describe
samples may be expected to differ from those that describe population because of
random fluctuations inherent in the sampling process. This is called sampling error.
Sampling error is what is left after all known sources of systematic variance have been
accounted for. In theory, sampling error consists of random fluctuations only, although
some unknown systematic variance may be included when too many or too few sample
elements possess a particular characteristic. Precision is measured by standard error of
estimate, a type of standard deviation measurement; the smaller the standard error of
estimate, the higher is the precision of the sample. The ideal sample design produces a
small standard error of estimate.
DBA 1657
NOTES
Representation Basis
Element Selection
Unrestricted
Restricted
Probability
Simple random
Complex random
Systematic
Stratified
Cluster
Double
Nonprobability
Convenience
Purposive
Judgement
Quota
Snowball
128
DBA 1657
generated through the computer programs also. Using the random numbers the sample
can be selected.
NOTES
DBA 1657
NOTES
It is important that the natural order of the defined target population list be
unrelated to the characteristic being studied.
Skip interval should not correspond to the systematic change in the target
population.
130
DBA 1657
NOTES
DBA 1657
NOTES
have to be selected to estimate the true population parameter accurately for that
subgroup. This method is also opted for in situation where it is easier, simpler and less
expensive to collect data from one or more strata than from others.
Advantages and disadvantages
Stratified random sampling provides several advantages viz., the assurance of
representativeness in the sample, the opportunity to study each stratum and make relative
comparisons between strata and the ability to make estimates for the target population
with the expectation of greater precision or less error.
iii. Cluster sampling
Cluster sampling is a probability sampling method in which the sampling units
are divided into mutually exclusive and collectively exhaustive subpopulation called
clusters. Each cluster is assumed to be the representative of the heterogeneity of the
target population. Groups of elements that would have heterogeneity among the members
within each group are chosen for study in cluster sampling. Several groups with intragroup
heterogeneity and intergroup homogeneity are found. A random sampling of the clusters
or groups is done and information is gathered from each of the members in the randomly
chosen clusters. Cluster sampling offers more of heterogeneity within groups and more
homogeneity among the groups.
Single stage and Multistage cluster sampling
In single stage cluster sampling, the population is divided into convenient clusters
and required number of clusters are randomly chosen as sample subjects. Each element
in each of the randomly chosen cluster is investigated in the study. Cluster sampling can
also be done in several stages which is known as multistage cluster sampling. For example
to study the banking behaviour of customers in a national survey , cluster sampling can
be used to select the urban, semi-urban and rural geographical locations of the study.
At the next stage, particular areas in each of the location would be chosen. At the third
stage, the banks within each area would be chosen. Thus multi-stage sampling involves
a probability sampling of the primary sampling units; from each of the primary units, a
probability sampling of the secondary sampling units is drawn; a third level of probability
sampling is done from each of these secondary units, and so on until the final stage of
breakdown for the sample units are arrived at, where every member of the unit will be
a sample.
Area sampling
Area sampling is a form of cluster sampling in which the clusters are formed by
geographic designations. For example, state, district, city, town etc., Area sampling is a
form of cluster sampling in which any geographic unit with identifiable boundaries can
132
DBA 1657
be used. Area sampling is less expensive than most other probability designs and is not
dependent on population frame. A city map showing blocks of the city would be adequate
information to allow a researcher to take a sample of the blocks and obtain data from
the residents therein.
NOTES
In stratified sampling the population is divided into a few subgroups, each with
many elements in it and the subgroups are selected according to some criterion
that is related to the variables under the study. In cluster sampling the population
is divided into many subgroups each with a few elements in it. The subgroups
are selected according to some criterion of ease or availability in data collection.
The elements are chosen randomly within each subgroup in stratified sampling.
In cluster sampling the subgroups are randomly chosen and each and every
element of the subgroup is studied indepth.
DBA 1657
NOTES
and later a sub-sample of this primary sample is used to examine the matter in more
detail The process includes collecting data from a sample using a previously defined
technique. Based on this information, a sub sample is selected for further study. It is
more convenient and economical to collect some information by sampling and then use
this information as the basis for selecting a sub sample for further study.
134
DBA 1657
NOTES
B. Purposive sampling
A non-probability sample that conforms to certain criteria is called purposive
sampling. There are two major types of purposive sampling viz., Judgment sampling
and Quota sampling.
i. Judgment sampling
Judgment sampling is a non-probability sampling method in which participants
are selected according to an experienced individuals belief that they will meet the
requirements of the study. The researcher selects sample members who conform to
some criterion. It is appropriate in the early stages of an exploratory study and involves
the choice of subjects who are most advantageously placed or in the best position to
provide the information required. This is used when a limited number or category of
people have the information that are being sought. The underlying assumption is that
the researchers belief that the opinions of a group of perceived experts on the topic of
interest are representative of the entire target population.
Advantages and disadvantages
If the judgment of the researcher or expert is correct then the sample generated
from the judgment sampling will be much better than one generated by convenience
sampling. However, as in the case of all non-probability sampling methods, the
representativeness of the sample cannot be measured. The raw data and information
collected through judgment sampling provides only a preliminary insight.
ii. Quota sampling
The quota sampling method involves the selection of prospective participants
according to prespecified quotas regarding either the demographic characteristics
(gender,age, education , income, occupation etc.,) specific attitudes (satisified, neutral,
dissatisfied) or specific behaviours ( regular, occasional, rare user of product) .The
purpose of quota sampling is to provide an assurance that prespecified subgroups of
the defined target population are represented on pertinent sampling factors that are
determined by the researcher. It ensures that certain groups are adequately represented
in the study through the assignment of the quota.
Advantages and disadvantages
The greatest advantage of quota sampling is that the sample generated contains
specific subgroups in the proportion desired by researchers. In those research projects
that require interviews the use of quotas ensures that the appropriate subgroups are
identified and included in the survey. The quota sampling method may eliminate or
reduce selection bias.
135
DBA 1657
NOTES
An inherent limitation of quota sampling is that the success of the study will be
dependent on subjective decisions made by the researchers. As a non-probability method,
it is incapable of measuring true representativeness of the sample or accuracy of the
estimate obtained. Therefore, attempts to generalize the data results beyond those
respondents who were sampled and interviewed become very questionable and may
misrepresent the given target population.
iii. Snowball Sampling
Snowball sampling is a non-probability sampling method in which a set of
respondents are chosen who help the researcher to identify additional respondents to
be included in the study. This method of sampling is also called as referral sampling
because one respondent refers other potential respondents. Snowball sampling is
typically used in research situations where the defined target population is very small
and unique and compiling a complete list of sampling units is a nearly impossible task.
While the traditional probability and other non-probability sampling methods would
normally require an extreme search effort to qualify a sufficient number of prospective
respondents, the snowball method would yield better result at a much lower cost. The
researcher has to identify and interview one qualified respondent and then solicit his
help to identify other respondents with similar characteristics.
Advantages and disadvantages
Snowball sampling enables to identify and select prospective respondents who
are small, hard to reach and uniquely defined target population. It is most useful in
qualitative research practices. Reduced sample size and costs are the primary advantage
of this sampling method. The major drawback is that the chance of bias is higher. If
there is a significant difference between people who are identified through snowball
sampling and others who are not then, it may give raise to problems. The results cannot
be generalized to members of larger defined target population.
3.7 DETERMINATION OF APPROPRIATE SAMPLING DESIGN
Determining an appropriate sampling design is a challenging issue and has greater
implications on the application of the research findings. Apart from considering the
theoretical components, sampling issues, advantages and drawbacks of different sampling
techniques, the decision should take into consideration the following factors:
1. Research objectives
A clear understanding of the statement of the problem and the objectives will
provide the initial guidelines for determining the appropriate sampling design. If the
research objectives include the need to generalize the findings of the research study,
136
DBA 1657
NOTES
DBA 1657
NOTES
138
DBA 1657
NOTES
Since the sample data is used for drawing inference regarding the population,
the inferences should be accurate to the extent possible and it should also be possible
to estimate the error. An interval estimation to ensure a relatively accurate estimation of
the population parameter should be made. For this purpose, statistics that have the
same distribution as the sampling distribution of mean, usually a Z or t statistic is used.
For example the problem at hand is to estimate the mean value of purchases
made by a customer from department stores. A sample of 64 customers are identified
through systematic sampling method and it is found that the sample mean X = 105 and
the sample standard deviation S = 10. X, the sample mean is a point estimate of , the
population mean. A confidence interval could be constructed around X to estimate the
range within which would fall. The standard error S X and the percentage or level of
confidence required will determine the width of the interval which is determined by the
formula.
= X KS X
SX =
SX =
S
n
10
= 1.25
64
DBA 1657
NOTES
The width of the interval has increased and as such the precision in the estimation is
comparatively less though the confidence level in the estimation has increased. A larger
sample size is required if the precision and confidence level has to be increased. The
sample size , n is a function of
The variability in the population
Precision or accuracy needed
Confidence level desired
Type of sampling plan used.
If the sample size cannot be increased, the only way to maintain same level of
precision would be by discarding the confidence level in the estimation. The confidence
level or certainty of the estimate will be reduced. It is a must for researchers to consider
four aspects while making decisions regarding the sample size.
The precision level needed in estimating the population characteristics ie the
allowable margin of error.
The level of confidence required ie., the percentage chance the researcher is
willing to take in committing error in the estimation of population parameters.
The extent of variability in the population on the characteristics investigated
The cost - benefit analysis of increasing the sample size.
3.8.2 Sample data and hypothesis testing
In addition to estimating the population parameters, the sample data can also
be used to test hypotheses about population values. For example, if we want to determine whether a customer spent the same average amount in purchases at Department
A as in Department B a null hypothesis can be formed. Null hypothesis proposes that
there is no significant differences in the amount spent by customers at the two different
stores. This would be expressed as:
H0 : A- B = 0
The alternate hypothesis can be states as follows:
H0 : A- B ? 0
If a sample of 20 customers from each of the two stores and find that the mean
value of purchases of customers in Store A is 105 with a standard deviation of 10, and
the corresponding figures for store B are 100 and 15, respectively , it can be seen that
XA X B = 105-100 = 5
The null hypothesis states that there is no significant difference. The probability
of the two group means having a difference of 5 in the context of null hypothesis should
be determined. This can be done by converting the difference in the sample means to a
140
DBA 1657
t statistic and identify the probability of finding a t of that value. The t distribution has
known probabilities attached to it. The critical values in t distribution for two samples of
20 each with 38 as degrees of freedom (n1+n2)-2 = 38) is 2.021. A two tailed test is
used to know whether the difference between Store A and Store B is positive or negative.
The t statistics can be calculated for testing the hypothesis as follows:
t=
(x x ) (
1
SX
NOTES
2 )
SX 2
n1 S12 + n 2 S 22 1 1
+
S x1 S x 2 =
(n1 + n2 2) n1 n2
(20 10 )+ (20 15 ) 1 + 1
2
(20 + 20 2)
t=
(x
20 20
x B ( A B )
4.136
A B = 0 (null hypothesis)
t=
50
= 1.209
4.136
The t value of 1.209 is much below the value of 2.201at 95% significance level. Even
for 90% probability requires a value of 1.684. Thus the difference of 5 found between
the two stores is not significant. The conclusion is that there is no significant difference
between the spending pattern of the customers in Store A and in Store B. Thus the null
hypothesis is accepted and the alternate hypothesis is rejected.
DBA 1657
NOTES
be taken to ensure that neither a small sample is selected so as to enhance the risk of
sampling error nor too many units are selected to increase the cost of study. It is necessary to make a trade-off between (i) increasing sample size which would reduce the
sampling error but increase the cost and (ii) decreasing the sample size which might
increase the sampling error while decreasing the cost.
Several factors should be considered before deciding the sample size. The first
and the foremost is the size of the error that would be tolerable for the purpose of the
decision making. The second is the degree of confidence with the results of the study. If
100 percent confidence of result is needed the entire population must be studied. However it is impractical and costly. Normally, confidence limit is accepted at 99%, 95%
and 90%. The confidence and precision aspects are discussed in detail under the heading precision and confidence in sample size estimation dealt earlier.
For determining the sample size the following relationship is used.
can be calculated if we know the upper and lower confidence limits. If these limits
are assumed to be Y, then
Z
x =Y
where Z is the value of the normal variate for a given confidence level.
The procedure for determining sample size can be illustrated through an example.
A management consultant concern is performing a survey to determine the annual salary of managers numbering 3000 in the textile concern within a district. The sample size
it should take for the purpose of the study should be ascertained in order to estimate the
mean annual earnings within plus and minus 1000 at 95 percent confidence level. The
standard deviation of annual earning of the entire population is known to be Rs.3000.
The desired upper and lower limit is Rs.1000 ie., the estimate of annual earnings within plus and minus Rs.1000 should be ascertained.
Z
= 1000
142
DBA 1657
x =
NOTES
1000
= 510.20
1.96
i.e.,
i.e., n =
3000
= 5.88
510.20
n = 34.57
Therefore the desired sample size is approximately 35.
SUMMARY
This chapter dealt in detail the various sources of data and the data collection
3000
= 510
= 510
.20
.20
methods. The primary data sources viz., the focus group and panels were discussed in
nn
detail. The data collection methods viz., the interview, questionnaire, observation and
other methods were examined. Sampling design is an important element of research as
it decides the validity and the reliability of the research findings. The various probability
and non-probability techniques were discussed in detail. The method of determining
the sample size, the precision and confidence desired in estimating the population size
were explained.
With this background, the next unit provides a detailed discussion on the various
multivariate techniques used to analyze the data collected.
Discuss the different data sources, explaining their usefulness and disadvantages?
Discuss the types of error and the steps to avoid the same.
DBA 1657
NOTES
Discuss the issues concerned with precision and confidence in sampling design.
144
DBA 1657
NOTES
Unit-4
145
DBA 1657
NOTES
4.1 INTRODUCTION
Business problems today are more complex. The various functional areas of
management are confronted by multiple independent and/or dependent variables. This
requires the application of multivariate techniques to gain an insight into the problems or
to make decisions regarding the choices involved. The availability of computers with
fast processing speed and versatile software has enhanced the application of these
techniques which involves complex mathematical calculations. Multivariate analysis can
be defined as those statistical techniques which focus upon, and bring out in bold
relief, the structure of simultaneous relationships among three or more phenomena.
Thus multivariate analysis refers to a group of statistical techniques used when there are
two or more measurements on each element and the variables are analyzed
simultaneously. It is concerned with the simultaneous relationship among two or more
phenomena. Multivariate techniques are largely empirical and deal with the reality. The
basic objective underlying the use of multivariate techniques is to represent a collection
of massive data in a simplified manner. In other words, multivariate techniques transform
a mass of observations into smaller number of composite scores in such a way that they
reflect as much information as possible contained in the raw data obtained in a research
study.
This unit explains some of the multivariate techniques and the application of
statistical package to solve the same.
Know the use of cluster analysis techniques for grouping similar objects or
people
146
DBA 1657
NOTES
147
DBA 1657
NOTES
One
Dependence
Methods
Number of
Dependent
variables
None
Interdependence
methods
(Nonmetric)
Nominal
Dependent variable
Level of measurement
(Metric)
Interval or
Ratio
Ordinal
Discriminant
analysis
Conjoint
Factor
Cluster
Perceptual
Mapping
Multiple
regression
ANOVA
MANOVA
Conjoint
Spearmans
Rank
Correlation
148
DBA 1657
NOTES
149
DBA 1657
NOTES
a1
b1
c1
k1
a2
b2
c2
k2
a3
b3
c3
k3
aN
bN
cN
kN
Persons(objects)
S is zero and the variance of scores in any column is 1. A factor is any linear combination
of the variable in a data matrix and can be stated in general manner :
A = Wa a + Wbb + . Wkk . The values are obtained and factor loading ie
the factor variable correlations is are calculated. Then communality symbolized as h2,
the eigen value and the total sum of squares are obtained and the results are interpreted.
The technique of rotation is done in order to obtain realistic results. The rotation reveals
different structures in the data. Finally the factor scores are obtained which enables to
explain the factors. After obtaining factor scores, several other multivariate analysis like
cluster, multiple regression, discriminant analysis etc can be performed.
4.4.1 Statistics and terms associated with factor analysis
The statistics and some of the basic terms used in factor analysis are explained
below:
i.
ii.
iii.
150
DBA 1657
high value of communality means that not much of the variable is left over
after whatever the factor represent is taken into consideration. It is worked
out for each variable as under:
NOTES
v.
Total sum of squares: When the eigen values of all factors are totaled, the
resulting value is termed as the total sum of squares. This value, when divided
by the number of variables involved in the study results in an index that
shows how the particular solution accounts for what, all the variables taken
together represent.
vi.
vii.
Factor scores: Factor scores are composite scores estimated for each
respondent on the derived factors. With the factor scores several other
multivariate analyses can be performed.
viii.
ix.
x.
DBA 1657
NOTES
xi.
Scree plot: Scree plot is a plot of the eigen values against the number of
factors in order of extraction.
152
DBA 1657
NOTES
DBA 1657
NOTES
primary concern is to determine the minimum number of factors that will account for
maximum variance in the data for use in subsequent multivariate analysis. The factors
are called principal components. If the researcher is attempting to uncover underlying
dimensions surrounding the original variables, common factor analysis is used. Principal
component analysis is based on the total information in each variable, whereas common
factor analysis is concerned only with the variance shared among all the variables.
Principal component analysis
Factor analysis begins with the construction of a new set of variables based on
the relationships in the correlation matrix. Principal component analysis method transforms
a set of variables into a new set of composite variables or principal components that are
not correlated with each other. The linear combination of factors accounts for the variance
in the data as a whole. The best combination makes up the first principal component
and is the first factor. The second principal component is defined as the best linear
combination of variables for explaining the variance not accounted for by the first factor.
Likewise, there may be a third, fourth and kth component, each being the best linear
combination of variables not accounted for by the previous factors.
This process continues till all the variance is accounted. However, usually it is
stopped after a small number of factors has been extracted. The output of the principal
component analysis might look like the data given below:
Extracted
components
% of variance
accounted for
Cumulative
variance
Component no. 1
74%
74%
Component no. 2
15%
89%
Component no. 3
11%
40 %
Numerical results from a factor analysis will be presented like the following
table. The values in the table are correlation coefficients between the factor and the
variables.
154
DBA 1657
A
Variable
Unrotated factors
I
II
h2
.70
-.40
.65
.60
-.50
.61
.60
-.35
.48
.50
.50
.50
.60
.50
.61
.60
.60
.72
2.18
1.39
% of variance
36.30
23.20
Cumulative %
36.30
59.50
Eigenvalue
NOTES
B
Rotated factors
I
II
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
1234567890123456789
.79
.15
.75
.03
.68.
.4
.06
.70.
.13
.77
.07
.85
In the above table .70 is the correlation coefficient between variable A and
factor I. The correlation coefficients are called as loadings. Eigen values are the sum of
the variances of the factor values. For factor I the eigenvalue is sum of .702
+602+602+502+602+602 which is 2.18. The eigen value 2.18 divided by the number of
variables i.e., 6 yields an estimate of the amount of total variance explained by the
factor. In the example given above Factor I accounts for 36% of the total variance.
The column heading h2 gives the communalities i.e., the estimates of the variance in each
variable that is explained by the two factors. From the above table it can be seen that
in the case of Variable A the communality is .702 + (-402) = .65 which indicates that 65
percent of the variance in variable A is statistically explained in terms of factors I and II.
In the unrotated factor, a loading does not provides much information. It is not
possible to find the variables with high loading in factor I and factor II. Rotation enables
to identify the variables associated with the factors.
4. Determine the number of factors
It is possible to compute as many principal components as there are variables,
but it does not serve the purpose of conducting a factor analysis. In order to summarize
155
DBA 1657
NOTES
the information contained in the original variables, smaller number of factors should be
extracted. The question of how many factors are to be extracted arise. Several procedures are discussed below for determining the number of factors.
A Priori determination:
Due to prior knowledge the researcher knows how many factors to extract
and thus can specify the number of factors to be extracted beforehand. The extraction
of factors is completed as soon as the desired number of factors is extracted.
Determination based on Eigenvalues:
In this approach only factors with eigen values greater than 1.0 are retained,
the other factors are not included in the model. An eigen value represents the amount of
variance associated with the factor. Hence, factors with variance greater than 1.0 are
included. If the number of variables is less than 20, this approach will result in conservative number of factors.
Determination based on Scree plot:
A scree plot is a plot of the eigen values against the number of factors in order
of extraction. The shape of plot is used to determine the number of factors. The plot
typically has a distinct break between the steep slope of factors with large eigenvalues
and a gradual trailing off associated with the rest of the factors. The gradual trailing off
is referred as scree. The point at which the scree begins denotes the true number of
factors.
156
DBA 1657
NOTES
DBA 1657
NOTES
158
DBA 1657
factor scores. The variables can be selected by examining the factor matrix and selecting for each factor the variable with the highest loading on that factor. The variable
could be as surrogate variables for the associated factor. This process will work well if
one factor loading for a variable is clearly higher than all other factor loadings. However
if two or more variables have similar loadings the choice will be difficult. In such cases
the choice of variables should be made on the basis of theoretical and measurement
consideration. For example, the theory may suggest that a variable with a slightly lower
loading is more important than one with slightly higher loading. Likewise, if variable
having a slightly lower loading has been measured more precisely, it should be selected
as the surrogate variable.
NOTES
DBA 1657
NOTES
160
DBA 1657
analysis in place of the original variables, with the knowledge that the meaningful
variation in the original data has not been lost. Likewise, a large number of
dependent variables also can be reduced through factor analysis.
NOTES
ii. Factor analysis can be used in product research to determine the brand attributes
that influence the consumers choice.
iii. In advertising studies, factor analysis can be used to understand the media
consumption habits of the consumers.
iv. In pricing studies, it can be used to identify the characteristics of price-sensitive
consumers.
Limitations:
i.
161
DBA 1657
NOTES
ii.
The results of single factor analysis are considered generally less reliable and
dependable as the factor analysis mostly starts with a set of imperfect data.
The factor analysis should be done atleast twice. If similar results are obtained
the confidence regarding the results will increase
iii.
Factor analysis is a complicated decision tool that can be used only when one
has thorough knowledge and enough experience of handling this tool.
162
DBA 1657
NOTES
Select the variables you want to enter into the factor analysis by double clicking on
them, or use the shift or control keys to select them and click the right arrow key to
move the selected variables to the Variables list on the right. Click Extraction
Extracting factors and factor rotation:
There is no hard and fast rule to determine the number of factors. A commonly
used convention is to use the number of factors with eigen values greater than 1. The
statistical package will select this number by default. The scree plot may also be used
to determine the number of factors
DBA 1657
NOTES
1
2
3
164
DBA 1657
NOTES
The variance explained by the initial solution, extracted components, and rotated
components is displayed. This first section of the table shows the Initial Eigenvalues.
The Total column gives the eigenvalue, or amount of variance in the original variables
accounted for by each component. The % of Variance column gives the ratio of the
variance accounted for by each component to the total variance in all of the variables.
The Cumulative % column gives the percentage of variance accounted for by the first n
components. For example, the cumulative percentage for the second component is the
sum of the percentage of variance for the first and second components. For the initial
solution, there are as many components as variables.
The second section of the table shows the extracted components. They explain
nearly 88% of the variability in the original ten variables, so the complexity of the data
set can be considerably reduced by using these components, with only a 12% loss of
information.
DBA 1657
NOTES
components. The large changes in the individual total suggest that the rotated component
matrix will be easier to interpret than the unrotated matrix.
The scree plot enables to determine the optimal number of components. The
eigen value of each component in the initial solution is plotted. Generally, the components on the steep slope are extracted. The components on the shallow slope contribute little to the solution. The last big drop occurs between the third and fourth components, so the first three components are selected.
Rotated Component Matrixa
Price in thousands
Horsepower
Engine size
Length
Wheelbase
Width
Vehicle type
Fuel efficiency
Fuel capacity
Curb weight
1
.935
.933
.753
.155
3.616E-02
.384
-.101
-.543
.398
.519
Component
2
3
-3.45E-03 4.136E-02
.242 5.565E-02
.436
.292
.943 6.862E-02
.884
.314
.759
.231
9.478E-02
.954
-.318
-.681
.495
.676
.533
.581
166
DBA 1657
NOTES
Variable 1
Variable 2
167
DBA 1657
NOTES
ii. Cluster centroid: The cluster centroid is the mean values of the variables for
all the objects in a particular cluster.
iii. Cluster centers: The cluster centers are the initial starting points in nonhierarchical clustering. Clustering are built around these centers or seeds.
iv. Cluster membership: Cluster membership indicates the cluster to which each
object or case belongs.
v. Dendrogram: Dendrogram or tree graph, is a graphical device for displaying
clustering results. Vertical lines represent clusters that are joined together. The
position of the line on the scale indicates the distance at which clusters were
joined. The dendrogram is read from left to right.
Anna University Chennai
168
DBA 1657
vi. Distance between cluster centers: The distance indicates how separated the
individual pairs of clusters are. Clusters that are widely separated are distinct
and desirable.
NOTES
DBA 1657
NOTES
The Euclidean distance is the most commonly used measure. It is the square
root of the sum of the squared differences in values for each variable.
ii. The city-block or Manhattan distance measure the distance between two
objects in terms of the sum of the absolute differences in values for each variable.
iii. The Chebychev distance between two objects is the maximum absolute
difference in values for any variable.
The variables involved in the study may be measured in terms of different units for
example in terms of Likert scale, frequency, percentages etc. in such cases before
clustering the respondents , the data must be standardized by rescaling each variable
to have a mean of zero and a standard deviation of unity. The outliners or cases
with non-conforming values should also be eliminated.
3. Select a Clustering Procedure
Clustering procedures may be broadly categorized as hierarchical or nonhierarchical. Hierarchical clustering is characterized by development of hierarchy or
tree-like structure. Hierarchical methods can be of two types viz., divisive or
agglomerative. Divisive clustering starts with all the objects grouped in a single cluster.
Clusters are divided until each object is in a separate cluster. Agglomerative clustering
starts with each object in a separate cluster. Clusters are formed by grouping objects
into bigger and bigger clusters. This process is continued until all objects are formed
into a single cluster. Agglomerative consists of (i) Linkage methods, (ii) Variance methods
and (iii) Centroid methods.
Anna University Chennai
170
DBA 1657
NOTES
Single Linkage
Minimum
distance
Cluster 2
Cluster 1
Single linkage method does not work well when the clusters are poorly defined.
The complete linkage method is similar to single linkage except that it is based on the
maximum distance or the furthest neighbour approach. The distance between two
clusters is calculated as the distance between their two furthest points.
Complete
Average
Linkage
Linkage
Maximum
Average
distance
distance
Cluster
Cluster
1 1
ClusterCluster
2
2
In the average linkage method the distance between two clusters is defined as
the average of the distances between all pairs of objects, where one member of the pair
is from each of the clusters. This method uses information on all pairs of distances, not
merely the minimum or maximum distances. Hence it is preferable to single and complete
linkage method.
171
DBA 1657
NOTES
The average linkage method and wards method perform better than other
procedures.
Non-hierarchical clustering
The non-hierarchical clustering method is also known as k-means clustering.
This method includes sequential threshold, parallel threshold and optimizing partitioning.
i.
172
DBA 1657
iii.
The optimizing partitioning method differs from the other threshold method
i.e., the objects can later be reassigned to clusters to optimize an overall
criterion, such as average within-cluster distance for a given number of
clusters.
NOTES
DBA 1657
NOTES
the variables. The centroids enable us to describe each cluster by assigning it a name or
label. It will be more helpful to profile the clusters in terms of variables that are not used
for clustering. The demographic, psychographic, product usage, media usage or other
variables can be used for profiling. The variables that significantly differentiate between
clusters can be identified via discriminant analysis and one-way analysis of variance.
6. Assess Reliability and Validity
Several decisions are made on the basis of cluster analysis, hence clustering
solutions should not be accepted without assessing the reliability and validity. The
following procedure can be followed to provide adequate checks on the quality of
clustering results.
y Perform cluster analysis on the same data using different distance measure.
Compare the results across measures to determine the stability of the solutions.
y Use different methods of clustering and compare the results.
y Split the data randomly into halves, perform clustering separately on each half
and compare the cluster centroids across the two sub-samples.
y Delete variables randomly. Perform clustering based on the reduced set of
variables. Compare the results with those obtained by clustering based on the
entire set of variables. In non-hierarchical clustering, the solution may depend
on the order of cases in the data set. Multiple runs using different order of cases
can be performed until solutions are stabilized.
Segmenting the market: The consumers may be clustered on the basis of the
benefits sought from the purchase of a product. Each cluster would consist of
consumers who are relatively homogeneous in term of the benefit they seek.
This is called benefit segmentation.
174
DBA 1657
firm can examine its current offerings compared to those of the competitors to
identify potential new product opportunities.
NOTES
175
DBA 1657
NOTES
Select the variables on basis of which clusters are to be formed. Also select the
case labeling variable
Click Plots.
Select Dendrogram.
Select None in the Icicle group.
Click Continue.
176
DBA 1657
NOTES
177
DBA 1657
NOTES
Cases are listed along the left vertical axis. The horizontal axis shows the
distance between clusters when they are joined. Parsing the classification tree to determine the number of clusters is a subjective process. Generally, the gaps between
joinings along the horizontal axis is looked for . Starting from the right, there is a gap
between 20 and 25, which splits the automobiles into two clusters. There is another
gap from approximately 4 to 15, which suggests 6 clusters
The agglomeration Schedule
The agglomeration schedule is a numerical summary of the cluster solution
At the first stage, cases 8 and 11 are combined because they have the smallest
distance. The cluster created by their joining next appears in stage 7. In stage 7, the
clusters created in stages 1 and 3 are joined. The resulting cluster next appears in stage
8. When there are many cases, the table becomes rather long, but it may be easier to
scan the coefficients column for large gaps rather than scan the dendrogram. A good
cluster solution sees a sudden jump (gap) in the distance coefficient. The solution before the gap indicates the good solution. The largest gaps in the coefficients column
occur between stages 5 and 6, indicating a 6-cluster solution, and stages 9 and 4,
indicating a 2-cluster solution. These are the same as the findings from the dendrogram.
178
DBA 1657
NOTES
DBA 1657
NOTES
Classification matrix:
This is also called as confusion matrix or prediction matrix. It contains the number
of correctly and misclassified cases. The correctly classified cases appear on the diagonal
because the predicted and actual groups are the same. The off diagonal elements
represent cases that have been incorrectly classified. The sum of the diagonal elements
divided by the total number of cases represent the hit ratio.
Discriminant function coefficients:
The unstandardised discriminant function coefficients are the multipliers of
variables, when the variables are in the original units of measurement.
Discriminant scores:
The unstandardized coefficients are multiplied by the values of the variables.
These products are summed and added to the constant term to obtain the discriminant
scores.
Eigenvalue.
For each discriminant function, the eigen value is the ratio of between group to
within group sums of squares. Large eigenvalues imply superior functions.
F values and their significance:
These are calculated from a one-way ANOVA, with the grouping variable
serving as the categorical independent variable. Each predictor serves as metric dependent
variable in the ANOVA.
Group means and group standard deviation:
These are computed for each predictor for each group.
Pooled within- group correlation matrix:
The pooled within group correlation matrix is computed by averaging the
separate covariance matrices for all the groups.
Structure correlations:
This is also referred to as discriminant loadings , the structure correlations
represent the simple correlations between the predictors and the discriminant function.
Total correlation matrix:
If the cases are treated as if they were from a single sample and the correlations
computed, a total correlation matrix is obtained.
180
DBA 1657
NOTES
Wilks :
Sometimes also called the U statistic, Wilks for each predictor is the ratio of
the within group sum of squares to the total sum of squares. The values range between
0 and 1. Large values of nearing 1 indicates that the group means are not different.
Small values of nearing 0 indicate that the group means are different.
4.6.2 Steps in conducting two group discriminant analysis
The steps in conducting two group discriminant analysis are discussed below:
Formulate the problem
Assumptions
Interpretation of discriminant
functions
DBA 1657
NOTES
When the dependent variable is interval or ratio scaled, it must first be converted into
categories. The predictor variable should be selected based on a theoretical model or
previous research or in the case of exploratory research, the experience of the researcher
should guide the selection.
2. Research design issues
Research design for discriminant anlysis requires consideration of the following
issues (1) selection of both dependent and independent variables, (2)deciding the sample
size needed for estimation of discriminant function and (3) division of sample for validation
purpose.
(i) Selection of dependent and independent variable
To apply discriminant analysis the researcher should specify the dependent and
the independent variables. Dependent variable should be categorical and the independent
variables are metric. The number of dependent variables categories can be two or
more, but these groups must be mutually exclusive and exhaustive. Each observation
should be such that it can be placed into only one group. The dependent variable in
some cases may involve two groups eg., purchasers and non purchasers. In some
cases it may also involve several groups such as heavy users, medium users, light users
and non-users of a product.
After the decision regarding the dependent variables, the researcher must decide about
the independent variables to be included in the analysis. Independent variables can be
selected in the following two ways.
Identifying the variables from the previous research or from the theoretical model
that is underlying the basis of research question.
The second approach is intuition i.e., utilizing the researchers knowledge and
intuitively selecting variables for which previous research is not available.
182
DBA 1657
NOTES
DBA 1657
NOTES
variables for theoretical reasons and is not interested in viewing the intermediate results
based only on the most discriminating variables. In step-wise discriminant analysis the
independent variables are entered one at a time, based on their ability to discriminate
among groups. The stepwise method is useful when the researcher wants to consider a
relatively large number of independent variables for inclusion in the function.
Statistical significance
The researcher must assess the level of significance of the discriminant function
computed. It would not be meaningful to interpret the analysis if the discriminant functions
estimated were not statistically significant. Significance test can be done on the basis of
a number of statistical criteria viz., wilks lambda, Hotellings trace and Pillai criterion.
The significant criterion of .05 or beyond is often used. If the higher levels of risk for
including non-significant results are acceptable, significance level at .2 or .3 may be
fixed.
If the number of groups is three or more, the researcher must decide not only if
the discrimination between groups is significant but also if each of the estimated
discriminant function is statistically significant.
Assessing Overall Fit
Assessing overall fit of the selected discriminant function involves three tasks:
calculating discriminant Z scores for each observation, evaluating group differences on
the discriminant Z scores and assessing group membership predication accuracy.
5. Interpretation of discriminant functions
Interpretation involves examining the discriminant functions to determine the
relative importance of each independent variable in discriminating between the groups.
Three methods are available to assess the importance of the discriminating function.
i.
ii.
iii.
184
DBA 1657
sizes of the significant F values are examined and ranked. Large F values
indicate greater discriminatory power.
NOTES
Discrimininant analysis can help to distinguish between heavy, medium and light
users of a product in terms of consumption habits and lifestyles
DBA 1657
NOTES
186
DBA 1657
NOTES
Click Continue
Click Classify in the Discriminant Analysis dialog box.
187
DBA 1657
NOTES
When there are lots of predictors, the stepwise method can be useful by automatically selecting the best variables to use in the model. The stepwise method starts
with a model that doesnt include any of the predictors. At each step, the predictor with
the largest F to Enter value that exceeds the entry criteria (by default, 3.84) is added
to the model. The variables left out of the analysis have F to Enter values smaller
than 3.84, so not added.
The following table displays statistics for the variables that are in the analysis at each
step.
Variables in Analysis
188
DBA 1657
NOTES
Eigen Values
Three functions are fit automatically, but due to its minuscule eigenvalue, the
third function can be ignored.
Wilks lambda shows that only the first two functions are useful
Wilks Lamda
Structure Matrix
The structure matrix enables to identify the significant variables within each
function.
When there is more than one discriminant function, an asterisk(*) marks each
variables largest absolute correlation with one of the canonical functions. Within each
189
DBA 1657
NOTES
function, these marked variables are then ordered by the size of the correlation. Level
of education is most strongly correlated with the first function, and it is the only variable
most strongly correlated with this function. Years with current employer, Age in years,
Household income in thousands, Years at current address, Retired, and Gender
are most strongly correlated with the second function, although Gender and Retired
are more weakly correlated than the others. The other variables mark this function as a
stability function. Number of people in household and Marital status are most
strongly correlated with the third discriminant function, but since this is a useless function,
predictors are also useless.
The territorial map
The territorial map helps to study the relationships between the groups and the
discriminant functions. Combined with the structure matrix results, it gives a graphical
interpretation of the relationship between predictors and groups.
190
DBA 1657
that group 4 customers are, in general, the most highly educated. The second function
separates groups 1 and 3. Since, the third function was found to be rather insignificant,
only the first two discriminant functions are plotted.
NOTES
From Wilks lambda, it can be understood that the model is doing better than
guessing, but the classification results should be considered to determine how much
better the model is.
Given the observed data given in the above table it can be seen that, the null
model (that is, one without predictors) would classify maximum number of customers
191
DBA 1657
NOTES
into the model group, Plus service. Thus, the null model would be correct 281/400
(281 customers out of 400 customers ) = 28.1% of the time.
The discriminant model gets 11.4% more or 39.5% of the customers. In
particular, the model excels at identifying Total service customers. However, it does an
exceptionally poor job of classifying E-service customers.
192
DBA 1657
NOTES
DBA 1657
NOTES
194
DBA 1657
NOTES
3. Assumptions
In carrying out multiple regression analysis several assumptions about the
dependent and independent variables and about the relationships as a whole are made.
once the variate has been derived through multiple regression, it acts collectively in
predicting the dependent variable. The assumption is made not only for the individual
variables but also for the variate itself. The variate and its relationship with the dependent
variable should also meet the assumptions of multiple regression. The assumptions are;
(i)
(ii)
(iii)
(iv)
195
DBA 1657
NOTES
196
DBA 1657
NOTES
197
DBA 1657
NOTES
198
DBA 1657
NOTES
199
DBA 1657
NOTES
The collinearity among the variables needs to be verified from the output
collinearity diagnostics. If the eigenvalues are close to 0, it means that the predictors
are highly inter-correlated and that small changes in the data values may lead to large
changes in the estimates of the coefficients. Condition index values greater than 15
indicates a possible problem with collinearity; greater than 30, a serious problem.
The following collinearity table show that there are no eigenvalues close to 0,
and all of the condition indexes are much less than 15. The model built using step-wise
methods does not have problems with collinearity.
Anna University Chennai
200
DBA 1657
NOTES
The ability of the model to predict the dependent variable can be checked
through the model fit summary
Model Summary
The adjuster R square value predicts the fitness of the model. Higher value is
preferable.
Stepwise Co-efficients
The step-wise algorithm chooses price and size of the vehicle wheelbase as
predictors. Sales are negatively affected by price and positively affected by size. Hence
the conclusion is that cheaper, bigger cars sell well.
201
DBA 1657
NOTES
202
DBA 1657
Pooled Rc2 (pooled canonical correlation) is the sum of the squares of all the canonical
correlation coefficients, representing all the orthogonal dimensions in the solution by
which the two sets of variables are related. Pooled Rc2 is used to assess the extent to
which one set of variables can be predicted or explained by the other set.
NOTES
Eigenvalues They reflect the proportion of variance in the canonical variate explained
by the canonical correlation relating two sets of variables.
Canonical weight. This is also called as the canonical function coefficient or the
canonical coefficient: The standardized canonical weights are used to assess the relative
importance of individual variables contributions to a given canonical correlation.
Structure correlation coefficient is also called as canonical factor loadings A
structure correlation is the correlation of a canonical variable with an original variable in
its set. Structure correlations are used for the following purposes.
Interpreting the Canonical Variables: The magnitudes of the structure
correlations help in interpreting the meaning of the canonical variables with which they
are associated. Larger canonical factor loadings should be weighted more when assigning
an interpretive label to the given canonical correlation. A rule of thumb is for variables
with correlations of 0.3 or above to be interpreted as being part of the canonical variable,
and those below not to be considered part of the canonical variable.
Calculating Variance Explained in a Given Original Variable: The square
of the structure correlation is the percent of the variance in a given original variable
accounted for by a given canonical variable on a given canonical correlation.
Canonical communality coefficient is the sum of the squared structure coefficients
for a given variable. The canonical communality coefficient measures how much of a
given original variables variance is reproducible from the canonical variables.
Redundancy coefficient, d, also called Rd, measures the percent of the variance of
the original variables of one set may be predicted from a (usually the first) canonical
variable from the other set. High redundancy means high ability to predict.
DBA 1657
NOTES
Two sets of variables - dependent and independent are identified in the canonical
correlation. Once the variables are identified, the canonical correlation can be performed
for the following purposes;
(i)
(ii)
Deriving a set of weights for each set of dependent and independent variables
so that the linear combinations of each set are maximally correlated.
(iii)
204
DBA 1657
NOTES
DBA 1657
NOTES
(a) Canonical weights can be used to interpret the canonical functions. This involves
examining the sign and magnitude of the canonical weight assigned to each
variable in its canonical variate. Variables with relatively larger weights contribute
more to the variates and vice versa.
(b) Canonical loadings can be used to interpret the functions. It measures the simple
linear correlations between an original observed variable in the dependent or
independent set and the sets canonical variate. The larger the coefficient, the
more important it is in deriving the canonical variate.
(c) Canonical cross loading can be used as an alternative to canonical loadings.
This involves correlating each of the orginal observed dependent variables directly
with the independent canonical variate and vice versa.
6. Validation and Diagnosis
Canonical correlation analysis should be subjected to validation methods to
ensure that the results are not specific only to the sample data and can be generalized
to the population. For the purpose of validation two sub samples can be created
and analyses can be performed on each sub sample separately. Then the results are
compared for similarity of canonical functions, variate loadings etc., If marked
differences are found additional investigation should be performed.
Another approach is to assess the sensitivity of the results to the removal of
dependent or independent variable. To ensure the stability of the canonical weights
and loadings, multiple canonical correlations can be performed each time by
removing a different independent and dependent variable.
4.4.3 Application of Statistical Package : Canonical correlation
Canonical correlation can be carried out in SPSS using syntax. There are two
ways to perform the same. One is to use the Canonical correlation.sps macro. The
other way is to use MANOVA with DISCRIM subcommand.
(1) Canonical correlation.sps macro
The macro is a part of the SPSS package and can be found in a subdirectory
where SPSS is installed. To use the canonical correlation macro, locate the file Canonical
correlation.sps on the computer. Suppose that it is in c:\Program Files\spss. In the
syntax window, type
include file c:\Program files\spss\canonical correlation.sps.
Anna University Chennai
206
DBA 1657
NOTES
MANOVA
To use MANOVA the following syntax should be typed in the window:
SUMMARY
Selection of multivariate techniques to analyze the data is based on two criteria:
dependent or independent variables and the type of data ie metric or non metric. The
various multivariate techniques like factor, cluster, multiple regression and correlation
discriminant analysis and canonical correlation were presented. The criteria for applying
the statistical tests and the steps involved in conducting the same is explained in detail.
Applications of these statistical tests using the software package were also discussed.
Once the data analysis is done, the report has to be prepared to communicate the
results to all concerned. The next unit on report writing deals with the same.
207
DBA 1657
NOTES
y Explain the application of cluster analysis with example. Elucidate the process
of performing the same in SPSS. How will you interpret the results?
y What are the uses of discriminant analysis? Explain the process of building a
discriminant model.
y What is multiple regression? Explain the steps involved in the application of the
same .
y When can you apply canonical correlation? Explain the steps involved in building
the model.
208
DBA 1657
NOTES
Unit 5
5.1 INTRODUCTION
Report writing is an integral part of a research process. Research reports are
written to communicate to the world at large the results of the research, field work, and
other activities. Research report is a concrete outcome of the research work undertaken.
The quality of the research is judged by the quality of the writing and how well the
importance of the findings is conveyed. A research carried out very scientifically revealing
findings of great importance may not be of value if the same is not communicated
effectively. In the context of business, the report assumes importance as it is through the
reports the management gets information regarding the activities performed at various
levels of the organization. The management takes decisions and controls various activities
of the business on the basis of information provided through the business reports.
According to Louis L.N. Business report is an unbiased and arranged presentation of
facts by one or more than one persons for a definite and specified important business
209
DBA 1657
NOTES
purpose. Koontz and ODonnell define report as, a documentation in which by the
purpose of providing information a specified problem is researched and analyzed and
conclusions, thoughts and sometimes references are presented. In a nut shell, a business
report is any factual, objective document that serves a business purpose. This chapter
provides an insight into the basics of writing research reports in addition to the contents
and characteristic features of a good report. The contents of a research proposal and
the use of visual aids in preparing reports are dealt in detail.
Report may also be prepared to convince the reader or to sell an idea. The
report in this case would be more detailed and convincing as to how the proposed
idea could add to the organizations value or the justification as to why it should
be adopted.
210
DBA 1657
NOTES
y Reports may be prepared to provide an insight into the problem and may also
provide a final solution to the same.
DBA 1657
NOTES
5. Function
The reports may be classified as informative and interpretative on the basis of
function performed. Informative reports present facts pertinent to the issue or situation.
Common types of informational reports include those for monitoring and controlling
operations, statements of policies and procedures, compliance reports and progress
reports. It may take the form of an operating or a periodic report. Operating reports
provide managers with detailed information regarding all activities like sales, inventory,
costs etc., Periodic reports which describe the activities in a department during a particular
period.
Interpretative also knows as analytical or investigative report analyses the facts
and presents recommendations and conclusions. The report presents facts and persuades
the reader to accept a stated decision, action or the recommendations detailed throughout
the report. It may take the form of problem solving report providing the background
information and analysis about the various options. Trouble shooting reports is a form
of problem solving report which discusses the source of the problem, extent of damage
done and solutions possible. A feasibility report is a problem solving report that studies
proposed options to assess whether all or any one of them is sound.
6. Subject dealt
The reports may be categorized as problem determining, fact finding, performance
report, technical report etc. The problem determining report focuses on underlying a
problem or to ascertain whether a problem actually exists. Technical reports are concerned
with presenting data on a specialized subject with or without comments.
7. Legal reports
Reports may be prepared to meet the government regulations. For eg., A
compliance report explains what a company is doing to conform to the government
regulations. It may be prepared on annual basis like the income tax returns, annual
share holders report etc. Interim compliance reports can also be prepared to monitor
and control the licenses granted by the government.
5.5 The Concept of audience
Reports are written for the sake of audience i.e., the readers of the reports.
The goal of report writer is to enable the audience to act and hence the audience should
be taken into consideration, right from word choice, planning, organizing, deciding about
the visual aids, sentence structure etc., A good report requires to tune up the various
aspects of the audience viz., their knowledge level, their role in the given situation, their
place in the organization and their attitude.
212
DBA 1657
NOTES
DBA 1657
NOTES
214
DBA 1657
more powerful the reader, the less likely the report will give orders and the
more likely it is to make suggestions.
NOTES
DBA 1657
NOTES
report type that is appropriate should be selected. For analytical reports, the problem
should be defined before stating the purpose of the report.
Problem definition
The problem addressed by a report may be defined by the person who authorizes
the report or by the researcher himself. The readers of the report should be convinced
about the existence of the problem. This requires persuasive writing method. The problem
definition can be made by answering the following issues:
y What needs to be ascertained?
y When did the problem start?
y What is the importance of the issue?
y Who are involved in the situation?
y Where is the trouble located?
Problem factoring can also be done which involves breaking down the perceived
problem into a series of logical, connected questions that try to identify the cause and
effect. Speculating the cause for a problem leads to forming a hypothesis. A hypothesis
is a potential explanation that needs to be tested. Dividing the problem and framing the
hypothesis based on the available evidence enables to tackle even the most complex
situation.
Developing the statement of purpose
The problem statement enables to define what is going to be investigated whereas
the statement of purpose defines, why the report is prepared. The purpose statement
can be started with an infinite phrase. For eg., To analyse the reasons for fall in the
share price. Using an infinite phrase ( to plus a verb) encourages to take control and
decide where the starting should be made. The purpose statement should be highly
specific and the same should be checked with the person who has authorized the report.
The confirmed statement can be used as the basis for developing the preliminary outline
of the report.
Developing a preliminary outline
Preliminary outline establishes the framework for the report preparation. It
provides a visual diagram of the report to be prepared, in important points, the order in
which the discussion will take place and the details to be included. The preliminary
outline might look different from the final outline of the report, however, the outline
guides the research effort and acts as a foundation for organizing and composing the
report. Since, outline is only a working draft it will be revised and modified in the further
216
DBA 1657
steps. The two common outline formats used to guide the writing efforts are; alphanumeric
and decimal. The grammatical parallelism should be ensured among the various items
presented at the same level. Parallelism ensures generality by showing that the ideas are
related and they are of similar importance.
NOTES
DBA 1657
NOTES
218
DBA 1657
NOTES
DBA 1657
NOTES
220
DBA 1657
NOTES
y Consistent time perspective should be ensured in the report i.e., the report
should be in past or present tense. The chronological sequence should also be
adapted in presenting the events.
y The readers perspective of the report might be different from the researchers
perspective. Hence a preview or road map of the report structure should be
included. This will clarify the reader regarding the overall organization and flow
of report.
III. Post-writing stage
A research report will undergo many drafts before finalization. The report is
revised many times to ensure the content, organization, style and tone, readability, clarity
and conciseness. Post-writing stage involves revision of the report, production and
proofreading the same.
(1)Revision
Revision takes place during and after preparation of the first draft. It is an
ongoing process that occurs throughout the writing process. Revision involves search
for best way of saying something, probing for right words, rephrasing sentences,
reshaping, juggling elements etc. Revision is a never ending process, however, every
research report has a deadline and hence schedules should be drawn and met. Revision
consists of three main activities viz,(i) evaluating content, organization ,style and tone
(ii) reviewing for readability and scannability and (iii) editing for clarity and conciseness
(i) Evaluating content, organization, style and tone
During the process of evaluating the content the following aspects should be
given due attention:
y Accuracy of the information presented
y Relevance of the facts presented to the concerned audience
y Completeness of information provided to suite the audience needs
y Balance between specific and general information
While reviewing the organization the following aspects should be considered:
y Logical order in presentation and coverage of all main points to be ensured
y Assuring that the main theme is given more space and prominence
221
DBA 1657
NOTES
222
DBA 1657
y Words ending with -ion,-tion, -ing, -ment, -ant, -ent, -ance, and -ency should
be used with care as they change verbs into nouns and adjectives. Verbs should
be used instead of noun phrases
NOTES
DBA 1657
NOTES
Prewriting
Analyzing
Investigating
Adaptation
Writing
Format
&length
Structure
Order
Composing
Post writing
Revision
Production
Proof reading
224
DBA 1657
y The report should be free of technical or statistical jargon if the same is addressed
to audiences who may not understand.
NOTES
DBA 1657
NOTES
Title page
Most of the organization have their own form of title page for the research
report and the same should be complied with. The title page generally has the
following information:
y Title of the report.
y The month and year of submission.
y For whom and by whom the report is submitted.
y If project report is submitted for award of degree, the degree for which the
dissertation is submitted for should be listed.
The best practice is to centre the title of the report on the page in upper case
letters. If the title is too long to be centered on one line, an inverted pyramid principle
should be followed without splitting word or phrases.
Preface
The preface may include the writers purpose in conducting the study, a brief
resume of the background, scope, purpose, general nature of the research for which
the report is prepared and the acknowledgments. A preface can be prepared only after
the final form of the report is ready. In the case of dissertation submitted for award of
degree the preface is omitted and instead an acknowledgment is added.
Acknowledgment recognizes the persons to whom the writer is indebted for
guidance and assistance during the study. It also credits the institution for providing
funds to conduct the study and for granting permission to use the facilities. The researcher
should acknowledge the assistance provided by all concerned honestly in a simple and
tactful manner.
Executive summary
An executive summary is a brief account of the research study. It is a report in
miniature covering all aspects in the body of the report but in a brief manner. It provides
an overview of the research problem identified and highlights the important information
such as the sampling design, data collection method used, results of data analysis, findings
and recommendation. The length of the executive summary will normally be two to
three pages. The executive summary is usually written after the completion of the report.
Sometimes a synopsis or an abstract may be included instead of the executive
summary, however, they are not one and the same. Executive summaries are more
comprehensive than a synopsis. It includes heading, visual aids and enough information
to help busy people to make quick decision. Although executive summaries are not
226
DBA 1657
designed to replace the report, in some cases it may be the only thing that may be read
by the audience. By contrast, a synopsis is only a brief overview of the entire report and
may either highlight the main points as they appear in the report or simply inform the
reader as to the content of the report. The purpose of synopsis is to entice the audience
to read the report.
NOTES
Table of contents
The table of contents includes the major divisions of the report. It indicates in
outline form the topics included in the report. The purpose of a table of contents is to
provide an analytical overview of the topics included in the report together with the
sequence of presentation. Depending on the length and complexity of the report, the
content page may show only the top two or three levels of headings or only the firstlevel headings. Care should be exercised to see that the titles of chapters and captions
of subdivisions within chapters correspond exactly with those included in the body of
the report. Page numbers for each of the divisions are given. The relationship between
major divisions and minor subdivisions should be shown by using capital letters and
indentation or by using numeric system.
The table of contents is prepared after the other parts of report have been
typed, so that the page numbers can be given. If they are fewer than four visual aids, the
same may be listed in the table of contents, but if there are more than four visual aids, a
separate list of illustration should be prepared. Some guidelines for writing table of
contents are given below:
y The page is titled as Table of Contents or Contents.
y The name of each section should be worded and formatted as it appears in the
text.
y The table of contents should not be underlined as they may overwhelm the
words.
y Use only the page number on which the section starts.
y The margins should be set such that the page numbers align on the right.
y Not more than three levels of headings should be given.
y The leaders, a series of dots can be used to connect the words to page numbers
List of Tables
The researcher should prepare a list of tables compiled under the heading LIST
OF TABLES. It should be centered on a separate page by itself. Two spaces below
the headings Table number, Title, and Page number should be given. Table number
227
DBA 1657
NOTES
should be aligned to the left, page number should be aligned at the right and the title
should be centered.
List of Illustrations
The list of figures should be prepared in the same form as the list of tables. The
page is headed as LIST OF FIGURES. The list includes the Figure number, title of the
figure and page number. Normally arabic numerals are used for numbering.
B. The Text
The text is the most important part of a report as it is in this section that the
writer presents the facts. The researcher should devote the greater part of attention to
the careful organization and presentation of his findings or arguments. The text may be
organized as introduction, methodology and as many chapters as required for presenting
the report.
Introduction
The introduction prepares the reader for the report by describing the various
parts; background, problem statement and research objectives.
Background
The background information provides a prelude to the reader of the research
report. It may be the preliminary results of exploration the survey or any other source.
The secondary data from the literature review could also be highlighted. Previous research,
theory or situations that led to the research issue can be discussed. The literature should
be organized, integrated and presented in a logical manner. The background includes
definitions, assumptions etc. It provides the needed information to understand the
remainder of the research report. It contains information pertinent to the management
problem or the situation that led to the study. It may be placed before the problem
statement.
Problem statement
The problem statement contains the need for the research project. The problem
is usually represented by a management question. It is followed by a more detailed set
of objectives. The guidelines are given below:
y It gives basic facts about the problem.
y It specify the causes or origin of the problem.
y It explains the significance of the problem.
Research objectives
The research objectives provide the purpose of the research. The objectives
may be research questions and associated investigative questions. In correlational study,
228
DBA 1657
the hypothesis statements are included. Hypothesis are declarative statements describing
the relationship between two or more variables. They state clearly the variables of
concern, the relationships among them, and the target group being studied. Operational
definitions of variables should be included.
NOTES
Methodology
The methodology contains the following sections:
y The type of the study viz., descriptive, exploratory should be mentioned in the
methodology.
y The sampling design explains the sample method and sample size.
y The data collection method is described in the report.
y The tools used for analysis of data should be explained.
Findings and Conclusions
The findings section is generally the longest section of the report. The objective
is to explain the data. Wherever needed the data should be supplemented with charts,
and graphs. The conclusion serves the important function of tying together the whole
thesis or assignment. The recommendations of the study are also presented in this section.
It provides idea about the corrective actions. In academic research, the suggestions
broaden the understanding of the subject area. In applied research, the recommendation
includes the guidelines for further managerial actions. Several alternatives may be provided
with further justifications. The conclusion should leave the reader with the impression
of completeness and of positive gain.
C. Reference material
The reference material includes, bibliography, appendix and index.
Bibliography
The bibliography follows the main body of the text and is a separate but integral
part of a thesis, preceded by a division sheet or introduced by a centered capitalized
heading BIBLIOGRAPHY. A bibliography is a list of secondary sources consulted
while preparing the report. In a proper sense bibliography differs from the reference
list. A bibliography is the listing of the work that is relevant to the main topic of research
interest arranged in the alphabetical order of the last names of the authors. A reference
list is a subset of bibliography. It includes details of all the citations used in the literature
survey and elsewhere in the research report, arranged in the alphabetical order of the
last names of the author. These citations are provided for the purpose of crediting the
author and enabling the reader to find the works cited
Proper citation, style and formats should be followed in providing reference.
Various methods of referencing are available viz., Publication manual of the American
229
DBA 1657
NOTES
Psychological Association (APA), The Chicago Manual of Style, The Modern Language
Association (MLA) System, American Chemical Society (ACS) system. Each of the
manuals specifies with examples, how the books, journals, newspaper articles,
dissertations and so on should be referenced.
For books the order may be as under:
1. Name of the author, last name first
2. Title of the book in italics
3. Place of publication and the publisher
4. Year of publication
Example:
Peeru Mohamed et.al, Customer Relationship Management, Delhi, Vikas publishing
house, 2002.
References for articles in journals could be cited as under:
1. Name of the author, last name first
2. title of article in quotation marks
3. Name of periodical, in italics
4. The volume or volume and number.
5. The date of the issue
6. The pagination
Example
Chitra.K, In search of Green Consumer: A Perceptual Study, Journal of
Services Research, Volume 7, No.1, April-September, 2007, pp.173-191.
The above examples are just samples for bibliography entries. There are many
other acceptable forms which can be used. However, a researcher should follow a
consistent style of reference throughout the report.
Appendix
The appendix contains information of a subordinate, supplementary or highly
technical nature that the researcher does not want to place in the body of the report.
Each appendix should be clearly separated from the other and should be listed in the
table of contents. The guidelines for preparing appendix are:
y Each appendix item should be referred in the appropriate place in the body of
the report.
230
DBA 1657
y In short reports, the page number numbers may be continued in sequence from
the last page of the body.
NOTES
DBA 1657
NOTES
232
DBA 1657
unsolicited proposal. This section should be geared to convince the sponsor that their
needs will be met by the conduct of the study.
NOTES
Research design
The design module describes the technical issues involved in conducting the
study. What is going to be done is described in technical terms. It can be divided into
many subsection viz., type of study, sampling design, data collection method tools for
analysis , scope of the study and limitations. The justification as to why the particular
method of sampling or data collection is opted should also be discussed.
Qualifications of the researcher
This section should provide the names of the principal investigator and coinvestigators, individuals involved in the project. The professional research competence
and experience of the researchers should be highlighted to assure the sponsor. The
academic experience, research experience and similar projects conducted for internal
and external agencies should be listed. The membership of the researcher in various
associations and other relevant accomplishment can be mentioned. A profile of the
researcher can be enclosed in the appendix of the report.
Budget
Budget should be prepared in the format required by the sponsoring agents.
The details to be presented in the budget varies depending on the sponsors requirements.
It should not be more than one or two pages. All the expenses should be presented with
a proper breakup.
Schedule
The schedule should indicate the major phases of the project, the time required
at each phase and the milestones that determines the completion of the project. For
example the major phases may be refining the problem based on interaction with
management, tuning up the objectives, designing the questionnaire, conducting pilot
study, data collection, analysis and interpretation and report writing. Each of the phases
should be presented along with the time schedule and the resources including the people
assigned to complete the work.
Facilities and special resources
The special facilities or resources needed to complete the project should be
described in detail along with the justification for the same. The proposal should carefully
list the relevant facilities and the resources that will be used. The costs for such facility
should also be detailed in the budjet.
Apart from the above, the bibliography listing the books, journals, websites
referred should be mentioned in the alphabetical order. The appendixes including the
233
DBA 1657
NOTES
glossary of terms, questionnaire, profile of the investigator etc should be prepared. For
a detailed discussion of the sections refer the integral parts of research report.
234
DBA 1657
NOTES
DBA 1657
NOTES
y The visual aids should be positioned in the report at logical and convenient
places.
y The visuals should be revised to eliminate clutter in terms of unnecessary words,
lines , three dimensions etc.
y High quality visuals should be created in terms of clarity in lines, words, numbers
and organizations as it is an important aspect which determines the effectiveness
of a report.
Various types of visuals are available to present the data. Some types of visuals
depict certain kinds of data better than others:
y Tables can be opted to present detailed, exact values.
y Frequencies and pie chart can be represented better with pie chart,
segmented bar chart or area chart.
y Line chart or bar chart can be used to illustrate trend over a time period.
y Bar chart is used to compare one item with another.
y Pie chart is used to compare on part with the whole.
y Line chart, bar chart or a scatter chart can be used to depict correlations.
y Map is used to show the geographical relationship.
y Flowchart or diagram is used to illustrate a process or a procedure.
236
DBA 1657
y The tables should be numbered consecutively throughout the report. The number
and the title are given above the table.
NOTES
y Table title should be informative and identify the main points of the table.
y Horizontal rules are used to separate the parts of the table. The rules are placed
above and below the column heads and below the last row of the table. Vertical
lines can also be used to separate the column.
y Spanner head should be used to characterize the column headings. The spanners
eliminate repetition in column headings.
y Common understandable units should be used. All items in a column should be
expressed in the same unit and rounded off for simplicity.
y Column or row total should be provided wherever needed.
y Explanatory comments should be placed below the table with the word Note.
y Source of the data given in the table should be mentioned.
2. Line graphs
Line graph depicts trends or relationship. It shows the relationship between
two variables by a line connecting points in X axis and Y axis. The line graphs usually
show trends over time. The line connects the points and its ups and downs illustrate the
changes. Line graphs have conventional parts; a caption that contains number and title,
axis rules, axis labels and a legend. Some guidelines to create line graphs are given
below;
y The figures should be numbered consecutively using Arabic numerals.
y Brief clear title should be used to specify the content of the graph
y The caption can be given either above or below the figure, but a consistent
pattern should be followed throughout the report
y The independent variables are recorded on the X axis and the dependent
variables on the Y axis
y Clear axis labels should be provided
y If the graphs have more than one line, visual distinct should be made between
the same. The lines should also be identified with labels or in a legend
237
DBA 1657
NOTES
Example:
A surface chart, also called an area chart, is a form of line chart with a
cumulative effect; all the lines add up to the top line, which represents the total. This
form of chart helps to illustrate the changes in the composition of something over time.
In preparing the surface chart, the most important segment should be put in the baseline
and the number of strata should be restricted to four or five.
3. Bar graphs
A bar chart depicts numbers by height or length of its rectangular bars. It makes
numbers easy to read and understand. Bar charts are very much useful to
y Compare the size of several items at one time.
Anna University Chennai
238
DBA 1657
NOTES
100%
80%
60%
40%
20%
0%
1
Year
A
A bar chart can be created in many ways depending on the need and creativity
of the researcher. However care should be exercised to see that the width of all bars
are be uniform and are placed evenly in logical order.
4. Pie charts
A pie chart is used to show the relative sizes of parts of a whole. It uses segments
of a circle to indicate percentages of a total. The whole circle represents 100 percent,
the segments of circle represents each items percentage of the total. Pie charts are
effective ways to show percentages or to compare one segment with another. General
guidelines are:
239
DBA 1657
NOTES
y A pie chart should not be divided into more that five segments as the reader
may have difficulty in differentiating the sizes of small segments.
y Segments should be identified with legends or call outs.
y The segments should be arranged in sequence clockwise form largest to smallest.
y Different color s or patterns can be used to distinguish the various pieces.
y All the segments put together should add up to 100 percent, if percentages are
used. Percentages can be placed inside the segments.
y The segment which needs greater attention can be exploded i.e pulled out from
the rest of the segment.
Example:
5. Pictograms
A chart that uses symbols instead of words or numbers to portray data is known
as pictogram. It is very novel way of presentation and it conveys more literal visual
messages. Pictograms enhance reports value.
6. Flow charts
Flow charts are used to show a time sequence, decision sequence or conceptual
relationships. The flowcharts are indispensable when illustrating processes, procedures
and sequential relationships. Arrows indicate the direction of the action, and symbols
represent steps or particular points in the action. In case of computer programming the
symbols have special shapes for certain activities.
Anna University Chennai
240
DBA 1657
NOTES
7. Organization charts
The organization chart illustrates the positions, units or functions of an organization
and the way they interrelate. Organization charts are used to depict the interrelationships
among the parts of an organization. An organizations normal communication channels
can be explained in detail with the help of organization charts.
8. Decision charts
A decision chart or decision tree is a flow chart that uses graphs to explain
whether or not to perform a certain action in a certain situation. At each point, the
reader must decide yes or no and then follow the appropriate path until the final goal is
reached.
241
DBA 1657
NOTES
9. Gantt Charts
A Gantt chart represents the schedule of a project. Unit of time is represented
along the horizontal axis and sub processes are explained on the vertical axis. The lines
indicate the starting and ending point of each sub-process.
242
DBA 1657
NOTES
10. Maps
Maps are used to represent statistics by geographical area. It is also used to
show location relationships. Maps can be used to show regional differences in sales of
the company. The maps can be illustrated to suit the needs. It can use dots, shading,
colour lines, labels, numbers and symbols. The computer softwares like Excel and
Coral draw has templates which makes the production of maps easier.
11. Photographs
Photographs enable to capture the exact appearance of an object and uses
visual appeals to capture the readers attention. The advent in technology like the digital
cameras has reduced the cost of including photographs drastically. Further modification
of photos to the requirement can be made with the help of software. It duplicates the
items to be discussed and also shows the relationships among various parts. Photographs
can be used to provide general introduction to orient the readers towards the object.
12. Drawings and diagrams
Drawings and diagrams are often used to show how something looks or operates.
Diagrams can be much clearer than words in explaining the readers the process or the
uses of an object. A variety of software programs can be used to add decorative touch
to the report. The drawings/diagrams enables to eliminate unnecessary details so that
the readers can focus on important aspect. Two commonly used drawings are the
exploded view and the detailed drawing. An exploded view shows the parts disconnected
but arranged in the order in which they fit together. They are used to show the internal
parts of a small and intricate object or to explain how the objects are assembled.
Manuals often use exploded drawings with named or numbered parts. Detailed drawings
are renditions of particular parts or assemblies.
SUMMARY
The research report is prepared to communicate the research findings. This
unit covered the different types of reports. The importance of audience analysis was
explained. The steps involved in the preparation of report and the integral parts of the
report were discussed. The contents of research proposal were highlighted. In addition,
the basic guidelines to use the visual aids and the various types of visual aids were
dealt.
243
DBA 1657
NOTES
Prepare a research proposal for identifying the market potential of a new product
launched by your concern.
What type of visual aids can be used for the presenting a report on the customer
satisfaction of a new brand of laptop introduced by your concern in the market.
244