RM - MBA - Full Notes

MADRAS INSTITUTE OF HOTEL MANAGEMENT AND CATERING TECHNOLOGY
II MBA in HOTEL MANAGEMENT AND CATERING SCIENCE
Semester – III
3.3 – RESEARCH METHODS IN HOSPITALITY INDUSTRY
Semester III / 3.3 – Research

Dept. of. Food Science
MIHMCT
UNIT 1
Research Introduction – Qualities of search – Components of research problems – various

steps in scientific research – Types of research - Hypothesis: Types, sources, characteristics
of unable hypothesis – Research design.
TERMS –RESEARCH
 Search back
 Search for knowledge
 Careful / diligent search, studious enquiry
 Critical exhaustive investigation / experimentation
 Aimed at discovery / interpretation of facts
 Revision of accepted theories/laws in light of new facts
 Or practical application of such new theories or laws
Research
―Research is a systematic inquiry to describe, explain, predict, and control the observed
phenomenon. Research involves inductive and deductive methods.‖
Inductive research methods are used to analyze an observed event. Deductive methods are
used to verify the observed event. Inductive approaches are associated with qualitative
research and deductive methods are more commonly associated with quantitative research.
MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Characteristics of research
1. A systematic approach must be followed for accurate data. Rules and procedures are
an integral part of the process that set the objective. Researchers need to practice
ethics and a code of conduct while making observations or drawing conclusions.
2. Research is based on logical reasoning and involves both inductive and deductive
methods.
3. The data or knowledge that is derived is in real time from actual observations in
natural settings.
4. There is an in-depth analysis of all data collected so that there are no anomalies
associated with it.
5. Research creates a path for generating new questions. Existing data helps create more
opportunities for research.
6. Research is analytical in nature. It makes use of all the available data so that there is
no ambiguity in inference.
7. Accuracy is one of the most important aspects of research. The information that is
obtained should be accurate and true to its nature. For example, laboratories provide a
controlled environment to collect data. Accuracy is measured in the instruments used,
the calibrations of instruments or tools, and the final result of the experiment.
Purpose of Research
There are three purposes of research:
1. Exploratory: As the name suggests, exploratory research is conducted to explore a

group of questions. The answers and analytics may not offer a final conclusion to the
perceived problem. It is conducted to handle new problem areas which haven‘t been
explored before. This exploratory process lays the foundation for more conclusive
research and data collection.

2. Descriptive: Descriptive research focuses on expanding knowledge on current issues
through a process of data collection. Descriptive studies are used to describe the
behavior of a sample population. In a descriptive study, only one variable is required
to conduct the study. The three main purposes of descriptive research are describing,
explaining, and validating the findings. For example, a study conducted to know if
top-level management leaders in the 21st century possess the moral right to receive a
huge sum of money from the company profit.
3. Explanatory:Explanatory research or causal research is conducted to understand the

impact of certain changes in existing standard procedures. Conducting experiments is
the most popular form of casual research. For example, a study conducted to
understand the effect of rebranding on customer loyalty.
Types of Research
 Descriptive Research
 Analytical Research
 Applied Research
 Fundamental Research
 Qualitative Research
 Quantitative Research
 Conceptual Research
 Empirical Research

Types of Research Methods
Research methods are broadly classified as Qualitative and Quantitative.
Both methods have distinctive properties and data collection methods.
Qualitative Methods
Qualitative research is a method that collects data using conversational methods. Participants
are asked open-ended questions. The responses collected are essentially non- numerical. This
method not only helps a researcher understand what participants think but also why they
think in a particular way.
Types of qualitative methods include:
 One-to-one Interview: This interview is conducted with one participant at a given

point in time. One-to-one interviews need a researcher to prepare questions in
advance. The researcher asks only the most important questions to the participant.
This type of interview lasts anywhere between 20 minutes to half an hour. During this
time the researcher collects as many meaningful answers as possible from the
participants to draw inferences.
 Focus Groups: Focus groups are small groups comprising of around 6-10
participants who are usually experts in the subject matter. A moderator is assigned to
a focus group who facilitates the discussion amongst the group members. A
moderator‘s experience in conducting the focus group plays an important role. An
experienced moderator can probe the participants by asking the correct questions that
will help them collect a sizable amount of information related to the research. 
 Ethnographic Research: Ethnographic research is an in-depth form of research

where people are observed in their natural environment without This method is
demanding due to the necessity of a researcher entering a natural environment of
other people. Geographic locations can be a constraint as well. Instead of conducting
interviews, a researcher experiences the normal setting and daily life of a group of
people. 

 Text Analysis: Text analysis is a little different from other qualitative methods as it is
used to analyze social constructs by decoding words through any available form of
documentation. The researcher studies and understands the context in which the
documents are written and then tries to draw meaningful inferences from it.
Researchers today follow activities on a social media platform to try and understand
patterns of thoughts.
 Case Study: Case study research is used to study an organization or an entity. This
method is one of the most valuable options for modern This type of research is used in
fields like the education sector, philosophical studies, and psychological studies. This
method involves a deep dive into ongoing research and collecting data.
Quantitative Research Methods
Quantitative methods deal with numbers and measurable forms. It uses a systematic way of
investigating events or data. It is used to answer questions in terms of justifying relationships
with measurable variables to explain, predict, or control a phenomenon.
There are three methods that are often used by researchers:
 Survey Research — The ultimate goal of survey research is to learn about a large
population by deploying a survey. Today, online surveys are popular as they are
convenient and can be sent in an email or made available on the internet. In this
method, a researcher designs a survey with the most relevant survey questions and
distributes the survey. Once the researcher receives responses, they summarize them
to tabulate meaningful findings and data.
 Descriptive Research — Descriptive research is a method which identifies the
characteristics of an observed phenomenon and collects more information. This
method is designed to depict the participants in a very systematic and accurate
manner. In simple words, descriptive research is all about describing the phenomenon,
observing it, and drawing conclusions from it.
 Correlational Research— Correlational research examines the relationship between
two or more variables. Consider a researcher is studying a correlation between cancer
and married Married women have a negative correlation with cancer. In this example,
there are two variables: cancer and married women. When we say negative
correlation, it means women who are married are less likely to develop cancer.
However, it doesn‘t mean that marriage directly avoids cancer.

Various Steps in Scientific Research
1. Analysis of research gap

2. Establishing the hypothesis
3. Abstract
4. Introduction to research topic
5. Collection of research article and knowledge gaining
6. Setting of materials and apparatus
7. Standardizing the methodology for research
8. Result generation and discussion
9. Summary
10. References and appendix
Hypothesis and Null Hypothesis:
Generally, an investigator has a hypothesis. The hypothesis may be that the sample mean
is less than the population mean, or the mean of one group is greater than the other group,
or the means of more than two groups are not the same. These hypotheses may shown
symbolically as follows:
𝐻 ∶ 𝑋̅ < 𝜇
𝐻 ∶ 𝜇1 < 𝜇2
𝐻 ∶ 𝜇1 ≠ 𝜇2 ≠ 𝜇3 ≠ 𝜇3
Whatever be the hypothesis an investigator puts forward, its statistical significance is
obtained by subjecting the null form of the hypothesis to an appropriate test of
significance. The null form of the hypothesis (𝐻0), which means there is no significance
difference between the means of two or more groups. The null hypotheses of the above
hypotheses as follows:
𝐻0 ∶ 𝑋̅ = 𝜇
𝐻0 ∶ 𝜇1 = 𝜇2
𝐻0 ∶ 𝜇1 = 𝜇2 = 𝜇3 = 𝜇3
Verbally, the 𝐻0 states that there is no significant difference between sample mean and
population mean, or between means of two population mean, or between means of more
than two populations.
Any admissible hypothesis that differs from a null hypothesis is called an alternative
hypothesis and is denoted by 𝐻1.

UNIT 2
Data collection : Source of data – Primary and secondary sources – Survey method –
Procedure – Questionnaire - Sampling merits and demerits - Experiments : Kinds –
Procedure, Control – Observation : Demerits – Demerits, Kinds – Procedure.
Data Collection:
Quantitative Data Collection:
The quantitative data collection methods rely on random sampling and structured data
collection instruments, which fit diverse experiences, into predetermined response categories.
They produce results that are easy to summarize, compare and generalize.

Qualitative Data Collection:

QUESTIONNAIRES
INTERVIEWS

DIFFERENCE BETWEEN QUANTITATIVE AND QUALITATIVE RESEARCH
DATA COLLECTION METHODOLOGY

Population(Universe)
Population is a set of all possible observation of the type which is to be investigated.
It can be finite or infinite.
Finite population
When the number of observation can be counted and is definite, it is known as finite
population
 No. of farmers in a village.
 All the fields under a specified crop.
Infinite population
When the number of observation cannot be measured and is infinite, it is known as
infinite population.
 Eg – No. of stars in the sky- infinity
 The population of plants on insect in a region.
Parameter
A summary measure that describes any given characteristic of the population is
known as parameter. Population are described in terms of certain measures like mean,
standard deviation etc. These measures of the population are called parameter and are usually
denoted by Greek letters. For example population is denoted by mean() and variance 2 .
Sample
A portion or small unit of the total population is considered to study and analysis.
Sample consists of a few items of a population.
Example – All the farmers in a village(population) and a few farmers(sample)
Statistic
A summary measure that describes the characteristic of the sample is known as
statisitic. Thus sample mean, sample standard deviation etc are statistic. The statistic are
usually denoted by roman letters.
x - sample mean
S – standard deviation
The statistic is a random variable because it varies from sample to sample.
Sampling:
The method of selecting samples from a population is known as sampling.

Sampling technique
There are two ways in which the information is collected during statistical survey.
They are
1. Census survey
2. Sampling survey
Census
It is also known as population survey and complete enumeration survey. Under census
survey the information are collected from each and every unit of the population or universe.
The population does not mean human population. It refers to the aggregate of all the units of
an investigation. For eg. If the investigation is on the housing conditions of a locality, all the
houses in that locality, constitute the population.
Sample survey
A sample is a part of the population. Information are collected from only a few units
of a population and not from all the units. Such a survey is known as sample survey.
Sampling technique is universal in nature, consciously or unconsciously it is adopted in every
day life. For example,
1. A handful of rice is examined before buying a sack.
2. We taste one or two fruits before buying a bunch of grapes.
3. In examinations & interviews only a few questions are asked to each person.
Need for sampling
The sampling methods have been extensively used for a variety of purposes and in
great diversity of situations. Eg. Handful of rice is examined before buying a sack.
In practice it may not be possible to collected information on all units of a population
due to various reasons such as
1. Lack of resources in terms of money, personnel and equipment.
2. The experimentation may be destructive in nature. Eg- finding out the germination
percentage of seed material or in evaluating the efficiency of an insecticide the
experimentation is destructive.
3. The data may be wasteful if they are not collected within a time limit. The census
survey will take longer time as compared to the sample survey. Hence for getting
quick results sampling is preferred moreover a sample survey will be less costly than
complete enumeration.
4. Sampling remains the only way when population contains infinite many numbers.
5. Greater accuracy

Sampling methods
The various methods of sampling can be grouped under
1)Probability sampling or random sampling
2) Non-probability sampling or non random sampling
Sampling methods
Non probability
Probability sampling
sampling
Purposive (or) Convenienc e Simple rando m

Quota sampling /Stratified samplin g Systematic sampling Cluster sampling
judgement sampling sampling sampling
Random sampling
Under this method, every unit of the population at any stage has equal chance (or)
each unit is drawn with known probability. It helps to estimate the variable of the sample
results. It is not possible in non probability sampling which is used advantageously when
there is no frame or when the respondance is expected to be non-co-operative.
Under probability sampling there are two procedures
1. Sampling with replacement(SWR)
2. Sampling without replacement(SWOR)
When the successive draws are made with placing back the units selected in the preceding
draws, it is known as sampling with replacement. When such replacement is not made it is
known as sampling without replacement.
When the population is finite sampling with replacement is adopted otherwise SWOR
is adopted.
Mainly there are many kinds of random sampling. Some of them are.
1. Simple Random Sampling
2. Systematic Random Sampling
3. Stratified Random Sampling
4. Cluster Sampling

1. Unrestricted Sampling
Simple Random sampling(SRS)
The basic probability sampling method is the simple random sampling. It is the
simplest of all the probability sampling methods. It is used when the population is
homogeneous.
When the units of the sample are drawn independently with equal probabilities. The
sampling method is known as Simple Random Sampling(SRS). Thus if the population
consists of N units, the probability of selecting any unit is 1/N.
A theoretical definition of SRS is as follows
Suppose drawn sample of size n from a population of size N. There are NC n possible
sample of size n. If all possible samples have an equal probability 1/NC n of being drawn, the
sampling is said be simple random sampling.
There are two methods in SRS
1. Lottery method
2. Random no. table method
Lottery method
This is most popular method and simplest method. In this method all the items of the
universe are numbered on separate slips of paper of same size, shape and color. They are
folded and mixed up in a drum or a box or a container. A blindfold selection is made.
Required number of slips are selected for the desired sample size. The selection of items thus
depends on chance.
For example if we select 5 students out of 50 students on slips of the same size and
mix them, then we make a blindfold selection of 5 students. This method is also called
unrestricted random sampling because units are selected from the population without any
restriction. This method is mostly used in lottery draws. If the universe is infinite, this method
is inapplicable. There is a lot of possibility of personal prejudice if the size and shape of the
slips are not identical.
Random number table method
As the lottery method cannot be used when the population is infinite, the alternative
method is using of table of random numbers.
There are several standard tables of random numbers. But the credit for this technique
goes to Prof. LHC. Tippet(1927). The random number table consists of 10,400 four- figured
numbers. There are various other random numbers. They are fishers and Yates(19380
comprising of 15,000 digits arranged in twos. Kendall and B.B Smith(1939) consisting of

1,00,000 digits grouped in 25,000 sets of 4 digit random numbers, Rand corporatio n(1955)
consisting of 2,00,000 random numbers of 5 digits each etc.,
Merits
1. There is less chance for personal bias.
2. Sampling error can be measured.
3. This method is economical as it saves time, money and labour.
Demerits
1. This requires a complete list of the population but such up-to-date lists are not
available in many enquires.
2. If the size of the sample is small, then it will not be a representative of the population.
2. Restricted sampling
Stratified Sampling
When the population is heterogeneous with respect to the characteristic in which we
are interested, we adopt stratified sampling.
When the heterogeneous population is divided into homogenous sub-population, the sub-
populations are called strata. From each stratum a separate sample is selected using simple
random sampling. This sampling method is known as stratified sampling.
The number of units to be selected may be uniform in all strata (or) may vary from
stratum to stratum.
There are four types of allocation of strata
1. Equal allocation
2. Proportinal allocation
3. Neyman‘s allocation
4. Optimum allocation
If the number of units to be selected is uniform in all strata it is known as equal
allocation of samples.
If the number of units to be selected from a stratum is proportional to the size of the
stratum, it is known as proportional allocation of samples.
When the cost per unit varies from stratum to stratum, it is known as optimum
allocation.When the costs for different strata are equal, it is known as Neyman‘s allocation.
Merits
1. It is more representative.
2. It ensures greater accuracy.
3. It is easy to administrate as the universe is sub-divided.

Demerits
1. To divide the population into homogeneous strata, it requires more money, time and
statistical experience which is a difficult one.
2. If proper stratification is not done, the sample will have an effect of bias.
Systematic sampling
Systematic sampling is a simpler and quicker method compared to other method.
Suppose that the population of size N is numbered from 1 to N. Let the desired sample size
be n. The population can be divided into subgroups by the formula
sizeofthepopulation
= N/n = k units.
sizeofthesample
Where k is called sample interval. A unit r is selected at random from the first k unit, then
the subsequent units be selected by
r, r+k, r+2k ……r+(n-1)k This type of selection is known as Systematic Sampling.
Example
Our population size ie., N=50
Our sample size be n=10
50
K= = 5 units
10
Select ‗r‘ randomly (ie) r<k
Say r =3
Our units be r, r+k, r+2k, ………r+(n-1)k
3, 8, 13, 18, 23, 28, 23, 33, 38, 43, 48
So we have to select 3rd value, 8th value, 13th value …. 48th value. So we will get a sample of
size 10.
Merits
1. This is simple and convenient.
2. The time and work is reduced much.
Demerits
1. It may not represent the whole population.
Cluster Sampling
When the population size is very large, we use cluster sampling. One is the difficulty
of preparing a frame. The other difficulties are high cost and administrative difficulty of
surveying widely scattered sampling units. In such cases cluster sampling is useful.

In cluster sampling, the population is divides into sub-populations known as clusters.
A specified number of clusters are selected at random and the observation are made on all the
units in the sample clusters.
It may be noted that the cluster is heterogeneous sub-population whereas stratum is
homogeneous sub-population.
Merits
1. It introduces flexibility in the sampling method.
2. It is helpful in large-scale survey where the preparation of the list is difficult, time
consuming or expensive.
Demerits
1. It is less accurate than other methods.
Non probability (non random) sampling
Under non probability sampling we have
1) Purposive or Judgement or Deliberate or Directed sampling.
2) Quota sampling.
3) Convenience or Chunk sampling
Purposive or Judgement or Deliberate
The investigator has the power to select or reject any item in an investigation. The
choice of sample items depnds on the judgement of the investigator. Invertigator has the vital
role to play in collecting the information.
Merits
1. It is a simple method.
2. It is used to obtain a more representative sample.
Demerits
1. Due to individual bias the sample may not be a representative one.
2. The estimates are not accurate.
Quota sampling
Quota sampling is a type of stratified. The number of units to be selected is decided
according to given criteria. But the units are not selected by any random sampling technique.
The interviewer is free to any appropriate unit of the population as a sample. In a quota
sample, quotas are set up according to some specified characteristics.
It is a stratified-cum-purposive sampling and thus has the advantages of both the
method.

Merits
1. The cost per unit selected is considerably less.
2. It is popularly used in market survey and public opinion polls.
Demerits
1. The quotas set are arrived on broad consideration. Detailed study of the population is
not made as in proportional allocation.
Convenience sampling
A convenience sample is obtained by selecting convenient population units. The
method of convenience sampling is also called the chunk. A chunk refers to that fraction of
the population being investigated which is selected neither by probability nor by judgement
but by convenience. A sample obtained from readily available lists such as automobile
registrations, telephone directories etc., is a convenience sample and not a random sample
even if the sample is drawn at random from the lists.
Hence the results obtained by following convenience sampling method can hardly be
representative of the population, they are generally biased and unsatisfactory.
Experimental Method – Kinds and Procedure

The experimental method involves the manipulation of variables to establish cause and effect
relationships. The key features are controlled methods and the random allocation of
participants into controlled and experimental groups.
An experiment is an investigation in which a hypothesis is scientifically tested. In an
experiment, an independent variable (the cause) is manipulated and the dependent variable
(the effect) is measured; any extraneous variables are controlled.
An advantage is that experiments should be objective. The views and opinions of the
researcher should not affect the results of a study. This is good as it makes the data
more valid, and less biased.
Lab Experiment
A laboratory experiment is an experiment conducted under highly controlled conditions (not
necessarily a laboratory), where accurate measurements are possible. The researcher decides
where the experiment will take place, at what time, with which participants, in what
circumstances and using a standardized procedure. Participants are randomly allocated to
each independent variable group.

Strength:
 It is easier to replicate (i.e. copy) a laboratory experiment. This is because a
standardized procedure is used. They allow for precise control of extraneous and
independent variables. This allows a cause and effect relationship to be established.
Limitation:
 The artificiality of the setting may produce unnatural behaviour that does not reflect
real life, i.e. low ecological validity. This means it would not be possible to generalize
the findings to a real life setting. Demand characteristics or experimenter effects may
bias the results and become confounding variables.
Field Experiments
Field experiments are done in the everyday (i.e. real life) environment of the participants. The
experimenter still manipulates the independent variable, but in a real- life setting (so cannot
really control extraneous variables).
Strength:
 Behaviour in a field experiment is more likely to reflect real life because of its natural
setting, i.e. higher ecological validity than a lab experiment. There is less likelihood
of demand characteristics affecting the results, as participants may not know they are
being studied. This occurs when the study is covert.
Limitation:
 There is less control over extraneous variables that might bias the results. This makes
it difficult for another researcher to replicate the study in exactly the same way.
Natural Experiments
Natural experiments are conducted in the everyday (i.e. real life) environment of the
participants, but here the experimenter has no control over the independent variable as it
occurs naturally in real life.
Strength:
 Behaviour in a natural experiment is more likely to reflect real life because of its
natural setting, i.e. very high ecological validity. There is less likelihood of demand
characteristics affecting the results, as participants may not know they are being
studied.

 Can be used in situations in which it would be ethically unacceptable to manipulate
the independent variable, e.g. researching stress.
Limitation:
 They may be more expensive and time consuming than lab experiments. There is no
control over extraneous variables that might bias the results. This makes it difficult for
another researcher to replicate the study in exactly the same way.
OBSERVATION METHODS
The observation method is described as a method to observe and describe the behavio ur of a
subject. As the name suggests, it is a way of collecting relevant information and data by
observing. It is also referred to as a participatory study because the researcher has to establish
a link with the respondent and for this has to immerse him in the same setting as theirs. Only
then can he use the observation method to record and take notes.
Participant observation
Participant observation was first introduced by Prof. Edward Winder Man. It means the
activities of a group in which an observer himself participates and note the situation. He
willingly mixes with the group and performs his activities as an observer not merely a
participator who criticizes the situation. In other words he takes place and share the activities
with his group. For example when we study the rural and urban conditions of Asian people,
we have to go there and watched what is going on. The best p hilosophy of participant
observation is that we watch the phenomena not to ask. The actual behavior of the group can
be observed only by participant observation not by any other method.
Merits
1. The observer is personally involved in group activities and shares their feelings and
prejudices.
2. He participates himself and gets insight into the behaviour of the group.
3. It motivates and stimulates mutual relationship b/w the observer and observe.
4. He can get more information‘s with accuracy and precision.
5. The information‘s are recorded in front of the group people.
Demerits
1. The observer may develop emotional attachment to his group which will lose the
objectivity of the study.

2. Cannot observe a certain phenomenon in a short time available to him.
3. Cannot cover a wide area through this method.
Non-Participant Observation
The non-participant observation has a lack of participation of the observer in his group
activities. He either watches the phenomena from a distance or participates in the group but
never in its activities. He only sit in the group but do not interest in the process.
The difference between participant & non-participant observation is that, in the former the
observer himself take part in a group and become the member of that group also participate in
their activities with full fledge while the latter refers to the less or no participation of the
observer in his group, their membership and activities. He watch from a distance but do not
have active eye sight that what is going on in the field of research.
Merits
1. Although observer himself never attach to the group but the objectivity maintained.
2. Less emotional involvement of the observer leads to accuracy and greater objectivity.
3. having secondary relationship with his group, so the information‘s are collected
entirely.
4. Through non-participant observation the research remains very smooth.
Demerits
1. Do not have full knowledge about the group activities.
2. Cannot understand the whole phenomena.
3. Cannot get real and deep insight into the phenomena.
Controlled Observation
Here observer and observe or subject both are controlled. For systematic data collection
control is imposed on both for accuracy and precision. When observation is pre-planned and
definite, then it is termed as controlled observation. In control observation, mechanical
devices are used for precision and standardized. So, control increase accuracy, reduce bias,
and ensure reliability and standardization. Some of the devices are as under.
1. Observational plan.
2. Observational schedule.
3. Mechanical appliances like, camera, maps, films, video, tape recorder etc.
4. Team of observers.
5. Socio Matric Scale.

Un-Controlled Observation
Uncontrolled observation takes place in natural setting without the influence of external or
outside control. The observer does not plan in advance but this is related to day-to-day
happenings and socio-cultural problems. It studies some of our life situations.
Structured and Un-Structured Observation
It this type careful information‘s are recorded in a standardized way. It is a planned
observation of a phenomena and to follow certain patterns, rules and designs for the purpose
what, how and when to observe, . Unstructured observation is opposite to structured. This is
not systematic and un-planned observation. A researcher does not set a plan in advance but he
get the information‘s freely. There are no rules to follow by the researcher.
General or Layman Observation
General or layman observation make by people in day-to-day happenings. They see many
things daily but there is no objectivity of their study. For example a person see the children
playing in a garden is general observation.
Scientific Observation
Scientific observation is based on some scientific rules and deliberate thinking. The observer
must know what to observe. He have proper planning, objectivity, hypothesis and observation
schedule in his study. Scientific observation is reliable and more standardized than general
observation.

UNIT 3
Qualities data: Nature – Scales Methods and scale construction technologies.
Scaling Techniques
Scaling technique is a method of placing respondents in continuation of gradual change in the
pre-assigned values, symbols or numbers based on the features of a particular object as per
the defined rules. All the scaling techniques are based on four pillars, i.e., order, description,
distance and origin.
The marketing research is highly dependable upon the scaling techniques, without which no
market analysis can be performed.
Types of Scaling Techniques

 Primary Scaling Techniques
 Nominal Scale
 Ordinal Scale
 Interval Scale
 Ratio Scale
 Other Scaling Techniques
 Comparative Scales
 Non-Comparative Scales
Primary Scaling Techniques
The major four scales used in statistics for market research consist of the following:

Nominal Scale
Nominal scales are adopted for non-quantitative (containing no numerical implication)

labelling variables which are unique and different from one another.
Types of Nominal Scales:
1. Dichotomous: A nominal scale that has only two labels is called ‗dichotomous‘; for
example, Yes/No.
2. Nominal with Order: The labels on a nominal scale arranged in an ascending or

descending order is termed as ‗nominal with order‘; for example, Excellent, Good,
Average, Poor, Worst.
3. Nominal without Order: Such nominal scale which has no sequence, is called
‗nominal without order‘; for example, Black, White.
Ordinal Scale
The ordinal scale functions on the concept of the relative position of the objects or labels
based on the individual‘s choice or preference.
For example, At Amazon.in, every product has a customer review section where the buyers
rate the listed product according to their buying experience, product features, quality, usage,
etc.
The ratings so provided are as follows:
 5 Star – Excellent
 4 Star – Good
 3 Star – Average
 2 Star – Poor
 1 Star – Worst
Interval Scale
An interval scale is also called a cardinal scale which is the numerical labelling with the same
difference among the consecutive measurement units. With the help of this scaling technique,
researchers can obtain a better comparison between the objects.

For example; A survey conducted by an automobile company to know the number of
vehicles owned by the people living in a particular area who can be its prospective customers
in future. It adopted the interval scaling technique for the purpose and provided the units as 1,
2, 3, 4, 5, 6 to select from.
In the scale mentioned above, every unit has the same difference, i.e., 1, whether it is between
2 and 3 or between 4 and 5.
Ratio Scale
One of the most superior measurement technique is the ratio scale. Similar to an interval
scale, a ratio scale is an abstract number system. It allows measurement at proper intervals,
order, categorization and distance, with an added property of originating from a fixed zero
point. Here, the comparison can be made in terms of the acquired ratio.
For example, A health product manufacturing company surveyed to identify the level of
obesity in a particular locality. It released the following survey questionnaire:
Select a category to which your weight belongs to:
Less than 40 kilograms
 40-59 Kilograms
 120 Kilograms and more

Other Scaling Techniques
Scaling of objects can be used for a comparative study between more than one objects
(products, services, brands, events, etc.). Or can be individually carried out to understand the
consumer‘s behaviour and response towards a particular object.
Following are the two categories under which other scaling techniques are placed based on
their comparability:

Comparative Scales
For comparing two or more variables, a comparative scale is used by the respondents.
Following are the different types of comparative scaling techniques:
Paired Comparison
A paired comparison symbolizes two variables from which the respondent needs to sele ct
one. This technique is mainly used at the time of product testing, to facilitate the consumers
with a comparative analysis of the two major products in the market.
To compare more than two objects say comparing P, Q and R, one can first compare P with Q
and then the superior one (i.e., one with a higher percentage) with R.
For example, A market survey was conducted to find out consumer‘s preference for the
network service provider brands, A and B. The outcome of the survey was as follows:
Brand ‗A‘ = 57%
Brand ‗B‘ = 43%
Thus, it is visible that the consumers prefer brand ‗A‘, over brand ‗B‘.

Rank Order
In rank order scaling the respondent needs to rank or arrange the given objects according to
his or her preference.
For example, A soap manufacturing company conducted a rank order scaling to find out the
orderly preference of the consumers. It asked the respondents to rank the following brands in
the sequence of their choice
The above scaling shows that soap ‗Y‘ is the most preferred brand, followed by soap ‗X‘,
then soap ‗Z‘ and the least preferred one is the soap ‗V‘.
Constant Sum
It is a scaling technique where a continual sum of units like dollars, points, chits, chips, etc. is
given to the features, attributes and importance of a particular product or service by the
respondents.
For example, The respondents belonging to 3 different segments were asked to allocate 50
points to the following attributes of a cosmetic product ‗P‘:

From the above constant sum scaling analysis, we can see that:
 Segment 1 considers product ‗P‘ due to its competitive price as a major factor.
 But segment 2 and segment 3, prefers the product because it is skin- friendly. 
Q-Sort Scaling
Q-sort scaling is a technique used for sorting the most appropriate objects out of a large
number of given variables. It emphasizes on the ranking of the given objects in a descending
order to form similar piles based on specific attributes.
It is suitable in the case where the number of objects is not less than 60 and more than 140,
the most appropriate of all ranging between 60 to 90.
For example, The marketing manager of a garment manufacturing company sorts the most
efficient marketing executives based on their past performance, sales revenue generation,
dedication and growth.
The Q-sort scaling was performed on 60 executives, and the marketing head creates three
piles based on their efficiency as follows:
Non-Comparative Scales
A non-comparative scale is used to analyse the performance of an individual product or

object on different parameters. Following are some of its most common types:
Continuous Rating Scales
It is a graphical rating scale where the respondents are free to place the object at a position of
their choice. It is done by selecting and marking a point along the vertica l or horizontal line
which ranges between two extreme criteria.

For example, A mattress manufacturing company used a continuous rating scale to find out
the level of customer satisfaction for its new comfy bedding. The response can be taken in the
following different ways (stated as versions here):
The above diagram shows a non-comparative analysis of one particular product, i.e. comfy
bedding. Thus, making it very clear that the customers are quite satisfied with the product and
its features.
Itemized Rating Scale
Itemized scale is another essential technique under the non-comparative scales. It emphasizes
on choosing a particular category among the various given categories by the respondents.
Each class is briefly defined by the researchers to facilitate such selection.
The three most commonly used itemized rating scales are as follows:
 Likert Scale : In the Likert scale, the researcher provides some statements and ask the
respondents to mark their level of agreement or disagreement over these statements by
selecting any one of the options from the five given alternatives.
For example, A shoes manufacturing company adopted the Likert scale technique for
its new sports shoe range named Z sports shoes. The purpose is to know the
agreement or disagreement of the respondents.
For this, the researcher asked the respondents to circle a number representing the most
suitable answer according to them, in the following representation:
 1 – Strongly Disagree
 2 – Disagree

 3 – Neither Agree Nor Disagree
 4 – Agree
 5 – Strongly Agree
Semantic Differential Scale : A bi-polar seven-point non-comparative rating scale is where

the respondent can mark on any of the seven points for each given attribute of the object as
per personal choice. Thus, depicting the respondent‘s attitude or perception towards the
object.
For example, A well-known brand for watches, carried out semantic differential scaling to
understand the customer‘s attitude towards its product. The pictorial representation of this
technique is as follows:
From the above diagram, we can analyze that the customer finds the product of superior
quality; however, the brand needs to focus more on the styling of its watches.
Stapel Scale : A Stapel scale is that itemized rating scale which measures the response,
perception or attitude of the respondents for a particular object through a unipolar rating. The
range of a Stapel scale is between -5 to +5 eliminating 0, thus confining to 10 units.
For example, A tours and travel company asked the respondent to rank their holiday package
in terms of value for money and user- friendly interface as follows:

UNIT 4
Data analysis : Introduction to Statistics – Probability theories – Estimation of population
parameter – Point and interval estimates of means and proportions – Hypothesis testing of
means and proportions – The T test – Two sample tests – Chi-Square test as of
independence- Chi-square as a test of goodness of fit.
Introduction
Statistics has originated as a science of statehood and found applications slowly and steadily
in Agriculture, Economics, Commerce, Biology, Medicine, Industry, planning, education and
so on.
Sir Ronald A. Fisher, (1890 – 1962), who is called “the Father of Statistics”, drew many
solid conclusions from statistical data.
STATISTICS – Definition
Statistics is the science which deals with the
1. Collection of data,
2. Organization of data (or) Classification of data,
3. Presentation of data,
4. Analysis of data,
5. Interpretation of data, which are known as the statistical methods.
Limitations of Statistics
1. Statistics does not deal with individual items.
2. Statistics deals with quantitative data only.
3. Statistical laws are true only on averages.
4. Statistical results are only approximately correct.
5. Statistics is liable to be misused.
Functions of Statistics
1. Simplifies complexity
2. Helps to compare.
3. Formulates and tests hypothesis.
4. Studies relationships.
5. Helps the government.
6. Helps in forecasting.
7. Formulation of suitable policies.

Uses of Statistics
Statistics is used in States, Economics, Business, Astronomy, Education, Auditing, Research,
Agriculture and Planning.
Branch of Statistics
1. Descriptive Statistics
2. Inferential Statistics
Descriptive Statistics
The branch of statistics is devoted to the summarization and description of data
(population or sample) is called descriptive statistics.
Descriptive statistics are used to describe quality characteristics and relationships. Included
are statistics such as the mean, standard deviation, the range, and a measure of the
distribution of data.
Inferential Statistics
The branch of statistics concerned with using sample data to make an inference about
the population of data is called inferential statistics.
Statistical process control (SPC) involves inspecting a random sample of the output from a
process and deciding whether the process is producing products with characteristics that fall
within a predetermined range. SPC answers the question of whether the process is
functioning properly or not.
Acceptance sampling is the process of randomly inspecting a sample of goods and deciding
whether to accept the entire lot based on the results. Acceptance sampling determines
whether a batch of goods should be accepted or rejected.
PROBABILITY:
The concept of probability is difficult to define in precise terms. In ordinary
language, the word probable means likely (or) chance. Generally the word, probability, is
used to denote the happening of a certain event, and the likelihood of the occurrence of
that event, based on past experiences. By looking at the clear sky, one will say that there
will not be any rain today. On the other hand, by looking at the cloudy sky or overcast sky,
one will say that there will be rain today. In the earlier sentence, we aim that there will not be
rain and in the latter we expect rain. On the other hand a mathematician says that the
probability of rain is ‗0‘ in the first case and that the probability of rain is ‗1‘ in the second
case. In between 0 and 1, there are fractions denoting the chance of the event occurring.

Meaning:
It is difficult to give a clear (or) generally accepted meaning of probability. In our
day-to-day life, we come across sentences like, for example:
1. ―When a coin is tossed, it must fall down‖
2. ―A ball is dropped from a certain height‖
3. “It is impossible to live without Oxygen, Water, Food”
4. ―Tossing a coin‖
5. ―Throwing a die‖
6. ―Drawing a card from a pack of cards‖
7. ―Probably, I win the match‖
8. ―Meteorological officials predict that the rainfall will be normal during the
monsoon‖
9. ―Probably this year we may have a good harvest‖ etc.,
When we look the above sentences, there is certainty in the first two examples;
impossibility in the third example and uncertainty in the remaining examples.
In ordinary language, the word probability means uncertainty about happenings. In
Mathematics and Statistics, a numerical measure of uncertainty is provided by the
important branch of statistics – called theory of probability. Thus we can say, that the
theory of probability describes certainty by 1 (one), impossibility by 0 (zero) and
uncertainties by the co-efficient which lies between 0 and 1.
Trial and Event:
An experiment which, though repeated under essentially identical (or) same
conditions does not give unique results but may result in any one of the several possible
outcomes. Performing an experiment is known as a trial (or) random experiment and the
outcomes are known as events (or) cases.
Example:
1. Tossing of a coin is a trial, and getting head (H) or tail (T) is an event.
2. Throwing of a die is a trial, and getting 1 (or 2, …, or 6) is an event.
3. Drawing two cards from a pack of well-shuffled cards is a trial and getting a King
and a Queen are events.

Sample space (S):
A set of all possible outcomes from an experiment is called sample space. For
example, when we throw a die, the possible outcomes are {1, 2, 3, 4, 5, 6}. All these
numbers are called sample space. Each possible outcome (or) element in a sample space is
called sample point.
Exhaustive Events:
The total number of possible outcomes in any trial is known as exhaustive events (or)
exhaustive cases.
Example:
1. In tossing of a coin there are two exhaustive cases, namely head and tail.
2. In throwing of a die, there are six exhaustive cases, since anyone of the 6 faces
1, 2, 3, 4, 5, 6 may come uppermost.
Favourable Events:
The number of cases favourable to an event in a trial is the number of outcomes which
entail the happening of the event.
Example:
1. In throwing of two dice, the number of cases favourable to getting the sum 5 is
(1, 4), (2, 3), (3,2), (4, 1) = 4.
2. In drawing a card from a pack of cards the number of cases favourable to drawing
of an ace is 4, for drawing a spade is 13 and for drawing a red card is 26.
Mutually Exclusive Events:
Events are said to be mutually exclusive (or) incompatible if the happening of any one
of the events excludes (or) precludes the happening of all the others i.e.) if no two or more of
the events can happen simultaneously in the same trial. (i.e.) The joint occurrence is not
possible.
Example:
1. In tossing a coin the events head and tail are mutually exclusive.
2. In throwing a die, all the 6 faces numbered 1 to 6 are mutually exclusive since if
any one of these faces comes, the possibility of others, in the same trial, is ruled
out.
Equally Likely Events:
Outcomes of a trial are said to be equally likely if taking in to consideration all the
relevant evidences, there is no reason to expect one in preference to the others. (i.e.) Two or

more events are said to be equally likely if each one of them has an equal chance of
occurring.
Example:
1. In tossing an unbiased or uniform coin, head or tail are equally likely events.
2. In throwing an unbiased die, all the 6 faces are equally likely events.
Independent Events:
Several events are said to be independent if the happening of an event is not affected
by the happening of one or more events.
Example:
1. In tossing an unbiased coin the event of getting a head in the first toss
independent of getting a head in the second toss, third and subsequent throws.
2. If we draw a card from a pack of well shuffled cards and replace it before
drawing the second card, the result of the second draw is independent of the
first draw.
Dependent Events:
If the happening of one event is affected by the happening of one or more events, then
the events are called dependent events.
Example:
If we draw a card from a pack of well shuffled cards, if the first card drawn is not
replaced then the second draw is dependent on the first draw.
Definition of Probability:
Mathematical (or) Classical (or) a-priori Probability:
If an experiment results in ‗n‘ exhaustive cases which are mutually exclusive and
equally likely cases out of which ‗m‘ events are favourable to the happening of an event ‗A‘,
then the probability ‗p‘ of happening of ‗A‘ is given by
p = P(A) = Favourable number of cases  m

Exhaustive number of cases n
Note:
1. If m = 0  P(A) = 0, then ‗A‘ is called an impossible event. (i.e.) also by P() = 0.
2. If m = n  P(A) = 1, then ‗A‘ is called assure (or) certain event.
3. The probability is a non-negative real number and cannot exceed unity (i.e.) lies
between 0 to 1.
4. The probability of non-happening of the event ‗A‘ (i.e.) P( A ). It is denoted by ‗q‘.

nm m
P( A ) =  1  1  P(A)
n n
q=1–p
p+q=1
(or) P(A) + P( A ) = 1.
Statistical (or) Empirical Probability (or) a-posteriori Probability:

If an experiment is repeated a number (n) of times, an event ‗A‘ happens ‗m‘ times
then the statistical probability of ‗A‘ is given by
m
p = P(A) = lt
n n
Axioms for Probability:

1. The probability of an event ranges from 0 to 1. If the event cannot take place
its probability shall be ‗0‘ if it certain, its probability shall be ‗1‘.
Let E1, E2, …., En be any events, then P(Ei)  0.
2. The probability of the entire sample space is ‗1‘. (i.e.) P(S) = 1.
n
Total Probability,  P(E )  1
i1
i
3. If A and B are mutually exclusive (or) disjoint events then the probability of
occurrence of either A (or) B denoted by P(AUB) shall be given by
P(AB) = P(A) + P(B)
P(E1E2….En) = P(E1) + P(E2) + …… + P(En)
If E1, E2, …., En are mutually exclusive events.
Conditional Probability:
Two events A and B are said to be dependent, when B can occur only when A is
known to have occurred (or vice versa). The probability attached to such an event is called
the conditional probability and is denoted by P(A/B) (read it as: A given B) or, in other
words, probability of A given that B has occurred.
P(A/B) = P(A  B)  P(AB)

P(B) P(B)
If two events A and B are dependent, then the conditional probability of B given A is,
P(B/A) = P(A  B)  P(AB)

P(A) P(A)

Theorems of Probability:
There are two important theorems of probability namely,
1. The addition theorem on probability
2. The multiplication theorem on probability.
I. Addition Theorem on Probability:

(i) Let A and B be any two events which are not mutually exclusive
P(A or B) = P(AB) = P(A + B) = P(A) + P(B) – P(AB) (or)
= P(A) + P(B) – P(AB)
Proof:

(ii) Let A and B be any two events which are mutually exclusive
P(A or B) = P(AB) = P(A + B) = P(A) + P(B)
Proof:
We know that, n(AB) = n(A) + n(B)

n(A  B)
P(AB) =
n
n(A)  n(B)
=
n
= n(A)  n(B)
n n
P(AB) = P(A) + P(B)
Note:
(i) In the case of 3 events, (not mutually exclusive events)
P(A or B or C) = P(ABC) = P(A + B + C)
= P(A) + P(B) + P(C) – P(AB) – P(BC) – P(AC) + P(ABC)
(ii) In the case of 3 events, (mutually exclusive events)
P(A or B or C) = P(ABC) = P(A + B + C) = P(A) + P(B) + P(C)
II. Multiplication Theorem on Probability:

(i) If A and B be any two events which are not independent, then (i.e.) dependent.
P(A and B) = P(AB) = P(AB) = P(A) . P(B/A) (I)
= P(B) . P(A/B) (II)
Where P(B/A) and P(A/B) are the conditional probability of B given A and A given B
respectively.
Proof:
Let n is the total number of events
n(A) is the number of events in A

n(B) is the number of events in B
n(AB) is the number of events in (AB)
n(AB) is the number of events in (AB)
n(A  B)
P(AB) =
n
= n(A  B)  n(A)
n n(A)
n(A) n(A  B)
= 
n n(A)
P(AB) = P(A) . P(B/A) (I)
n(A  B)
P(AB) =
n
= n(A  B)  n(B)
n n(B)
n(B) n(A  B)
= 
n n(B)
P(AB) = P(B) . P(A/B) (II)
(ii) If A and B be any two events which are independent, then,

P(B/A) = P(B) and P(A/B) = P(A)
P(A and B) = P(AB) = P(AB) = P(A) . P(B)
Note:
(i) In the case of 3 events, (dependent)
P(ABC) = P(A) . P(B/A) . P(C/AB)
(ii) In the case of 3 events, (independent)

P(ABC) = P(A) . P(B) . P(C)

Random Variable
A random variable is a variable that assumes numerical values associated with events
of an experiment.
Example
1. Number of patients in a clinic daily is a random variable.
2. Observe 100 babies to be born in a clinic. The number of boys, born is a random
variable.
3. Select one student from an university and measure the height and record the height as
‗x‘. The ‗x‘ is random variable. The value may be from 100 cm to 250 cm
Classification of random variable
A random variable can be classified in to two categories
1. Discrete random variable.
2. Continuous random variable
A discrete random variable is one that can assume only a countable number of values.
A continuous random variable can assume any value in one or more intervals .
In the example above 1st and 2nd example are discrete random variables. The height of
the students is a continuous random variable.
Probability Distribution
Suppose ‗x‘ be a random variable taking countable infinite number of values
x1+x2+… with each possible outcomes ‗xi‘. We associate a number pi= P(X=xi)=p(xi), called
the probability of xi. the set [xi, P(xi)] is called the probability distribution(pd) of the random
variable ‗x‘.
The probability distribution can be classified in to two categories
1. Discrete Probability Distribution (or) Probability Mass Function (or) (pmf)
2. Continuous Probability Distribution (or) Probability Density Function (or) (pdf)
Theoretical distributions are
1. Binomial distribution
Discrete distribution
2. Poisson distribution
3. Normal distribution ---------------  Continuous distribution

Discrete Probability distribution
Bernoulli distribution
A random variable x takes two values 0 and 1, with probabilities q and p ie., p(x=1) =
p and p(x=0)=q, q-1-p is called a Bernoulli variate and is said to be Bernoulli distribution
where p and q are probability of success and failure. It was given by swiss mathematician
James Bernoulli(1654-1705)
Example
 Tossing a coin(head or tail)
 Germination of seed(germinate or not)
Binomial distribution
Binomial distribution was discovered by James Bernoulli(1654-1705). Let a random

experiment be performed repeatedly and the occurrence of an event in a trial be called as
success and its non-occurrence is failure. Consider a set of n independent trails( n being
finite), in which the probability p of success in any trail is constant for each trial. Then q=1-p
is the probability of failure in any trail.
The probability of x success and consequently n-x failures in n independent trails. But
x successes in n trails can occur in ncx ways. Probability for each of these ways is pxqn-x.
P(sss…ff…fsf…f)=p(s)p(s)….p(f)p(f)….
=p,p…q,q…
=(p,p…p)(q,q…q)
(x times) (n-x times)
Hence the probability of x success in n trials is given by ncx pxqn-x
Poisson distribution
The Poisson distribution, named after Simeon Denis Poisson(1781-1840). Poisson

distribution is a discrete distribution. It describes random events that occurs rarely over a unit
of time or space.

It differs from the binomial distribution in the sense that we count the number of
success and number of failures, while in Poisson distribution, the average number of success
in given unit of time or space. The probability that exactly x events will occur in a given time
is as follows
e  x
P(x) = , x=0,1,2… called as probability mass function of Poisson
x!
distribution, where λ is the average number of occurrences per unit of time ------------ λ = np
Normal distribution
Continuous Probability distribution is normal distribution. It is also known as error law or
Normal law or Laplacian law or Gaussian distribution. Many of the sampling distribution like
student-t, f distribution and χ2 distribution.
A continuous random variable x is said to be a normal distribution with parameters µ
and σ2, if the density function is given by the probability law
ESTIMATION
The process of generalizing from the sample to the population is known as statistical
inference.
In statistical inference we have
1. Estimation of population parameters and
2. Testing of hypothesis.
The process of obtaining an estimate of the unknown value of a parameter by a statistic is
termed as estimation. There are two types of estimation.
1. Point Estimation.
2. Interval Estimation.
In general population parameter is denoted by θ and statistic is denoted by ˆ . The parameter
θ is unknown. The value of the statistic ˆ is computed from the random sample taken from
the population.
The statistic ˆ for estimating a parameter θ is called estimation of θ.
Example
1. The sample mean x is an estimator of population mean µ.
2. The sample SD s is an estimator of the population SD σ

The specified numerical value of as estimator calculated from the sample is called the
estimate.
Point Estimation
Sample statistic are used to estimate the population parameters, an best estimator has
to be selected. A best estimator is one which possesses the following properties
1. Unbiasedness
2. Consistency
3. Efficiency
4. Sufficieny
Limitation of Point Estimation
Under point estimation, a single value is given as the best estimate of the population value.
The point estimation does not take the size of the sample in to consideration directly. Point
estimate does not tell us how close the estimator is to the parameter being estimated.
Interval Estimation
Interval estimation involves the determination of an interval within which the
population value must like with a specified degree of confidence. 100(1-α)% confidence
interval for the parameter θ is specified as ˆ ±Z SE( ˆ )

ˆ- Z SE( ˆ ) and ˆ + Z SE( ˆ ) are lower limit and upper limit of the confidence interval or
confidence limits or fiducial limits.
In the case of large samples a 100(1-α)% CI for θ is given by ˆ ±Z SE( ˆ )

Z denotes the critical value of z for a percent level of significance. In case of 95% confidence
level, the critical value of Z is 1.96.
The critical value of Z for 99% confidence level is 2.58
Example
Confidence Interval for population mean is x  ZSE(x) -> Large sample
x  t SE(x) -> Small sample
For small samples, a 100(1-α)% CI for θ is given by ˆ  t SE( )

Where tα refers the critical value of t for α% level of significance with specified
degrees of freedom.

Hypothesis testing of means and proportions
Test of significance:
A very important aspect of sampling theory is the study of the test of significance, which
enables us to decide on the basis of the sample results if
1. The deviation between the observed sample statistic and the hypothetical parameter
value (or)
2. The deviation between two sample statistics,
is significant or might be attributed to chance or the fluctuations of sampling.
Level of significance:
The region in which the sample statistic is supposed to fall and then reject the null hypothesis
is known as the critical region or region of rejection. Generally, we take two critical regions,
which cover either 5 % or 1 % areas of the normal curve. When the null hypothesis is
rejected with the variate deviating from the mean by more than 1.96 𝜎 on either side, it is
known as 5 percent level of significance and if it is rejected outside 2.58 𝜎, it is 1 percent
level of significance. The probability that a sample statistic is falling in the rejection region is
known as the level significance.
Type I error & Type II error:
The error of rejecting 𝐻0when 𝐻0 is true is called type I error and the error of accepting
𝐻0when 𝐻0is false is called the type II error. The probability of type I error and type II errors
are denoted by 𝛼and𝛽 respectively.
* Type I error
Decision from sample **Type II error
𝛼Producers risk
Reject 𝐻0 Accept 𝐻0 𝛽Consumers risk
𝐻0True Wrong* correct 𝛼 Prob. of type I error

𝐻0False correct Wrong** 𝛽 Prob. of type II error
Tests of hypothesis concerning variance:

In testing hypothesis about the variance of a normally distributed population, the null
hypothesis is 𝐻0: 𝜎 2 = 𝜎 2𝑜where𝜎 2𝑜is some specified value of the population variance.
2
In this case, 𝜒2 statistic is given by ----- 𝜒2 = (𝑛−1)𝑠
Where 𝑠2 is computed from random
𝜎2
sample of size n. If 𝜒2 < 𝜒21− 𝛼 and 𝜒2 > 𝜒𝛼2 , i.e., when the computed value of 𝜒2 lies in the
2 2
rejection region, we reject the null hypothesis, otherwise we fail to reject the null hypothesis.

Degrees on freedom: (d.f.)
The number of independent variates which make up the statistic (e.g. 𝜒2) is known as the
degrees of freedom (d.f.) and usually denoted by 𝜈 (the letter ‗Nu‘ of Greek alphabet)
The number of degrees of freedom, is general is the total number of observations less the
number of independent constraints imposed on the observations. For example, if k is the
number of independent constraints in a set of ‗n‘ observations then 𝜈 = 𝑛 − 𝑘
In addition if any population parameter (s) is are calculated from the given data and used for
computing the expected frequencies then in applying 𝜒2test of goodness of fit , we have a
subtract one d.f. for each parameter calculated. Thus is ‗s‘ is the number of population
parameters. Estimated from the sample observations (n in numbers) then the required number
of degrees of freedom for 𝜒2test is (𝑛 − 𝑠 − 1)
Attributes:
A characteristic of population which cannot be measured quantitatively but described
qualitatively. E.g., honesty, mortality, skill, etc.
Small sample test:
When n less than 30 we apply small sample test. The test are
1. Test based on student t distribution -∞ ≤ t ≤ ∞
2. Test based on F distribution 0≤F≤∞
3. Test based on χ2 distribution 0≤ χ2 ≤ ∞
F test:
The object of the test F test is to discover whether the two independent estimates of the
population variance differ significantly, or whether the two samples may be regarded as
drawn from the normal populations having the same variance
𝑛1𝑠12/𝑛1 − 1
(𝑛 − 1, 𝑛2 − 1 ) 𝑑. 𝑓.
𝐹 = 2 𝑠2 /𝑛 2
2
~𝐹 1
𝑛 −1
follows F-distribution with (𝑛1 − 1, 𝑛2 − 1) d.f. If the computed value of F at a specified
level of significance for the given d.f. is less than the table value of F, there is no significance
difference between variances.
F – statistic is used in the of test of significance for testing.
1. Significance of equality of two variances.
2. Significance of observed multiple correlation coefficient.
3. Significance of observed correlation ratio.

4. Equality of several means of normal population in several sample case. This is
known as the Analysis of variance Technique (ANOVA) applied in design of
experiments.
„ t „ test:
Let 𝑋̅1 , 𝑋̅2 ,… , 𝑋̅𝑛 be a random sample of size ‗n‘ drawn from a normal population with
mean 𝜇 and variance 𝜎 2 then, student ‗ t ‗ statistics is given by
𝑋̅ − 𝜇
𝑡=
𝑆/√𝑛
This follows a student ‗t‘ distribution with (n-1) d.f. where
∑ 𝑋̅ 1
𝑋̅ = 𝑎𝑛𝑑 𝑆2 = ∑(𝑋̅𝑖 − 𝑋̅) 2 (𝑢𝑛𝑏𝑖𝑎𝑠𝑒𝑑 𝑣𝑎𝑟𝑖𝑎𝑛𝑐𝑒)
𝑛 𝑛−1
t – Distribution is used to test.
1. Test for single mean for single sample case.
2. Test for equality of two means for double sample case (independent samples and
dependent samples (paired – t- test)
3. Test for significance of observed correlation coefficient.
4. Test for significance of observed partial and multiple correlation coefficient.
5. Test for significance of observed regression coefficient.
Properties of t-distribution
(1) The t-distribution ranges from −∞ to ∞ just as does a normal distribution.
(2) The t-distribution like the standard normal distribution is bell shaped and symmetrical
around zero
(3) The shapes of the distribution changes as the number of degrees of freedom changes.
Therefore, for different degrees of freedom, the t distribution has a family of t-
distributions. Hence the degrees of freedom is a parameter of the t-distribution.
(4) The variance of the t-distribution is always greater than one and is defined only when
𝜈 ≥ 3 and is given by var(t)= 𝜈/𝜈 − 2.
(5) The t-distribution is more of platykurtic(less peaked at the centre and higher in tails)
than the normal distribution.
The t-distribution has a greater dispersion than the standard normal distribution. As n gets
larger the t-distribution approaches the normal distribution. When n is as large as 30, the
differences very small and the t-distribution approaches normal distribution in shape.

Application of t test:
The ‗t‘ distribution has a wide number of applications in statistics, some of which enumerated
below:
1. To test if the sample mean differs significantly from the hypothetical value of the
population mean
2. To test the significance of the difference between the two sample means
3. To test of hypothesis about coefficient of correlation
4. Estimation of population mean in terms of confidence interval
Test of matched pair data analysis is a useful application in experimental biology.
𝝌𝟐 chi square:
Chi-square test of karlpearsonis one of the simplest and most widely used non-parametric
tests in statistical work. It makes no assumptions about the population being sampled. It is a
statistical device to test the significance of the difference between observed distribution and
expected distribution. i.e., with the help of 𝜒2test we can know whether a given discrepancy
between theory and observation can be attributed to chance or whether it results from the
inadequacy of the theory to fit the observed facts.The quantity 𝜒2 is an index to measure the
extent ofsignificance of the difference between the observed and expected frequencies. The
formula for computing chi-square is:
(𝑂 − 𝐸)2
𝜒2 = ∑
𝐸
The calculated value of 𝜒2 is compared with the table value of 𝜒2 for given degrees of
freedom at specified level of significance.If the calculated value of 𝜒2 is greater than the
table value, the difference between theory and observation is considered to be significant. i.e.,
it could not have arisen due to fluctuations of simple sampling. On the other hand, if the
calculated value of 𝜒2 is less than the table value, the difference between theory and
observation is not considered significant, i.e., it could have arisen due to fluctuation of
sampling.
Important properties of the chi-square distribution
(1) 𝜒2distribution is a continuous probability distribution which has the value zero at its
lower limit and extends to infinity in the positive direction. Negative value of 𝜒2is not
possible
(2) The exact shape of the distribution depends upon the number of degrees of freedom𝜈.
For different values of 𝜈, we shall have different shape of the distribution. In general,
when 𝜈 is small, the shape of the curve is skewed to the right and as 𝜈 get larger, the

distribution becomes more and more symmetrical and can be approximated by the
normal distribution.
(3) The mean of the 𝜒2distribution is given by the degrees of freedom, i.e., 𝐸(𝜒2) = 𝜈
and variance is twice the degrees of freedom i.e., 2𝜈
(4) The distribution of √2𝜒2has mean equal to √2𝜈 − 1 and the standard deviation 1
The sum of independent 𝜒2variates is also a 𝜒2variate. Therefore if 𝜒12 is a 𝜒2 variate with
𝜈1 d.f. and 𝜒22 is another 𝜒2variate with 𝜈2 d.f. independent of 𝜒12, then their sum 𝜒12 + 𝜒22 is
also a 𝜒2variate with 𝜈1 + 𝜈2 d.f. this property is known as the additive property of 𝜒2.
Conditions for the application of 𝝌𝟐Test
The following five basic conditions must be met in order for chi square analysis to be applied
(i) The experimental data (sample observation) must be independent each other.
(ii) The sample data must be drawn at random from target population
(iii) The data should be expressed in original units for convenience of comparison, and
not in percentage or ratio form
(iv) The sample should contain at least 50 observations
There should not be less than five observations in any cell (each data entry known as a cell).
For less than 5 observations the value of 𝜒2 shall be overestimated and result in too many
rejections of null hypothesis.
Application of 𝝌𝟐Test
𝜒2distribution has a large number of applications in statistics, some of which are enumerated
below:
1. To test the hypothetical value of the population variance is 𝜎 2 = 𝜎𝑜2
2. To test the goodness of fit
3. To test the independence of attributes
To test the homogeneity of independent estimates of the population variance.
Remarks
1. If the differences between O and E are greater, the chi-square will be greater, and vice
versa. If there is no difference between O and E, the 𝜒2will be zero and vice versa.
2. Only frequencies can be used
3. Percentages, proportions, etc., cannot be used
4. All observations must be independent and mutually exclusive
5. Number of observation must be large
A minimum expected frequency of 5 is necessary in each O-E combination

Chi-Square test as of independence:
One of the most frequent uses of 𝜒2is for testing the null hypothesis that two criteria of
classification are independent. They are independent if the distribution of one criterion is no
way depends on the distribution of the other criterion. If there are dependent, there is an
association between the two criteria. In the test of independence, the population and sample
are classified according to some attributes.
The 𝜒2 test will indicate only whether or not any dependency relationship exists between
attributes. It will not indicate the degree of association or the direction of the dependency.
To conduct the test, a sample is drawn from population of interest and the observed
frequencies are classified according to the two criteria. The cross-classification can be
conveniently displayed by means of a table called a contingency table.i.e., it is organized in
such a way that one of the criterions is assigned rows and the other the columns, so that each
cell of the table one of the criterions is incident upon or contingent upon the other. Each cell
includes both observed and expected frequencies. Expected frequency of each cell is given by
𝑅𝑜𝑤 𝑡𝑜𝑡𝑎𝑙 × 𝐶𝑜𝑙𝑢𝑚𝑛 𝑡𝑜𝑡𝑎𝑙
𝐸=
𝑔𝑟𝑎𝑛𝑑 𝑡𝑜𝑡𝑎𝑙
Test statistic is given by
(𝑂 − 𝐸)2
2
𝜒 =∑ ~𝜒 2(𝑟 − 1)( 𝑠 − 1) 𝑑. 𝑓.
𝐸
Where r and s denote number categories for each criterion.
Which is distributed as a 𝜒2variate with (r-1)(s-1) d.f.
While applying the test, the hull hypothesis is that the two attributes are independent. If the
calculated value of 𝜒2 is less than the table value at a specified significance, the null
hypothesis is accepted, i.e., the two attributes are independent. If the calculated value of 𝜒2is
greater than the table value, the null hypothesis is rejected, i.e., the two attributes are
associated.

Chi-square as a test of goodness of fit:
Tests the goodness of fit are used when we want to determine whether an actual sample
distribution matches a known theoretical distribution. 𝜒2test is popularly known as a test of
goodness of fit that it enables us to ascertain how well the theoretical distributions such as
binomial, poisson, normal, etc, fit empirical distribution, i.e., those obtained from sample
data. i.e., It enables us to find if the deviation of the experiment from theory is just by chance
or is it really due to the inadequacy of the theory to fit the observed data. If there is higher
degree of conformity between the two distributions, any slight difference may be assumed to
be the result of sampling variation. On the other hand, any larger discrepancy between the
two distributions may lead to the conclusion that the sample was drawn from some theoretical
distribution other than one proposed.
While applying the chi-square test of goodness of fit, the null hypothesis usually states the
sample is drawn from the theoretical population distribution, and the alternate hypothesis
usually states that it is not.
(𝑂𝑖 −𝐸𝑖 )2
In this case, test statistic is given by 𝜒2 = ∑𝑛𝑖 =1
𝐸𝑖
This follows a chi-square distribution with (n-1) d.f.

UNIT 5
Data analysis: Simple correlation and regression analysis – The F test – analysis of variance –
Cross tabulation – Multivariate techniques and their applications – Discriminate analysis –
cluster analysis – Factor analysis and co-joint analysis.
Research reports: Presentation – Format – Language – Tables – Pictures and Graphs –
Comments.
Correlation:
Correlation is the study of relationship between two or more variables. Whenever we co nduct
any experiment we gather information on more related variables. When there are two related
variables their joint distribution is known as bivariate normal distribution and if there are
more than two variables their joint distribution is known as multivariate normal
distribution.
In case of bivariate (or) multivariate normal distribution, we are interested in discovering and
measuring the magnitude and direction of relationship between two (or) more variables, for
this we use the tool known as correlation.
Examples of the two variables which are to be studied together are
(i) Yield and Rainfall

(ii) Yield and quantity of Fertilizer application
(iii) Grain yield and Straw yield
(iv) Yield and infection with stem rust
(v) Supply of a commodity and its price
(vi) Height of a father and height of son, etc….
Types of Correlation:
There are four important ways of classifying correlation, viz,
1. Positive and Negative correlation

2. Simple, Partial and Multiple correlation
3. Linear and Non- linear correlation
4. Rank correlation

Suppose we have two continuous variables X and Y and if the change in X affects Y, the
variables are said to be correlated. In other words, the systematic relationship between the
variables is termed as correlation. When only two variables are involved the correlation is
known as simple correlation and when more than two variables are involved the correlation
is known as multiple correlation.
When the variables move in the same direction, these variables are said to be positively
correlated (or) direct correlation and if they move in the opposite direction they are said to
be negatively correlated (or) indirect (or) inverse correlation.
If the amount of change in one variable tends to bear constant ratio to the amount of change
in the other variable, then the correlation is said to be linear correlation.
A correlation is said to be non-linear (or) curvilinear if the amount of change in one

variable does not bear a constant ratio to the amount of change in the other variable.
Example: If rainfall is doubled, the production of rice would not necessarily be doubled.
When both the variables are not normal, the linear correlation coefficient procedure is not
applicable and we have to use rank correlation. The two methods of computing rank
correlation are one proposed by Spearman and another by Kendall. Spearman‟s rank
correlation procedure starts within ranking of the measurements of the values of X and Y
separately.
Scatter Diagram:
To investigate whether there is any relation between the variables X and Y we use scatter
diagram. Let (x1, y1), (x2, y2),….,(xn, yn) be ‗n‘ pairs of observations. If the variables X
and Y are plotted along the X axis and Y axis respectively in the X-Y plane of a graph sheet
the resultant diagram of dots is known as scatter diagram. From the scatter diagram we can
say whether there is any correlation between x and y and whether it is positive (or) negative
(or) the correlation is linear (or) curvilinear.

Assumptions in Correlation analysis:
1. The variables under study are continuous random variables and they are normally
distributed.
2. The relationship between the variables is linear.
3. Each pair of observations is independent.

Correlation Coefficient:
The index of the degree of the relationship between two continuous variables is
known as correlation coefficient. The correlation coefficient is symbolized as ‗r‘ in case of
sample and as ‗ρ‘ in case of population. The correlation coefficient ‗r‘ is known as Karl
Pearson‟s correlation coefficient. It is often referred to as product moment correlation.
Properties:
 The correlation coefficient values ranges between –1 to +1 (i.e.) −1 ≤ r ≤+1.
 The correlation coefficient is not affected by change of origin (or) scale (or) both.
 If r > 0, it denotes positive correlation and r < 0, it denotes negative correlation
between the two variables X and Y. If r = 0, then the two variables are not linearly
correlated. If X and Y are independent, then r = 0. (i.e.) no correlation. If r = +1, the
correlation is perfect positive correlation and if r = –1 then the correlation is perfect
negative correlation.
 The correlation coefficient between X and Y is same as Y and X. (i.e.) rxy = ryx = r
(say as symmetric).
Multiple Correlation:
The term multiple correlation and partial correlation refers to the theory of correlation
involving more than two variables. If it is used to find the degree of relationship among three
or more variables. Let ‗x1‘ be the dependent variable and x2, x3 be independent variables.

Partial Correlation:
The simple correlation between ‗x1 and x2‘ while other variables x3, x4, …, xn are kept
as constants, is known as partial correlation. It is denoted by r12. 3, 4,…, n. The range of
partial correlation coefficient lies between –1 to +1.
Regression Analysis:
In the correlation section we saw how measures the relation between two related
variables. Correlation analysis serves as a technique to estimate the degree of association
between the two random variables. But in many situations one may be interested in predicting
the value of a variable or expected value of the variable when the value of other variable is
known. In such cases we use the principle of regression.
In simple regression analysis only two variables will be considered, where one may
represent cause and the other may represent effect. The variable representing cause is known
as independent variable and is denoted by ‗X‘. The variable ‗X‘ is also known as predictor
variable (or) regressor. The variable representing effect is known as dependent variable
and is denoted by ‗Y‘. ‗Y‘ is also known as predicted variable or response
Example:
If we know the past history of age and height of plants and we may be interested in
predicting height of plant of a given age. In this example, height is the dependent variable
and age is the independent variable.
Hence regression is the functional relationship between two variables. The

relationship between the dependent and the independent variable may be expressed as a
function and such functional relationship is termed as regression. When there are only two
variables the functional relationship is known as simple regression and if the relationship
between the two variables in a straight line is known as simple linear regression otherwise is
known as simple non-linear regression.
Uses of Regression:
The regression analysis is useful in predicting the value of one variable from the given
value of another variable. Such predictions are useful when it is very difficult or expensive to
measure the dependent variable, Y. The other use of the regression analysis is to find out the
causal relationship between variables. Suppose we manipulate the variable X and obtain a
significant regression of variables Y on the variable X. Thus we can say that there is a causal

relationship between the variable X and Y. The causal relationship between nitrogen content of
soil and growth rate in a plant, or the dose of an insecticide and mortality of the insect population
may be established in this way.
F test – analysis of variance

Analysis of variance (ANOVA) can determine whether the means of three or more groups are
different. ANOVA uses F-tests to statistically test the equality of means.
F-tests are named after its test statistic, F, which was named in honor of Sir Ronald Fisher.
The F-statistic is simply a ratio of two variances. Variances are a measure of dispersion, or
how far the data are scattered from the mean. Larger values represent greater dispersion.
Variance is the square of the standard deviation. For us humans, standard deviations are
easier to understand than variances because they‘re in the same units as the data rather than
squared units. However, many analyses actually use variances in the calculations.
F-statistics are based on the ratio of mean squares. The term ―mean squares‖ may sound
confusing but it is simply an estimate of population variance that accounts for the degrees of
freedom (DF) used to calculate that estimate.
Analysis of variance (ANOVA)
The importance of ANOVA
A more important use of the F-distribution is in analyzing variance to see if three or more
samples come from populations with equal means. This is an important statistical test, not so
much because it is frequently used, but because it is a bridge between univariate statistics and
multivariate statistics and because the strategy it uses is one that is used in many multivariate
tests and procedures.
This is also the beginning of multivariate statistics. Notice that in the one-way ANOVA, each
observation is for two variables: the x variable and the group of which the observation is a
part. In later chapters, observations will have two, three, or more variables.
The F-test for equality of variances is sometimes used before using the t-test for equality of
means because the t-test, at least in the form presented in this text, requires that the samples
come from populations with equal variances.

Multivariate techniques and their applications
The general purpose of multivariate analysis of variance (MANOVA) is to determine whether

multiple levels of independent variables on their own or in combination with one another
have an effect on the dependent variables. MANOVA requires that the dependent variables
meet parametric requirements.
MANOVA is used under the same circumstances as ANOVA but when there are multiple
dependent variables as well as independent variables within the model which the researcher
wishes to test. MANOVA is also considered a valid alternative to the repeated measures
ANOVA when sphericity is violated.
The dependent variables in MANOVA need to conform to the parametric assumptions.

Generally, it is better not to place highly correlated dependent variables in the same model for
two main reasons. First, it does not make scientific sense to place into a model two or three
dependent variables which the researcher knows measure the same aspect of outcome.
(However, this is point will be influenced by the hypothesis which the researcher is testing.
For example, subscales from the same questionnaire may all be included in a MANOVA to
overcome problems associated with multiple testing. Subscales from most questionnaires are
related but may represent different aspects of the dependent variable.) The second reason for
trying to avoid including highly correlated dependent variables is that the correlation between
them can reduce the power of the tests. If MANOVA is being used to reduce multiple testing,
this loss in power needs to be considered as a trade-off for the reduction in the chance of a
Type I error occurring. Homogeneity of variance from ANOVA and t tests becomes
homogeneity of variance covariance in MANOVA models. The amount of variance within
each group needs to be comparable so that it can be assumed that the groups have been drawn
from a similar population. Furthermore it is assumed that these results can be pooled to
produce an error value which is representative of all the groups in the analysis. If there is a
large difference in the amount of error within each group the estimated error measure for the
model will be misleading.
There needs to be more participants than dependent variables. If there were only one
participant in any one of the combination of conditions, it would be impossible to determine
the amount of variance within that combination (since only one data point would be
available). Furthermore, the statistical power of any test is limited by a small sample size. (A
greater amount of variance will be attributed to error in smaller sample sizes, reducing the

chances of a significant finding.) A value known as Box‘s M, given by most statistical
programs, can be examined to determine whether the sample size is too small. Box‘s M
determines whether the covariance in different groups is significantly different and must not
be significant if one wishes to demonstrate that the sample sizes in each cell are adequate. An
alternative is Levene‘s test of homogeneity of variance which tolerates violations of
normality better than Box's M. However, rather than directly testing the size of the sample it
examines whether the amount of variance is equally represented within the independent
variable groups. In complex MANOVA models the likelihood of achieving robust analysis is
intrinsically linked to the sample size. There are restrictions associated with the
generalizability of the results when the sample size is small and therefore researchers should
be encouraged to obtain as large a sample as possible.
Cross Tabulation
Contigency Table : When individual in a same problem have two characters and a frequency
distribution is to be made by classifying them on the basis of both the characters so as to
show the relation between the characters, the resulted term is ca lled contingency table, such
as height and weight of plants.
Cross tabulation is a method to quantitatively analyze the relationship between

multiple variables. Also known as contingency tables or cross tabs, cross tabulation groups
variables to understand the correlation between different variables. It also shows how
correlations change from one variable grouping to another. It is usually used in statistical
analysis to find patterns, trends, and probabilities within raw data.
Cross tabulation is usually performed on categorical data — data that can be divided into
mutually exclusive groups.
An example of categorical data is the region of sales for a product. Typically, regio n can be
divided into categories such as geographic area (North, South, Northeast, West, etc) or state
(Andhra Pradesh, Rajasthan, Bihar, etc). The important thing to remember about categorical
data is that a categorical data point cannot belong to more than one category.
Cross tabulations are used to examine relationships within data that may not be readily
apparent. Cross tabulation is especially useful for studying market research or survey
responses. Cross tabulation of categorical data can be done with through tools such as SPSS,
SAS, and Microsoft Excel.

Benefits of cross tabulation
 Eliminates confusion while interpreting data

 Helps in deriving innumerable insights
 Offers data points to chart out a course of action
Discriminant analysis
Discriminant analysis is a technique that is used by the researcher to analyze the research data
when the criterion or the dependent variable is categorical and the predictor or the
independent variable is interval in nature. The term categorical variable means that the
dependent variable is divided into a number of categories. For example, three brands of
computers, Computer A, Computer B and Computer C can be the categorical dependent
variable. The objective of discriminant analysis is to develop discriminant functions that are
nothing but the linear combination of independent variables that will discriminate between
the categories of the dependent variable in a perfect manner. It enables the researcher to
examine whether significant differences exist among the groups, in terms of the predictor
variables. It also evaluates the accuracy of the classification. Discriminant analysis is
described by the number of categories that is possessed by the dependent variable.
Discriminant Analysis can be understood as a statistical method that analyses if the
classification of data is adequate with respect to the research data. It is implemented by
researchers for analyzing the data at the time when-
 Dependent variable or criterion is categorical

 Independent variable or predictor is an interval
There are four types of Discriminant analysis that comes into play-
#1. Linear Discriminant Analysis
This one is mainly used in statistics, machine learning, and stats recognition for analyzing a
linear combination for the specifications that differentiate 2 or 2+ objects or events.
#2. Multiple Discriminant Analysis
It is used for compressing the multivariate signal so that a low dimensional signal which is
open to classification can be produced.
#3. Quadratic Discriminant Analysis
In this type of analysis, your observation will be classified in the forms of the group that has
the least squared distance. However, in this, the squared distance will never be reduced to the
linear functions.

#4. Canonical Discriminant Analysis
In this type of analysis, dimension reduction occurs through the canonical correlation and
Principal Component Analysis.
Cluster analysis
Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects
in the same group (called a cluster) are more similar (in some sense) to each other than to
those in other groups (clusters). Cluster analysis, in statistics, set of tools and algorithms that
is used to classify different objects into groups in such a way that the similarity between two
objects is maximal if they belong to the same group and minimal otherwise.
Factor analysis
Factor analysis is a technique that is used to reduce a large number of variables into fewer
numbers of factors. This technique extracts maximum common variance from all variables
and puts them into a common score. As an index of all variables, we can use this score for
further analysis. Factor analysis is part of general linear model (GLM) and this method also
assumes several assumptions: there is linear relationship, there is no multicollinearity, it
includes relevant variables into analysis, and there is true correlation between variables and
factors. Several methods are available, but principal component analysis is used most
commonly.
Exploratory factor analysis: Assumes that any indicator or variable may be associated with
any factor. This is the most common factor analysis used by researchers and it is not based
on any prior theory.
Confirmatory factor analysis (CFA): Used to determine the factor and factor loading of
measured variables, and to confirm what is expected on the basic or pre-established theory.
CFA assumes that each factor is associated with a specified subset of measured variables. It
commonly uses two approaches:
1. The traditional method: Traditional factor method is based on principal factor

analysis method rather than common factor analysis. Traditional method allows the
researcher to know more about insight factor loading.

2. The SEM approach: CFA is an alternative approach of factor analysis which can be
done in SEM. In SEM, we will remove all straight arrows from the latent variable,
and add only that arrow which has to observe the variable representing the covariance
between every pair of latents.
Co-joint analysis
Conjoint analysis is a popular method of product and pricing research that uncovers
consumers' preferences and uses that information to help select product features, assess
sensitivity to price, forecast market shares, and predict adoption of new products or services.
Conjoint analysis is frequently used across different industries for all types of products, such
as consumer goods, electrical goods, life insurance plans, retirement housing, luxury goods,
and air travel. It is applicable in various instances that centre around discovering what type of
product consumers are likely to buy and what consumers value the most (and least) about a
product. As such, it is commonplace in marketing, advertising, and product management.
Businesses of all sizes can benefit from conjoint analysis, including even local grocery stores
and restaurants — and its scope is not just limited to profit motives, for example, charities
can use conjoint analysis‘ techniques to find out donor preferences.
Conjoint analysis works by breaking a product or service down into its components (referred
to as attributes and levels) and then testing different combinations of these components
to identify consumer preferences. For example, consider a conjoint study on smartphones.
The smartphone is sorted into four attributes which are further broken down into different
variations to create levels:

Types of Conjoint Analysis
Conjoint analysis can take various forms. Some of the most common include:
 Choice-Based Conjoint (CBC) Analysis: This is one of the most common forms of
conjoint analysis and is used to identify how a respondent values combinations of
features.
 Adaptive Conjoint Analysis (ACA): This form of analysis customizes each

respondent's survey experience based on their answers to early questions. It‘s often
leveraged in studies where several features or attributes are being evaluated to
streamline the process and extract the most valuable insights from each respondent.
 Full-Profile Conjoint Analysis : This form of analysis presents the respondent with a
series of full product descriptions and asks them to select the one they‘d be most
inclined to buy.
 MaxDiff Conjoint Analysis: This form of analysis presents multiple options to the
respondent, which they‘re asked to organize on a scale of ―best‖ to ―worst‖ (or ―mo st
likely to buy‖ to ―least likely to buy‖).

RM - MBA - Full Notes

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

RM - MBA - Full Notes

Uploaded by

Copyright:

Available Formats

MADRAS INSTITUTE OF HOTEL MANAGEMENT AND CATERING TECHNOLOGY

II MBA in HOTEL MANAGEMENT AND CATERING SCIENCE

3.3 – RESEARCH METHODS IN HOSPITALITY INDUSTRY

Semester III / 3.3 – Research

Research Introduction – Qualities of search – Components of research problems – various

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

There are three purposes of research:

1. Exploratory: As the name suggests, exploratory research is conducted to explore a

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

3. Explanatory:Explanatory research or causal research is conducted to understand the

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Research methods are broadly classified as Qualitative and Quantitative.

Both methods have distinctive properties and data collection methods.

Types of qualitative methods include:

 One-to-one Interview: This interview is conducted with one participant at a given

 Ethnographic Research: Ethnographic research is an in-depth form of research

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Quantitative Research Methods

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

1. Analysis of research gap

Hypothesis and Null Hypothesis:

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Quantitative Data Collection:

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

DATA COLLECTION METHODOLOGY

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Purposive (or) Convenienc e Simple rando m

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Experimental Method – Kinds and Procedure

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Types of Scaling Techniques

Primary Scaling Techniques

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Nominal scales are adopted for non-quantitative (containing no numerical implication)

Types of Nominal Scales:

2. Nominal with Order: The labels on a nominal scale arranged in an ascending or

The ratings so provided are as follows:

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Less than 40 kilograms

 120 Kilograms and more

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

A non-comparative scale is used to analyse the performance of an individual product or

Continuous Rating Scales

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Itemized Rating Scale

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

Semantic Differential Scale : A bi-polar seven-point non-comparative rating scale is where

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry

MIHMCT/Sem III – MBA in HMCS – 3.3 Research Methods in Hospitality Industry