You are on page 1of 303

# Quantitative Methods

Quantitative Methods
This document is authorized for internal use only at IBS campuses- Batch of 2012-2014 - Semester I. No part of this publi-
cation may be reproduced, stored in a retrieved system, used in a spreadsheet, or transmitted in any form or by any means
- electronic, mechanical, photocopying or otherwise - without prior permission in writing from IBS Hyderabad.
Introduction
Introduction to Statistics
Data, Measurement and Scales
Case Study: College Canteen’s Decreas-
ing Sales: Analysis Dilemmas
C
H
A
P
T
E
R

1
I n t hi s c hapt e r we wi l l di s c us s
Section1
Introduction to Statistics
What is Statistics?
Let us look at the following facts:
India’s GDP grew at 6.9% during 2011-12.
India’s export during the financial year 2011-12
amounted to \$300 billion.
The BSE Sensex was 17094.51 points at the
closure of the market on 13th April 2012.
Tata Motors reported a “profit after tax” of Rs.25.71
billions for the financial year 2009-10.
Total irrigated land in Andhra Pradesh is 4.4 million
hectares in the year 2012.
In all the above examples, reference is made to some kind
of data. “Statistics”, in common parlance, is understood as
data relating to some aspects of an individual or item or
unit. The individuals could be people, companies or
economies while the data could pertain to a certain time
period.
In Italian “stato” means state and “statista” refers to the
person involved with the administration of state. Born out
of a combination of these two words, “statistics” originally
meant collection of facts useful to the state. Records of
land, population, etc., have been maintained for long for
official purposes by the governments/rulers across the
globe. However, the formal term was introduced only in
the 18th century.
The modern meaning of statistics is somewhat different
from the above meaning (though the word is very much
used in the sense of data even today). Clearly, in the past
also, the interest in various records was with a view to use
them for better future predictions and planning. Today, the
discipline of statistics is about transforming data into
useful information for decision makers. Thanks to
development of mathematical tools and powerful
computers, statistics has emerged as even more a
stronger discipline in its own right. One may also define
statistics as the study of uncertainty.
In general, statistics can be broadly divided as descriptive
statistics and inferential statistics. Descriptive statistics
deals with collection of data related to a characteristic or a
few characteristics and its application in profiling the
individuals or units, whom the data pertains to. For
instance, if income data is collected on a sample of
individuals in a city, the data may be summarized in the
form of tables and graphs to understand the income status
of the sampled residents in the city better. However, if we
3
wish to estimate the average income of the residents of the
city, we need to get into the art of inferential statistics, i.e.,
the statistical tool that enables us to generalize beyond the
sample. These generalizations are made with a probability
attached to them.
Why Statistics?
“Converting raw data to useful information for decision-
making” is the essence of statistics. This skill is essential for
any business manager who is under constant pressure to
make decisions, often with incomplete and imperfect
information. A probabilistic guidance for decision-making is
superior to intuition and hunches, as it gives measurable
indication of the uncertainty. Thus knowledge of statistics will
help managers in making informed decisions.
A manager is often required to handle the following:
To be able to summarize the data he is handling in his
work situation.
To be able to play a leadership role in statistical study
either in handling or in liaisoning with consultants.
Such responsibilities would call for an understanding of basic
statistical concepts.
Managerial Applications of Statistics
In today’s globalized, computerized and Internet-enabled
world, there is an abundance of data, whether at the micro-
level, macro-level or at the organizational levels. The
challenge is to convert this data into useful information which
can be used by the organization. Several application areas
are listed below:
With the availability of point of purchase data
obtained through electronic scanners at supermarket,
the marketing managers can derive valuable
useful for future planning, product positioning and
marketing.
Quality control in production processes using the
Statistical Control Charts is another well known
application.
Comparing the movement of individual stocks with
the stock market averages is another important
statistical application in Financial Analysis.
Auditing and tax authorities often use sampling
approach to verify accounts and based on its
accuracy draw conclusions about the entire lot.
Economic forecasts are often obtained through the
application of statistical tools using the past data
under certain assumptions and conditions.
Thus statistics is a useful tool in business and economic
analysis.
4
Video 1.1.1: Should Managers study
Statistics?
Data and Measurement
The term “data” refers to the information collected on the
characteristic of interest on an individual or item. The
characteristic on which data is collected is termed as a
variable. The data can be quantitative or qualitative
(categorical). Consider a financial analyst collecting
closing equity prices data of all FMCG companies in India
as of 30th April 2012. This would be an example of
quantitative data; closing equity price being the variable.
In contrast, a market researcher (in the US) who believes
that ethnic background will have an influence on the
purchase behavior of an FMCG product, hence records
data on the ethnic background in five categories (White,
Black, Asian, Hispanic and others), along with data on
amount spent on the products. In this case, the variable,
ethnic background is qualitative in nature, whereas the
variable “amount spent” is quantitative.
Scales of Measurement
The point to note above is that we had variables and we
had a way of measuring them. Clearly, the way of
measurement should be precisely defined for each
characteristic. In general, the data on any characteristic is
collected using one of the appropriate scales of
measurement from the following: Nominal, Ordinal,
Interval and Ratio.
Nominal scale: Observations are labeled so that they fall
into different categories such as color of the eyes, social
group/occupation, housing type, gender and so on. Any
number used in a nominal scale is a category label only
and no mathematical operation can be performed on it
because its assignment to the category is arbitrary. Like a
list of the names of students in a class:
1. Anita 6. Mallika
4. Kanti 9. Srisha
5. Krishna 10. Tarun and so on.
This list represents only names and therefore has none of
the three qualities (magnitude, equal interval or absolute
zero). The numbers next to the names are used for
convenience only and are used simply to label groups or
classes. For example, when you are filling a form you are
asked to fill in your gender by denoting 1 = if male, or 2 =
if female. Or you may be asked to mention the Color of
5
Section 2
Data, Measurement and Scales
6
Table 1.2.1 Table 1.2.1 Table 1.2.1 Table 1.2.1 Table 1.2.1
Scale of Measurement Scale qualities Measurement principles Examples Permissible operations
Nominal None
People or objects with the same
scale value are the same on
some attribute. The values of the
scale have no ‘numeric’ meaning
in the way that you usually think
Names , Li s t s of
wo r d s , Ge n d e r ,
Et hni ci t y, Mar t i al
Status
Counting
Ordinal Magnitude
People or objects with higher
scale value have more of some
attribute. The intervals between
adj acent scal e val ues ar e
indeterminate. Scale assignment
is by the property of “greater
than,” “equal to,” or “less than”.
A n y t h i n g r a n k
ordered
Greater than or less than
operations
Interval
Magnitude equal
intervals
values are equal with respect to
the attribute being measured
Temperature, most
p e r s o n a l i t y
measur es, WAI S
intelligence score
of scale values
Ratio
Magnitude equal
i n t e r v a l s
absolute zero
There is a rationale zero point for
the scale. Ratios are equivalent,
e.g. the ratio of 2 to 1 is the same
as the ratio of 8 to 4
Age, Height, Weight,
Percentage, etc.
Mu l t i p l i c a t i o n a n d
division of scale values.
your eyes by 1=if blue, 2=if green, 3=if brown. The only
permissible mathematical operation for this kind of nominal data
is counting. Ethnicity and gender are examples of variables that
would be measured on a nominal Scale and the numbers
assigned to the different categories are arbitrary.
Ordinal scale: The categories that make up this scale are
ranked in terms of magnitude. Observations or any set of data
are put into categories, which can be ranked in some order such
as from greatest to lowest. For example, wealthy, middle-class,
and poor neighborhoods; expensive, moderate, or cheap
restaurants or a product ranked by the customers as best=1,
second best as 2 and so on. The rankings do not tell us how
much is the difference between the wealthy and middle-class,
as there is no absolute zero and no equal intervals in this scale.
No precise value can be assigned to a difference between
ranks. (When does "wealthy" become "middle-class", etc).
Interval scale: The third type of scale is called an interval scale.
It possesses both magnitude and equal intervals, but no
absolute zero. For example, the difference between 1 and 2 is
the same as the difference between 99 and 100. In the interval
scale, the categories have a meaningful unit of distance
separating them. A classic example of an interval scale is
temperature, because we know that each degree is the same
distance apart and we can easily tell if one temperature is
greater than, equal to, or less than another, but we cannot
"really" say 20
o
C is twice as hot as 10
o
C, as temperature has
no absolute zero, i.e., if the thermometer records that the
temperature outdoors is zero, it does not mean that there is no
temperature!!
Ratio scale: The fourth scale of measurement is the ratio scale.
A ratio scale contains all the three qualities, magnitude, equal
interval and absolute zero. Statisticians often prefer this scale
because the data can be more easily analyzed. Height, weight,
age and percentage of people who pass can be measured on a
ratio scale. For example, if you are 20 years old, you not only
know that you are older than your sister who is 15 years old
(magnitude), but you also know that you are five years older
(equal intervals) to her. A ratio scale also has a point where
none of the scale exists; i.e., when a person is born his or her
age is zero.
The scales of measurement, the scale qualities, measurement
principles, their examples and permissible operations are given
in a tabular form in table 1.2.1 for easy understanding and
meaningful comparison.
Equipped with an understanding of the different types of data,
we now proceed to the next major objective of statistical
method, that is, to organize and summarize the gathered
quantitative data in order to understand it better. The first step
in organizing data is to tabulate the scores into a frequency
distribution. In this chapter we will be focusing our attention to
the statistical concepts of frequency distribution, computation of
the mean, median, and mode, variance and standard deviation,
and then move on to understanding correlation.
Scaling is useful in a number of ways. It improves objectivity.
The matter under study can be expressed accurately. Even
small variations can be known with accuracy. Scaling makes
7
Keynote 1.2.1: Scales and measurements
the matter concise. A lot of material is expressed with brief
and to the point numbers. Scaling facilitates standardization.
The findings can be replicated elsewhere. Precision facilitates
comparison provided the scale possesses the required
qualities.
8
Section 3
Case Study: College Canteen’s Decreasing Sales: Analysis Dilem-
mas
9
This case study was written by Thalluri Prashnath Vidya Sagar, under the direction of R Muthukumar IBSCDC. It is intended to be
used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation. The
case was prepared from generalized experiences.
One fine morning, Raghu, the owner-manager of canteen, was
of fast food items and beverages. One of his friend and also a
supplier Ramesh came to meet him to discuss about the pend-
ing payment and further supplies. As the canteen was not doing
well over the past few months, he wanted to identify where he
goes wrong. His friend suggested him to conduct a survey about
the sales of beverages. So he randomly selects a sample of a
60 students comprising 38 male and 22 female students. The
students were asked to fill in a comment/feedback form. Raghu
believed that this survey would help the team to better under-
stand its customers’ needs, and better service them.
He decided to take up some statistical measures to assess the
following obtained information:
Name, age, gender and phone number
Impressions on the service offered by canteen employees
Preference of beverages
Amount spent on beverages.
After he collected the data through feedback forms, he com-
puted simple statistic measures for analyzing the data. First, he
divided the entire sample into two broad categories based on
gender, i.e. Male & Female and he assigned number 1 for male
2 for female.
To find out the actual interests of the students with respect to the
beverages and brewed beverages, the students were asked to
rank the four beverages based on their preferences. They had to
rank their strongest preference for the beverages as ‘1’ and the
lowest preference as ‘4’. After tabulating the data, he has given
the results in the form of a table (Exhibits II(a) and II (b)).
Raghu analyzed his percentage of profits with the sales of the
beverages, including Pepsi, Coke, Coffee and Tea. He also ob-
served that most of the students prefer the cold beverages par-
ticularly Pepsi. He has tabulated his observations(Exhibit III).
10
College Canteen’s Decreasing sales:
Analysis Dilemmas
0
10
20
30
40
Male(1) Female(2)
Coding of Broad Categories of Students
Students
Exhibit I
He came to understand that most of the students like Pepsi
than any other beverage. He also wanted to find out the service
quality of his staff. He also believed that it would help him to im-
prove the quality of service. Respondents were asked to state
their degree of agreement or disagreement with a statement by
selecting a response from a list such as the following one:
1.Agree very strongly, 2.Agree fairly strongly, 3.Agree,
4.Undecided, 5.Disagree, 6.Disagree fairly strongly, and
7.Disagree very strongly (Exhibits (IV (a) and IV(b)).
With all his observations, the canteen manager wants to imple-
ment certain measures later on to increase the sales through
improving his product mix and marketing mix to get maximum
profit without investing into the new ventures.
What is the significance of the given data in statistics? In
what way the data will help him in analysis?
11
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Exhibit II (a)
Ranking of Preferences of Beverages by Students
Stude
nt 1
Stude
nt 2
Stude
nt 3
Stude
nt 4
Stude
nt 5
Stude
nt 6
Stude
nt 7
....
..
Pepsi 1 3 1 3 1 1 1
Coke 2 4 3 2 1 4 3
Coffee 4 2 4 1 3 3 4
Tea 3 1 2 4 2 2 2
Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author
Exhibit II (b)
Student’s First Preferences of the Beverages
Exhibit II (b)
Student’s First Preferences of the Beverages
Exhibit II (b)
Student’s First Preferences of the Beverages
Exhibit II (b)
Student’s First Preferences of the Beverages
Rank Beverages Frequency %
1 Pepsi 18 30.0
4 Coke 12 20.0
2 Coffee 15 25.0
2 Tea 15 25.0
Total 60 100.0
Prepared by author Prepared by author Prepared by author Prepared by author
Exhibit III
Students’ First Preferences of the Beverages
Exhibit III
Students’ First Preferences of the Beverages
Exhibit III
Students’ First Preferences of the Beverages
Exhibit III
Students’ First Preferences of the Beverages
Beverages/
Sales
Gender Gender
% Proﬁt
Margin
Beverages/
Sales
Male(1) Female(2)
% Proﬁt
Margin
Pepsi 10 8 18
Coke 8 4 12
Coffee 9 6 15
Tea 11 4 15
Total 38 22
Prepared by author Prepared by author Prepared by author Prepared by author
12
Exhibit IV (b)
Student’s Preferences of the Quality of Service
Exhibit IV (b)
Student’s Preferences of the Quality of Service
Exhibit IV (b)
Student’s Preferences of the Quality of Service
Assigned Codes for Quality of
Service
Frequency %
1.Agree very strongly 10 16.7
2.Agree fairly strongly 15 25.0
3.Agree 17 28.3
4.Undecided 15 25.0
5.Disagree 3 5.0
6.Disagree fairly strongly 0 0
7.Disagree very strongly 0 0
Total 60 100
prepared by author prepared by author prepared by author
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
Exhibit IV (a)
Student’s Response Towards Quality of Service
St
ud
en
t 1
stu
den
t2
stu
den
t3
stu
den
t 4
stu
den
t 5
stu
den
t 6
stu
de
nt
7
stu
den
t 8
stu
den
t9
stu
de
nt
10
Quality
of
service
1 3 1 5 2 3 2 4 3 3
prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author
13
Arranging Data
Arranging Data : Why and How?
C
H
A
P
T
E
R

2
I n t hi s c hapt e r we wi l l di s c us s
Section1
Arranging Data : Why and How?
Arranging Data : Why?
In business, statistics is used to study the demand and
market characteristics of the product or service being sold.
In fact, market research has evolved into a separate
discipline. The planning process, whereby the firm seeks to
match its future activities with expected future conditions
and developments, can be facilitated by the use of
statistical probability. In the performance evaluation of
personnel, machinery, departments, etc., measures of
central tendency and dispersion can be used to provide a
certain degree of objectivity. In the field of finance, statistics
can be used to reveal long-term trends and seasonal
variations in sales, expenses and incomes. Statistics is
useful in the management of inventories and receivables. In
the management of investments, statistics can be used to
determine the alternative that provides the highest return
per unit risk. Statistics can also be used to test the validity
of various tools that are said to be useful in investment
selection.
Firms often study their profits over the years and attempt to
find clues for future performance. Investors compare
expected rates of return on various investment alternatives
to determine where to place their money. Merchant bankers
study the projected profits of their client companies to
advise on the right price at which equity issues may be
made. Credit rating agencies consider various factors
related to the creditworthiness of issuers of debt in order to
estimate the likelihood of default.
In the above mentioned cases, statistics is used to examine
real life situations for a description and assessment of what
is happening and to obtain some pointers to an uncertain
future. Statistics is all about number crunching and the
ultimate number cruncher – the computer – has placed
statistics in the center spot in today’s business environment.
15
16
COMPILE, COMPARE, CONCLUDE
Statistics may be used to reveal, conceal, guide and misguide. PCS
Data Products Ltd. is engaged in the manufacture of computer
hardware and copper clad laminates. In 2002-03 its sales were Rs.
29.86 crore, double the previous year’s sales. Suppose another
company Shady Ltd. (an imaginary company) claims that it has
outperformed PCS Data Products because its sales increased by
200% whereas PCS sales increased by only 100%. Such a statement
should be treated with extreme caution. For example, Shady Ltd. may
have had very low sales in the previous year, say sales of Rs.10,000
only. If they increased to Rs.30,000 in 2002-03, the growth rate would
be 200% which is higher than the growth rate of PCS. However, at
Shady’s low level of sales, such a high growth rate is totally
unimpressive as compared with the growth rate of PCS. In general,
when comparing growth rates, it is always useful to keep in view the
amounts involved. Otherwise, even a growth from zero to Re.1 can be
claimed to be an infinite growth rate! Suppose Shady Ltd. wants to
make a public issue of equity shares. In deciding whether to invest in
Shady’s shares, the public would consider the profits earned by it. If
Shady has incurred a loss, the public would be reluctant to take up its
shares. In such a case Shady may extend its current accounting year
to a period of say 15 months in the hope of covering up the loss with
the earnings in the additional three months. It may also use various
window-dressing measures to inflate its profits so that it displays a
higher profitability than PCS. In such cases the data provided by
Shady cannot be compared with the data provided by PCS. A
mediocre company like Shady is likely to produce mediocre products.
Hence it would resort to fair and unfair means to sell its product. Here
too statistics may be used to distort the truth. For example, Shady may
claim that on the basis of a survey it was found that Shady’s products
were considered the best available. The truth could be that the survey
covered only friends and relatives of Shady’s management.
INFLATION
Inflation is a general increase in the price of goods and services. The
inflation rate, as measured by the Consumer Price Index (CPI), was
9.9% in 20x2-x3 and is expected to be 8% in 20x3-x4. This does not
mean that prices will be lower in 1993-94. It merely means that the
general increase in prices will be lower. An example will clarify the
point.
The prices of certain items are included in calculating the CPI. If a
given quantity of these items cost Rs.10,000 in the beginning of 20x2-
x3, then at the end of 20x2-x3 they would cost Rs.10,990 which is
9.9% more. Further at the end of 20x3-x4 they will be expected to cost
Rs.11,869 which is 8% more than the cost at the beginning of the year.
Clearly the prices have not fallen, only the rate of increase has slowed
down.
Figure 2.1.1: Inflation as Reflected by
the Cost of Items Worth Rs.10,000
on 1st April 1992
Arranging Data : How?
Frequency Distributions
We will discuss this through the following example.
However, before we do that, we wish to differentiate
between raw data and processed data. Raw data is
information before it is processed and/or analyzed.
Processed data is information presented in a form so that
the reader can draw valid conclusions from it.
Example 2.1.1
The following table 2.1.1 lists the supposed share prices of
30 companies:
The presentation of data in this form requires a great deal
of space. If you refer to the newspaper pages which report
share prices of all shares traded on the previous day, you
will see that a wide space is covered. The above method of
presentation also does not allow one to quickly determine
the answers to the following types of questions:
What is the minimum share price among those given?
What is the maximum share price among those given?
Are the share prices evenly spread between the minimum
and maximum values? If not, are they concentrated in any
interval?
We can improve upon the above presentation of data by
creating an array in which the prices are arranged in
ascending or descending order. Below is an ascending
array of the data given in table 2.1.2.
Now, the questions posed earlier can be answered more
quickly, but the data still covers the same amount of space.
Besides, without the assistance of a computer, the sorting
17
Table 2.1.1 Table 2.1.1 Table 2.1.1 Table 2.1.1
Company Rs. Company Rs.
ACC 1690 Indian Hotels 420.00
Ballarpur 155.00 ITC 441.25
Bharat Forge 158.75 Kirloskar Cummins 305.00
Bombay Dyeing 236.25 Larsen & Turbo 175.00
Ceat 71 Mahindra & Mahindra 143.00
Century 525.00 Mukand 197.50
GE Shipping 73.75 Nestle India 282.50
Glaxo 200.00 Peico 125.00
Grasim 357.50 Premier Automobiles 35.00
Gujarat Fertilizers 205.00 Reliance 191.00
Hindustan Motors 26.00 Siemens 355.00
Hindustan Lever 350.00 Tata Power 870.00
Hindalco 585.00 Tata Steel 147.00
Indian Rayon 315.00 Telco 185.00
Indian Organic 35.00 Voltas 52.50
work involved in preparing the array is quite laborious. One
would have to repeatedly scan through the data to determine
the lowest share price, then the next lowest share price and
so on.
A more concise way to present the above data would be the
frequency. A frequency distribution of the above data is given
below in table 2.1.3:
18
Table 2.1.2 Table 2.1.2 Table 2.1.2 Table 2.1.2
Company Rs. Company Rs.
Hindustan Motors 26.00 Glaxo 200.00
Indian Organic 35.00 Gujarat Fertilizer 205.00
Premier Automobiles 35.00 Bombay Dyeing 236.25
Voltas 52.50 Nestle India 282.50
Ceat 71.00 Kirloskar Cummins 305.00
GE Shipping 73.75 Indian Rayon 315.00
Peico 125.00 Hindustan Lever 350.00
Mahindra & Mahindra 143.00 Siemens 355.00
Tata Steel 147.00 Grasim 357.50
Ballarpur 155.00 Indian Hotels 420.00
Bharat Forge 158.75 ITC 441.25
Larsen & Toubro 175.00 Hindalco 585.00
Telco 185.00 Tata Power 870.00
Reliance 191.00 ACC 1690.00
Mukand 197.50 Century 5250.00
Table 2.1.3 Table 2.1.3 Table 2.1.3
Class Interval
Share Price
(Rs.)
Tally Marks Frequency
20-895 III 28
895-1770 I 1
1770-2645 0
2645-3520 0
3520-4395 0
4395-5270 I 1
Total 30
IIII IIII IIII IIII IIII
Notes:
1. There are no hard and fast rules regarding the number
and size of class intervals. However, the following guidelines
are to be followed:
a. Every item of data or data point (in this case, share
price) should be included in one and only one class. Hence:
i. The lowest share price should be included in
the first class and the highest share price in the last class.
Adjacent classes should not have intervals in between. For
example, we cannot have adjacent classes like
20 – 895
900 – 1775
because neither class would include the data
points 896, 897, 898 and 899.
iii. Classes should not overlap. Hence we cannot
have classes like
20 – 895
890 – 1765
because the classes overlap and the data
points 890, 891, 892, 893, 894 fall in both classes.
20 – 895
895 – 1770
do not overlap because the data point 895 is
included only in the class 895 – 1770. Such types of classes
where the upper limit 895 of one class equals the lower limit
895 of the next class are called “exclusive” classes because
the upper limit of a class is excluded from the class.
We could also have classes of the type
20 – 894.99
895 – 1769.99
These are called “inclusive” classes because
the upper limit of each class is included in that class. Also
note that there are no intervals between the classes because
all data points are rounded off to the nearest paise. If we had
data points like Rs.894.993 then the above inclusive classes
would have to be adjusted as
20 – 894.999
895 – 1769.99
Class intervals should be of the same length to the extent
possible. (An example where it is not so is in item 3.)
In order to have the same definition of length for
inclusive and exclusive classes, the length of a class interval
is defined as the difference between the lower limit of
adjacent classes. Hence in the case of classes
20 – 895 and 895 – 1770, or
20 – 894.99 and   895 – 1769.99
The first class interval has a length of 895 – 20 =
Rs.875
c. The number of classes should usually be between
six and fifteen.
d. Subject to (c) above, the number of classes may be
equal to the square root of the number of data points. In our
19
example there are observations or data points. Hence the
number of classes should be around or 6.
The “Tally Marks” are merely a simple way of obtaining all the
class frequencies by running through the given data just once.
They are usually omitted in the presentation of a frequency
distribution.
3. Note that from the original data we were able to
construct a frequency distribution. However, given only the
frequency distribution we cannot reconstruct the original data.
Hence in obtaining a summarized presentation we have lost
information like the names of the companies and the exact
price of each company’s shares.
In the illustration given above (refer table 2.1.4) it may
be noticed that there are zero frequencies for the classes
1770 - 2645, 2645 - 3520 and 3520 - 4395. In fact, these
classes have been necessitated because of the single data
point in the class 4395 - 5270. At the same time, there is
overcrowding of data points in the class 20 - 895. To remedy
the above drawbacks we may use the following type of
classification.
However, the above frequency distribution violates one of our
guidelines, i.e., all class intervals should be of equal length.
This is because the last class “900 and over” has an infinite
length. Such classes are called “open-ended classes”
because we cannot numerically fix the upper (or in some
cases lower) end of the classes.
Example 2.1.2
Below are the debt-equity ratios of some companies as
shown in table 2.1.5:
How would you classify these companies in a frequency
distribution according to their debt-equity ratios?
First, you would count the number of observations or data
points – they are 17 in all. Hence we should have or
approximately 4 classes. Let us settle for the minimum of 6
classes. The data points range from 0 for Colgate to 4.4 for
Hindustan Motors (Refer table 2.1.6)
Hence we can get the following frequency distribution as
shown in table 2.1.7.
20
Table 2.1.4 Table 2.1.4
Class Interval Share
Price (Rs.)
Frequency
25-200 15
200-375 9
375-550 2
550-725 1
725-900 1
900 and over 2
Total 30
Frequency Polygons
Example 2.1.3
The growth of a particular industry may be ascertained by
several indicators. One simple indicator is the growth in sales
as compared with the previous year. Given below (table 2.1.8)
is an industry-wise performance of the corporate sector for two
consecutive years.
21
Table 2.1.5 Table 2.1.5
Company Debt-Equity Ratio
Parke Davis 0.4
Pfizer 0.9
East India Hotels 1.4
Reliance Industries 1.2
NOCIL 0.8
Videocon 1.4
Colgate 0
Essar Shipping 1.9
Ceat 2.6
Hindustan Motors 4.4
Voltas 2.1
Baroda Rayon 2.5
ITC 1.7
GE Shipping 0.5
Larsen & Tourbo 0.7
TISCO 1.3
Hindustan Lever 0.6
Table 2.1.7 Table 2.1.7 Table 2.1.7
Class
Debt-equity
Ratio
Tally Marks Frequency
0.00-0.75 IIII 4
0.75-1.50 II 7
1.50-2.25 III 3
2.25-3.00 II 2
3.00-3.75 0
3.75-4.50 I 1
Total 17
IIII
Table 2.1.6
We can find the minimum class width that would cover the data
points by using the formula
Maximum Value Data Point - Minimum Value Data Point
Number of Classes
==(4.4 – 0)/6
= 0.73 ~ 0.75
22
Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8
Industry
No. of
companies
Net Sales Percentage Industry
No. of
companies
Net Sales Percentage
Tea& Coffee 20 1,018.92 6.8 Paper & Pulp 21 2,339.33 11.9
Veget abl e Oi l s &
Vanaspati
20 1,506.77 20.7 Textiles 68 4,748.63 18.1
Sugar 11 671.30 34.1 Man-Made Fibres 23 6,278.31 33.1
Other Food Products 14 606.71 10.7 Other Textile Products 12 397.59 24.0
Alcohol 7 858.22 18.7 Cement 16 1,253.72 5.9
Cigarettes 4 3,439.02 17.8 Cement & Asbestos Products 4 225.86 18.3
Mineral Products 6 253.38 39.7 Ceramics 7 228.55 16.2
Alkalies 8 1,125.10 18.9 Glass 7 269.36 11.2
Ot h e r I n o r g a n i c
Chemicals
16 585.82 30.7 Granite & Marble 6 93.05 53.8
Organic Chemicals 14 764.73 14.7 Gems & Jewellery 6 498.89 35.8
D r u g s &
Pharmaceuticals
27 2,430.97 29.0 Steel 29 5,463.41 22.5
Fertilizers 8 1,927.59 -10.2 Castings & Forgings 9 287.75 39.9
Pesticides 4 313.86 24.0 Steel Tubes & Pipes 6 504.62 36.2
Dyes 9 528.84 16.6 Steel Products 17 983.56 20.3
Paints 6 457.70 5.9 Aluminum 5 1,419.36 16.7
Cosmetics &Toiletries 8 1,055.38 -2.4 Non-ferrous Metals 7 367.44 44.7
Other Chemicals 14 633.43 37.6 Industrial Machinery 25 1,128.11 13.9
Plastic in primary form 5 526.78 11.5 Machine Tools 7 93.52 35.2
Plastic Products 13 555.68 -12.8 Other Non-Elect. Machinery 26 1,830.80 16.5
Veget abl e Oi l s &
Vanaspathi
7 618.80 8.0 (contd...........) (contd...........) (contd...........) (contd...........)
The corresponding frequency distribution of change in sales is
given below in table 2.1.9:
Table 2.1.9 Table 2.1.9
Class
Change in Sales(%)
Frequency
-13 to - 4.4 2
- 4.4 to 4.2 3
4.2 to 12.8 12
12.8 to 21.4 16
21.4 to 30.0 5
30.0 to 38.6 10
38.6 to 47.2 6
47.2 to 55.8 3
Total 57
We can draw graph for this frequency distribution by taking
classes or class marks (mid-points of classes) on the X-axis
and frequencies on the Y-axis.
23
Table 2.1.8 (Contd......) Table 2.1.8 (Contd......) Table 2.1.8 (Contd......) Table 2.1.8 (Contd......)
Industry
No. of
companies
Net Sales Percentage
Electrical Machinery 16 2,353.20 16.1
Dry Cells & Batteries 2 198.82 -3.0
El ect r i c Lamps and
Bulbs
4 34.74 31.2
Wires & Cables 11 1,055.74 47.5
Electronics 31 623.44 55.4
Consumer Electronics 9 2,018.81 6.8
Comput ers & Off i ce
Equip.
4 205.59 45.3
Automobiles 9 6,229.62 3.3
Automobiles Ancillaries 35 995.72 12.2
Miscellaneous Mfg. 13 1,048.62 8.4
Construction 16 1,043.78 27.9
Hotels 8 192.42 36.3
Transport Services 7 822.88 12.3
Financial Services 50 729.82 46.7
Other Services 6 113.44 38.7
Diversified 13 6,264.05 15.8
Electricity 3 1,588.17 30.1
Total for all industries 777 75,120.03 17.3
Here rectangles have been erected with their bases equal
to the lengths of the class intervals and their heights equal
to the frequencies on a suitable scale. This type of graph is
called a Histogram.
While the histogram indicates the fluctuations in
frequencies from class to class, it does not clearly reveal
the rate of change in frequency from one class to the next.
For example, it is difficult to say by examining the
histogram whether the decline in frequency from the class
30-38.6 to the class 38.6-47.2 is the same as the decline in
frequency from the class 38.6-47.2 to the class 47.2-55.8.
Such a question can be easily answered by using a
frequency polygon.
In the case of a frequency polygon, the mid-points of the
classes are taken on the X-axis and the frequencies are
taken on the Y-axis. The plotted points are joined by a
straight line. The last point B is joined to the X-axis at the
mid-point of the next class 55.8 – 64.4. Similarly, the first
point A is joined to the mid-point of the preceding interval –
21.8 – (–13).
In the frequency polygon, we can see that the line a is
steeper sloping more than the line b. Hence we can
conclude that the frequency drop is more in the class
30-38.6 to 38.6-47.2 than the frequency drop in the class
38.6-47.2 to 47.2-55.8.
We may, similarly, define cumulative frequency distribution
and the graph of this distribution is called an Ogive.
For example, consider the sales data (refer table 2.1.10).
Cumulative Frequency Table
It may be noticed from the “less than” Ogive
curve below that it slopes up to the right.
24
Table 2.1.10 Table 2.1.10
Class Cumulative
Frequency
-13 x < - 4.4 2
- 4.4 x < 4.2 3
4.2 x < 12.8 12
12.8 x < 21.4 16
21.4 x < 30.0 5
30.0 x < 38.6 10
38.6 x < 47.2 6
47.2 x < 55.8 3
Total 57
We may similarly construct relative frequency tables
where the frequency of a class is divided by the total
number of observations.
A frequency polygon (or a relative frequency polygon)
indicates the skewness of the distribution.
B is symmetrical while A is said to be skewed to the right
and c is skewed to the left.
Skewness refers to the lack of symmetry. A distribution for
which the mean, median and mode are equal is known as
a symmetrical distribution. In such a distribution curve, a
vertical line drawn from the peak of the curve to the
horizontal axis will divide the area of the curve into two
equal parts and each part is the mirror image of the other.
An asymmetrical distribution for which the mean, median
and mode are not equal is known as a skewed
distribution. In a skewed distribution curve the values are
not equally distributed but are concentrated at the lower
or higher end of the frequency distribution.
In a curve, if many values are concentrated at the lower
end and very few values are concentrated at the higher
end, the curve is said to be skewed to the right or
positively skewed. A positively skewed distribution curve
tails off towards the higher end and for such a curve A.M
> Median > Mode. For a negatively skewed curve the
values are concentrated at the higher end and it is
skewed to the left because it tails off towards the low end.
Here the A.M < Median < Mode.
Remark
At this point, we want to distinguish between a Parameter
and a Statistic.
Suppose we compute the annual returns for the past year
of all the scrips listed on the Bombay Stock Exchange.
From the data, we may compute, say, the mode M and
the variance, . We may also compute the mode, m,
and the variance, , of the data restricting ourselves to
the 30 scrips in the Sensex.
M and are called parameters as they pertain to the
entire population. m and are called statistics as they
pertain to a sample.
25
Measures of Central Tendency
Objectives of Averaging
Types of Averages: Mathematical & Positional Av-
erages
Case Study: Mattel’s Global Expansion: Analyz-
ing Growth Trends
C
H
A
P
T
E
R

3
I n t hi s c hapt e r we wi l l di s c us s
Section 1
Objectives of Averaging
The most important objective of a statistical analysis is to cal-
culate a single value that represents the characteristics of
the entire available raw data. This single value representing
the entire data is called the ‘central value’ or an ‘average’.
This value is the point around which all the other value of the
data cluster. Therefore it is known as measure of location
and since this value is located at a central point nearest to
other values of the data it is also known as measure of cen-
tral tendency. This chapter discusses various measures of
central tendency like mean, median and mode and their use
in day to day management activities. For example, the mean
sales of a territory give a rough idea to the sales manager
about the sales potential of that territory.
a. To find out one value that represents the whole mass
of data
The objective of averaging is to represent a set of individual
values in a concise way, so that the researcher can have an
instant idea about the size of each entity in the group. Aver-
ages help the researcher or manager to grasp the character-
istics of the data group without studying every value in the
group. For example, a manager gets a good idea about the
age profile of trainees of a fresh batch by looking at the aver-
age age (calculated by dividing the total of age of all the
trainees by number of trainees). This average is a value that
enables the manager to have a overall idea about the char-
acteristics of the large number of trainees.
b. To enable comparison
Averages help in comparing two or more sets of data on the
same variable. They also help in drawing conclusions about
the characteristics of different sets of data. For example, a
manager can use the average sale of two territories to com-
pare the performance of sales executives of two territories.
These average sales figures of each territory reduce the bur-
den of going through the volumes of sales data to know the
performance of each territory. Thus, a quick and easy com-
parison of sales of the two territories is made possible for a
manager by these averages.
27
c. To establish relationship
Averages play a major role in establishing relationships be-
tween separate groups in quantitative terms. It is vague if
one states that productivity of an employee of Wipro is more
than that of an employee of Satyam Computer Solutions. It
would make sense if both the productivities are expressed in
terms of averages.
d. To derive inferences about a universe from a sample
Averages help a manager to get valuable inferences about
the whole universe by means of sample data. The average
calculated from a sample data give a reliable idea about the
average of the entire universe.
e. To aid decision-making
Averages act as benchmarks or standards for managerial
control and decision-making. A production manager may rely
on average employee productivity to set future production tar-
gets for individuals and the organization as a whole. Thus
these averages (average turnover, etc.) act as benchmarks
for performance appraisal and decision-making in future.
Requisites of a Good Average
An ideal average should have the following characteristics:
Should be rigidly defined
Should be mathematically expressed (Have a mathemati-
cal formula)
Should be readily comprehensible and easy to calculate
Should be calculated based on all the observations
Should be least affected by extreme fluctuations in sam-
pling data.
Should be suitable for further mathematical treatment.
In addition to the above requisites, a good average should
also retain maximum characteristics of the data, it should be
a nearest value to all the data elements. Averages should be
calculated for homogeneous data i.e. ages, sales etc.
28
Section 2
Types of Averages
Averages or measures of central tendency are of the
following types:
I.Mathematical averages
i. Arithmetic mean
ii. Geometric mean
II. Positional averages
i. Median
ii. Mode
Of the above, arithmetic mean, median and mode are the
widely used averages in that order. Keynote diagram shows
the types of averages.
I. Mathematical Averages
i. Arithmetic Mean
The arithmetic mean or mean is the most simple and
frequently used average.
Arithmetic mean is represented by notation (read x - bar).
Calculating the Mean from Ungrouped Data
Ungrouped data refers to a collection of observations
x
1
,x
2
,................., x
n
The mean is then calculated as:
29
Keynote 3.2.1: Types of Averages

i indicates the ith observation,
is the sum of values of all observations,
n is the number of observations.
∑ indicates that all the values of x are summed
together.
When the mean is calculated for the entire population, it is
known as population arithmetic mean (µ). ‘N’ is the number
of elements (observations) in the population.
Then µ = / N
Example 3.2.1
Absentee List of Drivers of the Transport Department over a
Span of 90 Days is shown below in table 3.2.1:
When a manager wants to know the average number of days
a driver is on leave in 90 days, he can calculate the mean of
the ungrouped data as follows:

= 55/10
= 5.5 days per driver out of 90 days
In the above example, the mean is calculated by adding
every observation separately, in no set order. This is an
ungrouped data. One can calculate the mean using the
above method for limited values. But the task becomes
difficult while calculating average for a vast data, say for
5000 employees. In such cases a frequency distribution of
the data will be helpful to a manage r, and mean should be
calculated using a different method.
Calculating the Mean from Grouped Data (Frequency
distribution)
A frequency distribution consists of data that are grouped
into classes and hence called grouped data. Every
observation (value) is placed in one of the classes. Unlike
the earlier example, the manager is unaware of the individual
values of every observation of the universe. For example, a
Finance manager wants to find out the average monthly pay
of 600 employees in an organization, and he is having a
30
Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1
Driver
Number of
days on
leave
1 2 3 4 5 6 7 8 9 10
8 6 6 7 4 5 6 2 4 7
X
i
i=1
n

31
Table 3.2.2: Average Monthly Pay of 600
Employees
Table 3.2.2: Average Monthly Pay of 600
Employees
Class (Rupees) Frequency
1000 - 2999 50
3000 - 4999 110
5000 - 6999 162
7000 - 8999 100
9000 - 10999 83
11000 - 12999 45
13000 - 14999 25
15000 - 16999 15
17000 - 18999 8
19000 - 20999 2
Total 600
Table 3.2.3: Calculating Arithmetic Mean for Grouped data Table 3.2.3: Calculating Arithmetic Mean for Grouped data Table 3.2.3: Calculating Arithmetic Mean for Grouped data Table 3.2.3: Calculating Arithmetic Mean for Grouped data
Class
(1)
(Rupees)
Class Mark (X)*
(2
0
Frequency (f)
(3)
(f) x (X)
(2) x (3)
1000 - 2999 2000 50 1,00,000
3000 - 4999 4000 110 4,40,000
5000 - 6999 6000 162 9,72,000
7000 - 8999 8000 100 8,00,000
9000 - 10999 10000 83 8,30,000
11000 - 12999 12000 45 5,40,000
13000 - 14999 14000 25 3,50,000
15000 - 16999 16000 15 2,40,000
17000 - 18999 18000 8 1,44,000
19000 - 20999 20000 2 40,000
n=600, ∑(f x X) = 44,56,000
Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66
*Class mark adjusted to nearest integers.
n=600, ∑(f x X) = 44,56,000
Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66
*Class mark adjusted to nearest integers.
n=600, ∑(f x X) = 44,56,000
Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66
*Class mark adjusted to nearest integers.
n=600, ∑(f x X) = 44,56,000
Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66
*Class mark adjusted to nearest integers.
frequency distribution (shown in Table 3.2.2).
To compute the arithmetic mean of grouped data, calculate
the midpoint of each class and multiply each mid point (class
mark) by frequency of observations in the corresponding
class. He then has to add all these results and divide the sum
by the total number of observations.Mid point (class mark) =
x = (lower limit + upper limit)/2
The formula for computing Arithmetic mean for grouped data
is:

Where, ∑ = Notation for “Sum”
= Number of observations in each class
= class mark (mid point of each class)
n = Number of observations
In the above example, the approximate mean (average
salary) is Rs.7426.66.
In case we had the data on the income of each of the 600
employees (i.e., ungrouped data), we could have calculated
the mean using the previous method. While there would be
some difference between the means obtained by both the
methods, mostly it would be small.
The first advantage of arithmetic mean is that its concept is
familiar and clear to most people. The second advantage is
that it is easy to understand and easy to calculate. Every data
set has one and only one mean. Finally, arithmetic average
provides a good basis for comparison. For example, if a
manager wants to compare the performance of salesmen of
four different regions of a state, arithmetic average provides
the correct basis for assessing the relative efficiency of the
regions.
However, Arithmetic mean suffers from a few drawbacks.
First, it may be affected by the extreme values that are far
from other values of the group. Observe that if the units
produced in a day by 5 workers of a batch as in Table 3.2.4 .
The mean units produced per day is
µ = ∑ x/n = (23 +22 +24+21+5) / 5 = 19 units
When the mean units are calculated leaving the fifth worker,
the mean is 22.5 units. Thus, one extreme value ‘5’ has
affected the mean. Hence, it is more appropriate to calculate
the mean excluding the extreme value in order to make it
more representative.
32
Table 3.2.4: Number of Units Produced by Workers in a
Day
Table 3.2.4: Number of Units Produced by Workers in a
Day
Table 3.2.4: Number of Units Produced by Workers in a
Day
Table 3.2.4: Number of Units Produced by Workers in a
Day
Table 3.2.4: Number of Units Produced by Workers in a
Day
Table 3.2.4: Number of Units Produced by Workers in a
Day
Worker 1 2 3 4 5
Units 23 22 24 21 5
The second disadvantage is that we cannot calculate the
mean for a grouped data set with open-ended classes at
either end of the scale. A class that allows either the upper
or lower end of a quantitative classification scheme to be
limitless is called as open-ended class.
The Weighted Arithmetic Mean
The weighted mean is calculated by taking into account the
relative importance of each of the values to the total value.
Consider, for example, the manufacturing company in Table
3.2.5 that employs three grades of labor (unskilled,
semiskilled, and skilled) to produce each of the two
products. When the company wants to know the average
wage per hour for each product, the simple arithmetic
average of the labor wage of the three types of labor will not
be appropriate as it gives equal weight to each category of
labor and this is not proper.
An appropriate method to calculate the average wage
per hour for the products is to take a ‘weighted average’ of
the wages of the three classes of labor, weighed in
proportion of total labor hour required by the three classes to
produce the product.
Here one unit of Product 1 required 10 hours of labor, of
which
Unskilled labor required 2 hours,
Semi-skilled labor required 3 hours,
Skilled labor required 5 hours.
When these above information are used as weights, then
Wage of labor (per hour) for product 1 is:
= (2x10+3x15+5x20)/(2+3+5)
= Rs. 16.5 / hour
Similarly, for Product 2 cost of labor (per hour) for 1 unit is:
= (6 × 10 + 2 × 15 + 1 × 20)/ (6+2+1)
= Rs. 12.22 / hour
As can be seen, in general, the formula for calculating the
weighted average is:

where,
33
Table 3.2.5: Labor - Capital Involved in Manufacturing
Two Products
Table 3.2.5: Labor - Capital Involved in Manufacturing
Two Products
Table 3.2.5: Labor - Capital Involved in Manufacturing
Two Products
Table 3.2.5: Labor - Capital Involved in Manufacturing
Two Products
Class of
Labour
Wage per
hour (x) (Rs)
Labour hours per unit Labour hours per unit
Class of
Labour
Wage per
hour (x) (Rs)
Product 1 Product 2
Unskilled
Semiskilled
Skilled
10
15
20
2
3
5
6
2
1
w = weight allocated to each observation (2, 3, 5 for product
1 in the above example)
∑ (w×x) = sum of each weight multiplied by that element.
S
w
will be equal to 1, if the weights are expressed in
proportion.
ii. Geometric Mean
Managers often come across quantities that change over a
period of time and may need to know the average rate of
change over a period of time. Arithmetic mean is inaccurate in
tracking such changes. Hence a new measure of central
tendency, called Geometric Mean, is needed to calculate the
average rate of change. It is defined as:

where, ‘n’ is the number of values.
Geometric mean is applicable in many cases. Its use in
calculating the growth rates of a textile unit in the southern
region for the last five year are given below in table 3.2.6:
The geometric mean

Where,
X
1
, X
2
, ........ X
n
are termed as the growth factor and is equal
to 1+ (rate/100)
1.1093 is the average growth factor. The growth rate is
calculated as
1.1093 – 1 = 0.1093 or the average growth rate is 10.93
percent per year.
Example 3.2.2
Matel Plastics Ltd got a raw material delivery order from
Blowplast Inc. However, the condition was that the delivery
would be considered cancelled. Robert, the salesman at
Matel, was assigned the responsibility to make the delivery.
Robert had to be careful not to exceed the 80 kmph speed
limit, otherwise he would be flouting the traffic rules. The
34
Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units
Year 1 2 3 4 5
Growth rate (%) 7 8 10 12 18
marketing manager asked him not to go below 60 kmph as
there was a risk of the order being cancelled. Robert divided
his journey time into four hours. He traveled the first quarter
of the distance at the speed of 50 kmph, the second quarter
at 65 kmph, the third quarter at 80 kmph and the last quarter
at 55 kmph. What is the average speed of his journey?
Solution:
Let the speed of Robert’s vehicle in the first hour, second
hour, third hour and fourth hour be X
1
, X
2
, X
3
and X
4

respectively.
The average speed (HM) of Robert’s whole journey from
Matel to Blowplast is given as 60.5 kmph.
From the given information in the problem, we have
X
1
= 50
X
2
= 65
X
3
= 80
X
4
=55
and n = 4.
After inserting the values in the formula for calculating the
harmonic mean, we get:
II. Positional Averages
i. The Median
The
median, as the name suggests, is the middle value of a data
series arranged in increasing or decreasing order of
magnitude.
Unlike the arithmetic mean (which is calculated from the
value of every observation in the series), median is a
positional average. It is the middle most value in the data or
the 50th percentile observation below which 50% of the
observations in the sample fall. The object of median is
35
Video 3.2.1:Central tendency, mean and median mode
therefore not merely to fix a value that shall be
representative of a data set, but also to establish a dividing
line separating the higher values from the lower values.
Calculating the Median from Ungrouped Data
If the data set contains an odd number of observations, the
middle observation of the array is the median. If there is an
even number of observations, the median is the average of
the two middle observations. If the total number of
observations is odd, say n, the value of item
gives the median and when the total of the frequencies is
even, say, 2n, then and are two central
observations and the arithmetic mean of these two
observations gives the median.
Example 3.2.3:
The data in table 3.2.7 relates to the sales figures of certain
companies relating to the year 2002-03:
Solution:
The median for the above data can be obtained as follows
in table 3.2.8:
The series should first be arranged in an order. In the
present case, it has been arranged in the descending order.
As there are 10 elements, the median will be the mean of
the 5th and the 6th items, i.e.,
(412 + 312)/2= Rs. 362 lakhs.
Thus the median sales value of the ten companies is Rs.
362 lakhs.
Calculating the Median from Grouped Data
In order to find the median, first the median class (i.e., the
class containing the 50th percentile observation) is to be
36
Table 3.2.7: Sales Figures of Companies Table 3.2.7: Sales Figures of Companies
Companies Sales (Rs. Lakhs)
JCCement 1520
Compex Inds 228
Hotel India 239
Hydro Power Co. 292
Thermal Power Co. 734
Star Tea 412
Cooling Ind. 980
Vegetable Oil Co. 312
Plating Ind. 256
located and then interpolation is to be used by assuming that
observations are evenly spaced over the entire class interval.
The formula used for the calculation of median is:
where, L
m
= lower limit of the median class
f
m
= frequency of the median class
F = cumulative frequency up to L
m
W = width of the median class
N = total frequency
Example 3.2.4
Let us find median for the following data of Table 3.2.9
Here the total frequency N = 153.
Median is the size of the item, i.e.,
item, i.e., the size of the 77th item. It lies in the class 20-30.
37
Table 3.2.8: Sales Figures of Companies Arranged in
Descending Order
Table 3.2.8: Sales Figures of Companies Arranged in
Descending Order
Table 3.2.8: Sales Figures of Companies Arranged in
Descending Order
Company Sales Rank
JCCement 1520 1
Cooling Ind. 980 2
Thermal Power Co. 734 3
Star Tea 412 5
Vegetable Oil Co. 312 6
Hydro Power Co. 292 7
Plating Ind. 256 8
Hotel India 239 9
Compex Inds 228 10
Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales
Gross Proﬁt as a
Percentage of Sales
0-10 10-20 20-30 30-40 40-50
No. of Companies 21 32 43 34 23
Hence 20-30 is the median class, of which the lower limit is
20.

Thus 25.35% is the median gross profit (as percentage of
sales) of the companies.
Median is not strongly affected by the extreme or abnormal
values. In this sense, median is a better average than mean
(as seen in example). Median is easy to understand and it can
be computed from any kind of data (even for grouped data
with open-ended classes, but excluding the case when median
falls in the open-ended class). Median can also be calculated
for qualitative data.
However, median has some disadvantages. First, it is a time-
consuming process as it is required to arrange the data before
calculating the median. Second, unlike mean, it is difficult to
compute median for data set with large number of
observations.
ii. The Mode
Mode is defined as the value of the observation of the variable
which occurs most frequently in the data set.
Calculating the Mode from Ungrouped Data
Table 3.2.11 shows the weights of 20 workers of an
organization. The mode of workers weights is 67 kgs as a
maximum number of workers (4 of them) have this weight.
Calculating the Mode from Grouped Data
When the data is grouped in a frequency distribution the
manager must assume that the mode is located in the class
38
Table 3.2.10:Cumulative Frequency Table 3.2.10:Cumulative Frequency Table 3.2.10:Cumulative Frequency
Gross Profit
(%)
No. of
Companies (f)
Cumulative
Frequency (cf)
0-10 21 21
10-20 32 53
20-30 43 96
30-40 34 130
40-50 23 153
Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs)
58 60 62 56 59 56 67 68 70 55
67 58 59 60 69 67 67 63 61 70
with highest frequency. The mode can be found using the
following formula:
d
1
= frequency of the modal class minus the frequency of
the class just below it
d
2
= frequency of the modal class minus the frequency of
the class just above it.
w = width of the modal class.
Example 3.2.5
Consider the salary example in Table 3.2.12 for computing
mode of that data.
Solution:

= 5000 + 1087.72
= Rs. 6087.72
Mode can be used as a measure of central location for
qualitative as well as quantitative data. It is not affected by
extreme values. It can also be used even when the classes
are open ended.
However, mode is not used widely as a measure of central
tendency, as it has a few drawbacks. For example, at times,
39
Table 3.2.12: Average Monthly Income of 600 employees Table 3.2.12: Average Monthly Income of 600 employees Table 3.2.12: Average Monthly Income of 600 employees
Class(Rs) Frequency
Cumulative
frequency
1000-3000 50 50
3000-5000 110 160
5000-7000 162 322
7000-9000 100 422
9000-11000 83 505
11000-13000 45 550
13000-15000 25 575
15000-17000 15 590
17000-19000 8 598
19000-21000 2 600
a data set contains no value that occurs more than once.
Further, all values in a data set might occur equal number of
times i.e., each observation has the same frequency. Another
disadvantage is that some data sets contain two, three or
many modes, making it difficult to interpret them.
Relationship between Mean, Median and Mode
In case of a symmetrical distribution, mean, median and
mode coincide. However, according to Karl Pearson, if the
distribution is moderately asymmetrical, the mean, median
and mode are related in the following manner:
Mean-Median = (Mean-Mode)/3
Thus Mode = 3 Median - 2 Mean
In a positively skewed distribution (skewed to the right), we
have AM > Median > Mode (Refer figure 3.2.1). For a
negatively skewed distribution (skewed to the left), we have
AM < Median < Mode (Refer figure 3.2.2).
Figure 3.2.1
Figure 3.2.2
40
Section 3
Case Study: Mattel’s Global Expansion
Toys are one of the world’s oldest consumer products. The
traditional toy industry, which was worth\$2billion–\$3billion
in 1968 evolved into a global market of over \$61.8 billion in
2007.1 US is the largest toys and games market in the
world, accounting for 34.1% of the global market’s value.
Though only 2% of the world’s children reside in the US,
the US’ toys and games market is Mattel, which holds a
7.8% of market share2 ; followed by Hasbro with 5.3%.
Mattel is also the world’s largest toy manufacturer and its
best known brands include Barbie, Matchbox, Fisher-Price
and Hot Wheels.
Mattel was founded in 1945 by Harold Matson and Elliot
Handler (hence the name ‘Matt-El’) in a garage workshop
in California. The company started as a picture-frame
manufacturer, but Elliot soon started a side business of
making dollhouse accessories out of picture-frame scraps.
The success of the dollhouse furniture turned the com-
pany’s focus on toys. In 1959, Mattel introduced the Barbie
product line, which remains the most successful and the
most popular brands even today. Mattel went public in
1960 and throughout the decade the company witnessed
growth through acquisitions of smaller toy manufacturers.
In 1968, Mattel created the first Hot Wheels products,
which eventually became another highly successful brand.
During the 1990s, Mattel merged with the Fisher-Price com-
pany (1993) and acquired Tyco Toys (1997), the third-
largest manufacturer of toys at that time. The Fisher-Price
ing toy company. The deal was, referred to as, the most sig-
nificant acquisition in the toy industry; since the acquisition
of Tonka Corp., by Hasbro in 1991. However, as competi-
41
This case study was written by R Muthukumar, IBSCDC. It is intended to be used as the basis for class discussion rather than to illustrate either
effective or ineffective handling of a management situation. The case was compiled from published sources.
tion in the toy industry was intense, the sales at Mattel
slumped in 1996 and 1997.
Mattel’s sales further dropped in 1998 owing to a massive
recall of its battery-powered cars. By 1998, the company
sold approximately 10 million battery-powered cars. How-
ever, many consumers began to complain that their vehi-
cles had caught up fire. Subsequently, in November 1998,
the US Consumer Products Safety Commission urged
Fisher-Price to issue a massive recall. An estimated 10 mil-
lion vehicles were recalled by the company, making this
one of the largest recalls in the history of the US toy indus-
try. Fisher-Price maintained that the fires were in virtually
every case caused by consumers tinkering with the en-
gines. The company spent \$30 million on repair of its re-
called products. In the fall of that year, the company took
the first step towards a major reorganization.
Mattel: Towards Developing Markets
Mattel began to sell its products directly to retailers and
wholesalers in Canada and most of the European, Asian
and Latin American countries. Europe is Mattel’s largest
market outside North America. It manufactured toy prod-
ucts for all segments in both company-owned facilities and
through independent contractors. Mattel’s principal manu-
facturing facilities were established in China, Indonesia, Ma-
laysia, Mexico and Thailand; while the independent contrac-
tors were positioned in the US, Europe, Mexico, the Far
East and Australia.
At present, the company operates in 42 countries and sells
products in more than 150 nations. Mattel’s segments are
separately managed business units, divided on a geo-
graphic basis between domestic and international. The do-
mestic segment of Mattel is further sub-divided into – Mat-
tel Girls & Boys Brands US, Fisher-Price Brands US and
American Girl Brands.
Mattel’s business is divided into two primary sectors: Do-
mestic (North America Region) and International. Mattel
products are sold directly to retailers in most European,
Latin American and Asian countries; while in Australia, Can-
ada and New Zealand, its products are sold through agents
and distributors (Mattel has no direct sales presence). Ex-
cept for American Girl, Mattel offers all its products world-
wide. It tailors its product as per the regional fads, though
the quality is compromised upon due to price sensitivity in
certain countries. The company sets itself apart by estab-
lishing close partnerships with its licensors and building
their brands.
Mattel distinguishes itself by producing a wide line of qual-
ity toys. It has outstanding brand name recognition and cus-
tomer loyalty. Mattel turned its attention to its new markets
way back in the 1970s. Since then, the company has been
taking advantage of global distribution and marketing net-
work to bolster sales in Mexico, Italy, Germany and Spain.
Since 2003, Mattel’s sales in the developing markets have
more than doubled (Exhibit I) and its sales of baby swings
and infant rockers in those markets have increased tenfold.
42
During the period 2006–2007, Mattel’s international sales
increased in comparison to its domestic sales. Particularly,
in Latin America, Mattel saw a rise in its sales by more than
23% in 2007. The company reported that its international
sales accounted for 49% of its gross sales in 2007.
Commenting on its international strategy, the company said
that it will continue to pursue localised and international pro-
grammes that are innovative and boost the growth of the
brand. The company hopes to cash in on countries where
US toys are seen as novelties. According to many experts,
it is very important to adopt local culture for toy companies
like Fisher-Price to attract more customers overseas.
“Fisher-Price is the tip of the spear for Mattel into these de-
veloping markets”,4 says Kevin Curran, Fisher-Price’s sen-
ior vice president and general manager.
If you are the head of operations, how will you analyse the
company’s sales performance with the data given. If all the
brands are expected to achieve sales growth of 7.25%,
8.2% and 7.15% respectively, what will be the average rate
of growth forecast for the next year?
Footnotes:
1. “Global Toys & Games – Industry Profile”, Datamoni-
tor, January 2008
2. “Toys & Games in the United States”, Datamonitor,
January 2008
3. Casey Nicholas, “Fisher-Price pursues toy sales in
developing markets”, The Financial Times, June 2nd
2008
4. “Fisher-Price pursues toy sales in developing mar-
kets”, op.cit.
43
44
Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007)
Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information) Segment Revenues(in \$million, except percentage information)
1999 2000 2001 2002 2003 2004 2005 2006 2007
Domestic
Mattel Girls& Boys Brands 1,835.8 1,890.4 1,817.3 1,790.o 1,594.1 1,511.6 1,364.9 1,507.5 1,445.0
Fisher-Price Brands 1,185.5 1,233.0 1,234.2 1,282.2 1,2652.2 1,319.2 1,358.6 1,471.6 1,511.1
American Girl Brands 298.6 324.0 340.8 350.2 344.4 379.1 436.1 40.0 431.5
Total Domestic 3,319.9 3,447.4 3,392.3 3,422.4 3203.7 3,209.9 3,159.6 3,419.1 3,387.6
International 1,556.2 1,517.7 1,680.3 1,890.9 2,175.7 2,336.2 2,463.9 2,739.0 3,205.3
Gross Sales 4,876.1 4,965.1 5,072.6 5,313.3 5,379.4 5,546.1 5,623.5 6,158.0 6,592.9
Sales Adjustments -373.4 (399.6 -384.7 428.0 419.3 443.3 444.5 507.9 622.8
Net Sales from Continuing
Operations
4,502.7 4,565.5 4,687.9 4,885.3 4,960.1 5,102.8 5,179.0 5,650.2 5,970.1
Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area
Domestic 3,319.9 3,447.4 3,392.3 3,422.4 3,203.7 3,209.9 3,159.6 3,419.1 3,387.6
% Change 1% 4% -2% 1% -6% 0% -2% 8% -1%
International 1,556.2 1,517.7 1,680.3 1,890.9 2,175.7 2,336.2 2,463.9 2,739.0 3,205.3
% Change -7% -2% 11% 13% 15% 7% 5% 11% 17%
% of Total Gross sales 32% 31% 33% 36% 40% 42% 44% 44% 49%
compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author
Measures of Dispersion
Various Measures of Dispersion
C
H
A
P
T
E
R

4
I n t hi s c hapt e r we wi l l di s c us s
Section1
Various Measures of Dispersion
In the previous chapter we discussed how one can calcu-
late a single value that represents the characteristics of the
entire raw data using three main measures: mean, median,
& mode. Another important characteristic of a data set is
the spread in the data or how far each element is from
some measure of central tendency (average). There are
several ways to measure the variability of the data. Al-
though the most common and most important is the stan-
dard deviation, which provides an average distance for
each element from the mean, there are also several other
important methods which are discussed here. They include:
range, inter quartile range and quartile deviations, mean de-
viation, variance and standard deviation.
Range
Range is the simplest method of studying dispersion.
Range is defined as the difference between the value of the
largest observation (L) and the value of the smallest obser-
vation present in the data set, i.e.,
Range = L - S
For a grouped frequency distribution, range is defined as
Range = Upper limit of the highest class - Lower limit of the
lowest class.
Merits and Limitations of Range
Merits:
Range is simple to understand and easy to calcu-
late.
Range is the quickest way to get a measure of dis-
persion, although it is not accurate.
Limitations:
It is not based on all the observations in the data. It
is computed based on highest and lowest values
and ignores the nature of dispersion among other
values of observations in the data set.
46
It is influenced by extreme values and hence fluctuates
from sample to sample of a population even though
the values that fall in between the highest and lowest
values may be similar.
Range cannot be computed from frequency distribu-
tions with open-end classes.
Range fails to explain about the character of the distri-
bution within two extreme observations (i.e., L and S).
Range is unreliable as a measure of dispersion of the
values within a distribution.
Uses of Range
In spite of the above limitations and shortcomings, range, as a
measure of dispersion, has many applications.
Range is used in industry for the quality control of prod-
ucts without 100% inspection. Range plays an important
role in construction of charts used for quality control. For
example, when the range of weight of a spare exceeds a
particular level, the entire production line is checked to en-
sure pre-specified quality in the production process.
Range is also useful in studying the fluctuations in finan-
cial and share markets.
Interquartile Range and Quartile Devia-
tion
Range as a measure of dispersion has many limitations as it
is based on two extreme observations. It fails to explain the
scatter within the range. So when these extreme observations
are discarded the limited range would be more reliable and
representative of the entire data. The range calculated based
on the middle 50 percent of the observations is called inter-
quartile range. This interquartile range is calculated from ob-
servations obtained after discarding one quartile of the obser-
vations at the lower end and another quartile of the observa-
tions at the upper end of the distribution. Thus interquartile
47
Figure 4.1.1: Interquartile Range
range is the difference between the third quartile and the first
quartile. The quartiles are the highest values
in each of the first three parts of the data set when the data
set is divided into four equal parts.
Therefore, interquartile range =
Figure 4.1.1 shows the concept of interquartile range graphi-
cally. Notice that the observations are divided into four equal
parts (25% each).
Quartile deviation is defined as one half of the interquartile
range.
Quartile deviation gives the average value by which the two
quartiles differ from the median. In symmetrical distribution,
the quartiles Q3 and Q1 are equidistant from the median i.e.
This difference can be taken as a measure of variation.
The median ± Quartile deviation covers approximately 50 per-
cent of the observations as the economic data or any other
business data is seldom perfectly symmetrical. A small quar-
tile deviation denotes less variation in the central 50% of the
observations, whereas a high quartile deviation indicates
large variations.
Merits and Limitations of Quartile Deviation
Quartile deviation (Q.D.) has many merits compared to range
and other measures of variation, but it also has some limita-
tions.
Merits:
Q.D can be used as a measure of variation to open-
ended distributions.
Q.D. is a better measure of variation for highly skewed
distribution or distribution with extreme values as Q.D.
is not affected by the presence of extreme values.
Limitations:
As the Q.D is calculated using only 50% of the total ob-
servations, it cannot be regarded as a good measure
of variation.
Q.D. is not a real measure of variation as it does not
measure the spread of observations from the average.
Q.D. is only a positional measure, like range.
Mean Deviation
Mean deviation is obtained by calculating the absolute devia-
tions of each observation from the mean.
Mean deviation for ungrouped data
48
To compute mean deviation for ungrouped data, absolute value
of the difference between each observation in the data set and
the mean is calculated, i.e., subtract the mean from every
value in the data set and ignore the positive or negative signs,
(considering everything to be positive). Finally, all those differ-
ences are added and this sum is divided by the number of
items in the sample.
Where, X = value of observation

mean of observations, and
N = number of observations in the sample
Example 4.1.1
Calculate the mean deviation of the leave patterns of 10 driv-
ers in one year for the values given in Table 4.1.1
49
Table 4.1.1:Calculation of Mean Deviation of the Leave
Patterns of 10 Drivers in One Year
Table 4.1.1:Calculation of Mean Deviation of the Leave
Patterns of 10 Drivers in One Year
Table 4.1.1:Calculation of Mean Deviation of the Leave
Patterns of 10 Drivers in One Year
Table 4.1.1:Calculation of Mean Deviation of the Leave
Patterns of 10 Drivers in One Year
S. no.
(N)
Observation in
days (x)
Deviation from
mean (x - X)
Absolute
deviation
(|X– X|)
1 10 -11 11
2 15 -6 6
3 18 -3 3
4 20 -1 1
5 20 -1 1
6 22 1 1
7 23 2 2
8 25 4 4
9 27 6 6
10 30 9 9
N=10 ∑x= 210 ∑|x–x|= 44

Mean Deviation for Grouped Data:
Mean deviation (M.D.) for grouped data can be calculated
about average (mean) using following formula.

where,
x
i
= mid value of the i
th
class interval
f
i
= the corresponding frequency
N= total frequency

Example 4.1.2
Compute the Mean Deviation for the data given in Table
4.1.2.
Here, in computation,

7.2
50
Table 4.1.2 Table 4.1.2 Table 4.1.2 Table 4.1.2 Table 4.1.2
Class Interval
Frequency
0-4 4-8 8-12 12-16
4 2 1 3
Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation
Class
Interval
Freque
ncy
(F)
Mid-value
of class
interval
(X)
f x X |x–x| f|x–X|
0-4 4 2 8 5.2 20.8
4-8 2 6 12 1.2 2.4
8-12 1 10 10 2.8 2.8
12-16 3 14 42 6.8 20.4
N=∑f
=10
∑fxX
= 72
∑f|x–X|
=46.4
Merits and Limitations of Absolute Mean Deviation
Merits:
Absolute mean deviation is simple and easy to under-
stand.
Absolute mean deviation is a more comprehensive
measure of dispersion as it is dependent on all obser-
vations of a distribution.
As it is obtained by taking the average of the devia-
tions of every observation from the mean, it is a true
measure of dispersion.
Limitations:
Absolute Mean deviation is less reliable as it is the
arithmetic mean of the absolute values (ignoring the
positive and negative signs).
Absolute Mean deviation is not conducive to further al-
gebraic treatment.
Absolute Mean deviation cannot be computed for distri-
butions with open-end classes.
Variance
Variance is similar to mean deviation, except that it is calcu-
lated by using the sum of the squared distances between the
mean and each observation is divided by the total number of
observations. While calculating variance, the differences (de-
viations) are squared to make them positive.
For Ungrouped Data
Where,

= the value of the i th observation

N = Total number of observations
For Grouped Data
where,
x
i
= mid-point of the i
th
class interval
51
f
i
= frequency of the i
th
class interval
N = Total number of observations.
= ∑ f
i
Standard Deviation
Standard deviation is the square root of the variance. The stan-
dard deviation is expressed in the same units as those used in
the data set, whereas variance is expressed in squared units. In
the case of both ungrouped and grouped data, the square root of
the respective variances will give the respective standard devia-
tions.
Properties
The value of standard deviation remains the same, if in a se-
ries each of the observation is increased or decreased by a
constant quantity. In statistical language we say, standard de-
viation is independent of change of origin.
For example, for the observations 3, 10 and 12
If we increase the value of each observation by 4.5 we get the ob-
servations 7.5, 14.5 and 16.5.
For a given data series, if each observation is multiplied or di-
vided by a constant quantity(changed of scale), the standard
deviation will also be similarly affected.

= 23.152 = 6 x 3.859
Thus the standard deviation has also been multiplied by 6.
The finding holds true even if we were to divide all the observa-
tions by a non-zero constant.
Therefore, the standard deviation is independent of any change
of origin, but is dependent on the change of scale.
Standard deviation is the minimum root-mean-square de-
viation. In other words, the sum of the squares of the de-
viations of items of any series from a value other than the
arithmetic mean would always be greater.
52
As it is possible to compute combined mean of two or
more groups, it is also possible to compute combined
standard deviation of two or more groups. Combined stan-
dard deviation is computed as follows:

where,

= mean of first group
= mean of second group
n
1
= number of observations in the first group
n
2
= number of observations in the second group

= the combined mean
Coefficient of Variation (C.V)
The coefficient of variation is a measure of relative dispersion
and is given by:
C.V. = Standard deviation/Mean
This is generally expressed in percentage
i.e., C.V. (%) = Standard deviation/mean× 100
Hence the coefficient of variation measures the spread of a set
of data as a proportion of its mean. It is used in problem situa-
tions where we want to compare the variability, homogeneity,
stability, uniformity and consistency of two or more data sets.
The data set for which the coefficient of variation is greater is
said to be more variable i.e., less consistent or less homogene-
ous. On the other hand, if the coefficient of variation is less it is
said to be less variable, i.e., more consistent or more homoge-
neous.
Example 4.1.3
Compute the Variance, Standard Deviation and Coefficient of
53
Video 4.1.1: Range, variance, Standard
deviation.
Variation given the profitability of 50 companies.
6.26
So variance of profitability among 50 companies is 39.24 and
the standard deviation is 6.26. Hence
C.V = 6.26/17.4 =0.3598 or 35.98 %
Example 4.1.4
A security analyst studied hundred companies and obtained the
following Return on Investment (ROI) data for the year 1992.
Calculate the standard deviation and the coefficient of variation
in ROI of the companies.
Solution:
We can find the variability in the ROI of the companies by calcu-
lating the standard deviation for the above data.
The steps involved are:
Find mean for grouped data.
Find deviations from mean for grouped data.
Find square of above deviations.
Sum up the squared deviations taking fre-
quency into account.
Take square root.
54
Table 4.1.4: Profitability of 50
Companies
Table 4.1.4: Profitability of 50
Companies
Profit %
(xi)
Number of
Companies (fi)
10 15
15 10
20 15
25 6
30 4
Measures of Central Tendency
68
Table 4.8: Computation of Variance and Coefficient of Variation
xi f
i
f
i
x
i

) x
i
(x 
2
) x
i
(x  f
i
2
) x
i
(x 
10 15 150 -7.4. 54.76 821.40
15 10 150 -2.4 5.76 57.60
20 15 300 2.6 6.76 101.40
25 6 150 7.6 57.76 346.56
30 4 120 12.6 158.76 635.04
Total 50 870 1962.00

x
50
870
i
f
i
x
i
f

= 17.4
2
 =
50
1962
i
f
2
) x
i
(x
i
f

=39.24
So vari ance of profitabi lity among 50 companies is 39.24 (Refer to Tabl e 4.6
for calculation)
Now, Standard Deviation of profi ts of 50 companies is:
S.D (  ) =
26 6. 39.24
N
) x (x f
2
i i
 
 

Coeffici ent of vari ation=
Mean
deviation Standard
=
4 . 17
26 . 6
=0.3598=35.98%
Example 4.6
A security analyst studied hundred compani es and obtained the foll owing
Return on Investment (ROI) data for the year 1992.Calculate the standard
devi ation in ROI of the companies.
Table 4.9: ROI of 100 Companies
Returns % 0-
10
10-
20
20-
30
30-
40
No. of
Companies
19 32 41 8
Solution:
We can find the variabi lity in the ROI of the companies by calculating the
standard deviation for the above data.
The steps involved are:
 Find mean for grouped data.
Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies
Returns %
No. of Companies
0-10 10-20 20-30 30-40
19 32 41 8
Thus the standard deviation for the return on investment is
8.8%.
In this calculation, we always assume that all the observations
in a class interval are located at the mid-point of the class. For
example, the first class interval (0 - 10) has mid-point 5 and
frequency 19. Hence the assumption is that all the 19 compa-
nies have an ROI of 5%.
The coefficient of variation could be computed as:
C.V = S.D/Mean = 8.81/18.8 = 0.4686 or 46.86%

Bienayme Chebyshev’s Rule
This rule was developed by Russian mathematician
named Bienayme and P.L. Chebyshev. According to it,
what ever may be the shape of a distribution (i.e., spread
of data), at least 75 percent of the values in the popula-
tion will fall within 2 standard deviations from the mean
and at least 89 percent will fall within 3 standard devia-
tions from the mean.
55
Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation
xi ﬁ ﬁxi (xi–x) (xi–x)
2
ﬁ(xi–x)
2
10 15 150 -7.4. 54.76 821.4
15 10 150 -2.4 5.76 57.6
20 15 300 2.6 6.76 101.4
25 6 150 7.6 57.76 346.56
30 4 120 12.6 158.76 635.04
Total 50 870 1962.00
Table 4.1.5: Computation of Variance and Coefficient of
Variation
Table 4.1.5: Computation of Variance and Coefficient of
Variation
Table 4.1.5: Computation of Variance and Coefficient of
Variation
Table 4.1.5: Computation of Variance and Coefficient of
Variation
Table 4.1.5: Computation of Variance and Coefficient of
Variation
Table 4.1.5: Computation of Variance and Coefficient of
Variation
Return on
investment
Mid-
point
No. of
companie
s
Deviation Deviation Deviation
% X f fX X - x f(X-x)2
0-10 5 19 95 -13.8 3618.36
10-20 15 32 480 -3.8 462.08
20-30 25 41 1025 6.2 1576.04
30-40 35 8 280 16.2 2099.52
Total 100 1880 7756
The rule states that the percentage of data observations lying
within ± k standard deviations of the mean is at least
This formula applies to differences greater than one standard
deviation about the mean, i.e., k must be greater than 1.
In case of a symmetrical bell-shaped curve, we can say that:
Approximately 68 percent of the observations in the popu-
lation fall within ±1 standard deviation from the mean
Approximately 95 percent of the observations in the popu-
lation fall within ±2 standard deviations from the mean.
Approximately 99 percent of the observations in the popu-
lation fall within ±3 standard deviation from the mean.
The diagrammatic representation of the location of observa-
tions around the mean of a bell-shaped frequency distribution
is given in Figure 4.1.2
56
Figure 4.1.2: Diagrammatic representation
of Bienayme – chebyshev Rule for a bell
shaped Curve
57
REVIEW 4.1
Question 1 of 12
Which of the following is a measure of central
tendency?
A. Median
B. Mode
C. Geometric mean
D. All the above
SECTION 2
CASE STUDY: MATTEL’S GLOBAL EXPANSION
58
Refer case study in chapter 3.
Basic Probability Concepts
Types of Probability
Probability Rules
Bayes’ Theorem
Case Study: Mitra Insurance Company
Case Study: Ram Publishers
C
H
A
P
T
E
R

5
Concepts of Probability
I n t hi s c hapt e r we wi l l di s c us s
Section1
Basic Probability Concepts
The concept of probability originated in the seventeenth
century and has become one of the most fascinating
subjects in the recent years. Probability has gained a lot of
importance and the mathematical theory of probability has
become the basis for statistical applications in the areas of
management, space technology, atomic physics, and the
like. In fact, most of the people use probability in their day-
to-day lives without being aware of it. Statements like “It
may rain today”, “Probably I will continue with the same job”,
“India might win the cricket series against Australia,” etc.,
are examples of the usage of probability in day-to-day life.
situations when a decision maker is uncertain as to what will
happen after the decisions are made. The theory of
probability is of great help in all such areas. In particular, it
enables a person to make ‘educated guesses’ on matters
where either full facts are not known or there is uncertainty
about the outcome. The probability formulae and techniques
were developed by Jacob Bernoulli, De Moivre, Thomas
Bayes, and Joseph Lagrange. Later Pierre Simon and
Laplace unified all these early ideas and compiled the first
general theory of probability. Even though, volumes have
been written on probability, the controversies concerned with
the concepts of probability theory continue.
The concept of probability was used by gamblers during the
early days in games of chance such as throwing a die,
drawing a card from the deck or tossing a coin. In these
games of chance, there is an uncertainty regarding the face
of the die that will appear in a throw or the card that will
appear in a draw or the face of a coin that will appear when
it is tossed. Although there is an uncertainty concerning the
outcome of any particular throw or any particular drawing,
there is a predictable long-term outcome. For instance, if a
die is thrown many times, experimental studies have shown
60
that the probability of a number to appear is one sixth (as
the die has 6 faces).
Basic Probability Concepts
Experiment
Any operation / process that results in two or more
outcomes is called an experiment.
Examples of an experiment:
Rolling an unbiased die is an experiment, where the
number that is to appear on the face of the die is
unpredictable and subject to change.
Tossing a fair coin is an experiment, where the
outcome head or tail is unpredictable.
Random Experiment
Any well-defined process of observing a given chance
phenomena through a series of trials that are finite or
infinite and each of which leads to a single outcome is
known as a random experiment.
Examples of random experiment:
Drawing a card from a pack of 52 cards. This is also a
chance phenomenon with only one outcome.
Drawing a ball from a bag containing a given number of
red, blue and white balls. This is also a chance
phenomenon with only one outcome.
A random experiment is different from experiments under
control conditions (example, experiment in a physical
laboratory) because the observation in a random
experiment involves chance phenomena and is not
performed under controlled conditions.
Possible Outcome
The result of a random experiment is called an outcome.
For example, picking a card from a pack of 52 cards and
getting an ace or a Jack or a Queen or a King or any other
card is an outcome.
Event
An event is one or more possible outcomes of an
experiment or a result of a trial or an observation. In other
words, an event is used to denote a phenomenon that
occurs with every realization of a set of conditions.
Elementary Event / Outcome
A simple or elementary event is a single possible outcome
of an experiment. A simple event cannot be further
subdivided into a combination of other events.
Sample Space
61
A collection of all possible elementary events of an
experiment is called Sample Space.
Example
Throwing a die and the event of getting a six (6) is a simple
event. The Sample Space consists of all possible
elementary outcomes of this experiment, i.e., {1,2,3,4,5
and 6}.
Compound Event
When two or more events occur in connection with each
other, then their simultaneous occurrence is called a
compound event. The compound event is an aggregate of
simple events.
Example
When we roll two dice, then the event of getting a six on
either the first or second die is a compound event.
Favorable Event
The number of cases favorable to an event in a trial is the
number of outcomes that result in the happening of a
particular event.
Examples
In drawing a card from a pack of 52 cards, the number
of favorable cases for drawing an ace are 4, for drawing
a spade are 13 and for drawing a black card are 26.
In throwing of three die, the number of cases favorable
to getting the sum of 4 is: (1, 1, 2), (1, 2, 1), (2, 1, 1),
i.e. totally three favorable outcomes.
Mutually Exclusive Events
Two events are said to be mutually exclusive or
incompatible if the happening of any one of them
precludes the happening of the other i.e., both the events
cannot happen simultaneously in a single trial or, the
happening of one prevents the happening of the other
and vice-versa.
Examples
In throwing a die, the events of getting each of the six
faces numbered 1 to 6 are mutually exclusive since if
any one of these faces comes, the possibility of others,
in the same trial is ruled out.
62
Gallery 2.1.1: Mutually Exclusive Events
If a single coin is tossed, head can be up or tail can be
up, both cannot be up at the same time.
Mutually exclusive events are those which do not overlap
when represented in Venn diagrams. (See gallery 2.1.1)
Dependent and Independent Events
Two or more events are said to be independent if the
happening of an event is not affected by the supplementary
knowledge concerning the occurrence of any number of the
remaining events. The question of dependence or
independence of events is relevant when experiments are
consecutive and not simultaneous.
Examples
In tossing an unbiased coin, a trial is not affected by the
result of the previous of subsequent trails. The events
therefore are independent.
If a card is drawn from a pack of 52 well-shuffled cards,
then only 51 cards are left. Now, if a second card is drawn
by replacing the first card (the picked card) then the pack
again has 52 cards and the trials are independent.
However, if the first card is not replaced back, the
composition of the pack stands changed and the
probability of the second card is affected and thus the
event is dependent on the previous trial.
Exhaustive Events
The total number of possible outcomes in any trial is known
as exhaustive events or exhaustive cases.
Examples
In tossing a fair coin, there are two possible outcomes,
head and tail. The list of these outcomes is exhaustive
since the result of any toss must be either head or tail, if
the possibility of the coin standing on an edge is ignored.
The two outcomes are also mutually exclusive.
For throwing two dice, the exhaustive number of cases
is 6x2 = 36. In general, for throwing ‘n’ dice, the
exhaustive number of
events is 6n. This is
because any of the six
numbers from 1 to 6 of
the first die may be
associated with any of
the six numbers of the
other dice. All the 36
outcomes are mutually
exclusive. The sum of
the probabilities for
mutually exclusive and
collectively an
exhaustive events
should be equal to one.
Equally Likely Events
Events are said to be
equally likely, if taking
into consideration all the relevant evidence, there is no
reason to expect one in preference to the others. In other
63
Figure 2.1.1:Exhaustive
events
words, when an event does not occur more often than the
others, they are said to be equally likely events.
Examples
In throwing an unbiased die, the outcome of a number
from 1 to 6 is equally likely.
In picking a card from a pack of 52 cards (with
replacement), each card can be picked up equally often.
When an unbiased coin is tossed, the chance of getting
either head or tail is equal.
Complementary Events
A complementary event is the number of unfavorable
outcomes in an experiment. Suppose ‘E’ is an event of the
number of favorable outcomes in the experiment, then a
complementary event denoted by is the number of
unfavorable outcomes in that experiment. The events E and
mutually exclusive and exhaustive.
Examples
In drawing a card from a pack of 52 cards, the event of
getting an ace of diamond is only one and that of getting
the complementary i.e., unfavorable event is 51.
In throwing a die, the favorable event of getting a face
with number 1 is 1 and the unfavorable event of getting it
is 5.
The sum of the probabilities of an event and its
complementary event is one.
64
Section 2
Types of Probability
There are four basic ways of classifying probability based on
the conceptual approaches to the study of probability theory.
There is disagreement among the experts regarding the
appropriate approach of probability. The basic approaches
are:
Classical approach
Relative frequency approach
Subjective approach
Axiomatic approach
Classical Approach
The classical approach is based on the assumption that
each event is equally likely to occur. This is an apriori
assumption (the term apriori refers to something that is
known by reason alone) and the probability based on this
assumption is known as apriori probability. This approach
employs abstract mathematical logic and hence is also
called as ‘abstract’ or ‘mathematical’ probability. This is the
reason for considerable use of familiar objects like cards,
before picking a card, tossing a coin or throwing a die,
respectively.
Definition
If a random experiment results in ‘N’ exhaustive, mutually
exclusive, and equally likely outcomes, out of which ‘f’ are
favorable to the happening of an event ‘E’, then the
probability of occurrence of E, usually denoted by P(E) is
given by
P (E) = f / N
65

James Bernoulli was the first man to obtain a quantitative
measure of uncertainty and the above definition was given
by him.
The probability that the event ‘E’ will not occur (i.e., the
event E complementary to E) is given by
If for an event E, P (E)= 0 then, the event is called an
impossible event and if P (E) = 1 then the event is called a
certain event.
Classical approach can be illustrated for tossing of a coin
or a die. Suppose that the probability of getting a head on a
single toss is to be calculated, then using formal terms,
The probability of getting ‘3’ on a single throw of a die is to
be calculated, then using formal terms,

Limitations of classical approach to probability
The limitations of this approach are:
The classical definition is applicable only when the trials
are equally likely or equally probable. For instance, the
probability that a candidate, attending an interview, will
succeed is not 50% since the two possible outcomes
viz. success and failure are not equally likely.
The classical definition is applicable only when the
exhaustive number of cases in a trial are finite.
The classical definition is applicable only when the
events are mutually exclusive.
Thus the classical approach to probability is useful in card
games, dice games, tossing coins and the like, but has
serious problems when it is applied to less orderly decision
problems that are encountered in the area of management.
Probabilities of occurrences such as an employee
resigning from a job before his/her retirement age or the
delay in delivery of a product to a nearby customer cannot
be predicted using this approach.
Relative Frequency Approach
The relative frequency of occurrence approach defines
probability as:
The observed relative frequency of an event in a very
large number of trials, when the conditions are stable
( i.e., the proportion of times that an event occurs in the
long-run.)
66
In this approach, the probability of happening of an event is
calculated knowing how often the event has happened in
the past. In other words, this method uses the relative
frequencies of past occurrences as probabilities. For
instance, suppose that an organization knows from the
past data that about 25 of its 300 employees entering every
year leave the organization due to good opportunities
elsewhere. Then the organization can predict the
probability of the employee turnover for this reason as:
25 / 300 = 1/12 = 0.083
Another characteristic of probabilities established by the
relative frequency of occurrence approach can be
illustrated by tossing a fair coin 1000 times. In this case it is
found that the proportion of getting either a head or tail is
more initially but as the number of tosses increase, both a
head or tail become equally likely and the probability of the
event showing a head is 0.5 or the event showing a tail is
0.5. Thus accuracy is gained as the experiment is repeated
and the number of observations is more. But the limitation
of this approach is the consumption of time and cost for
such large repetitions and additional observations.
Moreover, predicting probability using this approach
becomes a blunder if the prediction is not based on
sufficient data.
Subjective Approach
The approach was introduced by Frank Ramsey in 1926.
Subjective probabilities are those assigned to events by the
manager or the researcher based on the past experiences
or occurrences or on the evidences available. It may be
an educated guess or intuition. At higher levels of
managerial decisions, when the decision making
becomes very important, specific and is demanded to be
unique, managers use subjective probability.
Axiomatic Approach
According to axiomatic approach, probability is a number
assigned to the occurrence of an event in a sample
space. Let S be a sample space consisting of all possible
elementary outcomes of a random experiment, i.e.,
S = {s
1
,s
2
, ........... , s
n
} , assuming n elementary
outcomes for the experiment.
Then,
i. the probability of the entire sample space S is 1,
i.e. P(S) = 1.
ii. For each i, 0 ≤ P(s
i
) ≤ 1.
iii.For i ≠ j , P(s
i
and s
j
) = 0
iv.∑ P(s
i
) = 1
An event A is a collection of those elementary outcomes
meeting the requirements of the event. Clearly, the
probability of the event A must be greater than or equal to
0 and less than or equal to 1 or 100%.
67
i.e.,0 ≤ P(A) ≤ 1.
If A and B are mutually exclusive events, then the probability
of (A or B) is equal to the sum of the probabilities of A and B.
P (A or B) = P (A) + P (B) because P (A and B) = 0 as A and B
are mutually exclusive.
Two events A and B are mutually exclusive if the occurrence
of one implies the non-occurrence of the other. Hence
obtaining a head on tossing a coin and obtaining a tail are
mutually exclusive events.
68
Table 5.2.1 Table 5.2.1 Table 5.2.1 Table 5.2.1 Table 5.2.1
AB AE BD CD DE
AC AF BE CE DF
Section 3
Probability Rules
For Mutually exclusive events
This can be represented by the Venn diagram as shown in
Figure 2.1.2.
P (A or B or C) = P (A) + P (B) + P(C)
Suppose A = getting 1 on throwing the dice
B = getting 2 on throwing the dice
C = getting 3 on throwing the dice
As there are six possible equally likely outcomes on
throwing the dice,
P (A or B or C) =3/6=1/6+1/6+1/6 = P (A) + P (B) + P(C)
For Non-mutually exclusive events
If two events are not mutually exclusive the probability of
one of them occurring is the sum of the marginal
probabilities of the events minus the joint probability of the
occurrence of the events (Refer multiplication rule for
marginal and joint probability explanation).

P (A or B) = P (A) + P (B) – P (A and B)
69
Figure 5.3.1: Rules of Probability
where A and B are not mutually exclusive events.
Example 5.3.1
The Warwick Systems Company markets personal
computers (See Table 5.3.1). Some computers have two disk
drives (A) and some have one disk drive (B). Another feature
of these machines is the capacity in terms of K (kilo) bytes –
that is, whether they have 256K or 128K capacity. Presently,
the firm’s finished goods inventory consists of 300 machines
equipped with varying features (see Table 2.1.1). At any time,
the Warwick Systems Company may receive an order for a
machine or machines with specific features. If Warwick has a
sufficient number of machines to satisfy its customers, the
customers will continue to order machines from Warwick. But
if Warwick cannot satisfy its customers’ needs, they will
probably order machines elsewhere. Hence the
management of Warwick wishes to know the likelihood that
its inventories contain machines with desirable features.
In the above example, the sample space S is the set of all
machines in inventory. What is the probability of a random
selection of a two-disk drive machine from inventory or P(A)?
Also, find the probability of randomly selecting a two-disk
drive machine with 256K capacity.
Solution:
Let us represent two disk drive machines by A and one disk
drive machine by B. We will represent 256K by C and 128K
by D.
The probability of randomly selecting a machine with 256K
capacity is:
70
Table 5.3.1: Inventory of Warwick Systems Company Table 5.3.1: Inventory of Warwick Systems Company Table 5.3.1: Inventory of Warwick Systems Company Table 5.3.1: Inventory of Warwick Systems Company
2DD (A) 1DD (B) TOTAL
With 256 K
capacity (C)
100 50 150
With 128 K
capacity (D)
100 50 150
Total 200 100 300
Figure 5.3.2: Non-Exclusive Events
Concepts of Probability
159
As there are six possible equal ly likely outcomes on throwing the dice,
P (A or B or C) =
6
1
6
1
6
1
6
3
  
= P (A) + P (B) + P(C)
Nonmutually exclusive events
If two events are not mutually exclusive the probabil ity of one of them
occurring is the sum of the marginal probabili ties of the events minus the joint
probabi lity of the occurrence of the events.
P (A or B) = P (A) + P (B) – P (A and B)
where A and B are not mutually exclusive events.
Example 10.1
The Warwick Systems Company markets personal computers (See Table 10.1).
Some computers have two disk drives (A) and some have one disk drive (B).
Another feature of these machines is the capacity in terms of K (ki lo) bytes –
that is, whether they have 256K or 128K capacity. Presently, the firm’s
finished goods inventory consists of 300 machines equipped wi th varying
features (see Tabl e 10.1). At any time, the Warwick Systems Company may
receive an order for a machine or machines wi th specific features. If Warwick
has a suffici ent number of machines to satisfy its customers, the customers
wi ll continue to order machines from Warwick. But i f Warwick cannot satisfy
i ts customers’ needs, they will probably order machines elsewhere. Hence,
the management of Warwick wishes to know the likelihood that i ts
inventories contain machines with desirable features.
Table 10.1: Inventory of Warwick Systems Company
2DD
(A)
1DD
(B)
Total
With 256 K capacity (C) 100 50 150
With 128 K capacity (D) 100 50 150
Total 200 100 300
In the above example the sampl e space S is the set of all machines in
inventory. What is the probabili ty of a random selection of a two-disk drive
machine from inventory, or P(A)? Also find the probabili ty of randomly
selecting a two-disk drive machine wi th 256K capaci ty.
C D
C and D are not mutually
exclusive events
Figure 10.3: Non Exclusive Events
P(C) = 150/300 = 0.5
Each of the above probabilities is designated as a marginal
or unconditional probability. Events A and C are not
mutually exclusive since a machine may have both
characteristics. The probability of a machine having two
disk drives or having 256K capacity involves the addition
rule with a twist. Since A and C are not mutually exclusive
events, we must apply the counting rule. Hence, the
probability of A or C is:
P (A or C) = P (A) + P(C) - P (A and C) is
P (A or D) = P (A) + P(D) - P (A and D) is
Event (A or C), includes all elements except the 50
elements of B that are elements of neither A nor C.
The probability of a machine having features B and D is:
P (B and D) = 50/300 = 0.166
The probability of the complement of (A or C) is P (B and
D). These two events account for all 300 computers.
Example 5.3.2
Consider a bag containing 4 white and 5 black balls. If a
man draws 3 balls at random, without replacement, what
is the probability that all three are black?
Solution:
The total number of ways in which 3 balls can be drawn is
the number of ways of drawing 3 black balls is
therefore the probability of drawing 3 black balls is given
by:

Example 5.3.3
Consider a bag containing 5 white and 7 black balls. If
two balls are drawn at random without replacement, what
is the probability that one is white and the other is black?
Solution:
P (One is white and other, black)
Conditional Probability: Independent Events
71
If the probability of an event is subject to a restriction on the
sample space, the probability is said to be conditional.
Conditional probability is the probability of the occurrence of
an event, say A, subject to the occurrence of a previous
event, say B. We define the conditional probability of event A,
given that B has occurred as P (A|B). In case of A and B being
independent events, we represent P (A) as the probability of
event A. It is so because independent events are those whose
probabilities are in no way affected by the occurrence of each
other.
P (A|B) = P (A) or P(A and B) = P(A) x P(B)
In other words, two events A and B are said to be
independent if the probability of happening or not happening
of an event is not affected by the probability of happening or
not happening of the other, i.e., probability of both A and B
occurring is equal to the product of probability of A occurring
and probability of B occurring.
Let us take the example of a true-false test. As the success
answers are independent of each other we can say that the
probability of success of the second answer given that the
first answer is a success is simply the probability of the
success of the second answer, i.e.,

Conditional Probability: Dependent Events
We can define the conditional probability of event A, given
that event B occurred when both A and B are dependent
events, as the ratio of the number of elements common in
both A and B to the number of elements in B.

Example 5.3.4
72
Table 5.3.2: Membership in Labor Organizations Table 5.3.2: Membership in Labor Organizations Table 5.3.2: Membership in Labor Organizations Table 5.3.2: Membership in Labor Organizations
Membership Status
Non-
agricultural
Industries
(B1)
Agricultural
Industries
(B2)
Total
Members of labor
organizations (A1)
20,044 51 20,095
Non-members
represented by labor
organizations (A2)
2,394 4 2,398
Non-members not
represented by labor
organizations (A3)
63,586 1400 64,986
Total 86,024 1455 87,479
The data regarding the membership of workers is given below
in Table 5.3.2. Calculate the conditional probability that a
worker is a member of a labor organization given that he is
working in a non-agricultural industry.
Solution:
Let A1 denotes members of labor organizations. The
probability of an employed worker being a member of a labor
organization (event A1) is:
The probability of a worker being employed in a non-
agricultural industry (event B1) is:
Now, we wish to determine the probability that a worker is a
member of a labor organization given that the worker is
employed in a non-agricultural industry. So we must calculate
the conditional probability of event A1 occurring given that
event B1 has occurred. The formula for the conditional
probability is:
The probability of a worker being both a member of a labor
organization and employed in a non-agricultural industry is:
The conditional probability is then computed as:
The probability is 0.233 that a worker is a member of a labor
organization given that the worker is in a non-agricultural
industry.
Note that this probability can also be computed directly from
the data in the Table. The conditional probability is
The answer is the same as computed by using the formula for
conditional probability.
73
Multiplication Rule
Dependent events
The joint probability of two events A and B which are
dependent is equal to the probability of A multiplied by the
probability of B given that A has occurred.
P (A and B) = P (A) P (B | A)
or P (B and A) = P (B)P (A | B)
This formula is derived from the formula of conditional
probability of dependent events.
P (A and B) = P (B | A)x P (A)
Joint probability of several dependent events is equal to the
product of the probabilities of occurrence of the preceding
outcomes in the sequence.
P (A and B and C...) = P (A)P (B | A) P (C | A and B)....
Marginal probability in case of dependent events is just the
addition of the probabilities of all the events in which the
simple event occurs.
Example 5.3.5
A study of an insurance company shows that the probability of
an employee being absent on any given day P (A) is 0.1.
Given that an employee is absent, the probability of that
employee being absent a second day in succession P (B |
A) is 0.4. Find the probability of the employee being
absent on two successive days.
Solution:
Events A and B are dependent events because B cannot
occur unless event A has occurred. The probability of an
employee being absent on two successive days:
P (A and B) = P (A) P (B | A)
= (0.1) x (0.4) = 0.04
Thus the probability of an employee being absent on two
successive days is 0.04 or 4% of the time.
Example 5.3.6
Let us consider a project which involves an outlay of Rs.
1, 00,000. The cash inflows expected to be generated by
the project are shown in the Table 2.1.3. From the table
below, we find that there are eight possible cash flow
streams. The first cash flow stream consists of Rs.30,000
in year 1, Rs.30,000 in year 2 and Rs.35,000 in year 3,
the second cash flow stream consists of Rs.30,000 in
year 1, Rs.30,000 in year 2 and Rs.40,000 in year 3, so
on and so forth. The probabilities associated with these
cash flow streams are also given. Calculate the
probability of generating cash inflow of Rs.30, 000 in the
first year.
74
Solution:
It may be noted that the probability with which a cash flow
stream occurs is simply the joint probability of the individual
elements in that cash flow stream. The probability of the
first cash flow stream, i.e.,
Figure 5.3.3: Probability Tree
P(Rs.30000 ing year 1, Rs.30000 in year 2 and Rs. 35000
in year 3) = P(Rs.30,000 in year 1) × P (Rs. 30,000 in
year 2| Given Rs.30,000 in year 1) × P(Rs.35,000 in year
3| Given Rs.30,000 in year 1 and Rs.30,000 in year 2)
= (0.5) (0.8) (0.6) = 0.24
In the cash flow streams problem, given only the joint
probabilities of cash flows in three years from all streams
involving the cash inflow of Rs.30,000 in year one, we can
calculate the probability of the cash inflow of Rs. 30,000
in year 1.
75
TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT
YEAR 1" YEAR 1" YEAR 2 YEAR 2 YEAR 2 YEAR 3 YEAR 3 YEAR 3
Net
cash
ﬂow
Initial
Proba
bility
P(1)
Net
cash
ﬂow
Condi
tional
Proba
bility
P(2 |
1)
Net
cash
ﬂow
Net
cash
ﬂow
Condit
ional
Proba
bility
P(3 |
2,1)
Cash
ﬂow
strea
m
Cash
ﬂow
strea
m
Joint
Probabi
lity
P(1,2,3)
35,000 35,000 0.6 11 0.24
30,000 0.8 40,000 40,000 0.4 22 0.16
30,000 0.5 40,000 0.2 45,000 45,000 0.5 33 0.05
50,000 50,000 0.5 44 0.05
60,000 60,000 0.7 55 0.21
50,000 0.5 50,000 0.6 70,000 70,000 0.3 66 0.09
60,000 0.4 75,000 75,000 0.8 77 0.16
90,000 90,000 0.2 88 0.04
30,000
(0.5)
30,000
40,00
50,00
60,00
(0.5)
50,000
(0.8)
(0.2)
(0.6)
(0.4)
35,000
(0.6)
40,000
(0.4)
45,000
(0.5)
50,000
75,000
(0.5)
60,000
(0.7)
70,000
(0.3)
(0.8)
90,000
(0.2)
Year 1 Year 2 Year 3
P (Rs. 30,000, Rs. 30,000, Rs. 35,000
in years 1, 2, and 3) = 0.24
P (Rs. 30,000, Rs. 30,000, Rs. 40,000
in years 1, 2 and 3) = 0.16
P (Rs. 30,000, Rs. 40,000, Rs. 45,000
in years 1, 2 and 3) = 0.05
P (Rs. 30,000, Rs. 40,000, Rs. 50,000
in years 1, 2 and 3) = 0.05
Therefore, probability of the cash inflow of Rs.30, 000 in year
1, given the above joint probabilities is:
= 0.24 + 0.16 +0.05 + 0.05 = 0.50
Example 5.3.7
Suppose that a sample of size 2 is chosen from a population
of 6 elementary units. The sampling is performed without
replacement. Thus, an element of the population can only be
selected once in a sample. Calculate joint probability.
Solution:
Each possible sample of size 2 has the same chance of being
selected.
Let the elementary units of the population be denoted by A, B,
C, D, E, and F. Then, the possible samples of size 2 are:
Each of these 15 equally likely samples of size 2 has a
probability of being selected.
Consider the sample denoted by CE. Units C and E can
be selected in any order. We consider the order CE and
EC as separate events. The probability of selecting C and
then E is
P(C and E) = P(C) P (E | C) = (1/6) (1/5) = 1/30
Likewise, the probability of selecting E and then C is
P (E and C) = P (E) P(C | E) = (1/6) (1/5) = 1/30
These two joint events are mutually exclusive, and the
probability of one or the other occurring is
P [(C and E) or (E and C)] = (1/30) + (1/30) = 1/15
This value is the probability of C and E occurring in any
order.
Table 5.3.4 Table 5.3.4 Table 5.3.4 Table 5.3.4 Table 5.3.4
AB AE BD CD DE
AC AF BE CE DF
76
Section 4
Bayes’ Theorem
In business, there are an increasing number of instances
when occurrence of a particular event may impact the sales
and hence profits. For example, a home appliances retailer
calculates that it would be wise to stock his showroom with
microwave ovens to the extent of 15 percent of his available
shelf space. But later he finds out that the sales for
microwave ovens are showing a decline due to increase in
electricity tariff.
It is therefore important at this stage for the retailer to
recalculate the probability of a microwave oven selling under
the new circumstances. This would help him in making a
more profitable product mix decision for his showroom.
Here we find that some probabilities were changed after the
people involved (the retailer) got additional information
(information about increased electricity tariff). The new
probability thus obtained is known as posterior probability.
Since probabilities can be revised as new information is
gathered, the study of probability is of great significance in
managerial decision-making.
The concept of posterior probabilities was founded by the
18th century British Presbyterian minister Thomas Bayes.
Known as Bayes’ Theorem, it helps us to find the conditional
probability of one event occurring (A), given that another (B)
The terms and posterior refer to the time when information is
collected. Before information is obtained, we have prior
probabilities. Bayes’ Theorem provides a means of
calculating posterior probabilities from prior probabilities.
More formally, the Bayes Theorem stated as follows:
Let A
1
, A
2
, ..........., A
n
be mutually exclusively and
collectively exhaustive events, such that
77
A
1
U A
2
U ........... U A
n
= the sample space. Then
the posterior probability of the mutually exclusive events
(A
i’s
). Posterior to event B may be computed as
The example below illustrates the use of Bayes’ Theorem.
Example 2.1.8
Dandakaranya Oil Exploration Company is considering a
particular site for drilling for oil. Apriori (based on past
experience), the company expects three possible
outcomes - nil oil, moderate quantum of oil or huge
quantum of oil, with associated chances as
P(Nil oil) = 0.6
P(Moderate oil) = 0.3
P(Huge oil) = 0.1
conduct a seismic experiment, which can lead to one of the
three readings - low, medium or high. Company’s past
records show the following :
i. Of the 140 past sites that were drilled and produced no
oil, seven were high on seismic reading,
i.e., P (high reading | nil oil) = 7/140 = 0.05
ii. Of the 500 past sites that were drilled and produced
moderate oil, 10 were high on seismic reading,
i.e., P (high reading | moderate oil) = 10/500 =
0.02
iii. Of the 250 past sites that were drilled and produced
huge oil, 200 were high on seismic reading,
i.e., P (high reading | huge oil) = 200/250 = 0.8
A seismic survey at the site under consideration gave a
high reading. Should the company undertake drilling at the
site?
Solution
Clearly, the company is concerned about the possibility of
finding no oil despite a high seismic reading, i.e., the
company would like to find out
P (nil oil | high reading).
We apply Bayes’ Theorem to find this probability, which is
shown in a tabular form below:
Thus we see that even though the seismic prediction for
the site is high, still there is a 26% chance of not finding oil.
While the probability of no oil has come down with this
knowledge substantially from the earlier level of 60%, it is
for the company to take the final call.
78
79
REVIEW 5.1
Question 1 of 8
In probability, any operation/process that re-
sults in two or more outcomes is called
__________.
A. An Experiment
B. An Event
C. Possible Outcome
D. Equally Likely Event
Section 5
Case Study: Mitra Insurance Company
This case study was written by Sravanthi Vemulawada,
under the direction of R Muthukumar, IBSCDC. It is in-
tended to be used as the basis for class discussion
rather than to illustrate either effective or ineffective han-
dling of a management situation. The case was written
from generalised experiences.
80
Insurance is defined as the unbiased transfer of a risk of
loss from one being to another in exchange for a premium.
It can also be defined as a guaranteed small loss to
prevent a large, unpleasant loss. Law and Economics,
define it as a form of risk management primarily used to
guard against the risk of uncertain loss. The company
which sells the insurance is called the insurer and the
person or unit buying the insurance is called insured. The
amount to be charged for a certain amount of insurance
coverage is called the premium and the insurance rate is a
factor which is used to determine the premium. Now-a-
days, risk management, which is the practice of appraising
and controlling risks, has evolved as a distinct field of study
and practice. Insurance emerged back in the 7th century in
the Greek and Roman societies. Basically, insurable risks
consist of seven common characteristics. They are:
Large number of standardised exposure units: If a
very large number of standardised exposure units are
present and are increasing, it helps the insurers benefit
because the actual results (claims) are more likely to
become close to expected results (claims). If we
consider the case of automobile insurance, it covers
about 175 million automobiles in India, which is an
example of large number of standardised exposure
units.
Definite Loss: Definite Loss comes in cases where the
event gives rise to definite loss at a known time, known
place and from a known cause. Preferably, the time,
place and cause of a loss should be clear enough that a
person, with sufficient information, could, without bias,
verify all three elements. Fire accidents, automobile
accidents and injuries for a worker come under this
characteristic.
Accidental Loss: Here the event in the case of
accidental loss should be casual rather unexpected and
the loss should be pure. If we consider the case of
ordinary business risks, it is not considered insurable.
Large Loss: The size of the loss must be significant
from the perspective of the insured. The premiums
need to cover the expected cost of losses, the cost of
and supplying the capital needed to practically assure
that the insurer will be able to pay claims.
affordable in the sense that it should not cause
significant loss to the insurer. If the chances of an event
happening are so high, the cost of the event is huge
and the resulting premium is large relative to the
protection offered, then there are fewer chances of
Calculable Loss: The loss should be calculable. If not
exactly calculable, it should at least be estimable.
Possibility of loss is generally an observed exercise,
while cost has more to do with the ability of a person
who has a copy of insurance policy makes a sensibly
81
definite and an unbiased assessment of the quantity of
the loss retrievable as a result of the claim.
Limited Risk of Disastrously Large Losses: If a risk
can cause large losses to a very large number of
people holding various policies, the ability of the insurer
to issue policies becomes constrained, for example in
the case of earthquakes, hurricanes, etc.
Any risk that can be measured can potentially be insured.
There are different types of insurance like Auto Insurance,
Home Insurance, Health Insurance, Disability Insurance,
Casualty Insurance, Life Insurance, Property Insurance,
Liability Insurance, Credit Insurance, etc.
The actual application for benefits provided by an
insurance company is called an insurance claim. All the
policyholders must first file an insurance claim before any
money can be paid out to the hospital, to the repair shop or
to any contracted service. Now it is completely up to the
insurance company whether to approve the claim or
disapprove it based on their assessment of all conditions.
Depending on the type of insurance, the policyholders have
to make regular payments. This is in the case of home, life,
health, automobile insurance policies; the individual has to
maintain regular payments called premiums to the insurers.
By and large, these premiums are used to settle another
person’s insurance claim or used to develop the available
assets of the company. But sometimes when an accident
happens, which causes real financial damage or any such
natural calamities, then the policyholder has the right to file
an insurance claim so as to receive money from the
insurance company.
Generally, the insurance claim is filed with the local agent
of the insurance company who is responsible for studying
the details of the insurance claim and negotiating the
payments from the required insurer. Recognised
authorities like doctors, repair shops, etc., can file for the
insurance claims directly. Sometimes, the policyholder
would not want to file a claim because the damage would
have been minor or the opposite party has agreed to pay
out of their pockets for the mistake.
Once the insurance claim is filed with the local agent, the
insurance company sends an investigator who is called
as an adjustor or an appraiser. The appraiser’s job is to
evaluate the claim and determine if the repair valuation is
reasonable so that any frauds by the contractors can be
prevented. Most of the times, the appraisers evaluation is
considered final. Some insurance companies may not
recognize the claims for many reasons like few careless
accidents or if the claimant’s payments are not paid in full,
then the policy may not be active, etc.
Mitra Insurance Company is a nation wide recognised
insurance company in India. The kind of claims provided
by this company includes hospitalisation, physician’s visit
and outpatient treatment. The company received claims
from east, west, north and southern parts of the country
(Exhibit I).
Using Exhibit I, discuss the various entries as
82
conditional probabilities.
What is the probability of the event that the claim is from
west and the type is hospitalization?
83
Mitra Insurance Company:Claims(Geographical
Regions)
Mitra Insurance Company:Claims(Geographical
Regions)
Mitra Insurance Company:Claims(Geographical
Regions)
Mitra Insurance Company:Claims(Geographical
Regions)
Mitra Insurance Company:Claims(Geographical
Regions)
Kind of
claim
East South North West
Hospitaliza
tion
75 128 29 52
Physician
visit
233 514 104 251
Outpatient
Treatment
100 326 65 99
Section 6
Case Study: Ram Publishers
On 20th February 2004, Siva Raman, President of Ram Pub-
lishers met R.K.Mohan, Vice President, Marketing, and Rob-
ert Wilson, Chief Editor, to exchange notes on the negotia-
tions under way with N. Periyasamy regarding his soon-to-
be-written autobiography. Periyasamy, a 65 year old retired
IAS officer, had been appointed to the Election commission
by the government in 2000. Periyasamy planned to resign
before his term expired in 2005. He had approached Ram
Publishers, as well as two other publishing houses, to pub-
lish his memoirs.
sought by friend and foe alike. He had cultivated friendships
with various national political leaders. Periyasamy was a
regular participant in various meetings convened by political
A year back, Periyasamy had decided to cash in on these
experiences by writing a book. Ram Publishers had queried
him about the likely content of the autobiography. While it
was clear he intended to narrate the political intrigues he
had known, he also seemed to be well-informed on other is-
sues. Ram Publishers believed Periyasamy’s autobiography
might become a best seller.
Periyasamy was very clear about his profit expectation-Rs.
2 lakhs to sign a deal and another Rs. 2 lakhs upon delivery
of the script. It was also understood that the manuscript
would be ghost- written. Periyasamy would tell his reminis-
cences to Ram Publishers’ staff who would compile them
into a book.
At a meeting between Siva Raman, Mohan, and Robert Wil-
son, the conversation went as follows:
84
Mohan: I think this book could be a big hit of 2005 and the
sales could be as much as one lakh copies assuming a
price of Rs.250 retail. This is first and foremost a political
book. But let us not get too excited. We have to consider
the possibility that Periyasamy’s personal appeal, which is
at its peak at the moment, might dissipate over the next
year. We also don't know which other politicians might pub-
lish their memoirs around the same time. Remember, 2004
is an election year. The situation is quite fluid. I believe that
at a retail price of Rs.250, there is a 40% chance of sales
of around one lakh books, a 30% chance of sales of
around 40,000 books, and a 30% chance of sales of
around 10,000 books. Those are just representative scenar-
ios for the purpose of our calculations, of course.
Wilson: One thing is important. The book has to be written
before we can sell it. He has never written a book before,
so he doesn't know what it involves.
Siva Raman: We are also not completely confident that his
memoirs are going to be as exciting as we are expecting.
Let's face it, when our staff start looking at his stories, they
may find that the book is dull.
Mohan: We should be careful. We have to be sure, we can
make a profit if we publish it. One good thing, Periyasamy
has accepted the possibility that we may not wish to pub-
lish the book once we get to look at the manuscript.
Wilson: That's right. But after his delivery of manuscript, we
have to pay him the second Rs.2 lakhs, whether we publish
it or not.
Siva Raman: I think there is only a 70% chance Periy-
asamy will actually deliver a manuscript. Even after his de-
livery of manuscript, there's a 30% chance of a poor script
that we cannot publish. If we decide to publish the book,
then we have to examine the sales forecasts accurately. I
don't see how we can learn much more about our likely
sales before we make our final decision about going to
press.
Wilson: Why should we hand over Rs. 2 lakhs to someone
who may never deliver a manuscript? Siva Raman: Before
we get into that, let us use these sales projections and prob-
abilities and check to see if this deal makes sense.
Mohan: Let us first look at the costs. The cost of editorial
services (editing, proofreading and obtaining permissions
for photographs, etc) will be Rupees One lakh, which will
be incurred even if we decide to stop publishing. If we de-
cide to publish the book, we will also incur the cost of pre-
costs will be Rs.75 per copy.
Siva Raman: Will the unit cost come down, if we generate
more volume?
Mohan: Yes, but we'll need to print 10,000 copies no mat-
ter what. So, although it would cost much more per copy if
85
we were printing, say, 1,000 copies, for the numbers we
are talking about, it is effectively a flat rate. Furthermore,
for orders of our size, the printer will allow us to order cop-
ies on an "as-needed" basis, and we'll still get the same
rate. This means we won't get stuck with unsold inventory.
We'll get returns if the retailers cannot sell them.
Mohan: My proposed retail price of Rs.250 assumes a
wholesale price of Rs.160. For a generous margin like that
we will not permit returns. That's a common enough prac-
tice with books of a very topical nature. Distribution costs
will be about Rs.5 per copy. Marketing costs make up
about 40% of the wholesale price.
Siva Raman: But much of that marketing cost is fixed. We
have a marketing department and sales force whether we
sell Periyasamy's book or not. What are our incremental
marketing costs?
Mohan: We will pay 5% of the wholesale price as a com-
mission to the sales force. We will also spend about Rs.15
lakhs on advance publicity. We can prevent this cost, if we
decide not to publish the book based on our judgment.
Wilson: I feel that if we're only considering incremental ex-
penses, then the cost of editorial services, would be more
like Rs.50 thousand rather than Rupees One lakh, since
the permanent editorial staff are not very busy these days.
Ram Publishers’ senior management wondered whether
they should go ahead with the agreement.
86
Notes
Probability Distributions

Random Variable and Probability Distribution
Some Common Discrete Distributions
The Binomial Distribution
The Poisson Distribution
Some Common Continuous Distributions
Normal Distribution
t-Distribution
F-Distribution
Case Study: The Problem of a Medical Representa
tive
C
H
A
P
T
E
R

6
I n t hi s c hapt e r we wi l l di s c us s
Section1
Random Variable and Probability Distribution
In this chapter, we will discuss the concepts of probability
distributions. In fact, probability distributions are related to
frequency distributions and are considered as theoretical
frequency distributions. As these distributions deal with
expectations, they can be used as models in making
inferences and decisions under uncertain conditions. To
have a better understanding of the concepts of probability
distributions, let us consider the case of tossing a fair
(unbiased) coin twice. The possible outcomes of the
experiment are as shown in Table 6.1.1.
Suppose that an analyst is interested in knowing the number
of heads that can possibly result when the coin is tossed
twice. The analyst can conclude that out of the four possible
outcomes, one does not show the head at all, two show a
outcome and represents the way in which the analyst
expects the two-toss experiment to behave over time. This is
called probability distribution of the experiment.
Probability distributions can also be based on experience.
This is done by agencies involved in insurance actuaries to
determine insurance premiums by using experience with
88
Table 6.1.1. Possible Outcomes of Tossing a Fair Coin
Twice
Table 6.1.1. Possible Outcomes of Tossing a Fair Coin
Twice
Table 6.1.1. Possible Outcomes of Tossing a Fair Coin
Twice
Table 6.1.1. Possible Outcomes of Tossing a Fair Coin
Twice
First
toss
Second
toss
on two tosses
Probability of the
outcome
H H 2 0.5 × 0.5 = 0.25
H T 1 0.5 × 0.5 = 0.25
T T 1 0.5 × 0.5 = 0.25
T H 1 0.5 × 0.5 = 0.25
Total Probability = 1.00
death rates to establish probabilities of dying among various
age groups.
Random Variables
Before proceeding further,
let us first understand the
c o n c e p t o f r a n d o m
var i abl es, whi ch wi l l
enable us to understand
the concept of probability
distributions better.
Random variable is a
variable that takes on
di fferent val ues as a
result of the outcomes of
a random experiment. A
random variable is said to be continuous if it is allowed to
assume any value within a specified range and is said to be
discrete if it is allowed to take only a limited or countable
number of values, which can be listed. This can be further
explained through the following example. Suppose an
unbiased pair of dice is tossed. The possible outcome of the
sum of the upper faces of the two dice can take on any
integer value between 2 and 12. The outcome is said to be
discrete because it can take only a finite (or countable)
number of values.
On the other hand, if the task is to determine the mean age
of a sample of 1000 voters, the possible outcome (X) can
take any value in an interval(s) of numbers and is hence
continuous. It is a general practice to use capital letters for
random variables and lower case letters to indicate the
actual value it takes. That is X = x.
Expected Value of a Random Variable
Imagine a situation of tossing a coin ten times and getting 6
heads out of the experiment. The result is not always the
same if the same experiment is repeated under similar
conditions and is bound to vary from experiment to
experiment, though the coin is totally unbiased.
Expected value is a fundamental idea in the study of
probability distribution and is obtained by multiplying each
value that the variable can assume by the probability of
occurrence of that value and then summing up these
products. Let us illustrate the process of calculating the
expected value with the help of Example 6.1.1
Example 6.1.1
The daily records of a dental clinic indicate that the number
of patients arriving at the clinic ranges from 30 to 45 per day.
Table 6.1.2 illustrates the number of times each level is
reached during the past 100 days and the probability is for
the same level to recur the next day. Calculate the expected
value of number of patients to arrive at the clinic.
89
Introduction to random
variables
Solution:
To obtain the expected value of patients, we have to
multiply each value that the variable can assume with the
probability of occurrence of that value and then sum these
products. This is illustrated in Table 6.1.3.
90
Table 6.1.2. Number of Patients at Dental Clinic Table 6.1.2. Number of Patients at Dental Clinic Table 6.1.2. Number of Patients at Dental Clinic
Number of
Patients
Number of days the
level was observed
Probability for
reaching the level
30 3 0.03
31 2 0.02
32 1 0.01
33 5 0.05
34 6 0.06
35 7 0.07
36 9 0.09
37 10 0.10
38 12 0.12
39 11 0.11
40 9 0.09
41 6 0.06
42 5 0.05
43 8 0.08
44 2 0.02
45 4 0.04
Table 6.1.3. Calculation of Expected Value Table 6.1.3. Calculation of Expected Value Table 6.1.3. Calculation of Expected Value
Number of Patients
(1)
Number of days the level was
observed
(2)
Probability for reaching
the level
(3)
30 0.03 0.9
31 0.02 0.62
32 0.01 0.32
33 0.05 1.65
34 0.06 2.04
35 0.07 2.45
36 0.09 3.24
37 0.10 3.70
38 0.12 4.56
39 0.11 4.29
40 0.09 3.60
41 0.06 2.46
42 0.05 2.10
43 0.08 3.44
44 0.02 0.88
45 0.04 1.8
However, the expected value in the table does not mean that
38.05 patients will arrive the next day. This only helps the
dentist as a basis for his decisions on daily visits because the
expected value is a weighted average of the outcomes that
can be expected in the future. The dentist should recompute
the expected value and update his information on a regular
basis.
Types of Probability Distributions
Probability distributions are basically of two types:
Discrete Probability Distribution
Continuous Probability Distribution
A discrete variable can take only a limited number of values,
which can be listed. The probability of taking birth in a given
month is discrete because there are only 12 possible values
(12 months of the year) in the distribution. On the other hand,
in a continuous probability distribution, the variable is allowed
to take on any value within a given range.
Discrete Probability Distributions
Since each value of a discrete random variable is linked to an
outcome of an experiment, the values of a random variable
can be related to the probabilities of outcomes. The result of
this process is called a discrete probability distribution. To
illustrate the concepts of discrete probability distributions, let
us consider an experiment of tossing a balanced coin
thrice. The out come has to be one of the following:
TTT HTT THT TTH
HHT HTH THH HHH
If the aim is to determine the number of times head
occurs (X), the results can be depicted as given in Table
6.1.4.
The values given in the relative frequency column of the
91
Table 6.1.4. Theoritical Results of Tossing a
Balanced Coin
Table 6.1.4. Theoritical Results of Tossing a
Balanced Coin
Table 6.1.4. Theoritical Results of Tossing a
Balanced Coin
X Frequency Relative Frequency
0 1 1/8
1 3 3/8
2 3 3/8
3 1 1/8
Table 6.1.4 are nothing but the probabilities associated with
the values of X. So, the above findings can be slightly modified
as shown in Table 6.1.5.
Table 6.1.5 is a typical example of a discrete probability
distribution. The mean number of heads in 3 tosses is
calculated as below:
= 1.5
µ = 1.5 has a practical interpretation. If this experiment of
tossing a coin 3 times were repeated an infinite number of
times and the values of X were recorded, then theoretically,
1.5 would represent the average number of times heads would
come up. For this reason the mean is often called the
expected value E(X).
Continuous Probability Distribution
In such distributions, the variable can assume any value within
a given range. Therefore it is impossible to list all possible
values. If we were studying the waiting time for customers at
bank teller counter, the waiting would be a continuos variable
as the variable can take on any value within a continuum or
interval, depending on the precision of the measuring
instrument. This distribution would therefore be called a
continuous probability distribution.

92
Table 6.1.5. Probabilities of Getting Heads Table 6.1.5. Probabilities of Getting Heads
X P(X=x)
0 1/8
1 3/8
2 3/8
3 1/8
Total
1
Section 2
Some Common Discrete Distributions
Binomial Distribution
The binomial distribution is one of the widely used probabil-
ity distributions of discrete random variable. It describes dis-
crete, non-continuous data resulting from an experiment that
is also known as Bernoulli
Process (named after Ja-
cob Bernoulli, a Swiss
Mathematician of the sev-
enteenth century). The
tossing of a fair coin a
fixed number of times is a
typical example of Ber-
noulli process and the out-
comes (say, number of
be represented by the bino-
mial probability distribution.
The binomial distribution has an expected value (or mean µ)
which can be represented by the formula
µ = n p
Variance of the binomial distribution
= npq
Where,
n = Total number of Bernoulli trials
p = Probability of success in one trial
q = Probability of failure in one trial = 1– p
Characteristics of a Bernoulli Process
Each trial (each toss in our example) will have only
two possible outcomes: Success or Failure (head or
tail in our case).
The probability of the outcome of any trial remains
constant over time. That is, the probability of getting a
tail, in our example, is always 0.5 irrespective of the
number of times the coin is tossed.
93
Binomial distribution
The outcome of one trial cannot influence the out-
come of any other trial, and each trial is statistically
independent.
In technical parlance, the symbol ‘p’ is used to represent the
probability of a success and the symbol ‘q’ (q =1 - p) to repre-
sent the probability of failure. To represent a certain number
of successes, the symbol ‘r’ is generally used and the sym-
bol ‘n’ is used to represent the total number of trials.
The formula used to determine the probability of ‘r’ suc-
cesses in ‘n’ trials is given by
Example 6.2.1
A fair coin is tossed ten times. If getting head is defined as
success, find out the probability of getting 4 successes in the
ten trials.
Solution:
p = probability of getting head = 0.5
(Since it is a fair coin)
q = probability of not getting head = 1-p = 0.5
r =number of successes = 4
n = number of trials = 10
Probability of getting 4 successes in 10 trials

= 0.2051
Thus there is a 0.2051 probability of getting four heads on
ten tosses of a fair coin.
Example 6.2.2
A binomial experiment is repeated nine times. If the probabil-
ity of a success is 0.6, find the probability of getting four suc-
cesses.
Solution:
Here, n = 9 , p = 0.6 , q = 0.4 , r = 4
= 126×0.64×0.45
94
= 126×0.1296×0.01024 = 0.167
Poisson Distribution
The Poisson distribution applies to the situation when an
event occurs at random
points in time or space.
The observations on such
an event are characterized
by an average number of
occurrences of that event
per unit time or space.
This distribution is named
after its developer Siméon
Denis Poisson, a French
Mathematician. It can be
used to describe a number
of processes like distribution of telephone calls going
through a switch board system, the arrivals of trucks at a
toll booth, and so on.
A process is said to be producing a Poisson probability dis-
tribution if the following conditions are met:
(i) Independence
The number of times an event S occurs in any time interval
is independent of the number of times it occurs in any other
disjoint time interval.
(ii) Rate
In a very small time interval, t to t + h (where h is infinitesi-
mally small), the probability that the event occurs once is
approximately λ h (where λ is the average rate at which
the event S occurs per unit of time).
(iii) Lack of Clustering
The chance of two or more occurrences of S in a very
small interval, t to t + h is insignificant in comparison with
λ h, the chance of one occurrence.
In other words, we can describe Poisson distribution as a
limiting case of the binomial distribution where the prob-
ability of success (p) is infinitesimally small and the num-
ber of trials (n) so large that the product np equals λ, a fi-
nite constant. The mass probability function that repre-
sents the number of times the event S occurs in a given
period of time, say 0 to t, can be written as
Where X = discrete random variable
x = specific value X can take
λ = the mean number of occurrences per interval
of time
The mean and variance of a Poisson Distribution is λ.
Suppose that we are measuring events in time, occurring
with the following properties:
95
Poisson distribution
The number of events occurring in one time interval is in-
dependent of the number occurring in any other disjoint
time interval. (It has no memory.)
The probability that a single event will occur during a very
short time interval is proportional to that length of the time
interval.
The probability that more than one event will occur in such
a short time interval is negligible.
The number of events occurring in a fixed time interval is a
random variable X that has the Poisson distribution.
Example 6.2.3
tive particles passing through a
counter during one millisecond
in a laboratory experiment is
four. What is the probability that
six particles enter the counter in
a given millisecond?
Solution:
We know that λ = 4 and x = 6
,
96
Normal Distribution
Section 3
Some Common Continuous Distributions
Normal Distribution
The normal distribution reflects the various values taken by
many real life variables like the heights and weights of
people or the marks of students in a large class. In all these
cases a large number of observations are found to be
clustered around the mean value and their frequency drops
sharply as we move away from the mean in either direction.
For example, if the mean height of an adult in a city is 6 feet
then a large number of adults will have heights around 6
feet. Relatively few adults will have heights of 5 feet or 7
feet.
Further, if we draw samples of size n (where n is a fixed
number over 30) from any population, then the sample mean
X will be (approximately) normally distributed with a mean
equal to µ – the mean of the population.
The normal variable is a continuos variable. The
characteristics of normal probability distribution with
reference to the Figure 6.3.1 are:
The curve has a single peak; thus it is unimodal.
The mean of a normally distributed population lies at the
center of its normal curve.
Because of the symmetry of the normal probability
distribution, the median and the mode of the distribution
are also at the center.
The two tails of the normal probability distribution extend
indefinitely and never touch the horizontal axis.
97
Figure 6.3.1
The Standard Normal Distribution
The Standard Normal Distribution is a normal distribution
with a mean µ = 0 and a standard deviation = 1. The
observation values in a standard normal distribution are
denoted by the letter Z.
Example 6.3.1
A population is normally distributed with mean = 0 and
standard deviation = 1. What is the probability that an
observation from the population will have a value between –
1.28 and 1.28?
Solution:
We know that for a normal distribution 80% of the
observations lie between
Here, µ= 0 and = 1. i.e., it’s a standard normal distribution
So 80% of the observations will lie between –1.28 and +
1.28 (from normal table)
Hence the probability that an observation will have a value
between –1.28 and 1.28 is 80%.
Example 6.3.2
What is the probability that an observation from a standard
normal distribution will lie in the interval –1.96 to 1.96 ?
Solution:
From normal table, the probability is 95%.
Example 6.3.3
What is the probability that an observation from a standard
normal distribution will lie between –2.33 and + 2.33 ?
Solution:
From normal table, the probability is 98%.
Standardizing Normal Variables
Suppose we have a normal population. We can represent it
by a normal variable X. Further, we can convert any value of
X into a corresponding value Z of the standard normal
variable, by using the formula

Where
X = the value of any random variable
µ = the mean of the distribution of the random
98
= the standard deviation of the distribution
z = the number of standard deviations from x to the mean
of the distribution and is known as the z score or standard
score.
Example 6.3.4
A normal variable X has a mean of 56 and a standard
deviation of 12. Find the Z value corresponding to the X
value of –5.
Solution:
Example 6.3.5
A normal variable has a mean of 10 and a standard
deviation of 5. What is the probability that the normal
variable will take a value in the interval 0.2 to 19.8?
Solution:
Probability (0.2 < X < 19.8)
= Probability (-1.96 <
Z < 1.96)
= 95%
[Because 95% of the area under the standard normal curve
lies in the interval -1.96 to 1.96]
We can see this from the Normal Table:
Area under the standard normal curve between 0 and
1.96 is 0.4750.
Due to symmetry of the standard normal distribution, area
under the curve between –1.96 and + 1.96 is twice the
area under the curve between 0 and + 1.96.
Probability (–1.96 < Z < + 1.96) = 0.95 or 95%
Any normal variable can be converted into a standard
normal variable as illustrated above. Hence, we can use
the standard normal distribution table to find the
probability that the variable will take a value within any
given interval.
The Lognormal Distribution
If ln (X) is a normally distributed random variable, then X
is said to be a lognormal variable.
If P1, P2, P3, ... are the prices of a scrip in periods 1, 2, 3,
..., some applications in finance require ln (P2/P1), ln (P3/
P2),... to be normally distributed, that is, continuously
compounded returns are required to be normal. This
property is described as “Stock Prices are Lognormal”.
t-Distribution
Suppose we randomly select an Indian and find his/her
weight. Then X = “Weight of the person” is a random
variable. We may assume that X is normally distributed.
Moreover, suppose E(X) = 60 kg and that V(X) is
99
unknown, where E(X) is the population mean and V(X) is
the population variance.
Suppose we take a random sample of five people and
compute the average weight, say . Then is also a
random variable, since different samples may give different
values for .
It is a fact that E ( ) = E(X) for any such experiment. It is
also true that V ( ) = V( X )/n, where n is the sample size.
It is also true that, that if X is normal, so is .
has mean 0 and variance 1. But we do not have V ( )
since V(X) is unknown.
We may compute the sample variance, s
2
, from the five
individuals. We may consider as an approximation of V(X),
and replace V(X) by . In doing so we are losing one degree
of freedom. And

is a t-distribution with (n -1) degrees of freedom,
where,
µ = Population mean
s = Sample standard deviation
n = The sample size
As shown in the figure above, it is symmetrical like the
normal distribution, but its peak is lower than the normal
curve and its tail is a little higher above the abscissa than
the normal curve.
As degree of freedom increases, the distribution
approaches the Normal Distribution. So t-distribution is
used when the sample size is 30 or less than 30. Another
100
Figure 2.2.2: Distribution Curves with Differ-
ent Degrees of Freedom
condition for using this distribution is when the population
standard deviation is unknown.
Example 6.3.6
Consider the t-distribution with df = 13. What is the area to the
right of 1.771?
Solution:
From the t-distribution table, it can be seen that the area
under both the tails is 0.10. Therefore, the area under the
right tail will be 0.05.
F-Distribution
The F-distribution is the distribution of the ratio of two
independent Chi-square distributions. The degrees of freedom
of the numerator is n
1
and that of the denominator is n
2
. We
will come across this distribution while studying regression.
Example 6.3.7
Consider the F-distribution with degrees of freedom 2 in the
numerator and 13 in the denominator. What is the area to the
right of 3.81?
Solution:
From the F-distribution table, this area equals 0.05.
101
REVIEW 6.1
Question 1 of 25
Collection of all possible events of an experi-
ment is called
A. Sample space
B. Population space
C. Null set
D. Probability space
Section1
Case Study: The Problem of a Medical Representative
This case study was written by R.P. Suresh, Indian Institute of
Management Kozhikode, India. It is intended to be used as the
basis for class discussion rather than to illustrate either effective
or ineffective handling of a management situation. The case was
prepared from the generalised experiences.
102
Mr. Muralidharan Nair, a sales representative of WKPIL
(Well known Pharmaceuticals India limited) is one of the
promising representatives located in Kozhikode City in
South India. He has won several awards for his excellent
job of meeting the targets. Last year, he has also won the
National award of BEST REPRESENTATIVE of the year.
One of the most important jobs of the medical representa-
tives is to meet the practicing doctors and introduce to
them some of their new products, and discuss with them
the awareness of the doctors about the products of WKPIL
has a direct relation to the sales, the company fixes targets
on the number of doctors to be visited over a period of
time. WKPIL has a policy of finalizing the annual as well as
quarterly targets in consultation with the concerned offi-
cials. The company believes that this is the best way of in-
volving the entire organization in the decision making proc-
ess, and it is observed that the officials become more ac-
countable and are generally bound by the decision as they
were part of the decision making process. To meet the cur-
rent target, considering the number of visits that can be
made per day, Mr. Nair needs to meet 100 more doctors in
Kozhikode in the 27 days that is remaining in the quarter.
The regional manager of Western Region, Mr. Saurav
Deshpande, has extended an invitation to Mr. Nair to ad-
dress and interact with his fellow representatives, in the cur-
rent term, highlighting the factors that helped in his achieve-
ments. The company feels that this will be a motivating fac-
tor for other representatives. The venue for this meeting is
identified as Pune, which is about 800 k.m.’s away from
Kozhikode. Mr. Nair needs a day exclusively for this pur-
pose. Mr. Nair knows that he requires at least 25 days to
complete his target. As such, it looks it is possible to take a
day off required to go to Pune. However, he is also aware
that he cannot walk on a tight rope like this, because there
are some of the days during which he cannot travel to meet
the doctors due to the following exhaustive reasons.
In this region, some political or social organizations an-
nounce bandh or hartal, as a mark of protest against
some policy of the Government or to highlight a specific
problem facing the society. During these days, there is
a total restriction on movement of the public. And, there-
fore, during the days when a bandh or hartal is de-
clared, Mr. Nair will not be able to meet the doctors.
And also during the current season (viz. monsoon sea-
son) when it rains quite heavily some parts of the city
gets flooded with water. As a result of this, some of the
roads get blocked, and, hence, on these days again Mr.
Nair will not be able to meet the doctors. This case is
taken from a detailed paper by the author with the per-
mission of the author. It is intended to be used as the
basis for class discussion rather than to illustrate either
effective or ineffective handling of a management situa-
tion. Since Mr. Nair is not willing to miss the target, he
wants to make sure that he works for at least 25 days to
meet the target. At the same time, he is very keen to go
to Pune to address his fellow workers in Western re-
103
gion, as this will be a professional boost to his career,
and in the process, he may help his fellow workers also
to excel. In order to ensure that he gets enough working
days, he wishes to find out the frequency of the happen-
ings of these two events. After scanning through the
newspapers of the last two years, Mr. Nair observed
that during the monsoon there is a one in 30 chance
that, on any day in this season, the roads are blocked
due to flood in the city. He also observed from the re-
cords of the civic administration that the movement in
the city was restricted due to bandh or hartal, etc. for 14
days in the last 2 years viz., about 730 days. What con-
clusion did Mr. Nair arrive at? What are the methods Mr.
Nair used to arrive at this conclusion?
104
Sampling and Sampling Distributions
Population & Sample; Parameter & Statistic
Methods of Enumeration
• Census or Complete Enumeration
• Sampling Methods
Sampling and Non-Sampling Errors
Sampling Distribution
Estimation
Case Study:
Sampling the Population Favorite
Ascertaining Customer Satisfaction
Customer Satisfaction with DTH Services in India
Swarnamuki Public Bank Limited’s SME Loans.
C
H
A
P
T
E
R

7
I n t hi s c hapt e r we wi l l di s c us s
Section1
Population & Sample; Parameter & Statistic
The process of inferring something about a large group of
elements by studying only a part of it is known as sampling.
The collection of all elements about which some reference is
to be made is called the population. For example, in an
effort to study talcum powder usage in the urban areas of a
state, the population could be a collection of all talcum
powder users in major cities and towns in the state.
What we are interested in is to measure some particular
characteristic of the selected population. For example, it
could be the average life of a fluorescent tube, the
percentage of talcum powder users in a state or the
percentage of defectives in an engineering manufacturing
industry. Such a numerical measure, which describes a
characteristic of the population, is known as a parameter of
the population.
Usually we are interested in some population parameter and
we infer about the parameter by studying only a part of the
popul at i on, cal l ed t he
s a mp l e . S a mp l i n g ,
therefore, refers to the
process of choosing a
sample from the population
so that some inference
sample.
A numerical measure which
describes a characteristic
of a sample is called a statistic. To study the population
characteristics, a manager can either go for complete
enumeration (census) or a sampling study. However,
limitations of time, money and energy may restrict the
manager from going for complete enumeration of the entire
population. It is common practice that we check a hand full
106
Video on Sampling methods
of rice at the grocery store before buying a bag of rice or
taste a piece of sweet at a sweet shop before ordering it for
a party. This practice is based on the assumption that the
sample will provide approximate population information,
representi ng the popul ati on characteri sti c under
examination. For example, consider an automatic steel
casting machine that casts thousands of steel bars daily -
to check the performance of the machine a manager need
not wait to check the entire days output. Instead, he can
check samples taken at random intervals, and if any
defects are detected in the cast the machine can be reset
107
Section 2
Methods of Enumeration
There are two methods of enumeration, the complete
enumeration or census method, and the selective
enumeration or sample method. The first method deals with
the study of the entire population whereas the second
method studies the selected part of the population that is
representative of the entire population and is referred as
sampling method.
Census or Complete Enumeration
In case of census or complete enumeration information
relating to characteristics of each and every unit of the
population is collected. The unit may be an employee,
product or a department present in an organization. The
collection of all these units under study is called as
‘population’ or the ‘universe’. For example, when the study is
intended to find out the working conditions of workers in
cement industry, the ‘universe’ of the study will consist of all
the workers in this industry (spread over a geographical
location). Scanning through all the applications for the
purpose of recruitment is a good example of complete
enumeration.
108
census, Sample Enumeration & Characteristics of
a good sample
Sampling Methods
When the population/universe is large or difficult to
enumerate, information about its characteristics has to be
inferred from a subset of this population, called a sample.
The most difficult (but most important) aspect of selecting a
sample is to ensure the drawing of a representative
sample, i.e., ensuring that the sample chosen reflects the
population it is drawn from. We can use the sample to
make reasonable (probabilistic) inferences about the
population only when we can reasonably be sure that the
sample reflects the population.
A sample is a part of a larger group or set, that is usually
called a population. A sample is used to discover one or
more properties of the population. There are several
techniques that can be used to obtain a representative
sample. The technique used depends on the prior
knowledge of the properties of the population that will be
measured. There are two methods of selecting samples
from the population. They are:
Random or Probability Sampling
Non Probability Sampling
Refer Figure 7.2.1. for samples.
Random or Probability Sampling Methods
In probability sampling, the decision that whether a
particular element is included in the sample or not is
governed by chance alone. All probability sampling
methods ensure that each element in the population has
some non-zero probability of getting included in the
sample. This would mean defining a procedure for picking
up the sample, based on chance, and avoiding changes in
the sample except by way of a pre-defined process again.
The picking up of the sample is therefore totally insulated
against the judgment, convenience or whims of any person
involved with the study. That is why probability sampling
procedures tend to become rigorous and at times quite
time-consuming. Probability based selection of sample also
makes it free from individual biases and hence more
representative. Also, when probability sampling designs are
used, it is possible to quantify the magnitude of the likely
error in inference made and this is of great help in many
situations in building up confidence in the inference.
Some of the Random Sampling Methods are:
109
Figure 7.2.1: Samples
• Simple Random Sampling
• Systematic Sampling
• Stratified Sampling
• Cluster Sampling
• Multistage Sampling
Simple Random Sampling
Conceptually, simple random sampling is one of the
simplest sampling designs and can work well for relatively
small populations.
Suppose have a population having N elements and that we
want to pick up a sample of size ‘n’ (< N). Obviously, there
are possible samples of size ‘n’. Simple random
sampling is a process which ensures that each of the
samples of size ‘n’ has an equal probability of being picked
up as the chosen sample. This also implies that under
simple random sampling, each element of the population
has an equal probability of getting included in the sample.
All other forms of probability sampling use this basic
concept of simple random sampling but applied to a part of
the population at a time and not to the whole population.
It is imperative to have a list of all the members of the
population (called Population Frame) before a simple
random sample can be picked up. For example, to draw a
sample of 10 students out of a class of 70, we can write a
name chit for each student and mix the 70 chits in a bowl
well. Then draw chits one by one, 10 times.
It is easy to see that if
we replace the chits in
the bowl after noting
down the name of the
element, we will have a
simple random sample
with replacement and
o n e w i t h o u t
replacement if we do
not.
As the population size
increases, chit method would not be practical. We
associate a serial number with each member of our
population and then instruct a computer to select a
member from 1 through N using its pseudo-random
number generator. This ensures that every number from 1
through N has an equal probability of getting selected and
so the sample selected is a simple random sample. We
can also use a table of random numbers to draw a simple
random sample.
In practice, however,
s i m p l e r a n d o m
sampling is not popular
as mostly we may not
have population frame
and also operationally
it is more inconvenient
and costly.
110
Video 7.2.1: Sampling
Video 7.2.2: Simple random
sampling
Systematic Sampling
Suppose we wish to draw a ample of size n from a population
of size N, where n< N. Then order the population units based
on some identification and divide it into n partitions, with each
partition containing k units, where k=N/n (rounded off to
nearest integer). Then a unit is drawn randomly from the first
partition of k units and then every k th unit is drawn, thus
finally getting a
sample of size n
or (n+1).
For example, if
we want to have a
sample of size 6
from a population
of size 100, then k
woul d be 16. 7
(rounded off to 17). We would, therefore, have to decide
where to start from among the first 17 units in our frame. If
this number happens to be 7, for example, then the sample
would contain members having serial numbers 7, 24, 41, 58,
75 and 92 in the frame. It is to be noted that the random
process establishes only the first member of the sample - the
rest are pre-ordained automatically by the value of k.
Systematic sampling is relatively much easier to implement
compared to simple random sampling. It is popular in
sampling from pre- numbered receipts, invoices, cheques,
etc. However, if there is a pattern or periodicity int he
population frame such as greater rush at banks on Mondays
and Saturdays while studying the (number of customer
visiting a bank), it could result in selection bias.
Another situation could be when a population frame is
arranged in an order, ascending or descending, of some
attributes ( say, in descending order of marks while
studying marks distribution), then the location of the first
sample element may affect the result of the study. Both
simple random sampling and systematic sampling are
generally less efficient as compared to more sophisticated
probability sampling methods.
Stratified Sampling
Stratified sampling is more complex than simple random
sampling, but when
appl i ed properl y,
strati fi cati on can
s i g n i f i c a n t l y
i n c r e a s e t h e
statistical efficiency
of sampling.
Suppose we are
i n t e r e s t e d i n
e s t i ma t i n g t h e
demand of non-
aerated beverages
i n a r esi dent i al
colony. We know that the consumption of these
beverages has some relationship with the family income
and that the families residing in this colony can be
111
Figure 7.2.2: Stratified
sampling
Figure 7.2.1: Systematic sampling
classified into three categories, namely high-income, middle-
income and low-income families. If we are doing a sampling
study we would like to make sure that our sample does have
some members from each of the three categories - perhaps
in the same proportion as the total number of families
belonging to that category - in which case we would have
used proportional stratified sampling.
The basis for using stratified sampling is the existence of
strata such that each stratum is more homogeneous within
and markedly different between strata, the strata are
mutually exclusive and collectively exhaustive. The higher
the homogeneity within each stratum, the higher will be the
gain in statistical efficiency due to stratification. Samples are
usually drawn in proportion to the strata sizes. Each strata is
looked at as a population and samples drawn using any of
the methods described earlier.
Cluster Sampling
In cluster sampling,
t he popul at i on i s
di vi ded i nt o wel l
def i ned groups or
clusters, in such a
way that each cluster
is a representative of
the entire population.
In practice, clusters
are identified based
on some natural l y
o c c u r r i n g
phenomenon such as
villages, city blocks,
sales territories, etc.
Af t er t hat f ew of
these clusters are
randoml y sel ected
a n d u s u a l l y
c o m p l e t e d
enumerated. In case
the cluster sizes are
large one may resort
to random sampling int he selected clusters. The selection
of these clusters is done by using any one of the above
discussed sampling methods. For example, when a pre-
poll survey is conducted in an assembly segment, then
the entire voting population is divided into clusters. Then
some clusters are selected as samples and every element
of these clusters is studied to arrive at a final opinion
regarding the entire population.
Cluster sampling is used primarily because it allows for
great economies in data collection costs since the travel
related costs etc. are smaller. Although it is statistically
less efficient than simple random sampling, in most cases
this deficiency may be offset by the high economic
efficiency that it offers.
For example, to get a certain precision level one might
need a sample size of 100 under simple random sampling
and a sample size of 175 under cluster sampling.
However if the cost of data collection is Rs. 20 under
simple random sampling and only Rs. 5 under cluster
112
Video 7.2.2: Samples
Figure 7.2.3: Cluster
sampling
sampling, it would be cost effective to use cluster sampling.
Cluster sampling is mostly used in multi-stage sampling.
Multistage Sampling
When large national level surveys are undertaken, for better
representation and economy of costs, the samples are drawn
at different stages. For instance, a study on rural
unemployment may identify states as the first stage unit,
districts as the second stage unit, villages as the third stage
units, and households as the ultimate stage units. At each
stage we will take a sample using an appropriate random
sampling method for that stage. Most national level surveys
are carried out using such multistage sampling.
Non-Probability Sampling Methods
In non-probability sampling the sample units are selected on
non-random basis ignoring their probability of occurrence in
the population (since we may not know them). We resort to
such approaches when we are under the pressure of non-
availability of sampling frame, cost, time and ease of work;
high accuracy itself not being of importance.
Some of the non-random sampling methods are as follows:
Judgment Sampling
Convenience Sampling
Quota Sampling
Sequential Sampling
Judgment Sampling
In judgment sampling, the selection of the sample is
based on the judgment of the manager who is studying a
situation. This method is also known as “purpose
sampling” or “deliberate sampling”. This sampling method
should be carried out by an expert in the field as his
judgment will influence the final outcome of the study.
Convenience Sampling
This method is based on the convenience of the
researcher. The researcher uses the sources available to
him to come to a conclusion. For example, he may use a
telephone directory, to select the respondents for a
opinion poll or the list of employees of an organization
can be taken to study the employees.
Quota Sampling
In quota sampling, as in stratified sampling, we first
partition the population into mutually exclusive sub-
groups. Then a pre-specified proportion of sample is
drawn from each sub-group on a judgement basis. For
example, while carrying out opinion interviews (on streets)
on events like budget announcement, the tele-journalists
work on a quota (say, on age group basis, or on gender
basis). The journalists may involuntarily may tend to
interview the more “cooperative” people. Such a sample
may not be a representative.
Sequential Sampling
113
In sequential sampling the size of the sample is not fixed in
advance, but it is decided as the sampling process takes
place depending on the results of the first sample. A number
of sample lots are drawn in sequence one after another from
the population depending on the results of the earlier sample.
This sampling method is used for statistical quality control.
For example, a manager draws a lot from the inventory and
tests it for acceptability. If it is acceptable, there will be no
further samples required but if it is found unacceptable, the
entire stock will be rejected. So, when the results of the first
sample fall in near to acceptable standard the manager will go
for another sample before deciding on the quality of the
inventory.
Exercise for Discussion:
If you want to find the average height of all the students in
y o u r M B A b a t c h
a. What is the best way to draw a representative sample of
b. What do you think should be a good sample size?
114
REVIEW 7.1
Question 1 of 10
A population is normally distributed with
mean = 0 and standard deviation = 1.
What is the approximate probability that
an observation from the population will
A. 0.7156
B. 0.8435
C. 0.9065
D. 0.9974
Section 3
Sampling and Non-Sampling Errors
Sampling and Non-Sampling Errors
Sample survey is related to study of limited units of the total
population; hence there would be scope for inaccuracy (or)
error in the process of collection, processing and analysis of
the data (sample). These errors can be broadly classified
into sampling and non-sampling errors.
Sampling Errors
The purpose of taking a sample from a population is to
estimate a population parameter through a sampled statistic.
The estimate of the population parameter would vary over
different samples. However, chance dictates the selection of
units in each sample. The variation in the estimates over
the samples, due to chance variation over samples is
referred to as sampling error. It is possible to obtain and
estimate an error statistically based on even a single
sample. This is of great help in judging the worthiness of an
estimate of the parameter.
Some of the causes for error in sampling are:
Error in selection of the sample
Bias in the reporting of data
Diversity of population
Substitution of sampling units for convenience
Faulty demarcation of sampling universe.
Non-Sampling Errors
Non-sampling errors occur at the time of observation,
approximation and processing of data. This error is common
115
to both the sampling and census survey. In fact, it is larger
in census survey, simply because many more units are
surveyed. Non-sampling errors can arise at any stage of
the planning or execution of complete enumeration or
sample survey. The non-sampling error may be due to
faulty sampling plan, errors in design of the survey, sample
substitution at the field level, measurement error, lack of
trained and qualified investigators, inaccuracy in responses
collected due to bias on the part of the respondent or the
researcher, and finally the errors in compilation or
publication.
116
REVIEW 7.2
Question 1 of 5
In the sampling surveys, the errors are
A. Standard Error of Mean and
Population
B. Type I and Type II Error
C. Probability Errors and Non-
Probability Errors
D. Sampling Errors and Non-
Sampling Errors
Section 4
Sampling Distribution
Sampling Distribution
At IBS, we have 14 sections of first year MBA students,
each section containing 70 students. Thus if we look at the
first year students as population, our population size is 980.
Each section can be looked upon as a sample from this
population. Let us consider the variable, marks obtained in
a common QM examination (out of 100 marks). Clearly,
each student will score a value between zero and hundred.
The distribution of marks for different sections may be as in
figure 7.4.1.
The mean score ( )
for each section can be
expected to be some
where in the middle. If
we look at all the ’s,
one thing that we can
say intuitively is that
they will have a much
l e s s d i s p e r s i o n ,
possibly between 50
and 60. Now if we look
at over the sections,
we can expect the mean
to be same as it is for the population of all sections, but
clearly expect the variance and hence standard deviation to
be considerably less.
This intuitive result is more formally established through the
celebrated Central Limit Theorem in statistics. The theorem
states that when samples are taken from a large population
with mean (µ) and standard deviation (σ), then the
di st ri but i on of t he sampl e mean ( ) woul d be
approximately normal with mean (µ) and standard deviation
117
Video 7.4.1: Central Limit
Theorem
Figure: 7.4.1:Distribution of marks
irrespective of the shape of the population distribution,
where n is the sample size. To restate if X
1
, X
2
, X
3
..........X
n

are independently and identically distributed with mean (µ)
a n d s t a n d a r d d e v i a t i o n ( σ
x
) , t h e n
As the sample size n increases the standard deviation of of
will decrease. Thus probably bring the sample mean ( )
closer to the population mean. For this reason, this standard
deviation is referred to as standard error of .
In the case of finite population of size N, the standard error of
(σ ) is adjusted with a multiplier and given by
Notice that when N is
large and N>>n, then
multiplier will be close to
one.
We can find out the
sampling distribution of
not only mean, but any
other statistic estimated
from the sample such as the standard deviation (s). The
sampling distributions of different statistic provide the
basis for estimation and testing of hypotheses to be
discussed in the subsequent chapters.
118
Video 7.4.3: Sampling
distribution sample
problem
Keynote 7.4.1: Effect of Sample Size on Standard Error
Video 7.4.2: Standard error
119
REVIEW 7.3
Question 1 of 5
Which of the following sam-
pling method is most suscepti-
ble to subjectivity in selection?
A. Stratified sampling
B. Simple random sam-
pling
C. Cluster sampling
D. Judgment sampling
Section 5
Estimation
In most statistical studies, the population parameters are
unknown and must be estimated. Therefore, developing
methods for estimating, as accurately as possible, the values
of the population parameters is an important part of statistical
analysis. The primary goal of a sampling activity is to make
an inference about something using the least amount of
information possible. Here, we must be quite certain of things
like the number of observations to be made, the number of
points to sample and the number of people to survey.
Point Estimates
We can make two types of estimates about a population:
point estimates and interval estimates. A point estimate is a
single number that is used to estimate an unknown
population parameter.
For example, an estimate that the average weight of a
classroom of students is 50 kg or that the number of students
to register online for a particular university course is 250 is a
point estimate. Often, a point estimate is insufficient, as it is
either right or wrong. If it is said that a particular estimate is
wrong, we cannot be certain how wrong that estimate is or
about the reliability of that estimate.
The sample mean is the best estimator of the population
mean µ. It is unbiased, constant, and efficient and as long as
the sample is large, its sampling distribution can be compared
to a normal distribution.
This value of the sample mean is then an estimate of the
population mean. Similarly, point estimates of other statistics
can also be determined for population variance, standard
deviation and the population proportion.
Interval Estimates and Confidence Interval
If the actual result varies from the estimate by a little margin,
then it can be accepted as a good estimate whereas if it were
off by a large margin, it would be rejected as a poor estimate.
Therefore, a point estimate is useful if it is accompanied by
an estimate of the error that might be involved. Equivalently,
we can state that we have an interval estimate, its lower and
upper limit obtained with the help of the standard error of the
statistic of the population parameter. It indicates the inherent
error in estimation in two ways: by the extent of its range and
by the probability of the true population parameter lying within
that range. Using the above example, an interval estimate
would be something like this: the average weight of a class of
120
students is expected to be between 40 kgs and 55 kgs with
95% confidence, i.e., it is 95% certain that the exact average
weight falls in this range. Due to this way of looking at interval
estimate at we also call it the confidence interval.
Estimators and Estimates
Any sample statistic that is used to estimate an unknown
population parameter is called an estimator. The sample
mean can be an estimator of the population mean µ, and
the sample proportion can be used as an estimator of the
population proportion. Similarly we can also use the sample
range as an estimator for the population range.
Suppose the sales manager of a dry cell manufacturing firm,
needs an estimate of the average life (in months) of the
batteries. To proceed, let us take a sample of 500 batteries,
survey people who use those batteries about the battery life
they have experienced. Let us say that the present sample of
500 batteries has a mean battery life of 45 months. This gives
the point estimate for the life
of the batteries.
Ho we v e r , t h e s a l e s
manager is not satisfied
with this and can ask for the
amount of uncertainty that
accompanies this estimate,
whi ch i n essence i s a
within which the unknown
population mean is likely to
lie. Over several years he has observed the standard
deviation of the life of batteries to be 18 months. Hence, this
can be taken to be the standard deviation of the population.
Thus, the standard error of is given by:

We can now tell the sales manager that our estimate of the
life of the firm’s batteries is 45 months, and the standard error
that accompanies it is 0.805. In other words, the 95%
confidence interval can be given as:
where z (95%) is a value to be read from the standard normal
table and indicates that 95% of the observations of a standard
normal variate will lie between +-z (95%) of the mean (which
is zero here). When read from the table, z (95%) = 1.96.
Hence the confidence interval for the above example is
(43.42, 46.58).
A detailed list of confidence interval formula under different
situations is available at the end of the chapter on testing of
hypothesis.
121
Video 7.5.1:Point Estimates

122
123
124
125
Section 6
Case Study: Sampling the Population Favourite
126
This case study was written by Siva V Gabbita, Professor, IBS, Hyderabad. It is intended to be used as the basis for class dis-
cussion rather than to illustrate either effective or ineffective handling of a management situation. The case was prepared from
generalised experiences.
Sampling the Population Favorite

Have you tasted Hyderabadi Biryani?” That is the inevitable
Amar Singh (Amar) is a graduate from Delhi. Recently, he
came to Hyderabad to do a diploma course in management.
An avid foodie, Amar had heard of quite a few places that
served the popular dish, but he was a little sceptical about the
wedding hosted at a well-known 5-star hotel. He was a
vegetarian and had helped himself to small servings each of
Hyderabadi Vegetable Biryani, Kashmiri Pulao and Vegetable
Fried Rice, but did not find much variation amongst them.
However, his friends from Hyderabad suggested that he post-
pone his judgment until he visited places that specialised in
the dish. They told him a list of hotels renowned for Biry-
Qahna, Garden Restaurant, Alpha Hotel, Hotel Madina,
Hotel Niagara, Cafe Bahar, Shadab Restaurant and not to
mention the restaurants at Grand Kakatiya and Taj Banjara.
come to stay in Hyderabad. He decided to taste the dish at
Paradise that week. He also decided to taste the Biryani at a
few other hotels and compare the tastes to decide if the
Biryani at Paradise was significantly superior to that prepared
at other restaurants.
Paradise restaurant was started as a small shop and currently
it has three floors of the same building to itself. In spite of all
the fame it has, the food at Paradise is fairly priced and big on
quantity. It is the Quality/Price (=Value) that provides paisa va-
sool and is one of the basic reasons for the enduring popular-
ity of the Biryani at Paradise restaurant.
The best offerings at Paradise are arguably the Chicken Tikka
Kabab starter followed by a Hyderabadi Chicken Biryani. For
vegetarians, it is the Paneer Tikka starter followed by a Vege-
table Biryani, and the course is not completed without the
up of Vanilla ice cream topped with apricot puree.
Do-it-Yourself
Over the next few months, Amar visited different places includ-
ing his friends’ homes. Once, one of his friend’s mother men-
tioned that there is a difference in preparation at home from
what is prepared in restaurants because of the quantity in
which the dishes are prepared. This statement made an im-
pression on Amar. He and his like-minded friends wanted to
try their hand at preparing Hyderabadi Vegetable Biryani. Af-
ter browsing the Internet and speaking to a few chefs, they
came to know of the essential ingredients and the recipe for
The Recipe
127
Potatoes, carrots, french beans and cauliflower are boiled
with salted water. Sliced onions are deeply fried in oil until
they turn reddish-brown and the onions along with beaten yo-
gurt, garam masala, ginger and garlic paste (optionally) are
allowed to marinate for 1 hour. Simultaneously, rice is
washed, soaked and cooked until half done. Ghee heated in a
thick-bottomed vessel is sautéed for 2–3 minutes with the
marinated vegetables and brought to boil in water before layer-
ing the rice over the cooked vegetables. Biryani masala, mint,
coriander leaves, and more deep-fried onions, cashew nuts,
almonds and raisins are added and saffron milk is sprinkled
before covering the vessel with a moist cloth. Then the vessel
is covered with a lid, sealed with dough and cooked for 15–20
minutes at 180°C (alternatively cooked over a slow fire for
15–20 minutes).
Trial and Error
Over the next few months, cooking the perfect Biryani be-
came an obsession with Amar. In his inaugural attempt, he
used too little water while cooking the rice and ended up burn-
ing it at the bottom. Later on, he learnt to be careful while uni-
formly mixing the rice, so that the burnt taste would not
spread to the upper well-cooked layers.
Another revelation was that the spices and garam masala
made a big difference to the outcome. The extent to which the
vegetables were boiled, which in turn was determined by the
amount of water used, also brought a lot of variation in the
taste. He found that frying the boiled vegetables before add-
ing them to the cooked rice sometimes improved the taste.
Apart from these, he also found that there are many other vari-
ables, which changed the taste subtly. For instance, the se-
quence in which the sliced onions were fried, i.e., whether
fried separately or along with the vegetables, adding salt sepa-
rately to the onions and to the vegetables, boiling the rice
along with the vegetables, the time taken to cook the rice, etc.
tion. However, no two Biryani preparations tasted exactly the
same. Nevertheless, there was an essential taste, which was
unique to Amar’s Biryani. Though his Biryani tasted similar to
the ones prepared by anyone else, it was not exactly identi-
cal. There was ‘something’ in Amar’s Biryani that enabled his
friends to identify whether it had been prepared by Amar or
not just by helping themselves to a small portion.
Then Amar realised that this is also true of the Biryani pre-
pared at restaurants. Though the Biryani at Paradise might
taste subtly different from day to day, there was ‘something’
realised that there was also variation across the Biryani pre-
pared by different chefs. There was a distinct quality to the
Biryani prepared at each of the places, making it reasonably
simple to identify its source as well as compare and contrast
between Biryani prepared at different restaurants. However,
the average taste at Paradise restaurant varied from the aver-
age taste at Bawarchi restaurant, which varied from the aver-
age taste prepared at Café Bahar and so on.
Reflecting back on his experience at the wedding dinner at
the 5-star hotel, Amar once again wondered why then there
had not been any significant difference between the Hydera-
badi Biryani, Kashmiri Pulao and Vegetable Fried Rice. He
128
concluded that they were all perhaps made once and served
separately with different garnishing and separate nametags.
Questions for Discussion
1. Can we make a judgment about an entire batch by
evaluating a single portion drawn from that batch? Sub-
stantiate with suitable reasons.
2. If there is variation within the batches produced each
time by a producer, can we compare and contrast (to
find a significant difference) between all of them just by
evaluating a single portion from one producer and com-
paring it with a single portion from the other producer?
3. Is it possible to make an error in judgment? When,
therefore, is this technique justified and when is it not
justified?
4. Was Amar correct in his analysis of variance (based on
samples drawn from the Hyderabadi Biryani, Kashmiri
Pulao and Vegetable Fried Rice) that all the three sam-
ples must have come from the same population since
there seemed to be no significant difference between
them? Is it not possible that this was coincidence?
129
SECTION 7
Case Study: Ascertaining Customer Satisfaction

130
This case study was written by Dr. Sunil Bharadwaj, Professor (Department of Decision Sciences), IBS, Hyderabad. It is in-
tended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-
ment situation. The case was written from generalised experiences.
Sakuma India (Sakuma), an FMCG, is planning to achieve a
significant lead in the country’s digital still camera market with
the strongest ever product line-up in the category. With 42%
market share in still camera market, Sakuma is well ahead of
the competition (market share) and currently holds the domi-
nating market share. The market share result is a part of Sa-
kuma’s estimate based on expected total sales of Cyber-shot
digital still camera in FY2009, compared to total market size
for the category.
According to a press release, Sakuma has already launched
existing camera series. Sporting colourful fresh looks, the
cameras have slimmer dimensions and a futuristic design that
is easy to flaunt or carry around. The Cyber-shot lineup is
equipped with high 10–12 megapixel resolution,a newly-
developed Exmor CMOS sensor, intelligent features and pow-
erful imaging innovations to deliver enhanced imaging per-
formance and convenient photo sharing solutions for the In-
dian consumer.
Sakuma has already sold around 50,000 Cyber-shot cameras
and is now seeking to assess the satisfaction level of the us-
ers. The company is not planning to contact each and every
user of Cyber-shot camera for the obvious reasons of high
cost, time and effort involved in the process of contacting all
the customers. The marketing head of the company, Joe Phil-
lip (Phillip), is assigned 2 months to complete the job. Phillip
has called for a meeting with his team and the team is con-
tem- plating on various options for doing the job. All the team
members agreed that a questionnaire-based survey method
of data collection would be a good option to assess the satis-
faction level of the users with the product. The questionnaires
can be administered through either e-mail, postal sur-
vey or telephonic interview. However, the big question before
the team is – who and how many customers should they con-
tact?
Dev Anand, an executive, who joined Sakuma recently, sug-
gested an e-mail survey by e-mailing the questionnaire to
those customers who provided them with e-mail IDs. With suf-
ficient data available in the customer information form at retail
outlets, collection of e-mail IDs will not be a problem. A sam-
ple of adapted version of customer information form hasgiven
in Exhibit I.
However, other members did not feel enthusiastic about a sur-
vey through e-mails. They had experimented with this idea
earlier and felt that most of the time e-mail ID is either not
131
Ascertaining Customer Satisfaction
available or it is not furnished by many of the customers. Also,
customers do not respond well to e-mail surveys. The typical
response rate of usual e-mail survey is as less as 2% and the
usable responses are still less. In an e-mail survey, most of
the time, the e-mail lands up in the spam folder and the cus-
tomer neglects it. Also, there may not be any motivation for
the customer to open the e-mail and go through the survey
questionnaire sincerely.
The other options left were using telephonic interviews and
mail surveys. While the response rate and quality are usually
very good in telephonic surveys, the cost of survey is high. On
the other hand, in mail surveys, the response rate is better
than that in e-mail survey and a higher response rate can be
guaranteed through lucky-draw reward schemes. Moreover,
unlike e-mail IDs, the postal addresses of all the customers
are mostly available. However, one problem with the mail sur-
vey is the time taken by the customer to respond, if at all he
responds. The customer will not be motivated to fill the form
and post it back on the same day. The response is further de-
layed by the usual process of postal procedures. After a long
discussion on the methods of survey, the team finally agreed
to go for telephonic interviews, as the time available to com-
plete the study was limited.
The other part of the decision was to determine the number of
customers to be contacted and the technique to be adopted to
identify them. From their past experiences, all the team mem-
bers
knew that if 10%–20% of the customers are contacted, a fair
idea of the situation can be obtained. As such, the team
needs to find ways to select the 10%–20% of the customers.
However, the team wound up the discussion at this juncture
and agreed to meet after 2 days with possible options for the
survey. In the next meeting, the executives gave suggestions
on their approach to the problem.
Raman believed that the product is doing well and it is evident
by the appreciation letters and entries made by the customers
in the company’s blog. He suggested that the company can
contact only those bloggers and can get a very favourable re-
sponse. Collecting data from them would be very easy, as
they have already registered on the blog.
132
CUSTOMER INFORMATION FORM CUSTOMER INFORMATION FORM
customer name
Mobile/Telephone No
email Id
Age
Profession
Product Bought
Details of the Product
Other Information
prepared by author prepared by author
Exhibit I
On the other hand, Ravi Saxena (Saxena) suggested that the
company should treat all its customers on the same footing
and every one should have an equal chance of appearing in
the survey. He questioned the idea of relying on the blog. Sax-
ena is a strong supporter of random sampling. He recalled his
earlier experience, where he was supposed to take opinions
of doctors (from established hospitals) on electronic equip-
ment manufactured by the company. At that time, he had gen-
erated a list of all hospitals and physicians working in each of
the hospitals, wrote their names on a piece of paper, put them
in a box, mixed them well and drew certain names. In the
same way, he wanted the survey to be conducted through the
‘box’ approach. However, Ajay Jadeja strongly objected to this
idea. He argued that with a list of 50,000 customers, the ‘box’
approach is not practical.
Rohan Pillai, while supporting the need to generate the list on
a completely random basis, suggested choosing every fifth or
tenth (or nth) customer from the list. He felt that through this
method one can easily and quickly generate the details of de-
sired sample and the sample would still maintain a fair degree
of randomness.
However, Ram Tarneja (Tarneja), a senior executive, who was
patiently listening to the above discussion, suggested another
way of approaching the problem. He agreed that randomness
is fair but raised a query, “Instead of generating the sample
from the consolidated list of customers, why don’t we make
groups of customers according to zones, states and metropoli-
tan cities in India?” If they go by this method, he felt, that they
can easily assess in which of the state or zone customers are
satisfied or not and act accordingly. For example, if they go by
this method, it may turnout that Delhi customers need more
attention than Mumbai customers or vice versa.
At this point, Amrita Basu, who was keenly following the dis-
cussions, chipped in. While appre- ciating Tarneja’s sugges-
tion, she raised another query, “Why not group the customers
on the basis of camera models and then go for sampling for
each of the camera models?” She felt that this way, they can
be sure of the models. The discussion went on for another 2
hours, but a conclusion could not be reached.
133
134
SECTION 8
Case Study: Customer Satisfaction with DTH Services in India
135
This case study was written by Sravanthi Vemulawada under the direction of R Muthukumar, IBSCDC. It is intended to be used as
the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation. The case
was written from generalised experiences.
Since Cable TV entered India in 1992, entertainment on tele-
vision has grown rapidly. Out of the 71 million TV households
with the soaring viewership, complaints on quality also in-
creased. Digitalization of Cable TV took a new form when
Direct-To-Home (DTH) was launched in India in 2003.
The DTH service is an encrypted transmission. It is a digital
satellite service that provides television services direct to sub-
scribers anywhere within the country. Unlike the regular ca-
ble connection, the Set- Top-Box (STB) decodes the en-
crypted transmission. Since it makes use of wireless technol-
ogy, programs are sent to the subscriber’s television direct
from the satellite. This eliminates the need for cables and ca-
ble infrastructure. DTH service is particularly effective in re-
mote areas, where cables and even normal television serv-
ices are poor or nonexistent. These services provide the fin-
est picture and sound quality. Like the quality of any modern
movie theatre, DTH also provides the best quality surround
sound.
Although DTH services were proposed in India way back in
1996, it was not permitted until 2003. The government re-
jected approval to DTH due to concerns over national secu-
rity and cultural invasion. To prevent the implementation of
DTH service, even the cable operators had heavily lobbied
the government. Finally, the Government laid out certain regu-
lations for DTH providers to operate in India. To name a few,
no foreign player can invest more than 49% in an Indian DTH
venture, no broadcaster or cable network can earn more
than 20% share in DTH venture and the DTH provider has to
be an Indian company. Apart from that, players are required
to pay an initial amount of INR 100 million while entering the
business along with a bank guarantee of INR 400 million for
a license period of 10 years. They are also required to pay
the government 10% of their gross revenues, 12.36% of sub-
scription fees, entertainment tax ranging between 10%–20%
(varies from city to city) and VAT of 12.5%.
The Indian Government issued the first private DTH license
to Dish TV in 2003 and Dish TV started its operations in
2004. Dish TV installed a pizza size dish antenna and STB
for INR 3,190 at subscribers’ end and charged a monthly sub-
scription fees depending upon the package opted by them.
To suit the needs and pockets of different customers, Dish
TV offered four different packages made available through
25,000 dealers across the country. In 2 years, Dish TV gar-
nered a subscriber base of 1.5 million .
Dish TV has around 500 channels. Now it is planning to add
one more channel to its basic services because it wants to
increase its sales. This channel is expected to be one
amongst few entertainment channels. One of your friends is
appointed as a consultant with Dish TV. Since Mumbai is the
hub of the entertainment industry in India, your friend is of
the opinion that a survey of the channel’s Mumbai viewers
would be sufficient to know if a new channel has to be in-
cluded or not. Is his approach proper?
136
Customer Satisfaction with DTH Services in India
In 2006, Dish TV faced a new contender as Tata Sky had en-
tered the DTH market. Tata Sky investedINR25billion and
launched its service simultaneously in 300 cities across India,
concentrating mainly in Tier 1 cities. Tata Sky offered an initial
package of 55 channels. Its packages were priced similar to
that of Dish TV. By 2007, Tata Sky had 1.5 million subscribers
.
Tata Sky has introduced a new service called Tata Sky+. It
service. It divided its customer base into five age groups i.e.,
10–19, 20–29, 30–39, 40–49, 50–59 and surveyed these
groups accordingly. Is this approach proper?
Apart from Dish TV and Tata Sky, customers got another op-
tion in December 2007. South India’s first DTH provider, Sun
Direct TV6 launched its services at a price of INR 1,999 and
monthly subscriptions ranging between INR 75–250. Apart
from the usual offerings, it even provided add-on packages
and customer care service to its subscribers. In just 200 days
of its inception, Sun Direct was successful in reaching 1 mil-
lion subscribers. In South India, Sun Direct holds 65% market
share.
The number of subscribers of Sun Direct service is 3.1 mil-
lion. The company wanted some inputs about its service from
its existing subscribers. They proposed to select a sample of
100,000 subscribers. What should be the approach?
On August 19th 2008, Reliance Group (a Fortune 500 com-
pany worth INR 1,564 billion) entered the DTH sector with Big
TV8, investing INR 20.5 billion. Arun Kapoor , CEO, Big TV,
says that they are planning to capture 40% market share
within a year. At the outset, Big TV plans to spend INR 2bil-
lion on marketing and promotions . The company is using the
internet,hoardings,radio,and print media to make people
aware. A live demo of the product was also made available at
the demo closets of different TV outlets. Reliance plans to of-
fer 200 channels packaged and priced differently between
INR 1,490 and INR 4,999. In future, it also plans to add 130
channels.
Big TV has a million customers. Since the management re-
ceived complaints of its poor customer service, they worked
on it and resolved the problem. Later on, the management
wanted to know whether proper customer service was being
provided. So, they wanted to survey 10,000 customers of Big
TV. How should the sample be selected?
Indian Telecom conglomerate Bharti Airtel launched its DTH
service called Airtel Digital on October 9th, 2008 in 5,000 cit-
ies across the country. Currently it has about 175 channels .
The company holds nearly 24.2% market share of wireless
subscribers and has 300,00012 subscribers. The company
plans to leverage on this subscriber base. By 2009, DTH cus-
tomers are expected to reach around 10 million–12 million.
Airtel service is planning to add one more channel to its basic
service. There are five channels to choose from, and the com-
pany would like some input from its subscribers. There are
about 1 million subscribers, and the company knows that
35% of these are college students, 45% are white-collar work-
ers, 15% are blue-collar workers and 5% are others. What
type of sampling should be used here and why?
137
Foot Notes
1. Dish TV is a DTH entertainment service, which brings 500
channels and services straight from the satellite to the home.
It provides uninterrupted viewing without any transmission
cuts along with crystal-clear digital quality picture and stereo-
phonic sound.
2.Chatterjee Purvita, “DTH makes
stores/2007011800120300.htm, January18,2007.
3. Tata Sky is a DTH entertainment service, which has rede-
fined the television viewing experience for thousands of fami-
lies across India. They offer over 170 television channels in
DVD quality picture and CD quality sound along with a host of
new-age interactive services.
4.“Tata and Sky finally launches Tata Sky DTH Service in In-
dia”,
August 12, 2006.
5.RajGopal, “Can DTH compete with cable?”,
www.hindu.com/2005/12/24/stories/2005122404921100.htm,
December 24th 2005 .
6. Sun Direct is a DTH entertainment service wherein the
viewers can watch all their favourite programmes in true
DVD quality, it treats the viewers’ ears to a true theatre ex-
perience by providing awesome CD quality sound.
7. Iyer Byravee, “Sun Direct: Go national, think regional”,
ional/21/57/344653/, December 30th 2008.
8. Reliance’s DTH entertainment service Big TV is powered
by MPEG- 4 technology, which is being used for the first time
in India. It has fantastic features like pure digital viewing expe-
rience, more channel choice, many exclusive movie chan-
nels, easy programming guide, interactive services, parental
control and 24x7 customer service.
9 .Sinha Ashish, “ADAG to launch DTH service on Tuesday”,
1705, August 18, 2008.
10. Bharti Airtel launched its DTH Satellite TV called Airtel
Digital TV which is available to customers through 21,000 re-
tail points including Airtel Relationship Centres in 62 cities. It
uses the latest MPEG-4 standard with DVB S2 technology.
This latest technology enables delivery of more complex inter-
active content and is High Definition ready.
11.“Airtel DTH offers 175 channels”,
http://discuss.itacumens.com/index.php?topic=31581, Octo-
ber 7th 2008.
12.“DTH Networks India Forums”,http://
html, April 18th 2009.
standard.com/india/storypage.php?autono=339736, Novem-
ber 11th 2008.
138
139
SECTION 9
Case Study: Swarnamukhi Public Bank Limited’s SME Loans
140
This case study was written by Sravanthi Vemulavada, IBSCDC. It is intended to be used as the basis for class discussion rather
than to illustrate either effective or ineffective handling of a management situation. The case was written from generalised experi-
ences.
Small and Medium Enterprises (SMEs) are enterprises
wherein the number of employees and the turnover of the
company are below certain defined limits. SME is very com-
monly used term in European Union, in the United Nations, by
the World Bank and the World Trade Organization.
However, the size of an SME varies from nation to nation. In
the US, a company with less than 100 employees is termed
as a Small Enterprise (SE) and a company with less than 500
employees is termed as a Medium Enterprise (ME). In the
European Union, a company with less than 50 employees is
termed as a SE, while a company with less than 250 employ-
ees is called an ME. In Germany, a company is called as an
SME if it has 250 employees, while in Belgium, an SME con-
sists of 100 employees. In South Africa, the term Small Me-
dium Micro Enterprise (SMME) is used, whereas in Africa, the
nomenclature is Micro, Small and Medium Enterprise
(MSME).
Most of the economies in the world are dominated by smaller
enterprises. They comprise approximately 99% of all the firms
and they even account for about 40%–50% of the industrial
production. These smaller firms employ around 65 million peo-
ple.
SMEs have a major advantage of employing people at a low
capital cost. As per statistics, the sector is one of the biggest
employment providers, employing around 31 million people
through 12.8 million micro and small enterprises in India.
The labour intensity in the SME sector is estimated to be
around four times than that in large enterprises.
Indian SMEs
In India, the SMEs are known by the term Micro and Small En-
terprise (MSE). This sector plays a key role in the overall in-
dustrial economy. MSEs account for about 39% of the manu-
facturing output and around 33% of the total exports of the
country in terms of value. These MSEs also produce over
have consistently registered higher growth rate when com-
pared to the overall industrial sector.
More recently, in India the MSE sector has been enlarged to
include a medium category. Thus, the Micro, Small and
Medium Enterprises (MSMEs) are classified into two clas-
ses3:
1. Manufacturing Enterprises based on investment in plant
and machinery (Micro up to INR 25 lakh, Small between
INR 25 lakh and INR 5 crore and Medium between INR 5
crore and INR 10 crore).
2. Service Enterprises based on investment in equipments
(Micro up to INR 10 lakh, Small between INR 10 lakh and
INR 2 crore and Medium between INR 2 crore and INR 5
crore).
141
Swarnamukhi Public Bank Ltd’s SME loans
Globally, SMEs have been a source of innovation and SMEs
that integrated innovation are known to have garnered signifi-
cant benefits. However, in India, most of the MSEs still be-
lieve in importing technology rather than developing them in-
house or in association with some of the national Research
and Development (R&D) centres – this despite the fact that
India has the third largest pool of technologically trained man-
power. In short, Indian MSEs have mostly neglected their
R&D, and even their new product development and techno-
Even though MSEs constitute more than 80% of the total num-
ber of industrial enterprises and form the backbone for indus-
trial development in India, they suffer from some serious prob-
lems such as sub-optimal level of operation, technological out-
datedness and even lack of capital. In recent years, Indian
MSEs have started facing tough competition, particularly from
China. Their performance is also affected by the uncertain
market conditions due to the ongoing recession. Owing to the
same, the banks are sceptical about the repayment of loans
by the MSEs. Swarnamukhi Public Bank Limited is one such
bank, which is apprehensive about the repayment of the loans
by the MSEs in India.
Swarnamukhi Public Bank Limited
Bangalore-based Swarnamukhi Public Bank Limited, a me-
dium sized bank, has its presence across India. The top man-
agement of the Swarnamukhi Public Bank Limited is con-
cerned that the default rate may go up among MSEs as a con-
sequence of the recent economic recession. They wanted to
understand the chances of default among MSE loan ac-
counts. In particular, Vasanth Desai, the managing director of
the Bank wanted to know the region wise chances for 10% de-
fault, 15% default and 20% default. He knew that during the
recession in the ’90s, about 9% of the SMEs turned out de-
faulters on an all India basis. To avoid the repetition of such a
situation once again and totake necessary initiatives, he
wanted a branch-wise report from each region.
To respond to the queries of the MD, Albert Pinto (Pinto), re-
gional manager, Nagpur region, called for a meeting of branch
managers of all those branches, which are specially focusing
on MSEs. There were six such branches, mostly located at in-
dustrial centers /estates. The collective number of loans ad-
vanced by these branches to MSEs prior to September 2008
(i.e., prior to the emergence of recession in India) was 752, in-
cluding the 150 loans that they recently approved. In the meet-
ing, most of the branch managers expressed their concern
about MSEs’ ability to sustain the economic slowdown.
After a long discussion on the performance of the old as well
as the recently established MSEs, they could assess that
most of the MSEs are hardly concentrating on developing
new products and they are importing either the products or
the concept from the West. Finally, the branch managers con-
cluded that 8% of the MSEs, who received loans, would not
be able to make payments on time.
Given the scenario, what is the probability that more than
10% of the 150 loan takers would not make payments on
time? While discussions were informative and rich in experi-
142
ence sharing, Pinto wondered on how to respond to the MD’s
query at each branch level.
143
144
Testing of Hypothesis
The Basic Notion
The Formal Process
Steps in Hypothesis Testing
Tests for different situations
Case Studies
Smoking: A Costly Affair
Care Hygiene
Conversys Inc. (A & B)
Strategic Break
Shopper’s Stop
Hindustan Foods
A Study of Soap segment
Melting Delicacies (A)
C
H
A
P
T
E
R

8
I n t hi s c hapt e r we wi l l di s c us s
Section 1
Testing of Hypothesis
The Basic Notion
The notion of “testing of hypothesis” is very inherent in us.
Consider the following common situations:
1. Whenever we buy fruits/vegetables/sweets etc. we decide
whether or not to buy based on our assessment often, of
a single, small portion of the whole.
2. When we buy clothes we evaluate the quality of the cloth
by checking certain characteristics of the cloth and/or
tailoring/styling. Our purchasing decision is thus
influenced.
3. When we meet a stranger we decide whether we like or
dislike them based on an assessment of whether their
personalities match or do not match our expectations
4. We sometimes decide whether or not to watch a movie
based on the reviews and promos of the film. If the promo
is good we infer that the movie must be good too.
5. Investigators at a crime scene proceed first by identifying
a suspect and then try to collect evidences to establish
the criminality of the suspect.
6. A judge decides about the innocence or guiltiness of a
defendant based on the overall balance of the evidences
produced by the lawyers on both sides.
In all these examples we are starting with an assumption/
expectation and then taking a sample of evidence, we
compare if the sample evidence is within a acceptable
region or not, and accordingly take the decision. Against this
framework, the above examples are summed up in the
following table 8.1.1:
146
147
Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1
S.No. Example
Initial guess/
assumption
Evidences
(sample)
Processing to
a summary
measure
Acceptance
Threshold
Conclusion
Sample a few
items
Taking a position
on quality
Evidence within
acceptance
threshold
Sample a few
items
Taking a position
on quality
Evidence within
acceptance
threshold
3
Meeting a
stranger
Like minded
Various
behavioral
aspects of the
stranger
observed
Taking a view on
the stranger
They match with
a threshold
level acceptable
to me
Befriend him/her
4 Watching a movie Good movie
Watch reviews
and promos
Taking a view on
the movie
Reviews and
promos within a
threshold level
acceptable to
me
Go for movie
5
Crime
investigation
Identified suspect is
guilty
Collect
evidences
towards this
Taking a view on
the suspect
Incriminating
evidence above
a threshold level
Suspect is guilty
6 Jury trial Defendant is innocent
Evidences (for
and against)
presented to the
jury
Taking a view on
the defendant
Incriminating
evidence above
a threshold level
Defendant is
guilty
Thus based on the available evidences we have a way for
reaching a conclusion - a very likely conclusion - and yet
we cannot vouch for it to be the “absolute truth”. Our
conclusions are always most likely, in other words
probabilistic. This means that there is always a probability
that our conclusions are wrong. The wrong decision could
occur in two ways: (a) we may reject a hypothesis when it
is actually true, and (b) we may accept a hypothesis when
it is actually false. For instance , we might have drawn an
unrepresentative sample and hence our conclusion go
wrong. In the jury trial, while the available evidences may
incriminate the defendant as guilty, we are aware of cases
where sometimes after years of punishment to the convict,
evidences have emerged establishing the convict to be
innocent beyond doubt.
Visit: www.guardian.co.uk>News>Worldnews>Capital
punishment or www.http://www3.law.columbia.edu/hrlr/ltc
While approaching the problems statistically, essentially
around the premise laid down above, we use some formal
terminologies:
a. The initial guess/assumption is clearly about some
characteristic of the population. In other words, we are
b. Thus the initial guess/assumption about the population
parameter is referred as Null Hypothesis (Ho). A
hypothesis negating this position is called the Alternate
Hypothesisis (H1).
c. The evidences are primarily obtained through samples.
d.The summary measure is referred to as the test
stati sti c, whi ch i s obtai ned through stati sti cal
considerations and could defer from context to context.
e.The acceptance threshold is referred as critical value
and the region in which the test statistic is acceptable is
called the acceptance region. Consequently the region
beyond the critical value is called the rejection region.
f. The entire process goes under the broad name of
statistical inference.
The Formal Process
Let us now discuss the topic more formally.
Typically, we hypothesize a point estimate of a population
parameter. We take a sample and compute the sample
statistic. We test it by comparing the observed value of
the sample statistic with the expected value of sample
statistic (assuming the hypothesized parameter to be
true) and judging if the difference is significant. The
smaller the difference, the greater the chances of our
hypothesized value being correct and vice versa.
However, there is some amount of arbitrariness in
judging as to what should be considered as large
difference or otherwise. In practice we use standardized
values of the sampled statistic, which follows a known
probability distribution under assumptions. This
standardized statistic is called test statistic.
When the observed (or calculated) test statistic is
compared against a value (called critical value of the test
statistic) obtained from statistical tables for the probability
distribution of the test statistic, it allows us to decide with
148
a certain degree of confidence if the difference between the
observed value of the sample statistic and the expected value
of the sample statistic is significant.
However, one should bear it in mind that we are trying to
conclude something about the nature of the population based
on a sample from it. Hence, there is always a chance of our
going wrong if the sample does not happen to be
representative of the population (which we can never really be
sure about). Thus, we always make a probabilistic
statement about the conclusion reached such as “we accept
(or reject) the hypothesis with 95% confidence” i.e. in
95% cases the hypothesis is likely to be true (or false)
because the difference between the observed and expected
values of the sample statistic (under the hypothesized
parameter value) is not significant ( or significant). This means
that there is a 5% chance of making an error through
statistical inference.
There are possibilities for two types of error being committed
while carrying out a test as is clear from below (table 8.1.2):
While ideally it is preferable to reduce both type I
and type II errors, it is not possible to do so theoretically.
If we minimize type I error, type II error will increase and
vice-versa for reasons clear from the accompanying
figure. Hence we always keep the type I error fixed at α
and minimize the type II error. In practice α is mostly
taken as 5% or 1%.

Thus the null hypothesis is the status quo solution to each
of the examples indicated earlier. It is only a possible
solution, but the null hypothesis is what we will believe in
unless we have evidence to the contrary.
We restate the various terminologies related to hypothesis
149
Table 8.1.2 Table 8.1.2 Table 8.1.2
Decision(conclusion)▶
Actual (True state)

Ho accepted
Ho rejected
(i.e., effectively H1
accepted)
Ho True Correct decision
P(Ho rejected / Ho True)
Type I Error ()
Ho False
(i.e.,effectively H1 true)
P(Ho accepted/Ho
false)
Type II Error(ß)
Correct decision
testing with the illustration of coin tossing experiment.
(a) Null hypothesis
It is the hypothesis we wish to test on some population
parameter. Usually this is specified in mathematical terms,
e.g. the hypothesis whether a coin is unbiased or not can be
written as p=½, where p=probability of a head in a toss. The
null hypothesis is generally denoted as:
H0 : p = ½
(b) Alternative hypothesis
It is a hypothesis which contradicts the null hypothesis.
Thus, while testing for unbiased-ness of a coin, the alternative
hypothesis can be
(i) It is biased (in which case p ≠ ½)
(ii) Biased in favor of head (in which case p >½)
(iii) Biased in favor of tail (in which case p < ½)
Alternative hypothesis is generally denoted as
HA or H1 : p ≠ ½
or, H1 : p > ½
or, H1 : p < ½ .
At a time, we test one of the following situations:
H0 : p=½ versus H1 : p ≠ ½ (Two tailed test)
H0 : p=½ versus H1 : p > ½ (Right tailed test)
H0 : p=½ versus H1 : p < ½ (Left tailed test)
The above idea will be clear from the following figure 8.1.1.
(c) Test criterion or test statistic
This is a formula (differs from situation to situation)
which is used in formulating a test.
(d) p – value
The probability beyond the calculated value of the
test criterion under H0 is called the p – value.
(e) Critical Region and Critical Value
The set of values of the test criterion that lead to the
rejection of the hypothesis is called the critical region (or
rejection region) of the test. On the other hand, the
values that lead to the acceptance of the hypothesis are
said to form the acceptance region. This cut off point is
referred to as critical value.
150
Figure 8.1.1: Selecting the tail of the test
(f) Level of Significance
This is the probability level (under H0) which is employed in
defining the critical region. It is generally denoted by the
symbol α and is customarily taken as 0.05 or 0.01
(alternatively referred to as 5% or 1% level of
significance). We have to take this approach, as theoretically
it is not possible to minimize both Type I Error ( α ) and
Type II Error ( β ) simultaneously.
(g) Power of a Test
(1 - β) is referred to as the power of the test. The test
criterion is always such that for a given level of significance
(α), the power of the test (1 - β) is maximized.
(h) Test of Hypothesis
Based on all the above, this is a rule telling us when to
accept H0 and when to reject it. The decision depends on the
value of the statistic in relation to the critical value obtained
from the corresponding statistical table.
Steps in Hypothesis Testing
Common Steps:
State Ho and H1 ( Be clear if H1 is two tailed or right tailed or
left tailed)
Define rejection region (specifying level of significance, i. e., α
will take care of it)
Decide on the test statistic (z, t, F, ...........)
Collect sample data and compute test statistic
Steps if critical value approach followed
Determine the appropriate critical value depending on H1
( i.e., the value(s) on the distribution of the test statistic
beyond which probability is α).
Compare test statistic with the critical value(s) to decide
whether to reject H1
Steps if p-value approach is followed
Determine the p-value for the test statistic
Reject Ho at level of significance (α ) if p- value < α.
Common step
Interpret the conclusion in managerial terms.
151
Video 8.1.1:
Type I and Type II errors
Video 8.1.2:
Test criterion
Section 2
Tests for Different Situations
With these basic concepts we shall indicate different
situations and tests appropriate for them. Most of these
relate to the means, standard deviations and proportions.
Generally in practice a sample of size 30 or more is
referred to as large sample and in such cases it is possible
to use some large sample approximation which is due to the
celebrated Central Limit Theorem.
These tests can be broadly classified into the following two
categories:
(a) Small Sample tests, and
(b) Large Sample tests.
As the name suggests, small sample tests are applied
when the sample at hand is of small size. Large sample
tests are used when the sample size is large (sample size >
30 ). Most of these tests are based on four well-known
distributions in statistics, i.e., Normal distribution, t-
distribution, F-distribution and Chi-square (X
2
)
distribution. In the discussion below, we shall assume that
we have drawn a random sample x1, x2,…….., xn of size n
from a given population, our problem being to infer about the
nature of some parameter of the population.
Symbols and Notations
Before detailing on tests for different contexts / situations,
notations & symbols used are defined below:

152
Some other symbols:
153
Symbol Nature of Variate Defined through
z α Z follows Normal (0,1) P ( Z > z α) = α or equivalently P ( Z < z α) = 1 - α
z α/2 Z follows Normal (0,1) P ( Z > z α/2) = α/2 or equivalently P ( Z < -z α/2) = α/2
t (n-1),α t follows t-distribution
with
( n- 1) Degrees of
Freedom
P ( t > t (n-1),α ) = α or equivalently P ( t < t (n-1),α)
= 1 - α
t (n-1),α/2 t follows
t-distribution
P ( t > t (n-1),α/2 ) = α/2 or equivalently P (t < - t (n-1),α/2) =
α/2
F(n1-1, n2-1, α ) F follows
F-distribution
P (F > F(n1-1, n2-1, α )) = α
or equivalently P (F < F(n1-1, n2-1, α )) = 1 - α
F(n1-1, n2-1, α/2 ) F follows
F-distribution
P (F > F(n1-1, n2-1, α/2 )) = α/2
or equivalently P (F < F(n1-1, n2-1, α/2)) = 1 – α/2
X
2
(n – 1, α ) X
2
follows
Chi-square distribution
P (X
2
> X
2
(n – 1, α )) = α
or equivalently P (X
2
< X
2
(n – 1, α )) = 1 - α
X
2
(n – 1, α/2 ) X
2
follows
Chi-square distribution
P (X
2
> X
2
(n – 1, α/2 )) = α/2
or equivalently P (X
2
< X
2
(n – 1, α/2 )) = 1 – α/2
One Population Two Populations
Population Size N
N
1
, N
2
Sample Size n
n
1
, n
2
Sample
x
1
, x
2
.......x
3
x
11
, x
12
,.......x
1n
for Population I
x
21
, x
22
,.......x
2n
for Population II
Population Mean
µ µ
1

2
Population Variance
σ
2
σ
1
2
, σ
2
2
Population Standard
Deviation
σ
σ
1

2
Sample Mean
x x
1
, x
2
Estimate of
σ
2
s
2
=
(x
i
- ∑ x)
2
(n-1)
s
1
2
=
(x
1i
- ∑ x
1
)
2
(n
1
-1)
and s
2
2
=
(x
2i
- ∑ x
2
)
2
(n
2
-1)
Estimate of
σ S
S
1
,S
2
Population proportion
of ‘successes’
P
P
1
, P
2
Population Variance PQ/N

P
1
Q
1
/ N
1
, P
2
Q
2
/ N
2

Population Standard
Deviation
PQ/N P
1
Q
1
/N
1
, P
2
Q
2
/N
2
Sample proportion of
‘successes’
p
p
1
, p
2
Estimate of Variance
of sample proportion
(p q / n)

p
1
q
1
/ n
1
, p
2
q
2
/ n
2
Estimate of Standard
Deviation of sample
proportion
pq/n p
1
q
1
/n
1
, p
2
q
2
/n
2
1. Some Well Known Tests
Some well known and frequently used tests are given below along with their contexts / situations:
A. Common Tests based on Normal Distribution (GENERALLY for LARGE SAMPLES)
154
Situation Large/small
sample test
Standard Error Conﬁdence
Interval
Null Hypothesis Test Statistic Alternative
Hypothesis
Conclusion
A-1
µ unknown,
σ known
one population
under
consideration
ONE SAMPLE
PROBLEM
Large & Small
σ
x
=σ/ n
x ± z
α /2
× σ
x
H
0
: µ = µ
0
z =
(x-µ
0
)
σ
x
H
1
: µ ≠ µ
0
H
1
: µ > µ
0
H
1
: µ < µ
0
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z < z
α

A-2
µ
unknown
σ
unknown
One population
under
consideration
ONE SAMPLE
PROBLEM
Large
ˆ
σ
X
= s / n
x ± z
α /2
× σ

x
H
0
: µ = µ
0
z =
(x-µ
0
)
σ

x
H
1
: µ ≠ µ
0
H
1
: µ > µ
0
H
1
: µ < µ
0
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z < z
α

A-3
µ
unknown,
σ
unknown
Finite
population of
Size N
ONE SAMPLE
PROBLEM
Large ˆ σ
X
= s / n × FPM
Where
FPM = N− n ( ) / N−1 ( ) { }
= Finite Population
Multiplier
x ± z
α /2
× σ

x
H
0
: µ = µ
0
z =
(x-µ
0
)
σ

x
H
1
: µ ≠ µ
0
H
1
: µ > µ
0
H
1
: µ < µ
0
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z < z
α

Situation Large/small
sample test
Standard Error Conﬁdence
Interval
Null Hypothesis Test Statistic Alternative
Hypothesis
Conclusion
A-4
µ
1

2
unknown,
σ
1,
σ
2
Known
two population
under
consideration
TWO SAMPLE
PROBLEM
Large
σ
X
1
−X
2 ( )
= σ
2
1
/ n
1
+ σ
2
2
/ n
2
(x
1
− x
2
) ± z
α/2
× σ
(x
1
−x
2
)
H
o
: µ
1
= µ
2
z =
(x
1
-x
2
)
σ
( X
1
-X
2
)
H
1
: µ
1
≠ µ
2
H
1
: µ
1
> µ
2
H
1
: µ
1
< µ
2
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
A-5
µ
1

2
unknown
σ
1,
σ
2
unknown
two populations
under
consideration
TWO SAMPLE
PROBLEM
Large
ˆ
σ
X
1
−X
2 ( )
= s
2
1
/ n
1
+ s
2
2
/ n
2
(x
1
− x
2
) ± z
α/2
× ˆ σ
(x
1
−x
2
)
H
o
: µ
1
= µ
2
z =
(x
1
-x
2
)
σ

( X
1
-X
2
)
H
1
: µ
1
≠ µ
2
H
1
: µ
1
> µ
2
H
1
: µ
1
< µ
2
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
A-6
µ
1

2
unknown
common s.d.
σ

known, two
population
under
consideration
TWO SAMPLE
PROBLEM
Large σ
X
1
−X
2 ( )
= σ
2
(1/ n
1
+1/ n
2
)
(x
1
− x
2
) ± z
α/2
× σ
(x
1
−x
2
)
H
o
: µ
1
= µ
2
(i.e. given two
normal populations
with common
KNOWN s.d.
σ
, can
we say that the two
samples come from
the same
population?)
z =
(x
1
-x
2
)
σ

( X
1
-X
2
)
H
1
: µ
1
≠ µ
2
H
1
: µ
1
> µ
2
H
1
: µ
1
< µ
2
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
Situation Large/small
sample test
Standard Error Conﬁdence
Interval
Null Hypothesis Test Statistic Alternative
Hypothesis
Conclusion
A-7
µ
1

2
unknown,
common s.d.
σ

unknown, two
population
under
consideration
TWO SAMPLE
PROBLEM
Large ˆ σ
x
1
−x
2 ( )
= s
2
(1/ n
1
+1/ n
2
)
where
s
2
=
n
1
−1 ( )s
1
2
+ n
2
−1 ( )s
2
2
n
1
+ n
2
− 2 ( )

(x
1
− x
2
) ± z
α/2
×

σ
(x
1
−x
2
)
H
o
: µ
1
= µ
2
(i.e. given two
normal populations
with common, but
UNKNOWN s.d.
σ
,
can we say that the
two samples come
from the same
population?)
z =
(x
1
-x
2
)
σ

( X
1
-X
2
)
H
1
: µ
1
≠ µ
2
H
1
: µ
1
> µ
2
H
1
: µ
1
< µ
2
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
A-8
P unknown
n p > 5
ONE SAMPLE
PROBLEM
Large
σ
p
= P
0
Q
0
/n
Where Q
0
=1- P
0
p ± z
α/2
× σ
p
H
0
:P = P
0
z =
p − P
0
( )
σ
p
H
1
:P ≠ P
0
H
1
:P > P
0
H
1
:P < P
0
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
A-9
P Unknown
n p > 5
Finite
population of
size N
ONE SAMPLE
PROBLEM
Large
σ
p
= (FPM) × P
o
Q
0
/ n
where Q
0
=1 - P
0
&
FPM= (N-n)/(N-1) { }
= Finite Population
Multiplier
p ± z
α/2
× σ
p
H
0
:P = P
0
z =
p − P
0
( )
σ
p
H
1
:P ≠ P
0
H
1
:P > P
0
H
1
:P < P
0
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
A-10
P
1
, P
2
unknown
n
1
p
1
> 5
n
2
p
2
> 5
TWO SAMPLE
PROBLEM
Large σ
p
1
−p
2 ( )
= pq 1/ n
1
+1/ n
2
( ) { },
where
p =
n
1
p
1
+ n
2
p
2
( )
n
1
+ n
2
( )
& q=1-p
(p
1
− p
2
) ± z
α/2
× σ
(p
1
−p
2)
H
0
: P
1
= P
2
z =
(p
1
− p
2
)
σ
(p
1
−p
2
)
H
1
: P
1
≠ P
2
H
1
: P
1
> P
2
H
1
: P
1
< P
2
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
157
Situation Large/small
sample test
Standard Error Conﬁdence
Interval
Null Hypothesis Test Statistic Alternative
Hypothesis
Conclusion
A-11
σ
1
, σ
2
Unknown
TWO SAMPLE
PROBLEM
Large ˆ
σ
(s
1
−s
2
)
= s
1
2
/ 2n
1
+ s
2
2
/ 2n
2
(s
1
− s
2
) ± z
α/2
× ˆ σ
(s1−s2 )
H
0
: σ
1
= σ
2
z =
(s
1
− s
2
)
ˆ σ
(s
1
−s
2
)
H
1
: σ
1
≠ σ
2
H
1
: σ
1
> σ
2
H
1
: σ
1
< σ
2
Reject H
0
if | Z | > z
α /2
Reject H
0
if Z > z
α
Reject H
0
if Z <- z
α
B Tests based on t-distribution (Small Sample Tests)
Situation Large/small
sample test
Standard Error Conﬁdence
Interval
Null
Hypothesis
Test Statistic Alternative
Hypothesis
Conclusion
B-1

µ unknown
σ unknown
ONE
SAMPLE
PROBLEM
small ˆ σ
x
= s / n
x ± t
α/2
× ˆ σ
x
H
0
: µ = µ
0
t =
(x − µ
0
)
ˆ σ
x
H
1
: µ ≠ µ
0
H
1
: µ > µ
0
H
1
: µ < µ
0
Reject H
0
if | t | > t (n-1,α/2)
Reject H
0
if t > t (n-1,α)
Reject H
0
if t < t (n-1,α)

µ unknown
σ unknown
Situation
Large/
Small
Sampl
e Test
Standard Error
Confidence
Interval
Null
Hypothesis
Test Statistic
Alternative
Hypothesis
Conclusion
Small
Small
Small
Large
&
Small

σ
x
= s / n
x ± t
α/2
×

σ
x
H
0
: µ = µ
0 t =
(x − µ
0
)
ˆ
σ
x
H
1
: µ ≠ µ
0
H
1
: µ > µ
0
H
1
: µ < µ
0
Reject H
0
if
| t | t(n-1,α/2)
Reject H
0
if
t > t t(n-1,α)
Reject H
0
if
t < −t(n-1,α)
B−1
µ unknown,
σ unknown
ONE SAMPLE
PROBLEM
B− 2
µ
1

2
unknown,
common s.d.
σ unknown two
population under
consideration
TWO SAMPLE
PROBLEM

σ
(x
1
−x
2
)
= s
2
(1/ n
1
+1/ n
2
)
where s
2
=
(n
1
−1)s
1
2
+ (n
2
−1)s
2
2
(n
1
+ n
2
− 2)
(x
1
− x
2
)
±t
α/2
×
ˆ
σ
(x
1
−x
2
)
H
0
: µ
1

2
(i.e. given two
normal populations
with common s.d.
σ, can we say that
two samples come
from the same
population?)
t =
(x
1
− x
2
)
ˆ
σ
(x
1
−x
2
)
H
1
: µ
1
≠ µ
2
H
1
: µ
1
> µ
2
H
1
: µ
1
< µ
2
Reject H
0
if
| t | t(n
1
+ n
2
− 2, α / 2)
Reject H
0
if
t > t (n
1
+ n
2
− 2, α)
Reject H
0
if
t < −t (n
1
+ n
2
− 2, α)
B− 3
µ
x

y
unknown,
n−Paired
observations
PAIRED TEST
ˆ
σ
d
= s / n
where d
i
= (x
i
− y
i
)
& s
2
=
(d
i
− d ∑ )
2
(n −1)
x ± t
α/2
×
ˆ
σ
d
H
0
: µ
x
= µ
y
t =
d
ˆ
σ
d
H
1
: µ
x
≠ µ
y
H
1
: µ
x
> µ
y
H
1
: µ
x
< µ
y
Reject H
0
if
| t | > t(n-1,α/2)
Reject H
0
if
t > t(n-1,α)
Reject H
0
if
t < −t(n-1,α)
B− 4
ρ = population
correlation
coefficient
between X & Y
r = sample
correlation coefficient
between X & Y
n = sample size
CORRELATION
TEST
H
0
: ρ = 0
t =
r n − 2
1− r
2
H
1
: ρ ≠ 0
H
1
: ρ > 0
H
1
: ρ < 0
Reject H
0
if
| t | > t(n-1,α/2)
Reject H
0
if
t > t (n-2,α)
Reject H
0
if
t < −t(n-2,α)
158
B. Tests Based on T-distribution (Small Sample Tests)
Situation
Large/
Small
Sample
Test
Stand
ard
Error
Confid
ence
Interv
al
Null Hypothesis Test Statistic
Alternative
Hypothesis
Conclusion
Large
Large &
Small
Large &
Small
Small
Large
C-1
GOODNESS OF FIT
against a theoretical
or specified distribution
expected frequencies
E
1
>5,
sample size (n)
reasonably large
(say, > 50)
H
0
The sample
follows the
specified
distribution
χ
2
=
(O
i
- E
i
)
2
E
i

where
O
i
=Observed frequency
E
i
= Expected frequency
O
i
= ∑ E
i
=n ∑
i =1,2...,k and
k = number of classes.
H
1
The sample
does not follows the
specified
distribution
Reject H
0
if
χ
2
> χ
1
(k −1, α)
C− 2
INDEPENDENCE OF
TWO ATTRIBUTES
(say, A & B)
(r × s contingency
table)
H
0
P
1
= P
2
= ... = P
s
,
where s=no. of
populations and
r=no.of characteristics
being observed
χ
2
=
n
ij
- (n
io
n
oj
/ n ⎡

2
(n
io
n
oj
/ n

where
n
ij
= frequency of (A
i
,B
j
) cell
n
io
= Marginal total for A
i
n
oj
= Marginal total for B
j
n = total frequency
i =1,2...,r (no. of rows)
and
j =1,2,....,s (no of columns)
H
1
: A&B are not
independent
Reject H
0
if
χ
2
> χ
1
(r −1)(s −1) , α [ ]
χ
2
=
n
ij
−(n
io
n
oj
/ n ⎡

2
(n
io
n
oj
/ n

where
n
ij
= frequency of (A
i
,B
j
) cell
n
io
= Marginal total for A
i
n
oj
= Marginal total for B
j
n = total frequency
i = 1,2...,r (no. of rows)
and
j = 1,2,....,s (no of columns)
H
1
: A&B are not
independent
Reject H
0
if
χ
2
> χ
1
(r −1)(s −1) , α [ ]
H
0
: A&B are
independent
C− 3
EQUALITY OF
SEVERAL
POPULATION
PROPORTIONS
C− 4
TEST FOR
POPULATION
VARIANCE OF
NORMAL
POPULATION
H
0
: σ
2
= σ
0
2
χ
2
=
(n −1)s
2
σ
0
2
where
n = Sample size
Z = (2χ
2
) − (2n −1)
where χ
2
as above
H
1
: σ
2
≠ σ
0
2 Reject H
0
if
χ
2
> χ
2
(r −1)(s −1) , α [ ]
Reject H
0
if
| Z| > z
α

2
)
C. Tests Based on Chi-square distribution
160
Keynote 8.2.4: Example for A4
Keynote 8.2.1:Example for A1
Keynote 8.2.2: Example for A2
Keynote 8.2.3: Example for A3
Keynote 8.2.5: Example for A5
Keynote 8.2.6: Example for A8
Keynote 8.2.7: Example for B2
161
Keynote 8.2.8: Example for C1
Keynote 8.2.9: Example for C2
162
SECTION 3
Case Study: Smoking a Costly Affair Now?
163
This case study was written by Thalluri Prashanth Vidya Sagar, IBSCDC. It is intended to be used as the basis for class discus-
sion rather than to illustrate either effective or ineffective handling of a management situation. The case was compiled from
“An estimated 440,000 Americans die each year from dis-
eases caused by smoking. Smoking is responsible for an esti-
mated one in five U.S. deaths and costs the U.S. over \$150
billion each year in health care costs and lost
productivity.”
On April 1st 2009, the US government had spiked the federal
cigarette-tax rate from 39¢ to \$1.01 per pack. As smoking had
been taking toll on human lives, the anti-smoking advocates
welcomed the administration’s move stating that it would save
an estimated 900,000 lives. However, some of the smokers
worried about raising cost of their habit (Exhibit I).
This kind of taxation is often called as ‘sin tax’, as it was
mainly imposed on vices like gambling, drinking and smoking.
Recent hike in sin tax expected to stop around 2 million kids
from trying to smoke for the first time and prompt almost 1 mil-
The sin tax had historical roots since 1500s. Pope Leo X had
taxed the licensed prostitutes. Peter the Great levied tax on
men who grew beards. American sin taxation began with the
proposal of an American patriot, Alexander Hamilton, who pro-
posed taxation on alcohol to contain its consumption and si-
multaneously to raise the revenues for the government (Ex-
hibit II).
In China too, the State Administration of Taxation and the Uni-
versity of California (Berkeley) had released a report titled To-
bacco Tax and Its Potential Impact on China. In December
2008, they asked the Chinese government to increase sub-
stantially the tax rate on cigarette to reduce cigarette con-
sumption in China.
Experts estimated that an increase of 51% of the retail price
would reduce as much as 13.7 million smokers and save the
lives of 3.4 million. It was also estimated that the tax rate
could generate as much as 64.9 billion yuan (\$9.5 billion) per
annum as additional revenue for the government. All the ex-
perts unanimously agreed on the issue of raising tax rates to
affect a price increase. They pointed out that on an average
the cigarette tax rate was levied at 65%–70% of the retail
price across the globe.
164
1
Smoking: A Costly Affair Now?
Smoking: A Costly Affair Now?
“An estimated 440,000 Americans die each year from diseases caused by smoking. Smoking is
responsible for an estimated one in five U.S. deaths and costs the U.S. over \$150 billion each year in
health care costs and lost productivity.”
1
– American Lung Association
On April 1
st
2009, the US government had spiked the federal cigarette-tax rate from 39¢ to \$1.01
per pack. As smoking had been taking toll on human lives, the anti-smoking advocates welcomed the
administration’s move stating that it would save an
estimated 900,000 lives. However, some of the
smokers worried about raising cost of their habit
(Exhibit I).
This kind of taxation is often called as ‘sin tax’,
as it was mainly imposed on vices like gambling,
drinking and smoking. Recent hike in sin tax
expected to stop around 2 million kids from trying
to smoke for the first time and prompt almost 1
The sin tax had historical roots since 1500s. Pope
Great levied tax on men who grew beards.
American sin taxation began with the proposal of
an American patriot, Alexander Hamilton, who proposed taxation on alcohol to contain its consumption
and simultaneously to raise the revenues for the government (Exhibit II).
This case study was written by Thalluri Prashanth Vidya Sagar, IBSCDC. It is intended to be used as the basis for class discussion rather
than to illustrate either effective or ineffective handling of a management situation. The case was compiled from published sources.
No part of this publication may be copied, stored, transmitted, reproduced or distributed in any form or medium whatsoever
without the permission of the copyright owner.
Background Reading: Chapters 8 and 9, “Testing Hypotheses: One Sample Tests” and
“Testing Hypotheses: Two-Sample Tests”, Statistics for Management 7
th
Edition
(Richard I. Levin and David S. Rubin)
Ref. No.: QM0009
Exhibit I
Worried Smokers
Source: “Can Raising the Tobacco Tax Reduce the Number of Smokers?”,
http://www.bjreview.com.cn/forum/txt/2009-02/10/content_177671.htm, February 12
th
2009
1
“Smoking Cessation Resources Fact Sheet”, http://www.lungusa.org/site/c.dvLUK9O0E/b.44456/k.7B2A/
Smoking_Cessation_Resources_Fact_Sheet.htm, July 2004
Smoking :A Costly Affair Now
Some analysts criticized the sin tax by stating that such a
move would definitely promote the interests of low, cheap
quality cigarette producers, while further spoiling the health of
the smokers. However, a World Bank survey found a reduc-
tion of 4% in cigarette consumption, for every 10% increase in
retail price in developed countries, while the reduction was
8% in developing countries.
The experts also cited the example of New York City in suc-
cessfully controlling tobacco usage. The local government in
New York City had initiated a comprehensive anti-
smoking measure in 2002 by raising the cigarette tax rate.
It was found in 2006 that the city’s smoking rate dramati-
cally declined by 20% to stand at only 17.5%. A survey
also showed that 45.3% smokers in New York were smok-
ing fewer times than before or considering plans and
ways to quit smoking. It was also found that a number of
adolescent smokers, who were more sensitive to cigarette
prices, cut off their tobacco consumption due to their lim-
ited finances. As a result of cigarette price hikes, there
were more cigarette quitters in low-income groups than
high-income groups in the city.
According to another survey on smoking habit, 400 out of
a random sample of 500 men were found to be smokers.
After the tax on tobacco had been increased, another ran-
dom sample of 600 men in the New York City included
400 smokers. An analyst got a doubt whether the ob-
served decrease in proportion of smokers was significant
or not. He wanted to test the data at 5% level of signifi-
cance.
LEGISLATING MORALITY LEGISLATING MORALITY
1787
Alexander Hamilton advocated taxing “ardent spirits “in
Federalist No.12
1794 US liquor tax sparks the Whiskey Rebellion
1864
To raise money for the Civil War, US authorities levied
federal cigarette tax of up to 2.4 ¢ per pack for the ﬁrst time
in the US history
1963
Annual per capital cigarette consumption among US adults
peaks at 4,345
2005
Nine Democratic Senators introduced an unsuccessful bill
that would have imposed a 25% tax on purveyors of online
pornography
2009
Amid a public outcry. New York Governor, David Paterson
backtracks on plans to raise taxes on goods ranging from
Source: Altman Alex, “ A Brief History of: Sin Taxes”, http:/www.time.com/time/
magazine/article/0,9171,1889187,00.html, April 2nd 2009
Source: Altman Alex, “ A Brief History of: Sin Taxes”, http:/www.time.com/time/
magazine/article/0,9171,1889187,00.html, April 2nd 2009
165
Exhibit II
166
SECTION 9
Case Study: Care Hygiene
In 2003, Mumbai based Care Hygiene Co (Care Hygiene), a
well–known company dealing in healthcare products, re-
corded sales of Rs.665 crores and a net income of Rs. 45.6
crores. ‘Nutravit,’ a chocolate flavored health drink, was one
of its flagship products. But of late, Nutravit had been facing
stiff competition from a number of other chocolate–based
significantly come down. Care Hygiene decided it was high
time it took some efforts to tackle competition and regain
market share. The company decided to market a new variant
of Nutravit, with an improved formulation and in a new flavor.
By mid–2003, Care Hygiene was ready with its new variant –
a powdered mix which when mixed with milk gave a nutri-
tious as well as tasty vanilla cum chocolate flavored drink.
Care Hygiene’s marketing manager decided to test market
the new product. He selected Mumbai and Nagpur as the
test cities because there were significant similarities in the
consumption patterns of its health drink ‘Nutravit’ in these
two cities. In Mumbai, Care continued to market its estab-
lished health beverage, while in Nagpur it replaced it with the
new vanilla cum chocolate flavor.
In each city, a sample of 200 households was selected and
interviewed over a six– month period. Based on each house-
hold’s reported consumption of the beverages, Care’s mar-
keting manager charted the results showing the different
household consumption rates in Mumbai and Nagpur (Refer
to Table I for the household consumption rates).
In Mumbai, where the chocolate flavor was marketed, 114
households reported using the beverage. In Nagpur, where
the new variant was test marketed, 136 households reported
using the beverage mix.
167
TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES
Consumptio
n rate
Nagpur (Vanilla
cum Chocolate)
Nagpur (Vanilla
cum Chocolate)
Mumbai
(Chocolate)
Mumbai
(Chocolate)
Consumptio
n rate
Numb
er
% of
households
Numb
er
% of
househol
ds
Heavy 34 17 28 14
Moderate 52 26 44 22
Light 50 25 42 21
Non-user 64 32 86 43
Total 200 100 200 100
Quantitative Methods
Questions for Discussion:
1. Care’s marketing manager wondered if the difference in
usage rates (57% in Mumbai and 68% in Nagpur) could
be attributed to the new vanilla formulation or if the differ-
ence had merely resulted by chance due to sampling.
2. Since the new formulation was an improved one with a
new flavor, it was more expensive. The management de-
cided to proceed with it only if there was sufficient evi-
dence that the new variant would yield better results.
While test marketing the new variant, Care’s marketing
manager had decided that if it achieved a 75% usage
rate among target households, he would recommend the
launching of the product. What should he do? Based on
the sample of 200 households, the new variant had
achieved a usage rate of 68%. Should he recommend to
the management for or against launching of the new
product?
3. Among the 200 households sampled in each city, Care
found different consumption rates. While 86 heavy and
moderate consumption households were reported in Nag-
pur, 72 heavy and moderate consuming households
were reported in Mumbai. Care’s marketing manager
wanted to know if the difference between the consump-
tion rates in the two cities was statistically significant. If
there was a statistically significant difference, he could
conclude that the new flavor was causing a heavier con-
sumption pattern.
4. Care’s marketing manager wondered if the difference in
usage rates (57% in Mumbai and 68% in Nagpur) could
be attributed to the new vanilla formulation or if the differ-
ence had merely resulted by chance due to sampling.
5. Since the new formulation was an improved one with a
new flavor, it was more expensive. The management de-
cided to proceed with it only if there was sufficient evi-
dence that the new variant would yield better results.
While test marketing the new variant, Care’s marketing
manager had decided that if it achieved a 75% usage
rate among target households, he would recommend the
launching of the product. What should he do? Based on
the sample of 200 households, the new variant had
achieved a usage rate of 68%. Should he recommend to
the management for or against launching of the new
product?
6. Among the 200 households sampled in each city, Care
found different consumption rates. While 86 heavy and
moderate consumption households were reported in Nag-
pur, 72 heavy and moderate consuming households
were reported in Mumbai. Care’s marketing manager
wanted to know if the difference between the consump-
tion rates in the two cities was statistically significant. If
there was a statistically significant difference, he could
conclude that the new flavor was causing a heavier con-
sumption pattern.
168
169
SECTION 5
Case Study: Conversys Inc (A)
170
This case study was written by Dr. Sourabh Bhattacharya, Professor (Department of Decision Sciences), IBS, Hyderabad. It is
intended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-
ment situation. The case was written from generalised experiences.
Conversys Inc. (Conversys),
started its operations in July
2000, soon became one of the
most reputed call centres in Hyder-
abad, India. Conversys provides both inbound calls and out-
bound call services to its wide range of clientele. Conversys’
clientele includes consumer product firms, financial product
firms, automobile firms, telecommunication firms, etc. It also
provides internal functional services such as pay roll mainte-
nance, help desk, sales support, etc. to many firms.
Performance Evaluation Method at Conversys
One of the most important section of employees at Conversys
are the Customer Service Representatives (CSRs) or the
Agents. These agents are the ones who answer customers’
tele- phone inquiries. Therefore, the performance of the
agents plays a vital role in building the company’s reputation
of providing 99% service rate. Moreover, agents are paid by
the hour. Hence, their productivity becomes an important is-
sue. The typical performance measure for call centre agents
is AverageHandlingTime(AHT)2 in seconds or number of calls
handled in an hour. Every month, the Unit Managers (UMs)
compute a simple statistics (i.e., mean of AHTs) for each
agent, taking into account the tenure of the agent. The Unit
Managers (UMs) prepare reports that are presented and dis-
cussed in the monthly performance meetings with the higher-
ups, to screen for well-performing and under-performing
agents. The agents performing below standards are identified
in these monthly meetings and provided further training.
Agents are given 2 months of On-the-Job Training (OJT) for
improving their accuracy, speed and efficiency while process-
ing phone calls.
After the OJT, UMs again monitor phone calls to ensure that
the agents achieve company’s courtesy and accuracy
standards. The performance measure of each agent before
and after the OJT is compared to decide whether the agent
has improved to be retained or not.
The Dilemma of Amit Vardhan
Amit Vardhan (Amit), the UM of one of the project teams, has
recently become concerned about the performance of one of
his agents, Ishan Singh (Ishan). The company standards
specify the AHT to be less than 180 sec. Amit collects Ishan’s
AHT data (Exhibit I) for the last 1 month and wonders if Is-
han should undergo a training to further improve his perform-
ance. After analyzing Ishan’s performance data, Amit con-
cludes that Ishan is below the company standards and he
needs to undergo 2 months of OJT.
One month after Ishan’s
training is over, Amit de-
cides to evaluate Ishan’s
performance and give a
salary hike, provided his
per f or mance has i m-
proved. However, another
agent, Devang
Parekh (Devang) is also
a potential candidate for
the salary hike. Hence, Amit decides that the salary hike
171
Conversys Inc.(A)
INTERACTIVE 8.1 Ishan’s
and Devang performance
would be given to the one who would show a better perform-
ance in the coming month. Amit went through the observa-
tions of the AHT record of Ishan and Devang for the very next
month (Exhibits II & III). However, he got confused regard-
ing who should be given the salary hike – Ishan or Devang.
172
Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT
Day
AHT (in
seconds)
Day
AHT (in
seconds)
Day
AHT (in
seconds
)
1 185 11 180 21 178
2 180 12 183 22 178
3 175 13 180 23 179
4 185 14 179 24 180
5 182 15 181 25 185
6 185 16 185 26 180
7 196 17 176 27 183
8 180 18 180 28 185
9 182 19 186 29 180
10 189 20 180 30 180
Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author
Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT
Day
AHT (in
seconds)
Day
AHT (in
seconds)
Day
AHT
(in
second
s)
1 180 11 180 21 180
2 175 12 180 22 185
3 173 13 178 23 183
4 183 14 180 24 180
5 178 15 183 25 180
6 182 16 180 26 183
7 185 17 179 27 181
8 170 18 175 28 184
9 180 19 178 29 181
10 180 20 180 30 182
Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author
Exhibit I Exhibit II
173
Devang’s Performance Devang’s Performance Devang’s Performance Devang’s Performance Devang’s Performance Devang’s Performance
Day
AHT (in
seconds)
Day
AHT (in
seconds)
Day
AHT (in
seconds)
1 185 11 190 21 182
2 193 12 183 22 178
3 178 13 183 23 177
4 175 14 181 24 176
5 190 15 185 25 175
6 187 16 185 26 180
7 176 17 183 27 180
8 179 18 182 28 179
9 185 19 178 29 174
10 182 20 187 30 180
Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author
Exhibit III
SECTION 6
Case Study: Conversys Inc. (B)
174
This case study was written by Dr. Sourabh Bhattacharya, Professor, Department of Operations & IT, IBS Hyderabad. It is
intended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a
management situation. The case was prepared from the generalized experiences.
The management of Conversys has recently become concerned
about its on-the-job training policy. OJT requires the trainers to
possess specialist teaching skills. However, most of Conver-
sys’s trainers lacked the skill and knowledge to train, resulting in
an output of insufficient standards. Moreover, the trainers being
the employees themselves were not given sufficient time to
spend with the trainees, which again led to substandard training
and insufficient learning.
The vice president (HR) of Conversys, Shailja Goel is of the
opinion that instead of OJT, employees should be given training
by external agencies in a more systematic and structured man-
ner. A number of such agencies were contacted over the next
few months and VoiceTutorial had been shortlisted for the job.
However, before signing the contract with VoiceTutorial Ms.
Shailaja wanted to test the effectiveness of the training methods
used by VoiceTutorial. She negotiated with VoiceTutorial to run
a pilot training program for 15 of her employees. The pilot pro-
gram was scheduled to start after a month. The average
monthly performance of these 15 employees was recorded for a
month before the pilot program started. The employees were
given one week extensive training on data collection and entry,
customer service and call handling techniques. The monthly av-
erage performance of these 15 employees was also recorded
after the training program was over. The performance data is
shown in the table below. Ms Shailaja is now wondering how to
use these data for assessing the effectiveness of the pilot train-
ing program of VoiceTutorial.
175
Day
AHT (in seconds)
before the pilot
program
AHT (in seconds)
after the pilot
program
1 180 185
2 193 183
3 178 182
4 175 175
5 185 187
6 187 182
7 176 178
8 179 177
9 189 176
10 182 175
11 185 180
12 183 180
13 183 179
14 181 174
15 185 180
Prepared by author Prepared by author Prepared by author
Conversys Inc.(B)
176
Notes
INTERACTIVE 8.2 Employees Performance
177
178
179
SECTION 7
Case Study: The Strategic Break: To be or Not to be
180
This case study was written by Dr. Sourabh Bhattacharya, Professor (Department of Decision Sciences), IBS, Hyderabad. It is in-
tended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-
ment situation. The case was written from generalised experiences.
Newspapers and internet media is flooded with such criticisms
about the recently introduced “Strategic Break” in the IPL
T20’s season two. On one hand players are of the opinion that
the strategic breaks hamper the momentum of a team, on the
other hand media is looking at these breaks suspiciously. Me-
dia believes that the strategic breaks are born out of com-
pletely commercial interests of the Board of Control for Cricket
in India (BCCI).
Rubbishing the media allegations, Lalit Modi, the Chairman of
IPL, claims that the strategic break is the innovation brought
into the 20-20 format of the game. “The ‘strategy break’ is an
innovative deviation from tradition, which gives teams an op-
portunity to consult and alter strategies after 10 overs to get
their acts right,” says Modi. However, Modi also assures that
the reassessment of the idea will be done takingthe views of
the players into consideration once the season two games are
over.
Introduction
On the lines of National Basketball League (NBA) of USA and
football’s English Premier League, Board of Control for Cricket
in India (BCCI) launched Indian Premier League (IPL) in the
year 2008. IPL was established as professional Twenty20
cricket league with the approval of International Cricket Coun-
cil (ICC). The format of the Twenty20 game is completely differ-
ent from the format of the usual one-day game. The most im-
portant difference is that Twenty20 is a 20 over each innings
game, which allows a bowler to bowl a maximum of 4 overs
whereas in one-day game, which is of 50 overs per innings, 10
overs are maximum for each bowler. Apart from the maximum
number of overs there are many other differences in terms of
fielding restrictions, rules for time out and a no ball, rules in the
181
The Strategic Break: To be or Not to be
event of a tie etc. With all these changes the Twenty20 game
has become faster and more exciting.
The season – 1 of the IPL Twenty20 (also known as DLF IPL
2008) was played in various cities of India between eight
teams (Exhibit I). The season lasted for 45 days in which 59
matches were played. On June 1st 2008, the final match was
played between Rajasthan Royals and Chennai Super Kings
at DY Patil Sports Academy in Mumbai. Under the captaincy
of Shane Warne3 Rajasthan Royals defeated Chennai Super
Kings by 3 wickets.
By the end of the season – 1, IPL Twenty20 had earned enor-
mous popularity among the viewers as well as cricketers
across the world. BCCI was always positive and certain about
the popularity and acceptance of Twenty20 format of the
game and had already chalked out its plans for the season –
2 (also known as IPL 2) games to be held in 2009. Initially, IPL
2 games were planned to be held in India but due to the gen-
eral elections in India taking place at the same time adequate
security for the tournament could not be guaranteed by Indian
government. It was at this time when doubts were cast on the
future of IPL 2, Government of South Africa came to the res-
cue of BCCI and offered South Africa to be the venue for IPL
2 games.
With a lot of fanfare, IPL 2 was kick started in the city of Jo-
hannesburg, South Africa on April 18th 2009.
The Strategic Break
With an idea of bringing innovation and variety, BCCI decided
to introduce two new rules in IPL 2. The first alteration was to
the rule of bowl-out in the event of a tie. In IPL 1, in case of a
tie, each team had to bowl five balls on the unguarded wick-
ets and whichever team hits the wickets maximum number of
times wins. In case
both the teams hit the
same number of wick-
ets after the first five
balls per side, the
bowling continues and
the winner is decided
by sudden death4. In
IPL 2, bowl-out rule
was replaced with the
rule of super-over. In
super-over, each team
nominates three bats-
men and one bowler
to play a one over
“mini-match”. Each side bats one over bowled by the one
nominated opposition bowler. If the batting side loses two
wickets, their innings is over. The side with the higher score
from their over wins. If the teams finish tied on runs scored in
that one over, the side with the higher number of sixes in its
full innings and in the one-over eliminator will be declared the
winner. If the teams are still tied, the one with the higher num-
ber of fours in both innings will win.
The second alteration was the introduction of “the strategic
break” in IPL 2. The strategic break is the official time-out of 7
minutes 30 seconds in duration midway through the innings.
The idea of strategic break is to allow the teams to re-group
tactically. During the time-out, the fielding team and the two
182
Exhibit I Team Players in IPL 2
strategic break, which has given rise to a lot of controversies
in the season 2 of IPL. Players in general and batsmen in par-
ticular came down heavily on the idea of a time-out in the mid-
dle of the innings. They felt that this break hampers the mo-
mentum of the team. Media, on the other hand, had a different
point of view to criticize the introduction of strategic break.
They alleged that the strategic breaks have the commercial
interests of BCCI to earn more advertising revenues. Chair-
man of IPL, Lalit Modi rubbished the media allegations and ex-
plained strategic breaks as the innovation brought into the
game of cricket. He said that the concept of time-outs was al-
ready existing in the games like football or basketball and it is
just adapted in the game of cricket. However, taking the con-
cern of the players into account Modi assured to reassess the
idea of strategic breaks once the IPL2 tournament is over
In order to evaluate the idea of strategic break, Modi will have
to look at the performances of the teams before and after the
strategic break. Exhibits II and III show the first innings and
the second innings performances of the 17 matches played till
now in IPL 2 respectively. Can Modi reach to a conclusion
whether the players’ claim that the strategic break hampers
the momentum of the game is correct or not?

183
Exhibit II: Performance in the First Innings of Match Exhibit II: Performance in the First Innings of Match Exhibit II: Performance in the First Innings of Match Exhibit II: Performance in the First Innings of Match
Match no 1 Batted 1st Score before
strategy break
score in next 5
overs after break
1 MI v CSK 64-1 41-3
2 RCB v RR 57-4 30-1
3 KXIP v DD 67-1 37-6
4 KKR v DC 31-3 33-2
5 CSK v RCB 106-0 29-2
6 KXIP v KKR 67-3 50-1
8 DC v RCB 91-2 48-1
9 DD v CSK 90-3 33-1
10 RR v KKR 78-4 29-0
11 RCB v KXIP 71-3 29-1
12 DC v MI 88-1 49-3
14 RCB v DD 74-3 26-1
15 KXIP v RR 60-4 38-0
16 CSK v DD 88-2 25-2
17 MI v KKR 111-0 40-3
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
Compiled by author Compiled by author Compiled by author Compiled by author
184
Exhibit III: Performance in the Second Innings of Match Exhibit III: Performance in the Second Innings of Match Exhibit III: Performance in the Second Innings of Match Exhibit III: Performance in the Second Innings of Match
Match no 1 Batted 2nd Score before
strategy break
score in next 5
overs after break
1 CSK vMI 70-3 38-2
2 RR v RCB 32-5 26-4
3 DD v KXIP 58-0 DNB
4 DC v KKR 69-2 35-0
5 RCB v CSK 56-5 29-4
6 KKR v KXIP 79-1 DNB
8 RCB v DC 57-3 52-1
9 CSK v DD 106-2 42-2
10 KKR v RR 67-3 31-2
11 KXIP v RCB 80-1 47-1
12 MI v DC 84-1 24-3
14 DD v RCB 64-2 35-1
15 RR v KXIP 48-6 34-0
16 DD v CSK 85-2 44-1
17 KKR v MI 70-2 25-5
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
DNB= did not bat
Notes: The match # 7 and 13 were abandoned without a ball being bowled.
In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6
overs. The corresponding ﬁgure after strategy-break corresponds to their
performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs
and there was no strategy-break in their innings.
In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs
Compiled by author Compiled by author Compiled by author Compiled by author
SECTION 8
Case Study: Shoppers’ Stop Private Labels
185
This case study was written by Siva V Gabbita, Professor, IBS, Hyderabad. It is intended to be used as the basis for class discus-
sion rather than to illustrate either effective or ineffective handling of a management situation. The case was written from general-
ised experiences.
Shoppers’ Stop Private Labels
On October 27, 1991 the K. Raheja Corp. group of compa-
nies, one of India’s biggest hospitality and real estate players
crossed another milestone with the foundation of its lifestyle
venture - Shoppers’ Stop. Shoppers Stop is today one of the
From its inception when it began by operating a chain of de-
partmental stores Shoppers’ Stop has progressed from being
a single brand shop to becoming a Fashion & Lifestyle store
for the family. Shoppers’ Stop is now a household name,
known for its superior quality products, services and above
all, for providing a complete shopping experience.
Today Shoppers’ Stop has twenty six (26) stores across the
country and three stores under the name HomeStop) and
over the years it has also begun operating a number of spe-
cialty stores, namely Crossword Bookstores, Mother care,
Brio, Desi Café and Arcelia.
Shoppers’ Stop has become a benchmark for the Indian retail
industry. In fact, the company’s continuing expansion plans
aim to help Shoppers’ Stop meet the challenges of the retail
industry in an even better manner than it does today.
Shoppers Stop retails a range of branded apparel and private
labels in apparel, footwear, fashion jewellery, leather products,
accessories and home products. These are complemented by
cafe, food, entertainment, personal care and various beauty
related services.
Shoppers Stop retails products of domestic and interna-
tional brands such as Louis Philippe, Pepe, Arrow, BIBA,
Gini & Jony, Carbon, Corelle, Magppie, Nike, Reebok,
LEGO, and Mattel.
Shoppers Stop retails merchandise under its own labels,
such as STOP, Kashish, LIFE and Vettorio Fratini, Elliza
Donatein, Acropolis etc. The company also licensees for
Austin Reed (London), an international brand, whose
men’s and women’s outerwear are retailed in India exclu-
sively through the chain.
Retailers today understand the role that private label
brands play in long-term business strategy and marketing
strategy. Store brands play a significant role as part of the
marketing mix of retail chains. On the supply side effec-
tive category management enables retailers to optimize
supply chain relationships whereas on the demand side
strategic brand management works in tandem in each
aisle of each store. Well known national brands are avail-
able everywhere and are not store specific. Therefore the
retailer’s store brand portfolio has the advantage of obtain-
ing as well as providing synergies with well known
brands, which attracts customers to establish a relation-
ship with the franchise.
History of Private labels
Private label brands traditionally competed with well
known brands in the same product category because their
price-value proposition allowed them to be positioned as
the “cheaper alternative”. As a result of such a positioning
186
while they attracted consumer attention they were also how-
ever perceived as inferior in quality. However retailers pushed
private label products because they yielded high margins of
profitability with minimum marketing effort.
Private labels therefore grew to provide competition to na-
tional brands. On the flip side the entire product category was
undermined by commoditization since they forced a price com-
petition erasing profit margins all around. Also, this cost-
based competition significantly reduced a focus on product dif-
ferentiation. Therefore all entities along the supply chain
missed the opportunities that existed for tapping latent con-
sumer needs which these categories sometimes had the abil-
ity to fulfill.
Private label success and Loyalty programs
In some cases however the reverse has been true – where
well-known brands have been unable to escape the innova-
tor’s dilemma. Store brands have succeeded in identifying cus-
tomer needs and have provided alternative value proposi-
tions. The success of private label brands also allowed for di-
versification into other product categories which were hitherto
dominated by the well-known brands. In this way the capacity
of private labels to provide value, visibility, consumer involve-
ment and therefore interest has exceeded that of the well
known brands.
More importantly private labels have perhaps largely suc-
ceeded because retailers have focused on promoting them.
Store brands have the advantage due to their potential for
store association whereas national brands are ubiquitous and
therefore not store-specific. Retailers therefore use pro-
prietary brands to draw people into their own stores. Bind-
ing the consumer favorably to the store is additionally
driven through loyalty programs. Shoppers Stop’s has a
loyalty program called First Citizen. They also offer a co-
branded credit card with Citibank for their members.
Questions for Discussion
The Marketing Manager of SHOPPERS STOP wants to
assess the popularity of one of its “own”store brand –
STOP, against two well known brands viz. John Players
and Provogue. If resource rationalization demands that
only a sample size of 150 qualified consumers can be sur-
veyed, can the brand preference (or lack of it) of the
STOP brand over the other brands be established?
187
SECTION 9
Case Study : Hindustan Foods
Hindustan Foods, a leading manufacturer of food products,
recorded sales of Rs. 445.6 crores and a net income of Rs.
54.57 crores in 2003. The company manufactured fruit-
cakes, cookies, biscuits, confectionary and a variety of other
food products including baby foods. The domestic confec-
tionery market was loosely divided into seven categories -
hard-boiled candies, toffees, éclairs, chewing gum, bubble
gum, mints and lozenges. Hard-boiled candies occupied the
largest share of this market. Hindustan Foods did not have
a presence in this segment. It manufactured and marketed
toffees as ‘Tasty Bite’ toffees while in the chewing and bub-
ble gum segment it had a significant presence with its
‘Fresh mint’ brand.
Hindustan Foods planned to enter the hard-boiled fruit
candy segment under its Tasty Bite brand. The objective
was to gain significant presence and market share in a seg-
ment that was rapidly growing. The company wanted to test
three new flavors for the proposed candy, strawberry, apri-
cot and pineapple. Hindustan Foods also wanted to meas-
ure the impact of three different retail prices – 50 paise, 75
paise and Re.1 for the three flavours.
The company selected nine geographically separated
stores, as the test stores for the new flavours and different
price points. These stores were similar with respect to Hin-
dustan Foods’ confectionary sales and were located in
neighbourhoods that had similar demographic characteris-
tics. Because each of the three flavours was to be tested at
each price, a total of nine different flavour – price combina-
Hindustan Foods arranged for the delivery of the three new
flavours across the stores. At the end of four weeks, the
company collected the unsold candy cases. It determined
the number of cases sold for each flavour at each price.
With the data so determined, Hindustan Foods wanted to
know if the difference in sales was due to the difference in
flavours and what effect the different prices had on sales.
188
189
Hindustan Foods’ Experimental Results
Number of Cases of New Flavours sold at Different Prices
PRICE
FLAVOUR FLAVOUR FLAVOUR
PRICE
STRAWBERRY APRICOT PINEAPPLE
50 paise 22 54 35
75 paise 24 45 32
Re 1 15 35 31
SECTION 10
Case Study : A Study of Soap Segment in Indian FMCG Market
190
This case study was written by P Sashikala, Professor (Department of Decision sciences), IBS, Hyderabad. It is intended to be
used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.
The case was written from generalized experiences.
Across the globe, the significance of sales promotion in the
marketing mix of the Fast Moving Consumer Goods (FMCG)
industry has been increasing day-by-day. In the times of slow-
down/ recession in the economy, the marketers may depend
on sales promotion techniques to boost up the consumer
demand. Sales promotions generally hit directly at the decision
affect the consumer’s buying pattern directly producing immedi-
ate results. Generally sales promotion is a tool for boosting
sales for FMCG sector as certain products are price sensitive.
With various brands of FMCG in the market, the emphasis on
sales promotion in India has increased by over 500%–600%
from 2000 to 2008. It is estimated that the marketing compa-
nies have spent about INR 5,000 crore (approximately \$1054
million) as sales promotion expenditure. However, the usage
of techniques to improve sales in FMCG sector requires the
manufacturers to understand consumer perceptions, attitudes
and preferences while channelling their sales promotional ef-
forts. These efforts should aim at building product awareness,
creating interest in the product, stimulating demand by convinc-
ing the customers and reinforcing the brand among the custom-
ers.
Generally, in the FMCG sector, especially in a vast market like
India, the consumer may switch from one product to other
based on the promotional offers. However, all sales promo-
tional techniques may not have the same impact on all the con-
sumers alike.
Indian FMCG Market
The Indian FMCG sector is the fourth largest sector in the econ-
omy with a total market size of about \$13.1 billion in 2007. It
has a strong presence of MNCs. It is also well established with
distribution network and intense competition between the organ-
ized and unorganized segments with low operational cost.
FMCG market is also leveraging on the rural market segments.
191
Exhibit I: FMCG Category and Products Exhibit I: FMCG Category and Products
CATEGORY PRODUCTS
House hold care
Fabric wash(Laundry soaps and synthetic
detergents); household cleaners (dis/utensil
cleaners, ﬂoor cleaners, toilet cleaners, air
fresheners, insecticides and mosquito repellents
metal polish and furniture polish).
F o o d a n d
Beverages
Health beverages; soft drinks; staples/cereals;
bakery products (biscuits, bread, cakes); snack
food; chocolates; ice cream; tea; coffee; soft drinks;
processed fruits vegetables; dairy products; bottled
water; branded ﬂour; branded rice; branded sugar;
juices etc.
Personal Care
Oral care, hair care, skin care, personal wash
(soaps); cosmetics and toiletries; deodorants;
perfumes; feminine hygiene; paper products
A Study of Soap Segment in Indian FMCG Market
The total number of rural house holds is expected to rise from
135 million in 2001-2002 to 153 million in 2009- 2010 which also
presents the largest potential market in the world. The FMCG
market is estimated to increase from \$11.6 billion in 2003 to
\$33.4 billion in 2015. The penetration level and the per capita
consumption in India for most of the products like tooth paste,
skin care, hare care etc, is low which indicates the untapped
market. For example, the per capita consumption of toilet or
bathing soap in the country is 800 gm, whereas it is 6.5 kg in the
US, 4 kg in China and 2.5 kg in Indonesia.5
With burgeoning Indian population, particularly the middle class
and the rural segments, the manufacturers have an opportunity
to convert consumers to use more and more branded products
in the FMCG segment.6 Following table (Exhibit I) gives an over-
view of FMCG category and products.
In the year 2004, the size of the personal wash products is esti-
mated at US\$ 989 million; hair care products at US\$ 831 million
and oral care products at US\$ 537 million. While the overall per-
sonal wash market is growing at one per cent, the premium and
middle-end soaps are growing at a rate of 10 per cent. The lead-
ing players in this market are HLL, Nirma, Godrej Soaps and
P&G. The production status of the Indian FMCG industry (in
2004) is given in the table below(Exhibit II)
Brief on Soap Market
According to Pradipta (2007), the segment of soaps is one of
the biggest FMCG categories in the country. Bathing and toilet
soaps contribute around 30% to the soaps market. There are 38
companies in India manufacturing soaps. Major players include
HUL, Reckit Benkiser, Godrej Consumer Products, Henkel Spic,
Procter & Gamble and Nirma.
Some of the major brands in the
soap segment are Lux, Hamam
and Lifebuoy , Cinthol, Shikakai
and Godrej No. 1 (GCPL),
Camay (P&G) and Dettol (Reckit
Benckiser). The present approxi-
mate size of the branded soap
market is around INR 7,500-
crore (\$1581.8 million approximately). With increasing competi-
tion, this sector will register a 20% growth in 2009, despite the
economic downturn.
According to industry estimates, HUL controls is with 46.7% mar-
ket share in 2007, with brands including Lifebuoy, Lux, Rexona,
Breeze and Hamam. After HUL comes Nirma and Godrej with
their respective brands. The medicated soap brands include Det-
tol and Margo. Another major player in FMCG sector is P&G,
which had portfolio of products in healthcare; feminine-care; hair
In the light of intense competition and companies offering sales
promotion the author is interested in conducting a study in soap
market.The objective is to ascertain the perceptions of consum-
ers preference towards various sales promotion offers such as
Discount on market price, buy 2 get 1 free, contests/ games and
lucky draw, surprise gifts/coupons. A brief description of promo-
tional offers is given below:
It should be noted that all offers are not offered at the same
time.
192
Price Discounts
Product Wise Production(2004) Product Wise Production(2004) Product Wise Production(2004) Product Wise Production(2004) Product Wise Production(2004)
Segment Unit Size Key Players Share of Market holder %
Household Care 62
Fabric wash market Mn tonnes 50 HLL,P&G,Nirma,SPIC 38
Laundry sops/ bars US \$ mn 1102
Detergent cakes Mn tonnes 15
Washing powder Mn tonnes 26
Dish wash US \$ mn 93 HLL 59
Personal care 58
Soap & Toiletories Mn tonnes 60 Hll,Nirma, Godrej
Personal wash market US \$ mn 989 Hll,Nirma, Godrej
Oral care US \$ mn 537 Colgate palmolive,Hll 40
Skin care & Cosmetics US \$ mn 274 Hll, Dabur,P&G 58
Hair care US \$ mn 831 Marico, Hll, Cavincare, Proctor & Gamble, Dabur,Godrej 54
Feminine Hygiene US \$ mn 44 Proctor &Gamble, Jhonson& Jhonson
Food and Beverages
Bakery products Mn tonnes 30 Britania, parle,ITC
Tea 000 tonnes 870 Hll, Tata Tea 31
Cofee 000 tonnes 20 Nestle, Hll, Tata tea 49*
Mineral water Mn tonnes 65 Parle ,Bisleri, Parle Agro, Coca Cola, Pepsi
Soft Drink Mn crates 284 Coca Cola , Pepsi
Branded atta 000 tonnes 750 Pilsbury,HLL,Agro tech, Nature Fresh, ITC 15
Health beverages 000 tonnes 120 Smithkline Beecham, Cadbury,Nestle, Amul
Milk and Dairy products US \$ mn 653 Amul , Britania,Nestle
Chocolates US \$ mn 174 Cadbury, Nestle
culinary products Mn tonnes 326 HLL, Nestle 78
Edible oil Mn tonnes 13 Ruchi soya, marico, ITC, Agro tech 28
ExhibitII
The study is mainly intended to analyse the overall effect of vari-
ous sales promotion on consumer buying decision. A suitable
sample is selected for the study and data is collected through a
balanced and unbiased questionnaire. It attempts to examine the
perception of customer’s preference of aforesaid promotional of-
fers. The questionnaire is administered to 250 respondents
within the age group of 15-25, 25-35 and 35-45 years. Respon-
dents are asked to give their preference towards various sales
schemes offered with soaps. The perceptions of the consumers
are measured on a preference scale of 1 to 5 with ‘1’ being ‘Not
preferred’ and ‘5’ being ‘Most preferred’. There can be respon-
dents who may not prefer a certain type of promotional offer. An-
other question which is put to the respondents is whether they
prefer to buy existing products (stick to their brand) or they prefer
to buy new products (shift brands).
Results of Analysis of Data Collected
It is observed in the research study that as many as 70% of the
consumers prefer to buy one soap at a time while other 30% pre-
fer to buy more than one or multiple-pack.
Out of 250, 100 respondents preferred buying new products and
150 preferred existing products.
It is also observed that out of 100 respondents who preferred
buying new products, 25 are in the age group of 15-25, 45 in 25-
35 and 30 in the age group of 35-45. The corresponding figures
of 150 who preferred existing products are 15, 75, and 60 respec-
tively.
Out of all 250, 180 preferred promotional offers. It is also ob-
served that out of 180 who preferred promotional offers, 30 are
in the age group of 15-25, 95 in 25-35 and 55 in 35-45 and the
corresponding figures who do not prefer a particular type of pro-
motional offers are 20, 20 and 30.
Out of the 100 who preferred buying new products, 80 respon-
dents preferred a promotional offer on the new product and 20
do not prefer a particular type of promotional offer. An offer on a
new product gives them a feeling of either low quality or product
not doing well on an overall basis.
Out of 250 respondents 125 are male and 125 are female. It was
also observed that 85 of the male respondents prefer Promo-
tional offers whereas 40 do not and 95 of the female respon-
dents prefer promotional offers whereas 30 do not prefer a cer-
tain type promotional offer. Out of 150 who preferred existing
product 80 are males and rest are females and out of 100 who
preferred new products 45 are male and rest are female.
While analyzing deep into the promotional offers, it is observed
that the consumer’s preference for cash discounts is more than
any other type of promotional techniques including buy-two-get-
one free, contests and lucky draws as well as surprise gifts and
194
Questions for Discussion
Do you expect the influence of various sales promotion
offers on the purchasing decision of the consumers is
same? If no, why? What are the managerial implications
If you are the regional marketing manager of a company,
would you decide to go for any promotional offer or not
while launching a new product?
Based on data collected in the present study, as a man-
ager, would you take factors like gender and age into con-
sideration while deciding the promotional offer?
Exhibit III
Most Favored Sales Promotional Measures
195
SECTION 11
Case Study: Melting Delicacies (A)
196
This case study was written by Sushama Marathe ,Professor (Department of Decision sciences), IBS, Hyderabad. It is intended
to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situa-
tion. The case was written from generalized experiences.
In contemporary India, we see Ice cream chains located in cit-
ies all over the country. Their menus offer a variety of ice
cream flavors, ice cream sundaes, banana splits. People like
them because they can get a quick ice cream and dessert for
a reasonable price and with ease. When you open a chain
store, you are cashing in on the name of the franchise. People
know that whether your store is in Mumbai or Chennai or Hy-
This is why you take a franchise but the down side is there are
tight regulations on franchisees.
A franchise when it comes to small business is a business one
can buy into for a fee. Most fast food chains are made up of
people buying the right and territory of that chain for a fee.
Owning a franchise does not mean only having a recognized
vertising as well. Most of the time the parent company will is-
sue sales fliers, coupons and TV ads as part of the franchise
fee1 . Franchises are not just limited to food & eateries but
come in all sorts of products including brick and mortar types
The Indian tropical climate is right for ice-cream consumption.
As opposed to many other countries, India has a very low per
capita consumption of ice cream even if we look at only the
middle class and above. This primarily indicates that the reis
a large un tapped market potential. Industry Snapshot of In-
dian Ice Cream Market are, “Market Size – Rs. 800 Crores
and market growing at 10 to 12%”.
Maria Fernandez, a young entrepreneur in her late twenties
had taken a franchise for a retail ice-cream chain “Melting deli-
cacies”. She entered into a contract for a chain of five outlets
in Hyderabad, a metropolitan city and capital of the state of
Andhra Pradesh in southern India. Within a year of her starting
this venture, she was making good profits. The quality and the
variety of the ice-creams, prompt service, polite service person-
nel and ambience were the factors of instant success. “Melting
delicacies” were renowned for their exotic flavors and reason-
able prices. They had Mango Mawa ice-cream looking like a
cake with a silver-foil on it, Dry fruit ice-cream with almonds,
pista and figs with a dash of saffron; Vanilla ice creams with a
base of cream and cashew nuts; Mango Rich Duet and Cus-
tard Apple ice Cream. These ice creams were priced any-
where between Rs3 25 to Rs 35 per scoop/cup/plate. They
also had variety of sundaes and Banana splits. But the five fla-
vors of ice creams were hot favorites. The “Melting delicacies”
outlets were frequented by young and old all alike with the
same enthusiasm and passion.
All the five outlets had there own distinctive surroundings. One
of the outlets was located in the Cyber City area where all the
IT giants and multinational companies (MNCs) had their of-
fices. This joint was flooded with young people on weekdays
and was relatively less crowded on weekends. Another loca-
tion, near a boating facility, recreation center and amusement
park in the heart of the city, was so overcrowded on weekends
that many customers had to leave disappointed. The third loca-
tion was in an institutional area and was surrounded mainly by
197
Melting Delicacies(A)
Women’s educational Institutions and colleges. It was observed
that this joint had a heavy demand for the Mango Duet flavor.
One more was in an elite residential locality and the fifth was in a
sprawling shopping mall which also housed a multiplex.
In the Market there was tough competition from chains like
Baskin Robbins, Havmore, Naturals and the local chain of Dairy
Cream. Recently some of these competitors had opened their out-
lets in close proximity of “Melting delicacies”. Maria knew she had
to be on a close guard and avoid situations of out of stock on the
most favored flavors, delay in service due to non availability of
adequate service personnel or because of lack of place for the sit
and eat clients. These would only mean loss of customers and
grown tremendously over the past one year. With growing popu-
larity of the chain and tough competition in the market, any slip in
decision making would only mean trouble for the business. As a
CEO she felt that at this point of time knowing answers to certain
ground realities on trends, patterns, associations of the product
and the consumers was essential for right decisions on inventory
planning to human resource and marketing management.
She discussed this with the young managers in her outlets. They
said that intuitively they feel that there are preferences and asso-
ciations but were not certain about it. For verifying this they
needed documented data. She assigned this task of identifying
and gathering relevant information from the outlets, which will
help and assist informed decisions for better performance of her
business to one of the management trainees working in her of-
fice.
After a series of discussions with the outlet managers and Maria
and getting clarity on the objective he outlined a study which re-
quired data collection.
The outlets had both “sit ‘n’ eat” (in-store) and take away facili-
ties. The take away was available in packs of 0.5 Lts , 1Lts and
2Lts. For the in-store service the requested flavor was served
and charged per scoop. As collecting information on preferred
/consumed flavor and gender or age was not possible for the
take away sales, this Information was documented from the sit
and eat customers only. A quick recording of information on day
of the week, gender, age and flavor of the ice-cream consumed
was done at all the outlets. The other essential information was
also appropriately documented.
Many consolidated tables were generated from the data. Three
such consolidated tables are given in Exhibits (I), (II) and (III).
198
INTERACTIVE 8.3
199
Exhibit I
Number of Scoops Consumed by In Store Consumers*
Exhibit I
Number of Scoops Consumed by In Store Consumers*
Exhibit I
Number of Scoops Consumed by In Store Consumers*
Exhibit I
Number of Scoops Consumed by In Store Consumers*
Exhibit I
Number of Scoops Consumed by In Store Consumers*
Exhibit I
Number of Scoops Consumed by In Store Consumers*
Exhibit I
Number of Scoops Consumed by In Store Consumers*
Flavor
choice
Week days
(Monday to Friday)
Week days
(Monday to Friday)
Week days
(Monday to Friday)
Weekends
(Saturday-Sunday)
Weekends
(Saturday-Sunday)
Weekends
(Saturday-Sunday)
Flavor
choice
Male Female Total Male
Femal
e
Total
Mango
Mawa
100 75 175 45 30 75
Mango
Duet
75 150 225 60 65 125
Vanilla
Cashew
95 55 150 20 30 50
Custard
Apple
60 90 150 50 50 100
Dry fruit 40 60 100 10 40 50
Total 370 430 800 185 215 400
* Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only
Exhibit II
Trafﬁc Density at the “Sit and Eat”* Venue by Gender
Exhibit II
Trafﬁc Density at the “Sit and Eat”* Venue by Gender
Exhibit II
Trafﬁc Density at the “Sit and Eat”* Venue by Gender
Exhibit II
Trafﬁc Density at the “Sit and Eat”* Venue by Gender
Exhibit II
Trafﬁc Density at the “Sit and Eat”* Venue by Gender
Exhibit II
Trafﬁc Density at the “Sit and Eat”* Venue by Gender
Exhibit II
Trafﬁc Density at the “Sit and Eat”* Venue by Gender
location
Week days
(Monday to Friday)
Week days
(Monday to Friday)
Week days
(Monday to Friday)
Weekends
(Saturday-Sunday)
Weekends
(Saturday-Sunday)
Weekends
(Saturday-Sunday)
location
Male
Femal
e
Total Male
Fema
le
Total
Cyber
City
550 500 1050 150 125 275
Insitution
al area
500 1000 1500 175 150 325
NTR
Gardens
375 350 725 650 675 1325
Banjara
Hills
375 425 800 185 215 400
Central
Mall
300 475 775 575 675 1250
total 2100 2750 4850 1735 1840 3575
* Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only
200
Exhibit III
Number of Scoops Sold on One Particular day
Exhibit III
Number of Scoops Sold on One Particular day
Exhibit III
Number of Scoops Sold on One Particular day
Exhibit III
Number of Scoops Sold on One Particular day
Exhibit III
Number of Scoops Sold on One Particular day
Exhibit III
Number of Scoops Sold on One Particular day
Exhibit III
Number of Scoops Sold on One Particular day
Location
Mango
Mawa
Mango
Duet
Vanilla
cashew
Dry
fruit
Custa
rd
apple
Total
Cyber City 25 35 15 15 15 275
Insitutional
area
35 20 15 10 20 325
NTR
Gardens
100 150 50 60 85 1325
Banjara
Hills
85 75 65 60 80 400
Central
Mall
50 175 75 65 125 1250
total 295 455 220 210 325 3575
* Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only
Analysis of Variance (ANOVA)
Analysis of Variance
Assumptions and Basics of ANOVA
Applying ANOVA to the Emoluments Problem
Multiple Comparisons
ANOVA in Practice
C
H
A
P
T
E
R

9
I n t hi s c hapt e r we wi l l di s c us s
Section1
What is ANOVA?
Analysis of Variance in a statistical technique allows us to
test whether the differences as observed among more than
two sample means are significant or not. In other words, our
concern is whether the samples come from same
population or not. This is a generalization over the test of
significance among the means drawn from two populations.
Managers are often required to test the significance of the
differences among the means drawn from more than two
populations. Several applications of ANOVA can be seen.
A transport company would like to compare the mileage
given by different brands of tyres.
A fertilizer company would like to compare the effectiveness
of different fertilizers on productivity.
An engineering company would like to compare the
machine productivities for machines producing the same
products. In general we have a response variable (or
dependent variable). Then we collect data to decide if one
or more factors (or independent variables) influence the
response variable. In many cases, the classes or categories
may be predefined and we would have to take them as
given. For instance, while comparing average heights of
different ethnic groups, the ethnic groups are taken as given
and we observe samples from each group.
Another type of situation is when the influencing factor (the
independent variable, also called treatment) is in our control
and we experimentally manipulate them. For instance, a
pharmaceutical company which has developed three
different types of drugs for treating a disease, may
consciously conduct and experiment, in which the affected
people are divided into four groups (one each for each drug
and one group for placebo application) and each group is
administered the drug for a period after which the
responses can be observed. Clearly, the assignment of
patients to the drug/treatment should be done randomly.
This is a typical situation of design of experiment and the
particular approach indicated here is referred to as
completely randomized design. The ultimate interest here is
to compare the mean effectiveness of the drugs and the
place bo on the four groups.
202
The analysis tool for both the ethnic group example and the
pharmaceutical example will be one way ANOVA; one way
because we are observing the impact of only one factor/
treatment in these examples. This suggests us that we can
have two way or in general m way ANOVA, when we have
more than one influencing factor / treatment under
consideration. For instance, in the pharmaceutical example
one may be interested in studying the effectiveness of the
drugs in relation to the age groups of the patients, as prima
facie the drugs are expected to impact different age groups
differently. Thus age group emerges as another factor,
besides drugs.
Let us consider an example.
Example: Emoluments Comparability
From four premier institutes, respectively 6,7,8 and 8 man-
agement graduates were selected. The amount (in Rupees
lakhs) they were offered as emoluments annually during
their placement is shown below in table 9.1.1.
Can we say that on an average graduates of all the institu-
tions are being offered the same emoluments, or are some
institutions preferred over the others9.
203
Figure 9.1.1: ANOVA
Table 9.1.1. Table B Table 9.1.1. Table B Table 9.1.1. Table B Table 9.1.1. Table B
Institute 1 Institute 2 Institute 3 Institute 4
11 8 10 7.75
12 9 11 8.25
9 9.5 10.5 8.75
10.5 9.75 10.25 9
11.5 10 10.75 9.5
12 10.25 9.75 10
10.5 9 10.5
8.5 11
Section 2
Assumptions and Basics of ANOVA
Assumptions
The various populations from which the samples are
drawn should be normal and have equal variances. The
requirement of normality can be relaxed if the sample
sizes are large enough.
The samples under each class/treatment are drawn
randomly and independently.
Basics of ANOVA
Let there be n sample observations on a random variable
X divided into k classes on the basis of some criteria or
factors or exposed treatments.
Let
ni = number of observations in the i
th
class (say, treated
with i
th
fertilizer)
n = total number of observations = ∑ ni
Xij = j
th
observation from the i
th
class, I = 1,2……., ni ; j =
1,2………,k
k = number of classes/treatments
Ti = ∑ Xij summed over j and Xi bar = Ti / n
The sample data structure would look as follows in table
9.1.1:
We wish to test the following hypothesis:
Null Hypothesis: H0 : µ1 = µ2 = µ3 = ……………..= µk , i.e.,
all the means are equal.
204
Table 9.1.1. Sample Data Structure Table 9.1.1. Sample Data Structure Table 9.1.1. Sample Data Structure Table 9.1.1. Sample Data Structure
Classes/
Treatments
Sample Observations
Total
Mean
1 X11, X12, ..............X1n1 T1
2 X21, X22, ..............X2n2 T2
......
i X1i, Xi2, ..............Xini Ti
......
k Xk1, Xk2, ..............Xknk Tk
Xi
Xk
X2
X
1
Alternate Hypothesis: H1 : Not all means are equal, i.e., at
least two means are different.
We have two methods to test the above hypotheses using
ANOVA. While conceptually both methods are the same, the
second method is convenient for manual computation.
Method 1:
Step 1: Compute the means and sum of squared deviations
for each class by the formulae:
Also compute the grand mean of all the data observations
in the k-classes by the formula:
Step 2: Obtain the Between Classes Sum of Squares (BSS)
by the formula:
Step 3 : Obtain the Between Classes Mean Sum of Squares
(MBSS)
Step 4: Obtain the Within Classes Sum of Squares (WSS) by
the formula:
Step 5: Obtain the Within Classes Mean Sum of Squares
(MWSS)
Step 6 : Obtain the test statistic F or Variance Ratio (V.R)
Step 7: Reject where
is the desired level of significance.
Method 2:
Step 1: Compute = Grand total of all the
observations
Step 2: Compute Correction Factor
where, is the total number of observations.
Step 3: Compute Raw Sum of Squares (RSS) =
205
Step 4: Total (TSS) =
Step 5: Compute = The sum of all the
observations in the ith class; (i=i,2,...k)
Step 6: Between Classes (or Treatment ) S.S (BSS )
Step 7: Within Classes or Error S.S (WSS) = Total S.S -
Between Classes S.S
Step 8: Now follow steps 3,5,6 and 7 of the method 1.
These calculation are much simpler as compared to those
in the previous method. We can summarize computation
(from either methods ) as below
Here, F(critical) = F(3, 25, 0.05) = 2.99
Since computed F>F(critical), we reject the null hypothesis
of equality of emoluments across the institutions. The
ANOVA results may be summed up in a tabular form as
shown in the Table 9.1.2.
206
F =
MBSS
MWSS
 F k −1, n − k ( )
Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification
Sources of
Variation
Sum of
Square
s
d.f.
Mean
Sum of
Square
s
Variance Ratio
(F)
Treatments
(Between
Classes)
BSS k-1 MBSS
Error (WIthin
Classes)
WSS n-k MWSS
Total TSS n-1
Section 3
Applying ANOVA to the Emoluments Problem
He r e , we have:
n = 29, k = 4, n1 = 6, n2 = 7, n3 = 8 and n4 = 8.
Using Method 1:On computation we get,
T1 = 66.00, T2 = 67.00, T3 = 79.75, T4 = 74.75
And X = 9.913
To compute S1
2
, S2
2
, S3
2
and S4
2
we compute the following
table 9.1.1
With the help of computations in the table we get:
S1
2
= 6.5, S2
2
= 4.339, = 5.180 and S4
2
= 8.742.
From here,
WSS = ∑ Si
2
= 24.761 and MWSS = WSS/(n-k) = 24.761/
(29-4) = 0.990
To obtain BSS and MBSS, we compute the following in
table 9.1.2
From here, BSS = 10.523 and MBSS = BSS/(k-1) =
10.523/(4-1) = 3.508
The F ratio comes to F = MBSS / MWSS = 3.508/0.990
Here, F(critical) = F(3, 25, 0.05) = 2.99
Since computed F<F(critical), we do not reject the null
hypothesis of equality of emoluments across the
institutions. The ANOVA results may be summed up in a
tabular form as shown in the Table 9.1.3.
207
Xi
208
Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1
X1i Square X2i Square X3i Square X4i Square
11.000 0.000 0.000 8.000 -1.571 2.469 10.000 0.031 0.001 7.750 -1.594 2.540
12.000 1.000 1.000 9.000 -0.571 0.327 11.000 1.031 1.063 8.250 -1.094 1.196
9.000 -2.000 4.000 9.500 -0.071 0.005 10.500 0.531 0.282 8.750 -0.594 0.353
10.500 -0.500 0.250 9.750 0.179 0.032 10.25 0.281 0.079 9.000 -0.344 0.118
11.500 0.500 0.250 10.000 0.429 0.184 10.75 0.781 0.610 9.500 0.156 0.024
12.000 1.000 1.000 10.250 0.679 0.460 9.750 -0.219 0.048 10.000 0.656 0.431
10.500 0.929 0.862 9.000 -0.969 0.938 10.500 1.156 1.337
8.500 -1.469 2.157 11.000 1.656 2.743
Total 6.5 4.3393 5.1797 8.7422
WSS 24.761
X
4i
− X4 X
1i
− X1 X
2i
− X2
X
3i
− X3
209
Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2.
Instit
-ute
ni Diff. Diff. sq.
ni*Diff.
sq.
1 6 11.000 9.913 1.087 1.182 7.089
2 7 9.571 9.913 -0.342 0.117 0.817
3 8 9.969 9.913 0.056 0.003 0.025
4 8 9.344 9.913 -0.569 0.324 2.592
BSS 10.523 MBSS 3.508 F 3.542
Xi X
Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification
Source of
Variation
SS df MS F P-value F crit
Between
Groups
10.523 3.000 3.508 3.542 0.029 2.991
Within
Groups
24.761 25.000 0.990
Total 35.284 28.000
Section 4
Multiple Comparisons
In ANOVA, when the null hypothesis is rejected, it
indicates that the samples represent different populations.
If so, it would be of interest to identify the sub-groups of
populations which are homogenous among themselves,
(i.e.,have the same means). For instance, if a
manufacturer has several suppliers for the supply of a
particular component, he would like to group them on the
basis of quality levels of the components supplied by
them. Based on our knowledge of t-test, we know that we
can conduct a pair wise equality of mean (on quality level)
test for all sample pairs.
Formally, the null hypothesis would be H0 : µi = µj (i≠j),
and we use the t-test for comparing populations i and j,
with the estimate of σ
2
obtained from the two samples.
In an alternative to this approach, it is suggested that the
estimate of σ
2
may be obtained based on all the samples,
instead of only the samples being compared. Hence, it is
a pooled estimate of σ
2
based on all samples. Thus,
t =
Since MWSS is an unbiased estimate of σ
2
, we would
reject H0 by comparing the computed t with the t(critical)
with (n-k) degrees of freedom and α level of significance,
depending on the alternative.
Consider the alternative hypothesis as H1 : µi ≠ µj. We can
simplify the test and restate as follows:
Reject H0 at α level of significance, if,
,
where LSD = t(n-k, α/2) *
Here, LSD stands for least significant difference, or
sometimes called critical difference.
210
If all ’s are equal, then we need to calculate the LSD only
once and then compare the differences of all pairs of sample
means with the computed LSD. If the difference for pair is less
than LSD, they belong to the same population and if not, they
belong to different populations.
In ANOVA, multiple comparisons could also be carried out
using the Tukey-Kramer procedure. As earlier, our hypotheses
may be stated as
H0 : µi = µj Vs. H1 : µi ≠ µj (for all i≠j)
According to this procedure, we compute a critical range (CR)
as below.
CR =
where is the upper tale critical value from a Studentized
range distribution with degrees of freedom as k and (n-k)
respectively for the numerator and the denominator, with level
of significance as . Critical values for Studentized range
distribution are available in tables in standard textbooks on
statistics.
211
Section 5
ANOVA in Practice
As we noted earlier, two important assumptions for
conducting ANOVA are: (i) all the populations are normal
and (ii) all the populations have equal variance. On both
counts, some relaxation is possible in practice. If the size
of each sample is large enough (>30), ANOVA can be
applied even if the underlying populations deviate from
normal distribution. Similarly, in the case of one-way
ANOVA, if the sample sizes are nearly equal over the
groups, ANOVA can tolerate some fluctuations in
variance. The thumb rule is: the largest sample standard
deviation should be no more than twice the smallest
sample standard deviation.
There are however, more formal tests for testing the
equality of variances over the populations considered.
One such test is the Levene’s test.
Although the one way ANOVA is relatively robust (as
explained above), large differences in the variances can
significantly affect the validity of the F test. Thus in such
situations, we can first test for the equality of the
variances over different classes/treatments (called
Levene’s test), and only if the homogeneity of the
variances is accepted, we proceed for ANOVA.
Formally, the Levene’s test is described as follows:
Vs. (i =
1,2,....,k)
For conducting the test, for each class, we first compute
the absolute difference between each observation and the
median of the class. Thus we will obtain absolute
differences for the first class, for the second class and
so on, with finally absolute differences for the class.
We then perform a one way ANOVA on these differences,
testing for equality of mean absolute differences over the
classes. We reject the original null hypotheses of equality
of variances if this null hypothesis is rejected.
References:
http://www2.sas.com/proceedings/sugi29/192-29.pdf
212
SECTION 6
Case Study: Real Foods
213
In 2003, Real Foods, a mango juice manufacturer had a pre-
dominant market presence in South India. ‘Enjoy,’ the bottled
mango juice from Real Foods enjoyed a comfortable position
in the branded fruit juices market. For the first time, Real
Foods ventured into another product – an orange juice con-
centrate. Since the market was already full of canned and bot-
tled orange juices, Real Foods opted for the concentrate
form, targeting the home consumption segment. Liquid con-
centrates were available in the market already but Real
Foods had developed a powder concentrate available in tetra-
packs. The powder concentrate when mixed with water gave
a litre of orange juice. Real Foods decided to market the new
product under the ‘Enjoy’ brand name, to leverage the brand’s
equity.
The new product had several attractive features. First of all,
the powder concentrate was much more convenient than the
canned and bottled orange juices. Secondly, Real Foods be-
lieved the quality of the juice made out of the concentrate was
better because unlike canned juices, the juice from the con-
centrate could be prepared just before consumption. Another
very important feature was that the powder concentrate was
available at a much lower price than the other juices. The mar-
keting manager was in a dilemma as to how to advertise the
new product. He could opt for that emphasized on the conven-
ience of the product, the quality attribute, or the price advan-
tage. To facilitate a decision, he conducted an experiment in
three cities – Bangalore, Chennai, and Hyderabad.
In Bangalore, the marketing manager launched the new prod-
of the product. The product was easy to carry from the store
to home. The powder did not require storage in the refrigera-
tor. Even households without a refrigerator could buy the prod-
lighted the ease with which one litre of juice was ready in a
the quality of the product – the freshness proposition, how it
tisements stressed upon the price advantage.
The marketing manager recorded the weekly sales of the new
concentrate in tetrapacks, for 20 weeks in all the three cities
(Refer to Table I). He wanted to know if the difference in sales
was on account of the different communication strategies
adopted by the company for the three cities.
214
Real Foods
215
Weekly sales(for 20 weeks) in 3 cities Weekly sales(for 20 weeks) in 3 cities Weekly sales(for 20 weeks) in 3 cities Weekly sales(for 20 weeks) in 3 cities
Week
Bangalore
(Convenience)
Chennai
(Quality)
(Price)
1 75 45 65
2 60 54 45
3 75 65 56
4 45 56 60
5 55 65 64
6 75 70 54
7 65 62 80
8 80 70 56
9 75 71 67
10 89 60 50
11 95 67 67
12 87 64 70
13 64 56 72
14 71 65 65
15 84 57 65
16 75 54 63
17 54 67 56
18 65 70 64
19 65 59 68
20 55 63 72
216
SECTION 7
Case Study: “Melting Delicacies” Ice cream Parlour chain (B)
217
This case study was written by L. Shridharan, Professor, Department of Decision Sciences, IBS, Hyderabad. It is intended to be
used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.
The case was written from generalized experiences.
Maria and her outlet managers generally believed that the
weekly sales across the five outlets of “Melting Delicacies”
were more or less same. Of late however, some of the outlet
managers indicated some fluctuations in the sales and they at-
tributed this to the opening of a competitor’s outlets in the vicin-
ity of some of her outlets. Naturally, Maria was concerned.
This could pose a threat and Maria knew that she had to act
fast. However, the balanced person in her suggested that she
should ascertain the views of the outlet managers on an objec-
tive basis. She once again called Kiran, the management
trainee in her office and expressed her concern. She wanted
him to verify if there is a difference in sales across the outlets
and if so to indicate the outlets in the order / groups of sales
importance. With these facts established statistically, she felt
she would be on a firmer ground to evolve her strategy to
counter the competition.
After some discussions with the outlet managers, and with
guidance from his B-School professor, Kiran decided on a plan
for data collection and analysis. As a first step, he collected
the weekly sales revenue data for each outlet for the past 15
weeks for each of the outlets, though data was not available
for some outlets for some weeks (Exhibit I).
Based on this information, will Kiran be able to help Maria in
ascertaining her concern one way or the other?
218
“Melting Delicacies” Ice Cream Parlour chain (B)
INTERACTIVE 9.1
219
Weekly Sales(in Rs. Lakhs) at Different Outlets of
Weekly Sales(in Rs. Lakhs) at Different Outlets of
Weekly Sales(in Rs. Lakhs) at Different Outlets of
Weekly Sales(in Rs. Lakhs) at Different Outlets of
Weekly Sales(in Rs. Lakhs) at Different Outlets of
Weekly Sales(in Rs. Lakhs) at Different Outlets of
week
Cyber
City
Institutional
Area
NTR
Garden
Banjara
Hills
Central
Mall
1 3.9 3.7 6.1 4.5 6.5
2 4.2 NA 5.4 3.7 5.3
3 6.6 4.7 5.8 5.4 6.4
4 5.1 5.8 5.4 3.8 6.1
5 4.1 3.6 7.2 5.1 6.4
6 3.8 4.6 5.3 3.8 7.4
7 5.7 3.2 4.4 5.4 NA
8 5.1 4.5 6.7 5.7 5.7
9 5.2 3.9 4.5 3.9 5.9
10 NA 3.2 5.0 5.2 6.6
11 2.7 NA 6.4 5.9 7.3
12 3.1 3.2 NA 5.1 5.8
13 6.1 4.2 5.5 4.5 6.2
14 5.1 3.9 NA 3.3 6.3
15 4.8 3.3 7.0 4.7 7.4
NA:Not Available NA:Not Available NA:Not Available NA:Not Available NA:Not Available NA:Not Available
Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author
Exhibit I
Correlation and Regression Analysis
Correlation Analysis
• Correlation Coefficient
• Properties of Correlation Coefficient
Simple Linear Regression
• Simple Regression of Y on X
• Simple Regression of X on Y
• Some Properties of Regression Coefficients
Multiple Regression
• How good the regression fit is (simple or
multiple)?
• Standard Error of Estimate
• Testing for Significance of Regression Rela-
tion
• Testing for significance of Regression Coeffi-
cients
• Confidence Interval for b¡
• Prediction using Regression equation
• Some Important Considerations
C
H
A
P
T
E
R

1
0
I n t hi s c hapt e r we wi l l di s c us s
Section1
Correlation Analysis
Correlation Analysis is about the study of changes in one
variable in relation to changes in another variable. The
phenomenon can be observed in several natural and
economic contexts. Illustratively,
(a). Higher the rainfall, higher the agricultural production
(b). Higher the income, higher the expenditure
(c). Higher the price, lower the demand.
(d). Higher the age of an equipment, higher the
maintenance cost.
Therefore, it is a study of variation in one variable in relation
to variation in the other variable.
Consider the maintenance cost of a particular type of
equipment at different vintage levels (refer table 10.1.1).
We can plot the data as in figure 10.1.1. It is suggestive of a
linear relation between agricultural output and rainfall.
The question is - can we measure the extent of this linear
relationship between the variables rainfall and output?
221
TABLE 10.1.1 TABLE 10.1.1
Vintage (in years) Maintenance Cost
2 6
7 18
5 13
9 23
4 9
3 5
8 22
Figure 10.1.1: Plotting data
Vintage (in years)
0
7.5
15.0
22.5
30.0
0 2.25 4.50 6.75 9.00
M
a
i
n
t
e
n
a
n
c
e

c
o
s
t

(
i
n

R
s
.
0
0
0

s
)
Correlation Coefficient
The strength of correlation between two variables X and Y
is measured through correlation coefficient, which is
defined as:
Properties of Correlation Coefficient
a. Correlation coefficient r
XY measures only the extent of
linear relationship between X and Y.
b. Always, we have, -1 ≤
r
XY ≤ + 1
Further,
r
XY = 1 => perfect positive relation between X &Y (i.e.,as X
increases, Y increases)
r
XY = -1 => perfect negative relation between X &Y (i.e.,as
X increases, Y decreases)
r
XY = 0 => No correlation (i.e.,each of X and Y behave their
own way).
c. Change of origin and scale does not affect the
correlation coefficient.
Let U and V be defined as:
U = (X - a)/c, V = (Y - b)/d, where a, b, c & d are constants.
Then
r
XY =
r
UV
d. If X and Y are independent, then
r
XY = 0, but the
converse is not true as can be seen from the following
example:
Let:
X
Y
-3 -2 -1 +1 +2 +3
9 4 1 1 4 9
Here
r
XY
= 0, but actually Y = X
2
(a non-linear relation) and
hence X and Y are perfectly related.
e. Spurious Correlation
Let (x1, y1), (x2, y2) --------------(xn, yn) be n pairs of
observations. Mathematically one can calculate the
correlation coefficient between X and Y. However, to make
meaningful sense out of it, one must look for theoretical or
other reasons for the cause and effect relationship. While
222
agricultural production of Country A can be expected to
depend on the rainfall in that country, clearly rainfall in
Country A cannot provide any meaningful explanation for
agricultural production in Country B. On computation, we
will get some value for the correlation coefficient due to
influence of some common factors like nature or time, but
clearly such correlations cannot be meaningfully
interpreted.
f. Sometimes it may be more meaningful to correlate
variables with a lag, e.g., current months’ sale would
months ago ( i.e., a lag of 2 periods). Then, we may
correlate Y
t
with Xt-2.
g. is referred to as coefficient of Determination.
h. Test for significance of correlation coefficient
Let ∫XY = Population Correlation Coefficient between X and
Y. Then, we may test Ho : ∫XY = 0 against the alternatives
H1 : ∫XY ≠ 0, ∫XY > 0, ∫XY < 0 through a t - test.
The test statistic is given by
t = r
XY
√(n-2) ∾ t
n-2
√(1-

r
2
xy

)
The decision criteria at level of significance is as follows:
If H1 : ∫XY ≠ 0, then reject Ho if | t | > t(n-2)
If H1 : ∫XY > 0, then reject Ho if t > t(n-2)
If H1 : ∫XY < 0, then reject Ho if t < t(n-2)
This testing is made possible under the assumption that
the error terms e
i
’s are mutually independent and are
distributed normally with ‘zero’ mean and a constant
variance .
223
Section 2
Simple Linear Regression
Correlation coefficient measures the degree of linear
relationship between two variables. Though it does not
probe the cause and effect relationship. On the other hand,
the Linear Regression probes cause - effect relation by
specifying the nature of the relationship between Y
(dependent variable) and X (independent variable) in the
case of Simple Regression, and X1, X2, ...............Xk
(independent variables) in the case of Multiple Regression.
Simple Regression of Y on X
Let (x1, y1) ................ (xn, yn) be n - observations. We
believe that X is the cause and Y is the effect. We try to
identify the relationship through the following simple linear
model (see figure 10.2.1).
Yi = a + b Xi + ei,
where, ei is the error term
If we know a and b, we would know the relationship between
Y and X. We try to obtain a and b in such a way that the
“error sum of squares” is minimized, i.e., we minimize,
224
Figure 10.2.1:Simple Regression of Y on X

i-1

n
= (Yi - a - b Xi)
2
over the choice of ‘a’ and ‘b’ using
calculus approach. When this is done, we get and b^
(estimates of a and b) as:

This regression is called regression of Y on X, ‘a’ and ‘b’
are called regression coefficients, and and , the
respective estimates. This regression equation can be
equivalently written as (Y - ) = (X - ).
The method of determining a and b by minimizing the “error
sum of squares” is called the Least Squares Method.
Simple regression of X on Y
In a similar fashion, we can obtain the regression of X on Y,
say denoted as X = a* + b* Y
In this case, we get:
This regression equation can be equivalently written as:
(X- ) = * (Y - )
225
Keynote:Example 10.2.1
Keynote: Example 10.2.2
Some Properties of Regression Coefficients
i). Note that in the case of simple regressions.
This means
, where the sign would be decided
by the sign of and *. They will always have the same
signs.
ii). Also,
References:
Simple%20Regression.pdf
226
Section 3
Multiple Regression
In the case of simple regression, we had only one
independent (explanatory) variable (X) to explain the
dependent variable (Y). In the case of multiple regression,
we consider several independent (explanatory) variables
(say, X1.........Xk) to explain the dependent variable (Y). The
data structure looks as in the table 10.3.1:
Once again, we consider the linear model only (Simple
regression can be considered as a special case of multiple
regression where k=1). Thus the regression relation is
expressed as follows:
Yi = b
0
+ b
1
X
1i
+ b
2
x
2i
+...............+b
k
X
ki
+ e
i
The estimation of b’s is done following the same logic as in
the case of simple regression, i.e., by minimizing the error
sum of squares for the choice of b’s. While we will not
present here the formulae for b’s, computer software (Excel,
SPSS, SAS, etc.) gives the estimated values of the b’s (the
regression coefficients), written as b^’s.
Table 10.3.1: Data Structure Table 10.3.1: Data Structure Table 10.3.1: Data Structure
Observation Yi X1i....................Xki
1 Y1 X11....................Xk1
2
:
:
Y2
:
:
X12....................Xk2
:
:
n Yn X1n....................Xki
How good the regression fit is (simple or
multiple) ?
Once the regression line is estimated, we can obtain the
estimated values for Y¡’s (denoted by ’s) for given values
227
of X¡’s. We should expect that Y’s and ’s to be close for a
good regression relation. In other words, we would expect a
high correlation between Y¡ and if the regression relation
is good, i.e., can be taken as indicative of the
goodness of the regression fit.
Another way of looking at the degree of closeness between
Y’s and ’s could be through the break-up of total sum of
squares as given below:
(This relation can be proved.)
or, Total SS = Explained SS + Unexplained SS (or Error SS)
or,
where,
Clearly, higher the proportion of explained sum of squares
in the total sum of squares (R
2
), the better or more reliable
would be the regression relation. It is also clear that 0 ≤ R
2

≤ can be proved.
Actually, R
2
= .
R
2
closer to 1 is indicative of a good regression fit.
However, will keep on increasing if we continue to add
more independent variables to the regression relation even
if their contribution is not significant. To take care of this
situation, we use adjusted as given below:
will always be slightly lower than R
2
and would fall when
the addition of a variable does not contribute significantly.
Standard Error of Estimate
Estimate of , called standard error of estimate, is given as:
where, k= the number of independent variables.
Here

is an unbiased estimate of
2
, i.e.,
228
E (s
e
2
) =
2
In the case of Simple Regression:

229
Keynote 10.3.2: Example
Keynote 10.3.3: Example
Keynote 10.3.4: Example
Keynote 10.3.1:Example
Testing for significance of Regression
Relation
This amounts to testing for H
0
: R
2
=0
against alternatives on
. This is equivalent to testing
H
0
: b
1
=b
2
.......b
k
= 0 against
not all b’s equal to zero. All testings are carried out under the
assumptions that the error terms (
ei
’s) are mutually
independent and distributed normally with zero mean and
constant variance (

2
).
To test for significance of R
2
, we use the following
F - statistic:
Here, Ho : R
2
= 0 vs H1 : R
2
> 0
We reject Ho of F > F
k, (n-k-1)
(œ)
An ANOVA presentation can be made for the above
hypothesis testing as given below:
Testing for significance of Regression
Coefficients
If a regression relation is found to be significant, the next
logical question to ask is: which all independent variables
are contributing significantly to the regression relation?
This amounts to testing for significance of b’s individually.
This is done through appropriate t-tests.
In general, we can test for H
0
: b¡ = ß¡ against
alternatives, where ß¡ is the hypothesized value for the
regression coefficient from past experience or other
sources. The testing procedure is as below:
230
Table 10.3.2
Source of
Variation
SS DF MSS F - Ratio
Regression
∑(Y¡- Y^)
2
Explained SS
(ESS)
k ESS/k
F= ESS/k
Un SS/ (n-k-1)
Error
∑ (Y¡ - Y^)
2
Unexplained SS
(Un SS)
(n-k-1) Un SS/ (n-k-1)
Total
∑(Y¡- Y^)
2
TSS
(n-1)
H
0
b¡ = ß¡ vs H
1
: b¡ = ß¡
H
1
: b¡ > ß¡
H
1
: b¡ < ß¡
Test statistic is given by:
The Decision Criteria at œ level of significance is given by
If the bi’s are tested against ‘0’, (i.e., ß¡=0), then we refer to it
as test of significance of regression coefficients.
Confidence Interval for b¡
The confidence interval for b¡ is given by
Here, is to be read from the t-table
appropriately.
In the case of Simple Regression, we have:
a n d t h e c o n f i d e n c e i n t e r v a l f o r b i s :
231
If H
1
: b
i
≠ β
i
, reject H
o
if t > t
(n−2)
α
2

If H
1
: b
i
≠ β
i
, reject H
o
if t > t
(n−2)
α ( )
If H
1
: b
i
≠ β
i
, reject H
o
if t < t
(n−2)
α ( )
Keynote: Example 10.3.5
Prediction using Regression equation
An important purpose of estimating regression equation is
to predict the value of dependent variable for given values
of independent variables. It is possible to give the
confidence interval for such prediction.
The case of Simple Regression
Confidence Interval for predicting
(see Figure 10.3.1)
Confidence Interval for predicting individual value of Y
given X = Xo (see Figure 10.3.2)
Some Important Considerations
If a regression line is not significant (i.e., H
o
: R
2
=0
accepted), then the best prediction of Y is , for
any values of X’s.
While predicting Y - value, the X - values should be
within the maximum and minimum observations of
respective X’s or near about. In other words, we
consider the regression relation valid within the X-
232
ˆ
Y
0
± t n − 2, α 2 ( ) s
e
1+
1
n
+
x
o
− x ( )
2
x
i
− x ( )
2

Figure10.3.1:Prediction of Mean
Figure 10.3.2:Prediction of Individual
Observation
Regression relation obtained based on data from
one population, cannot be extended over another
population for prediction. However, one can test if
the regression relation obtained for Population I can
be taken as statistically equal to the relation
obtained for Population II through suitable statistical
tests. This amounts to testing for equality of
corresponding coefficients (all together) from the two
relations.
Example:10.3.6
Let us assume that the Demand for Wheat in Northern
states be
D = a + b P and in Southern states be
D* = a* + b* P.
The query is : Is the consumption pattern of wheat
same for Northern and Southern states.
This amounts to testing H
o
: a = a* & b = b* (together)
against the alternative H1 : not so. We can test for
equality of individual coefficients also.
Lagged Regression
of a product, it is reasonable to expect a time lag
before the impact is seen. We can identify many other
similar situations with lagged impact. In such cases,
we incorporate the lag in the regression model. If we
of three periods, then we can regress sales (S
t
) and
t
) as:
S
t
= a + b E
t-3
The idea can be carried forward to multiple regression
also with different lags for different explanatory
variables. However, the onus of identifying the lags for
different explanatory variables is on us, based on the
understanding of the phenomena being studied.
Transformations to obtain linearity: Sometimes
attempting a simple linear relation between the
dependent and independent variables may not
produce a good relation (i.e., R
2
not very high). In
233
Keynote: Example 10.3.6
such situations, we try some transformation on Y and
X variables and attempt fitting linear relation in terms
of transformed variables. Popular transformations are
log-transformation, semi-log transformation, square-
root transformation, reciprocal transformation, etc.
Double log transformation will appear as given below:

In this case, B can be interpreted as the elasticity of Y
with respect to X.
Semi-log transformation will appear as given below:

In this case, B can be interpreted as the growth rate
in Y, if X represents time.
234
Section 4
Some Financial Applications
Risk of a Portfolio
In the real world, investors may hold various securities and
other assets. Any such collection of assets is called a portfo-
lio. For example, if you have shares of the Tata Iron and
Steel Company Ltd. and Reliance Industries Ltd., you have a
portfolio consisting of two shares.
The return on a portfolio is equal to the weighted average of
the returns on the assets in the portfolio. The weights used
are the values of the individual assets in the portfolio.
The Standard Deviation of the returns on a security meas-
ures the risk of investing in the security. In the same way,
the Standard Deviation of a Portfolio measures the risk of in-
vesting in the portfolio.
235
Keynote: Example 10.4.1
Characteristic Line
Financial analysts often talk of the beta of a share. We will
describe what the beta signiﬁes and the method commonly
used to estimate it in this section.
Beta of a share is a number that is used to describe how sen-
sitive the share is to the movements in the market as a
whole.
The market as a whole represented by a market index such
as the Bombay Stock Exchange (BSE) Sensitive Index (Sen-
sex), BSE National Index and Economic Times Index. Sup-
pose, during a period under study, the Sensex has doubled
from 900 to 1800, an investor would expect the prices of the
shares held by him also to have doubled. Every investor
would like the shares held by him to do at least as well as
the market, if not better. Whether the individual shares do
as well as the market or remain unaffected by the market
trends depends upon the sensitivity of stock prices to the
market movements. This sensitivity of stock prices to the
market movements is measured by beta. If the stock has
trebled while the market index has doubled, the stock is
considered to be highly sensitive and its beta would be
greater than one. If the stock’s performance exactly
matches that of the market index, the beta of the stock
would be equal to one. If the stock’s appreciation is only
75% compared to the 100% appreciation in market index,
then the stock is less sensitive and the beta would be less
than one.
Depending upon the beta, shares can also be classiﬁed as
aggressive and defensive. If the beta of a share is greater
than one, the share is classiﬁed as an aggressive security.
Performance of an aggressive security is directly propor-
tional to the performance of the market. In a booming mar-
ket, aggressive security will perform much better than the
market performance. While in a bearish market, perform-
ance of aggressive security would drop at a rate faster than
the market. If the beta of a share is less than one, the share
is classiﬁed as defensive security. Performance of a defen-
sive security is also directly proportional to the perform-
ance of the market. When the market moves up, the hold-
ers of defensive securities would reap less than proportion-
ate beneﬁts. However, when the market moves down, the
decline in the defensive securities prices would also be less
than market movement.
Beta is also used to measure the systematic risk of a secu-
rity. The total risk of a security can be divided into two
broad components. The ﬁrst is the risk speciﬁc to the secu-
rity or diversiﬁable risk or non-systematic risk. The inves-
tor by holding a portfolio which is well-diversiﬁed can com-
pletely eliminate the unsystematic risk. The systematic risk,
236
which is the second component of the total risk, is the risk
associated with the general market movement and it can-
not be eliminated through diversiﬁcation. All securities do
not have the same degree of systematic risk because the im-
pact of economy-wide factors could differ from company
to company.
Modern Portfolio Theory contends that the required rate of
return of a security (which in turn determines the price of
the security) depends only on the systematic risk of a secu-
rity or its beta. The total risk is irrelevant because through
diversiﬁcation, the investor can eliminate the non-
systematic risk and hence the market would not consider
the non-systematic risk in the pricing process.
The foregoing discussion brings out the importance of the
study of the beta of a security. How is the beta estimated?
The returns from a given security are regressed with the re-
turn from the market index. The regression line or the line
of best ﬁt for the observations is called as the characteristic
line. The slope of the line is the beta of the security. While
regressing, the return on market index is taken as the inde-
pendent variable and the return on the security is taken as
the dependent variable.
Cost-Volume-Proﬁt Analysis
The Cost-Volume-Proﬁt (CVP) analysis provides answers
to vital questions such as:
At what sales volume would the ﬁrm break-even?
How sensitive is the proﬁt to variations in output?
How sensitive is the proﬁt to variations in selling
prices?
What should be the sales level in quantity terms for the
ﬁrm to earn the target level of proﬁts?
237
Keynote: 10.4.2
One basic assumption of CVP analysis is that all costs could
be segregated into ﬁxed and variable, and costs which are
of a semi-ﬁxed or semi-variable nature could be segregated
into the ﬁxed and variable components.
The method of simple linear regression is commonly used
to segregate the ﬁxed and variable components of semi-
ﬁxed or semi-variable costs. The illustration given below ex-
plains the application of regression technique in CVP analy-
sis.
238
Keynote: Example 10.4.3
SECTION 5
Case Study: Boosting Sales of Double Kola
239
This case study was written by Dr. Sunil Bharadwaj, Professor (Department of Decision Sciences), IBS, Hyderabad. It is intended to
be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.
The case was written from generalized experiences.
The
world famous ‘Cola war’ has been growing rapidly in the Indian
battleground for over a decade. The two cola giants, who have
been waging marketing war since the time they stepped into
the country, are trying to take full advantage of Indian weather
conditions and fast food habits of the Indians. The main charac-
teristic of the war is to use innovative, promotional and advertis-
ing campaigns and to strengthen distribution networks. Deliver-
ing value for money, delivering advertising around houses and
conducting market coups have been the standard operating pro-
cedure in the Coke versus Pepsi saga for decades worldwide.
Though Coke has turned out to be the leader in the market,
Pepsi is always trying to snatch the No.1 position in the market-
place. In 1993, Coke started with a huge 69% share of the mar-
ket, according to the data from the Indian Market Research Bu-
reau. It garnered a huge share by buying out Parle’s popular
brands – Limca, Thums Up and Gold Spot. However, it could
not leverage on such a large portfolio of products and soon the
collective strength seemed to fade away. As a result, Coke’s
market share dropped by more than 10% by the end of 2000,
while Pepsi’s market share went up from 23% to 43% in the
same period.1 According to AC Neilson, a leading market re-
search company, Coca-Cola India’s consolidated share of car-
bonated soft drinks was 57.8% in 2008, whereas Pepsico was
at a distant second with 35.6% share.2 However, PepsiCo is de-
termined to increase its market share by as much as it can.
The Issue
Double Koala, a renowned cola-maker, is facing a problem of
slow rate of growth in its sales in South India. It is lagging be-
hind the industry growth. Vijay Botliwala (Botliwala), the CEO of
the company, called for a meeting of marketing officials for a
better understanding of the variability associated with sales. Af-
ter greeting everybody, Botliwala threw the discussion open to
ing, started the discussion by highlighting the role of advertise-
vital to the business they are in. She believes that as the over-
all market size grows, the number of users of the product in-
creases, and hence the importance of attracting and converting
the users into customers of the company. Under such a situa-
tion, she argued, increasing emphasis must be placed on adver-
tising and informing potential customers about the availability of
the products. In the present era of Information and Communica-
tion Technology (ICT), there are advertising methods, which are
not only cost-effective but also capable in reaching out to a
large numbers of consumers. Advertising, for an industry like
cold drinks, can secure leads for salesmen and middlemen.
dealers’ as well as the consumers’ confidence at large in the
company and its products. She believes in the motto: ‘Advertis-
ing is to stimulate market demand’. She stresses on more budg-
campaigns. The company has tried celebrity endorsement at
various points of time. The celebrities were from the arena of
sports and movies. Kochar feels that in the past this has contrib-
uted to sales growth.
240
Boosting the Sales of Double Kola
On the other hand, Kaushik Agarwal (Agarwal), head of sales
tisement can stimulate demand, the sales-force of the com-
pany should also be ready to walk that extra mile to encash
the opportunity. His analysis revealed that the incentives given
to salespersons were not adequate to motivate them. He then
highlights certain incentive schemes, which are prevalent in
rival companies. He cited instances where salespersons from
rival companies were doing better than their own salesper-
sons and were being suitably rewarded. He observed that the
competitors’ sales teams have been aggressive in tying up
with local restaurants and fast food joints, whereas Double Ko-
ala’s salespersons were not taking up any such initiatives. He
felt that the main problem of slow growth is the old incentive
system which needs immediate upward revision. At this junc-
ture, Zahir Khan (Khan), deputy to Agarwal, supplemented
that given the low margin their company offered to the distribu-
tors, it is unlikely to attract new distributors, they preferring a
competitor instead. He said, “In our company, in the first place
we give lesser margin as compared to our competitors and we
particularly have no scheme for rewarding the best performing
distributors.” Khan believes that a little higher margin to the
distributors will be fruitful in attracting more distributors, par-
ticularly in new areas. This will lead to better sales for the com-
pany.
At this juncture, Kapil Singhvi (Singhvi), the finance manager,
brought up the issue of pricing. In his opinion, if the price is re-
duced, it may lead to an increase in demand. However, he is
not sure how far it will help in boosting the sales. At this point,
Botliwala took charge of the discussion. He ruled out consider-
ing a decrease in price, as it may instill a price war, which will
result in erosion of profits for the players in the industry. Botli-
wala also felt that they need to discuss the matter with hard
facts and figures, rather than on the basis of intuitions. He
asked Manoj Poddar (Poddar), the young market research
analyst at Double Koala, to come up with some key quantita-
tive information at the next meeting scheduled for Monday.
With only 3 days left for the meeting, Poddar worked hard on
the weekend to gather relevant information.In the next meet-
ing, Poddar presented data on quarterly sales, number of dis-
tributors, distributors’margins, company’s sales force strength,
nearest competitor’s sales force strength, total incentives
ment. While there were quick responses, comments and con-
clusions made based on the data presented by Poddar (Ex-
hibit I), Botliwala was aghast seeing the haphazard way in
which the conclusions were drawn. He felt that in these days
of mathematical modeling done with computer and software
support, there should be a more objective way of drawing con-
clusions from data. He also raised the issue of assessing the
impact of celebrity endorsement on the sales of their cola.
Foot notes
1. Rekhi Shefali, “COKE VS PEPSI – Cola Quarrels”,
http://www.indiatoday.com/itoday/04051998/biz2.html, May
4th 1998
2. Bhushan Ratna, “Coca-Cola Thums down for PepsiCo”,
http://economictimes.indiatimes.com/News/Coca-Cola_Th
241
September 30th 2008
242
243
Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data
Year Quarter Sales in INR Number of
Distributors
Distributor's
margin(%)
Competitors sales
force
Company's sales
force
Total
incentive
2002
1 20 155 5 1300 1000 0.5 1.2
2002
1 22 160 5 1400 1100 0.6 1.4
2002
1 24 160 5.5 1420 1300 0.6 1.5
2002
1 26 165 5.5 1425 1300 0.6 1.4
2003
1 26 165 6 1425 1326 0.7 1.4
2003
1 28 165 6 1400 1410 0.7 1.5
2003
1 28 165 6 1420 1420 0.7 1.6
2003
1 32 170 6.5 1425 1450 1 1.6
2004
1 30 170 6.5 1460 1460 1 1.5
2004
1 34 175 7 1460 1490 1 1.8
2004
1 34 175 7 1450 1510 1 1.8
2004
1 32 175 7 1500 1610 1 1.7
2005
1 36 180 7.5 1510 1650 1.5 1.8
2005
1 38 180 7.5 1500 1700 1.5 1.9
2005
1 38 185 8 1520 1750 1.5 1.9
2005
1 40 190 8 1530 1760 1.5 2.4
2006
1 44 200 8.5 1550 1790 1.8 2.1
2006
1 46 210 9.5 1560 1800 1.8 2.4
2006
1 48 200 9.5 1560 1800 1.8 2.5
2006
1 50 210 10 1550 1800 2 2.5
2007
1 58 220 10 1580 2000 2.1 2.9
2007
1 60 220 12 1600 2000 2.1 3.1
2007
1 62 230 12 1610 2000 2.2 3.2
2007
1 68 230 12.5 1650 2000 2.3 3.4
2008
1 72 250 12.5 1660 2400 2.9 3.6
2008
2 78 260 14 1750 2500 3 4.1
2008
3 84 265 15.5 1800 2500 3.5 4.2
2008
4 102 270 16 1850 2500 4 5.4
1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million
Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author
Exhibit I
SECTION 6
Case Study: Planning for Road Safety
244
This case study was written by Dr. Sunil Bharadwaj, Professor (Department of Decision Sciences), IBS, Hyderabad. It is intended
to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situa-
tion. The case was written from generalised experiences.
The Mayor of Ivory city, Cooper Mandela (Mandela), has
called in a meeting with the traffic police officials. The agenda
is to discuss the matter of increasing number of tatal accidents
in the city in the last few years. After greeting everybody, Man-
dela starts the discussion on the matter. During the discus-
sion, the traffic police officials try to present their perspective
on the recurring accidents.
The Meeting
The excerpts from the meeting are as follows:
As the discussion seemed to end nowhere, Mandela was
caught in a quandary with regard to finding a suitable solution
to the problem. In spite of all the prevailing confusion, he is
sure of one thing i.e., once the reasons for the variability in the
accidents are understood, he will immediately move forward
making necessary changes in the policies to solve the prob-
lem. To begin with, Mandela wants to understand the following
to enable him to tackle the problem:
What are the factors causing accidents in the city?
Which variable describes the variability in the number of
accidents the most?
Which variables significantly describe the variability in the
number of accidents?
Fortunately, at this point, the Statistician of the Police Depart-
ment dished out some statistics relating to the road accidents
in Ivory city (Exhibit I).
245
246
Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city
Year Quarter Number of ofﬁcials in
the ﬁeld
Number of people visiting
Bars in hundreds
Number of
the young
Number of
Vehicles
Trafﬁc police
investments
Prescribed speed limits
in the city
2002
1 20 500 13000 25000 1000 40
2002
2 22 500 14000 25500 1100 40
2002
3 24 600 14200 25600 1300 40
2002
4 26 600 14250 25700 1300 40
2003
1 26 600 13600 25750 1326 40
2003
2 28 650 14000 25800 1410 40
2003
3 28 600 14200 25900 1420 40
2003
4 32 650 14250 26000 1450 40
2004
1 30 650 14500 26500 1460 40
2004
2 34 700 14600 26700 1490 40
2004
3 34 700 14600 26800 1510 40
2004
4 32 700 14500 26850 1610 40
2005
1 36 900 15000 26900 1650 60
2005
2 38 900 15100 27000 1700 60
2005
3 38 900 15000 27500 1750 60
2005
4 40 900 15200 27900 1760 60
2006
1 44 1000 15300 29950 1790 60
2006
2 46 1000 15500 30000 1800 60
2006
3 48 1000 15600 30500 1800 60
2006
4 50 1000 15500 31000 1800 60
2007
1 58 1200 15800 32000 2000 60
2007
2 60 1200 16000 33000 2000 60
2007
3 62 1200 16100 35000 2000 60
2007
4 68 1200 16500 36500 2000 60
2008
1 72 1500 16600 38000 3400 80
2008
2 78 1500 17500 40000 3600 80
2008
3 84 1500 18000 45000 4000 80
2008
4 102 1800 18500 48000 4400 80
Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author
SECTION 7
Case Study: Measuring Growth and Responsiveness
247
This case study was written by L. Shridharan, Professor, IBS Hyderabad. It is intended to be used as the basis for class discus-
sion rather than to illustrate either effective or ineffective handling of a management situation. The case was compiled from gen-
eralized experience.
Suziland is a prosperous country, belonging to the league of
‘developed nations’. With a population of about 270 million,
the country’s growth has been keeping pace with the popula-
tion growth. In early 2011 the National Planning Committee,
headed by Dr. Peter Mugabe (a well known economist), was
engaged in drawing up development plan for the next four
years (2012- 2016). Being a free market economy, the coun-
try believed in indirect management of economic instruments
than direct interventions. Dr. Mugabe firmly believed that the
prosperity of the nation must reflect in the growth of ‘personal
consumption expenditure’ and its components, such as ex-
penditures on durables, non-durables and services. As a prel-
ude to plan for peoples’ prosperity, Dr. Mugabe emphasized
the need for assessing the existing growth pattern and the re-
sponsiveness of expenditures on different heads to a change
in the total personal consumption expenditure. He called Ms.
Julie Obama, the Research Officer with the Committee, and
asked her to provide the information within two days. Ms.
Obama got on the job immediately. By contacting the Depart-
ment of Statistics within the Government, she could get quar-
terly data on personal expenditures and its components for
the past six years in billions of Suziland dollar (SZ\$), the cur-
rency of Suziland (Exhibit I). With data at hand she now
needed to answer Dr. Mugabe’s queries on growth in expendi-
ture pattern and responsiveness of individual components to
a change in the overall personal expenditure.
How should Ms. Obama proceed?
248
Measuring Growth and Responsiveness in Suziland
249
Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components
Year Quarter Time Expenditure on
services
Expenditure on
durables
Expenditure on non
Durables
Personnel consumer
expenditure.
2005
1 1 2274.1 529.7 1169.2 3973.0
2005
2 2 2284.0 545.7 1178.2 4008.0
2005
3 3 2306.0 556.7 1186.1 4049.4
2005
4 4 2319.8 569.7 1190.5 4080.0
2006
1 5 2335.1 578.7 1205.0 4118.9
2006
2 6 2354.1 578.7 1205.0 4118.9
2006
3 7 2365.7 590.3 1217.9 4174.0
2006
4 8 2377.0 605.9 1226.1 4209.0
2007
1 9 2390.5 604.5 1233.0 4227.9
2007
2 10 2413.2 613.2 1237.8 4264.1
2007
3 11 2427.6 625.6 1240.1 4293.2
2007
4 12 2439.3 633.1 1246.3 4318.6
2008
1 13 2463.1 642.1 1253.2 4358.4
2008
2 14 2481.1 661.5 1267.9 4411.1
2008
3 15 2499.9 658.4 1271.7 4430.0
2008
4 16 2512.6 669.9 1280.8 4463.3
2009
1 17 2531.6 689.7 1292.0 4513.2
2009
2 18 2551.5 689.1 1291.3 4529.9
2009
3 19 2581.5 714.2 1307.5 4602.9
2009
4 20 2608.5 681.8 1306.3 4596.6
2010
1 21 2631.2 746.5 1329.8 4707.5
2010
2 22 2666.1 766.5 1347.1 4779.7
2010
3 23 2701.5 771.0 1354.2 4826.7
2010
4 24 2742.6 701.1 1453.6 4897.3
Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author
Time Series Analysis & Exponential Smoothing
Components of a Time Series
• Secular Trend
• Cyclical Variation
• Seasonal Variation
Irregular Variation
The Multiplicative Model
Exponential Smoothing
Case Study: Predicting Sales of a Company
Case Study: The Electric Fan Industry
C
H
A
P
T
E
R

1
1
I n t hi s c hapt e r we wi l l di s c us s
Section 1
Components of a Time Series
A sequence of values of a variable, which change with the
course of time constitutes a Time Series. The time aspect
of such variables plays a very important role as it affects
the variable to a large extent. The analysis of time series
helps in forecasting or projecting the future value of the
variable.
The primary components of Time Series are:
Secular Trend
Cyclical Variation
Seasonal Variation
Irregular or Random Variation
Secular Trend
Secular trend is the general tendency of the data to grow,
decline or to remain constant in values over a period of
time. It relates to the movement of data over a fairly long
period of time. There are two types of secular trends:
Linear and Non-Linear.
Linear Secular trend is a straight line trend. When the
data relating to a series is plotted against time, if most of
the observations cluster around a straight line, it is a
251
Video 11.1.1: Time Series Analysis
situation of linear trend. It
can be upward slopping,
downward slopping or be
horizontal to the time axis.
Non-Linear Secular trend is
a trend which does not give
rise to a straight line when
the time series is plotted
against time. It takes a concave, convex or curvilinear form
with ups and downs. One of the widely used method to fit a
secular trend and estimate the model parameters is the Least
Squares Method discussed earlier in the regression chapter.
Commonly, we use the following models for trend fitting:

where, t = refers to the time period
Y
t
= the data at time t
Least squares approach is used to fit the above trend curves.
In the case of linear trend, the parameter are estimated as:
252
Video 11.1.2:Trend Analysis
Keynote 11.1.1: Time series analysis
Figure 11.1.1: Components of a Time Series
For manual calculations, see the link for simpler calcula-
tions.
Example 11.1.1 :(Refer keynote 11.1.2)
Some-
times, we also include dummy variables, while defining
the linear trend. Suppose we have quarterly data on sales
of a product for a few years. We can model this situation
with the following model :
Y
t
= a + bt + c
1
Q
1
+ c
2
Q
2
+ c
3
Q
3
,
where, t = refers to time (expressed in quarters)
Y
t
= the data value at time t
Q
1
= 1 if it is quarter 1
= 0 otherwise
Q
2
= 1 if it is quarter 2
= 0 otherwise
Q
3
= 1 if it is quarter 3
= 0 otherwise
Here too, we can estimate all the regression coefficients
through least squares method. This approach takes care
of linear trend and seasonality together.
Cyclical Variation
Cyclical variation is the gradual fluctuation in a time series
taking place over long time period (years). Business cy-
cles present a common example of cyclical fluctuation,
with a boom, slump, recession and recovery phases.
Most of the time series relating to price, investment, in-
come, wage, production, etc., exhibit this type of cycle.
Residual Method
253
Keynote 11.1.2: Example
The Residual Method is the common method used for cal-
culating Cyclical variations. The ratio of actual values and
the corresponding trend values is used as indicative of cy-
clical fluctuation.
Cyclical Variation =
where, = Actual values
= Estimated trend values
Seasonal Variation
Seasonal Variation is fluctuations that occur regularly
within a year over seasons. For instance, sale of refrigera-
tor would be influenced by the seasons (summer, winter,
autumn or rainy). These are short term fluctuations which
can change weekly, monthly, quarterly or half yearly. The
main reasons for such variations are natural causes such
as weather or climate and social causes such as habits,
Ratio to Moving Average Method
A widely used technique for calculating the seasonal
trends is the Ratio to Moving Average Method. In general,
moving average of a time series indicates running aver-
ages for the data taken over a given contiguous period. In
254
Keynote 11.1.3: Example
Keynote 11.1.4: Example
the context of seasonal variation, we take the average over
the number of periods in a year (4 if quarterly data, 12 if an-
nual data). Each time the average is recorded at the centre of
the period. If the number of periods is odd, then there is a
unique centre. If it is even, then we centre the two middle
most averages by taking their average, so as to represent
against a particular period. It should be easy to see that these
moving averages are smoothening out the seasonal effect.
Consequently, the ratio of actual value to the corresponding
moving average value would be indicative of the seasonal im-
pact. Using this logic, we develop a seasonal index illustrated
in the example (Refer keynote).
Irregular Variation
Irregular variations follows an indistinct and an unequal
pattern. They do not repeat in any specific pattern. They
are also called erratic, accidental, episodic variations.
These variations are caused by accidental and random
factors like earthquakes, famines, floods, wars, strikes,
lockouts, epidemics, etc. They include variations which
are not attributable to secular, seasonal or cyclical varia-
tions. There are no models to find out the irregular trend
as they occur unexpectedly and inconsistently though
some methods are used to isolate these trends.
A Multiplicative Model
A time series can be expressed as an additive or a multi-
plicative model.
In practice, the multiplicative model is popularly used. The
multiplicative model is expressed as:
Y
t
= T
t
x C
t
x S
t
x I
t
,
where, Yt = Actual value of the time series at time t,
T
t
= Trend value of the time series at time t.
C
t
= Cyclical Index at time t
S
t
= Seasonal Index at time t and
I
t
= Irregularity ratio at time t.
As stated at the beginning, the purpose of studying a time
series is to make forecasts for near future. Using multipli-
255
Keynote 11.1.5: Example
cative model, we can forecast taking into account the trend,
cyclical and seasonal indices. We earlier studied as to how
to quantify each of these components. We presume/expect
the irregularity ratio to be unity on an average. Thus, a fore-
cast based on multiplicative model would be more reliable
than one based on trend alone. However, we should keep in
mind that in as far as ‘time’ is used as an “overall” explana-
tory variable for the behavior of the time series, such fore-
casts should be made only for the near future, i.e., for the
short-term.
References:
www.clt.astate.edu/crbrown/multiplicativemodel.ppt
256
Keynote 11.1.6: Example
Section 2
Exponential Smoothing
Exponential smoothing has become very popular as a
forecasting method for a wide variety of time series data.
Historically, the method was independently developed by
Brown and Holt. Brown worked for the US Navy during
the World War II, where his assignment was to design a
tracking system for fire-control information to compute the
location of submarines. Later, he applied this technique to
the forecasting of demand for spare parts (an inventory
control problem). Since then, various types of exponential
smoothing models have evolved. Generally, exponential
smoothing techniques find application in financial and eco-
nomic time series, though it can be used with any discrete
set of repeated observations, as done by Brown earlier.
Moving Average
We earlier discussed about moving average in the context
of a time series. Suppose a time series is more or less
hovering around a constant value, but for some random
errors, then we can write:
i.e., the average of the observed time series.
Thus moving average gives equal weight to each observa-
tion and can be said to be an appropriate smoothing tech-
nique in the case of a “constant” time series, which is mod-
eled as above.
257
Single Exponential Smoothing
When the value of the parameter in the model is
slowly changing over time, giving equal weight to each obser-
vation may not be appropriate. Instead, it may be preferable
to attach greater weight to recent past than to the remote past
in a graded manner. Simple Exponential method achieve this
through a smoothing constant ( ). The model can be written
as:
Being a recursive relation, this can be simplified as:
This implies that each smoothed value is the weighted aver-
age of the previous observations, where the weights decrease
e x p o n e n t i a l l y .
Refer keynote for example (example 11.2.1).
Double Exponential Smoothing Model
If a time series exhibits a linear trend, then Holt-Winter double
exponential smoothing model is recommended for forecast-
ing. The model smoothens an exponentially smoothing compo-
nent (E), with a smoothing factor(∝), and a trend component
(T), with another smoothing factor (β). The model is given as
258
Keynote 11.2.1: Example
where,
F
t
= Forecast value for period t
Y
t
= Actual value for period t
E
t
= Estimated value for period t
T
t
= Trend value for period t
= Smoothing factor for estimates (0< <1)
= Smoothing factor for trends (0<β<1)
k= number of periods ahead, for which forecasting is be-
In the model, (a) E
1
and T
1
are not defined and in (b) We
take E
2
= Y
2
and T
2
= Y
2
- Y
1
.
By taking separate values for and , both between 0
and 1, we can obtain the forecast for k periods ahead.
However, the critical question is - how to decide values for
and ? We do this by “trial and error” so as to minimize
the Root Mean Square Error (RMSE) of the model. This
would be equivalent to minimizing Mean Square Error
(MSE). In practice, we minimize MSE. This can be done in
an organized way using Excel Solver, where we try to mini-
mize MSE (for the choice of and ) subject to ≥ 0,
≤ 1 , ≥ 0, ≤ 1.
With values for “ ” and “ ” obtained which minimize the
MSE, we can use these values for the actual forecast of
the time series.
Forecast Error
Let be a time series with data for n periods and be
the corresponding forecast (obtained by any method),
then popularly forecast errors are measured by Mean Ab-
solute Error (MAE) or Root Mean Squared Error (RMSE).
The former is defined as
If forecasts for a situation is obtained through several
methods (say, by decomposition method, single exponen-
tial smoothing and double exponential smoothing), then
259
we can choose the method giving least forecast error
(based on MAE or RMSE) as the “best” method for the
particular forecast. For single exponential smoothing, one
could compute RMSE over various values of and β. For
single exponential smoothing, one could compute RMSE
over various values itself, and for double exponential
smoothing over various combination of values of and β.
Having said this, periodically we need to reconfirm that
the identified method parameter values (like and β) con-
tinue to be the best by recomputing the forecast errors
and comparing them over different methods. In practice,
use of RMSE or squared RMSE (called Mean Squared Er-
ror (MSE)) is more popular.
260
Question 1 of 8
The weighting factor (#) in simple exponential
smoothing ranges from
A. -1 to 1
B. 0 to 1
C. 1 to 2
D. 2 to 3
Section 3
Case Study: Predicting Sales of a Company
261
This case study was written by Sunil Bhardwaj, Professor, Department of Operations & IT, IBS Hyderabad. It is intended to be
used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.
The case was prepared from the generalised experiences.
Amar Corporation is into the business of manufacturing wash-
ing machines and the company has been fairly doing well.
One of the important activities carried at Amar Corporation is
to generate production schedules so as to meet anticipated
demand for a coming year. Accurate planning for production
and supplies depend on the quality of the forecast they are
able to generate.
On his first visit of the company, in the first round of the talks
with the people at the corporate office, Amit observed that
mostly the forecasts are based on the qualitative assess-
ment of the market conditions by the sales manager and his
team. He observed that business forecasting has always
been one of the important components of running such an en-
terprise. However, forecasting traditionally has been based
less on concrete and comprehensive data than on face-to-
face meetings and common sense.The typical practice is to
have an opinion of sales force team about the sales of the
washing machines and then a number is mutually decided
and agreed upon by the team. This practice has worked well
for the company however last few years they were not able
to forecast with much accuracy.
During his MBA course he has learned that in recent years,
business forecasting has developed into a much more scien-
tific exercise, with a host of theories, mathematical tech-
niques and models designed for forecasting certain types of
data. The development of information technologies and the
Internet has given boost to this development, as companies
tices, but into forecasting schemes as well. In the 2000s, fore-
casting or predicting the optimal levels of goods to buy or
products to produce involved specialized software and elec-
tronic networks that incorporate great deal of data and ad-
vanced mathematical algorithms tailored to a company’s par-
ticular market conditions and nature of business.
Amit understands that forecasting sales and profits, particu-
larly on a short-term basis (one year to three years), is neces-
sary for planning for business success. This process, estimat-
ing future business performance based on the actual results
from prior periods, enables the business owner/manager to
modify or manage the operation of the business on a timely
basis. This allows the business to have a better understand-
ing of deciding on sales targets and avoid losses or major fi-
nancial problems incase some future results from operations
not conform to reasonable expectations. Amit was excited
with the consulting project and he thinks it’s an opportunity to
train some of the managers on few of the quantitative fore-
casting techniques. Amit has been thinking about the various
parameters which may be helpful in preparing the forecast
for the client. Some of them are:
Company Specific Data
Sales Force size,
Incentive schemes,
262
Predicting Sales of a Company
Promotional Budget
Price
Competitor Price
Environment Specific Data
Overall state of the economy,
Economic status of Amar Corporation as well as the in-
dustry within the economy
Population growth,
Disposable income
Elasticity of demand for the product or service
Threats from the substitutes or competitor products
Data from the Past
Previous sales levels and trends,
Average past administrative, and selling expenses,
Trends in the company’s credit policy (supplier,
trade credit, and bank credit) to support various levels of
inventory
Trends in accounts receivable required to achieve previ-
ous sales volumes
After a few rounds of meeting with the managers at Amar
corp. they have invited Amar to teach few of the quantitative
approaches to forecasting. The following data was made
I. What techniques can be used to forecast sales with this
data ? What are the drawbacks of the techniques?
II. Which is better qualitative forecasting, quantitative fore-
casting or both?
263
264
Year Sales (In lakhs)
(In lakhs)
1991 150 15
1992 120 16
1993 160 15
1994 150 17
1995 150 18
1996 130 19
1997 180 18
1998 160 18
1999 170 18
2000 140 15
2001 200 20
2002 180 21
2003 200 22
2004 150 17
2005 230 24
2006 200 24
2007 250 25
2008 260 26
2009 270 28
prepared by the author prepared by the author prepared by the author
Exhibit I
Section 4
Case Study: The Electric Fan Industry
265
This case study was written by Prof. L. Shridharan, Department of Decision Sciences, IBS, Hyderabad. It is intended to be used
as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation. The
case was written from generalized experiences.
The first electric fan was
manufactured in India in
1921. While Orient Fans
started in 1940s, the growth
in the industry came about after independence with a ban on
imports. Jay Engineering Works (with Usha brand) started in
the 1950s. Other major organised sector players in the indus-
try today are Khaitan, Polar, Crompton Greaves, Bajaj, Hav-
ell’s, Metro and a few others, besides several smaller units in
the small scale sector. The fan market in India consists of ceil-
ing fans, table fans, pedestal fans, wall fans, exhaust fans
and industrial exhaust & special purpose fans. Given the tropi-
cal nature of India, it is the ceiling fan which has a dominant
share in the total production and market.
While in fifties, Kolkata emerged as the major production cen-
medium cities in India. Hyderabad emerged as a major cen-
tre in the nineties, though Jay Engineering Works had started
its unit in the sixties. Today, Hyderabad is considered the larg-
est ceiling fan manufacturing centre in the country. While 10
to 15 units in the organized sector manufacture complete
fans, a few hundred units (mostly in unorganized sector)
manufacture various components.Tibrewala, owner and chief
executive officer of Bhagyanagar Fans Limited, Hyderabad is
also the President of the association formed by the local fan
manufacturers in the organized sector. With some shortages
faced in the components supply in the last peak season by
many units, the association felt that a better understanding of
the short term demand pattern is necessary. After the meet-
ing, Tibrewala called Ravi Kumar, the young market research
executive (an MBA) with the association, and explained what
he wanted. He asked him to come forward with a short term
forecast for the next six months, before the next meeting
scheduled after two weeks.
Ravi Kumar got on to the job immediately with an internet
search, library search and visits to the industry department
for relevant data. Despite spending a week on these efforts,
he could not find past data on sales or demand for electric
fans. What he could find however was the monthly produc-
tion by organized sector since 2000 A.D.(Exhibit I). With only
a week to generate the forecast Ravi is under tension since
he is not clear on the data or the approach to generate the
forecast.
266
The Electric Fan Industry
INTERACTIVE 11.1 Production of Elec-
tric Fan in India
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Production of Electric Fans in india
(In lakh numbers)
Month 2000-01 2001-02 2002-03 2004-05 2005-06 2006-07 2007-08 2008-09
April 6.6 6.9 7.7 9 9.2 9.6 10.1 10.4
May 6.7 7.3 8.1 8.5 9.5 9.7 11.3 11.6
June 5.9 7.2 8.7 8.7 7.9 8.8 10.8 10.7
July 4.9 7.2 8.2 7.2 7 9.7 9.7 9.5
August 5.8 7.2 6.8 8.1 6.1 6.9 8.8 8.8
September 6.4 7.5 6.5 7.7 6.6 7.5 8.4 8.9
October 6.2 7.2 7.6 8.5 6.8 7.3 8.1 8.4
November 6.3 7.4 7.7 8.2 7 7.7 8.6 7.6
December 9.6 7.1 7.7 8.3 8.2 8.8 9.4 8.2
January 7.5 7.8 7.6 8.2 8.9 9.4 10.5
February 7.6 8.1 7.3 8.3 9.8 10.2 10.9
March 8.1 8.7 8 9.3 10.2 11.8 11.1
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Source: Monthly Abstract of Statistics, central Statistical organization, Government of India
Several issues between January 2001 to March 2009
Exhibit I
Decision Theory
The Framework for Decision-making
Decision-Making Environment
• Decision-making under Certainty
• Decision-making under Uncertainty
• Decision Making under Risk
a. Expected Monetary Value
b. Expected Opportunity Loss
c. Expected value under Perfect Infor-
mation
Decision Tree Analysis
Posterior Probability and Decision-making

C
H
A
P
T
E
R

1
2
I n t hi s c hapt e r we wi l l di s c us s
Section1
The Framework for Decision-Making under different Environments
Decision-making, both long-term and short-term, are an
integral part of any management process. Managers at all
levels have to deal with planning, organizing, monitoring and
control at their levels of operations, be it in finance,
production, investment, pricing, demand or research
decisions. The import of these decisions are expected to be
seen through increase in profit, reduction in cost, increase in
turnover, increase in market share etc. Decision theory
helps us in arriving at appropriate decisions under different
circumstances.
The Framework for Decision-making
Regardless of the context and environment, all decision
problems faced by mangers, have the following common
features:
Decision Makers Objectives: These should be clearly stated
measurable objective to the problem. For instance, a
production manager would be concerned with minimizing
downtime of a machine whose downtime may prove costly
to the organization. His problem could be to determine the
optimal number of spare motors to maintain to enable quick
repair of the machine.
Courses of Action: It is a list of available alternative acts
available to the manager to address the problem. He can
choose any one of them. Clearly, he would like to identify
the act which would be “optimal” to is situation. If n
alternatives are available to problem, they are donated by
.
For the production manager problem, the courses of action
are the inventory of spare motors he should maintain, say
1,2 or 3.
States of Nature: These are events which are beyond the
control of the decision makers, but has an impact on the
problem at hand and generally donated by to
indicate m states of nature faced by the problem. While the
states of nature are uncertain, based on past experience or
on other basis, it may be possible to indicate the probability
of occurrence for each level of state of nature. For the
production mangers problem, the state of nature is --how
many motors would fail at a time (1, 2 or 3).
Payoffs: This is a calculable measure of benefits or loss for
each combination of action and state of nature. The payoff
for action and state of nature is donated by . While
270
the payoffs are generally in monetary terms, it is not a
must. It could be time saved, material saved, measurable
quality improvement, etc. Thus the production manager
may estimate the production lost etc. or downtime of the
machine for each combination of inventory or breakdown
as payoff.
The following payoff table 12.1.1 sums up the production
manager’s problem:
Example: A Pricing Problem
Consider an Ayurvedic cosmetic company contemplating to
introduce a multi-herbal cream in place of its existing uni-
herbal cream. The issue before the company is to decide
on price under three options - offer existing price ,
moderate increase in price highlighting the multi-herbal
character , a substantial increase in price with a new
attractive packaging highlighting the multi-herbal character
. These are the courses of actions, of which the
company wants to choose one. However the company is
aware of the following possible market conditions (states of
nature): no competitor emerging , a small competitor
emerging and a major competitor emerging . The
marketing department estimates the annual net profits (pay
offs) for each course of action under different market
conditions in the table 12.1.2 as follows:
The company has to decide on a strategy (action) to be
followed.
271
Table 12.1.1 Table 12.1.1 Table 12.1.1 Table 12.1.1
Number of
motors that
may fail
Spare motors to keep (Act) Spare motors to keep (Act) Spare motors to keep (Act)
Number of
motors that
may fail
1 (A1) 2 (A2) 3 (A3)
1 (S1) X11 X12 X13
2 (S2) X21 X22 X23
3 (S3) X31 X32 X33
Table 12.1.2 Table 12.1.2 Table 12.1.2 Table 12.1.2
Market
condition
Annual Net Profit (in Rs.mn) under
different pricing strategies
Annual Net Profit (in Rs.mn) under
different pricing strategies
Annual Net Profit (in Rs.mn) under
different pricing strategies
Market
condition
No increase
Moderate
increase
Substantial
increase
No
competition
6.00 5.00 3.50
Minor
competition
5.00 4.50 2.50
Major
competition
4.00 3.00 1.80
Decision-Making Environment
We have three types of decision making environment
Under Certainty
Under Uncertainty
Under Risk
Decision-making under certainty
Under certain environment there is only one state of nature
and all information are known with definite results. Hence a
deterministic choice of action can be made. Techniques like
Linear Programming, Transportation and Assignment
models, Goal programming, Break Even Analysis, etc., are
used under certainty environment.
Decision Making under Uncertainty
Under uncertain environment, more than one states of
nature exists. However beyond identifying the state of
nature, we have no information on them and hence it is not
possible to assign any probability to each level of state.
Under this situation, decision are based on specific criteria
depending on ones choice of principles. For this several
alternative criteria are: (a) Maximin; (b) Minimax; (c)
Maximax; (d) Laplace; (e) Hurwitz Realism; and (f) Regret.
We explain below each of the criteria illustrated in the
context of pricing problem:
(a). Maximin Criterion: This is a pessimistic approach,
where we go for the best action under the worst state of
nature.
For the problem, the decision would be to go for “no
increase” in price, as the company makes Rs. 4 million
under the worst scenario of a major competitor emerging.
(b). Minimax Criterion: This is an optimistic approach,
where the worst pricing strategy for the best market
conditions is selected. Here, this approach suggests the
strategy of “substantial increase” in price with “no
competition” as the best market condition and the
company makes Rs. 3.50 min.
(c). Maximax Criterion: This is the “best of the best”
approach. Hence, the best market condition is “no
competition” and the best strategy is “no increase in
price”
(d). Laplace Criterion: With no information, the probability
of the various state of nature, we assume equal
probability (1/3 in this case) and compute the expected
pay off for each action . Hence we get:
E (No price change)= Rs.5 mn
E (Moderate Price change) = Rs.4.17 mn
E (Substantial price change) = Rs.2.6 mn
Hence, by a place criterion, we prefer “no increase” in
price with a payoff of Rs.4.9 mn.
(e). Hurwitz Realism Criterion: The Maximax and Maximin
are the two extremities - Optimistic and Pessimistic.
Hurwitz proposed that realism would be somewhere in
between. Representing the degree of optimism by œ
(where 0 < < 1), Hurwitz suggested that for each
strategy (act) a decision index (Di) be calculated as the
weighted average of optimistic and pessimistic pay offs,
the former weighed by degree of optimism ( ) and the
latter by degree of pessimism (1- ). Taking = 0.6, for
pricing problem we get:
D (No price rise) = 0.6 x 6.00 + 0.4 x 4.00 = 5.2
272
D (Moderate price rise) = 0.6 x 5.00+0.4 x 3.00 = 4.2
D (Substantial price rise) = 0.6 x 3.50 + 0.4 x 1.80 = 2.82
Thus we would go for “ no increase” in price based on
Laplace criterion. The conclusion could depend on the value
of . Hence a realistic guess of needs to be arrived out.
(f) Regret criterion: This approach takes into account the loss
of missed opportunity due to not knowing the state of
nature in advance. An opportunity loss can be computed
as the differences between the pay off for a given
outcome and the maximum pay off under that state of
nature. The opportunity loss table 12.1.3 for the pricing
problem would be as follows:
Thus, minimum of the maximum regret is zero. Hence,
“no price increase” is the strategy to be adopted under
regret criterion.
While there are differences in the conclusion reached
following different criterion , they primarily reflect the
underlying principles of each criteria. The company has to
take a call on the principle to follow depending on its
outlook and judgements.
Decision-Making under Risk
(a). Expected Monetary Value
Unlike in the previous case, when the probabilities of the
state of nature are known, it is a situation of decision-
making under risk. The probabilities may be usually
known on the basis of past data / experience. In this case,
we have only one criteria leading to unique selection of
strategy, though probabilistic in nature. We find the
expected pay off for each strategy called Expected
Monetary Value (EMV), using the probabilities of the
states of nature. In the pricing example, suppose the
probabilities of three states were given as 0.2 (for no
competition ), 0.5 (for minor competition) and 0.3 (for
major competition). Then, EMV or expected profit (EP in
this case) is given as
EMV ( No change in Price ) = Rs.4.9 mn
EMV (Moderate change in Price ) = Rs.4.15 mn
EMV (Substantial change in Price) = Rs. 2.49 mn
Since the EMV is highest for “ no change “ in price, the
273
Table 12.1.3 Table 12.1.3 Table 12.1.3 Table 12.1.3
Market
condition
Regret Payoffs pricing strategy Regret Payoffs pricing strategy Regret Payoffs pricing strategy
Market
condition
No increase
Moderate
increase
Substantial
increase
No
competition
(6-6)=0 (6-5)=1 (6-3.5)=2.5
Minor
competition
(5-5)=0 (5-4)=1 (6-4.5)=2.5
Major
competition
(4-4)=0 (4-3)=1 (6-1.5)=2.2
(b) Expected Opportunity Loss: Under regret criterion, we
discuss pay off in terms of opportunity loss. If we are working
with such payoffs, we call the corresponding EMV as Expected
Opportunity Loss (EOL) and would go for the strategy with least
expected loss. This approach would be followed if the payoffs
were defined in terms of cost or downtime. In the pricing
example,
EOL (No increase ) = 0
EOL ( Moderate increase ) = 0.75
EOL (Substantial increase ) = 1.285
Thus “no increase” in price” option is selected.
(c) Expected Value under Perfect Information:
In the above kind of situation suppose a soothsayer (a modern
day Market Research Consultant), says that he can give perfect
information on the occurrence of state of nature. Could this
knowledge affect on strategy choice? How much would the
information be worth as soothsayer do not come free? For
instances, in the pricing example, the probability of each state
remaining the same, the soothsayer tells the state of nature (in
market condition) for every day for the next one year. As
opposed to the previous case, where we knew the percentage of
days for which each of the state of nature would apply, now we
additionally know which state will apply for any given day.
Clearly, with market condition known for each day, the company
would choose the strategy with maximum payoff for that day.
Thus, under perfect information, the relevant part of the payoff
table 12.1.4 would look as below:
Therefore, Expected Profit of Perfect Information (EPPI)
= 0.2 x 6.00 + 0.5 x 5.00 + 0.3 x 4.00
= Rs.4.9 mn
and, Expected Value of Perfect Information
= EPPI - EMV (max)
= 4.9 - 4.9 = 0
Normally, the difference between the EPPI and EMV (max) is
the amount of money, the company will be willing to pay for the
perfect information. In this pricing example, since the difference
is zero, the company will not go for the perfect information.
274
Table 12.1.4 Table 12.1.4 Table 12.1.4 Table 12.1.4 Table 12.1.4
Market
condition
Probability
of the state
Regret Payoffs pricing strategy Regret Payoffs pricing strategy Regret Payoffs pricing strategy
Market
condition
Probability
of the state
No
change
Moderate
change
Substantial
change
No
competition
0.2 6.00 - -
Minor
competition
0.5 5.00 - -
Major
competition
0.3 4.00 - -
Section 2
Decision Tree Analysis
We earlier discussed about decision-making under risk.
Decision tree is a diagrammatic presentation of the same
decision process, which helps essentially in easy
comprehension of the logical relations in the process.
Decision tree is a convenient tool for making financial or
number based decisions where a lot of complex information
needs to be taken into account. These provide an effective
model in which alternative decisions and the implications of
taking those decisions can be laid down and evaluated.
They also help the managers to get an accurate, balanced
picture of the risks and rewards that can result from a
particular decision. Decision trees can be drawn to evaluate
the risk in decisions concerning investments, new products
launches, outsourcing, etc.
Guidelines to Draw a Decision Tree
Drawing a decision tree starts with a decision that needs to
be made. This decision is represented by a small square
called decision node (usually decision trees are drawn from
left to right). Each possible alternative is represented by a
line (drawn from the decision square diverging to the right)
and the payoff is written at the end of the line. When the
outcomes at a point are uncertain, then we draw a small
circle to represent that node. Each alternative from this
node, called as chance node, is represented by a line and
an associated probability. The pay offs are written at the end
of the line. If the
result is a decision,
draw another square
at the end of the line.
Figure 12.2.1 shows
how a decision tree
looks like.
with a basic decision
tree, review this to
find out whether any
other solutions or outcomes can be considered for further
evaluation. Then prepare a final decision tree diagram.
275
Figure 12.2.1: Decision Tree Model
Example 12.2.1
A FMCG company is making plans either to launch a new
product or to consolidate its existing products. The
company can launch new products in the market in two
ways: (1) Through detailed product development (2) Rapid
product development. If the company wants to consolidate,
then it would do it either by strengthening its existing
products through advertising and promotion or also thinking
of reaping the benefits of the brand name of the company
without making any additional investments. For this, the
company has employed a market research firm to find out
the market reaction of its products. The market research
after doing the survey of the company’s products found out
that the market may have three reactions - good, average
and poor and accordingly calculated the profits for each
reaction.
If the company goes for detailed product development for
launching new products, then the company can make profit
of Rs. 10,00,000 when market reaction is good, Rs. 50,000
when average and Rs. 2,000 when poor and the
probabilities of such reactions are 0.4, 0.4 and 0.2
respectively.
If the company goes for rapid product development for
launching new products, then the company can make profit
of Rs. 8,00,000 when market reaction is good, Rs. 25,000
when average and Rs. 2,000 when poor and the
probabilities of such reactions are 0.2, 0.1 and 0.7
respectively.
If the company goes for strengthening its existing products for
consolidation, then the company can make profit of Rs. 3,
00,000 when market reaction is good, Rs. 20,000 when
average and Rs. 6,000 when poor and the probabilities of
such reactions are 0.2, 0.4 and 0.4 respectively.
If the company goes for consolidation its existing products
without making any additional investments, i.e., reap the
existing products, then the company can make profit of Rs.
20,000 when market reaction is good, Rs. 9,000 when
average and Rs. 6,000 when poor and the probabilities of
such reactions are 0.3, 0.2 and 0.5 respectively. The detailed
development cost is Rs. 1, 50,000, rapid development cost is
Rs. 80,000, while costs for strengthening the existing
products is Rs. 30,000. Now the company has to decide
whether to launch new products or to consolidate existing
products.
Solution:
Evaluation of Decision Tree
When a decision tree diagram is made, a manager can take
the decision from the decision tree which will give him the
greatest payoff. Managers can evaluate each possible
outcome by assigning cash or numeric values to them. Now
the lines from a circle (chance point) are given probability
depending on the chance of that event (outcome) occurring.
Clearly, at each circle the total probability must be 1. These
probabilities are assigned based on past data (if data is
available) or experience based guess of the manager
(subjective probability). When probabilities are assigned the
276
decision tree in Figure 12.2.2 will look like the tree in Figure
12.2.3.
Calculating Tree Values
Once the manager has decided the values of the outcomes and
has assessed the probability of occurrence of these outcomes,
he can start calculating, the values of each alternative. In our
problem the probability values have been assigned in Figure
12.2.4.
277
Figure 12.2.2: Decision Tree Model for a FMCG Com-
pany
new
ÞroducL
ConsolldaLe
ueLalled
uevelopmenL
MarkeL 8eacLlon
8apld
uevelopmenL
SLrengLhen
ÞroducL
8eap ÞroducL
MarkeL 8eacLlon
(0.4) C 8s. 10,00,000
C - Cood
A - Average
(0.4) A 8s. 30,000c
(0.2) Þ 8s. 2,000
(0.2) C 8s. 8,00,000
(0.1) A 8s. 23,000
(0.7) Þ 8s. 2,000
(0.2) C 8s.3,00,000
(0.4) A 8s. 20,000
(0.4) Þ 8s. 6,000
(0.3) C 8s. 20,000
(0.2) A 8s. 9,000
(0.3) Þ 8s. 6,000
Figure 12.2.4: Decision Tree
new ÞroducL
ConsolldaLlon
Detailed Development
8apld uevelopmenL
4,20,400
1,63,900
8eap ÞroducLs
SLrengLhen
ÞroducLs
70,400
10,800
C - Cood
A - Average
C - 4,00,000
A - 20,000
Þ - 400
C - 1,60,000
A - 2,300
Þ - 1,400
C - 60,000
A - 8,000
Þ - 2,400
C - 6,000
A - 1,800
Þ - 3,000
Figure 12.2.3: Decision Tree with Probabilities
new ÞroducL
ConsolldaLe
Detailed
Development
8apld
uevelopmenL
SLrengLhen
ÞroducL
8eap beneflLs of lLs
brand name
MarkeL reacLlon
MarkeL reacLlon
Cood
Average
Þoor
Cood
Average
Þoor
Cood
Average
Þoor
Cood
Average
Þoor
Calculating the values for chance nodes (Circles)
Consider the chance node under new product with detailed
development approach. The EMV at this chance node due to
probabilistic market condition can be computed as:
EMV1 = 10,00,000 × 0.4 + 50,000 × 0.4 + 2,000 × 0.2 = Rs.
4,20,400.
Similarly, we get:
EMV2 = Rs. 1,63,900
EMV3 = Rs. 70,400
EMV4 = Rs. 10,800
Calculating the values of decision nodes (Squares)
While evaluating a decision node, one should write down the
cost of each alternative solution along the decision line. Then
this cost is subtracted from the value of the outcome that is
already calculated. This will give a value which represents the
benefits of that decision (sunk costs, amounts already spent
are not considered while calculating the node value). After
calculating the benefit of each decision, one can select the
decision that offers the greatest (highest) benefit. Figure
12.2.5 shows the expected net monetary benefits of each
decision.
In the Figure 12.2.5, one can see that the net benefit for “new
product, detailed development” is Rs.2, 70,400 (after
deducing the cost of this decision). The net benefit for “rapid
development” is Rs.83, 900. Thus the benefits from “new
product, detailed development” are more than that of “new
product, rapid development. Hence the most valuable option,
i.e. “New product, detailed development” is selected and its
value is assigned to the decision node. Similarly, the value for
consolidation decision node is obtained.
Final Result
When the values of the decision nodes are available, the
manager can go for the most rewarding decision. In this
example, “should we develop a new product or consolidate?”
The best option is to develop a new product, through detailed
development.
278
Figure 12.2.5: Decision Tree Showing Expected Gross
Monetary Benefits of New Product Development and Con-
solidation
sLrengLhenlng Lhe producL" ls asslgned Lo Lhe second declslon node (consolldaLed).
Final Result
2,70,400
new ÞroducL
2,70,400
ConsolldaLlon
ueLalled dev cosL = 1,30,000 4,20,400
1,63,900
70,400
10,800
8apld dev cosL = 80,000
1,63,900-80,000=83,900
8eap producLs cosL = 0
40,400
Section 3
Posterior Probability and Decision-making
The kind of decision-making problem that we handled so far
had different “states of nature” and we had an idea of the
probability of occurrence of each state. Any new additional
information should help in arriving at a better decision. In
particular, we consider a situation when the old information
on state of nature are evaluated with some conditional new
information. We may recall that Bayes’ Theorem helps us in
such situation in obtaining posterior probabilities, posterior
to identified conditions. With such additional information it
would be logical to expect arrive at a better decision making.
Let us experience this through an example.
Example : Housing problem
“Ansal Lifestyles” is a leading housing development
company. The company has acquired a prime land in a
metro city and proposes to develop a dwelling complex on
this land. The company is considering three option for the
dwelling complex - a small complex with only 30 flats, or a
medium complex with 60 flats, or a large complex with 90
flats. The demand for the flats is expected to be either
strong (with 0.8 probability) or weak (with 0.2 probability)
Ansal Lifestyles has to decide on one of the options for
construction. The payoff for different combinations of
demand and options is as below
Which of the options would you recommend?
Solution:
Here, EMV (Small complex) = Rs.7.8 mn
EMV (Medium complex) = Rs.12.2 mn
EMV (Large complex) = Rs. 14.2 mn
Hence Ansal’s should go for large complex. A decision tree
for this problem would look as in Keynote 12.3.1-Fig (A)
Further, EVPI = EPPI - EMV (max)
= 17.4 - 14.2
= Rs.3.2 mn
279
Thus Ansal’s would be willing to pay a maximum of Rs. 3.2
mn for perfect information on the state of demand and no
more.
Housing example: Extended
Even as Ansals are debating over the options, a Market
Research Consultant offers to prepare a detailed demand
study for a fee. The Ansal’s at this point estimate the
implications of a favorable report under different demand
conditions as below:
P{Favorable Report(FR)/Strong Demand (SD)} = 0.9
P {Favorable Report (FR)/Weak Demand (WD} = 0.25
information, but not perfect information.)
Should Ansals engage the Market Research Consultant
and if so, what is maximum fee they can consider paying
him?
Solution:
At this stage (before engaging the MR consultant), we
know the chances of a favorable report given the state of
demand (strong or weak). However, Ansals would be
interested in knowing the chances of a strong or weak
demand given the report in favorable (or unfavorable). The
hope is - we will have a better estimate of probability of
demand with the MR study findings. In other words, we
would like to know P(SD/FR), P(WD/FR), P(SD/WFR) 5 &
P (WD/UFR).
These can be easily found using Baye’s Theorem as
below: The decision tree to decide if Ansals should go for
MR study or not would be as in Keynote 12.3.1- figure (B)
and would use the posterior probabilities.
As can be seen,
EMV (MR Study) = Rs.15.94 mn
EMV (No MR Study) = Rs. 14.2 mn
Clearly, it is worth going for the MR Study. The consultant
can be paid a maximum of Rs. 1.74 mn.
280
Keynote 12.3.1: Decision Tree
SECTION 4
Case Study: Mining for Precious Metals
281
This case study was written by Dr. Sunil Bharadwaj, Professor( Department of decision sciences),IBS, Hyderabad. It is intended to
be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.
The case was written from generalized experiences.
National Mineral Corporation (NMC), a leading metal explora-
tion company, has recently started its operations in South In-
dia. The whole business is to find out sites where potential
sources of metals are present. The company procedure in-
volves evaluation of a piece of land by taking samples of
earth at various depths and analyzing the extract for its con-
tents, which is termed as ‘geological exploration’. Before start-
ing explorations, the company buys a small area of land for
their trials. If the extracts found are rich in certain minerals,
the company estimates the appropriate size of the land to be
bought in that area. Once this is done, negotiations are car-
ried out with the owner, license is obtained from the local
authorities, and thereafter starts the actual mining for mineral
ores. The mineral ores, which are obtained are sold to a vari-
ety of businesses, which use the ore as a raw material for fur-
ther processing and use.
The Dilemma
The director and other officials of the company are sitting in
the boardroom discussing the various possibilities for buying
a piece of land, which has attracted the explorers of the com-
pany.
A senior explorer specifies that the company’s earlier experi-
ence in dealing with the type of land under consideration, indi-
cates that geological explorations would cost approximately
INR 1 lakh and would yield significant metal deposits as fol-
lows:
Manganese 1% chance
Gold 0.05% chance
Silver 0.2% chance
However, geological facts indicate that most of the times, only
one of these three metals is found i.e., neither there is a
chance of finding two or more of these metals at one place,
nor there is a chance of finding any other metal.
Another officer who has been working closely with the authori-
ties in the area has come up with an option. The company, if it
wishes, may pay INR 75,000 for the right to conduct a 3-day
test exploration before deciding whether to purchase the
piece of land or not. Such 3-day test explorations can only
give a preliminary indication of whether significant metal de-
posits are present or not.
The company had previously tried these kinds of options and
the past experience indicates that 3- day test explorations
cost them an average of INR 25,000 and that significant metal
deposits are present 50% of the time.
Given the past experiences and geological facts, the director
asks the officer to identify the possible outcomes of opting for
a 3-day test. He developed two scenarios: Firstly, if the 3-day
test exploration indicates significant metal deposits, then the
chances of finding manganese, gold and silver increase to
3%, 2% and 1% respectively. Secondly, if the 3-day test explo-
ration fails to indicate significant metal deposits, then the
chances of finding manganese, gold and silver decrease to
0.75%, 0.04% and 0.175% respectively.
Questions for Discussion
282
What should NMC do? Should NMC abandon the plans
One of NMC’s competitors is prepared to pay half of all
costs associated with this piece of land in return for half of
all revenues. Under these circumstances, what should
NMC do?
Notes
283
SECTION 5
Roja Silks
284
This case study was written by R Muthukumar, IBSCDC. It is intended to be used as the basis for class discussion rather than to il-
lustrate either effective or ineffective handling of a management situation. The case was compiled from published sources.
started operations in 1996. It had three separate sections for
children, ladies, and men. Roja offered casuals, formals, and
western wear. For ladies, there were salwars in soft velvets,
cool cottons, printed clothes, and also accessories like
branded leather bags, shoes, nightwear, etc. For men, there
were formal shirts, trousers, jeans, T–shirts, suits, ties, socks,
and undergarments, and accessories like sunglasses and
leather products. The company had two manufacturing units
in Trichy and Salem.
spending during 2002. According to industry estimates, the
category accounted for Rs. 30 crores worth of ad spends in
Chennai alone (2003). Roja spent around Rs. 2 crores on its
launch alone. Another big spender was Nandhini Silks
(Nandhini), which spent around Rs. 1 crore on advertising to
drive home the message that it was not a force that could be
easily ignored. In the past, well–entrenched players such as
only on occasions such as Diwali, Christmas, and Pongal.
Now the needs had changed. The key issue was differentia-
tion and brand building, in what was turning out to be a highly
competitive market. This probably explained the boom in ad
spending.
Roja believed that it was possible to generate faster growth.
The retailer believed that capacity had to be built ahead of the
ager (GM) to decide the location of the plant. With two options
before him, GM was somewhat confused:
Construction of a large plant to meet the possible demand
in the future.
Construction of a small plant to meet a low demand and
expanding it when the demand increased.
After detailed discussions with his colleagues, the GM de-
cided to get the help of a consultant for conducting market
research to find out more about the demand pattern. The
consultant believed the probabilities of low, medium, and
high demands were 0.3, 0.4, and 0.3 respectively. The fol-
lowing data was also collected as part of the market re-
search exercise.
If a large plant was constructed at a cost of Rs. 12 lakhs,
it would be able to meet the demand in the future. The op-
erating returns for low, medium and high demands were
estimated at Rs. 10 lakhs, Rs. 16 lakhs and 24 lakhs re-
spectively.
If a small plant was constructed at a cost of Rs. 6 lakhs, it
would meet only low demand and it would have to be ex-
panded, if the demand increased in the future.
Depending upon the demand, a small plant might require
no expansion (for low demand), or might require a small
expansion at a cost of Rs. 3 lakhs (for medium demand),
or might require a large expansion at a cost of Rs. 5 lakhs
(for high demand).
In future, for the sudden expansion of the plant to meet
the demand, some revenues might be lost. The operating
returns to be realized in case of small plant expansion
285
and large plant expansion were projected at Rs. 14 lakhs and
Rs. 22 lakhs, respectively.
The GM was wondering, which of the options he must pursue.
Notes
286
SECTION 6
Universal home care products
In 2003, Universal Home Care Products Ltd. (Universal),
was one of India’s largest producers of detergents and
cleaning agents with sales of Rs. 1775 crores and a net in-
come of Rs. 112.3 crores. The company’s product line con-
sisted of over 1000 products ranging from industrial chemi-
cals to a variety of household cleaners and detergents. Con-
sumer products accounted for around 50% of the com-
pany’s turnover. Some of its brands had been highly suc-
cessful over a long period of time, while others had been
modified or dropped depending upon market conditions.
The company had an active new product development func-
tion.
Universal’s policy, with respect to any of its household clean-
ing products, was very rigid. The product had to capture at
least 5% share in that particular market within a year of its
introduction, failing which the product was dropped. Re-
cently, the company had developed an all–purpose house-
hold cleaner, ‘Sparkle,’ which was the first of its kind. The
cleaner differed from the traditional cleaners in its versatility.
It could clean a variety of surfaces like wood, glass, metal,
plastic, and ceramic. According to Universal, Sparkle could
remove the toughest of stains on any kind of surface. In ad-
dition, the new product was available as a spray cleaner of-
fering ease of use.
The company’s product management group saw in the new
spray cleaner, an opportunity to market a new product that
could improve the company’s position in the household
cleaner market. Sparkle was tested among 500 house-
wives. Though the product was not complete in all respects,
it got instant sampling. After making minor changes in the
packaging and fragrance, the product would be ready for
the market. Universal projected a market share of 6% by
the first year, 10% by the second year, and 14% after two
years.
Daychem Ltd. (Daychem) was an aggressive competitor
known for its proactive strategies. In the past, Universal and
almost all the segments in which the companies had a pres-
ence. Past experience had proved that Daychem was fast
in coming out with substitute products in a very short pe-
riod. For all that Universal knew, Daychem might already be
planning to launch a similar multi–purpose household
287
cleaner, with similar positioning, targeting the same seg-
ments.
The success of ‘Sparkle’ depended on Daychem’s ability to
bring out a competing product and the relationship between
the firm’s pricing structure for Sparkle and the competitor’s
pricing structure for the competing product. Daychem’s ability
to bring out a competing product was estimated at 60 %. Uni-
versal estimated the profits for its new product for three differ-
ent prices, in the absence of competition. If Universal set a
low price, the estimated profits were Rs. 60,000, Rs. 75,000
at a medium price, and Rs. 90,000 at a high price. There was
another dimension to the problem because Universal had to
take into account the competing product’s pricing as well (Re-
fer to Table I).
Universal had to set its price first because it was entering the
market first with its product. Estimates of the probability of
competitor’s prices are as follows:
What should Universal do, with respect to pricing Sparkle, tak-
ing into account all these dimensions?
Notes
288
If Universal’s
Price is:
Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo
If Universal’s
Price is:
If Competitor’s Price is: If Competitor’s Price is: If Competitor’s Price is:
If Universal’s
Price is:
LOW MEDIUM HIGH
LOW 32 40 49
MEDIUM 35 48 50
HIGH 12 30 49
If Universal’s
Price is:
Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo
If Universal’s
Price is:
Competitor’s Price is Expected to be: Competitor’s Price is Expected to be: Competitor’s Price is Expected to be:
If Universal’s
Price is:
LOW MEDIUM HIGH
LOW 80% 15% 5%
MEDIUM 20% 70% 10%
HIGH 5% 30% 65%
289
Section 6
Case Study: Ram Publishers
Refer Case Study in Chapter 5
290
Comprehensive Case Studies
Doughnut Bakers
KATT: An Outsourcing Company
C
H
A
P
T
E
R

1
3
I n t hi s c hapt e r we wi l l di s c us s
SECTION 1
Case Study: Doughnut Bakers
292
This case study was written by Dr. Sunil Bharadwaj, professor, (Department of Decision Sciences), IBS, Hyderabad. It is in-
tended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-
ment situation. The case was written from generalized experiences.
Doughnut Bakers is one of the famous bakeries in the heart
of the city. The location of the bakery is good as it lies within
a densely populated area of the city. It is famous for the clas-
sic interiors, exotic cakes, pastries & dishes at reasonable
prices. About a year ago there was a change of manage-
ment. The new management has brought in some changes in
the menu and the staff. However, the management feels that
it has not been able to come up to the expectations of the
customers. Under the previous management around 90% of
the customers used to revisit. The monthly sales averaged
around Rs. 14 lakhs. However in the last nine months the
monthly sales averaged around Rs 11 lakhs with a standard
deviation of around Rs.4 lakhs. The bakery manager has re-
ported an increase in number of complaints which is also evi-
dent from the written customer feedback. Most of the custom-
ers report that the service is slow. Around 45 % of the custom-
ers have indicated dissatisfaction with the service quality of
the bakery.
To look into the matter a consultant was hired. He visited the
bakery as a disguised customer during peak hour (evening
time). He knows (based on past information) that the average
arrival rate of customers during this particular hour is around
20 customers in an hour, against the seating capacity for 40
customers at a time. As a mathematician at heart he starts
contemplating about the probability that he finds the bakery
empty i.e. without customers or the probability that the bak-
ery is fully packed.
He then has some snacks and then meets the management.
He goes through the data provided by the management. At
the outset the consultant is worried about the seeming de-
crease in the sales in the recent past. He wants to confirm on
this with not more than a 5 % chance of going wrong.
Also he is aware of the slow service delivery. In his experi-
ence if customers spend less time in a bakery on a crowded
day, it is always better for the bakery. Faster service will re-
duce the waiting time for the customers awaiting their turn.
He has observed that on weekends a typical customer
spends around 60 minutes in the bakery. However in most of
the competing bakeries the time is not more than 50 minutes
(In his earlier assignment he has studied a majority of com-
peting bakeries and found that the average time was 45 min-
utes, with a standard deviation of 5 minutes). He is contem-
plating whether the bakery has become inefficient under the
new management as compared to competing bakeries.
He wants to meet a few of the staff members chosen ran-
domly. The overall staff distribution is as shown in Exhibit I.
293
Doughnut Bakers
Nine members are supposed to report to him. Given his pas-
sion for mathematics, he is wondering about the probability of
five of the members being from the helpers’ category?
One of the service managers, who happens to be an account-
ant, gives him the following data related to bakery business
of their firm in Exhibit 2
Looking into the data he contemplates whether it is advisable
to offer more number of items or not?
Based on his meeting with the staff, he assesses that a train-
ing program can boost the morale and skills of the team at
the bakery and will help them in improving the service quality.
Accordingly, he designs and conducts a training program for
the staff and has collected data on time taken to perform vari-
ous operations by the trained staff with a view to compare
them with the performance prior to the training. Exhibit 3
shows the data collected on the performance of waiters.
The consultant wants to be 95% sure that his training pro-
gram has reduced the time taken to perform operations by
the waiters so that he gets his pending payment.
The consultant is further promised a reward if the customers’
revisit rate is better than the past rate. After a few months a
survey is carried out by the management and it is found that
out of 200 customers 191 have revisited. Should the manage-
ment offer the reward to the consultant? The consultant has
294
Staff Members Distribution Staff Members Distribution
Bakers 5
Service managers 4
Waiters 16
Helpers 8
Bakery Manager 1
Sales Data Sales Data Sales Data Sales Data
Year
Monthly Average
Sales
(in lakhs)
Monthly Average
Total number
of items kept
2000 10 50 20
2001 10.5 55 20
2002 11 58 24
2003 11.6 60 24
2004 12 67 24
2005 12.5 69 26
2006 12.8 70 26
2007 13.5 74 24
2008 14 78 24
2009 11 52 22
Exhibit I
Exhibit II
it will be better if advertising budget is increased. Was he
right?
the consultant – both institutional and personal. Are they
really personal?
295
Data on Time Taken by Waiters to Service a Customer Data on Time Taken by Waiters to Service a Customer Data on Time Taken by Waiters to Service a Customer
Waiter
Time taken before
training(in minutes)
Time taken after training (in
minutes)
Waiter1 4 3
Waiter2 4.5 3.5
Waiter3 4 3
Waiter4 5 4
Waiter5 4 2
Waiter6 5.5 4
Exhibit III
Notes
SECTION 2
Case Study: KATT: An Outsourcing Company
296
This case study was written by Sunil Bhardwaj, Professor, Department of Operations & IT, IBS Hyderabad. It is intended to be
used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.
The case was prepared from the generalized experiences.
K A T T O u t-
sourcing offers
a wide range of
outsourcing serv-
ices including IT outsourcing, Back office outsourcing, HR
outsourcing and many others. Today the company handles
many US clients including 500 Fortune clients and caters to
clients’ diverse needs in a cost effective manner.
The company was started by a young man Ashok, B Tech,
who began his career in IT as a service engineer for IT prod-
ucts in a reputed company in 1986. Within a span of three
years, he was made the head of their North and East service
division. After four years, i.e. in 1990, Ashok joined a com-
pany in US.
As per Ashok, “During my days in US, I learnt the best prac-
tices of software development, client service and business
processes and also established a good rapport with major IT
companies in US.” In 1995, Ashok took a decision to come
back to India and start his own company which will meet the
requirements of US companies at a lower price and thus
KATT outsourcing was born.
In 1995,the concept of outsourcing was new to India and to
credibility of the new company and its ability to deliver on the
strict parameters required. But he was able to overcome all
hurdles with a small team of twenty-five employees each at
Gurgaon and Bangalore.
Like any new set-up, KATT outsourcing also faced a capital
crunch, but Ashok acted as a one man army and took all the
responsibilities including HR, accounts, administration and
even service delivery on his own shoulders. As he would say.
“KATT Outsourcing is my baby, I was responsible for both its
success and failures”.
Over a period of 15 years, KATT Outsourcing has evolved as
one stop outsourcing solution provider, with its activities rang-
ing from call centers to product development and mainte-
nance services.
Owing to rapid growth, the company has recently set-up a
separate marketing and business development department
as well as a separate department for customer satisfaction.
KATT has expanded its operations after considering the
pain-points of its customers. Today KATT has 1500+ techni-
cally qualified engineers and technicians, 40+ service cen-
ters with a presence in 20+ locations across India.
The company statistics show that the largest percentage of
jobs being outsourced is in Information Technology, by
around 28%. The next largest field is human resources tak-
ing 15% of the outsourcing market, followed closely by sales
and marketing outsourcing with 14% and financial services
outsourcing at 11%. The remaining 32% is made up of a vari-
ety of processes, including administrative outsourcing.
Discussions on three themes are getting popular at KATT
nowadays. One is the new office location, second is the cus-
tomer survey and the third is a proposed bid for Microsoft’s
297
KATT: An Outsourcing Company
outsourcing contract. The details of some of the related activi-
ties are as follows:
Decision for New Location
The Chennai office of the company has evolved as a major
center over the years. Recently the company has been plan-
ning to shift its Chennai office to a new building in an SEZ
(which is 40 kms. away from the city) to reap various bene-
fits offered to SEZs, besides the spacious building. A survey
is done by the HR department to assess the acceptability of
the plan. 56 employees from the Chennai office were ran-
domly chosen. The employees were asked whether or not
they favored moving to the new location. Following were the
gender- wise responses of the employees. (Exhibit I)
Insights Into the Customers
In order to gain insights of US outsourcing business KATT de-
cided to carry out a customer survey. The company first
needs to understand the reasons for the clients to choose
outsourcing of their business processes. Earlier a similar
study was done. However, it was a qualitative study in which
experts’ opinions were gathered. The study concluded that
organizations that outsource are seeking to realize benefits
the service to the business. This will involve reducing the
scope, defining quality levels, re-pricing, re-negotiation, cost
re-structuring etc. These are approaches to cost economies
through off-shoring called “labor arbitrage” enabled by the
wage gap between industrialized and developing nations.
2.Core Competency: Resources (investment, people, infra-
structure etc.) are focused on developing the core business.
For example often organizations outsource their IT support
to specialized IT services companies.
3.Cost Restructuring: Operating leverage is a measure that
compares fixed costs to variable costs. Outsourcing changes
the balance of this ratio by offering a move from fixed to vari-
able cost and also by making variable costs more predict-
able.
4.Best Practices:
ti onal best prac-
tices which would
be too difficult or
time consuming to
develop in-house.
5. Bi ndi ng Per-
f or mance Con-
tract: Services will
be provided to a le-
298
Results of Employees Survey Results of Employees Survey Results of Employees Survey Results of Employees Survey
Men Women Total
In favour 14 3 17
Opposed 8 31 39
Total 22 34 56
Exhibit I
INTERACTIVE 13.1
gally binding contract with financial penalties and legal redress.
This is not the case with internal services.
6.Quality Improvement: Achieve a step change in quality
through contracting out the service with a new service level
agreement.
tainable source of skills, in particular in science and engineer-
ing.
8.Management of Capacity: An improved method of capacity
management of services and technology where the risk in pro-
viding the excess capacity is borne by the supplier.
9.Catalyst for Change: An organization can use an out-
sourcing agreement as a catalyst for major step change that
can not be achieved alone. The outsourcer becomes a Change
agent in the process.
10. Reduce time to Market: The acceleration of the develop-
ment or production of a product through the additional capabil-
ity brought by the supplier.
However First five aspects were considered to be most impor-
tant by the experts. These were studied in detail. The survey
resulted in a lot of data collection. For example, Exhibit II
shows the percentage of 200 respondents (each representing
a separate firm) giving a particular rating to the importance of a
particular reason for outsourcing (1= least important, 5 = most
important).
Another set of data collected (Exhibit III) related to the extent of
cost savings achieved by firms of different sizes (all KATT cli-
ents). Size was defined as Small, Medium and Large depend-
ing on the turnover of the client companies.
KATT is also examining the impact of size on the product devel-
opment collaboration with the outsourcing partners (Exhibit IV).
299
Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses
Ratings/
Reasons
1 2 3 4 5
Total
Responde
nts
Cost
10 10 20 60 100 200
Core
Competency
20 20 20 50 90 200
Cost
Restructuring
10 10 20 70 90 200
Best Practices 20 30 20 50 80 200
Contract 30 30 10 50 70 200
Exhibit II
Another set of data gives the break up of KATT’s monthly reve-
nue by various regions. (Exhibit V)
Bid for Microsoft Outsourcing Contract
KATT is trying to decide whether to bid for a outsourcing con-
tract with Microsoft or not It is estimated that mere prepara-
tions for the bid will cost Rs 2 lakh. Past data reveals that
there is a 50% chance that an Indian company like KATT will
be shortlisted (otherwise their bid will be rejected)
Once “short-listed” KATT has to furnish further detailed infor-
mation and prove its competence in handling the project .This
may have expenses as high as Rs 1 lakh. After this stage
their bid will either be accepted or rejected.
The company estimates that the labour and material costs as-
sociated with the contract are Rs 10 lakhs. They are consider-
ing three possible bid prices, namely Rs 15 lakhs, Rs 17 lakhs
300
Number of Firms by Extent of Savings and Firm Size Number of Firms by Extent of Savings and Firm Size Number of Firms by Extent of Savings and Firm Size Number of Firms by Extent of Savings and Firm Size
Extent of cost savings
by outsourcing
Number of ﬁrms by size Number of ﬁrms by size Number of ﬁrms by size
Extent of cost savings
by outsourcing
Small ﬁrms
(75)
Medium
ﬁrms
(75)
Large ﬁrms
(75)
0% 8 7 0
1-10% 22 15 8
11-20% 15 23 13
20-30% 15 16 12
30% & above 15 14 17
Product Development Collaboration and size:
percentages of responses by client firms
Product Development Collaboration and size:
percentages of responses by client firms
Product Development Collaboration and size:
percentages of responses by client firms
Product Development Collaboration and size:
percentages of responses by client firms
Small Medium Large
Yes 56% 60% 76%
No 44% 40% 24%
Total 75 75 50
KATT’ s Monthly Revenues by Regions KATT’ s Monthly Revenues by Regions KATT’ s Monthly Revenues by Regions KATT’ s Monthly Revenues by Regions
Month
Monthly revenue(in \$millions) Monthly revenue(in \$millions) Monthly revenue(in \$millions)
Month
1 6.1 4.9 5.0
2 4.3 5.7 4.7
3 7.2 5.7 4.0
4 5.5 6.3 5.2
5 5.9 6.0 6.0
6 6.8 4.2 3.9
7 5.3 5.8 4.2
8 4.9 5.4 4.9
9 6.1 6.7 5.0
10 7.0 4.0 3.7
11 4.3 5.9 4.2
12 6.0 5.9 4.5
Exhibit III
Exhibit IV
Exhibit V
and Rs 19 lakhs. They estimate that the probability of these bids
being accepted (once they have been short-listed) is 0.90, 0.75
and 0.35 respectively.
A consultant is hired to assess the above deal. His opinion is not
to bid for the project.
Questions for Discussion:
1. A survey of 150 outsourcing companies shows that
the largest percentage (about 25%) of jobs outsourced to India
is in the area of Information Technology. Can Ashok safely con-
clude that KATT’s share of IT business is significantly above the
industry average?
2. Refer to Exhibit 1 and answer:
a. What is the probability that a randomly selected resi-
dent is a man and is in favor of new building?
b. What is the probability that a randomly selected resi-
dent is a man?
c. What is the probability that a randomly selected resident is in
favor of new building?
d. What is the probability that a randomly selected resi-
dent is a man or in favor of building the bridge?
e. A randomly selected resident turns out to be male.
Compute the probability that he is in favor of new building?
3. Refer to Exhibit 2. Can Ashok conclude that all the five
reasons are considered to be of equal importance by the re-
spondents?
4. Refer to Exhibit 3. Can Ashok conclude that savings
from outsourcing is independent of company size?
5. Refer to Exhibit 4. Are product collaboration interre-
lated with the size of the firm?
6. Refer to Exhibit 5. Are monthly revenues of KATT
across the regions similar?
301
This document is authorized for internal use only at IBS campuses- Batch of 2012-2014 - Semester I. No part of this publication
may be reproduced, stored in a retrieved system, used in a spreadsheet, or transmitted in any form or by any means - electronic,
mechanical, photocopying or otherwise - without prior permission in writing from IBS Hyderabad.