## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Quantitative Methods

This document is authorized for internal use only at IBS campuses- Batch of 2012-2014 - Semester I. No part of this publi-

cation may be reproduced, stored in a retrieved system, used in a spreadsheet, or transmitted in any form or by any means

- electronic, mechanical, photocopying or otherwise - without prior permission in writing from IBS Hyderabad.

Introduction

Introduction to Statistics

Data, Measurement and Scales

Case Study: College Canteen’s Decreas-

ing Sales: Analysis Dilemmas

C

H

A

P

T

E

R

1

I n t hi s c hapt e r we wi l l di s c us s

Section1

Introduction to Statistics

What is Statistics?

Let us look at the following facts:

India’s GDP grew at 6.9% during 2011-12.

India’s export during the financial year 2011-12

amounted to $300 billion.

The BSE Sensex was 17094.51 points at the

closure of the market on 13th April 2012.

Tata Motors reported a “profit after tax” of Rs.25.71

billions for the financial year 2009-10.

Total irrigated land in Andhra Pradesh is 4.4 million

hectares in the year 2012.

In all the above examples, reference is made to some kind

of data. “Statistics”, in common parlance, is understood as

data relating to some aspects of an individual or item or

unit. The individuals could be people, companies or

economies while the data could pertain to a certain time

period.

In Italian “stato” means state and “statista” refers to the

person involved with the administration of state. Born out

of a combination of these two words, “statistics” originally

meant collection of facts useful to the state. Records of

land, population, etc., have been maintained for long for

official purposes by the governments/rulers across the

globe. However, the formal term was introduced only in

the 18th century.

The modern meaning of statistics is somewhat different

from the above meaning (though the word is very much

used in the sense of data even today). Clearly, in the past

also, the interest in various records was with a view to use

them for better future predictions and planning. Today, the

discipline of statistics is about transforming data into

useful information for decision makers. Thanks to

development of mathematical tools and powerful

computers, statistics has emerged as even more a

stronger discipline in its own right. One may also define

statistics as the study of uncertainty.

In general, statistics can be broadly divided as descriptive

statistics and inferential statistics. Descriptive statistics

deals with collection of data related to a characteristic or a

few characteristics and its application in profiling the

individuals or units, whom the data pertains to. For

instance, if income data is collected on a sample of

individuals in a city, the data may be summarized in the

form of tables and graphs to understand the income status

of the sampled residents in the city better. However, if we

3

wish to estimate the average income of the residents of the

city, we need to get into the art of inferential statistics, i.e.,

the statistical tool that enables us to generalize beyond the

sample. These generalizations are made with a probability

attached to them.

Why Statistics?

“Converting raw data to useful information for decision-

making” is the essence of statistics. This skill is essential for

any business manager who is under constant pressure to

make decisions, often with incomplete and imperfect

information. A probabilistic guidance for decision-making is

superior to intuition and hunches, as it gives measurable

indication of the uncertainty. Thus knowledge of statistics will

help managers in making informed decisions.

A manager is often required to handle the following:

To be able to summarize the data he is handling in his

work situation.

To be able to play a leadership role in statistical study

either in handling or in liaisoning with consultants.

Such responsibilities would call for an understanding of basic

statistical concepts.

Managerial Applications of Statistics

In today’s globalized, computerized and Internet-enabled

world, there is an abundance of data, whether at the micro-

level, macro-level or at the organizational levels. The

challenge is to convert this data into useful information which

can be used by the organization. Several application areas

are listed below:

With the availability of point of purchase data

obtained through electronic scanners at supermarket,

the marketing managers can derive valuable

information about buying behavior, which could be

useful for future planning, product positioning and

marketing.

Quality control in production processes using the

Statistical Control Charts is another well known

application.

Comparing the movement of individual stocks with

the stock market averages is another important

statistical application in Financial Analysis.

Auditing and tax authorities often use sampling

approach to verify accounts and based on its

accuracy draw conclusions about the entire lot.

Economic forecasts are often obtained through the

application of statistical tools using the past data

under certain assumptions and conditions.

Thus statistics is a useful tool in business and economic

analysis.

4

Video 1.1.1: Should Managers study

Statistics?

Data and Measurement

The term “data” refers to the information collected on the

characteristic of interest on an individual or item. The

characteristic on which data is collected is termed as a

variable. The data can be quantitative or qualitative

(categorical). Consider a financial analyst collecting

closing equity prices data of all FMCG companies in India

as of 30th April 2012. This would be an example of

quantitative data; closing equity price being the variable.

In contrast, a market researcher (in the US) who believes

that ethnic background will have an influence on the

purchase behavior of an FMCG product, hence records

data on the ethnic background in five categories (White,

Black, Asian, Hispanic and others), along with data on

amount spent on the products. In this case, the variable,

ethnic background is qualitative in nature, whereas the

variable “amount spent” is quantitative.

Scales of Measurement

The point to note above is that we had variables and we

had a way of measuring them. Clearly, the way of

measurement should be precisely defined for each

characteristic. In general, the data on any characteristic is

collected using one of the appropriate scales of

measurement from the following: Nominal, Ordinal,

Interval and Ratio.

Nominal scale: Observations are labeled so that they fall

into different categories such as color of the eyes, social

group/occupation, housing type, gender and so on. Any

number used in a nominal scale is a category label only

and no mathematical operation can be performed on it

because its assignment to the category is arbitrary. Like a

list of the names of students in a class:

1. Anita 6. Mallika

2. Arjun 7. Radha

3. Aruna 8. Radhika

4. Kanti 9. Srisha

5. Krishna 10. Tarun and so on.

This list represents only names and therefore has none of

the three qualities (magnitude, equal interval or absolute

zero). The numbers next to the names are used for

convenience only and are used simply to label groups or

classes. For example, when you are filling a form you are

asked to fill in your gender by denoting 1 = if male, or 2 =

if female. Or you may be asked to mention the Color of

5

Section 2

Data, Measurement and Scales

6

Table 1.2.1 Table 1.2.1 Table 1.2.1 Table 1.2.1 Table 1.2.1

Scale of Measurement Scale qualities Measurement principles Examples Permissible operations

Nominal None

People or objects with the same

scale value are the same on

some attribute. The values of the

scale have no ‘numeric’ meaning

in the way that you usually think

about numbers.

Names , Li s t s of

wo r d s , Ge n d e r ,

Et hni ci t y, Mar t i al

Status

Counting

Ordinal Magnitude

People or objects with higher

scale value have more of some

attribute. The intervals between

adj acent scal e val ues ar e

indeterminate. Scale assignment

is by the property of “greater

than,” “equal to,” or “less than”.

A n y t h i n g r a n k

ordered

Greater than or less than

operations

Interval

Magnitude equal

intervals

Intervals between adjacent scale

values are equal with respect to

the attribute being measured

Temperature, most

p e r s o n a l i t y

measur es, WAI S

intelligence score

Addition and subtraction

of scale values

Ratio

Magnitude equal

i n t e r v a l s

absolute zero

There is a rationale zero point for

the scale. Ratios are equivalent,

e.g. the ratio of 2 to 1 is the same

as the ratio of 8 to 4

Age, Height, Weight,

Percentage, etc.

Mu l t i p l i c a t i o n a n d

division of scale values.

your eyes by 1=if blue, 2=if green, 3=if brown. The only

permissible mathematical operation for this kind of nominal data

is counting. Ethnicity and gender are examples of variables that

would be measured on a nominal Scale and the numbers

assigned to the different categories are arbitrary.

Ordinal scale: The categories that make up this scale are

ranked in terms of magnitude. Observations or any set of data

are put into categories, which can be ranked in some order such

as from greatest to lowest. For example, wealthy, middle-class,

and poor neighborhoods; expensive, moderate, or cheap

restaurants or a product ranked by the customers as best=1,

second best as 2 and so on. The rankings do not tell us how

much is the difference between the wealthy and middle-class,

as there is no absolute zero and no equal intervals in this scale.

No precise value can be assigned to a difference between

ranks. (When does "wealthy" become "middle-class", etc).

Interval scale: The third type of scale is called an interval scale.

It possesses both magnitude and equal intervals, but no

absolute zero. For example, the difference between 1 and 2 is

the same as the difference between 99 and 100. In the interval

scale, the categories have a meaningful unit of distance

separating them. A classic example of an interval scale is

temperature, because we know that each degree is the same

distance apart and we can easily tell if one temperature is

greater than, equal to, or less than another, but we cannot

"really" say 20

o

C is twice as hot as 10

o

C, as temperature has

no absolute zero, i.e., if the thermometer records that the

temperature outdoors is zero, it does not mean that there is no

temperature!!

Ratio scale: The fourth scale of measurement is the ratio scale.

A ratio scale contains all the three qualities, magnitude, equal

interval and absolute zero. Statisticians often prefer this scale

because the data can be more easily analyzed. Height, weight,

age and percentage of people who pass can be measured on a

ratio scale. For example, if you are 20 years old, you not only

know that you are older than your sister who is 15 years old

(magnitude), but you also know that you are five years older

(equal intervals) to her. A ratio scale also has a point where

none of the scale exists; i.e., when a person is born his or her

age is zero.

The scales of measurement, the scale qualities, measurement

principles, their examples and permissible operations are given

in a tabular form in table 1.2.1 for easy understanding and

meaningful comparison.

Equipped with an understanding of the different types of data,

we now proceed to the next major objective of statistical

method, that is, to organize and summarize the gathered

quantitative data in order to understand it better. The first step

in organizing data is to tabulate the scores into a frequency

distribution. In this chapter we will be focusing our attention to

the statistical concepts of frequency distribution, computation of

the mean, median, and mode, variance and standard deviation,

and then move on to understanding correlation.

Advantages of Scaling Techniques

Scaling is useful in a number of ways. It improves objectivity.

The matter under study can be expressed accurately. Even

small variations can be known with accuracy. Scaling makes

7

Keynote 1.2.1: Scales and measurements

the matter concise. A lot of material is expressed with brief

and to the point numbers. Scaling facilitates standardization.

The findings can be replicated elsewhere. Precision facilitates

comparison provided the scale possesses the required

qualities.

8

Section 3

Case Study: College Canteen’s Decreasing Sales: Analysis Dilem-

mas

9

This case study was written by Thalluri Prashnath Vidya Sagar, under the direction of R Muthukumar IBSCDC. It is intended to be

used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation. The

case was prepared from generalized experiences.

One fine morning, Raghu, the owner-manager of canteen, was

thinking seriously about his canteen business. He sells varieties

of fast food items and beverages. One of his friend and also a

supplier Ramesh came to meet him to discuss about the pend-

ing payment and further supplies. As the canteen was not doing

well over the past few months, he wanted to identify where he

goes wrong. His friend suggested him to conduct a survey about

the sales of beverages. So he randomly selects a sample of a

60 students comprising 38 male and 22 female students. The

students were asked to fill in a comment/feedback form. Raghu

believed that this survey would help the team to better under-

stand its customers’ needs, and better service them.

He decided to take up some statistical measures to assess the

following obtained information:

Name, age, gender and phone number

Impressions on the service offered by canteen employees

Preference of beverages

Amount spent on beverages.

After he collected the data through feedback forms, he com-

puted simple statistic measures for analyzing the data. First, he

divided the entire sample into two broad categories based on

gender, i.e. Male & Female and he assigned number 1 for male

2 for female.

To find out the actual interests of the students with respect to the

beverages and brewed beverages, the students were asked to

rank the four beverages based on their preferences. They had to

rank their strongest preference for the beverages as ‘1’ and the

lowest preference as ‘4’. After tabulating the data, he has given

the results in the form of a table (Exhibits II(a) and II (b)).

Raghu analyzed his percentage of profits with the sales of the

beverages, including Pepsi, Coke, Coffee and Tea. He also ob-

served that most of the students prefer the cold beverages par-

ticularly Pepsi. He has tabulated his observations(Exhibit III).

10

College Canteen’s Decreasing sales:

Analysis Dilemmas

0

10

20

30

40

Male(1) Female(2)

Coding of Broad Categories of Students

Students

Exhibit I

He came to understand that most of the students like Pepsi

than any other beverage. He also wanted to find out the service

quality of his staff. He also believed that it would help him to im-

prove the quality of service. Respondents were asked to state

their degree of agreement or disagreement with a statement by

selecting a response from a list such as the following one:

1.Agree very strongly, 2.Agree fairly strongly, 3.Agree,

4.Undecided, 5.Disagree, 6.Disagree fairly strongly, and

7.Disagree very strongly (Exhibits (IV (a) and IV(b)).

With all his observations, the canteen manager wants to imple-

ment certain measures later on to increase the sales through

improving his product mix and marketing mix to get maximum

profit without investing into the new ventures.

What is the significance of the given data in statistics? In

what way the data will help him in analysis?

11

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Exhibit II (a)

Ranking of Preferences of Beverages by Students

Stude

nt 1

Stude

nt 2

Stude

nt 3

Stude

nt 4

Stude

nt 5

Stude

nt 6

Stude

nt 7

....

..

Pepsi 1 3 1 3 1 1 1

Coke 2 4 3 2 1 4 3

Coffee 4 2 4 1 3 3 4

Tea 3 1 2 4 2 2 2

Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author Prepared by the author

Exhibit II (b)

Student’s First Preferences of the Beverages

Exhibit II (b)

Student’s First Preferences of the Beverages

Exhibit II (b)

Student’s First Preferences of the Beverages

Exhibit II (b)

Student’s First Preferences of the Beverages

Rank Beverages Frequency %

1 Pepsi 18 30.0

4 Coke 12 20.0

2 Coffee 15 25.0

2 Tea 15 25.0

Total 60 100.0

Prepared by author Prepared by author Prepared by author Prepared by author

Exhibit III

Students’ First Preferences of the Beverages

Exhibit III

Students’ First Preferences of the Beverages

Exhibit III

Students’ First Preferences of the Beverages

Exhibit III

Students’ First Preferences of the Beverages

Beverages/

Sales

Gender Gender

% Proﬁt

Margin

Beverages/

Sales

Male(1) Female(2)

% Proﬁt

Margin

Pepsi 10 8 18

Coke 8 4 12

Coffee 9 6 15

Tea 11 4 15

Total 38 22

Prepared by author Prepared by author Prepared by author Prepared by author

12

Exhibit IV (b)

Student’s Preferences of the Quality of Service

Exhibit IV (b)

Student’s Preferences of the Quality of Service

Exhibit IV (b)

Student’s Preferences of the Quality of Service

Assigned Codes for Quality of

Service

Frequency %

1.Agree very strongly 10 16.7

2.Agree fairly strongly 15 25.0

3.Agree 17 28.3

4.Undecided 15 25.0

5.Disagree 3 5.0

6.Disagree fairly strongly 0 0

7.Disagree very strongly 0 0

Total 60 100

prepared by author prepared by author prepared by author

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

Exhibit IV (a)

Student’s Response Towards Quality of Service

St

ud

en

t 1

stu

den

t2

stu

den

t3

stu

den

t 4

stu

den

t 5

stu

den

t 6

stu

de

nt

7

stu

den

t 8

stu

den

t9

stu

de

nt

10

Quality

of

service

1 3 1 5 2 3 2 4 3 3

prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author prepared by author

13

Arranging Data

Arranging Data : Why and How?

C

H

A

P

T

E

R

2

I n t hi s c hapt e r we wi l l di s c us s

Section1

Arranging Data : Why and How?

Arranging Data : Why?

In business, statistics is used to study the demand and

market characteristics of the product or service being sold.

In fact, market research has evolved into a separate

discipline. The planning process, whereby the firm seeks to

match its future activities with expected future conditions

and developments, can be facilitated by the use of

statistical probability. In the performance evaluation of

personnel, machinery, departments, etc., measures of

central tendency and dispersion can be used to provide a

certain degree of objectivity. In the field of finance, statistics

can be used to reveal long-term trends and seasonal

variations in sales, expenses and incomes. Statistics is

useful in the management of inventories and receivables. In

the management of investments, statistics can be used to

determine the alternative that provides the highest return

per unit risk. Statistics can also be used to test the validity

of various tools that are said to be useful in investment

selection.

Firms often study their profits over the years and attempt to

find clues for future performance. Investors compare

expected rates of return on various investment alternatives

to determine where to place their money. Merchant bankers

study the projected profits of their client companies to

advise on the right price at which equity issues may be

made. Credit rating agencies consider various factors

related to the creditworthiness of issuers of debt in order to

estimate the likelihood of default.

In the above mentioned cases, statistics is used to examine

real life situations for a description and assessment of what

is happening and to obtain some pointers to an uncertain

future. Statistics is all about number crunching and the

ultimate number cruncher – the computer – has placed

statistics in the center spot in today’s business environment.

15

16

COMPILE, COMPARE, CONCLUDE

Statistics may be used to reveal, conceal, guide and misguide. PCS

Data Products Ltd. is engaged in the manufacture of computer

hardware and copper clad laminates. In 2002-03 its sales were Rs.

29.86 crore, double the previous year’s sales. Suppose another

company Shady Ltd. (an imaginary company) claims that it has

outperformed PCS Data Products because its sales increased by

200% whereas PCS sales increased by only 100%. Such a statement

should be treated with extreme caution. For example, Shady Ltd. may

have had very low sales in the previous year, say sales of Rs.10,000

only. If they increased to Rs.30,000 in 2002-03, the growth rate would

be 200% which is higher than the growth rate of PCS. However, at

Shady’s low level of sales, such a high growth rate is totally

unimpressive as compared with the growth rate of PCS. In general,

when comparing growth rates, it is always useful to keep in view the

amounts involved. Otherwise, even a growth from zero to Re.1 can be

claimed to be an infinite growth rate! Suppose Shady Ltd. wants to

make a public issue of equity shares. In deciding whether to invest in

Shady’s shares, the public would consider the profits earned by it. If

Shady has incurred a loss, the public would be reluctant to take up its

shares. In such a case Shady may extend its current accounting year

to a period of say 15 months in the hope of covering up the loss with

the earnings in the additional three months. It may also use various

window-dressing measures to inflate its profits so that it displays a

higher profitability than PCS. In such cases the data provided by

Shady cannot be compared with the data provided by PCS. A

mediocre company like Shady is likely to produce mediocre products.

Hence it would resort to fair and unfair means to sell its product. Here

too statistics may be used to distort the truth. For example, Shady may

claim that on the basis of a survey it was found that Shady’s products

were considered the best available. The truth could be that the survey

covered only friends and relatives of Shady’s management.

INFLATION

Inflation is a general increase in the price of goods and services. The

inflation rate, as measured by the Consumer Price Index (CPI), was

9.9% in 20x2-x3 and is expected to be 8% in 20x3-x4. This does not

mean that prices will be lower in 1993-94. It merely means that the

general increase in prices will be lower. An example will clarify the

point.

The prices of certain items are included in calculating the CPI. If a

given quantity of these items cost Rs.10,000 in the beginning of 20x2-

x3, then at the end of 20x2-x3 they would cost Rs.10,990 which is

9.9% more. Further at the end of 20x3-x4 they will be expected to cost

Rs.11,869 which is 8% more than the cost at the beginning of the year.

Clearly the prices have not fallen, only the rate of increase has slowed

down.

Figure 2.1.1: Inflation as Reflected by

the Cost of Items Worth Rs.10,000

on 1st April 1992

Arranging Data : How?

Frequency Distributions

We will discuss this through the following example.

However, before we do that, we wish to differentiate

between raw data and processed data. Raw data is

information before it is processed and/or analyzed.

Processed data is information presented in a form so that

the reader can draw valid conclusions from it.

Example 2.1.1

The following table 2.1.1 lists the supposed share prices of

30 companies:

The presentation of data in this form requires a great deal

of space. If you refer to the newspaper pages which report

share prices of all shares traded on the previous day, you

will see that a wide space is covered. The above method of

presentation also does not allow one to quickly determine

the answers to the following types of questions:

What is the minimum share price among those given?

What is the maximum share price among those given?

Are the share prices evenly spread between the minimum

and maximum values? If not, are they concentrated in any

interval?

We can improve upon the above presentation of data by

creating an array in which the prices are arranged in

ascending or descending order. Below is an ascending

array of the data given in table 2.1.2.

Now, the questions posed earlier can be answered more

quickly, but the data still covers the same amount of space.

Besides, without the assistance of a computer, the sorting

17

Table 2.1.1 Table 2.1.1 Table 2.1.1 Table 2.1.1

Company Rs. Company Rs.

ACC 1690 Indian Hotels 420.00

Ballarpur 155.00 ITC 441.25

Bharat Forge 158.75 Kirloskar Cummins 305.00

Bombay Dyeing 236.25 Larsen & Turbo 175.00

Ceat 71 Mahindra & Mahindra 143.00

Century 525.00 Mukand 197.50

GE Shipping 73.75 Nestle India 282.50

Glaxo 200.00 Peico 125.00

Grasim 357.50 Premier Automobiles 35.00

Gujarat Fertilizers 205.00 Reliance 191.00

Hindustan Motors 26.00 Siemens 355.00

Hindustan Lever 350.00 Tata Power 870.00

Hindalco 585.00 Tata Steel 147.00

Indian Rayon 315.00 Telco 185.00

Indian Organic 35.00 Voltas 52.50

work involved in preparing the array is quite laborious. One

would have to repeatedly scan through the data to determine

the lowest share price, then the next lowest share price and

so on.

A more concise way to present the above data would be the

frequency. A frequency distribution of the above data is given

below in table 2.1.3:

18

Table 2.1.2 Table 2.1.2 Table 2.1.2 Table 2.1.2

Company Rs. Company Rs.

Hindustan Motors 26.00 Glaxo 200.00

Indian Organic 35.00 Gujarat Fertilizer 205.00

Premier Automobiles 35.00 Bombay Dyeing 236.25

Voltas 52.50 Nestle India 282.50

Ceat 71.00 Kirloskar Cummins 305.00

GE Shipping 73.75 Indian Rayon 315.00

Peico 125.00 Hindustan Lever 350.00

Mahindra & Mahindra 143.00 Siemens 355.00

Tata Steel 147.00 Grasim 357.50

Ballarpur 155.00 Indian Hotels 420.00

Bharat Forge 158.75 ITC 441.25

Larsen & Toubro 175.00 Hindalco 585.00

Telco 185.00 Tata Power 870.00

Reliance 191.00 ACC 1690.00

Mukand 197.50 Century 5250.00

Table 2.1.3 Table 2.1.3 Table 2.1.3

Class Interval

Share Price

(Rs.)

Tally Marks Frequency

20-895 III 28

895-1770 I 1

1770-2645 0

2645-3520 0

3520-4395 0

4395-5270 I 1

Total 30

IIII IIII IIII IIII IIII

Notes:

1. There are no hard and fast rules regarding the number

and size of class intervals. However, the following guidelines

are to be followed:

a. Every item of data or data point (in this case, share

price) should be included in one and only one class. Hence:

i. The lowest share price should be included in

the first class and the highest share price in the last class.

Adjacent classes should not have intervals in between. For

example, we cannot have adjacent classes like

20 – 895

900 – 1775

because neither class would include the data

points 896, 897, 898 and 899.

iii. Classes should not overlap. Hence we cannot

have classes like

20 – 895

890 – 1765

because the classes overlap and the data

points 890, 891, 892, 893, 894 fall in both classes.

Please note that the classes

20 – 895

895 – 1770

do not overlap because the data point 895 is

included only in the class 895 – 1770. Such types of classes

where the upper limit 895 of one class equals the lower limit

895 of the next class are called “exclusive” classes because

the upper limit of a class is excluded from the class.

We could also have classes of the type

20 – 894.99

895 – 1769.99

These are called “inclusive” classes because

the upper limit of each class is included in that class. Also

note that there are no intervals between the classes because

all data points are rounded off to the nearest paise. If we had

data points like Rs.894.993 then the above inclusive classes

would have to be adjusted as

20 – 894.999

895 – 1769.99

Class intervals should be of the same length to the extent

possible. (An example where it is not so is in item 3.)

In order to have the same definition of length for

inclusive and exclusive classes, the length of a class interval

is defined as the difference between the lower limit of

adjacent classes. Hence in the case of classes

20 – 895 and 895 – 1770, or

20 – 894.99 and 895 – 1769.99

The first class interval has a length of 895 – 20 =

Rs.875

c. The number of classes should usually be between

six and fifteen.

d. Subject to (c) above, the number of classes may be

equal to the square root of the number of data points. In our

19

example there are observations or data points. Hence the

number of classes should be around or 6.

The “Tally Marks” are merely a simple way of obtaining all the

class frequencies by running through the given data just once.

They are usually omitted in the presentation of a frequency

distribution.

3. Note that from the original data we were able to

construct a frequency distribution. However, given only the

frequency distribution we cannot reconstruct the original data.

Hence in obtaining a summarized presentation we have lost

information like the names of the companies and the exact

price of each company’s shares.

In the illustration given above (refer table 2.1.4) it may

be noticed that there are zero frequencies for the classes

1770 - 2645, 2645 - 3520 and 3520 - 4395. In fact, these

classes have been necessitated because of the single data

point in the class 4395 - 5270. At the same time, there is

overcrowding of data points in the class 20 - 895. To remedy

the above drawbacks we may use the following type of

classification.

However, the above frequency distribution violates one of our

guidelines, i.e., all class intervals should be of equal length.

This is because the last class “900 and over” has an infinite

length. Such classes are called “open-ended classes”

because we cannot numerically fix the upper (or in some

cases lower) end of the classes.

Example 2.1.2

Below are the debt-equity ratios of some companies as

shown in table 2.1.5:

How would you classify these companies in a frequency

distribution according to their debt-equity ratios?

First, you would count the number of observations or data

points – they are 17 in all. Hence we should have or

approximately 4 classes. Let us settle for the minimum of 6

classes. The data points range from 0 for Colgate to 4.4 for

Hindustan Motors (Refer table 2.1.6)

Hence we can get the following frequency distribution as

shown in table 2.1.7.

20

Table 2.1.4 Table 2.1.4

Class Interval Share

Price (Rs.)

Frequency

25-200 15

200-375 9

375-550 2

550-725 1

725-900 1

900 and over 2

Total 30

Frequency Polygons

Example 2.1.3

The growth of a particular industry may be ascertained by

several indicators. One simple indicator is the growth in sales

as compared with the previous year. Given below (table 2.1.8)

is an industry-wise performance of the corporate sector for two

consecutive years.

21

Table 2.1.5 Table 2.1.5

Company Debt-Equity Ratio

Parke Davis 0.4

Pfizer 0.9

East India Hotels 1.4

Reliance Industries 1.2

NOCIL 0.8

Videocon 1.4

Colgate 0

Essar Shipping 1.9

Ceat 2.6

Hindustan Motors 4.4

Voltas 2.1

Baroda Rayon 2.5

ITC 1.7

GE Shipping 0.5

Larsen & Tourbo 0.7

TISCO 1.3

Hindustan Lever 0.6

Table 2.1.7 Table 2.1.7 Table 2.1.7

Class

Debt-equity

Ratio

Tally Marks Frequency

0.00-0.75 IIII 4

0.75-1.50 II 7

1.50-2.25 III 3

2.25-3.00 II 2

3.00-3.75 0

3.75-4.50 I 1

Total 17

IIII

Table 2.1.6

We can find the minimum class width that would cover the data

points by using the formula

Maximum Value Data Point - Minimum Value Data Point

Number of Classes

==(4.4 – 0)/6

= 0.73 ~ 0.75

22

Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8 Table 2.1.8

Industry

No. of

companies

Net Sales Percentage Industry

No. of

companies

Net Sales Percentage

Tea& Coffee 20 1,018.92 6.8 Paper & Pulp 21 2,339.33 11.9

Veget abl e Oi l s &

Vanaspati

20 1,506.77 20.7 Textiles 68 4,748.63 18.1

Sugar 11 671.30 34.1 Man-Made Fibres 23 6,278.31 33.1

Other Food Products 14 606.71 10.7 Other Textile Products 12 397.59 24.0

Alcohol 7 858.22 18.7 Cement 16 1,253.72 5.9

Cigarettes 4 3,439.02 17.8 Cement & Asbestos Products 4 225.86 18.3

Mineral Products 6 253.38 39.7 Ceramics 7 228.55 16.2

Alkalies 8 1,125.10 18.9 Glass 7 269.36 11.2

Ot h e r I n o r g a n i c

Chemicals

16 585.82 30.7 Granite & Marble 6 93.05 53.8

Organic Chemicals 14 764.73 14.7 Gems & Jewellery 6 498.89 35.8

D r u g s &

Pharmaceuticals

27 2,430.97 29.0 Steel 29 5,463.41 22.5

Fertilizers 8 1,927.59 -10.2 Castings & Forgings 9 287.75 39.9

Pesticides 4 313.86 24.0 Steel Tubes & Pipes 6 504.62 36.2

Dyes 9 528.84 16.6 Steel Products 17 983.56 20.3

Paints 6 457.70 5.9 Aluminum 5 1,419.36 16.7

Cosmetics &Toiletries 8 1,055.38 -2.4 Non-ferrous Metals 7 367.44 44.7

Other Chemicals 14 633.43 37.6 Industrial Machinery 25 1,128.11 13.9

Plastic in primary form 5 526.78 11.5 Machine Tools 7 93.52 35.2

Plastic Products 13 555.68 -12.8 Other Non-Elect. Machinery 26 1,830.80 16.5

Veget abl e Oi l s &

Vanaspathi

7 618.80 8.0 (contd...........) (contd...........) (contd...........) (contd...........)

The corresponding frequency distribution of change in sales is

given below in table 2.1.9:

Table 2.1.9 Table 2.1.9

Class

Change in Sales(%)

Frequency

-13 to - 4.4 2

- 4.4 to 4.2 3

4.2 to 12.8 12

12.8 to 21.4 16

21.4 to 30.0 5

30.0 to 38.6 10

38.6 to 47.2 6

47.2 to 55.8 3

Total 57

We can draw graph for this frequency distribution by taking

classes or class marks (mid-points of classes) on the X-axis

and frequencies on the Y-axis.

23

Table 2.1.8 (Contd......) Table 2.1.8 (Contd......) Table 2.1.8 (Contd......) Table 2.1.8 (Contd......)

Industry

No. of

companies

Net Sales Percentage

Electrical Machinery 16 2,353.20 16.1

Dry Cells & Batteries 2 198.82 -3.0

El ect r i c Lamps and

Bulbs

4 34.74 31.2

Wires & Cables 11 1,055.74 47.5

Electronics 31 623.44 55.4

Consumer Electronics 9 2,018.81 6.8

Comput ers & Off i ce

Equip.

4 205.59 45.3

Automobiles 9 6,229.62 3.3

Automobiles Ancillaries 35 995.72 12.2

Miscellaneous Mfg. 13 1,048.62 8.4

Construction 16 1,043.78 27.9

Trading 18 1,310.31 17.0

Hotels 8 192.42 36.3

Transport Services 7 822.88 12.3

Financial Services 50 729.82 46.7

Other Services 6 113.44 38.7

Diversified 13 6,264.05 15.8

Electricity 3 1,588.17 30.1

Total for all industries 777 75,120.03 17.3

Here rectangles have been erected with their bases equal

to the lengths of the class intervals and their heights equal

to the frequencies on a suitable scale. This type of graph is

called a Histogram.

While the histogram indicates the fluctuations in

frequencies from class to class, it does not clearly reveal

the rate of change in frequency from one class to the next.

For example, it is difficult to say by examining the

histogram whether the decline in frequency from the class

30-38.6 to the class 38.6-47.2 is the same as the decline in

frequency from the class 38.6-47.2 to the class 47.2-55.8.

Such a question can be easily answered by using a

frequency polygon.

In the case of a frequency polygon, the mid-points of the

classes are taken on the X-axis and the frequencies are

taken on the Y-axis. The plotted points are joined by a

straight line. The last point B is joined to the X-axis at the

mid-point of the next class 55.8 – 64.4. Similarly, the first

point A is joined to the mid-point of the preceding interval –

21.8 – (–13).

In the frequency polygon, we can see that the line a is

steeper sloping more than the line b. Hence we can

conclude that the frequency drop is more in the class

30-38.6 to 38.6-47.2 than the frequency drop in the class

38.6-47.2 to 47.2-55.8.

We may, similarly, define cumulative frequency distribution

and the graph of this distribution is called an Ogive.

For example, consider the sales data (refer table 2.1.10).

Cumulative Frequency Table

It may be noticed from the “less than” Ogive

curve below that it slopes up to the right.

24

Table 2.1.10 Table 2.1.10

Class Cumulative

Frequency

-13 x < - 4.4 2

- 4.4 x < 4.2 3

4.2 x < 12.8 12

12.8 x < 21.4 16

21.4 x < 30.0 5

30.0 x < 38.6 10

38.6 x < 47.2 6

47.2 x < 55.8 3

Total 57

We may similarly construct relative frequency tables

where the frequency of a class is divided by the total

number of observations.

A frequency polygon (or a relative frequency polygon)

indicates the skewness of the distribution.

B is symmetrical while A is said to be skewed to the right

and c is skewed to the left.

Skewness refers to the lack of symmetry. A distribution for

which the mean, median and mode are equal is known as

a symmetrical distribution. In such a distribution curve, a

vertical line drawn from the peak of the curve to the

horizontal axis will divide the area of the curve into two

equal parts and each part is the mirror image of the other.

An asymmetrical distribution for which the mean, median

and mode are not equal is known as a skewed

distribution. In a skewed distribution curve the values are

not equally distributed but are concentrated at the lower

or higher end of the frequency distribution.

In a curve, if many values are concentrated at the lower

end and very few values are concentrated at the higher

end, the curve is said to be skewed to the right or

positively skewed. A positively skewed distribution curve

tails off towards the higher end and for such a curve A.M

> Median > Mode. For a negatively skewed curve the

values are concentrated at the higher end and it is

skewed to the left because it tails off towards the low end.

Here the A.M < Median < Mode.

Remark

At this point, we want to distinguish between a Parameter

and a Statistic.

Suppose we compute the annual returns for the past year

of all the scrips listed on the Bombay Stock Exchange.

From the data, we may compute, say, the mode M and

the variance, . We may also compute the mode, m,

and the variance, , of the data restricting ourselves to

the 30 scrips in the Sensex.

M and are called parameters as they pertain to the

entire population. m and are called statistics as they

pertain to a sample.

25

Measures of Central Tendency

Objectives of Averaging

Types of Averages: Mathematical & Positional Av-

erages

Case Study: Mattel’s Global Expansion: Analyz-

ing Growth Trends

C

H

A

P

T

E

R

3

I n t hi s c hapt e r we wi l l di s c us s

Section 1

Objectives of Averaging

The most important objective of a statistical analysis is to cal-

culate a single value that represents the characteristics of

the entire available raw data. This single value representing

the entire data is called the ‘central value’ or an ‘average’.

This value is the point around which all the other value of the

data cluster. Therefore it is known as measure of location

and since this value is located at a central point nearest to

other values of the data it is also known as measure of cen-

tral tendency. This chapter discusses various measures of

central tendency like mean, median and mode and their use

in day to day management activities. For example, the mean

sales of a territory give a rough idea to the sales manager

about the sales potential of that territory.

a. To find out one value that represents the whole mass

of data

The objective of averaging is to represent a set of individual

values in a concise way, so that the researcher can have an

instant idea about the size of each entity in the group. Aver-

ages help the researcher or manager to grasp the character-

istics of the data group without studying every value in the

group. For example, a manager gets a good idea about the

age profile of trainees of a fresh batch by looking at the aver-

age age (calculated by dividing the total of age of all the

trainees by number of trainees). This average is a value that

enables the manager to have a overall idea about the char-

acteristics of the large number of trainees.

b. To enable comparison

Averages help in comparing two or more sets of data on the

same variable. They also help in drawing conclusions about

the characteristics of different sets of data. For example, a

manager can use the average sale of two territories to com-

pare the performance of sales executives of two territories.

These average sales figures of each territory reduce the bur-

den of going through the volumes of sales data to know the

performance of each territory. Thus, a quick and easy com-

parison of sales of the two territories is made possible for a

manager by these averages.

27

c. To establish relationship

Averages play a major role in establishing relationships be-

tween separate groups in quantitative terms. It is vague if

one states that productivity of an employee of Wipro is more

than that of an employee of Satyam Computer Solutions. It

would make sense if both the productivities are expressed in

terms of averages.

d. To derive inferences about a universe from a sample

Averages help a manager to get valuable inferences about

the whole universe by means of sample data. The average

calculated from a sample data give a reliable idea about the

average of the entire universe.

e. To aid decision-making

Averages act as benchmarks or standards for managerial

control and decision-making. A production manager may rely

on average employee productivity to set future production tar-

gets for individuals and the organization as a whole. Thus

these averages (average turnover, etc.) act as benchmarks

for performance appraisal and decision-making in future.

Requisites of a Good Average

An ideal average should have the following characteristics:

Should be rigidly defined

Should be mathematically expressed (Have a mathemati-

cal formula)

Should be readily comprehensible and easy to calculate

Should be calculated based on all the observations

Should be least affected by extreme fluctuations in sam-

pling data.

Should be suitable for further mathematical treatment.

In addition to the above requisites, a good average should

also retain maximum characteristics of the data, it should be

a nearest value to all the data elements. Averages should be

calculated for homogeneous data i.e. ages, sales etc.

28

Section 2

Types of Averages

Averages or measures of central tendency are of the

following types:

I.Mathematical averages

i. Arithmetic mean

ii. Geometric mean

II. Positional averages

i. Median

ii. Mode

Of the above, arithmetic mean, median and mode are the

widely used averages in that order. Keynote diagram shows

the types of averages.

I. Mathematical Averages

i. Arithmetic Mean

The arithmetic mean or mean is the most simple and

frequently used average.

Arithmetic mean is represented by notation (read x - bar).

Calculating the Mean from Ungrouped Data

Ungrouped data refers to a collection of observations

x

1

,x

2

,................., x

n

The mean is then calculated as:

29

Keynote 3.2.1: Types of Averages

i indicates the ith observation,

is the sum of values of all observations,

n is the number of observations.

∑ indicates that all the values of x are summed

together.

When the mean is calculated for the entire population, it is

known as population arithmetic mean (µ). ‘N’ is the number

of elements (observations) in the population.

Then µ = / N

Example 3.2.1

Absentee List of Drivers of the Transport Department over a

Span of 90 Days is shown below in table 3.2.1:

When a manager wants to know the average number of days

a driver is on leave in 90 days, he can calculate the mean of

the ungrouped data as follows:

= 55/10

= 5.5 days per driver out of 90 days

In the above example, the mean is calculated by adding

every observation separately, in no set order. This is an

ungrouped data. One can calculate the mean using the

above method for limited values. But the task becomes

difficult while calculating average for a vast data, say for

5000 employees. In such cases a frequency distribution of

the data will be helpful to a manage r, and mean should be

calculated using a different method.

Calculating the Mean from Grouped Data (Frequency

distribution)

A frequency distribution consists of data that are grouped

into classes and hence called grouped data. Every

observation (value) is placed in one of the classes. Unlike

the earlier example, the manager is unaware of the individual

values of every observation of the universe. For example, a

Finance manager wants to find out the average monthly pay

of 600 employees in an organization, and he is having a

30

Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1 Table 3.2.1

Driver

Number of

days on

leave

1 2 3 4 5 6 7 8 9 10

8 6 6 7 4 5 6 2 4 7

X

i

i=1

n

∑

31

Table 3.2.2: Average Monthly Pay of 600

Employees

Table 3.2.2: Average Monthly Pay of 600

Employees

Class (Rupees) Frequency

1000 - 2999 50

3000 - 4999 110

5000 - 6999 162

7000 - 8999 100

9000 - 10999 83

11000 - 12999 45

13000 - 14999 25

15000 - 16999 15

17000 - 18999 8

19000 - 20999 2

Total 600

Table 3.2.3: Calculating Arithmetic Mean for Grouped data Table 3.2.3: Calculating Arithmetic Mean for Grouped data Table 3.2.3: Calculating Arithmetic Mean for Grouped data Table 3.2.3: Calculating Arithmetic Mean for Grouped data

Class

(1)

(Rupees)

Class Mark (X)*

(2

0

Frequency (f)

(3)

(f) x (X)

(2) x (3)

1000 - 2999 2000 50 1,00,000

3000 - 4999 4000 110 4,40,000

5000 - 6999 6000 162 9,72,000

7000 - 8999 8000 100 8,00,000

9000 - 10999 10000 83 8,30,000

11000 - 12999 12000 45 5,40,000

13000 - 14999 14000 25 3,50,000

15000 - 16999 16000 15 2,40,000

17000 - 18999 18000 8 1,44,000

19000 - 20999 20000 2 40,000

n=600, ∑(f x X) = 44,56,000

Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66

*Class mark adjusted to nearest integers.

n=600, ∑(f x X) = 44,56,000

Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66

*Class mark adjusted to nearest integers.

n=600, ∑(f x X) = 44,56,000

Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66

*Class mark adjusted to nearest integers.

n=600, ∑(f x X) = 44,56,000

Sample mean, X = (f x X)/n =45,56,000/600=Rs.7426.66

*Class mark adjusted to nearest integers.

frequency distribution (shown in Table 3.2.2).

To compute the arithmetic mean of grouped data, calculate

the midpoint of each class and multiply each mid point (class

mark) by frequency of observations in the corresponding

class. He then has to add all these results and divide the sum

by the total number of observations.Mid point (class mark) =

x = (lower limit + upper limit)/2

The formula for computing Arithmetic mean for grouped data

is:

Where, ∑ = Notation for “Sum”

= Number of observations in each class

= class mark (mid point of each class)

n = Number of observations

In the above example, the approximate mean (average

salary) is Rs.7426.66.

In case we had the data on the income of each of the 600

employees (i.e., ungrouped data), we could have calculated

the mean using the previous method. While there would be

some difference between the means obtained by both the

methods, mostly it would be small.

Advantages and Disadvantages of Mean

The first advantage of arithmetic mean is that its concept is

familiar and clear to most people. The second advantage is

that it is easy to understand and easy to calculate. Every data

set has one and only one mean. Finally, arithmetic average

provides a good basis for comparison. For example, if a

manager wants to compare the performance of salesmen of

four different regions of a state, arithmetic average provides

the correct basis for assessing the relative efficiency of the

regions.

However, Arithmetic mean suffers from a few drawbacks.

First, it may be affected by the extreme values that are far

from other values of the group. Observe that if the units

produced in a day by 5 workers of a batch as in Table 3.2.4 .

The mean units produced per day is

µ = ∑ x/n = (23 +22 +24+21+5) / 5 = 19 units

When the mean units are calculated leaving the fifth worker,

the mean is 22.5 units. Thus, one extreme value ‘5’ has

affected the mean. Hence, it is more appropriate to calculate

the mean excluding the extreme value in order to make it

more representative.

32

Table 3.2.4: Number of Units Produced by Workers in a

Day

Table 3.2.4: Number of Units Produced by Workers in a

Day

Table 3.2.4: Number of Units Produced by Workers in a

Day

Table 3.2.4: Number of Units Produced by Workers in a

Day

Table 3.2.4: Number of Units Produced by Workers in a

Day

Table 3.2.4: Number of Units Produced by Workers in a

Day

Worker 1 2 3 4 5

Units 23 22 24 21 5

The second disadvantage is that we cannot calculate the

mean for a grouped data set with open-ended classes at

either end of the scale. A class that allows either the upper

or lower end of a quantitative classification scheme to be

limitless is called as open-ended class.

The Weighted Arithmetic Mean

The weighted mean is calculated by taking into account the

relative importance of each of the values to the total value.

Consider, for example, the manufacturing company in Table

3.2.5 that employs three grades of labor (unskilled,

semiskilled, and skilled) to produce each of the two

products. When the company wants to know the average

wage per hour for each product, the simple arithmetic

average of the labor wage of the three types of labor will not

be appropriate as it gives equal weight to each category of

labor and this is not proper.

An appropriate method to calculate the average wage

per hour for the products is to take a ‘weighted average’ of

the wages of the three classes of labor, weighed in

proportion of total labor hour required by the three classes to

produce the product.

Here one unit of Product 1 required 10 hours of labor, of

which

Unskilled labor required 2 hours,

Semi-skilled labor required 3 hours,

Skilled labor required 5 hours.

When these above information are used as weights, then

Wage of labor (per hour) for product 1 is:

= (2x10+3x15+5x20)/(2+3+5)

= Rs. 16.5 / hour

Similarly, for Product 2 cost of labor (per hour) for 1 unit is:

= (6 × 10 + 2 × 15 + 1 × 20)/ (6+2+1)

= Rs. 12.22 / hour

As can be seen, in general, the formula for calculating the

weighted average is:

where,

33

Table 3.2.5: Labor - Capital Involved in Manufacturing

Two Products

Table 3.2.5: Labor - Capital Involved in Manufacturing

Two Products

Table 3.2.5: Labor - Capital Involved in Manufacturing

Two Products

Table 3.2.5: Labor - Capital Involved in Manufacturing

Two Products

Class of

Labour

Wage per

hour (x) (Rs)

Labour hours per unit Labour hours per unit

Class of

Labour

Wage per

hour (x) (Rs)

Product 1 Product 2

Unskilled

Semiskilled

Skilled

10

15

20

2

3

5

6

2

1

w = weight allocated to each observation (2, 3, 5 for product

1 in the above example)

∑ (w×x) = sum of each weight multiplied by that element.

S

w

will be equal to 1, if the weights are expressed in

proportion.

ii. Geometric Mean

Managers often come across quantities that change over a

period of time and may need to know the average rate of

change over a period of time. Arithmetic mean is inaccurate in

tracking such changes. Hence a new measure of central

tendency, called Geometric Mean, is needed to calculate the

average rate of change. It is defined as:

where, ‘n’ is the number of values.

Geometric mean is applicable in many cases. Its use in

calculating the growth rates of a textile unit in the southern

region for the last five year are given below in table 3.2.6:

The geometric mean

Where,

X

1

, X

2

, ........ X

n

are termed as the growth factor and is equal

to 1+ (rate/100)

1.1093 is the average growth factor. The growth rate is

calculated as

1.1093 – 1 = 0.1093 or the average growth rate is 10.93

percent per year.

Example 3.2.2

Matel Plastics Ltd got a raw material delivery order from

Blowplast Inc. However, the condition was that the delivery

had to be made within four hours, failing which the order

would be considered cancelled. Robert, the salesman at

Matel, was assigned the responsibility to make the delivery.

Robert had to be careful not to exceed the 80 kmph speed

limit, otherwise he would be flouting the traffic rules. The

34

Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units Table 3.2.6: Growth Rate of Textile Units

Year 1 2 3 4 5

Growth rate (%) 7 8 10 12 18

marketing manager asked him not to go below 60 kmph as

there was a risk of the order being cancelled. Robert divided

his journey time into four hours. He traveled the first quarter

of the distance at the speed of 50 kmph, the second quarter

at 65 kmph, the third quarter at 80 kmph and the last quarter

at 55 kmph. What is the average speed of his journey?

Solution:

Let the speed of Robert’s vehicle in the first hour, second

hour, third hour and fourth hour be X

1

, X

2

, X

3

and X

4

respectively.

The average speed (HM) of Robert’s whole journey from

Matel to Blowplast is given as 60.5 kmph.

From the given information in the problem, we have

X

1

= 50

X

2

= 65

X

3

= 80

X

4

=55

and n = 4.

After inserting the values in the formula for calculating the

harmonic mean, we get:

II. Positional Averages

i. The Median

The

median, as the name suggests, is the middle value of a data

series arranged in increasing or decreasing order of

magnitude.

Unlike the arithmetic mean (which is calculated from the

value of every observation in the series), median is a

positional average. It is the middle most value in the data or

the 50th percentile observation below which 50% of the

observations in the sample fall. The object of median is

35

Video 3.2.1:Central tendency, mean and median mode

therefore not merely to fix a value that shall be

representative of a data set, but also to establish a dividing

line separating the higher values from the lower values.

Calculating the Median from Ungrouped Data

If the data set contains an odd number of observations, the

middle observation of the array is the median. If there is an

even number of observations, the median is the average of

the two middle observations. If the total number of

observations is odd, say n, the value of item

gives the median and when the total of the frequencies is

even, say, 2n, then and are two central

observations and the arithmetic mean of these two

observations gives the median.

Example 3.2.3:

The data in table 3.2.7 relates to the sales figures of certain

companies relating to the year 2002-03:

Solution:

The median for the above data can be obtained as follows

in table 3.2.8:

The series should first be arranged in an order. In the

present case, it has been arranged in the descending order.

As there are 10 elements, the median will be the mean of

the 5th and the 6th items, i.e.,

(412 + 312)/2= Rs. 362 lakhs.

Thus the median sales value of the ten companies is Rs.

362 lakhs.

Calculating the Median from Grouped Data

In order to find the median, first the median class (i.e., the

class containing the 50th percentile observation) is to be

36

Table 3.2.7: Sales Figures of Companies Table 3.2.7: Sales Figures of Companies

Companies Sales (Rs. Lakhs)

JCCement 1520

Hyderabad Valley 436

Compex Inds 228

Hotel India 239

Hydro Power Co. 292

Thermal Power Co. 734

Star Tea 412

Cooling Ind. 980

Vegetable Oil Co. 312

Plating Ind. 256

located and then interpolation is to be used by assuming that

observations are evenly spaced over the entire class interval.

The formula used for the calculation of median is:

where, L

m

= lower limit of the median class

f

m

= frequency of the median class

F = cumulative frequency up to L

m

W = width of the median class

N = total frequency

Example 3.2.4

Let us find median for the following data of Table 3.2.9

Here the total frequency N = 153.

Median is the size of the item, i.e.,

item, i.e., the size of the 77th item. It lies in the class 20-30.

37

Table 3.2.8: Sales Figures of Companies Arranged in

Descending Order

Table 3.2.8: Sales Figures of Companies Arranged in

Descending Order

Table 3.2.8: Sales Figures of Companies Arranged in

Descending Order

Company Sales Rank

JCCement 1520 1

Cooling Ind. 980 2

Thermal Power Co. 734 3

Hyderabad Valley 436 4

Star Tea 412 5

Vegetable Oil Co. 312 6

Hydro Power Co. 292 7

Plating Ind. 256 8

Hotel India 239 9

Compex Inds 228 10

Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales Table3.2.9: Gross Profit as a Percentage of Sales

Gross Proﬁt as a

Percentage of Sales

0-10 10-20 20-30 30-40 40-50

No. of Companies 21 32 43 34 23

Hence 20-30 is the median class, of which the lower limit is

20.

Thus 25.35% is the median gross profit (as percentage of

sales) of the companies.

Advantages and Disadvantages of Median

Median is not strongly affected by the extreme or abnormal

values. In this sense, median is a better average than mean

(as seen in example). Median is easy to understand and it can

be computed from any kind of data (even for grouped data

with open-ended classes, but excluding the case when median

falls in the open-ended class). Median can also be calculated

for qualitative data.

However, median has some disadvantages. First, it is a time-

consuming process as it is required to arrange the data before

calculating the median. Second, unlike mean, it is difficult to

compute median for data set with large number of

observations.

ii. The Mode

Mode is defined as the value of the observation of the variable

which occurs most frequently in the data set.

Calculating the Mode from Ungrouped Data

Table 3.2.11 shows the weights of 20 workers of an

organization. The mode of workers weights is 67 kgs as a

maximum number of workers (4 of them) have this weight.

Calculating the Mode from Grouped Data

When the data is grouped in a frequency distribution the

manager must assume that the mode is located in the class

38

Table 3.2.10:Cumulative Frequency Table 3.2.10:Cumulative Frequency Table 3.2.10:Cumulative Frequency

Gross Profit

(%)

No. of

Companies (f)

Cumulative

Frequency (cf)

0-10 21 21

10-20 32 53

20-30 43 96

30-40 34 130

40-50 23 153

Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs) Table 3.2.11: Weights of 20 Workers (in kgs)

58 60 62 56 59 56 67 68 70 55

67 58 59 60 69 67 67 63 61 70

with highest frequency. The mode can be found using the

following formula:

d

1

= frequency of the modal class minus the frequency of

the class just below it

d

2

= frequency of the modal class minus the frequency of

the class just above it.

w = width of the modal class.

Example 3.2.5

Consider the salary example in Table 3.2.12 for computing

mode of that data.

Solution:

= 5000 + 1087.72

= Rs. 6087.72

Advantages and Disadvantages of Mode

Mode can be used as a measure of central location for

qualitative as well as quantitative data. It is not affected by

extreme values. It can also be used even when the classes

are open ended.

However, mode is not used widely as a measure of central

tendency, as it has a few drawbacks. For example, at times,

39

Table 3.2.12: Average Monthly Income of 600 employees Table 3.2.12: Average Monthly Income of 600 employees Table 3.2.12: Average Monthly Income of 600 employees

Class(Rs) Frequency

Cumulative

frequency

1000-3000 50 50

3000-5000 110 160

5000-7000 162 322

7000-9000 100 422

9000-11000 83 505

11000-13000 45 550

13000-15000 25 575

15000-17000 15 590

17000-19000 8 598

19000-21000 2 600

a data set contains no value that occurs more than once.

Further, all values in a data set might occur equal number of

times i.e., each observation has the same frequency. Another

disadvantage is that some data sets contain two, three or

many modes, making it difficult to interpret them.

Relationship between Mean, Median and Mode

In case of a symmetrical distribution, mean, median and

mode coincide. However, according to Karl Pearson, if the

distribution is moderately asymmetrical, the mean, median

and mode are related in the following manner:

Mean-Median = (Mean-Mode)/3

Thus Mode = 3 Median - 2 Mean

In a positively skewed distribution (skewed to the right), we

have AM > Median > Mode (Refer figure 3.2.1). For a

negatively skewed distribution (skewed to the left), we have

AM < Median < Mode (Refer figure 3.2.2).

Figure 3.2.1

Figure 3.2.2

40

Section 3

Case Study: Mattel’s Global Expansion

Toys are one of the world’s oldest consumer products. The

traditional toy industry, which was worth$2billion–$3billion

in 1968 evolved into a global market of over $61.8 billion in

2007.1 US is the largest toys and games market in the

world, accounting for 34.1% of the global market’s value.

Though only 2% of the world’s children reside in the US,

they buy half of the world’s toys. The leading company in

the US’ toys and games market is Mattel, which holds a

7.8% of market share2 ; followed by Hasbro with 5.3%.

Mattel is also the world’s largest toy manufacturer and its

best known brands include Barbie, Matchbox, Fisher-Price

and Hot Wheels.

Mattel was founded in 1945 by Harold Matson and Elliot

Handler (hence the name ‘Matt-El’) in a garage workshop

in California. The company started as a picture-frame

manufacturer, but Elliot soon started a side business of

making dollhouse accessories out of picture-frame scraps.

The success of the dollhouse furniture turned the com-

pany’s focus on toys. In 1959, Mattel introduced the Barbie

product line, which remains the most successful and the

most popular brands even today. Mattel went public in

1960 and throughout the decade the company witnessed

growth through acquisitions of smaller toy manufacturers.

In 1968, Mattel created the first Hot Wheels products,

which eventually became another highly successful brand.

During the 1990s, Mattel merged with the Fisher-Price com-

pany (1993) and acquired Tyco Toys (1997), the third-

largest manufacturer of toys at that time. The Fisher-Price

deal made Mattel overpower Hasbro and become the lead-

ing toy company. The deal was, referred to as, the most sig-

nificant acquisition in the toy industry; since the acquisition

of Tonka Corp., by Hasbro in 1991. However, as competi-

41

This case study was written by R Muthukumar, IBSCDC. It is intended to be used as the basis for class discussion rather than to illustrate either

effective or ineffective handling of a management situation. The case was compiled from published sources.

tion in the toy industry was intense, the sales at Mattel

slumped in 1996 and 1997.

Mattel’s sales further dropped in 1998 owing to a massive

recall of its battery-powered cars. By 1998, the company

sold approximately 10 million battery-powered cars. How-

ever, many consumers began to complain that their vehi-

cles had caught up fire. Subsequently, in November 1998,

the US Consumer Products Safety Commission urged

Fisher-Price to issue a massive recall. An estimated 10 mil-

lion vehicles were recalled by the company, making this

one of the largest recalls in the history of the US toy indus-

try. Fisher-Price maintained that the fires were in virtually

every case caused by consumers tinkering with the en-

gines. The company spent $30 million on repair of its re-

called products. In the fall of that year, the company took

the first step towards a major reorganization.

Mattel: Towards Developing Markets

Mattel began to sell its products directly to retailers and

wholesalers in Canada and most of the European, Asian

and Latin American countries. Europe is Mattel’s largest

market outside North America. It manufactured toy prod-

ucts for all segments in both company-owned facilities and

through independent contractors. Mattel’s principal manu-

facturing facilities were established in China, Indonesia, Ma-

laysia, Mexico and Thailand; while the independent contrac-

tors were positioned in the US, Europe, Mexico, the Far

East and Australia.

At present, the company operates in 42 countries and sells

products in more than 150 nations. Mattel’s segments are

separately managed business units, divided on a geo-

graphic basis between domestic and international. The do-

mestic segment of Mattel is further sub-divided into – Mat-

tel Girls & Boys Brands US, Fisher-Price Brands US and

American Girl Brands.

Mattel’s business is divided into two primary sectors: Do-

mestic (North America Region) and International. Mattel

products are sold directly to retailers in most European,

Latin American and Asian countries; while in Australia, Can-

ada and New Zealand, its products are sold through agents

and distributors (Mattel has no direct sales presence). Ex-

cept for American Girl, Mattel offers all its products world-

wide. It tailors its product as per the regional fads, though

the quality is compromised upon due to price sensitivity in

certain countries. The company sets itself apart by estab-

lishing close partnerships with its licensors and building

their brands.

Mattel distinguishes itself by producing a wide line of qual-

ity toys. It has outstanding brand name recognition and cus-

tomer loyalty. Mattel turned its attention to its new markets

way back in the 1970s. Since then, the company has been

taking advantage of global distribution and marketing net-

work to bolster sales in Mexico, Italy, Germany and Spain.

Since 2003, Mattel’s sales in the developing markets have

more than doubled (Exhibit I) and its sales of baby swings

and infant rockers in those markets have increased tenfold.

42

During the period 2006–2007, Mattel’s international sales

increased in comparison to its domestic sales. Particularly,

in Latin America, Mattel saw a rise in its sales by more than

23% in 2007. The company reported that its international

sales accounted for 49% of its gross sales in 2007.

Commenting on its international strategy, the company said

that it will continue to pursue localised and international pro-

grammes that are innovative and boost the growth of the

brand. The company hopes to cash in on countries where

US toys are seen as novelties. According to many experts,

it is very important to adopt local culture for toy companies

like Fisher-Price to attract more customers overseas.

“Fisher-Price is the tip of the spear for Mattel into these de-

veloping markets”,4 says Kevin Curran, Fisher-Price’s sen-

ior vice president and general manager.

If you are the head of operations, how will you analyse the

company’s sales performance with the data given. If all the

brands are expected to achieve sales growth of 7.25%,

8.2% and 7.15% respectively, what will be the average rate

of growth forecast for the next year?

Footnotes:

1. “Global Toys & Games – Industry Profile”, Datamoni-

tor, January 2008

2. “Toys & Games in the United States”, Datamonitor,

January 2008

3. Casey Nicholas, “Fisher-Price pursues toy sales in

developing markets”, The Financial Times, June 2nd

2008

4. “Fisher-Price pursues toy sales in developing mar-

kets”, op.cit.

43

44

Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007) Mattel, Inc and Subsidiaries Segment Information(1999-2007)

Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information) Segment Revenues(in $million, except percentage information)

1999 2000 2001 2002 2003 2004 2005 2006 2007

Domestic

Mattel Girls& Boys Brands 1,835.8 1,890.4 1,817.3 1,790.o 1,594.1 1,511.6 1,364.9 1,507.5 1,445.0

Fisher-Price Brands 1,185.5 1,233.0 1,234.2 1,282.2 1,2652.2 1,319.2 1,358.6 1,471.6 1,511.1

American Girl Brands 298.6 324.0 340.8 350.2 344.4 379.1 436.1 40.0 431.5

Total Domestic 3,319.9 3,447.4 3,392.3 3,422.4 3203.7 3,209.9 3,159.6 3,419.1 3,387.6

International 1,556.2 1,517.7 1,680.3 1,890.9 2,175.7 2,336.2 2,463.9 2,739.0 3,205.3

Gross Sales 4,876.1 4,965.1 5,072.6 5,313.3 5,379.4 5,546.1 5,623.5 6,158.0 6,592.9

Sales Adjustments -373.4 (399.6 -384.7 428.0 419.3 443.3 444.5 507.9 622.8

Net Sales from Continuing

Operations

4,502.7 4,565.5 4,687.9 4,885.3 4,960.1 5,102.8 5,179.0 5,650.2 5,970.1

Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area Gross Sales by Geographic Area

Domestic 3,319.9 3,447.4 3,392.3 3,422.4 3,203.7 3,209.9 3,159.6 3,419.1 3,387.6

% Change 1% 4% -2% 1% -6% 0% -2% 8% -1%

International 1,556.2 1,517.7 1,680.3 1,890.9 2,175.7 2,336.2 2,463.9 2,739.0 3,205.3

% Change -7% -2% 11% 13% 15% 7% 5% 11% 17%

% of Total Gross sales 32% 31% 33% 36% 40% 42% 44% 44% 49%

compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author compiled by author

Measures of Dispersion

Various Measures of Dispersion

C

H

A

P

T

E

R

4

I n t hi s c hapt e r we wi l l di s c us s

Section1

Various Measures of Dispersion

In the previous chapter we discussed how one can calcu-

late a single value that represents the characteristics of the

entire raw data using three main measures: mean, median,

& mode. Another important characteristic of a data set is

the spread in the data or how far each element is from

some measure of central tendency (average). There are

several ways to measure the variability of the data. Al-

though the most common and most important is the stan-

dard deviation, which provides an average distance for

each element from the mean, there are also several other

important methods which are discussed here. They include:

range, inter quartile range and quartile deviations, mean de-

viation, variance and standard deviation.

Range

Range is the simplest method of studying dispersion.

Range is defined as the difference between the value of the

largest observation (L) and the value of the smallest obser-

vation present in the data set, i.e.,

Range = L - S

For a grouped frequency distribution, range is defined as

Range = Upper limit of the highest class - Lower limit of the

lowest class.

Merits and Limitations of Range

Merits:

Range is simple to understand and easy to calcu-

late.

Range is the quickest way to get a measure of dis-

persion, although it is not accurate.

Limitations:

It is not based on all the observations in the data. It

is computed based on highest and lowest values

and ignores the nature of dispersion among other

values of observations in the data set.

46

It is influenced by extreme values and hence fluctuates

from sample to sample of a population even though

the values that fall in between the highest and lowest

values may be similar.

Range cannot be computed from frequency distribu-

tions with open-end classes.

Range fails to explain about the character of the distri-

bution within two extreme observations (i.e., L and S).

Range is unreliable as a measure of dispersion of the

values within a distribution.

Uses of Range

In spite of the above limitations and shortcomings, range, as a

measure of dispersion, has many applications.

Range is used in industry for the quality control of prod-

ucts without 100% inspection. Range plays an important

role in construction of charts used for quality control. For

example, when the range of weight of a spare exceeds a

particular level, the entire production line is checked to en-

sure pre-specified quality in the production process.

Range is also useful in studying the fluctuations in finan-

cial and share markets.

Interquartile Range and Quartile Devia-

tion

Range as a measure of dispersion has many limitations as it

is based on two extreme observations. It fails to explain the

scatter within the range. So when these extreme observations

are discarded the limited range would be more reliable and

representative of the entire data. The range calculated based

on the middle 50 percent of the observations is called inter-

quartile range. This interquartile range is calculated from ob-

servations obtained after discarding one quartile of the obser-

vations at the lower end and another quartile of the observa-

tions at the upper end of the distribution. Thus interquartile

47

Figure 4.1.1: Interquartile Range

range is the difference between the third quartile and the first

quartile. The quartiles are the highest values

in each of the first three parts of the data set when the data

set is divided into four equal parts.

Therefore, interquartile range =

Figure 4.1.1 shows the concept of interquartile range graphi-

cally. Notice that the observations are divided into four equal

parts (25% each).

Quartile deviation is defined as one half of the interquartile

range.

Quartile deviation gives the average value by which the two

quartiles differ from the median. In symmetrical distribution,

the quartiles Q3 and Q1 are equidistant from the median i.e.

This difference can be taken as a measure of variation.

The median ± Quartile deviation covers approximately 50 per-

cent of the observations as the economic data or any other

business data is seldom perfectly symmetrical. A small quar-

tile deviation denotes less variation in the central 50% of the

observations, whereas a high quartile deviation indicates

large variations.

Merits and Limitations of Quartile Deviation

Quartile deviation (Q.D.) has many merits compared to range

and other measures of variation, but it also has some limita-

tions.

Merits:

Q.D can be used as a measure of variation to open-

ended distributions.

Q.D. is a better measure of variation for highly skewed

distribution or distribution with extreme values as Q.D.

is not affected by the presence of extreme values.

Limitations:

As the Q.D is calculated using only 50% of the total ob-

servations, it cannot be regarded as a good measure

of variation.

Q.D. is not a real measure of variation as it does not

measure the spread of observations from the average.

Q.D. is only a positional measure, like range.

Mean Deviation

Mean deviation is obtained by calculating the absolute devia-

tions of each observation from the mean.

Mean deviation for ungrouped data

48

To compute mean deviation for ungrouped data, absolute value

of the difference between each observation in the data set and

the mean is calculated, i.e., subtract the mean from every

value in the data set and ignore the positive or negative signs,

(considering everything to be positive). Finally, all those differ-

ences are added and this sum is divided by the number of

items in the sample.

Where, X = value of observation

mean of observations, and

N = number of observations in the sample

Example 4.1.1

Calculate the mean deviation of the leave patterns of 10 driv-

ers in one year for the values given in Table 4.1.1

49

Table 4.1.1:Calculation of Mean Deviation of the Leave

Patterns of 10 Drivers in One Year

Table 4.1.1:Calculation of Mean Deviation of the Leave

Patterns of 10 Drivers in One Year

Table 4.1.1:Calculation of Mean Deviation of the Leave

Patterns of 10 Drivers in One Year

Table 4.1.1:Calculation of Mean Deviation of the Leave

Patterns of 10 Drivers in One Year

S. no.

(N)

Observation in

days (x)

Deviation from

mean (x - X)

Absolute

deviation

(|X– X|)

1 10 -11 11

2 15 -6 6

3 18 -3 3

4 20 -1 1

5 20 -1 1

6 22 1 1

7 23 2 2

8 25 4 4

9 27 6 6

10 30 9 9

N=10 ∑x= 210 ∑|x–x|= 44

Mean Deviation for Grouped Data:

Mean deviation (M.D.) for grouped data can be calculated

about average (mean) using following formula.

where,

x

i

= mid value of the i

th

class interval

f

i

= the corresponding frequency

N= total frequency

Example 4.1.2

Compute the Mean Deviation for the data given in Table

4.1.2.

Here, in computation,

7.2

50

Table 4.1.2 Table 4.1.2 Table 4.1.2 Table 4.1.2 Table 4.1.2

Class Interval

Frequency

0-4 4-8 8-12 12-16

4 2 1 3

Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation Table 4.1.3: Computation of Mean Deviation

Class

Interval

Freque

ncy

(F)

Mid-value

of class

interval

(X)

f x X |x–x| f|x–X|

0-4 4 2 8 5.2 20.8

4-8 2 6 12 1.2 2.4

8-12 1 10 10 2.8 2.8

12-16 3 14 42 6.8 20.4

N=∑f

=10

∑fxX

= 72

∑f|x–X|

=46.4

Merits and Limitations of Absolute Mean Deviation

Merits:

Absolute mean deviation is simple and easy to under-

stand.

Absolute mean deviation is a more comprehensive

measure of dispersion as it is dependent on all obser-

vations of a distribution.

As it is obtained by taking the average of the devia-

tions of every observation from the mean, it is a true

measure of dispersion.

Limitations:

Absolute Mean deviation is less reliable as it is the

arithmetic mean of the absolute values (ignoring the

positive and negative signs).

Absolute Mean deviation is not conducive to further al-

gebraic treatment.

Absolute Mean deviation cannot be computed for distri-

butions with open-end classes.

Variance

Variance is similar to mean deviation, except that it is calcu-

lated by using the sum of the squared distances between the

mean and each observation is divided by the total number of

observations. While calculating variance, the differences (de-

viations) are squared to make them positive.

For Ungrouped Data

Where,

= the value of the i th observation

N = Total number of observations

For Grouped Data

where,

x

i

= mid-point of the i

th

class interval

51

f

i

= frequency of the i

th

class interval

N = Total number of observations.

= ∑ f

i

Standard Deviation

Standard deviation is the square root of the variance. The stan-

dard deviation is expressed in the same units as those used in

the data set, whereas variance is expressed in squared units. In

the case of both ungrouped and grouped data, the square root of

the respective variances will give the respective standard devia-

tions.

Properties

The value of standard deviation remains the same, if in a se-

ries each of the observation is increased or decreased by a

constant quantity. In statistical language we say, standard de-

viation is independent of change of origin.

For example, for the observations 3, 10 and 12

If we increase the value of each observation by 4.5 we get the ob-

servations 7.5, 14.5 and 16.5.

For a given data series, if each observation is multiplied or di-

vided by a constant quantity(changed of scale), the standard

deviation will also be similarly affected.

= 23.152 = 6 x 3.859

Thus the standard deviation has also been multiplied by 6.

The finding holds true even if we were to divide all the observa-

tions by a non-zero constant.

Therefore, the standard deviation is independent of any change

of origin, but is dependent on the change of scale.

Standard deviation is the minimum root-mean-square de-

viation. In other words, the sum of the squares of the de-

viations of items of any series from a value other than the

arithmetic mean would always be greater.

52

As it is possible to compute combined mean of two or

more groups, it is also possible to compute combined

standard deviation of two or more groups. Combined stan-

dard deviation is computed as follows:

where,

= mean of first group

= mean of second group

n

1

= number of observations in the first group

n

2

= number of observations in the second group

= the combined mean

Coefficient of Variation (C.V)

The coefficient of variation is a measure of relative dispersion

and is given by:

C.V. = Standard deviation/Mean

This is generally expressed in percentage

i.e., C.V. (%) = Standard deviation/mean× 100

Hence the coefficient of variation measures the spread of a set

of data as a proportion of its mean. It is used in problem situa-

tions where we want to compare the variability, homogeneity,

stability, uniformity and consistency of two or more data sets.

The data set for which the coefficient of variation is greater is

said to be more variable i.e., less consistent or less homogene-

ous. On the other hand, if the coefficient of variation is less it is

said to be less variable, i.e., more consistent or more homoge-

neous.

Example 4.1.3

Compute the Variance, Standard Deviation and Coefficient of

53

Video 4.1.1: Range, variance, Standard

deviation.

Variation given the profitability of 50 companies.

6.26

So variance of profitability among 50 companies is 39.24 and

the standard deviation is 6.26. Hence

C.V = 6.26/17.4 =0.3598 or 35.98 %

Example 4.1.4

A security analyst studied hundred companies and obtained the

following Return on Investment (ROI) data for the year 1992.

Calculate the standard deviation and the coefficient of variation

in ROI of the companies.

Solution:

We can find the variability in the ROI of the companies by calcu-

lating the standard deviation for the above data.

The steps involved are:

Find mean for grouped data.

Find deviations from mean for grouped data.

Find square of above deviations.

Sum up the squared deviations taking fre-

quency into account.

Take square root.

54

Table 4.1.4: Profitability of 50

Companies

Table 4.1.4: Profitability of 50

Companies

Profit %

(xi)

Number of

Companies (fi)

10 15

15 10

20 15

25 6

30 4

Measures of Central Tendency

68

Table 4.8: Computation of Variance and Coefficient of Variation

xi f

i

f

i

x

i

) x

i

(x

2

) x

i

(x f

i

2

) x

i

(x

10 15 150 -7.4. 54.76 821.40

15 10 150 -2.4 5.76 57.60

20 15 300 2.6 6.76 101.40

25 6 150 7.6 57.76 346.56

30 4 120 12.6 158.76 635.04

Total 50 870 1962.00

x

50

870

i

f

i

x

i

f

= 17.4

2

=

50

1962

i

f

2

) x

i

(x

i

f

=39.24

So vari ance of profitabi lity among 50 companies is 39.24 (Refer to Tabl e 4.6

for calculation)

Now, Standard Deviation of profi ts of 50 companies is:

S.D ( ) =

26 6. 39.24

N

) x (x f

2

i i

Coeffici ent of vari ation=

Mean

deviation Standard

=

4 . 17

26 . 6

=0.3598=35.98%

Example 4.6

A security analyst studied hundred compani es and obtained the foll owing

Return on Investment (ROI) data for the year 1992.Calculate the standard

devi ation in ROI of the companies.

Table 4.9: ROI of 100 Companies

Returns % 0-

10

10-

20

20-

30

30-

40

No. of

Companies

19 32 41 8

Solution:

We can find the variabi lity in the ROI of the companies by calculating the

standard deviation for the above data.

The steps involved are:

Find mean for grouped data.

Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies Table 4.1.6: ROI of 100 Companies

Returns %

No. of Companies

0-10 10-20 20-30 30-40

19 32 41 8

Thus the standard deviation for the return on investment is

8.8%.

In this calculation, we always assume that all the observations

in a class interval are located at the mid-point of the class. For

example, the first class interval (0 - 10) has mid-point 5 and

frequency 19. Hence the assumption is that all the 19 compa-

nies have an ROI of 5%.

The coefficient of variation could be computed as:

C.V = S.D/Mean = 8.81/18.8 = 0.4686 or 46.86%

Bienayme Chebyshev’s Rule

This rule was developed by Russian mathematician

named Bienayme and P.L. Chebyshev. According to it,

what ever may be the shape of a distribution (i.e., spread

of data), at least 75 percent of the values in the popula-

tion will fall within 2 standard deviations from the mean

and at least 89 percent will fall within 3 standard devia-

tions from the mean.

55

Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation Table 4.1.7: Calculation of Standard Deviation

xi ﬁ ﬁxi (xi–x) (xi–x)

2

ﬁ(xi–x)

2

10 15 150 -7.4. 54.76 821.4

15 10 150 -2.4 5.76 57.6

20 15 300 2.6 6.76 101.4

25 6 150 7.6 57.76 346.56

30 4 120 12.6 158.76 635.04

Total 50 870 1962.00

Table 4.1.5: Computation of Variance and Coefficient of

Variation

Table 4.1.5: Computation of Variance and Coefficient of

Variation

Table 4.1.5: Computation of Variance and Coefficient of

Variation

Table 4.1.5: Computation of Variance and Coefficient of

Variation

Table 4.1.5: Computation of Variance and Coefficient of

Variation

Table 4.1.5: Computation of Variance and Coefficient of

Variation

Return on

investment

Mid-

point

No. of

companie

s

Deviation Deviation Deviation

% X f fX X - x f(X-x)2

0-10 5 19 95 -13.8 3618.36

10-20 15 32 480 -3.8 462.08

20-30 25 41 1025 6.2 1576.04

30-40 35 8 280 16.2 2099.52

Total 100 1880 7756

The rule states that the percentage of data observations lying

within ± k standard deviations of the mean is at least

This formula applies to differences greater than one standard

deviation about the mean, i.e., k must be greater than 1.

In case of a symmetrical bell-shaped curve, we can say that:

Approximately 68 percent of the observations in the popu-

lation fall within ±1 standard deviation from the mean

Approximately 95 percent of the observations in the popu-

lation fall within ±2 standard deviations from the mean.

Approximately 99 percent of the observations in the popu-

lation fall within ±3 standard deviation from the mean.

The diagrammatic representation of the location of observa-

tions around the mean of a bell-shaped frequency distribution

is given in Figure 4.1.2

56

Figure 4.1.2: Diagrammatic representation

of Bienayme – chebyshev Rule for a bell

shaped Curve

57

REVIEW 4.1

Check Answer

Question 1 of 12

Which of the following is a measure of central

tendency?

A. Median

B. Mode

C. Geometric mean

D. All the above

SECTION 2

CASE STUDY: MATTEL’S GLOBAL EXPANSION

58

Refer case study in chapter 3.

Basic Probability Concepts

Types of Probability

Probability Rules

Bayes’ Theorem

Case Study: Mitra Insurance Company

Case Study: Ram Publishers

C

H

A

P

T

E

R

5

Concepts of Probability

I n t hi s c hapt e r we wi l l di s c us s

Section1

Basic Probability Concepts

The concept of probability originated in the seventeenth

century and has become one of the most fascinating

subjects in the recent years. Probability has gained a lot of

importance and the mathematical theory of probability has

become the basis for statistical applications in the areas of

management, space technology, atomic physics, and the

like. In fact, most of the people use probability in their day-

to-day lives without being aware of it. Statements like “It

may rain today”, “Probably I will continue with the same job”,

“India might win the cricket series against Australia,” etc.,

are examples of the usage of probability in day-to-day life.

Various business decisions in real life are made under

situations when a decision maker is uncertain as to what will

happen after the decisions are made. The theory of

probability is of great help in all such areas. In particular, it

enables a person to make ‘educated guesses’ on matters

where either full facts are not known or there is uncertainty

about the outcome. The probability formulae and techniques

were developed by Jacob Bernoulli, De Moivre, Thomas

Bayes, and Joseph Lagrange. Later Pierre Simon and

Laplace unified all these early ideas and compiled the first

general theory of probability. Even though, volumes have

been written on probability, the controversies concerned with

the concepts of probability theory continue.

The concept of probability was used by gamblers during the

early days in games of chance such as throwing a die,

drawing a card from the deck or tossing a coin. In these

games of chance, there is an uncertainty regarding the face

of the die that will appear in a throw or the card that will

appear in a draw or the face of a coin that will appear when

it is tossed. Although there is an uncertainty concerning the

outcome of any particular throw or any particular drawing,

there is a predictable long-term outcome. For instance, if a

die is thrown many times, experimental studies have shown

60

that the probability of a number to appear is one sixth (as

the die has 6 faces).

Basic Probability Concepts

Experiment

Any operation / process that results in two or more

outcomes is called an experiment.

Examples of an experiment:

Rolling an unbiased die is an experiment, where the

number that is to appear on the face of the die is

unpredictable and subject to change.

Tossing a fair coin is an experiment, where the

outcome head or tail is unpredictable.

Random Experiment

Any well-defined process of observing a given chance

phenomena through a series of trials that are finite or

infinite and each of which leads to a single outcome is

known as a random experiment.

Examples of random experiment:

Drawing a card from a pack of 52 cards. This is also a

chance phenomenon with only one outcome.

Drawing a ball from a bag containing a given number of

red, blue and white balls. This is also a chance

phenomenon with only one outcome.

A random experiment is different from experiments under

control conditions (example, experiment in a physical

laboratory) because the observation in a random

experiment involves chance phenomena and is not

performed under controlled conditions.

Possible Outcome

The result of a random experiment is called an outcome.

For example, picking a card from a pack of 52 cards and

getting an ace or a Jack or a Queen or a King or any other

card is an outcome.

Event

An event is one or more possible outcomes of an

experiment or a result of a trial or an observation. In other

words, an event is used to denote a phenomenon that

occurs with every realization of a set of conditions.

Elementary Event / Outcome

A simple or elementary event is a single possible outcome

of an experiment. A simple event cannot be further

subdivided into a combination of other events.

Sample Space

61

A collection of all possible elementary events of an

experiment is called Sample Space.

Example

Throwing a die and the event of getting a six (6) is a simple

event. The Sample Space consists of all possible

elementary outcomes of this experiment, i.e., {1,2,3,4,5

and 6}.

Compound Event

When two or more events occur in connection with each

other, then their simultaneous occurrence is called a

compound event. The compound event is an aggregate of

simple events.

Example

When we roll two dice, then the event of getting a six on

either the first or second die is a compound event.

Favorable Event

The number of cases favorable to an event in a trial is the

number of outcomes that result in the happening of a

particular event.

Examples

In drawing a card from a pack of 52 cards, the number

of favorable cases for drawing an ace are 4, for drawing

a spade are 13 and for drawing a black card are 26.

In throwing of three die, the number of cases favorable

to getting the sum of 4 is: (1, 1, 2), (1, 2, 1), (2, 1, 1),

i.e. totally three favorable outcomes.

Mutually Exclusive Events

Two events are said to be mutually exclusive or

incompatible if the happening of any one of them

precludes the happening of the other i.e., both the events

cannot happen simultaneously in a single trial or, the

happening of one prevents the happening of the other

and vice-versa.

Examples

In throwing a die, the events of getting each of the six

faces numbered 1 to 6 are mutually exclusive since if

any one of these faces comes, the possibility of others,

in the same trial is ruled out.

62

Gallery 2.1.1: Mutually Exclusive Events

If a single coin is tossed, head can be up or tail can be

up, both cannot be up at the same time.

Mutually exclusive events are those which do not overlap

when represented in Venn diagrams. (See gallery 2.1.1)

Dependent and Independent Events

Two or more events are said to be independent if the

happening of an event is not affected by the supplementary

knowledge concerning the occurrence of any number of the

remaining events. The question of dependence or

independence of events is relevant when experiments are

consecutive and not simultaneous.

Examples

In tossing an unbiased coin, a trial is not affected by the

result of the previous of subsequent trails. The events

therefore are independent.

If a card is drawn from a pack of 52 well-shuffled cards,

then only 51 cards are left. Now, if a second card is drawn

by replacing the first card (the picked card) then the pack

again has 52 cards and the trials are independent.

However, if the first card is not replaced back, the

composition of the pack stands changed and the

probability of the second card is affected and thus the

event is dependent on the previous trial.

Exhaustive Events

The total number of possible outcomes in any trial is known

as exhaustive events or exhaustive cases.

Examples

In tossing a fair coin, there are two possible outcomes,

head and tail. The list of these outcomes is exhaustive

since the result of any toss must be either head or tail, if

the possibility of the coin standing on an edge is ignored.

The two outcomes are also mutually exclusive.

For throwing two dice, the exhaustive number of cases

is 6x2 = 36. In general, for throwing ‘n’ dice, the

exhaustive number of

events is 6n. This is

because any of the six

numbers from 1 to 6 of

the first die may be

associated with any of

the six numbers of the

other dice. All the 36

outcomes are mutually

exclusive. The sum of

the probabilities for

mutually exclusive and

collectively an

exhaustive events

should be equal to one.

Equally Likely Events

Events are said to be

equally likely, if taking

into consideration all the relevant evidence, there is no

reason to expect one in preference to the others. In other

63

Figure 2.1.1:Exhaustive

events

words, when an event does not occur more often than the

others, they are said to be equally likely events.

Examples

In throwing an unbiased die, the outcome of a number

from 1 to 6 is equally likely.

In picking a card from a pack of 52 cards (with

replacement), each card can be picked up equally often.

When an unbiased coin is tossed, the chance of getting

either head or tail is equal.

Complementary Events

A complementary event is the number of unfavorable

outcomes in an experiment. Suppose ‘E’ is an event of the

number of favorable outcomes in the experiment, then a

complementary event denoted by is the number of

unfavorable outcomes in that experiment. The events E and

mutually exclusive and exhaustive.

Examples

In drawing a card from a pack of 52 cards, the event of

getting an ace of diamond is only one and that of getting

the complementary i.e., unfavorable event is 51.

In throwing a die, the favorable event of getting a face

with number 1 is 1 and the unfavorable event of getting it

is 5.

The sum of the probabilities of an event and its

complementary event is one.

64

Section 2

Types of Probability

There are four basic ways of classifying probability based on

the conceptual approaches to the study of probability theory.

There is disagreement among the experts regarding the

appropriate approach of probability. The basic approaches

are:

Classical approach

Relative frequency approach

Subjective approach

Axiomatic approach

Classical Approach

The classical approach is based on the assumption that

each event is equally likely to occur. This is an apriori

assumption (the term apriori refers to something that is

known by reason alone) and the probability based on this

assumption is known as apriori probability. This approach

employs abstract mathematical logic and hence is also

called as ‘abstract’ or ‘mathematical’ probability. This is the

reason for considerable use of familiar objects like cards,

coins, dice, etc., where the answer can be stated in advance

before picking a card, tossing a coin or throwing a die,

respectively.

Definition

If a random experiment results in ‘N’ exhaustive, mutually

exclusive, and equally likely outcomes, out of which ‘f’ are

favorable to the happening of an event ‘E’, then the

probability of occurrence of E, usually denoted by P(E) is

given by

P (E) = f / N

65

James Bernoulli was the first man to obtain a quantitative

measure of uncertainty and the above definition was given

by him.

The probability that the event ‘E’ will not occur (i.e., the

event E complementary to E) is given by

If for an event E, P (E)= 0 then, the event is called an

impossible event and if P (E) = 1 then the event is called a

certain event.

Classical approach can be illustrated for tossing of a coin

or a die. Suppose that the probability of getting a head on a

single toss is to be calculated, then using formal terms,

The probability of getting ‘3’ on a single throw of a die is to

be calculated, then using formal terms,

Limitations of classical approach to probability

The limitations of this approach are:

The classical definition is applicable only when the trials

are equally likely or equally probable. For instance, the

probability that a candidate, attending an interview, will

succeed is not 50% since the two possible outcomes

viz. success and failure are not equally likely.

The classical definition is applicable only when the

exhaustive number of cases in a trial are finite.

The classical definition is applicable only when the

events are mutually exclusive.

Thus the classical approach to probability is useful in card

games, dice games, tossing coins and the like, but has

serious problems when it is applied to less orderly decision

problems that are encountered in the area of management.

Probabilities of occurrences such as an employee

resigning from a job before his/her retirement age or the

delay in delivery of a product to a nearby customer cannot

be predicted using this approach.

Relative Frequency Approach

The relative frequency of occurrence approach defines

probability as:

The observed relative frequency of an event in a very

large number of trials, when the conditions are stable

( i.e., the proportion of times that an event occurs in the

long-run.)

66

In this approach, the probability of happening of an event is

calculated knowing how often the event has happened in

the past. In other words, this method uses the relative

frequencies of past occurrences as probabilities. For

instance, suppose that an organization knows from the

past data that about 25 of its 300 employees entering every

year leave the organization due to good opportunities

elsewhere. Then the organization can predict the

probability of the employee turnover for this reason as:

25 / 300 = 1/12 = 0.083

Another characteristic of probabilities established by the

relative frequency of occurrence approach can be

illustrated by tossing a fair coin 1000 times. In this case it is

found that the proportion of getting either a head or tail is

more initially but as the number of tosses increase, both a

head or tail become equally likely and the probability of the

event showing a head is 0.5 or the event showing a tail is

0.5. Thus accuracy is gained as the experiment is repeated

and the number of observations is more. But the limitation

of this approach is the consumption of time and cost for

such large repetitions and additional observations.

Moreover, predicting probability using this approach

becomes a blunder if the prediction is not based on

sufficient data.

Subjective Approach

The approach was introduced by Frank Ramsey in 1926.

Subjective probabilities are those assigned to events by the

manager or the researcher based on the past experiences

or occurrences or on the evidences available. It may be

an educated guess or intuition. At higher levels of

managerial decisions, when the decision making

becomes very important, specific and is demanded to be

unique, managers use subjective probability.

Axiomatic Approach

According to axiomatic approach, probability is a number

assigned to the occurrence of an event in a sample

space. Let S be a sample space consisting of all possible

elementary outcomes of a random experiment, i.e.,

S = {s

1

,s

2

, ........... , s

n

} , assuming n elementary

outcomes for the experiment.

Then,

i. the probability of the entire sample space S is 1,

i.e. P(S) = 1.

ii. For each i, 0 ≤ P(s

i

) ≤ 1.

iii.For i ≠ j , P(s

i

and s

j

) = 0

iv.∑ P(s

i

) = 1

An event A is a collection of those elementary outcomes

meeting the requirements of the event. Clearly, the

probability of the event A must be greater than or equal to

0 and less than or equal to 1 or 100%.

67

i.e.,0 ≤ P(A) ≤ 1.

If A and B are mutually exclusive events, then the probability

of (A or B) is equal to the sum of the probabilities of A and B.

P (A or B) = P (A) + P (B) because P (A and B) = 0 as A and B

are mutually exclusive.

Two events A and B are mutually exclusive if the occurrence

of one implies the non-occurrence of the other. Hence

obtaining a head on tossing a coin and obtaining a tail are

mutually exclusive events.

68

Table 5.2.1 Table 5.2.1 Table 5.2.1 Table 5.2.1 Table 5.2.1

AB AE BD CD DE

AC AF BE CE DF

AD BC BF CF EF

Section 3

Probability Rules

The Addition Rule

For Mutually exclusive events

This can be represented by the Venn diagram as shown in

Figure 2.1.2.

P (A or B or C) = P (A) + P (B) + P(C)

Suppose A = getting 1 on throwing the dice

B = getting 2 on throwing the dice

C = getting 3 on throwing the dice

As there are six possible equally likely outcomes on

throwing the dice,

P (A or B or C) =3/6=1/6+1/6+1/6 = P (A) + P (B) + P(C)

For Non-mutually exclusive events

If two events are not mutually exclusive the probability of

one of them occurring is the sum of the marginal

probabilities of the events minus the joint probability of the

occurrence of the events (Refer multiplication rule for

marginal and joint probability explanation).

P (A or B) = P (A) + P (B) – P (A and B)

69

Figure 5.3.1: Rules of Probability

where A and B are not mutually exclusive events.

Example 5.3.1

The Warwick Systems Company markets personal

computers (See Table 5.3.1). Some computers have two disk

drives (A) and some have one disk drive (B). Another feature

of these machines is the capacity in terms of K (kilo) bytes –

that is, whether they have 256K or 128K capacity. Presently,

the firm’s finished goods inventory consists of 300 machines

equipped with varying features (see Table 2.1.1). At any time,

the Warwick Systems Company may receive an order for a

machine or machines with specific features. If Warwick has a

sufficient number of machines to satisfy its customers, the

customers will continue to order machines from Warwick. But

if Warwick cannot satisfy its customers’ needs, they will

probably order machines elsewhere. Hence the

management of Warwick wishes to know the likelihood that

its inventories contain machines with desirable features.

In the above example, the sample space S is the set of all

machines in inventory. What is the probability of a random

selection of a two-disk drive machine from inventory or P(A)?

Also, find the probability of randomly selecting a two-disk

drive machine with 256K capacity.

Solution:

Let us represent two disk drive machines by A and one disk

drive machine by B. We will represent 256K by C and 128K

by D.

The probability of randomly selecting a machine with 256K

capacity is:

70

Table 5.3.1: Inventory of Warwick Systems Company Table 5.3.1: Inventory of Warwick Systems Company Table 5.3.1: Inventory of Warwick Systems Company Table 5.3.1: Inventory of Warwick Systems Company

2DD (A) 1DD (B) TOTAL

With 256 K

capacity (C)

100 50 150

With 128 K

capacity (D)

100 50 150

Total 200 100 300

Figure 5.3.2: Non-Exclusive Events

Concepts of Probability

159

As there are six possible equal ly likely outcomes on throwing the dice,

P (A or B or C) =

6

1

6

1

6

1

6

3

= P (A) + P (B) + P(C)

Nonmutually exclusive events

If two events are not mutually exclusive the probabil ity of one of them

occurring is the sum of the marginal probabili ties of the events minus the joint

probabi lity of the occurrence of the events.

P (A or B) = P (A) + P (B) – P (A and B)

where A and B are not mutually exclusive events.

Example 10.1

The Warwick Systems Company markets personal computers (See Table 10.1).

Some computers have two disk drives (A) and some have one disk drive (B).

Another feature of these machines is the capacity in terms of K (ki lo) bytes –

that is, whether they have 256K or 128K capacity. Presently, the firm’s

finished goods inventory consists of 300 machines equipped wi th varying

features (see Tabl e 10.1). At any time, the Warwick Systems Company may

receive an order for a machine or machines wi th specific features. If Warwick

has a suffici ent number of machines to satisfy its customers, the customers

wi ll continue to order machines from Warwick. But i f Warwick cannot satisfy

i ts customers’ needs, they will probably order machines elsewhere. Hence,

the management of Warwick wishes to know the likelihood that i ts

inventories contain machines with desirable features.

Table 10.1: Inventory of Warwick Systems Company

2DD

(A)

1DD

(B)

Total

With 256 K capacity (C) 100 50 150

With 128 K capacity (D) 100 50 150

Total 200 100 300

In the above example the sampl e space S is the set of all machines in

inventory. What is the probabili ty of a random selection of a two-disk drive

machine from inventory, or P(A)? Also find the probabili ty of randomly

selecting a two-disk drive machine wi th 256K capaci ty.

C D

C and D are not mutually

exclusive events

Figure 10.3: Non Exclusive Events

P(C) = 150/300 = 0.5

Each of the above probabilities is designated as a marginal

or unconditional probability. Events A and C are not

mutually exclusive since a machine may have both

characteristics. The probability of a machine having two

disk drives or having 256K capacity involves the addition

rule with a twist. Since A and C are not mutually exclusive

events, we must apply the counting rule. Hence, the

probability of A or C is:

P (A or C) = P (A) + P(C) - P (A and C) is

P (A or D) = P (A) + P(D) - P (A and D) is

Event (A or C), includes all elements except the 50

elements of B that are elements of neither A nor C.

The probability of a machine having features B and D is:

P (B and D) = 50/300 = 0.166

The probability of the complement of (A or C) is P (B and

D). These two events account for all 300 computers.

Example 5.3.2

Consider a bag containing 4 white and 5 black balls. If a

man draws 3 balls at random, without replacement, what

is the probability that all three are black?

Solution:

The total number of ways in which 3 balls can be drawn is

the number of ways of drawing 3 black balls is

therefore the probability of drawing 3 black balls is given

by:

Example 5.3.3

Consider a bag containing 5 white and 7 black balls. If

two balls are drawn at random without replacement, what

is the probability that one is white and the other is black?

Solution:

P (One is white and other, black)

Conditional Probability: Independent Events

71

If the probability of an event is subject to a restriction on the

sample space, the probability is said to be conditional.

Conditional probability is the probability of the occurrence of

an event, say A, subject to the occurrence of a previous

event, say B. We define the conditional probability of event A,

given that B has occurred as P (A|B). In case of A and B being

independent events, we represent P (A) as the probability of

event A. It is so because independent events are those whose

probabilities are in no way affected by the occurrence of each

other.

P (A|B) = P (A) or P(A and B) = P(A) x P(B)

In other words, two events A and B are said to be

independent if the probability of happening or not happening

of an event is not affected by the probability of happening or

not happening of the other, i.e., probability of both A and B

occurring is equal to the product of probability of A occurring

and probability of B occurring.

Let us take the example of a true-false test. As the success

answers are independent of each other we can say that the

probability of success of the second answer given that the

first answer is a success is simply the probability of the

success of the second answer, i.e.,

Conditional Probability: Dependent Events

We can define the conditional probability of event A, given

that event B occurred when both A and B are dependent

events, as the ratio of the number of elements common in

both A and B to the number of elements in B.

Example 5.3.4

72

Table 5.3.2: Membership in Labor Organizations Table 5.3.2: Membership in Labor Organizations Table 5.3.2: Membership in Labor Organizations Table 5.3.2: Membership in Labor Organizations

Membership Status

Non-

agricultural

Industries

(B1)

Agricultural

Industries

(B2)

Total

Members of labor

organizations (A1)

20,044 51 20,095

Non-members

represented by labor

organizations (A2)

2,394 4 2,398

Non-members not

represented by labor

organizations (A3)

63,586 1400 64,986

Total 86,024 1455 87,479

The data regarding the membership of workers is given below

in Table 5.3.2. Calculate the conditional probability that a

worker is a member of a labor organization given that he is

working in a non-agricultural industry.

Solution:

Let A1 denotes members of labor organizations. The

probability of an employed worker being a member of a labor

organization (event A1) is:

The probability of a worker being employed in a non-

agricultural industry (event B1) is:

Now, we wish to determine the probability that a worker is a

member of a labor organization given that the worker is

employed in a non-agricultural industry. So we must calculate

the conditional probability of event A1 occurring given that

event B1 has occurred. The formula for the conditional

probability is:

The probability of a worker being both a member of a labor

organization and employed in a non-agricultural industry is:

The conditional probability is then computed as:

The probability is 0.233 that a worker is a member of a labor

organization given that the worker is in a non-agricultural

industry.

Note that this probability can also be computed directly from

the data in the Table. The conditional probability is

The answer is the same as computed by using the formula for

conditional probability.

73

Multiplication Rule

Dependent events

The joint probability of two events A and B which are

dependent is equal to the probability of A multiplied by the

probability of B given that A has occurred.

P (A and B) = P (A) P (B | A)

or P (B and A) = P (B)P (A | B)

This formula is derived from the formula of conditional

probability of dependent events.

P (A and B) = P (B | A)x P (A)

Joint probability of several dependent events is equal to the

product of the probabilities of occurrence of the preceding

outcomes in the sequence.

P (A and B and C...) = P (A)P (B | A) P (C | A and B)....

Marginal probability in case of dependent events is just the

addition of the probabilities of all the events in which the

simple event occurs.

Example 5.3.5

A study of an insurance company shows that the probability of

an employee being absent on any given day P (A) is 0.1.

Given that an employee is absent, the probability of that

employee being absent a second day in succession P (B |

A) is 0.4. Find the probability of the employee being

absent on two successive days.

Solution:

Events A and B are dependent events because B cannot

occur unless event A has occurred. The probability of an

employee being absent on two successive days:

P (A and B) = P (A) P (B | A)

= (0.1) x (0.4) = 0.04

Thus the probability of an employee being absent on two

successive days is 0.04 or 4% of the time.

Example 5.3.6

Let us consider a project which involves an outlay of Rs.

1, 00,000. The cash inflows expected to be generated by

the project are shown in the Table 2.1.3. From the table

below, we find that there are eight possible cash flow

streams. The first cash flow stream consists of Rs.30,000

in year 1, Rs.30,000 in year 2 and Rs.35,000 in year 3,

the second cash flow stream consists of Rs.30,000 in

year 1, Rs.30,000 in year 2 and Rs.40,000 in year 3, so

on and so forth. The probabilities associated with these

cash flow streams are also given. Calculate the

probability of generating cash inflow of Rs.30, 000 in the

first year.

74

Solution:

It may be noted that the probability with which a cash flow

stream occurs is simply the joint probability of the individual

elements in that cash flow stream. The probability of the

first cash flow stream, i.e.,

Figure 5.3.3: Probability Tree

P(Rs.30000 ing year 1, Rs.30000 in year 2 and Rs. 35000

in year 3) = P(Rs.30,000 in year 1) × P (Rs. 30,000 in

year 2| Given Rs.30,000 in year 1) × P(Rs.35,000 in year

3| Given Rs.30,000 in year 1 and Rs.30,000 in year 2)

= (0.5) (0.8) (0.6) = 0.24

In the cash flow streams problem, given only the joint

probabilities of cash flows in three years from all streams

involving the cash inflow of Rs.30,000 in year one, we can

calculate the probability of the cash inflow of Rs. 30,000

in year 1.

75

TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT TABLE 5.3.3: CASH INFLOWS FOR PROJECT

YEAR 1" YEAR 1" YEAR 2 YEAR 2 YEAR 2 YEAR 3 YEAR 3 YEAR 3

Net

cash

ﬂow

Initial

Proba

bility

P(1)

Net

cash

ﬂow

Condi

tional

Proba

bility

P(2 |

1)

Net

cash

ﬂow

Net

cash

ﬂow

Condit

ional

Proba

bility

P(3 |

2,1)

Cash

ﬂow

strea

m

Cash

ﬂow

strea

m

Joint

Probabi

lity

P(1,2,3)

35,000 35,000 0.6 11 0.24

30,000 0.8 40,000 40,000 0.4 22 0.16

30,000 0.5 40,000 0.2 45,000 45,000 0.5 33 0.05

50,000 50,000 0.5 44 0.05

60,000 60,000 0.7 55 0.21

50,000 0.5 50,000 0.6 70,000 70,000 0.3 66 0.09

60,000 0.4 75,000 75,000 0.8 77 0.16

90,000 90,000 0.2 88 0.04

30,000

(0.5)

30,000

40,00

50,00

60,00

(0.5)

50,000

(0.8)

(0.2)

(0.6)

(0.4)

35,000

(0.6)

40,000

(0.4)

45,000

(0.5)

50,000

75,000

(0.5)

60,000

(0.7)

70,000

(0.3)

(0.8)

90,000

(0.2)

Year 1 Year 2 Year 3

P (Rs. 30,000, Rs. 30,000, Rs. 35,000

in years 1, 2, and 3) = 0.24

P (Rs. 30,000, Rs. 30,000, Rs. 40,000

in years 1, 2 and 3) = 0.16

P (Rs. 30,000, Rs. 40,000, Rs. 45,000

in years 1, 2 and 3) = 0.05

P (Rs. 30,000, Rs. 40,000, Rs. 50,000

in years 1, 2 and 3) = 0.05

Therefore, probability of the cash inflow of Rs.30, 000 in year

1, given the above joint probabilities is:

= 0.24 + 0.16 +0.05 + 0.05 = 0.50

Example 5.3.7

Suppose that a sample of size 2 is chosen from a population

of 6 elementary units. The sampling is performed without

replacement. Thus, an element of the population can only be

selected once in a sample. Calculate joint probability.

Solution:

Each possible sample of size 2 has the same chance of being

selected.

Let the elementary units of the population be denoted by A, B,

C, D, E, and F. Then, the possible samples of size 2 are:

Each of these 15 equally likely samples of size 2 has a

probability of being selected.

Consider the sample denoted by CE. Units C and E can

be selected in any order. We consider the order CE and

EC as separate events. The probability of selecting C and

then E is

P(C and E) = P(C) P (E | C) = (1/6) (1/5) = 1/30

Likewise, the probability of selecting E and then C is

P (E and C) = P (E) P(C | E) = (1/6) (1/5) = 1/30

These two joint events are mutually exclusive, and the

probability of one or the other occurring is

P [(C and E) or (E and C)] = (1/30) + (1/30) = 1/15

This value is the probability of C and E occurring in any

order.

Table 5.3.4 Table 5.3.4 Table 5.3.4 Table 5.3.4 Table 5.3.4

AB AE BD CD DE

AC AF BE CE DF

AD BC BF CF EF

76

Section 4

Bayes’ Theorem

In business, there are an increasing number of instances

when occurrence of a particular event may impact the sales

and hence profits. For example, a home appliances retailer

calculates that it would be wise to stock his showroom with

microwave ovens to the extent of 15 percent of his available

shelf space. But later he finds out that the sales for

microwave ovens are showing a decline due to increase in

electricity tariff.

It is therefore important at this stage for the retailer to

recalculate the probability of a microwave oven selling under

the new circumstances. This would help him in making a

more profitable product mix decision for his showroom.

Here we find that some probabilities were changed after the

people involved (the retailer) got additional information

(information about increased electricity tariff). The new

probability thus obtained is known as posterior probability.

Since probabilities can be revised as new information is

gathered, the study of probability is of great significance in

managerial decision-making.

The concept of posterior probabilities was founded by the

18th century British Presbyterian minister Thomas Bayes.

Known as Bayes’ Theorem, it helps us to find the conditional

probability of one event occurring (A), given that another (B)

has already occurred.

The terms and posterior refer to the time when information is

collected. Before information is obtained, we have prior

probabilities. Bayes’ Theorem provides a means of

calculating posterior probabilities from prior probabilities.

More formally, the Bayes Theorem stated as follows:

Let A

1

, A

2

, ..........., A

n

be mutually exclusively and

collectively exhaustive events, such that

77

A

1

U A

2

U ........... U A

n

= the sample space. Then

the posterior probability of the mutually exclusive events

(A

i’s

). Posterior to event B may be computed as

The example below illustrates the use of Bayes’ Theorem.

Example 2.1.8

Dandakaranya Oil Exploration Company is considering a

particular site for drilling for oil. Apriori (based on past

experience), the company expects three possible

outcomes - nil oil, moderate quantum of oil or huge

quantum of oil, with associated chances as

P(Nil oil) = 0.6

P(Moderate oil) = 0.3

P(Huge oil) = 0.1

To obtain more information about the site, the company can

conduct a seismic experiment, which can lead to one of the

three readings - low, medium or high. Company’s past

records show the following :

i. Of the 140 past sites that were drilled and produced no

oil, seven were high on seismic reading,

i.e., P (high reading | nil oil) = 7/140 = 0.05

ii. Of the 500 past sites that were drilled and produced

moderate oil, 10 were high on seismic reading,

i.e., P (high reading | moderate oil) = 10/500 =

0.02

iii. Of the 250 past sites that were drilled and produced

huge oil, 200 were high on seismic reading,

i.e., P (high reading | huge oil) = 200/250 = 0.8

A seismic survey at the site under consideration gave a

high reading. Should the company undertake drilling at the

site?

Solution

Clearly, the company is concerned about the possibility of

finding no oil despite a high seismic reading, i.e., the

company would like to find out

P (nil oil | high reading).

We apply Bayes’ Theorem to find this probability, which is

shown in a tabular form below:

Thus we see that even though the seismic prediction for

the site is high, still there is a 26% chance of not finding oil.

While the probability of no oil has come down with this

knowledge substantially from the earlier level of 60%, it is

for the company to take the final call.

78

79

REVIEW 5.1

Check Answer

Question 1 of 8

In probability, any operation/process that re-

sults in two or more outcomes is called

__________.

A. An Experiment

B. An Event

C. Possible Outcome

D. Equally Likely Event

Section 5

Case Study: Mitra Insurance Company

This case study was written by Sravanthi Vemulawada,

under the direction of R Muthukumar, IBSCDC. It is in-

tended to be used as the basis for class discussion

rather than to illustrate either effective or ineffective han-

dling of a management situation. The case was written

from generalised experiences.

80

Insurance is defined as the unbiased transfer of a risk of

loss from one being to another in exchange for a premium.

It can also be defined as a guaranteed small loss to

prevent a large, unpleasant loss. Law and Economics,

define it as a form of risk management primarily used to

guard against the risk of uncertain loss. The company

which sells the insurance is called the insurer and the

person or unit buying the insurance is called insured. The

amount to be charged for a certain amount of insurance

coverage is called the premium and the insurance rate is a

factor which is used to determine the premium. Now-a-

days, risk management, which is the practice of appraising

and controlling risks, has evolved as a distinct field of study

and practice. Insurance emerged back in the 7th century in

the Greek and Roman societies. Basically, insurable risks

consist of seven common characteristics. They are:

Large number of standardised exposure units: If a

very large number of standardised exposure units are

present and are increasing, it helps the insurers benefit

because the actual results (claims) are more likely to

become close to expected results (claims). If we

consider the case of automobile insurance, it covers

about 175 million automobiles in India, which is an

example of large number of standardised exposure

units.

Definite Loss: Definite Loss comes in cases where the

event gives rise to definite loss at a known time, known

place and from a known cause. Preferably, the time,

place and cause of a loss should be clear enough that a

person, with sufficient information, could, without bias,

verify all three elements. Fire accidents, automobile

accidents and injuries for a worker come under this

characteristic.

Accidental Loss: Here the event in the case of

accidental loss should be casual rather unexpected and

the loss should be pure. If we consider the case of

ordinary business risks, it is not considered insurable.

Large Loss: The size of the loss must be significant

from the perspective of the insured. The premiums

need to cover the expected cost of losses, the cost of

issuing and administering the policy, adjusting losses

and supplying the capital needed to practically assure

that the insurer will be able to pay claims.

Affordable Premium: The premium should be

affordable in the sense that it should not cause

significant loss to the insurer. If the chances of an event

happening are so high, the cost of the event is huge

and the resulting premium is large relative to the

protection offered, then there are fewer chances of

people buying that insurance.

Calculable Loss: The loss should be calculable. If not

exactly calculable, it should at least be estimable.

Possibility of loss is generally an observed exercise,

while cost has more to do with the ability of a person

who has a copy of insurance policy makes a sensibly

81

definite and an unbiased assessment of the quantity of

the loss retrievable as a result of the claim.

Limited Risk of Disastrously Large Losses: If a risk

can cause large losses to a very large number of

people holding various policies, the ability of the insurer

to issue policies becomes constrained, for example in

the case of earthquakes, hurricanes, etc.

Any risk that can be measured can potentially be insured.

There are different types of insurance like Auto Insurance,

Home Insurance, Health Insurance, Disability Insurance,

Casualty Insurance, Life Insurance, Property Insurance,

Liability Insurance, Credit Insurance, etc.

The actual application for benefits provided by an

insurance company is called an insurance claim. All the

policyholders must first file an insurance claim before any

money can be paid out to the hospital, to the repair shop or

to any contracted service. Now it is completely up to the

insurance company whether to approve the claim or

disapprove it based on their assessment of all conditions.

Depending on the type of insurance, the policyholders have

to make regular payments. This is in the case of home, life,

health, automobile insurance policies; the individual has to

maintain regular payments called premiums to the insurers.

By and large, these premiums are used to settle another

person’s insurance claim or used to develop the available

assets of the company. But sometimes when an accident

happens, which causes real financial damage or any such

natural calamities, then the policyholder has the right to file

an insurance claim so as to receive money from the

insurance company.

Generally, the insurance claim is filed with the local agent

of the insurance company who is responsible for studying

the details of the insurance claim and negotiating the

payments from the required insurer. Recognised

authorities like doctors, repair shops, etc., can file for the

insurance claims directly. Sometimes, the policyholder

would not want to file a claim because the damage would

have been minor or the opposite party has agreed to pay

out of their pockets for the mistake.

Once the insurance claim is filed with the local agent, the

insurance company sends an investigator who is called

as an adjustor or an appraiser. The appraiser’s job is to

evaluate the claim and determine if the repair valuation is

reasonable so that any frauds by the contractors can be

prevented. Most of the times, the appraisers evaluation is

considered final. Some insurance companies may not

recognize the claims for many reasons like few careless

accidents or if the claimant’s payments are not paid in full,

then the policy may not be active, etc.

Mitra Insurance Company is a nation wide recognised

insurance company in India. The kind of claims provided

by this company includes hospitalisation, physician’s visit

and outpatient treatment. The company received claims

from east, west, north and southern parts of the country

(Exhibit I).

Using Exhibit I, discuss the various entries as

82

conditional probabilities.

What is the probability of the event that the claim is from

west and the type is hospitalization?

83

Mitra Insurance Company:Claims(Geographical

Regions)

Mitra Insurance Company:Claims(Geographical

Regions)

Mitra Insurance Company:Claims(Geographical

Regions)

Mitra Insurance Company:Claims(Geographical

Regions)

Mitra Insurance Company:Claims(Geographical

Regions)

Kind of

claim

East South North West

Hospitaliza

tion

75 128 29 52

Physician

visit

233 514 104 251

Outpatient

Treatment

100 326 65 99

Section 6

Case Study: Ram Publishers

On 20th February 2004, Siva Raman, President of Ram Pub-

lishers met R.K.Mohan, Vice President, Marketing, and Rob-

ert Wilson, Chief Editor, to exchange notes on the negotia-

tions under way with N. Periyasamy regarding his soon-to-

be-written autobiography. Periyasamy, a 65 year old retired

IAS officer, had been appointed to the Election commission

by the government in 2000. Periyasamy planned to resign

before his term expired in 2005. He had approached Ram

Publishers, as well as two other publishing houses, to pub-

lish his memoirs.

Periyasamy was widely respected and his advice had been

sought by friend and foe alike. He had cultivated friendships

with various national political leaders. Periyasamy was a

regular participant in various meetings convened by political

leaders.

A year back, Periyasamy had decided to cash in on these

experiences by writing a book. Ram Publishers had queried

him about the likely content of the autobiography. While it

was clear he intended to narrate the political intrigues he

had known, he also seemed to be well-informed on other is-

sues. Ram Publishers believed Periyasamy’s autobiography

might become a best seller.

Periyasamy was very clear about his profit expectation-Rs.

2 lakhs to sign a deal and another Rs. 2 lakhs upon delivery

of the script. It was also understood that the manuscript

would be ghost- written. Periyasamy would tell his reminis-

cences to Ram Publishers’ staff who would compile them

into a book.

At a meeting between Siva Raman, Mohan, and Robert Wil-

son, the conversation went as follows:

84

Mohan: I think this book could be a big hit of 2005 and the

sales could be as much as one lakh copies assuming a

price of Rs.250 retail. This is first and foremost a political

book. But let us not get too excited. We have to consider

the possibility that Periyasamy’s personal appeal, which is

at its peak at the moment, might dissipate over the next

year. We also don't know which other politicians might pub-

lish their memoirs around the same time. Remember, 2004

is an election year. The situation is quite fluid. I believe that

at a retail price of Rs.250, there is a 40% chance of sales

of around one lakh books, a 30% chance of sales of

around 40,000 books, and a 30% chance of sales of

around 10,000 books. Those are just representative scenar-

ios for the purpose of our calculations, of course.

Wilson: One thing is important. The book has to be written

before we can sell it. He has never written a book before,

so he doesn't know what it involves.

Siva Raman: We are also not completely confident that his

memoirs are going to be as exciting as we are expecting.

Let's face it, when our staff start looking at his stories, they

may find that the book is dull.

Mohan: We should be careful. We have to be sure, we can

make a profit if we publish it. One good thing, Periyasamy

has accepted the possibility that we may not wish to pub-

lish the book once we get to look at the manuscript.

Wilson: That's right. But after his delivery of manuscript, we

have to pay him the second Rs.2 lakhs, whether we publish

it or not.

Siva Raman: I think there is only a 70% chance Periy-

asamy will actually deliver a manuscript. Even after his de-

livery of manuscript, there's a 30% chance of a poor script

that we cannot publish. If we decide to publish the book,

then we have to examine the sales forecasts accurately. I

don't see how we can learn much more about our likely

sales before we make our final decision about going to

press.

Wilson: Why should we hand over Rs. 2 lakhs to someone

who may never deliver a manuscript? Siva Raman: Before

we get into that, let us use these sales projections and prob-

abilities and check to see if this deal makes sense.

Mohan: Let us first look at the costs. The cost of editorial

services (editing, proofreading and obtaining permissions

for photographs, etc) will be Rupees One lakh, which will

be incurred even if we decide to stop publishing. If we de-

cide to publish the book, we will also incur the cost of pre-

paring camera-ready proofs, about Rs.50,000. Printing

costs will be Rs.75 per copy.

Siva Raman: Will the unit cost come down, if we generate

more volume?

Mohan: Yes, but we'll need to print 10,000 copies no mat-

ter what. So, although it would cost much more per copy if

85

we were printing, say, 1,000 copies, for the numbers we

are talking about, it is effectively a flat rate. Furthermore,

for orders of our size, the printer will allow us to order cop-

ies on an "as-needed" basis, and we'll still get the same

rate. This means we won't get stuck with unsold inventory.

We'll get returns if the retailers cannot sell them.

Mohan: My proposed retail price of Rs.250 assumes a

wholesale price of Rs.160. For a generous margin like that

we will not permit returns. That's a common enough prac-

tice with books of a very topical nature. Distribution costs

will be about Rs.5 per copy. Marketing costs make up

about 40% of the wholesale price.

Siva Raman: But much of that marketing cost is fixed. We

have a marketing department and sales force whether we

sell Periyasamy's book or not. What are our incremental

marketing costs?

Mohan: We will pay 5% of the wholesale price as a com-

mission to the sales force. We will also spend about Rs.15

lakhs on advance publicity. We can prevent this cost, if we

decide not to publish the book based on our judgment.

Wilson: I feel that if we're only considering incremental ex-

penses, then the cost of editorial services, would be more

like Rs.50 thousand rather than Rupees One lakh, since

the permanent editorial staff are not very busy these days.

Ram Publishers’ senior management wondered whether

they should go ahead with the agreement.

86

Notes

Probability Distributions

Random Variable and Probability Distribution

Some Common Discrete Distributions

The Binomial Distribution

The Poisson Distribution

Some Common Continuous Distributions

Normal Distribution

t-Distribution

F-Distribution

Case Study: The Problem of a Medical Representa

tive

C

H

A

P

T

E

R

6

I n t hi s c hapt e r we wi l l di s c us s

Section1

Random Variable and Probability Distribution

In this chapter, we will discuss the concepts of probability

distributions. In fact, probability distributions are related to

frequency distributions and are considered as theoretical

frequency distributions. As these distributions deal with

expectations, they can be used as models in making

inferences and decisions under uncertain conditions. To

have a better understanding of the concepts of probability

distributions, let us consider the case of tossing a fair

(unbiased) coin twice. The possible outcomes of the

experiment are as shown in Table 6.1.1.

Suppose that an analyst is interested in knowing the number

of heads that can possibly result when the coin is tossed

twice. The analyst can conclude that out of the four possible

outcomes, one does not show the head at all, two show a

single head and one shows two heads. This is a theoretical

outcome and represents the way in which the analyst

expects the two-toss experiment to behave over time. This is

called probability distribution of the experiment.

Probability distributions can also be based on experience.

This is done by agencies involved in insurance actuaries to

determine insurance premiums by using experience with

88

Table 6.1.1. Possible Outcomes of Tossing a Fair Coin

Twice

Table 6.1.1. Possible Outcomes of Tossing a Fair Coin

Twice

Table 6.1.1. Possible Outcomes of Tossing a Fair Coin

Twice

Table 6.1.1. Possible Outcomes of Tossing a Fair Coin

Twice

First

toss

Second

toss

Number of heads

on two tosses

Probability of the

outcome

H H 2 0.5 × 0.5 = 0.25

H T 1 0.5 × 0.5 = 0.25

T T 1 0.5 × 0.5 = 0.25

T H 1 0.5 × 0.5 = 0.25

Total Probability = 1.00

death rates to establish probabilities of dying among various

age groups.

Random Variables

Before proceeding further,

let us first understand the

c o n c e p t o f r a n d o m

var i abl es, whi ch wi l l

enable us to understand

the concept of probability

distributions better.

Random variable is a

variable that takes on

di fferent val ues as a

result of the outcomes of

a random experiment. A

random variable is said to be continuous if it is allowed to

assume any value within a specified range and is said to be

discrete if it is allowed to take only a limited or countable

number of values, which can be listed. This can be further

explained through the following example. Suppose an

unbiased pair of dice is tossed. The possible outcome of the

sum of the upper faces of the two dice can take on any

integer value between 2 and 12. The outcome is said to be

discrete because it can take only a finite (or countable)

number of values.

On the other hand, if the task is to determine the mean age

of a sample of 1000 voters, the possible outcome (X) can

take any value in an interval(s) of numbers and is hence

continuous. It is a general practice to use capital letters for

random variables and lower case letters to indicate the

actual value it takes. That is X = x.

Expected Value of a Random Variable

Imagine a situation of tossing a coin ten times and getting 6

heads out of the experiment. The result is not always the

same if the same experiment is repeated under similar

conditions and is bound to vary from experiment to

experiment, though the coin is totally unbiased.

Expected value is a fundamental idea in the study of

probability distribution and is obtained by multiplying each

value that the variable can assume by the probability of

occurrence of that value and then summing up these

products. Let us illustrate the process of calculating the

expected value with the help of Example 6.1.1

Example 6.1.1

The daily records of a dental clinic indicate that the number

of patients arriving at the clinic ranges from 30 to 45 per day.

Table 6.1.2 illustrates the number of times each level is

reached during the past 100 days and the probability is for

the same level to recur the next day. Calculate the expected

value of number of patients to arrive at the clinic.

89

Introduction to random

variables

Solution:

To obtain the expected value of patients, we have to

multiply each value that the variable can assume with the

probability of occurrence of that value and then sum these

products. This is illustrated in Table 6.1.3.

90

Table 6.1.2. Number of Patients at Dental Clinic Table 6.1.2. Number of Patients at Dental Clinic Table 6.1.2. Number of Patients at Dental Clinic

Number of

Patients

Number of days the

level was observed

Probability for

reaching the level

30 3 0.03

31 2 0.02

32 1 0.01

33 5 0.05

34 6 0.06

35 7 0.07

36 9 0.09

37 10 0.10

38 12 0.12

39 11 0.11

40 9 0.09

41 6 0.06

42 5 0.05

43 8 0.08

44 2 0.02

45 4 0.04

Table 6.1.3. Calculation of Expected Value Table 6.1.3. Calculation of Expected Value Table 6.1.3. Calculation of Expected Value

Number of Patients

(1)

Number of days the level was

observed

(2)

Probability for reaching

the level

(3)

30 0.03 0.9

31 0.02 0.62

32 0.01 0.32

33 0.05 1.65

34 0.06 2.04

35 0.07 2.45

36 0.09 3.24

37 0.10 3.70

38 0.12 4.56

39 0.11 4.29

40 0.09 3.60

41 0.06 2.46

42 0.05 2.10

43 0.08 3.44

44 0.02 0.88

45 0.04 1.8

However, the expected value in the table does not mean that

38.05 patients will arrive the next day. This only helps the

dentist as a basis for his decisions on daily visits because the

expected value is a weighted average of the outcomes that

can be expected in the future. The dentist should recompute

the expected value and update his information on a regular

basis.

Types of Probability Distributions

Probability distributions are basically of two types:

Discrete Probability Distribution

Continuous Probability Distribution

A discrete variable can take only a limited number of values,

which can be listed. The probability of taking birth in a given

month is discrete because there are only 12 possible values

(12 months of the year) in the distribution. On the other hand,

in a continuous probability distribution, the variable is allowed

to take on any value within a given range.

Discrete Probability Distributions

Since each value of a discrete random variable is linked to an

outcome of an experiment, the values of a random variable

can be related to the probabilities of outcomes. The result of

this process is called a discrete probability distribution. To

illustrate the concepts of discrete probability distributions, let

us consider an experiment of tossing a balanced coin

thrice. The out come has to be one of the following:

TTT HTT THT TTH

HHT HTH THH HHH

If the aim is to determine the number of times head

occurs (X), the results can be depicted as given in Table

6.1.4.

The values given in the relative frequency column of the

91

Table 6.1.4. Theoritical Results of Tossing a

Balanced Coin

Table 6.1.4. Theoritical Results of Tossing a

Balanced Coin

Table 6.1.4. Theoritical Results of Tossing a

Balanced Coin

X Frequency Relative Frequency

0 1 1/8

1 3 3/8

2 3 3/8

3 1 1/8

Table 6.1.4 are nothing but the probabilities associated with

the values of X. So, the above findings can be slightly modified

as shown in Table 6.1.5.

Table 6.1.5 is a typical example of a discrete probability

distribution. The mean number of heads in 3 tosses is

calculated as below:

= 1.5

µ = 1.5 has a practical interpretation. If this experiment of

tossing a coin 3 times were repeated an infinite number of

times and the values of X were recorded, then theoretically,

1.5 would represent the average number of times heads would

come up. For this reason the mean is often called the

expected value E(X).

Continuous Probability Distribution

In such distributions, the variable can assume any value within

a given range. Therefore it is impossible to list all possible

values. If we were studying the waiting time for customers at

bank teller counter, the waiting would be a continuos variable

as the variable can take on any value within a continuum or

interval, depending on the precision of the measuring

instrument. This distribution would therefore be called a

continuous probability distribution.

92

Table 6.1.5. Probabilities of Getting Heads Table 6.1.5. Probabilities of Getting Heads

X P(X=x)

0 1/8

1 3/8

2 3/8

3 1/8

Total

1

Section 2

Some Common Discrete Distributions

Binomial Distribution

The binomial distribution is one of the widely used probabil-

ity distributions of discrete random variable. It describes dis-

crete, non-continuous data resulting from an experiment that

is also known as Bernoulli

Process (named after Ja-

cob Bernoulli, a Swiss

Mathematician of the sev-

enteenth century). The

tossing of a fair coin a

fixed number of times is a

typical example of Ber-

noulli process and the out-

comes (say, number of

heads) of such tosses can

be represented by the bino-

mial probability distribution.

The binomial distribution has an expected value (or mean µ)

which can be represented by the formula

µ = n p

Variance of the binomial distribution

= npq

Where,

n = Total number of Bernoulli trials

p = Probability of success in one trial

q = Probability of failure in one trial = 1– p

Characteristics of a Bernoulli Process

Each trial (each toss in our example) will have only

two possible outcomes: Success or Failure (head or

tail in our case).

The probability of the outcome of any trial remains

constant over time. That is, the probability of getting a

tail, in our example, is always 0.5 irrespective of the

number of times the coin is tossed.

93

Binomial distribution

The outcome of one trial cannot influence the out-

come of any other trial, and each trial is statistically

independent.

In technical parlance, the symbol ‘p’ is used to represent the

probability of a success and the symbol ‘q’ (q =1 - p) to repre-

sent the probability of failure. To represent a certain number

of successes, the symbol ‘r’ is generally used and the sym-

bol ‘n’ is used to represent the total number of trials.

The formula used to determine the probability of ‘r’ suc-

cesses in ‘n’ trials is given by

Example 6.2.1

A fair coin is tossed ten times. If getting head is defined as

success, find out the probability of getting 4 successes in the

ten trials.

Solution:

p = probability of getting head = 0.5

(Since it is a fair coin)

q = probability of not getting head = 1-p = 0.5

r =number of successes = 4

n = number of trials = 10

Probability of getting 4 successes in 10 trials

= 0.2051

Thus there is a 0.2051 probability of getting four heads on

ten tosses of a fair coin.

Example 6.2.2

A binomial experiment is repeated nine times. If the probabil-

ity of a success is 0.6, find the probability of getting four suc-

cesses.

Solution:

Here, n = 9 , p = 0.6 , q = 0.4 , r = 4

= 126×0.64×0.45

94

= 126×0.1296×0.01024 = 0.167

Poisson Distribution

The Poisson distribution applies to the situation when an

event occurs at random

points in time or space.

The observations on such

an event are characterized

by an average number of

occurrences of that event

per unit time or space.

This distribution is named

after its developer Siméon

Denis Poisson, a French

Mathematician. It can be

used to describe a number

of processes like distribution of telephone calls going

through a switch board system, the arrivals of trucks at a

toll booth, and so on.

A process is said to be producing a Poisson probability dis-

tribution if the following conditions are met:

(i) Independence

The number of times an event S occurs in any time interval

is independent of the number of times it occurs in any other

disjoint time interval.

(ii) Rate

In a very small time interval, t to t + h (where h is infinitesi-

mally small), the probability that the event occurs once is

approximately λ h (where λ is the average rate at which

the event S occurs per unit of time).

(iii) Lack of Clustering

The chance of two or more occurrences of S in a very

small interval, t to t + h is insignificant in comparison with

λ h, the chance of one occurrence.

In other words, we can describe Poisson distribution as a

limiting case of the binomial distribution where the prob-

ability of success (p) is infinitesimally small and the num-

ber of trials (n) so large that the product np equals λ, a fi-

nite constant. The mass probability function that repre-

sents the number of times the event S occurs in a given

period of time, say 0 to t, can be written as

Where X = discrete random variable

x = specific value X can take

λ = the mean number of occurrences per interval

of time

The mean and variance of a Poisson Distribution is λ.

Suppose that we are measuring events in time, occurring

with the following properties:

95

Poisson distribution

The number of events occurring in one time interval is in-

dependent of the number occurring in any other disjoint

time interval. (It has no memory.)

The probability that a single event will occur during a very

short time interval is proportional to that length of the time

interval.

The probability that more than one event will occur in such

a short time interval is negligible.

The number of events occurring in a fixed time interval is a

random variable X that has the Poisson distribution.

Example 6.2.3

The average number of radioac-

tive particles passing through a

counter during one millisecond

in a laboratory experiment is

four. What is the probability that

six particles enter the counter in

a given millisecond?

Solution:

We know that λ = 4 and x = 6

,

96

Normal Distribution

Section 3

Some Common Continuous Distributions

Normal Distribution

The normal distribution reflects the various values taken by

many real life variables like the heights and weights of

people or the marks of students in a large class. In all these

cases a large number of observations are found to be

clustered around the mean value and their frequency drops

sharply as we move away from the mean in either direction.

For example, if the mean height of an adult in a city is 6 feet

then a large number of adults will have heights around 6

feet. Relatively few adults will have heights of 5 feet or 7

feet.

Further, if we draw samples of size n (where n is a fixed

number over 30) from any population, then the sample mean

X will be (approximately) normally distributed with a mean

equal to µ – the mean of the population.

The normal variable is a continuos variable. The

characteristics of normal probability distribution with

reference to the Figure 6.3.1 are:

The curve has a single peak; thus it is unimodal.

The mean of a normally distributed population lies at the

center of its normal curve.

Because of the symmetry of the normal probability

distribution, the median and the mode of the distribution

are also at the center.

The two tails of the normal probability distribution extend

indefinitely and never touch the horizontal axis.

97

Figure 6.3.1

The Standard Normal Distribution

The Standard Normal Distribution is a normal distribution

with a mean µ = 0 and a standard deviation = 1. The

observation values in a standard normal distribution are

denoted by the letter Z.

Example 6.3.1

A population is normally distributed with mean = 0 and

standard deviation = 1. What is the probability that an

observation from the population will have a value between –

1.28 and 1.28?

Solution:

We know that for a normal distribution 80% of the

observations lie between

Here, µ= 0 and = 1. i.e., it’s a standard normal distribution

So 80% of the observations will lie between –1.28 and +

1.28 (from normal table)

Hence the probability that an observation will have a value

between –1.28 and 1.28 is 80%.

Example 6.3.2

What is the probability that an observation from a standard

normal distribution will lie in the interval –1.96 to 1.96 ?

Solution:

From normal table, the probability is 95%.

Example 6.3.3

What is the probability that an observation from a standard

normal distribution will lie between –2.33 and + 2.33 ?

Solution:

From normal table, the probability is 98%.

Standardizing Normal Variables

Suppose we have a normal population. We can represent it

by a normal variable X. Further, we can convert any value of

X into a corresponding value Z of the standard normal

variable, by using the formula

Where

X = the value of any random variable

µ = the mean of the distribution of the random

98

= the standard deviation of the distribution

z = the number of standard deviations from x to the mean

of the distribution and is known as the z score or standard

score.

Example 6.3.4

A normal variable X has a mean of 56 and a standard

deviation of 12. Find the Z value corresponding to the X

value of –5.

Solution:

Example 6.3.5

A normal variable has a mean of 10 and a standard

deviation of 5. What is the probability that the normal

variable will take a value in the interval 0.2 to 19.8?

Solution:

Probability (0.2 < X < 19.8)

= Probability (-1.96 <

Z < 1.96)

= 95%

[Because 95% of the area under the standard normal curve

lies in the interval -1.96 to 1.96]

We can see this from the Normal Table:

Area under the standard normal curve between 0 and

1.96 is 0.4750.

Due to symmetry of the standard normal distribution, area

under the curve between –1.96 and + 1.96 is twice the

area under the curve between 0 and + 1.96.

Probability (–1.96 < Z < + 1.96) = 0.95 or 95%

Any normal variable can be converted into a standard

normal variable as illustrated above. Hence, we can use

the standard normal distribution table to find the

probability that the variable will take a value within any

given interval.

The Lognormal Distribution

If ln (X) is a normally distributed random variable, then X

is said to be a lognormal variable.

If P1, P2, P3, ... are the prices of a scrip in periods 1, 2, 3,

..., some applications in finance require ln (P2/P1), ln (P3/

P2),... to be normally distributed, that is, continuously

compounded returns are required to be normal. This

property is described as “Stock Prices are Lognormal”.

t-Distribution

Suppose we randomly select an Indian and find his/her

weight. Then X = “Weight of the person” is a random

variable. We may assume that X is normally distributed.

Moreover, suppose E(X) = 60 kg and that V(X) is

99

unknown, where E(X) is the population mean and V(X) is

the population variance.

Suppose we take a random sample of five people and

compute the average weight, say . Then is also a

random variable, since different samples may give different

values for .

It is a fact that E ( ) = E(X) for any such experiment. It is

also true that V ( ) = V( X )/n, where n is the sample size.

It is also true that, that if X is normal, so is .

has mean 0 and variance 1. But we do not have V ( )

since V(X) is unknown.

We may compute the sample variance, s

2

, from the five

individuals. We may consider as an approximation of V(X),

and replace V(X) by . In doing so we are losing one degree

of freedom. And

is a t-distribution with (n -1) degrees of freedom,

where,

µ = Population mean

s = Sample standard deviation

n = The sample size

As shown in the figure above, it is symmetrical like the

normal distribution, but its peak is lower than the normal

curve and its tail is a little higher above the abscissa than

the normal curve.

As degree of freedom increases, the distribution

approaches the Normal Distribution. So t-distribution is

used when the sample size is 30 or less than 30. Another

100

Figure 2.2.2: Distribution Curves with Differ-

ent Degrees of Freedom

condition for using this distribution is when the population

standard deviation is unknown.

Example 6.3.6

Consider the t-distribution with df = 13. What is the area to the

right of 1.771?

Solution:

From the t-distribution table, it can be seen that the area

under both the tails is 0.10. Therefore, the area under the

right tail will be 0.05.

F-Distribution

The F-distribution is the distribution of the ratio of two

independent Chi-square distributions. The degrees of freedom

of the numerator is n

1

and that of the denominator is n

2

. We

will come across this distribution while studying regression.

Example 6.3.7

Consider the F-distribution with degrees of freedom 2 in the

numerator and 13 in the denominator. What is the area to the

right of 3.81?

Solution:

From the F-distribution table, this area equals 0.05.

101

REVIEW 6.1

Check Answer

Question 1 of 25

Collection of all possible events of an experi-

ment is called

A. Sample space

B. Population space

C. Null set

D. Probability space

Section1

Case Study: The Problem of a Medical Representative

This case study was written by R.P. Suresh, Indian Institute of

Management Kozhikode, India. It is intended to be used as the

basis for class discussion rather than to illustrate either effective

or ineffective handling of a management situation. The case was

prepared from the generalised experiences.

102

Mr. Muralidharan Nair, a sales representative of WKPIL

(Well known Pharmaceuticals India limited) is one of the

promising representatives located in Kozhikode City in

South India. He has won several awards for his excellent

job of meeting the targets. Last year, he has also won the

National award of BEST REPRESENTATIVE of the year.

One of the most important jobs of the medical representa-

tives is to meet the practicing doctors and introduce to

them some of their new products, and discuss with them

about their advantages over the other existing products. As

the awareness of the doctors about the products of WKPIL

has a direct relation to the sales, the company fixes targets

on the number of doctors to be visited over a period of

time. WKPIL has a policy of finalizing the annual as well as

quarterly targets in consultation with the concerned offi-

cials. The company believes that this is the best way of in-

volving the entire organization in the decision making proc-

ess, and it is observed that the officials become more ac-

countable and are generally bound by the decision as they

were part of the decision making process. To meet the cur-

rent target, considering the number of visits that can be

made per day, Mr. Nair needs to meet 100 more doctors in

Kozhikode in the 27 days that is remaining in the quarter.

The regional manager of Western Region, Mr. Saurav

Deshpande, has extended an invitation to Mr. Nair to ad-

dress and interact with his fellow representatives, in the cur-

rent term, highlighting the factors that helped in his achieve-

ments. The company feels that this will be a motivating fac-

tor for other representatives. The venue for this meeting is

identified as Pune, which is about 800 k.m.’s away from

Kozhikode. Mr. Nair needs a day exclusively for this pur-

pose. Mr. Nair knows that he requires at least 25 days to

complete his target. As such, it looks it is possible to take a

day off required to go to Pune. However, he is also aware

that he cannot walk on a tight rope like this, because there

are some of the days during which he cannot travel to meet

the doctors due to the following exhaustive reasons.

In this region, some political or social organizations an-

nounce bandh or hartal, as a mark of protest against

some policy of the Government or to highlight a specific

problem facing the society. During these days, there is

a total restriction on movement of the public. And, there-

fore, during the days when a bandh or hartal is de-

clared, Mr. Nair will not be able to meet the doctors.

And also during the current season (viz. monsoon sea-

son) when it rains quite heavily some parts of the city

gets flooded with water. As a result of this, some of the

roads get blocked, and, hence, on these days again Mr.

Nair will not be able to meet the doctors. This case is

taken from a detailed paper by the author with the per-

mission of the author. It is intended to be used as the

basis for class discussion rather than to illustrate either

effective or ineffective handling of a management situa-

tion. Since Mr. Nair is not willing to miss the target, he

wants to make sure that he works for at least 25 days to

meet the target. At the same time, he is very keen to go

to Pune to address his fellow workers in Western re-

103

gion, as this will be a professional boost to his career,

and in the process, he may help his fellow workers also

to excel. In order to ensure that he gets enough working

days, he wishes to find out the frequency of the happen-

ings of these two events. After scanning through the

newspapers of the last two years, Mr. Nair observed

that during the monsoon there is a one in 30 chance

that, on any day in this season, the roads are blocked

due to flood in the city. He also observed from the re-

cords of the civic administration that the movement in

the city was restricted due to bandh or hartal, etc. for 14

days in the last 2 years viz., about 730 days. What con-

clusion did Mr. Nair arrive at? What are the methods Mr.

Nair used to arrive at this conclusion?

104

Sampling and Sampling Distributions

Population & Sample; Parameter & Statistic

Methods of Enumeration

• Census or Complete Enumeration

• Sampling Methods

Sampling and Non-Sampling Errors

Sampling Distribution

Estimation

Case Study:

Sampling the Population Favorite

Ascertaining Customer Satisfaction

Customer Satisfaction with DTH Services in India

Swarnamuki Public Bank Limited’s SME Loans.

C

H

A

P

T

E

R

7

I n t hi s c hapt e r we wi l l di s c us s

Section1

Population & Sample; Parameter & Statistic

The process of inferring something about a large group of

elements by studying only a part of it is known as sampling.

The collection of all elements about which some reference is

to be made is called the population. For example, in an

effort to study talcum powder usage in the urban areas of a

state, the population could be a collection of all talcum

powder users in major cities and towns in the state.

What we are interested in is to measure some particular

characteristic of the selected population. For example, it

could be the average life of a fluorescent tube, the

percentage of talcum powder users in a state or the

percentage of defectives in an engineering manufacturing

industry. Such a numerical measure, which describes a

characteristic of the population, is known as a parameter of

the population.

Usually we are interested in some population parameter and

we infer about the parameter by studying only a part of the

popul at i on, cal l ed t he

s a mp l e . S a mp l i n g ,

therefore, refers to the

process of choosing a

sample from the population

so that some inference

about the population can

be made by studying the

sample.

A numerical measure which

describes a characteristic

of a sample is called a statistic. To study the population

characteristics, a manager can either go for complete

enumeration (census) or a sampling study. However,

limitations of time, money and energy may restrict the

manager from going for complete enumeration of the entire

population. It is common practice that we check a hand full

106

Video on Sampling methods

of rice at the grocery store before buying a bag of rice or

taste a piece of sweet at a sweet shop before ordering it for

a party. This practice is based on the assumption that the

sample will provide approximate population information,

representi ng the popul ati on characteri sti c under

examination. For example, consider an automatic steel

casting machine that casts thousands of steel bars daily -

to check the performance of the machine a manager need

not wait to check the entire days output. Instead, he can

check samples taken at random intervals, and if any

defects are detected in the cast the machine can be reset

or readjusted to function accurately.

107

Section 2

Methods of Enumeration

There are two methods of enumeration, the complete

enumeration or census method, and the selective

enumeration or sample method. The first method deals with

the study of the entire population whereas the second

method studies the selected part of the population that is

representative of the entire population and is referred as

sampling method.

Census or Complete Enumeration

In case of census or complete enumeration information

relating to characteristics of each and every unit of the

population is collected. The unit may be an employee,

product or a department present in an organization. The

collection of all these units under study is called as

‘population’ or the ‘universe’. For example, when the study is

intended to find out the working conditions of workers in

cement industry, the ‘universe’ of the study will consist of all

the workers in this industry (spread over a geographical

location). Scanning through all the applications for the

purpose of recruitment is a good example of complete

enumeration.

108

Keynote 7.2.1 : Advantages and Disadvantages of

census, Sample Enumeration & Characteristics of

a good sample

Sampling Methods

When the population/universe is large or difficult to

enumerate, information about its characteristics has to be

inferred from a subset of this population, called a sample.

The most difficult (but most important) aspect of selecting a

sample is to ensure the drawing of a representative

sample, i.e., ensuring that the sample chosen reflects the

population it is drawn from. We can use the sample to

make reasonable (probabilistic) inferences about the

population only when we can reasonably be sure that the

sample reflects the population.

A sample is a part of a larger group or set, that is usually

called a population. A sample is used to discover one or

more properties of the population. There are several

techniques that can be used to obtain a representative

sample. The technique used depends on the prior

knowledge of the properties of the population that will be

measured. There are two methods of selecting samples

from the population. They are:

Random or Probability Sampling

Non Probability Sampling

Refer Figure 7.2.1. for samples.

Random or Probability Sampling Methods

In probability sampling, the decision that whether a

particular element is included in the sample or not is

governed by chance alone. All probability sampling

methods ensure that each element in the population has

some non-zero probability of getting included in the

sample. This would mean defining a procedure for picking

up the sample, based on chance, and avoiding changes in

the sample except by way of a pre-defined process again.

The picking up of the sample is therefore totally insulated

against the judgment, convenience or whims of any person

involved with the study. That is why probability sampling

procedures tend to become rigorous and at times quite

time-consuming. Probability based selection of sample also

makes it free from individual biases and hence more

representative. Also, when probability sampling designs are

used, it is possible to quantify the magnitude of the likely

error in inference made and this is of great help in many

situations in building up confidence in the inference.

Some of the Random Sampling Methods are:

109

Figure 7.2.1: Samples

• Simple Random Sampling

• Systematic Sampling

• Stratified Sampling

• Cluster Sampling

• Multistage Sampling

Simple Random Sampling

Conceptually, simple random sampling is one of the

simplest sampling designs and can work well for relatively

small populations.

Suppose have a population having N elements and that we

want to pick up a sample of size ‘n’ (< N). Obviously, there

are possible samples of size ‘n’. Simple random

sampling is a process which ensures that each of the

samples of size ‘n’ has an equal probability of being picked

up as the chosen sample. This also implies that under

simple random sampling, each element of the population

has an equal probability of getting included in the sample.

All other forms of probability sampling use this basic

concept of simple random sampling but applied to a part of

the population at a time and not to the whole population.

It is imperative to have a list of all the members of the

population (called Population Frame) before a simple

random sample can be picked up. For example, to draw a

sample of 10 students out of a class of 70, we can write a

name chit for each student and mix the 70 chits in a bowl

well. Then draw chits one by one, 10 times.

It is easy to see that if

we replace the chits in

the bowl after noting

down the name of the

element, we will have a

simple random sample

with replacement and

o n e w i t h o u t

replacement if we do

not.

As the population size

increases, chit method would not be practical. We

associate a serial number with each member of our

population and then instruct a computer to select a

member from 1 through N using its pseudo-random

number generator. This ensures that every number from 1

through N has an equal probability of getting selected and

so the sample selected is a simple random sample. We

can also use a table of random numbers to draw a simple

random sample.

In practice, however,

s i m p l e r a n d o m

sampling is not popular

as mostly we may not

have population frame

and also operationally

it is more inconvenient

and costly.

110

Video 7.2.1: Sampling

Video 7.2.2: Simple random

sampling

Systematic Sampling

Suppose we wish to draw a ample of size n from a population

of size N, where n< N. Then order the population units based

on some identification and divide it into n partitions, with each

partition containing k units, where k=N/n (rounded off to

nearest integer). Then a unit is drawn randomly from the first

partition of k units and then every k th unit is drawn, thus

finally getting a

sample of size n

or (n+1).

For example, if

we want to have a

sample of size 6

from a population

of size 100, then k

woul d be 16. 7

(rounded off to 17). We would, therefore, have to decide

where to start from among the first 17 units in our frame. If

this number happens to be 7, for example, then the sample

would contain members having serial numbers 7, 24, 41, 58,

75 and 92 in the frame. It is to be noted that the random

process establishes only the first member of the sample - the

rest are pre-ordained automatically by the value of k.

Systematic sampling is relatively much easier to implement

compared to simple random sampling. It is popular in

sampling from pre- numbered receipts, invoices, cheques,

etc. However, if there is a pattern or periodicity int he

population frame such as greater rush at banks on Mondays

and Saturdays while studying the (number of customer

visiting a bank), it could result in selection bias.

Another situation could be when a population frame is

arranged in an order, ascending or descending, of some

attributes ( say, in descending order of marks while

studying marks distribution), then the location of the first

sample element may affect the result of the study. Both

simple random sampling and systematic sampling are

generally less efficient as compared to more sophisticated

probability sampling methods.

Stratified Sampling

Stratified sampling is more complex than simple random

sampling, but when

appl i ed properl y,

strati fi cati on can

s i g n i f i c a n t l y

i n c r e a s e t h e

statistical efficiency

of sampling.

Suppose we are

i n t e r e s t e d i n

e s t i ma t i n g t h e

demand of non-

aerated beverages

i n a r esi dent i al

colony. We know that the consumption of these

beverages has some relationship with the family income

and that the families residing in this colony can be

111

Figure 7.2.2: Stratified

sampling

Figure 7.2.1: Systematic sampling

classified into three categories, namely high-income, middle-

income and low-income families. If we are doing a sampling

study we would like to make sure that our sample does have

some members from each of the three categories - perhaps

in the same proportion as the total number of families

belonging to that category - in which case we would have

used proportional stratified sampling.

The basis for using stratified sampling is the existence of

strata such that each stratum is more homogeneous within

and markedly different between strata, the strata are

mutually exclusive and collectively exhaustive. The higher

the homogeneity within each stratum, the higher will be the

gain in statistical efficiency due to stratification. Samples are

usually drawn in proportion to the strata sizes. Each strata is

looked at as a population and samples drawn using any of

the methods described earlier.

Cluster Sampling

In cluster sampling,

t he popul at i on i s

di vi ded i nt o wel l

def i ned groups or

clusters, in such a

way that each cluster

is a representative of

the entire population.

In practice, clusters

are identified based

on some natural l y

o c c u r r i n g

phenomenon such as

villages, city blocks,

sales territories, etc.

Af t er t hat f ew of

these clusters are

randoml y sel ected

a n d u s u a l l y

c o m p l e t e d

enumerated. In case

the cluster sizes are

large one may resort

to random sampling int he selected clusters. The selection

of these clusters is done by using any one of the above

discussed sampling methods. For example, when a pre-

poll survey is conducted in an assembly segment, then

the entire voting population is divided into clusters. Then

some clusters are selected as samples and every element

of these clusters is studied to arrive at a final opinion

regarding the entire population.

Cluster sampling is used primarily because it allows for

great economies in data collection costs since the travel

related costs etc. are smaller. Although it is statistically

less efficient than simple random sampling, in most cases

this deficiency may be offset by the high economic

efficiency that it offers.

For example, to get a certain precision level one might

need a sample size of 100 under simple random sampling

and a sample size of 175 under cluster sampling.

However if the cost of data collection is Rs. 20 under

simple random sampling and only Rs. 5 under cluster

112

Video 7.2.2: Samples

Figure 7.2.3: Cluster

sampling

sampling, it would be cost effective to use cluster sampling.

Cluster sampling is mostly used in multi-stage sampling.

Multistage Sampling

When large national level surveys are undertaken, for better

representation and economy of costs, the samples are drawn

at different stages. For instance, a study on rural

unemployment may identify states as the first stage unit,

districts as the second stage unit, villages as the third stage

units, and households as the ultimate stage units. At each

stage we will take a sample using an appropriate random

sampling method for that stage. Most national level surveys

are carried out using such multistage sampling.

Non-Probability Sampling Methods

In non-probability sampling the sample units are selected on

non-random basis ignoring their probability of occurrence in

the population (since we may not know them). We resort to

such approaches when we are under the pressure of non-

availability of sampling frame, cost, time and ease of work;

high accuracy itself not being of importance.

Some of the non-random sampling methods are as follows:

Judgment Sampling

Convenience Sampling

Quota Sampling

Sequential Sampling

Judgment Sampling

In judgment sampling, the selection of the sample is

based on the judgment of the manager who is studying a

situation. This method is also known as “purpose

sampling” or “deliberate sampling”. This sampling method

should be carried out by an expert in the field as his

judgment will influence the final outcome of the study.

Convenience Sampling

This method is based on the convenience of the

researcher. The researcher uses the sources available to

him to come to a conclusion. For example, he may use a

telephone directory, to select the respondents for a

opinion poll or the list of employees of an organization

can be taken to study the employees.

Quota Sampling

In quota sampling, as in stratified sampling, we first

partition the population into mutually exclusive sub-

groups. Then a pre-specified proportion of sample is

drawn from each sub-group on a judgement basis. For

example, while carrying out opinion interviews (on streets)

on events like budget announcement, the tele-journalists

work on a quota (say, on age group basis, or on gender

basis). The journalists may involuntarily may tend to

interview the more “cooperative” people. Such a sample

may not be a representative.

Sequential Sampling

113

In sequential sampling the size of the sample is not fixed in

advance, but it is decided as the sampling process takes

place depending on the results of the first sample. A number

of sample lots are drawn in sequence one after another from

the population depending on the results of the earlier sample.

This sampling method is used for statistical quality control.

For example, a manager draws a lot from the inventory and

tests it for acceptability. If it is acceptable, there will be no

further samples required but if it is found unacceptable, the

entire stock will be rejected. So, when the results of the first

sample fall in near to acceptable standard the manager will go

for another sample before deciding on the quality of the

inventory.

Exercise for Discussion:

If you want to find the average height of all the students in

y o u r M B A b a t c h

a. What is the best way to draw a representative sample of

students from your batch?

b. What do you think should be a good sample size?

114

REVIEW 7.1

Check Answer

Question 1 of 10

A population is normally distributed with

mean = 0 and standard deviation = 1.

What is the approximate probability that

an observation from the population will

A. 0.7156

B. 0.8435

C. 0.9065

D. 0.9974

Section 3

Sampling and Non-Sampling Errors

Sampling and Non-Sampling Errors

Sample survey is related to study of limited units of the total

population; hence there would be scope for inaccuracy (or)

error in the process of collection, processing and analysis of

the data (sample). These errors can be broadly classified

into sampling and non-sampling errors.

Sampling Errors

The purpose of taking a sample from a population is to

estimate a population parameter through a sampled statistic.

The estimate of the population parameter would vary over

different samples. However, chance dictates the selection of

units in each sample. The variation in the estimates over

the samples, due to chance variation over samples is

referred to as sampling error. It is possible to obtain and

estimate an error statistically based on even a single

sample. This is of great help in judging the worthiness of an

estimate of the parameter.

Some of the causes for error in sampling are:

Error in selection of the sample

Bias in the reporting of data

Diversity of population

Substitution of sampling units for convenience

Faulty demarcation of sampling universe.

Non-Sampling Errors

Non-sampling errors occur at the time of observation,

approximation and processing of data. This error is common

115

to both the sampling and census survey. In fact, it is larger

in census survey, simply because many more units are

surveyed. Non-sampling errors can arise at any stage of

the planning or execution of complete enumeration or

sample survey. The non-sampling error may be due to

faulty sampling plan, errors in design of the survey, sample

substitution at the field level, measurement error, lack of

trained and qualified investigators, inaccuracy in responses

collected due to bias on the part of the respondent or the

researcher, and finally the errors in compilation or

publication.

116

REVIEW 7.2

Check Answer

Question 1 of 5

In the sampling surveys, the errors are

broadly classified as

A. Standard Error of Mean and

Population

B. Type I and Type II Error

C. Probability Errors and Non-

Probability Errors

D. Sampling Errors and Non-

Sampling Errors

Section 4

Sampling Distribution

Sampling Distribution

At IBS, we have 14 sections of first year MBA students,

each section containing 70 students. Thus if we look at the

first year students as population, our population size is 980.

Each section can be looked upon as a sample from this

population. Let us consider the variable, marks obtained in

a common QM examination (out of 100 marks). Clearly,

each student will score a value between zero and hundred.

The distribution of marks for different sections may be as in

figure 7.4.1.

The mean score ( )

for each section can be

expected to be some

where in the middle. If

we look at all the ’s,

one thing that we can

say intuitively is that

they will have a much

l e s s d i s p e r s i o n ,

possibly between 50

and 60. Now if we look

at over the sections,

we can expect the mean

to be same as it is for the population of all sections, but

clearly expect the variance and hence standard deviation to

be considerably less.

This intuitive result is more formally established through the

celebrated Central Limit Theorem in statistics. The theorem

states that when samples are taken from a large population

with mean (µ) and standard deviation (σ), then the

di st ri but i on of t he sampl e mean ( ) woul d be

approximately normal with mean (µ) and standard deviation

117

Video 7.4.1: Central Limit

Theorem

Figure: 7.4.1:Distribution of marks

irrespective of the shape of the population distribution,

where n is the sample size. To restate if X

1

, X

2

, X

3

..........X

n

are independently and identically distributed with mean (µ)

a n d s t a n d a r d d e v i a t i o n ( σ

x

) , t h e n

As the sample size n increases the standard deviation of of

will decrease. Thus probably bring the sample mean ( )

closer to the population mean. For this reason, this standard

deviation is referred to as standard error of .

In the case of finite population of size N, the standard error of

(σ ) is adjusted with a multiplier and given by

Notice that when N is

large and N>>n, then

multiplier will be close to

one.

We can find out the

sampling distribution of

not only mean, but any

other statistic estimated

from the sample such as the standard deviation (s). The

sampling distributions of different statistic provide the

basis for estimation and testing of hypotheses to be

discussed in the subsequent chapters.

118

Video 7.4.3: Sampling

distribution sample

problem

Keynote 7.4.1: Effect of Sample Size on Standard Error

Video 7.4.2: Standard error

119

REVIEW 7.3

Check Answer

Question 1 of 5

Which of the following sam-

pling method is most suscepti-

ble to subjectivity in selection?

A. Stratified sampling

B. Simple random sam-

pling

C. Cluster sampling

D. Judgment sampling

Section 5

Estimation

In most statistical studies, the population parameters are

unknown and must be estimated. Therefore, developing

methods for estimating, as accurately as possible, the values

of the population parameters is an important part of statistical

analysis. The primary goal of a sampling activity is to make

an inference about something using the least amount of

information possible. Here, we must be quite certain of things

like the number of observations to be made, the number of

points to sample and the number of people to survey.

Point Estimates

We can make two types of estimates about a population:

point estimates and interval estimates. A point estimate is a

single number that is used to estimate an unknown

population parameter.

For example, an estimate that the average weight of a

classroom of students is 50 kg or that the number of students

to register online for a particular university course is 250 is a

point estimate. Often, a point estimate is insufficient, as it is

either right or wrong. If it is said that a particular estimate is

wrong, we cannot be certain how wrong that estimate is or

about the reliability of that estimate.

The sample mean is the best estimator of the population

mean µ. It is unbiased, constant, and efficient and as long as

the sample is large, its sampling distribution can be compared

to a normal distribution.

This value of the sample mean is then an estimate of the

population mean. Similarly, point estimates of other statistics

can also be determined for population variance, standard

deviation and the population proportion.

Interval Estimates and Confidence Interval

If the actual result varies from the estimate by a little margin,

then it can be accepted as a good estimate whereas if it were

off by a large margin, it would be rejected as a poor estimate.

Therefore, a point estimate is useful if it is accompanied by

an estimate of the error that might be involved. Equivalently,

we can state that we have an interval estimate, its lower and

upper limit obtained with the help of the standard error of the

statistic of the population parameter. It indicates the inherent

error in estimation in two ways: by the extent of its range and

by the probability of the true population parameter lying within

that range. Using the above example, an interval estimate

would be something like this: the average weight of a class of

120

students is expected to be between 40 kgs and 55 kgs with

95% confidence, i.e., it is 95% certain that the exact average

weight falls in this range. Due to this way of looking at interval

estimate at we also call it the confidence interval.

Estimators and Estimates

Any sample statistic that is used to estimate an unknown

population parameter is called an estimator. The sample

mean can be an estimator of the population mean µ, and

the sample proportion can be used as an estimator of the

population proportion. Similarly we can also use the sample

range as an estimator for the population range.

Suppose the sales manager of a dry cell manufacturing firm,

needs an estimate of the average life (in months) of the

batteries. To proceed, let us take a sample of 500 batteries,

survey people who use those batteries about the battery life

they have experienced. Let us say that the present sample of

500 batteries has a mean battery life of 45 months. This gives

the point estimate for the life

of the batteries.

Ho we v e r , t h e s a l e s

manager is not satisfied

with this and can ask for the

amount of uncertainty that

accompanies this estimate,

whi ch i n essence i s a

statement about the range

within which the unknown

population mean is likely to

lie. Over several years he has observed the standard

deviation of the life of batteries to be 18 months. Hence, this

can be taken to be the standard deviation of the population.

Thus, the standard error of is given by:

We can now tell the sales manager that our estimate of the

life of the firm’s batteries is 45 months, and the standard error

that accompanies it is 0.805. In other words, the 95%

confidence interval can be given as:

where z (95%) is a value to be read from the standard normal

table and indicates that 95% of the observations of a standard

normal variate will lie between +-z (95%) of the mean (which

is zero here). When read from the table, z (95%) = 1.96.

Hence the confidence interval for the above example is

(43.42, 46.58).

A detailed list of confidence interval formula under different

situations is available at the end of the chapter on testing of

hypothesis.

121

Video 7.5.1:Point Estimates

122

123

124

125

Section 6

Case Study: Sampling the Population Favourite

126

This case study was written by Siva V Gabbita, Professor, IBS, Hyderabad. It is intended to be used as the basis for class dis-

cussion rather than to illustrate either effective or ineffective handling of a management situation. The case was prepared from

generalised experiences.

Sampling the Population Favorite

“

Have you tasted Hyderabadi Biryani?” That is the inevitable

question asked of visitors to the twin cities of Hyderabad and

Secunderabad.

Amar Singh (Amar) is a graduate from Delhi. Recently, he

came to Hyderabad to do a diploma course in management.

An avid foodie, Amar had heard of quite a few places that

served the popular dish, but he was a little sceptical about the

taste of Hyderabadi Briyani. He had been recently invited to a

wedding hosted at a well-known 5-star hotel. He was a

vegetarian and had helped himself to small servings each of

Hyderabadi Vegetable Biryani, Kashmiri Pulao and Vegetable

Fried Rice, but did not find much variation amongst them.

However, his friends from Hyderabad suggested that he post-

pone his judgment until he visited places that specialised in

the dish. They told him a list of hotels renowned for Biry-

ani, namely Paradise Hotel, Bawarchi, Aadaab/Bawarchi

Qahna, Garden Restaurant, Alpha Hotel, Hotel Madina,

Hotel Niagara, Cafe Bahar, Shadab Restaurant and not to

mention the restaurants at Grand Kakatiya and Taj Banjara.

Paradise on Earth

Paradise was one restaurant that Amar had heard spoken of

in the same breath as Hyderabadi Biryani even before he had

come to stay in Hyderabad. He decided to taste the dish at

Paradise that week. He also decided to taste the Biryani at a

few other hotels and compare the tastes to decide if the

Biryani at Paradise was significantly superior to that prepared

at other restaurants.

Paradise restaurant was started as a small shop and currently

it has three floors of the same building to itself. In spite of all

the fame it has, the food at Paradise is fairly priced and big on

quantity. It is the Quality/Price (=Value) that provides paisa va-

sool and is one of the basic reasons for the enduring popular-

ity of the Biryani at Paradise restaurant.

The best offerings at Paradise are arguably the Chicken Tikka

Kabab starter followed by a Hyderabadi Chicken Biryani. For

vegetarians, it is the Paneer Tikka starter followed by a Vege-

table Biryani, and the course is not completed without the

typical Hyderabadi dessert khubani ka meetha, which is made

up of Vanilla ice cream topped with apricot puree.

Do-it-Yourself

Over the next few months, Amar visited different places includ-

ing his friends’ homes. Once, one of his friend’s mother men-

tioned that there is a difference in preparation at home from

what is prepared in restaurants because of the quantity in

which the dishes are prepared. This statement made an im-

pression on Amar. He and his like-minded friends wanted to

try their hand at preparing Hyderabadi Vegetable Biryani. Af-

ter browsing the Internet and speaking to a few chefs, they

came to know of the essential ingredients and the recipe for

preparing Hyderabadi Vegetable Biryani.

The Recipe

127

Potatoes, carrots, french beans and cauliflower are boiled

with salted water. Sliced onions are deeply fried in oil until

they turn reddish-brown and the onions along with beaten yo-

gurt, garam masala, ginger and garlic paste (optionally) are

allowed to marinate for 1 hour. Simultaneously, rice is

washed, soaked and cooked until half done. Ghee heated in a

thick-bottomed vessel is sautéed for 2–3 minutes with the

marinated vegetables and brought to boil in water before layer-

ing the rice over the cooked vegetables. Biryani masala, mint,

coriander leaves, and more deep-fried onions, cashew nuts,

almonds and raisins are added and saffron milk is sprinkled

before covering the vessel with a moist cloth. Then the vessel

is covered with a lid, sealed with dough and cooked for 15–20

minutes at 180°C (alternatively cooked over a slow fire for

15–20 minutes).

Trial and Error

Over the next few months, cooking the perfect Biryani be-

came an obsession with Amar. In his inaugural attempt, he

used too little water while cooking the rice and ended up burn-

ing it at the bottom. Later on, he learnt to be careful while uni-

formly mixing the rice, so that the burnt taste would not

spread to the upper well-cooked layers.

Another revelation was that the spices and garam masala

made a big difference to the outcome. The extent to which the

vegetables were boiled, which in turn was determined by the

amount of water used, also brought a lot of variation in the

taste. He found that frying the boiled vegetables before add-

ing them to the cooked rice sometimes improved the taste.

Apart from these, he also found that there are many other vari-

ables, which changed the taste subtly. For instance, the se-

quence in which the sliced onions were fried, i.e., whether

fried separately or along with the vegetables, adding salt sepa-

rately to the onions and to the vegetables, boiling the rice

along with the vegetables, the time taken to cook the rice, etc.

In due course, Amar had become adept in Biryani prepara-

tion. However, no two Biryani preparations tasted exactly the

same. Nevertheless, there was an essential taste, which was

unique to Amar’s Biryani. Though his Biryani tasted similar to

the ones prepared by anyone else, it was not exactly identi-

cal. There was ‘something’ in Amar’s Biryani that enabled his

friends to identify whether it had been prepared by Amar or

not just by helping themselves to a small portion.

Then Amar realised that this is also true of the Biryani pre-

pared at restaurants. Though the Biryani at Paradise might

taste subtly different from day to day, there was ‘something’

that made people identify it as Paradise Biryani. Likewise, he

realised that there was also variation across the Biryani pre-

pared by different chefs. There was a distinct quality to the

Biryani prepared at each of the places, making it reasonably

simple to identify its source as well as compare and contrast

between Biryani prepared at different restaurants. However,

the average taste at Paradise restaurant varied from the aver-

age taste at Bawarchi restaurant, which varied from the aver-

age taste prepared at Café Bahar and so on.

Reflecting back on his experience at the wedding dinner at

the 5-star hotel, Amar once again wondered why then there

had not been any significant difference between the Hydera-

badi Biryani, Kashmiri Pulao and Vegetable Fried Rice. He

128

concluded that they were all perhaps made once and served

separately with different garnishing and separate nametags.

Questions for Discussion

1. Can we make a judgment about an entire batch by

evaluating a single portion drawn from that batch? Sub-

stantiate with suitable reasons.

2. If there is variation within the batches produced each

time by a producer, can we compare and contrast (to

find a significant difference) between all of them just by

evaluating a single portion from one producer and com-

paring it with a single portion from the other producer?

3. Is it possible to make an error in judgment? When,

therefore, is this technique justified and when is it not

justified?

4. Was Amar correct in his analysis of variance (based on

samples drawn from the Hyderabadi Biryani, Kashmiri

Pulao and Vegetable Fried Rice) that all the three sam-

ples must have come from the same population since

there seemed to be no significant difference between

them? Is it not possible that this was coincidence?

129

SECTION 7

Case Study: Ascertaining Customer Satisfaction

130

This case study was written by Dr. Sunil Bharadwaj, Professor (Department of Decision Sciences), IBS, Hyderabad. It is in-

tended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-

ment situation. The case was written from generalised experiences.

Sakuma India (Sakuma), an FMCG, is planning to achieve a

significant lead in the country’s digital still camera market with

the strongest ever product line-up in the category. With 42%

market share in still camera market, Sakuma is well ahead of

the competition (market share) and currently holds the domi-

nating market share. The market share result is a part of Sa-

kuma’s estimate based on expected total sales of Cyber-shot

digital still camera in FY2009, compared to total market size

for the category.

According to a press release, Sakuma has already launched

its Cyber-shot collection with 11 new additions to its already

existing camera series. Sporting colourful fresh looks, the

cameras have slimmer dimensions and a futuristic design that

is easy to flaunt or carry around. The Cyber-shot lineup is

equipped with high 10–12 megapixel resolution,a newly-

developed Exmor CMOS sensor, intelligent features and pow-

erful imaging innovations to deliver enhanced imaging per-

formance and convenient photo sharing solutions for the In-

dian consumer.

Sakuma has already sold around 50,000 Cyber-shot cameras

and is now seeking to assess the satisfaction level of the us-

ers. The company is not planning to contact each and every

user of Cyber-shot camera for the obvious reasons of high

cost, time and effort involved in the process of contacting all

the customers. The marketing head of the company, Joe Phil-

lip (Phillip), is assigned 2 months to complete the job. Phillip

has called for a meeting with his team and the team is con-

tem- plating on various options for doing the job. All the team

members agreed that a questionnaire-based survey method

of data collection would be a good option to assess the satis-

faction level of the users with the product. The questionnaires

can be administered through either e-mail, postal sur-

vey or telephonic interview. However, the big question before

the team is – who and how many customers should they con-

tact?

Dev Anand, an executive, who joined Sakuma recently, sug-

gested an e-mail survey by e-mailing the questionnaire to

those customers who provided them with e-mail IDs. With suf-

ficient data available in the customer information form at retail

outlets, collection of e-mail IDs will not be a problem. A sam-

ple of adapted version of customer information form hasgiven

in Exhibit I.

However, other members did not feel enthusiastic about a sur-

vey through e-mails. They had experimented with this idea

earlier and felt that most of the time e-mail ID is either not

131

Ascertaining Customer Satisfaction

available or it is not furnished by many of the customers. Also,

customers do not respond well to e-mail surveys. The typical

response rate of usual e-mail survey is as less as 2% and the

usable responses are still less. In an e-mail survey, most of

the time, the e-mail lands up in the spam folder and the cus-

tomer neglects it. Also, there may not be any motivation for

the customer to open the e-mail and go through the survey

questionnaire sincerely.

The other options left were using telephonic interviews and

mail surveys. While the response rate and quality are usually

very good in telephonic surveys, the cost of survey is high. On

the other hand, in mail surveys, the response rate is better

than that in e-mail survey and a higher response rate can be

guaranteed through lucky-draw reward schemes. Moreover,

unlike e-mail IDs, the postal addresses of all the customers

are mostly available. However, one problem with the mail sur-

vey is the time taken by the customer to respond, if at all he

responds. The customer will not be motivated to fill the form

and post it back on the same day. The response is further de-

layed by the usual process of postal procedures. After a long

discussion on the methods of survey, the team finally agreed

to go for telephonic interviews, as the time available to com-

plete the study was limited.

The other part of the decision was to determine the number of

customers to be contacted and the technique to be adopted to

identify them. From their past experiences, all the team mem-

bers

knew that if 10%–20% of the customers are contacted, a fair

idea of the situation can be obtained. As such, the team

needs to find ways to select the 10%–20% of the customers.

However, the team wound up the discussion at this juncture

and agreed to meet after 2 days with possible options for the

survey. In the next meeting, the executives gave suggestions

on their approach to the problem.

Raman believed that the product is doing well and it is evident

by the appreciation letters and entries made by the customers

in the company’s blog. He suggested that the company can

contact only those bloggers and can get a very favourable re-

sponse. Collecting data from them would be very easy, as

they have already registered on the blog.

132

CUSTOMER INFORMATION FORM CUSTOMER INFORMATION FORM

customer name

Mobile/Telephone No

Address

email Id

Age

Profession

Product Bought

Details of the Product

Other Information

prepared by author prepared by author

Exhibit I

On the other hand, Ravi Saxena (Saxena) suggested that the

company should treat all its customers on the same footing

and every one should have an equal chance of appearing in

the survey. He questioned the idea of relying on the blog. Sax-

ena is a strong supporter of random sampling. He recalled his

earlier experience, where he was supposed to take opinions

of doctors (from established hospitals) on electronic equip-

ment manufactured by the company. At that time, he had gen-

erated a list of all hospitals and physicians working in each of

the hospitals, wrote their names on a piece of paper, put them

in a box, mixed them well and drew certain names. In the

same way, he wanted the survey to be conducted through the

‘box’ approach. However, Ajay Jadeja strongly objected to this

idea. He argued that with a list of 50,000 customers, the ‘box’

approach is not practical.

Rohan Pillai, while supporting the need to generate the list on

a completely random basis, suggested choosing every fifth or

tenth (or nth) customer from the list. He felt that through this

method one can easily and quickly generate the details of de-

sired sample and the sample would still maintain a fair degree

of randomness.

However, Ram Tarneja (Tarneja), a senior executive, who was

patiently listening to the above discussion, suggested another

way of approaching the problem. He agreed that randomness

is fair but raised a query, “Instead of generating the sample

from the consolidated list of customers, why don’t we make

groups of customers according to zones, states and metropoli-

tan cities in India?” If they go by this method, he felt, that they

can easily assess in which of the state or zone customers are

satisfied or not and act accordingly. For example, if they go by

this method, it may turnout that Delhi customers need more

attention than Mumbai customers or vice versa.

At this point, Amrita Basu, who was keenly following the dis-

cussions, chipped in. While appre- ciating Tarneja’s sugges-

tion, she raised another query, “Why not group the customers

on the basis of camera models and then go for sampling for

each of the camera models?” She felt that this way, they can

be sure of the models. The discussion went on for another 2

hours, but a conclusion could not be reached.

133

134

SECTION 8

Case Study: Customer Satisfaction with DTH Services in India

135

This case study was written by Sravanthi Vemulawada under the direction of R Muthukumar, IBSCDC. It is intended to be used as

the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation. The case

was written from generalised experiences.

Since Cable TV entered India in 1992, entertainment on tele-

vision has grown rapidly. Out of the 71 million TV households

in 1999, 32 million had access to Cable TV. However, along

with the soaring viewership, complaints on quality also in-

creased. Digitalization of Cable TV took a new form when

Direct-To-Home (DTH) was launched in India in 2003.

The DTH service is an encrypted transmission. It is a digital

satellite service that provides television services direct to sub-

scribers anywhere within the country. Unlike the regular ca-

ble connection, the Set- Top-Box (STB) decodes the en-

crypted transmission. Since it makes use of wireless technol-

ogy, programs are sent to the subscriber’s television direct

from the satellite. This eliminates the need for cables and ca-

ble infrastructure. DTH service is particularly effective in re-

mote areas, where cables and even normal television serv-

ices are poor or nonexistent. These services provide the fin-

est picture and sound quality. Like the quality of any modern

movie theatre, DTH also provides the best quality surround

sound.

Although DTH services were proposed in India way back in

1996, it was not permitted until 2003. The government re-

jected approval to DTH due to concerns over national secu-

rity and cultural invasion. To prevent the implementation of

DTH service, even the cable operators had heavily lobbied

the government. Finally, the Government laid out certain regu-

lations for DTH providers to operate in India. To name a few,

no foreign player can invest more than 49% in an Indian DTH

venture, no broadcaster or cable network can earn more

than 20% share in DTH venture and the DTH provider has to

be an Indian company. Apart from that, players are required

to pay an initial amount of INR 100 million while entering the

business along with a bank guarantee of INR 400 million for

a license period of 10 years. They are also required to pay

the government 10% of their gross revenues, 12.36% of sub-

scription fees, entertainment tax ranging between 10%–20%

(varies from city to city) and VAT of 12.5%.

The Indian Government issued the first private DTH license

to Dish TV in 2003 and Dish TV started its operations in

2004. Dish TV installed a pizza size dish antenna and STB

for INR 3,190 at subscribers’ end and charged a monthly sub-

scription fees depending upon the package opted by them.

To suit the needs and pockets of different customers, Dish

TV offered four different packages made available through

25,000 dealers across the country. In 2 years, Dish TV gar-

nered a subscriber base of 1.5 million .

Dish TV has around 500 channels. Now it is planning to add

one more channel to its basic services because it wants to

increase its sales. This channel is expected to be one

amongst few entertainment channels. One of your friends is

appointed as a consultant with Dish TV. Since Mumbai is the

hub of the entertainment industry in India, your friend is of

the opinion that a survey of the channel’s Mumbai viewers

would be sufficient to know if a new channel has to be in-

cluded or not. Is his approach proper?

136

Customer Satisfaction with DTH Services in India

In 2006, Dish TV faced a new contender as Tata Sky had en-

tered the DTH market. Tata Sky investedINR25billion and

launched its service simultaneously in 300 cities across India,

concentrating mainly in Tier 1 cities. Tata Sky offered an initial

package of 55 channels. Its packages were priced similar to

that of Dish TV. By 2007, Tata Sky had 1.5 million subscribers

.

Tata Sky has introduced a new service called Tata Sky+. It

wanted to know its existing customers’ feedback about this

service. It divided its customer base into five age groups i.e.,

10–19, 20–29, 30–39, 40–49, 50–59 and surveyed these

groups accordingly. Is this approach proper?

Apart from Dish TV and Tata Sky, customers got another op-

tion in December 2007. South India’s first DTH provider, Sun

Direct TV6 launched its services at a price of INR 1,999 and

monthly subscriptions ranging between INR 75–250. Apart

from the usual offerings, it even provided add-on packages

and customer care service to its subscribers. In just 200 days

of its inception, Sun Direct was successful in reaching 1 mil-

lion subscribers. In South India, Sun Direct holds 65% market

share.

The number of subscribers of Sun Direct service is 3.1 mil-

lion. The company wanted some inputs about its service from

its existing subscribers. They proposed to select a sample of

100,000 subscribers. What should be the approach?

On August 19th 2008, Reliance Group (a Fortune 500 com-

pany worth INR 1,564 billion) entered the DTH sector with Big

TV8, investing INR 20.5 billion. Arun Kapoor , CEO, Big TV,

says that they are planning to capture 40% market share

within a year. At the outset, Big TV plans to spend INR 2bil-

lion on marketing and promotions . The company is using the

internet,hoardings,radio,and print media to make people

aware. A live demo of the product was also made available at

the demo closets of different TV outlets. Reliance plans to of-

fer 200 channels packaged and priced differently between

INR 1,490 and INR 4,999. In future, it also plans to add 130

channels.

Big TV has a million customers. Since the management re-

ceived complaints of its poor customer service, they worked

on it and resolved the problem. Later on, the management

wanted to know whether proper customer service was being

provided. So, they wanted to survey 10,000 customers of Big

TV. How should the sample be selected?

Indian Telecom conglomerate Bharti Airtel launched its DTH

service called Airtel Digital on October 9th, 2008 in 5,000 cit-

ies across the country. Currently it has about 175 channels .

The company holds nearly 24.2% market share of wireless

subscribers and has 300,00012 subscribers. The company

plans to leverage on this subscriber base. By 2009, DTH cus-

tomers are expected to reach around 10 million–12 million.

Airtel service is planning to add one more channel to its basic

service. There are five channels to choose from, and the com-

pany would like some input from its subscribers. There are

about 1 million subscribers, and the company knows that

35% of these are college students, 45% are white-collar work-

ers, 15% are blue-collar workers and 5% are others. What

type of sampling should be used here and why?

137

Foot Notes

1. Dish TV is a DTH entertainment service, which brings 500

channels and services straight from the satellite to the home.

It provides uninterrupted viewing without any transmission

cuts along with crystal-clear digital quality picture and stereo-

phonic sound.

2.Chatterjee Purvita, “DTH makes

merry”,www.thehindubusinessline.com/catalyst/2007/01/18/

stores/2007011800120300.htm, January18,2007.

3. Tata Sky is a DTH entertainment service, which has rede-

fined the television viewing experience for thousands of fami-

lies across India. They offer over 170 television channels in

DVD quality picture and CD quality sound along with a host of

new-age interactive services.

4.“Tata and Sky finally launches Tata Sky DTH Service in In-

dia”,

www.sifybroadband.techwhack.com/532-tata-sky-dth-service,

August 12, 2006.

5.RajGopal, “Can DTH compete with cable?”,

www.hindu.com/2005/12/24/stories/2005122404921100.htm,

December 24th 2005 .

6. Sun Direct is a DTH entertainment service wherein the

viewers can watch all their favourite programmes in true

DVD quality, it treats the viewers’ ears to a true theatre ex-

perience by providing awesome CD quality sound.

7. Iyer Byravee, “Sun Direct: Go national, think regional”,

www.business-standard.com/india/news/go-national-think-reg

ional/21/57/344653/, December 30th 2008.

8. Reliance’s DTH entertainment service Big TV is powered

by MPEG- 4 technology, which is being used for the first time

in India. It has fantastic features like pure digital viewing expe-

rience, more channel choice, many exclusive movie chan-

nels, easy programming guide, interactive services, parental

control and 24x7 customer service.

9 .Sinha Ashish, “ADAG to launch DTH service on Tuesday”,

www.business-standard.com/india/storypage.php?autono=33

1705, August 18, 2008.

10. Bharti Airtel launched its DTH Satellite TV called Airtel

Digital TV which is available to customers through 21,000 re-

tail points including Airtel Relationship Centres in 62 cities. It

uses the latest MPEG-4 standard with DVB S2 technology.

This latest technology enables delivery of more complex inter-

active content and is High Definition ready.

11.“Airtel DTH offers 175 channels”,

http://discuss.itacumens.com/index.php?topic=31581, Octo-

ber 7th 2008.

12.“DTH Networks India Forums”,http://

www.saveondish.com/forum/archive/index.php/thread-12244.

html, April 18th 2009.

13.Iyer Byravee, “Triple,trick or treat?”,http://www.business-

standard.com/india/storypage.php?autono=339736, Novem-

ber 11th 2008.

138

139

SECTION 9

Case Study: Swarnamukhi Public Bank Limited’s SME Loans

140

This case study was written by Sravanthi Vemulavada, IBSCDC. It is intended to be used as the basis for class discussion rather

than to illustrate either effective or ineffective handling of a management situation. The case was written from generalised experi-

ences.

Small and Medium Enterprises (SMEs) are enterprises

wherein the number of employees and the turnover of the

company are below certain defined limits. SME is very com-

monly used term in European Union, in the United Nations, by

the World Bank and the World Trade Organization.

However, the size of an SME varies from nation to nation. In

the US, a company with less than 100 employees is termed

as a Small Enterprise (SE) and a company with less than 500

employees is termed as a Medium Enterprise (ME). In the

European Union, a company with less than 50 employees is

termed as a SE, while a company with less than 250 employ-

ees is called an ME. In Germany, a company is called as an

SME if it has 250 employees, while in Belgium, an SME con-

sists of 100 employees. In South Africa, the term Small Me-

dium Micro Enterprise (SMME) is used, whereas in Africa, the

nomenclature is Micro, Small and Medium Enterprise

(MSME).

Most of the economies in the world are dominated by smaller

enterprises. They comprise approximately 99% of all the firms

and they even account for about 40%–50% of the industrial

production. These smaller firms employ around 65 million peo-

ple.

SMEs have a major advantage of employing people at a low

capital cost. As per statistics, the sector is one of the biggest

employment providers, employing around 31 million people

through 12.8 million micro and small enterprises in India.

The labour intensity in the SME sector is estimated to be

around four times than that in large enterprises.

Indian SMEs

In India, the SMEs are known by the term Micro and Small En-

terprise (MSE). This sector plays a key role in the overall in-

dustrial economy. MSEs account for about 39% of the manu-

facturing output and around 33% of the total exports of the

country in terms of value. These MSEs also produce over

8,000 value added products.2 In addition to the above, MSEs

have consistently registered higher growth rate when com-

pared to the overall industrial sector.

More recently, in India the MSE sector has been enlarged to

include a medium category. Thus, the Micro, Small and

Medium Enterprises (MSMEs) are classified into two clas-

ses3:

1. Manufacturing Enterprises based on investment in plant

and machinery (Micro up to INR 25 lakh, Small between

INR 25 lakh and INR 5 crore and Medium between INR 5

crore and INR 10 crore).

2. Service Enterprises based on investment in equipments

(Micro up to INR 10 lakh, Small between INR 10 lakh and

INR 2 crore and Medium between INR 2 crore and INR 5

crore).

141

Swarnamukhi Public Bank Ltd’s SME loans

Globally, SMEs have been a source of innovation and SMEs

that integrated innovation are known to have garnered signifi-

cant benefits. However, in India, most of the MSEs still be-

lieve in importing technology rather than developing them in-

house or in association with some of the national Research

and Development (R&D) centres – this despite the fact that

India has the third largest pool of technologically trained man-

power. In short, Indian MSEs have mostly neglected their

R&D, and even their new product development and techno-

logical up gradation.

Even though MSEs constitute more than 80% of the total num-

ber of industrial enterprises and form the backbone for indus-

trial development in India, they suffer from some serious prob-

lems such as sub-optimal level of operation, technological out-

datedness and even lack of capital. In recent years, Indian

MSEs have started facing tough competition, particularly from

China. Their performance is also affected by the uncertain

market conditions due to the ongoing recession. Owing to the

same, the banks are sceptical about the repayment of loans

by the MSEs. Swarnamukhi Public Bank Limited is one such

bank, which is apprehensive about the repayment of the loans

by the MSEs in India.

Swarnamukhi Public Bank Limited

Bangalore-based Swarnamukhi Public Bank Limited, a me-

dium sized bank, has its presence across India. The top man-

agement of the Swarnamukhi Public Bank Limited is con-

cerned that the default rate may go up among MSEs as a con-

sequence of the recent economic recession. They wanted to

understand the chances of default among MSE loan ac-

counts. In particular, Vasanth Desai, the managing director of

the Bank wanted to know the region wise chances for 10% de-

fault, 15% default and 20% default. He knew that during the

recession in the ’90s, about 9% of the SMEs turned out de-

faulters on an all India basis. To avoid the repetition of such a

situation once again and totake necessary initiatives, he

wanted a branch-wise report from each region.

To respond to the queries of the MD, Albert Pinto (Pinto), re-

gional manager, Nagpur region, called for a meeting of branch

managers of all those branches, which are specially focusing

on MSEs. There were six such branches, mostly located at in-

dustrial centers /estates. The collective number of loans ad-

vanced by these branches to MSEs prior to September 2008

(i.e., prior to the emergence of recession in India) was 752, in-

cluding the 150 loans that they recently approved. In the meet-

ing, most of the branch managers expressed their concern

about MSEs’ ability to sustain the economic slowdown.

After a long discussion on the performance of the old as well

as the recently established MSEs, they could assess that

most of the MSEs are hardly concentrating on developing

new products and they are importing either the products or

the concept from the West. Finally, the branch managers con-

cluded that 8% of the MSEs, who received loans, would not

be able to make payments on time.

Given the scenario, what is the probability that more than

10% of the 150 loan takers would not make payments on

time? While discussions were informative and rich in experi-

142

ence sharing, Pinto wondered on how to respond to the MD’s

query at each branch level.

143

144

Testing of Hypothesis

The Basic Notion

The Formal Process

Steps in Hypothesis Testing

Tests for different situations

Case Studies

Smoking: A Costly Affair

Care Hygiene

Conversys Inc. (A & B)

Strategic Break

Shopper’s Stop

Hindustan Foods

A Study of Soap segment

Melting Delicacies (A)

C

H

A

P

T

E

R

8

I n t hi s c hapt e r we wi l l di s c us s

Section 1

Testing of Hypothesis

The Basic Notion

The notion of “testing of hypothesis” is very inherent in us.

Consider the following common situations:

1. Whenever we buy fruits/vegetables/sweets etc. we decide

whether or not to buy based on our assessment often, of

a single, small portion of the whole.

2. When we buy clothes we evaluate the quality of the cloth

by checking certain characteristics of the cloth and/or

tailoring/styling. Our purchasing decision is thus

influenced.

3. When we meet a stranger we decide whether we like or

dislike them based on an assessment of whether their

personalities match or do not match our expectations

4. We sometimes decide whether or not to watch a movie

based on the reviews and promos of the film. If the promo

is good we infer that the movie must be good too.

5. Investigators at a crime scene proceed first by identifying

a suspect and then try to collect evidences to establish

the criminality of the suspect.

6. A judge decides about the innocence or guiltiness of a

defendant based on the overall balance of the evidences

produced by the lawyers on both sides.

In all these examples we are starting with an assumption/

expectation and then taking a sample of evidence, we

compare if the sample evidence is within a acceptable

region or not, and accordingly take the decision. Against this

framework, the above examples are summed up in the

following table 8.1.1:

146

147

Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1 Table 8.1.1

S.No. Example

Initial guess/

assumption

Evidences

(sample)

Processing to

a summary

measure

Acceptance

Threshold

Conclusion

1 Vegetable buying Good quality

Sample a few

items

Taking a position

on quality

Evidence within

acceptance

threshold

Buy

2 Cloth buying Good quality

Sample a few

items

Taking a position

on quality

Evidence within

acceptance

threshold

Buy

3

Meeting a

stranger

Like minded

Various

behavioral

aspects of the

stranger

observed

Taking a view on

the stranger

They match with

a threshold

level acceptable

to me

Befriend him/her

4 Watching a movie Good movie

Watch reviews

and promos

Taking a view on

the movie

Reviews and

promos within a

threshold level

acceptable to

me

Go for movie

5

Crime

investigation

Identified suspect is

guilty

Collect

evidences

towards this

Taking a view on

the suspect

Incriminating

evidence above

a threshold level

Suspect is guilty

6 Jury trial Defendant is innocent

Evidences (for

and against)

presented to the

jury

Taking a view on

the defendant

Incriminating

evidence above

a threshold level

Defendant is

guilty

Thus based on the available evidences we have a way for

reaching a conclusion - a very likely conclusion - and yet

we cannot vouch for it to be the “absolute truth”. Our

conclusions are always most likely, in other words

probabilistic. This means that there is always a probability

that our conclusions are wrong. The wrong decision could

occur in two ways: (a) we may reject a hypothesis when it

is actually true, and (b) we may accept a hypothesis when

it is actually false. For instance , we might have drawn an

unrepresentative sample and hence our conclusion go

wrong. In the jury trial, while the available evidences may

incriminate the defendant as guilty, we are aware of cases

where sometimes after years of punishment to the convict,

evidences have emerged establishing the convict to be

innocent beyond doubt.

Visit: www.guardian.co.uk>News>Worldnews>Capital

punishment or www.http://www3.law.columbia.edu/hrlr/ltc

While approaching the problems statistically, essentially

around the premise laid down above, we use some formal

terminologies:

a. The initial guess/assumption is clearly about some

characteristic of the population. In other words, we are

concerned about a population parameter.

b. Thus the initial guess/assumption about the population

parameter is referred as Null Hypothesis (Ho). A

hypothesis negating this position is called the Alternate

Hypothesisis (H1).

c. The evidences are primarily obtained through samples.

d.The summary measure is referred to as the test

stati sti c, whi ch i s obtai ned through stati sti cal

considerations and could defer from context to context.

e.The acceptance threshold is referred as critical value

and the region in which the test statistic is acceptable is

called the acceptance region. Consequently the region

beyond the critical value is called the rejection region.

f. The entire process goes under the broad name of

statistical inference.

The Formal Process

Let us now discuss the topic more formally.

Typically, we hypothesize a point estimate of a population

parameter. We take a sample and compute the sample

statistic. We test it by comparing the observed value of

the sample statistic with the expected value of sample

statistic (assuming the hypothesized parameter to be

true) and judging if the difference is significant. The

smaller the difference, the greater the chances of our

hypothesized value being correct and vice versa.

However, there is some amount of arbitrariness in

judging as to what should be considered as large

difference or otherwise. In practice we use standardized

values of the sampled statistic, which follows a known

probability distribution under assumptions. This

standardized statistic is called test statistic.

When the observed (or calculated) test statistic is

compared against a value (called critical value of the test

statistic) obtained from statistical tables for the probability

distribution of the test statistic, it allows us to decide with

148

a certain degree of confidence if the difference between the

observed value of the sample statistic and the expected value

of the sample statistic is significant.

However, one should bear it in mind that we are trying to

conclude something about the nature of the population based

on a sample from it. Hence, there is always a chance of our

going wrong if the sample does not happen to be

representative of the population (which we can never really be

sure about). Thus, we always make a probabilistic

statement about the conclusion reached such as “we accept

(or reject) the hypothesis with 95% confidence” i.e. in

95% cases the hypothesis is likely to be true (or false)

because the difference between the observed and expected

values of the sample statistic (under the hypothesized

parameter value) is not significant ( or significant). This means

that there is a 5% chance of making an error through

statistical inference.

There are possibilities for two types of error being committed

while carrying out a test as is clear from below (table 8.1.2):

While ideally it is preferable to reduce both type I

and type II errors, it is not possible to do so theoretically.

If we minimize type I error, type II error will increase and

vice-versa for reasons clear from the accompanying

figure. Hence we always keep the type I error fixed at α

and minimize the type II error. In practice α is mostly

taken as 5% or 1%.

Thus the null hypothesis is the status quo solution to each

of the examples indicated earlier. It is only a possible

solution, but the null hypothesis is what we will believe in

unless we have evidence to the contrary.

We restate the various terminologies related to hypothesis

149

Table 8.1.2 Table 8.1.2 Table 8.1.2

Decision(conclusion)▶

Actual (True state)

▼

Ho accepted

Ho rejected

(i.e., effectively H1

accepted)

Ho True Correct decision

P(Ho rejected / Ho True)

Type I Error ()

Ho False

(i.e.,effectively H1 true)

P(Ho accepted/Ho

false)

Type II Error(ß)

Correct decision

testing with the illustration of coin tossing experiment.

(a) Null hypothesis

It is the hypothesis we wish to test on some population

parameter. Usually this is specified in mathematical terms,

e.g. the hypothesis whether a coin is unbiased or not can be

written as p=½, where p=probability of a head in a toss. The

null hypothesis is generally denoted as:

H0 : p = ½

(b) Alternative hypothesis

It is a hypothesis which contradicts the null hypothesis.

Thus, while testing for unbiased-ness of a coin, the alternative

hypothesis can be

(i) It is biased (in which case p ≠ ½)

(ii) Biased in favor of head (in which case p >½)

(iii) Biased in favor of tail (in which case p < ½)

Alternative hypothesis is generally denoted as

HA or H1 : p ≠ ½

or, H1 : p > ½

or, H1 : p < ½ .

At a time, we test one of the following situations:

H0 : p=½ versus H1 : p ≠ ½ (Two tailed test)

H0 : p=½ versus H1 : p > ½ (Right tailed test)

H0 : p=½ versus H1 : p < ½ (Left tailed test)

The above idea will be clear from the following figure 8.1.1.

(c) Test criterion or test statistic

This is a formula (differs from situation to situation)

which is used in formulating a test.

(d) p – value

The probability beyond the calculated value of the

test criterion under H0 is called the p – value.

(e) Critical Region and Critical Value

The set of values of the test criterion that lead to the

rejection of the hypothesis is called the critical region (or

rejection region) of the test. On the other hand, the

values that lead to the acceptance of the hypothesis are

said to form the acceptance region. This cut off point is

referred to as critical value.

150

Figure 8.1.1: Selecting the tail of the test

(f) Level of Significance

This is the probability level (under H0) which is employed in

defining the critical region. It is generally denoted by the

symbol α and is customarily taken as 0.05 or 0.01

(alternatively referred to as 5% or 1% level of

significance). We have to take this approach, as theoretically

it is not possible to minimize both Type I Error ( α ) and

Type II Error ( β ) simultaneously.

(g) Power of a Test

(1 - β) is referred to as the power of the test. The test

criterion is always such that for a given level of significance

(α), the power of the test (1 - β) is maximized.

(h) Test of Hypothesis

Based on all the above, this is a rule telling us when to

accept H0 and when to reject it. The decision depends on the

value of the statistic in relation to the critical value obtained

from the corresponding statistical table.

Steps in Hypothesis Testing

Common Steps:

State Ho and H1 ( Be clear if H1 is two tailed or right tailed or

left tailed)

Define rejection region (specifying level of significance, i. e., α

will take care of it)

Decide on the test statistic (z, t, F, ...........)

Collect sample data and compute test statistic

Steps if critical value approach followed

Determine the appropriate critical value depending on H1

( i.e., the value(s) on the distribution of the test statistic

beyond which probability is α).

Compare test statistic with the critical value(s) to decide

whether to reject H1

Steps if p-value approach is followed

Determine the p-value for the test statistic

Reject Ho at level of significance (α ) if p- value < α.

Common step

Interpret the conclusion in managerial terms.

151

Video 8.1.1:

Type I and Type II errors

Video 8.1.2:

Test criterion

Section 2

Tests for Different Situations

With these basic concepts we shall indicate different

situations and tests appropriate for them. Most of these

relate to the means, standard deviations and proportions.

Generally in practice a sample of size 30 or more is

referred to as large sample and in such cases it is possible

to use some large sample approximation which is due to the

celebrated Central Limit Theorem.

These tests can be broadly classified into the following two

categories:

(a) Small Sample tests, and

(b) Large Sample tests.

As the name suggests, small sample tests are applied

when the sample at hand is of small size. Large sample

tests are used when the sample size is large (sample size >

30 ). Most of these tests are based on four well-known

distributions in statistics, i.e., Normal distribution, t-

distribution, F-distribution and Chi-square (X

2

)

distribution. In the discussion below, we shall assume that

we have drawn a random sample x1, x2,…….., xn of size n

from a given population, our problem being to infer about the

nature of some parameter of the population.

Symbols and Notations

Before detailing on tests for different contexts / situations,

notations & symbols used are defined below:

152

Some other symbols:

153

Symbol Nature of Variate Defined through

z α Z follows Normal (0,1) P ( Z > z α) = α or equivalently P ( Z < z α) = 1 - α

z α/2 Z follows Normal (0,1) P ( Z > z α/2) = α/2 or equivalently P ( Z < -z α/2) = α/2

t (n-1),α t follows t-distribution

with

( n- 1) Degrees of

Freedom

P ( t > t (n-1),α ) = α or equivalently P ( t < t (n-1),α)

= 1 - α

t (n-1),α/2 t follows

t-distribution

P ( t > t (n-1),α/2 ) = α/2 or equivalently P (t < - t (n-1),α/2) =

α/2

F(n1-1, n2-1, α ) F follows

F-distribution

P (F > F(n1-1, n2-1, α )) = α

or equivalently P (F < F(n1-1, n2-1, α )) = 1 - α

F(n1-1, n2-1, α/2 ) F follows

F-distribution

P (F > F(n1-1, n2-1, α/2 )) = α/2

or equivalently P (F < F(n1-1, n2-1, α/2)) = 1 – α/2

X

2

(n – 1, α ) X

2

follows

Chi-square distribution

P (X

2

> X

2

(n – 1, α )) = α

or equivalently P (X

2

< X

2

(n – 1, α )) = 1 - α

X

2

(n – 1, α/2 ) X

2

follows

Chi-square distribution

P (X

2

> X

2

(n – 1, α/2 )) = α/2

or equivalently P (X

2

< X

2

(n – 1, α/2 )) = 1 – α/2

One Population Two Populations

Population Size N

N

1

, N

2

Sample Size n

n

1

, n

2

Sample

x

1

, x

2

.......x

3

x

11

, x

12

,.......x

1n

for Population I

x

21

, x

22

,.......x

2n

for Population II

Population Mean

µ µ

1

,µ

2

Population Variance

σ

2

σ

1

2

, σ

2

2

Population Standard

Deviation

σ

σ

1

,σ

2

Sample Mean

x x

1

, x

2

Estimate of

σ

2

s

2

=

(x

i

- ∑ x)

2

(n-1)

s

1

2

=

(x

1i

- ∑ x

1

)

2

(n

1

-1)

and s

2

2

=

(x

2i

- ∑ x

2

)

2

(n

2

-1)

Estimate of

σ S

S

1

,S

2

Population proportion

of ‘successes’

P

P

1

, P

2

Population Variance PQ/N

P

1

Q

1

/ N

1

, P

2

Q

2

/ N

2

Population Standard

Deviation

PQ/N P

1

Q

1

/N

1

, P

2

Q

2

/N

2

Sample proportion of

‘successes’

p

p

1

, p

2

Estimate of Variance

of sample proportion

(p q / n)

p

1

q

1

/ n

1

, p

2

q

2

/ n

2

Estimate of Standard

Deviation of sample

proportion

pq/n p

1

q

1

/n

1

, p

2

q

2

/n

2

1. Some Well Known Tests

Some well known and frequently used tests are given below along with their contexts / situations:

A. Common Tests based on Normal Distribution (GENERALLY for LARGE SAMPLES)

154

Situation Large/small

sample test

Standard Error Conﬁdence

Interval

Null Hypothesis Test Statistic Alternative

Hypothesis

Conclusion

A-1

µ unknown,

σ known

one population

under

consideration

ONE SAMPLE

PROBLEM

Large & Small

σ

x

=σ/ n

x ± z

α /2

× σ

x

H

0

: µ = µ

0

z =

(x-µ

0

)

σ

x

H

1

: µ ≠ µ

0

H

1

: µ > µ

0

H

1

: µ < µ

0

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z < z

α

A-2

µ

unknown

σ

unknown

One population

under

consideration

ONE SAMPLE

PROBLEM

Large

ˆ

σ

X

= s / n

x ± z

α /2

× σ

x

H

0

: µ = µ

0

z =

(x-µ

0

)

σ

x

H

1

: µ ≠ µ

0

H

1

: µ > µ

0

H

1

: µ < µ

0

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z < z

α

A-3

µ

unknown,

σ

unknown

Finite

population of

Size N

ONE SAMPLE

PROBLEM

Large ˆ σ

X

= s / n × FPM

Where

FPM = N− n ( ) / N−1 ( ) { }

= Finite Population

Multiplier

x ± z

α /2

× σ

x

H

0

: µ = µ

0

z =

(x-µ

0

)

σ

x

H

1

: µ ≠ µ

0

H

1

: µ > µ

0

H

1

: µ < µ

0

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z < z

α

Situation Large/small

sample test

Standard Error Conﬁdence

Interval

Null Hypothesis Test Statistic Alternative

Hypothesis

Conclusion

A-4

µ

1

,µ

2

unknown,

σ

1,

σ

2

Known

two population

under

consideration

TWO SAMPLE

PROBLEM

Large

σ

X

1

−X

2 ( )

= σ

2

1

/ n

1

+ σ

2

2

/ n

2

(x

1

− x

2

) ± z

α/2

× σ

(x

1

−x

2

)

H

o

: µ

1

= µ

2

z =

(x

1

-x

2

)

σ

( X

1

-X

2

)

H

1

: µ

1

≠ µ

2

H

1

: µ

1

> µ

2

H

1

: µ

1

< µ

2

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

A-5

µ

1

,µ

2

unknown

σ

1,

σ

2

unknown

two populations

under

consideration

TWO SAMPLE

PROBLEM

Large

ˆ

σ

X

1

−X

2 ( )

= s

2

1

/ n

1

+ s

2

2

/ n

2

(x

1

− x

2

) ± z

α/2

× ˆ σ

(x

1

−x

2

)

H

o

: µ

1

= µ

2

z =

(x

1

-x

2

)

σ

( X

1

-X

2

)

H

1

: µ

1

≠ µ

2

H

1

: µ

1

> µ

2

H

1

: µ

1

< µ

2

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

A-6

µ

1

,µ

2

unknown

common s.d.

σ

known, two

population

under

consideration

TWO SAMPLE

PROBLEM

Large σ

X

1

−X

2 ( )

= σ

2

(1/ n

1

+1/ n

2

)

(x

1

− x

2

) ± z

α/2

× σ

(x

1

−x

2

)

H

o

: µ

1

= µ

2

(i.e. given two

normal populations

with common

KNOWN s.d.

σ

, can

we say that the two

samples come from

the same

population?)

z =

(x

1

-x

2

)

σ

( X

1

-X

2

)

H

1

: µ

1

≠ µ

2

H

1

: µ

1

> µ

2

H

1

: µ

1

< µ

2

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

Situation Large/small

sample test

Standard Error Conﬁdence

Interval

Null Hypothesis Test Statistic Alternative

Hypothesis

Conclusion

A-7

µ

1

,µ

2

unknown,

common s.d.

σ

unknown, two

population

under

consideration

TWO SAMPLE

PROBLEM

Large ˆ σ

x

1

−x

2 ( )

= s

2

(1/ n

1

+1/ n

2

)

where

s

2

=

n

1

−1 ( )s

1

2

+ n

2

−1 ( )s

2

2

n

1

+ n

2

− 2 ( )

(x

1

− x

2

) ± z

α/2

×

σ

(x

1

−x

2

)

H

o

: µ

1

= µ

2

(i.e. given two

normal populations

with common, but

UNKNOWN s.d.

σ

,

can we say that the

two samples come

from the same

population?)

z =

(x

1

-x

2

)

σ

( X

1

-X

2

)

H

1

: µ

1

≠ µ

2

H

1

: µ

1

> µ

2

H

1

: µ

1

< µ

2

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

A-8

P unknown

n p > 5

ONE SAMPLE

PROBLEM

Large

σ

p

= P

0

Q

0

/n

Where Q

0

=1- P

0

p ± z

α/2

× σ

p

H

0

:P = P

0

z =

p − P

0

( )

σ

p

H

1

:P ≠ P

0

H

1

:P > P

0

H

1

:P < P

0

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

A-9

P Unknown

n p > 5

Finite

population of

size N

ONE SAMPLE

PROBLEM

Large

σ

p

= (FPM) × P

o

Q

0

/ n

where Q

0

=1 - P

0

&

FPM= (N-n)/(N-1) { }

= Finite Population

Multiplier

p ± z

α/2

× σ

p

H

0

:P = P

0

z =

p − P

0

( )

σ

p

H

1

:P ≠ P

0

H

1

:P > P

0

H

1

:P < P

0

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

A-10

P

1

, P

2

unknown

n

1

p

1

> 5

n

2

p

2

> 5

TWO SAMPLE

PROBLEM

Large σ

p

1

−p

2 ( )

= pq 1/ n

1

+1/ n

2

( ) { },

where

p =

n

1

p

1

+ n

2

p

2

( )

n

1

+ n

2

( )

& q=1-p

(p

1

− p

2

) ± z

α/2

× σ

(p

1

−p

2)

H

0

: P

1

= P

2

z =

(p

1

− p

2

)

σ

(p

1

−p

2

)

H

1

: P

1

≠ P

2

H

1

: P

1

> P

2

H

1

: P

1

< P

2

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

157

Situation Large/small

sample test

Standard Error Conﬁdence

Interval

Null Hypothesis Test Statistic Alternative

Hypothesis

Conclusion

A-11

σ

1

, σ

2

Unknown

TWO SAMPLE

PROBLEM

Large ˆ

σ

(s

1

−s

2

)

= s

1

2

/ 2n

1

+ s

2

2

/ 2n

2

(s

1

− s

2

) ± z

α/2

× ˆ σ

(s1−s2 )

H

0

: σ

1

= σ

2

z =

(s

1

− s

2

)

ˆ σ

(s

1

−s

2

)

H

1

: σ

1

≠ σ

2

H

1

: σ

1

> σ

2

H

1

: σ

1

< σ

2

Reject H

0

if | Z | > z

α /2

Reject H

0

if Z > z

α

Reject H

0

if Z <- z

α

B Tests based on t-distribution (Small Sample Tests)

Situation Large/small

sample test

Standard Error Conﬁdence

Interval

Null

Hypothesis

Test Statistic Alternative

Hypothesis

Conclusion

B-1

µ unknown

σ unknown

ONE

SAMPLE

PROBLEM

small ˆ σ

x

= s / n

x ± t

α/2

× ˆ σ

x

H

0

: µ = µ

0

t =

(x − µ

0

)

ˆ σ

x

H

1

: µ ≠ µ

0

H

1

: µ > µ

0

H

1

: µ < µ

0

Reject H

0

if | t | > t (n-1,α/2)

Reject H

0

if t > t (n-1,α)

Reject H

0

if t < t (n-1,α)

µ unknown

σ unknown

Situation

Large/

Small

Sampl

e Test

Standard Error

Confidence

Interval

Null

Hypothesis

Test Statistic

Alternative

Hypothesis

Conclusion

Small

Small

Small

Large

&

Small

σ

x

= s / n

x ± t

α/2

×

σ

x

H

0

: µ = µ

0 t =

(x − µ

0

)

ˆ

σ

x

H

1

: µ ≠ µ

0

H

1

: µ > µ

0

H

1

: µ < µ

0

Reject H

0

if

| t | t(n-1,α/2)

Reject H

0

if

t > t t(n-1,α)

Reject H

0

if

t < −t(n-1,α)

B−1

µ unknown,

σ unknown

ONE SAMPLE

PROBLEM

B− 2

µ

1

,µ

2

unknown,

common s.d.

σ unknown two

population under

consideration

TWO SAMPLE

PROBLEM

σ

(x

1

−x

2

)

= s

2

(1/ n

1

+1/ n

2

)

where s

2

=

(n

1

−1)s

1

2

+ (n

2

−1)s

2

2

(n

1

+ n

2

− 2)

(x

1

− x

2

)

±t

α/2

×

ˆ

σ

(x

1

−x

2

)

H

0

: µ

1

=µ

2

(i.e. given two

normal populations

with common s.d.

σ, can we say that

two samples come

from the same

population?)

t =

(x

1

− x

2

)

ˆ

σ

(x

1

−x

2

)

H

1

: µ

1

≠ µ

2

H

1

: µ

1

> µ

2

H

1

: µ

1

< µ

2

Reject H

0

if

| t | t(n

1

+ n

2

− 2, α / 2)

Reject H

0

if

t > t (n

1

+ n

2

− 2, α)

Reject H

0

if

t < −t (n

1

+ n

2

− 2, α)

B− 3

µ

x

,µ

y

unknown,

n−Paired

observations

PAIRED TEST

ˆ

σ

d

= s / n

where d

i

= (x

i

− y

i

)

& s

2

=

(d

i

− d ∑ )

2

(n −1)

x ± t

α/2

×

ˆ

σ

d

H

0

: µ

x

= µ

y

t =

d

ˆ

σ

d

H

1

: µ

x

≠ µ

y

H

1

: µ

x

> µ

y

H

1

: µ

x

< µ

y

Reject H

0

if

| t | > t(n-1,α/2)

Reject H

0

if

t > t(n-1,α)

Reject H

0

if

t < −t(n-1,α)

B− 4

ρ = population

correlation

coefficient

between X & Y

r = sample

correlation coefficient

between X & Y

n = sample size

CORRELATION

TEST

H

0

: ρ = 0

t =

r n − 2

1− r

2

H

1

: ρ ≠ 0

H

1

: ρ > 0

H

1

: ρ < 0

Reject H

0

if

| t | > t(n-1,α/2)

Reject H

0

if

t > t (n-2,α)

Reject H

0

if

t < −t(n-2,α)

158

B. Tests Based on T-distribution (Small Sample Tests)

Situation

Large/

Small

Sample

Test

Stand

ard

Error

Confid

ence

Interv

al

Null Hypothesis Test Statistic

Alternative

Hypothesis

Conclusion

Large

Large &

Small

Large &

Small

Small

Large

C-1

GOODNESS OF FIT

against a theoretical

or specified distribution

expected frequencies

E

1

>5,

sample size (n)

reasonably large

(say, > 50)

H

0

The sample

follows the

specified

distribution

χ

2

=

(O

i

- E

i

)

2

E

i

∑

where

O

i

=Observed frequency

E

i

= Expected frequency

O

i

= ∑ E

i

=n ∑

i =1,2...,k and

k = number of classes.

H

1

The sample

does not follows the

specified

distribution

Reject H

0

if

χ

2

> χ

1

(k −1, α)

C− 2

INDEPENDENCE OF

TWO ATTRIBUTES

(say, A & B)

(r × s contingency

table)

H

0

P

1

= P

2

= ... = P

s

,

where s=no. of

populations and

r=no.of characteristics

being observed

χ

2

=

n

ij

- (n

io

n

oj

/ n ⎡

⎣

⎤

⎦

2

(n

io

n

oj

/ n

∑

where

n

ij

= frequency of (A

i

,B

j

) cell

n

io

= Marginal total for A

i

n

oj

= Marginal total for B

j

n = total frequency

i =1,2...,r (no. of rows)

and

j =1,2,....,s (no of columns)

H

1

: A&B are not

independent

Reject H

0

if

χ

2

> χ

1

(r −1)(s −1) , α [ ]

χ

2

=

n

ij

−(n

io

n

oj

/ n ⎡

⎣

⎤

⎦

2

(n

io

n

oj

/ n

∑

where

n

ij

= frequency of (A

i

,B

j

) cell

n

io

= Marginal total for A

i

n

oj

= Marginal total for B

j

n = total frequency

i = 1,2...,r (no. of rows)

and

j = 1,2,....,s (no of columns)

H

1

: A&B are not

independent

Reject H

0

if

χ

2

> χ

1

(r −1)(s −1) , α [ ]

H

0

: A&B are

independent

C− 3

EQUALITY OF

SEVERAL

POPULATION

PROPORTIONS

C− 4

TEST FOR

POPULATION

VARIANCE OF

NORMAL

POPULATION

H

0

: σ

2

= σ

0

2

χ

2

=

(n −1)s

2

σ

0

2

where

n = Sample size

Z = (2χ

2

) − (2n −1)

where χ

2

as above

H

1

: σ

2

≠ σ

0

2 Reject H

0

if

χ

2

> χ

2

(r −1)(s −1) , α [ ]

Reject H

0

if

| Z| > z

α

(χ

2

)

C. Tests Based on Chi-square distribution

160

Keynote 8.2.4: Example for A4

Keynote 8.2.1:Example for A1

Keynote 8.2.2: Example for A2

Keynote 8.2.3: Example for A3

Keynote 8.2.5: Example for A5

Keynote 8.2.6: Example for A8

Keynote 8.2.7: Example for B2

161

Keynote 8.2.8: Example for C1

Keynote 8.2.9: Example for C2

162

SECTION 3

Case Study: Smoking a Costly Affair Now?

163

This case study was written by Thalluri Prashanth Vidya Sagar, IBSCDC. It is intended to be used as the basis for class discus-

sion rather than to illustrate either effective or ineffective handling of a management situation. The case was compiled from

“An estimated 440,000 Americans die each year from dis-

eases caused by smoking. Smoking is responsible for an esti-

mated one in five U.S. deaths and costs the U.S. over $150

billion each year in health care costs and lost

productivity.”

On April 1st 2009, the US government had spiked the federal

cigarette-tax rate from 39¢ to $1.01 per pack. As smoking had

been taking toll on human lives, the anti-smoking advocates

welcomed the administration’s move stating that it would save

an estimated 900,000 lives. However, some of the smokers

worried about raising cost of their habit (Exhibit I).

This kind of taxation is often called as ‘sin tax’, as it was

mainly imposed on vices like gambling, drinking and smoking.

Recent hike in sin tax expected to stop around 2 million kids

from trying to smoke for the first time and prompt almost 1 mil-

lion adults to quit.

The sin tax had historical roots since 1500s. Pope Leo X had

taxed the licensed prostitutes. Peter the Great levied tax on

men who grew beards. American sin taxation began with the

proposal of an American patriot, Alexander Hamilton, who pro-

posed taxation on alcohol to contain its consumption and si-

multaneously to raise the revenues for the government (Ex-

hibit II).

In China too, the State Administration of Taxation and the Uni-

versity of California (Berkeley) had released a report titled To-

bacco Tax and Its Potential Impact on China. In December

2008, they asked the Chinese government to increase sub-

stantially the tax rate on cigarette to reduce cigarette con-

sumption in China.

Experts estimated that an increase of 51% of the retail price

would reduce as much as 13.7 million smokers and save the

lives of 3.4 million. It was also estimated that the tax rate

could generate as much as 64.9 billion yuan ($9.5 billion) per

annum as additional revenue for the government. All the ex-

perts unanimously agreed on the issue of raising tax rates to

affect a price increase. They pointed out that on an average

the cigarette tax rate was levied at 65%–70% of the retail

price across the globe.

164

1

Smoking: A Costly Affair Now?

Smoking: A Costly Affair Now?

“An estimated 440,000 Americans die each year from diseases caused by smoking. Smoking is

responsible for an estimated one in five U.S. deaths and costs the U.S. over $150 billion each year in

health care costs and lost productivity.”

1

– American Lung Association

On April 1

st

2009, the US government had spiked the federal cigarette-tax rate from 39¢ to $1.01

per pack. As smoking had been taking toll on human lives, the anti-smoking advocates welcomed the

administration’s move stating that it would save an

estimated 900,000 lives. However, some of the

smokers worried about raising cost of their habit

(Exhibit I).

This kind of taxation is often called as ‘sin tax’,

as it was mainly imposed on vices like gambling,

drinking and smoking. Recent hike in sin tax

expected to stop around 2 million kids from trying

to smoke for the first time and prompt almost 1

million adults to quit.

The sin tax had historical roots since 1500s. Pope

Leo X had taxed the licensed prostitutes. Peter the

Great levied tax on men who grew beards.

American sin taxation began with the proposal of

an American patriot, Alexander Hamilton, who proposed taxation on alcohol to contain its consumption

and simultaneously to raise the revenues for the government (Exhibit II).

This case study was written by Thalluri Prashanth Vidya Sagar, IBSCDC. It is intended to be used as the basis for class discussion rather

than to illustrate either effective or ineffective handling of a management situation. The case was compiled from published sources.

© 2009, IBSCDC.

No part of this publication may be copied, stored, transmitted, reproduced or distributed in any form or medium whatsoever

without the permission of the copyright owner.

Background Reading: Chapters 8 and 9, “Testing Hypotheses: One Sample Tests” and

“Testing Hypotheses: Two-Sample Tests”, Statistics for Management 7

th

Edition

(Richard I. Levin and David S. Rubin)

Ref. No.: QM0009

Exhibit I

Worried Smokers

Source: “Can Raising the Tobacco Tax Reduce the Number of Smokers?”,

http://www.bjreview.com.cn/forum/txt/2009-02/10/content_177671.htm, February 12

th

2009

1

“Smoking Cessation Resources Fact Sheet”, http://www.lungusa.org/site/c.dvLUK9O0E/b.44456/k.7B2A/

Smoking_Cessation_Resources_Fact_Sheet.htm, July 2004

Smoking :A Costly Affair Now

Some analysts criticized the sin tax by stating that such a

move would definitely promote the interests of low, cheap

quality cigarette producers, while further spoiling the health of

the smokers. However, a World Bank survey found a reduc-

tion of 4% in cigarette consumption, for every 10% increase in

retail price in developed countries, while the reduction was

8% in developing countries.

The experts also cited the example of New York City in suc-

cessfully controlling tobacco usage. The local government in

New York City had initiated a comprehensive anti-

smoking measure in 2002 by raising the cigarette tax rate.

It was found in 2006 that the city’s smoking rate dramati-

cally declined by 20% to stand at only 17.5%. A survey

also showed that 45.3% smokers in New York were smok-

ing fewer times than before or considering plans and

ways to quit smoking. It was also found that a number of

adolescent smokers, who were more sensitive to cigarette

prices, cut off their tobacco consumption due to their lim-

ited finances. As a result of cigarette price hikes, there

were more cigarette quitters in low-income groups than

high-income groups in the city.

According to another survey on smoking habit, 400 out of

a random sample of 500 men were found to be smokers.

After the tax on tobacco had been increased, another ran-

dom sample of 600 men in the New York City included

400 smokers. An analyst got a doubt whether the ob-

served decrease in proportion of smokers was significant

or not. He wanted to test the data at 5% level of signifi-

cance.

LEGISLATING MORALITY LEGISLATING MORALITY

1787

Alexander Hamilton advocated taxing “ardent spirits “in

Federalist No.12

1794 US liquor tax sparks the Whiskey Rebellion

1864

To raise money for the Civil War, US authorities levied

federal cigarette tax of up to 2.4 ¢ per pack for the ﬁrst time

in the US history

1963

Annual per capital cigarette consumption among US adults

peaks at 4,345

2005

Nine Democratic Senators introduced an unsuccessful bill

that would have imposed a 25% tax on purveyors of online

pornography

2009

Amid a public outcry. New York Governor, David Paterson

backtracks on plans to raise taxes on goods ranging from

downloads of pornography to sugary soft drinks

Source: Altman Alex, “ A Brief History of: Sin Taxes”, http:/www.time.com/time/

magazine/article/0,9171,1889187,00.html, April 2nd 2009

Source: Altman Alex, “ A Brief History of: Sin Taxes”, http:/www.time.com/time/

magazine/article/0,9171,1889187,00.html, April 2nd 2009

165

Exhibit II

166

SECTION 9

Case Study: Care Hygiene

In 2003, Mumbai based Care Hygiene Co (Care Hygiene), a

well–known company dealing in healthcare products, re-

corded sales of Rs.665 crores and a net income of Rs. 45.6

crores. ‘Nutravit,’ a chocolate flavored health drink, was one

of its flagship products. But of late, Nutravit had been facing

stiff competition from a number of other chocolate–based

beverages that had flooded the market. Its market share had

significantly come down. Care Hygiene decided it was high

time it took some efforts to tackle competition and regain

market share. The company decided to market a new variant

of Nutravit, with an improved formulation and in a new flavor.

By mid–2003, Care Hygiene was ready with its new variant –

a powdered mix which when mixed with milk gave a nutri-

tious as well as tasty vanilla cum chocolate flavored drink.

Care Hygiene’s marketing manager decided to test market

the new product. He selected Mumbai and Nagpur as the

test cities because there were significant similarities in the

consumption patterns of its health drink ‘Nutravit’ in these

two cities. In Mumbai, Care continued to market its estab-

lished health beverage, while in Nagpur it replaced it with the

new vanilla cum chocolate flavor.

In each city, a sample of 200 households was selected and

interviewed over a six– month period. Based on each house-

hold’s reported consumption of the beverages, Care’s mar-

keting manager charted the results showing the different

household consumption rates in Mumbai and Nagpur (Refer

to Table I for the household consumption rates).

In Mumbai, where the chocolate flavor was marketed, 114

households reported using the beverage. In Nagpur, where

the new variant was test marketed, 136 households reported

using the beverage mix.

167

TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES TABLE I: HOUSEHOLD CONSUMPTION RATES

Consumptio

n rate

Nagpur (Vanilla

cum Chocolate)

Nagpur (Vanilla

cum Chocolate)

Mumbai

(Chocolate)

Mumbai

(Chocolate)

Consumptio

n rate

Numb

er

% of

households

Numb

er

% of

househol

ds

Heavy 34 17 28 14

Moderate 52 26 44 22

Light 50 25 42 21

Non-user 64 32 86 43

Total 200 100 200 100

Quantitative Methods

Questions for Discussion:

1. Care’s marketing manager wondered if the difference in

usage rates (57% in Mumbai and 68% in Nagpur) could

be attributed to the new vanilla formulation or if the differ-

ence had merely resulted by chance due to sampling.

2. Since the new formulation was an improved one with a

new flavor, it was more expensive. The management de-

cided to proceed with it only if there was sufficient evi-

dence that the new variant would yield better results.

While test marketing the new variant, Care’s marketing

manager had decided that if it achieved a 75% usage

rate among target households, he would recommend the

launching of the product. What should he do? Based on

the sample of 200 households, the new variant had

achieved a usage rate of 68%. Should he recommend to

the management for or against launching of the new

product?

3. Among the 200 households sampled in each city, Care

found different consumption rates. While 86 heavy and

moderate consumption households were reported in Nag-

pur, 72 heavy and moderate consuming households

were reported in Mumbai. Care’s marketing manager

wanted to know if the difference between the consump-

tion rates in the two cities was statistically significant. If

there was a statistically significant difference, he could

conclude that the new flavor was causing a heavier con-

sumption pattern.

4. Care’s marketing manager wondered if the difference in

usage rates (57% in Mumbai and 68% in Nagpur) could

be attributed to the new vanilla formulation or if the differ-

ence had merely resulted by chance due to sampling.

5. Since the new formulation was an improved one with a

new flavor, it was more expensive. The management de-

cided to proceed with it only if there was sufficient evi-

dence that the new variant would yield better results.

While test marketing the new variant, Care’s marketing

manager had decided that if it achieved a 75% usage

rate among target households, he would recommend the

launching of the product. What should he do? Based on

the sample of 200 households, the new variant had

achieved a usage rate of 68%. Should he recommend to

the management for or against launching of the new

product?

6. Among the 200 households sampled in each city, Care

found different consumption rates. While 86 heavy and

moderate consumption households were reported in Nag-

pur, 72 heavy and moderate consuming households

were reported in Mumbai. Care’s marketing manager

wanted to know if the difference between the consump-

tion rates in the two cities was statistically significant. If

there was a statistically significant difference, he could

conclude that the new flavor was causing a heavier con-

sumption pattern.

168

169

SECTION 5

Case Study: Conversys Inc (A)

170

This case study was written by Dr. Sourabh Bhattacharya, Professor (Department of Decision Sciences), IBS, Hyderabad. It is

intended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-

ment situation. The case was written from generalised experiences.

Conversys Inc. (Conversys),

started its operations in July

2000, soon became one of the

most reputed call centres in Hyder-

abad, India. Conversys provides both inbound calls and out-

bound call services to its wide range of clientele. Conversys’

clientele includes consumer product firms, financial product

firms, automobile firms, telecommunication firms, etc. It also

provides internal functional services such as pay roll mainte-

nance, help desk, sales support, etc. to many firms.

Performance Evaluation Method at Conversys

One of the most important section of employees at Conversys

are the Customer Service Representatives (CSRs) or the

Agents. These agents are the ones who answer customers’

tele- phone inquiries. Therefore, the performance of the

agents plays a vital role in building the company’s reputation

of providing 99% service rate. Moreover, agents are paid by

the hour. Hence, their productivity becomes an important is-

sue. The typical performance measure for call centre agents

is AverageHandlingTime(AHT)2 in seconds or number of calls

handled in an hour. Every month, the Unit Managers (UMs)

compute a simple statistics (i.e., mean of AHTs) for each

agent, taking into account the tenure of the agent. The Unit

Managers (UMs) prepare reports that are presented and dis-

cussed in the monthly performance meetings with the higher-

ups, to screen for well-performing and under-performing

agents. The agents performing below standards are identified

in these monthly meetings and provided further training.

Agents are given 2 months of On-the-Job Training (OJT) for

improving their accuracy, speed and efficiency while process-

ing phone calls.

After the OJT, UMs again monitor phone calls to ensure that

the agents achieve company’s courtesy and accuracy

standards. The performance measure of each agent before

and after the OJT is compared to decide whether the agent

has improved to be retained or not.

The Dilemma of Amit Vardhan

Amit Vardhan (Amit), the UM of one of the project teams, has

recently become concerned about the performance of one of

his agents, Ishan Singh (Ishan). The company standards

specify the AHT to be less than 180 sec. Amit collects Ishan’s

AHT data (Exhibit I) for the last 1 month and wonders if Is-

han should undergo a training to further improve his perform-

ance. After analyzing Ishan’s performance data, Amit con-

cludes that Ishan is below the company standards and he

needs to undergo 2 months of OJT.

One month after Ishan’s

training is over, Amit de-

cides to evaluate Ishan’s

performance and give a

salary hike, provided his

per f or mance has i m-

proved. However, another

agent, Devang

Parekh (Devang) is also

a potential candidate for

the salary hike. Hence, Amit decides that the salary hike

171

Conversys Inc.(A)

INTERACTIVE 8.1 Ishan’s

and Devang performance

would be given to the one who would show a better perform-

ance in the coming month. Amit went through the observa-

tions of the AHT record of Ishan and Devang for the very next

month (Exhibits II & III). However, he got confused regard-

ing who should be given the salary hike – Ishan or Devang.

172

Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT Ishan’s Performance Before OJT

Day

AHT (in

seconds)

Day

AHT (in

seconds)

Day

AHT (in

seconds

)

1 185 11 180 21 178

2 180 12 183 22 178

3 175 13 180 23 179

4 185 14 179 24 180

5 182 15 181 25 185

6 185 16 185 26 180

7 196 17 176 27 183

8 180 18 180 28 185

9 182 19 186 29 180

10 189 20 180 30 180

Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author

Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT Ishan’s Performance After OJT

Day

AHT (in

seconds)

Day

AHT (in

seconds)

Day

AHT

(in

second

s)

1 180 11 180 21 180

2 175 12 180 22 185

3 173 13 178 23 183

4 183 14 180 24 180

5 178 15 183 25 180

6 182 16 180 26 183

7 185 17 179 27 181

8 170 18 175 28 184

9 180 19 178 29 181

10 180 20 180 30 182

Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author

Exhibit I Exhibit II

173

Devang’s Performance Devang’s Performance Devang’s Performance Devang’s Performance Devang’s Performance Devang’s Performance

Day

AHT (in

seconds)

Day

AHT (in

seconds)

Day

AHT (in

seconds)

1 185 11 190 21 182

2 193 12 183 22 178

3 178 13 183 23 177

4 175 14 181 24 176

5 190 15 185 25 175

6 187 16 185 26 180

7 176 17 183 27 180

8 179 18 182 28 179

9 185 19 178 29 174

10 182 20 187 30 180

Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author Compiled by author

Exhibit III

SECTION 6

Case Study: Conversys Inc. (B)

174

This case study was written by Dr. Sourabh Bhattacharya, Professor, Department of Operations & IT, IBS Hyderabad. It is

intended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a

management situation. The case was prepared from the generalized experiences.

The management of Conversys has recently become concerned

about its on-the-job training policy. OJT requires the trainers to

possess specialist teaching skills. However, most of Conver-

sys’s trainers lacked the skill and knowledge to train, resulting in

an output of insufficient standards. Moreover, the trainers being

the employees themselves were not given sufficient time to

spend with the trainees, which again led to substandard training

and insufficient learning.

The vice president (HR) of Conversys, Shailja Goel is of the

opinion that instead of OJT, employees should be given training

by external agencies in a more systematic and structured man-

ner. A number of such agencies were contacted over the next

few months and VoiceTutorial had been shortlisted for the job.

However, before signing the contract with VoiceTutorial Ms.

Shailaja wanted to test the effectiveness of the training methods

used by VoiceTutorial. She negotiated with VoiceTutorial to run

a pilot training program for 15 of her employees. The pilot pro-

gram was scheduled to start after a month. The average

monthly performance of these 15 employees was recorded for a

month before the pilot program started. The employees were

given one week extensive training on data collection and entry,

customer service and call handling techniques. The monthly av-

erage performance of these 15 employees was also recorded

after the training program was over. The performance data is

shown in the table below. Ms Shailaja is now wondering how to

use these data for assessing the effectiveness of the pilot train-

ing program of VoiceTutorial.

175

Day

AHT (in seconds)

before the pilot

program

AHT (in seconds)

after the pilot

program

1 180 185

2 193 183

3 178 182

4 175 175

5 185 187

6 187 182

7 176 178

8 179 177

9 189 176

10 182 175

11 185 180

12 183 180

13 183 179

14 181 174

15 185 180

Prepared by author Prepared by author Prepared by author

Conversys Inc.(B)

176

Notes

INTERACTIVE 8.2 Employees Performance

177

178

179

SECTION 7

Case Study: The Strategic Break: To be or Not to be

180

This case study was written by Dr. Sourabh Bhattacharya, Professor (Department of Decision Sciences), IBS, Hyderabad. It is in-

tended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-

ment situation. The case was written from generalised experiences.

Newspapers and internet media is flooded with such criticisms

about the recently introduced “Strategic Break” in the IPL

T20’s season two. On one hand players are of the opinion that

the strategic breaks hamper the momentum of a team, on the

other hand media is looking at these breaks suspiciously. Me-

dia believes that the strategic breaks are born out of com-

pletely commercial interests of the Board of Control for Cricket

in India (BCCI).

Rubbishing the media allegations, Lalit Modi, the Chairman of

IPL, claims that the strategic break is the innovation brought

into the 20-20 format of the game. “The ‘strategy break’ is an

innovative deviation from tradition, which gives teams an op-

portunity to consult and alter strategies after 10 overs to get

their acts right,” says Modi. However, Modi also assures that

the reassessment of the idea will be done takingthe views of

the players into consideration once the season two games are

over.

Introduction

On the lines of National Basketball League (NBA) of USA and

football’s English Premier League, Board of Control for Cricket

in India (BCCI) launched Indian Premier League (IPL) in the

year 2008. IPL was established as professional Twenty20

cricket league with the approval of International Cricket Coun-

cil (ICC). The format of the Twenty20 game is completely differ-

ent from the format of the usual one-day game. The most im-

portant difference is that Twenty20 is a 20 over each innings

game, which allows a bowler to bowl a maximum of 4 overs

whereas in one-day game, which is of 50 overs per innings, 10

overs are maximum for each bowler. Apart from the maximum

number of overs there are many other differences in terms of

fielding restrictions, rules for time out and a no ball, rules in the

181

The Strategic Break: To be or Not to be

event of a tie etc. With all these changes the Twenty20 game

has become faster and more exciting.

The season – 1 of the IPL Twenty20 (also known as DLF IPL

2008) was played in various cities of India between eight

teams (Exhibit I). The season lasted for 45 days in which 59

matches were played. On June 1st 2008, the final match was

played between Rajasthan Royals and Chennai Super Kings

at DY Patil Sports Academy in Mumbai. Under the captaincy

of Shane Warne3 Rajasthan Royals defeated Chennai Super

Kings by 3 wickets.

By the end of the season – 1, IPL Twenty20 had earned enor-

mous popularity among the viewers as well as cricketers

across the world. BCCI was always positive and certain about

the popularity and acceptance of Twenty20 format of the

game and had already chalked out its plans for the season –

2 (also known as IPL 2) games to be held in 2009. Initially, IPL

2 games were planned to be held in India but due to the gen-

eral elections in India taking place at the same time adequate

security for the tournament could not be guaranteed by Indian

government. It was at this time when doubts were cast on the

future of IPL 2, Government of South Africa came to the res-

cue of BCCI and offered South Africa to be the venue for IPL

2 games.

With a lot of fanfare, IPL 2 was kick started in the city of Jo-

hannesburg, South Africa on April 18th 2009.

The Strategic Break

With an idea of bringing innovation and variety, BCCI decided

to introduce two new rules in IPL 2. The first alteration was to

the rule of bowl-out in the event of a tie. In IPL 1, in case of a

tie, each team had to bowl five balls on the unguarded wick-

ets and whichever team hits the wickets maximum number of

times wins. In case

both the teams hit the

same number of wick-

ets after the first five

balls per side, the

bowling continues and

the winner is decided

by sudden death4. In

IPL 2, bowl-out rule

was replaced with the

rule of super-over. In

super-over, each team

nominates three bats-

men and one bowler

to play a one over

“mini-match”. Each side bats one over bowled by the one

nominated opposition bowler. If the batting side loses two

wickets, their innings is over. The side with the higher score

from their over wins. If the teams finish tied on runs scored in

that one over, the side with the higher number of sixes in its

full innings and in the one-over eliminator will be declared the

winner. If the teams are still tied, the one with the higher num-

ber of fours in both innings will win.

The second alteration was the introduction of “the strategic

break” in IPL 2. The strategic break is the official time-out of 7

minutes 30 seconds in duration midway through the innings.

The idea of strategic break is to allow the teams to re-group

tactically. During the time-out, the fielding team and the two

182

Exhibit I Team Players in IPL 2

batsmen may return to the dug-outs.It is the introduction of

strategic break, which has given rise to a lot of controversies

in the season 2 of IPL. Players in general and batsmen in par-

ticular came down heavily on the idea of a time-out in the mid-

dle of the innings. They felt that this break hampers the mo-

mentum of the team. Media, on the other hand, had a different

point of view to criticize the introduction of strategic break.

They alleged that the strategic breaks have the commercial

interests of BCCI to earn more advertising revenues. Chair-

man of IPL, Lalit Modi rubbished the media allegations and ex-

plained strategic breaks as the innovation brought into the

game of cricket. He said that the concept of time-outs was al-

ready existing in the games like football or basketball and it is

just adapted in the game of cricket. However, taking the con-

cern of the players into account Modi assured to reassess the

idea of strategic breaks once the IPL2 tournament is over

In order to evaluate the idea of strategic break, Modi will have

to look at the performances of the teams before and after the

strategic break. Exhibits II and III show the first innings and

the second innings performances of the 17 matches played till

now in IPL 2 respectively. Can Modi reach to a conclusion

whether the players’ claim that the strategic break hampers

the momentum of the game is correct or not?

183

Exhibit II: Performance in the First Innings of Match Exhibit II: Performance in the First Innings of Match Exhibit II: Performance in the First Innings of Match Exhibit II: Performance in the First Innings of Match

Match no 1 Batted 1st Score before

strategy break

score in next 5

overs after break

1 MI v CSK 64-1 41-3

2 RCB v RR 57-4 30-1

3 KXIP v DD 67-1 37-6

4 KKR v DC 31-3 33-2

5 CSK v RCB 106-0 29-2

6 KXIP v KKR 67-3 50-1

8 DC v RCB 91-2 48-1

9 DD v CSK 90-3 33-1

10 RR v KKR 78-4 29-0

11 RCB v KXIP 71-3 29-1

12 DC v MI 88-1 49-3

14 RCB v DD 74-3 26-1

15 KXIP v RR 60-4 38-0

16 CSK v DD 88-2 25-2

17 MI v KKR 111-0 40-3

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

Compiled by author Compiled by author Compiled by author Compiled by author

184

Exhibit III: Performance in the Second Innings of Match Exhibit III: Performance in the Second Innings of Match Exhibit III: Performance in the Second Innings of Match Exhibit III: Performance in the Second Innings of Match

Match no 1 Batted 2nd Score before

strategy break

score in next 5

overs after break

1 CSK vMI 70-3 38-2

2 RR v RCB 32-5 26-4

3 DD v KXIP 58-0 DNB

4 DC v KKR 69-2 35-0

5 RCB v CSK 56-5 29-4

6 KKR v KXIP 79-1 DNB

8 RCB v DC 57-3 52-1

9 CSK v DD 106-2 42-2

10 KKR v RR 67-3 31-2

11 KXIP v RCB 80-1 47-1

12 MI v DC 84-1 24-3

14 DD v RCB 64-2 35-1

15 RR v KXIP 48-6 34-0

16 DD v CSK 85-2 44-1

17 KKR v MI 70-2 25-5

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

DNB= did not bat

Notes: The match # 7 and 13 were abandoned without a ball being bowled.

In Match # 3, KXIP were allocated 12 overs and strategy-break was taken after 6

overs. The corresponding ﬁgure after strategy-break corresponds to their

performance in next 6 overs. Delhi Daredevils won the match in only 4.5 overs

and there was no strategy-break in their innings.

In Match # 4 DC won in 13.1 overs In Match # 6 KKR won in 9.2 overs

Compiled by author Compiled by author Compiled by author Compiled by author

SECTION 8

Case Study: Shoppers’ Stop Private Labels

185

This case study was written by Siva V Gabbita, Professor, IBS, Hyderabad. It is intended to be used as the basis for class discus-

sion rather than to illustrate either effective or ineffective handling of a management situation. The case was written from general-

ised experiences.

Shoppers’ Stop Private Labels

On October 27, 1991 the K. Raheja Corp. group of compa-

nies, one of India’s biggest hospitality and real estate players

crossed another milestone with the foundation of its lifestyle

venture - Shoppers’ Stop. Shoppers Stop is today one of the

leading retail stores in India.

From its inception when it began by operating a chain of de-

partmental stores Shoppers’ Stop has progressed from being

a single brand shop to becoming a Fashion & Lifestyle store

for the family. Shoppers’ Stop is now a household name,

known for its superior quality products, services and above

all, for providing a complete shopping experience.

Today Shoppers’ Stop has twenty six (26) stores across the

country and three stores under the name HomeStop) and

over the years it has also begun operating a number of spe-

cialty stores, namely Crossword Bookstores, Mother care,

Brio, Desi Café and Arcelia.

Shoppers’ Stop has become a benchmark for the Indian retail

industry. In fact, the company’s continuing expansion plans

aim to help Shoppers’ Stop meet the challenges of the retail

industry in an even better manner than it does today.

Shoppers Stop retails a range of branded apparel and private

labels in apparel, footwear, fashion jewellery, leather products,

accessories and home products. These are complemented by

cafe, food, entertainment, personal care and various beauty

related services.

Shoppers Stop retails products of domestic and interna-

tional brands such as Louis Philippe, Pepe, Arrow, BIBA,

Gini & Jony, Carbon, Corelle, Magppie, Nike, Reebok,

LEGO, and Mattel.

Shoppers Stop retails merchandise under its own labels,

such as STOP, Kashish, LIFE and Vettorio Fratini, Elliza

Donatein, Acropolis etc. The company also licensees for

Austin Reed (London), an international brand, whose

men’s and women’s outerwear are retailed in India exclu-

sively through the chain.

Retailers today understand the role that private label

brands play in long-term business strategy and marketing

strategy. Store brands play a significant role as part of the

marketing mix of retail chains. On the supply side effec-

tive category management enables retailers to optimize

supply chain relationships whereas on the demand side

strategic brand management works in tandem in each

aisle of each store. Well known national brands are avail-

able everywhere and are not store specific. Therefore the

retailer’s store brand portfolio has the advantage of obtain-

ing as well as providing synergies with well known

brands, which attracts customers to establish a relation-

ship with the franchise.

History of Private labels

Private label brands traditionally competed with well

known brands in the same product category because their

price-value proposition allowed them to be positioned as

the “cheaper alternative”. As a result of such a positioning

186

while they attracted consumer attention they were also how-

ever perceived as inferior in quality. However retailers pushed

private label products because they yielded high margins of

profitability with minimum marketing effort.

Private labels therefore grew to provide competition to na-

tional brands. On the flip side the entire product category was

undermined by commoditization since they forced a price com-

petition erasing profit margins all around. Also, this cost-

based competition significantly reduced a focus on product dif-

ferentiation. Therefore all entities along the supply chain

missed the opportunities that existed for tapping latent con-

sumer needs which these categories sometimes had the abil-

ity to fulfill.

Private label success and Loyalty programs

In some cases however the reverse has been true – where

well-known brands have been unable to escape the innova-

tor’s dilemma. Store brands have succeeded in identifying cus-

tomer needs and have provided alternative value proposi-

tions. The success of private label brands also allowed for di-

versification into other product categories which were hitherto

dominated by the well-known brands. In this way the capacity

of private labels to provide value, visibility, consumer involve-

ment and therefore interest has exceeded that of the well

known brands.

More importantly private labels have perhaps largely suc-

ceeded because retailers have focused on promoting them.

Store brands have the advantage due to their potential for

store association whereas national brands are ubiquitous and

therefore not store-specific. Retailers therefore use pro-

prietary brands to draw people into their own stores. Bind-

ing the consumer favorably to the store is additionally

driven through loyalty programs. Shoppers Stop’s has a

loyalty program called First Citizen. They also offer a co-

branded credit card with Citibank for their members.

Questions for Discussion

The Marketing Manager of SHOPPERS STOP wants to

assess the popularity of one of its “own”store brand –

STOP, against two well known brands viz. John Players

and Provogue. If resource rationalization demands that

only a sample size of 150 qualified consumers can be sur-

veyed, can the brand preference (or lack of it) of the

STOP brand over the other brands be established?

187

SECTION 9

Case Study : Hindustan Foods

Hindustan Foods, a leading manufacturer of food products,

recorded sales of Rs. 445.6 crores and a net income of Rs.

54.57 crores in 2003. The company manufactured fruit-

cakes, cookies, biscuits, confectionary and a variety of other

food products including baby foods. The domestic confec-

tionery market was loosely divided into seven categories -

hard-boiled candies, toffees, éclairs, chewing gum, bubble

gum, mints and lozenges. Hard-boiled candies occupied the

largest share of this market. Hindustan Foods did not have

a presence in this segment. It manufactured and marketed

toffees as ‘Tasty Bite’ toffees while in the chewing and bub-

ble gum segment it had a significant presence with its

‘Fresh mint’ brand.

Hindustan Foods planned to enter the hard-boiled fruit

candy segment under its Tasty Bite brand. The objective

was to gain significant presence and market share in a seg-

ment that was rapidly growing. The company wanted to test

three new flavors for the proposed candy, strawberry, apri-

cot and pineapple. Hindustan Foods also wanted to meas-

ure the impact of three different retail prices – 50 paise, 75

paise and Re.1 for the three flavours.

The company selected nine geographically separated

stores, as the test stores for the new flavours and different

price points. These stores were similar with respect to Hin-

dustan Foods’ confectionary sales and were located in

neighbourhoods that had similar demographic characteris-

tics. Because each of the three flavours was to be tested at

each price, a total of nine different flavour – price combina-

tions had to be tested.

Hindustan Foods arranged for the delivery of the three new

flavours across the stores. At the end of four weeks, the

company collected the unsold candy cases. It determined

the number of cases sold for each flavour at each price.

With the data so determined, Hindustan Foods wanted to

know if the difference in sales was due to the difference in

flavours and what effect the different prices had on sales.

188

189

Hindustan Foods’ Experimental Results

Number of Cases of New Flavours sold at Different Prices

PRICE

FLAVOUR FLAVOUR FLAVOUR

PRICE

STRAWBERRY APRICOT PINEAPPLE

50 paise 22 54 35

75 paise 24 45 32

Re 1 15 35 31

SECTION 10

Case Study : A Study of Soap Segment in Indian FMCG Market

190

This case study was written by P Sashikala, Professor (Department of Decision sciences), IBS, Hyderabad. It is intended to be

used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.

The case was written from generalized experiences.

Across the globe, the significance of sales promotion in the

marketing mix of the Fast Moving Consumer Goods (FMCG)

industry has been increasing day-by-day. In the times of slow-

down/ recession in the economy, the marketers may depend

on sales promotion techniques to boost up the consumer

demand. Sales promotions generally hit directly at the decision

and purchasing stages of the buying process. Thus, they may

affect the consumer’s buying pattern directly producing immedi-

ate results. Generally sales promotion is a tool for boosting

sales for FMCG sector as certain products are price sensitive.

With various brands of FMCG in the market, the emphasis on

sales promotion in India has increased by over 500%–600%

from 2000 to 2008. It is estimated that the marketing compa-

nies have spent about INR 5,000 crore (approximately $1054

million) as sales promotion expenditure. However, the usage

of techniques to improve sales in FMCG sector requires the

manufacturers to understand consumer perceptions, attitudes

and preferences while channelling their sales promotional ef-

forts. These efforts should aim at building product awareness,

creating interest in the product, stimulating demand by convinc-

ing the customers and reinforcing the brand among the custom-

ers.

Generally, in the FMCG sector, especially in a vast market like

India, the consumer may switch from one product to other

based on the promotional offers. However, all sales promo-

tional techniques may not have the same impact on all the con-

sumers alike.

Indian FMCG Market

The Indian FMCG sector is the fourth largest sector in the econ-

omy with a total market size of about $13.1 billion in 2007. It

has a strong presence of MNCs. It is also well established with

distribution network and intense competition between the organ-

ized and unorganized segments with low operational cost.

FMCG market is also leveraging on the rural market segments.

191

Exhibit I: FMCG Category and Products Exhibit I: FMCG Category and Products

CATEGORY PRODUCTS

House hold care

Fabric wash(Laundry soaps and synthetic

detergents); household cleaners (dis/utensil

cleaners, ﬂoor cleaners, toilet cleaners, air

fresheners, insecticides and mosquito repellents

metal polish and furniture polish).

F o o d a n d

Beverages

Health beverages; soft drinks; staples/cereals;

bakery products (biscuits, bread, cakes); snack

food; chocolates; ice cream; tea; coffee; soft drinks;

processed fruits vegetables; dairy products; bottled

water; branded ﬂour; branded rice; branded sugar;

juices etc.

Personal Care

Oral care, hair care, skin care, personal wash

(soaps); cosmetics and toiletries; deodorants;

perfumes; feminine hygiene; paper products

A Study of Soap Segment in Indian FMCG Market

The total number of rural house holds is expected to rise from

135 million in 2001-2002 to 153 million in 2009- 2010 which also

presents the largest potential market in the world. The FMCG

market is estimated to increase from $11.6 billion in 2003 to

$33.4 billion in 2015. The penetration level and the per capita

consumption in India for most of the products like tooth paste,

skin care, hare care etc, is low which indicates the untapped

market. For example, the per capita consumption of toilet or

bathing soap in the country is 800 gm, whereas it is 6.5 kg in the

US, 4 kg in China and 2.5 kg in Indonesia.5

With burgeoning Indian population, particularly the middle class

and the rural segments, the manufacturers have an opportunity

to convert consumers to use more and more branded products

in the FMCG segment.6 Following table (Exhibit I) gives an over-

view of FMCG category and products.

In the year 2004, the size of the personal wash products is esti-

mated at US$ 989 million; hair care products at US$ 831 million

and oral care products at US$ 537 million. While the overall per-

sonal wash market is growing at one per cent, the premium and

middle-end soaps are growing at a rate of 10 per cent. The lead-

ing players in this market are HLL, Nirma, Godrej Soaps and

P&G. The production status of the Indian FMCG industry (in

2004) is given in the table below(Exhibit II)

Brief on Soap Market

According to Pradipta (2007), the segment of soaps is one of

the biggest FMCG categories in the country. Bathing and toilet

soaps contribute around 30% to the soaps market. There are 38

companies in India manufacturing soaps. Major players include

HUL, Reckit Benkiser, Godrej Consumer Products, Henkel Spic,

Procter & Gamble and Nirma.

Some of the major brands in the

soap segment are Lux, Hamam

and Lifebuoy , Cinthol, Shikakai

and Godrej No. 1 (GCPL),

Camay (P&G) and Dettol (Reckit

Benckiser). The present approxi-

mate size of the branded soap

market is around INR 7,500-

crore ($1581.8 million approximately). With increasing competi-

tion, this sector will register a 20% growth in 2009, despite the

economic downturn.

According to industry estimates, HUL controls is with 46.7% mar-

ket share in 2007, with brands including Lifebuoy, Lux, Rexona,

Breeze and Hamam. After HUL comes Nirma and Godrej with

their respective brands. The medicated soap brands include Det-

tol and Margo. Another major player in FMCG sector is P&G,

which had portfolio of products in healthcare; feminine-care; hair

care and fabric care businesses.

In the light of intense competition and companies offering sales

promotion the author is interested in conducting a study in soap

market.The objective is to ascertain the perceptions of consum-

ers preference towards various sales promotion offers such as

Discount on market price, buy 2 get 1 free, contests/ games and

lucky draw, surprise gifts/coupons. A brief description of promo-

tional offers is given below:

It should be noted that all offers are not offered at the same

time.

192

Price Discounts

Product Wise Production(2004) Product Wise Production(2004) Product Wise Production(2004) Product Wise Production(2004) Product Wise Production(2004)

Segment Unit Size Key Players Share of Market holder %

Household Care 62

Fabric wash market Mn tonnes 50 HLL,P&G,Nirma,SPIC 38

Laundry sops/ bars US $ mn 1102

Detergent cakes Mn tonnes 15

Washing powder Mn tonnes 26

Dish wash US $ mn 93 HLL 59

Personal care 58

Soap & Toiletories Mn tonnes 60 Hll,Nirma, Godrej

Personal wash market US $ mn 989 Hll,Nirma, Godrej

Oral care US $ mn 537 Colgate palmolive,Hll 40

Skin care & Cosmetics US $ mn 274 Hll, Dabur,P&G 58

Hair care US $ mn 831 Marico, Hll, Cavincare, Proctor & Gamble, Dabur,Godrej 54

Feminine Hygiene US $ mn 44 Proctor &Gamble, Jhonson& Jhonson

Food and Beverages

Bakery products Mn tonnes 30 Britania, parle,ITC

Tea 000 tonnes 870 Hll, Tata Tea 31

Cofee 000 tonnes 20 Nestle, Hll, Tata tea 49*

Mineral water Mn tonnes 65 Parle ,Bisleri, Parle Agro, Coca Cola, Pepsi

Soft Drink Mn crates 284 Coca Cola , Pepsi

Branded atta 000 tonnes 750 Pilsbury,HLL,Agro tech, Nature Fresh, ITC 15

Health beverages 000 tonnes 120 Smithkline Beecham, Cadbury,Nestle, Amul

Milk and Dairy products US $ mn 653 Amul , Britania,Nestle

Chocolates US $ mn 174 Cadbury, Nestle

culinary products Mn tonnes 326 HLL, Nestle 78

Edible oil Mn tonnes 13 Ruchi soya, marico, ITC, Agro tech 28

ExhibitII

The study is mainly intended to analyse the overall effect of vari-

ous sales promotion on consumer buying decision. A suitable

sample is selected for the study and data is collected through a

balanced and unbiased questionnaire. It attempts to examine the

perception of customer’s preference of aforesaid promotional of-

fers. The questionnaire is administered to 250 respondents

within the age group of 15-25, 25-35 and 35-45 years. Respon-

dents are asked to give their preference towards various sales

schemes offered with soaps. The perceptions of the consumers

are measured on a preference scale of 1 to 5 with ‘1’ being ‘Not

preferred’ and ‘5’ being ‘Most preferred’. There can be respon-

dents who may not prefer a certain type of promotional offer. An-

other question which is put to the respondents is whether they

prefer to buy existing products (stick to their brand) or they prefer

to buy new products (shift brands).

Results of Analysis of Data Collected

It is observed in the research study that as many as 70% of the

consumers prefer to buy one soap at a time while other 30% pre-

fer to buy more than one or multiple-pack.

Out of 250, 100 respondents preferred buying new products and

150 preferred existing products.

It is also observed that out of 100 respondents who preferred

buying new products, 25 are in the age group of 15-25, 45 in 25-

35 and 30 in the age group of 35-45. The corresponding figures

of 150 who preferred existing products are 15, 75, and 60 respec-

tively.

Out of all 250, 180 preferred promotional offers. It is also ob-

served that out of 180 who preferred promotional offers, 30 are

in the age group of 15-25, 95 in 25-35 and 55 in 35-45 and the

corresponding figures who do not prefer a particular type of pro-

motional offers are 20, 20 and 30.

Out of the 100 who preferred buying new products, 80 respon-

dents preferred a promotional offer on the new product and 20

do not prefer a particular type of promotional offer. An offer on a

new product gives them a feeling of either low quality or product

not doing well on an overall basis.

Out of 250 respondents 125 are male and 125 are female. It was

also observed that 85 of the male respondents prefer Promo-

tional offers whereas 40 do not and 95 of the female respon-

dents prefer promotional offers whereas 30 do not prefer a cer-

tain type promotional offer. Out of 150 who preferred existing

product 80 are males and rest are females and out of 100 who

preferred new products 45 are male and rest are female.

While analyzing deep into the promotional offers, it is observed

that the consumer’s preference for cash discounts is more than

any other type of promotional techniques including buy-two-get-

one free, contests and lucky draws as well as surprise gifts and

gift hampers (Exhibit III).

194

Questions for Discussion

Do you expect the influence of various sales promotion

offers on the purchasing decision of the consumers is

same? If no, why? What are the managerial implications

of your answer?

If you are the regional marketing manager of a company,

would you decide to go for any promotional offer or not

while launching a new product?

Based on data collected in the present study, as a man-

ager, would you take factors like gender and age into con-

sideration while deciding the promotional offer?

Exhibit III

Most Favored Sales Promotional Measures

195

SECTION 11

Case Study: Melting Delicacies (A)

196

This case study was written by Sushama Marathe ,Professor (Department of Decision sciences), IBS, Hyderabad. It is intended

to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situa-

tion. The case was written from generalized experiences.

In contemporary India, we see Ice cream chains located in cit-

ies all over the country. Their menus offer a variety of ice

cream flavors, ice cream sundaes, banana splits. People like

them because they can get a quick ice cream and dessert for

a reasonable price and with ease. When you open a chain

store, you are cashing in on the name of the franchise. People

know that whether your store is in Mumbai or Chennai or Hy-

derabad, they will get the same service, variety and menu.

This is why you take a franchise but the down side is there are

tight regulations on franchisees.

A franchise when it comes to small business is a business one

can buy into for a fee. Most fast food chains are made up of

people buying the right and territory of that chain for a fee.

Owning a franchise does not mean only having a recognized

trademark but will also benefit from the parent company’s ad-

vertising as well. Most of the time the parent company will is-

sue sales fliers, coupons and TV ads as part of the franchise

fee1 . Franchises are not just limited to food & eateries but

come in all sorts of products including brick and mortar types

of businesses also.

The Indian tropical climate is right for ice-cream consumption.

As opposed to many other countries, India has a very low per

capita consumption of ice cream even if we look at only the

middle class and above. This primarily indicates that the reis

a large un tapped market potential. Industry Snapshot of In-

dian Ice Cream Market are, “Market Size – Rs. 800 Crores

and market growing at 10 to 12%”.

Maria Fernandez, a young entrepreneur in her late twenties

had taken a franchise for a retail ice-cream chain “Melting deli-

cacies”. She entered into a contract for a chain of five outlets

in Hyderabad, a metropolitan city and capital of the state of

Andhra Pradesh in southern India. Within a year of her starting

this venture, she was making good profits. The quality and the

variety of the ice-creams, prompt service, polite service person-

nel and ambience were the factors of instant success. “Melting

delicacies” were renowned for their exotic flavors and reason-

able prices. They had Mango Mawa ice-cream looking like a

cake with a silver-foil on it, Dry fruit ice-cream with almonds,

pista and figs with a dash of saffron; Vanilla ice creams with a

base of cream and cashew nuts; Mango Rich Duet and Cus-

tard Apple ice Cream. These ice creams were priced any-

where between Rs3 25 to Rs 35 per scoop/cup/plate. They

also had variety of sundaes and Banana splits. But the five fla-

vors of ice creams were hot favorites. The “Melting delicacies”

outlets were frequented by young and old all alike with the

same enthusiasm and passion.

All the five outlets had there own distinctive surroundings. One

of the outlets was located in the Cyber City area where all the

IT giants and multinational companies (MNCs) had their of-

fices. This joint was flooded with young people on weekdays

and was relatively less crowded on weekends. Another loca-

tion, near a boating facility, recreation center and amusement

park in the heart of the city, was so overcrowded on weekends

that many customers had to leave disappointed. The third loca-

tion was in an institutional area and was surrounded mainly by

197

Melting Delicacies(A)

Women’s educational Institutions and colleges. It was observed

that this joint had a heavy demand for the Mango Duet flavor.

One more was in an elite residential locality and the fifth was in a

sprawling shopping mall which also housed a multiplex.

In the Market there was tough competition from chains like

Baskin Robbins, Havmore, Naturals and the local chain of Dairy

Cream. Recently some of these competitors had opened their out-

lets in close proximity of “Melting delicacies”. Maria knew she had

to be on a close guard and avoid situations of out of stock on the

most favored flavors, delay in service due to non availability of

adequate service personnel or because of lack of place for the sit

and eat clients. These would only mean loss of customers and

business opportunities. The popularity of this chain of outlets had

grown tremendously over the past one year. With growing popu-

larity of the chain and tough competition in the market, any slip in

decision making would only mean trouble for the business. As a

CEO she felt that at this point of time knowing answers to certain

ground realities on trends, patterns, associations of the product

and the consumers was essential for right decisions on inventory

planning to human resource and marketing management.

She discussed this with the young managers in her outlets. They

said that intuitively they feel that there are preferences and asso-

ciations but were not certain about it. For verifying this they

needed documented data. She assigned this task of identifying

and gathering relevant information from the outlets, which will

help and assist informed decisions for better performance of her

business to one of the management trainees working in her of-

fice.

After a series of discussions with the outlet managers and Maria

and getting clarity on the objective he outlined a study which re-

quired data collection.

The outlets had both “sit ‘n’ eat” (in-store) and take away facili-

ties. The take away was available in packs of 0.5 Lts , 1Lts and

2Lts. For the in-store service the requested flavor was served

and charged per scoop. As collecting information on preferred

/consumed flavor and gender or age was not possible for the

take away sales, this Information was documented from the sit

and eat customers only. A quick recording of information on day

of the week, gender, age and flavor of the ice-cream consumed

was done at all the outlets. The other essential information was

also appropriately documented.

Many consolidated tables were generated from the data. Three

such consolidated tables are given in Exhibits (I), (II) and (III).

198

INTERACTIVE 8.3

199

Exhibit I

Number of Scoops Consumed by In Store Consumers*

Exhibit I

Number of Scoops Consumed by In Store Consumers*

Exhibit I

Number of Scoops Consumed by In Store Consumers*

Exhibit I

Number of Scoops Consumed by In Store Consumers*

Exhibit I

Number of Scoops Consumed by In Store Consumers*

Exhibit I

Number of Scoops Consumed by In Store Consumers*

Exhibit I

Number of Scoops Consumed by In Store Consumers*

Flavor

choice

Week days

(Monday to Friday)

Week days

(Monday to Friday)

Week days

(Monday to Friday)

Weekends

(Saturday-Sunday)

Weekends

(Saturday-Sunday)

Weekends

(Saturday-Sunday)

Flavor

choice

Male Female Total Male

Femal

e

Total

Mango

Mawa

100 75 175 45 30 75

Mango

Duet

75 150 225 60 65 125

Vanilla

Cashew

95 55 150 20 30 50

Custard

Apple

60 90 150 50 50 100

Dry fruit 40 60 100 10 40 50

Total 370 430 800 185 215 400

* Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only

Exhibit II

Trafﬁc Density at the “Sit and Eat”* Venue by Gender

Exhibit II

Trafﬁc Density at the “Sit and Eat”* Venue by Gender

Exhibit II

Trafﬁc Density at the “Sit and Eat”* Venue by Gender

Exhibit II

Trafﬁc Density at the “Sit and Eat”* Venue by Gender

Exhibit II

Trafﬁc Density at the “Sit and Eat”* Venue by Gender

Exhibit II

Trafﬁc Density at the “Sit and Eat”* Venue by Gender

Exhibit II

Trafﬁc Density at the “Sit and Eat”* Venue by Gender

location

Week days

(Monday to Friday)

Week days

(Monday to Friday)

Week days

(Monday to Friday)

Weekends

(Saturday-Sunday)

Weekends

(Saturday-Sunday)

Weekends

(Saturday-Sunday)

location

Male

Femal

e

Total Male

Fema

le

Total

Cyber

City

550 500 1050 150 125 275

Insitution

al area

500 1000 1500 175 150 325

NTR

Gardens

375 350 725 650 675 1325

Banjara

Hills

375 425 800 185 215 400

Central

Mall

300 475 775 575 675 1250

total 2100 2750 4850 1735 1840 3575

* Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only

200

Exhibit III

Number of Scoops Sold on One Particular day

Exhibit III

Number of Scoops Sold on One Particular day

Exhibit III

Number of Scoops Sold on One Particular day

Exhibit III

Number of Scoops Sold on One Particular day

Exhibit III

Number of Scoops Sold on One Particular day

Exhibit III

Number of Scoops Sold on One Particular day

Exhibit III

Number of Scoops Sold on One Particular day

Location

Mango

Mawa

Mango

Duet

Vanilla

cashew

Dry

fruit

Custa

rd

apple

Total

Cyber City 25 35 15 15 15 275

Insitutional

area

35 20 15 10 20 325

NTR

Gardens

100 150 50 60 85 1325

Banjara

Hills

85 75 65 60 80 400

Central

Mall

50 175 75 65 125 1250

total 295 455 220 210 325 3575

* Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only * Data for one week from one location only

Analysis of Variance (ANOVA)

Analysis of Variance

Assumptions and Basics of ANOVA

Applying ANOVA to the Emoluments Problem

Multiple Comparisons

ANOVA in Practice

C

H

A

P

T

E

R

9

I n t hi s c hapt e r we wi l l di s c us s

Section1

What is ANOVA?

Analysis of Variance in a statistical technique allows us to

test whether the differences as observed among more than

two sample means are significant or not. In other words, our

concern is whether the samples come from same

population or not. This is a generalization over the test of

significance among the means drawn from two populations.

Managers are often required to test the significance of the

differences among the means drawn from more than two

populations. Several applications of ANOVA can be seen.

A transport company would like to compare the mileage

given by different brands of tyres.

A fertilizer company would like to compare the effectiveness

of different fertilizers on productivity.

An engineering company would like to compare the

machine productivities for machines producing the same

products. In general we have a response variable (or

dependent variable). Then we collect data to decide if one

or more factors (or independent variables) influence the

response variable. In many cases, the classes or categories

may be predefined and we would have to take them as

given. For instance, while comparing average heights of

different ethnic groups, the ethnic groups are taken as given

and we observe samples from each group.

Another type of situation is when the influencing factor (the

independent variable, also called treatment) is in our control

and we experimentally manipulate them. For instance, a

pharmaceutical company which has developed three

different types of drugs for treating a disease, may

consciously conduct and experiment, in which the affected

people are divided into four groups (one each for each drug

and one group for placebo application) and each group is

administered the drug for a period after which the

responses can be observed. Clearly, the assignment of

patients to the drug/treatment should be done randomly.

This is a typical situation of design of experiment and the

particular approach indicated here is referred to as

completely randomized design. The ultimate interest here is

to compare the mean effectiveness of the drugs and the

place bo on the four groups.

202

The analysis tool for both the ethnic group example and the

pharmaceutical example will be one way ANOVA; one way

because we are observing the impact of only one factor/

treatment in these examples. This suggests us that we can

have two way or in general m way ANOVA, when we have

more than one influencing factor / treatment under

consideration. For instance, in the pharmaceutical example

one may be interested in studying the effectiveness of the

drugs in relation to the age groups of the patients, as prima

facie the drugs are expected to impact different age groups

differently. Thus age group emerges as another factor,

besides drugs.

Let us consider an example.

Example: Emoluments Comparability

From four premier institutes, respectively 6,7,8 and 8 man-

agement graduates were selected. The amount (in Rupees

lakhs) they were offered as emoluments annually during

their placement is shown below in table 9.1.1.

Can we say that on an average graduates of all the institu-

tions are being offered the same emoluments, or are some

institutions preferred over the others9.

203

Figure 9.1.1: ANOVA

Table 9.1.1. Table B Table 9.1.1. Table B Table 9.1.1. Table B Table 9.1.1. Table B

Institute 1 Institute 2 Institute 3 Institute 4

11 8 10 7.75

12 9 11 8.25

9 9.5 10.5 8.75

10.5 9.75 10.25 9

11.5 10 10.75 9.5

12 10.25 9.75 10

10.5 9 10.5

8.5 11

Section 2

Assumptions and Basics of ANOVA

Assumptions

The various populations from which the samples are

drawn should be normal and have equal variances. The

requirement of normality can be relaxed if the sample

sizes are large enough.

The samples under each class/treatment are drawn

randomly and independently.

Basics of ANOVA

Let there be n sample observations on a random variable

X divided into k classes on the basis of some criteria or

factors or exposed treatments.

Let

ni = number of observations in the i

th

class (say, treated

with i

th

fertilizer)

n = total number of observations = ∑ ni

Xij = j

th

observation from the i

th

class, I = 1,2……., ni ; j =

1,2………,k

k = number of classes/treatments

Ti = ∑ Xij summed over j and Xi bar = Ti / n

The sample data structure would look as follows in table

9.1.1:

We wish to test the following hypothesis:

Null Hypothesis: H0 : µ1 = µ2 = µ3 = ……………..= µk , i.e.,

all the means are equal.

204

Table 9.1.1. Sample Data Structure Table 9.1.1. Sample Data Structure Table 9.1.1. Sample Data Structure Table 9.1.1. Sample Data Structure

Classes/

Treatments

Sample Observations

Total

Mean

1 X11, X12, ..............X1n1 T1

2 X21, X22, ..............X2n2 T2

......

i X1i, Xi2, ..............Xini Ti

......

k Xk1, Xk2, ..............Xknk Tk

Xi

Xk

X2

X

1

Alternate Hypothesis: H1 : Not all means are equal, i.e., at

least two means are different.

We have two methods to test the above hypotheses using

ANOVA. While conceptually both methods are the same, the

second method is convenient for manual computation.

Method 1:

Step 1: Compute the means and sum of squared deviations

for each class by the formulae:

Also compute the grand mean of all the data observations

in the k-classes by the formula:

Step 2: Obtain the Between Classes Sum of Squares (BSS)

by the formula:

Step 3 : Obtain the Between Classes Mean Sum of Squares

(MBSS)

Step 4: Obtain the Within Classes Sum of Squares (WSS) by

the formula:

Step 5: Obtain the Within Classes Mean Sum of Squares

(MWSS)

Step 6 : Obtain the test statistic F or Variance Ratio (V.R)

Step 7: Reject where

is the desired level of significance.

Method 2:

Step 1: Compute = Grand total of all the

observations

Step 2: Compute Correction Factor

where, is the total number of observations.

Step 3: Compute Raw Sum of Squares (RSS) =

205

Step 4: Total (TSS) =

Step 5: Compute = The sum of all the

observations in the ith class; (i=i,2,...k)

Step 6: Between Classes (or Treatment ) S.S (BSS )

Step 7: Within Classes or Error S.S (WSS) = Total S.S -

Between Classes S.S

Step 8: Now follow steps 3,5,6 and 7 of the method 1.

These calculation are much simpler as compared to those

in the previous method. We can summarize computation

(from either methods ) as below

Here, F(critical) = F(3, 25, 0.05) = 2.99

Since computed F>F(critical), we reject the null hypothesis

of equality of emoluments across the institutions. The

ANOVA results may be summed up in a tabular form as

shown in the Table 9.1.2.

206

F =

MBSS

MWSS

F k −1, n − k ( )

Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification Table 9.1.2. ANOVA Table for one way classification

Sources of

Variation

Sum of

Square

s

d.f.

Mean

Sum of

Square

s

Variance Ratio

(F)

Treatments

(Between

Classes)

BSS k-1 MBSS

Error (WIthin

Classes)

WSS n-k MWSS

Total TSS n-1

Section 3

Applying ANOVA to the Emoluments Problem

He r e , we have:

n = 29, k = 4, n1 = 6, n2 = 7, n3 = 8 and n4 = 8.

Using Method 1:On computation we get,

T1 = 66.00, T2 = 67.00, T3 = 79.75, T4 = 74.75

And X = 9.913

To compute S1

2

, S2

2

, S3

2

and S4

2

we compute the following

table 9.1.1

With the help of computations in the table we get:

S1

2

= 6.5, S2

2

= 4.339, = 5.180 and S4

2

= 8.742.

From here,

WSS = ∑ Si

2

= 24.761 and MWSS = WSS/(n-k) = 24.761/

(29-4) = 0.990

To obtain BSS and MBSS, we compute the following in

table 9.1.2

From here, BSS = 10.523 and MBSS = BSS/(k-1) =

10.523/(4-1) = 3.508

The F ratio comes to F = MBSS / MWSS = 3.508/0.990

Here, F(critical) = F(3, 25, 0.05) = 2.99

Since computed F<F(critical), we do not reject the null

hypothesis of equality of emoluments across the

institutions. The ANOVA results may be summed up in a

tabular form as shown in the Table 9.1.3.

207

Xi

208

Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1 Table 9.1.1

X1i Square X2i Square X3i Square X4i Square

11.000 0.000 0.000 8.000 -1.571 2.469 10.000 0.031 0.001 7.750 -1.594 2.540

12.000 1.000 1.000 9.000 -0.571 0.327 11.000 1.031 1.063 8.250 -1.094 1.196

9.000 -2.000 4.000 9.500 -0.071 0.005 10.500 0.531 0.282 8.750 -0.594 0.353

10.500 -0.500 0.250 9.750 0.179 0.032 10.25 0.281 0.079 9.000 -0.344 0.118

11.500 0.500 0.250 10.000 0.429 0.184 10.75 0.781 0.610 9.500 0.156 0.024

12.000 1.000 1.000 10.250 0.679 0.460 9.750 -0.219 0.048 10.000 0.656 0.431

10.500 0.929 0.862 9.000 -0.969 0.938 10.500 1.156 1.337

8.500 -1.469 2.157 11.000 1.656 2.743

Total 6.5 4.3393 5.1797 8.7422

WSS 24.761

X

4i

− X4 X

1i

− X1 X

2i

− X2

X

3i

− X3

209

Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2. Table 9.1.2.

Instit

-ute

ni Diff. Diff. sq.

ni*Diff.

sq.

1 6 11.000 9.913 1.087 1.182 7.089

2 7 9.571 9.913 -0.342 0.117 0.817

3 8 9.969 9.913 0.056 0.003 0.025

4 8 9.344 9.913 -0.569 0.324 2.592

BSS 10.523 MBSS 3.508 F 3.542

Xi X

Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification Table 9.1.3 :Anova Table for one way classification

Source of

Variation

SS df MS F P-value F crit

Between

Groups

10.523 3.000 3.508 3.542 0.029 2.991

Within

Groups

24.761 25.000 0.990

Total 35.284 28.000

Section 4

Multiple Comparisons

In ANOVA, when the null hypothesis is rejected, it

indicates that the samples represent different populations.

If so, it would be of interest to identify the sub-groups of

populations which are homogenous among themselves,

(i.e.,have the same means). For instance, if a

manufacturer has several suppliers for the supply of a

particular component, he would like to group them on the

basis of quality levels of the components supplied by

them. Based on our knowledge of t-test, we know that we

can conduct a pair wise equality of mean (on quality level)

test for all sample pairs.

Formally, the null hypothesis would be H0 : µi = µj (i≠j),

and we use the t-test for comparing populations i and j,

with the estimate of σ

2

obtained from the two samples.

In an alternative to this approach, it is suggested that the

estimate of σ

2

may be obtained based on all the samples,

instead of only the samples being compared. Hence, it is

a pooled estimate of σ

2

based on all samples. Thus,

t =

Since MWSS is an unbiased estimate of σ

2

, we would

reject H0 by comparing the computed t with the t(critical)

with (n-k) degrees of freedom and α level of significance,

depending on the alternative.

Consider the alternative hypothesis as H1 : µi ≠ µj. We can

simplify the test and restate as follows:

Reject H0 at α level of significance, if,

,

where LSD = t(n-k, α/2) *

Here, LSD stands for least significant difference, or

sometimes called critical difference.

210

If all ’s are equal, then we need to calculate the LSD only

once and then compare the differences of all pairs of sample

means with the computed LSD. If the difference for pair is less

than LSD, they belong to the same population and if not, they

belong to different populations.

In ANOVA, multiple comparisons could also be carried out

using the Tukey-Kramer procedure. As earlier, our hypotheses

may be stated as

H0 : µi = µj Vs. H1 : µi ≠ µj (for all i≠j)

According to this procedure, we compute a critical range (CR)

as below.

CR =

where is the upper tale critical value from a Studentized

range distribution with degrees of freedom as k and (n-k)

respectively for the numerator and the denominator, with level

of significance as . Critical values for Studentized range

distribution are available in tables in standard textbooks on

statistics.

211

Section 5

ANOVA in Practice

As we noted earlier, two important assumptions for

conducting ANOVA are: (i) all the populations are normal

and (ii) all the populations have equal variance. On both

counts, some relaxation is possible in practice. If the size

of each sample is large enough (>30), ANOVA can be

applied even if the underlying populations deviate from

normal distribution. Similarly, in the case of one-way

ANOVA, if the sample sizes are nearly equal over the

groups, ANOVA can tolerate some fluctuations in

variance. The thumb rule is: the largest sample standard

deviation should be no more than twice the smallest

sample standard deviation.

There are however, more formal tests for testing the

equality of variances over the populations considered.

One such test is the Levene’s test.

Although the one way ANOVA is relatively robust (as

explained above), large differences in the variances can

significantly affect the validity of the F test. Thus in such

situations, we can first test for the equality of the

variances over different classes/treatments (called

Levene’s test), and only if the homogeneity of the

variances is accepted, we proceed for ANOVA.

Formally, the Levene’s test is described as follows:

Vs. (i =

1,2,....,k)

For conducting the test, for each class, we first compute

the absolute difference between each observation and the

median of the class. Thus we will obtain absolute

differences for the first class, for the second class and

so on, with finally absolute differences for the class.

We then perform a one way ANOVA on these differences,

testing for equality of mean absolute differences over the

classes. We reject the original null hypotheses of equality

of variances if this null hypothesis is rejected.

References:

http://www2.sas.com/proceedings/sugi29/192-29.pdf

212

SECTION 6

Case Study: Real Foods

213

In 2003, Real Foods, a mango juice manufacturer had a pre-

dominant market presence in South India. ‘Enjoy,’ the bottled

mango juice from Real Foods enjoyed a comfortable position

in the branded fruit juices market. For the first time, Real

Foods ventured into another product – an orange juice con-

centrate. Since the market was already full of canned and bot-

tled orange juices, Real Foods opted for the concentrate

form, targeting the home consumption segment. Liquid con-

centrates were available in the market already but Real

Foods had developed a powder concentrate available in tetra-

packs. The powder concentrate when mixed with water gave

a litre of orange juice. Real Foods decided to market the new

product under the ‘Enjoy’ brand name, to leverage the brand’s

equity.

The new product had several attractive features. First of all,

the powder concentrate was much more convenient than the

canned and bottled orange juices. Secondly, Real Foods be-

lieved the quality of the juice made out of the concentrate was

better because unlike canned juices, the juice from the con-

centrate could be prepared just before consumption. Another

very important feature was that the powder concentrate was

available at a much lower price than the other juices. The mar-

keting manager was in a dilemma as to how to advertise the

new product. He could opt for that emphasized on the conven-

ience of the product, the quality attribute, or the price advan-

tage. To facilitate a decision, he conducted an experiment in

three cities – Bangalore, Chennai, and Hyderabad.

In Bangalore, the marketing manager launched the new prod-

uct backed by advertisements stressing on the convenience

of the product. The product was easy to carry from the store

to home. The powder did not require storage in the refrigera-

tor. Even households without a refrigerator could buy the prod-

uct without fearing spoilage. The advertisements also high-

lighted the ease with which one litre of juice was ready in a

short time. In Chennai, the advertisements emphasized on

the quality of the product – the freshness proposition, how it

tasted better than bottled juices, etc. In Hyderabad, the adver-

tisements stressed upon the price advantage.

The marketing manager recorded the weekly sales of the new

concentrate in tetrapacks, for 20 weeks in all the three cities

(Refer to Table I). He wanted to know if the difference in sales

was on account of the different communication strategies

adopted by the company for the three cities.

214

Real Foods

215

Weekly sales(for 20 weeks) in 3 cities Weekly sales(for 20 weeks) in 3 cities Weekly sales(for 20 weeks) in 3 cities Weekly sales(for 20 weeks) in 3 cities

Week

Bangalore

(Convenience)

Chennai

(Quality)

Hyderabad

(Price)

1 75 45 65

2 60 54 45

3 75 65 56

4 45 56 60

5 55 65 64

6 75 70 54

7 65 62 80

8 80 70 56

9 75 71 67

10 89 60 50

11 95 67 67

12 87 64 70

13 64 56 72

14 71 65 65

15 84 57 65

16 75 54 63

17 54 67 56

18 65 70 64

19 65 59 68

20 55 63 72

216

SECTION 7

Case Study: “Melting Delicacies” Ice cream Parlour chain (B)

217

This case study was written by L. Shridharan, Professor, Department of Decision Sciences, IBS, Hyderabad. It is intended to be

used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.

The case was written from generalized experiences.

Maria and her outlet managers generally believed that the

weekly sales across the five outlets of “Melting Delicacies”

were more or less same. Of late however, some of the outlet

managers indicated some fluctuations in the sales and they at-

tributed this to the opening of a competitor’s outlets in the vicin-

ity of some of her outlets. Naturally, Maria was concerned.

This could pose a threat and Maria knew that she had to act

fast. However, the balanced person in her suggested that she

should ascertain the views of the outlet managers on an objec-

tive basis. She once again called Kiran, the management

trainee in her office and expressed her concern. She wanted

him to verify if there is a difference in sales across the outlets

and if so to indicate the outlets in the order / groups of sales

importance. With these facts established statistically, she felt

she would be on a firmer ground to evolve her strategy to

counter the competition.

After some discussions with the outlet managers, and with

guidance from his B-School professor, Kiran decided on a plan

for data collection and analysis. As a first step, he collected

the weekly sales revenue data for each outlet for the past 15

weeks for each of the outlets, though data was not available

for some outlets for some weeks (Exhibit I).

Based on this information, will Kiran be able to help Maria in

ascertaining her concern one way or the other?

218

“Melting Delicacies” Ice Cream Parlour chain (B)

INTERACTIVE 9.1

219

Weekly Sales(in Rs. Lakhs) at Different Outlets of

“Melting Delicacies” at Hyderabad

Weekly Sales(in Rs. Lakhs) at Different Outlets of

“Melting Delicacies” at Hyderabad

Weekly Sales(in Rs. Lakhs) at Different Outlets of

“Melting Delicacies” at Hyderabad

Weekly Sales(in Rs. Lakhs) at Different Outlets of

“Melting Delicacies” at Hyderabad

Weekly Sales(in Rs. Lakhs) at Different Outlets of

“Melting Delicacies” at Hyderabad

Weekly Sales(in Rs. Lakhs) at Different Outlets of

“Melting Delicacies” at Hyderabad

week

Cyber

City

Institutional

Area

NTR

Garden

Banjara

Hills

Central

Mall

1 3.9 3.7 6.1 4.5 6.5

2 4.2 NA 5.4 3.7 5.3

3 6.6 4.7 5.8 5.4 6.4

4 5.1 5.8 5.4 3.8 6.1

5 4.1 3.6 7.2 5.1 6.4

6 3.8 4.6 5.3 3.8 7.4

7 5.7 3.2 4.4 5.4 NA

8 5.1 4.5 6.7 5.7 5.7

9 5.2 3.9 4.5 3.9 5.9

10 NA 3.2 5.0 5.2 6.6

11 2.7 NA 6.4 5.9 7.3

12 3.1 3.2 NA 5.1 5.8

13 6.1 4.2 5.5 4.5 6.2

14 5.1 3.9 NA 3.3 6.3

15 4.8 3.3 7.0 4.7 7.4

NA:Not Available NA:Not Available NA:Not Available NA:Not Available NA:Not Available NA:Not Available

Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author

Exhibit I

Correlation and Regression Analysis

Correlation Analysis

• Correlation Coefficient

• Properties of Correlation Coefficient

Simple Linear Regression

• Simple Regression of Y on X

• Simple Regression of X on Y

• Some Properties of Regression Coefficients

Multiple Regression

• How good the regression fit is (simple or

multiple)?

• Standard Error of Estimate

• Testing for Significance of Regression Rela-

tion

• Testing for significance of Regression Coeffi-

cients

• Confidence Interval for b¡

• Prediction using Regression equation

• Some Important Considerations

C

H

A

P

T

E

R

1

0

I n t hi s c hapt e r we wi l l di s c us s

Section1

Correlation Analysis

Correlation Analysis is about the study of changes in one

variable in relation to changes in another variable. The

phenomenon can be observed in several natural and

economic contexts. Illustratively,

(a). Higher the rainfall, higher the agricultural production

(b). Higher the income, higher the expenditure

(c). Higher the price, lower the demand.

(d). Higher the age of an equipment, higher the

maintenance cost.

Therefore, it is a study of variation in one variable in relation

to variation in the other variable.

Consider the maintenance cost of a particular type of

equipment at different vintage levels (refer table 10.1.1).

We can plot the data as in figure 10.1.1. It is suggestive of a

linear relation between agricultural output and rainfall.

The question is - can we measure the extent of this linear

relationship between the variables rainfall and output?

221

TABLE 10.1.1 TABLE 10.1.1

Vintage (in years) Maintenance Cost

2 6

7 18

5 13

9 23

4 9

3 5

8 22

Figure 10.1.1: Plotting data

Vintage (in years)

0

7.5

15.0

22.5

30.0

0 2.25 4.50 6.75 9.00

M

a

i

n

t

e

n

a

n

c

e

c

o

s

t

(

i

n

R

s

.

0

0

0

’

s

)

Correlation Coefficient

The strength of correlation between two variables X and Y

is measured through correlation coefficient, which is

defined as:

Properties of Correlation Coefficient

a. Correlation coefficient r

XY measures only the extent of

linear relationship between X and Y.

b. Always, we have, -1 ≤

r

XY ≤ + 1

Further,

r

XY = 1 => perfect positive relation between X &Y (i.e.,as X

increases, Y increases)

r

XY = -1 => perfect negative relation between X &Y (i.e.,as

X increases, Y decreases)

r

XY = 0 => No correlation (i.e.,each of X and Y behave their

own way).

c. Change of origin and scale does not affect the

correlation coefficient.

Let U and V be defined as:

U = (X - a)/c, V = (Y - b)/d, where a, b, c & d are constants.

Then

r

XY =

r

UV

d. If X and Y are independent, then

r

XY = 0, but the

converse is not true as can be seen from the following

example:

Let:

X

Y

-3 -2 -1 +1 +2 +3

9 4 1 1 4 9

Here

r

XY

= 0, but actually Y = X

2

(a non-linear relation) and

hence X and Y are perfectly related.

e. Spurious Correlation

Let (x1, y1), (x2, y2) --------------(xn, yn) be n pairs of

observations. Mathematically one can calculate the

correlation coefficient between X and Y. However, to make

meaningful sense out of it, one must look for theoretical or

other reasons for the cause and effect relationship. While

222

agricultural production of Country A can be expected to

depend on the rainfall in that country, clearly rainfall in

Country A cannot provide any meaningful explanation for

agricultural production in Country B. On computation, we

will get some value for the correlation coefficient due to

influence of some common factors like nature or time, but

clearly such correlations cannot be meaningfully

interpreted.

f. Sometimes it may be more meaningful to correlate

variables with a lag, e.g., current months’ sale would

depend on the advertisement expenditure incurred, say, 2

months ago ( i.e., a lag of 2 periods). Then, we may

correlate Y

t

with Xt-2.

g. is referred to as coefficient of Determination.

h. Test for significance of correlation coefficient

Let ∫XY = Population Correlation Coefficient between X and

Y. Then, we may test Ho : ∫XY = 0 against the alternatives

H1 : ∫XY ≠ 0, ∫XY > 0, ∫XY < 0 through a t - test.

The test statistic is given by

t = r

XY

√(n-2) ∾ t

n-2

√(1-

r

2

xy

)

The decision criteria at level of significance is as follows:

If H1 : ∫XY ≠ 0, then reject Ho if | t | > t(n-2)

If H1 : ∫XY > 0, then reject Ho if t > t(n-2)

If H1 : ∫XY < 0, then reject Ho if t < t(n-2)

This testing is made possible under the assumption that

the error terms e

i

’s are mutually independent and are

distributed normally with ‘zero’ mean and a constant

variance .

223

Section 2

Simple Linear Regression

Correlation coefficient measures the degree of linear

relationship between two variables. Though it does not

probe the cause and effect relationship. On the other hand,

the Linear Regression probes cause - effect relation by

specifying the nature of the relationship between Y

(dependent variable) and X (independent variable) in the

case of Simple Regression, and X1, X2, ...............Xk

(independent variables) in the case of Multiple Regression.

Simple Regression of Y on X

Let (x1, y1) ................ (xn, yn) be n - observations. We

believe that X is the cause and Y is the effect. We try to

identify the relationship through the following simple linear

model (see figure 10.2.1).

Yi = a + b Xi + ei,

where, ei is the error term

If we know a and b, we would know the relationship between

Y and X. We try to obtain a and b in such a way that the

“error sum of squares” is minimized, i.e., we minimize,

224

Figure 10.2.1:Simple Regression of Y on X

i-1

∑

n

= (Yi - a - b Xi)

2

over the choice of ‘a’ and ‘b’ using

calculus approach. When this is done, we get and b^

(estimates of a and b) as:

This regression is called regression of Y on X, ‘a’ and ‘b’

are called regression coefficients, and and , the

respective estimates. This regression equation can be

equivalently written as (Y - ) = (X - ).

The method of determining a and b by minimizing the “error

sum of squares” is called the Least Squares Method.

Simple regression of X on Y

In a similar fashion, we can obtain the regression of X on Y,

say denoted as X = a* + b* Y

In this case, we get:

This regression equation can be equivalently written as:

(X- ) = * (Y - )

225

Keynote:Example 10.2.1

Keynote: Example 10.2.2

Some Properties of Regression Coefficients

i). Note that in the case of simple regressions.

This means

, where the sign would be decided

by the sign of and *. They will always have the same

signs.

ii). Also,

References:

http://biocomp.cnb.uam.es/~coss/Docencia/ADAM/Sample/

Simple%20Regression.pdf

226

Section 3

Multiple Regression

In the case of simple regression, we had only one

independent (explanatory) variable (X) to explain the

dependent variable (Y). In the case of multiple regression,

we consider several independent (explanatory) variables

(say, X1.........Xk) to explain the dependent variable (Y). The

data structure looks as in the table 10.3.1:

Once again, we consider the linear model only (Simple

regression can be considered as a special case of multiple

regression where k=1). Thus the regression relation is

expressed as follows:

Yi = b

0

+ b

1

X

1i

+ b

2

x

2i

+...............+b

k

X

ki

+ e

i

The estimation of b’s is done following the same logic as in

the case of simple regression, i.e., by minimizing the error

sum of squares for the choice of b’s. While we will not

present here the formulae for b’s, computer software (Excel,

SPSS, SAS, etc.) gives the estimated values of the b’s (the

regression coefficients), written as b^’s.

Table 10.3.1: Data Structure Table 10.3.1: Data Structure Table 10.3.1: Data Structure

Observation Yi X1i....................Xki

1 Y1 X11....................Xk1

2

:

:

Y2

:

:

X12....................Xk2

:

:

n Yn X1n....................Xki

How good the regression fit is (simple or

multiple) ?

Once the regression line is estimated, we can obtain the

estimated values for Y¡’s (denoted by ’s) for given values

227

of X¡’s. We should expect that Y’s and ’s to be close for a

good regression relation. In other words, we would expect a

high correlation between Y¡ and if the regression relation

is good, i.e., can be taken as indicative of the

goodness of the regression fit.

Another way of looking at the degree of closeness between

Y’s and ’s could be through the break-up of total sum of

squares as given below:

(This relation can be proved.)

or, Total SS = Explained SS + Unexplained SS (or Error SS)

or,

where,

Clearly, higher the proportion of explained sum of squares

in the total sum of squares (R

2

), the better or more reliable

would be the regression relation. It is also clear that 0 ≤ R

2

≤ can be proved.

Actually, R

2

= .

R

2

closer to 1 is indicative of a good regression fit.

However, will keep on increasing if we continue to add

more independent variables to the regression relation even

if their contribution is not significant. To take care of this

situation, we use adjusted as given below:

will always be slightly lower than R

2

and would fall when

the addition of a variable does not contribute significantly.

Standard Error of Estimate

Estimate of , called standard error of estimate, is given as:

where, k= the number of independent variables.

Here

is an unbiased estimate of

2

, i.e.,

228

E (s

e

2

) =

2

In the case of Simple Regression:

229

Keynote 10.3.2: Example

Keynote 10.3.3: Example

Keynote 10.3.4: Example

Keynote 10.3.1:Example

Testing for significance of Regression

Relation

This amounts to testing for H

0

: R

2

=0

against alternatives on

. This is equivalent to testing

H

0

: b

1

=b

2

.......b

k

= 0 against

not all b’s equal to zero. All testings are carried out under the

assumptions that the error terms (

ei

’s) are mutually

independent and distributed normally with zero mean and

constant variance (

2

).

To test for significance of R

2

, we use the following

F - statistic:

Here, Ho : R

2

= 0 vs H1 : R

2

> 0

We reject Ho of F > F

k, (n-k-1)

(œ)

An ANOVA presentation can be made for the above

hypothesis testing as given below:

Testing for significance of Regression

Coefficients

If a regression relation is found to be significant, the next

logical question to ask is: which all independent variables

are contributing significantly to the regression relation?

This amounts to testing for significance of b’s individually.

This is done through appropriate t-tests.

In general, we can test for H

0

: b¡ = ß¡ against

alternatives, where ß¡ is the hypothesized value for the

regression coefficient from past experience or other

sources. The testing procedure is as below:

230

Table 10.3.2

Source of

Variation

SS DF MSS F - Ratio

Regression

∑(Y¡- Y^)

2

Explained SS

(ESS)

k ESS/k

F= ESS/k

Un SS/ (n-k-1)

Error

∑ (Y¡ - Y^)

2

Unexplained SS

(Un SS)

(n-k-1) Un SS/ (n-k-1)

Total

∑(Y¡- Y^)

2

TSS

(n-1)

H

0

b¡ = ß¡ vs H

1

: b¡ = ß¡

H

1

: b¡ > ß¡

H

1

: b¡ < ß¡

Test statistic is given by:

The Decision Criteria at œ level of significance is given by

If the bi’s are tested against ‘0’, (i.e., ß¡=0), then we refer to it

as test of significance of regression coefficients.

Confidence Interval for b¡

The confidence interval for b¡ is given by

Here, is to be read from the t-table

appropriately.

In the case of Simple Regression, we have:

a n d t h e c o n f i d e n c e i n t e r v a l f o r b i s :

231

If H

1

: b

i

≠ β

i

, reject H

o

if t > t

(n−2)

α

2

⎛

⎝

⎜

⎞

⎠

⎟

If H

1

: b

i

≠ β

i

, reject H

o

if t > t

(n−2)

α ( )

If H

1

: b

i

≠ β

i

, reject H

o

if t < t

(n−2)

α ( )

Keynote: Example 10.3.5

Prediction using Regression equation

An important purpose of estimating regression equation is

to predict the value of dependent variable for given values

of independent variables. It is possible to give the

confidence interval for such prediction.

The case of Simple Regression

Confidence Interval for predicting

(see Figure 10.3.1)

Confidence Interval for predicting individual value of Y

given X = Xo (see Figure 10.3.2)

Some Important Considerations

If a regression line is not significant (i.e., H

o

: R

2

=0

accepted), then the best prediction of Y is , for

any values of X’s.

While predicting Y - value, the X - values should be

within the maximum and minimum observations of

respective X’s or near about. In other words, we

consider the regression relation valid within the X-

232

ˆ

Y

0

± t n − 2, α 2 ( ) s

e

1+

1

n

+

x

o

− x ( )

2

x

i

− x ( )

2

∑

Figure10.3.1:Prediction of Mean

Figure 10.3.2:Prediction of Individual

Observation

ranges or near about them.

Regression relation obtained based on data from

one population, cannot be extended over another

population for prediction. However, one can test if

the regression relation obtained for Population I can

be taken as statistically equal to the relation

obtained for Population II through suitable statistical

tests. This amounts to testing for equality of

corresponding coefficients (all together) from the two

relations.

Example:10.3.6

Let us assume that the Demand for Wheat in Northern

states be

D = a + b P and in Southern states be

D* = a* + b* P.

The query is : Is the consumption pattern of wheat

same for Northern and Southern states.

This amounts to testing H

o

: a = a* & b = b* (together)

against the alternative H1 : not so. We can test for

equality of individual coefficients also.

Lagged Regression

If we consider the influence of advertisement on sales

of a product, it is reasonable to expect a time lag

before the impact is seen. We can identify many other

similar situations with lagged impact. In such cases,

we incorporate the lag in the regression model. If we

believe that advertisement influences sales with a lag

of three periods, then we can regress sales (S

t

) and

advertisement (E

t

) as:

S

t

= a + b E

t-3

The idea can be carried forward to multiple regression

also with different lags for different explanatory

variables. However, the onus of identifying the lags for

different explanatory variables is on us, based on the

understanding of the phenomena being studied.

Transformations to obtain linearity: Sometimes

attempting a simple linear relation between the

dependent and independent variables may not

produce a good relation (i.e., R

2

not very high). In

233

Keynote: Example 10.3.6

such situations, we try some transformation on Y and

X variables and attempt fitting linear relation in terms

of transformed variables. Popular transformations are

log-transformation, semi-log transformation, square-

root transformation, reciprocal transformation, etc.

Double log transformation will appear as given below:

In this case, B can be interpreted as the elasticity of Y

with respect to X.

Semi-log transformation will appear as given below:

In this case, B can be interpreted as the growth rate

in Y, if X represents time.

234

Section 4

Some Financial Applications

Risk of a Portfolio

In the real world, investors may hold various securities and

other assets. Any such collection of assets is called a portfo-

lio. For example, if you have shares of the Tata Iron and

Steel Company Ltd. and Reliance Industries Ltd., you have a

portfolio consisting of two shares.

The return on a portfolio is equal to the weighted average of

the returns on the assets in the portfolio. The weights used

are the values of the individual assets in the portfolio.

The Standard Deviation of the returns on a security meas-

ures the risk of investing in the security. In the same way,

the Standard Deviation of a Portfolio measures the risk of in-

vesting in the portfolio.

235

Keynote: Example 10.4.1

Characteristic Line

Financial analysts often talk of the beta of a share. We will

describe what the beta signiﬁes and the method commonly

used to estimate it in this section.

Beta of a share is a number that is used to describe how sen-

sitive the share is to the movements in the market as a

whole.

The market as a whole represented by a market index such

as the Bombay Stock Exchange (BSE) Sensitive Index (Sen-

sex), BSE National Index and Economic Times Index. Sup-

pose, during a period under study, the Sensex has doubled

from 900 to 1800, an investor would expect the prices of the

shares held by him also to have doubled. Every investor

would like the shares held by him to do at least as well as

the market, if not better. Whether the individual shares do

as well as the market or remain unaffected by the market

trends depends upon the sensitivity of stock prices to the

market movements. This sensitivity of stock prices to the

market movements is measured by beta. If the stock has

trebled while the market index has doubled, the stock is

considered to be highly sensitive and its beta would be

greater than one. If the stock’s performance exactly

matches that of the market index, the beta of the stock

would be equal to one. If the stock’s appreciation is only

75% compared to the 100% appreciation in market index,

then the stock is less sensitive and the beta would be less

than one.

Depending upon the beta, shares can also be classiﬁed as

aggressive and defensive. If the beta of a share is greater

than one, the share is classiﬁed as an aggressive security.

Performance of an aggressive security is directly propor-

tional to the performance of the market. In a booming mar-

ket, aggressive security will perform much better than the

market performance. While in a bearish market, perform-

ance of aggressive security would drop at a rate faster than

the market. If the beta of a share is less than one, the share

is classiﬁed as defensive security. Performance of a defen-

sive security is also directly proportional to the perform-

ance of the market. When the market moves up, the hold-

ers of defensive securities would reap less than proportion-

ate beneﬁts. However, when the market moves down, the

decline in the defensive securities prices would also be less

than market movement.

Beta is also used to measure the systematic risk of a secu-

rity. The total risk of a security can be divided into two

broad components. The ﬁrst is the risk speciﬁc to the secu-

rity or diversiﬁable risk or non-systematic risk. The inves-

tor by holding a portfolio which is well-diversiﬁed can com-

pletely eliminate the unsystematic risk. The systematic risk,

236

which is the second component of the total risk, is the risk

associated with the general market movement and it can-

not be eliminated through diversiﬁcation. All securities do

not have the same degree of systematic risk because the im-

pact of economy-wide factors could differ from company

to company.

Modern Portfolio Theory contends that the required rate of

return of a security (which in turn determines the price of

the security) depends only on the systematic risk of a secu-

rity or its beta. The total risk is irrelevant because through

diversiﬁcation, the investor can eliminate the non-

systematic risk and hence the market would not consider

the non-systematic risk in the pricing process.

The foregoing discussion brings out the importance of the

study of the beta of a security. How is the beta estimated?

The returns from a given security are regressed with the re-

turn from the market index. The regression line or the line

of best ﬁt for the observations is called as the characteristic

line. The slope of the line is the beta of the security. While

regressing, the return on market index is taken as the inde-

pendent variable and the return on the security is taken as

the dependent variable.

Cost-Volume-Proﬁt Analysis

The Cost-Volume-Proﬁt (CVP) analysis provides answers

to vital questions such as:

At what sales volume would the ﬁrm break-even?

How sensitive is the proﬁt to variations in output?

How sensitive is the proﬁt to variations in selling

prices?

What should be the sales level in quantity terms for the

ﬁrm to earn the target level of proﬁts?

237

Keynote: 10.4.2

One basic assumption of CVP analysis is that all costs could

be segregated into ﬁxed and variable, and costs which are

of a semi-ﬁxed or semi-variable nature could be segregated

into the ﬁxed and variable components.

The method of simple linear regression is commonly used

to segregate the ﬁxed and variable components of semi-

ﬁxed or semi-variable costs. The illustration given below ex-

plains the application of regression technique in CVP analy-

sis.

238

Keynote: Example 10.4.3

SECTION 5

Case Study: Boosting Sales of Double Kola

239

This case study was written by Dr. Sunil Bharadwaj, Professor (Department of Decision Sciences), IBS, Hyderabad. It is intended to

be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.

The case was written from generalized experiences.

The

world famous ‘Cola war’ has been growing rapidly in the Indian

battleground for over a decade. The two cola giants, who have

been waging marketing war since the time they stepped into

the country, are trying to take full advantage of Indian weather

conditions and fast food habits of the Indians. The main charac-

teristic of the war is to use innovative, promotional and advertis-

ing campaigns and to strengthen distribution networks. Deliver-

ing value for money, delivering advertising around houses and

conducting market coups have been the standard operating pro-

cedure in the Coke versus Pepsi saga for decades worldwide.

Though Coke has turned out to be the leader in the market,

Pepsi is always trying to snatch the No.1 position in the market-

place. In 1993, Coke started with a huge 69% share of the mar-

ket, according to the data from the Indian Market Research Bu-

reau. It garnered a huge share by buying out Parle’s popular

brands – Limca, Thums Up and Gold Spot. However, it could

not leverage on such a large portfolio of products and soon the

collective strength seemed to fade away. As a result, Coke’s

market share dropped by more than 10% by the end of 2000,

while Pepsi’s market share went up from 23% to 43% in the

same period.1 According to AC Neilson, a leading market re-

search company, Coca-Cola India’s consolidated share of car-

bonated soft drinks was 57.8% in 2008, whereas Pepsico was

at a distant second with 35.6% share.2 However, PepsiCo is de-

termined to increase its market share by as much as it can.

The Issue

Double Koala, a renowned cola-maker, is facing a problem of

slow rate of growth in its sales in South India. It is lagging be-

hind the industry growth. Vijay Botliwala (Botliwala), the CEO of

the company, called for a meeting of marketing officials for a

better understanding of the variability associated with sales. Af-

ter greeting everybody, Botliwala threw the discussion open to

the marketing team. Priya Kochar (Kochar), head of advertis-

ing, started the discussion by highlighting the role of advertise-

ments in sales promotion. In her view, advertisement is quite

vital to the business they are in. She believes that as the over-

all market size grows, the number of users of the product in-

creases, and hence the importance of attracting and converting

the users into customers of the company. Under such a situa-

tion, she argued, increasing emphasis must be placed on adver-

tising and informing potential customers about the availability of

the products. In the present era of Information and Communica-

tion Technology (ICT), there are advertising methods, which are

not only cost-effective but also capable in reaching out to a

large numbers of consumers. Advertising, for an industry like

cold drinks, can secure leads for salesmen and middlemen.

Visibility of the product through advertisement will gain both the

dealers’ as well as the consumers’ confidence at large in the

company and its products. She believes in the motto: ‘Advertis-

ing is to stimulate market demand’. She stresses on more budg-

ets for advertising and more focus on aggressive advertising

campaigns. The company has tried celebrity endorsement at

various points of time. The celebrities were from the arena of

sports and movies. Kochar feels that in the past this has contrib-

uted to sales growth.

240

Boosting the Sales of Double Kola

On the other hand, Kaushik Agarwal (Agarwal), head of sales

(who broadly agrees with Kochar), believes that though adver-

tisement can stimulate demand, the sales-force of the com-

pany should also be ready to walk that extra mile to encash

the opportunity. His analysis revealed that the incentives given

to salespersons were not adequate to motivate them. He then

highlights certain incentive schemes, which are prevalent in

rival companies. He cited instances where salespersons from

rival companies were doing better than their own salesper-

sons and were being suitably rewarded. He observed that the

competitors’ sales teams have been aggressive in tying up

with local restaurants and fast food joints, whereas Double Ko-

ala’s salespersons were not taking up any such initiatives. He

felt that the main problem of slow growth is the old incentive

system which needs immediate upward revision. At this junc-

ture, Zahir Khan (Khan), deputy to Agarwal, supplemented

that given the low margin their company offered to the distribu-

tors, it is unlikely to attract new distributors, they preferring a

competitor instead. He said, “In our company, in the first place

we give lesser margin as compared to our competitors and we

particularly have no scheme for rewarding the best performing

distributors.” Khan believes that a little higher margin to the

distributors will be fruitful in attracting more distributors, par-

ticularly in new areas. This will lead to better sales for the com-

pany.

At this juncture, Kapil Singhvi (Singhvi), the finance manager,

brought up the issue of pricing. In his opinion, if the price is re-

duced, it may lead to an increase in demand. However, he is

not sure how far it will help in boosting the sales. At this point,

Botliwala took charge of the discussion. He ruled out consider-

ing a decrease in price, as it may instill a price war, which will

result in erosion of profits for the players in the industry. Botli-

wala also felt that they need to discuss the matter with hard

facts and figures, rather than on the basis of intuitions. He

asked Manoj Poddar (Poddar), the young market research

analyst at Double Koala, to come up with some key quantita-

tive information at the next meeting scheduled for Monday.

With only 3 days left for the meeting, Poddar worked hard on

the weekend to gather relevant information.In the next meet-

ing, Poddar presented data on quarterly sales, number of dis-

tributors, distributors’margins, company’s sales force strength,

nearest competitor’s sales force strength, total incentives

paid, the expenditure on advertisement and celebrity endorse-

ment. While there were quick responses, comments and con-

clusions made based on the data presented by Poddar (Ex-

hibit I), Botliwala was aghast seeing the haphazard way in

which the conclusions were drawn. He felt that in these days

of mathematical modeling done with computer and software

support, there should be a more objective way of drawing con-

clusions from data. He also raised the issue of assessing the

impact of celebrity endorsement on the sales of their cola.

Foot notes

1. Rekhi Shefali, “COKE VS PEPSI – Cola Quarrels”,

http://www.indiatoday.com/itoday/04051998/biz2.html, May

4th 1998

2. Bhushan Ratna, “Coca-Cola Thums down for PepsiCo”,

http://economictimes.indiatimes.com/News/Coca-Cola_Th

241

ums_down_for_PepsiCo/rssarticleshow/3542480.cms,

September 30th 2008

242

243

Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data Sales Data

Year Quarter Sales in INR Number of

Distributors

Distributor's

margin(%)

Competitors sales

force

Company's sales

force

Total

incentive

Advertising budget

2002

1 20 155 5 1300 1000 0.5 1.2

2002

1 22 160 5 1400 1100 0.6 1.4

2002

1 24 160 5.5 1420 1300 0.6 1.5

2002

1 26 165 5.5 1425 1300 0.6 1.4

2003

1 26 165 6 1425 1326 0.7 1.4

2003

1 28 165 6 1400 1410 0.7 1.5

2003

1 28 165 6 1420 1420 0.7 1.6

2003

1 32 170 6.5 1425 1450 1 1.6

2004

1 30 170 6.5 1460 1460 1 1.5

2004

1 34 175 7 1460 1490 1 1.8

2004

1 34 175 7 1450 1510 1 1.8

2004

1 32 175 7 1500 1610 1 1.7

2005

1 36 180 7.5 1510 1650 1.5 1.8

2005

1 38 180 7.5 1500 1700 1.5 1.9

2005

1 38 185 8 1520 1750 1.5 1.9

2005

1 40 190 8 1530 1760 1.5 2.4

2006

1 44 200 8.5 1550 1790 1.8 2.1

2006

1 46 210 9.5 1560 1800 1.8 2.4

2006

1 48 200 9.5 1560 1800 1.8 2.5

2006

1 50 210 10 1550 1800 2 2.5

2007

1 58 220 10 1580 2000 2.1 2.9

2007

1 60 220 12 1600 2000 2.1 3.1

2007

1 62 230 12 1610 2000 2.2 3.2

2007

1 68 230 12.5 1650 2000 2.3 3.4

2008

1 72 250 12.5 1660 2400 2.9 3.6

2008

2 78 260 14 1750 2500 3 4.1

2008

3 84 265 15.5 1800 2500 3.5 4.2

2008

4 102 270 16 1850 2500 4 5.4

1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million 1 Crore= 10 million

Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author

Exhibit I

SECTION 6

Case Study: Planning for Road Safety

244

This case study was written by Dr. Sunil Bharadwaj, Professor (Department of Decision Sciences), IBS, Hyderabad. It is intended

to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situa-

tion. The case was written from generalised experiences.

The Mayor of Ivory city, Cooper Mandela (Mandela), has

called in a meeting with the traffic police officials. The agenda

is to discuss the matter of increasing number of tatal accidents

in the city in the last few years. After greeting everybody, Man-

dela starts the discussion on the matter. During the discus-

sion, the traffic police officials try to present their perspective

on the recurring accidents.

The Meeting

The excerpts from the meeting are as follows:

As the discussion seemed to end nowhere, Mandela was

caught in a quandary with regard to finding a suitable solution

to the problem. In spite of all the prevailing confusion, he is

sure of one thing i.e., once the reasons for the variability in the

accidents are understood, he will immediately move forward

making necessary changes in the policies to solve the prob-

lem. To begin with, Mandela wants to understand the following

to enable him to tackle the problem:

What are the factors causing accidents in the city?

Which variable describes the variability in the number of

accidents the most?

Which variables significantly describe the variability in the

number of accidents?

Fortunately, at this point, the Statistician of the Police Depart-

ment dished out some statistics relating to the road accidents

in Ivory city (Exhibit I).

245

Planning for Road Safety

246

Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city Statistics of the road accidents in the ivory city

Year Quarter Number of ofﬁcials in

the ﬁeld

Number of people visiting

Bars in hundreds

Number of

licensees issued to

the young

Number of

Vehicles

Trafﬁc police

investments

Prescribed speed limits

in the city

2002

1 20 500 13000 25000 1000 40

2002

2 22 500 14000 25500 1100 40

2002

3 24 600 14200 25600 1300 40

2002

4 26 600 14250 25700 1300 40

2003

1 26 600 13600 25750 1326 40

2003

2 28 650 14000 25800 1410 40

2003

3 28 600 14200 25900 1420 40

2003

4 32 650 14250 26000 1450 40

2004

1 30 650 14500 26500 1460 40

2004

2 34 700 14600 26700 1490 40

2004

3 34 700 14600 26800 1510 40

2004

4 32 700 14500 26850 1610 40

2005

1 36 900 15000 26900 1650 60

2005

2 38 900 15100 27000 1700 60

2005

3 38 900 15000 27500 1750 60

2005

4 40 900 15200 27900 1760 60

2006

1 44 1000 15300 29950 1790 60

2006

2 46 1000 15500 30000 1800 60

2006

3 48 1000 15600 30500 1800 60

2006

4 50 1000 15500 31000 1800 60

2007

1 58 1200 15800 32000 2000 60

2007

2 60 1200 16000 33000 2000 60

2007

3 62 1200 16100 35000 2000 60

2007

4 68 1200 16500 36500 2000 60

2008

1 72 1500 16600 38000 3400 80

2008

2 78 1500 17500 40000 3600 80

2008

3 84 1500 18000 45000 4000 80

2008

4 102 1800 18500 48000 4400 80

Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author

SECTION 7

Case Study: Measuring Growth and Responsiveness

247

This case study was written by L. Shridharan, Professor, IBS Hyderabad. It is intended to be used as the basis for class discus-

sion rather than to illustrate either effective or ineffective handling of a management situation. The case was compiled from gen-

eralized experience.

Suziland is a prosperous country, belonging to the league of

‘developed nations’. With a population of about 270 million,

the country’s growth has been keeping pace with the popula-

tion growth. In early 2011 the National Planning Committee,

headed by Dr. Peter Mugabe (a well known economist), was

engaged in drawing up development plan for the next four

years (2012- 2016). Being a free market economy, the coun-

try believed in indirect management of economic instruments

than direct interventions. Dr. Mugabe firmly believed that the

prosperity of the nation must reflect in the growth of ‘personal

consumption expenditure’ and its components, such as ex-

penditures on durables, non-durables and services. As a prel-

ude to plan for peoples’ prosperity, Dr. Mugabe emphasized

the need for assessing the existing growth pattern and the re-

sponsiveness of expenditures on different heads to a change

in the total personal consumption expenditure. He called Ms.

Julie Obama, the Research Officer with the Committee, and

asked her to provide the information within two days. Ms.

Obama got on the job immediately. By contacting the Depart-

ment of Statistics within the Government, she could get quar-

terly data on personal expenditures and its components for

the past six years in billions of Suziland dollar (SZ$), the cur-

rency of Suziland (Exhibit I). With data at hand she now

needed to answer Dr. Mugabe’s queries on growth in expendi-

ture pattern and responsiveness of individual components to

a change in the overall personal expenditure.

How should Ms. Obama proceed?

248

Measuring Growth and Responsiveness in Suziland

249

Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components Exhibit I: Total Personal Consumption Expenditure and It’s Components

Year Quarter Time Expenditure on

services

Expenditure on

durables

Expenditure on non

Durables

Personnel consumer

expenditure.

2005

1 1 2274.1 529.7 1169.2 3973.0

2005

2 2 2284.0 545.7 1178.2 4008.0

2005

3 3 2306.0 556.7 1186.1 4049.4

2005

4 4 2319.8 569.7 1190.5 4080.0

2006

1 5 2335.1 578.7 1205.0 4118.9

2006

2 6 2354.1 578.7 1205.0 4118.9

2006

3 7 2365.7 590.3 1217.9 4174.0

2006

4 8 2377.0 605.9 1226.1 4209.0

2007

1 9 2390.5 604.5 1233.0 4227.9

2007

2 10 2413.2 613.2 1237.8 4264.1

2007

3 11 2427.6 625.6 1240.1 4293.2

2007

4 12 2439.3 633.1 1246.3 4318.6

2008

1 13 2463.1 642.1 1253.2 4358.4

2008

2 14 2481.1 661.5 1267.9 4411.1

2008

3 15 2499.9 658.4 1271.7 4430.0

2008

4 16 2512.6 669.9 1280.8 4463.3

2009

1 17 2531.6 689.7 1292.0 4513.2

2009

2 18 2551.5 689.1 1291.3 4529.9

2009

3 19 2581.5 714.2 1307.5 4602.9

2009

4 20 2608.5 681.8 1306.3 4596.6

2010

1 21 2631.2 746.5 1329.8 4707.5

2010

2 22 2666.1 766.5 1347.1 4779.7

2010

3 23 2701.5 771.0 1354.2 4826.7

2010

4 24 2742.6 701.1 1453.6 4897.3

Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author Prepared by author

Time Series Analysis & Exponential Smoothing

Components of a Time Series

• Secular Trend

• Cyclical Variation

• Seasonal Variation

Irregular Variation

The Multiplicative Model

Exponential Smoothing

Case Study: Predicting Sales of a Company

Case Study: The Electric Fan Industry

C

H

A

P

T

E

R

1

1

I n t hi s c hapt e r we wi l l di s c us s

Section 1

Components of a Time Series

A sequence of values of a variable, which change with the

course of time constitutes a Time Series. The time aspect

of such variables plays a very important role as it affects

the variable to a large extent. The analysis of time series

helps in forecasting or projecting the future value of the

variable.

The primary components of Time Series are:

Secular Trend

Cyclical Variation

Seasonal Variation

Irregular or Random Variation

Secular Trend

Secular trend is the general tendency of the data to grow,

decline or to remain constant in values over a period of

time. It relates to the movement of data over a fairly long

period of time. There are two types of secular trends:

Linear and Non-Linear.

Linear Secular trend is a straight line trend. When the

data relating to a series is plotted against time, if most of

the observations cluster around a straight line, it is a

251

Video 11.1.1: Time Series Analysis

situation of linear trend. It

can be upward slopping,

downward slopping or be

horizontal to the time axis.

Non-Linear Secular trend is

a trend which does not give

rise to a straight line when

the time series is plotted

against time. It takes a concave, convex or curvilinear form

with ups and downs. One of the widely used method to fit a

secular trend and estimate the model parameters is the Least

Squares Method discussed earlier in the regression chapter.

Commonly, we use the following models for trend fitting:

where, t = refers to the time period

Y

t

= the data at time t

Least squares approach is used to fit the above trend curves.

In the case of linear trend, the parameter are estimated as:

252

Video 11.1.2:Trend Analysis

Keynote 11.1.1: Time series analysis

Figure 11.1.1: Components of a Time Series

For manual calculations, see the link for simpler calcula-

tions.

Example 11.1.1 :(Refer keynote 11.1.2)

Some-

times, we also include dummy variables, while defining

the linear trend. Suppose we have quarterly data on sales

of a product for a few years. We can model this situation

with the following model :

Y

t

= a + bt + c

1

Q

1

+ c

2

Q

2

+ c

3

Q

3

,

where, t = refers to time (expressed in quarters)

Y

t

= the data value at time t

Q

1

= 1 if it is quarter 1

= 0 otherwise

Q

2

= 1 if it is quarter 2

= 0 otherwise

Q

3

= 1 if it is quarter 3

= 0 otherwise

Here too, we can estimate all the regression coefficients

through least squares method. This approach takes care

of linear trend and seasonality together.

Cyclical Variation

Cyclical variation is the gradual fluctuation in a time series

taking place over long time period (years). Business cy-

cles present a common example of cyclical fluctuation,

with a boom, slump, recession and recovery phases.

Most of the time series relating to price, investment, in-

come, wage, production, etc., exhibit this type of cycle.

Residual Method

253

Keynote 11.1.2: Example

The Residual Method is the common method used for cal-

culating Cyclical variations. The ratio of actual values and

the corresponding trend values is used as indicative of cy-

clical fluctuation.

Cyclical Variation =

where, = Actual values

= Estimated trend values

Seasonal Variation

Seasonal Variation is fluctuations that occur regularly

within a year over seasons. For instance, sale of refrigera-

tor would be influenced by the seasons (summer, winter,

autumn or rainy). These are short term fluctuations which

can change weekly, monthly, quarterly or half yearly. The

main reasons for such variations are natural causes such

as weather or climate and social causes such as habits,

customs, traditions, conventions and fashions.

Ratio to Moving Average Method

A widely used technique for calculating the seasonal

trends is the Ratio to Moving Average Method. In general,

moving average of a time series indicates running aver-

ages for the data taken over a given contiguous period. In

254

Keynote 11.1.3: Example

Keynote 11.1.4: Example

the context of seasonal variation, we take the average over

the number of periods in a year (4 if quarterly data, 12 if an-

nual data). Each time the average is recorded at the centre of

the period. If the number of periods is odd, then there is a

unique centre. If it is even, then we centre the two middle

most averages by taking their average, so as to represent

against a particular period. It should be easy to see that these

moving averages are smoothening out the seasonal effect.

Consequently, the ratio of actual value to the corresponding

moving average value would be indicative of the seasonal im-

pact. Using this logic, we develop a seasonal index illustrated

in the example (Refer keynote).

Irregular Variation

Irregular variations follows an indistinct and an unequal

pattern. They do not repeat in any specific pattern. They

are also called erratic, accidental, episodic variations.

These variations are caused by accidental and random

factors like earthquakes, famines, floods, wars, strikes,

lockouts, epidemics, etc. They include variations which

are not attributable to secular, seasonal or cyclical varia-

tions. There are no models to find out the irregular trend

as they occur unexpectedly and inconsistently though

some methods are used to isolate these trends.

A Multiplicative Model

A time series can be expressed as an additive or a multi-

plicative model.

In practice, the multiplicative model is popularly used. The

multiplicative model is expressed as:

Y

t

= T

t

x C

t

x S

t

x I

t

,

where, Yt = Actual value of the time series at time t,

T

t

= Trend value of the time series at time t.

C

t

= Cyclical Index at time t

S

t

= Seasonal Index at time t and

I

t

= Irregularity ratio at time t.

As stated at the beginning, the purpose of studying a time

series is to make forecasts for near future. Using multipli-

255

Keynote 11.1.5: Example

cative model, we can forecast taking into account the trend,

cyclical and seasonal indices. We earlier studied as to how

to quantify each of these components. We presume/expect

the irregularity ratio to be unity on an average. Thus, a fore-

cast based on multiplicative model would be more reliable

than one based on trend alone. However, we should keep in

mind that in as far as ‘time’ is used as an “overall” explana-

tory variable for the behavior of the time series, such fore-

casts should be made only for the near future, i.e., for the

short-term.

References:

www.clt.astate.edu/crbrown/multiplicativemodel.ppt

256

Keynote 11.1.6: Example

Section 2

Exponential Smoothing

Exponential smoothing has become very popular as a

forecasting method for a wide variety of time series data.

Historically, the method was independently developed by

Brown and Holt. Brown worked for the US Navy during

the World War II, where his assignment was to design a

tracking system for fire-control information to compute the

location of submarines. Later, he applied this technique to

the forecasting of demand for spare parts (an inventory

control problem). Since then, various types of exponential

smoothing models have evolved. Generally, exponential

smoothing techniques find application in financial and eco-

nomic time series, though it can be used with any discrete

set of repeated observations, as done by Brown earlier.

Moving Average

We earlier discussed about moving average in the context

of a time series. Suppose a time series is more or less

hovering around a constant value, but for some random

errors, then we can write:

i.e., the average of the observed time series.

Thus moving average gives equal weight to each observa-

tion and can be said to be an appropriate smoothing tech-

nique in the case of a “constant” time series, which is mod-

eled as above.

257

Single Exponential Smoothing

When the value of the parameter in the model is

slowly changing over time, giving equal weight to each obser-

vation may not be appropriate. Instead, it may be preferable

to attach greater weight to recent past than to the remote past

in a graded manner. Simple Exponential method achieve this

through a smoothing constant ( ). The model can be written

as:

Being a recursive relation, this can be simplified as:

This implies that each smoothed value is the weighted aver-

age of the previous observations, where the weights decrease

e x p o n e n t i a l l y .

Refer keynote for example (example 11.2.1).

Double Exponential Smoothing Model

If a time series exhibits a linear trend, then Holt-Winter double

exponential smoothing model is recommended for forecast-

ing. The model smoothens an exponentially smoothing compo-

nent (E), with a smoothing factor(∝), and a trend component

(T), with another smoothing factor (β). The model is given as

258

Keynote 11.2.1: Example

where,

F

t

= Forecast value for period t

Y

t

= Actual value for period t

E

t

= Estimated value for period t

T

t

= Trend value for period t

= Smoothing factor for estimates (0< <1)

= Smoothing factor for trends (0<β<1)

k= number of periods ahead, for which forecasting is be-

ing made.

In the model, (a) E

1

and T

1

are not defined and in (b) We

take E

2

= Y

2

and T

2

= Y

2

- Y

1

.

By taking separate values for and , both between 0

and 1, we can obtain the forecast for k periods ahead.

However, the critical question is - how to decide values for

and ? We do this by “trial and error” so as to minimize

the Root Mean Square Error (RMSE) of the model. This

would be equivalent to minimizing Mean Square Error

(MSE). In practice, we minimize MSE. This can be done in

an organized way using Excel Solver, where we try to mini-

mize MSE (for the choice of and ) subject to ≥ 0,

≤ 1 , ≥ 0, ≤ 1.

With values for “ ” and “ ” obtained which minimize the

MSE, we can use these values for the actual forecast of

the time series.

Forecast Error

Let be a time series with data for n periods and be

the corresponding forecast (obtained by any method),

then popularly forecast errors are measured by Mean Ab-

solute Error (MAE) or Root Mean Squared Error (RMSE).

The former is defined as

If forecasts for a situation is obtained through several

methods (say, by decomposition method, single exponen-

tial smoothing and double exponential smoothing), then

259

we can choose the method giving least forecast error

(based on MAE or RMSE) as the “best” method for the

particular forecast. For single exponential smoothing, one

could compute RMSE over various values of and β. For

single exponential smoothing, one could compute RMSE

over various values itself, and for double exponential

smoothing over various combination of values of and β.

Having said this, periodically we need to reconfirm that

the identified method parameter values (like and β) con-

tinue to be the best by recomputing the forecast errors

and comparing them over different methods. In practice,

use of RMSE or squared RMSE (called Mean Squared Er-

ror (MSE)) is more popular.

260

Check Answer

Question 1 of 8

The weighting factor (#) in simple exponential

smoothing ranges from

A. -1 to 1

B. 0 to 1

C. 1 to 2

D. 2 to 3

Section 3

Case Study: Predicting Sales of a Company

261

This case study was written by Sunil Bhardwaj, Professor, Department of Operations & IT, IBS Hyderabad. It is intended to be

used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.

The case was prepared from the generalised experiences.

Amar Corporation is into the business of manufacturing wash-

ing machines and the company has been fairly doing well.

One of the important activities carried at Amar Corporation is

to generate production schedules so as to meet anticipated

demand for a coming year. Accurate planning for production

and supplies depend on the quality of the forecast they are

able to generate.

On his first visit of the company, in the first round of the talks

with the people at the corporate office, Amit observed that

mostly the forecasts are based on the qualitative assess-

ment of the market conditions by the sales manager and his

team. He observed that business forecasting has always

been one of the important components of running such an en-

terprise. However, forecasting traditionally has been based

less on concrete and comprehensive data than on face-to-

face meetings and common sense.The typical practice is to

have an opinion of sales force team about the sales of the

washing machines and then a number is mutually decided

and agreed upon by the team. This practice has worked well

for the company however last few years they were not able

to forecast with much accuracy.

During his MBA course he has learned that in recent years,

business forecasting has developed into a much more scien-

tific exercise, with a host of theories, mathematical tech-

niques and models designed for forecasting certain types of

data. The development of information technologies and the

Internet has given boost to this development, as companies

not only adopted such technologies into their business prac-

tices, but into forecasting schemes as well. In the 2000s, fore-

casting or predicting the optimal levels of goods to buy or

products to produce involved specialized software and elec-

tronic networks that incorporate great deal of data and ad-

vanced mathematical algorithms tailored to a company’s par-

ticular market conditions and nature of business.

Amit understands that forecasting sales and profits, particu-

larly on a short-term basis (one year to three years), is neces-

sary for planning for business success. This process, estimat-

ing future business performance based on the actual results

from prior periods, enables the business owner/manager to

modify or manage the operation of the business on a timely

basis. This allows the business to have a better understand-

ing of deciding on sales targets and avoid losses or major fi-

nancial problems incase some future results from operations

not conform to reasonable expectations. Amit was excited

with the consulting project and he thinks it’s an opportunity to

train some of the managers on few of the quantitative fore-

casting techniques. Amit has been thinking about the various

parameters which may be helpful in preparing the forecast

for the client. Some of them are:

Company Specific Data

Sales Force size,

Incentive schemes,

262

Predicting Sales of a Company

Promotional Budget

Price

Competitor Price

Environment Specific Data

Overall state of the economy,

Economic status of Amar Corporation as well as the in-

dustry within the economy

Population growth,

Disposable income

Elasticity of demand for the product or service

Threats from the substitutes or competitor products

Data from the Past

Previous sales levels and trends,

Average past administrative, and selling expenses,

Trends in the company’s credit policy (supplier,

trade credit, and bank credit) to support various levels of

inventory

Trends in accounts receivable required to achieve previ-

ous sales volumes

After a few rounds of meeting with the managers at Amar

corp. they have invited Amar to teach few of the quantitative

approaches to forecasting. The following data was made

available to Amar to start with:

I. What techniques can be used to forecast sales with this

data ? What are the drawbacks of the techniques?

II. Which is better qualitative forecasting, quantitative fore-

casting or both?

263

264

Year Sales (In lakhs)

Advertising Budget

(In lakhs)

1991 150 15

1992 120 16

1993 160 15

1994 150 17

1995 150 18

1996 130 19

1997 180 18

1998 160 18

1999 170 18

2000 140 15

2001 200 20

2002 180 21

2003 200 22

2004 150 17

2005 230 24

2006 200 24

2007 250 25

2008 260 26

2009 270 28

prepared by the author prepared by the author prepared by the author

Exhibit I

Section 4

Case Study: The Electric Fan Industry

265

This case study was written by Prof. L. Shridharan, Department of Decision Sciences, IBS, Hyderabad. It is intended to be used

as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation. The

case was written from generalized experiences.

The first electric fan was

manufactured in India in

1921. While Orient Fans

started in 1940s, the growth

in the industry came about after independence with a ban on

imports. Jay Engineering Works (with Usha brand) started in

the 1950s. Other major organised sector players in the indus-

try today are Khaitan, Polar, Crompton Greaves, Bajaj, Hav-

ell’s, Metro and a few others, besides several smaller units in

the small scale sector. The fan market in India consists of ceil-

ing fans, table fans, pedestal fans, wall fans, exhaust fans

and industrial exhaust & special purpose fans. Given the tropi-

cal nature of India, it is the ceiling fan which has a dominant

share in the total production and market.

While in fifties, Kolkata emerged as the major production cen-

tre, gradually the manufacturing spread to other major and

medium cities in India. Hyderabad emerged as a major cen-

tre in the nineties, though Jay Engineering Works had started

its unit in the sixties. Today, Hyderabad is considered the larg-

est ceiling fan manufacturing centre in the country. While 10

to 15 units in the organized sector manufacture complete

fans, a few hundred units (mostly in unorganized sector)

manufacture various components.Tibrewala, owner and chief

executive officer of Bhagyanagar Fans Limited, Hyderabad is

also the President of the association formed by the local fan

manufacturers in the organized sector. With some shortages

faced in the components supply in the last peak season by

many units, the association felt that a better understanding of

the short term demand pattern is necessary. After the meet-

ing, Tibrewala called Ravi Kumar, the young market research

executive (an MBA) with the association, and explained what

he wanted. He asked him to come forward with a short term

forecast for the next six months, before the next meeting

scheduled after two weeks.

Ravi Kumar got on to the job immediately with an internet

search, library search and visits to the industry department

for relevant data. Despite spending a week on these efforts,

he could not find past data on sales or demand for electric

fans. What he could find however was the monthly produc-

tion by organized sector since 2000 A.D.(Exhibit I). With only

a week to generate the forecast Ravi is under tension since

he is not clear on the data or the approach to generate the

forecast.

266

The Electric Fan Industry

INTERACTIVE 11.1 Production of Elec-

tric Fan in India

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Production of Electric Fans in india

(In lakh numbers)

Month 2000-01 2001-02 2002-03 2004-05 2005-06 2006-07 2007-08 2008-09

April 6.6 6.9 7.7 9 9.2 9.6 10.1 10.4

May 6.7 7.3 8.1 8.5 9.5 9.7 11.3 11.6

June 5.9 7.2 8.7 8.7 7.9 8.8 10.8 10.7

July 4.9 7.2 8.2 7.2 7 9.7 9.7 9.5

August 5.8 7.2 6.8 8.1 6.1 6.9 8.8 8.8

September 6.4 7.5 6.5 7.7 6.6 7.5 8.4 8.9

October 6.2 7.2 7.6 8.5 6.8 7.3 8.1 8.4

November 6.3 7.4 7.7 8.2 7 7.7 8.6 7.6

December 9.6 7.1 7.7 8.3 8.2 8.8 9.4 8.2

January 7.5 7.8 7.6 8.2 8.9 9.4 10.5

February 7.6 8.1 7.3 8.3 9.8 10.2 10.9

March 8.1 8.7 8 9.3 10.2 11.8 11.1

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Source: Monthly Abstract of Statistics, central Statistical organization, Government of India

Several issues between January 2001 to March 2009

Exhibit I

Decision Theory

The Framework for Decision-making

Decision-Making Environment

• Decision-making under Certainty

• Decision-making under Uncertainty

• Decision Making under Risk

a. Expected Monetary Value

b. Expected Opportunity Loss

c. Expected value under Perfect Infor-

mation

Decision Tree Analysis

Posterior Probability and Decision-making

C

H

A

P

T

E

R

1

2

I n t hi s c hapt e r we wi l l di s c us s

Section1

The Framework for Decision-Making under different Environments

Decision-making, both long-term and short-term, are an

integral part of any management process. Managers at all

levels have to deal with planning, organizing, monitoring and

control at their levels of operations, be it in finance,

production, investment, pricing, demand or research

decisions. The import of these decisions are expected to be

seen through increase in profit, reduction in cost, increase in

turnover, increase in market share etc. Decision theory

helps us in arriving at appropriate decisions under different

circumstances.

The Framework for Decision-making

Regardless of the context and environment, all decision

problems faced by mangers, have the following common

features:

Decision Makers Objectives: These should be clearly stated

measurable objective to the problem. For instance, a

production manager would be concerned with minimizing

downtime of a machine whose downtime may prove costly

to the organization. His problem could be to determine the

optimal number of spare motors to maintain to enable quick

repair of the machine.

Courses of Action: It is a list of available alternative acts

available to the manager to address the problem. He can

choose any one of them. Clearly, he would like to identify

the act which would be “optimal” to is situation. If n

alternatives are available to problem, they are donated by

.

For the production manager problem, the courses of action

are the inventory of spare motors he should maintain, say

1,2 or 3.

States of Nature: These are events which are beyond the

control of the decision makers, but has an impact on the

problem at hand and generally donated by to

indicate m states of nature faced by the problem. While the

states of nature are uncertain, based on past experience or

on other basis, it may be possible to indicate the probability

of occurrence for each level of state of nature. For the

production mangers problem, the state of nature is --how

many motors would fail at a time (1, 2 or 3).

Payoffs: This is a calculable measure of benefits or loss for

each combination of action and state of nature. The payoff

for action and state of nature is donated by . While

270

the payoffs are generally in monetary terms, it is not a

must. It could be time saved, material saved, measurable

quality improvement, etc. Thus the production manager

may estimate the production lost etc. or downtime of the

machine for each combination of inventory or breakdown

as payoff.

The following payoff table 12.1.1 sums up the production

manager’s problem:

Example: A Pricing Problem

Consider an Ayurvedic cosmetic company contemplating to

introduce a multi-herbal cream in place of its existing uni-

herbal cream. The issue before the company is to decide

on price under three options - offer existing price ,

moderate increase in price highlighting the multi-herbal

character , a substantial increase in price with a new

attractive packaging highlighting the multi-herbal character

. These are the courses of actions, of which the

company wants to choose one. However the company is

aware of the following possible market conditions (states of

nature): no competitor emerging , a small competitor

emerging and a major competitor emerging . The

marketing department estimates the annual net profits (pay

offs) for each course of action under different market

conditions in the table 12.1.2 as follows:

The company has to decide on a strategy (action) to be

followed.

271

Table 12.1.1 Table 12.1.1 Table 12.1.1 Table 12.1.1

Number of

motors that

may fail

Spare motors to keep (Act) Spare motors to keep (Act) Spare motors to keep (Act)

Number of

motors that

may fail

1 (A1) 2 (A2) 3 (A3)

1 (S1) X11 X12 X13

2 (S2) X21 X22 X23

3 (S3) X31 X32 X33

Table 12.1.2 Table 12.1.2 Table 12.1.2 Table 12.1.2

Market

condition

Annual Net Profit (in Rs.mn) under

different pricing strategies

Annual Net Profit (in Rs.mn) under

different pricing strategies

Annual Net Profit (in Rs.mn) under

different pricing strategies

Market

condition

No increase

Moderate

increase

Substantial

increase

No

competition

6.00 5.00 3.50

Minor

competition

5.00 4.50 2.50

Major

competition

4.00 3.00 1.80

Decision-Making Environment

We have three types of decision making environment

Under Certainty

Under Uncertainty

Under Risk

Decision-making under certainty

Under certain environment there is only one state of nature

and all information are known with definite results. Hence a

deterministic choice of action can be made. Techniques like

Linear Programming, Transportation and Assignment

models, Goal programming, Break Even Analysis, etc., are

used under certainty environment.

Decision Making under Uncertainty

Under uncertain environment, more than one states of

nature exists. However beyond identifying the state of

nature, we have no information on them and hence it is not

possible to assign any probability to each level of state.

Under this situation, decision are based on specific criteria

depending on ones choice of principles. For this several

alternative criteria are: (a) Maximin; (b) Minimax; (c)

Maximax; (d) Laplace; (e) Hurwitz Realism; and (f) Regret.

We explain below each of the criteria illustrated in the

context of pricing problem:

(a). Maximin Criterion: This is a pessimistic approach,

where we go for the best action under the worst state of

nature.

For the problem, the decision would be to go for “no

increase” in price, as the company makes Rs. 4 million

under the worst scenario of a major competitor emerging.

(b). Minimax Criterion: This is an optimistic approach,

where the worst pricing strategy for the best market

conditions is selected. Here, this approach suggests the

strategy of “substantial increase” in price with “no

competition” as the best market condition and the

company makes Rs. 3.50 min.

(c). Maximax Criterion: This is the “best of the best”

approach. Hence, the best market condition is “no

competition” and the best strategy is “no increase in

price”

(d). Laplace Criterion: With no information, the probability

of the various state of nature, we assume equal

probability (1/3 in this case) and compute the expected

pay off for each action . Hence we get:

E (No price change)= Rs.5 mn

E (Moderate Price change) = Rs.4.17 mn

E (Substantial price change) = Rs.2.6 mn

Hence, by a place criterion, we prefer “no increase” in

price with a payoff of Rs.4.9 mn.

(e). Hurwitz Realism Criterion: The Maximax and Maximin

are the two extremities - Optimistic and Pessimistic.

Hurwitz proposed that realism would be somewhere in

between. Representing the degree of optimism by œ

(where 0 < < 1), Hurwitz suggested that for each

strategy (act) a decision index (Di) be calculated as the

weighted average of optimistic and pessimistic pay offs,

the former weighed by degree of optimism ( ) and the

latter by degree of pessimism (1- ). Taking = 0.6, for

pricing problem we get:

D (No price rise) = 0.6 x 6.00 + 0.4 x 4.00 = 5.2

272

D (Moderate price rise) = 0.6 x 5.00+0.4 x 3.00 = 4.2

D (Substantial price rise) = 0.6 x 3.50 + 0.4 x 1.80 = 2.82

Thus we would go for “ no increase” in price based on

Laplace criterion. The conclusion could depend on the value

of . Hence a realistic guess of needs to be arrived out.

(f) Regret criterion: This approach takes into account the loss

of missed opportunity due to not knowing the state of

nature in advance. An opportunity loss can be computed

as the differences between the pay off for a given

outcome and the maximum pay off under that state of

nature. The opportunity loss table 12.1.3 for the pricing

problem would be as follows:

Thus, minimum of the maximum regret is zero. Hence,

“no price increase” is the strategy to be adopted under

regret criterion.

While there are differences in the conclusion reached

following different criterion , they primarily reflect the

underlying principles of each criteria. The company has to

take a call on the principle to follow depending on its

outlook and judgements.

Decision-Making under Risk

(a). Expected Monetary Value

Unlike in the previous case, when the probabilities of the

state of nature are known, it is a situation of decision-

making under risk. The probabilities may be usually

known on the basis of past data / experience. In this case,

we have only one criteria leading to unique selection of

strategy, though probabilistic in nature. We find the

expected pay off for each strategy called Expected

Monetary Value (EMV), using the probabilities of the

states of nature. In the pricing example, suppose the

probabilities of three states were given as 0.2 (for no

competition ), 0.5 (for minor competition) and 0.3 (for

major competition). Then, EMV or expected profit (EP in

this case) is given as

EMV ( No change in Price ) = Rs.4.9 mn

EMV (Moderate change in Price ) = Rs.4.15 mn

EMV (Substantial change in Price) = Rs. 2.49 mn

Since the EMV is highest for “ no change “ in price, the

company should follow this strategy.

273

Table 12.1.3 Table 12.1.3 Table 12.1.3 Table 12.1.3

Market

condition

Regret Payoffs pricing strategy Regret Payoffs pricing strategy Regret Payoffs pricing strategy

Market

condition

No increase

Moderate

increase

Substantial

increase

No

competition

(6-6)=0 (6-5)=1 (6-3.5)=2.5

Minor

competition

(5-5)=0 (5-4)=1 (6-4.5)=2.5

Major

competition

(4-4)=0 (4-3)=1 (6-1.5)=2.2

(b) Expected Opportunity Loss: Under regret criterion, we

discuss pay off in terms of opportunity loss. If we are working

with such payoffs, we call the corresponding EMV as Expected

Opportunity Loss (EOL) and would go for the strategy with least

expected loss. This approach would be followed if the payoffs

were defined in terms of cost or downtime. In the pricing

example,

EOL (No increase ) = 0

EOL ( Moderate increase ) = 0.75

EOL (Substantial increase ) = 1.285

Thus “no increase” in price” option is selected.

(c) Expected Value under Perfect Information:

In the above kind of situation suppose a soothsayer (a modern

day Market Research Consultant), says that he can give perfect

information on the occurrence of state of nature. Could this

knowledge affect on strategy choice? How much would the

information be worth as soothsayer do not come free? For

instances, in the pricing example, the probability of each state

remaining the same, the soothsayer tells the state of nature (in

market condition) for every day for the next one year. As

opposed to the previous case, where we knew the percentage of

days for which each of the state of nature would apply, now we

additionally know which state will apply for any given day.

Clearly, with market condition known for each day, the company

would choose the strategy with maximum payoff for that day.

Thus, under perfect information, the relevant part of the payoff

table 12.1.4 would look as below:

Therefore, Expected Profit of Perfect Information (EPPI)

= 0.2 x 6.00 + 0.5 x 5.00 + 0.3 x 4.00

= Rs.4.9 mn

and, Expected Value of Perfect Information

= EPPI - EMV (max)

= 4.9 - 4.9 = 0

Normally, the difference between the EPPI and EMV (max) is

the amount of money, the company will be willing to pay for the

perfect information. In this pricing example, since the difference

is zero, the company will not go for the perfect information.

274

Table 12.1.4 Table 12.1.4 Table 12.1.4 Table 12.1.4 Table 12.1.4

Market

condition

Probability

of the state

Regret Payoffs pricing strategy Regret Payoffs pricing strategy Regret Payoffs pricing strategy

Market

condition

Probability

of the state

No

change

Moderate

change

Substantial

change

No

competition

0.2 6.00 - -

Minor

competition

0.5 5.00 - -

Major

competition

0.3 4.00 - -

Section 2

Decision Tree Analysis

We earlier discussed about decision-making under risk.

Decision tree is a diagrammatic presentation of the same

decision process, which helps essentially in easy

comprehension of the logical relations in the process.

Decision tree is a convenient tool for making financial or

number based decisions where a lot of complex information

needs to be taken into account. These provide an effective

model in which alternative decisions and the implications of

taking those decisions can be laid down and evaluated.

They also help the managers to get an accurate, balanced

picture of the risks and rewards that can result from a

particular decision. Decision trees can be drawn to evaluate

the risk in decisions concerning investments, new products

launches, outsourcing, etc.

Guidelines to Draw a Decision Tree

Drawing a decision tree starts with a decision that needs to

be made. This decision is represented by a small square

called decision node (usually decision trees are drawn from

left to right). Each possible alternative is represented by a

line (drawn from the decision square diverging to the right)

and the payoff is written at the end of the line. When the

outcomes at a point are uncertain, then we draw a small

circle to represent that node. Each alternative from this

node, called as chance node, is represented by a line and

an associated probability. The pay offs are written at the end

of the line. If the

result is a decision,

draw another square

at the end of the line.

Figure 12.2.1 shows

how a decision tree

looks like.

Once one is ready

with a basic decision

tree, review this to

find out whether any

other solutions or outcomes can be considered for further

evaluation. Then prepare a final decision tree diagram.

275

Figure 12.2.1: Decision Tree Model

Example 12.2.1

A FMCG company is making plans either to launch a new

product or to consolidate its existing products. The

company can launch new products in the market in two

ways: (1) Through detailed product development (2) Rapid

product development. If the company wants to consolidate,

then it would do it either by strengthening its existing

products through advertising and promotion or also thinking

of reaping the benefits of the brand name of the company

without making any additional investments. For this, the

company has employed a market research firm to find out

the market reaction of its products. The market research

after doing the survey of the company’s products found out

that the market may have three reactions - good, average

and poor and accordingly calculated the profits for each

reaction.

If the company goes for detailed product development for

launching new products, then the company can make profit

of Rs. 10,00,000 when market reaction is good, Rs. 50,000

when average and Rs. 2,000 when poor and the

probabilities of such reactions are 0.4, 0.4 and 0.2

respectively.

If the company goes for rapid product development for

launching new products, then the company can make profit

of Rs. 8,00,000 when market reaction is good, Rs. 25,000

when average and Rs. 2,000 when poor and the

probabilities of such reactions are 0.2, 0.1 and 0.7

respectively.

If the company goes for strengthening its existing products for

consolidation, then the company can make profit of Rs. 3,

00,000 when market reaction is good, Rs. 20,000 when

average and Rs. 6,000 when poor and the probabilities of

such reactions are 0.2, 0.4 and 0.4 respectively.

If the company goes for consolidation its existing products

without making any additional investments, i.e., reap the

existing products, then the company can make profit of Rs.

20,000 when market reaction is good, Rs. 9,000 when

average and Rs. 6,000 when poor and the probabilities of

such reactions are 0.3, 0.2 and 0.5 respectively. The detailed

development cost is Rs. 1, 50,000, rapid development cost is

Rs. 80,000, while costs for strengthening the existing

products is Rs. 30,000. Now the company has to decide

whether to launch new products or to consolidate existing

products.

Solution:

Evaluation of Decision Tree

When a decision tree diagram is made, a manager can take

the decision from the decision tree which will give him the

greatest payoff. Managers can evaluate each possible

outcome by assigning cash or numeric values to them. Now

the lines from a circle (chance point) are given probability

depending on the chance of that event (outcome) occurring.

Clearly, at each circle the total probability must be 1. These

probabilities are assigned based on past data (if data is

available) or experience based guess of the manager

(subjective probability). When probabilities are assigned the

276

decision tree in Figure 12.2.2 will look like the tree in Figure

12.2.3.

Calculating Tree Values

Once the manager has decided the values of the outcomes and

has assessed the probability of occurrence of these outcomes,

he can start calculating, the values of each alternative. In our

problem the probability values have been assigned in Figure

12.2.4.

277

Figure 12.2.2: Decision Tree Model for a FMCG Com-

pany

new

ÞroducL

ConsolldaLe

ueLalled

uevelopmenL

MarkeL 8eacLlon

8apld

uevelopmenL

SLrengLhen

ÞroducL

8eap ÞroducL

MarkeL 8eacLlon

(0.4) C 8s. 10,00,000

C - Cood

A - Average

(0.4) A 8s. 30,000c

(0.2) Þ 8s. 2,000

(0.2) C 8s. 8,00,000

(0.1) A 8s. 23,000

(0.7) Þ 8s. 2,000

(0.2) C 8s.3,00,000

(0.4) A 8s. 20,000

(0.4) Þ 8s. 6,000

(0.3) C 8s. 20,000

(0.2) A 8s. 9,000

(0.3) Þ 8s. 6,000

Figure 12.2.4: Decision Tree

new ÞroducL

ConsolldaLlon

Detailed Development

8apld uevelopmenL

4,20,400

1,63,900

8eap ÞroducLs

SLrengLhen

ÞroducLs

70,400

10,800

C - Cood

A - Average

C - 4,00,000

A - 20,000

Þ - 400

C - 1,60,000

A - 2,300

Þ - 1,400

C - 60,000

A - 8,000

Þ - 2,400

C - 6,000

A - 1,800

Þ - 3,000

Figure 12.2.3: Decision Tree with Probabilities

new ÞroducL

ConsolldaLe

Detailed

Development

8apld

uevelopmenL

SLrengLhen

ÞroducL

8eap beneflLs of lLs

brand name

MarkeL reacLlon

MarkeL reacLlon

Cood

Average

Þoor

Cood

Average

Þoor

Cood

Average

Þoor

Cood

Average

Þoor

Calculating the values for chance nodes (Circles)

Consider the chance node under new product with detailed

development approach. The EMV at this chance node due to

probabilistic market condition can be computed as:

EMV1 = 10,00,000 × 0.4 + 50,000 × 0.4 + 2,000 × 0.2 = Rs.

4,20,400.

Similarly, we get:

EMV2 = Rs. 1,63,900

EMV3 = Rs. 70,400

EMV4 = Rs. 10,800

Calculating the values of decision nodes (Squares)

While evaluating a decision node, one should write down the

cost of each alternative solution along the decision line. Then

this cost is subtracted from the value of the outcome that is

already calculated. This will give a value which represents the

benefits of that decision (sunk costs, amounts already spent

are not considered while calculating the node value). After

calculating the benefit of each decision, one can select the

decision that offers the greatest (highest) benefit. Figure

12.2.5 shows the expected net monetary benefits of each

decision.

In the Figure 12.2.5, one can see that the net benefit for “new

product, detailed development” is Rs.2, 70,400 (after

deducing the cost of this decision). The net benefit for “rapid

development” is Rs.83, 900. Thus the benefits from “new

product, detailed development” are more than that of “new

product, rapid development. Hence the most valuable option,

i.e. “New product, detailed development” is selected and its

value is assigned to the decision node. Similarly, the value for

consolidation decision node is obtained.

Final Result

When the values of the decision nodes are available, the

manager can go for the most rewarding decision. In this

example, “should we develop a new product or consolidate?”

The best option is to develop a new product, through detailed

development.

278

Figure 12.2.5: Decision Tree Showing Expected Gross

Monetary Benefits of New Product Development and Con-

solidation

sLrengLhenlng Lhe producL" ls asslgned Lo Lhe second declslon node (consolldaLed).

Final Result

2,70,400

new ÞroducL

2,70,400

ConsolldaLlon

ueLalled dev cosL = 1,30,000 4,20,400

1,63,900

70,400

10,800

8apld dev cosL = 80,000

1,63,900-80,000=83,900

8eap producLs cosL = 0

40,400

Section 3

Posterior Probability and Decision-making

The kind of decision-making problem that we handled so far

had different “states of nature” and we had an idea of the

probability of occurrence of each state. Any new additional

information should help in arriving at a better decision. In

particular, we consider a situation when the old information

on state of nature are evaluated with some conditional new

information. We may recall that Bayes’ Theorem helps us in

such situation in obtaining posterior probabilities, posterior

to identified conditions. With such additional information it

would be logical to expect arrive at a better decision making.

Let us experience this through an example.

Example : Housing problem

“Ansal Lifestyles” is a leading housing development

company. The company has acquired a prime land in a

metro city and proposes to develop a dwelling complex on

this land. The company is considering three option for the

dwelling complex - a small complex with only 30 flats, or a

medium complex with 60 flats, or a large complex with 90

flats. The demand for the flats is expected to be either

strong (with 0.8 probability) or weak (with 0.2 probability)

Ansal Lifestyles has to decide on one of the options for

construction. The payoff for different combinations of

demand and options is as below

Which of the options would you recommend?

Solution:

Here, EMV (Small complex) = Rs.7.8 mn

EMV (Medium complex) = Rs.12.2 mn

EMV (Large complex) = Rs. 14.2 mn

Hence Ansal’s should go for large complex. A decision tree

for this problem would look as in Keynote 12.3.1-Fig (A)

Further, EVPI = EPPI - EMV (max)

= 17.4 - 14.2

= Rs.3.2 mn

279

Thus Ansal’s would be willing to pay a maximum of Rs. 3.2

mn for perfect information on the state of demand and no

more.

Housing example: Extended

Even as Ansals are debating over the options, a Market

Research Consultant offers to prepare a detailed demand

study for a fee. The Ansal’s at this point estimate the

implications of a favorable report under different demand

conditions as below:

P{Favorable Report(FR)/Strong Demand (SD)} = 0.9

P {Favorable Report (FR)/Weak Demand (WD} = 0.25

(Please note that we now have some new/additional

information, but not perfect information.)

Should Ansals engage the Market Research Consultant

and if so, what is maximum fee they can consider paying

him?

Solution:

At this stage (before engaging the MR consultant), we

know the chances of a favorable report given the state of

demand (strong or weak). However, Ansals would be

interested in knowing the chances of a strong or weak

demand given the report in favorable (or unfavorable). The

hope is - we will have a better estimate of probability of

demand with the MR study findings. In other words, we

would like to know P(SD/FR), P(WD/FR), P(SD/WFR) 5 &

P (WD/UFR).

These can be easily found using Baye’s Theorem as

below: The decision tree to decide if Ansals should go for

MR study or not would be as in Keynote 12.3.1- figure (B)

and would use the posterior probabilities.

As can be seen,

EMV (MR Study) = Rs.15.94 mn

EMV (No MR Study) = Rs. 14.2 mn

Clearly, it is worth going for the MR Study. The consultant

can be paid a maximum of Rs. 1.74 mn.

280

Keynote 12.3.1: Decision Tree

SECTION 4

Case Study: Mining for Precious Metals

281

This case study was written by Dr. Sunil Bharadwaj, Professor( Department of decision sciences),IBS, Hyderabad. It is intended to

be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.

The case was written from generalized experiences.

National Mineral Corporation (NMC), a leading metal explora-

tion company, has recently started its operations in South In-

dia. The whole business is to find out sites where potential

sources of metals are present. The company procedure in-

volves evaluation of a piece of land by taking samples of

earth at various depths and analyzing the extract for its con-

tents, which is termed as ‘geological exploration’. Before start-

ing explorations, the company buys a small area of land for

their trials. If the extracts found are rich in certain minerals,

the company estimates the appropriate size of the land to be

bought in that area. Once this is done, negotiations are car-

ried out with the owner, license is obtained from the local

authorities, and thereafter starts the actual mining for mineral

ores. The mineral ores, which are obtained are sold to a vari-

ety of businesses, which use the ore as a raw material for fur-

ther processing and use.

The Dilemma

The director and other officials of the company are sitting in

the boardroom discussing the various possibilities for buying

a piece of land, which has attracted the explorers of the com-

pany.

A senior explorer specifies that the company’s earlier experi-

ence in dealing with the type of land under consideration, indi-

cates that geological explorations would cost approximately

INR 1 lakh and would yield significant metal deposits as fol-

lows:

Manganese 1% chance

Gold 0.05% chance

Silver 0.2% chance

However, geological facts indicate that most of the times, only

one of these three metals is found i.e., neither there is a

chance of finding two or more of these metals at one place,

nor there is a chance of finding any other metal.

Another officer who has been working closely with the authori-

ties in the area has come up with an option. The company, if it

wishes, may pay INR 75,000 for the right to conduct a 3-day

test exploration before deciding whether to purchase the

piece of land or not. Such 3-day test explorations can only

give a preliminary indication of whether significant metal de-

posits are present or not.

The company had previously tried these kinds of options and

the past experience indicates that 3- day test explorations

cost them an average of INR 25,000 and that significant metal

deposits are present 50% of the time.

Given the past experiences and geological facts, the director

asks the officer to identify the possible outcomes of opting for

a 3-day test. He developed two scenarios: Firstly, if the 3-day

test exploration indicates significant metal deposits, then the

chances of finding manganese, gold and silver increase to

3%, 2% and 1% respectively. Secondly, if the 3-day test explo-

ration fails to indicate significant metal deposits, then the

chances of finding manganese, gold and silver decrease to

0.75%, 0.04% and 0.175% respectively.

Questions for Discussion

282

What should NMC do? Should NMC abandon the plans

to buy the land?

One of NMC’s competitors is prepared to pay half of all

costs associated with this piece of land in return for half of

all revenues. Under these circumstances, what should

NMC do?

Notes

283

SECTION 5

Roja Silks

284

This case study was written by R Muthukumar, IBSCDC. It is intended to be used as the basis for class discussion rather than to il-

lustrate either effective or ineffective handling of a management situation. The case was compiled from published sources.

Roja Silks (Roja), a leading apparel retailer in Madras, had

started operations in 1996. It had three separate sections for

children, ladies, and men. Roja offered casuals, formals, and

western wear. For ladies, there were salwars in soft velvets,

cool cottons, printed clothes, and also accessories like

branded leather bags, shoes, nightwear, etc. For men, there

were formal shirts, trousers, jeans, T–shirts, suits, ties, socks,

and undergarments, and accessories like sunglasses and

leather products. The company had two manufacturing units

in Trichy and Salem.

The silk Saree business had suddenly seen a big boom in ad

spending during 2002. According to industry estimates, the

category accounted for Rs. 30 crores worth of ad spends in

Chennai alone (2003). Roja spent around Rs. 2 crores on its

launch alone. Another big spender was Nandhini Silks

(Nandhini), which spent around Rs. 1 crore on advertising to

drive home the message that it was not a force that could be

easily ignored. In the past, well–entrenched players such as

Aruna Silks (Aruna) had released ads through local outfits,

only on occasions such as Diwali, Christmas, and Pongal.

Now the needs had changed. The key issue was differentia-

tion and brand building, in what was turning out to be a highly

competitive market. This probably explained the boom in ad

spending.

Roja believed that it was possible to generate faster growth.

The retailer believed that capacity had to be built ahead of the

market demand. The promoters had asked the General Man-

ager (GM) to decide the location of the plant. With two options

before him, GM was somewhat confused:

Construction of a large plant to meet the possible demand

in the future.

Construction of a small plant to meet a low demand and

expanding it when the demand increased.

After detailed discussions with his colleagues, the GM de-

cided to get the help of a consultant for conducting market

research to find out more about the demand pattern. The

consultant believed the probabilities of low, medium, and

high demands were 0.3, 0.4, and 0.3 respectively. The fol-

lowing data was also collected as part of the market re-

search exercise.

If a large plant was constructed at a cost of Rs. 12 lakhs,

it would be able to meet the demand in the future. The op-

erating returns for low, medium and high demands were

estimated at Rs. 10 lakhs, Rs. 16 lakhs and 24 lakhs re-

spectively.

If a small plant was constructed at a cost of Rs. 6 lakhs, it

would meet only low demand and it would have to be ex-

panded, if the demand increased in the future.

Depending upon the demand, a small plant might require

no expansion (for low demand), or might require a small

expansion at a cost of Rs. 3 lakhs (for medium demand),

or might require a large expansion at a cost of Rs. 5 lakhs

(for high demand).

In future, for the sudden expansion of the plant to meet

the demand, some revenues might be lost. The operating

returns to be realized in case of small plant expansion

285

and large plant expansion were projected at Rs. 14 lakhs and

Rs. 22 lakhs, respectively.

The GM was wondering, which of the options he must pursue.

Notes

286

SECTION 6

Universal home care products

In 2003, Universal Home Care Products Ltd. (Universal),

was one of India’s largest producers of detergents and

cleaning agents with sales of Rs. 1775 crores and a net in-

come of Rs. 112.3 crores. The company’s product line con-

sisted of over 1000 products ranging from industrial chemi-

cals to a variety of household cleaners and detergents. Con-

sumer products accounted for around 50% of the com-

pany’s turnover. Some of its brands had been highly suc-

cessful over a long period of time, while others had been

modified or dropped depending upon market conditions.

The company had an active new product development func-

tion.

Universal’s policy, with respect to any of its household clean-

ing products, was very rigid. The product had to capture at

least 5% share in that particular market within a year of its

introduction, failing which the product was dropped. Re-

cently, the company had developed an all–purpose house-

hold cleaner, ‘Sparkle,’ which was the first of its kind. The

cleaner differed from the traditional cleaners in its versatility.

It could clean a variety of surfaces like wood, glass, metal,

plastic, and ceramic. According to Universal, Sparkle could

remove the toughest of stains on any kind of surface. In ad-

dition, the new product was available as a spray cleaner of-

fering ease of use.

The company’s product management group saw in the new

spray cleaner, an opportunity to market a new product that

could improve the company’s position in the household

cleaner market. Sparkle was tested among 500 house-

wives. Though the product was not complete in all respects,

it got instant sampling. After making minor changes in the

packaging and fragrance, the product would be ready for

the market. Universal projected a market share of 6% by

the first year, 10% by the second year, and 14% after two

years.

Daychem Ltd. (Daychem) was an aggressive competitor

known for its proactive strategies. In the past, Universal and

Daychem had fought fierce battles for market leadership in

almost all the segments in which the companies had a pres-

ence. Past experience had proved that Daychem was fast

in coming out with substitute products in a very short pe-

riod. For all that Universal knew, Daychem might already be

planning to launch a similar multi–purpose household

287

cleaner, with similar positioning, targeting the same seg-

ments.

The success of ‘Sparkle’ depended on Daychem’s ability to

bring out a competing product and the relationship between

the firm’s pricing structure for Sparkle and the competitor’s

pricing structure for the competing product. Daychem’s ability

to bring out a competing product was estimated at 60 %. Uni-

versal estimated the profits for its new product for three differ-

ent prices, in the absence of competition. If Universal set a

low price, the estimated profits were Rs. 60,000, Rs. 75,000

at a medium price, and Rs. 90,000 at a high price. There was

another dimension to the problem because Universal had to

take into account the competing product’s pricing as well (Re-

fer to Table I).

Universal had to set its price first because it was entering the

market first with its product. Estimates of the probability of

competitor’s prices are as follows:

What should Universal do, with respect to pricing Sparkle, tak-

ing into account all these dimensions?

Notes

288

If Universal’s

Price is:

Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo

If Universal’s

Price is:

If Competitor’s Price is: If Competitor’s Price is: If Competitor’s Price is:

If Universal’s

Price is:

LOW MEDIUM HIGH

LOW 32 40 49

MEDIUM 35 48 50

HIGH 12 30 49

If Universal’s

Price is:

Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo Estimated profits in Rs’ ooo

If Universal’s

Price is:

Competitor’s Price is Expected to be: Competitor’s Price is Expected to be: Competitor’s Price is Expected to be:

If Universal’s

Price is:

LOW MEDIUM HIGH

LOW 80% 15% 5%

MEDIUM 20% 70% 10%

HIGH 5% 30% 65%

289

Section 6

Case Study: Ram Publishers

Refer Case Study in Chapter 5

290

Comprehensive Case Studies

Doughnut Bakers

KATT: An Outsourcing Company

C

H

A

P

T

E

R

1

3

I n t hi s c hapt e r we wi l l di s c us s

SECTION 1

Case Study: Doughnut Bakers

292

This case study was written by Dr. Sunil Bharadwaj, professor, (Department of Decision Sciences), IBS, Hyderabad. It is in-

tended to be used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a manage-

ment situation. The case was written from generalized experiences.

Doughnut Bakers is one of the famous bakeries in the heart

of the city. The location of the bakery is good as it lies within

a densely populated area of the city. It is famous for the clas-

sic interiors, exotic cakes, pastries & dishes at reasonable

prices. About a year ago there was a change of manage-

ment. The new management has brought in some changes in

the menu and the staff. However, the management feels that

it has not been able to come up to the expectations of the

customers. Under the previous management around 90% of

the customers used to revisit. The monthly sales averaged

around Rs. 14 lakhs. However in the last nine months the

monthly sales averaged around Rs 11 lakhs with a standard

deviation of around Rs.4 lakhs. The bakery manager has re-

ported an increase in number of complaints which is also evi-

dent from the written customer feedback. Most of the custom-

ers report that the service is slow. Around 45 % of the custom-

ers have indicated dissatisfaction with the service quality of

the bakery.

To look into the matter a consultant was hired. He visited the

bakery as a disguised customer during peak hour (evening

time). He knows (based on past information) that the average

arrival rate of customers during this particular hour is around

20 customers in an hour, against the seating capacity for 40

customers at a time. As a mathematician at heart he starts

contemplating about the probability that he finds the bakery

empty i.e. without customers or the probability that the bak-

ery is fully packed.

He then has some snacks and then meets the management.

He goes through the data provided by the management. At

the outset the consultant is worried about the seeming de-

crease in the sales in the recent past. He wants to confirm on

this with not more than a 5 % chance of going wrong.

Also he is aware of the slow service delivery. In his experi-

ence if customers spend less time in a bakery on a crowded

day, it is always better for the bakery. Faster service will re-

duce the waiting time for the customers awaiting their turn.

He has observed that on weekends a typical customer

spends around 60 minutes in the bakery. However in most of

the competing bakeries the time is not more than 50 minutes

(In his earlier assignment he has studied a majority of com-

peting bakeries and found that the average time was 45 min-

utes, with a standard deviation of 5 minutes). He is contem-

plating whether the bakery has become inefficient under the

new management as compared to competing bakeries.

He wants to meet a few of the staff members chosen ran-

domly. The overall staff distribution is as shown in Exhibit I.

293

Doughnut Bakers

Nine members are supposed to report to him. Given his pas-

sion for mathematics, he is wondering about the probability of

five of the members being from the helpers’ category?

One of the service managers, who happens to be an account-

ant, gives him the following data related to bakery business

of their firm in Exhibit 2

Looking into the data he contemplates whether it is advisable

to offer more number of items or not?

Based on his meeting with the staff, he assesses that a train-

ing program can boost the morale and skills of the team at

the bakery and will help them in improving the service quality.

Accordingly, he designs and conducts a training program for

the staff and has collected data on time taken to perform vari-

ous operations by the trained staff with a view to compare

them with the performance prior to the training. Exhibit 3

shows the data collected on the performance of waiters.

The consultant wants to be 95% sure that his training pro-

gram has reduced the time taken to perform operations by

the waiters so that he gets his pending payment.

The consultant is further promised a reward if the customers’

revisit rate is better than the past rate. After a few months a

survey is carried out by the management and it is found that

out of 200 customers 191 have revisited. Should the manage-

ment offer the reward to the consultant? The consultant has

made a request not to cut down the advertisement budget or

294

Staff Members Distribution Staff Members Distribution

Bakers 5

Service managers 4

Waiters 16

Helpers 8

Bakery Manager 1

Sales Data Sales Data Sales Data Sales Data

Year

Monthly Average

Sales

(in lakhs)

Monthly Average

Advertising Budget

Total number

of items kept

2000 10 50 20

2001 10.5 55 20

2002 11 58 24

2003 11.6 60 24

2004 12 67 24

2005 12.5 69 26

2006 12.8 70 26

2007 13.5 74 24

2008 14 78 24

2009 11 52 22

Exhibit I

Exhibit II

it will be better if advertising budget is increased. Was he

right?

Prepare a management report addressing all the concerns of

the consultant – both institutional and personal. Are they

really personal?

295

Data on Time Taken by Waiters to Service a Customer Data on Time Taken by Waiters to Service a Customer Data on Time Taken by Waiters to Service a Customer

Waiter

Time taken before

training(in minutes)

Time taken after training (in

minutes)

Waiter1 4 3

Waiter2 4.5 3.5

Waiter3 4 3

Waiter4 5 4

Waiter5 4 2

Waiter6 5.5 4

Exhibit III

Notes

SECTION 2

Case Study: KATT: An Outsourcing Company

296

This case study was written by Sunil Bhardwaj, Professor, Department of Operations & IT, IBS Hyderabad. It is intended to be

used as the basis for class discussion rather than to illustrate either effective or ineffective handling of a management situation.

The case was prepared from the generalized experiences.

K A T T O u t-

sourcing offers

a wide range of

outsourcing serv-

ices including IT outsourcing, Back office outsourcing, HR

outsourcing and many others. Today the company handles

many US clients including 500 Fortune clients and caters to

clients’ diverse needs in a cost effective manner.

The company was started by a young man Ashok, B Tech,

who began his career in IT as a service engineer for IT prod-

ucts in a reputed company in 1986. Within a span of three

years, he was made the head of their North and East service

division. After four years, i.e. in 1990, Ashok joined a com-

pany in US.

As per Ashok, “During my days in US, I learnt the best prac-

tices of software development, client service and business

processes and also established a good rapport with major IT

companies in US.” In 1995, Ashok took a decision to come

back to India and start his own company which will meet the

requirements of US companies at a lower price and thus

KATT outsourcing was born.

In 1995,the concept of outsourcing was new to India and to

the US vendors. Also vendors had apprehensions about the

credibility of the new company and its ability to deliver on the

strict parameters required. But he was able to overcome all

hurdles with a small team of twenty-five employees each at

Gurgaon and Bangalore.

Like any new set-up, KATT outsourcing also faced a capital

crunch, but Ashok acted as a one man army and took all the

responsibilities including HR, accounts, administration and

even service delivery on his own shoulders. As he would say.

“KATT Outsourcing is my baby, I was responsible for both its

success and failures”.

Over a period of 15 years, KATT Outsourcing has evolved as

one stop outsourcing solution provider, with its activities rang-

ing from call centers to product development and mainte-

nance services.

Owing to rapid growth, the company has recently set-up a

separate marketing and business development department

as well as a separate department for customer satisfaction.

KATT has expanded its operations after considering the

pain-points of its customers. Today KATT has 1500+ techni-

cally qualified engineers and technicians, 40+ service cen-

ters with a presence in 20+ locations across India.

The company statistics show that the largest percentage of

jobs being outsourced is in Information Technology, by

around 28%. The next largest field is human resources tak-

ing 15% of the outsourcing market, followed closely by sales

and marketing outsourcing with 14% and financial services

outsourcing at 11%. The remaining 32% is made up of a vari-

ety of processes, including administrative outsourcing.

Discussions on three themes are getting popular at KATT

nowadays. One is the new office location, second is the cus-

tomer survey and the third is a proposed bid for Microsoft’s

297

KATT: An Outsourcing Company

outsourcing contract. The details of some of the related activi-

ties are as follows:

Decision for New Location

The Chennai office of the company has evolved as a major

center over the years. Recently the company has been plan-

ning to shift its Chennai office to a new building in an SEZ

(which is 40 kms. away from the city) to reap various bene-

fits offered to SEZs, besides the spacious building. A survey

is done by the HR department to assess the acceptability of

the plan. 56 employees from the Chennai office were ran-

domly chosen. The employees were asked whether or not

they favored moving to the new location. Following were the

gender- wise responses of the employees. (Exhibit I)

Insights Into the Customers

In order to gain insights of US outsourcing business KATT de-

cided to carry out a customer survey. The company first

needs to understand the reasons for the clients to choose

outsourcing of their business processes. Earlier a similar

study was done. However, it was a qualitative study in which

experts’ opinions were gathered. The study concluded that

organizations that outsource are seeking to realize benefits

or address the following issues:

1.Cost Advantage: It is about lowering of the overall cost of

the service to the business. This will involve reducing the

scope, defining quality levels, re-pricing, re-negotiation, cost

re-structuring etc. These are approaches to cost economies

through off-shoring called “labor arbitrage” enabled by the

wage gap between industrialized and developing nations.

2.Core Competency: Resources (investment, people, infra-

structure etc.) are focused on developing the core business.

For example often organizations outsource their IT support

to specialized IT services companies.

3.Cost Restructuring: Operating leverage is a measure that

compares fixed costs to variable costs. Outsourcing changes

the balance of this ratio by offering a move from fixed to vari-

able cost and also by making variable costs more predict-

able.

4.Best Practices:

Access to opera-

ti onal best prac-

tices which would

be too difficult or

time consuming to

develop in-house.

5. Bi ndi ng Per-

f or mance Con-

tract: Services will

be provided to a le-

298

Results of Employees Survey Results of Employees Survey Results of Employees Survey Results of Employees Survey

Men Women Total

In favour 14 3 17

Opposed 8 31 39

Total 22 34 56

Exhibit I

INTERACTIVE 13.1

gally binding contract with financial penalties and legal redress.

This is not the case with internal services.

6.Quality Improvement: Achieve a step change in quality

through contracting out the service with a new service level

agreement.

7 Access to Talent: Access to a larger talent pool and a sus-

tainable source of skills, in particular in science and engineer-

ing.

8.Management of Capacity: An improved method of capacity

management of services and technology where the risk in pro-

viding the excess capacity is borne by the supplier.

9.Catalyst for Change: An organization can use an out-

sourcing agreement as a catalyst for major step change that

can not be achieved alone. The outsourcer becomes a Change

agent in the process.

10. Reduce time to Market: The acceleration of the develop-

ment or production of a product through the additional capabil-

ity brought by the supplier.

However First five aspects were considered to be most impor-

tant by the experts. These were studied in detail. The survey

resulted in a lot of data collection. For example, Exhibit II

shows the percentage of 200 respondents (each representing

a separate firm) giving a particular rating to the importance of a

particular reason for outsourcing (1= least important, 5 = most

important).

Another set of data collected (Exhibit III) related to the extent of

cost savings achieved by firms of different sizes (all KATT cli-

ents). Size was defined as Small, Medium and Large depend-

ing on the turnover of the client companies.

KATT is also examining the impact of size on the product devel-

opment collaboration with the outsourcing partners (Exhibit IV).

299

Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses Results of the Customers’ Survey: Number of Responses

Ratings/

Reasons

1 2 3 4 5

Total

Responde

nts

Cost

Advantage

10 10 20 60 100 200

Core

Competency

20 20 20 50 90 200

Cost

Restructuring

10 10 20 70 90 200

Best Practices 20 30 20 50 80 200

Contract 30 30 10 50 70 200

Exhibit II

Another set of data gives the break up of KATT’s monthly reve-

nue by various regions. (Exhibit V)

Bid for Microsoft Outsourcing Contract

KATT is trying to decide whether to bid for a outsourcing con-

tract with Microsoft or not It is estimated that mere prepara-

tions for the bid will cost Rs 2 lakh. Past data reveals that

there is a 50% chance that an Indian company like KATT will

be shortlisted (otherwise their bid will be rejected)

Once “short-listed” KATT has to furnish further detailed infor-

mation and prove its competence in handling the project .This

may have expenses as high as Rs 1 lakh. After this stage

their bid will either be accepted or rejected.

The company estimates that the labour and material costs as-

sociated with the contract are Rs 10 lakhs. They are consider-

ing three possible bid prices, namely Rs 15 lakhs, Rs 17 lakhs

300

Number of Firms by Extent of Savings and Firm Size Number of Firms by Extent of Savings and Firm Size Number of Firms by Extent of Savings and Firm Size Number of Firms by Extent of Savings and Firm Size

Extent of cost savings

by outsourcing

Number of ﬁrms by size Number of ﬁrms by size Number of ﬁrms by size

Extent of cost savings

by outsourcing

Small ﬁrms

(75)

Medium

ﬁrms

(75)

Large ﬁrms

(75)

0% 8 7 0

1-10% 22 15 8

11-20% 15 23 13

20-30% 15 16 12

30% & above 15 14 17

Product Development Collaboration and size:

percentages of responses by client firms

Product Development Collaboration and size:

percentages of responses by client firms

Product Development Collaboration and size:

percentages of responses by client firms

Product Development Collaboration and size:

percentages of responses by client firms

Small Medium Large

Yes 56% 60% 76%

No 44% 40% 24%

Total 75 75 50

KATT’ s Monthly Revenues by Regions KATT’ s Monthly Revenues by Regions KATT’ s Monthly Revenues by Regions KATT’ s Monthly Revenues by Regions

Month

Monthly revenue(in $millions) Monthly revenue(in $millions) Monthly revenue(in $millions)

Month

US UK Canada

1 6.1 4.9 5.0

2 4.3 5.7 4.7

3 7.2 5.7 4.0

4 5.5 6.3 5.2

5 5.9 6.0 6.0

6 6.8 4.2 3.9

7 5.3 5.8 4.2

8 4.9 5.4 4.9

9 6.1 6.7 5.0

10 7.0 4.0 3.7

11 4.3 5.9 4.2

12 6.0 5.9 4.5

Exhibit III

Exhibit IV

Exhibit V

and Rs 19 lakhs. They estimate that the probability of these bids

being accepted (once they have been short-listed) is 0.90, 0.75

and 0.35 respectively.

A consultant is hired to assess the above deal. His opinion is not

to bid for the project.

Questions for Discussion:

1. A survey of 150 outsourcing companies shows that

the largest percentage (about 25%) of jobs outsourced to India

is in the area of Information Technology. Can Ashok safely con-

clude that KATT’s share of IT business is significantly above the

industry average?

2. Refer to Exhibit 1 and answer:

a. What is the probability that a randomly selected resi-

dent is a man and is in favor of new building?

b. What is the probability that a randomly selected resi-

dent is a man?

c. What is the probability that a randomly selected resident is in

favor of new building?

d. What is the probability that a randomly selected resi-

dent is a man or in favor of building the bridge?

e. A randomly selected resident turns out to be male.

Compute the probability that he is in favor of new building?

3. Refer to Exhibit 2. Can Ashok conclude that all the five

reasons are considered to be of equal importance by the re-

spondents?

4. Refer to Exhibit 3. Can Ashok conclude that savings

from outsourcing is independent of company size?

5. Refer to Exhibit 4. Are product collaboration interre-

lated with the size of the firm?

6. Refer to Exhibit 5. Are monthly revenues of KATT

across the regions similar?

7. Should KATT follow consultant’s advice? And Why?

301

This document is authorized for internal use only at IBS campuses- Batch of 2012-2014 - Semester I. No part of this publication

may be reproduced, stored in a retrieved system, used in a spreadsheet, or transmitted in any form or by any means - electronic,

mechanical, photocopying or otherwise - without prior permission in writing from IBS Hyderabad.

- A Quick Approach to Statistics by G.R.pashA
- Descriptive Statistics
- Elementary Concepts in Statistics
- Hem 707 Biostatistics
- Statistics Module 1 - 2
- Ln Biostat Hss Final
- Activity 1 in Statistical Method
- 1 Research in Psychology and Basic Concepts in Statistics
- HUL Case Study
- effectiveness_measure
- triolaed11chapter1-110912140050-phpapp01.ppt
- statistik pemula
- MB0040 Set 1– Statistics For Management
- quantitativedataanalysis-131122004449-phpapp01
- Elementary Concepts in Statistics
- An Introduction to SPSS
- e-portfolio
- The Smart Recruit
- Http://Educareers.co.Cc Http://Educareers.co.Cc
- Lane, Reber
- Theory of Abstraction
- 05 Visual Mappings
- mba stats
- Maths.pdf
- 4 1 Weziak Isac School Autonomy Crossnational Perspective
- Likert Scales and Data Analyses
- Chapter2 Sampling Simple Random Sampling
- Report on Review of Food Waste Reporting Methodology and Practice
- Bba 104 Assignment
- PAKU SARAWAK 2013 M2(Q)

- Americas Changing Workforce
- FAQs ( Frequently asked questions in MBA placements, Ibs - Hyderabad)
- Aptitude Formula Downloads
- Business Strategy Questionnaire
- Interview Performance
- Americas Changing Workforce
- Business Strategy Questionnaire
- MR Project Group 6
- PEL, 6 Hats
- Accounts Case 18-3
- Tips Project Proposal
- Basics of Capital Expenditure Decisions Material
- CRM Practice by Public sector Insurance firms
- PEL
- PEL Version 1.0

Sign up to vote on this title

UsefulNot usefulClose Dialog## Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

Loading